All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
To: "Maciej W. Rozycki" <macro@orcam.me.uk>
Cc: Paul Cercueil <paul@crapouillou.net>,
	Nathan Chancellor <nathan@kernel.org>,
	Tiezhu Yang <yangtiezhu@loongson.cn>,
	linux-mips@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] MIPS: Rewrite `csum_tcpudp_nofold' in plain C
Date: Mon, 23 May 2022 11:40:05 +0200	[thread overview]
Message-ID: <20220523094005.GB6296@alpha.franken.de> (raw)
In-Reply-To: <alpine.DEB.2.21.2205222035380.52080@angie.orcam.me.uk>

On Sun, May 22, 2022 at 09:48:14PM +0100, Maciej W. Rozycki wrote:
> Recent commit 198688edbf77 ("MIPS: Fix inline asm input/output type 
> mismatch in checksum.h used with Clang") introduced a code size and 
> performance regression with 64-bit code emitted for `csum_tcpudp_nofold' 
> by GCC, caused by a redundant truncation operation produced due to a 
> data type change made to the variable associated with the inline 
> assembly's output operand.
> 
> The intent previously expressed here with operands and constraints for 
> optimal code was to have the output operand share a register with one 
> inputs, both of a different integer type each.  This is perfectly valid 
> with the MIPS psABI where a register can hold integer data of different 
> types and the assembly code used here makes data stored in the output 
> register match the data type used with the output operand, however it 
> has turned out impossible to express this arrangement in source code 
> such as to satisfy LLVM, apparently due to the compiler's internal 
> limitations.
> 
> There is nothing peculiar about the inline assembly `csum_tcpudp_nofold' 
> includes however, though it does choose assembly instructions carefully.
> 
> Rewrite this piece of assembly in plain C then, using corresponding C 
> language operations, making GCC produce the same assembly instructions, 
> possibly shuffled, in the general case and sometimes actually fewer of 
> them where an input is constant, because the compiler does not have to 
> reload it to a register (operand constraints could be adjusted for that, 
> but the plain C approach is cleaner anyway).
> 
> Example code size changes are as follows, for a 32-bit configuration:
> 
>       text       data        bss      total filename
>    5920480    1347236     126592    7394308 vmlinux-old
>    5920480    1347236     126592    7394308 vmlinux-now
>    5919728    1347236     126592    7393556 vmlinux-c
> 
> and for a 64-bit configuration:
> 
>       text       data        bss      total filename
>    6024112    1790828     225728    8040668 vmlinux-old
>    6024128    1790828     225728    8040684 vmlinux-now
>    6023760    1790828     225728    8040316 vmlinux-c
> 
> respectively, where "old" is with the commit referred reverted, "now" is 
> with no change, and "c" is with this change applied.
> 
> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
> ---
> Hi,
> 
>  I have visually inspected code produced and verified this change to boot 
> with TCP networking performing just fine, both with a 32-bit and a 64-bit 
> configuration.  Sadly with the little endianness only, because in the 
> course of this verification I have discovered the core card of my Malta 
> board bit the dust a few days ago, apparently in a permanent manner, and I 
> have no other big-endian MIPS system available here to try.
> 
>  The only difference between the two endiannesses is the left-shift 
> operation on (proto + len) however, which doesn't happen for big-endian 
> configurations, so the little endianness should in principle provide 
> enough coverage.
> 
>  Also I'm leaving it to LLVM folks to verify, however this is plain C, so 
> it is expected to just work.
> 
>  Please apply.
> 
>   Maciej
> ---
>  arch/mips/include/asm/checksum.h |   71 ++++++++++++++++++---------------------

applied to mips-next.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.                                                [ RFC1925, 2.3 ]

  reply	other threads:[~2022-05-23 10:30 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-22 20:48 [PATCH] MIPS: Rewrite `csum_tcpudp_nofold' in plain C Maciej W. Rozycki
2022-05-23  9:40 ` Thomas Bogendoerfer [this message]
2022-05-24 16:38 ` Florian Fainelli
2022-05-24 17:18   ` Maciej W. Rozycki
2022-05-24 18:30     ` Florian Fainelli
2022-05-25 15:01       ` Maciej W. Rozycki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220523094005.GB6296@alpha.franken.de \
    --to=tsbogend@alpha.franken.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=macro@orcam.me.uk \
    --cc=nathan@kernel.org \
    --cc=paul@crapouillou.net \
    --cc=yangtiezhu@loongson.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.