linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stafford Horne <shorne@gmail.com>
To: Arnd Bergmann <arnd@kernel.org>
Cc: linux-arch <linux-arch@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Vineet Gupta <vgupta@synopsys.com>, Arnd Bergmann <arnd@arndb.de>,
	Jonas Bonn <jonas@southpole.se>,
	Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>,
	Openrisc <openrisc@lists.librecores.org>,
	"Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC 02/12] openrisc: always use unaligned-struct header
Date: Sat, 8 May 2021 08:02:17 +0900	[thread overview]
Message-ID: <CAAfxs77U4-ojjmsX1uQ9QwhnKa69Aqcs=br+H-Yc7E7RNODqvQ@mail.gmail.com> (raw)
In-Reply-To: <20210507220813.365382-3-arnd@kernel.org>

On Sat, May 8, 2021 at 7:10 AM Arnd Bergmann <arnd@kernel.org> wrote:
>
> From: Arnd Bergmann <arnd@arndb.de>
>
> openrisc is the only architecture using the linux/unaligned/*memmove
> infrastructure. There is a comment saying that this version is more
> efficient, but this was added in 2011 before the openrisc gcc port
> was merged upstream.
>
> I checked a couple of files to see what the actual difference is with
> the mainline gcc (9.4 and 11.1), and found that the generic header
> seems to produce better code now, regardless of the gcc version.
>
> Specifically, the be_memmove leads to allocating a stack slot and
> copying the data one byte at a time, then reading the whole word
> from the stack:
>
> 00000000 <test_get_unaligned_memmove>:
>    0:   9c 21 ff f4     l.addi r1,r1,-12
>    4:   d4 01 10 04     l.sw 4(r1),r2
>    8:   8e 63 00 00     l.lbz r19,0(r3)
>    c:   9c 41 00 0c     l.addi r2,r1,12
>   10:   8e 23 00 01     l.lbz r17,1(r3)
>   14:   db e2 9f f4     l.sb -12(r2),r19
>   18:   db e2 8f f5     l.sb -11(r2),r17
>   1c:   8e 63 00 02     l.lbz r19,2(r3)
>   20:   8e 23 00 03     l.lbz r17,3(r3)
>   24:   d4 01 48 08     l.sw 8(r1),r9
>   28:   db e2 9f f6     l.sb -10(r2),r19
>   2c:   db e2 8f f7     l.sb -9(r2),r17
>   30:   85 62 ff f4     l.lwz r11,-12(r2)
>   34:   85 21 00 08     l.lwz r9,8(r1)
>   38:   84 41 00 04     l.lwz r2,4(r1)
>   3c:   44 00 48 00     l.jr r9
>   40:   9c 21 00 0c     l.addi r1,r1,12
>
> while the be_struct version reads each byte into a register
> and does a shift to the right position:
>
> 00000000 <test_get_unaligned_struct>:
>    0:   9c 21 ff f8     l.addi r1,r1,-8
>    4:   8e 63 00 00     l.lbz r19,0(r3)
>    8:   aa 20 00 18     l.ori r17,r0,0x18
>    c:   e2 73 88 08     l.sll r19,r19,r17
>   10:   8d 63 00 01     l.lbz r11,1(r3)
>   14:   aa 20 00 10     l.ori r17,r0,0x10
>   18:   e1 6b 88 08     l.sll r11,r11,r17
>   1c:   e1 6b 98 04     l.or r11,r11,r19
>   20:   8e 23 00 02     l.lbz r17,2(r3)
>   24:   aa 60 00 08     l.ori r19,r0,0x8
>   28:   e2 31 98 08     l.sll r17,r17,r19
>   2c:   d4 01 10 00     l.sw 0(r1),r2
>   30:   d4 01 48 04     l.sw 4(r1),r9
>   34:   9c 41 00 08     l.addi r2,r1,8
>   38:   e2 31 58 04     l.or r17,r17,r11
>   3c:   8d 63 00 03     l.lbz r11,3(r3)
>   40:   e1 6b 88 04     l.or r11,r11,r17
>   44:   84 41 00 00     l.lwz r2,0(r1)
>   48:   85 21 00 04     l.lwz r9,4(r1)
>   4c:   44 00 48 00     l.jr r9
>   50:   9c 21 00 08     l.addi r1,r1,8
>
> I don't know how the loads/store perform compared to the shift version
> on a particular microarchitecture, but my guess is that the shifts
> are better.

Thanks for doing the investigation on this as well.

Load stores are slow like on most architectures.  WIth caching it will
be faster, but
still not faster than the shifts.  So this looks good to me.

> In the trivial example, the struct version is a few instructions longer,
> but building a whole kernel shows an overall reduction in code size,
> presumably because it now has to manage fewer stack slots:
>
>    text    data     bss     dec     hex filename
> 4792010  181480   82324 5055814  4d2546 vmlinux-unaligned-memmove
> 4790642  181480   82324 5054446  4d1fee vmlinux-unaligned-struct

That's a plus.

> Remove the memmove version completely and let openrisc use the same
> code as everyone else, as a simplification.
>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Acked-by: Stafford Horne <shorne@gmail.com>

> ---
>  arch/openrisc/include/asm/unaligned.h | 47 ---------------------------
>  include/linux/unaligned/be_memmove.h  | 37 ---------------------
>  include/linux/unaligned/le_memmove.h  | 37 ---------------------
>  include/linux/unaligned/memmove.h     | 46 --------------------------
>  4 files changed, 167 deletions(-)
>  delete mode 100644 arch/openrisc/include/asm/unaligned.h
>  delete mode 100644 include/linux/unaligned/be_memmove.h
>  delete mode 100644 include/linux/unaligned/le_memmove.h
>  delete mode 100644 include/linux/unaligned/memmove.h

Thanks again,

-Stafford

  reply	other threads:[~2021-05-07 23:02 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-07 22:07 [RFC 0/12] Unify asm/unaligned.h around struct helper Arnd Bergmann
2021-05-07 22:07 ` [RFC 01/12] asm-generic: use asm-generic/unaligned.h for most architectures Arnd Bergmann
2021-05-07 23:02   ` Thomas Gleixner
2021-05-10 10:16   ` Geert Uytterhoeven
2021-05-10 13:12     ` Arnd Bergmann
2021-05-07 22:07 ` [RFC 02/12] openrisc: always use unaligned-struct header Arnd Bergmann
2021-05-07 23:02   ` Stafford Horne [this message]
2021-05-08 11:42   ` David Laight
2021-05-07 22:07 ` [RFC 03/12] sh: remove unaligned access for sh4a Arnd Bergmann
2021-05-10 21:11   ` Rob Landley
2021-05-07 22:07 ` [RFC 04/12] m68k: select CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS Arnd Bergmann
2021-05-10 10:18   ` Geert Uytterhoeven
2021-05-07 22:07 ` [RFC 05/12] powerpc: use linux/unaligned/le_struct.h on LE power7 Arnd Bergmann
2021-05-07 22:07 ` [RFC 06/12] asm-generic: unaligned: remove byteshift helpers Arnd Bergmann
2021-05-08 11:38   ` Arnd Bergmann
2021-05-07 22:07 ` [RFC 07/12] asm-generic: unaligned always use struct helpers Arnd Bergmann
2021-05-07 22:07 ` [RFC 08/12] partitions: msdos: fix one-byte get_unaligned() Arnd Bergmann
2021-05-07 22:07 ` [RFC 09/12] apparmor: use get_unaligned() only for multi-byte words Arnd Bergmann
2021-05-10  8:17   ` John Johansen
2021-05-07 22:07 ` [RFC 10/12] mwifiex: re-fix for unaligned accesses Arnd Bergmann
2021-05-07 22:07 ` [RFC 11/12] netpoll: avoid put_unaligned() on single character Arnd Bergmann
2021-05-07 22:07 ` [RFC 12/12] asm-generic: simplify asm/unaligned.h Arnd Bergmann
2021-05-07 23:54   ` Linus Torvalds
2021-05-08  9:28     ` Arnd Bergmann
2021-05-08 15:23       ` Linus Torvalds
2021-05-08 11:03   ` David Laight
2021-05-08 14:18     ` David Laight
2021-05-10  6:39   ` Geert Uytterhoeven
2021-05-07 22:07 ` [RFC 0/12] Unify asm/unaligned.h around struct helper Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAAfxs77U4-ojjmsX1uQ9QwhnKa69Aqcs=br+H-Yc7E7RNODqvQ@mail.gmail.com' \
    --to=shorne@gmail.com \
    --cc=arnd@arndb.de \
    --cc=arnd@kernel.org \
    --cc=jonas@southpole.se \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=openrisc@lists.librecores.org \
    --cc=stefan.kristiansson@saunalahti.fi \
    --cc=torvalds@linux-foundation.org \
    --cc=vgupta@synopsys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).