linux-sh.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/13] Unify asm/unaligned.h around struct helper
@ 2021-05-14 10:00 Arnd Bergmann
  2021-05-14 10:00 ` [PATCH v2 03/13] sh: remove unaligned access for sh4a Arnd Bergmann
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Arnd Bergmann @ 2021-05-14 10:00 UTC (permalink / raw)
  To: linux-arch
  Cc: Linus Torvalds, Vineet Gupta, Arnd Bergmann, Amitkumar Karwar,
	Benjamin Herrenschmidt, Borislav Petkov, Eric Dumazet,
	Florian Fainelli, Ganapathi Bhat, Geert Uytterhoeven,
	H. Peter Anvin, Ingo Molnar, Jakub Kicinski, James Morris,
	Jens Axboe, John Johansen, Jonas Bonn, Kalle Valo,
	Michael Ellerman, Paul Mackerras, Rich Felker,
	Richard Russon (FlatCap),
	Russell King, Serge E. Hallyn, Sharvari Harisangam,
	Stafford Horne, Stefan Kristiansson, Thomas Gleixner,
	Vladimir Oltean, Xinming Hu, Yoshinori Sato, x86, linux-kernel,
	linux-arm-kernel, linux-m68k, linux-crypto, openrisc,
	linuxppc-dev, linux-sh, sparclinux, linux-ntfs-dev, linux-block,
	linux-wireless, netdev, linux-security-module

From: Arnd Bergmann <arnd@arndb.de>

The get_unaligned()/put_unaligned() helpers are traditionally architecture
specific, with the two main variants being the "access-ok.h" version
that assumes unaligned pointer accesses always work on a particular
architecture, and the "le-struct.h" version that casts the data to a
byte aligned type before dereferencing, for architectures that cannot
always do unaligned accesses in hardware.

Based on the discussion linked below, it appears that the access-ok
version is not realiable on any architecture, but the struct version
probably has no downsides. This series changes the code to use the
same implementation on all architectures, addressing the few exceptions
separately.

I've included this version in the asm-generic tree for 5.14 already,
addressing the few issues that were pointed out in the RFC. If there
are any remaining problems, I hope those can be addressed as follow-up
patches.

        Arnd

Link: https://lore.kernel.org/lkml/75d07691-1e4f-741f-9852-38c0b4f520bc@synopsys.com/
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363
Link: https://lore.kernel.org/lkml/20210507220813.365382-14-arnd@kernel.org/
Link: git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git unaligned-rework-v2


Arnd Bergmann (13):
  asm-generic: use asm-generic/unaligned.h for most architectures
  openrisc: always use unaligned-struct header
  sh: remove unaligned access for sh4a
  m68k: select CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
  powerpc: use linux/unaligned/le_struct.h on LE power7
  asm-generic: unaligned: remove byteshift helpers
  asm-generic: unaligned always use struct helpers
  partitions: msdos: fix one-byte get_unaligned()
  apparmor: use get_unaligned() only for multi-byte words
  mwifiex: re-fix for unaligned accesses
  netpoll: avoid put_unaligned() on single character
  asm-generic: uaccess: 1-byte access is always aligned
  asm-generic: simplify asm/unaligned.h

 arch/alpha/include/asm/unaligned.h          |  12 --
 arch/arm/include/asm/unaligned.h            |  27 ---
 arch/ia64/include/asm/unaligned.h           |  12 --
 arch/m68k/Kconfig                           |   1 +
 arch/m68k/include/asm/unaligned.h           |  26 ---
 arch/microblaze/include/asm/unaligned.h     |  27 ---
 arch/mips/crypto/crc32-mips.c               |   2 +-
 arch/openrisc/include/asm/unaligned.h       |  47 -----
 arch/parisc/include/asm/unaligned.h         |   6 +-
 arch/powerpc/include/asm/unaligned.h        |  22 ---
 arch/sh/include/asm/unaligned-sh4a.h        | 199 --------------------
 arch/sh/include/asm/unaligned.h             |  13 --
 arch/sparc/include/asm/unaligned.h          |  11 --
 arch/x86/include/asm/unaligned.h            |  15 --
 arch/xtensa/include/asm/unaligned.h         |  29 ---
 block/partitions/ldm.h                      |   2 +-
 block/partitions/msdos.c                    |   2 +-
 drivers/net/wireless/marvell/mwifiex/pcie.c |  10 +-
 include/asm-generic/uaccess.h               |   4 +-
 include/asm-generic/unaligned.h             | 141 +++++++++++---
 include/linux/unaligned/access_ok.h         |  68 -------
 include/linux/unaligned/be_byteshift.h      |  71 -------
 include/linux/unaligned/be_memmove.h        |  37 ----
 include/linux/unaligned/be_struct.h         |  37 ----
 include/linux/unaligned/generic.h           | 115 -----------
 include/linux/unaligned/le_byteshift.h      |  71 -------
 include/linux/unaligned/le_memmove.h        |  37 ----
 include/linux/unaligned/le_struct.h         |  37 ----
 include/linux/unaligned/memmove.h           |  46 -----
 net/core/netpoll.c                          |   4 +-
 security/apparmor/policy_unpack.c           |   2 +-
 31 files changed, 131 insertions(+), 1002 deletions(-)
 delete mode 100644 arch/alpha/include/asm/unaligned.h
 delete mode 100644 arch/arm/include/asm/unaligned.h
 delete mode 100644 arch/ia64/include/asm/unaligned.h
 delete mode 100644 arch/m68k/include/asm/unaligned.h
 delete mode 100644 arch/microblaze/include/asm/unaligned.h
 delete mode 100644 arch/openrisc/include/asm/unaligned.h
 delete mode 100644 arch/powerpc/include/asm/unaligned.h
 delete mode 100644 arch/sh/include/asm/unaligned-sh4a.h
 delete mode 100644 arch/sh/include/asm/unaligned.h
 delete mode 100644 arch/sparc/include/asm/unaligned.h
 delete mode 100644 arch/x86/include/asm/unaligned.h
 delete mode 100644 arch/xtensa/include/asm/unaligned.h
 delete mode 100644 include/linux/unaligned/access_ok.h
 delete mode 100644 include/linux/unaligned/be_byteshift.h
 delete mode 100644 include/linux/unaligned/be_memmove.h
 delete mode 100644 include/linux/unaligned/be_struct.h
 delete mode 100644 include/linux/unaligned/generic.h
 delete mode 100644 include/linux/unaligned/le_byteshift.h
 delete mode 100644 include/linux/unaligned/le_memmove.h
 delete mode 100644 include/linux/unaligned/le_struct.h
 delete mode 100644 include/linux/unaligned/memmove.h

-- 
2.29.2

Cc: Amitkumar Karwar <amitkarwar@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ganapathi Bhat <ganapathi017@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: James Morris <jmorris@namei.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: "Richard Russon (FlatCap)" <ldm@flatcap.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Sharvari Harisangam <sharvari.harisangam@nxp.com>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vladimir Oltean <vladimir.oltean@nxp.com>
Cc: Xinming Hu <huxinming820@gmail.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-crypto@vger.kernel.org
Cc: openrisc@lists.librecores.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-ntfs-dev@lists.sourceforge.net
Cc: linux-block@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-security-module@vger.kernel.org



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 03/13] sh: remove unaligned access for sh4a
  2021-05-14 10:00 [PATCH v2 00/13] Unify asm/unaligned.h around struct helper Arnd Bergmann
@ 2021-05-14 10:00 ` Arnd Bergmann
  2021-05-14 10:34   ` John Paul Adrian Glaubitz
  2021-05-14 17:32 ` [PATCH v2 00/13] Unify asm/unaligned.h around struct helper Linus Torvalds
  2021-12-16 17:29 ` Ard Biesheuvel
  2 siblings, 1 reply; 18+ messages in thread
From: Arnd Bergmann @ 2021-05-14 10:00 UTC (permalink / raw)
  To: linux-arch
  Cc: Linus Torvalds, Vineet Gupta, Arnd Bergmann, Yoshinori Sato,
	Rich Felker, linux-sh, linux-kernel

From: Arnd Bergmann <arnd@arndb.de>

Unlike every other architecture, sh4a uses an inline asm implementation
for get_unaligned(). I have shown that this produces better object
code than the asm-generic version. However, there are very few users of
arch/sh/ overall, and most of those seem to use sh4 rather than sh4a CPU
cores, so it seems not worth keeping the complexity in the architecture
independent code.

Change over to the generic version to allow simplifying that in a
follow-up patch.

If there are sh4a users that want the best performance, it would probably
be best to add support for the movua instruction in gcc itself, as this
would not just help get_unaligned() callers but any code that accesses
a __packed variable in user space or kernel.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/sh/include/asm/unaligned-sh4a.h | 199 ---------------------------
 arch/sh/include/asm/unaligned.h      |  13 --
 2 files changed, 212 deletions(-)
 delete mode 100644 arch/sh/include/asm/unaligned-sh4a.h
 delete mode 100644 arch/sh/include/asm/unaligned.h

diff --git a/arch/sh/include/asm/unaligned-sh4a.h b/arch/sh/include/asm/unaligned-sh4a.h
deleted file mode 100644
index d311f00ed530..000000000000
--- a/arch/sh/include/asm/unaligned-sh4a.h
+++ /dev/null
@@ -1,199 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __ASM_SH_UNALIGNED_SH4A_H
-#define __ASM_SH_UNALIGNED_SH4A_H
-
-/*
- * SH-4A has support for unaligned 32-bit loads, and 32-bit loads only.
- * Support for 64-bit accesses are done through shifting and masking
- * relative to the endianness. Unaligned stores are not supported by the
- * instruction encoding, so these continue to use the packed
- * struct.
- *
- * The same note as with the movli.l/movco.l pair applies here, as long
- * as the load is guaranteed to be inlined, nothing else will hook in to
- * r0 and we get the return value for free.
- *
- * NOTE: Due to the fact we require r0 encoding, care should be taken to
- * avoid mixing these heavily with other r0 consumers, such as the atomic
- * ops. Failure to adhere to this can result in the compiler running out
- * of spill registers and blowing up when building at low optimization
- * levels. See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34777.
- */
-#include <linux/unaligned/packed_struct.h>
-#include <linux/types.h>
-#include <asm/byteorder.h>
-
-static inline u16 sh4a_get_unaligned_cpu16(const u8 *p)
-{
-#ifdef __LITTLE_ENDIAN
-	return p[0] | p[1] << 8;
-#else
-	return p[0] << 8 | p[1];
-#endif
-}
-
-static __always_inline u32 sh4a_get_unaligned_cpu32(const u8 *p)
-{
-	unsigned long unaligned;
-
-	__asm__ __volatile__ (
-		"movua.l	@%1, %0\n\t"
-		 : "=z" (unaligned)
-		 : "r" (p)
-	);
-
-	return unaligned;
-}
-
-/*
- * Even though movua.l supports auto-increment on the read side, it can
- * only store to r0 due to instruction encoding constraints, so just let
- * the compiler sort it out on its own.
- */
-static inline u64 sh4a_get_unaligned_cpu64(const u8 *p)
-{
-#ifdef __LITTLE_ENDIAN
-	return (u64)sh4a_get_unaligned_cpu32(p + 4) << 32 |
-		    sh4a_get_unaligned_cpu32(p);
-#else
-	return (u64)sh4a_get_unaligned_cpu32(p) << 32 |
-		    sh4a_get_unaligned_cpu32(p + 4);
-#endif
-}
-
-static inline u16 get_unaligned_le16(const void *p)
-{
-	return le16_to_cpu(sh4a_get_unaligned_cpu16(p));
-}
-
-static inline u32 get_unaligned_le32(const void *p)
-{
-	return le32_to_cpu(sh4a_get_unaligned_cpu32(p));
-}
-
-static inline u64 get_unaligned_le64(const void *p)
-{
-	return le64_to_cpu(sh4a_get_unaligned_cpu64(p));
-}
-
-static inline u16 get_unaligned_be16(const void *p)
-{
-	return be16_to_cpu(sh4a_get_unaligned_cpu16(p));
-}
-
-static inline u32 get_unaligned_be32(const void *p)
-{
-	return be32_to_cpu(sh4a_get_unaligned_cpu32(p));
-}
-
-static inline u64 get_unaligned_be64(const void *p)
-{
-	return be64_to_cpu(sh4a_get_unaligned_cpu64(p));
-}
-
-static inline void nonnative_put_le16(u16 val, u8 *p)
-{
-	*p++ = val;
-	*p++ = val >> 8;
-}
-
-static inline void nonnative_put_le32(u32 val, u8 *p)
-{
-	nonnative_put_le16(val, p);
-	nonnative_put_le16(val >> 16, p + 2);
-}
-
-static inline void nonnative_put_le64(u64 val, u8 *p)
-{
-	nonnative_put_le32(val, p);
-	nonnative_put_le32(val >> 32, p + 4);
-}
-
-static inline void nonnative_put_be16(u16 val, u8 *p)
-{
-	*p++ = val >> 8;
-	*p++ = val;
-}
-
-static inline void nonnative_put_be32(u32 val, u8 *p)
-{
-	nonnative_put_be16(val >> 16, p);
-	nonnative_put_be16(val, p + 2);
-}
-
-static inline void nonnative_put_be64(u64 val, u8 *p)
-{
-	nonnative_put_be32(val >> 32, p);
-	nonnative_put_be32(val, p + 4);
-}
-
-static inline void put_unaligned_le16(u16 val, void *p)
-{
-#ifdef __LITTLE_ENDIAN
-	__put_unaligned_cpu16(val, p);
-#else
-	nonnative_put_le16(val, p);
-#endif
-}
-
-static inline void put_unaligned_le32(u32 val, void *p)
-{
-#ifdef __LITTLE_ENDIAN
-	__put_unaligned_cpu32(val, p);
-#else
-	nonnative_put_le32(val, p);
-#endif
-}
-
-static inline void put_unaligned_le64(u64 val, void *p)
-{
-#ifdef __LITTLE_ENDIAN
-	__put_unaligned_cpu64(val, p);
-#else
-	nonnative_put_le64(val, p);
-#endif
-}
-
-static inline void put_unaligned_be16(u16 val, void *p)
-{
-#ifdef __BIG_ENDIAN
-	__put_unaligned_cpu16(val, p);
-#else
-	nonnative_put_be16(val, p);
-#endif
-}
-
-static inline void put_unaligned_be32(u32 val, void *p)
-{
-#ifdef __BIG_ENDIAN
-	__put_unaligned_cpu32(val, p);
-#else
-	nonnative_put_be32(val, p);
-#endif
-}
-
-static inline void put_unaligned_be64(u64 val, void *p)
-{
-#ifdef __BIG_ENDIAN
-	__put_unaligned_cpu64(val, p);
-#else
-	nonnative_put_be64(val, p);
-#endif
-}
-
-/*
- * While it's a bit non-obvious, even though the generic le/be wrappers
- * use the __get/put_xxx prefixing, they actually wrap in to the
- * non-prefixed get/put_xxx variants as provided above.
- */
-#include <linux/unaligned/generic.h>
-
-#ifdef __LITTLE_ENDIAN
-# define get_unaligned __get_unaligned_le
-# define put_unaligned __put_unaligned_le
-#else
-# define get_unaligned __get_unaligned_be
-# define put_unaligned __put_unaligned_be
-#endif
-
-#endif /* __ASM_SH_UNALIGNED_SH4A_H */
diff --git a/arch/sh/include/asm/unaligned.h b/arch/sh/include/asm/unaligned.h
deleted file mode 100644
index 0c92e2c73af4..000000000000
--- a/arch/sh/include/asm/unaligned.h
+++ /dev/null
@@ -1,13 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_SH_UNALIGNED_H
-#define _ASM_SH_UNALIGNED_H
-
-#ifdef CONFIG_CPU_SH4A
-/* SH-4A can handle unaligned loads in a relatively neutered fashion. */
-#include <asm/unaligned-sh4a.h>
-#else
-/* Otherwise, SH can't handle unaligned accesses. */
-#include <asm-generic/unaligned.h>
-#endif
-
-#endif /* _ASM_SH_UNALIGNED_H */
-- 
2.29.2


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 03/13] sh: remove unaligned access for sh4a
  2021-05-14 10:00 ` [PATCH v2 03/13] sh: remove unaligned access for sh4a Arnd Bergmann
@ 2021-05-14 10:34   ` John Paul Adrian Glaubitz
  2021-05-14 12:22     ` Arnd Bergmann
  0 siblings, 1 reply; 18+ messages in thread
From: John Paul Adrian Glaubitz @ 2021-05-14 10:34 UTC (permalink / raw)
  To: Arnd Bergmann, linux-arch
  Cc: Linus Torvalds, Vineet Gupta, Arnd Bergmann, Yoshinori Sato,
	Rich Felker, linux-sh, linux-kernel

Hi Arnd!

On 5/14/21 12:00 PM, Arnd Bergmann wrote:
> Unlike every other architecture, sh4a uses an inline asm implementation
> for get_unaligned(). I have shown that this produces better object
> code than the asm-generic version. However, there are very few users of
> arch/sh/ overall, and most of those seem to use sh4 rather than sh4a CPU
> cores, so it seems not worth keeping the complexity in the architecture
> independent code.

My Renesas SH4-Boards actually run an sh4a-Kernel, not an sh4-Kernel:

root@tirpitz:~> uname -a
Linux tirpitz 5.11.0-rc4-00012-g10c03c5bf422 #161 PREEMPT Mon Jan 18 21:10:17 CET 2021 sh4a GNU/Linux
root@tirpitz:~>

So, if this change reduces performance on sh4a, I would rather not merge it.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz@debian.org
`. `'   Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 03/13] sh: remove unaligned access for sh4a
  2021-05-14 10:34   ` John Paul Adrian Glaubitz
@ 2021-05-14 12:22     ` Arnd Bergmann
  2021-05-15 15:36       ` John Paul Adrian Glaubitz
  0 siblings, 1 reply; 18+ messages in thread
From: Arnd Bergmann @ 2021-05-14 12:22 UTC (permalink / raw)
  To: John Paul Adrian Glaubitz
  Cc: linux-arch, Linus Torvalds, Vineet Gupta, Yoshinori Sato,
	Rich Felker, Linux-sh list, Linux Kernel Mailing List

On Fri, May 14, 2021 at 12:34 PM John Paul Adrian Glaubitz
<glaubitz@physik.fu-berlin.de> wrote:
>
> Hi Arnd!
>
> On 5/14/21 12:00 PM, Arnd Bergmann wrote:
> > Unlike every other architecture, sh4a uses an inline asm implementation
> > for get_unaligned(). I have shown that this produces better object
> > code than the asm-generic version. However, there are very few users of
> > arch/sh/ overall, and most of those seem to use sh4 rather than sh4a CPU
> > cores, so it seems not worth keeping the complexity in the architecture
> > independent code.
>
> My Renesas SH4-Boards actually run an sh4a-Kernel, not an sh4-Kernel:
>
> root@tirpitz:~> uname -a
> Linux tirpitz 5.11.0-rc4-00012-g10c03c5bf422 #161 PREEMPT Mon Jan 18 21:10:17 CET 2021 sh4a GNU/Linux
> root@tirpitz:~>
>
> So, if this change reduces performance on sh4a, I would rather not merge it.

It only makes a difference in very specific scenarios in which unaligned
accesses are done in a fast path, e.g. when forwarding network packet
at a high rate on a big-endian kernel (little-endian kernels wouldn't run into
this on IP headers). If you have a use case for this machine on which the
you can show a performance regression, I can add a patch on top to put
the optimized sh4a get_unaligned_le32() back. Dropping this patch
altogether would make the series much more complex because most of
the associated code gets removed in the end.

As I mentioned, supporting "movua" in the compiler likely has a much
larger impact on performance, as it would also help in user space, and
it should improve the networking case on little-endian kernels by replacing
the four separate byte loads/shift pairs with a movua plus a byteswap.

Not sure if there are gcc developers that have an active interest in sh4a
support any more.

      Arnd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 00/13] Unify asm/unaligned.h around struct helper
  2021-05-14 10:00 [PATCH v2 00/13] Unify asm/unaligned.h around struct helper Arnd Bergmann
  2021-05-14 10:00 ` [PATCH v2 03/13] sh: remove unaligned access for sh4a Arnd Bergmann
@ 2021-05-14 17:32 ` Linus Torvalds
  2021-05-14 18:51   ` Vineet Gupta
  2021-05-14 19:31   ` Arnd Bergmann
  2021-12-16 17:29 ` Ard Biesheuvel
  2 siblings, 2 replies; 18+ messages in thread
From: Linus Torvalds @ 2021-05-14 17:32 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arch, Vineet Gupta, Arnd Bergmann, Amitkumar Karwar,
	Benjamin Herrenschmidt, Borislav Petkov, Eric Dumazet,
	Florian Fainelli, Ganapathi Bhat, Geert Uytterhoeven,
	H. Peter Anvin, Ingo Molnar, Jakub Kicinski, James Morris,
	Jens Axboe, John Johansen, Jonas Bonn, Kalle Valo,
	Michael Ellerman, Paul Mackerras, Rich Felker,
	Richard Russon (FlatCap),
	Russell King, Serge E. Hallyn, Sharvari Harisangam,
	Stafford Horne, Stefan Kristiansson, Thomas Gleixner,
	Vladimir Oltean, Xinming Hu, Yoshinori Sato,
	the arch/x86 maintainers, Linux Kernel Mailing List, Linux ARM,
	linux-m68k, Linux Crypto Mailing List, openrisc, linuxppc-dev,
	Linux-sh list, linux-sparc, linux-ntfs-dev, linux-block,
	linux-wireless, Netdev, LSM List

On Fri, May 14, 2021 at 3:02 AM Arnd Bergmann <arnd@kernel.org> wrote:
>
> I've included this version in the asm-generic tree for 5.14 already,
> addressing the few issues that were pointed out in the RFC. If there
> are any remaining problems, I hope those can be addressed as follow-up
> patches.

This continues to look great to me, and now has the even simpler
remaining implementation.

I'd be tempted to just pull it in for 5.13, but I guess we don't
actually have any _outstanding_ bug in this area (the bug was in our
zlib code, required -O3 to trigger, has been fixed now, and the biggy
case didn't even use "get_unaligned()").

So I guess your 5.14 timing is the right thing to do.

        Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 00/13] Unify asm/unaligned.h around struct helper
  2021-05-14 17:32 ` [PATCH v2 00/13] Unify asm/unaligned.h around struct helper Linus Torvalds
@ 2021-05-14 18:51   ` Vineet Gupta
  2021-05-14 19:22     ` Linus Torvalds
  2021-05-14 19:31   ` Arnd Bergmann
  1 sibling, 1 reply; 18+ messages in thread
From: Vineet Gupta @ 2021-05-14 18:51 UTC (permalink / raw)
  To: Linus Torvalds, Arnd Bergmann
  Cc: linux-arch, Arnd Bergmann, Amitkumar Karwar,
	Benjamin Herrenschmidt, Borislav Petkov, Eric Dumazet,
	Florian Fainelli, Ganapathi Bhat, Geert Uytterhoeven,
	H. Peter Anvin, Ingo Molnar, Jakub Kicinski, James Morris,
	Jens Axboe, John Johansen, Jonas Bonn, Kalle Valo,
	Michael Ellerman, Paul Mackerras, Rich Felker,
	Richard Russon (FlatCap),
	Russell King, Serge E. Hallyn, Sharvari Harisangam,
	Stafford Horne, Stefan Kristiansson, Thomas Gleixner,
	Vladimir Oltean, Xinming Hu, Yoshinori Sato,
	the arch/x86 maintainers, Linux Kernel Mailing List, Linux ARM,
	linux-m68k, Linux Crypto Mailing List, openrisc, linuxppc-dev,
	Linux-sh list, linux-sparc, linux-ntfs-dev, linux-block,
	linux-wireless, Netdev, LSM List

On 5/14/21 10:32 AM, Linus Torvalds wrote:
> On Fri, May 14, 2021 at 3:02 AM Arnd Bergmann <arnd@kernel.org> wrote:
>> I've included this version in the asm-generic tree for 5.14 already,
>> addressing the few issues that were pointed out in the RFC. If there
>> are any remaining problems, I hope those can be addressed as follow-up
>> patches.
> This continues to look great to me, and now has the even simpler
> remaining implementation.
>
> I'd be tempted to just pull it in for 5.13, but I guess we don't
> actually have any _outstanding_ bug in this area (the bug was in our
> zlib code, required -O3 to trigger, has been fixed now,

Wasn't the new zlib code slated for 5.14. I don't see it in your master yet

>   and the biggy
> case didn't even use "get_unaligned()").

Indeed this series is sort of orthogonal to that bug, but IMO that bug 
still exists in 5.13 for -O3 build, granted that is not enabled for !ARC.

-Vineet

>
> So I guess your 5.14 timing is the right thing to do.
>
>          Linus


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 00/13] Unify asm/unaligned.h around struct helper
  2021-05-14 18:51   ` Vineet Gupta
@ 2021-05-14 19:22     ` Linus Torvalds
  2021-05-14 19:45       ` Vineet Gupta
  0 siblings, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2021-05-14 19:22 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Arnd Bergmann, linux-arch, Arnd Bergmann, Amitkumar Karwar,
	Benjamin Herrenschmidt, Borislav Petkov, Eric Dumazet,
	Florian Fainelli, Ganapathi Bhat, Geert Uytterhoeven,
	H. Peter Anvin, Ingo Molnar, Jakub Kicinski, James Morris,
	Jens Axboe, John Johansen, Jonas Bonn, Kalle Valo,
	Michael Ellerman, Paul Mackerras, Rich Felker,
	Richard Russon (FlatCap),
	Russell King, Serge E. Hallyn, Sharvari Harisangam,
	Stafford Horne, Stefan Kristiansson, Thomas Gleixner,
	Vladimir Oltean, Xinming Hu, Yoshinori Sato,
	the arch/x86 maintainers, Linux Kernel Mailing List, Linux ARM,
	linux-m68k, Linux Crypto Mailing List, openrisc, linuxppc-dev,
	Linux-sh list, linux-sparc, linux-ntfs-dev, linux-block,
	linux-wireless, Netdev, LSM List

On Fri, May 14, 2021 at 11:52 AM Vineet Gupta
<Vineet.Gupta1@synopsys.com> wrote:
>
> Wasn't the new zlib code slated for 5.14. I don't see it in your master yet

You're right, I never actually committed it, since it was specific to
ARC and -O3 and I wasn't entirely happy with the amount of testing it
got (with Heiko pointing out that the s390 stuff needed more fixes for
the change).

So in fact it's not even queued up for 5.14 due to this all, I just dropped it.

> >   and the biggy
> > case didn't even use "get_unaligned()").
>
> Indeed this series is sort of orthogonal to that bug, but IMO that bug
> still exists in 5.13 for -O3 build, granted that is not enabled for !ARC.

Right, the zlib bug is still there.

But Arnd's series wouldn't even fix it: right now inffast has its own
- ugly and slow - special 2-byte-only version of "get_unaligned()",
called "get_unaligned16()".

And because it's ugly and slow, it's not actually used for
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.

Vineet - maybe the fix is to not take my patch to update to a newer
zlib, but to just fix inffast to use the proper get_unaligned(). Then
Arnd's series _would_ actually fix all this..

              Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 00/13] Unify asm/unaligned.h around struct helper
  2021-05-14 17:32 ` [PATCH v2 00/13] Unify asm/unaligned.h around struct helper Linus Torvalds
  2021-05-14 18:51   ` Vineet Gupta
@ 2021-05-14 19:31   ` Arnd Bergmann
  1 sibling, 0 replies; 18+ messages in thread
From: Arnd Bergmann @ 2021-05-14 19:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-arch, Vineet Gupta, Amitkumar Karwar,
	Benjamin Herrenschmidt, Borislav Petkov, Eric Dumazet,
	Florian Fainelli, Ganapathi Bhat, Geert Uytterhoeven,
	H. Peter Anvin, Ingo Molnar, Jakub Kicinski, James Morris,
	Jens Axboe, John Johansen, Jonas Bonn, Kalle Valo,
	Michael Ellerman, Paul Mackerras, Rich Felker,
	Richard Russon (FlatCap),
	Russell King, Serge E. Hallyn, Sharvari Harisangam,
	Stafford Horne, Stefan Kristiansson, Thomas Gleixner,
	Vladimir Oltean, Xinming Hu, Yoshinori Sato,
	the arch/x86 maintainers, Linux Kernel Mailing List, Linux ARM,
	linux-m68k, Linux Crypto Mailing List, Openrisc, linuxppc-dev,
	Linux-sh list, linux-sparc, linux-ntfs-dev, linux-block,
	linux-wireless, Netdev, LSM List

On Fri, May 14, 2021 at 7:32 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Fri, May 14, 2021 at 3:02 AM Arnd Bergmann <arnd@kernel.org> wrote:
> >
> > I've included this version in the asm-generic tree for 5.14 already,
> > addressing the few issues that were pointed out in the RFC. If there
> > are any remaining problems, I hope those can be addressed as follow-up
> > patches.
>
> This continues to look great to me, and now has the even simpler
> remaining implementation.
>
> I'd be tempted to just pull it in for 5.13, but I guess we don't
> actually have any _outstanding_ bug in this area (the bug was in our
> zlib code, required -O3 to trigger, has been fixed now, and the biggy
> case didn't even use "get_unaligned()").
>
> So I guess your 5.14 timing is the right thing to do.

Yes, I think that's best, just in case something does come up. While all the
object code I looked at does appear better, this is one of those areas that
can be hard to pinpoint if we hit a regression in a particular combination of
architecture+compiler+source file.

I have pushed a signed tag to
https://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git
asm-generic-unaligned-5.14

and plan to send that in the 5.14 merge window unless you decide to
take it now after all.

        Arnd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 00/13] Unify asm/unaligned.h around struct helper
  2021-05-14 19:22     ` Linus Torvalds
@ 2021-05-14 19:45       ` Vineet Gupta
  2021-05-14 20:19         ` Linus Torvalds
  0 siblings, 1 reply; 18+ messages in thread
From: Vineet Gupta @ 2021-05-14 19:45 UTC (permalink / raw)
  To: Linus Torvalds, Vineet Gupta
  Cc: Arnd Bergmann, linux-arch, Arnd Bergmann, Amitkumar Karwar,
	Benjamin Herrenschmidt, Borislav Petkov, Eric Dumazet,
	Florian Fainelli, Ganapathi Bhat, Geert Uytterhoeven,
	H. Peter Anvin, Ingo Molnar, Jakub Kicinski, James Morris,
	Jens Axboe, John Johansen, Jonas Bonn, Kalle Valo,
	Michael Ellerman, Paul Mackerras, Rich Felker,
	Richard Russon (FlatCap),
	Russell King, Serge E. Hallyn, Sharvari Harisangam,
	Stafford Horne, Stefan Kristiansson, Thomas Gleixner,
	Vladimir Oltean, Xinming Hu, Yoshinori Sato,
	the arch/x86 maintainers, Linux Kernel Mailing List, Linux ARM,
	linux-m68k, Linux Crypto Mailing List, openrisc, linuxppc-dev,
	Linux-sh list, linux-sparc, linux-ntfs-dev, linux-block,
	linux-wireless, Netdev, LSM List

On 5/14/21 12:22 PM, Linus Torvalds wrote:
> On Fri, May 14, 2021 at 11:52 AM Vineet Gupta
> <Vineet.Gupta1@synopsys.com> wrote:
>> Wasn't the new zlib code slated for 5.14. I don't see it in your master yet
> You're right, I never actually committed it, since it was specific to
> ARC and -O3

Well, not really, the issue manifested in ARC O3 testing, but I showed 
the problem existed for arm64 gcc too.

> and I wasn't entirely happy with the amount of testing it
> got (with Heiko pointing out that the s390 stuff needed more fixes for
> the change).

With his addon patch everything seemed hunky dory.

> The patch below is required on top of your patch to make it compile
> for s390 as well.
> Tested with kernel image decompression, and also btrfs with file
> compression; both software and hardware compression.
> Everything seems to work.

> So in fact it's not even queued up for 5.14 due to this all, I just dropped it.

But Why. Can't we throw it in linux-next for 5.14. I promise to test it 
- and will likely hit any corner cases. Also for the time being we could 
force just that file/files to build for -O3 to stress test the aspects 
that were fragile.

>>>    and the biggy
>>> case didn't even use "get_unaligned()").
>> Indeed this series is sort of orthogonal to that bug, but IMO that bug
>> still exists in 5.13 for -O3 build, granted that is not enabled for !ARC.
> Right, the zlib bug is still there.
>
> But Arnd's series wouldn't even fix it: right now inffast has its own
> - ugly and slow - special 2-byte-only version of "get_unaligned()",
> called "get_unaligned16()".

I know that's why said they are orthogonal.


> And because it's ugly and slow, it's not actually used for
> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.
>
> Vineet - maybe the fix is to not take my patch to update to a newer
> zlib, but to just fix inffast to use the proper get_unaligned(). Then
> Arnd's series _would_ actually fix all this..

OK if you say so.

-Vineet

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 00/13] Unify asm/unaligned.h around struct helper
  2021-05-14 19:45       ` Vineet Gupta
@ 2021-05-14 20:19         ` Linus Torvalds
  0 siblings, 0 replies; 18+ messages in thread
From: Linus Torvalds @ 2021-05-14 20:19 UTC (permalink / raw)
  To: Vineet Gupta
  Cc: Arnd Bergmann, linux-arch, Arnd Bergmann, Amitkumar Karwar,
	Benjamin Herrenschmidt, Borislav Petkov, Eric Dumazet,
	Florian Fainelli, Ganapathi Bhat, Geert Uytterhoeven,
	H. Peter Anvin, Ingo Molnar, Jakub Kicinski, James Morris,
	Jens Axboe, John Johansen, Jonas Bonn, Kalle Valo,
	Michael Ellerman, Paul Mackerras, Rich Felker,
	Richard Russon (FlatCap),
	Russell King, Serge E. Hallyn, Sharvari Harisangam,
	Stafford Horne, Stefan Kristiansson, Thomas Gleixner,
	Vladimir Oltean, Xinming Hu, Yoshinori Sato,
	the arch/x86 maintainers, Linux Kernel Mailing List, Linux ARM,
	linux-m68k, Linux Crypto Mailing List, openrisc, linuxppc-dev,
	Linux-sh list, linux-sparc, linux-ntfs-dev, linux-block,
	linux-wireless, Netdev, LSM List

On Fri, May 14, 2021 at 12:45 PM Vineet Gupta
<Vineet.Gupta1@synopsys.com> wrote:
>
> Well, not really, the issue manifested in ARC O3 testing, but I showed
> the problem existed for arm64 gcc too.

.. but not with a supported kernel configuration.

> > So in fact it's not even queued up for 5.14 due to this all, I just dropped it.
>
> But Why.

I just didn't have time to deal with it during the merge window. If
you keep it alive, that's all fine and good.

                Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 03/13] sh: remove unaligned access for sh4a
  2021-05-14 12:22     ` Arnd Bergmann
@ 2021-05-15 15:36       ` John Paul Adrian Glaubitz
  2021-05-15 20:10         ` Arnd Bergmann
  0 siblings, 1 reply; 18+ messages in thread
From: John Paul Adrian Glaubitz @ 2021-05-15 15:36 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-arch, Linus Torvalds, Vineet Gupta, Yoshinori Sato,
	Rich Felker, Linux-sh list, Linux Kernel Mailing List

Hi Arnd!

On 5/14/21 2:22 PM, Arnd Bergmann wrote:
>> My Renesas SH4-Boards actually run an sh4a-Kernel, not an sh4-Kernel:
>>
>> root@tirpitz:~> uname -a
>> Linux tirpitz 5.11.0-rc4-00012-g10c03c5bf422 #161 PREEMPT Mon Jan 18 21:10:17 CET 2021 sh4a GNU/Linux
>> root@tirpitz:~>
>>
>> So, if this change reduces performance on sh4a, I would rather not merge it.
> 
> It only makes a difference in very specific scenarios in which unaligned
> accesses are done in a fast path, e.g. when forwarding network packet
> at a high rate on a big-endian kernel (little-endian kernels wouldn't run into
> this on IP headers). If you have a use case for this machine on which the
> you can show a performance regression, I can add a patch on top to put
> the optimized sh4a get_unaligned_le32() back. Dropping this patch
> altogether would make the series much more complex because most of
> the associated code gets removed in the end.

Hmm, okay. But why does code which sits below arch/sh have to be removed anyway?

I don't fully understand why it poses any maintenance burden/

> As I mentioned, supporting "movua" in the compiler likely has a much
> larger impact on performance, as it would also help in user space, and
> it should improve the networking case on little-endian kernels by replacing
> the four separate byte loads/shift pairs with a movua plus a byteswap.

The problem is that - at least in Debian - we use the sh4 baseline while the kernel
supports both sh4 and sh4a, so we can't use any of these instructions in userland at
the moment.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer - glaubitz@debian.org
`. `'   Freie Universitaet Berlin - glaubitz@physik.fu-berlin.de
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 03/13] sh: remove unaligned access for sh4a
  2021-05-15 15:36       ` John Paul Adrian Glaubitz
@ 2021-05-15 20:10         ` Arnd Bergmann
  0 siblings, 0 replies; 18+ messages in thread
From: Arnd Bergmann @ 2021-05-15 20:10 UTC (permalink / raw)
  To: John Paul Adrian Glaubitz
  Cc: linux-arch, Linus Torvalds, Vineet Gupta, Yoshinori Sato,
	Rich Felker, Linux-sh list, Linux Kernel Mailing List

On Sat, May 15, 2021 at 5:36 PM John Paul Adrian Glaubitz
<glaubitz@physik.fu-berlin.de> wrote:
> On 5/14/21 2:22 PM, Arnd Bergmann wrote:
> >> My Renesas SH4-Boards actually run an sh4a-Kernel, not an sh4-Kernel:
> >>
> >> root@tirpitz:~> uname -a
> >> Linux tirpitz 5.11.0-rc4-00012-g10c03c5bf422 #161 PREEMPT Mon Jan 18 21:10:17 CET 2021 sh4a GNU/Linux
> >> root@tirpitz:~>
> >>
> >> So, if this change reduces performance on sh4a, I would rather not merge it.
> >
> > It only makes a difference in very specific scenarios in which unaligned
> > accesses are done in a fast path, e.g. when forwarding network packet
> > at a high rate on a big-endian kernel (little-endian kernels wouldn't run into
> > this on IP headers). If you have a use case for this machine on which the
> > you can show a performance regression, I can add a patch on top to put
> > the optimized sh4a get_unaligned_le32() back. Dropping this patch
> > altogether would make the series much more complex because most of
> > the associated code gets removed in the end.
>
> Hmm, okay. But why does code which sits below arch/sh have to be removed anyway?
>
> I don't fully understand why it poses any maintenance burden/

What  I'm removing is the part that lets architectures override the
generic version.

> > As I mentioned, supporting "movua" in the compiler likely has a much
> > larger impact on performance, as it would also help in user space, and
> > it should improve the networking case on little-endian kernels by replacing
> > the four separate byte loads/shift pairs with a movua plus a byteswap.
>
> The problem is that - at least in Debian - we use the sh4 baseline while the kernel
> supports both sh4 and sh4a, so we can't use any of these instructions in userland at
> the moment.

I tried building an sh7785lcr_defconfig with and without the patch,
and found that
the only affected files are:

- in-kernel nfs client
- crc32c/sha1/sha256 hash functions
- device probing for libata, scsi-core, scsi-disk, hid, r8168
  (should not matter after boot)
- msdos partition parsing

Any nfs client performance difference is probably not even measurable even
at gigabit ethernet speed.
I see that the hash functions are notably different, but I don't know if the
output from the new generic code is actually better or worse than the
original. If you do think this is important, please try the version from

https://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git
unaligned-sh4a

against the version without the last change in that series. If you can find
a relevant test case that exercises it, you may want to add a custom
implementation of the hash functions as well.

       Arnd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 00/13] Unify asm/unaligned.h around struct helper
  2021-05-14 10:00 [PATCH v2 00/13] Unify asm/unaligned.h around struct helper Arnd Bergmann
  2021-05-14 10:00 ` [PATCH v2 03/13] sh: remove unaligned access for sh4a Arnd Bergmann
  2021-05-14 17:32 ` [PATCH v2 00/13] Unify asm/unaligned.h around struct helper Linus Torvalds
@ 2021-12-16 17:29 ` Ard Biesheuvel
  2021-12-16 17:42   ` Linus Torvalds
                     ` (2 more replies)
  2 siblings, 3 replies; 18+ messages in thread
From: Ard Biesheuvel @ 2021-12-16 17:29 UTC (permalink / raw)
  To: Arnd Bergmann, Jason A. Donenfeld, johannes, Kees Cook, Nick Desaulniers
  Cc: linux-arch, Linus Torvalds, Vineet Gupta, Arnd Bergmann,
	Amitkumar Karwar, Benjamin Herrenschmidt, Borislav Petkov,
	Eric Dumazet, Florian Fainelli, Ganapathi Bhat,
	Geert Uytterhoeven, H. Peter Anvin, Ingo Molnar, Jakub Kicinski,
	James Morris, Jens Axboe, John Johansen, Jonas Bonn, Kalle Valo,
	Michael Ellerman, Paul Mackerras, Rich Felker,
	Richard Russon (FlatCap),
	Russell King, Serge E. Hallyn, Sharvari Harisangam,
	Stafford Horne, Stefan Kristiansson, Thomas Gleixner,
	Vladimir Oltean, Xinming Hu, Yoshinori Sato, X86 ML,
	Linux Kernel Mailing List, Linux ARM, linux-m68k,
	Linux Crypto Mailing List, openrisc,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	linux-sh, open list:SPARC + UltraSPARC (sparc/sparc64),
	linux-ntfs-dev, linux-block, linux-wireless,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	linux-security-module

Hi Arnd,

(replying to an old thread as this came up in the discussion regarding
misaligned loads and stored in siphash() when compiled for ARM
[f7e5b9bfa6c8820407b64eabc1f29c9a87e8993d])

On Fri, 14 May 2021 at 12:02, Arnd Bergmann <arnd@kernel.org> wrote:
>
> From: Arnd Bergmann <arnd@arndb.de>
>
> The get_unaligned()/put_unaligned() helpers are traditionally architecture
> specific, with the two main variants being the "access-ok.h" version
> that assumes unaligned pointer accesses always work on a particular
> architecture, and the "le-struct.h" version that casts the data to a
> byte aligned type before dereferencing, for architectures that cannot
> always do unaligned accesses in hardware.
>
> Based on the discussion linked below, it appears that the access-ok
> version is not realiable on any architecture, but the struct version
> probably has no downsides. This series changes the code to use the
> same implementation on all architectures, addressing the few exceptions
> separately.
>
> I've included this version in the asm-generic tree for 5.14 already,
> addressing the few issues that were pointed out in the RFC. If there
> are any remaining problems, I hope those can be addressed as follow-up
> patches.
>

I think this series is a huge improvement, but it does not solve the
UB problem completely. As we found, there are open issues in the GCC
bugzilla regarding assumptions in the compiler that aligned quantities
either overlap entirely or not at all. (e.g.,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363)

CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is used in many places to
conditionally emit code that violates C alignment rules. E.g., there
is this example in Documentation/core-api/unaligned-memory-access.rst:

bool ether_addr_equal(const u8 *addr1, const u8 *addr2)
{
#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
  u32 fold = ((*(const u32 *)addr1) ^ (*(const u32 *)addr2)) |
             ((*(const u16 *)(addr1 + 4)) ^ (*(const u16 *)(addr2 + 4)));
  return fold == 0;
#else
...

(which now deviates from its actual implementation, but the point is
the same) where CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is used in the
wrong way (IMHO).

The pattern seems to be

#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
  // ignore alignment rules, just cast to a more aligned pointer type
#else
  // use unaligned accessors, which could be either cheap or expensive,
  // depending on the architecture
#endif

whereas the following pattern makes more sense, I think, and does not
violate any C rules in the common case:

#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
  // use unaligned accessors, which are cheap or even entirely free
#else
  // avoid unaligned accessors, as they are expensive; instead, reorganize
  // the data so we don't need them (similar to setting NET_IP_ALIGN to 2)
#endif

The only remaining problem here is reinterpreting a char* pointer to a
u32*, e.g., for accessing the IP address in an Ethernet frame when
NET_IP_ALIGN == 2, which could suffer from the same UB problem again,
as I understand it.

In the 32-bit ARM case (v6+) [which is admittedly an outlier] this
makes a substantial difference, as ARMv6 does have efficient unaligned
accessors (load/store word or halfword may be used on misaligned
addresses) but requires that load/store double-word and load/store
multiple are only used on 32-bit aligned addresses. GCC does the right
thing with the unaligned accessors, but blindly casting away
misalignment may result in alignment traps if the compiler happened to
emit load-double or load-multiple instructions for the memory access
in question.

Jason already verifed that in the siphash() case, the aligned and
unaligned versions of the code actually compile to the same machine
code on x86, as the unaligned accessors just disappear. I suspect this
to be the case for many instances where
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is being used, mostly in the
networking stack.

So I intend to dig a bit deeper into this, and perhaps propose some
changes where the interpretation of
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is documented more clearly, and
tweaked according to my suggestion above (while ensuring that codegen
does not suffer, of course)

Thoughts, concerns, objections?


--
Ard.




> Link: https://lore.kernel.org/lkml/75d07691-1e4f-741f-9852-38c0b4f520bc@synopsys.com/
> Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363
> Link: https://lore.kernel.org/lkml/20210507220813.365382-14-arnd@kernel.org/
> Link: git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git unaligned-rework-v2
>
>
> Arnd Bergmann (13):
>   asm-generic: use asm-generic/unaligned.h for most architectures
>   openrisc: always use unaligned-struct header
>   sh: remove unaligned access for sh4a
>   m68k: select CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
>   powerpc: use linux/unaligned/le_struct.h on LE power7
>   asm-generic: unaligned: remove byteshift helpers
>   asm-generic: unaligned always use struct helpers
>   partitions: msdos: fix one-byte get_unaligned()
>   apparmor: use get_unaligned() only for multi-byte words
>   mwifiex: re-fix for unaligned accesses
>   netpoll: avoid put_unaligned() on single character
>   asm-generic: uaccess: 1-byte access is always aligned
>   asm-generic: simplify asm/unaligned.h
>
>  arch/alpha/include/asm/unaligned.h          |  12 --
>  arch/arm/include/asm/unaligned.h            |  27 ---
>  arch/ia64/include/asm/unaligned.h           |  12 --
>  arch/m68k/Kconfig                           |   1 +
>  arch/m68k/include/asm/unaligned.h           |  26 ---
>  arch/microblaze/include/asm/unaligned.h     |  27 ---
>  arch/mips/crypto/crc32-mips.c               |   2 +-
>  arch/openrisc/include/asm/unaligned.h       |  47 -----
>  arch/parisc/include/asm/unaligned.h         |   6 +-
>  arch/powerpc/include/asm/unaligned.h        |  22 ---
>  arch/sh/include/asm/unaligned-sh4a.h        | 199 --------------------
>  arch/sh/include/asm/unaligned.h             |  13 --
>  arch/sparc/include/asm/unaligned.h          |  11 --
>  arch/x86/include/asm/unaligned.h            |  15 --
>  arch/xtensa/include/asm/unaligned.h         |  29 ---
>  block/partitions/ldm.h                      |   2 +-
>  block/partitions/msdos.c                    |   2 +-
>  drivers/net/wireless/marvell/mwifiex/pcie.c |  10 +-
>  include/asm-generic/uaccess.h               |   4 +-
>  include/asm-generic/unaligned.h             | 141 +++++++++++---
>  include/linux/unaligned/access_ok.h         |  68 -------
>  include/linux/unaligned/be_byteshift.h      |  71 -------
>  include/linux/unaligned/be_memmove.h        |  37 ----
>  include/linux/unaligned/be_struct.h         |  37 ----
>  include/linux/unaligned/generic.h           | 115 -----------
>  include/linux/unaligned/le_byteshift.h      |  71 -------
>  include/linux/unaligned/le_memmove.h        |  37 ----
>  include/linux/unaligned/le_struct.h         |  37 ----
>  include/linux/unaligned/memmove.h           |  46 -----
>  net/core/netpoll.c                          |   4 +-
>  security/apparmor/policy_unpack.c           |   2 +-
>  31 files changed, 131 insertions(+), 1002 deletions(-)
>  delete mode 100644 arch/alpha/include/asm/unaligned.h
>  delete mode 100644 arch/arm/include/asm/unaligned.h
>  delete mode 100644 arch/ia64/include/asm/unaligned.h
>  delete mode 100644 arch/m68k/include/asm/unaligned.h
>  delete mode 100644 arch/microblaze/include/asm/unaligned.h
>  delete mode 100644 arch/openrisc/include/asm/unaligned.h
>  delete mode 100644 arch/powerpc/include/asm/unaligned.h
>  delete mode 100644 arch/sh/include/asm/unaligned-sh4a.h
>  delete mode 100644 arch/sh/include/asm/unaligned.h
>  delete mode 100644 arch/sparc/include/asm/unaligned.h
>  delete mode 100644 arch/x86/include/asm/unaligned.h
>  delete mode 100644 arch/xtensa/include/asm/unaligned.h
>  delete mode 100644 include/linux/unaligned/access_ok.h
>  delete mode 100644 include/linux/unaligned/be_byteshift.h
>  delete mode 100644 include/linux/unaligned/be_memmove.h
>  delete mode 100644 include/linux/unaligned/be_struct.h
>  delete mode 100644 include/linux/unaligned/generic.h
>  delete mode 100644 include/linux/unaligned/le_byteshift.h
>  delete mode 100644 include/linux/unaligned/le_memmove.h
>  delete mode 100644 include/linux/unaligned/le_struct.h
>  delete mode 100644 include/linux/unaligned/memmove.h
>
> --
> 2.29.2
>
> Cc: Amitkumar Karwar <amitkarwar@gmail.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Florian Fainelli <f.fainelli@gmail.com>
> Cc: Ganapathi Bhat <ganapathi017@gmail.com>
> Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: James Morris <jmorris@namei.org>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: John Johansen <john.johansen@canonical.com>
> Cc: Jonas Bonn <jonas@southpole.se>
> Cc: Kalle Valo <kvalo@codeaurora.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Rich Felker <dalias@libc.org>
> Cc: "Richard Russon (FlatCap)" <ldm@flatcap.org>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: "Serge E. Hallyn" <serge@hallyn.com>
> Cc: Sharvari Harisangam <sharvari.harisangam@nxp.com>
> Cc: Stafford Horne <shorne@gmail.com>
> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Vladimir Oltean <vladimir.oltean@nxp.com>
> Cc: Xinming Hu <huxinming820@gmail.com>
> Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
> Cc: x86@kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-m68k@lists.linux-m68k.org
> Cc: linux-crypto@vger.kernel.org
> Cc: openrisc@lists.librecores.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-sh@vger.kernel.org
> Cc: sparclinux@vger.kernel.org
> Cc: linux-ntfs-dev@lists.sourceforge.net
> Cc: linux-block@vger.kernel.org
> Cc: linux-wireless@vger.kernel.org
> Cc: netdev@vger.kernel.org
> Cc: linux-arch@vger.kernel.org
> Cc: linux-security-module@vger.kernel.org
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 00/13] Unify asm/unaligned.h around struct helper
  2021-12-16 17:29 ` Ard Biesheuvel
@ 2021-12-16 17:42   ` Linus Torvalds
  2021-12-16 17:49   ` David Laight
  2021-12-16 18:56   ` Segher Boessenkool
  2 siblings, 0 replies; 18+ messages in thread
From: Linus Torvalds @ 2021-12-16 17:42 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Arnd Bergmann, Jason A. Donenfeld, Johannes Berg, Kees Cook,
	Nick Desaulniers, linux-arch, Vineet Gupta, Arnd Bergmann,
	Amitkumar Karwar, Benjamin Herrenschmidt, Borislav Petkov,
	Eric Dumazet, Florian Fainelli, Ganapathi Bhat,
	Geert Uytterhoeven, H. Peter Anvin, Ingo Molnar, Jakub Kicinski,
	James Morris, Jens Axboe, John Johansen, Jonas Bonn, Kalle Valo,
	Michael Ellerman, Paul Mackerras, Rich Felker,
	Richard Russon (FlatCap),
	Russell King, Serge E. Hallyn, Sharvari Harisangam,
	Stafford Horne, Stefan Kristiansson, Thomas Gleixner,
	Vladimir Oltean, Xinming Hu, Yoshinori Sato, X86 ML,
	Linux Kernel Mailing List, Linux ARM, linux-m68k,
	Linux Crypto Mailing List, openrisc,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Linux-sh list, open list:SPARC + UltraSPARC (sparc/sparc64),
	linux-ntfs-dev, linux-block, linux-wireless,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	LSM List

On Thu, Dec 16, 2021 at 9:29 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is used in many places to
> conditionally emit code that violates C alignment rules. E.g., there
> is this example in Documentation/core-api/unaligned-memory-access.rst:
>
> bool ether_addr_equal(const u8 *addr1, const u8 *addr2)
> {
> #ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
>   u32 fold = ((*(const u32 *)addr1) ^ (*(const u32 *)addr2)) |
>              ((*(const u16 *)(addr1 + 4)) ^ (*(const u16 *)(addr2 + 4)));
>   return fold == 0;
> #else

It probably works fine in practice - the one case we had was really
pretty special, and about the vectorizer doing odd things.

But I think we should strive to convert these to use
"get_unaligned()", since code generation is fine. It still often makes
sense to have that test for the config variable, simply because the
approach might be different if we know unaligned accesses are slow.

So I'll happily take patches that do obvious conversions to
get_unaligned() where they make sense, but I don't think we should
consider this some huge hard requirement.

                 Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v2 00/13] Unify asm/unaligned.h around struct helper
  2021-12-16 17:29 ` Ard Biesheuvel
  2021-12-16 17:42   ` Linus Torvalds
@ 2021-12-16 17:49   ` David Laight
  2021-12-16 18:56   ` Segher Boessenkool
  2 siblings, 0 replies; 18+ messages in thread
From: David Laight @ 2021-12-16 17:49 UTC (permalink / raw)
  To: 'Ard Biesheuvel',
	Arnd Bergmann, Jason A. Donenfeld, johannes, Kees Cook,
	Nick Desaulniers
  Cc: Rich Felker, linux-sh, Amitkumar Karwar, Russell King,
	Eric Dumazet, Paul Mackerras, H. Peter Anvin,
	open list:SPARC + UltraSPARC (sparc/sparc64),
	Thomas Gleixner, linux-arch, Florian Fainelli, Yoshinori Sato,
	X86 ML, James Morris, Ingo Molnar, Geert Uytterhoeven, Linux ARM,
	Richard Russon (FlatCap),
	Jakub Kicinski, Serge E. Hallyn, Jonas Bonn, Arnd Bergmann,
	Ganapathi Bhat, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Stefan Kristiansson, linux-block, linux-m68k, openrisc,
	Borislav Petkov, Stafford Horne, Kalle Valo, Jens Axboe,
	John Johansen, Xinming Hu, Vineet Gupta, linux-wireless,
	Linux Kernel Mailing List, Vladimir Oltean, linux-ntfs-dev,
	linux-security-module, Linux Crypto Mailing List,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Linus Torvalds, Sharvari Harisangam

From: Ard Biesheuvel
> Sent: 16 December 2021 17:30
> 
> Hi Arnd,
> 
> (replying to an old thread as this came up in the discussion regarding
> misaligned loads and stored in siphash() when compiled for ARM
> [f7e5b9bfa6c8820407b64eabc1f29c9a87e8993d])
> 
> On Fri, 14 May 2021 at 12:02, Arnd Bergmann <arnd@kernel.org> wrote:
> >
> > From: Arnd Bergmann <arnd@arndb.de>
> >
> > The get_unaligned()/put_unaligned() helpers are traditionally architecture
> > specific, with the two main variants being the "access-ok.h" version
> > that assumes unaligned pointer accesses always work on a particular
> > architecture, and the "le-struct.h" version that casts the data to a
> > byte aligned type before dereferencing, for architectures that cannot
> > always do unaligned accesses in hardware.

I'm pretty sure the compiler is allowed to 'read through' that cast
and still do an aligned access.
It has always been hard to get the compiler to 'forget' about known/expected
alignment - typically trying to stop memcpy() faulting on sparc.
Real function calls are usually required - but LTO may scupper that.

> >
> > Based on the discussion linked below, it appears that the access-ok
> > version is not realiable on any architecture, but the struct version
> > probably has no downsides. This series changes the code to use the
> > same implementation on all architectures, addressing the few exceptions
> > separately.
> >
> > I've included this version in the asm-generic tree for 5.14 already,
> > addressing the few issues that were pointed out in the RFC. If there
> > are any remaining problems, I hope those can be addressed as follow-up
> > patches.
> >
> 
> I think this series is a huge improvement, but it does not solve the
> UB problem completely. As we found, there are open issues in the GCC
> bugzilla regarding assumptions in the compiler that aligned quantities
> either overlap entirely or not at all. (e.g.,
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363)

I think we can stop the compiler merging unaligned requests by adding a byte-sized
memory barrier for the base address before and after the access.
That should still support complex addressing modes (esp on x86).

Another option is to do the misaligned access from within an asm statement.
While architecture dependant, it only really depends on the syntax of the ld/st
instruction.
The compiler can't merge those because it doesn't know whether the data is
'frobbed' before/after the memory access.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 00/13] Unify asm/unaligned.h around struct helper
  2021-12-16 17:29 ` Ard Biesheuvel
  2021-12-16 17:42   ` Linus Torvalds
  2021-12-16 17:49   ` David Laight
@ 2021-12-16 18:56   ` Segher Boessenkool
  2021-12-17 12:34     ` David Laight
  2 siblings, 1 reply; 18+ messages in thread
From: Segher Boessenkool @ 2021-12-16 18:56 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Arnd Bergmann, Jason A. Donenfeld, johannes, Kees Cook,
	Nick Desaulniers, Rich Felker, linux-sh, Amitkumar Karwar,
	Russell King, Eric Dumazet, Paul Mackerras, H. Peter Anvin,
	open list:SPARC + UltraSPARC (sparc/sparc64),
	Thomas Gleixner, linux-arch, Florian Fainelli, Yoshinori Sato,
	X86 ML, James Morris, Ingo Molnar, Geert Uytterhoeven, Linux ARM,
	Richard Russon (FlatCap),
	Jakub Kicinski, Serge E. Hallyn, Jonas Bonn, Arnd Bergmann,
	Ganapathi Bhat, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Stefan Kristiansson, linux-block, linux-m68k, openrisc,
	Borislav Petkov, Stafford Horne, Kalle Valo, Jens Axboe,
	John Johansen, Xinming Hu, Vineet Gupta, linux-wireless,
	Linux Kernel Mailing List, Vladimir Oltean, linux-ntfs-dev,
	linux-security-module, Linux Crypto Mailing List,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Linus Torvalds, Sharvari Harisangam

On Thu, Dec 16, 2021 at 06:29:40PM +0100, Ard Biesheuvel wrote:
> I think this series is a huge improvement, but it does not solve the
> UB problem completely. As we found, there are open issues in the GCC
> bugzilla regarding assumptions in the compiler that aligned quantities
> either overlap entirely or not at all. (e.g.,
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363)

That isn't open, it was closed as INVALID back in May.

(Naturally) aligned quantities only overlap if they are the same datum.
This follows directly from the definition of (naturally) aligned.  There
is no mystery here.

All unaligned data need to be marked up properly.

> CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is used in many places to
> conditionally emit code that violates C alignment rules.

Most of this is ABI, not C.  It is the ABI that requires certain
alignments.  Ignoring that plain does not work, but even if it would
you will end up with much slower generated code.

> whereas the following pattern makes more sense, I think, and does not
> violate any C rules in the common case:
> 
> #ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
>   // use unaligned accessors, which are cheap or even entirely free
> #else
>   // avoid unaligned accessors, as they are expensive; instead, reorganize
>   // the data so we don't need them (similar to setting NET_IP_ALIGN to 2)
> #endif

Yes, this looks more reasonable.

> The only remaining problem here is reinterpreting a char* pointer to a
> u32*, e.g., for accessing the IP address in an Ethernet frame when
> NET_IP_ALIGN == 2, which could suffer from the same UB problem again,
> as I understand it.

The problem is never casting a pointer to pointer to character type, and
then later back to an appriopriate pointer type.  These things are both
required to work.  The problem always is accessing something as if it
was something of another type, which is not valid C.  This however is
exactly what -fno-strict-aliasing allows, so that works as well.

But this does not have much to do with alignment.


Segher

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH v2 00/13] Unify asm/unaligned.h around struct helper
  2021-12-16 18:56   ` Segher Boessenkool
@ 2021-12-17 12:34     ` David Laight
  2021-12-17 13:35       ` Segher Boessenkool
  0 siblings, 1 reply; 18+ messages in thread
From: David Laight @ 2021-12-17 12:34 UTC (permalink / raw)
  To: 'Segher Boessenkool', Ard Biesheuvel
  Cc: linux-wireless, Jason A. Donenfeld, Rich Felker, linux-sh,
	Richard Russon (FlatCap),
	X86 ML, Amitkumar Karwar, James Morris, Eric Dumazet,
	Paul Mackerras, linux-m68k, H. Peter Anvin,
	open list:SPARC + UltraSPARC (sparc/sparc64),
	Stafford Horne, linux-arch, Florian Fainelli, Yoshinori Sato,
	Russell King, Linus Torvalds, Ingo Molnar, Geert Uytterhoeven,
	Kalle Valo, Vladimir Oltean, Jakub Kicinski, Serge E. Hallyn,
	Jonas Bonn, Kees Cook, Arnd Bergmann, Ganapathi Bhat,
	Stefan Kristiansson, linux-block, openrisc, Borislav Petkov,
	Thomas Gleixner, Linux ARM, Jens Axboe, Arnd Bergmann,
	John Johansen, Xinming Hu, Vineet Gupta, Nick Desaulniers,
	Linux Kernel Mailing List, linux-ntfs-dev, linux-security-module,
	Linux Crypto Mailing List,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	johannes, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Sharvari Harisangam

From: Segher Boessenkool
> Sent: 16 December 2021 18:56
...
> > The only remaining problem here is reinterpreting a char* pointer to a
> > u32*, e.g., for accessing the IP address in an Ethernet frame when
> > NET_IP_ALIGN == 2, which could suffer from the same UB problem again,
> > as I understand it.
> 
> The problem is never casting a pointer to pointer to character type, and
> then later back to an appriopriate pointer type.
> These things are both required to work.

I think that is true of 'void *', not 'char *'.
'char' is special in that 'strict aliasing' doesn't apply to it.
(Which is actually a pain sometimes.)

> The problem always is accessing something as if it
> was something of another type, which is not valid C.  This however is
> exactly what -fno-strict-aliasing allows, so that works as well.

IIRC the C language only allows you to have pointers to valid data items.
(Since they can only be generated by the & operator on a valid item.)
Indirecting any other pointer is probably UB!

This (sort of) allows the compiler to 'look through' casts to find
what the actual type is (or might be).
It can then use that information to make optimisation choices.
This has caused grief with memcpy() calls that are trying to copy
a structure that the coder knows is misaligned to an aligned buffer.

So while *(unaligned_ptr *)char_ptr probably has to work.
If the compiler can see *(unaligned_ptr *)(char *)int_ptr it can
assume the alignment of the 'int_ptr' and do a single aligned access.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 00/13] Unify asm/unaligned.h around struct helper
  2021-12-17 12:34     ` David Laight
@ 2021-12-17 13:35       ` Segher Boessenkool
  0 siblings, 0 replies; 18+ messages in thread
From: Segher Boessenkool @ 2021-12-17 13:35 UTC (permalink / raw)
  To: David Laight
  Cc: Ard Biesheuvel, linux-wireless, Jason A. Donenfeld, Rich Felker,
	linux-sh, Richard Russon (FlatCap),
	X86 ML, Amitkumar Karwar, James Morris, Eric Dumazet,
	Paul Mackerras, linux-m68k, H. Peter Anvin,
	open list:SPARC + UltraSPARC (sparc/sparc64),
	Stafford Horne, linux-arch, Florian Fainelli, Yoshinori Sato,
	Russell King, Linus Torvalds, Ingo Molnar, Geert Uytterhoeven,
	Kalle Valo, Vladimir Oltean, Jakub Kicinski, Serge E. Hallyn,
	Jonas Bonn, Kees Cook, Arnd Bergmann, Ganapathi Bhat,
	Stefan Kristiansson, linux-block, openrisc, Borislav Petkov,
	Thomas Gleixner, Linux ARM, Jens Axboe, Arnd Bergmann,
	John Johansen, Xinming Hu, Vineet Gupta, Nick Desaulniers,
	Linux Kernel Mailing List, linux-ntfs-dev, linux-security-module,
	Linux Crypto Mailing List,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	johannes, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Sharvari Harisangam

On Fri, Dec 17, 2021 at 12:34:53PM +0000, David Laight wrote:
> From: Segher Boessenkool
> > Sent: 16 December 2021 18:56
> ...
> > > The only remaining problem here is reinterpreting a char* pointer to a
> > > u32*, e.g., for accessing the IP address in an Ethernet frame when
> > > NET_IP_ALIGN == 2, which could suffer from the same UB problem again,
> > > as I understand it.
> > 
> > The problem is never casting a pointer to pointer to character type, and
> > then later back to an appriopriate pointer type.
> > These things are both required to work.
> 
> I think that is true of 'void *', not 'char *'.

No, see 6.3.2.3/7.  Both are allowed (and behave the same in fact).

> 'char' is special in that 'strict aliasing' doesn't apply to it.
> (Which is actually a pain sometimes.)

That has nothing to do with it.  Yes, you can validly access any memory
as a character type, but that has nothing to do with what pointer casts
are allowed and which are not.

> > The problem always is accessing something as if it
> > was something of another type, which is not valid C.  This however is
> > exactly what -fno-strict-aliasing allows, so that works as well.
> 
> IIRC the C language only allows you to have pointers to valid data items.
> (Since they can only be generated by the & operator on a valid item.)

Not so.  For example you are explicitly allowed to have pointers one
past the last element of an array (and do arithmetic on that!), and of
course null pointers are a thing.

C allows you to make up pointers from integers as well.  This is
perfectly fine to do.  Accessing anything via such pointers might well
be not standard C, of course.

> Indirecting any other pointer is probably UB!

If a pointer points to an object, indirecting it gives an lvalue of that
object.  It does not matter how you got that pointer, all that matters
is that it points at a valid object.

> This (sort of) allows the compiler to 'look through' casts to find
> what the actual type is (or might be).
> It can then use that information to make optimisation choices.
> This has caused grief with memcpy() calls that are trying to copy
> a structure that the coder knows is misaligned to an aligned buffer.

This is 6.5/7.

Alignment is 6.2.8 but it doesn't actually come into play at all here.

> So while *(unaligned_ptr *)char_ptr probably has to work.

Only if the original pointer points to an object that is correct
(including correctly aligned) for such an lvalue.

> If the compiler can see *(unaligned_ptr *)(char *)int_ptr it can
> assume the alignment of the 'int_ptr' and do a single aligned access.

It is undefined behaviour to have an address in int_ptr that is not
correctly aligned for whatever type it points to.


Segher

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-12-17 13:54 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-14 10:00 [PATCH v2 00/13] Unify asm/unaligned.h around struct helper Arnd Bergmann
2021-05-14 10:00 ` [PATCH v2 03/13] sh: remove unaligned access for sh4a Arnd Bergmann
2021-05-14 10:34   ` John Paul Adrian Glaubitz
2021-05-14 12:22     ` Arnd Bergmann
2021-05-15 15:36       ` John Paul Adrian Glaubitz
2021-05-15 20:10         ` Arnd Bergmann
2021-05-14 17:32 ` [PATCH v2 00/13] Unify asm/unaligned.h around struct helper Linus Torvalds
2021-05-14 18:51   ` Vineet Gupta
2021-05-14 19:22     ` Linus Torvalds
2021-05-14 19:45       ` Vineet Gupta
2021-05-14 20:19         ` Linus Torvalds
2021-05-14 19:31   ` Arnd Bergmann
2021-12-16 17:29 ` Ard Biesheuvel
2021-12-16 17:42   ` Linus Torvalds
2021-12-16 17:49   ` David Laight
2021-12-16 18:56   ` Segher Boessenkool
2021-12-17 12:34     ` David Laight
2021-12-17 13:35       ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).