linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64
@ 2016-10-21 20:32 Yury Norov
  2016-10-21 20:33 ` [PATCH 01/18] 32-bit ABI: introduce ARCH_32BIT_OFF_T config option Yury Norov
                   ` (21 more replies)
  0 siblings, 22 replies; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:32 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf

This series enables aarch64 with ilp32 mode, and as supporting work,
introduces ARCH_32BIT_OFF_T configuration option that is enabled for
existing 32-bit architectures but disabled for new arches (so 64-bit
off_t is is used by new userspace).

This version is based on kernel v4.9-rc1.  It works with glibc-2.24,
and tested with LTP.

This version contains ABI changes, and should be used with new glibc
version. See links below.

This is RFC because there is still no solid understanding what type
of registers top-halves delousing we prefer and it affects ABI. In
this patchset, w0-w7 are cleared for each syscall in assembler entry.

The alternative approach is in introducing compat wrappers which is
little faster for natively routed syscalls (~2.6% for syscall with
no payload) but much more complicated.

Patch 1 may be applied separately from other patches of series.

v3: https://lkml.org/lkml/2014/9/3/704
v4: https://lkml.org/lkml/2015/4/13/691
v5: https://lkml.org/lkml/2015/9/29/911
v6: https://lkml.org/lkml/2016/5/23/661
v7: RFC nowrap:  https://lkml.org/lkml/2016/6/17/990
v7: RFC2 nowrap: https://lkml.org/lkml/2016/8/17/245
v7: RFC3 nowrap: https://lkml.org/lkml/2016/8/17/245
 - rebased on kernel 4.9-rc1;
 - setrlimit(), getrlimit() special handling is dropped.
   rlim_t is still 64-bit, but glibc is forced to use sys_prlimit64(),
   and redirection is not needed anymore;
 - sys_stat() and sys_stat64() redirection is dropped. Glibc defines
   aarch32-compatible struct stat instead;
 - sys_fcntl() redirection is dropped. Glibc sets proper definitions for
   requests instead;
 - renameat() is disabled for aarch64/ilp32. Glibc is forced to use renameat2();
 - __ARCH_WANT_SYNC_FILE_RANGE2 is enabled for aarch64/ilp32 to force it use
   sys_sync_file_range2 prior to sys_sync_file_range, like aarch32;
 - VDSO code refactored. Version is switched to 4.9.
 - comments  and documentation are revised;
 - checkpatch.pl errors are fixed.

Links:
Kernel: https://github.com/norov/linux/tree/ilp32-v4.9
glibc:  https://github.com/norov/glibc/tree/ilp32-2.24-dev1

Andrew Pinski (6):
  arm64: rename COMPAT to AARCH32_EL0 in Kconfig
  arm64: ensure the kernel is compiled for LP64
  arm64:uapi: set __BITS_PER_LONG correctly for ILP32 and LP64
  arm64: ilp32: add sys_ilp32.c and a separate table (in entry.S) to use
    it
  arm64: ilp32: introduce ilp32-specific handlers for sigframe and
    ucontext
  arm64:ilp32: add ARM64_ILP32 to Kconfig

Philipp Tomsich (1):
  arm64:ilp32: add vdso-ilp32 and use for signal return

Yury Norov (11):
  32-bit ABI: introduce ARCH_32BIT_OFF_T config option
  arm64: ilp32: add documentation on the ILP32 ABI for ARM64
  thread: move thread bits accessors to separated file
  arm64: introduce is_a32_task and is_a32_thread (for AArch32 compat)
  arm64: ilp32: add is_ilp32_compat_{task,thread} and TIF_32BIT_AARCH64
  arm64: introduce binfmt_elf32.c
  arm64: ilp32: introduce binfmt_ilp32.c
  arm64: ilp32: share aarch32 syscall handlers
  arm64: signal: share lp64 signal routines to ilp32
  arm64: signal32: move ilp32 and aarch32 common code to separated file
  arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32

 Documentation/arm64/ilp32.txt                 |  46 +++++++
 arch/Kconfig                                  |   4 +
 arch/arc/Kconfig                              |   1 +
 arch/arm/Kconfig                              |   1 +
 arch/arm64/Kconfig                            |  19 ++-
 arch/arm64/Makefile                           |   5 +
 arch/arm64/include/asm/compat.h               |  19 +--
 arch/arm64/include/asm/elf.h                  |  29 +++--
 arch/arm64/include/asm/fpsimd.h               |   2 +-
 arch/arm64/include/asm/ftrace.h               |   2 +-
 arch/arm64/include/asm/hwcap.h                |   6 +-
 arch/arm64/include/asm/is_compat.h            |  90 +++++++++++++
 arch/arm64/include/asm/memory.h               |   5 +-
 arch/arm64/include/asm/processor.h            |  11 +-
 arch/arm64/include/asm/ptrace.h               |   2 +-
 arch/arm64/include/asm/seccomp.h              |   2 +-
 arch/arm64/include/asm/signal32.h             |   9 +-
 arch/arm64/include/asm/signal32_common.h      |  27 ++++
 arch/arm64/include/asm/signal_common.h        |  33 +++++
 arch/arm64/include/asm/signal_ilp32.h         |  38 ++++++
 arch/arm64/include/asm/syscall.h              |   2 +-
 arch/arm64/include/asm/thread_info.h          |   4 +-
 arch/arm64/include/asm/unistd.h               |   8 +-
 arch/arm64/include/asm/unistd32.h             |   2 +-
 arch/arm64/include/asm/vdso.h                 |   6 +
 arch/arm64/include/uapi/asm/bitsperlong.h     |   9 +-
 arch/arm64/include/uapi/asm/unistd.h          |  12 ++
 arch/arm64/kernel/Makefile                    |  18 ++-
 arch/arm64/kernel/asm-offsets.c               |   9 +-
 arch/arm64/kernel/binfmt_elf32.c              |  31 +++++
 arch/arm64/kernel/binfmt_ilp32.c              |  97 ++++++++++++++
 arch/arm64/kernel/cpufeature.c                |   8 +-
 arch/arm64/kernel/cpuinfo.c                   |  20 +--
 arch/arm64/kernel/entry.S                     |  34 ++++-
 arch/arm64/kernel/entry32.S                   |  80 ------------
 arch/arm64/kernel/entry32_common.S            | 107 ++++++++++++++++
 arch/arm64/kernel/entry_ilp32.S               |  22 ++++
 arch/arm64/kernel/head.S                      |   2 +-
 arch/arm64/kernel/hw_breakpoint.c             |  10 +-
 arch/arm64/kernel/perf_regs.c                 |   2 +-
 arch/arm64/kernel/process.c                   |   7 +-
 arch/arm64/kernel/ptrace.c                    | 110 ++++++++++++++--
 arch/arm64/kernel/signal.c                    | 102 +++++++++------
 arch/arm64/kernel/signal32.c                  | 107 ----------------
 arch/arm64/kernel/signal32_common.c           | 135 ++++++++++++++++++++
 arch/arm64/kernel/signal_ilp32.c              | 174 ++++++++++++++++++++++++++
 arch/arm64/kernel/sys32.c                     |   1 +
 arch/arm64/kernel/sys_ilp32.c                 | 100 +++++++++++++++
 arch/arm64/kernel/traps.c                     |   5 +-
 arch/arm64/kernel/vdso-ilp32/.gitignore       |   2 +
 arch/arm64/kernel/vdso-ilp32/Makefile         |  74 +++++++++++
 arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S     |  33 +++++
 arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S |  95 ++++++++++++++
 arch/arm64/kernel/vdso.c                      |  70 +++++++++--
 arch/arm64/kernel/vdso/gettimeofday.S         |  20 ++-
 arch/arm64/kernel/vdso/vdso.S                 |   6 +-
 arch/blackfin/Kconfig                         |   1 +
 arch/cris/Kconfig                             |   1 +
 arch/frv/Kconfig                              |   1 +
 arch/h8300/Kconfig                            |   1 +
 arch/hexagon/Kconfig                          |   1 +
 arch/m32r/Kconfig                             |   1 +
 arch/m68k/Kconfig                             |   1 +
 arch/metag/Kconfig                            |   1 +
 arch/microblaze/Kconfig                       |   1 +
 arch/mips/Kconfig                             |   1 +
 arch/mn10300/Kconfig                          |   1 +
 arch/nios2/Kconfig                            |   1 +
 arch/openrisc/Kconfig                         |   1 +
 arch/parisc/Kconfig                           |   1 +
 arch/powerpc/Kconfig                          |   1 +
 arch/score/Kconfig                            |   1 +
 arch/sh/Kconfig                               |   1 +
 arch/sparc/Kconfig                            |   1 +
 arch/tile/Kconfig                             |   1 +
 arch/tile/kernel/compat.c                     |   3 +
 arch/unicore32/Kconfig                        |   1 +
 arch/x86/Kconfig                              |   1 +
 arch/x86/um/Kconfig                           |   1 +
 arch/xtensa/Kconfig                           |   1 +
 drivers/clocksource/arm_arch_timer.c          |   2 +-
 include/linux/fcntl.h                         |   2 +-
 include/linux/ptrace.h                        |   6 +
 include/linux/thread_bits.h                   |  54 ++++++++
 include/linux/thread_info.h                   |  44 +------
 include/uapi/asm-generic/unistd.h             |   5 +-
 kernel/ptrace.c                               |  10 +-
 87 files changed, 1635 insertions(+), 389 deletions(-)
 create mode 100644 Documentation/arm64/ilp32.txt
 create mode 100644 arch/arm64/include/asm/is_compat.h
 create mode 100644 arch/arm64/include/asm/signal32_common.h
 create mode 100644 arch/arm64/include/asm/signal_common.h
 create mode 100644 arch/arm64/include/asm/signal_ilp32.h
 create mode 100644 arch/arm64/kernel/binfmt_elf32.c
 create mode 100644 arch/arm64/kernel/binfmt_ilp32.c
 create mode 100644 arch/arm64/kernel/entry32_common.S
 create mode 100644 arch/arm64/kernel/entry_ilp32.S
 create mode 100644 arch/arm64/kernel/signal32_common.c
 create mode 100644 arch/arm64/kernel/signal_ilp32.c
 create mode 100644 arch/arm64/kernel/sys_ilp32.c
 create mode 100644 arch/arm64/kernel/vdso-ilp32/.gitignore
 create mode 100644 arch/arm64/kernel/vdso-ilp32/Makefile
 create mode 100644 arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S
 create mode 100644 arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S
 create mode 100644 include/linux/thread_bits.h

-- 
2.7.4

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 01/18] 32-bit ABI: introduce ARCH_32BIT_OFF_T config option
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-10-24 16:30   ` Chris Metcalf
  2016-10-21 20:33 ` [PATCH 02/18] arm64: ilp32: add documentation on the ILP32 ABI for ARM64 Yury Norov
                   ` (20 subsequent siblings)
  21 siblings, 1 reply; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf

All new 32-bit architectures should have 64-bit off_t type, but existing
architectures has 32-bit ones.

To handle it, new config option is added to arch/Kconfig that defaults
ARCH_32BIT_OFF_T to be disabled for non-64 bit architectures. All existing
32-bit architectures enable it explicitly here.

New option affects force_o_largefile() behaviour. Namely, if off_t is
64-bits long, we have no reason to reject user to open big files.

For syscalls sys_openat() and sys_open_by_handle_at() force_o_largefile()
is called, to set O_LARGEFILE flag, and this is the only difference
comparing to compat versions. All compat ABIs are already turned to use
64-bit off_t, except tile. So, compat versions for this syscalls are not
needed anymore. Tile is handled explicitly.

Note that even if architectures has only 64-bit off_t in the kernel
(arc, c6x, h8300, hexagon, metag, nios2, openrisc, tile32 and unicore32),
a libc may use 32-bit off_t, and therefore want to limit the file size
to 4GB unless specified differently in the open flags.

Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
---
 arch/Kconfig                      | 4 ++++
 arch/arc/Kconfig                  | 1 +
 arch/arm/Kconfig                  | 1 +
 arch/blackfin/Kconfig             | 1 +
 arch/cris/Kconfig                 | 1 +
 arch/frv/Kconfig                  | 1 +
 arch/h8300/Kconfig                | 1 +
 arch/hexagon/Kconfig              | 1 +
 arch/m32r/Kconfig                 | 1 +
 arch/m68k/Kconfig                 | 1 +
 arch/metag/Kconfig                | 1 +
 arch/microblaze/Kconfig           | 1 +
 arch/mips/Kconfig                 | 1 +
 arch/mn10300/Kconfig              | 1 +
 arch/nios2/Kconfig                | 1 +
 arch/openrisc/Kconfig             | 1 +
 arch/parisc/Kconfig               | 1 +
 arch/powerpc/Kconfig              | 1 +
 arch/score/Kconfig                | 1 +
 arch/sh/Kconfig                   | 1 +
 arch/sparc/Kconfig                | 1 +
 arch/tile/Kconfig                 | 1 +
 arch/tile/kernel/compat.c         | 3 +++
 arch/unicore32/Kconfig            | 1 +
 arch/x86/Kconfig                  | 1 +
 arch/x86/um/Kconfig               | 1 +
 arch/xtensa/Kconfig               | 1 +
 include/linux/fcntl.h             | 2 +-
 include/uapi/asm-generic/unistd.h | 5 ++---
 29 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 659bdd0..ec06a71 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -234,6 +234,10 @@ config ARCH_THREAD_STACK_ALLOCATOR
 config ARCH_WANTS_DYNAMIC_TASK_STRUCT
 	bool
 
+config ARCH_32BIT_OFF_T
+	bool
+	depends on !64BIT
+
 config HAVE_REGS_AND_STACK_ACCESS_API
 	bool
 	help
diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index ecd1237..3e8dfd6 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -9,6 +9,7 @@
 config ARC
 	def_bool y
 	select ARCH_SUPPORTS_ATOMIC_RMW if ARC_HAS_LLSC
+	select ARCH_32BIT_OFF_T
 	select BUILDTIME_EXTABLE_SORT
 	select CLKSRC_OF
 	select CLONE_BACKWARDS
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index b5d529f..ff8b8b2 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1,6 +1,7 @@
 config ARM
 	bool
 	default y
+	select ARCH_32BIT_OFF_T
 	select ARCH_CLOCKSOURCE_DATA
 	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select ARCH_HAS_ELF_RANDOMIZE
diff --git a/arch/blackfin/Kconfig b/arch/blackfin/Kconfig
index 3c1bd64..26418e7 100644
--- a/arch/blackfin/Kconfig
+++ b/arch/blackfin/Kconfig
@@ -12,6 +12,7 @@ config RWSEM_XCHGADD_ALGORITHM
 
 config BLACKFIN
 	def_bool y
+	select ARCH_32BIT_OFF_T
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_DYNAMIC_FTRACE
diff --git a/arch/cris/Kconfig b/arch/cris/Kconfig
index 71b758d..8c059f0 100644
--- a/arch/cris/Kconfig
+++ b/arch/cris/Kconfig
@@ -50,6 +50,7 @@ config LOCKDEP_SUPPORT
 config CRIS
 	bool
 	default y
+	select ARCH_32BIT_OFF_T
 	select HAVE_IDE
 	select GENERIC_ATOMIC64
 	select HAVE_UID16
diff --git a/arch/frv/Kconfig b/arch/frv/Kconfig
index eefd9a4..2f14904 100644
--- a/arch/frv/Kconfig
+++ b/arch/frv/Kconfig
@@ -1,6 +1,7 @@
 config FRV
 	bool
 	default y
+	select ARCH_32BIT_OFF_T
 	select HAVE_IDE
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_PERF_EVENTS
diff --git a/arch/h8300/Kconfig b/arch/h8300/Kconfig
index 3ae8525..29bbcb1 100644
--- a/arch/h8300/Kconfig
+++ b/arch/h8300/Kconfig
@@ -1,5 +1,6 @@
 config H8300
         def_bool y
+	select ARCH_32BIT_OFF_T
 	select GENERIC_ATOMIC64
 	select HAVE_UID16
 	select VIRT_TO_BUS
diff --git a/arch/hexagon/Kconfig b/arch/hexagon/Kconfig
index 1941e4b..bbcea8c 100644
--- a/arch/hexagon/Kconfig
+++ b/arch/hexagon/Kconfig
@@ -3,6 +3,7 @@ comment "Linux Kernel Configuration for Hexagon"
 
 config HEXAGON
 	def_bool y
+	select ARCH_32BIT_OFF_T
 	select HAVE_OPROFILE
 	# Other pending projects/to-do items.
 	# select HAVE_REGS_AND_STACK_ACCESS_API
diff --git a/arch/m32r/Kconfig b/arch/m32r/Kconfig
index 3cc8498..efa10d3 100644
--- a/arch/m32r/Kconfig
+++ b/arch/m32r/Kconfig
@@ -1,6 +1,7 @@
 config M32R
 	bool
 	default y
+	select ARCH_32BIT_OFF_T
 	select HAVE_IDE
 	select HAVE_OPROFILE
 	select INIT_ALL_POSSIBLE
diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index d140206..ed6f90c 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -1,6 +1,7 @@
 config M68K
 	bool
 	default y
+	select ARCH_32BIT_OFF_T
 	select ARCH_MIGHT_HAVE_PC_PARPORT if ISA
 	select HAVE_IDE
 	select HAVE_AOUT if MMU
diff --git a/arch/metag/Kconfig b/arch/metag/Kconfig
index 5b7a45d..c337192 100644
--- a/arch/metag/Kconfig
+++ b/arch/metag/Kconfig
@@ -1,5 +1,6 @@
 config METAG
 	def_bool y
+	select ARCH_32BIT_OFF_T
 	select EMBEDDED
 	select GENERIC_ATOMIC64
 	select GENERIC_CLOCKEVENTS
diff --git a/arch/microblaze/Kconfig b/arch/microblaze/Kconfig
index 86f6572..3a6146b 100644
--- a/arch/microblaze/Kconfig
+++ b/arch/microblaze/Kconfig
@@ -1,5 +1,6 @@
 config MICROBLAZE
 	def_bool y
+	select ARCH_32BIT_OFF_T
 	select ARCH_HAS_GCOV_PROFILE_ALL
 	select ARCH_MIGHT_HAVE_PC_PARPORT
 	select ARCH_WANT_IPC_PARSE_VERSION
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index b3c5bde..a01da24 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -1,6 +1,7 @@
 config MIPS
 	bool
 	default y
+	select ARCH_32BIT_OFF_T if !64BIT
 	select ARCH_SUPPORTS_UPROBES
 	select ARCH_MIGHT_HAVE_PC_PARPORT
 	select ARCH_MIGHT_HAVE_PC_SERIO
diff --git a/arch/mn10300/Kconfig b/arch/mn10300/Kconfig
index 38e3494..c44c699 100644
--- a/arch/mn10300/Kconfig
+++ b/arch/mn10300/Kconfig
@@ -1,6 +1,7 @@
 config MN10300
 	def_bool y
 	select HAVE_EXIT_THREAD
+	select ARCH_32BIT_OFF_T
 	select HAVE_OPROFILE
 	select HAVE_UID16
 	select GENERIC_IRQ_SHOW
diff --git a/arch/nios2/Kconfig b/arch/nios2/Kconfig
index 51a56c8..f9273c9 100644
--- a/arch/nios2/Kconfig
+++ b/arch/nios2/Kconfig
@@ -1,5 +1,6 @@
 config NIOS2
 	def_bool y
+	select ARCH_32BIT_OFF_T
 	select CLKSRC_OF
 	select GENERIC_ATOMIC64
 	select GENERIC_CLOCKEVENTS
diff --git a/arch/openrisc/Kconfig b/arch/openrisc/Kconfig
index 489e7f9..c4c96c9 100644
--- a/arch/openrisc/Kconfig
+++ b/arch/openrisc/Kconfig
@@ -5,6 +5,7 @@
 
 config OPENRISC
 	def_bool y
+	select ARCH_32BIT_OFF_T
 	select OF
 	select OF_EARLY_FLATTREE
 	select IRQ_DOMAIN
diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index 71c4a3a..025ae12 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -1,5 +1,6 @@
 config PARISC
 	def_bool y
+	select ARCH_32BIT_OFF_T if !64BIT
 	select ARCH_MIGHT_HAVE_PC_PARPORT
 	select HAVE_IDE
 	select HAVE_OPROFILE
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 65fba4c..22178eb 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -80,6 +80,7 @@ config ARCH_HAS_DMA_SET_COHERENT_MASK
 config PPC
 	bool
 	default y
+	select ARCH_32BIT_OFF_T if PPC32
 	select ARCH_MIGHT_HAVE_PC_PARPORT
 	select ARCH_MIGHT_HAVE_PC_SERIO
 	select BINFMT_ELF
diff --git a/arch/score/Kconfig b/arch/score/Kconfig
index 507d631..0a9484b 100644
--- a/arch/score/Kconfig
+++ b/arch/score/Kconfig
@@ -2,6 +2,7 @@ menu "Machine selection"
 
 config SCORE
        def_bool y
+       select ARCH_32BIT_OFF_T
        select GENERIC_IRQ_SHOW
        select GENERIC_IOMAP
        select GENERIC_ATOMIC64
diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index ee08695..1f99eb3 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -56,6 +56,7 @@ config SUPERH
 
 config SUPERH32
 	def_bool ARCH = "sh"
+	select ARCH_32BIT_OFF_T
 	select HAVE_KPROBES
 	select HAVE_KRETPROBES
 	select HAVE_IOREMAP_PROT if MMU && !X2TLB
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index b23c76b..36ef669 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -46,6 +46,7 @@ config SPARC
 
 config SPARC32
 	def_bool !64BIT
+	select ARCH_32BIT_OFF_T
 	select GENERIC_ATOMIC64
 	select CLZ_TAB
 	select HAVE_UID16
diff --git a/arch/tile/Kconfig b/arch/tile/Kconfig
index 4583c03..845dcbd 100644
--- a/arch/tile/Kconfig
+++ b/arch/tile/Kconfig
@@ -3,6 +3,7 @@
 
 config TILE
 	def_bool y
+	select ARCH_32BIT_OFF_T if !64BIT
 	select ARCH_HAS_DEVMEM_IS_ALLOWED
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
 	select ARCH_WANT_FRAME_POINTERS
diff --git a/arch/tile/kernel/compat.c b/arch/tile/kernel/compat.c
index bdaf71d..b38a898 100644
--- a/arch/tile/kernel/compat.c
+++ b/arch/tile/kernel/compat.c
@@ -103,6 +103,9 @@ COMPAT_SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned int, offset_high,
 #define compat_sys_readahead sys32_readahead
 #define sys_llseek compat_sys_llseek
 
+#define sys_openat             compat_sys_openat
+#define sys_open_by_handle_at  compat_sys_open_by_handle_at
+
 /* Call the assembly trampolines where necessary. */
 #define compat_sys_rt_sigreturn _compat_sys_rt_sigreturn
 #define sys_clone _sys_clone
diff --git a/arch/unicore32/Kconfig b/arch/unicore32/Kconfig
index 0769066..cc642f9 100644
--- a/arch/unicore32/Kconfig
+++ b/arch/unicore32/Kconfig
@@ -1,6 +1,7 @@
 config UNICORE32
 	def_bool y
 	select ARCH_HAS_DEVMEM_IS_ALLOWED
+	select ARCH_32BIT_OFF_T
 	select ARCH_MIGHT_HAVE_PC_PARPORT
 	select ARCH_MIGHT_HAVE_PC_SERIO
 	select HAVE_MEMBLOCK
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index bada636..52d19b4 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -20,6 +20,7 @@ config X86
 	select ACPI_LEGACY_TABLES_LOOKUP	if ACPI
 	select ACPI_SYSTEM_POWER_STATES_SUPPORT	if ACPI
 	select ANON_INODES
+	select ARCH_32BIT_OFF_T			if X86_32
 	select ARCH_CLOCKSOURCE_DATA
 	select ARCH_DISCARD_MEMBLOCK
 	select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
diff --git a/arch/x86/um/Kconfig b/arch/x86/um/Kconfig
index ed56a1c..8436bcd 100644
--- a/arch/x86/um/Kconfig
+++ b/arch/x86/um/Kconfig
@@ -21,6 +21,7 @@ config 64BIT
 config X86_32
 	def_bool !64BIT
 	select HAVE_AOUT
+	select ARCH_32BIT_OFF_T
 	select ARCH_WANT_IPC_PARSE_VERSION
 	select MODULES_USE_ELF_REL
 	select CLONE_BACKWARDS
diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig
index f610586..90c062d 100644
--- a/arch/xtensa/Kconfig
+++ b/arch/xtensa/Kconfig
@@ -3,6 +3,7 @@ config ZONE_DMA
 
 config XTENSA
 	def_bool y
+	select ARCH_32BIT_OFF_T
 	select ARCH_WANT_FRAME_POINTERS
 	select ARCH_WANT_IPC_PARSE_VERSION
 	select BUILDTIME_EXTABLE_SORT
diff --git a/include/linux/fcntl.h b/include/linux/fcntl.h
index 76ce329..46960a1 100644
--- a/include/linux/fcntl.h
+++ b/include/linux/fcntl.h
@@ -5,7 +5,7 @@
 
 
 #ifndef force_o_largefile
-#define force_o_largefile() (BITS_PER_LONG != 32)
+#define force_o_largefile() (!IS_ENABLED(CONFIG_ARCH_32BIT_OFF_T))
 #endif
 
 #if BITS_PER_LONG == 32
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 9b1462e..a6062be 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -178,7 +178,7 @@ __SYSCALL(__NR_fchownat, sys_fchownat)
 #define __NR_fchown 55
 __SYSCALL(__NR_fchown, sys_fchown)
 #define __NR_openat 56
-__SC_COMP(__NR_openat, sys_openat, compat_sys_openat)
+__SYSCALL(__NR_openat, sys_openat)
 #define __NR_close 57
 __SYSCALL(__NR_close, sys_close)
 #define __NR_vhangup 58
@@ -676,8 +676,7 @@ __SYSCALL(__NR_fanotify_mark, sys_fanotify_mark)
 #define __NR_name_to_handle_at         264
 __SYSCALL(__NR_name_to_handle_at, sys_name_to_handle_at)
 #define __NR_open_by_handle_at         265
-__SC_COMP(__NR_open_by_handle_at, sys_open_by_handle_at, \
-	  compat_sys_open_by_handle_at)
+__SYSCALL(__NR_open_by_handle_at, sys_open_by_handle_at)
 #define __NR_clock_adjtime 266
 __SC_COMP(__NR_clock_adjtime, sys_clock_adjtime, compat_sys_clock_adjtime)
 #define __NR_syncfs 267
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 02/18] arm64: ilp32: add documentation on the ILP32 ABI for ARM64
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
  2016-10-21 20:33 ` [PATCH 01/18] 32-bit ABI: introduce ARCH_32BIT_OFF_T config option Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-10-24 16:36   ` Chris Metcalf
  2016-10-21 20:33 ` [PATCH 03/18] arm64: rename COMPAT to AARCH32_EL0 in Kconfig Yury Norov
                   ` (19 subsequent siblings)
  21 siblings, 1 reply; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf

Based on Andrew Pinski's patch-series.

Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
---
 Documentation/arm64/ilp32.txt | 46 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)
 create mode 100644 Documentation/arm64/ilp32.txt

diff --git a/Documentation/arm64/ilp32.txt b/Documentation/arm64/ilp32.txt
new file mode 100644
index 0000000..b96c18f
--- /dev/null
+++ b/Documentation/arm64/ilp32.txt
@@ -0,0 +1,46 @@
+ILP32 AARCH64 SYSCALL ABI
+=========================
+
+This document describes the ILP32 syscall ABI and where it differs
+from the generic compat linux syscall interface.
+
+AARCH64/ILP32 userspace can potentially access top halves of registers that
+are passed as syscall arguments, so such registers (w0-w7) are deloused.
+
+AARCH64/ILP32 provides next types turned to 64-bit (comparing to AARCH32):
+ino_t       is u64 type.
+off_t       is s64 type.
+blkcnt_t    is s64 type.
+fsblkcnt_t  is u64 type.
+fsfilcnt_t  is u64 type.
+rlim_t      is u64 type.
+
+AARCH64/ILP32 ABI uses standard syscall table which can be found at
+include/uapi/asm-generic/unistd.h, with the exceptions listed below.
+
+Syscalls which pass 64bit values are handled by the code shared from
+AARCH32 and pass that value as a pair. Next syscalls are affected:
+fadvise64_64()
+fallocate()
+ftruncate64()
+pread64()
+pwrite64()
+readahead()
+sync_file_range()
+truncate64()
+sys_mmap()
+
+ptrace() syscall is handled by compat version.
+
+shmat() syscall is handled by non-compat handler as aarch64/ilp32 has no
+limitation on 4-pages alignment for shared memory.
+
+statfs() and fstatfs() take the size of sfruct statfs as an argument.
+It is calculated differently in kernel and user spaces. So AARCH32 handlers
+are taken to handle it.
+
+struct rt_sigframe is redefined and contains struct compat_siginfo,
+as compat syscalls expects, and struct ilp32_sigframe, to handle
+AARCH64 register set and 32-bit userspace register representation.
+
+elf_gregset_t is taken from lp64 to handle registers properly.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 03/18] arm64: rename COMPAT to AARCH32_EL0 in Kconfig
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
  2016-10-21 20:33 ` [PATCH 01/18] 32-bit ABI: introduce ARCH_32BIT_OFF_T config option Yury Norov
  2016-10-21 20:33 ` [PATCH 02/18] arm64: ilp32: add documentation on the ILP32 ABI for ARM64 Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-10-21 20:33 ` [PATCH 04/18] arm64: ensure the kernel is compiled for LP64 Yury Norov
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf, Andrew Pinski,
	Andrew Pinski, Bamvor Jian Zhang

From: Andrew Pinski <apinski@cavium.com>

In this patchset  ILP32 ABI support is added. Additionally to AARCH32,
which is binary-compatible with ARM, ILP32 is (mostly) ABI-compatible.

>From now, AARCH32_EL0 (former COMPAT) config option means the support of
AARCH32 userspace, ARM64_ILP32 - support of ILP32 ABI (see next patches),
and COMPAT indicates that one of them, or both, is enabled.

Where needed, CONFIG_COMPAT is changed over to use CONFIG_AARCH32_EL0 instead

Signed-off-by: Andrew Pinski <Andrew.Pinski@caviumnetworks.com>
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
Signed-off-by: Bamvor Jian Zhang <bamvor.zhangjian@linaro.org>
---
 arch/arm64/Kconfig                   | 10 ++++++++--
 arch/arm64/include/asm/fpsimd.h      |  2 +-
 arch/arm64/include/asm/hwcap.h       |  4 ++--
 arch/arm64/include/asm/processor.h   |  6 +++---
 arch/arm64/include/asm/ptrace.h      |  2 +-
 arch/arm64/include/asm/seccomp.h     |  2 +-
 arch/arm64/include/asm/signal32.h    |  6 ++++--
 arch/arm64/include/asm/unistd.h      |  2 +-
 arch/arm64/kernel/Makefile           |  2 +-
 arch/arm64/kernel/asm-offsets.c      |  2 +-
 arch/arm64/kernel/cpufeature.c       |  8 ++++----
 arch/arm64/kernel/cpuinfo.c          | 20 +++++++++++---------
 arch/arm64/kernel/entry.S            |  6 +++---
 arch/arm64/kernel/head.S             |  2 +-
 arch/arm64/kernel/ptrace.c           |  8 ++++----
 arch/arm64/kernel/traps.c            |  2 +-
 arch/arm64/kernel/vdso.c             |  4 ++--
 drivers/clocksource/arm_arch_timer.c |  2 +-
 18 files changed, 50 insertions(+), 40 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 30398db..0cd786e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -396,7 +396,7 @@ config ARM64_ERRATUM_834220
 
 config ARM64_ERRATUM_845719
 	bool "Cortex-A53: 845719: a load might read incorrect data"
-	depends on COMPAT
+	depends on AARCH32_EL0
 	default y
 	help
 	  This option adds an alternative code sequence to work around ARM
@@ -725,7 +725,7 @@ config FORCE_MAX_ZONEORDER
 
 menuconfig ARMV8_DEPRECATED
 	bool "Emulate deprecated/obsolete ARMv8 instructions"
-	depends on COMPAT
+	depends on AARCH32_EL0
 	help
 	  Legacy software support may require certain instructions
 	  that have been deprecated or obsoleted in the architecture.
@@ -995,8 +995,14 @@ menu "Userspace binary formats"
 source "fs/Kconfig.binfmt"
 
 config COMPAT
+	bool
+	depends on AARCH32_EL0
+
+config AARCH32_EL0
 	bool "Kernel support for 32-bit EL0"
+	def_bool y
 	depends on ARM64_4K_PAGES || EXPERT
+	select COMPAT
 	select COMPAT_BINFMT_ELF
 	select HAVE_UID16
 	select OLD_SIGSUSPEND3
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 50f559f..63b19f1 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -52,7 +52,7 @@ struct fpsimd_partial_state {
 };
 
 
-#if defined(__KERNEL__) && defined(CONFIG_COMPAT)
+#if defined(__KERNEL__) && defined(CONFIG_AARCH32_EL0)
 /* Masks for extracting the FPSR and FPCR from the FPSCR */
 #define VFP_FPSCR_STAT_MASK	0xf800009f
 #define VFP_FPSCR_CTRL_MASK	0x07f79f00
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index 400b80b..2c7fc5d 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -46,7 +46,7 @@
  */
 #define ELF_HWCAP		(elf_hwcap)
 
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 #define COMPAT_ELF_HWCAP	(compat_elf_hwcap)
 #define COMPAT_ELF_HWCAP2	(compat_elf_hwcap2)
 extern unsigned int compat_elf_hwcap, compat_elf_hwcap2;
@@ -54,7 +54,7 @@ extern unsigned int compat_elf_hwcap, compat_elf_hwcap2;
 
 enum {
 	CAP_HWCAP = 1,
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 	CAP_COMPAT_HWCAP,
 	CAP_COMPAT_HWCAP2,
 #endif
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index df2e53d..6173a7b 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -79,7 +79,7 @@ struct cpu_context {
 struct thread_struct {
 	struct cpu_context	cpu_context;	/* cpu context */
 	unsigned long		tp_value;	/* TLS register */
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 	unsigned long		tp2_value;
 #endif
 	struct fpsimd_state	fpsimd_state;
@@ -88,7 +88,7 @@ struct thread_struct {
 	struct debug_info	debug;		/* debugging */
 };
 
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 #define task_user_tls(t)						\
 ({									\
 	unsigned long *__tls;						\
@@ -119,7 +119,7 @@ static inline void start_thread(struct pt_regs *regs, unsigned long pc,
 	regs->sp = sp;
 }
 
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 static inline void compat_start_thread(struct pt_regs *regs, unsigned long pc,
 				       unsigned long sp)
 {
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index ada08b5..f5ca5f5 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -125,7 +125,7 @@ struct pt_regs {
 
 #define arch_has_single_step()	(1)
 
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 #define compat_thumb_mode(regs) \
 	(((regs)->pstate & COMPAT_PSR_T_BIT))
 #else
diff --git a/arch/arm64/include/asm/seccomp.h b/arch/arm64/include/asm/seccomp.h
index c76fac9..00ef0bf 100644
--- a/arch/arm64/include/asm/seccomp.h
+++ b/arch/arm64/include/asm/seccomp.h
@@ -13,7 +13,7 @@
 
 #include <asm/unistd.h>
 
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 #define __NR_seccomp_read_32		__NR_compat_read
 #define __NR_seccomp_write_32		__NR_compat_write
 #define __NR_seccomp_exit_32		__NR_compat_exit
diff --git a/arch/arm64/include/asm/signal32.h b/arch/arm64/include/asm/signal32.h
index eeaa975..e68fcce 100644
--- a/arch/arm64/include/asm/signal32.h
+++ b/arch/arm64/include/asm/signal32.h
@@ -17,7 +17,9 @@
 #define __ASM_SIGNAL32_H
 
 #ifdef __KERNEL__
-#ifdef CONFIG_COMPAT
+
+#ifdef CONFIG_AARCH32_EL0
+
 #include <linux/compat.h>
 
 #define AARCH32_KERN_SIGRET_CODE_OFFSET	0x500
@@ -47,6 +49,6 @@ static inline int compat_setup_rt_frame(int usig, struct ksignal *ksig, sigset_t
 static inline void compat_setup_restart_syscall(struct pt_regs *regs)
 {
 }
-#endif /* CONFIG_COMPAT */
+#endif /* CONFIG_AARCH32_EL0 */
 #endif /* __KERNEL__ */
 #endif /* __ASM_SIGNAL32_H */
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index e78ac26..fe9d6c1 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -13,7 +13,7 @@
  * You should have received a copy of the GNU General Public License
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 #define __ARCH_WANT_COMPAT_SYS_GETDENTS64
 #define __ARCH_WANT_COMPAT_STAT64
 #define __ARCH_WANT_SYS_GETHOSTNAME
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 7d66bba..8a19fda 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -27,7 +27,7 @@ OBJCOPYFLAGS := --prefix-symbols=__efistub_
 $(obj)/%.stub.o: $(obj)/%.o FORCE
 	$(call if_changed,objcopy)
 
-arm64-obj-$(CONFIG_COMPAT)		+= sys32.o kuser32.o signal32.o 	\
+arm64-obj-$(CONFIG_AARCH32_EL0)		+= sys32.o kuser32.o signal32.o 	\
 					   sys_compat.o entry32.o
 arm64-obj-$(CONFIG_FUNCTION_TRACER)	+= ftrace.o entry-ftrace.o
 arm64-obj-$(CONFIG_MODULES)		+= arm64ksyms.o module.o
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 4a2f0f0..d8d7086 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -65,7 +65,7 @@ int main(void)
   DEFINE(S_X28,			offsetof(struct pt_regs, regs[28]));
   DEFINE(S_LR,			offsetof(struct pt_regs, regs[30]));
   DEFINE(S_SP,			offsetof(struct pt_regs, sp));
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
   DEFINE(S_COMPAT_SP,		offsetof(struct pt_regs, compat_sp));
 #endif
   DEFINE(S_PSTATE,		offsetof(struct pt_regs, pstate));
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index b3ac0c4..12805ee 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -32,7 +32,7 @@
 unsigned long elf_hwcap __read_mostly;
 EXPORT_SYMBOL_GPL(elf_hwcap);
 
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 #define COMPAT_ELF_HWCAP_DEFAULT	\
 				(COMPAT_HWCAP_HALF|COMPAT_HWCAP_THUMB|\
 				 COMPAT_HWCAP_FAST_MULT|COMPAT_HWCAP_EDSP|\
@@ -859,7 +859,7 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
 };
 
 static const struct arm64_cpu_capabilities compat_elf_hwcaps[] = {
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 	HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_AES_SHIFT, FTR_UNSIGNED, 2, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_PMULL),
 	HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_AES_SHIFT, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_AES),
 	HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_SHA1_SHIFT, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_SHA1),
@@ -875,7 +875,7 @@ static void __init cap_set_elf_hwcap(const struct arm64_cpu_capabilities *cap)
 	case CAP_HWCAP:
 		elf_hwcap |= cap->hwcap;
 		break;
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 	case CAP_COMPAT_HWCAP:
 		compat_elf_hwcap |= (u32)cap->hwcap;
 		break;
@@ -898,7 +898,7 @@ static bool cpus_have_elf_hwcap(const struct arm64_cpu_capabilities *cap)
 	case CAP_HWCAP:
 		rc = (elf_hwcap & cap->hwcap) != 0;
 		break;
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 	case CAP_COMPAT_HWCAP:
 		rc = (compat_elf_hwcap & (u32)cap->hwcap) != 0;
 		break;
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index c742df5..b76c759 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -134,15 +134,17 @@ static int c_show(struct seq_file *m, void *v)
 		 */
 		seq_puts(m, "Features\t:");
 		if (compat) {
-#ifdef CONFIG_COMPAT
-			for (j = 0; compat_hwcap_str[j]; j++)
-				if (compat_elf_hwcap & (1 << j))
-					seq_printf(m, " %s", compat_hwcap_str[j]);
-
-			for (j = 0; compat_hwcap2_str[j]; j++)
-				if (compat_elf_hwcap2 & (1 << j))
-					seq_printf(m, " %s", compat_hwcap2_str[j]);
-#endif /* CONFIG_COMPAT */
+#ifdef CONFIG_AARCH32_EL0
+			if (personality(current->personality) == PER_LINUX32) {
+				for (j = 0; compat_hwcap_str[j]; j++)
+					if (compat_elf_hwcap & (1 << j))
+						seq_printf(m, " %s", compat_hwcap_str[j]);
+
+				for (j = 0; compat_hwcap2_str[j]; j++)
+					if (compat_elf_hwcap2 & (1 << j))
+						seq_printf(m, " %s", compat_hwcap2_str[j]);
+			}
+#endif /* CONFIG_AARCH32_EL0 */
 		} else {
 			for (j = 0; hwcap_str[j]; j++)
 				if (elf_hwcap & (1 << j))
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 223d54a..b6fb14b 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -271,7 +271,7 @@ ENTRY(vectors)
 	ventry	el0_fiq_invalid			// FIQ 64-bit EL0
 	ventry	el0_error_invalid		// Error 64-bit EL0
 
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 	ventry	el0_sync_compat			// Synchronous 32-bit EL0
 	ventry	el0_irq_compat			// IRQ 32-bit EL0
 	ventry	el0_fiq_invalid_compat		// FIQ 32-bit EL0
@@ -311,7 +311,7 @@ el0_error_invalid:
 	inv_entry 0, BAD_ERROR
 ENDPROC(el0_error_invalid)
 
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 el0_fiq_invalid_compat:
 	inv_entry 0, BAD_FIQ, 32
 ENDPROC(el0_fiq_invalid_compat)
@@ -479,7 +479,7 @@ el0_sync:
 	b.ge	el0_dbg
 	b	el0_inv
 
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 	.align	6
 el0_sync_compat:
 	kernel_entry 0, 32
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 427f6d3..10cb017 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -575,7 +575,7 @@ CPU_LE(	movk	x0, #0x30d0, lsl #16	)	// Clear EE and E0E on LE systems
 	msr	cptr_el2, x0			// Disable copro. traps to EL2
 1:
 
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 	msr	hstr_el2, xzr			// Disable CP15 traps to EL2
 #endif
 
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index e0c81da..1d6f43e 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -183,7 +183,7 @@ static void ptrace_hbptriggered(struct perf_event *bp,
 		.si_addr	= (void __user *)(bkpt->trigger),
 	};
 
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 	int i;
 
 	if (!is_compat_task())
@@ -758,7 +758,7 @@ static const struct user_regset_view user_aarch64_view = {
 	.regsets = aarch64_regsets, .n = ARRAY_SIZE(aarch64_regsets)
 };
 
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 #include <linux/compat.h>
 
 enum compat_regset {
@@ -1293,11 +1293,11 @@ long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
 
 	return ret;
 }
-#endif /* CONFIG_COMPAT */
+#endif /* CONFIG_AARCH32_EL0 */
 
 const struct user_regset_view *task_user_regset_view(struct task_struct *task)
 {
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 	/*
 	 * Core dumping of 32-bit tasks or compat ptrace requests must use the
 	 * user_aarch32_view compatible with arm32. Native ptrace requests on
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 5ff020f..14a08a0 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -526,7 +526,7 @@ long compat_arm_syscall(struct pt_regs *regs);
 
 asmlinkage long do_ni_syscall(struct pt_regs *regs)
 {
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 	long ret;
 	if (is_compat_task()) {
 		ret = compat_arm_syscall(regs);
diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index a2c2478..7f822cd 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -49,7 +49,7 @@ static union {
 } vdso_data_store __page_aligned_data;
 struct vdso_data *vdso_data = &vdso_data_store.data;
 
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 /*
  * Create and map the vectors page for AArch32 tasks.
  */
@@ -108,7 +108,7 @@ int aarch32_setup_vectors_page(struct linux_binprm *bprm, int uses_interp)
 
 	return PTR_ERR_OR_ZERO(ret);
 }
-#endif /* CONFIG_COMPAT */
+#endif /* CONFIG_AARCH32_EL0 */
 
 static struct vm_special_mapping vdso_spec[2] __ro_after_init = {
 	{
diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
index 73c487d..0ed1b62 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -418,7 +418,7 @@ static void arch_timer_evtstrm_enable(int divider)
 			| ARCH_TIMER_VIRT_EVT_EN;
 	arch_timer_set_cntkctl(cntkctl);
 	elf_hwcap |= HWCAP_EVTSTRM;
-#ifdef CONFIG_COMPAT
+#ifdef CONFIG_AARCH32_EL0
 	compat_elf_hwcap |= COMPAT_HWCAP_EVTSTRM;
 #endif
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 04/18] arm64: ensure the kernel is compiled for LP64
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (2 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 03/18] arm64: rename COMPAT to AARCH32_EL0 in Kconfig Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-10-21 20:33 ` [PATCH 05/18] arm64:uapi: set __BITS_PER_LONG correctly for ILP32 and LP64 Yury Norov
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf, Andrew Pinski,
	Andrew Pinski

From: Andrew Pinski <apinski@cavium.com>

The kernel needs to be compiled as a LP64 binary for ARM64, even when
using a compiler that defaults to code-generation for the ILP32 ABI.
Consequently, we need to explicitly pass '-mabi=lp64' (supported on
gcc-4.9 and newer).

Signed-off-by: Andrew Pinski <Andrew.Pinski@caviumnetworks.com>
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Reviewed-by: David Daney <ddaney@caviumnetworks.com>
---
 arch/arm64/Makefile | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index ab51aed..80eb000 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -42,15 +42,20 @@ KBUILD_CFLAGS	+= -fno-asynchronous-unwind-tables
 KBUILD_CFLAGS	+= $(call cc-option, -mpc-relative-literal-loads)
 KBUILD_AFLAGS	+= $(lseinstr)
 
+KBUILD_CFLAGS	+= $(call cc-option,-mabi=lp64)
+KBUILD_AFLAGS	+= $(call cc-option,-mabi=lp64)
+
 ifeq ($(CONFIG_CPU_BIG_ENDIAN), y)
 KBUILD_CPPFLAGS	+= -mbig-endian
 AS		+= -EB
 LD		+= -EB
+LDFLAGS		+= -maarch64linuxb
 UTS_MACHINE	:= aarch64_be
 else
 KBUILD_CPPFLAGS	+= -mlittle-endian
 AS		+= -EL
 LD		+= -EL
+LDFLAGS		+= -maarch64linux
 UTS_MACHINE	:= aarch64
 endif
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 05/18] arm64:uapi: set __BITS_PER_LONG correctly for ILP32 and LP64
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (3 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 04/18] arm64: ensure the kernel is compiled for LP64 Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-10-21 20:33 ` [PATCH 06/18] thread: move thread bits accessors to separated file Yury Norov
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf, Andrew Pinski

From: Andrew Pinski <apinski@cavium.com>

Define __BITS_PER_LONG depending on the ABI used (i.e. check whether
__ILP32__ or __LP64__ is defined).  This is necessary for glibc to
determine the appropriate type definitions for the system call interface.

Signed-off-by: Andrew Pinski <apinski@cavium.com>
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Reviewed-by: David Daney <ddaney@caviumnetworks.com>
---
 arch/arm64/include/uapi/asm/bitsperlong.h | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/uapi/asm/bitsperlong.h b/arch/arm64/include/uapi/asm/bitsperlong.h
index fce9c29..ab61d68 100644
--- a/arch/arm64/include/uapi/asm/bitsperlong.h
+++ b/arch/arm64/include/uapi/asm/bitsperlong.h
@@ -16,7 +16,14 @@
 #ifndef __ASM_BITSPERLONG_H
 #define __ASM_BITSPERLONG_H
 
-#define __BITS_PER_LONG 64
+#if defined(__LP64__)
+/* Assuming __LP64__ will be defined for native ELF64's and not for ILP32. */
+# define __BITS_PER_LONG 64
+#elif defined(__ILP32__)
+# define __BITS_PER_LONG 32
+#else
+# error "Neither LP64 nor ILP32: unsupported ABI in asm/bitsperlong.h"
+#endif
 
 #include <asm-generic/bitsperlong.h>
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 06/18] thread: move thread bits accessors to separated file
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (4 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 05/18] arm64:uapi: set __BITS_PER_LONG correctly for ILP32 and LP64 Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-10-21 20:33 ` [PATCH 07/18] arm64: introduce is_a32_task and is_a32_thread (for AArch32 compat) Yury Norov
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf

They may be accessed from low-level code, so isolating is a measure to
avoid circular dependencies in header files.

The exact reason for circular dependency is WARN_ON() macro added
in patch [edd63a27] "set_restore_sigmask() is never called without
SIGPENDING (and never should be)"

Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
---
 include/linux/thread_bits.h | 54 +++++++++++++++++++++++++++++++++++++++++++++
 include/linux/thread_info.h | 44 +-----------------------------------
 2 files changed, 55 insertions(+), 43 deletions(-)
 create mode 100644 include/linux/thread_bits.h

diff --git a/include/linux/thread_bits.h b/include/linux/thread_bits.h
new file mode 100644
index 0000000..ed788b0
--- /dev/null
+++ b/include/linux/thread_bits.h
@@ -0,0 +1,54 @@
+
+/* thread_bits.h: common low-level thread bits accessors */
+
+#ifndef _LINUX_THREAD_BITS_H
+#define _LINUX_THREAD_BITS_H
+
+#ifndef __ASSEMBLY__
+
+#include <linux/bitops.h>
+#include <asm/thread_info.h>
+
+/*
+ * flag set/clear/test wrappers
+ * - pass TIF_xxxx constants to these functions
+ */
+
+static inline void set_ti_thread_flag(struct thread_info *ti, int flag)
+{
+	set_bit(flag, (unsigned long *)&ti->flags);
+}
+
+static inline void clear_ti_thread_flag(struct thread_info *ti, int flag)
+{
+	clear_bit(flag, (unsigned long *)&ti->flags);
+}
+
+static inline int test_and_set_ti_thread_flag(struct thread_info *ti, int flag)
+{
+	return test_and_set_bit(flag, (unsigned long *)&ti->flags);
+}
+
+static inline int test_and_clear_ti_thread_flag(struct thread_info *ti, int flag)
+{
+	return test_and_clear_bit(flag, (unsigned long *)&ti->flags);
+}
+
+static inline int test_ti_thread_flag(struct thread_info *ti, int flag)
+{
+	return test_bit(flag, (unsigned long *)&ti->flags);
+}
+
+#define set_thread_flag(flag) \
+	set_ti_thread_flag(current_thread_info(), flag)
+#define clear_thread_flag(flag) \
+	clear_ti_thread_flag(current_thread_info(), flag)
+#define test_and_set_thread_flag(flag) \
+	test_and_set_ti_thread_flag(current_thread_info(), flag)
+#define test_and_clear_thread_flag(flag) \
+	test_and_clear_ti_thread_flag(current_thread_info(), flag)
+#define test_thread_flag(flag) \
+	test_ti_thread_flag(current_thread_info(), flag)
+
+#endif /* !__ASSEMBLY__ */
+#endif /* _LINUX_THREAD_BITS_H */
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index 45f004e..f6e3239 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -65,8 +65,7 @@ struct restart_block {
 
 extern long do_no_restart_syscall(struct restart_block *parm);
 
-#include <linux/bitops.h>
-#include <asm/thread_info.h>
+#include <linux/thread_bits.h>
 
 #ifdef __KERNEL__
 
@@ -77,47 +76,6 @@ extern long do_no_restart_syscall(struct restart_block *parm);
 # define THREADINFO_GFP		(GFP_KERNEL_ACCOUNT | __GFP_NOTRACK)
 #endif
 
-/*
- * flag set/clear/test wrappers
- * - pass TIF_xxxx constants to these functions
- */
-
-static inline void set_ti_thread_flag(struct thread_info *ti, int flag)
-{
-	set_bit(flag, (unsigned long *)&ti->flags);
-}
-
-static inline void clear_ti_thread_flag(struct thread_info *ti, int flag)
-{
-	clear_bit(flag, (unsigned long *)&ti->flags);
-}
-
-static inline int test_and_set_ti_thread_flag(struct thread_info *ti, int flag)
-{
-	return test_and_set_bit(flag, (unsigned long *)&ti->flags);
-}
-
-static inline int test_and_clear_ti_thread_flag(struct thread_info *ti, int flag)
-{
-	return test_and_clear_bit(flag, (unsigned long *)&ti->flags);
-}
-
-static inline int test_ti_thread_flag(struct thread_info *ti, int flag)
-{
-	return test_bit(flag, (unsigned long *)&ti->flags);
-}
-
-#define set_thread_flag(flag) \
-	set_ti_thread_flag(current_thread_info(), flag)
-#define clear_thread_flag(flag) \
-	clear_ti_thread_flag(current_thread_info(), flag)
-#define test_and_set_thread_flag(flag) \
-	test_and_set_ti_thread_flag(current_thread_info(), flag)
-#define test_and_clear_thread_flag(flag) \
-	test_and_clear_ti_thread_flag(current_thread_info(), flag)
-#define test_thread_flag(flag) \
-	test_ti_thread_flag(current_thread_info(), flag)
-
 #define tif_need_resched() test_thread_flag(TIF_NEED_RESCHED)
 
 #ifndef CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 07/18] arm64: introduce is_a32_task and is_a32_thread (for AArch32 compat)
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (5 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 06/18] thread: move thread bits accessors to separated file Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-10-21 20:33 ` [PATCH 08/18] arm64: ilp32: add is_ilp32_compat_{task,thread} and TIF_32BIT_AARCH64 Yury Norov
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf, Andrew Pinski,
	Bamvor Zhang Jian

Based on patch of Andrew Pinski.

This patch introduces is_a32_compat_task and is_a32_thread so it is
easier to say this is a a32 specific thread or a generic compat thread/task.
Corresponding functions are located in <asm/is_compat.h> to avoid mess in
headers.

Some files include both <linux/compat.h> and <asm/compat.h>,
and this is wrong because <linux/compat.h> has <asm/compat.h> already
included. It was fixed too.

Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Andrew Pinski <Andrew.Pinski@caviumnetworks.com>
Signed-off-by: Bamvor Zhang Jian <bamvor.zhangjian@linaro.org>
---
 arch/arm64/include/asm/compat.h      | 19 ++---------
 arch/arm64/include/asm/elf.h         | 10 +++---
 arch/arm64/include/asm/ftrace.h      |  2 +-
 arch/arm64/include/asm/is_compat.h   | 64 ++++++++++++++++++++++++++++++++++++
 arch/arm64/include/asm/memory.h      |  5 +--
 arch/arm64/include/asm/processor.h   |  5 +--
 arch/arm64/include/asm/syscall.h     |  2 +-
 arch/arm64/include/asm/thread_info.h |  2 +-
 arch/arm64/kernel/hw_breakpoint.c    | 10 +++---
 arch/arm64/kernel/perf_regs.c        |  2 +-
 arch/arm64/kernel/process.c          |  7 ++--
 arch/arm64/kernel/ptrace.c           | 11 +++----
 arch/arm64/kernel/signal.c           |  4 +--
 arch/arm64/kernel/traps.c            |  3 +-
 14 files changed, 98 insertions(+), 48 deletions(-)
 create mode 100644 arch/arm64/include/asm/is_compat.h

diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h
index eb8432b..df2f72d 100644
--- a/arch/arm64/include/asm/compat.h
+++ b/arch/arm64/include/asm/compat.h
@@ -24,6 +24,8 @@
 #include <linux/types.h>
 #include <linux/sched.h>
 
+#include <asm/is_compat.h>
+
 #define COMPAT_USER_HZ		100
 #ifdef __AARCH64EB__
 #define COMPAT_UTS_MACHINE	"armv8b\0\0"
@@ -298,23 +300,6 @@ struct compat_shmid64_ds {
 	compat_ulong_t __unused5;
 };
 
-static inline int is_compat_task(void)
-{
-	return test_thread_flag(TIF_32BIT);
-}
-
-static inline int is_compat_thread(struct thread_info *thread)
-{
-	return test_ti_thread_flag(thread, TIF_32BIT);
-}
-
-#else /* !CONFIG_COMPAT */
-
-static inline int is_compat_thread(struct thread_info *thread)
-{
-	return 0;
-}
-
 #endif /* CONFIG_COMPAT */
 #endif /* __KERNEL__ */
 #endif /* __ASM_COMPAT_H */
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index a55384f..6a9049b 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -16,6 +16,10 @@
 #ifndef __ASM_ELF_H
 #define __ASM_ELF_H
 
+#ifndef __ASSEMBLY__
+#include <linux/compat.h>
+#endif
+
 #include <asm/hwcap.h>
 
 /*
@@ -153,13 +157,9 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm,
 				       int uses_interp);
 
 /* 1GB of VA */
-#ifdef CONFIG_COMPAT
-#define STACK_RND_MASK			(test_thread_flag(TIF_32BIT) ? \
+#define STACK_RND_MASK			(is_compat_task() ? \
 						0x7ff >> (PAGE_SHIFT - 12) : \
 						0x3ffff >> (PAGE_SHIFT - 12))
-#else
-#define STACK_RND_MASK			(0x3ffff >> (PAGE_SHIFT - 12))
-#endif
 
 #ifdef __AARCH64EB__
 #define COMPAT_ELF_PLATFORM		("v8b")
diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index caa955f..0feb28a 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -54,7 +54,7 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
 #define ARCH_TRACE_IGNORE_COMPAT_SYSCALLS
 static inline bool arch_trace_is_compat_syscall(struct pt_regs *regs)
 {
-	return is_compat_task();
+	return is_a32_compat_task();
 }
 #endif /* ifndef __ASSEMBLY__ */
 
diff --git a/arch/arm64/include/asm/is_compat.h b/arch/arm64/include/asm/is_compat.h
new file mode 100644
index 0000000..8dba5ca
--- /dev/null
+++ b/arch/arm64/include/asm/is_compat.h
@@ -0,0 +1,64 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __ASM_IS_COMPAT_H
+#define __ASM_IS_COMPAT_H
+#ifndef __ASSEMBLY__
+
+#include <linux/thread_bits.h>
+
+#ifdef CONFIG_AARCH32_EL0
+
+static inline int is_a32_compat_task(void)
+{
+	return test_thread_flag(TIF_32BIT);
+}
+
+static inline int is_a32_compat_thread(struct thread_info *thread)
+{
+	return test_ti_thread_flag(thread, TIF_32BIT);
+}
+
+#else
+
+static inline int is_a32_compat_task(void)
+
+{
+	return 0;
+}
+
+static inline int is_a32_compat_thread(struct thread_info *thread)
+{
+	return 0;
+}
+
+#endif /* CONFIG_AARCH32_EL0 */
+
+#ifdef CONFIG_COMPAT
+
+static inline int is_compat_task(void)
+{
+	return is_a32_compat_task();
+}
+
+#endif /* CONFIG_COMPAT */
+
+static inline int is_compat_thread(struct thread_info *thread)
+{
+	return is_a32_compat_thread(thread);
+}
+
+
+#endif /* !__ASSEMBLY__ */
+#endif /* __ASM_IS_COMPAT_H */
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index ba62df8..39497ae 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -26,6 +26,7 @@
 #include <linux/types.h>
 #include <asm/bug.h>
 #include <asm/sizes.h>
+#include <asm/is_compat.h>
 
 /*
  * Allow for constants defined here to be used from assembly code
@@ -78,9 +79,9 @@
 
 #ifdef CONFIG_COMPAT
 #define TASK_SIZE_32		UL(0x100000000)
-#define TASK_SIZE		(test_thread_flag(TIF_32BIT) ? \
+#define TASK_SIZE		(is_compat_task() ?		\
 				TASK_SIZE_32 : TASK_SIZE_64)
-#define TASK_SIZE_OF(tsk)	(test_tsk_thread_flag(tsk, TIF_32BIT) ? \
+#define TASK_SIZE_OF(tsk)	(is_compat_thread(tsk) ? \
 				TASK_SIZE_32 : TASK_SIZE_64)
 #else
 #define TASK_SIZE		TASK_SIZE_64
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 6173a7b..49a046a 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -30,6 +30,7 @@
 #include <linux/string.h>
 
 #include <asm/alternative.h>
+#include <asm/is_compat.h>
 #include <asm/fpsimd.h>
 #include <asm/hw_breakpoint.h>
 #include <asm/lse.h>
@@ -40,7 +41,7 @@
 #define STACK_TOP_MAX		TASK_SIZE_64
 #ifdef CONFIG_COMPAT
 #define AARCH32_VECTORS_BASE	0xffff0000
-#define STACK_TOP		(test_thread_flag(TIF_32BIT) ? \
+#define STACK_TOP		(is_compat_task() ? \
 				AARCH32_VECTORS_BASE : STACK_TOP_MAX)
 #else
 #define STACK_TOP		STACK_TOP_MAX
@@ -92,7 +93,7 @@ struct thread_struct {
 #define task_user_tls(t)						\
 ({									\
 	unsigned long *__tls;						\
-	if (is_compat_thread(task_thread_info(t)))			\
+	if (is_a32_compat_thread(task_thread_info(t)))			\
 		__tls = &(t)->thread.tp2_value;				\
 	else								\
 		__tls = &(t)->thread.tp_value;				\
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
index 709a574..ce09641 100644
--- a/arch/arm64/include/asm/syscall.h
+++ b/arch/arm64/include/asm/syscall.h
@@ -113,7 +113,7 @@ static inline void syscall_set_arguments(struct task_struct *task,
  */
 static inline int syscall_get_arch(void)
 {
-	if (is_compat_task())
+	if (is_a32_compat_task())
 		return AUDIT_ARCH_ARM;
 
 	return AUDIT_ARCH_AARCH64;
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index e9ea5a6..e12411f 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -121,7 +121,7 @@ static inline struct thread_info *current_thread_info(void)
 #define TIF_FREEZE		19
 #define TIF_RESTORE_SIGMASK	20
 #define TIF_SINGLESTEP		21
-#define TIF_32BIT		22	/* 32bit process */
+#define TIF_32BIT		22	/* AARCH32 process */
 
 #define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)
 #define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)
diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c
index 948b731..4c14957 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -168,7 +168,7 @@ enum hw_breakpoint_ops {
 	HW_BREAKPOINT_RESTORE
 };
 
-static int is_compat_bp(struct perf_event *bp)
+static int is_a32_compat_bp(struct perf_event *bp)
 {
 	struct task_struct *tsk = bp->hw.target;
 
@@ -179,7 +179,7 @@ static int is_compat_bp(struct perf_event *bp)
 	 * deprecated behaviour if we use unaligned watchpoints in
 	 * AArch64 state.
 	 */
-	return tsk && is_compat_thread(task_thread_info(tsk));
+	return tsk && is_a32_compat_thread(task_thread_info(tsk));
 }
 
 /**
@@ -439,7 +439,7 @@ static int arch_build_bp_info(struct perf_event *bp)
 	 * Watchpoints can be of length 1, 2, 4 or 8 bytes.
 	 */
 	if (info->ctrl.type == ARM_BREAKPOINT_EXECUTE) {
-		if (is_compat_bp(bp)) {
+		if (is_a32_compat_bp(bp)) {
 			if (info->ctrl.len != ARM_BREAKPOINT_LEN_2 &&
 			    info->ctrl.len != ARM_BREAKPOINT_LEN_4)
 				return -EINVAL;
@@ -496,7 +496,7 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp)
 	 * AArch32 tasks expect some simple alignment fixups, so emulate
 	 * that here.
 	 */
-	if (is_compat_bp(bp)) {
+	if (is_a32_compat_bp(bp)) {
 		if (info->ctrl.len == ARM_BREAKPOINT_LEN_8)
 			alignment_mask = 0x7;
 		else
@@ -685,7 +685,7 @@ static int watchpoint_handler(unsigned long addr, unsigned int esr,
 
 		info = counter_arch_bp(wp);
 		/* AArch32 watchpoints are either 4 or 8 bytes aligned. */
-		if (is_compat_task()) {
+		if (is_a32_compat_task()) {
 			if (info->ctrl.len == ARM_BREAKPOINT_LEN_8)
 				alignment_mask = 0x7;
 			else
diff --git a/arch/arm64/kernel/perf_regs.c b/arch/arm64/kernel/perf_regs.c
index 3f62b35..a79058f 100644
--- a/arch/arm64/kernel/perf_regs.c
+++ b/arch/arm64/kernel/perf_regs.c
@@ -45,7 +45,7 @@ int perf_reg_validate(u64 mask)
 
 u64 perf_reg_abi(struct task_struct *task)
 {
-	if (is_compat_thread(task_thread_info(task)))
+	if (is_a32_compat_thread(task_thread_info(task)))
 		return PERF_SAMPLE_REGS_ABI_32;
 	else
 		return PERF_SAMPLE_REGS_ABI_64;
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 27b2f13..b78f80d 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -47,7 +47,6 @@
 #include <trace/events/power.h>
 
 #include <asm/alternative.h>
-#include <asm/compat.h>
 #include <asm/cacheflush.h>
 #include <asm/fpsimd.h>
 #include <asm/mmu_context.h>
@@ -204,7 +203,7 @@ static void tls_thread_flush(void)
 {
 	write_sysreg(0, tpidr_el0);
 
-	if (is_compat_task()) {
+	if (is_a32_compat_task()) {
 		current->thread.tp_value = 0;
 
 		/*
@@ -256,7 +255,7 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
 		*task_user_tls(p) = read_sysreg(tpidr_el0);
 
 		if (stack_start) {
-			if (is_compat_thread(task_thread_info(p)))
+			if (is_a32_compat_thread(task_thread_info(p)))
 				childregs->compat_sp = stack_start;
 			else
 				childregs->sp = stack_start;
@@ -293,7 +292,7 @@ static void tls_thread_switch(struct task_struct *next)
 	*task_user_tls(current) = tpidr;
 
 	tpidr = *task_user_tls(next);
-	tpidrro = is_compat_thread(task_thread_info(next)) ?
+	tpidrro = is_a32_compat_thread(task_thread_info(next)) ?
 		  next->thread.tp_value : 0;
 
 	write_sysreg(tpidr, tpidr_el0);
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 1d6f43e..1d075ed 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -38,7 +38,6 @@
 #include <linux/tracehook.h>
 #include <linux/elf.h>
 
-#include <asm/compat.h>
 #include <asm/debug-monitors.h>
 #include <asm/pgtable.h>
 #include <asm/syscall.h>
@@ -186,7 +185,7 @@ static void ptrace_hbptriggered(struct perf_event *bp,
 #ifdef CONFIG_AARCH32_EL0
 	int i;
 
-	if (!is_compat_task())
+	if (!is_a32_compat_task())
 		goto send_sig;
 
 	for (i = 0; i < ARM_MAX_BRP; ++i) {
@@ -1304,9 +1303,9 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task)
 	 * 32-bit children use an extended user_aarch32_ptrace_view to allow
 	 * access to the TLS register.
 	 */
-	if (is_compat_task())
+	if (is_a32_compat_task())
 		return &user_aarch32_view;
-	else if (is_compat_thread(task_thread_info(task)))
+	else if (is_a32_compat_thread(task_thread_info(task)))
 		return &user_aarch32_ptrace_view;
 #endif
 	return &user_aarch64_view;
@@ -1333,7 +1332,7 @@ static void tracehook_report_syscall(struct pt_regs *regs,
 	 * A scratch register (ip(r12) on AArch32, x7 on AArch64) is
 	 * used to denote syscall entry/exit:
 	 */
-	regno = (is_compat_task() ? 12 : 7);
+	regno = (is_a32_compat_task() ? 12 : 7);
 	saved_reg = regs->regs[regno];
 	regs->regs[regno] = dir;
 
@@ -1444,7 +1443,7 @@ int valid_user_regs(struct user_pt_regs *regs, struct task_struct *task)
 	if (!test_tsk_thread_flag(task, TIF_SINGLESTEP))
 		regs->pstate &= ~DBG_SPSR_SS;
 
-	if (is_compat_thread(task_thread_info(task)))
+	if (is_a32_compat_thread(task_thread_info(task)))
 		return valid_compat_regs(regs);
 	else
 		return valid_native_regs(regs);
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 404dd67..f90cdf5 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -276,7 +276,7 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
 
 static void setup_restart_syscall(struct pt_regs *regs)
 {
-	if (is_compat_task())
+	if (is_a32_compat_task())
 		compat_setup_restart_syscall(regs);
 	else
 		regs->regs[8] = __NR_restart_syscall;
@@ -295,7 +295,7 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
 	/*
 	 * Set up the stack frame
 	 */
-	if (is_compat_task()) {
+	if (is_a32_compat_task()) {
 		if (ksig->ka.sa.sa_flags & SA_SIGINFO)
 			ret = compat_setup_rt_frame(usig, ksig, oldset, regs);
 		else
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 14a08a0..3644ddc 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -18,6 +18,7 @@
  */
 
 #include <linux/bug.h>
+#include <linux/compat.h>
 #include <linux/signal.h>
 #include <linux/personality.h>
 #include <linux/kallsyms.h>
@@ -528,7 +529,7 @@ asmlinkage long do_ni_syscall(struct pt_regs *regs)
 {
 #ifdef CONFIG_AARCH32_EL0
 	long ret;
-	if (is_compat_task()) {
+	if (is_a32_compat_task()) {
 		ret = compat_arm_syscall(regs);
 		if (ret != -ENOSYS)
 			return ret;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 08/18] arm64: ilp32: add is_ilp32_compat_{task,thread} and TIF_32BIT_AARCH64
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (6 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 07/18] arm64: introduce is_a32_task and is_a32_thread (for AArch32 compat) Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-10-21 20:33 ` [PATCH 09/18] arm64: introduce binfmt_elf32.c Yury Norov
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf, Andrew Pinski

ILP32 tasks are needed to be distinguished from lp64 and aarch32.
This patch adds helper functions is_ilp32_compat_{task,thread} and
thread flag TIF_32BIT_AARCH64 to address it. This is a preparation
for following patches in ilp32 patchset.

For consistency, SET_PERSONALITY is changed here accordingly.

Signed-off-by: Andrew Pinski <Andrew.Pinski@caviumnetworks.com>
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Reviewed-by: David Daney <ddaney@caviumnetworks.com>
---
 arch/arm64/include/asm/elf.h         | 13 +++++++++++--
 arch/arm64/include/asm/is_compat.h   | 30 ++++++++++++++++++++++++++++--
 arch/arm64/include/asm/thread_info.h |  2 ++
 3 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index 6a9049b..f259fe8 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -142,7 +142,11 @@ typedef struct user_fpsimd_state elf_fpregset_t;
  */
 #define ELF_PLAT_INIT(_r, load_addr)	(_r)->regs[0] = 0
 
-#define SET_PERSONALITY(ex)		clear_thread_flag(TIF_32BIT);
+#define SET_PERSONALITY(ex)		\
+do {						\
+	clear_thread_flag(TIF_32BIT_AARCH64);	\
+	clear_thread_flag(TIF_32BIT);		\
+} while (0)
 
 /* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT entries changes */
 #define ARCH_DLINFO							\
@@ -183,7 +187,12 @@ typedef compat_elf_greg_t		compat_elf_gregset_t[COMPAT_ELF_NGREG];
 					 ((x)->e_flags & EF_ARM_EABI_MASK))
 
 #define compat_start_thread		compat_start_thread
-#define COMPAT_SET_PERSONALITY(ex)	set_thread_flag(TIF_32BIT);
+#define COMPAT_SET_PERSONALITY(ex)		\
+do {						\
+	clear_thread_flag(TIF_32BIT_AARCH64);	\
+	set_thread_flag(TIF_32BIT);		\
+} while (0)
+
 #define COMPAT_ARCH_DLINFO
 extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
 				      int uses_interp);
diff --git a/arch/arm64/include/asm/is_compat.h b/arch/arm64/include/asm/is_compat.h
index 8dba5ca..7726beb 100644
--- a/arch/arm64/include/asm/is_compat.h
+++ b/arch/arm64/include/asm/is_compat.h
@@ -45,18 +45,44 @@ static inline int is_a32_compat_thread(struct thread_info *thread)
 
 #endif /* CONFIG_AARCH32_EL0 */
 
+#ifdef CONFIG_ARM64_ILP32
+
+static inline int is_ilp32_compat_task(void)
+{
+	return test_thread_flag(TIF_32BIT_AARCH64);
+}
+
+static inline int is_ilp32_compat_thread(struct thread_info *thread)
+{
+	return test_ti_thread_flag(thread, TIF_32BIT_AARCH64);
+}
+
+#else
+
+static inline int is_ilp32_compat_task(void)
+{
+	return 0;
+}
+
+static inline int is_ilp32_compat_thread(struct thread_info *thread)
+{
+	return 0;
+}
+
+#endif /* CONFIG_ARM64_ILP32 */
+
 #ifdef CONFIG_COMPAT
 
 static inline int is_compat_task(void)
 {
-	return is_a32_compat_task();
+	return is_a32_compat_task() || is_ilp32_compat_task();
 }
 
 #endif /* CONFIG_COMPAT */
 
 static inline int is_compat_thread(struct thread_info *thread)
 {
-	return is_a32_compat_thread(thread);
+	return is_a32_compat_thread(thread) || is_ilp32_compat_thread(thread);
 }
 
 
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index e12411f..680aca5 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -122,6 +122,7 @@ static inline struct thread_info *current_thread_info(void)
 #define TIF_RESTORE_SIGMASK	20
 #define TIF_SINGLESTEP		21
 #define TIF_32BIT		22	/* AARCH32 process */
+#define TIF_32BIT_AARCH64	23	/* 32 bit process on AArch64(ILP32) */
 
 #define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)
 #define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)
@@ -133,6 +134,7 @@ static inline struct thread_info *current_thread_info(void)
 #define _TIF_SYSCALL_TRACEPOINT	(1 << TIF_SYSCALL_TRACEPOINT)
 #define _TIF_SECCOMP		(1 << TIF_SECCOMP)
 #define _TIF_32BIT		(1 << TIF_32BIT)
+#define _TIF_32BIT_AARCH64	(1 << TIF_32BIT_AARCH64)
 
 #define _TIF_WORK_MASK		(_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
 				 _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 09/18] arm64: introduce binfmt_elf32.c
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (7 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 08/18] arm64: ilp32: add is_ilp32_compat_{task,thread} and TIF_32BIT_AARCH64 Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-12-05 15:10   ` Catalin Marinas
  2016-10-21 20:33 ` [PATCH 10/18] arm64: ilp32: introduce binfmt_ilp32.c Yury Norov
                   ` (12 subsequent siblings)
  21 siblings, 1 reply; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf

As we support more than one compat formats, it looks more reasonable
to not use fs/compat_binfmt.c. Custom binfmt_elf32.c allows to move aarch32
specific definitions there and make code more maintainable and readable.

Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
---
 arch/arm64/Kconfig               |  1 -
 arch/arm64/include/asm/hwcap.h   |  2 --
 arch/arm64/kernel/Makefile       |  2 +-
 arch/arm64/kernel/binfmt_elf32.c | 31 +++++++++++++++++++++++++++++++
 4 files changed, 32 insertions(+), 4 deletions(-)
 create mode 100644 arch/arm64/kernel/binfmt_elf32.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 0cd786e..9efa86a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1003,7 +1003,6 @@ config AARCH32_EL0
 	def_bool y
 	depends on ARM64_4K_PAGES || EXPERT
 	select COMPAT
-	select COMPAT_BINFMT_ELF
 	select HAVE_UID16
 	select OLD_SIGSUSPEND3
 	select COMPAT_OLD_SIGACTION
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index 2c7fc5d..99dfd92 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -47,8 +47,6 @@
 #define ELF_HWCAP		(elf_hwcap)
 
 #ifdef CONFIG_AARCH32_EL0
-#define COMPAT_ELF_HWCAP	(compat_elf_hwcap)
-#define COMPAT_ELF_HWCAP2	(compat_elf_hwcap2)
 extern unsigned int compat_elf_hwcap, compat_elf_hwcap2;
 #endif
 
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 8a19fda..abe5040 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -28,7 +28,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
 	$(call if_changed,objcopy)
 
 arm64-obj-$(CONFIG_AARCH32_EL0)		+= sys32.o kuser32.o signal32.o 	\
-					   sys_compat.o entry32.o
+					   sys_compat.o entry32.o binfmt_elf32.o
 arm64-obj-$(CONFIG_FUNCTION_TRACER)	+= ftrace.o entry-ftrace.o
 arm64-obj-$(CONFIG_MODULES)		+= arm64ksyms.o module.o
 arm64-obj-$(CONFIG_ARM64_MODULE_PLTS)	+= module-plts.o
diff --git a/arch/arm64/kernel/binfmt_elf32.c b/arch/arm64/kernel/binfmt_elf32.c
new file mode 100644
index 0000000..aec1c8a
--- /dev/null
+++ b/arch/arm64/kernel/binfmt_elf32.c
@@ -0,0 +1,31 @@
+/*
+ * Support for AArch32 Linux ELF binaries.
+ */
+
+/* AArch32 EABI. */
+#define EF_ARM_EABI_MASK		0xff000000
+
+#define compat_start_thread		compat_start_thread
+#define COMPAT_SET_PERSONALITY(ex)		\
+do {						\
+	clear_thread_flag(TIF_32BIT_AARCH64);	\
+	set_thread_flag(TIF_32BIT);		\
+} while (0)
+
+#define COMPAT_ARCH_DLINFO
+#define COMPAT_ELF_HWCAP		(compat_elf_hwcap)
+#define COMPAT_ELF_HWCAP2		(compat_elf_hwcap2)
+
+#ifdef __AARCH64EB__
+#define COMPAT_ELF_PLATFORM		("v8b")
+#else
+#define COMPAT_ELF_PLATFORM		("v8l")
+#endif
+
+#define compat_arch_setup_additional_pages \
+					aarch32_setup_vectors_page
+struct linux_binprm;
+extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
+				      int uses_interp);
+
+#include "../../../fs/compat_binfmt_elf.c"
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 10/18] arm64: ilp32: introduce binfmt_ilp32.c
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (8 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 09/18] arm64: introduce binfmt_elf32.c Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-12-05 15:38   ` Catalin Marinas
  2016-10-21 20:33 ` [PATCH 11/18] arm64: ilp32: share aarch32 syscall handlers Yury Norov
                   ` (11 subsequent siblings)
  21 siblings, 1 reply; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf,
	Bamvor Zhang Jian

binfmt_ilp32.c is needed to handle ILP32 binaries

Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Bamvor Zhang Jian <bamvor.zhangjian@linaro.org>
---
 arch/arm64/include/asm/elf.h     |  6 +++
 arch/arm64/kernel/Makefile       |  1 +
 arch/arm64/kernel/binfmt_ilp32.c | 97 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 104 insertions(+)
 create mode 100644 arch/arm64/kernel/binfmt_ilp32.c

diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index f259fe8..be29dde 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -175,10 +175,16 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm,
 
 #define COMPAT_ELF_ET_DYN_BASE		(2 * TASK_SIZE_32 / 3)
 
+#ifndef USE_AARCH64_GREG
 /* AArch32 registers. */
 #define COMPAT_ELF_NGREG		18
 typedef unsigned int			compat_elf_greg_t;
 typedef compat_elf_greg_t		compat_elf_gregset_t[COMPAT_ELF_NGREG];
+#else /* AArch64 registers for AARCH64/ILP32 */
+#define COMPAT_ELF_NGREG	ELF_NGREG
+#define compat_elf_greg_t	elf_greg_t
+#define compat_elf_gregset_t	elf_gregset_t
+#endif
 
 /* AArch32 EABI. */
 #define EF_ARM_EABI_MASK		0xff000000
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index abe5040..f661888 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -29,6 +29,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
 
 arm64-obj-$(CONFIG_AARCH32_EL0)		+= sys32.o kuser32.o signal32.o 	\
 					   sys_compat.o entry32.o binfmt_elf32.o
+arm64-obj-$(CONFIG_ARM64_ILP32)		+= binfmt_ilp32.o
 arm64-obj-$(CONFIG_FUNCTION_TRACER)	+= ftrace.o entry-ftrace.o
 arm64-obj-$(CONFIG_MODULES)		+= arm64ksyms.o module.o
 arm64-obj-$(CONFIG_ARM64_MODULE_PLTS)	+= module-plts.o
diff --git a/arch/arm64/kernel/binfmt_ilp32.c b/arch/arm64/kernel/binfmt_ilp32.c
new file mode 100644
index 0000000..759066e
--- /dev/null
+++ b/arch/arm64/kernel/binfmt_ilp32.c
@@ -0,0 +1,97 @@
+/*
+ * Support for ILP32 Linux/aarch64 ELF binaries.
+ */
+#define USE_AARCH64_GREG
+
+#include <linux/elfcore-compat.h>
+#include <linux/time.h>
+
+#undef	ELF_CLASS
+#define ELF_CLASS	ELFCLASS32
+
+#undef	elfhdr
+#undef	elf_phdr
+#undef	elf_shdr
+#undef	elf_note
+#undef	elf_addr_t
+#define elfhdr		elf32_hdr
+#define elf_phdr	elf32_phdr
+#define elf_shdr	elf32_shdr
+#define elf_note	elf32_note
+#define elf_addr_t	Elf32_Addr
+
+/*
+ * Some data types as stored in coredump.
+ */
+#define user_long_t		compat_long_t
+#define user_siginfo_t		compat_siginfo_t
+#define copy_siginfo_to_user	copy_siginfo_to_user32
+
+/*
+ * The machine-dependent core note format types are defined in elfcore-compat.h,
+ * which requires asm/elf.h to define compat_elf_gregset_t et al.
+ */
+#define elf_prstatus	compat_elf_prstatus
+#define elf_prpsinfo	compat_elf_prpsinfo
+
+/*
+ * Compat version of cputime_to_compat_timeval, perhaps this
+ * should be an inline in <linux/compat.h>.
+ */
+static void cputime_to_compat_timeval(const cputime_t cputime,
+				      struct compat_timeval *value)
+{
+	struct timeval tv;
+
+	cputime_to_timeval(cputime, &tv);
+	value->tv_sec = tv.tv_sec;
+	value->tv_usec = tv.tv_usec;
+}
+
+#undef cputime_to_timeval
+#define cputime_to_timeval cputime_to_compat_timeval
+
+/* AARCH64 ILP32 EABI. */
+#undef elf_check_arch
+#define elf_check_arch(x)		(((x)->e_machine == EM_AARCH64)	\
+					&& (x)->e_ident[EI_CLASS] == ELFCLASS32)
+
+#undef SET_PERSONALITY
+#define SET_PERSONALITY(ex)						\
+do {									\
+	set_thread_flag(TIF_32BIT_AARCH64);				\
+	clear_thread_flag(TIF_32BIT);					\
+} while (0)
+
+#undef ARCH_DLINFO
+#define ARCH_DLINFO							\
+do {									\
+	NEW_AUX_ENT(AT_SYSINFO_EHDR,					\
+		    (elf_addr_t)(long)current->mm->context.vdso);	\
+} while (0)
+
+#undef ELF_PLATFORM
+#ifdef __AARCH64EB__
+#define ELF_PLATFORM		("aarch64_be:ilp32")
+#else
+#define ELF_PLATFORM		("aarch64:ilp32")
+#endif
+
+#undef ELF_ET_DYN_BASE
+#define ELF_ET_DYN_BASE COMPAT_ELF_ET_DYN_BASE
+
+#undef ELF_HWCAP
+#undef ELF_HWCAP2
+#define ELF_HWCAP			((u32) elf_hwcap)
+#define ELF_HWCAP2			((u32) (elf_hwcap >> 32))
+
+/*
+ * Rename a few of the symbols that binfmt_elf.c will define.
+ * These are all local so the names don't really matter, but it
+ * might make some debugging less confusing not to duplicate them.
+ */
+#define elf_format		compat_elf_format
+#define init_elf_binfmt		init_compat_elf_binfmt
+#define exit_elf_binfmt		exit_compat_elf_binfmt
+
+#include "../../../fs/binfmt_elf.c"
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 11/18] arm64: ilp32: share aarch32 syscall handlers
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (9 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 10/18] arm64: ilp32: introduce binfmt_ilp32.c Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-12-05 17:12   ` Catalin Marinas
  2016-10-21 20:33 ` [PATCH 12/18] arm64: ilp32: add sys_ilp32.c and a separate table (in entry.S) to use it Yury Norov
                   ` (10 subsequent siblings)
  21 siblings, 1 reply; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf

off_t is  passed in register pair just like in aarch32.
In this patch corresponding aarch32 handlers are shared to
ilp32 code.

Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
---
 arch/arm64/kernel/Makefile         |   1 +
 arch/arm64/kernel/entry32.S        |  80 ---------------------------
 arch/arm64/kernel/entry32_common.S | 107 +++++++++++++++++++++++++++++++++++++
 3 files changed, 108 insertions(+), 80 deletions(-)
 create mode 100644 arch/arm64/kernel/entry32_common.S

diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index f661888..9123bb8 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -30,6 +30,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
 arm64-obj-$(CONFIG_AARCH32_EL0)		+= sys32.o kuser32.o signal32.o 	\
 					   sys_compat.o entry32.o binfmt_elf32.o
 arm64-obj-$(CONFIG_ARM64_ILP32)		+= binfmt_ilp32.o
+arm64-obj-$(CONFIG_COMPAT)		+= entry32_common.o
 arm64-obj-$(CONFIG_FUNCTION_TRACER)	+= ftrace.o entry-ftrace.o
 arm64-obj-$(CONFIG_MODULES)		+= arm64ksyms.o module.o
 arm64-obj-$(CONFIG_ARM64_MODULE_PLTS)	+= module-plts.o
diff --git a/arch/arm64/kernel/entry32.S b/arch/arm64/kernel/entry32.S
index f332d5d..4bede03 100644
--- a/arch/arm64/kernel/entry32.S
+++ b/arch/arm64/kernel/entry32.S
@@ -39,83 +39,3 @@ ENTRY(compat_sys_rt_sigreturn_wrapper)
 	mov	x0, sp
 	b	compat_sys_rt_sigreturn
 ENDPROC(compat_sys_rt_sigreturn_wrapper)
-
-ENTRY(compat_sys_statfs64_wrapper)
-	mov	w3, #84
-	cmp	w1, #88
-	csel	w1, w3, w1, eq
-	b	compat_sys_statfs64
-ENDPROC(compat_sys_statfs64_wrapper)
-
-ENTRY(compat_sys_fstatfs64_wrapper)
-	mov	w3, #84
-	cmp	w1, #88
-	csel	w1, w3, w1, eq
-	b	compat_sys_fstatfs64
-ENDPROC(compat_sys_fstatfs64_wrapper)
-
-/*
- * Note: off_4k (w5) is always in units of 4K. If we can't do the
- * requested offset because it is not page-aligned, we return -EINVAL.
- */
-ENTRY(compat_sys_mmap2_wrapper)
-#if PAGE_SHIFT > 12
-	tst	w5, #~PAGE_MASK >> 12
-	b.ne	1f
-	lsr	w5, w5, #PAGE_SHIFT - 12
-#endif
-	b	sys_mmap_pgoff
-1:	mov	x0, #-EINVAL
-	ret
-ENDPROC(compat_sys_mmap2_wrapper)
-
-/*
- * Wrappers for AArch32 syscalls that either take 64-bit parameters
- * in registers or that take 32-bit parameters which require sign
- * extension.
- */
-ENTRY(compat_sys_pread64_wrapper)
-	regs_to_64	x3, x4, x5
-	b	sys_pread64
-ENDPROC(compat_sys_pread64_wrapper)
-
-ENTRY(compat_sys_pwrite64_wrapper)
-	regs_to_64	x3, x4, x5
-	b	sys_pwrite64
-ENDPROC(compat_sys_pwrite64_wrapper)
-
-ENTRY(compat_sys_truncate64_wrapper)
-	regs_to_64	x1, x2, x3
-	b	sys_truncate
-ENDPROC(compat_sys_truncate64_wrapper)
-
-ENTRY(compat_sys_ftruncate64_wrapper)
-	regs_to_64	x1, x2, x3
-	b	sys_ftruncate
-ENDPROC(compat_sys_ftruncate64_wrapper)
-
-ENTRY(compat_sys_readahead_wrapper)
-	regs_to_64	x1, x2, x3
-	mov	w2, w4
-	b	sys_readahead
-ENDPROC(compat_sys_readahead_wrapper)
-
-ENTRY(compat_sys_fadvise64_64_wrapper)
-	mov	w6, w1
-	regs_to_64	x1, x2, x3
-	regs_to_64	x2, x4, x5
-	mov	w3, w6
-	b	sys_fadvise64_64
-ENDPROC(compat_sys_fadvise64_64_wrapper)
-
-ENTRY(compat_sys_sync_file_range2_wrapper)
-	regs_to_64	x2, x2, x3
-	regs_to_64	x3, x4, x5
-	b	sys_sync_file_range2
-ENDPROC(compat_sys_sync_file_range2_wrapper)
-
-ENTRY(compat_sys_fallocate_wrapper)
-	regs_to_64	x2, x2, x3
-	regs_to_64	x3, x4, x5
-	b	sys_fallocate
-ENDPROC(compat_sys_fallocate_wrapper)
diff --git a/arch/arm64/kernel/entry32_common.S b/arch/arm64/kernel/entry32_common.S
new file mode 100644
index 0000000..f4a5e4d
--- /dev/null
+++ b/arch/arm64/kernel/entry32_common.S
@@ -0,0 +1,107 @@
+/*
+ * Compat system call wrappers
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Authors: Will Deacon <will.deacon@arm.com>
+ *	    Catalin Marinas <catalin.marinas@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+#include <linux/const.h>
+
+#include <asm/assembler.h>
+#include <asm/asm-offsets.h>
+#include <asm/errno.h>
+#include <asm/page.h>
+
+/*
+ * Note: off_4k (w5) is always in units of 4K. If we can't do the
+ * requested offset because it is not page-aligned, we return -EINVAL.
+ */
+ENTRY(compat_sys_mmap2_wrapper)
+#if PAGE_SHIFT > 12
+	tst	w5, #~PAGE_MASK >> 12
+	b.ne	1f
+	lsr	w5, w5, #PAGE_SHIFT - 12
+#endif
+	b	sys_mmap_pgoff
+1:	mov	x0, #-EINVAL
+	ret
+ENDPROC(compat_sys_mmap2_wrapper)
+
+/*
+ * Wrappers for AArch32 syscalls that either take 64-bit parameters
+ * in registers or that take 32-bit parameters which require sign
+ * extension.
+ */
+ENTRY(compat_sys_pread64_wrapper)
+	regs_to_64	x3, x4, x5
+	b	sys_pread64
+ENDPROC(compat_sys_pread64_wrapper)
+
+ENTRY(compat_sys_pwrite64_wrapper)
+	regs_to_64	x3, x4, x5
+	b	sys_pwrite64
+ENDPROC(compat_sys_pwrite64_wrapper)
+
+ENTRY(compat_sys_truncate64_wrapper)
+	regs_to_64	x1, x2, x3
+	b	sys_truncate
+ENDPROC(compat_sys_truncate64_wrapper)
+
+ENTRY(compat_sys_ftruncate64_wrapper)
+	regs_to_64	x1, x2, x3
+	b	sys_ftruncate
+ENDPROC(compat_sys_ftruncate64_wrapper)
+
+ENTRY(compat_sys_readahead_wrapper)
+	regs_to_64	x1, x2, x3
+	mov	w2, w4
+	b	sys_readahead
+ENDPROC(compat_sys_readahead_wrapper)
+
+ENTRY(compat_sys_fadvise64_64_wrapper)
+	mov	w6, w1
+	regs_to_64	x1, x2, x3
+	regs_to_64	x2, x4, x5
+	mov	w3, w6
+	b	sys_fadvise64_64
+ENDPROC(compat_sys_fadvise64_64_wrapper)
+
+ENTRY(compat_sys_sync_file_range2_wrapper)
+	regs_to_64	x2, x2, x3
+	regs_to_64	x3, x4, x5
+	b	sys_sync_file_range2
+ENDPROC(compat_sys_sync_file_range2_wrapper)
+
+ENTRY(compat_sys_fallocate_wrapper)
+	regs_to_64	x2, x2, x3
+	regs_to_64	x3, x4, x5
+	b	sys_fallocate
+ENDPROC(compat_sys_fallocate_wrapper)
+
+ENTRY(compat_sys_statfs64_wrapper)
+	mov	w3, #84
+	cmp	w1, #88
+	csel	w1, w3, w1, eq
+	b	compat_sys_statfs64
+ENDPROC(compat_sys_statfs64_wrapper)
+
+ENTRY(compat_sys_fstatfs64_wrapper)
+	mov	w3, #84
+	cmp	w1, #88
+	csel	w1, w3, w1, eq
+	b	compat_sys_fstatfs64
+ENDPROC(compat_sys_fstatfs64_wrapper)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 12/18] arm64: ilp32: add sys_ilp32.c and a separate table (in entry.S) to use it
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (10 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 11/18] arm64: ilp32: share aarch32 syscall handlers Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-10-21 20:33 ` [PATCH 13/18] arm64: signal: share lp64 signal routines to ilp32 Yury Norov
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf, Andrew Pinski,
	Andrew Pinski, Bamvor Zhang Jian

From: Andrew Pinski <apinski@cavium.com>

Add a separate syscall-table for ILP32, which dispatches either to native
LP64 system call implementation or to compat-syscalls, as appropriate.

Signed-off-by: Andrew Pinski <Andrew.Pinski@caviumnetworks.com>
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Bamvor Zhang Jian <bamvor.zhangjian@linaro.org>
---
 arch/arm64/include/asm/unistd.h      |   8 ++-
 arch/arm64/include/uapi/asm/unistd.h |  12 +++++
 arch/arm64/kernel/Makefile           |   2 +-
 arch/arm64/kernel/entry.S            |  28 +++++++++-
 arch/arm64/kernel/sys_ilp32.c        | 100 +++++++++++++++++++++++++++++++++++
 5 files changed, 145 insertions(+), 5 deletions(-)
 create mode 100644 arch/arm64/kernel/sys_ilp32.c

diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index fe9d6c1..851cc8a 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -13,13 +13,17 @@
  * You should have received a copy of the GNU General Public License
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
+
+#ifdef CONFIG_COMPAT
+#define __ARCH_WANT_COMPAT_STAT64
+#define __ARCH_WANT_SYS_LLSEEK
+#endif
+
 #ifdef CONFIG_AARCH32_EL0
 #define __ARCH_WANT_COMPAT_SYS_GETDENTS64
-#define __ARCH_WANT_COMPAT_STAT64
 #define __ARCH_WANT_SYS_GETHOSTNAME
 #define __ARCH_WANT_SYS_PAUSE
 #define __ARCH_WANT_SYS_GETPGRP
-#define __ARCH_WANT_SYS_LLSEEK
 #define __ARCH_WANT_SYS_NICE
 #define __ARCH_WANT_SYS_SIGPENDING
 #define __ARCH_WANT_SYS_SIGPROCMASK
diff --git a/arch/arm64/include/uapi/asm/unistd.h b/arch/arm64/include/uapi/asm/unistd.h
index 043d17a..b4cd688 100644
--- a/arch/arm64/include/uapi/asm/unistd.h
+++ b/arch/arm64/include/uapi/asm/unistd.h
@@ -14,6 +14,18 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+/*
+ * Use AARCH32 interface for sys_sync_file_range() as it passes 64-bit arguments.
+ */
+#if defined(__ILP32__) || defined(__SYSCALL_COMPAT)
+#define __ARCH_WANT_SYNC_FILE_RANGE2
+#endif
+
+/*
+ * AARCH64/ILP32 is introduced after renameat() was replaced with renameat2().
+ */
+#if !(defined(__ILP32__) || defined(__SYSCALL_COMPAT))
 #define __ARCH_WANT_RENAMEAT
+#endif
 
 #include <asm-generic/unistd.h>
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 9123bb8..06070f5 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -29,7 +29,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
 
 arm64-obj-$(CONFIG_AARCH32_EL0)		+= sys32.o kuser32.o signal32.o 	\
 					   sys_compat.o entry32.o binfmt_elf32.o
-arm64-obj-$(CONFIG_ARM64_ILP32)		+= binfmt_ilp32.o
+arm64-obj-$(CONFIG_ARM64_ILP32)		+= binfmt_ilp32.o sys_ilp32.o
 arm64-obj-$(CONFIG_COMPAT)		+= entry32_common.o
 arm64-obj-$(CONFIG_FUNCTION_TRACER)	+= ftrace.o entry-ftrace.o
 arm64-obj-$(CONFIG_MODULES)		+= arm64ksyms.o module.o
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index b6fb14b..b152aab 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -249,6 +249,23 @@ tsk	.req	x28		// current thread_info
 
 	.text
 
+#ifdef CONFIG_ARM64_ILP32
+/*
+ * AARCH64/ILP32. Zero top halves of x0-x7
+ * registers as userspace may put garbage there.
+ */
+	.macro	delouse_input_regs
+	mov w0, w0
+	mov w1, w1
+	mov w2, w2
+	mov w3, w3
+	mov w4, w4
+	mov w5, w5
+	mov w6, w6
+	mov w7, w7
+	.endm
+#endif
+
 /*
  * Exception vectors.
  */
@@ -517,6 +534,7 @@ el0_svc_compat:
 	 * AArch32 syscall handling
 	 */
 	adrp	stbl, compat_sys_call_table	// load compat syscall table pointer
+	ldr     x16, [tsk, #TI_FLAGS]
 	uxtw	scno, w7			// syscall number in w7 (r7)
 	mov     sc_nr, #__NR_compat_syscalls
 	b	el0_svc_naked
@@ -739,15 +757,21 @@ ENDPROC(ret_from_fork)
 	.align	6
 el0_svc:
 	adrp	stbl, sys_call_table		// load syscall table pointer
+	ldr	x16, [tsk, #TI_FLAGS]
 	uxtw	scno, w8			// syscall number in w8
 	mov	sc_nr, #__NR_syscalls
+#ifdef CONFIG_ARM64_ILP32
+	tst	x16, #_TIF_32BIT_AARCH64
+	b.eq	el0_svc_naked			// We are using LP64  syscall table
+	adrp	stbl, sys_call_ilp32_table	// load ilp32 syscall table pointer
+	delouse_input_regs
+#endif
 el0_svc_naked:					// compat entry point
 	stp	x0, scno, [sp, #S_ORIG_X0]	// save the original x0 and syscall number
 	enable_dbg_and_irq
 	ct_user_exit 1
 
-	ldr	x16, [tsk, #TI_FLAGS]		// check for syscall hooks
-	tst	x16, #_TIF_SYSCALL_WORK
+	tst	x16, #_TIF_SYSCALL_WORK		// check for syscall hooks
 	b.ne	__sys_trace
 	cmp     scno, sc_nr                     // check upper syscall limit
 	b.hs	ni_sys
diff --git a/arch/arm64/kernel/sys_ilp32.c b/arch/arm64/kernel/sys_ilp32.c
new file mode 100644
index 0000000..fbf2f00
--- /dev/null
+++ b/arch/arm64/kernel/sys_ilp32.c
@@ -0,0 +1,100 @@
+/*
+ * AArch64- ILP32 specific system calls implementation
+ *
+ * Copyright (C) 2016 Cavium Inc.
+ * Author: Andrew Pinski <apinski@cavium.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define __SYSCALL_COMPAT
+
+#include <linux/compiler.h>
+#include <linux/errno.h>
+#include <linux/fs.h>
+#include <linux/mm.h>
+#include <linux/msg.h>
+#include <linux/export.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/syscalls.h>
+#include <linux/compat.h>
+#include <asm-generic/syscalls.h>
+
+/*
+ * AARCH32 requires 4-page alignment for shared memory,
+ * but AARCH64 - only 1 page. This is the only difference
+ * between compat and native sys_shmat(). So ILP32 just pick
+ * AARCH64 version.
+ */
+#define compat_sys_shmat		sys_shmat
+
+/*
+ * ILP32 needs special handling for some ptrace requests.
+ */
+#define sys_ptrace			compat_sys_ptrace
+
+/*
+ * Using AARCH32 interface for syscalls that take 64-bit
+ * parameters in registers.
+ */
+#define compat_sys_fadvise64_64		compat_sys_fadvise64_64_wrapper
+#define compat_sys_fallocate		compat_sys_fallocate_wrapper
+#define compat_sys_ftruncate64		compat_sys_ftruncate64_wrapper
+#define compat_sys_pread64		compat_sys_pread64_wrapper
+#define compat_sys_pwrite64		compat_sys_pwrite64_wrapper
+#define compat_sys_readahead		compat_sys_readahead_wrapper
+#define compat_sys_sync_file_range2	compat_sys_sync_file_range2_wrapper
+#define compat_sys_truncate64		compat_sys_truncate64_wrapper
+#define sys_mmap2			compat_sys_mmap2_wrapper
+
+/*
+ * Using AARCH32 interface for syscalls that take the size of
+ * sfruct statfs as an argument, as it's calculated differently
+ * in kernel and user spaces.
+ */
+#define compat_sys_fstatfs64		compat_sys_fstatfs64_wrapper
+#define compat_sys_statfs64		compat_sys_statfs64_wrapper
+
+/*
+ * Using custom wrapper for rt_sigreturn() to handle custom
+ * struct rt_sigframe.
+ */
+#define compat_sys_rt_sigreturn        ilp32_sys_rt_sigreturn_wrapper
+
+asmlinkage long compat_sys_fstatfs64_wrapper(void);
+asmlinkage long compat_sys_statfs64_wrapper(void);
+asmlinkage long compat_sys_fadvise64_64_wrapper(void);
+asmlinkage long compat_sys_fallocate_wrapper(void);
+asmlinkage long compat_sys_ftruncate64_wrapper(void);
+asmlinkage long compat_sys_mmap2_wrapper(void);
+asmlinkage long compat_sys_pread64_wrapper(void);
+asmlinkage long compat_sys_pwrite64_wrapper(void);
+asmlinkage long compat_sys_readahead_wrapper(void);
+asmlinkage long compat_sys_sync_file_range2_wrapper(void);
+asmlinkage long compat_sys_truncate64_wrapper(void);
+asmlinkage long ilp32_sys_rt_sigreturn_wrapper(void);
+
+#include <asm/syscall.h>
+
+#undef __SYSCALL
+#define __SYSCALL(nr, sym)	[nr] = sym,
+
+/*
+ * The sys_call_ilp32_table array must be 4K aligned to be accessible from
+ * kernel/entry.S.
+ */
+void *sys_call_ilp32_table[__NR_syscalls] __aligned(4096) = {
+	[0 ... __NR_syscalls - 1] = sys_ni_syscall,
+#include <asm/unistd.h>
+};
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 13/18] arm64: signal: share lp64 signal routines to ilp32
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (11 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 12/18] arm64: ilp32: add sys_ilp32.c and a separate table (in entry.S) to use it Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-10-21 20:33 ` [PATCH 14/18] arm64: signal32: move ilp32 and aarch32 common code to separated file Yury Norov
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf,
	Bamvor Zhang Jian

After that, it will be possible to reuse it in ilp32.

Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Bamvor Zhang Jian <bamvor.zhangjian@linaro.org>
---
 arch/arm64/include/asm/signal_common.h | 33 ++++++++++++
 arch/arm64/kernel/signal.c             | 93 +++++++++++++++++++++-------------
 2 files changed, 92 insertions(+), 34 deletions(-)
 create mode 100644 arch/arm64/include/asm/signal_common.h

diff --git a/arch/arm64/include/asm/signal_common.h b/arch/arm64/include/asm/signal_common.h
new file mode 100644
index 0000000..756ed2c
--- /dev/null
+++ b/arch/arm64/include/asm/signal_common.h
@@ -0,0 +1,33 @@
+/*
+ * Copyright (C) 1995-2009 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2016 Cavium Networks.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __ASM_SIGNAL_COMMON_H
+#define __ASM_SIGNAL_COMMON_H
+
+#include <linux/uaccess.h>
+#include <asm/ucontext.h>
+#include <asm/fpsimd.h>
+
+int preserve_fpsimd_context(struct fpsimd_context __user *ctx);
+int restore_fpsimd_context(struct fpsimd_context __user *ctx);
+int setup_sigcontext(struct sigcontext __user *uc_mcontext, struct pt_regs *regs);
+int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sf);
+void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
+			void __user *frame, off_t sigframe_off, int usig);
+
+#endif /* __ASM_SIGNAL_COMMON_H */
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index f90cdf5..478d6c5 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -34,18 +34,26 @@
 #include <asm/fpsimd.h>
 #include <asm/signal32.h>
 #include <asm/vdso.h>
+#include <asm/signal_common.h>
+
+#define RT_SIGFRAME_FP_POS (offsetof(struct rt_sigframe, sig)	\
+			+ offsetof(struct sigframe, fp))
+
+struct sigframe {
+	struct ucontext uc;
+	u64 fp;
+	u64 lr;
+};
 
 /*
  * Do a signal return; undo the signal stack. These are aligned to 128-bit.
  */
 struct rt_sigframe {
 	struct siginfo info;
-	struct ucontext uc;
-	u64 fp;
-	u64 lr;
+	struct sigframe sig;
 };
 
-static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
+int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
 {
 	struct fpsimd_state *fpsimd = &current->thread.fpsimd_state;
 	int err;
@@ -65,7 +73,7 @@ static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
 	return err ? -EFAULT : 0;
 }
 
-static int restore_fpsimd_context(struct fpsimd_context __user *ctx)
+int restore_fpsimd_context(struct fpsimd_context __user *ctx)
 {
 	struct fpsimd_state fpsimd;
 	__u32 magic, size;
@@ -93,22 +101,30 @@ static int restore_fpsimd_context(struct fpsimd_context __user *ctx)
 }
 
 static int restore_sigframe(struct pt_regs *regs,
-			    struct rt_sigframe __user *sf)
+			    struct sigframe __user *sf)
 {
 	sigset_t set;
-	int i, err;
-	void *aux = sf->uc.uc_mcontext.__reserved;
-
+	int err;
 	err = __copy_from_user(&set, &sf->uc.uc_sigmask, sizeof(set));
 	if (err == 0)
 		set_current_blocked(&set);
 
+	err |= restore_sigcontext(regs, &sf->uc.uc_mcontext);
+	return err;
+}
+
+
+int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *uc_mcontext)
+{
+	int i, err = 0;
+	void *aux = uc_mcontext->__reserved;
+
 	for (i = 0; i < 31; i++)
-		__get_user_error(regs->regs[i], &sf->uc.uc_mcontext.regs[i],
+		__get_user_error(regs->regs[i], &uc_mcontext->regs[i],
 				 err);
-	__get_user_error(regs->sp, &sf->uc.uc_mcontext.sp, err);
-	__get_user_error(regs->pc, &sf->uc.uc_mcontext.pc, err);
-	__get_user_error(regs->pstate, &sf->uc.uc_mcontext.pstate, err);
+	__get_user_error(regs->sp, &uc_mcontext->sp, err);
+	__get_user_error(regs->pc, &uc_mcontext->pc, err);
+	__get_user_error(regs->pstate, &uc_mcontext->pstate, err);
 
 	/*
 	 * Avoid sys_rt_sigreturn() restarting.
@@ -145,10 +161,10 @@ asmlinkage long sys_rt_sigreturn(struct pt_regs *regs)
 	if (!access_ok(VERIFY_READ, frame, sizeof (*frame)))
 		goto badframe;
 
-	if (restore_sigframe(regs, frame))
+	if (restore_sigframe(regs, &frame->sig))
 		goto badframe;
 
-	if (restore_altstack(&frame->uc.uc_stack))
+	if (restore_altstack(&frame->sig.uc.uc_stack))
 		goto badframe;
 
 	return regs->regs[0];
@@ -162,27 +178,36 @@ asmlinkage long sys_rt_sigreturn(struct pt_regs *regs)
 	return 0;
 }
 
-static int setup_sigframe(struct rt_sigframe __user *sf,
+static int setup_sigframe(struct sigframe __user *sf,
 			  struct pt_regs *regs, sigset_t *set)
 {
-	int i, err = 0;
-	void *aux = sf->uc.uc_mcontext.__reserved;
-	struct _aarch64_ctx *end;
+	int err = 0;
 
 	/* set up the stack frame for unwinding */
 	__put_user_error(regs->regs[29], &sf->fp, err);
 	__put_user_error(regs->regs[30], &sf->lr, err);
+	err |= __copy_to_user(&sf->uc.uc_sigmask, set, sizeof(*set));
+	err |= setup_sigcontext(&sf->uc.uc_mcontext, regs);
+
+	return err;
+}
+
+int setup_sigcontext(struct sigcontext __user *uc_mcontext,
+			struct pt_regs *regs)
+{
+	void *aux = uc_mcontext->__reserved;
+	struct _aarch64_ctx *end;
+	int i, err = 0;
 
 	for (i = 0; i < 31; i++)
-		__put_user_error(regs->regs[i], &sf->uc.uc_mcontext.regs[i],
+		__put_user_error(regs->regs[i], &uc_mcontext->regs[i],
 				 err);
-	__put_user_error(regs->sp, &sf->uc.uc_mcontext.sp, err);
-	__put_user_error(regs->pc, &sf->uc.uc_mcontext.pc, err);
-	__put_user_error(regs->pstate, &sf->uc.uc_mcontext.pstate, err);
 
-	__put_user_error(current->thread.fault_address, &sf->uc.uc_mcontext.fault_address, err);
+	__put_user_error(regs->sp, &uc_mcontext->sp, err);
+	__put_user_error(regs->pc, &uc_mcontext->pc, err);
+	__put_user_error(regs->pstate, &uc_mcontext->pstate, err);
 
-	err |= __copy_to_user(&sf->uc.uc_sigmask, set, sizeof(*set));
+	__put_user_error(current->thread.fault_address, &uc_mcontext->fault_address, err);
 
 	if (err == 0) {
 		struct fpsimd_context *fpsimd_ctx =
@@ -229,14 +254,14 @@ static struct rt_sigframe __user *get_sigframe(struct ksignal *ksig,
 	return frame;
 }
 
-static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
-			 void __user *frame, int usig)
+void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
+			 void __user *frame, off_t fp_pos, int usig)
 {
 	__sigrestore_t sigtramp;
 
 	regs->regs[0] = usig;
 	regs->sp = (unsigned long)frame;
-	regs->regs[29] = regs->sp + offsetof(struct rt_sigframe, fp);
+	regs->regs[29] = regs->sp + fp_pos;
 	regs->pc = (unsigned long)ka->sa.sa_handler;
 
 	if (ka->sa.sa_flags & SA_RESTORER)
@@ -257,17 +282,17 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
 	if (!frame)
 		return 1;
 
-	__put_user_error(0, &frame->uc.uc_flags, err);
-	__put_user_error(NULL, &frame->uc.uc_link, err);
+	__put_user_error(0, &frame->sig.uc.uc_flags, err);
+	__put_user_error(NULL, &frame->sig.uc.uc_link, err);
 
-	err |= __save_altstack(&frame->uc.uc_stack, regs->sp);
-	err |= setup_sigframe(frame, regs, set);
+	err |= __save_altstack(&frame->sig.uc.uc_stack, regs->sp);
+	err |= setup_sigframe(&frame->sig, regs, set);
 	if (err == 0) {
-		setup_return(regs, &ksig->ka, frame, usig);
+		setup_return(regs, &ksig->ka, frame, RT_SIGFRAME_FP_POS, usig);
 		if (ksig->ka.sa.sa_flags & SA_SIGINFO) {
 			err |= copy_siginfo_to_user(&frame->info, &ksig->info);
 			regs->regs[1] = (unsigned long)&frame->info;
-			regs->regs[2] = (unsigned long)&frame->uc;
+			regs->regs[2] = (unsigned long)&frame->sig.uc;
 		}
 	}
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 14/18] arm64: signal32: move ilp32 and aarch32 common code to separated file
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (12 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 13/18] arm64: signal: share lp64 signal routines to ilp32 Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-12-05 16:18   ` Catalin Marinas
  2016-10-21 20:33 ` [PATCH 15/18] arm64: ilp32: introduce ilp32-specific handlers for sigframe and ucontext Yury Norov
                   ` (7 subsequent siblings)
  21 siblings, 1 reply; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf

Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
---
 arch/arm64/include/asm/signal32.h        |   3 +
 arch/arm64/include/asm/signal32_common.h |  27 +++++++
 arch/arm64/kernel/Makefile               |   2 +-
 arch/arm64/kernel/signal32.c             | 107 ------------------------
 arch/arm64/kernel/signal32_common.c      | 135 +++++++++++++++++++++++++++++++
 5 files changed, 166 insertions(+), 108 deletions(-)
 create mode 100644 arch/arm64/include/asm/signal32_common.h
 create mode 100644 arch/arm64/kernel/signal32_common.c

diff --git a/arch/arm64/include/asm/signal32.h b/arch/arm64/include/asm/signal32.h
index e68fcce..1c4ede7 100644
--- a/arch/arm64/include/asm/signal32.h
+++ b/arch/arm64/include/asm/signal32.h
@@ -13,6 +13,9 @@
  * You should have received a copy of the GNU General Public License
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
+
+#include <asm/signal32_common.h>
+
 #ifndef __ASM_SIGNAL32_H
 #define __ASM_SIGNAL32_H
 
diff --git a/arch/arm64/include/asm/signal32_common.h b/arch/arm64/include/asm/signal32_common.h
new file mode 100644
index 0000000..36c1ebc
--- /dev/null
+++ b/arch/arm64/include/asm/signal32_common.h
@@ -0,0 +1,27 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SIGNAL32_COMMON_H
+#define __ASM_SIGNAL32_COMMON_H
+
+#ifdef CONFIG_COMPAT
+
+int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from);
+int copy_siginfo_from_user32(siginfo_t *to, compat_siginfo_t __user *from);
+
+int put_sigset_t(compat_sigset_t __user *uset, sigset_t *set);
+int get_sigset_t(sigset_t *set, const compat_sigset_t __user *uset);
+
+#endif /* CONFIG_COMPAT*/
+
+#endif /* __ASM_SIGNAL32_COMMON_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 06070f5..fdc0052 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -30,7 +30,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
 arm64-obj-$(CONFIG_AARCH32_EL0)		+= sys32.o kuser32.o signal32.o 	\
 					   sys_compat.o entry32.o binfmt_elf32.o
 arm64-obj-$(CONFIG_ARM64_ILP32)		+= binfmt_ilp32.o sys_ilp32.o
-arm64-obj-$(CONFIG_COMPAT)		+= entry32_common.o
+arm64-obj-$(CONFIG_COMPAT)		+= entry32_common.o signal32_common.o
 arm64-obj-$(CONFIG_FUNCTION_TRACER)	+= ftrace.o entry-ftrace.o
 arm64-obj-$(CONFIG_MODULES)		+= arm64ksyms.o module.o
 arm64-obj-$(CONFIG_ARM64_MODULE_PLTS)	+= module-plts.o
diff --git a/arch/arm64/kernel/signal32.c b/arch/arm64/kernel/signal32.c
index b7063de..f2c1a38 100644
--- a/arch/arm64/kernel/signal32.c
+++ b/arch/arm64/kernel/signal32.c
@@ -103,113 +103,6 @@ struct compat_rt_sigframe {
 
 #define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP)))
 
-static inline int put_sigset_t(compat_sigset_t __user *uset, sigset_t *set)
-{
-	compat_sigset_t	cset;
-
-	cset.sig[0] = set->sig[0] & 0xffffffffull;
-	cset.sig[1] = set->sig[0] >> 32;
-
-	return copy_to_user(uset, &cset, sizeof(*uset));
-}
-
-static inline int get_sigset_t(sigset_t *set,
-			       const compat_sigset_t __user *uset)
-{
-	compat_sigset_t s32;
-
-	if (copy_from_user(&s32, uset, sizeof(*uset)))
-		return -EFAULT;
-
-	set->sig[0] = s32.sig[0] | (((long)s32.sig[1]) << 32);
-	return 0;
-}
-
-int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from)
-{
-	int err;
-
-	if (!access_ok(VERIFY_WRITE, to, sizeof(*to)))
-		return -EFAULT;
-
-	/* If you change siginfo_t structure, please be sure
-	 * this code is fixed accordingly.
-	 * It should never copy any pad contained in the structure
-	 * to avoid security leaks, but must copy the generic
-	 * 3 ints plus the relevant union member.
-	 * This routine must convert siginfo from 64bit to 32bit as well
-	 * at the same time.
-	 */
-	err = __put_user(from->si_signo, &to->si_signo);
-	err |= __put_user(from->si_errno, &to->si_errno);
-	err |= __put_user((short)from->si_code, &to->si_code);
-	if (from->si_code < 0)
-		err |= __copy_to_user(&to->_sifields._pad, &from->_sifields._pad,
-				      SI_PAD_SIZE);
-	else switch (from->si_code & __SI_MASK) {
-	case __SI_KILL:
-		err |= __put_user(from->si_pid, &to->si_pid);
-		err |= __put_user(from->si_uid, &to->si_uid);
-		break;
-	case __SI_TIMER:
-		 err |= __put_user(from->si_tid, &to->si_tid);
-		 err |= __put_user(from->si_overrun, &to->si_overrun);
-		 err |= __put_user(from->si_int, &to->si_int);
-		break;
-	case __SI_POLL:
-		err |= __put_user(from->si_band, &to->si_band);
-		err |= __put_user(from->si_fd, &to->si_fd);
-		break;
-	case __SI_FAULT:
-		err |= __put_user((compat_uptr_t)(unsigned long)from->si_addr,
-				  &to->si_addr);
-#ifdef BUS_MCEERR_AO
-		/*
-		 * Other callers might not initialize the si_lsb field,
-		 * so check explicitly for the right codes here.
-		 */
-		if (from->si_signo == SIGBUS &&
-		    (from->si_code == BUS_MCEERR_AR || from->si_code == BUS_MCEERR_AO))
-			err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb);
-#endif
-		break;
-	case __SI_CHLD:
-		err |= __put_user(from->si_pid, &to->si_pid);
-		err |= __put_user(from->si_uid, &to->si_uid);
-		err |= __put_user(from->si_status, &to->si_status);
-		err |= __put_user(from->si_utime, &to->si_utime);
-		err |= __put_user(from->si_stime, &to->si_stime);
-		break;
-	case __SI_RT: /* This is not generated by the kernel as of now. */
-	case __SI_MESGQ: /* But this is */
-		err |= __put_user(from->si_pid, &to->si_pid);
-		err |= __put_user(from->si_uid, &to->si_uid);
-		err |= __put_user(from->si_int, &to->si_int);
-		break;
-	case __SI_SYS:
-		err |= __put_user((compat_uptr_t)(unsigned long)
-				from->si_call_addr, &to->si_call_addr);
-		err |= __put_user(from->si_syscall, &to->si_syscall);
-		err |= __put_user(from->si_arch, &to->si_arch);
-		break;
-	default: /* this is just in case for now ... */
-		err |= __put_user(from->si_pid, &to->si_pid);
-		err |= __put_user(from->si_uid, &to->si_uid);
-		break;
-	}
-	return err;
-}
-
-int copy_siginfo_from_user32(siginfo_t *to, compat_siginfo_t __user *from)
-{
-	if (copy_from_user(to, from, __ARCH_SI_PREAMBLE_SIZE) ||
-	    copy_from_user(to->_sifields._pad,
-			   from->_sifields._pad, SI_PAD_SIZE))
-		return -EFAULT;
-
-	return 0;
-}
-
 /*
  * VFP save/restore code.
  *
diff --git a/arch/arm64/kernel/signal32_common.c b/arch/arm64/kernel/signal32_common.c
new file mode 100644
index 0000000..c8cba96
--- /dev/null
+++ b/arch/arm64/kernel/signal32_common.c
@@ -0,0 +1,135 @@
+/*
+ * Based on arch/arm/kernel/signal.c
+ *
+ * Copyright (C) 1995-2009 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ * Modified by Will Deacon <will.deacon@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/compat.h>
+#include <linux/signal.h>
+#include <linux/ratelimit.h>
+
+#include <asm/esr.h>
+#include <asm/fpsimd.h>
+#include <asm/signal32_common.h>
+#include <asm/uaccess.h>
+#include <asm/unistd.h>
+
+int put_sigset_t(compat_sigset_t __user *uset, sigset_t *set)
+{
+	compat_sigset_t	cset;
+
+	cset.sig[0] = set->sig[0] & 0xffffffffull;
+	cset.sig[1] = set->sig[0] >> 32;
+
+	return copy_to_user(uset, &cset, sizeof(*uset));
+}
+
+int get_sigset_t(sigset_t *set, const compat_sigset_t __user *uset)
+{
+	compat_sigset_t s32;
+
+	if (copy_from_user(&s32, uset, sizeof(*uset)))
+		return -EFAULT;
+
+	set->sig[0] = s32.sig[0] | (((long)s32.sig[1]) << 32);
+	return 0;
+}
+
+int copy_siginfo_to_user32(compat_siginfo_t __user *to, const siginfo_t *from)
+{
+	int err;
+
+	if (!access_ok(VERIFY_WRITE, to, sizeof(*to)))
+		return -EFAULT;
+
+	/* If you change siginfo_t structure, please be sure
+	 * this code is fixed accordingly.
+	 * It should never copy any pad contained in the structure
+	 * to avoid security leaks, but must copy the generic
+	 * 3 ints plus the relevant union member.
+	 * This routine must convert siginfo from 64bit to 32bit as well
+	 * at the same time.
+	 */
+	err = __put_user(from->si_signo, &to->si_signo);
+	err |= __put_user(from->si_errno, &to->si_errno);
+	err |= __put_user((short)from->si_code, &to->si_code);
+	if (from->si_code < 0)
+		err |= __copy_to_user(&to->_sifields._pad, &from->_sifields._pad,
+				      SI_PAD_SIZE);
+	else switch (from->si_code & __SI_MASK) {
+	case __SI_KILL:
+		err |= __put_user(from->si_pid, &to->si_pid);
+		err |= __put_user(from->si_uid, &to->si_uid);
+		break;
+	case __SI_TIMER:
+		err |= __put_user(from->si_tid, &to->si_tid);
+		err |= __put_user(from->si_overrun, &to->si_overrun);
+		err |= __put_user(from->si_int, &to->si_int);
+		break;
+	case __SI_POLL:
+		err |= __put_user(from->si_band, &to->si_band);
+		err |= __put_user(from->si_fd, &to->si_fd);
+		break;
+	case __SI_FAULT:
+		err |= __put_user((compat_uptr_t)(unsigned long)from->si_addr,
+				  &to->si_addr);
+#ifdef BUS_MCEERR_AO
+		/*
+		 * Other callers might not initialize the si_lsb field,
+		 * so check explicitly for the right codes here.
+		 */
+		if (from->si_signo == SIGBUS &&
+		    (from->si_code == BUS_MCEERR_AR || from->si_code == BUS_MCEERR_AO))
+			err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb);
+#endif
+		break;
+	case __SI_CHLD:
+		err |= __put_user(from->si_pid, &to->si_pid);
+		err |= __put_user(from->si_uid, &to->si_uid);
+		err |= __put_user(from->si_status, &to->si_status);
+		err |= __put_user(from->si_utime, &to->si_utime);
+		err |= __put_user(from->si_stime, &to->si_stime);
+		break;
+	case __SI_RT: /* This is not generated by the kernel as of now. */
+	case __SI_MESGQ: /* But this is */
+		err |= __put_user(from->si_pid, &to->si_pid);
+		err |= __put_user(from->si_uid, &to->si_uid);
+		err |= __put_user(from->si_int, &to->si_int);
+		break;
+	case __SI_SYS:
+		err |= __put_user((compat_uptr_t)(unsigned long)
+				from->si_call_addr, &to->si_call_addr);
+		err |= __put_user(from->si_syscall, &to->si_syscall);
+		err |= __put_user(from->si_arch, &to->si_arch);
+		break;
+	default: /* this is just in case for now ... */
+		err |= __put_user(from->si_pid, &to->si_pid);
+		err |= __put_user(from->si_uid, &to->si_uid);
+		break;
+	}
+	return err;
+}
+
+int copy_siginfo_from_user32(siginfo_t *to, compat_siginfo_t __user *from)
+{
+	if (copy_from_user(to, from, __ARCH_SI_PREAMBLE_SIZE) ||
+	    copy_from_user(to->_sifields._pad,
+			   from->_sifields._pad, SI_PAD_SIZE))
+		return -EFAULT;
+
+	return 0;
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 15/18] arm64: ilp32: introduce ilp32-specific handlers for sigframe and ucontext
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (13 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 14/18] arm64: signal32: move ilp32 and aarch32 common code to separated file Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-10-21 20:33 ` [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32 Yury Norov
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf, Andrew Pinski,
	Andrew Pinski

From: Andrew Pinski <apinski@cavium.com>

ILP32 uses AARCH32 compat structures and syscall handlers for signals.
But ILP32 struct rt_sigframe  and ucontext differs from both LP64 and
AARCH32. So some specific mechanism is needed to take care of it.

Signed-off-by: Andrew Pinski <Andrew.Pinski@caviumnetworks.com>
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
---
 arch/arm64/include/asm/signal_ilp32.h |  38 ++++++++
 arch/arm64/kernel/Makefile            |   3 +-
 arch/arm64/kernel/entry_ilp32.S       |  22 +++++
 arch/arm64/kernel/signal.c            |   3 +
 arch/arm64/kernel/signal_ilp32.c      | 174 ++++++++++++++++++++++++++++++++++
 5 files changed, 239 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/signal_ilp32.h
 create mode 100644 arch/arm64/kernel/entry_ilp32.S
 create mode 100644 arch/arm64/kernel/signal_ilp32.c

diff --git a/arch/arm64/include/asm/signal_ilp32.h b/arch/arm64/include/asm/signal_ilp32.h
new file mode 100644
index 0000000..d3210d8
--- /dev/null
+++ b/arch/arm64/include/asm/signal_ilp32.h
@@ -0,0 +1,38 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <asm/signal32_common.h>
+#include <asm/signal_common.h>
+
+#ifndef __ASM_SIGNAL_ILP32_H
+#define __ASM_SIGNAL_ILP32_H
+
+#ifdef CONFIG_ARM64_ILP32
+
+#include <linux/compat.h>
+
+int ilp32_setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
+			  struct pt_regs *regs);
+
+#else
+
+static inline int ilp32_setup_rt_frame(int usig, struct ksignal *ksig,
+					sigset_t *set, struct pt_regs *regs)
+{
+	return -ENOSYS;
+}
+
+#endif /* CONFIG_ARM64_ILP32 */
+
+#endif /* __ASM_SIGNAL_ILP32_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index fdc0052..af400fb 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -29,7 +29,8 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
 
 arm64-obj-$(CONFIG_AARCH32_EL0)		+= sys32.o kuser32.o signal32.o 	\
 					   sys_compat.o entry32.o binfmt_elf32.o
-arm64-obj-$(CONFIG_ARM64_ILP32)		+= binfmt_ilp32.o sys_ilp32.o
+arm64-obj-$(CONFIG_ARM64_ILP32)		+= binfmt_ilp32.o sys_ilp32.o 		\
+					   signal_ilp32.o entry_ilp32.o
 arm64-obj-$(CONFIG_COMPAT)		+= entry32_common.o signal32_common.o
 arm64-obj-$(CONFIG_FUNCTION_TRACER)	+= ftrace.o entry-ftrace.o
 arm64-obj-$(CONFIG_MODULES)		+= arm64ksyms.o module.o
diff --git a/arch/arm64/kernel/entry_ilp32.S b/arch/arm64/kernel/entry_ilp32.S
new file mode 100644
index 0000000..a8bb94b
--- /dev/null
+++ b/arch/arm64/kernel/entry_ilp32.S
@@ -0,0 +1,22 @@
+/*
+ * ILP32 system call wrappers
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+
+ENTRY(ilp32_sys_rt_sigreturn_wrapper)
+	mov	x0, sp
+	b	ilp32_sys_rt_sigreturn
+ENDPROC(ilp32_sys_rt_sigreturn_wrapper)
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 478d6c5..1b130f4 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -35,6 +35,7 @@
 #include <asm/signal32.h>
 #include <asm/vdso.h>
 #include <asm/signal_common.h>
+#include <asm/signal_ilp32.h>
 
 #define RT_SIGFRAME_FP_POS (offsetof(struct rt_sigframe, sig)	\
 			+ offsetof(struct sigframe, fp))
@@ -325,6 +326,8 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
 			ret = compat_setup_rt_frame(usig, ksig, oldset, regs);
 		else
 			ret = compat_setup_frame(usig, ksig, oldset, regs);
+	} else if (is_ilp32_compat_task()) {
+		ret = ilp32_setup_rt_frame(usig, ksig, oldset, regs);
 	} else {
 		ret = setup_rt_frame(usig, ksig, oldset, regs);
 	}
diff --git a/arch/arm64/kernel/signal_ilp32.c b/arch/arm64/kernel/signal_ilp32.c
new file mode 100644
index 0000000..6f9b7aa
--- /dev/null
+++ b/arch/arm64/kernel/signal_ilp32.c
@@ -0,0 +1,174 @@
+/*
+ * Based on arch/arm/kernel/signal.c
+ *
+ * Copyright (C) 1995-2009 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2016 Cavium Networks.
+ * Yury Norov <ynorov@caviumnetworks.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/compat.h>
+#include <linux/signal.h>
+#include <linux/syscalls.h>
+#include <linux/ratelimit.h>
+
+#include <asm/esr.h>
+#include <asm/fpsimd.h>
+#include <asm/signal_ilp32.h>
+#include <asm/uaccess.h>
+#include <asm/unistd.h>
+#include <asm/ucontext.h>
+
+
+#define ILP32_RT_SIGFRAME_FP_POS (offsetof(struct ilp32_rt_sigframe, sig)	\
+			+ offsetof(struct ilp32_sigframe, fp))
+
+struct ilp32_ucontext {
+	u32		uc_flags;
+	u32		uc_link;
+	compat_stack_t  uc_stack;
+	compat_sigset_t uc_sigmask;
+	/* glibc uses a 1024-bit sigset_t */
+	__u8		unused[1024 / 8 - sizeof(compat_sigset_t)];
+	/* last for future expansion */
+	struct sigcontext uc_mcontext;
+};
+
+struct ilp32_sigframe {
+	struct ilp32_ucontext uc;
+	u64 fp;
+	u64 lr;
+};
+
+struct ilp32_rt_sigframe {
+	struct compat_siginfo info;
+	struct ilp32_sigframe sig;
+};
+
+static int restore_ilp32_sigframe(struct pt_regs *regs,
+			struct ilp32_sigframe __user *sf)
+{
+	int err;
+	sigset_t set;
+
+	err = get_sigset_t(&set, &sf->uc.uc_sigmask);
+	if (err == 0)
+		set_current_blocked(&set);
+	err |= restore_sigcontext(regs, &sf->uc.uc_mcontext);
+	return err;
+}
+
+static int setup_ilp32_sigframe(struct ilp32_sigframe __user *sf,
+				struct pt_regs *regs, sigset_t *set)
+{
+	int err = 0;
+
+	/* set up the stack frame for unwinding */
+	__put_user_error(regs->regs[29], &sf->fp, err);
+	__put_user_error(regs->regs[30], &sf->lr, err);
+
+	err |= put_sigset_t(&sf->uc.uc_sigmask, set);
+	err |= setup_sigcontext(&sf->uc.uc_mcontext, regs);
+	return err;
+}
+
+asmlinkage long ilp32_sys_rt_sigreturn(struct pt_regs *regs)
+{
+	struct ilp32_rt_sigframe __user *frame;
+
+	/* Always make any pending restarted system calls return -EINTR */
+	current->restart_block.fn = do_no_restart_syscall;
+
+	/*
+	 * Since we stacked the signal on a 128-bit boundary,
+	 * then 'sp' should be word aligned here.  If it's
+	 * not, then the user is trying to mess with us.
+	 */
+	if (regs->sp & 15)
+		goto badframe;
+
+	frame = (struct ilp32_rt_sigframe __user *) regs->sp;
+
+	if (!access_ok(VERIFY_READ, frame, sizeof(*frame)))
+		goto badframe;
+
+	if (restore_ilp32_sigframe(regs, &frame->sig))
+		goto badframe;
+
+	if (compat_restore_altstack(&frame->sig.uc.uc_stack))
+		goto badframe;
+
+	return regs->regs[0];
+
+badframe:
+	if (show_unhandled_signals)
+		pr_info_ratelimited("%s[%d]: bad frame in %s: pc=%08llx sp=%08llx\n",
+				    current->comm, task_pid_nr(current),
+				    __func__, regs->pc, regs->sp);
+	force_sig(SIGSEGV, current);
+
+	return 0;
+}
+
+static struct ilp32_rt_sigframe __user *ilp32_get_sigframe(struct ksignal *ksig,
+						struct pt_regs *regs)
+{
+	unsigned long sp, sp_top;
+	struct ilp32_rt_sigframe __user *frame;
+
+	sp = sp_top = sigsp(regs->sp, ksig);
+
+	sp = (sp - sizeof(struct ilp32_rt_sigframe)) & ~15;
+	frame = (struct ilp32_rt_sigframe __user *)sp;
+
+	/*
+	 * Check that we can actually write to the signal frame.
+	 */
+	if (!access_ok(VERIFY_WRITE, frame, sp_top - sp))
+		frame = NULL;
+
+	return frame;
+}
+
+/*
+ * ILP32 signal handling routines called from signal.c
+ */
+int ilp32_setup_rt_frame(int usig, struct ksignal *ksig,
+			  sigset_t *set, struct pt_regs *regs)
+{
+	struct ilp32_rt_sigframe __user *frame;
+	int err = 0;
+
+	frame = ilp32_get_sigframe(ksig, regs);
+
+	if (!frame)
+		return 1;
+
+	err |= copy_siginfo_to_user32(&frame->info, &ksig->info);
+
+	__put_user_error(0, &frame->sig.uc.uc_flags, err);
+	__put_user_error(0, &frame->sig.uc.uc_link, err);
+
+	err |= __compat_save_altstack(&frame->sig.uc.uc_stack, regs->sp);
+	err |= setup_ilp32_sigframe(&frame->sig, regs, set);
+	if (err == 0) {
+		setup_return(regs, &ksig->ka,
+				frame, ILP32_RT_SIGFRAME_FP_POS, usig);
+		regs->regs[1] = (unsigned long)&frame->info;
+		regs->regs[2] = (unsigned long)&frame->sig.uc;
+	}
+
+	return err;
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (14 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 15/18] arm64: ilp32: introduce ilp32-specific handlers for sigframe and ucontext Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-12-05 16:34   ` Catalin Marinas
  2016-10-21 20:33 ` [PATCH 17/18] arm64:ilp32: add vdso-ilp32 and use for signal return Yury Norov
                   ` (5 subsequent siblings)
  21 siblings, 1 reply; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf,
	Bamvor Zhang Jian

New aarch32 ptrace syscall handler is introduced to avoid run-time
detection of the task type.

Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Bamvor Zhang Jian <bamvor.zhangjian@linaro.org>
Signed-off-by: Chengming Zhou <zhouchengming1@huawei.com>
---
 arch/arm64/include/asm/unistd32.h |  2 +-
 arch/arm64/kernel/ptrace.c        | 91 ++++++++++++++++++++++++++++++++++++++-
 arch/arm64/kernel/sys32.c         |  1 +
 include/linux/ptrace.h            |  6 +++
 kernel/ptrace.c                   | 10 ++---
 5 files changed, 103 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index b7e8ef1..6da7cbd 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -74,7 +74,7 @@ __SYSCALL(__NR_getuid, sys_getuid16)
 			/* 25 was sys_stime */
 __SYSCALL(25, sys_ni_syscall)
 #define __NR_ptrace 26
-__SYSCALL(__NR_ptrace, compat_sys_ptrace)
+__SYSCALL(__NR_ptrace, compat_sys_aarch32_ptrace)
 			/* 27 was sys_alarm */
 __SYSCALL(27, sys_ni_syscall)
 			/* 28 was sys_fstat */
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 1d075ed..ac542c9 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -29,6 +29,7 @@
 #include <linux/user.h>
 #include <linux/seccomp.h>
 #include <linux/security.h>
+#include <linux/syscalls.h>
 #include <linux/init.h>
 #include <linux/signal.h>
 #include <linux/uaccess.h>
@@ -40,6 +41,7 @@
 
 #include <asm/debug-monitors.h>
 #include <asm/pgtable.h>
+#include <asm/signal32_common.h>
 #include <asm/syscall.h>
 #include <asm/traps.h>
 #include <asm/system_misc.h>
@@ -1215,7 +1217,7 @@ static int compat_ptrace_sethbpregs(struct task_struct *tsk, compat_long_t num,
 }
 #endif	/* CONFIG_HAVE_HW_BREAKPOINT */
 
-long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
+static long compat_a32_ptrace(struct task_struct *child, compat_long_t request,
 			compat_ulong_t caddr, compat_ulong_t cdata)
 {
 	unsigned long addr = caddr;
@@ -1292,8 +1294,95 @@ long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
 
 	return ret;
 }
+
+COMPAT_SYSCALL_DEFINE4(aarch32_ptrace, compat_long_t, request, compat_long_t, pid,
+		       compat_long_t, addr, compat_long_t, data)
+{
+	struct task_struct *child;
+	long ret;
+
+	if (request == PTRACE_TRACEME) {
+		ret = ptrace_traceme();
+		goto out;
+	}
+
+	child = ptrace_get_task_struct(pid);
+	if (IS_ERR(child)) {
+		ret = PTR_ERR(child);
+		goto out;
+	}
+
+	if (request == PTRACE_ATTACH || request == PTRACE_SEIZE) {
+		ret = ptrace_attach(child, request, addr, data);
+		goto out_put_task_struct;
+	}
+
+	ret = ptrace_check_attach(child, request == PTRACE_KILL ||
+				  request == PTRACE_INTERRUPT);
+	if (!ret) {
+		ret = compat_a32_ptrace(child, request, addr, data);
+		if (ret || request != PTRACE_DETACH)
+			ptrace_unfreeze_traced(child);
+	}
+
+ out_put_task_struct:
+	put_task_struct(child);
+ out:
+	return ret;
+}
+
 #endif /* CONFIG_AARCH32_EL0 */
 
+#ifdef CONFIG_ARM64_ILP32
+
+long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
+			compat_ulong_t caddr, compat_ulong_t cdata)
+{
+	sigset_t new_set;
+
+	switch (request) {
+	case PTRACE_GETSIGMASK:
+		if (caddr != sizeof(compat_sigset_t))
+			return -EINVAL;
+
+		return put_sigset_t((compat_sigset_t __user *) (u64) cdata,
+					&child->blocked);
+
+	case PTRACE_SETSIGMASK:
+		if (caddr != sizeof(compat_sigset_t))
+			return -EINVAL;
+
+		if (get_sigset_t(&new_set, (compat_sigset_t __user *) (u64) cdata))
+			return -EFAULT;
+
+		sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
+
+		/*
+		 * Every thread does recalc_sigpending() after resume, so
+		 * retarget_shared_pending() and recalc_sigpending() are not
+		 * called here.
+		 */
+		spin_lock_irq(&child->sighand->siglock);
+		child->blocked = new_set;
+		spin_unlock_irq(&child->sighand->siglock);
+
+		return 0;
+
+	default:
+		return compat_ptrace_request(child, request, caddr, cdata);
+	}
+}
+
+#elif defined(CONFIG_COMPAT)
+
+long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
+		compat_ulong_t caddr, compat_ulong_t cdata)
+{
+	return 0;
+}
+
+#endif
+
 const struct user_regset_view *task_user_regset_view(struct task_struct *task)
 {
 #ifdef CONFIG_AARCH32_EL0
diff --git a/arch/arm64/kernel/sys32.c b/arch/arm64/kernel/sys32.c
index a40b134..3752443 100644
--- a/arch/arm64/kernel/sys32.c
+++ b/arch/arm64/kernel/sys32.c
@@ -38,6 +38,7 @@ asmlinkage long compat_sys_fadvise64_64_wrapper(void);
 asmlinkage long compat_sys_sync_file_range2_wrapper(void);
 asmlinkage long compat_sys_fallocate_wrapper(void);
 asmlinkage long compat_sys_mmap2_wrapper(void);
+asmlinkage long compat_sys_aarch32_ptrace(void);
 
 #undef __SYSCALL
 #define __SYSCALL(nr, sym)	[nr] = sym,
diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 504c98a..75887a0 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -97,6 +97,12 @@ int generic_ptrace_peekdata(struct task_struct *tsk, unsigned long addr,
 			    unsigned long data);
 int generic_ptrace_pokedata(struct task_struct *tsk, unsigned long addr,
 			    unsigned long data);
+int ptrace_traceme(void);
+struct task_struct *ptrace_get_task_struct(pid_t pid);
+int ptrace_attach(struct task_struct *task, long request,
+			 unsigned long addr, unsigned long flags);
+int ptrace_check_attach(struct task_struct *child, bool ignore_state);
+void ptrace_unfreeze_traced(struct task_struct *task);
 
 /**
  * ptrace_parent - return the task that is tracing the given task
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 2a99027..5638880 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -138,7 +138,7 @@ static bool ptrace_freeze_traced(struct task_struct *task)
 	return ret;
 }
 
-static void ptrace_unfreeze_traced(struct task_struct *task)
+void ptrace_unfreeze_traced(struct task_struct *task)
 {
 	if (task->state != __TASK_TRACED)
 		return;
@@ -170,7 +170,7 @@ static void ptrace_unfreeze_traced(struct task_struct *task)
  * RETURNS:
  * 0 on success, -ESRCH if %child is not ready.
  */
-static int ptrace_check_attach(struct task_struct *child, bool ignore_state)
+int ptrace_check_attach(struct task_struct *child, bool ignore_state)
 {
 	int ret = -ESRCH;
 
@@ -294,7 +294,7 @@ bool ptrace_may_access(struct task_struct *task, unsigned int mode)
 	return !err;
 }
 
-static int ptrace_attach(struct task_struct *task, long request,
+int ptrace_attach(struct task_struct *task, long request,
 			 unsigned long addr,
 			 unsigned long flags)
 {
@@ -408,7 +408,7 @@ static int ptrace_attach(struct task_struct *task, long request,
  * Performs checks and sets PT_PTRACED.
  * Should be used by all ptrace implementations for PTRACE_TRACEME.
  */
-static int ptrace_traceme(void)
+int ptrace_traceme(void)
 {
 	int ret = -EPERM;
 
@@ -1057,7 +1057,7 @@ int ptrace_request(struct task_struct *child, long request,
 	return ret;
 }
 
-static struct task_struct *ptrace_get_task_struct(pid_t pid)
+struct task_struct *ptrace_get_task_struct(pid_t pid)
 {
 	struct task_struct *child;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 17/18] arm64:ilp32: add vdso-ilp32 and use for signal return
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (15 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32 Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-10-21 20:33 ` [PATCH 18/18] arm64:ilp32: add ARM64_ILP32 to Kconfig Yury Norov
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf,
	Bamvor Zhang Jian

From: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>

ILP32 VDSO exports next symbols:
 __kernel_rt_sigreturn;
 __kernel_gettimeofday;
 __kernel_clock_gettime;
 __kernel_clock_getres.

What shared object to use, kernel selects depending on result of
is_ilp32_compat_task() in arch/arm64/kernel/vdso.c, so it substitutes
correct pages and spec.

Adjusted to move the move data page before code pages in sync with
commit 601255ae3c98 ("arm64: vdso: move data page before code pages")

Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Bamvor Zhang Jian <bamvor.zhangjian@linaro.org>
---
 arch/arm64/include/asm/vdso.h                 |  6 ++
 arch/arm64/kernel/Makefile                    | 11 ++++
 arch/arm64/kernel/asm-offsets.c               |  7 ++
 arch/arm64/kernel/signal.c                    |  2 +
 arch/arm64/kernel/vdso-ilp32/.gitignore       |  2 +
 arch/arm64/kernel/vdso-ilp32/Makefile         | 74 +++++++++++++++++++++
 arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S     | 33 ++++++++++
 arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S | 95 +++++++++++++++++++++++++++
 arch/arm64/kernel/vdso.c                      | 66 ++++++++++++++++---
 arch/arm64/kernel/vdso/gettimeofday.S         | 20 +++++-
 arch/arm64/kernel/vdso/vdso.S                 |  6 +-
 11 files changed, 306 insertions(+), 16 deletions(-)
 create mode 100644 arch/arm64/kernel/vdso-ilp32/.gitignore
 create mode 100644 arch/arm64/kernel/vdso-ilp32/Makefile
 create mode 100644 arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S
 create mode 100644 arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S

diff --git a/arch/arm64/include/asm/vdso.h b/arch/arm64/include/asm/vdso.h
index 839ce00..649a9a4 100644
--- a/arch/arm64/include/asm/vdso.h
+++ b/arch/arm64/include/asm/vdso.h
@@ -29,6 +29,12 @@
 
 #include <generated/vdso-offsets.h>
 
+#ifdef CONFIG_ARM64_ILP32
+#include <generated/vdso-ilp32-offsets.h>
+#else
+#define vdso_offset_sigtramp_ilp32
+#endif
+
 #define VDSO_SYMBOL(base, name)						   \
 ({									   \
 	(void *)(vdso_offset_##name - VDSO_LBASE + (unsigned long)(base)); \
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index af400fb..43e680a 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -55,6 +55,17 @@ arm64-obj-$(CONFIG_KEXEC)		+= machine_kexec.o relocate_kernel.o	\
 					   cpu-reset.o
 
 obj-y					+= $(arm64-obj-y) vdso/ probes/
+obj-$(CONFIG_ARM64_ILP32)		+= vdso-ilp32/
 obj-m					+= $(arm64-obj-m)
 head-y					:= head.o
 extra-y					+= $(head-y) vmlinux.lds
+
+# vDSO - this must be built first to generate the symbol offsets
+$(call objectify,$(arm64-obj-y)): $(obj)/vdso/vdso-offsets.h
+$(obj)/vdso/vdso-offsets.h: $(obj)/vdso
+
+ifeq ($(CONFIG_ARM64_ILP32),y)
+# vDSO - this must be built first to generate the symbol offsets
+$(call objectify,$(arm64-obj-y)): $(obj)/vdso-ilp32/vdso-ilp32-offsets.h
+$(obj)/vdso-ilp32/vdso-ilp32-offsets.h: $(obj)/vdso-ilp32
+endif
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index d8d7086..8f844b9 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -119,6 +119,13 @@ int main(void)
   DEFINE(TSPEC_TV_SEC,		offsetof(struct timespec, tv_sec));
   DEFINE(TSPEC_TV_NSEC,		offsetof(struct timespec, tv_nsec));
   BLANK();
+#ifdef CONFIG_COMPAT
+  DEFINE(COMPAT_TVAL_TV_SEC,	offsetof(struct compat_timeval, tv_sec));
+  DEFINE(COMPAT_TVAL_TV_USEC,	offsetof(struct compat_timeval, tv_usec));
+  DEFINE(COMPAT_TSPEC_TV_SEC,	offsetof(struct compat_timespec, tv_sec));
+  DEFINE(COMPAT_TSPEC_TV_NSEC,	offsetof(struct compat_timespec, tv_nsec));
+  BLANK();
+#endif
   DEFINE(TZ_MINWEST,		offsetof(struct timezone, tz_minuteswest));
   DEFINE(TZ_DSTTIME,		offsetof(struct timezone, tz_dsttime));
   BLANK();
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 1b130f4..72f68f0 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -267,6 +267,8 @@ void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
 
 	if (ka->sa.sa_flags & SA_RESTORER)
 		sigtramp = ka->sa.sa_restorer;
+	else if (is_ilp32_compat_task())
+		sigtramp = VDSO_SYMBOL(current->mm->context.vdso, sigtramp_ilp32);
 	else
 		sigtramp = VDSO_SYMBOL(current->mm->context.vdso, sigtramp);
 
diff --git a/arch/arm64/kernel/vdso-ilp32/.gitignore b/arch/arm64/kernel/vdso-ilp32/.gitignore
new file mode 100644
index 0000000..61806c3
--- /dev/null
+++ b/arch/arm64/kernel/vdso-ilp32/.gitignore
@@ -0,0 +1,2 @@
+vdso-ilp32.lds
+vdso-ilp32-offsets.h
diff --git a/arch/arm64/kernel/vdso-ilp32/Makefile b/arch/arm64/kernel/vdso-ilp32/Makefile
new file mode 100644
index 0000000..0671e88
--- /dev/null
+++ b/arch/arm64/kernel/vdso-ilp32/Makefile
@@ -0,0 +1,74 @@
+#
+# Building a vDSO image for AArch64.
+#
+# Author: Will Deacon <will.deacon@arm.com>
+# Heavily based on the vDSO Makefiles for other archs.
+#
+
+obj-ilp32-vdso := gettimeofday-ilp32.o note-ilp32.o sigreturn-ilp32.o
+
+# Build rules
+targets := $(obj-ilp32-vdso) vdso-ilp32.so vdso-ilp32.so.dbg
+obj-ilp32-vdso := $(addprefix $(obj)/, $(obj-ilp32-vdso))
+
+ccflags-y := -shared -fno-common -fno-builtin
+ccflags-y += -nostdlib -Wl,-soname=linux-ilp32-vdso.so.1 \
+		$(call cc-ldoption, -Wl$(comma)--hash-style=sysv)
+
+obj-y += vdso-ilp32.o
+extra-y += vdso-ilp32.lds vdso-ilp32-offsets.h
+CPPFLAGS_vdso-ilp32.lds += -P -C -U$(ARCH) -mabi=ilp32
+
+# Force dependency (incbin is bad)
+$(obj)/vdso-ilp32.o : $(obj)/vdso-ilp32.so
+
+# Link rule for the .so file, .lds has to be first
+$(obj)/vdso-ilp32.so.dbg: $(src)/vdso-ilp32.lds $(obj-ilp32-vdso)
+	$(call if_changed,vdso-ilp32ld)
+
+# Strip rule for the .so file
+$(obj)/%.so: OBJCOPYFLAGS := -S
+$(obj)/%.so: $(obj)/%.so.dbg FORCE
+	$(call if_changed,objcopy)
+
+# Generate VDSO offsets using helper script
+gen-vdsosym := $(srctree)/$(src)/../vdso/gen_vdso_offsets.sh
+quiet_cmd_vdsosym = VDSOSYM $@
+define cmd_vdsosym
+	$(NM) $< | $(gen-vdsosym) | LC_ALL=C sort > $@ && \
+	cp $@ include/generated/
+endef
+
+$(obj)/vdso-ilp32-offsets.h: $(obj)/vdso-ilp32.so.dbg FORCE
+	$(call if_changed,vdsosym)
+
+# Assembly rules for the .S files
+#$(obj-ilp32-vdso): %.o: $(src)/../vdso/$(subst -ilp32,,%.S)
+#	$(call if_changed_dep,vdso-ilp32as)
+
+$(obj)/gettimeofday-ilp32.o: $(src)/../vdso/gettimeofday.S
+	$(call if_changed_dep,vdso-ilp32as)
+
+$(obj)/note-ilp32.o: $(src)/../vdso/note.S
+	$(call if_changed_dep,vdso-ilp32as)
+
+# This one should be fine because ILP32 uses the same generic
+# __NR_rt_sigreturn syscall number.
+$(obj)/sigreturn-ilp32.o: $(src)/../vdso/sigreturn.S
+	$(call if_changed_dep,vdso-ilp32as)
+
+# Actual build commands
+quiet_cmd_vdso-ilp32ld = VDSOILP32L $@
+      cmd_vdso-ilp32ld = $(CC) $(c_flags) -mabi=ilp32  -Wl,-n -Wl,-T $^ -o $@
+quiet_cmd_vdso-ilp32as = VDSOILP32A $@
+      cmd_vdso-ilp32as = $(CC) $(a_flags) -mabi=ilp32 -c -o $@ $<
+
+# Install commands for the unstripped file
+quiet_cmd_vdso_install = INSTALL $@
+      cmd_vdso_install = cp $(obj)/$@.dbg $(MODLIB)/vdso/$@
+
+vdso-ilp32.so: $(obj)/vdso-ilp32.so.dbg
+	@mkdir -p $(MODLIB)/vdso
+	$(call cmd,vdso_install)
+
+vdso_install: vdso-ilp32.so
diff --git a/arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S b/arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S
new file mode 100644
index 0000000..46ac072
--- /dev/null
+++ b/arch/arm64/kernel/vdso-ilp32/vdso-ilp32.S
@@ -0,0 +1,33 @@
+/*
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ */
+
+#include <linux/init.h>
+#include <linux/linkage.h>
+#include <linux/const.h>
+#include <asm/page.h>
+
+	__PAGE_ALIGNED_DATA
+
+	.globl vdso_ilp32_start, vdso_ilp32_end
+	.balign PAGE_SIZE
+vdso_ilp32_start:
+	.incbin "arch/arm64/kernel/vdso-ilp32/vdso-ilp32.so"
+	.balign PAGE_SIZE
+vdso_ilp32_end:
+
+	.previous
diff --git a/arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S b/arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S
new file mode 100644
index 0000000..3b564ca
--- /dev/null
+++ b/arch/arm64/kernel/vdso-ilp32/vdso-ilp32.lds.S
@@ -0,0 +1,95 @@
+/*
+ * GNU linker script for the VDSO library.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ * Heavily based on the vDSO linker scripts for other archs.
+ */
+
+#include <linux/const.h>
+#include <asm/page.h>
+#include <asm/vdso.h>
+
+SECTIONS
+{
+	PROVIDE(_vdso_data = . - PAGE_SIZE);
+	. = VDSO_LBASE + SIZEOF_HEADERS;
+
+	.hash		: { *(.hash) }			:text
+	.gnu.hash	: { *(.gnu.hash) }
+	.dynsym		: { *(.dynsym) }
+	.dynstr		: { *(.dynstr) }
+	.gnu.version	: { *(.gnu.version) }
+	.gnu.version_d	: { *(.gnu.version_d) }
+	.gnu.version_r	: { *(.gnu.version_r) }
+
+	.note		: { *(.note.*) }		:text	:note
+
+	. = ALIGN(16);
+
+	.text		: { *(.text*) }			:text	=0xd503201f
+	PROVIDE (__etext = .);
+	PROVIDE (_etext = .);
+	PROVIDE (etext = .);
+
+	.eh_frame_hdr	: { *(.eh_frame_hdr) }		:text	:eh_frame_hdr
+	.eh_frame	: { KEEP (*(.eh_frame)) }	:text
+
+	.dynamic	: { *(.dynamic) }		:text	:dynamic
+
+	.rodata		: { *(.rodata*) }		:text
+
+	_end = .;
+	PROVIDE(end = .);
+
+	/DISCARD/	: {
+		*(.note.GNU-stack)
+		*(.data .data.* .gnu.linkonce.d.* .sdata*)
+		*(.bss .sbss .dynbss .dynsbss)
+	}
+}
+
+/*
+ * We must supply the ELF program headers explicitly to get just one
+ * PT_LOAD segment, and set the flags explicitly to make segments read-only.
+ */
+PHDRS
+{
+	text		PT_LOAD		FLAGS(5) FILEHDR PHDRS; /* PF_R|PF_X */
+	dynamic		PT_DYNAMIC	FLAGS(4);		/* PF_R */
+	note		PT_NOTE		FLAGS(4);		/* PF_R */
+	eh_frame_hdr	PT_GNU_EH_FRAME;
+}
+
+/*
+ * This controls what symbols we export from the DSO.
+ */
+VERSION
+{
+	LINUX_4.9 {
+	global:
+		__kernel_rt_sigreturn;
+		__kernel_gettimeofday;
+		__kernel_clock_gettime;
+		__kernel_clock_getres;
+	local: *;
+	};
+}
+
+/*
+ * Make the sigreturn code visible to the kernel.
+ */
+VDSO_sigtramp_ilp32		= __kernel_rt_sigreturn;
diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index 7f822cd..3f884e1 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -37,8 +37,13 @@
 #include <asm/vdso.h>
 #include <asm/vdso_datapage.h>
 
-extern char vdso_start, vdso_end;
-static unsigned long vdso_pages __ro_after_init;
+extern char vdso_lp64_start, vdso_lp64_end;
+static unsigned long vdso_lp64_pages __ro_after_init;
+
+#ifdef CONFIG_ARM64_ILP32
+extern char vdso_ilp32_start, vdso_ilp32_end;
+static unsigned long vdso_ilp32_pages __ro_after_init;
+#endif
 
 /*
  * The vDSO data page.
@@ -110,7 +115,17 @@ int aarch32_setup_vectors_page(struct linux_binprm *bprm, int uses_interp)
 }
 #endif /* CONFIG_AARCH32_EL0 */
 
-static struct vm_special_mapping vdso_spec[2] __ro_after_init = {
+static struct vm_special_mapping vdso_lp64_spec[2] __ro_after_init = {
+	{
+		.name	= "[vvar]",
+	},
+	{
+		.name	= "[vdso]",
+	},
+};
+
+#ifdef CONFIG_ARM64_ILP32
+static struct vm_special_mapping vdso_ilp32_spec[2] __ro_after_init = {
 	{
 		.name	= "[vvar]",
 	},
@@ -118,20 +133,26 @@ static struct vm_special_mapping vdso_spec[2] __ro_after_init = {
 		.name	= "[vdso]",
 	},
 };
+#endif
 
-static int __init vdso_init(void)
+static int __init vdso_init(char *vdso_start, char *vdso_end,
+					  unsigned long *vdso_pagesp,
+					  struct vm_special_mapping *vdso_spec)
 {
 	int i;
+	unsigned long vdso_pages;
 	struct page **vdso_pagelist;
 
-	if (memcmp(&vdso_start, "\177ELF", 4)) {
+	if (memcmp(vdso_start, "\177ELF", 4)) {
 		pr_err("vDSO is not a valid ELF object!\n");
 		return -EINVAL;
 	}
 
-	vdso_pages = (&vdso_end - &vdso_start) >> PAGE_SHIFT;
+	vdso_pages = (vdso_end - vdso_start) >> PAGE_SHIFT;
+	*vdso_pagesp = vdso_pages;
 	pr_info("vdso: %ld pages (%ld code @ %p, %ld data @ %p)\n",
-		vdso_pages + 1, vdso_pages, &vdso_start, 1L, vdso_data);
+					vdso_pages + 1, vdso_pages,
+					vdso_start, 1L, vdso_data);
 
 	/* Allocate the vDSO pagelist, plus a page for the data. */
 	vdso_pagelist = kcalloc(vdso_pages + 1, sizeof(struct page *),
@@ -144,14 +165,30 @@ static int __init vdso_init(void)
 
 	/* Grab the vDSO code pages. */
 	for (i = 0; i < vdso_pages; i++)
-		vdso_pagelist[i + 1] = pfn_to_page(PHYS_PFN(__pa(&vdso_start)) + i);
+		vdso_pagelist[i + 1] =
+			pfn_to_page(PHYS_PFN(__pa(vdso_start)) + i);
 
 	vdso_spec[0].pages = &vdso_pagelist[0];
 	vdso_spec[1].pages = &vdso_pagelist[1];
 
 	return 0;
 }
-arch_initcall(vdso_init);
+
+static int __init vdso_lp64_init(void)
+{
+	return vdso_init(&vdso_lp64_start, &vdso_lp64_end,
+				&vdso_lp64_pages, vdso_lp64_spec);
+}
+arch_initcall(vdso_lp64_init);
+
+#ifdef CONFIG_ARM64_ILP32
+static int __init vdso_ilp32_init(void)
+{
+	return vdso_init(&vdso_ilp32_start, &vdso_ilp32_end,
+				&vdso_ilp32_pages, vdso_ilp32_spec);
+}
+arch_initcall(vdso_ilp32_init);
+#endif
 
 int arch_setup_additional_pages(struct linux_binprm *bprm,
 				int uses_interp)
@@ -159,8 +196,17 @@ int arch_setup_additional_pages(struct linux_binprm *bprm,
 	struct mm_struct *mm = current->mm;
 	unsigned long vdso_base, vdso_text_len, vdso_mapping_len;
 	void *ret;
+	unsigned long pages = vdso_lp64_pages;
+	struct vm_special_mapping *vdso_spec = vdso_lp64_spec;
+
+#ifdef CONFIG_ARM64_ILP32
+	if (is_ilp32_compat_task()) {
+		pages = vdso_ilp32_pages;
+		vdso_spec = vdso_ilp32_spec;
+	}
+#endif
 
-	vdso_text_len = vdso_pages << PAGE_SHIFT;
+	vdso_text_len = pages << PAGE_SHIFT;
 	/* Be sure to map the data page */
 	vdso_mapping_len = vdso_text_len + PAGE_SIZE;
 
diff --git a/arch/arm64/kernel/vdso/gettimeofday.S b/arch/arm64/kernel/vdso/gettimeofday.S
index e00b467..062a33d 100644
--- a/arch/arm64/kernel/vdso/gettimeofday.S
+++ b/arch/arm64/kernel/vdso/gettimeofday.S
@@ -25,6 +25,16 @@
 #define NSEC_PER_SEC_LO16	0xca00
 #define NSEC_PER_SEC_HI16	0x3b9a
 
+#ifdef __LP64__
+#define PTR_REG(n)	x##n
+#define OFFSET(n)	n
+#define DELOUSE(n)
+#else
+#define PTR_REG(n)	w##n
+#define OFFSET(n)	COMPAT_##n
+#define DELOUSE(n)	mov     w##n, w##n
+#endif
+
 vdso_data	.req	x6
 seqcnt		.req	w7
 w_tmp		.req	w8
@@ -119,7 +129,7 @@ x_tmp		.req	x8
 	.if \shift == 1
 	lsr	x11, x11, x12
 	.endif
-	stp	x10, x11, [x1, #TSPEC_TV_SEC]
+	stp     PTR_REG(10), PTR_REG(11), [x1, #OFFSET(TSPEC_TV_SEC)]
 	mov	x0, xzr
 	ret
 	.endm
@@ -136,6 +146,8 @@ x_tmp		.req	x8
 /* int __kernel_gettimeofday(struct timeval *tv, struct timezone *tz); */
 ENTRY(__kernel_gettimeofday)
 	.cfi_startproc
+	DELOUSE(0)
+	DELOUSE(1)
 	adr	vdso_data, _vdso_data
 	/* If tv is NULL, skip to the timezone code. */
 	cbz	x0, 2f
@@ -160,7 +172,7 @@ ENTRY(__kernel_gettimeofday)
 	mov	x13, #1000
 	lsl	x13, x13, x12
 	udiv	x11, x11, x13
-	stp	x10, x11, [x0, #TVAL_TV_SEC]
+	stp	PTR_REG(10), PTR_REG(11), [x0, #OFFSET(TVAL_TV_SEC)]
 2:
 	/* If tz is NULL, return 0. */
 	cbz	x1, 3f
@@ -182,6 +194,7 @@ ENDPROC(__kernel_gettimeofday)
 /* int __kernel_clock_gettime(clockid_t clock_id, struct timespec *tp); */
 ENTRY(__kernel_clock_gettime)
 	.cfi_startproc
+	DELOUSE(1)
 	cmp	w0, #JUMPSLOT_MAX
 	b.hi	syscall
 	adr	vdso_data, _vdso_data
@@ -297,6 +310,7 @@ ENDPROC(__kernel_clock_gettime)
 /* int __kernel_clock_getres(clockid_t clock_id, struct timespec *res); */
 ENTRY(__kernel_clock_getres)
 	.cfi_startproc
+	DELOUSE(1)
 	cmp	w0, #CLOCK_REALTIME
 	ccmp	w0, #CLOCK_MONOTONIC, #0x4, ne
 	ccmp	w0, #CLOCK_MONOTONIC_RAW, #0x4, ne
@@ -311,7 +325,7 @@ ENTRY(__kernel_clock_getres)
 	ldr	x2, 6f
 2:
 	cbz	w1, 3f
-	stp	xzr, x2, [x1]
+	stp	PTR_REG(zr), PTR_REG(2), [x1]
 
 3:	/* res == NULL. */
 	mov	w0, wzr
diff --git a/arch/arm64/kernel/vdso/vdso.S b/arch/arm64/kernel/vdso/vdso.S
index 82379a7..a40ae24 100644
--- a/arch/arm64/kernel/vdso/vdso.S
+++ b/arch/arm64/kernel/vdso/vdso.S
@@ -21,12 +21,12 @@
 #include <linux/const.h>
 #include <asm/page.h>
 
-	.globl vdso_start, vdso_end
+	.globl vdso_lp64_start, vdso_lp64_end
 	.section .rodata
 	.balign PAGE_SIZE
-vdso_start:
+vdso_lp64_start:
 	.incbin "arch/arm64/kernel/vdso/vdso.so"
 	.balign PAGE_SIZE
-vdso_end:
+vdso_lp64_end:
 
 	.previous
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH 18/18] arm64:ilp32: add ARM64_ILP32 to Kconfig
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (16 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 17/18] arm64:ilp32: add vdso-ilp32 and use for signal return Yury Norov
@ 2016-10-21 20:33 ` Yury Norov
  2016-10-28 12:46 ` ILP32 for ARM64 - testing with lmbench Yury Norov
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-10-21 20:33 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, ynorov, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf, Andrew Pinski,
	Andrew Pinski

From: Andrew Pinski <apinski@cavium.com>

This patch adds the config option for ILP32.

Signed-off-by: Andrew Pinski <Andrew.Pinski@caviumnetworks.com>
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Reviewed-by: David Daney <ddaney@caviumnetworks.com>
---
 arch/arm64/Kconfig | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9efa86a..07e177f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -996,7 +996,7 @@ source "fs/Kconfig.binfmt"
 
 config COMPAT
 	bool
-	depends on AARCH32_EL0
+	depends on AARCH32_EL0 || ARM64_ILP32
 
 config AARCH32_EL0
 	bool "Kernel support for 32-bit EL0"
@@ -1018,6 +1018,14 @@ config AARCH32_EL0
 
 	  If you want to execute 32-bit userspace applications, say Y.
 
+config ARM64_ILP32
+	bool "Kernel support for ILP32"
+	select COMPAT
+	help
+	  This option enables support for AArch64 ILP32 user space.  ILP32
+	  is an ABI where long and pointers are 32bits but it uses the AARCH64
+	  instruction set.
+
 config SYSVIPC_COMPAT
 	def_bool y
 	depends on COMPAT && SYSVIPC
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH 01/18] 32-bit ABI: introduce ARCH_32BIT_OFF_T config option
  2016-10-21 20:33 ` [PATCH 01/18] 32-bit ABI: introduce ARCH_32BIT_OFF_T config option Yury Norov
@ 2016-10-24 16:30   ` Chris Metcalf
  2016-10-24 22:22     ` Arnd Bergmann
  0 siblings, 1 reply; 64+ messages in thread
From: Chris Metcalf @ 2016-10-24 16:30 UTC (permalink / raw)
  To: Yury Norov, arnd, catalin.marinas, linux-arm-kernel,
	linux-kernel, linux-doc, linux-arch
  Cc: schwidefsky, heiko.carstens, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1

On 10/21/2016 4:33 PM, Yury Norov wrote:
> All new 32-bit architectures should have 64-bit off_t type, but existing
> architectures has 32-bit ones.
>
> [...]
> For syscalls sys_openat() and sys_open_by_handle_at() force_o_largefile()
> is called, to set O_LARGEFILE flag, and this is the only difference
> comparing to compat versions. All compat ABIs are already turned to use
> 64-bit off_t, except tile. So, compat versions for this syscalls are not
> needed anymore. Tile is handled explicitly.
>
> [...]
> --- a/arch/tile/kernel/compat.c
> +++ b/arch/tile/kernel/compat.c
> @@ -103,6 +103,9 @@ COMPAT_SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned int, offset_high,
>   #define compat_sys_readahead sys32_readahead
>   #define sys_llseek compat_sys_llseek
>   
> +#define sys_openat             compat_sys_openat
> +#define sys_open_by_handle_at  compat_sys_open_by_handle_at
> +
>   /* Call the assembly trampolines where necessary. */
>   #define compat_sys_rt_sigreturn _compat_sys_rt_sigreturn
>   #define sys_clone _sys_clone

This patch accomplishes two goals that could be completely separated.
It's confusing to have them mixed in the same patch without any
discussion of why they are in the same patch.

First, you want to modify the default <asm/unistd.h> behavior for
compat syscalls so that the default is sys_openat (etc) rather than
the existing compat_sys_openat, and then use that new behavior for
arm64 ILP32.  This lets you force O_LARGEFILE for arm64 ILP32 to
support having a 64-bit off_t at all times.  To do that, you fix the
asm-generic header, and then make tile have a special override.
This seems reasonable enough.

Second, you introduce ARCH_32BIT_OFF_T basically as a synonym for
"BITS_PER_WORD == 32", so that new 32-bit architectures can choose not
to enable it.  This is fine in the abstract, but I'm a bit troubled by
the fact that you are not actually introducing a new 32-bit
architecture here (just a new 32-bit mode for the arm 64-bit kernel).
Shouldn't this part of the change wait until someone actually has a
new 32-bit kernel to drive this forward?

If you want to push forward the ARCH_32BIT_OFF_T change in the absence
of an architecture that supports it, I would think it would be a lot
less confusing to have these two in separate patches, and make it
clear that the ARCH_32BIT_OFF_T change is just laying groundwork
for some hypothetical future architecture.

The existing commit language itself is also confusing. You write "All
compat ABIs are already turned to use 64-bit off_t, except tile."
First, I'm not sure what you mean by "turned" here.  And, tile is just
one of many compat ABIs that allow O_LARGEFILE not to be part of the
open call: see arm64's AArch32 ABI, MIPS o32, s390 31-bit emulation,
sparc64's 32-bit mode, and of course x86's 32-bit compat mode.
Presumably your point here is that tile is the only pre-existing
architecture that #includes <asm/unistd.h> to create its compat
syscall table, and so I think "all except tile" here is particularly
confusing, since there are no architectures except tile that use the
__SYSCALL_COMPAT functionality in the current tree.

-- 
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 02/18] arm64: ilp32: add documentation on the ILP32 ABI for ARM64
  2016-10-21 20:33 ` [PATCH 02/18] arm64: ilp32: add documentation on the ILP32 ABI for ARM64 Yury Norov
@ 2016-10-24 16:36   ` Chris Metcalf
  2016-10-27  9:40     ` Yury Norov
  0 siblings, 1 reply; 64+ messages in thread
From: Chris Metcalf @ 2016-10-24 16:36 UTC (permalink / raw)
  To: Yury Norov, arnd, catalin.marinas, linux-arm-kernel,
	linux-kernel, linux-doc, linux-arch
  Cc: schwidefsky, heiko.carstens, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1

On 10/21/2016 4:33 PM, Yury Norov wrote:
> Based on Andrew Pinski's patch-series.
>
> Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
> ---
>   Documentation/arm64/ilp32.txt | 46 +++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 46 insertions(+)
>   create mode 100644 Documentation/arm64/ilp32.txt
>
> diff --git a/Documentation/arm64/ilp32.txt b/Documentation/arm64/ilp32.txt
> new file mode 100644
> index 0000000..b96c18f
> --- /dev/null
> +++ b/Documentation/arm64/ilp32.txt
> @@ -0,0 +1,46 @@
> +ILP32 AARCH64 SYSCALL ABI
> +=========================
> +
> +This document describes the ILP32 syscall ABI and where it differs
> +from the generic compat linux syscall interface.
> +
> +AARCH64/ILP32 userspace can potentially access top halves of registers that
> +are passed as syscall arguments, so such registers (w0-w7) are deloused.

I'm not sure what "potentially access" here means: I think what you want to say
is that userspace can pass garbage in the top half, but you should be clearer about
what you mean here.  Also, you shouldn't use "deloused" here, since it's not a term
that's defined elsewhere in the kernel, even though it's been used colloquially on LKML.
Provide an actual implementation definition, like "have their top 32 bits zeroed".

> +AARCH64/ILP32 provides next types turned to 64-bit (comparing to AARCH32):

What does "turned" mean here?  And I "next types" isn't standard English; you want
to say something like "the following types".  Likewise later with "next syscalls".

-- 
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 01/18] 32-bit ABI: introduce ARCH_32BIT_OFF_T config option
  2016-10-24 16:30   ` Chris Metcalf
@ 2016-10-24 22:22     ` Arnd Bergmann
  2016-10-27  9:29       ` Yury Norov
  0 siblings, 1 reply; 64+ messages in thread
From: Arnd Bergmann @ 2016-10-24 22:22 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Yury Norov, catalin.marinas, linux-arm-kernel, linux-kernel,
	linux-doc, linux-arch, schwidefsky, heiko.carstens, pinskia,
	broonie, joseph, christoph.muellner, bamvor.zhangjian,
	szabolcs.nagy, klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor,
	kilobyte, geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1

On Monday, October 24, 2016 12:30:47 PM CEST Chris Metcalf wrote:
> On 10/21/2016 4:33 PM, Yury Norov wrote:
> > All new 32-bit architectures should have 64-bit off_t type, but existing
> > architectures has 32-bit ones.
> >
> > [...]
> > For syscalls sys_openat() and sys_open_by_handle_at() force_o_largefile()
> > is called, to set O_LARGEFILE flag, and this is the only difference
> > comparing to compat versions. All compat ABIs are already turned to use
> > 64-bit off_t, except tile. So, compat versions for this syscalls are not
> > needed anymore. Tile is handled explicitly.
> >
> > [...]
> > --- a/arch/tile/kernel/compat.c
> > +++ b/arch/tile/kernel/compat.c
> > @@ -103,6 +103,9 @@ COMPAT_SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned int, offset_high,
> >   #define compat_sys_readahead sys32_readahead
> >   #define sys_llseek compat_sys_llseek
> >   
> > +#define sys_openat             compat_sys_openat
> > +#define sys_open_by_handle_at  compat_sys_open_by_handle_at
> > +
> >   /* Call the assembly trampolines where necessary. */
> >   #define compat_sys_rt_sigreturn _compat_sys_rt_sigreturn
> >   #define sys_clone _sys_clone
> 
> This patch accomplishes two goals that could be completely separated.
> It's confusing to have them mixed in the same patch without any
> discussion of why they are in the same patch.
> 
> First, you want to modify the default <asm/unistd.h> behavior for
> compat syscalls so that the default is sys_openat (etc) rather than
> the existing compat_sys_openat, and then use that new behavior for
> arm64 ILP32.  This lets you force O_LARGEFILE for arm64 ILP32 to
> support having a 64-bit off_t at all times.  To do that, you fix the
> asm-generic header, and then make tile have a special override.
> This seems reasonable enough.
> 
> Second, you introduce ARCH_32BIT_OFF_T basically as a synonym for
> "BITS_PER_WORD == 32", so that new 32-bit architectures can choose not
> to enable it.  This is fine in the abstract, but I'm a bit troubled by
> the fact that you are not actually introducing a new 32-bit
> architecture here (just a new 32-bit mode for the arm 64-bit kernel).
> Shouldn't this part of the change wait until someone actually has a
> new 32-bit kernel to drive this forward?

I asked for this specifically because we identified the problem
during the review of the aarch64 ilp32 code, and it might not
be noticed in the next architecture submission.

The most important aspect from my perspective is that the new
ilp32 ABI on aarch64 behaves the same way that any native 32-bit
architecture does, and when we change the default, it should
be done for both compat mode and native mode at the same time.

> If you want to push forward the ARCH_32BIT_OFF_T change in the absence
> of an architecture that supports it, I would think it would be a lot
> less confusing to have these two in separate patches, and make it
> clear that the ARCH_32BIT_OFF_T change is just laying groundwork
> for some hypothetical future architecture.
> 
> The existing commit language itself is also confusing. You write "All
> compat ABIs are already turned to use 64-bit off_t, except tile."
> First, I'm not sure what you mean by "turned" here.  And, tile is just
> one of many compat ABIs that allow O_LARGEFILE not to be part of the
> open call: see arm64's AArch32 ABI, MIPS o32, s390 31-bit emulation,
> sparc64's 32-bit mode, and of course x86's 32-bit compat mode.
> Presumably your point here is that tile is the only pre-existing
> architecture that #includes <asm/unistd.h> to create its compat
> syscall table, and so I think "all except tile" here is particularly
> confusing, since there are no architectures except tile that use the
> __SYSCALL_COMPAT functionality in the current tree.

Agreed, this could be made clearer, and splitting the patch up
in two also seems reasonable, though I didn't see it as important.

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 01/18] 32-bit ABI: introduce ARCH_32BIT_OFF_T config option
  2016-10-24 22:22     ` Arnd Bergmann
@ 2016-10-27  9:29       ` Yury Norov
  0 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-10-27  9:29 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Chris Metcalf, catalin.marinas, linux-arm-kernel, linux-kernel,
	linux-doc, linux-arch, schwidefsky, heiko.carstens, pinskia,
	broonie, joseph, christoph.muellner, bamvor.zhangjian,
	szabolcs.nagy, klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor,
	kilobyte, geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1

On Tue, Oct 25, 2016 at 12:22:47AM +0200, Arnd Bergmann wrote:
> On Monday, October 24, 2016 12:30:47 PM CEST Chris Metcalf wrote:
> > On 10/21/2016 4:33 PM, Yury Norov wrote:
> > > All new 32-bit architectures should have 64-bit off_t type, but existing
> > > architectures has 32-bit ones.
> > >
> > > [...]
> > > For syscalls sys_openat() and sys_open_by_handle_at() force_o_largefile()
> > > is called, to set O_LARGEFILE flag, and this is the only difference
> > > comparing to compat versions. All compat ABIs are already turned to use
> > > 64-bit off_t, except tile. So, compat versions for this syscalls are not
> > > needed anymore. Tile is handled explicitly.
> > >
> > > [...]
> > > --- a/arch/tile/kernel/compat.c
> > > +++ b/arch/tile/kernel/compat.c
> > > @@ -103,6 +103,9 @@ COMPAT_SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned int, offset_high,
> > >   #define compat_sys_readahead sys32_readahead
> > >   #define sys_llseek compat_sys_llseek
> > >   
> > > +#define sys_openat             compat_sys_openat
> > > +#define sys_open_by_handle_at  compat_sys_open_by_handle_at
> > > +
> > >   /* Call the assembly trampolines where necessary. */
> > >   #define compat_sys_rt_sigreturn _compat_sys_rt_sigreturn
> > >   #define sys_clone _sys_clone
> > 
> > This patch accomplishes two goals that could be completely separated.
> > It's confusing to have them mixed in the same patch without any
> > discussion of why they are in the same patch.
> > 
> > First, you want to modify the default <asm/unistd.h> behavior for
> > compat syscalls so that the default is sys_openat (etc) rather than
> > the existing compat_sys_openat, and then use that new behavior for
> > arm64 ILP32.  This lets you force O_LARGEFILE for arm64 ILP32 to
> > support having a 64-bit off_t at all times.  To do that, you fix the
> > asm-generic header, and then make tile have a special override.
> > This seems reasonable enough.
> > 
> > Second, you introduce ARCH_32BIT_OFF_T basically as a synonym for
> > "BITS_PER_WORD == 32", so that new 32-bit architectures can choose not
> > to enable it.  This is fine in the abstract, but I'm a bit troubled by
> > the fact that you are not actually introducing a new 32-bit
> > architecture here (just a new 32-bit mode for the arm 64-bit kernel).
> > Shouldn't this part of the change wait until someone actually has a
> > new 32-bit kernel to drive this forward?
> 
> I asked for this specifically because we identified the problem
> during the review of the aarch64 ilp32 code, and it might not
> be noticed in the next architecture submission.
> 
> The most important aspect from my perspective is that the new
> ilp32 ABI on aarch64 behaves the same way that any native 32-bit
> architecture does, and when we change the default, it should
> be done for both compat mode and native mode at the same time.
> 
> > If you want to push forward the ARCH_32BIT_OFF_T change in the absence
> > of an architecture that supports it, I would think it would be a lot
> > less confusing to have these two in separate patches, and make it
> > clear that the ARCH_32BIT_OFF_T change is just laying groundwork
> > for some hypothetical future architecture.
> > 
> > The existing commit language itself is also confusing. You write "All
> > compat ABIs are already turned to use 64-bit off_t, except tile."
> > First, I'm not sure what you mean by "turned" here.  And, tile is just
> > one of many compat ABIs that allow O_LARGEFILE not to be part of the
> > open call: see arm64's AArch32 ABI, MIPS o32, s390 31-bit emulation,
> > sparc64's 32-bit mode, and of course x86's 32-bit compat mode.
> > Presumably your point here is that tile is the only pre-existing
> > architecture that #includes <asm/unistd.h> to create its compat
> > syscall table, and so I think "all except tile" here is particularly
> > confusing, since there are no architectures except tile that use the
> > __SYSCALL_COMPAT functionality in the current tree.
> 
> Agreed, this could be made clearer, and splitting the patch up
> in two also seems reasonable, though I didn't see it as important.
> 
> 	Arnd

In the past it was a separated series of 2 patches, and it was even
acked by Arnd, but not submitted. 
http://lists-archives.com/linux-kernel/28471253-32-bit-abi-introduce-arch_32bit_off_t-config-option.html

I can restore that small series in aarch64/ilp32 for next iteration, or resend
it separately if you think to submit it before aarch64/ilp32 (which is
better, for me).

Yury

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 02/18] arm64: ilp32: add documentation on the ILP32 ABI for ARM64
  2016-10-24 16:36   ` Chris Metcalf
@ 2016-10-27  9:40     ` Yury Norov
  0 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-10-27  9:40 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch, schwidefsky, heiko.carstens, pinskia, broonie,
	joseph, christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1

Hi Chris,

Thank you for comments

On Mon, Oct 24, 2016 at 12:36:27PM -0400, Chris Metcalf wrote:
> On 10/21/2016 4:33 PM, Yury Norov wrote:
> >Based on Andrew Pinski's patch-series.
> >
> >Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
> >---
> >  Documentation/arm64/ilp32.txt | 46 +++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 46 insertions(+)
> >  create mode 100644 Documentation/arm64/ilp32.txt
> >
> >diff --git a/Documentation/arm64/ilp32.txt b/Documentation/arm64/ilp32.txt
> >new file mode 100644
> >index 0000000..b96c18f
> >--- /dev/null
> >+++ b/Documentation/arm64/ilp32.txt
> >@@ -0,0 +1,46 @@
> >+ILP32 AARCH64 SYSCALL ABI
> >+=========================
> >+
> >+This document describes the ILP32 syscall ABI and where it differs
> >+from the generic compat linux syscall interface.
> >+
> >+AARCH64/ILP32 userspace can potentially access top halves of registers that
> >+are passed as syscall arguments, so such registers (w0-w7) are deloused.
> 
> I'm not sure what "potentially access" here means: I think what you want to say
> is that userspace can pass garbage in the top half, but you should be clearer about
> what you mean here. 

Yes. Will change.

> Also, you shouldn't use "deloused" here, since it's not a term
> that's defined elsewhere in the kernel, even though it's been used colloquially on LKML.
> Provide an actual implementation definition, like "have their top 32 bits zeroed".

Agree.
In fact 'delouse' is used in the name of corresponding macro in
include/linux/compat.h:
29 #ifndef __SC_DELOUSE
30 #define __SC_DELOUSE(t,v) ((t)(unsigned long)(v))
31 #endif

But it's not for documentation.

> 
> >+AARCH64/ILP32 provides next types turned to 64-bit (comparing to AARCH32):
> 
> What does "turned" mean here?  And I "next types" isn't standard English; you want
> to say something like "the following types".  Likewise later with "next syscalls".

Thanks, will change.

Yury

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64 - testing with lmbench
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (17 preceding siblings ...)
  2016-10-21 20:33 ` [PATCH 18/18] arm64:ilp32: add ARM64_ILP32 to Kconfig Yury Norov
@ 2016-10-28 12:46 ` Yury Norov
  2016-11-17  3:28   ` Zhangjian (Bamvor)
  2016-11-07  8:23 ` ILP32 for ARM64: testing with glibc testsuite Yury Norov
                   ` (2 subsequent siblings)
  21 siblings, 1 reply; 64+ messages in thread
From: Yury Norov @ 2016-10-28 12:46 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf, sellcey

[Add Steve Ellcey, thanks for testing on ThunderX]

Lmbench-3.0-a9 testing is performed on ThunderX machine to check that
ILP32 series does not add performance regressions for LP64. Test
summary is in the table below. Our measurements doesn't show
significant performance regression of LP64 if ILP32 code is merged,
both enabled or disabled.

               ILP32 enabled   ILP32  disabled   Standard Kernel 
null syscall   0.1066          0.1121            0.1121
               95.09%          100.00%

stat           1.3947          1.3814            1.3864
               100.60%         99.64%

fstat          0.4459          0.4344            0.4524
               98.56%          96.02%

open/close     4.0606          4.0411            4.0453
               100.38%         99.90%

read           0.4819          0.5014            0.5014
               96.11%          100.00%

Tested with linux 4.8 because 4.9-rc1 is not fixed yet for ThunderX.
Other system details below.

Yury.

ubuntu@crb6:~$ uname -a
Linux crb6 4.8.0+ #3 SMP Thu Oct 27 11:01:32 PDT 2016 aarch64 aarch64 aarch64 GNU/Linux

ubuntu@crb6:~$ cat /proc/meminfo
MemTotal:       132011948 kB
MemFree:        131442672 kB
MemAvailable:   130695764 kB
Buffers:           15696 kB
Cached:            88088 kB
SwapCached:            0 kB
Active:            82760 kB
Inactive:          41336 kB
Active(anon):      20880 kB
Inactive(anon):     8576 kB
Active(file):      61880 kB
Inactive(file):    32760 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:      128920572 kB
SwapFree:       128920572 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:         20544 kB
Mapped:            19780 kB
Shmem:              9060 kB
Slab:              78804 kB
SReclaimable:      27372 kB
SUnreclaim:        51432 kB
KernelStack:        8336 kB
PageTables:          820 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    194926544 kB
Committed_AS:     256324 kB
VmallocTotal:   135290290112 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

ubuntu@crb6:~$ cat /proc/cpuinfo
processor	: 0
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 1
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 2
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 3
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 4
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 5
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 6
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 7
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 8
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 9
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 10
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 11
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 12
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 13
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 14
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 15
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 16
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 17
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 18
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 19
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 20
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 21
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 22
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 23
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 24
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 25
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 26
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 27
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 28
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 29
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 30
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 31
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 32
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 33
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 34
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 35
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 36
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 37
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 38
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 39
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 40
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 41
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 42
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 43
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 44
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 45
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 46
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

processor	: 47
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 0

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64: testing with glibc testsuite
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (18 preceding siblings ...)
  2016-10-28 12:46 ` ILP32 for ARM64 - testing with lmbench Yury Norov
@ 2016-11-07  8:23 ` Yury Norov
  2016-11-09  9:56   ` Yury Norov
  2016-11-30  5:02 ` [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
  2016-12-18  7:08 ` Yury Norov
  21 siblings, 1 reply; 64+ messages in thread
From: Yury Norov @ 2016-11-07  8:23 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch, GNU C Library
  Cc: schwidefsky, heiko.carstens, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf,
	Adhemerval Zanella, Steve Ellcey

[-- Attachment #1: Type: text/plain, Size: 2684 bytes --]

Hi all,

[add libc-alpha mail list]

For libc-alpha: this is the part of LKML submission with latest
patches for aarch64/ilp32.
https://www.spinics.net/lists/arm-kernel/msg537846.html

Glibc that I use has also included consolidation patches from Adhemerval
Zanella and me that are still not in the glibc master. The full series is:
https://github.com/norov/glibc/tree/ilp32-2.24-dev2

Below is the results of glibc testsuite run for aarch64/lp64
in different configurations. Column names meaning:
kvgv: kernel is vanilla, glibc is vanilla;
kdgv: kernel has ilp32 patches applied, but ilp32 is disabled in config; 
      glibc is vanilla;
kegv: kernel has ilp32 patches applied and ilp32 is enabled, glibc is vanilla;
kege: kernel patches are applied and enabled, glibc patches are applied.

Only different lines are shown. Full results are in attached archive. 

I didn't analyze regressions deep yet, so any ideas/suggestions are appreciated.

Yury.

Test					kvgv	kdgv	kegv	kege
conform/ISO/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL
conform/ISO11/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL
conform/ISO99/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL
conform/POSIX/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL
conform/POSIX/sys/stat.h/linknamespace	PASS 	PASS 	PASS	FAIL
conform/UNIX98/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL
conform/XOPEN2K/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL
conform/XPG3/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL
conform/XPG4/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL
csu/tst-atomic				PASS 	PASS 	PASS	FAIL
elf/check-localplt			PASS 	PASS 	PASS	FAIL
iconvdata/mtrace-tst-loading		PASS	FAIL	PASS 	PASS
iconvdata/tst-loading			PASS	FAIL	PASS 	PASS
io/check-installed-headers-c		PASS 	PASS 	PASS	FAIL
io/check-installed-headers-cxx		PASS 	PASS 	PASS	FAIL
malloc/tst-malloc-backtrace		FAIL	PASS	PASS	PASS
malloc/tst-malloc-thread-exit		FAIL	PASS	PASS	PASS
malloc/tst-malloc-usable		FAIL	PASS	PASS	PASS
malloc/tst-mallocfork			FAIL	PASS	PASS	PASS
malloc/tst-mallocstate			FAIL	PASS	PASS	PASS
malloc/tst-mallopt			FAIL	PASS	PASS	PASS
malloc/tst-mcheck			FAIL	PASS	PASS	PASS
malloc/tst-memalign			FAIL	PASS	PASS	PASS
malloc/tst-obstack			FAIL	PASS	PASS	PASS
malloc/tst-posix_memalign		FAIL	PASS	PASS	PASS
malloc/tst-pvalloc			FAIL	PASS	PASS	PASS
malloc/tst-realloc			FAIL	PASS	PASS	PASS
malloc/tst-scratch_buffer		FAIL	PASS	PASS	PASS
malloc/tst-trim1			FAIL	PASS	PASS	PASS
nptl/tst-eintr4				PASS 	PASS 	PASS	NA
posix/tst-regex2			PASS	FAIL	FAIL	FAIL
posix/tst-getaddrinfo4			PASS	PASS	FAIL	FAIL
posix/tst-getaddrinfo5			PASS	PASS	FAIL	FAIL
sysvipc/test-sysvmsg			NA	NA	NA	FAIL
sysvipc/test-sysvsem			NA	NA	NA	FAIL
sysvipc/test-sysvshm			NA	NA	NA	FAIL

[-- Attachment #2: lp64.sum.tar.gz --]
[-- Type: application/gzip, Size: 37007 bytes --]

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64: testing with glibc testsuite
  2016-11-07  8:23 ` ILP32 for ARM64: testing with glibc testsuite Yury Norov
@ 2016-11-09  9:56   ` Yury Norov
  2016-11-16 11:22     ` Maxim Kuvyrkov
  0 siblings, 1 reply; 64+ messages in thread
From: Yury Norov @ 2016-11-09  9:56 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch, GNU C Library
  Cc: schwidefsky, heiko.carstens, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf,
	Adhemerval Zanella, Steve Ellcey

On Mon, Nov 07, 2016 at 01:53:59PM +0530, Yury Norov wrote:
> Hi all,
> 
> [add libc-alpha mail list]
> 
> For libc-alpha: this is the part of LKML submission with latest
> patches for aarch64/ilp32.
> https://www.spinics.net/lists/arm-kernel/msg537846.html
> 
> Glibc that I use has also included consolidation patches from Adhemerval
> Zanella and me that are still not in the glibc master. The full series is:
> https://github.com/norov/glibc/tree/ilp32-2.24-dev2
> 
> Below is the results of glibc testsuite run for aarch64/lp64
> in different configurations. Column names meaning:
> kvgv: kernel is vanilla, glibc is vanilla;
> kdgv: kernel has ilp32 patches applied, but ilp32 is disabled in config; 
>       glibc is vanilla;
> kegv: kernel has ilp32 patches applied and ilp32 is enabled, glibc is vanilla;
> kege: kernel patches are applied and enabled, glibc patches are applied.
> 
> Only different lines are shown. Full results are in attached archive. 
 
The same, plus ILP32 regressions:

Test					kvgv	kdgv	kegv	kege	ilp32
conform/ISO/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
conform/ISO11/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
conform/ISO99/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
conform/POSIX/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
conform/POSIX/sys/stat.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
conform/UNIX98/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
conform/XOPEN2K/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
conform/XPG3/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
conform/XPG4/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
csu/tst-atomic				PASS 	PASS 	PASS	FAIL	PASS
elf/check-localplt			PASS 	PASS 	PASS	FAIL	FAIL
iconvdata/mtrace-tst-loading		PASS	FAIL	PASS 	PASS	FAIL
iconvdata/tst-loading			PASS	FAIL	PASS 	PASS	PASS
io/check-installed-headers-c		PASS 	PASS 	PASS	FAIL	FAIL
io/check-installed-headers-cxx		PASS 	PASS 	PASS	FAIL	FAIL
malloc/tst-malloc-backtrace		FAIL	PASS	PASS	PASS	PASS
malloc/tst-malloc-thread-exit		FAIL	PASS	PASS	PASS	PASS
malloc/tst-malloc-usable		FAIL	PASS	PASS	PASS	PASS
malloc/tst-mallocfork			FAIL	PASS	PASS	PASS	PASS
malloc/tst-mallocstate			FAIL	PASS	PASS	PASS	PASS
malloc/tst-mallopt			FAIL	PASS	PASS	PASS	PASS
malloc/tst-mcheck			FAIL	PASS	PASS	PASS	PASS
malloc/tst-memalign			FAIL	PASS	PASS	PASS	PASS
malloc/tst-obstack			FAIL	PASS	PASS	PASS	PASS
malloc/tst-posix_memalign		FAIL	PASS	PASS	PASS	PASS
malloc/tst-pvalloc			FAIL	PASS	PASS	PASS	PASS
malloc/tst-realloc			FAIL	PASS	PASS	PASS	PASS
malloc/tst-scratch_buffer		FAIL	PASS	PASS	PASS	PASS
malloc/tst-trim1			FAIL	PASS	PASS	PASS	PASS
nptl/tst-eintr4				PASS 	PASS 	PASS	NA	NA
posix/tst-regex2			PASS	FAIL	FAIL	FAIL	FAIL
posix/tst-getaddrinfo4			PASS	PASS	FAIL	FAIL	PASS
posix/tst-getaddrinfo5			PASS	PASS	FAIL	FAIL	PASS
sysvipc/test-sysvmsg			NA	NA	NA	FAIL	PASS
sysvipc/test-sysvsem			NA	NA	NA	FAIL	PASS
sysvipc/test-sysvshm			NA	NA	NA	FAIL	PASS

c++-types-check				PASS	PASS	PASS	PASS	FAIL
debug/tst-backtrace4			PASS	PASS	PASS	PASS	FAIL
elf/check-abi-libc			PASS	PASS	PASS	PASS	FAIL
elf/tst-tls1				PASS	PASS	PASS	PASS	FAIL
elf/tst-tls1-static			PASS	PASS	PASS	PASS	FAIL
elf/tst-tls2				PASS	PASS	PASS	PASS	FAIL
elf/tst-tls2-static			PASS	PASS	PASS	PASS	FAIL
elf/tst-tls3				PASS	PASS	PASS	PASS	FAIL
math/check-abi-libm			PASS	PASS	PASS	PASS	FAIL
misc/tst-writev				PASS	PASS	PASS	PASS   	NA  
nptl/tst-cancel-self-canceltype		PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel1			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel10			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel11			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel13			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel15			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel16			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel17			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel18			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel2			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel20			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel21			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel24			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel25			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel26			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel27			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel3			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel4			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel5			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel6			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancel7			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancelx10			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancelx11			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancelx13			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancelx15			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancelx16			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancelx17			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancelx18			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancelx2			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancelx20			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancelx21			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancelx3			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancelx4			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancelx5			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancelx6			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cancelx7			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cleanup4			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cleanupx4			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cond-except			PASS	PASS	PASS	PASS	FAIL
nptl/tst-cond7				PASS	PASS	PASS	PASS	FAIL
nptl/tst-cond8				PASS	PASS	PASS	PASS	FAIL
nptl/tst-fini1				PASS	PASS	PASS	PASS	FAIL
nptl/tst-initializers1			PASS	PASS	PASS	PASS	FAIL
nptl/tst-initializers1-c11		PASS	PASS	PASS	PASS	FAIL
nptl/tst-initializers1-c89		PASS	PASS	PASS	PASS	FAIL
nptl/tst-initializers1-c99		PASS	PASS	PASS	PASS	FAIL
nptl/tst-initializers1-gnu11		PASS	PASS	PASS	PASS	FAIL
nptl/tst-initializers1-gnu89		PASS	PASS	PASS	PASS	FAIL
nptl/tst-initializers1-gnu99		PASS	PASS	PASS	PASS	FAIL
nptl/tst-join5				PASS	PASS	PASS	PASS	FAIL
nptl/tst-key3				PASS	PASS	PASS	PASS	FAIL
nptl/tst-mutex8				PASS	PASS	PASS	PASS	FAIL
nptl/tst-mutexpi8			PASS	PASS	PASS	PASS	FAIL
nptl/tst-once3				PASS	PASS	PASS	PASS	FAIL
nptl/tst-once4				PASS	PASS	PASS	PASS	FAIL
nptl/tst-oncex3				PASS	PASS	PASS	PASS	FAIL
nptl/tst-oncex4				PASS	PASS	PASS	PASS	FAIL
nptl/tst-rwlock15			PASS	PASS	PASS	PASS	FAIL
nptl/tst-rwlock8			PASS	PASS	PASS	PASS	FAIL
nptl/tst-rwlock9			PASS	PASS	PASS	PASS	FAIL
nptl/tst-sem11				PASS	PASS	PASS	PASS	FAIL
nptl/tst-sem12				PASS	PASS	PASS	PASS	FAIL
posix/bug-regex24			PASS	PASS	PASS	PASS	FAIL
rt/tst-mqueue1				PASS	PASS	PASS	PASS	FAIL
rt/tst-mqueue2				PASS	PASS	PASS	PASS	FAIL
rt/tst-mqueue4				PASS	PASS	PASS	PASS	FAIL
rt/tst-mqueue7				PASS	PASS	PASS	PASS	FAIL
rt/tst-mqueue8				PASS	PASS	PASS	PASS	FAIL
rt/tst-mqueue8x				PASS	PASS	PASS	PASS	FAIL
stdlib/tst-makecontext3			PASS	PASS	PASS	PASS	FAIL

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64: testing with glibc testsuite
  2016-11-09  9:56   ` Yury Norov
@ 2016-11-16 11:22     ` Maxim Kuvyrkov
  2016-11-17 15:50       ` Catalin Marinas
  2016-11-17 21:45       ` Steve Ellcey
  0 siblings, 2 replies; 64+ messages in thread
From: Maxim Kuvyrkov @ 2016-11-16 11:22 UTC (permalink / raw)
  To: Yury Norov
  Cc: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch, GNU C Library, schwidefsky, heiko.carstens,
	Andrew Pinski, broonie, Joseph S. Myers, christoph.muellner,
	bamvor.zhangjian, Szabolcs Nagy, klimov.linux, Nathan_Lynch,
	agraf, Prasun Kapoor, kilobyte, Geert Uytterhoeven,
	Dr. Philipp Tomsich, manuel.montezelo, linyongting, davem,
	zhouchengming1, cmetcalf, Adhemerval Zanella, Steve Ellcey

> On Nov 9, 2016, at 1:56 PM, Yury Norov <ynorov@caviumnetworks.com> wrote:
> 
> On Mon, Nov 07, 2016 at 01:53:59PM +0530, Yury Norov wrote:
>> Hi all,
>> 
>> [add libc-alpha mail list]
>> 
>> For libc-alpha: this is the part of LKML submission with latest
>> patches for aarch64/ilp32.
>> https://www.spinics.net/lists/arm-kernel/msg537846.html
>> 
>> Glibc that I use has also included consolidation patches from Adhemerval
>> Zanella and me that are still not in the glibc master. The full series is:
>> https://github.com/norov/glibc/tree/ilp32-2.24-dev2
>> 
>> Below is the results of glibc testsuite run for aarch64/lp64
>> in different configurations. Column names meaning:
>> kvgv: kernel is vanilla, glibc is vanilla;
>> kdgv: kernel has ilp32 patches applied, but ilp32 is disabled in config; 
>>      glibc is vanilla;
>> kegv: kernel has ilp32 patches applied and ilp32 is enabled, glibc is vanilla;
>> kege: kernel patches are applied and enabled, glibc patches are applied.
>> 
>> Only different lines are shown. Full results are in attached archive. 

Hi Yury,

The general requirement merging ILP32 glibc patches is that LP64 does not regress in any reasonable configuration.  This means that there should be 0 regressions between kvgv and kvge -- i.e., glibc in LP64 mode with and without ILP32 patches does not regress on the vanilla kernel.  The kvge configuration is not in your testing matrix, and I suggest you make sure it has no regressions before fixing the more "advanced" configuration of kege.

Ideally, there should be no regressions between kvgv and kege configurations, but I don't consider this to a requirement for glibc acceptance of ILP32 patches, since any regressions between kvge and kege configurations are likely to be on the kernel side.

Speculating on the kernel requirements for ILP32 kernel patchset, I think there should be 0 regressions between kvgv and kdgv configurations, where you have only 3 tests to investigate and fix.

[I do appreciate that there are progressions in your results as well, but the glibc policy is that they do not offset regressions.]

The above only concerns LP64 support in kernel and glibc.

Regarding ILP32 runtime, my opinion is that it is acceptable for ILP32 to have extra failures compared to LP64, since these are not regressions, but, rather, failures of a new configuration.  From a superficial glance is seems that ILP32 linknamespace support requires attention, as well as stack unwinding (judging from NPTL failures).


--
Maxim Kuvyrkov
www.linaro.org



> 
> The same, plus ILP32 regressions:
> 
> Test					kvgv	kdgv	kegv	kege	ilp32
> conform/ISO/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
> conform/ISO11/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
> conform/ISO99/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
> conform/POSIX/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
> conform/POSIX/sys/stat.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
> conform/UNIX98/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
> conform/XOPEN2K/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
> conform/XPG3/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
> conform/XPG4/stdio.h/linknamespace	PASS 	PASS 	PASS	FAIL	FAIL
> csu/tst-atomic				PASS 	PASS 	PASS	FAIL	PASS
> elf/check-localplt			PASS 	PASS 	PASS	FAIL	FAIL
> iconvdata/mtrace-tst-loading		PASS	FAIL	PASS 	PASS	FAIL
> iconvdata/tst-loading			PASS	FAIL	PASS 	PASS	PASS
> io/check-installed-headers-c		PASS 	PASS 	PASS	FAIL	FAIL
> io/check-installed-headers-cxx		PASS 	PASS 	PASS	FAIL	FAIL
> malloc/tst-malloc-backtrace		FAIL	PASS	PASS	PASS	PASS
> malloc/tst-malloc-thread-exit		FAIL	PASS	PASS	PASS	PASS
> malloc/tst-malloc-usable		FAIL	PASS	PASS	PASS	PASS
> malloc/tst-mallocfork			FAIL	PASS	PASS	PASS	PASS
> malloc/tst-mallocstate			FAIL	PASS	PASS	PASS	PASS
> malloc/tst-mallopt			FAIL	PASS	PASS	PASS	PASS
> malloc/tst-mcheck			FAIL	PASS	PASS	PASS	PASS
> malloc/tst-memalign			FAIL	PASS	PASS	PASS	PASS
> malloc/tst-obstack			FAIL	PASS	PASS	PASS	PASS
> malloc/tst-posix_memalign		FAIL	PASS	PASS	PASS	PASS
> malloc/tst-pvalloc			FAIL	PASS	PASS	PASS	PASS
> malloc/tst-realloc			FAIL	PASS	PASS	PASS	PASS
> malloc/tst-scratch_buffer		FAIL	PASS	PASS	PASS	PASS
> malloc/tst-trim1			FAIL	PASS	PASS	PASS	PASS
> nptl/tst-eintr4				PASS 	PASS 	PASS	NA	NA
> posix/tst-regex2			PASS	FAIL	FAIL	FAIL	FAIL
> posix/tst-getaddrinfo4			PASS	PASS	FAIL	FAIL	PASS
> posix/tst-getaddrinfo5			PASS	PASS	FAIL	FAIL	PASS
> sysvipc/test-sysvmsg			NA	NA	NA	FAIL	PASS
> sysvipc/test-sysvsem			NA	NA	NA	FAIL	PASS
> sysvipc/test-sysvshm			NA	NA	NA	FAIL	PASS
> 
> c++-types-check				PASS	PASS	PASS	PASS	FAIL
> debug/tst-backtrace4			PASS	PASS	PASS	PASS	FAIL
> elf/check-abi-libc			PASS	PASS	PASS	PASS	FAIL
> elf/tst-tls1				PASS	PASS	PASS	PASS	FAIL
> elf/tst-tls1-static			PASS	PASS	PASS	PASS	FAIL
> elf/tst-tls2				PASS	PASS	PASS	PASS	FAIL
> elf/tst-tls2-static			PASS	PASS	PASS	PASS	FAIL
> elf/tst-tls3				PASS	PASS	PASS	PASS	FAIL
> math/check-abi-libm			PASS	PASS	PASS	PASS	FAIL
> misc/tst-writev				PASS	PASS	PASS	PASS   	NA  
> nptl/tst-cancel-self-canceltype		PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel1			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel10			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel11			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel13			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel15			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel16			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel17			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel18			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel2			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel20			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel21			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel24			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel25			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel26			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel27			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel3			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel4			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel5			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel6			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancel7			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancelx10			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancelx11			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancelx13			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancelx15			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancelx16			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancelx17			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancelx18			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancelx2			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancelx20			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancelx21			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancelx3			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancelx4			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancelx5			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancelx6			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cancelx7			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cleanup4			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cleanupx4			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cond-except			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cond7				PASS	PASS	PASS	PASS	FAIL
> nptl/tst-cond8				PASS	PASS	PASS	PASS	FAIL
> nptl/tst-fini1				PASS	PASS	PASS	PASS	FAIL
> nptl/tst-initializers1			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-initializers1-c11		PASS	PASS	PASS	PASS	FAIL
> nptl/tst-initializers1-c89		PASS	PASS	PASS	PASS	FAIL
> nptl/tst-initializers1-c99		PASS	PASS	PASS	PASS	FAIL
> nptl/tst-initializers1-gnu11		PASS	PASS	PASS	PASS	FAIL
> nptl/tst-initializers1-gnu89		PASS	PASS	PASS	PASS	FAIL
> nptl/tst-initializers1-gnu99		PASS	PASS	PASS	PASS	FAIL
> nptl/tst-join5				PASS	PASS	PASS	PASS	FAIL
> nptl/tst-key3				PASS	PASS	PASS	PASS	FAIL
> nptl/tst-mutex8				PASS	PASS	PASS	PASS	FAIL
> nptl/tst-mutexpi8			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-once3				PASS	PASS	PASS	PASS	FAIL
> nptl/tst-once4				PASS	PASS	PASS	PASS	FAIL
> nptl/tst-oncex3				PASS	PASS	PASS	PASS	FAIL
> nptl/tst-oncex4				PASS	PASS	PASS	PASS	FAIL
> nptl/tst-rwlock15			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-rwlock8			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-rwlock9			PASS	PASS	PASS	PASS	FAIL
> nptl/tst-sem11				PASS	PASS	PASS	PASS	FAIL
> nptl/tst-sem12				PASS	PASS	PASS	PASS	FAIL
> posix/bug-regex24			PASS	PASS	PASS	PASS	FAIL
> rt/tst-mqueue1				PASS	PASS	PASS	PASS	FAIL
> rt/tst-mqueue2				PASS	PASS	PASS	PASS	FAIL
> rt/tst-mqueue4				PASS	PASS	PASS	PASS	FAIL
> rt/tst-mqueue7				PASS	PASS	PASS	PASS	FAIL
> rt/tst-mqueue8				PASS	PASS	PASS	PASS	FAIL
> rt/tst-mqueue8x				PASS	PASS	PASS	PASS	FAIL
> stdlib/tst-makecontext3			PASS	PASS	PASS	PASS	FAIL

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64 - testing with lmbench
  2016-10-28 12:46 ` ILP32 for ARM64 - testing with lmbench Yury Norov
@ 2016-11-17  3:28   ` Zhangjian (Bamvor)
  2016-11-17  5:02     ` Maxim Kuvyrkov
  0 siblings, 1 reply; 64+ messages in thread
From: Zhangjian (Bamvor) @ 2016-11-17  3:28 UTC (permalink / raw)
  To: Yury Norov, arnd, catalin.marinas, linux-arm-kernel,
	linux-kernel, linux-doc, linux-arch
  Cc: schwidefsky, heiko.carstens, pinskia, broonie, joseph,
	christoph.muellner, szabolcs.nagy, klimov.linux, Nathan_Lynch,
	agraf, Prasun.Kapoor, kilobyte, geert, philipp.tomsich,
	manuel.montezelo, linyongting, maxim.kuvyrkov, davem,
	zhouchengming1, cmetcalf, sellcey, hanjun.guo, Ding Tianhong,
	Zhangjian (Bamvor)

Hi, all

I test specint of aarch64 LP64 when aarch32 el0 disable/enabled respectively
and compare with ILP32 unmerged kernel(4.8-rc6) in our arm64 board. I found
that difference(ILP32 disabled/ILP32 unmerged) is bigger when aarch32 el0 is
enabled, compare with aarch32 el0 disabled kernel. And bzip2, mcg, hmmer,
libquantum are the top four differences[1]. Note that bigger is better in
specint test.

In order to make sure the above results, I retest these four testcases in
reportable way(reference the command in the end). The result[2] show that
libquantum decrease -2.09% after ILP32 enabled and aarch32 on. I think it is in
significant.

The result of lmbench is not stable in my board. I plan to dig it later.

[1] The following test result is tested through --size=ref --iterations=3.
1.1 Test when aarch32_el0 is enabled.
                         ILP32 disabled        base line
       400.perlbench            100.00%             100%
       401.bzip2                 99.35%             100%
       403.gcc                  100.26%             100%
       429.mcf                  102.75%             100%
       445.gobmk                100.00%             100%
       456.hmmer                 95.66%             100%
       458.sjeng                100.00%             100%
       462.libquantum           100.00%             100%
       471.omnetpp              100.59%             100%
       473.astar                 99.66%             100%
       483.xalancbmk             99.10%             100%

1.2 Test when aarch32_el0 is disabled
                         ILP32 disabled         base line
       400.perlbench            100.22%              100%
       401.bzip2                100.95%              100%
       403.gcc                  100.20%              100%
       429.mcf                  100.76%              100%
       445.gobmk                100.36%              100%
       456.hmmer                 97.94%              100%
       458.sjeng                 99.73%              100%
       462.libquantum            98.72%              100%
       471.omnetpp              100.86%              100%
       473.astar                 99.15%              100%
       483.xalancbmk            100.08%              100%

[2] The following test result is tested through: runspec --config=my.cfg --size=test,train,ref --noreportable --tune=base,peak --iterations=3 bzip2 mcf hmmer libquantum
2.1 Test when aarch32_el0 is enabled.
                          ILP32_enabled         base line
       401.bzip2                100.82%              100%
       429.mcf                  100.18%              100%
       456.hmmer                 99.64%              100%
       462.libquantum            97.91%              100%

Regards

Bamvor

On 2016/10/28 20:46, Yury Norov wrote:
> [Add Steve Ellcey, thanks for testing on ThunderX]
>
> Lmbench-3.0-a9 testing is performed on ThunderX machine to check that
> ILP32 series does not add performance regressions for LP64. Test
> summary is in the table below. Our measurements doesn't show
> significant performance regression of LP64 if ILP32 code is merged,
> both enabled or disabled.
>
>                ILP32 enabled   ILP32  disabled   Standard Kernel
> null syscall   0.1066          0.1121            0.1121
>                95.09%          100.00%
>
> stat           1.3947          1.3814            1.3864
>                100.60%         99.64%
>
> fstat          0.4459          0.4344            0.4524
>                98.56%          96.02%
>
> open/close     4.0606          4.0411            4.0453
>                100.38%         99.90%
>
> read           0.4819          0.5014            0.5014
>                96.11%          100.00%
>
> Tested with linux 4.8 because 4.9-rc1 is not fixed yet for ThunderX.
> Other system details below.
>
> Yury.
>
> ubuntu@crb6:~$ uname -a
> Linux crb6 4.8.0+ #3 SMP Thu Oct 27 11:01:32 PDT 2016 aarch64 aarch64 aarch64 GNU/Linux
>
> ubuntu@crb6:~$ cat /proc/meminfo
> MemTotal:       132011948 kB
> MemFree:        131442672 kB
> MemAvailable:   130695764 kB
> Buffers:           15696 kB
> Cached:            88088 kB
> SwapCached:            0 kB
> Active:            82760 kB
> Inactive:          41336 kB
> Active(anon):      20880 kB
> Inactive(anon):     8576 kB
> Active(file):      61880 kB
> Inactive(file):    32760 kB
> Unevictable:           0 kB
> Mlocked:               0 kB
> SwapTotal:      128920572 kB
> SwapFree:       128920572 kB
> Dirty:                 0 kB
> Writeback:             0 kB
> AnonPages:         20544 kB
> Mapped:            19780 kB
> Shmem:              9060 kB
> Slab:              78804 kB
> SReclaimable:      27372 kB
> SUnreclaim:        51432 kB
> KernelStack:        8336 kB
> PageTables:          820 kB
> NFS_Unstable:          0 kB
> Bounce:                0 kB
> WritebackTmp:          0 kB
> CommitLimit:    194926544 kB
> Committed_AS:     256324 kB
> VmallocTotal:   135290290112 kB
> VmallocUsed:           0 kB
> VmallocChunk:          0 kB
> AnonHugePages:         0 kB
> ShmemHugePages:        0 kB
> ShmemPmdMapped:        0 kB
> CmaTotal:              0 kB
> CmaFree:               0 kB
> HugePages_Total:       0
> HugePages_Free:        0
> HugePages_Rsvd:        0
> HugePages_Surp:        0
> Hugepagesize:       2048 kB
>
> ubuntu@crb6:~$ cat /proc/cpuinfo
> processor	: 0
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 1
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 2
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 3
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 4
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 5
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 6
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 7
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 8
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 9
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 10
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 11
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 12
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 13
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 14
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 15
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 16
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 17
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 18
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 19
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 20
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 21
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 22
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 23
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 24
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 25
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 26
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 27
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 28
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 29
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 30
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 31
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 32
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 33
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 34
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 35
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 36
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 37
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 38
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 39
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 40
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 41
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 42
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 43
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 44
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 45
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 46
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>
> processor	: 47
> BogoMIPS	: 200.00
> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
> CPU implementer	: 0x43
> CPU architecture: 8
> CPU variant	: 0x1
> CPU part	: 0x0a1
> CPU revision	: 0
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64 - testing with lmbench
  2016-11-17  3:28   ` Zhangjian (Bamvor)
@ 2016-11-17  5:02     ` Maxim Kuvyrkov
  2016-11-17  7:48       ` Zhangjian (Bamvor)
  0 siblings, 1 reply; 64+ messages in thread
From: Maxim Kuvyrkov @ 2016-11-17  5:02 UTC (permalink / raw)
  To: Zhangjian (Bamvor)
  Cc: Yury Norov, arnd, catalin.marinas, linux-arm-kernel,
	linux-kernel, linux-doc, linux-arch, schwidefsky, heiko.carstens,
	Andrew Pinski, broonie, Joseph S. Myers, christoph.muellner,
	Szabolcs Nagy, klimov.linux, Nathan_Lynch, agraf, Prasun Kapoor,
	kilobyte, Geert Uytterhoeven, Dr. Philipp Tomsich,
	manuel.montezelo, linyongting, David Miller, zhouchengming1,
	cmetcalf, sellcey, hanjun.guo, Ding Tianhong

Hi Bamvor,

I'm surprised that you see this much difference from ILP32 patches on SPEC CPU2006int at all.  The SPEC CPU2006 benchmarks spend almost no time in the kernel syscalls.  I can imagine memory, TLB, and cache handling in the kernel could affect CPU2006 benchmarks.  Do ILP32 patches touch code in those areas?

Other than that, it would be interesting to check what the variance is between the 3 iterations of benchmark runs.  Could you check what relative standard deviation is between the 3 iterations -- (STDEV(RUN1, RUN2, RUN3) / RUNselected)?

For reference, in my [non-ILP32] benchmarking I see 1.1% for 401.bzip2,  0.8% for 429.mcf, 0.2% for 456.hmmer, and 0.1% for 462.libquantum.

--
Maxim Kuvyrkov
www.linaro.org



> On Nov 17, 2016, at 7:28 AM, Zhangjian (Bamvor) <bamvor.zhangjian@huawei.com> wrote:
> 
> Hi, all
> 
> I test specint of aarch64 LP64 when aarch32 el0 disable/enabled respectively
> and compare with ILP32 unmerged kernel(4.8-rc6) in our arm64 board. I found
> that difference(ILP32 disabled/ILP32 unmerged) is bigger when aarch32 el0 is
> enabled, compare with aarch32 el0 disabled kernel. And bzip2, mcg, hmmer,
> libquantum are the top four differences[1]. Note that bigger is better in
> specint test.
> 
> In order to make sure the above results, I retest these four testcases in
> reportable way(reference the command in the end). The result[2] show that
> libquantum decrease -2.09% after ILP32 enabled and aarch32 on. I think it is in
> significant.
> 
> The result of lmbench is not stable in my board. I plan to dig it later.
> 
> [1] The following test result is tested through --size=ref --iterations=3.
> 1.1 Test when aarch32_el0 is enabled.
>                        ILP32 disabled        base line
>      400.perlbench            100.00%             100%
>      401.bzip2                 99.35%             100%
>      403.gcc                  100.26%             100%
>      429.mcf                  102.75%             100%
>      445.gobmk                100.00%             100%
>      456.hmmer                 95.66%             100%
>      458.sjeng                100.00%             100%
>      462.libquantum           100.00%             100%
>      471.omnetpp              100.59%             100%
>      473.astar                 99.66%             100%
>      483.xalancbmk             99.10%             100%
> 
> 1.2 Test when aarch32_el0 is disabled
>                        ILP32 disabled         base line
>      400.perlbench            100.22%              100%
>      401.bzip2                100.95%              100%
>      403.gcc                  100.20%              100%
>      429.mcf                  100.76%              100%
>      445.gobmk                100.36%              100%
>      456.hmmer                 97.94%              100%
>      458.sjeng                 99.73%              100%
>      462.libquantum            98.72%              100%
>      471.omnetpp              100.86%              100%
>      473.astar                 99.15%              100%
>      483.xalancbmk            100.08%              100%
> 
> [2] The following test result is tested through: runspec --config=my.cfg --size=test,train,ref --noreportable --tune=base,peak --iterations=3 bzip2 mcf hmmer libquantum
> 2.1 Test when aarch32_el0 is enabled.
>                         ILP32_enabled         base line
>      401.bzip2                100.82%              100%
>      429.mcf                  100.18%              100%
>      456.hmmer                 99.64%              100%
>      462.libquantum            97.91%              100%
> 
> Regards
> 
> Bamvor
> 
> On 2016/10/28 20:46, Yury Norov wrote:
>> [Add Steve Ellcey, thanks for testing on ThunderX]
>> 
>> Lmbench-3.0-a9 testing is performed on ThunderX machine to check that
>> ILP32 series does not add performance regressions for LP64. Test
>> summary is in the table below. Our measurements doesn't show
>> significant performance regression of LP64 if ILP32 code is merged,
>> both enabled or disabled.
>> 
>>               ILP32 enabled   ILP32  disabled   Standard Kernel
>> null syscall   0.1066          0.1121            0.1121
>>               95.09%          100.00%
>> 
>> stat           1.3947          1.3814            1.3864
>>               100.60%         99.64%
>> 
>> fstat          0.4459          0.4344            0.4524
>>               98.56%          96.02%
>> 
>> open/close     4.0606          4.0411            4.0453
>>               100.38%         99.90%
>> 
>> read           0.4819          0.5014            0.5014
>>               96.11%          100.00%
>> 
>> Tested with linux 4.8 because 4.9-rc1 is not fixed yet for ThunderX.
>> Other system details below.
>> 
>> Yury.
>> 
>> ubuntu@crb6:~$ uname -a
>> Linux crb6 4.8.0+ #3 SMP Thu Oct 27 11:01:32 PDT 2016 aarch64 aarch64 aarch64 GNU/Linux
>> 
>> ubuntu@crb6:~$ cat /proc/meminfo
>> MemTotal:       132011948 kB
>> MemFree:        131442672 kB
>> MemAvailable:   130695764 kB
>> Buffers:           15696 kB
>> Cached:            88088 kB
>> SwapCached:            0 kB
>> Active:            82760 kB
>> Inactive:          41336 kB
>> Active(anon):      20880 kB
>> Inactive(anon):     8576 kB
>> Active(file):      61880 kB
>> Inactive(file):    32760 kB
>> Unevictable:           0 kB
>> Mlocked:               0 kB
>> SwapTotal:      128920572 kB
>> SwapFree:       128920572 kB
>> Dirty:                 0 kB
>> Writeback:             0 kB
>> AnonPages:         20544 kB
>> Mapped:            19780 kB
>> Shmem:              9060 kB
>> Slab:              78804 kB
>> SReclaimable:      27372 kB
>> SUnreclaim:        51432 kB
>> KernelStack:        8336 kB
>> PageTables:          820 kB
>> NFS_Unstable:          0 kB
>> Bounce:                0 kB
>> WritebackTmp:          0 kB
>> CommitLimit:    194926544 kB
>> Committed_AS:     256324 kB
>> VmallocTotal:   135290290112 kB
>> VmallocUsed:           0 kB
>> VmallocChunk:          0 kB
>> AnonHugePages:         0 kB
>> ShmemHugePages:        0 kB
>> ShmemPmdMapped:        0 kB
>> CmaTotal:              0 kB
>> CmaFree:               0 kB
>> HugePages_Total:       0
>> HugePages_Free:        0
>> HugePages_Rsvd:        0
>> HugePages_Surp:        0
>> Hugepagesize:       2048 kB
>> 
>> ubuntu@crb6:~$ cat /proc/cpuinfo
>> processor	: 0
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 1
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 2
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 3
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 4
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 5
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 6
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 7
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 8
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 9
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 10
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 11
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 12
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 13
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 14
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 15
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 16
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 17
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 18
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 19
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 20
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 21
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 22
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 23
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 24
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 25
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 26
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 27
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 28
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 29
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 30
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 31
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 32
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 33
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 34
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 35
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 36
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 37
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 38
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 39
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 40
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 41
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 42
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 43
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 44
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 45
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 46
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
>> processor	: 47
>> BogoMIPS	: 200.00
>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>> CPU implementer	: 0x43
>> CPU architecture: 8
>> CPU variant	: 0x1
>> CPU part	: 0x0a1
>> CPU revision	: 0
>> 
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64 - testing with lmbench
  2016-11-17  5:02     ` Maxim Kuvyrkov
@ 2016-11-17  7:48       ` Zhangjian (Bamvor)
  2016-12-05 10:16         ` Zhangjian (Bamvor)
  0 siblings, 1 reply; 64+ messages in thread
From: Zhangjian (Bamvor) @ 2016-11-17  7:48 UTC (permalink / raw)
  To: Maxim Kuvyrkov
  Cc: Yury Norov, arnd, catalin.marinas, linux-arm-kernel,
	linux-kernel, linux-doc, linux-arch, schwidefsky, heiko.carstens,
	Andrew Pinski, broonie, Joseph S. Myers, christoph.muellner,
	Szabolcs Nagy, klimov.linux, Nathan_Lynch, agraf, Prasun Kapoor,
	kilobyte, Geert Uytterhoeven, Dr. Philipp Tomsich,
	manuel.montezelo, linyongting, David Miller, zhouchengming1,
	cmetcalf, sellcey, hanjun.guo, Ding Tianhong, Zhangjian (Bamvor)

Hi, Maxim

On 2016/11/17 13:02, Maxim Kuvyrkov wrote:
> Hi Bamvor,
>
> I'm surprised that you see this much difference from ILP32 patches on SPEC CPU2006int at all.  The SPEC CPU2006 benchmarks spend almost no time in the kernel syscalls.  I can imagine memory, TLB, and cache handling in the kernel could affect CPU2006 benchmarks.  Do ILP32 patches touch code in those areas?
>
> Other than that, it would be interesting to check what the variance is between the 3 iterations of benchmark runs.  Could you check what relative standard deviation is between the 3 iterations -- (STDEV(RUN1, RUN2, RUN3) / RUNselected)?
>
> For reference, in my [non-ILP32] benchmarking I see 1.1% for 401.bzip2,  0.8% for 429.mcf, 0.2% for 456.hmmer, and 0.1% for 462.libquantum.
Here is my result:
                     ILP32_merged    ILP32_unmerged
       401.bzip2            0.31%            0.26%
       429.mcf              1.61%            1.36%
       456.hmmer            1.37%            1.57%
       462.libquantum       0.29%            0.28%

Regards

Bamvor

>
> --
> Maxim Kuvyrkov
> www.linaro.org
>
>
>
>> On Nov 17, 2016, at 7:28 AM, Zhangjian (Bamvor) <bamvor.zhangjian@huawei.com> wrote:
>>
>> Hi, all
>>
>> I test specint of aarch64 LP64 when aarch32 el0 disable/enabled respectively
>> and compare with ILP32 unmerged kernel(4.8-rc6) in our arm64 board. I found
>> that difference(ILP32 disabled/ILP32 unmerged) is bigger when aarch32 el0 is
>> enabled, compare with aarch32 el0 disabled kernel. And bzip2, mcg, hmmer,
>> libquantum are the top four differences[1]. Note that bigger is better in
>> specint test.
>>
>> In order to make sure the above results, I retest these four testcases in
>> reportable way(reference the command in the end). The result[2] show that
>> libquantum decrease -2.09% after ILP32 enabled and aarch32 on. I think it is in
>> significant.
>>
>> The result of lmbench is not stable in my board. I plan to dig it later.
>>
>> [1] The following test result is tested through --size=ref --iterations=3.
>> 1.1 Test when aarch32_el0 is enabled.
>>                        ILP32 disabled        base line
>>      400.perlbench            100.00%             100%
>>      401.bzip2                 99.35%             100%
>>      403.gcc                  100.26%             100%
>>      429.mcf                  102.75%             100%
>>      445.gobmk                100.00%             100%
>>      456.hmmer                 95.66%             100%
>>      458.sjeng                100.00%             100%
>>      462.libquantum           100.00%             100%
>>      471.omnetpp              100.59%             100%
>>      473.astar                 99.66%             100%
>>      483.xalancbmk             99.10%             100%
>>
>> 1.2 Test when aarch32_el0 is disabled
>>                        ILP32 disabled         base line
>>      400.perlbench            100.22%              100%
>>      401.bzip2                100.95%              100%
>>      403.gcc                  100.20%              100%
>>      429.mcf                  100.76%              100%
>>      445.gobmk                100.36%              100%
>>      456.hmmer                 97.94%              100%
>>      458.sjeng                 99.73%              100%
>>      462.libquantum            98.72%              100%
>>      471.omnetpp              100.86%              100%
>>      473.astar                 99.15%              100%
>>      483.xalancbmk            100.08%              100%
>>
>> [2] The following test result is tested through: runspec --config=my.cfg --size=test,train,ref --noreportable --tune=base,peak --iterations=3 bzip2 mcf hmmer libquantum
>> 2.1 Test when aarch32_el0 is enabled.
>>                         ILP32_enabled         base line
>>      401.bzip2                100.82%              100%
>>      429.mcf                  100.18%              100%
>>      456.hmmer                 99.64%              100%
>>      462.libquantum            97.91%              100%
>>
>> Regards
>>
>> Bamvor
>>
>> On 2016/10/28 20:46, Yury Norov wrote:
>>> [Add Steve Ellcey, thanks for testing on ThunderX]
>>>
>>> Lmbench-3.0-a9 testing is performed on ThunderX machine to check that
>>> ILP32 series does not add performance regressions for LP64. Test
>>> summary is in the table below. Our measurements doesn't show
>>> significant performance regression of LP64 if ILP32 code is merged,
>>> both enabled or disabled.
>>>
>>>               ILP32 enabled   ILP32  disabled   Standard Kernel
>>> null syscall   0.1066          0.1121            0.1121
>>>               95.09%          100.00%
>>>
>>> stat           1.3947          1.3814            1.3864
>>>               100.60%         99.64%
>>>
>>> fstat          0.4459          0.4344            0.4524
>>>               98.56%          96.02%
>>>
>>> open/close     4.0606          4.0411            4.0453
>>>               100.38%         99.90%
>>>
>>> read           0.4819          0.5014            0.5014
>>>               96.11%          100.00%
>>>
>>> Tested with linux 4.8 because 4.9-rc1 is not fixed yet for ThunderX.
>>> Other system details below.
>>>
>>> Yury.
>>>
>>> ubuntu@crb6:~$ uname -a
>>> Linux crb6 4.8.0+ #3 SMP Thu Oct 27 11:01:32 PDT 2016 aarch64 aarch64 aarch64 GNU/Linux
>>>
>>> ubuntu@crb6:~$ cat /proc/meminfo
>>> MemTotal:       132011948 kB
>>> MemFree:        131442672 kB
>>> MemAvailable:   130695764 kB
>>> Buffers:           15696 kB
>>> Cached:            88088 kB
>>> SwapCached:            0 kB
>>> Active:            82760 kB
>>> Inactive:          41336 kB
>>> Active(anon):      20880 kB
>>> Inactive(anon):     8576 kB
>>> Active(file):      61880 kB
>>> Inactive(file):    32760 kB
>>> Unevictable:           0 kB
>>> Mlocked:               0 kB
>>> SwapTotal:      128920572 kB
>>> SwapFree:       128920572 kB
>>> Dirty:                 0 kB
>>> Writeback:             0 kB
>>> AnonPages:         20544 kB
>>> Mapped:            19780 kB
>>> Shmem:              9060 kB
>>> Slab:              78804 kB
>>> SReclaimable:      27372 kB
>>> SUnreclaim:        51432 kB
>>> KernelStack:        8336 kB
>>> PageTables:          820 kB
>>> NFS_Unstable:          0 kB
>>> Bounce:                0 kB
>>> WritebackTmp:          0 kB
>>> CommitLimit:    194926544 kB
>>> Committed_AS:     256324 kB
>>> VmallocTotal:   135290290112 kB
>>> VmallocUsed:           0 kB
>>> VmallocChunk:          0 kB
>>> AnonHugePages:         0 kB
>>> ShmemHugePages:        0 kB
>>> ShmemPmdMapped:        0 kB
>>> CmaTotal:              0 kB
>>> CmaFree:               0 kB
>>> HugePages_Total:       0
>>> HugePages_Free:        0
>>> HugePages_Rsvd:        0
>>> HugePages_Surp:        0
>>> Hugepagesize:       2048 kB
>>>
>>> ubuntu@crb6:~$ cat /proc/cpuinfo
>>> processor	: 0
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 1
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 2
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 3
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 4
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 5
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 6
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 7
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 8
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 9
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 10
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 11
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 12
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 13
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 14
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 15
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 16
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 17
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 18
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 19
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 20
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 21
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 22
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 23
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 24
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 25
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 26
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 27
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 28
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 29
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 30
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 31
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 32
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 33
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 34
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 35
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 36
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 37
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 38
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 39
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 40
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 41
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 42
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 43
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 44
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 45
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 46
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>> processor	: 47
>>> BogoMIPS	: 200.00
>>> Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>> CPU implementer	: 0x43
>>> CPU architecture: 8
>>> CPU variant	: 0x1
>>> CPU part	: 0x0a1
>>> CPU revision	: 0
>>>
>>
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64: testing with glibc testsuite
  2016-11-16 11:22     ` Maxim Kuvyrkov
@ 2016-11-17 15:50       ` Catalin Marinas
  2016-11-17 21:45       ` Steve Ellcey
  1 sibling, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2016-11-17 15:50 UTC (permalink / raw)
  To: Maxim Kuvyrkov
  Cc: Yury Norov, arnd, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch, GNU C Library, schwidefsky, heiko.carstens,
	Andrew Pinski, broonie, Joseph S. Myers, christoph.muellner,
	bamvor.zhangjian, Szabolcs Nagy, klimov.linux, Nathan_Lynch,
	agraf, Prasun Kapoor, kilobyte, Geert Uytterhoeven,
	Dr. Philipp Tomsich, manuel.montezelo, linyongting, davem,
	zhouchengming1, cmetcalf, Adhemerval Zanella, Steve Ellcey

On Wed, Nov 16, 2016 at 03:22:26PM +0400, Maxim Kuvyrkov wrote:
> Regarding ILP32 runtime, my opinion is that it is acceptable for ILP32
> to have extra failures compared to LP64, since these are not
> regressions, but, rather, failures of a new configuration.

I disagree with this. We definitely need to understand why they fail,
otherwise we run the risk of potential glibc or kernel implementation
bugs becoming ABI.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64: testing with glibc testsuite
  2016-11-16 11:22     ` Maxim Kuvyrkov
  2016-11-17 15:50       ` Catalin Marinas
@ 2016-11-17 21:45       ` Steve Ellcey
  2016-12-05  9:58         ` Zhangjian (Bamvor)
  1 sibling, 1 reply; 64+ messages in thread
From: Steve Ellcey @ 2016-11-17 21:45 UTC (permalink / raw)
  To: Maxim Kuvyrkov, Yury Norov
  Cc: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch, GNU C Library, schwidefsky, heiko.carstens,
	Andrew Pinski, broonie, Joseph S. Myers, christoph.muellner,
	bamvor.zhangjian, Szabolcs Nagy, klimov.linux, Nathan_Lynch,
	agraf, Prasun Kapoor, kilobyte, Geert Uytterhoeven,
	Dr. Philipp Tomsich, manuel.montezelo, linyongting, davem,
	zhouchengming1, cmetcalf, Adhemerval Zanella

On Wed, 2016-11-16 at 15:22 +0400, Maxim Kuvyrkov wrote:
> > 
> > On Nov 9, 2016, at 1:56 PM, Yury Norov <ynorov@caviumnetworks.com>
> > wrote:
> > 
> > > 
> > > Below is the results of glibc testsuite run for aarch64/lp64

I have been running the glibc testsuite as well.  I have only run it on
an ILP32 enabled kernel.  Using that kernel, top-of-tree glibc, and the
ILP32 glibc patches I have no LP64 regressions.  There are 5 failures
in LP64 mode but I get them with vanilla top-of-tree glibc sources too.
They are:
	nptl/eintr1 (I actually don't run this because it kills the 'make check')
	debug/tst-backtrace5
	debug/tst-backtrace6
	nptl/tst-stack4
	nptl/tst-thread_local1

In ILP32 mode I get 33 failures, they include the above failures (minus
nptl/tst-thread_local1) plus:

	c++-types-check
	conform/ISO11/inttypes.h/conform
	conform/ISO11/stdint.h/conform
	conform/ISO99/inttypes.h/conform
	conform/ISO99/stdint.h/conform
	conform/POSIX2008/inttypes.h/conform
	conform/POSIX2008/stdint.h/conform
	conform/XOPEN2K/inttypes.h/conform
	conform/XOPEN2K/stdint.h/conform
	conform/XOPEN2K8/inttypes.h/conform
	conform/XOPEN2K8/stdint.h/conform
	elf/tst-tls1
	elf/tst-tls1-static
	elf/tst-tls2
	elf/tst-tls2-static
	elf/tst-tls3
	math/check-abi-libm
	math/test-double
	math/test-double-finite
	math/test-float
	math/test-float-finite
	misc/tst-sync_file_range
	nptl/tst-cancel26
	nptl/tst-cancel27
	nptl/tst-sem3
	rt/tst-mqueue1
	rt/tst-mqueue2
	rt/tst-mqueue4
	rt/tst-mqueue7
	stdlib/tst-makecontext3

I am currently looking at these ILP32 regressions (starting with the
tls failures) to see if I can figure out what is happening with them.

Steve Ellcey
sellcey@caviumnetworks.com

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (19 preceding siblings ...)
  2016-11-07  8:23 ` ILP32 for ARM64: testing with glibc testsuite Yury Norov
@ 2016-11-30  5:02 ` Yury Norov
  2016-11-30  6:52   ` Adam Borowski
  2016-12-18  7:08 ` Yury Norov
  21 siblings, 1 reply; 64+ messages in thread
From: Yury Norov @ 2016-11-30  5:02 UTC (permalink / raw)
  To: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch
  Cc: schwidefsky, heiko.carstens, pinskia, broonie, joseph,
	christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, kilobyte,
	geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf

On Fri, Oct 21, 2016 at 11:32:59PM +0300, Yury Norov wrote:
> This series enables aarch64 with ilp32 mode, and as supporting work,
> introduces ARCH_32BIT_OFF_T configuration option that is enabled for
> existing 32-bit architectures but disabled for new arches (so 64-bit
> off_t is is used by new userspace).
> 
> This version is based on kernel v4.9-rc1.  It works with glibc-2.24,
> and tested with LTP.
> 
> This version contains ABI changes, and should be used with new glibc
> version. See links below.
> 
> This is RFC because there is still no solid understanding what type
> of registers top-halves delousing we prefer and it affects ABI. In
> this patchset, w0-w7 are cleared for each syscall in assembler entry.
> 
> The alternative approach is in introducing compat wrappers which is
> little faster for natively routed syscalls (~2.6% for syscall with
> no payload) but much more complicated.

Hi all,

Steve Ellcey submitted glibc patches for ILP32:
https://www.sourceware.org/ml/libc-alpha/2016-11/msg01071.html
It implicitly assumes that kernel clears top halves of registers for
all syscalls in assembly entry. That patches are going to be taken.
It it happens, we will have no choice on kernel side how to clear top
halves anymore.

For me current version is OK, and I see no problems with it. I just
write this email to remind that it's still RFC, and this is the last
chance to get back to wrappers. 

Yury.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64
  2016-11-30  5:02 ` [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
@ 2016-11-30  6:52   ` Adam Borowski
  0 siblings, 0 replies; 64+ messages in thread
From: Adam Borowski @ 2016-11-30  6:52 UTC (permalink / raw)
  To: Yury Norov
  Cc: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch, schwidefsky, heiko.carstens, pinskia, broonie,
	joseph, christoph.muellner, bamvor.zhangjian, szabolcs.nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor, geert,
	philipp.tomsich, manuel.montezelo, linyongting, maxim.kuvyrkov,
	davem, zhouchengming1, cmetcalf

On Wed, Nov 30, 2016 at 10:32:09AM +0530, Yury Norov wrote:
> On Fri, Oct 21, 2016 at 11:32:59PM +0300, Yury Norov wrote:
> > This series enables aarch64 with ilp32 mode, and as supporting work,
> > introduces ARCH_32BIT_OFF_T configuration option that is enabled for
> > existing 32-bit architectures but disabled for new arches (so 64-bit
> > off_t is is used by new userspace).
> 
> Hi all,
> 
> Steve Ellcey submitted glibc patches for ILP32:
> https://www.sourceware.org/ml/libc-alpha/2016-11/msg01071.html
> It implicitly assumes that kernel clears top halves of registers for
> all syscalls in assembly entry. That patches are going to be taken.
> It it happens, we will have no choice on kernel side how to clear top
> halves anymore.

Since a while ago, there's a package "arch-test" in Debian that empirically
enumerates architectures executable by the running kernel (and loaded
binfmts), by trying small test programs for each.  The list of architectures
it knows does include arm64ilp32.

For most archs the test is just {write(1, "ok\n"); _exit(0);} unless there's
some difference from baseline that should be checked for, like dmb (ARMv7)
on armhf or mtvsrd (POWER8) on ppc64el.  I could scribble in the top half of
a register to test the delousing, but it's not like alternate versions of
the ABI are expected in the wild...


There's another issue: name.  A stalled request to add it to dpkg's cputable
(https://bugs.debian.org/824742) uses "arm64ilp32" and "arm64ilp32be" which
are unwieldy.  Even the discussion uses "ilp32" -- probably too generic. 
https://wiki.linaro.org/Platform/arm64-ilp32 mentions both.  I've heard
"a32" somewhere.  I have no stake here (I'm on the CC list as a x32 not arm
porter...), but if you want to choose a color for this bikeshed, the time
is now.


Meow!
-- 
The bill declaring Jesus as the King of Poland fails to specify whether
the addition is at the top or end of the list of kings.  What should the
historians do?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64: testing with glibc testsuite
  2016-11-17 21:45       ` Steve Ellcey
@ 2016-12-05  9:58         ` Zhangjian (Bamvor)
  2016-12-05 10:07           ` Andreas Schwab
  0 siblings, 1 reply; 64+ messages in thread
From: Zhangjian (Bamvor) @ 2016-12-05  9:58 UTC (permalink / raw)
  To: Steve Ellcey, Maxim Kuvyrkov, Yury Norov
  Cc: arnd, catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch, GNU C Library, schwidefsky, heiko.carstens,
	Andrew Pinski, broonie, Joseph S. Myers, christoph.muellner,
	Szabolcs Nagy, klimov.linux, Nathan_Lynch, agraf, Prasun Kapoor,
	kilobyte, Geert Uytterhoeven, Dr. Philipp Tomsich,
	manuel.montezelo, linyongting, davem, zhouchengming1, cmetcalf,
	Adhemerval Zanella, Zhangjian (Bamvor),
	Ding Tianhong, Hanjun Guo, jijun (D),
	chenjianguo3, liupeifeng (A)

Hi, Steve

On 2016/11/18 5:45, Steve Ellcey wrote:
> On Wed, 2016-11-16 at 15:22 +0400, Maxim Kuvyrkov wrote:
>>>
>>> On Nov 9, 2016, at 1:56 PM, Yury Norov <ynorov@caviumnetworks.com>
>>> wrote:
>>>
>>>>
>>>> Below is the results of glibc testsuite run for aarch64/lp64
>
> I have been running the glibc testsuite as well.  I have only run it on
> an ILP32 enabled kernel.  Using that kernel, top-of-tree glibc, and the
> ILP32 glibc patches I have no LP64 regressions.  There are 5 failures
> in LP64 mode but I get them with vanilla top-of-tree glibc sources too.
> They are:
> 	nptl/eintr1 (I actually don't run this because it kills the 'make check')
> 	debug/tst-backtrace5
> 	debug/tst-backtrace6
> 	nptl/tst-stack4
> 	nptl/tst-thread_local1
>
> In ILP32 mode I get 33 failures, they include the above failures (minus
> nptl/tst-thread_local1) plus:
>
> 	c++-types-check
> 	conform/ISO11/inttypes.h/conform
> 	conform/ISO11/stdint.h/conform
> 	conform/ISO99/inttypes.h/conform
> 	conform/ISO99/stdint.h/conform
> 	conform/POSIX2008/inttypes.h/conform
> 	conform/POSIX2008/stdint.h/conform
> 	conform/XOPEN2K/inttypes.h/conform
> 	conform/XOPEN2K/stdint.h/conform
> 	conform/XOPEN2K8/inttypes.h/conform
> 	conform/XOPEN2K8/stdint.h/conform
> 	elf/tst-tls1
> 	elf/tst-tls1-static
> 	elf/tst-tls2
> 	elf/tst-tls2-static
> 	elf/tst-tls3
> 	math/check-abi-libm
> 	math/test-double
> 	math/test-double-finite
> 	math/test-float
> 	math/test-float-finite
> 	misc/tst-sync_file_range
> 	nptl/tst-cancel26
> 	nptl/tst-cancel27
> 	nptl/tst-sem3
> 	rt/tst-mqueue1
> 	rt/tst-mqueue2
> 	rt/tst-mqueue4
> 	rt/tst-mqueue7
> 	stdlib/tst-makecontext3
>
> I am currently looking at these ILP32 regressions (starting with the
> tls failures) to see if I can figure out what is happening with them.
Is there some progresses on it? We could collabrate to fix those issues.

Regards

Bamvor
>
> Steve Ellcey
> sellcey@caviumnetworks.com
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64: testing with glibc testsuite
  2016-12-05  9:58         ` Zhangjian (Bamvor)
@ 2016-12-05 10:07           ` Andreas Schwab
  2016-12-05 10:24             ` Zhangjian (Bamvor)
  2016-12-05 19:33             ` Steve Ellcey
  0 siblings, 2 replies; 64+ messages in thread
From: Andreas Schwab @ 2016-12-05 10:07 UTC (permalink / raw)
  To: Zhangjian (Bamvor)
  Cc: Steve Ellcey, Maxim Kuvyrkov, Yury Norov, arnd, catalin.marinas,
	linux-arm-kernel, linux-kernel, linux-doc, linux-arch,
	GNU C Library, schwidefsky, heiko.carstens, Andrew Pinski,
	broonie, Joseph S. Myers, christoph.muellner, Szabolcs Nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun Kapoor, kilobyte,
	Geert Uytterhoeven, Dr. Philipp Tomsich, manuel.montezelo,
	linyongting, davem, <zh

On Dez 05 2016, "Zhangjian (Bamvor)" <bamvor.zhangjian@huawei.com> wrote:

> Is there some progresses on it? We could collabrate to fix those issues.

All the elf/nptl/rt fails should be fixed by the recent binutils fixes.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64 - testing with lmbench
  2016-11-17  7:48       ` Zhangjian (Bamvor)
@ 2016-12-05 10:16         ` Zhangjian (Bamvor)
  2016-12-05 14:13           ` Catalin Marinas
  0 siblings, 1 reply; 64+ messages in thread
From: Zhangjian (Bamvor) @ 2016-12-05 10:16 UTC (permalink / raw)
  To: Maxim Kuvyrkov
  Cc: Yury Norov, arnd, catalin.marinas, linux-arm-kernel,
	linux-kernel, linux-doc, linux-arch, schwidefsky, heiko.carstens,
	Andrew Pinski, broonie, Joseph S. Myers, christoph.muellner,
	Szabolcs Nagy, klimov.linux, Nathan_Lynch, agraf, Prasun Kapoor,
	kilobyte, Geert Uytterhoeven, Dr. Philipp Tomsich,
	manuel.montezelo, linyongting, David Miller, zhouchengming1,
	cmetcalf, sellcey, hanjun.guo, Ding Tianhong, Zhangjian (Bamvor),
	GNU C Library, matt.spencer

Hi, Catalin, Guys

Do you have suggestion of next move of upstreaming ILP32?
There are already the test results of lmbench and specint. Do you they are ok or need more data to prove no regression?
I have also noticed that there are ILP32 failures in glibc testsuite. Is it the only blocker for merge ILP32(in technology part)?

We appreciate any feedback/suggestion and hope could collaborate to improve the upstream progress.

(cc libc-alpha to get more input).

Thanks

Bamvor

On 2016/11/17 15:48, Zhangjian (Bamvor) wrote:
> Hi, Maxim
>
> On 2016/11/17 13:02, Maxim Kuvyrkov wrote:
>> Hi Bamvor,
>>
>> I'm surprised that you see this much difference from ILP32 patches on SPEC CPU2006int at all.  The SPEC CPU2006 benchmarks spend almost no time in the kernel syscalls.  I can imagine memory, TLB,
>> and cache handling in the kernel could affect CPU2006 benchmarks.  Do ILP32 patches touch code in those areas?
>>
>> Other than that, it would be interesting to check what the variance is between the 3 iterations of benchmark runs.  Could you check what relative standard deviation is between the 3 iterations --
>> (STDEV(RUN1, RUN2, RUN3) / RUNselected)?
>>
>> For reference, in my [non-ILP32] benchmarking I see 1.1% for 401.bzip2,  0.8% for 429.mcf, 0.2% for 456.hmmer, and 0.1% for 462.libquantum.
> Here is my result:
>                     ILP32_merged    ILP32_unmerged
>       401.bzip2            0.31%            0.26%
>       429.mcf              1.61%            1.36%
>       456.hmmer            1.37%            1.57%
>       462.libquantum       0.29%            0.28%
>
> Regards
>
> Bamvor
>
>>
>> --
>> Maxim Kuvyrkov
>> www.linaro.org
>>
>>
>>
>>> On Nov 17, 2016, at 7:28 AM, Zhangjian (Bamvor) <bamvor.zhangjian@huawei.com> wrote:
>>>
>>> Hi, all
>>>
>>> I test specint of aarch64 LP64 when aarch32 el0 disable/enabled respectively
>>> and compare with ILP32 unmerged kernel(4.8-rc6) in our arm64 board. I found
>>> that difference(ILP32 disabled/ILP32 unmerged) is bigger when aarch32 el0 is
>>> enabled, compare with aarch32 el0 disabled kernel. And bzip2, mcg, hmmer,
>>> libquantum are the top four differences[1]. Note that bigger is better in
>>> specint test.
>>>
>>> In order to make sure the above results, I retest these four testcases in
>>> reportable way(reference the command in the end). The result[2] show that
>>> libquantum decrease -2.09% after ILP32 enabled and aarch32 on. I think it is in
>>> significant.
>>>
>>> The result of lmbench is not stable in my board. I plan to dig it later.
>>>
>>> [1] The following test result is tested through --size=ref --iterations=3.
>>> 1.1 Test when aarch32_el0 is enabled.
>>>                        ILP32 disabled        base line
>>>      400.perlbench            100.00%             100%
>>>      401.bzip2                 99.35%             100%
>>>      403.gcc                  100.26%             100%
>>>      429.mcf                  102.75%             100%
>>>      445.gobmk                100.00%             100%
>>>      456.hmmer                 95.66%             100%
>>>      458.sjeng                100.00%             100%
>>>      462.libquantum           100.00%             100%
>>>      471.omnetpp              100.59%             100%
>>>      473.astar                 99.66%             100%
>>>      483.xalancbmk             99.10%             100%
>>>
>>> 1.2 Test when aarch32_el0 is disabled
>>>                        ILP32 disabled         base line
>>>      400.perlbench            100.22%              100%
>>>      401.bzip2                100.95%              100%
>>>      403.gcc                  100.20%              100%
>>>      429.mcf                  100.76%              100%
>>>      445.gobmk                100.36%              100%
>>>      456.hmmer                 97.94%              100%
>>>      458.sjeng                 99.73%              100%
>>>      462.libquantum            98.72%              100%
>>>      471.omnetpp              100.86%              100%
>>>      473.astar                 99.15%              100%
>>>      483.xalancbmk            100.08%              100%
>>>
>>> [2] The following test result is tested through: runspec --config=my.cfg --size=test,train,ref --noreportable --tune=base,peak --iterations=3 bzip2 mcf hmmer libquantum
>>> 2.1 Test when aarch32_el0 is enabled.
>>>                         ILP32_enabled         base line
>>>      401.bzip2                100.82%              100%
>>>      429.mcf                  100.18%              100%
>>>      456.hmmer                 99.64%              100%
>>>      462.libquantum            97.91%              100%
>>>
>>> Regards
>>>
>>> Bamvor
>>>
>>> On 2016/10/28 20:46, Yury Norov wrote:
>>>> [Add Steve Ellcey, thanks for testing on ThunderX]
>>>>
>>>> Lmbench-3.0-a9 testing is performed on ThunderX machine to check that
>>>> ILP32 series does not add performance regressions for LP64. Test
>>>> summary is in the table below. Our measurements doesn't show
>>>> significant performance regression of LP64 if ILP32 code is merged,
>>>> both enabled or disabled.
>>>>
>>>>               ILP32 enabled   ILP32  disabled   Standard Kernel
>>>> null syscall   0.1066          0.1121            0.1121
>>>>               95.09%          100.00%
>>>>
>>>> stat           1.3947          1.3814            1.3864
>>>>               100.60%         99.64%
>>>>
>>>> fstat          0.4459          0.4344            0.4524
>>>>               98.56%          96.02%
>>>>
>>>> open/close     4.0606          4.0411            4.0453
>>>>               100.38%         99.90%
>>>>
>>>> read           0.4819          0.5014            0.5014
>>>>               96.11%          100.00%
>>>>
>>>> Tested with linux 4.8 because 4.9-rc1 is not fixed yet for ThunderX.
>>>> Other system details below.
>>>>
>>>> Yury.
>>>>
>>>> ubuntu@crb6:~$ uname -a
>>>> Linux crb6 4.8.0+ #3 SMP Thu Oct 27 11:01:32 PDT 2016 aarch64 aarch64 aarch64 GNU/Linux
>>>>
>>>> ubuntu@crb6:~$ cat /proc/meminfo
>>>> MemTotal:       132011948 kB
>>>> MemFree:        131442672 kB
>>>> MemAvailable:   130695764 kB
>>>> Buffers:           15696 kB
>>>> Cached:            88088 kB
>>>> SwapCached:            0 kB
>>>> Active:            82760 kB
>>>> Inactive:          41336 kB
>>>> Active(anon):      20880 kB
>>>> Inactive(anon):     8576 kB
>>>> Active(file):      61880 kB
>>>> Inactive(file):    32760 kB
>>>> Unevictable:           0 kB
>>>> Mlocked:               0 kB
>>>> SwapTotal:      128920572 kB
>>>> SwapFree:       128920572 kB
>>>> Dirty:                 0 kB
>>>> Writeback:             0 kB
>>>> AnonPages:         20544 kB
>>>> Mapped:            19780 kB
>>>> Shmem:              9060 kB
>>>> Slab:              78804 kB
>>>> SReclaimable:      27372 kB
>>>> SUnreclaim:        51432 kB
>>>> KernelStack:        8336 kB
>>>> PageTables:          820 kB
>>>> NFS_Unstable:          0 kB
>>>> Bounce:                0 kB
>>>> WritebackTmp:          0 kB
>>>> CommitLimit:    194926544 kB
>>>> Committed_AS:     256324 kB
>>>> VmallocTotal:   135290290112 kB
>>>> VmallocUsed:           0 kB
>>>> VmallocChunk:          0 kB
>>>> AnonHugePages:         0 kB
>>>> ShmemHugePages:        0 kB
>>>> ShmemPmdMapped:        0 kB
>>>> CmaTotal:              0 kB
>>>> CmaFree:               0 kB
>>>> HugePages_Total:       0
>>>> HugePages_Free:        0
>>>> HugePages_Rsvd:        0
>>>> HugePages_Surp:        0
>>>> Hugepagesize:       2048 kB
>>>>
>>>> ubuntu@crb6:~$ cat /proc/cpuinfo
>>>> processor    : 0
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 1
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 2
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 3
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 4
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 5
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 6
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 7
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 8
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 9
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 10
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 11
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 12
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 13
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 14
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 15
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 16
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 17
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 18
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 19
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 20
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 21
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 22
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 23
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 24
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 25
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 26
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 27
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 28
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 29
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 30
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 31
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 32
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 33
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 34
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 35
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 36
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 37
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 38
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 39
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 40
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 41
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 42
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 43
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 44
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 45
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 46
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>> processor    : 47
>>>> BogoMIPS    : 200.00
>>>> Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics
>>>> CPU implementer    : 0x43
>>>> CPU architecture: 8
>>>> CPU variant    : 0x1
>>>> CPU part    : 0x0a1
>>>> CPU revision    : 0
>>>>
>>>
>>
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64: testing with glibc testsuite
  2016-12-05 10:07           ` Andreas Schwab
@ 2016-12-05 10:24             ` Zhangjian (Bamvor)
  2016-12-06  5:29               ` Yury Norov
  2016-12-05 19:33             ` Steve Ellcey
  1 sibling, 1 reply; 64+ messages in thread
From: Zhangjian (Bamvor) @ 2016-12-05 10:24 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Steve Ellcey, Maxim Kuvyrkov, Yury Norov, arnd, catalin.marinas,
	linux-arm-kernel, linux-kernel, linux-doc, linux-arch,
	GNU C Library, schwidefsky, heiko.carstens, Andrew Pinski,
	broonie, Joseph S. Myers, christoph.muellner, Szabolcs Nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun Kapoor, kilobyte,
	Geert Uytterhoeven, Dr. Philipp Tomsich, manuel.montezelo,
	linyongting, davem, Zhangjian (Bamvor)



On 2016/12/5 18:07, Andreas Schwab wrote:
> On Dez 05 2016, "Zhangjian (Bamvor)" <bamvor.zhangjian@huawei.com> wrote:
>
>> Is there some progresses on it? We could collabrate to fix those issues.
>
> All the elf/nptl/rt fails should be fixed by the recent binutils fixes.
Cool. How about the conform and other failures?

Regards

Bamvor
>
> Andreas.
>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64 - testing with lmbench
  2016-12-05 10:16         ` Zhangjian (Bamvor)
@ 2016-12-05 14:13           ` Catalin Marinas
  2016-12-11 12:08             ` Yury Norov
  0 siblings, 1 reply; 64+ messages in thread
From: Catalin Marinas @ 2016-12-05 14:13 UTC (permalink / raw)
  To: Zhangjian (Bamvor)
  Cc: Maxim Kuvyrkov, linux-doc, Szabolcs Nagy, heiko.carstens,
	cmetcalf, Yury Norov, Dr. Philipp Tomsich, matt.spencer,
	Joseph S. Myers, linux-arch, zhouchengming1, sellcey,
	Prasun Kapoor, agraf, Geert Uytterhoeven, Ding Tianhong,
	kilobyte, manuel.montezelo, arnd, Andrew Pinski, linyongting,
	klimov.linux, broonie, linux-arm-kernel, GNU C Library,
	Nathan_Lynch, linux-kernel, hanjun.guo, schwidefsky,
	David Miller, christoph.muellner

On Mon, Dec 05, 2016 at 06:16:09PM +0800, Zhangjian (Bamvor) wrote:
> Do you have suggestion of next move of upstreaming ILP32?

I mentioned the steps a few time before. I'm pasting them again here:

1. Complete the review of the Linux patches and ABI (no merge yet)
2. Review the corresponding glibc patches (no merge yet)
3. Ask (Linaro, Cavium) for toolchain + filesystem (pre-built and more
   than just busybox) to be able to reproduce the testing in ARM
4. More testing (LTP, trinity, performance regressions etc.)
5. Move the ILP32 PCS out of beta (based on the results from 4)
6. Check the market again to see if anyone still needs ILP32
7. Based on 6, decide whether to merge the kernel and glibc patches

What's not explicitly mentioned in step 4 is glibc testing. Point 5 is
ARM's responsibility (toolchain folk).

> There are already the test results of lmbench and specint. Do you they
> are ok or need more data to prove no regression?

I would need to reproduce the tests myself, see step 3.

> I have also noticed that there are ILP32 failures in glibc testsuite.
> Is it the only blocker for merge ILP32(in technology part)?

It's probably not the only blocker but I have to review the kernel
patches again to make sure. I'd also like to see whether the libc-alpha
community is ok with the glibc counterpart (but don't merge the patches
until the ABI is agreed on both sides).

On performance, I want to make sure there are no regressions on
AArch32/compat and AArch64/LP64.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/18] arm64: introduce binfmt_elf32.c
  2016-10-21 20:33 ` [PATCH 09/18] arm64: introduce binfmt_elf32.c Yury Norov
@ 2016-12-05 15:10   ` Catalin Marinas
  2016-12-14  9:39     ` Yury Norov
  0 siblings, 1 reply; 64+ messages in thread
From: Catalin Marinas @ 2016-12-05 15:10 UTC (permalink / raw)
  To: Yury Norov
  Cc: arnd, linux-arm-kernel, linux-kernel, linux-doc, linux-arch,
	szabolcs.nagy, heiko.carstens, cmetcalf, philipp.tomsich, joseph,
	zhouchengming1, Prasun.Kapoor, agraf, geert, kilobyte,
	manuel.montezelo, pinskia, linyongting, klimov.linux, broonie,
	bamvor.zhangjian, maxim.kuvyrkov, Nathan_Lynch, schwidefsky,
	davem, christoph.muellner

On Fri, Oct 21, 2016 at 11:33:08PM +0300, Yury Norov wrote:
> As we support more than one compat formats, it looks more reasonable
> to not use fs/compat_binfmt.c. Custom binfmt_elf32.c allows to move aarch32
> specific definitions there and make code more maintainable and readable.

Can you remind me why we need this patch (rather than using the default
fs/compat_binfmt_elf.c which you include here anyway)?

> --- /dev/null
> +++ b/arch/arm64/kernel/binfmt_elf32.c
> @@ -0,0 +1,31 @@
> +/*
> + * Support for AArch32 Linux ELF binaries.
> + */
> +
> +/* AArch32 EABI. */
> +#define EF_ARM_EABI_MASK		0xff000000
> +
> +#define compat_start_thread		compat_start_thread
> +#define COMPAT_SET_PERSONALITY(ex)		\
> +do {						\
> +	clear_thread_flag(TIF_32BIT_AARCH64);	\
> +	set_thread_flag(TIF_32BIT);		\
> +} while (0)

You introduce this here but it seems to still be present in asm/elf.h.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 10/18] arm64: ilp32: introduce binfmt_ilp32.c
  2016-10-21 20:33 ` [PATCH 10/18] arm64: ilp32: introduce binfmt_ilp32.c Yury Norov
@ 2016-12-05 15:38   ` Catalin Marinas
  2016-12-21 18:56     ` Yury Norov
  0 siblings, 1 reply; 64+ messages in thread
From: Catalin Marinas @ 2016-12-05 15:38 UTC (permalink / raw)
  To: Yury Norov
  Cc: arnd, linux-arm-kernel, linux-kernel, linux-doc, linux-arch,
	szabolcs.nagy, heiko.carstens, cmetcalf, philipp.tomsich, joseph,
	zhouchengming1, Prasun.Kapoor, agraf, geert, kilobyte,
	manuel.montezelo, pinskia, linyongting, klimov.linux, broonie,
	bamvor.zhangjian, Bamvor Zhang Jian, maxim.kuvyrkov,
	Nathan_Lynch, schwidefsky, davem, christoph.muellner

On Fri, Oct 21, 2016 at 11:33:09PM +0300, Yury Norov wrote:
> binfmt_ilp32.c is needed to handle ILP32 binaries
> 
> Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
> Signed-off-by: Bamvor Zhang Jian <bamvor.zhangjian@linaro.org>
> ---
>  arch/arm64/include/asm/elf.h     |  6 +++
>  arch/arm64/kernel/Makefile       |  1 +
>  arch/arm64/kernel/binfmt_ilp32.c | 97 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 104 insertions(+)
>  create mode 100644 arch/arm64/kernel/binfmt_ilp32.c
> 
> diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> index f259fe8..be29dde 100644
> --- a/arch/arm64/include/asm/elf.h
> +++ b/arch/arm64/include/asm/elf.h
> @@ -175,10 +175,16 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm,
>  
>  #define COMPAT_ELF_ET_DYN_BASE		(2 * TASK_SIZE_32 / 3)
>  
> +#ifndef USE_AARCH64_GREG
>  /* AArch32 registers. */
>  #define COMPAT_ELF_NGREG		18
>  typedef unsigned int			compat_elf_greg_t;
>  typedef compat_elf_greg_t		compat_elf_gregset_t[COMPAT_ELF_NGREG];
> +#else /* AArch64 registers for AARCH64/ILP32 */
> +#define COMPAT_ELF_NGREG	ELF_NGREG
> +#define compat_elf_greg_t	elf_greg_t
> +#define compat_elf_gregset_t	elf_gregset_t
> +#endif

I think you only need compat_elf_gregset_t definition here and leave the
other two undefined.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 14/18] arm64: signal32: move ilp32 and aarch32 common code to separated file
  2016-10-21 20:33 ` [PATCH 14/18] arm64: signal32: move ilp32 and aarch32 common code to separated file Yury Norov
@ 2016-12-05 16:18   ` Catalin Marinas
  2016-12-06  9:36     ` Yury Norov
  0 siblings, 1 reply; 64+ messages in thread
From: Catalin Marinas @ 2016-12-05 16:18 UTC (permalink / raw)
  To: Yury Norov
  Cc: arnd, linux-arm-kernel, linux-kernel, linux-doc, linux-arch,
	szabolcs.nagy, heiko.carstens, cmetcalf, philipp.tomsich, joseph,
	zhouchengming1, Prasun.Kapoor, agraf, geert, kilobyte,
	manuel.montezelo, pinskia, linyongting, klimov.linux, broonie,
	bamvor.zhangjian, maxim.kuvyrkov, Nathan_Lynch, schwidefsky,
	davem, christoph.muellner

On Fri, Oct 21, 2016 at 11:33:13PM +0300, Yury Norov wrote:
> Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>

Please add some description, even if it means copying the subject.

> ---
>  arch/arm64/include/asm/signal32.h        |   3 +
>  arch/arm64/include/asm/signal32_common.h |  27 +++++++
>  arch/arm64/kernel/Makefile               |   2 +-
>  arch/arm64/kernel/signal32.c             | 107 ------------------------
>  arch/arm64/kernel/signal32_common.c      | 135 +++++++++++++++++++++++++++++++
>  5 files changed, 166 insertions(+), 108 deletions(-)
>  create mode 100644 arch/arm64/include/asm/signal32_common.h
>  create mode 100644 arch/arm64/kernel/signal32_common.c

I wonder whether you can make such patches more readable by setting
"diff.renames" to "copy" in your gitconfig (unless it's set already and
Git cannot detect partial file code moving/copying).

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32
  2016-10-21 20:33 ` [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32 Yury Norov
@ 2016-12-05 16:34   ` Catalin Marinas
  2016-12-06  6:25     ` Yury Norov
  0 siblings, 1 reply; 64+ messages in thread
From: Catalin Marinas @ 2016-12-05 16:34 UTC (permalink / raw)
  To: Yury Norov
  Cc: arnd, linux-arm-kernel, linux-kernel, linux-doc, linux-arch,
	szabolcs.nagy, heiko.carstens, cmetcalf, philipp.tomsich, joseph,
	zhouchengming1, Prasun.Kapoor, agraf, geert, kilobyte,
	manuel.montezelo, pinskia, linyongting, klimov.linux, broonie,
	bamvor.zhangjian, Bamvor Zhang Jian, maxim.kuvyrkov,
	Nathan_Lynch, schwidefsky, davem, christoph.muellner

On Fri, Oct 21, 2016 at 11:33:15PM +0300, Yury Norov wrote:
> New aarch32 ptrace syscall handler is introduced to avoid run-time
> detection of the task type.

What's wrong with the run-time detection? If it's just to avoid a
negligible overhead, I would rather keep the code simpler by avoiding
duplicating the generic compat_sys_ptrace().

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 11/18] arm64: ilp32: share aarch32 syscall handlers
  2016-10-21 20:33 ` [PATCH 11/18] arm64: ilp32: share aarch32 syscall handlers Yury Norov
@ 2016-12-05 17:12   ` Catalin Marinas
  2016-12-06  7:32     ` Yury Norov
  0 siblings, 1 reply; 64+ messages in thread
From: Catalin Marinas @ 2016-12-05 17:12 UTC (permalink / raw)
  To: Yury Norov
  Cc: arnd, linux-arm-kernel, linux-kernel, linux-doc, linux-arch,
	szabolcs.nagy, heiko.carstens, cmetcalf, philipp.tomsich, joseph,
	zhouchengming1, Prasun.Kapoor, agraf, geert, kilobyte,
	manuel.montezelo, pinskia, linyongting, klimov.linux, broonie,
	bamvor.zhangjian, maxim.kuvyrkov, Nathan_Lynch, schwidefsky,
	davem, christoph.muellner

On Fri, Oct 21, 2016 at 11:33:10PM +0300, Yury Norov wrote:
> off_t is  passed in register pair just like in aarch32.
> In this patch corresponding aarch32 handlers are shared to
> ilp32 code.
[...]
> +/*
> + * Note: off_4k (w5) is always in units of 4K. If we can't do the
> + * requested offset because it is not page-aligned, we return -EINVAL.
> + */
> +ENTRY(compat_sys_mmap2_wrapper)
> +#if PAGE_SHIFT > 12
> +	tst	w5, #~PAGE_MASK >> 12
> +	b.ne	1f
> +	lsr	w5, w5, #PAGE_SHIFT - 12
> +#endif
> +	b	sys_mmap_pgoff
> +1:	mov	x0, #-EINVAL
> +	ret
> +ENDPROC(compat_sys_mmap2_wrapper)

For compat sys_mmap2, the pgoff argument is in multiples of 4K. This was
traditionally used for architectures where off_t is 32-bit to allow
mapping files to 2^44.

Since off_t is 64-bit with AArch64/ILP32, should we just pass the off_t
as a 64-bit value in two different registers (w5 and w6)?

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64: testing with glibc testsuite
  2016-12-05 10:07           ` Andreas Schwab
  2016-12-05 10:24             ` Zhangjian (Bamvor)
@ 2016-12-05 19:33             ` Steve Ellcey
  2016-12-06  8:31               ` Andreas Schwab
  1 sibling, 1 reply; 64+ messages in thread
From: Steve Ellcey @ 2016-12-05 19:33 UTC (permalink / raw)
  To: Andreas Schwab, Zhangjian (Bamvor)
  Cc: Maxim Kuvyrkov, Yury Norov, arnd, catalin.marinas,
	linux-arm-kernel, linux-kernel, linux-doc, linux-arch,
	GNU C Library, schwidefsky, heiko.carstens, Andrew Pinski,
	broonie, Joseph S. Myers, christoph.muellner, Szabolcs Nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun Kapoor, kilobyte,
	Geert Uytterhoeven, Dr. Philipp Tomsich, manuel.montezelo,
	linyongting, davem, zh

On Mon, 2016-12-05 at 11:07 +0100, Andreas Schwab wrote:
> On Dez 05 2016, "Zhangjian (Bamvor)" <bamvor.zhangjian@huawei.com>
> wrote:
> 
> > 
> > Is there some progresses on it? We could collabrate to fix those
> > issues.
> All the elf/nptl/rt fails should be fixed by the recent binutils
> fixes.
> 
> Andreas.

I am using binutils ToT and Yury's latest patch (https://sourceware.org
/ml/binutils/2016-12/msg00039.html) and I am still seeing some nptl and
rt failures in the glibc testsuite, specifically:

FAIL: nptl/tst-cancel26
FAIL: nptl/tst-cancel27
FAIL: nptl/tst-stack4
FAIL: rt/tst-mqueue1
FAIL: rt/tst-mqueue2
FAIL: rt/tst-mqueue4
FAIL: rt/tst-mqueue7

Steve Ellcey
sellcey@caviumnetworks.com

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64: testing with glibc testsuite
  2016-12-05 10:24             ` Zhangjian (Bamvor)
@ 2016-12-06  5:29               ` Yury Norov
  0 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-12-06  5:29 UTC (permalink / raw)
  To: Zhangjian (Bamvor)
  Cc: Andreas Schwab, Steve Ellcey, Maxim Kuvyrkov, arnd,
	catalin.marinas, linux-arm-kernel, linux-kernel, linux-doc,
	linux-arch, GNU C Library, schwidefsky, heiko.carstens,
	Andrew Pinski, broonie, Joseph S. Myers, christoph.muellner,
	Szabolcs Nagy, klimov.linux, Nathan_Lynch, agraf, Prasun Kapoor,
	kilobyte, Geert Uytterhoeven, Dr. Philipp Tomsich,
	manuel.montezelo, linyongting, davem

On Mon, Dec 05, 2016 at 06:24:11PM +0800, Zhangjian (Bamvor) wrote:
> 
> 
> On 2016/12/5 18:07, Andreas Schwab wrote:
> >On Dez 05 2016, "Zhangjian (Bamvor)" <bamvor.zhangjian@huawei.com> wrote:
> >
> >>Is there some progresses on it? We could collabrate to fix those issues.
> >
> >All the elf/nptl/rt fails should be fixed by the recent binutils fixes.
> Cool. How about the conform and other failures?

I think conform is only my local problem. I use pretty non-standard
environment for build and testing - cross-compilation + qemu. Steve
builds and runs tests natively, and he doesn't see that regressions.

Yury

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32
  2016-12-05 16:34   ` Catalin Marinas
@ 2016-12-06  6:25     ` Yury Norov
  2016-12-06  6:30       ` Yury Norov
  2016-12-07 16:59       ` Catalin Marinas
  0 siblings, 2 replies; 64+ messages in thread
From: Yury Norov @ 2016-12-06  6:25 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: arnd, linux-arm-kernel, linux-kernel, linux-doc, linux-arch,
	szabolcs.nagy, heiko.carstens, cmetcalf, philipp.tomsich, joseph,
	zhouchengming1, Prasun.Kapoor, agraf, geert, kilobyte,
	manuel.montezelo, pinskia, linyongting, klimov.linux, broonie,
	bamvor.zhangjian, Bamvor Zhang Jian, maxim.kuvyrkov,
	Nathan_Lynch, schwidefsky, davem, christoph.muellner

On Mon, Dec 05, 2016 at 04:34:23PM +0000, Catalin Marinas wrote:
> On Fri, Oct 21, 2016 at 11:33:15PM +0300, Yury Norov wrote:
> > New aarch32 ptrace syscall handler is introduced to avoid run-time
> > detection of the task type.
> 
> What's wrong with the run-time detection? If it's just to avoid a
> negligible overhead, I would rather keep the code simpler by avoiding
> duplicating the generic compat_sys_ptrace().

Nothing wrong. This is how Arnd asked me to do. You already asked this
question: http://lkml.iu.edu/hypermail/linux/kernel/1604.3/00930.html

If it's still looking weird to you, I can switch back to runtime
ptrace. But I'd like to see Arnd's opinion.

Yury.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32
  2016-12-06  6:25     ` Yury Norov
@ 2016-12-06  6:30       ` Yury Norov
  2016-12-07 16:59       ` Catalin Marinas
  1 sibling, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-12-06  6:30 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-doc, szabolcs.nagy, heiko.carstens, cmetcalf,
	philipp.tomsich, joseph, linux-arch, zhouchengming1,
	Prasun.Kapoor, agraf, geert, kilobyte, manuel.montezelo, arnd,
	pinskia, linyongting, klimov.linux, broonie, bamvor.zhangjian,
	Bamvor Zhang Jian, linux-arm-kernel, maxim.kuvyrkov,
	Nathan_Lynch, linux-kernel, schwidefsky, davem,
	christoph.muellner

On Tue, Dec 06, 2016 at 11:55:08AM +0530, Yury Norov wrote:
> On Mon, Dec 05, 2016 at 04:34:23PM +0000, Catalin Marinas wrote:
> > On Fri, Oct 21, 2016 at 11:33:15PM +0300, Yury Norov wrote:
> > > New aarch32 ptrace syscall handler is introduced to avoid run-time
> > > detection of the task type.
> > 
> > What's wrong with the run-time detection? If it's just to avoid a
> > negligible overhead, I would rather keep the code simpler by avoiding
> > duplicating the generic compat_sys_ptrace().
> 
> Nothing wrong. This is how Arnd asked me to do. You already asked this
> question: http://lkml.iu.edu/hypermail/linux/kernel/1604.3/00930.html
> 
> If it's still looking weird to you, I can switch back to runtime
> ptrace. But I'd like to see Arnd's opinion.
 
 This is the Arnd's email:
 https://patchwork.kernel.org/patch/7980521/

 Yury.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 11/18] arm64: ilp32: share aarch32 syscall handlers
  2016-12-05 17:12   ` Catalin Marinas
@ 2016-12-06  7:32     ` Yury Norov
  0 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-12-06  7:32 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: arnd, linux-arm-kernel, linux-kernel, linux-doc, linux-arch,
	szabolcs.nagy, heiko.carstens, cmetcalf, philipp.tomsich, joseph,
	zhouchengming1, Prasun.Kapoor, agraf, geert, kilobyte,
	manuel.montezelo, pinskia, linyongting, klimov.linux, broonie,
	bamvor.zhangjian, maxim.kuvyrkov, Nathan_Lynch, schwidefsky,
	davem, christoph.muellner

On Mon, Dec 05, 2016 at 05:12:43PM +0000, Catalin Marinas wrote:
> On Fri, Oct 21, 2016 at 11:33:10PM +0300, Yury Norov wrote:
> > off_t is  passed in register pair just like in aarch32.
> > In this patch corresponding aarch32 handlers are shared to
> > ilp32 code.
> [...]
> > +/*
> > + * Note: off_4k (w5) is always in units of 4K. If we can't do the
> > + * requested offset because it is not page-aligned, we return -EINVAL.
> > + */
> > +ENTRY(compat_sys_mmap2_wrapper)
> > +#if PAGE_SHIFT > 12
> > +	tst	w5, #~PAGE_MASK >> 12
> > +	b.ne	1f
> > +	lsr	w5, w5, #PAGE_SHIFT - 12
> > +#endif
> > +	b	sys_mmap_pgoff
> > +1:	mov	x0, #-EINVAL
> > +	ret
> > +ENDPROC(compat_sys_mmap2_wrapper)
> 
> For compat sys_mmap2, the pgoff argument is in multiples of 4K. This was
> traditionally used for architectures where off_t is 32-bit to allow
> mapping files to 2^44.
> 
> Since off_t is 64-bit with AArch64/ILP32, should we just pass the off_t
> as a 64-bit value in two different registers (w5 and w6)?

Current glibc implementation becomes broken for 64-bit off_t if
if I'll do what you want.
sysdeps/unix/sysv/linux/generic/wordsize-32/mmap.c
28 __ptr_t
29 __mmap (__ptr_t addr, size_t len, int prot, int flags, int fd, off_t offset)
30 {
31   if (offset & (MMAP_PAGE_UNIT - 1))
32     {
33       __set_errno (EINVAL);
34       return MAP_FAILED;
35     }
36   return (__ptr_t) INLINE_SYSCALL (mmap2, 6, addr, len, prot, flags, fd,
37                                    offset / MMAP_PAGE_UNIT);
38 }
39 
40 weak_alias (__mmap, mmap)

So it requires changes both in glibc and in kernel. I can do it. But
I'd like to collect opinions of kernel and glibc developers before
starting it. 

Yury

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64: testing with glibc testsuite
  2016-12-05 19:33             ` Steve Ellcey
@ 2016-12-06  8:31               ` Andreas Schwab
  0 siblings, 0 replies; 64+ messages in thread
From: Andreas Schwab @ 2016-12-06  8:31 UTC (permalink / raw)
  To: Steve Ellcey
  Cc: Zhangjian (Bamvor),
	Maxim Kuvyrkov, Yury Norov, arnd, catalin.marinas,
	linux-arm-kernel, linux-kernel, linux-doc, linux-arch,
	GNU C Library, schwidefsky, heiko.carstens, Andrew Pinski,
	broonie, Joseph S. Myers, christoph.muellner, Szabolcs Nagy,
	klimov.linux, Nathan_Lynch, agraf, Prasun Kapoor, kilobyte,
	Geert Uytterhoeven, Dr. Philipp Tomsich, manuel.montezelo,
	linyongting, davem, zh

On Dez 05 2016, Steve Ellcey <sellcey@caviumnetworks.com> wrote:

> FAIL: nptl/tst-cancel26
> FAIL: nptl/tst-cancel27

> FAIL: rt/tst-mqueue1
> FAIL: rt/tst-mqueue2
> FAIL: rt/tst-mqueue4
> FAIL: rt/tst-mqueue7

I don't see these failures.  Maybe you need to rebuild libgcc?

https://build.opensuse.org/package/live_build_log/devel:ARM:AArch64:ILP32/glibc-testsuite/standard/aarch64_ilp32

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 14/18] arm64: signal32: move ilp32 and aarch32 common code to separated file
  2016-12-05 16:18   ` Catalin Marinas
@ 2016-12-06  9:36     ` Yury Norov
  0 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-12-06  9:36 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: arnd, linux-arm-kernel, linux-kernel, linux-doc, linux-arch,
	szabolcs.nagy, heiko.carstens, cmetcalf, philipp.tomsich, joseph,
	zhouchengming1, Prasun.Kapoor, agraf, geert, kilobyte,
	manuel.montezelo, pinskia, linyongting, klimov.linux, broonie,
	bamvor.zhangjian, maxim.kuvyrkov, Nathan_Lynch, schwidefsky,
	davem, christoph.muellner

On Mon, Dec 05, 2016 at 04:18:24PM +0000, Catalin Marinas wrote:
> On Fri, Oct 21, 2016 at 11:33:13PM +0300, Yury Norov wrote:
> > Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
> 
> Please add some description, even if it means copying the subject.
> 
> > ---
> >  arch/arm64/include/asm/signal32.h        |   3 +
> >  arch/arm64/include/asm/signal32_common.h |  27 +++++++
> >  arch/arm64/kernel/Makefile               |   2 +-
> >  arch/arm64/kernel/signal32.c             | 107 ------------------------
> >  arch/arm64/kernel/signal32_common.c      | 135 +++++++++++++++++++++++++++++++
> >  5 files changed, 166 insertions(+), 108 deletions(-)
> >  create mode 100644 arch/arm64/include/asm/signal32_common.h
> >  create mode 100644 arch/arm64/kernel/signal32_common.c
> 
> I wonder whether you can make such patches more readable by setting
> "diff.renames" to "copy" in your gitconfig (unless it's set already and
> Git cannot detect partial file code moving/copying).

I tried "git format-patch -C --find-copies-harder" - the same result.

Yury

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32
  2016-12-06  6:25     ` Yury Norov
  2016-12-06  6:30       ` Yury Norov
@ 2016-12-07 16:59       ` Catalin Marinas
  2016-12-07 20:40         ` Arnd Bergmann
  1 sibling, 1 reply; 64+ messages in thread
From: Catalin Marinas @ 2016-12-07 16:59 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-doc, szabolcs.nagy, heiko.carstens, cmetcalf,
	philipp.tomsich, joseph, linux-arch, zhouchengming1,
	Prasun.Kapoor, agraf, geert, kilobyte, manuel.montezelo, arnd,
	pinskia, linyongting, klimov.linux, broonie, bamvor.zhangjian,
	Bamvor Zhang Jian, linux-arm-kernel, maxim.kuvyrkov,
	Nathan_Lynch, linux-kernel, schwidefsky, davem,
	christoph.muellner

On Tue, Dec 06, 2016 at 11:55:08AM +0530, Yury Norov wrote:
> On Mon, Dec 05, 2016 at 04:34:23PM +0000, Catalin Marinas wrote:
> > On Fri, Oct 21, 2016 at 11:33:15PM +0300, Yury Norov wrote:
> > > New aarch32 ptrace syscall handler is introduced to avoid run-time
> > > detection of the task type.
> > 
> > What's wrong with the run-time detection? If it's just to avoid a
> > negligible overhead, I would rather keep the code simpler by avoiding
> > duplicating the generic compat_sys_ptrace().
> 
> Nothing wrong. This is how Arnd asked me to do. You already asked this
> question: http://lkml.iu.edu/hypermail/linux/kernel/1604.3/00930.html

Hmm, I completely forgot about this ;). There is still an advantage to
doing run-time checking if we avoid touching core code (less acks to
gather and less code duplication).

Let's see what Arnd says but the initial patch looked simpler.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32
  2016-12-07 16:59       ` Catalin Marinas
@ 2016-12-07 20:40         ` Arnd Bergmann
  2016-12-08 13:12           ` Catalin Marinas
  2017-01-05 20:40           ` Yury Norov
  0 siblings, 2 replies; 64+ messages in thread
From: Arnd Bergmann @ 2016-12-07 20:40 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Yury Norov, linux-doc, szabolcs.nagy, heiko.carstens, cmetcalf,
	philipp.tomsich, joseph, linux-arch, zhouchengming1,
	Prasun.Kapoor, agraf, geert, kilobyte, manuel.montezelo, pinskia,
	linyongting, klimov.linux, broonie, bamvor.zhangjian,
	Bamvor Zhang Jian, linux-arm-kernel, maxim.kuvyrkov,
	Nathan_Lynch, linux-kernel, schwidefsky, davem,
	christoph.muellner

On Wednesday, December 7, 2016 4:59:13 PM CET Catalin Marinas wrote:
> On Tue, Dec 06, 2016 at 11:55:08AM +0530, Yury Norov wrote:
> > On Mon, Dec 05, 2016 at 04:34:23PM +0000, Catalin Marinas wrote:
> > > On Fri, Oct 21, 2016 at 11:33:15PM +0300, Yury Norov wrote:
> > > > New aarch32 ptrace syscall handler is introduced to avoid run-time
> > > > detection of the task type.
> > > 
> > > What's wrong with the run-time detection? If it's just to avoid a
> > > negligible overhead, I would rather keep the code simpler by avoiding
> > > duplicating the generic compat_sys_ptrace().
> > 
> > Nothing wrong. This is how Arnd asked me to do. You already asked this
> > question: http://lkml.iu.edu/hypermail/linux/kernel/1604.3/00930.html
> 
> Hmm, I completely forgot about this ;). There is still an advantage to
> doing run-time checking if we avoid touching core code (less acks to
> gather and less code duplication).
> 
> Let's see what Arnd says but the initial patch looked simpler.

I don't currently have either version of the patch in my inbox
(the archive is on a different machine), but in general I'd still
think it's best to avoid the runtime check for aarch64-ilp32
altogether. I'd have to look at the overall kernel source to
see if it's worth avoiding one or two instances though, or
if there are an overwhelming number of other checks that we
can't avoid at all.

Regarding ptrace, I notice that arch/tile doesn't even use
the compat entry point for its ilp32 user space on 64-bit
kernels, it just calls the regular 64-bit one. Would that
help here?

	Arnd

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32
  2016-12-07 20:40         ` Arnd Bergmann
@ 2016-12-08 13:12           ` Catalin Marinas
  2017-01-05 20:40           ` Yury Norov
  1 sibling, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2016-12-08 13:12 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-doc, szabolcs.nagy, heiko.carstens, cmetcalf, Yury Norov,
	philipp.tomsich, joseph, linux-arch, zhouchengming1,
	Prasun.Kapoor, agraf, geert, kilobyte, manuel.montezelo, pinskia,
	linyongting, klimov.linux, broonie, bamvor.zhangjian,
	Bamvor Zhang Jian, linux-arm-kernel, maxim.kuvyrkov,
	Nathan_Lynch, linux-kernel, schwidefsky, davem,
	christoph.muellner

On Wed, Dec 07, 2016 at 09:40:13PM +0100, Arnd Bergmann wrote:
> On Wednesday, December 7, 2016 4:59:13 PM CET Catalin Marinas wrote:
> > On Tue, Dec 06, 2016 at 11:55:08AM +0530, Yury Norov wrote:
> > > On Mon, Dec 05, 2016 at 04:34:23PM +0000, Catalin Marinas wrote:
> > > > On Fri, Oct 21, 2016 at 11:33:15PM +0300, Yury Norov wrote:
> > > > > New aarch32 ptrace syscall handler is introduced to avoid run-time
> > > > > detection of the task type.
> > > > 
> > > > What's wrong with the run-time detection? If it's just to avoid a
> > > > negligible overhead, I would rather keep the code simpler by avoiding
> > > > duplicating the generic compat_sys_ptrace().
> > > 
> > > Nothing wrong. This is how Arnd asked me to do. You already asked this
> > > question: http://lkml.iu.edu/hypermail/linux/kernel/1604.3/00930.html
> > 
> > Hmm, I completely forgot about this ;). There is still an advantage to
> > doing run-time checking if we avoid touching core code (less acks to
> > gather and less code duplication).
> > 
> > Let's see what Arnd says but the initial patch looked simpler.
> 
> I don't currently have either version of the patch in my inbox
> (the archive is on a different machine), but in general I'd still
> think it's best to avoid the runtime check for aarch64-ilp32
> altogether. I'd have to look at the overall kernel source to
> see if it's worth avoiding one or two instances though, or
> if there are an overwhelming number of other checks that we
> can't avoid at all.

Just in case you haven't found them already, current version:

https://marc.info/?l=linux-arm-kernel&m=147708276818318&w=2

Original version:

https://patchwork.kernel.org/patch/7980521/

The old one looks more readable and given that ptrace is not really a
fast path, I'm not two worried about run-time checks

> Regarding ptrace, I notice that arch/tile doesn't even use
> the compat entry point for its ilp32 user space on 64-bit
> kernels, it just calls the regular 64-bit one. Would that
> help here?

I don't know whether it would work, we have incompatible siginfo_t on
AArch64/ILP32.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: ILP32 for ARM64 - testing with lmbench
  2016-12-05 14:13           ` Catalin Marinas
@ 2016-12-11 12:08             ` Yury Norov
  0 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-12-11 12:08 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Zhangjian (Bamvor),
	Maxim Kuvyrkov, linux-doc, Szabolcs Nagy, heiko.carstens,
	cmetcalf, Dr. Philipp Tomsich, matt.spencer, Joseph S. Myers,
	linux-arch, zhouchengming1, sellcey, Prasun Kapoor, agraf,
	Geert Uytterhoeven, Ding Tianhong, kilobyte, manuel.montezelo,
	arnd, Andrew Pinski, linyongting, klimov.linux, broonie,
	linux-arm-kernel, GNU C Library, Nathan_Lynch, linux-kernel,
	hanjun.guo, schwidefsky, David Miller, christoph.muellner

On Mon, Dec 05, 2016 at 02:13:12PM +0000, Catalin Marinas wrote:
> On Mon, Dec 05, 2016 at 06:16:09PM +0800, Zhangjian (Bamvor) wrote:
> > Do you have suggestion of next move of upstreaming ILP32?
> 
> I mentioned the steps a few time before. I'm pasting them again here:
> 
> 1. Complete the review of the Linux patches and ABI (no merge yet)
> 2. Review the corresponding glibc patches (no merge yet)
> 3. Ask (Linaro, Cavium) for toolchain + filesystem (pre-built and more
>    than just busybox) to be able to reproduce the testing in ARM
> 4. More testing (LTP, trinity, performance regressions etc.)
> 5. Move the ILP32 PCS out of beta (based on the results from 4)
> 6. Check the market again to see if anyone still needs ILP32
> 7. Based on 6, decide whether to merge the kernel and glibc patches
> 
> What's not explicitly mentioned in step 4 is glibc testing. Point 5 is
> ARM's responsibility (toolchain folk).
> 
> > There are already the test results of lmbench and specint. Do you they
> > are ok or need more data to prove no regression?
> 
> I would need to reproduce the tests myself, see step 3.

Hi Catalin,

> 3. Ask (Linaro, Cavium) for toolchain + filesystem (pre-built and more
>    than just busybox) to be able to reproduce the testing in ARM

This is the Andrew's toolchain I use to build kernel, GLIBC, binutils etc:
https://drive.google.com/open?id=0B93nHerV55yNVlVKaXpOOHQtbW8
It's not the latest build but it works well to me.

This archive contains 4.9-rc8 kernel, initrd, sys-root, qemu image based on
ilp32 busybox. 
https://drive.google.com/open?id=0B93nHerV55yNbVo0bko0bWlQeFE

I can start linux on qemu and run basic commands and tests in ilp32
mode. This is my first attempt to create rootfs, and this is very basic
busybox + sys-root.  But it lets me start lp64 and ilp32 apps (find
example there). If you need something more, let me know and I'll add
it. You can also use any professional distro with this ilp32-enabled
kernel, just copy sys-root there (like I actually do - I run Ubuntu
14 daily). 

BTW. This is of course good idea to build and test ilp32 user
environment, but in real life I think ilp32 apps will work in lp64
userspace.

> 4. More testing (LTP, trinity, performance regressions etc.)

I also built and ran trinity. After ~24 hours I found all trinity
threads stalled for lp64, and after another 24 hours I found it
running but slower for ilp32. Kernel was alive in both cases. 

Yury.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/18] arm64: introduce binfmt_elf32.c
  2016-12-05 15:10   ` Catalin Marinas
@ 2016-12-14  9:39     ` Yury Norov
  0 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2016-12-14  9:39 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: arnd, linux-arm-kernel, linux-kernel, linux-doc, linux-arch,
	szabolcs.nagy, heiko.carstens, cmetcalf, philipp.tomsich, joseph,
	zhouchengming1, Prasun.Kapoor, agraf, geert, kilobyte,
	manuel.montezelo, pinskia, linyongting, klimov.linux, broonie,
	bamvor.zhangjian, maxim.kuvyrkov, Nathan_Lynch, schwidefsky,
	davem, christoph.muellner

On Mon, Dec 05, 2016 at 03:10:19PM +0000, Catalin Marinas wrote:
> On Fri, Oct 21, 2016 at 11:33:08PM +0300, Yury Norov wrote:
> > As we support more than one compat formats, it looks more reasonable
> > to not use fs/compat_binfmt.c. Custom binfmt_elf32.c allows to move aarch32
> > specific definitions there and make code more maintainable and readable.
> 
> Can you remind me why we need this patch (rather than using the default
> fs/compat_binfmt_elf.c which you include here anyway)?

https://patchwork.kernel.org/patch/8756121/

This is mostly to avoid runtime checks and hide some re-definitions
for aarch32 from ilp32, to avoid re-re-definition.

> 
> > --- /dev/null
> > +++ b/arch/arm64/kernel/binfmt_elf32.c
> > @@ -0,0 +1,31 @@
> > +/*
> > + * Support for AArch32 Linux ELF binaries.
> > + */
> > +
> > +/* AArch32 EABI. */
> > +#define EF_ARM_EABI_MASK		0xff000000
> > +
> > +#define compat_start_thread		compat_start_thread
> > +#define COMPAT_SET_PERSONALITY(ex)		\
> > +do {						\
> > +	clear_thread_flag(TIF_32BIT_AARCH64);	\
> > +	set_thread_flag(TIF_32BIT);		\
> > +} while (0)
> 
> You introduce this here but it seems to still be present in asm/elf.h.

Hmm... Maybe chunk that delete it from asm/elf.h was dropped at some
rebase. Thank you for the catch. I'll check it again.

Yury

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64
  2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
                   ` (20 preceding siblings ...)
  2016-11-30  5:02 ` [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
@ 2016-12-18  7:08 ` Yury Norov
  2017-01-06 14:47   ` Catalin Marinas
  21 siblings, 1 reply; 64+ messages in thread
From: Yury Norov @ 2016-12-18  7:08 UTC (permalink / raw)
  To: arnd, catalin.marinas
  Cc: schwidefsky, heiko.carstens, pinskia, broonie, joseph,
	linux-arm-kernel, linux-kernel, linux-doc, bamvor.zhangjian,
	szabolcs.nagy, klimov.linux, Nathan_Lynch, agraf, Prasun.Kapoor,
	kilobyte, geert, philipp.tomsich, manuel.montezelo, linyongting,
	maxim.kuvyrkov, davem, zhouchengming1, cmetcalf

On Fri, Oct 21, 2016 at 11:32:59PM +0300, Yury Norov wrote:
> This series enables aarch64 with ilp32 mode, and as supporting work,
> introduces ARCH_32BIT_OFF_T configuration option that is enabled for
> existing 32-bit architectures but disabled for new arches (so 64-bit
> off_t is is used by new userspace).
> 
> This version is based on kernel v4.9-rc1.  It works with glibc-2.24,
> and tested with LTP.
 
Hi Arnd, Catalin

For last few days I'm trying to rebase this series on current master,
and I see significant conflicts and regressions. In fact, every time
I rebase on next rc1, I feel like I play a roulette.

This is not a significant problem now because it's almost for sure
that this series will not get into 4.10, for reasons not related to
kernel code. And I have time to deal with regressions. But in general,
I'd like to try my patches on top of other candidates for next merge
window. I cannot read all emails in LKML, but I can easily detect
problems and join to the discussion at early stage if I see any problem.

This is probably a noob question, and there are well-known branches,
like Andrew Morton's one. But at this stage it's very important to
have this series prepared for merge, and I'd prefer to ask about it.

Yury.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 10/18] arm64: ilp32: introduce binfmt_ilp32.c
  2016-12-05 15:38   ` Catalin Marinas
@ 2016-12-21 18:56     ` Yury Norov
  2017-01-06 14:48       ` Catalin Marinas
  0 siblings, 1 reply; 64+ messages in thread
From: Yury Norov @ 2016-12-21 18:56 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: arnd, linux-arm-kernel, linux-kernel, linux-doc, linux-arch,
	szabolcs.nagy, heiko.carstens, cmetcalf, philipp.tomsich, joseph,
	zhouchengming1, Prasun.Kapoor, agraf, geert, kilobyte,
	manuel.montezelo, pinskia, linyongting, klimov.linux, broonie,
	bamvor.zhangjian, Bamvor Zhang Jian, maxim.kuvyrkov,
	Nathan_Lynch, schwidefsky, davem, christoph.muellner

On Mon, Dec 05, 2016 at 03:38:01PM +0000, Catalin Marinas wrote:
> On Fri, Oct 21, 2016 at 11:33:09PM +0300, Yury Norov wrote:
> > binfmt_ilp32.c is needed to handle ILP32 binaries
> > 
> > Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
> > Signed-off-by: Bamvor Zhang Jian <bamvor.zhangjian@linaro.org>
> > ---
> >  arch/arm64/include/asm/elf.h     |  6 +++
> >  arch/arm64/kernel/Makefile       |  1 +
> >  arch/arm64/kernel/binfmt_ilp32.c | 97 ++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 104 insertions(+)
> >  create mode 100644 arch/arm64/kernel/binfmt_ilp32.c
> > 
> > diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> > index f259fe8..be29dde 100644
> > --- a/arch/arm64/include/asm/elf.h
> > +++ b/arch/arm64/include/asm/elf.h
> > @@ -175,10 +175,16 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm,
> >  
> >  #define COMPAT_ELF_ET_DYN_BASE		(2 * TASK_SIZE_32 / 3)
> >  
> > +#ifndef USE_AARCH64_GREG
> >  /* AArch32 registers. */
> >  #define COMPAT_ELF_NGREG		18
> >  typedef unsigned int			compat_elf_greg_t;
> >  typedef compat_elf_greg_t		compat_elf_gregset_t[COMPAT_ELF_NGREG];
> > +#else /* AArch64 registers for AARCH64/ILP32 */
> > +#define COMPAT_ELF_NGREG	ELF_NGREG
> > +#define compat_elf_greg_t	elf_greg_t
> > +#define compat_elf_gregset_t	elf_gregset_t
> > +#endif
> 
> I think you only need compat_elf_gregset_t definition here and leave the
> other two undefined.

I checked everything here again, and found that almost all compat defines
may be moved to corresponding binfmt files. If everything is OK, I'll
incorporate next patch to the series

Yury

--
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index abb75f5..76f0a5c 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -176,30 +176,10 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm,
 
 #define COMPAT_ELF_ET_DYN_BASE		(2 * TASK_SIZE_32 / 3)
 
-#ifndef USE_AARCH64_GREG
 /* AArch32 registers. */
 #define COMPAT_ELF_NGREG		18
 typedef unsigned int			compat_elf_greg_t;
 typedef compat_elf_greg_t		compat_elf_gregset_t[COMPAT_ELF_NGREG];
-#else /* AArch64 registers for AARCH64/ILP32 */
-#define COMPAT_ELF_NGREG	ELF_NGREG
-#define compat_elf_greg_t	elf_greg_t
-#define compat_elf_gregset_t	elf_gregset_t
-#endif
-
-/* AArch32 EABI. */
-#define EF_ARM_EABI_MASK		0xff000000
-#define compat_elf_check_arch(x)	(system_supports_32bit_el0() && \
-					 ((x)->e_machine == EM_ARM) && \
-					 ((x)->e_flags & EF_ARM_EABI_MASK))
-
-#define compat_start_thread		compat_start_thread
-#define COMPAT_ARCH_DLINFO
-extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
-				      int uses_interp);
-#define compat_arch_setup_additional_pages \
-					aarch32_setup_vectors_page
-
 #endif /* CONFIG_COMPAT */
 
 #endif /* !__ASSEMBLY__ */
diff --git a/arch/arm64/kernel/binfmt_elf32.c b/arch/arm64/kernel/binfmt_elf32.c
index 99a4cf2..7c38a22 100644
--- a/arch/arm64/kernel/binfmt_elf32.c
+++ b/arch/arm64/kernel/binfmt_elf32.c
@@ -17,16 +17,16 @@
 #define COMPAT_ELF_HWCAP		(compat_elf_hwcap)
 #define COMPAT_ELF_HWCAP2		(compat_elf_hwcap2)
 
-#ifdef __AARCH64EB__
-#define COMPAT_ELF_PLATFORM		("v8b")
-#else
-#define COMPAT_ELF_PLATFORM		("v8l")
-#endif
-
 #define compat_arch_setup_additional_pages \
 					aarch32_setup_vectors_page
 struct linux_binprm;
 extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
 				      int uses_interp);
 
+/* AArch32 EABI. */
+#define compat_elf_check_arch(x)	(system_supports_32bit_el0() && \
+					 ((x)->e_machine == EM_ARM) && \
+					 ((x)->e_flags & EF_ARM_EABI_MASK))
+
+
 #include "../../../fs/compat_binfmt_elf.c"
diff --git a/arch/arm64/kernel/binfmt_ilp32.c b/arch/arm64/kernel/binfmt_ilp32.c
index dd62467..ec4a412 100644
--- a/arch/arm64/kernel/binfmt_ilp32.c
+++ b/arch/arm64/kernel/binfmt_ilp32.c
@@ -1,7 +1,9 @@
 /*
  * Support for ILP32 Linux/aarch64 ELF binaries.
  */
-#define USE_AARCH64_GREG
+
+#undef compat_elf_gregset_t
+#define compat_elf_gregset_t	elf_gregset_t
 
 #include <linux/elfcore-compat.h>
 #include <linux/time.h>

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32
  2016-12-07 20:40         ` Arnd Bergmann
  2016-12-08 13:12           ` Catalin Marinas
@ 2017-01-05 20:40           ` Yury Norov
  2017-01-06 14:36             ` Catalin Marinas
  1 sibling, 1 reply; 64+ messages in thread
From: Yury Norov @ 2017-01-05 20:40 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Catalin Marinas, linux-doc, szabolcs.nagy, heiko.carstens,
	cmetcalf, philipp.tomsich, joseph, linux-arch, zhouchengming1,
	Prasun.Kapoor, agraf, geert, kilobyte, manuel.montezelo, pinskia,
	linyongting, klimov.linux, broonie, bamvor.zhangjian,
	Bamvor Zhang Jian, linux-arm-kernel, maxim.kuvyrkov,
	Nathan_Lynch, linux-kernel, schwidefsky, davem,
	christoph.muellner

On Wed, Dec 07, 2016 at 09:40:13PM +0100, Arnd Bergmann wrote:
> On Wednesday, December 7, 2016 4:59:13 PM CET Catalin Marinas wrote:
> > On Tue, Dec 06, 2016 at 11:55:08AM +0530, Yury Norov wrote:
> > > On Mon, Dec 05, 2016 at 04:34:23PM +0000, Catalin Marinas wrote:
> > > > On Fri, Oct 21, 2016 at 11:33:15PM +0300, Yury Norov wrote:
> > > > > New aarch32 ptrace syscall handler is introduced to avoid run-time
> > > > > detection of the task type.
> > > > 
> > > > What's wrong with the run-time detection? If it's just to avoid a
> > > > negligible overhead, I would rather keep the code simpler by avoiding
> > > > duplicating the generic compat_sys_ptrace().
> > > 
> > > Nothing wrong. This is how Arnd asked me to do. You already asked this
> > > question: http://lkml.iu.edu/hypermail/linux/kernel/1604.3/00930.html
> > 
> > Hmm, I completely forgot about this ;). There is still an advantage to
> > doing run-time checking if we avoid touching core code (less acks to
> > gather and less code duplication).
> > 
> > Let's see what Arnd says but the initial patch looked simpler.
> 
> I don't currently have either version of the patch in my inbox
> (the archive is on a different machine), but in general I'd still
> think it's best to avoid the runtime check for aarch64-ilp32
> altogether. I'd have to look at the overall kernel source to
> see if it's worth avoiding one or two instances though, or
> if there are an overwhelming number of other checks that we
> can't avoid at all.
> 
> Regarding ptrace, I notice that arch/tile doesn't even use
> the compat entry point for its ilp32 user space on 64-bit
> kernels, it just calls the regular 64-bit one. Would that
> help here?

ILP32 tasks has unique context that is not like aarch64 or aarch32,
so we have to have unique ptrace handler. I prepared the patch for
ptrace with runtime ABI detection, as Catalin said, see there:
https://github.com/norov/linux/commit/1f66dc22a4450b192e83458f2c3cc0e79f53e670

If it's OK, I'd like to update submission.

Yury

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32
  2017-01-05 20:40           ` Yury Norov
@ 2017-01-06 14:36             ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2017-01-06 14:36 UTC (permalink / raw)
  To: Yury Norov
  Cc: Arnd Bergmann, linux-doc, szabolcs.nagy, heiko.carstens,
	cmetcalf, philipp.tomsich, joseph, linux-arch, zhouchengming1,
	Prasun.Kapoor, agraf, geert, kilobyte, manuel.montezelo, pinskia,
	linyongting, klimov.linux, broonie, bamvor.zhangjian,
	Bamvor Zhang Jian, linux-arm-kernel, maxim.kuvyrkov,
	Nathan_Lynch, linux-kernel, schwidefsky, davem,
	christoph.muellner

On Fri, Jan 06, 2017 at 02:10:03AM +0530, Yury Norov wrote:
> On Wed, Dec 07, 2016 at 09:40:13PM +0100, Arnd Bergmann wrote:
> > On Wednesday, December 7, 2016 4:59:13 PM CET Catalin Marinas wrote:
> > > On Tue, Dec 06, 2016 at 11:55:08AM +0530, Yury Norov wrote:
> > > > On Mon, Dec 05, 2016 at 04:34:23PM +0000, Catalin Marinas wrote:
> > > > > On Fri, Oct 21, 2016 at 11:33:15PM +0300, Yury Norov wrote:
> > > > > > New aarch32 ptrace syscall handler is introduced to avoid run-time
> > > > > > detection of the task type.
> > > > > 
> > > > > What's wrong with the run-time detection? If it's just to avoid a
> > > > > negligible overhead, I would rather keep the code simpler by avoiding
> > > > > duplicating the generic compat_sys_ptrace().
> > > > 
> > > > Nothing wrong. This is how Arnd asked me to do. You already asked this
> > > > question: http://lkml.iu.edu/hypermail/linux/kernel/1604.3/00930.html
> > > 
> > > Hmm, I completely forgot about this ;). There is still an advantage to
> > > doing run-time checking if we avoid touching core code (less acks to
> > > gather and less code duplication).
> > > 
> > > Let's see what Arnd says but the initial patch looked simpler.
> > 
> > I don't currently have either version of the patch in my inbox
> > (the archive is on a different machine), but in general I'd still
> > think it's best to avoid the runtime check for aarch64-ilp32
> > altogether. I'd have to look at the overall kernel source to
> > see if it's worth avoiding one or two instances though, or
> > if there are an overwhelming number of other checks that we
> > can't avoid at all.
> > 
> > Regarding ptrace, I notice that arch/tile doesn't even use
> > the compat entry point for its ilp32 user space on 64-bit
> > kernels, it just calls the regular 64-bit one. Would that
> > help here?
> 
> ILP32 tasks has unique context that is not like aarch64 or aarch32,
> so we have to have unique ptrace handler. I prepared the patch for
> ptrace with runtime ABI detection, as Catalin said, see there:
> https://github.com/norov/linux/commit/1f66dc22a4450b192e83458f2c3cc0e79f53e670
> 
> If it's OK, I'd like to update submission.

This looks better to me (and even better if you no longer need to touch
the generic ptrace code).

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64
  2016-12-18  7:08 ` Yury Norov
@ 2017-01-06 14:47   ` Catalin Marinas
  2017-01-09  8:30     ` Yury Norov
  0 siblings, 1 reply; 64+ messages in thread
From: Catalin Marinas @ 2017-01-06 14:47 UTC (permalink / raw)
  To: Yury Norov
  Cc: arnd, linux-doc, szabolcs.nagy, heiko.carstens, cmetcalf,
	philipp.tomsich, joseph, zhouchengming1, Prasun.Kapoor, agraf,
	geert, kilobyte, manuel.montezelo, pinskia, linyongting,
	klimov.linux, broonie, bamvor.zhangjian, linux-arm-kernel,
	maxim.kuvyrkov, Nathan_Lynch, linux-kernel, schwidefsky, davem

On Sun, Dec 18, 2016 at 12:38:23PM +0530, Yury Norov wrote:
> On Fri, Oct 21, 2016 at 11:32:59PM +0300, Yury Norov wrote:
> > This series enables aarch64 with ilp32 mode, and as supporting work,
> > introduces ARCH_32BIT_OFF_T configuration option that is enabled for
> > existing 32-bit architectures but disabled for new arches (so 64-bit
> > off_t is is used by new userspace).
> > 
> > This version is based on kernel v4.9-rc1.  It works with glibc-2.24,
> > and tested with LTP.
>  
> Hi Arnd, Catalin
> 
> For last few days I'm trying to rebase this series on current master,
> and I see significant conflicts and regressions. In fact, every time
> I rebase on next rc1, I feel like I play a roulette.
> 
> This is not a significant problem now because it's almost for sure
> that this series will not get into 4.10, for reasons not related to
> kernel code. And I have time to deal with regressions. But in general,
> I'd like to try my patches on top of other candidates for next merge
> window. I cannot read all emails in LKML, but I can easily detect
> problems and join to the discussion at early stage if I see any problem.
> 
> This is probably a noob question, and there are well-known branches,
> like Andrew Morton's one. But at this stage it's very important to
> have this series prepared for merge, and I'd prefer to ask about it.

I'm not entirely sure what the question is. For development, you could
base your series on a final release, e.g. 4.9. For reviews and
especially if you are targeting a certain merging window, it's useful to
rebase your patches on a fairly recent -rc, e.g. 4.10-rc3. I would
entirely skip any non-tagged kernel states (like middle of the merging
window) or out of tree branches. There may be a case to rebase on some
other developer's branch but only if there is a dependency that can't be
avoided and usually with prior agreement from both the respective
developer (as not to rebase the branch) and the involved maintainers.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 10/18] arm64: ilp32: introduce binfmt_ilp32.c
  2016-12-21 18:56     ` Yury Norov
@ 2017-01-06 14:48       ` Catalin Marinas
  0 siblings, 0 replies; 64+ messages in thread
From: Catalin Marinas @ 2017-01-06 14:48 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-doc, szabolcs.nagy, heiko.carstens, cmetcalf,
	philipp.tomsich, joseph, linux-arch, zhouchengming1,
	Prasun.Kapoor, agraf, geert, kilobyte, manuel.montezelo, arnd,
	pinskia, linyongting, klimov.linux, broonie, bamvor.zhangjian,
	Bamvor Zhang Jian, linux-arm-kernel, maxim.kuvyrkov,
	Nathan_Lynch, linux-kernel, schwidefsky, davem,
	christoph.muellner

On Thu, Dec 22, 2016 at 12:26:40AM +0530, Yury Norov wrote:
> On Mon, Dec 05, 2016 at 03:38:01PM +0000, Catalin Marinas wrote:
> > On Fri, Oct 21, 2016 at 11:33:09PM +0300, Yury Norov wrote:
> > > binfmt_ilp32.c is needed to handle ILP32 binaries
> > > 
> > > Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
> > > Signed-off-by: Bamvor Zhang Jian <bamvor.zhangjian@linaro.org>
> > > ---
> > >  arch/arm64/include/asm/elf.h     |  6 +++
> > >  arch/arm64/kernel/Makefile       |  1 +
> > >  arch/arm64/kernel/binfmt_ilp32.c | 97 ++++++++++++++++++++++++++++++++++++++++
> > >  3 files changed, 104 insertions(+)
> > >  create mode 100644 arch/arm64/kernel/binfmt_ilp32.c
> > > 
> > > diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> > > index f259fe8..be29dde 100644
> > > --- a/arch/arm64/include/asm/elf.h
> > > +++ b/arch/arm64/include/asm/elf.h
> > > @@ -175,10 +175,16 @@ extern int arch_setup_additional_pages(struct linux_binprm *bprm,
> > >  
> > >  #define COMPAT_ELF_ET_DYN_BASE		(2 * TASK_SIZE_32 / 3)
> > >  
> > > +#ifndef USE_AARCH64_GREG
> > >  /* AArch32 registers. */
> > >  #define COMPAT_ELF_NGREG		18
> > >  typedef unsigned int			compat_elf_greg_t;
> > >  typedef compat_elf_greg_t		compat_elf_gregset_t[COMPAT_ELF_NGREG];
> > > +#else /* AArch64 registers for AARCH64/ILP32 */
> > > +#define COMPAT_ELF_NGREG	ELF_NGREG
> > > +#define compat_elf_greg_t	elf_greg_t
> > > +#define compat_elf_gregset_t	elf_gregset_t
> > > +#endif
> > 
> > I think you only need compat_elf_gregset_t definition here and leave the
> > other two undefined.
> 
> I checked everything here again, and found that almost all compat defines
> may be moved to corresponding binfmt files. If everything is OK, I'll
> incorporate next patch to the series

It seems fine at a quick look but I'll have to see the final patch.

-- 
Catalin

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64
  2017-01-06 14:47   ` Catalin Marinas
@ 2017-01-09  8:30     ` Yury Norov
  0 siblings, 0 replies; 64+ messages in thread
From: Yury Norov @ 2017-01-09  8:30 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: arnd, linux-doc, szabolcs.nagy, heiko.carstens, cmetcalf,
	philipp.tomsich, joseph, zhouchengming1, Prasun.Kapoor, agraf,
	geert, kilobyte, manuel.montezelo, pinskia, linyongting,
	klimov.linux, broonie, bamvor.zhangjian, linux-arm-kernel,
	maxim.kuvyrkov, Nathan_Lynch, linux-kernel, schwidefsky, davem

On Fri, Jan 06, 2017 at 02:47:04PM +0000, Catalin Marinas wrote:
> On Sun, Dec 18, 2016 at 12:38:23PM +0530, Yury Norov wrote:
> > On Fri, Oct 21, 2016 at 11:32:59PM +0300, Yury Norov wrote:
> > > This series enables aarch64 with ilp32 mode, and as supporting work,
> > > introduces ARCH_32BIT_OFF_T configuration option that is enabled for
> > > existing 32-bit architectures but disabled for new arches (so 64-bit
> > > off_t is is used by new userspace).
> > > 
> > > This version is based on kernel v4.9-rc1.  It works with glibc-2.24,
> > > and tested with LTP.
> >  
> > Hi Arnd, Catalin
> > 
> > For last few days I'm trying to rebase this series on current master,
> > and I see significant conflicts and regressions. In fact, every time
> > I rebase on next rc1, I feel like I play a roulette.
> > 
> > This is not a significant problem now because it's almost for sure
> > that this series will not get into 4.10, for reasons not related to
> > kernel code. And I have time to deal with regressions. But in general,
> > I'd like to try my patches on top of other candidates for next merge
> > window. I cannot read all emails in LKML, but I can easily detect
> > problems and join to the discussion at early stage if I see any problem.
> > 
> > This is probably a noob question, and there are well-known branches,
> > like Andrew Morton's one. But at this stage it's very important to
> > have this series prepared for merge, and I'd prefer to ask about it.
> 
> I'm not entirely sure what the question is. For development, you could
> base your series on a final release, e.g. 4.9. For reviews and
> especially if you are targeting a certain merging window, it's useful to
> rebase your patches on a fairly recent -rc, e.g. 4.10-rc3. I would
> entirely skip any non-tagged kernel states (like middle of the merging
> window) or out of tree branches. There may be a case to rebase on some
> other developer's branch but only if there is a dependency that can't be
> avoided and usually with prior agreement from both the respective
> developer (as not to rebase the branch) and the involved maintainers.

Hi Catalin, 4.10-rcX is good enough but I also need to be sure that
when merge window will be opened I will not find my series broken due
to conflicts, because merge window is only 2 weeks, and there's no
much time to investigate and fix all bugs properly.

Anyway, linux-next is what I need, as Chris mentioned.

Yury

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2017-01-09  8:30 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-21 20:32 [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
2016-10-21 20:33 ` [PATCH 01/18] 32-bit ABI: introduce ARCH_32BIT_OFF_T config option Yury Norov
2016-10-24 16:30   ` Chris Metcalf
2016-10-24 22:22     ` Arnd Bergmann
2016-10-27  9:29       ` Yury Norov
2016-10-21 20:33 ` [PATCH 02/18] arm64: ilp32: add documentation on the ILP32 ABI for ARM64 Yury Norov
2016-10-24 16:36   ` Chris Metcalf
2016-10-27  9:40     ` Yury Norov
2016-10-21 20:33 ` [PATCH 03/18] arm64: rename COMPAT to AARCH32_EL0 in Kconfig Yury Norov
2016-10-21 20:33 ` [PATCH 04/18] arm64: ensure the kernel is compiled for LP64 Yury Norov
2016-10-21 20:33 ` [PATCH 05/18] arm64:uapi: set __BITS_PER_LONG correctly for ILP32 and LP64 Yury Norov
2016-10-21 20:33 ` [PATCH 06/18] thread: move thread bits accessors to separated file Yury Norov
2016-10-21 20:33 ` [PATCH 07/18] arm64: introduce is_a32_task and is_a32_thread (for AArch32 compat) Yury Norov
2016-10-21 20:33 ` [PATCH 08/18] arm64: ilp32: add is_ilp32_compat_{task,thread} and TIF_32BIT_AARCH64 Yury Norov
2016-10-21 20:33 ` [PATCH 09/18] arm64: introduce binfmt_elf32.c Yury Norov
2016-12-05 15:10   ` Catalin Marinas
2016-12-14  9:39     ` Yury Norov
2016-10-21 20:33 ` [PATCH 10/18] arm64: ilp32: introduce binfmt_ilp32.c Yury Norov
2016-12-05 15:38   ` Catalin Marinas
2016-12-21 18:56     ` Yury Norov
2017-01-06 14:48       ` Catalin Marinas
2016-10-21 20:33 ` [PATCH 11/18] arm64: ilp32: share aarch32 syscall handlers Yury Norov
2016-12-05 17:12   ` Catalin Marinas
2016-12-06  7:32     ` Yury Norov
2016-10-21 20:33 ` [PATCH 12/18] arm64: ilp32: add sys_ilp32.c and a separate table (in entry.S) to use it Yury Norov
2016-10-21 20:33 ` [PATCH 13/18] arm64: signal: share lp64 signal routines to ilp32 Yury Norov
2016-10-21 20:33 ` [PATCH 14/18] arm64: signal32: move ilp32 and aarch32 common code to separated file Yury Norov
2016-12-05 16:18   ` Catalin Marinas
2016-12-06  9:36     ` Yury Norov
2016-10-21 20:33 ` [PATCH 15/18] arm64: ilp32: introduce ilp32-specific handlers for sigframe and ucontext Yury Norov
2016-10-21 20:33 ` [PATCH 16/18] arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32 Yury Norov
2016-12-05 16:34   ` Catalin Marinas
2016-12-06  6:25     ` Yury Norov
2016-12-06  6:30       ` Yury Norov
2016-12-07 16:59       ` Catalin Marinas
2016-12-07 20:40         ` Arnd Bergmann
2016-12-08 13:12           ` Catalin Marinas
2017-01-05 20:40           ` Yury Norov
2017-01-06 14:36             ` Catalin Marinas
2016-10-21 20:33 ` [PATCH 17/18] arm64:ilp32: add vdso-ilp32 and use for signal return Yury Norov
2016-10-21 20:33 ` [PATCH 18/18] arm64:ilp32: add ARM64_ILP32 to Kconfig Yury Norov
2016-10-28 12:46 ` ILP32 for ARM64 - testing with lmbench Yury Norov
2016-11-17  3:28   ` Zhangjian (Bamvor)
2016-11-17  5:02     ` Maxim Kuvyrkov
2016-11-17  7:48       ` Zhangjian (Bamvor)
2016-12-05 10:16         ` Zhangjian (Bamvor)
2016-12-05 14:13           ` Catalin Marinas
2016-12-11 12:08             ` Yury Norov
2016-11-07  8:23 ` ILP32 for ARM64: testing with glibc testsuite Yury Norov
2016-11-09  9:56   ` Yury Norov
2016-11-16 11:22     ` Maxim Kuvyrkov
2016-11-17 15:50       ` Catalin Marinas
2016-11-17 21:45       ` Steve Ellcey
2016-12-05  9:58         ` Zhangjian (Bamvor)
2016-12-05 10:07           ` Andreas Schwab
2016-12-05 10:24             ` Zhangjian (Bamvor)
2016-12-06  5:29               ` Yury Norov
2016-12-05 19:33             ` Steve Ellcey
2016-12-06  8:31               ` Andreas Schwab
2016-11-30  5:02 ` [RFC3 nowrap: PATCH v7 00/18] ILP32 for ARM64 Yury Norov
2016-11-30  6:52   ` Adam Borowski
2016-12-18  7:08 ` Yury Norov
2017-01-06 14:47   ` Catalin Marinas
2017-01-09  8:30     ` Yury Norov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).