linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/22] arm64: Memory Tagging Extension user-space support
@ 2019-12-11 18:40 Catalin Marinas
  2019-12-11 18:40 ` [PATCH 01/22] mm: Reserve asm-generic prot flags 0x10 and 0x20 for arch use Catalin Marinas
                   ` (22 more replies)
  0 siblings, 23 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

Hi,

This series proposes the initial user-space support for the ARMv8.5
Memory Tagging Extension [1]. The patches are also available on this
branch:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux devel/mte

Short description extracted from the MTE whitepaper [2]:

  MTE aims to increase the memory safety of code written in unsafe
  languages without requiring source changes, and in some cases, without
  requiring recompilation. The Arm Memory Tagging Extension implements
  lock and key access to memory. Locks can be set on memory and keys
  provided during memory access. If the key matches the lock, the access
  is permitted. If it does not match, an error is reported. Memory
  locations are tagged by adding four bits of metadata to each 16 bytes
  of physical memory. This is the Tag Granule. Tagging memory implements
  the lock. Pointers, and therefore virtual addresses, are modified to
  contain the key. In order to implement the key bits without requiring
  larger pointers MTE uses the Top Byte Ignore (TBI) feature of the
  ARMv8-A Architecture. When TBI is enabled, the top byte of a virtual
  address is ignored when using it as an input for address translation.
  This allows the top byte to store metadata.

The rough outline of this series, apart from some clean-up patches:

1. Enable detection of the MTE feature by the kernel.

2. Switch the linear map to use the Normal-Tagged memory attribute so
   that the kernel can read/write the tags in memory (a.k.a. allocation
   tags).

3. Handle tags in {clear,copy}_page() and memcmp_pages().

4. User tag fault exception handling and SIGSEGV injection.

5. PROT_MTE support to enable tag checks/accesses in user-space,
   together with new arch_calc_vm_flag_bits() and arch_validate_flags()
   hooks.

6. User control of tag check fault mode and tag exclusion via prctl(),
   built on top of the PR_{SET,GET}_TAGGED_ADDR_CTRL.

7. Documentation of the user ABI with a C example (though such MTE
   enabling and allocation tagging is expected to live in a C library).

For libc people interested in MTE, I suggest reading the last patch with
the ABI documentation.

Missing bits before upstreaming:

- Swap support. Currently ARM64_MTE (default n) selects ARCH_NO_SWAP.
  The SPARC ADI hooks for the similar feature are not sufficient for
  correct (no races) saving and restoring of the MTE metadata in swapped
  pages. A separate patch series will be posted once implemented.

- Related to the above is suspend to disk.

- ptrace() support to be able to access the tags in memory of a
  different process, something like {PEEK,POKE}_TAG.

- coredump (user) currently does not contain the tags.

- kselftests (work in progress)

- Clarify whether mmap(tagged_addr, PROT_MTE) pre-tags the memory with
  the tag given in the tagged_addr hint. Strong justification is
  required for this as it would force arm64 to disable the zero page.

- Clarify with the hardware architects whether CPUID checking is
  sufficient or additional description via FDT or ACPI is required.

[1] https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/enhancing-memory-safety
[2] https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/Arm_Memory_Tagging_Extension_Whitepaper.pdf

Catalin Marinas (13):
  kbuild: Add support for 'as-instr' to be used in Kconfig files
  arm64: alternative: Allow alternative_insn to always issue the first
    instruction
  arm64: Use macros instead of hard-coded constants for MAIR_EL1
  arm64: mte: Use Normal Tagged attributes for the linear map
  arm64: mte: Assembler macros and default architecture for .S files
  arm64: Tags-aware memcmp_pages() implementation
  mm: Introduce arch_calc_vm_flag_bits()
  arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  mm: Introduce arch_validate_flags()
  arm64: mte: Validate the PROT_MTE request via arch_validate_flags()
  mm: Allow arm64 mmap(PROT_MTE) on RAM-based files
  arm64: mte: Allow user control of the tag check mode via prctl()
  arm64: mte: Allow user control of the excluded tags via prctl()

Dave Martin (1):
  mm: Reserve asm-generic prot flags 0x10 and 0x20 for arch use

Vincenzo Frascino (8):
  arm64: mte: system register definitions
  arm64: mte: CPU feature detection and initial sysreg configuration
  arm64: mte: Tags-aware clear_page() implementation
  arm64: mte: Tags-aware copy_page() implementation
  arm64: mte: Add specific SIGSEGV codes
  arm64: mte: Handle synchronous and asynchronous tag check faults
  arm64: mte: Kconfig entry
  arm64: mte: Add Memory Tagging Extension documentation

 Documentation/arm64/cpu-feature-registers.rst |   4 +
 Documentation/arm64/elf_hwcaps.rst            |   4 +
 Documentation/arm64/index.rst                 |   1 +
 .../arm64/memory-tagging-extension.rst        | 229 ++++++++++++++++++
 arch/arm64/Kconfig                            |  32 +++
 arch/arm64/include/asm/alternative.h          |   8 +-
 arch/arm64/include/asm/assembler.h            |  16 ++
 arch/arm64/include/asm/cpucaps.h              |   5 +-
 arch/arm64/include/asm/cpufeature.h           |   6 +
 arch/arm64/include/asm/hwcap.h                |   1 +
 arch/arm64/include/asm/kvm_arm.h              |   3 +-
 arch/arm64/include/asm/memory.h               |  17 +-
 arch/arm64/include/asm/mman.h                 |  78 ++++++
 arch/arm64/include/asm/mte.h                  |  11 +
 arch/arm64/include/asm/page.h                 |   4 +-
 arch/arm64/include/asm/pgtable-prot.h         |   2 +
 arch/arm64/include/asm/pgtable.h              |   7 +-
 arch/arm64/include/asm/processor.h            |   4 +
 arch/arm64/include/asm/sysreg.h               |  70 ++++++
 arch/arm64/include/asm/thread_info.h          |   4 +-
 arch/arm64/include/uapi/asm/hwcap.h           |   2 +
 arch/arm64/include/uapi/asm/mman.h            |  14 ++
 arch/arm64/include/uapi/asm/ptrace.h          |   1 +
 arch/arm64/kernel/cpufeature.c                |  59 +++++
 arch/arm64/kernel/cpuinfo.c                   |   2 +
 arch/arm64/kernel/entry.S                     |  17 ++
 arch/arm64/kernel/process.c                   | 141 ++++++++++-
 arch/arm64/kernel/ptrace.c                    |   2 +-
 arch/arm64/kernel/signal.c                    |   8 +
 arch/arm64/lib/Makefile                       |   2 +
 arch/arm64/lib/clear_page.S                   |   7 +-
 arch/arm64/lib/copy_page.S                    |  23 ++
 arch/arm64/lib/mte.S                          |  46 ++++
 arch/arm64/mm/Makefile                        |   1 +
 arch/arm64/mm/cmppages.c                      |  26 ++
 arch/arm64/mm/dump.c                          |   4 +
 arch/arm64/mm/fault.c                         |   9 +-
 arch/arm64/mm/mmu.c                           |  22 +-
 arch/arm64/mm/proc.S                          |  31 ++-
 fs/proc/task_mmu.c                            |   3 +
 include/linux/mm.h                            |   8 +
 include/linux/mman.h                          |  20 +-
 include/uapi/asm-generic/mman-common.h        |   2 +
 include/uapi/asm-generic/siginfo.h            |   9 +-
 include/uapi/linux/prctl.h                    |   9 +
 mm/mmap.c                                     |   9 +
 mm/mprotect.c                                 |   8 +
 mm/shmem.c                                    |   3 +
 mm/util.c                                     |   2 +-
 scripts/Kconfig.include                       |   4 +
 50 files changed, 958 insertions(+), 42 deletions(-)
 create mode 100644 Documentation/arm64/memory-tagging-extension.rst
 create mode 100644 arch/arm64/include/asm/mman.h
 create mode 100644 arch/arm64/include/asm/mte.h
 create mode 100644 arch/arm64/include/uapi/asm/mman.h
 create mode 100644 arch/arm64/lib/mte.S
 create mode 100644 arch/arm64/mm/cmppages.c



^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 01/22] mm: Reserve asm-generic prot flags 0x10 and 0x20 for arch use
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 19:26   ` Arnd Bergmann
  2019-12-11 18:40 ` [PATCH 02/22] kbuild: Add support for 'as-instr' to be used in Kconfig files Catalin Marinas
                   ` (21 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch, Dave Martin, Arnd Bergmann

From: Dave Martin <Dave.Martin@arm.com>

The asm-generic/mman.h definitions are used by a few architectures that
also define arch-specific PROT flags with value 0x10 and 0x20. This
currently applies to sparc and powerpc for 0x10, while arm64 will soon
join with 0x10 and 0x20.

To help future maintainers, document the use of this flag in the
asm-generic header too.

Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
[catalin.marinas@arm.com: reserve 0x20 as well]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 include/uapi/asm-generic/mman-common.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h
index c160a5354eb6..f94f65d429be 100644
--- a/include/uapi/asm-generic/mman-common.h
+++ b/include/uapi/asm-generic/mman-common.h
@@ -11,6 +11,8 @@
 #define PROT_WRITE	0x2		/* page can be written */
 #define PROT_EXEC	0x4		/* page can be executed */
 #define PROT_SEM	0x8		/* page may be used for atomic ops */
+/*			0x10		   reserved for arch-specific use */
+/*			0x20		   reserved for arch-specific use */
 #define PROT_NONE	0x0		/* page can not be accessed */
 #define PROT_GROWSDOWN	0x01000000	/* mprotect flag: extend change to start of growsdown vma */
 #define PROT_GROWSUP	0x02000000	/* mprotect flag: extend change to end of growsup vma */


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 02/22] kbuild: Add support for 'as-instr' to be used in Kconfig files
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
  2019-12-11 18:40 ` [PATCH 01/22] mm: Reserve asm-generic prot flags 0x10 and 0x20 for arch use Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-12  5:03   ` Masahiro Yamada
  2019-12-11 18:40 ` [PATCH 03/22] arm64: alternative: Allow alternative_insn to always issue the first instruction Catalin Marinas
                   ` (20 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch, Masahiro Yamada, linux-kbuild, Vladimir Murzin

Similar to 'cc-option' or 'ld-option', it is occasionally necessary to
check whether the assembler supports certain ISA extensions. In the
arm64 code we currently do this in Makefile with an additional define:

lseinstr := $(call as-instr,.arch_extension lse,-DCONFIG_AS_LSE=1)

Add the 'as-instr' option so that it can be used in Kconfig directly:

	def_bool $(as-instr,.arch_extension lse)

Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: linux-kbuild@vger.kernel.org
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 scripts/Kconfig.include | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/scripts/Kconfig.include b/scripts/Kconfig.include
index d4adfbe42690..9d07e59cbdf7 100644
--- a/scripts/Kconfig.include
+++ b/scripts/Kconfig.include
@@ -31,6 +31,10 @@ cc-option = $(success,$(CC) -Werror $(CLANG_FLAGS) $(1) -E -x c /dev/null -o /de
 # Return y if the linker supports <flag>, n otherwise
 ld-option = $(success,$(LD) -v $(1))
 
+# $(as-instr,<instr>)
+# Return y if the assembler supports <instr>, n otherwise
+as-instr = $(success,printf "%b\n" "$(1)" | $(CC) $(CLANG_FLAGS) -c -x assembler -o /dev/null -)
+
 # check if $(CC) and $(LD) exist
 $(error-if,$(failure,command -v $(CC)),compiler '$(CC)' not found)
 $(error-if,$(failure,command -v $(LD)),linker '$(LD)' not found)


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 03/22] arm64: alternative: Allow alternative_insn to always issue the first instruction
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
  2019-12-11 18:40 ` [PATCH 01/22] mm: Reserve asm-generic prot flags 0x10 and 0x20 for arch use Catalin Marinas
  2019-12-11 18:40 ` [PATCH 02/22] kbuild: Add support for 'as-instr' to be used in Kconfig files Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 18:40 ` [PATCH 04/22] arm64: Use macros instead of hard-coded constants for MAIR_EL1 Catalin Marinas
                   ` (19 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

There are situations where we do not want to disable the whole block
based on a config option, only the alternative part while keeping the
first instruction. Improve the alternative_insn assembler macro to take
a 'first_insn' argument, default 0 to preserve the current behaviour.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/alternative.h | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
index b9f8d787eea9..b4d3ffe16ca6 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -101,7 +101,11 @@ static inline void apply_alternatives_module(void *start, size_t length) { }
 	.byte \alt_len
 .endm
 
-.macro alternative_insn insn1, insn2, cap, enable = 1
+/*
+ * Disable the whole block if enable == 0, unless first_insn == 1 in which
+ * case insn1 will always be issued but without an alternative insn2.
+ */
+.macro alternative_insn insn1, insn2, cap, enable = 1, first_insn = 0
 	.if \enable
 661:	\insn1
 662:	.pushsection .altinstructions, "a"
@@ -112,6 +116,8 @@ static inline void apply_alternatives_module(void *start, size_t length) { }
 664:	.popsection
 	.org	. - (664b-663b) + (662b-661b)
 	.org	. - (662b-661b) + (664b-663b)
+	.elseif \first_insn
+	\insn1
 	.endif
 .endm
 


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 04/22] arm64: Use macros instead of hard-coded constants for MAIR_EL1
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (2 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 03/22] arm64: alternative: Allow alternative_insn to always issue the first instruction Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 18:40 ` [PATCH 05/22] arm64: mte: system register definitions Catalin Marinas
                   ` (18 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

Currently, the arm64 __cpu_setup has hard-coded constants for the memory
attributes that go into the MAIR_EL1 register. Define proper macros in
asm/sysreg.h and make use of them in proc.S.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/sysreg.h | 12 ++++++++++++
 arch/arm64/mm/proc.S            | 27 ++++++++++-----------------
 2 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 6e919fafb43d..e21470337c5e 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -538,6 +538,18 @@
 			 SCTLR_EL1_NTWE | SCTLR_ELx_IESB | SCTLR_EL1_SPAN |\
 			 ENDIAN_SET_EL1 | SCTLR_EL1_UCI  | SCTLR_EL1_RES1)
 
+/* MAIR_ELx memory attributes (used by Linux) */
+#define MAIR_ATTR_DEVICE_nGnRnE		UL(0x00)
+#define MAIR_ATTR_DEVICE_nGnRE		UL(0x04)
+#define MAIR_ATTR_DEVICE_GRE		UL(0x0c)
+#define MAIR_ATTR_NORMAL_NC		UL(0x44)
+#define MAIR_ATTR_NORMAL_WT		UL(0xbb)
+#define MAIR_ATTR_NORMAL		UL(0xff)
+#define MAIR_ATTR_MASK			UL(0xff)
+
+/* Position the attr at the correct index */
+#define MAIR_ATTRIDX(attr, idx)		((attr) << ((idx) * 8))
+
 /* id_aa64isar0 */
 #define ID_AA64ISAR0_TS_SHIFT		52
 #define ID_AA64ISAR0_FHM_SHIFT		48
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index a1e0592d1fbc..55f715957b36 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -42,7 +42,14 @@
 #define TCR_KASAN_FLAGS 0
 #endif
 
-#define MAIR(attr, mt)	((attr) << ((mt) * 8))
+/* Default MAIR_EL1 */
+#define MAIR_EL1_SET							\
+	(MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRnE, MT_DEVICE_nGnRnE) |	\
+	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRE, MT_DEVICE_nGnRE) |	\
+	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_GRE, MT_DEVICE_GRE) |		\
+	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_NC, MT_NORMAL_NC) |		\
+	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL, MT_NORMAL) |			\
+	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_WT, MT_NORMAL_WT))
 
 #ifdef CONFIG_CPU_PM
 /**
@@ -416,23 +423,9 @@ ENTRY(__cpu_setup)
 	enable_dbg				// since this is per-cpu
 	reset_pmuserenr_el0 x0			// Disable PMU access from EL0
 	/*
-	 * Memory region attributes for LPAE:
-	 *
-	 *   n = AttrIndx[2:0]
-	 *			n	MAIR
-	 *   DEVICE_nGnRnE	000	00000000
-	 *   DEVICE_nGnRE	001	00000100
-	 *   DEVICE_GRE		010	00001100
-	 *   NORMAL_NC		011	01000100
-	 *   NORMAL		100	11111111
-	 *   NORMAL_WT		101	10111011
+	 * Memory region attributes
 	 */
-	ldr	x5, =MAIR(0x00, MT_DEVICE_nGnRnE) | \
-		     MAIR(0x04, MT_DEVICE_nGnRE) | \
-		     MAIR(0x0c, MT_DEVICE_GRE) | \
-		     MAIR(0x44, MT_NORMAL_NC) | \
-		     MAIR(0xff, MT_NORMAL) | \
-		     MAIR(0xbb, MT_NORMAL_WT)
+	mov_q	x5, MAIR_EL1_SET
 	msr	mair_el1, x5
 	/*
 	 * Prepare SCTLR


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 05/22] arm64: mte: system register definitions
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (3 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 04/22] arm64: Use macros instead of hard-coded constants for MAIR_EL1 Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 18:40 ` [PATCH 06/22] arm64: mte: CPU feature detection and initial sysreg configuration Catalin Marinas
                   ` (17 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

Add Memory Tagging Extension system register definitions together with
the relevant bitfields.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/kvm_arm.h     |  1 +
 arch/arm64/include/asm/sysreg.h      | 50 ++++++++++++++++++++++++++++
 arch/arm64/include/uapi/asm/ptrace.h |  1 +
 arch/arm64/kernel/ptrace.c           |  2 +-
 4 files changed, 53 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 6e5d839f42b5..0b25f9a81c57 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -12,6 +12,7 @@
 #include <asm/types.h>
 
 /* Hyp Configuration Register (HCR) bits */
+#define HCR_ATA		(UL(1) << 56)
 #define HCR_FWB		(UL(1) << 46)
 #define HCR_API		(UL(1) << 41)
 #define HCR_APK		(UL(1) << 40)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index e21470337c5e..609d1b3238dd 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -171,6 +171,8 @@
 #define SYS_SCTLR_EL1			sys_reg(3, 0, 1, 0, 0)
 #define SYS_ACTLR_EL1			sys_reg(3, 0, 1, 0, 1)
 #define SYS_CPACR_EL1			sys_reg(3, 0, 1, 0, 2)
+#define SYS_RGSR_EL1			sys_reg(3, 0, 1, 0, 5)
+#define SYS_GCR_EL1			sys_reg(3, 0, 1, 0, 6)
 
 #define SYS_ZCR_EL1			sys_reg(3, 0, 1, 2, 0)
 
@@ -208,6 +210,8 @@
 #define SYS_ERXADDR_EL1			sys_reg(3, 0, 5, 4, 3)
 #define SYS_ERXMISC0_EL1		sys_reg(3, 0, 5, 5, 0)
 #define SYS_ERXMISC1_EL1		sys_reg(3, 0, 5, 5, 1)
+#define SYS_TFSR_EL1			sys_reg(3, 0, 5, 6, 0)
+#define SYS_TFSRE0_EL1			sys_reg(3, 0, 5, 6, 1)
 
 #define SYS_FAR_EL1			sys_reg(3, 0, 6, 0, 0)
 #define SYS_PAR_EL1			sys_reg(3, 0, 7, 4, 0)
@@ -358,6 +362,7 @@
 
 #define SYS_CCSIDR_EL1			sys_reg(3, 1, 0, 0, 0)
 #define SYS_CLIDR_EL1			sys_reg(3, 1, 0, 0, 1)
+#define SYS_GMID_EL1			sys_reg(3, 1, 0, 0, 4)
 #define SYS_AIDR_EL1			sys_reg(3, 1, 0, 0, 7)
 
 #define SYS_CSSELR_EL1			sys_reg(3, 2, 0, 0, 0)
@@ -411,6 +416,7 @@
 #define SYS_ESR_EL2			sys_reg(3, 4, 5, 2, 0)
 #define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
 #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
+#define SYS_TFSR_EL2			sys_reg(3, 4, 5, 6, 0)
 #define SYS_FAR_EL2			sys_reg(3, 4, 6, 0, 0)
 
 #define SYS_VDISR_EL2			sys_reg(3, 4, 12, 1,  1)
@@ -467,6 +473,7 @@
 #define SYS_AFSR0_EL12			sys_reg(3, 5, 5, 1, 0)
 #define SYS_AFSR1_EL12			sys_reg(3, 5, 5, 1, 1)
 #define SYS_ESR_EL12			sys_reg(3, 5, 5, 2, 0)
+#define SYS_TFSR_EL12			sys_reg(3, 5, 5, 6, 0)
 #define SYS_FAR_EL12			sys_reg(3, 5, 6, 0, 0)
 #define SYS_MAIR_EL12			sys_reg(3, 5, 10, 2, 0)
 #define SYS_AMAIR_EL12			sys_reg(3, 5, 10, 3, 0)
@@ -482,6 +489,14 @@
 
 /* Common SCTLR_ELx flags. */
 #define SCTLR_ELx_DSSBS	(BIT(44))
+#define SCTLR_ELx_ATA	(BIT(43))
+
+#define SCTLR_ELx_TCF_SHIFT	40
+#define SCTLR_ELx_TCF_SYNC	(UL(0x1) << SCTLR_ELx_TCF_SHIFT)
+#define SCTLR_ELx_TCF_ASYNC	(UL(0x2) << SCTLR_ELx_TCF_SHIFT)
+#define SCTLR_ELx_TCF_MASK	(UL(0x3) << SCTLR_ELx_TCF_SHIFT)
+
+#define SCTLR_ELx_ITFSB	(BIT(37))
 #define SCTLR_ELx_ENIA	(BIT(31))
 #define SCTLR_ELx_ENIB	(BIT(30))
 #define SCTLR_ELx_ENDA	(BIT(27))
@@ -510,6 +525,13 @@
 #endif
 
 /* SCTLR_EL1 specific flags. */
+#define SCTLR_EL1_ATA0		(BIT(42))
+
+#define SCTLR_EL1_TCF0_SHIFT	38
+#define SCTLR_EL1_TCF0_SYNC	(UL(0x1) << SCTLR_EL1_TCF0_SHIFT)
+#define SCTLR_EL1_TCF0_ASYNC	(UL(0x2) << SCTLR_EL1_TCF0_SHIFT)
+#define SCTLR_EL1_TCF0_MASK	(UL(0x3) << SCTLR_EL1_TCF0_SHIFT)
+
 #define SCTLR_EL1_UCI		(BIT(26))
 #define SCTLR_EL1_E0E		(BIT(24))
 #define SCTLR_EL1_SPAN		(BIT(23))
@@ -544,6 +566,7 @@
 #define MAIR_ATTR_DEVICE_GRE		UL(0x0c)
 #define MAIR_ATTR_NORMAL_NC		UL(0x44)
 #define MAIR_ATTR_NORMAL_WT		UL(0xbb)
+#define MAIR_ATTR_NORMAL_TAGGED		UL(0xf0)
 #define MAIR_ATTR_NORMAL		UL(0xff)
 #define MAIR_ATTR_MASK			UL(0xff)
 
@@ -611,11 +634,16 @@
 
 /* id_aa64pfr1 */
 #define ID_AA64PFR1_SSBS_SHIFT		4
+#define ID_AA64PFR1_MTE_SHIFT		8
 
 #define ID_AA64PFR1_SSBS_PSTATE_NI	0
 #define ID_AA64PFR1_SSBS_PSTATE_ONLY	1
 #define ID_AA64PFR1_SSBS_PSTATE_INSNS	2
 
+#define ID_AA64PFR1_MTE_NI		0x0
+#define ID_AA64PFR1_MTE_EL0		0x1
+#define ID_AA64PFR1_MTE			0x2
+
 /* id_aa64zfr0 */
 #define ID_AA64ZFR0_SM4_SHIFT		40
 #define ID_AA64ZFR0_SHA3_SHIFT		32
@@ -746,6 +774,28 @@
 #define CPACR_EL1_ZEN_EL0EN	(BIT(17)) /* enable EL0 access, if EL1EN set */
 #define CPACR_EL1_ZEN		(CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN)
 
+/* TCR EL1 Bit Definitions */
+#define SYS_TCR_EL1_TCMA1	(BIT(58))
+#define SYS_TCR_EL1_TCMA0	(BIT(57))
+
+/* GCR_EL1 Definitions */
+#define SYS_GCR_EL1_RRND	(BIT(16))
+#define SYS_GCR_EL1_EXCL_MASK	0xffffUL
+
+/* RGSR_EL1 Definitions */
+#define SYS_RGSR_EL1_TAG_MASK	0xfUL
+#define SYS_RGSR_EL1_SEED_SHIFT	8
+#define SYS_RGSR_EL1_SEED_MASK	0xffffUL
+
+/* GMID_EL1 field definitions */
+#define SYS_GMID_EL1_BS_SHIFT	0
+#define SYS_GMID_EL1_BS_SIZE	4
+
+/* TFSR{,E0}_EL1 bit definitions */
+#define SYS_TFSR_EL1_TF0_SHIFT	0
+#define SYS_TFSR_EL1_TF1_SHIFT	1
+#define SYS_TFSR_EL1_TF0	(UL(1) << SYS_TFSR_EL1_TF0_SHIFT)
+#define SYS_TFSR_EL1_TF1	(UK(2) << SYS_TFSR_EL1_TF1_SHIFT)
 
 /* Safe value for MPIDR_EL1: Bit31:RES1, Bit30:U:0, Bit24:MT:0 */
 #define SYS_MPIDR_SAFE_VAL	(BIT(31))
diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h
index 7ed9294e2004..0ae569ec7960 100644
--- a/arch/arm64/include/uapi/asm/ptrace.h
+++ b/arch/arm64/include/uapi/asm/ptrace.h
@@ -49,6 +49,7 @@
 #define PSR_SSBS_BIT	0x00001000
 #define PSR_PAN_BIT	0x00400000
 #define PSR_UAO_BIT	0x00800000
+#define PSR_TCO_BIT	0x02000000
 #define PSR_V_BIT	0x10000000
 #define PSR_C_BIT	0x20000000
 #define PSR_Z_BIT	0x40000000
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 6771c399d40c..05dcb4853b8d 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1852,7 +1852,7 @@ void syscall_trace_exit(struct pt_regs *regs)
  * We also reserve IL for the kernel; SS is handled dynamically.
  */
 #define SPSR_EL1_AARCH64_RES0_BITS \
-	(GENMASK_ULL(63, 32) | GENMASK_ULL(27, 25) | GENMASK_ULL(23, 22) | \
+	(GENMASK_ULL(63, 32) | GENMASK_ULL(27, 26) | GENMASK_ULL(23, 22) | \
 	 GENMASK_ULL(20, 13) | GENMASK_ULL(11, 10) | GENMASK_ULL(5, 5))
 #define SPSR_EL1_AARCH32_RES0_BITS \
 	(GENMASK_ULL(63, 32) | GENMASK_ULL(22, 22) | GENMASK_ULL(20, 20))


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 06/22] arm64: mte: CPU feature detection and initial sysreg configuration
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (4 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 05/22] arm64: mte: system register definitions Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 18:40 ` [PATCH 07/22] arm64: mte: Use Normal Tagged attributes for the linear map Catalin Marinas
                   ` (16 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

Add the cpufeature and hwcap entries to detect the presence of MTE on
the boot CPUs (primary and secondary). Any late secondary CPU not
supporting the feature, if detected during boot, will be parked.

In addition, add the minimum SCTLR_EL1 and HCR_EL2 bits for enabling
MTE. Without subsequent setting of MAIR, these bits do not have an
effect on tag checking.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/cpucaps.h    |  5 ++++-
 arch/arm64/include/asm/cpufeature.h |  6 ++++++
 arch/arm64/include/asm/hwcap.h      |  1 +
 arch/arm64/include/asm/kvm_arm.h    |  2 +-
 arch/arm64/include/asm/sysreg.h     |  1 +
 arch/arm64/include/uapi/asm/hwcap.h |  2 ++
 arch/arm64/kernel/cpufeature.c      | 29 +++++++++++++++++++++++++++++
 arch/arm64/kernel/cpuinfo.c         |  2 ++
 8 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index b92683871119..8a56604ec04e 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -56,7 +56,10 @@
 #define ARM64_WORKAROUND_CAVIUM_TX2_219_PRFM	46
 #define ARM64_WORKAROUND_1542419		47
 #define ARM64_WORKAROUND_1319367		48
+/* 49 reserved for ARM64_HAS_E0PD */
+/* 50 reserved for ARM64_BTI */
+#define ARM64_MTE				51
 
-#define ARM64_NCAPS				49
+#define ARM64_NCAPS				52
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 4261d55e8506..7d464797eff6 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -607,6 +607,12 @@ static inline bool system_uses_irq_prio_masking(void)
 	       cpus_have_const_cap(ARM64_HAS_IRQ_PRIO_MASKING);
 }
 
+static inline bool system_supports_mte(void)
+{
+	return IS_ENABLED(CONFIG_ARM64_MTE) &&
+		cpus_have_const_cap(ARM64_MTE);
+}
+
 static inline bool system_has_prio_mask_debugging(void)
 {
 	return IS_ENABLED(CONFIG_ARM64_DEBUG_PRIORITY_MASKING) &&
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index 3d2f2472a36c..a98b3347c4a4 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -86,6 +86,7 @@
 #define KERNEL_HWCAP_SVESM4		__khwcap2_feature(SVESM4)
 #define KERNEL_HWCAP_FLAGM2		__khwcap2_feature(FLAGM2)
 #define KERNEL_HWCAP_FRINT		__khwcap2_feature(FRINT)
+#define KERNEL_HWCAP_MTE		__khwcap2_feature(MTE)
 
 /*
  * This yields a mask that user programs can use to figure out what
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 0b25f9a81c57..37bcb93b376c 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -78,7 +78,7 @@
 			 HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | HCR_TLOR | \
 			 HCR_FMO | HCR_IMO)
 #define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF)
-#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK)
+#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK | HCR_ATA)
 #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
 
 /* TCR_EL2 Registers bits */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 609d1b3238dd..9e5753272f4b 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -558,6 +558,7 @@
 			 SCTLR_EL1_SA0  | SCTLR_EL1_SED  | SCTLR_ELx_I    |\
 			 SCTLR_EL1_DZE  | SCTLR_EL1_UCT                   |\
 			 SCTLR_EL1_NTWE | SCTLR_ELx_IESB | SCTLR_EL1_SPAN |\
+			 SCTLR_ELx_ITFSB| SCTLR_ELx_ATA  | SCTLR_EL1_ATA0 |\
 			 ENDIAN_SET_EL1 | SCTLR_EL1_UCI  | SCTLR_EL1_RES1)
 
 /* MAIR_ELx memory attributes (used by Linux) */
diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
index a1e72886b30c..57a29c9b29d9 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -65,5 +65,7 @@
 #define HWCAP2_SVESM4		(1 << 6)
 #define HWCAP2_FLAGM2		(1 << 7)
 #define HWCAP2_FRINT		(1 << 8)
+/* bit 9 reserved for HWCAP2_BTI */
+#define HWCAP2_MTE		(1 << 10)
 
 #endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 04cf64e9f0c9..a3eea2cce6b0 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -172,6 +172,8 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = {
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_SSBS_SHIFT, 4, ID_AA64PFR1_SSBS_PSTATE_NI),
+	ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_MTE),
+		       FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_MTE_SHIFT, 4, ID_AA64PFR1_MTE_NI),
 	ARM64_FTR_END,
 };
 
@@ -1267,6 +1269,17 @@ static bool can_use_gic_priorities(const struct arm64_cpu_capabilities *entry,
 }
 #endif
 
+#ifdef CONFIG_ARM64_MTE
+static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
+{
+	write_sysreg_s(SYS_GCR_EL1_RRND, SYS_GCR_EL1);
+	write_sysreg_s(0, SYS_TFSR_EL1);
+	write_sysreg_s(0, SYS_TFSRE0_EL1);
+
+	isb();
+}
+#endif /* CONFIG_ARM64_MTE */
+
 static const struct arm64_cpu_capabilities arm64_features[] = {
 	{
 		.desc = "GIC system register CPU interface",
@@ -1567,6 +1580,19 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.min_field_value = 1,
 	},
 #endif
+#ifdef CONFIG_ARM64_MTE
+	{
+		.desc = "Memory Tagging Extension",
+		.capability = ARM64_MTE,
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64PFR1_EL1,
+		.field_pos = ID_AA64PFR1_MTE_SHIFT,
+		.min_field_value = ID_AA64PFR1_MTE,
+		.sign = FTR_UNSIGNED,
+		.cpu_enable = cpu_enable_mte,
+	},
+#endif /* CONFIG_ARM64_MTE */
 	{},
 };
 
@@ -1666,6 +1692,9 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
 	HWCAP_MULTI_CAP(ptr_auth_hwcap_addr_matches, CAP_HWCAP, KERNEL_HWCAP_PACA),
 	HWCAP_MULTI_CAP(ptr_auth_hwcap_gen_matches, CAP_HWCAP, KERNEL_HWCAP_PACG),
 #endif
+#ifdef CONFIG_ARM64_MTE
+	HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_MTE_SHIFT, FTR_UNSIGNED, ID_AA64PFR1_MTE, CAP_HWCAP, KERNEL_HWCAP_MTE),
+#endif /* CONFIG_ARM64_MTE */
 	{},
 };
 
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 56bba746da1c..6f686e64610e 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -84,6 +84,8 @@ static const char *const hwcap_str[] = {
 	"svesm4",
 	"flagm2",
 	"frint",
+	"",		/* reserved for BTI */
+	"mte",
 	NULL
 };
 


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 07/22] arm64: mte: Use Normal Tagged attributes for the linear map
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (5 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 06/22] arm64: mte: CPU feature detection and initial sysreg configuration Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 18:40 ` [PATCH 08/22] arm64: mte: Assembler macros and default architecture for .S files Catalin Marinas
                   ` (15 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

Once user space is given access to tagged memory, the kernel must be
able to clear/save/restore tags visible to the user. This is done via
the linear mapping, therefore map it as such. The new MT_NORMAL_TAGGED
index for MAIR_EL1 is initially mapped as Normal memory and later
changed to Normal Tagged via the cpufeature infrastructure. From a
mismatched attribute aliases perspective, the Tagged memory is
considered a permission and it won't lead to undefined behaviour.

The empty_zero_page is cleared to ensure that the tags it contains are
already zeroed. The actual tags-aware clear_page() implementation is
part of a subsequent patch.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/memory.h       |  1 +
 arch/arm64/include/asm/pgtable-prot.h |  2 ++
 arch/arm64/kernel/cpufeature.c        | 30 +++++++++++++++++++++++++++
 arch/arm64/mm/dump.c                  |  4 ++++
 arch/arm64/mm/mmu.c                   | 22 ++++++++++++++++++--
 arch/arm64/mm/proc.S                  |  8 +++++--
 6 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index a4f9ca5479b0..55994ab362ae 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -145,6 +145,7 @@
 #define MT_NORMAL_NC		3
 #define MT_NORMAL		4
 #define MT_NORMAL_WT		5
+#define MT_NORMAL_TAGGED	6
 
 /*
  * Memory types for Stage-2 translation
diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index 8dc6c5cdabe6..ef1e565c3a79 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -37,6 +37,7 @@
 #define PROT_NORMAL_NC		(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_NC))
 #define PROT_NORMAL_WT		(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_WT))
 #define PROT_NORMAL		(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL))
+#define PROT_NORMAL_TAGGED	(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_TAGGED))
 
 #define PROT_SECT_DEVICE_nGnRE	(PROT_SECT_DEFAULT | PMD_SECT_PXN | PMD_SECT_UXN | PMD_ATTRINDX(MT_DEVICE_nGnRE))
 #define PROT_SECT_NORMAL	(PROT_SECT_DEFAULT | PMD_SECT_PXN | PMD_SECT_UXN | PMD_ATTRINDX(MT_NORMAL))
@@ -46,6 +47,7 @@
 #define _HYP_PAGE_DEFAULT	_PAGE_DEFAULT
 
 #define PAGE_KERNEL		__pgprot(PROT_NORMAL)
+#define PAGE_KERNEL_TAGGED	__pgprot(PROT_NORMAL_TAGGED)
 #define PAGE_KERNEL_RO		__pgprot((PROT_NORMAL & ~PTE_WRITE) | PTE_RDONLY)
 #define PAGE_KERNEL_ROX		__pgprot((PROT_NORMAL & ~(PTE_WRITE | PTE_PXN)) | PTE_RDONLY)
 #define PAGE_KERNEL_EXEC	__pgprot(PROT_NORMAL & ~PTE_PXN)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index a3eea2cce6b0..06f3f6677284 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1272,12 +1272,42 @@ static bool can_use_gic_priorities(const struct arm64_cpu_capabilities *entry,
 #ifdef CONFIG_ARM64_MTE
 static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
 {
+	u64 mair;
+
 	write_sysreg_s(SYS_GCR_EL1_RRND, SYS_GCR_EL1);
 	write_sysreg_s(0, SYS_TFSR_EL1);
 	write_sysreg_s(0, SYS_TFSRE0_EL1);
 
+	/*
+	 * Update the MT_NORMAL_TAGGED index in MAIR_EL1. Tag checking is
+	 * disabled for the kernel, so there won't be any observable effect
+	 * other than allowing the kernel to read and write tags.
+	 */
+	mair = read_sysreg_s(SYS_MAIR_EL1);
+	mair &= ~MAIR_ATTRIDX(MAIR_ATTR_MASK, MT_NORMAL_TAGGED);
+	mair |= MAIR_ATTRIDX(MAIR_ATTR_NORMAL_TAGGED, MT_NORMAL_TAGGED);
+	write_sysreg_s(mair, SYS_MAIR_EL1);
+
 	isb();
 }
+
+static int __init system_enable_mte(void)
+{
+	if (!system_supports_mte())
+		return 0;
+
+	/* Ensure the TLB does not have stale MAIR attributes */
+	flush_tlb_all();
+
+	/*
+	 * Clear the zero page (again) so that tags are reset. This needs to
+	 * be done via the linear map which has the Tagged attribute.
+	 */
+	clear_page(lm_alias(empty_zero_page));
+
+	return 0;
+}
+core_initcall(system_enable_mte);
 #endif /* CONFIG_ARM64_MTE */
 
 static const struct arm64_cpu_capabilities arm64_features[] = {
diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index 0a920b538a89..1f75a71e63f2 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -163,6 +163,10 @@ static const struct prot_bits pte_bits[] = {
 		.mask	= PTE_ATTRINDX_MASK,
 		.val	= PTE_ATTRINDX(MT_NORMAL),
 		.set	= "MEM/NORMAL",
+	}, {
+		.mask	= PTE_ATTRINDX_MASK,
+		.val	= PTE_ATTRINDX(MT_NORMAL_TAGGED),
+		.set	= "MEM/NORMAL-TAGGED",
 	}
 };
 
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 5a3b15a14a7f..a039a5540cd1 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -120,7 +120,7 @@ static bool pgattr_change_is_safe(u64 old, u64 new)
 	 * The following mapping attributes may be updated in live
 	 * kernel mappings without the need for break-before-make.
 	 */
-	static const pteval_t mask = PTE_PXN | PTE_RDONLY | PTE_WRITE | PTE_NG;
+	pteval_t mask = PTE_PXN | PTE_RDONLY | PTE_WRITE | PTE_NG;
 
 	/* creating or taking down mappings is always safe */
 	if (old == 0 || new == 0)
@@ -134,6 +134,19 @@ static bool pgattr_change_is_safe(u64 old, u64 new)
 	if (old & ~new & PTE_NG)
 		return false;
 
+	if (system_supports_mte()) {
+		/*
+		 * Changing the memory type between Normal and Normal-Tagged
+		 * is safe since Tagged is considered a permission attribute
+		 * from the mismatched attribute aliases perspective.
+		 */
+		if ((old & PTE_ATTRINDX_MASK) == PTE_ATTRINDX(MT_NORMAL) ||
+		    (old & PTE_ATTRINDX_MASK) == PTE_ATTRINDX(MT_NORMAL_TAGGED) ||
+		    (new & PTE_ATTRINDX_MASK) == PTE_ATTRINDX(MT_NORMAL) ||
+		    (new & PTE_ATTRINDX_MASK) == PTE_ATTRINDX(MT_NORMAL_TAGGED))
+			mask |= PTE_ATTRINDX_MASK;
+	}
+
 	return ((old ^ new) & ~mask) == 0;
 }
 
@@ -488,7 +501,12 @@ static void __init map_mem(pgd_t *pgdp)
 		if (memblock_is_nomap(reg))
 			continue;
 
-		__map_memblock(pgdp, start, end, PAGE_KERNEL, flags);
+		/*
+		 * The linear map must allow allocation tags reading/writing
+		 * if MTE is present. Otherwise, it has the same attributes as
+		 * PAGE_KERNEL.
+		 */
+		__map_memblock(pgdp, start, end, PAGE_KERNEL_TAGGED, flags);
 	}
 
 	/*
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 55f715957b36..a8ba4078aa84 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -42,14 +42,18 @@
 #define TCR_KASAN_FLAGS 0
 #endif
 
-/* Default MAIR_EL1 */
+/*
+ * Default MAIR_EL1. MT_NORMAL_TAGGED is initially mapped as Normal memory and
+ * changed later to Normal Tagged if the system supports MTE.
+ */
 #define MAIR_EL1_SET							\
 	(MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRnE, MT_DEVICE_nGnRnE) |	\
 	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRE, MT_DEVICE_nGnRE) |	\
 	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_GRE, MT_DEVICE_GRE) |		\
 	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_NC, MT_NORMAL_NC) |		\
 	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL, MT_NORMAL) |			\
-	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_WT, MT_NORMAL_WT))
+	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_WT, MT_NORMAL_WT) |		\
+	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL, MT_NORMAL_TAGGED))
 
 #ifdef CONFIG_CPU_PM
 /**


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 08/22] arm64: mte: Assembler macros and default architecture for .S files
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (6 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 07/22] arm64: mte: Use Normal Tagged attributes for the linear map Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 18:40 ` [PATCH 09/22] arm64: mte: Tags-aware clear_page() implementation Catalin Marinas
                   ` (14 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

Add the multitag_transfer_size macro to the arm64 assembler.h, together
with '.arch armv8.5-a+memtag' when CONFIG_ARM64_MTE is enabled.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/assembler.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index b8cf7c85ffa2..74a649a0b6e6 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -21,8 +21,13 @@
 #include <asm/page.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
+#include <asm/sysreg.h>
 #include <asm/thread_info.h>
 
+#ifdef CONFIG_ARM64_MTE
+	.arch armv8.5-a+memtag
+#endif
+
 	.macro save_and_disable_daif, flags
 	mrs	\flags, daif
 	msr	daifset, #0xf
@@ -756,4 +761,15 @@ USER(\label, ic	ivau, \tmp2)			// invalidate I line PoU
 .Lyield_out_\@ :
 	.endm
 
+/*
+ * multitag_transfer_size - set \reg to the block size that is accessed by the
+ * LDGM/STGM instructions.
+ */
+	.macro	multitag_transfer_size, reg, tmp
+	mrs_s	\reg, SYS_GMID_EL1
+	ubfx	\reg, \reg, #SYS_GMID_EL1_BS_SHIFT, #SYS_GMID_EL1_BS_SIZE
+	mov	\tmp, #4
+	lsl	\reg, \tmp, \reg
+	.endm
+
 #endif	/* __ASM_ASSEMBLER_H */


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 09/22] arm64: mte: Tags-aware clear_page() implementation
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (7 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 08/22] arm64: mte: Assembler macros and default architecture for .S files Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 18:40 ` [PATCH 10/22] arm64: mte: Tags-aware copy_page() implementation Catalin Marinas
                   ` (13 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

When the Memory Tagging Extension is enabled, the tags need to be set to
zero a page is cleared as they are visible to the user.

Introduce an MTE-aware clear_page() which clears the tags in addition to
data.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/lib/clear_page.S | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
index 78a9ef66288a..575cea03f68a 100644
--- a/arch/arm64/lib/clear_page.S
+++ b/arch/arm64/lib/clear_page.S
@@ -5,7 +5,9 @@
 
 #include <linux/linkage.h>
 #include <linux/const.h>
+#include <asm/alternative.h>
 #include <asm/assembler.h>
+#include <asm/cpufeature.h>
 #include <asm/page.h>
 
 /*
@@ -19,8 +21,9 @@ ENTRY(clear_page)
 	and	w1, w1, #0xf
 	mov	x2, #4
 	lsl	x1, x2, x1
-
-1:	dc	zva, x0
+1:
+alternative_insn "dc zva, x0", "stzgm xzr, [x0]", \
+			 ARM64_MTE, IS_ENABLED(CONFIG_ARM64_MTE), 1
 	add	x0, x0, x1
 	tst	x0, #(PAGE_SIZE - 1)
 	b.ne	1b


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 10/22] arm64: mte: Tags-aware copy_page() implementation
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (8 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 09/22] arm64: mte: Tags-aware clear_page() implementation Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 18:40 ` [PATCH 11/22] arm64: Tags-aware memcmp_pages() implementation Catalin Marinas
                   ` (12 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

When the Memory Tagging Extension is enabled, the tags need to be
preserved across page copy (e.g. for copy-on-write).

Introduce MTE-aware copy_page() which preserves the tags across page
copy.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/lib/copy_page.S | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/arch/arm64/lib/copy_page.S b/arch/arm64/lib/copy_page.S
index bbb8562396af..970b7a20da70 100644
--- a/arch/arm64/lib/copy_page.S
+++ b/arch/arm64/lib/copy_page.S
@@ -25,6 +25,29 @@ alternative_if ARM64_HAS_NO_HW_PREFETCH
 	prfm	pldl1strm, [x1, #384]
 alternative_else_nop_endif
 
+#ifdef CONFIG_ARM64_MTE
+alternative_if_not ARM64_MTE
+	b	2f
+alternative_else_nop_endif
+	/*
+	 * Copy tags if MTE has been enabled.
+	 */
+	mov	x2, x0
+	mov	x3, x1
+
+	multitag_transfer_size x7, x5
+1:
+	ldgm	x4, [x3]
+	stgm	x4, [x2]
+
+	add	x2, x2, x7
+	add	x3, x3, x7
+
+	tst	x2, #(PAGE_SIZE - 1)
+	b.ne	1b
+2:
+#endif
+
 	ldp	x2, x3, [x1]
 	ldp	x4, x5, [x1, #16]
 	ldp	x6, x7, [x1, #32]


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 11/22] arm64: Tags-aware memcmp_pages() implementation
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (9 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 10/22] arm64: mte: Tags-aware copy_page() implementation Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 18:40 ` [PATCH 12/22] arm64: mte: Add specific SIGSEGV codes Catalin Marinas
                   ` (11 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

When the Memory Tagging Extension is enabled, two pages are identical
only if both their data and tags are identical.

Make the generic memcmp_pages() a __weak function and add an
arm64-specific implementation which takes care of the tags comparison.

Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/mte.h | 11 +++++++++
 arch/arm64/lib/Makefile      |  2 ++
 arch/arm64/lib/mte.S         | 46 ++++++++++++++++++++++++++++++++++++
 arch/arm64/mm/Makefile       |  1 +
 arch/arm64/mm/cmppages.c     | 26 ++++++++++++++++++++
 mm/util.c                    |  2 +-
 6 files changed, 87 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/mte.h
 create mode 100644 arch/arm64/lib/mte.S
 create mode 100644 arch/arm64/mm/cmppages.c

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
new file mode 100644
index 000000000000..64e814273659
--- /dev/null
+++ b/arch/arm64/include/asm/mte.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_MTE_H
+#define __ASM_MTE_H
+
+#ifndef __ASSEMBLY__
+
+/* Memory Tagging API */
+int mte_memcmp_pages(const void *page1_addr, const void *page2_addr);
+
+#endif /* __ASSEMBLY__ */
+#endif /* __ASM_MTE_H  */
diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
index c21b936dc01d..178879a754b0 100644
--- a/arch/arm64/lib/Makefile
+++ b/arch/arm64/lib/Makefile
@@ -16,3 +16,5 @@ lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
 obj-$(CONFIG_CRC32) += crc32.o
 
 obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
+
+obj-$(CONFIG_ARM64_MTE) += mte.o
diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
new file mode 100644
index 000000000000..d41955ab4134
--- /dev/null
+++ b/arch/arm64/lib/mte.S
@@ -0,0 +1,46 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2019 ARM Ltd.
+ */
+#include <linux/linkage.h>
+
+#include <asm/assembler.h>
+
+/*
+ * Compare tags of two pages
+ *   x0 - page1 address
+ *   x1 - page2 address
+ * Returns:
+ *   w0 - negative, zero or positive value if the tag in the first page is
+ *	  less than, equal to or greater than the tag in the second page
+ */
+ENTRY(mte_memcmp_pages)
+	multitag_transfer_size x7, x5
+1:
+	ldgm	x2, [x0]
+	ldgm	x3, [x1]
+
+	eor	x4, x2, x3
+	cbnz	x4, 2f
+
+	add	x0, x0, x7
+	add	x1, x1, x7
+
+	tst	x0, #(PAGE_SIZE - 1)
+	b.ne	1b
+
+	mov	w0, #0
+	ret
+2:
+	rbit	x4, x4
+	clz	x4, x4			// count the least significant equal bits
+	and	x4, x4, #~3		// round down to a multiple of 4 (bits per tag)
+
+	lsr	x2, x2, x4		// remove equal tags
+	lsr	x3, x3, x4
+
+	lsl	w2, w2, #28		// compare the differing tags
+	sub	w0, w2, w3, lsl #28
+
+	ret
+ENDPROC(mte_memcmp_pages)
diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
index 849c1df3d214..7a556e9a19dc 100644
--- a/arch/arm64/mm/Makefile
+++ b/arch/arm64/mm/Makefile
@@ -8,6 +8,7 @@ obj-$(CONFIG_ARM64_PTDUMP_CORE)	+= dump.o
 obj-$(CONFIG_ARM64_PTDUMP_DEBUGFS)	+= ptdump_debugfs.o
 obj-$(CONFIG_NUMA)		+= numa.o
 obj-$(CONFIG_DEBUG_VIRTUAL)	+= physaddr.o
+obj-$(CONFIG_ARM64_MTE)		+= cmppages.o
 KASAN_SANITIZE_physaddr.o	+= n
 
 obj-$(CONFIG_KASAN)		+= kasan_init.o
diff --git a/arch/arm64/mm/cmppages.c b/arch/arm64/mm/cmppages.c
new file mode 100644
index 000000000000..943c1877e014
--- /dev/null
+++ b/arch/arm64/mm/cmppages.c
@@ -0,0 +1,26 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2019 ARM Ltd.
+ */
+
+#include <linux/mm.h>
+#include <linux/string.h>
+
+#include <asm/cpufeature.h>
+#include <asm/mte.h>
+
+int memcmp_pages(struct page *page1, struct page *page2)
+{
+	char *addr1, *addr2;
+	int ret;
+
+	addr1 = page_address(page1);
+	addr2 = page_address(page2);
+
+	ret = memcmp(addr1, addr2, PAGE_SIZE);
+	/* if page content identical, check the tags */
+	if (ret == 0 && system_supports_mte())
+		ret = mte_memcmp_pages(addr1, addr2);
+
+	return ret;
+}
diff --git a/mm/util.c b/mm/util.c
index 988d11e6c17c..662fb3da6d01 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -899,7 +899,7 @@ int get_cmdline(struct task_struct *task, char *buffer, int buflen)
 	return res;
 }
 
-int memcmp_pages(struct page *page1, struct page *page2)
+int __weak memcmp_pages(struct page *page1, struct page *page2)
 {
 	char *addr1, *addr2;
 	int ret;


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 12/22] arm64: mte: Add specific SIGSEGV codes
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (10 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 11/22] arm64: Tags-aware memcmp_pages() implementation Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 19:31   ` Arnd Bergmann
  2019-12-11 18:40 ` [PATCH 13/22] arm64: mte: Handle synchronous and asynchronous tag check faults Catalin Marinas
                   ` (10 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch, Arnd Bergmann

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

Add MTE-specific SIGSEGV codes to siginfo.h.

Note that the for MTE we are reusing the same SPARC ADI codes because
the two functionalities are similar and they cannot coexist on the same
system.

Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
[catalin.marinas@arm.com: renamed precise/imprecise to sync/async]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 include/uapi/asm-generic/siginfo.h | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
index cb3d6c267181..a5184a5438c6 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -227,8 +227,13 @@ typedef struct siginfo {
 # define SEGV_PKUERR	4	/* failed protection key checks */
 #endif
 #define SEGV_ACCADI	5	/* ADI not enabled for mapped object */
-#define SEGV_ADIDERR	6	/* Disrupting MCD error */
-#define SEGV_ADIPERR	7	/* Precise MCD exception */
+#ifdef __aarch64__
+# define SEGV_MTEAERR	6	/* Asynchronous MTE error */
+# define SEGV_MTESERR	7	/* Synchronous MTE exception */
+#else
+# define SEGV_ADIDERR	6	/* Disrupting MCD error */
+# define SEGV_ADIPERR	7	/* Precise MCD exception */
+#endif
 #define NSIGSEGV	7
 
 /*


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 13/22] arm64: mte: Handle synchronous and asynchronous tag check faults
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (11 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 12/22] arm64: mte: Add specific SIGSEGV codes Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-14  1:43   ` Peter Collingbourne
  2019-12-11 18:40 ` [PATCH 14/22] mm: Introduce arch_calc_vm_flag_bits() Catalin Marinas
                   ` (9 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

The Memory Tagging Extension has two modes of notifying a tag check
fault at EL0, configurable through the SCTLR_EL1.TCF0 field:

1. Synchronous raising of a Data Abort exception with DFSC 17.
2. Asynchronous setting of a cumulative bit in TFSRE0_EL1.

Add the exception handler for the synchronous exception and handling of
the asynchronous TFSRE0_EL1.TF0 bit setting via a new TIF flag in
do_notify_resume().

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/thread_info.h |  4 +++-
 arch/arm64/kernel/entry.S            | 17 +++++++++++++++++
 arch/arm64/kernel/process.c          |  7 +++++++
 arch/arm64/kernel/signal.c           |  8 ++++++++
 arch/arm64/mm/fault.c                |  9 ++++++++-
 5 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index f0cec4160136..f759a0215a71 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -63,6 +63,7 @@ void arch_release_task_struct(struct task_struct *tsk);
 #define TIF_FOREIGN_FPSTATE	3	/* CPU's FP state is not current's */
 #define TIF_UPROBE		4	/* uprobe breakpoint or singlestep */
 #define TIF_FSCHECK		5	/* Check FS is USER_DS on return */
+#define TIF_MTE_ASYNC_FAULT	6	/* MTE Asynchronous Tag Check Fault */
 #define TIF_NOHZ		7
 #define TIF_SYSCALL_TRACE	8	/* syscall trace active */
 #define TIF_SYSCALL_AUDIT	9	/* syscall auditing */
@@ -93,10 +94,11 @@ void arch_release_task_struct(struct task_struct *tsk);
 #define _TIF_FSCHECK		(1 << TIF_FSCHECK)
 #define _TIF_32BIT		(1 << TIF_32BIT)
 #define _TIF_SVE		(1 << TIF_SVE)
+#define _TIF_MTE_ASYNC_FAULT	(1 << TIF_MTE_ASYNC_FAULT)
 
 #define _TIF_WORK_MASK		(_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
 				 _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
-				 _TIF_UPROBE | _TIF_FSCHECK)
+				 _TIF_UPROBE | _TIF_FSCHECK | _TIF_MTE_ASYNC_FAULT)
 
 #define _TIF_SYSCALL_WORK	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
 				 _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 7c6a0a41676f..c221a539e61d 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -144,6 +144,22 @@ alternative_cb_end
 #endif
 	.endm
 
+	// Check for MTE asynchronous tag check faults
+	.macro check_mte_async_tcf, flgs, tmp
+#ifdef CONFIG_ARM64_MTE
+alternative_if_not ARM64_MTE
+	b	1f
+alternative_else_nop_endif
+	mrs_s	\tmp, SYS_TFSRE0_EL1
+	tbz	\tmp, #SYS_TFSR_EL1_TF0_SHIFT, 1f
+	// Asynchronous TCF occurred at EL0, set the TI flag
+	orr	\flgs, \flgs, #_TIF_MTE_ASYNC_FAULT
+	str	\flgs, [tsk, #TSK_TI_FLAGS]
+	msr_s	SYS_TFSRE0_EL1, xzr
+1:
+#endif
+	.endm
+
 	.macro	kernel_entry, el, regsize = 64
 	.if	\regsize == 32
 	mov	w0, w0				// zero upper 32 bits of x0
@@ -171,6 +187,7 @@ alternative_cb_end
 	ldr	x19, [tsk, #TSK_TI_FLAGS]	// since we can unmask debug
 	disable_step_tsk x19, x20		// exceptions when scheduling.
 
+	check_mte_async_tcf x19, x22
 	apply_ssbd 1, x22, x23
 
 	.else
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 71f788cd2b18..dd98d539894e 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -317,12 +317,19 @@ static void flush_tagged_addr_state(void)
 		clear_thread_flag(TIF_TAGGED_ADDR);
 }
 
+static void flush_mte_state(void)
+{
+	if (system_supports_mte())
+		clear_thread_flag(TIF_MTE_ASYNC_FAULT);
+}
+
 void flush_thread(void)
 {
 	fpsimd_flush_thread();
 	tls_thread_flush();
 	flush_ptrace_hw_breakpoint(current);
 	flush_tagged_addr_state();
+	flush_mte_state();
 }
 
 void release_thread(struct task_struct *dead_task)
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index dd2cdc0d5be2..41fae64af82a 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -730,6 +730,9 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
 	regs->regs[29] = (unsigned long)&user->next_frame->fp;
 	regs->pc = (unsigned long)ka->sa.sa_handler;
 
+	/* TCO (Tag Check Override) always cleared for signal handlers */
+	regs->pstate &= ~PSR_TCO_BIT;
+
 	if (ka->sa.sa_flags & SA_RESTORER)
 		sigtramp = ka->sa.sa_restorer;
 	else
@@ -921,6 +924,11 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
 			if (thread_flags & _TIF_UPROBE)
 				uprobe_notify_resume(regs);
 
+			if (thread_flags & _TIF_MTE_ASYNC_FAULT) {
+				clear_thread_flag(TIF_MTE_ASYNC_FAULT);
+				force_signal_inject(SIGSEGV, SEGV_MTEAERR, 0);
+			}
+
 			if (thread_flags & _TIF_SIGPENDING)
 				do_signal(regs);
 
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 077b02a2d4d3..ef3bfa2bf2b1 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -660,6 +660,13 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
 	return 0;
 }
 
+static int do_tag_check_fault(unsigned long addr, unsigned int esr,
+			      struct pt_regs *regs)
+{
+	do_bad_area(addr, esr, regs);
+	return 0;
+}
+
 static const struct fault_info fault_info[] = {
 	{ do_bad,		SIGKILL, SI_KERNEL,	"ttbr address size fault"	},
 	{ do_bad,		SIGKILL, SI_KERNEL,	"level 1 address size fault"	},
@@ -678,7 +685,7 @@ static const struct fault_info fault_info[] = {
 	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 2 permission fault"	},
 	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 3 permission fault"	},
 	{ do_sea,		SIGBUS,  BUS_OBJERR,	"synchronous external abort"	},
-	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 17"			},
+	{ do_tag_check_fault,	SIGSEGV, SEGV_MTESERR,	"synchronous tag check fault"	},
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 18"			},
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 19"			},
 	{ do_sea,		SIGKILL, SI_KERNEL,	"level 0 (translation table walk)"	},


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 14/22] mm: Introduce arch_calc_vm_flag_bits()
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (12 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 13/22] arm64: mte: Handle synchronous and asynchronous tag check faults Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 18:40 ` [PATCH 15/22] arm64: mte: Add PROT_MTE support to mmap() and mprotect() Catalin Marinas
                   ` (8 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

Similarly to arch_calc_vm_prot_bits(), introduce a dummy
arch_calc_vm_flag_bits() invoked from calc_vm_flag_bits(). This macro
can be overridden by architectures to insert specific VM_* flags derived
from the mmap() MAP_* flags.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 include/linux/mman.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/mman.h b/include/linux/mman.h
index 4b08e9c9c538..c53b194c11a1 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -81,6 +81,10 @@ static inline void vm_unacct_memory(long pages)
 #define arch_calc_vm_prot_bits(prot, pkey) 0
 #endif
 
+#ifndef arch_calc_vm_flag_bits
+#define arch_calc_vm_flag_bits(flags) 0
+#endif
+
 #ifndef arch_vm_get_page_prot
 #define arch_vm_get_page_prot(vm_flags) __pgprot(0)
 #endif
@@ -131,7 +135,8 @@ calc_vm_flag_bits(unsigned long flags)
 	return _calc_vm_trans(flags, MAP_GROWSDOWN,  VM_GROWSDOWN ) |
 	       _calc_vm_trans(flags, MAP_DENYWRITE,  VM_DENYWRITE ) |
 	       _calc_vm_trans(flags, MAP_LOCKED,     VM_LOCKED    ) |
-	       _calc_vm_trans(flags, MAP_SYNC,	     VM_SYNC      );
+	       _calc_vm_trans(flags, MAP_SYNC,	     VM_SYNC      ) |
+	       arch_calc_vm_flag_bits(flags);
 }
 
 unsigned long vm_commit_limit(void);


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 15/22] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (13 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 14/22] mm: Introduce arch_calc_vm_flag_bits() Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2020-01-21 22:06   ` Peter Collingbourne
  2019-12-11 18:40 ` [PATCH 16/22] mm: Introduce arch_validate_flags() Catalin Marinas
                   ` (7 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

To enable tagging on a memory range, the user must explicitly opt in via
a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new
memory type in the AttrIndx field of a pte, simplify the or'ing of these
bits over the protection_map[] attributes by making MT_NORMAL index 0.

There are two conditions for arch_vm_get_page_prot() to return the
MT_NORMAL_TAGGED memory type: (1) the user requested it via PROT_MTE,
registered as VM_MTE in the vm_flags, and (2) the vma supports MTE,
decided during the mmap() call (only) and registered as VM_MTE_ALLOWED.

arch_calc_vm_prot_bits() is responsible for registering the user request
as VM_MTE. The newly introduced arch_calc_vm_flag_bits() sets
VM_MTE_ALLOWED if the mapping is MAP_ANONYMOUS. An MTE-capable
filesystem (RAM-based) may be able to set VM_MTE_ALLOWED during its
mmap() file ops call.

In addition, update VM_DATA_DEFAULT_FLAGS to allow mprotect(PROT_MTE) on
stack or brk area.

The Linux mmap() syscall currently ignores unknown PROT_* flags. In the
presence of MTE, an mmap(PROT_MTE) on a file which does not support MTE
will not report an error and the memory will not be mapped as Normal
Tagged. For consistency, mprotect(PROT_MTE) will not report an error
either if the memory range does not support MTE. Two subsequent patches
in the series will propose tightening of this behaviour.

Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/memory.h    | 18 +++++----
 arch/arm64/include/asm/mman.h      | 64 ++++++++++++++++++++++++++++++
 arch/arm64/include/asm/page.h      |  4 +-
 arch/arm64/include/asm/pgtable.h   |  7 +++-
 arch/arm64/include/uapi/asm/mman.h | 14 +++++++
 fs/proc/task_mmu.c                 |  3 ++
 include/linux/mm.h                 |  8 ++++
 7 files changed, 109 insertions(+), 9 deletions(-)
 create mode 100644 arch/arm64/include/asm/mman.h
 create mode 100644 arch/arm64/include/uapi/asm/mman.h

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 55994ab362ae..f0e535895a78 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -138,14 +138,18 @@
 
 /*
  * Memory types available.
+ *
+ * IMPORTANT: MT_NORMAL must be index 0 since vm_get_page_prot() may 'or' in
+ *	      the MT_NORMAL_TAGGED memory type for PROT_MTE mappings. Note
+ *	      that protection_map[] only contains MT_NORMAL attributes.
  */
-#define MT_DEVICE_nGnRnE	0
-#define MT_DEVICE_nGnRE		1
-#define MT_DEVICE_GRE		2
-#define MT_NORMAL_NC		3
-#define MT_NORMAL		4
-#define MT_NORMAL_WT		5
-#define MT_NORMAL_TAGGED	6
+#define MT_NORMAL		0
+#define MT_NORMAL_TAGGED	1
+#define MT_NORMAL_NC		2
+#define MT_NORMAL_WT		3
+#define MT_DEVICE_nGnRnE	4
+#define MT_DEVICE_nGnRE		5
+#define MT_DEVICE_GRE		6
 
 /*
  * Memory types for Stage-2 translation
diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
new file mode 100644
index 000000000000..c77a23869223
--- /dev/null
+++ b/arch/arm64/include/asm/mman.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_MMAN_H__
+#define __ASM_MMAN_H__
+
+#include <uapi/asm/mman.h>
+
+/*
+ * There are two conditions required for returning a Normal Tagged memory type
+ * in arch_vm_get_page_prot(): (1) the user requested it via PROT_MTE passed
+ * to mmap() or mprotect() and (2) the corresponding vma supports MTE. We
+ * register (1) as VM_MTE in the vma->vm_flags and (2) as VM_MTE_ALLOWED. Note
+ * that the latter can only be set during the mmap() call since mprotect()
+ * does not accept MAP_* flags.
+ */
+static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
+						   unsigned long pkey)
+{
+	if (!system_supports_mte())
+		return 0;
+
+	if (prot & PROT_MTE)
+		return VM_MTE;
+
+	return 0;
+}
+#define arch_calc_vm_prot_bits arch_calc_vm_prot_bits
+
+static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
+{
+	if (!system_supports_mte())
+		return 0;
+
+	/*
+	 * Only allow MTE on anonymous mappings as these are guaranteed to be
+	 * backed by tags-capable memory. The vm_flags may be overridden by a
+	 * filesystem supporting MTE (RAM-based).
+	 */
+	if (flags & MAP_ANONYMOUS)
+		return VM_MTE_ALLOWED;
+
+	return 0;
+}
+#define arch_calc_vm_flag_bits arch_calc_vm_flag_bits
+
+static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
+{
+	return (vm_flags & VM_MTE) && (vm_flags & VM_MTE_ALLOWED) ?
+		__pgprot(PTE_ATTRINDX(MT_NORMAL_TAGGED)) :
+		__pgprot(0);
+}
+#define arch_vm_get_page_prot arch_vm_get_page_prot
+
+static inline bool arch_validate_prot(unsigned long prot, unsigned long addr)
+{
+	unsigned long supported = PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM;
+
+	if (system_supports_mte())
+		supported |= PROT_MTE;
+
+	return (prot & ~supported) == 0;
+}
+#define arch_validate_prot arch_validate_prot
+
+#endif /* !__ASM_MMAN_H__ */
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index d39ddb258a04..10d71f927b70 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -32,9 +32,11 @@ extern int pfn_valid(unsigned long);
 
 #endif /* !__ASSEMBLY__ */
 
+/* Used for stack and brk memory ranges */
 #define VM_DATA_DEFAULT_FLAGS \
 	(((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) | \
-	 VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+	 VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC | \
+	 VM_MTE_ALLOWED)
 
 #include <asm-generic/getorder.h>
 
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 5d15b4735a0e..e5e2cb6f2f3c 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -661,8 +661,13 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
+	/*
+	 * Normal and Normal-Tagged are two different memory types and indices
+	 * in MAIR_EL1. The mask below has to include PTE_ATTRINDX_MASK.
+	 */
 	const pteval_t mask = PTE_USER | PTE_PXN | PTE_UXN | PTE_RDONLY |
-			      PTE_PROT_NONE | PTE_VALID | PTE_WRITE;
+			      PTE_PROT_NONE | PTE_VALID | PTE_WRITE |
+			      PTE_ATTRINDX_MASK;
 	/* preserve the hardware dirty information */
 	if (pte_hw_dirty(pte))
 		pte = pte_mkdirty(pte);
diff --git a/arch/arm64/include/uapi/asm/mman.h b/arch/arm64/include/uapi/asm/mman.h
new file mode 100644
index 000000000000..d7677ee84878
--- /dev/null
+++ b/arch/arm64/include/uapi/asm/mman.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI__ASM_MMAN_H
+#define _UAPI__ASM_MMAN_H
+
+#include <asm-generic/mman.h>
+
+/*
+ * The generic mman.h file reserves 0x10 and 0x20 for arch-specific PROT_*
+ * flags.
+ */
+/* 0x10 reserved for PROT_BTI */
+#define PROT_MTE	 0x20		/* Normal Tagged mapping */
+
+#endif /* !_UAPI__ASM_MMAN_H */
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 9442631fd4af..34bc9e0b4896 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -677,6 +677,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
 		[ilog2(VM_MERGEABLE)]	= "mg",
 		[ilog2(VM_UFFD_MISSING)]= "um",
 		[ilog2(VM_UFFD_WP)]	= "uw",
+#ifdef CONFIG_ARM64_MTE
+		[ilog2(VM_MTE)]		= "mt",
+#endif
 #ifdef CONFIG_ARCH_HAS_PKEYS
 		/* These come out via ProtectionKey: */
 		[ilog2(VM_PKEY_BIT0)]	= "",
diff --git a/include/linux/mm.h b/include/linux/mm.h
index c97ea3b694e6..cf59b4558bbe 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -340,6 +340,14 @@ extern unsigned int kobjsize(const void *objp);
 # define VM_MPX		VM_NONE
 #endif
 
+#if defined(CONFIG_ARM64_MTE)
+# define VM_MTE		VM_HIGH_ARCH_0	/* Use Tagged memory for access control */
+# define VM_MTE_ALLOWED	VM_HIGH_ARCH_1	/* Tagged memory permitted */
+#else
+# define VM_MTE		VM_NONE
+# define VM_MTE_ALLOWED	VM_NONE
+#endif
+
 #ifndef VM_GROWSUP
 # define VM_GROWSUP	VM_NONE
 #endif


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 16/22] mm: Introduce arch_validate_flags()
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (14 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 15/22] arm64: mte: Add PROT_MTE support to mmap() and mprotect() Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 18:40 ` [PATCH 17/22] arm64: mte: Validate the PROT_MTE request via arch_validate_flags() Catalin Marinas
                   ` (6 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

Similarly to arch_validate_prot() called from do_mprotect_pkey(), an
architecture may need to sanity-check the new vm_flags.

Define a dummy function always returning true. In addition to
do_mprotect_pkey(), also invoke it from mmap_region() prior to updating
vma->vm_page_prot to allow the architecture code to veto potentially
inconsistent vm_flags.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 include/linux/mman.h | 13 +++++++++++++
 mm/mmap.c            |  9 +++++++++
 mm/mprotect.c        |  8 ++++++++
 3 files changed, 30 insertions(+)

diff --git a/include/linux/mman.h b/include/linux/mman.h
index c53b194c11a1..686fa6c98ce7 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -103,6 +103,19 @@ static inline bool arch_validate_prot(unsigned long prot, unsigned long addr)
 #define arch_validate_prot arch_validate_prot
 #endif
 
+#ifndef arch_validate_flags
+/*
+ * This is called from mprotect() with the updated vma->vm_flags.
+ *
+ * Returns true if the VM_* flags are valid.
+ */
+static inline bool arch_validate_flags(unsigned long flags)
+{
+	return true;
+}
+#define arch_validate_flags arch_validate_flags
+#endif
+
 /*
  * Optimisation macro.  It is equivalent to:
  *      (x & bit1) ? bit2 : 0
diff --git a/mm/mmap.c b/mm/mmap.c
index 9c648524e4dc..433355c5bdf1 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1804,6 +1804,15 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 		vma_set_anonymous(vma);
 	}
 
+	/* Allow architectures to sanity-check the vm_flags */
+	if (!arch_validate_flags(vma->vm_flags)) {
+		error = -EINVAL;
+		if (file)
+			goto unmap_and_free_vma;
+		else
+			goto free_vma;
+	}
+
 	vma_link(mm, vma, prev, rb_link, rb_parent);
 	/* Once vma denies write, undo our temporary denial count */
 	if (file) {
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 7a8e84f86831..787b071bcbfd 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -543,6 +543,14 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
 			goto out;
 		}
 
+		/*
+		 * Allow architectures to sanity-check the new flags.
+		 */
+		if (!arch_validate_flags(newflags)) {
+			error = -EINVAL;
+			goto out;
+		}
+
 		error = security_file_mprotect(vma, reqprot, prot);
 		if (error)
 			goto out;


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 17/22] arm64: mte: Validate the PROT_MTE request via arch_validate_flags()
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (15 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 16/22] mm: Introduce arch_validate_flags() Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 18:40 ` [PATCH 18/22] mm: Allow arm64 mmap(PROT_MTE) on RAM-based files Catalin Marinas
                   ` (5 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

Make use of the newly introduced arch_validate_flags() hook to
sanity-check the PROT_MTE request passed to mmap() and mprotect(). If
the mapping does not support MTE, these syscalls will return -EINVAL.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/mman.h | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
index c77a23869223..5c356d1ca266 100644
--- a/arch/arm64/include/asm/mman.h
+++ b/arch/arm64/include/asm/mman.h
@@ -44,7 +44,11 @@ static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
 
 static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
 {
-	return (vm_flags & VM_MTE) && (vm_flags & VM_MTE_ALLOWED) ?
+	/*
+	 * Checking for VM_MTE only is sufficient since arch_validate_flags()
+	 * does not permit (VM_MTE & !VM_MTE_ALLOWED).
+	 */
+	return (vm_flags & VM_MTE) ?
 		__pgprot(PTE_ATTRINDX(MT_NORMAL_TAGGED)) :
 		__pgprot(0);
 }
@@ -61,4 +65,14 @@ static inline bool arch_validate_prot(unsigned long prot, unsigned long addr)
 }
 #define arch_validate_prot arch_validate_prot
 
+static inline bool arch_validate_flags(unsigned long flags)
+{
+	if (!system_supports_mte())
+		return true;
+
+	/* only allow VM_MTE if VM_MTE_ALLOWED has been set previously */
+	return !(flags & VM_MTE) || (flags & VM_MTE_ALLOWED);
+}
+#define arch_validate_flags arch_validate_flags
+
 #endif /* !__ASM_MMAN_H__ */


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 18/22] mm: Allow arm64 mmap(PROT_MTE) on RAM-based files
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (16 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 17/22] arm64: mte: Validate the PROT_MTE request via arch_validate_flags() Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 18:40 ` [PATCH 19/22] arm64: mte: Allow user control of the tag check mode via prctl() Catalin Marinas
                   ` (4 subsequent siblings)
  22 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

Since arm64 memory (allocation) tags can only be stored in RAM, mapping
files with PROT_MTE is not allowed by default. RAM-based files like
those in a tmpfs mount or memfd_create() can support memory tagging, so
update the vm_flags accordingly in shmem_mmap().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 mm/shmem.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/shmem.c b/mm/shmem.c
index 165fa6332993..1b1753f90e2d 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2225,6 +2225,9 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
 			vma->vm_flags &= ~(VM_MAYWRITE);
 	}
 
+	/* arm64 - allow memory tagging on RAM-based files */
+	vma->vm_flags |= VM_MTE_ALLOWED;
+
 	file_accessed(file);
 	vma->vm_ops = &shmem_vm_ops;
 	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGE_PAGECACHE) &&


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 19/22] arm64: mte: Allow user control of the tag check mode via prctl()
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (17 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 18/22] mm: Allow arm64 mmap(PROT_MTE) on RAM-based files Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-19 20:32   ` Peter Collingbourne
  2019-12-27 14:34   ` [PATCH 19/22] arm64: mte: Allow user control of the tag check mode via prctl() Kevin Brodsky
  2019-12-11 18:40 ` [PATCH 20/22] arm64: mte: Allow user control of the excluded tags " Catalin Marinas
                   ` (3 subsequent siblings)
  22 siblings, 2 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

By default, even if PROT_MTE is set on a memory range, there is no tag
check fault reporting (SIGSEGV). Introduce a set of option to the
exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag
check fault mode:

  PR_MTE_TCF_NONE  - no reporting (default)
  PR_MTE_TCF_SYNC  - synchronous tag check fault reporting
  PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting

These options translate into the corresponding SCTLR_EL1.TCF0 bitfield,
context-switched by the kernel. Note that uaccess done by the kernel is
not checked and cannot be configured by the user.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/processor.h |   3 +
 arch/arm64/kernel/process.c        | 119 +++++++++++++++++++++++++++--
 include/uapi/linux/prctl.h         |   6 ++
 3 files changed, 123 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 5ba63204d078..91aa270afc7d 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -148,6 +148,9 @@ struct thread_struct {
 #ifdef CONFIG_ARM64_PTR_AUTH
 	struct ptrauth_keys	keys_user;
 #endif
+#ifdef CONFIG_ARM64_MTE
+	u64			sctlr_tcf0;
+#endif
 };
 
 static inline void arch_thread_struct_whitelist(unsigned long *offset,
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index dd98d539894e..47ce98f47253 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -317,11 +317,22 @@ static void flush_tagged_addr_state(void)
 		clear_thread_flag(TIF_TAGGED_ADDR);
 }
 
+#ifdef CONFIG_ARM64_MTE
+static void flush_mte_state(void)
+{
+	if (!system_supports_mte())
+		return;
+
+	/* clear any pending asynchronous tag fault */
+	clear_thread_flag(TIF_MTE_ASYNC_FAULT);
+	/* disable tag checking */
+	current->thread.sctlr_tcf0 = 0;
+}
+#else
 static void flush_mte_state(void)
 {
-	if (system_supports_mte())
-		clear_thread_flag(TIF_MTE_ASYNC_FAULT);
 }
+#endif
 
 void flush_thread(void)
 {
@@ -484,6 +495,29 @@ static void ssbs_thread_switch(struct task_struct *next)
 		set_ssbs_bit(regs);
 }
 
+#ifdef CONFIG_ARM64_MTE
+static void update_sctlr_el1_tcf0(u64 tcf0)
+{
+	/* no need for ISB since this only affects EL0, implicit with ERET */
+	sysreg_clear_set(sctlr_el1, SCTLR_EL1_TCF0_MASK, tcf0);
+}
+
+/* Handle MTE thread switch */
+static void mte_thread_switch(struct task_struct *next)
+{
+	if (!system_supports_mte())
+		return;
+
+	/* avoid expensive SCTLR_EL1 accesses if no change */
+	if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
+		update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
+}
+#else
+static void mte_thread_switch(struct task_struct *next)
+{
+}
+#endif
+
 /*
  * We store our current task in sp_el0, which is clobbered by userspace. Keep a
  * shadow copy so that we can restore this upon entry from userspace.
@@ -514,6 +548,7 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
 	uao_thread_switch(next);
 	ptrauth_thread_switch(next);
 	ssbs_thread_switch(next);
+	mte_thread_switch(next);
 
 	/*
 	 * Complete any pending TLB or cache maintenance on this CPU in case
@@ -574,6 +609,67 @@ void arch_setup_new_exec(void)
 	ptrauth_thread_init_user(current);
 }
 
+#ifdef CONFIG_ARM64_MTE
+static long set_mte_ctrl(unsigned long arg)
+{
+	u64 tcf0;
+
+	if (!system_supports_mte())
+		return 0;
+
+	switch (arg & PR_MTE_TCF_MASK) {
+	case PR_MTE_TCF_NONE:
+		tcf0 = 0;
+		break;
+	case PR_MTE_TCF_SYNC:
+		tcf0 = SCTLR_EL1_TCF0_SYNC;
+		break;
+	case PR_MTE_TCF_ASYNC:
+		tcf0 = SCTLR_EL1_TCF0_ASYNC;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	/*
+	 * mte_thread_switch() checks current->thread.sctlr_tcf0 as an
+	 * optimisation. Disable preemption so that it does not see
+	 * the variable update before the SCTLR_EL1.TCF0 one.
+	 */
+	preempt_disable();
+	current->thread.sctlr_tcf0 = tcf0;
+	update_sctlr_el1_tcf0(tcf0);
+	preempt_enable();
+
+	return 0;
+}
+
+static long get_mte_ctrl(void)
+{
+	if (!system_supports_mte())
+		return 0;
+
+	switch (current->thread.sctlr_tcf0) {
+	case SCTLR_EL1_TCF0_SYNC:
+		return PR_MTE_TCF_SYNC;
+	case SCTLR_EL1_TCF0_ASYNC:
+		return PR_MTE_TCF_ASYNC;
+	}
+
+	return 0;
+}
+#else
+static long set_mte_ctrl(unsigned long arg)
+{
+	return 0;
+}
+
+static long get_mte_ctrl(void)
+{
+	return 0;
+}
+#endif
+
 #ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
 /*
  * Control the relaxed ABI allowing tagged user addresses into the kernel.
@@ -582,9 +678,15 @@ static unsigned int tagged_addr_disabled;
 
 long set_tagged_addr_ctrl(unsigned long arg)
 {
+	unsigned long valid_mask = PR_TAGGED_ADDR_ENABLE;
+
 	if (is_compat_task())
 		return -EINVAL;
-	if (arg & ~PR_TAGGED_ADDR_ENABLE)
+
+	if (system_supports_mte())
+		valid_mask |= PR_MTE_TCF_MASK;
+
+	if (arg & ~valid_mask)
 		return -EINVAL;
 
 	/*
@@ -594,6 +696,9 @@ long set_tagged_addr_ctrl(unsigned long arg)
 	if (arg & PR_TAGGED_ADDR_ENABLE && tagged_addr_disabled)
 		return -EINVAL;
 
+	if (set_mte_ctrl(arg) != 0)
+		return -EINVAL;
+
 	update_thread_flag(TIF_TAGGED_ADDR, arg & PR_TAGGED_ADDR_ENABLE);
 
 	return 0;
@@ -601,13 +706,17 @@ long set_tagged_addr_ctrl(unsigned long arg)
 
 long get_tagged_addr_ctrl(void)
 {
+	long ret = 0;
+
 	if (is_compat_task())
 		return -EINVAL;
 
 	if (test_thread_flag(TIF_TAGGED_ADDR))
-		return PR_TAGGED_ADDR_ENABLE;
+		ret = PR_TAGGED_ADDR_ENABLE;
 
-	return 0;
+	ret |= get_mte_ctrl();
+
+	return ret;
 }
 
 /*
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 7da1b37b27aa..5e9323e66a38 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -233,5 +233,11 @@ struct prctl_mm_map {
 #define PR_SET_TAGGED_ADDR_CTRL		55
 #define PR_GET_TAGGED_ADDR_CTRL		56
 # define PR_TAGGED_ADDR_ENABLE		(1UL << 0)
+/* MTE tag check fault modes */
+# define PR_MTE_TCF_SHIFT		1
+# define PR_MTE_TCF_NONE		(0UL << PR_MTE_TCF_SHIFT)
+# define PR_MTE_TCF_SYNC		(1UL << PR_MTE_TCF_SHIFT)
+# define PR_MTE_TCF_ASYNC		(2UL << PR_MTE_TCF_SHIFT)
+# define PR_MTE_TCF_MASK		(3UL << PR_MTE_TCF_SHIFT)
 
 #endif /* _LINUX_PRCTL_H */


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 20/22] arm64: mte: Allow user control of the excluded tags via prctl()
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (18 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 19/22] arm64: mte: Allow user control of the tag check mode via prctl() Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-16 14:20   ` Kevin Brodsky
  2019-12-11 18:40 ` [PATCH 21/22] arm64: mte: Kconfig entry Catalin Marinas
                   ` (2 subsequent siblings)
  22 siblings, 1 reply; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

The IRG, ADDG and SUBG instructions insert a random tag in the resulting
address. Certain tags can be excluded via the GCR_EL1.Exclude bitmap
when, for example, the user wants a certain colour for freed buffers.
Since the GCR_EL1 register is not accessible at EL0, extend the
prctl(PR_SET_TAGGED_ADDR_CTRL) interface to include a 16-bit field in
the first argument for controlling the excluded tags. This setting is
pre-thread.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/processor.h |  1 +
 arch/arm64/include/asm/sysreg.h    |  7 +++++++
 arch/arm64/kernel/process.c        | 27 +++++++++++++++++++++++----
 include/uapi/linux/prctl.h         |  3 +++
 4 files changed, 34 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 91aa270afc7d..5b6988035334 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -150,6 +150,7 @@ struct thread_struct {
 #endif
 #ifdef CONFIG_ARM64_MTE
 	u64			sctlr_tcf0;
+	u64			gcr_excl;
 #endif
 };
 
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 9e5753272f4b..b6bb6d31f1cd 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -901,6 +901,13 @@
 		write_sysreg(__scs_new, sysreg);			\
 } while (0)
 
+#define sysreg_clear_set_s(sysreg, clear, set) do {			\
+	u64 __scs_val = read_sysreg_s(sysreg);				\
+	u64 __scs_new = (__scs_val & ~(u64)(clear)) | (set);		\
+	if (__scs_new != __scs_val)					\
+		write_sysreg_s(__scs_new, sysreg);			\
+} while (0)
+
 #endif
 
 #endif	/* __ASM_SYSREG_H */
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 47ce98f47253..5ec6889795fc 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -502,6 +502,15 @@ static void update_sctlr_el1_tcf0(u64 tcf0)
 	sysreg_clear_set(sctlr_el1, SCTLR_EL1_TCF0_MASK, tcf0);
 }
 
+static void update_gcr_el1_excl(u64 excl)
+{
+	/*
+	 * No need for ISB since this only affects EL0 currently, implicit
+	 * with ERET.
+	 */
+	sysreg_clear_set_s(SYS_GCR_EL1, SYS_GCR_EL1_EXCL_MASK, excl);
+}
+
 /* Handle MTE thread switch */
 static void mte_thread_switch(struct task_struct *next)
 {
@@ -511,6 +520,7 @@ static void mte_thread_switch(struct task_struct *next)
 	/* avoid expensive SCTLR_EL1 accesses if no change */
 	if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
 		update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
+	update_gcr_el1_excl(next->thread.gcr_excl);
 }
 #else
 static void mte_thread_switch(struct task_struct *next)
@@ -641,22 +651,31 @@ static long set_mte_ctrl(unsigned long arg)
 	update_sctlr_el1_tcf0(tcf0);
 	preempt_enable();
 
+	current->thread.gcr_excl = (arg & PR_MTE_EXCL_MASK) >> PR_MTE_EXCL_SHIFT;
+	update_gcr_el1_excl(current->thread.gcr_excl);
+
 	return 0;
 }
 
 static long get_mte_ctrl(void)
 {
+	unsigned long ret;
+
 	if (!system_supports_mte())
 		return 0;
 
+	ret = current->thread.gcr_excl << PR_MTE_EXCL_SHIFT;
+
 	switch (current->thread.sctlr_tcf0) {
 	case SCTLR_EL1_TCF0_SYNC:
-		return PR_MTE_TCF_SYNC;
+		ret |= PR_MTE_TCF_SYNC;
+		break;
 	case SCTLR_EL1_TCF0_ASYNC:
-		return PR_MTE_TCF_ASYNC;
+		ret |= PR_MTE_TCF_ASYNC;
+		break;
 	}
 
-	return 0;
+	return ret;
 }
 #else
 static long set_mte_ctrl(unsigned long arg)
@@ -684,7 +703,7 @@ long set_tagged_addr_ctrl(unsigned long arg)
 		return -EINVAL;
 
 	if (system_supports_mte())
-		valid_mask |= PR_MTE_TCF_MASK;
+		valid_mask |= PR_MTE_TCF_MASK | PR_MTE_EXCL_MASK;
 
 	if (arg & ~valid_mask)
 		return -EINVAL;
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 5e9323e66a38..749de5ab4f9f 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -239,5 +239,8 @@ struct prctl_mm_map {
 # define PR_MTE_TCF_SYNC		(1UL << PR_MTE_TCF_SHIFT)
 # define PR_MTE_TCF_ASYNC		(2UL << PR_MTE_TCF_SHIFT)
 # define PR_MTE_TCF_MASK		(3UL << PR_MTE_TCF_SHIFT)
+/* MTE tag exclusion mask */
+# define PR_MTE_EXCL_SHIFT		3
+# define PR_MTE_EXCL_MASK		(0xffffUL << PR_MTE_EXCL_SHIFT)
 
 #endif /* _LINUX_PRCTL_H */


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 21/22] arm64: mte: Kconfig entry
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (19 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 20/22] arm64: mte: Allow user control of the excluded tags " Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-11 18:40 ` [PATCH 22/22] arm64: mte: Add Memory Tagging Extension documentation Catalin Marinas
  2019-12-13 18:05 ` [PATCH 00/22] arm64: Memory Tagging Extension user-space support Peter Collingbourne
  22 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

Add Memory Tagging Extension support to the arm64 kbuild.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/Kconfig | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index b1b4476ddb83..9a119c402bde 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1484,6 +1484,38 @@ config ARM64_PTR_AUTH
 
 endmenu
 
+menu "ARMv8.5 architectural features"
+
+config ARM64_AS_HAS_MTE
+	def_bool $(as-instr,.arch armv8.5-a+memtag)
+
+config ARM64_MTE
+	bool "Memory Tagging Extension support"
+	depends on ARM64_AS_HAS_MTE && ARM64_TAGGED_ADDR_ABI
+	select ARCH_USES_HIGH_VMA_FLAGS
+	select ARCH_NO_SWAP
+	help
+	  Memory Tagging (part of the ARMv8.5 Extensions) provides
+	  architectural support for run-time, always-on detection of
+	  various classes of memory error to aid with software debugging
+	  to eliminate vulnerabilities arising from memory-unsafe
+	  languages.
+
+	  This option enables the support for the Memory Tagging
+	  Extension at EL0 (i.e. for userspace).
+
+	  Selecting this option allows the feature to be detected at
+	  runtime. Any secondary CPU not implementing this feature will
+	  not be allowed a late bring-up.
+
+	  Userspace binaries that want to use this feature must
+	  explicitly opt in. The mechanism for the userspace is
+	  described in:
+
+	  Documentation/arm64/memory-tagging-extension.rst.
+
+endmenu
+
 config ARM64_SVE
 	bool "ARM Scalable Vector Extension support"
 	default y


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 22/22] arm64: mte: Add Memory Tagging Extension documentation
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (20 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 21/22] arm64: mte: Kconfig entry Catalin Marinas
@ 2019-12-11 18:40 ` Catalin Marinas
  2019-12-24 15:03   ` Kevin Brodsky
  2019-12-13 18:05 ` [PATCH 00/22] arm64: Memory Tagging Extension user-space support Peter Collingbourne
  22 siblings, 1 reply; 51+ messages in thread
From: Catalin Marinas @ 2019-12-11 18:40 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Kevin Brodsky, Andrey Konovalov, linux-mm,
	linux-arch

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

Memory Tagging Extension (part of the ARMv8.5 Extensions) provides
a mechanism to detect the sources of memory related errors which
may be vulnerable to exploitation, including bounds violations,
use-after-free, use-after-return, use-out-of-scope and use before
initialization errors.

Add Memory Tagging Extension documentation for the arm64 linux
kernel support.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 Documentation/arm64/cpu-feature-registers.rst |   4 +
 Documentation/arm64/elf_hwcaps.rst            |   4 +
 Documentation/arm64/index.rst                 |   1 +
 .../arm64/memory-tagging-extension.rst        | 229 ++++++++++++++++++
 4 files changed, 238 insertions(+)
 create mode 100644 Documentation/arm64/memory-tagging-extension.rst

diff --git a/Documentation/arm64/cpu-feature-registers.rst b/Documentation/arm64/cpu-feature-registers.rst
index b6e44884e3ad..67305a5f613a 100644
--- a/Documentation/arm64/cpu-feature-registers.rst
+++ b/Documentation/arm64/cpu-feature-registers.rst
@@ -172,8 +172,12 @@ infrastructure:
      +------------------------------+---------+---------+
      | Name                         |  bits   | visible |
      +------------------------------+---------+---------+
+     | MTE                          | [11-8]  |    y    |
+     +------------------------------+---------+---------+
      | SSBS                         | [7-4]   |    y    |
      +------------------------------+---------+---------+
+     | BT                           | [3-0]   |    n    |
+     +------------------------------+---------+---------+
 
 
   4) MIDR_EL1 - Main ID Register
diff --git a/Documentation/arm64/elf_hwcaps.rst b/Documentation/arm64/elf_hwcaps.rst
index 7fa3d215ae6a..0f52d22c28af 100644
--- a/Documentation/arm64/elf_hwcaps.rst
+++ b/Documentation/arm64/elf_hwcaps.rst
@@ -204,6 +204,10 @@ HWCAP2_FRINT
 
     Functionality implied by ID_AA64ISAR1_EL1.FRINTTS == 0b0001.
 
+HWCAP2_MTE
+
+    Functionality implied by ID_AA64PFR1_EL1.MTE == 0b0010.
+    Documentation/arm64/memory-tagging-extension.rst.
 
 4. Unused AT_HWCAP bits
 -----------------------
diff --git a/Documentation/arm64/index.rst b/Documentation/arm64/index.rst
index 5c0c69dc58aa..82970c6d384f 100644
--- a/Documentation/arm64/index.rst
+++ b/Documentation/arm64/index.rst
@@ -13,6 +13,7 @@ ARM64 Architecture
     hugetlbpage
     legacy_instructions
     memory
+    memory-tagging-extension
     pointer-authentication
     silicon-errata
     sve
diff --git a/Documentation/arm64/memory-tagging-extension.rst b/Documentation/arm64/memory-tagging-extension.rst
new file mode 100644
index 000000000000..ae02f0771971
--- /dev/null
+++ b/Documentation/arm64/memory-tagging-extension.rst
@@ -0,0 +1,229 @@
+===============================================
+Memory Tagging Extension (MTE) in AArch64 Linux
+===============================================
+
+Authors: Vincenzo Frascino <vincenzo.frascino@arm.com>
+         Catalin Marinas <catalin.marinas@arm.com>
+
+Date: 2019-11-29
+
+This document describes the provision of the Memory Tagging Extension
+functionality in AArch64 Linux.
+
+Introduction
+============
+
+ARMv8.5 based processors introduce the Memory Tagging Extension (MTE)
+feature. MTE is built on top of the ARMv8.0 virtual address tagging TBI
+(Top Byte Ignore) feature and allows software to access a 4-bit
+allocation tag for each 16-byte granule in the physical address space.
+Such memory range must be mapped with the Normal-Tagged memory
+attribute. A logical tag is derived from bits 59-56 of the virtual
+address used for the memory access. A CPU with MTE enabled will compare
+the logical tag against the allocation tag and potentially raise an
+exception on mismatch, subject to system registers configuration.
+
+Userspace Support
+=================
+
+Memory Tagging Extension Linux support depends on AArch64 Tagged Address
+ABI being enabled in the kernel. For more details on AArch64 Tagged
+Address ABI refer to Documentation/arm64/tagged-address-abi.rst.
+
+When ``CONFIG_ARM64_MTE`` is selected and Memory Tagging Extension is
+supported by the hardware, the kernel advertises the feature to
+userspace via ``HWCAP2_MTE``.
+
+PROT_MTE
+--------
+
+To access the allocation tags, a user process must enable the Tagged
+memory attribute on an address range using a new ``prot`` flag for
+``mmap()`` and ``mprotect()``:
+
+``PROT_MTE`` - Pages allow access to the MTE allocation tags.
+
+The allocation tag is set to 0 when such pages are first mapped in the
+user address space and preserved on copy-on-write. ``MAP_SHARED`` is
+supported and the allocation tags can be shared between processes.
+
+**Note**: ``PROT_MTE`` is only supported on ``MAP_ANONYMOUS`` and
+RAM-based file mappings (``tmpfs``, ``memfd``). Passing it to other
+types of mapping will result in ``-EINVAL`` returned by these system
+calls.
+
+**Note**: The ``PROT_MTE`` flag (and corresponding memory type) cannot
+be cleared by ``mprotect()``. If this is desirable, ``munmap()``
+(followed by ``mmap()``) must be used.
+
+Tag Check Faults
+----------------
+
+When ``PROT_MTE`` is enabled on an address range and a mismatch between
+the logical and allocation tags occurs on access, there are three
+configurable behaviours:
+
+- *Ignore* - This is the default mode. The CPU (and kernel) ignores the
+  tag check fault.
+
+- *Synchronous* - The kernel raises a ``SIGSEGV`` synchronously, with
+  ``.si_code = SEGV_MTESERR`` and ``.si_addr = <fault-address>``. The
+  memory access is not performed.
+
+- *Asynchronous* - The kernel raises a ``SIGSEGV``, in the current
+  thread, asynchronously following one or multiple tag check faults,
+  with ``.si_code = SEGV_MTEAERR`` and ``.si_addr = 0``.
+
+**Note**: There are no *match-all* logical tags available for user
+applications.
+
+The user can select the above modes, per thread, using the
+``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call where
+``flags`` contain one of the following values in the ``PR_MTE_TCF_MASK``
+bit-field:
+
+- ``PR_MTE_TCF_NONE``  - *Ignore* tag check faults
+- ``PR_MTE_TCF_SYNC``  - *Synchronous* tag check fault mode
+- ``PR_MTE_TCF_ASYNC`` - *Asynchronous* tag check fault mode
+
+Tag checking can also be disabled for a user thread by setting the
+``PSTATE.TCO`` bit with ``MSR TCO, #1``.
+
+**Note**: Signal handlers are always invoked with ``PSTATE.TCO = 0``,
+irrespective of the interrupted context.
+
+**Note**: Kernel accesses to user memory (e.g. ``read()`` system call)
+do not generate a tag check fault.
+
+Excluding Tags in the ``IRG``, ``ADDG`` and ``SUBG`` instructions
+-----------------------------------------------------------------
+
+The architecture allows excluding certain tags to be randomly generated
+via the ``GCR_EL1.Exclude`` register bit-field. This can be configured,
+per thread, using the ``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)``
+system call where ``flags`` contains the exclusion bitmap in the
+``PR_MTE_EXCL_MASK`` bit-field.
+
+Example of correct usage
+========================
+
+*MTE Example code*
+
+.. code-block:: c
+
+    /*
+     * To be compiled with -march=armv8.5-a+memtag
+     */
+    #include <errno.h>
+    #include <stdio.h>
+    #include <stdlib.h>
+    #include <unistd.h>
+    #include <sys/auxv.h>
+    #include <sys/mman.h>
+    #include <sys/prctl.h>
+
+    /*
+     * From arch/arm64/include/uapi/asm/hwcap.h
+     */
+    #define HWCAP2_MTE              (1 << 10)
+
+    /*
+     * From arch/arm64/include/uapi/asm/mman.h
+     */
+    #define PROT_MTE                 0x20
+
+    /*
+     * From include/uapi/linux/prctl.h
+     */
+    #define PR_SET_TAGGED_ADDR_CTRL 55
+    #define PR_GET_TAGGED_ADDR_CTRL 56
+    # define PR_TAGGED_ADDR_ENABLE  (1UL << 0)
+    # define PR_MTE_TCF_SHIFT       1
+    # define PR_MTE_TCF_NONE        (0UL << PR_MTE_TCF_SHIFT)
+    # define PR_MTE_TCF_SYNC        (1UL << PR_MTE_TCF_SHIFT)
+    # define PR_MTE_TCF_ASYNC       (2UL << PR_MTE_TCF_SHIFT)
+    # define PR_MTE_TCF_MASK        (3UL << PR_MTE_TCF_SHIFT)
+    # define PR_MTE_EXCL_SHIFT      3
+    # define PR_MTE_EXCL_MASK       (0xffffUL << PR_MTE_EXCL_SHIFT)
+
+    /*
+     * Insert a random logical tag into the given pointer.
+     */
+    #define insert_random_tag(ptr) ({                       \
+            __u64 __val;                                    \
+            asm("irg %0, %1" : "=r" (__val) : "r" (ptr));   \
+            __val;                                          \
+    })
+
+    /*
+     * Set the allocation tag on the destination address.
+     */
+    #define set_tag(tag, addr) do {                                 \
+            asm volatile("stg %0, [%1]" : : "r" (tag), "r" (addr)); \
+    } while (0)
+
+    int main()
+    {
+            unsigned long *a;
+            unsigned long page_sz = getpagesize();
+            unsigned long hwcap2 = getauxval(AT_HWCAP2);
+
+            /* check if MTE is present */
+            if (!(hwcap2 & HWCAP2_MTE))
+                    return -1;
+
+            /*
+             * Enable the tagged address ABI, synchronous MTE tag check faults and
+             * exclude tag 0 from the randomly generated set.
+             */
+            if (prctl(PR_SET_TAGGED_ADDR_CTRL,
+                      PR_TAGGED_ADDR_ENABLE | PR_MTE_TCF_SYNC | (1 << PR_MTE_EXCL_SHIFT),
+                      0, 0, 0)) {
+                    perror("prctl() failed");
+                    return -1;
+            }
+
+            a = mmap(0, page_sz, PROT_READ | PROT_WRITE,
+                     MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+            if (a == MAP_FAILED) {
+                    perror("mmap() failed");
+                    return -1;
+            }
+
+            /*
+             * Enable MTE on the above anonymous mmap. The flag could be passed
+             * directly to mmap() and skip this step.
+             */
+            if (mprotect(a, page_sz, PROT_READ | PROT_WRITE | PROT_MTE)) {
+                    perror("mprotect() failed");
+                    return -1;
+            }
+
+            /* access with the default tag (0) */
+            a[0] = 1;
+            a[1] = 2;
+
+            printf("a[0] = %lu a[1] = %lu\n", a[0], a[1]);
+
+            /* set the logical and allocation tags */
+            a = (unsigned long *)insert_random_tag(a);
+            set_tag(a, a);
+
+            printf("%p\n", a);
+
+            /* non-zero tag access */
+            a[0] = 3;
+            printf("a[0] = %lu a[1] = %lu\n", a[0], a[1]);
+
+            /*
+             * If MTE is enabled correctly the next instruction will generate an
+             * exception.
+             */
+            printf("Expecting SIGSEGV...\n");
+            a[2] = 0xdead;
+
+            /* this should not be printed in the PR_MTE_TCF_SYNC mode */
+            printf("...done\n");
+
+            return 0;
+    }


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 01/22] mm: Reserve asm-generic prot flags 0x10 and 0x20 for arch use
  2019-12-11 18:40 ` [PATCH 01/22] mm: Reserve asm-generic prot flags 0x10 and 0x20 for arch use Catalin Marinas
@ 2019-12-11 19:26   ` Arnd Bergmann
  0 siblings, 0 replies; 51+ messages in thread
From: Arnd Bergmann @ 2019-12-11 19:26 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linux ARM, Will Deacon, Marc Zyngier, Vincenzo Frascino,
	Szabolcs Nagy, Richard Earnshaw, Kevin Brodsky, Andrey Konovalov,
	Linux-MM, linux-arch, Dave Martin

On Wed, Dec 11, 2019 at 7:40 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> From: Dave Martin <Dave.Martin@arm.com>
>
> The asm-generic/mman.h definitions are used by a few architectures that
> also define arch-specific PROT flags with value 0x10 and 0x20. This
> currently applies to sparc and powerpc for 0x10, while arm64 will soon
> join with 0x10 and 0x20.
>
> To help future maintainers, document the use of this flag in the
> asm-generic header too.
>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> [catalin.marinas@arm.com: reserve 0x20 as well]
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Acked-by: Arnd Bergmann <arnd@arndb.de>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 12/22] arm64: mte: Add specific SIGSEGV codes
  2019-12-11 18:40 ` [PATCH 12/22] arm64: mte: Add specific SIGSEGV codes Catalin Marinas
@ 2019-12-11 19:31   ` Arnd Bergmann
  2019-12-12  9:34     ` Catalin Marinas
  2019-12-12 18:26     ` Eric W. Biederman
  0 siblings, 2 replies; 51+ messages in thread
From: Arnd Bergmann @ 2019-12-11 19:31 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linux ARM, Will Deacon, Marc Zyngier, Vincenzo Frascino,
	Szabolcs Nagy, Richard Earnshaw, Kevin Brodsky, Andrey Konovalov,
	Linux-MM, linux-arch, Eric W. Biederman, Al Viro

On Wed, Dec 11, 2019 at 7:40 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> From: Vincenzo Frascino <vincenzo.frascino@arm.com>
>
> Add MTE-specific SIGSEGV codes to siginfo.h.
>
> Note that the for MTE we are reusing the same SPARC ADI codes because
> the two functionalities are similar and they cannot coexist on the same
> system.
>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
> [catalin.marinas@arm.com: renamed precise/imprecise to sync/async]
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  include/uapi/asm-generic/siginfo.h | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
> index cb3d6c267181..a5184a5438c6 100644
> --- a/include/uapi/asm-generic/siginfo.h
> +++ b/include/uapi/asm-generic/siginfo.h
> @@ -227,8 +227,13 @@ typedef struct siginfo {
>  # define SEGV_PKUERR   4       /* failed protection key checks */
>  #endif
>  #define SEGV_ACCADI    5       /* ADI not enabled for mapped object */
> -#define SEGV_ADIDERR   6       /* Disrupting MCD error */
> -#define SEGV_ADIPERR   7       /* Precise MCD exception */
> +#ifdef __aarch64__
> +# define SEGV_MTEAERR  6       /* Asynchronous MTE error */
> +# define SEGV_MTESERR  7       /* Synchronous MTE exception */
> +#else
> +# define SEGV_ADIDERR  6       /* Disrupting MCD error */
> +# define SEGV_ADIPERR  7       /* Precise MCD exception */
> +#endif

SEGV_ADIPERR/SEGV_ADIDERR were added together with SEGV_ACCADI,
it seems a bit odd to make only two of them conditional but not the others.

I think we are generally working towards having the same constants
across architectures even for features that only exist on one of them.

Adding Al and Eric to Cc, maybe they have another suggestion on what
constants should be used.

     Arnd


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/22] kbuild: Add support for 'as-instr' to be used in Kconfig files
  2019-12-11 18:40 ` [PATCH 02/22] kbuild: Add support for 'as-instr' to be used in Kconfig files Catalin Marinas
@ 2019-12-12  5:03   ` Masahiro Yamada
  0 siblings, 0 replies; 51+ messages in thread
From: Masahiro Yamada @ 2019-12-12  5:03 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, Will Deacon, Marc Zyngier, Vincenzo Frascino,
	Szabolcs Nagy, Richard Earnshaw, Kevin Brodsky, Andrey Konovalov,
	linux-mm, linux-arch, Linux Kbuild mailing list, Vladimir Murzin

On Thu, Dec 12, 2019 at 3:40 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> Similar to 'cc-option' or 'ld-option', it is occasionally necessary to
> check whether the assembler supports certain ISA extensions. In the
> arm64 code we currently do this in Makefile with an additional define:
>
> lseinstr := $(call as-instr,.arch_extension lse,-DCONFIG_AS_LSE=1)
>
> Add the 'as-instr' option so that it can be used in Kconfig directly:
>
>         def_bool $(as-instr,.arch_extension lse)
>
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> Cc: linux-kbuild@vger.kernel.org
> Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---

Please feel fee to apply this to arm64 tree.
Acked-by: Masahiro Yamada <masahiroy@kernel.org>

>  scripts/Kconfig.include | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/scripts/Kconfig.include b/scripts/Kconfig.include
> index d4adfbe42690..9d07e59cbdf7 100644
> --- a/scripts/Kconfig.include
> +++ b/scripts/Kconfig.include
> @@ -31,6 +31,10 @@ cc-option = $(success,$(CC) -Werror $(CLANG_FLAGS) $(1) -E -x c /dev/null -o /de
>  # Return y if the linker supports <flag>, n otherwise
>  ld-option = $(success,$(LD) -v $(1))
>
> +# $(as-instr,<instr>)
> +# Return y if the assembler supports <instr>, n otherwise
> +as-instr = $(success,printf "%b\n" "$(1)" | $(CC) $(CLANG_FLAGS) -c -x assembler -o /dev/null -)
> +
>  # check if $(CC) and $(LD) exist
>  $(error-if,$(failure,command -v $(CC)),compiler '$(CC)' not found)
>  $(error-if,$(failure,command -v $(LD)),linker '$(LD)' not found)



-- 
Best Regards
Masahiro Yamada


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 12/22] arm64: mte: Add specific SIGSEGV codes
  2019-12-11 19:31   ` Arnd Bergmann
@ 2019-12-12  9:34     ` Catalin Marinas
  2019-12-12 18:26     ` Eric W. Biederman
  1 sibling, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-12  9:34 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linux ARM, Will Deacon, Marc Zyngier, Vincenzo Frascino,
	Szabolcs Nagy, Richard Earnshaw, Kevin Brodsky, Andrey Konovalov,
	Linux-MM, linux-arch, Eric W. Biederman, Al Viro

On Wed, Dec 11, 2019 at 08:31:28PM +0100, Arnd Bergmann wrote:
> On Wed, Dec 11, 2019 at 7:40 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> >
> > From: Vincenzo Frascino <vincenzo.frascino@arm.com>
> >
> > Add MTE-specific SIGSEGV codes to siginfo.h.
> >
> > Note that the for MTE we are reusing the same SPARC ADI codes because
> > the two functionalities are similar and they cannot coexist on the same
> > system.
> >
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
> > [catalin.marinas@arm.com: renamed precise/imprecise to sync/async]
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > ---
> >  include/uapi/asm-generic/siginfo.h | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
> > index cb3d6c267181..a5184a5438c6 100644
> > --- a/include/uapi/asm-generic/siginfo.h
> > +++ b/include/uapi/asm-generic/siginfo.h
> > @@ -227,8 +227,13 @@ typedef struct siginfo {
> >  # define SEGV_PKUERR   4       /* failed protection key checks */
> >  #endif
> >  #define SEGV_ACCADI    5       /* ADI not enabled for mapped object */
> > -#define SEGV_ADIDERR   6       /* Disrupting MCD error */
> > -#define SEGV_ADIPERR   7       /* Precise MCD exception */
> > +#ifdef __aarch64__
> > +# define SEGV_MTEAERR  6       /* Asynchronous MTE error */
> > +# define SEGV_MTESERR  7       /* Synchronous MTE exception */
> > +#else
> > +# define SEGV_ADIDERR  6       /* Disrupting MCD error */
> > +# define SEGV_ADIPERR  7       /* Precise MCD exception */
> > +#endif
> 
> SEGV_ADIPERR/SEGV_ADIDERR were added together with SEGV_ACCADI,
> it seems a bit odd to make only two of them conditional but not the others.

Ah, I missed this. I think we should drop the #ifdef entirely. There is
no harm in having two different macros with the same value.

> I think we are generally working towards having the same constants
> across architectures even for features that only exist on one of them.

I'd rather keep both the ARM and SPARC naming here as the behaviour may
be subtly different between the two architectures. IIUC, the disrupting
SPARC MCD error on means a memory corruption trap sent to the
hypervisor. On ARM MTE, the asynchronous tag check fault is a pretty
much benign setting of a status flag. The kernel, when detecting this
flag, injects a SIGSEGV on the ret_to_user path. If there's no switch
into the kernel, a user program cannot become aware of the asynchronous
MTE tag check fault.

We also don't have the equivalent of ACCADI.

-- 
Catalin


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 12/22] arm64: mte: Add specific SIGSEGV codes
  2019-12-11 19:31   ` Arnd Bergmann
  2019-12-12  9:34     ` Catalin Marinas
@ 2019-12-12 18:26     ` Eric W. Biederman
  2019-12-17 17:48       ` Catalin Marinas
  1 sibling, 1 reply; 51+ messages in thread
From: Eric W. Biederman @ 2019-12-12 18:26 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Catalin Marinas, Linux ARM, Will Deacon, Marc Zyngier,
	Vincenzo Frascino, Szabolcs Nagy, Richard Earnshaw,
	Kevin Brodsky, Andrey Konovalov, Linux-MM, linux-arch, Al Viro

Arnd Bergmann <arnd@arndb.de> writes:

> On Wed, Dec 11, 2019 at 7:40 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>>
>> From: Vincenzo Frascino <vincenzo.frascino@arm.com>
>>
>> Add MTE-specific SIGSEGV codes to siginfo.h.
>>
>> Note that the for MTE we are reusing the same SPARC ADI codes because
>> the two functionalities are similar and they cannot coexist on the same
>> system.

Please Please Please don't do that.

It is actively harmful to have architecture specific si_code values.
As it makes maintenance much more difficult.

Especially as the si_codes are part of union descrimanator.

If your functionality is identical reuse the numbers otherwise please
just select the next numbers not yet used.

We have at least 256 si_codes per signal 2**32 if we really need them so
there is no need to be reuse numbers.

The practical problem is that architecture specific si_codes start
turning kernel/signal.c into #ifdef soup, and we loose a lot of
basic compile coverage because of that.  In turn not compiling the code
leads to bit-rot in all kinds of weird places.



Now as far as the observation that this is almost the same as other
functionality why can't this fit the existing interface exposed to
userspace?   Sometimes there are good reasons, but technology gets
a lot more uptake and testing when the same interfaces are more widely
available.

Eric

p.s. As for coexistence there is always the possibility that one chip
in a cpu family does supports one thing and another chip in a cpu
family supports another.  So userspace may have to cope with the
situation even if an individual chip doesn't.

I remember a similar case where sparc had several distinct page table
formats and we had a single kernel that had to cope with them all.


>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
>> [catalin.marinas@arm.com: renamed precise/imprecise to sync/async]
>> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
>> ---
>>  include/uapi/asm-generic/siginfo.h | 9 +++++++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
>> index cb3d6c267181..a5184a5438c6 100644
>> --- a/include/uapi/asm-generic/siginfo.h
>> +++ b/include/uapi/asm-generic/siginfo.h
>> @@ -227,8 +227,13 @@ typedef struct siginfo {
>>  # define SEGV_PKUERR   4       /* failed protection key checks */
>>  #endif
>>  #define SEGV_ACCADI    5       /* ADI not enabled for mapped object */
>> -#define SEGV_ADIDERR   6       /* Disrupting MCD error */
>> -#define SEGV_ADIPERR   7       /* Precise MCD exception */
>> +#ifdef __aarch64__
>> +# define SEGV_MTEAERR  6       /* Asynchronous MTE error */
>> +# define SEGV_MTESERR  7       /* Synchronous MTE exception */
>> +#else
>> +# define SEGV_ADIDERR  6       /* Disrupting MCD error */
>> +# define SEGV_ADIPERR  7       /* Precise MCD exception */
>> +#endif
>
> SEGV_ADIPERR/SEGV_ADIDERR were added together with SEGV_ACCADI,
> it seems a bit odd to make only two of them conditional but not the others.
>
> I think we are generally working towards having the same constants
> across architectures even for features that only exist on one of them.
>
> Adding Al and Eric to Cc, maybe they have another suggestion on what
> constants should be used.
>
>      Arnd


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 00/22] arm64: Memory Tagging Extension user-space support
  2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
                   ` (21 preceding siblings ...)
  2019-12-11 18:40 ` [PATCH 22/22] arm64: mte: Add Memory Tagging Extension documentation Catalin Marinas
@ 2019-12-13 18:05 ` Peter Collingbourne
  2020-02-13 11:23   ` Catalin Marinas
  22 siblings, 1 reply; 51+ messages in thread
From: Peter Collingbourne @ 2019-12-13 18:05 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linux ARM, linux-arch, Richard Earnshaw, Szabolcs Nagy,
	Marc Zyngier, Kevin Brodsky, linux-mm, Andrey Konovalov,
	Vincenzo Frascino, Will Deacon, Evgenii Stepanov,
	Kostya Kortchinsky, Kostya Serebryany

On Wed, Dec 11, 2019 at 10:40 AM Catalin Marinas
<catalin.marinas@arm.com> wrote:
> Hi,
>
> This series proposes the initial user-space support for the ARMv8.5
> Memory Tagging Extension [1].

Thanks for sending out this series. I have been testing it on Android
with the FVP model and my in-development scudo changes that add memory
tagging support [1], and have not noticed any problems so far.

> - Clarify whether mmap(tagged_addr, PROT_MTE) pre-tags the memory with
>   the tag given in the tagged_addr hint. Strong justification is
>   required for this as it would force arm64 to disable the zero page.

We would like to use this feature in scudo to tag large (>128KB on
Android) allocations, which are currently allocated via mmap rather
than from an allocation pool. Otherwise we would need to pay the cost
(perf and RSS) of faulting all of their pages at allocation time
instead of on demand, if we want to tag them.

If we could disable the zero page for tagged mappings only and let the
pages be faulted as they are read, that would work for us.

Peter

[1] https://reviews.llvm.org/D70762


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 13/22] arm64: mte: Handle synchronous and asynchronous tag check faults
  2019-12-11 18:40 ` [PATCH 13/22] arm64: mte: Handle synchronous and asynchronous tag check faults Catalin Marinas
@ 2019-12-14  1:43   ` Peter Collingbourne
  2019-12-17 18:01     ` Catalin Marinas
  0 siblings, 1 reply; 51+ messages in thread
From: Peter Collingbourne @ 2019-12-14  1:43 UTC (permalink / raw)
  To: Catalin Marinas, Evgenii Stepanov, Kostya Serebryany
  Cc: Linux ARM, linux-arch, Richard Earnshaw, Szabolcs Nagy,
	Marc Zyngier, Kevin Brodsky, linux-mm, Andrey Konovalov,
	Vincenzo Frascino, Will Deacon

On Wed, Dec 11, 2019 at 10:44 AM Catalin Marinas
<catalin.marinas@arm.com> wrote:
>
> From: Vincenzo Frascino <vincenzo.frascino@arm.com>
>
> The Memory Tagging Extension has two modes of notifying a tag check
> fault at EL0, configurable through the SCTLR_EL1.TCF0 field:
>
> 1. Synchronous raising of a Data Abort exception with DFSC 17.
> 2. Asynchronous setting of a cumulative bit in TFSRE0_EL1.
>
> Add the exception handler for the synchronous exception and handling of
> the asynchronous TFSRE0_EL1.TF0 bit setting via a new TIF flag in
> do_notify_resume().
>
> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
> Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm64/include/asm/thread_info.h |  4 +++-
>  arch/arm64/kernel/entry.S            | 17 +++++++++++++++++
>  arch/arm64/kernel/process.c          |  7 +++++++
>  arch/arm64/kernel/signal.c           |  8 ++++++++
>  arch/arm64/mm/fault.c                |  9 ++++++++-
>  5 files changed, 43 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
> index f0cec4160136..f759a0215a71 100644
> --- a/arch/arm64/include/asm/thread_info.h
> +++ b/arch/arm64/include/asm/thread_info.h
> @@ -63,6 +63,7 @@ void arch_release_task_struct(struct task_struct *tsk);
>  #define TIF_FOREIGN_FPSTATE    3       /* CPU's FP state is not current's */
>  #define TIF_UPROBE             4       /* uprobe breakpoint or singlestep */
>  #define TIF_FSCHECK            5       /* Check FS is USER_DS on return */
> +#define TIF_MTE_ASYNC_FAULT    6       /* MTE Asynchronous Tag Check Fault */
>  #define TIF_NOHZ               7
>  #define TIF_SYSCALL_TRACE      8       /* syscall trace active */
>  #define TIF_SYSCALL_AUDIT      9       /* syscall auditing */
> @@ -93,10 +94,11 @@ void arch_release_task_struct(struct task_struct *tsk);
>  #define _TIF_FSCHECK           (1 << TIF_FSCHECK)
>  #define _TIF_32BIT             (1 << TIF_32BIT)
>  #define _TIF_SVE               (1 << TIF_SVE)
> +#define _TIF_MTE_ASYNC_FAULT   (1 << TIF_MTE_ASYNC_FAULT)
>
>  #define _TIF_WORK_MASK         (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
>                                  _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
> -                                _TIF_UPROBE | _TIF_FSCHECK)
> +                                _TIF_UPROBE | _TIF_FSCHECK | _TIF_MTE_ASYNC_FAULT)
>
>  #define _TIF_SYSCALL_WORK      (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
>                                  _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 7c6a0a41676f..c221a539e61d 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -144,6 +144,22 @@ alternative_cb_end
>  #endif
>         .endm
>
> +       // Check for MTE asynchronous tag check faults
> +       .macro check_mte_async_tcf, flgs, tmp
> +#ifdef CONFIG_ARM64_MTE
> +alternative_if_not ARM64_MTE
> +       b       1f
> +alternative_else_nop_endif
> +       mrs_s   \tmp, SYS_TFSRE0_EL1
> +       tbz     \tmp, #SYS_TFSR_EL1_TF0_SHIFT, 1f
> +       // Asynchronous TCF occurred at EL0, set the TI flag
> +       orr     \flgs, \flgs, #_TIF_MTE_ASYNC_FAULT
> +       str     \flgs, [tsk, #TSK_TI_FLAGS]
> +       msr_s   SYS_TFSRE0_EL1, xzr
> +1:
> +#endif
> +       .endm
> +
>         .macro  kernel_entry, el, regsize = 64
>         .if     \regsize == 32
>         mov     w0, w0                          // zero upper 32 bits of x0
> @@ -171,6 +187,7 @@ alternative_cb_end
>         ldr     x19, [tsk, #TSK_TI_FLAGS]       // since we can unmask debug
>         disable_step_tsk x19, x20               // exceptions when scheduling.
>
> +       check_mte_async_tcf x19, x22
>         apply_ssbd 1, x22, x23
>
>         .else
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 71f788cd2b18..dd98d539894e 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -317,12 +317,19 @@ static void flush_tagged_addr_state(void)
>                 clear_thread_flag(TIF_TAGGED_ADDR);
>  }
>
> +static void flush_mte_state(void)
> +{
> +       if (system_supports_mte())
> +               clear_thread_flag(TIF_MTE_ASYNC_FAULT);
> +}
> +
>  void flush_thread(void)
>  {
>         fpsimd_flush_thread();
>         tls_thread_flush();
>         flush_ptrace_hw_breakpoint(current);
>         flush_tagged_addr_state();
> +       flush_mte_state();
>  }
>
>  void release_thread(struct task_struct *dead_task)
> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> index dd2cdc0d5be2..41fae64af82a 100644
> --- a/arch/arm64/kernel/signal.c
> +++ b/arch/arm64/kernel/signal.c
> @@ -730,6 +730,9 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
>         regs->regs[29] = (unsigned long)&user->next_frame->fp;
>         regs->pc = (unsigned long)ka->sa.sa_handler;
>
> +       /* TCO (Tag Check Override) always cleared for signal handlers */
> +       regs->pstate &= ~PSR_TCO_BIT;
> +
>         if (ka->sa.sa_flags & SA_RESTORER)
>                 sigtramp = ka->sa.sa_restorer;
>         else
> @@ -921,6 +924,11 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
>                         if (thread_flags & _TIF_UPROBE)
>                                 uprobe_notify_resume(regs);
>
> +                       if (thread_flags & _TIF_MTE_ASYNC_FAULT) {
> +                               clear_thread_flag(TIF_MTE_ASYNC_FAULT);
> +                               force_signal_inject(SIGSEGV, SEGV_MTEAERR, 0);

In the case where the kernel is entered due to a syscall, this will
inject a signal, but only after servicing the syscall. This means
that, for example, if the syscall is exit(), the async tag check
failure will be silently ignored. I can reproduce the problem with the
program below:

.arch_extension mte

.globl _start
_start:
mov x0, #0x37 // PR_SET_TAGGED_ADDR_CTRL
mov x1, #0xd // PR_TAGGED_ADDR_ENABLE | PR_MTE_TCF_ASYNC | (1 <<
PR_MTE_EXCL_SHIFT)
mov x2, #0
mov x3, #0
mov x4, #0
mov x8, #0xa7 // prctl
svc #0

mov x0, xzr
mov w1, #0x1000
mov w2, #0x23 // PROT_READ|PROT_WRITE|PROT_MTE
mov w3, #0x22 // MAP_PRIVATE|MAP_ANONYMOUS
mov w4, #0xffffffff
mov x5, xzr
mov x8, #0xde // mmap
svc #0

orr x0, x0, #(1 << 56)
str x0, [x0] // <- tag check fail here

// mov x0, #0
// mov x8, #0x17 // dup
// svc #0

mov x0, #0
mov x8, #0x5d // exit
svc #0

If I run this program, it terminates successfully (i.e. the exit
syscall succeeds). And if I uncomment the dup() syscall and run the
program under strace, I see that the program dies with SIGSEGV, but
not before servicing the dup().

This patch fixes the problem for me:

diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 9a9d98a443fc..d0c8918dee00 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -94,6 +94,8 @@ static void el0_svc_common(struct pt_regs *regs, int
scno, int sc_nr,
                           const syscall_fn_t syscall_table[])
 {
        unsigned long flags = current_thread_info()->flags;
+       if (flags & _TIF_MTE_ASYNC_FAULT)
+               return;

        regs->orig_x0 = regs->regs[0];
        regs->syscallno = scno;

I am not sure whether this is the correct fix, though.

Peter

> +                       }
> +
>                         if (thread_flags & _TIF_SIGPENDING)
>                                 do_signal(regs);
>
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 077b02a2d4d3..ef3bfa2bf2b1 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -660,6 +660,13 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
>         return 0;
>  }
>
> +static int do_tag_check_fault(unsigned long addr, unsigned int esr,
> +                             struct pt_regs *regs)
> +{
> +       do_bad_area(addr, esr, regs);
> +       return 0;
> +}
> +
>  static const struct fault_info fault_info[] = {
>         { do_bad,               SIGKILL, SI_KERNEL,     "ttbr address size fault"       },
>         { do_bad,               SIGKILL, SI_KERNEL,     "level 1 address size fault"    },
> @@ -678,7 +685,7 @@ static const struct fault_info fault_info[] = {
>         { do_page_fault,        SIGSEGV, SEGV_ACCERR,   "level 2 permission fault"      },
>         { do_page_fault,        SIGSEGV, SEGV_ACCERR,   "level 3 permission fault"      },
>         { do_sea,               SIGBUS,  BUS_OBJERR,    "synchronous external abort"    },
> -       { do_bad,               SIGKILL, SI_KERNEL,     "unknown 17"                    },
> +       { do_tag_check_fault,   SIGSEGV, SEGV_MTESERR,  "synchronous tag check fault"   },
>         { do_bad,               SIGKILL, SI_KERNEL,     "unknown 18"                    },
>         { do_bad,               SIGKILL, SI_KERNEL,     "unknown 19"                    },
>         { do_sea,               SIGKILL, SI_KERNEL,     "level 0 (translation table walk)"      },
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 20/22] arm64: mte: Allow user control of the excluded tags via prctl()
  2019-12-11 18:40 ` [PATCH 20/22] arm64: mte: Allow user control of the excluded tags " Catalin Marinas
@ 2019-12-16 14:20   ` Kevin Brodsky
  2019-12-16 17:30     ` Peter Collingbourne
  0 siblings, 1 reply; 51+ messages in thread
From: Kevin Brodsky @ 2019-12-16 14:20 UTC (permalink / raw)
  To: Catalin Marinas, linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Andrey Konovalov, linux-mm, linux-arch,
	Branislav Rankov, Peter Collingbourne

+Branislav, Peter

In this patch, the default exclusion mask remains 0 (i.e. all tags can be generated). 
After some more discussions, Branislav and I think that it would be better to start 
with the reverse, i.e. all tags but 0 excluded (mask = 0xfe or 0xff).

This should simplify the MTE setup in the early C runtime quite a bit. Indeed, if all 
tags can be generated, doing any heap or stack tagging before the 
PR_SET_TAGGED_ADDR_CTRL prctl() is issued can cause problems, notably because tagged 
addresses could end up being passed to syscalls. Conversely, if IRG and ADDG never 
set the top byte by default, then tagging operations should be no-ops until the 
prctl() is issued. This would be particularly useful given that it may not be 
straightforward for the C runtime to issue the prctl() before doing anything else.

Additionally, since the default tag checking mode is PR_MTE_TCF_NONE, it would make 
perfect sense not to generate tags by default.

Any thoughts?

Thanks,
Kevin

On 11/12/2019 18:40, Catalin Marinas wrote:
> The IRG, ADDG and SUBG instructions insert a random tag in the resulting
> address. Certain tags can be excluded via the GCR_EL1.Exclude bitmap
> when, for example, the user wants a certain colour for freed buffers.
> Since the GCR_EL1 register is not accessible at EL0, extend the
> prctl(PR_SET_TAGGED_ADDR_CTRL) interface to include a 16-bit field in
> the first argument for controlling the excluded tags. This setting is
> pre-thread.
>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>   arch/arm64/include/asm/processor.h |  1 +
>   arch/arm64/include/asm/sysreg.h    |  7 +++++++
>   arch/arm64/kernel/process.c        | 27 +++++++++++++++++++++++----
>   include/uapi/linux/prctl.h         |  3 +++
>   4 files changed, 34 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index 91aa270afc7d..5b6988035334 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -150,6 +150,7 @@ struct thread_struct {
>   #endif
>   #ifdef CONFIG_ARM64_MTE
>   	u64			sctlr_tcf0;
> +	u64			gcr_excl;
>   #endif
>   };
>   
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 9e5753272f4b..b6bb6d31f1cd 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -901,6 +901,13 @@
>   		write_sysreg(__scs_new, sysreg);			\
>   } while (0)
>   
> +#define sysreg_clear_set_s(sysreg, clear, set) do {			\
> +	u64 __scs_val = read_sysreg_s(sysreg);				\
> +	u64 __scs_new = (__scs_val & ~(u64)(clear)) | (set);		\
> +	if (__scs_new != __scs_val)					\
> +		write_sysreg_s(__scs_new, sysreg);			\
> +} while (0)
> +
>   #endif
>   
>   #endif	/* __ASM_SYSREG_H */
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 47ce98f47253..5ec6889795fc 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -502,6 +502,15 @@ static void update_sctlr_el1_tcf0(u64 tcf0)
>   	sysreg_clear_set(sctlr_el1, SCTLR_EL1_TCF0_MASK, tcf0);
>   }
>   
> +static void update_gcr_el1_excl(u64 excl)
> +{
> +	/*
> +	 * No need for ISB since this only affects EL0 currently, implicit
> +	 * with ERET.
> +	 */
> +	sysreg_clear_set_s(SYS_GCR_EL1, SYS_GCR_EL1_EXCL_MASK, excl);
> +}
> +
>   /* Handle MTE thread switch */
>   static void mte_thread_switch(struct task_struct *next)
>   {
> @@ -511,6 +520,7 @@ static void mte_thread_switch(struct task_struct *next)
>   	/* avoid expensive SCTLR_EL1 accesses if no change */
>   	if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
>   		update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
> +	update_gcr_el1_excl(next->thread.gcr_excl);
>   }
>   #else
>   static void mte_thread_switch(struct task_struct *next)
> @@ -641,22 +651,31 @@ static long set_mte_ctrl(unsigned long arg)
>   	update_sctlr_el1_tcf0(tcf0);
>   	preempt_enable();
>   
> +	current->thread.gcr_excl = (arg & PR_MTE_EXCL_MASK) >> PR_MTE_EXCL_SHIFT;
> +	update_gcr_el1_excl(current->thread.gcr_excl);
> +
>   	return 0;
>   }
>   
>   static long get_mte_ctrl(void)
>   {
> +	unsigned long ret;
> +
>   	if (!system_supports_mte())
>   		return 0;
>   
> +	ret = current->thread.gcr_excl << PR_MTE_EXCL_SHIFT;
> +
>   	switch (current->thread.sctlr_tcf0) {
>   	case SCTLR_EL1_TCF0_SYNC:
> -		return PR_MTE_TCF_SYNC;
> +		ret |= PR_MTE_TCF_SYNC;
> +		break;
>   	case SCTLR_EL1_TCF0_ASYNC:
> -		return PR_MTE_TCF_ASYNC;
> +		ret |= PR_MTE_TCF_ASYNC;
> +		break;
>   	}
>   
> -	return 0;
> +	return ret;
>   }
>   #else
>   static long set_mte_ctrl(unsigned long arg)
> @@ -684,7 +703,7 @@ long set_tagged_addr_ctrl(unsigned long arg)
>   		return -EINVAL;
>   
>   	if (system_supports_mte())
> -		valid_mask |= PR_MTE_TCF_MASK;
> +		valid_mask |= PR_MTE_TCF_MASK | PR_MTE_EXCL_MASK;
>   
>   	if (arg & ~valid_mask)
>   		return -EINVAL;
> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> index 5e9323e66a38..749de5ab4f9f 100644
> --- a/include/uapi/linux/prctl.h
> +++ b/include/uapi/linux/prctl.h
> @@ -239,5 +239,8 @@ struct prctl_mm_map {
>   # define PR_MTE_TCF_SYNC		(1UL << PR_MTE_TCF_SHIFT)
>   # define PR_MTE_TCF_ASYNC		(2UL << PR_MTE_TCF_SHIFT)
>   # define PR_MTE_TCF_MASK		(3UL << PR_MTE_TCF_SHIFT)
> +/* MTE tag exclusion mask */
> +# define PR_MTE_EXCL_SHIFT		3
> +# define PR_MTE_EXCL_MASK		(0xffffUL << PR_MTE_EXCL_SHIFT)
>   
>   #endif /* _LINUX_PRCTL_H */



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 20/22] arm64: mte: Allow user control of the excluded tags via prctl()
  2019-12-16 14:20   ` Kevin Brodsky
@ 2019-12-16 17:30     ` Peter Collingbourne
  2019-12-17 17:56       ` Catalin Marinas
  2020-06-22 17:17       ` Catalin Marinas
  0 siblings, 2 replies; 51+ messages in thread
From: Peter Collingbourne @ 2019-12-16 17:30 UTC (permalink / raw)
  To: Kevin Brodsky
  Cc: Catalin Marinas, Linux ARM, Will Deacon, Marc Zyngier,
	Vincenzo Frascino, Szabolcs Nagy, Richard Earnshaw,
	Andrey Konovalov, linux-mm, linux-arch, Branislav Rankov

On Mon, Dec 16, 2019 at 6:20 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote:
>
> +Branislav, Peter
>
> In this patch, the default exclusion mask remains 0 (i.e. all tags can be generated).
> After some more discussions, Branislav and I think that it would be better to start
> with the reverse, i.e. all tags but 0 excluded (mask = 0xfe or 0xff).
>
> This should simplify the MTE setup in the early C runtime quite a bit. Indeed, if all
> tags can be generated, doing any heap or stack tagging before the
> PR_SET_TAGGED_ADDR_CTRL prctl() is issued can cause problems, notably because tagged
> addresses could end up being passed to syscalls. Conversely, if IRG and ADDG never
> set the top byte by default, then tagging operations should be no-ops until the
> prctl() is issued. This would be particularly useful given that it may not be
> straightforward for the C runtime to issue the prctl() before doing anything else.
>
> Additionally, since the default tag checking mode is PR_MTE_TCF_NONE, it would make
> perfect sense not to generate tags by default.
>
> Any thoughts?

This would indeed allow the early C runtime startup code to pass
tagged addresses to syscalls, but I don't think it would entirely free
the code from the burden of worrying about stack tagging. Either way,
any stack frames that are active at the point when the prctl() is
issued would need to be compiled without stack tagging, because
otherwise those stack frames may use ADDG to rematerialize a stack
object address, which may produce a different address post-prctl.
Setting the exclude mask to 0xffff would at least make it more likely
for this problem to be detected, though.

If we change the default in this way, maybe it would be worth
considering flipping the meaning of the tag mask and have it be a mask
of tags to allow. That would be consistent with the existing behaviour
where userspace sets bits in tagged_addr_ctrl in order to enable
tagging features.

Peter

>
> Thanks,
> Kevin
>
> On 11/12/2019 18:40, Catalin Marinas wrote:
> > The IRG, ADDG and SUBG instructions insert a random tag in the resulting
> > address. Certain tags can be excluded via the GCR_EL1.Exclude bitmap
> > when, for example, the user wants a certain colour for freed buffers.
> > Since the GCR_EL1 register is not accessible at EL0, extend the
> > prctl(PR_SET_TAGGED_ADDR_CTRL) interface to include a 16-bit field in
> > the first argument for controlling the excluded tags. This setting is
> > pre-thread.
> >
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > ---
> >   arch/arm64/include/asm/processor.h |  1 +
> >   arch/arm64/include/asm/sysreg.h    |  7 +++++++
> >   arch/arm64/kernel/process.c        | 27 +++++++++++++++++++++++----
> >   include/uapi/linux/prctl.h         |  3 +++
> >   4 files changed, 34 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> > index 91aa270afc7d..5b6988035334 100644
> > --- a/arch/arm64/include/asm/processor.h
> > +++ b/arch/arm64/include/asm/processor.h
> > @@ -150,6 +150,7 @@ struct thread_struct {
> >   #endif
> >   #ifdef CONFIG_ARM64_MTE
> >       u64                     sctlr_tcf0;
> > +     u64                     gcr_excl;
> >   #endif
> >   };
> >
> > diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> > index 9e5753272f4b..b6bb6d31f1cd 100644
> > --- a/arch/arm64/include/asm/sysreg.h
> > +++ b/arch/arm64/include/asm/sysreg.h
> > @@ -901,6 +901,13 @@
> >               write_sysreg(__scs_new, sysreg);                        \
> >   } while (0)
> >
> > +#define sysreg_clear_set_s(sysreg, clear, set) do {                  \
> > +     u64 __scs_val = read_sysreg_s(sysreg);                          \
> > +     u64 __scs_new = (__scs_val & ~(u64)(clear)) | (set);            \
> > +     if (__scs_new != __scs_val)                                     \
> > +             write_sysreg_s(__scs_new, sysreg);                      \
> > +} while (0)
> > +
> >   #endif
> >
> >   #endif      /* __ASM_SYSREG_H */
> > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> > index 47ce98f47253..5ec6889795fc 100644
> > --- a/arch/arm64/kernel/process.c
> > +++ b/arch/arm64/kernel/process.c
> > @@ -502,6 +502,15 @@ static void update_sctlr_el1_tcf0(u64 tcf0)
> >       sysreg_clear_set(sctlr_el1, SCTLR_EL1_TCF0_MASK, tcf0);
> >   }
> >
> > +static void update_gcr_el1_excl(u64 excl)
> > +{
> > +     /*
> > +      * No need for ISB since this only affects EL0 currently, implicit
> > +      * with ERET.
> > +      */
> > +     sysreg_clear_set_s(SYS_GCR_EL1, SYS_GCR_EL1_EXCL_MASK, excl);
> > +}
> > +
> >   /* Handle MTE thread switch */
> >   static void mte_thread_switch(struct task_struct *next)
> >   {
> > @@ -511,6 +520,7 @@ static void mte_thread_switch(struct task_struct *next)
> >       /* avoid expensive SCTLR_EL1 accesses if no change */
> >       if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
> >               update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
> > +     update_gcr_el1_excl(next->thread.gcr_excl);
> >   }
> >   #else
> >   static void mte_thread_switch(struct task_struct *next)
> > @@ -641,22 +651,31 @@ static long set_mte_ctrl(unsigned long arg)
> >       update_sctlr_el1_tcf0(tcf0);
> >       preempt_enable();
> >
> > +     current->thread.gcr_excl = (arg & PR_MTE_EXCL_MASK) >> PR_MTE_EXCL_SHIFT;
> > +     update_gcr_el1_excl(current->thread.gcr_excl);
> > +
> >       return 0;
> >   }
> >
> >   static long get_mte_ctrl(void)
> >   {
> > +     unsigned long ret;
> > +
> >       if (!system_supports_mte())
> >               return 0;
> >
> > +     ret = current->thread.gcr_excl << PR_MTE_EXCL_SHIFT;
> > +
> >       switch (current->thread.sctlr_tcf0) {
> >       case SCTLR_EL1_TCF0_SYNC:
> > -             return PR_MTE_TCF_SYNC;
> > +             ret |= PR_MTE_TCF_SYNC;
> > +             break;
> >       case SCTLR_EL1_TCF0_ASYNC:
> > -             return PR_MTE_TCF_ASYNC;
> > +             ret |= PR_MTE_TCF_ASYNC;
> > +             break;
> >       }
> >
> > -     return 0;
> > +     return ret;
> >   }
> >   #else
> >   static long set_mte_ctrl(unsigned long arg)
> > @@ -684,7 +703,7 @@ long set_tagged_addr_ctrl(unsigned long arg)
> >               return -EINVAL;
> >
> >       if (system_supports_mte())
> > -             valid_mask |= PR_MTE_TCF_MASK;
> > +             valid_mask |= PR_MTE_TCF_MASK | PR_MTE_EXCL_MASK;
> >
> >       if (arg & ~valid_mask)
> >               return -EINVAL;
> > diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> > index 5e9323e66a38..749de5ab4f9f 100644
> > --- a/include/uapi/linux/prctl.h
> > +++ b/include/uapi/linux/prctl.h
> > @@ -239,5 +239,8 @@ struct prctl_mm_map {
> >   # define PR_MTE_TCF_SYNC            (1UL << PR_MTE_TCF_SHIFT)
> >   # define PR_MTE_TCF_ASYNC           (2UL << PR_MTE_TCF_SHIFT)
> >   # define PR_MTE_TCF_MASK            (3UL << PR_MTE_TCF_SHIFT)
> > +/* MTE tag exclusion mask */
> > +# define PR_MTE_EXCL_SHIFT           3
> > +# define PR_MTE_EXCL_MASK            (0xffffUL << PR_MTE_EXCL_SHIFT)
> >
> >   #endif /* _LINUX_PRCTL_H */
>


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 12/22] arm64: mte: Add specific SIGSEGV codes
  2019-12-12 18:26     ` Eric W. Biederman
@ 2019-12-17 17:48       ` Catalin Marinas
  2019-12-17 20:06         ` Eric W. Biederman
  0 siblings, 1 reply; 51+ messages in thread
From: Catalin Marinas @ 2019-12-17 17:48 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Arnd Bergmann, Linux ARM, Will Deacon, Marc Zyngier,
	Vincenzo Frascino, Szabolcs Nagy, Richard Earnshaw,
	Kevin Brodsky, Andrey Konovalov, Linux-MM, linux-arch, Al Viro

Hi Eric,

On Thu, Dec 12, 2019 at 12:26:41PM -0600, Eric W. Biederman wrote:
> Arnd Bergmann <arnd@arndb.de> writes:
> > On Wed, Dec 11, 2019 at 7:40 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> >>
> >> From: Vincenzo Frascino <vincenzo.frascino@arm.com>
> >>
> >> Add MTE-specific SIGSEGV codes to siginfo.h.
> >>
> >> Note that the for MTE we are reusing the same SPARC ADI codes because
> >> the two functionalities are similar and they cannot coexist on the same
> >> system.
> 
> Please Please Please don't do that.
> 
> It is actively harmful to have architecture specific si_code values.
> As it makes maintenance much more difficult.
> 
> Especially as the si_codes are part of union descrimanator.
> 
> If your functionality is identical reuse the numbers otherwise please
> just select the next numbers not yet used.

It makes sense.

> We have at least 256 si_codes per signal 2**32 if we really need them so
> there is no need to be reuse numbers.
> 
> The practical problem is that architecture specific si_codes start
> turning kernel/signal.c into #ifdef soup, and we loose a lot of
> basic compile coverage because of that.  In turn not compiling the code
> leads to bit-rot in all kinds of weird places.

Fortunately for MTE we don't need to change kernel/signal.c. It's
sufficient to call force_sig_fault() from the arch code with the
corresponding signo, code and fault address.

> p.s. As for coexistence there is always the possibility that one chip
> in a cpu family does supports one thing and another chip in a cpu
> family supports another.  So userspace may have to cope with the
> situation even if an individual chip doesn't.
> 
> I remember a similar case where sparc had several distinct page table
> formats and we had a single kernel that had to cope with them all.

We have such fun on ARM as well with the big.LITTLE systems where not
all CPUs support the same features. For example, MTE is only enabled
once all the secondary CPUs have booted and confirmed to have the
feature.

Thanks.

-- 
Catalin


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 20/22] arm64: mte: Allow user control of the excluded tags via prctl()
  2019-12-16 17:30     ` Peter Collingbourne
@ 2019-12-17 17:56       ` Catalin Marinas
  2020-06-22 17:17       ` Catalin Marinas
  1 sibling, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2019-12-17 17:56 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Kevin Brodsky, Linux ARM, Will Deacon, Marc Zyngier,
	Vincenzo Frascino, Szabolcs Nagy, Richard Earnshaw,
	Andrey Konovalov, linux-mm, linux-arch, Branislav Rankov

On Mon, Dec 16, 2019 at 09:30:36AM -0800, Peter Collingbourne wrote:
> On Mon, Dec 16, 2019 at 6:20 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote:
> > In this patch, the default exclusion mask remains 0 (i.e. all tags can be generated).
> > After some more discussions, Branislav and I think that it would be better to start
> > with the reverse, i.e. all tags but 0 excluded (mask = 0xfe or 0xff).

So with mask 0xff, IRG generates only tag 0? This seems to be the case
reading the pseudocode in the ARM ARM.

> > This should simplify the MTE setup in the early C runtime quite a bit. Indeed, if all
> > tags can be generated, doing any heap or stack tagging before the
> > PR_SET_TAGGED_ADDR_CTRL prctl() is issued can cause problems, notably because tagged
> > addresses could end up being passed to syscalls. Conversely, if IRG and ADDG never
> > set the top byte by default, then tagging operations should be no-ops until the
> > prctl() is issued. This would be particularly useful given that it may not be
> > straightforward for the C runtime to issue the prctl() before doing anything else.
> >
> > Additionally, since the default tag checking mode is PR_MTE_TCF_NONE, it would make
> > perfect sense not to generate tags by default.
> >
> > Any thoughts?
> 
> This would indeed allow the early C runtime startup code to pass
> tagged addresses to syscalls, but I don't think it would entirely free
> the code from the burden of worrying about stack tagging. Either way,
> any stack frames that are active at the point when the prctl() is
> issued would need to be compiled without stack tagging, because
> otherwise those stack frames may use ADDG to rematerialize a stack
> object address, which may produce a different address post-prctl.
> Setting the exclude mask to 0xffff would at least make it more likely
> for this problem to be detected, though.
> 
> If we change the default in this way, maybe it would be worth
> considering flipping the meaning of the tag mask and have it be a mask
> of tags to allow. That would be consistent with the existing behaviour
> where userspace sets bits in tagged_addr_ctrl in order to enable
> tagging features.

Either option works for me. It's really for the libc people to decide
what they need. I think an "include" rather than "exclude" mask makes
sense with the default 0 meaning only generate tag 0.

Thanks.

-- 
Catalin


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 13/22] arm64: mte: Handle synchronous and asynchronous tag check faults
  2019-12-14  1:43   ` Peter Collingbourne
@ 2019-12-17 18:01     ` Catalin Marinas
  2019-12-20  1:36       ` [PATCH] arm64: mte: Do not service syscalls after async tag fault Peter Collingbourne
  0 siblings, 1 reply; 51+ messages in thread
From: Catalin Marinas @ 2019-12-17 18:01 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Evgenii Stepanov, Kostya Serebryany, Linux ARM, linux-arch,
	Richard Earnshaw, Szabolcs Nagy, Marc Zyngier, Kevin Brodsky,
	linux-mm, Andrey Konovalov, Vincenzo Frascino, Will Deacon

On Fri, Dec 13, 2019 at 05:43:15PM -0800, Peter Collingbourne wrote:
> On Wed, Dec 11, 2019 at 10:44 AM Catalin Marinas
> <catalin.marinas@arm.com> wrote:
> > diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> > index dd2cdc0d5be2..41fae64af82a 100644
> > --- a/arch/arm64/kernel/signal.c
> > +++ b/arch/arm64/kernel/signal.c
> > @@ -730,6 +730,9 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
> >         regs->regs[29] = (unsigned long)&user->next_frame->fp;
> >         regs->pc = (unsigned long)ka->sa.sa_handler;
> >
> > +       /* TCO (Tag Check Override) always cleared for signal handlers */
> > +       regs->pstate &= ~PSR_TCO_BIT;
> > +
> >         if (ka->sa.sa_flags & SA_RESTORER)
> >                 sigtramp = ka->sa.sa_restorer;
> >         else
> > @@ -921,6 +924,11 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
> >                         if (thread_flags & _TIF_UPROBE)
> >                                 uprobe_notify_resume(regs);
> >
> > +                       if (thread_flags & _TIF_MTE_ASYNC_FAULT) {
> > +                               clear_thread_flag(TIF_MTE_ASYNC_FAULT);
> > +                               force_signal_inject(SIGSEGV, SEGV_MTEAERR, 0);
> 
> In the case where the kernel is entered due to a syscall, this will
> inject a signal, but only after servicing the syscall. This means
> that, for example, if the syscall is exit(), the async tag check
> failure will be silently ignored. I can reproduce the problem with the
> program below:
[...]
> This patch fixes the problem for me:
> 
> diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> index 9a9d98a443fc..d0c8918dee00 100644
> --- a/arch/arm64/kernel/syscall.c
> +++ b/arch/arm64/kernel/syscall.c
> @@ -94,6 +94,8 @@ static void el0_svc_common(struct pt_regs *regs, int
> scno, int sc_nr,
>                            const syscall_fn_t syscall_table[])
>  {
>         unsigned long flags = current_thread_info()->flags;
> +       if (flags & _TIF_MTE_ASYNC_FAULT)
> +               return;

It needs a bit of thinking. This one wouldn't work if you want to handle
the signal and resume since it would skip the SVC instruction. We'd need
at least to do a regs->pc -= 4 and probably move it further down in this
function.

-- 
Catalin


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 12/22] arm64: mte: Add specific SIGSEGV codes
  2019-12-17 17:48       ` Catalin Marinas
@ 2019-12-17 20:06         ` Eric W. Biederman
  0 siblings, 0 replies; 51+ messages in thread
From: Eric W. Biederman @ 2019-12-17 20:06 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Arnd Bergmann, Linux ARM, Will Deacon, Marc Zyngier,
	Vincenzo Frascino, Szabolcs Nagy, Richard Earnshaw,
	Kevin Brodsky, Andrey Konovalov, Linux-MM, linux-arch, Al Viro

Catalin Marinas <catalin.marinas@arm.com> writes:

> Hi Eric,
>
> On Thu, Dec 12, 2019 at 12:26:41PM -0600, Eric W. Biederman wrote:
>> Arnd Bergmann <arnd@arndb.de> writes:
>> > On Wed, Dec 11, 2019 at 7:40 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>> >>
>> >> From: Vincenzo Frascino <vincenzo.frascino@arm.com>
>> >>
>> >> Add MTE-specific SIGSEGV codes to siginfo.h.
>> >>
>> >> Note that the for MTE we are reusing the same SPARC ADI codes because
>> >> the two functionalities are similar and they cannot coexist on the same
>> >> system.
>> 
>> Please Please Please don't do that.
>> 
>> It is actively harmful to have architecture specific si_code values.
>> As it makes maintenance much more difficult.
>> 
>> Especially as the si_codes are part of union descrimanator.
>> 
>> If your functionality is identical reuse the numbers otherwise please
>> just select the next numbers not yet used.
>
> It makes sense.
>
>> We have at least 256 si_codes per signal 2**32 if we really need them so
>> there is no need to be reuse numbers.
>> 
>> The practical problem is that architecture specific si_codes start
>> turning kernel/signal.c into #ifdef soup, and we loose a lot of
>> basic compile coverage because of that.  In turn not compiling the code
>> leads to bit-rot in all kinds of weird places.
>
> Fortunately for MTE we don't need to change kernel/signal.c. It's
> sufficient to call force_sig_fault() from the arch code with the
> corresponding signo, code and fault address.

Hooray for force_sig_fault at keeping people honest about which
parameters they are passing.

So far it looks like it is just BUS_MCEERR_AR, BUS_MCEERR_AO,
SEGV_BNDERR, and SEGV_PKUERR that are the really confusing ones,
as they go beyond the ordinary force_sig_fault layout.

But we really do need the knowledge of how all of the cases are encoded
or things can get very confusing.  Especially when mixing 32bit and
64bit code.

>> p.s. As for coexistence there is always the possibility that one chip
>> in a cpu family does supports one thing and another chip in a cpu
>> family supports another.  So userspace may have to cope with the
>> situation even if an individual chip doesn't.
>> 
>> I remember a similar case where sparc had several distinct page table
>> formats and we had a single kernel that had to cope with them all.
>
> We have such fun on ARM as well with the big.LITTLE systems where not
> all CPUs support the same features. For example, MTE is only enabled
> once all the secondary CPUs have booted and confirmed to have the
> feature.

Which all makes it possible that the alternative to MTE referenced as
ADI might show up in some future ARM chip.  Which really makes reusing
the numbers a bad idea.

Not that I actually recall what any of this functionality actually is,
but I can tell when people are setting themselves of for a challenge
unnecessarily.

Eric


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 19/22] arm64: mte: Allow user control of the tag check mode via prctl()
  2019-12-11 18:40 ` [PATCH 19/22] arm64: mte: Allow user control of the tag check mode via prctl() Catalin Marinas
@ 2019-12-19 20:32   ` Peter Collingbourne
  2019-12-20  1:48     ` [PATCH] arm64: mte: Clear SCTLR_EL1.TCF0 on exec Peter Collingbourne
  2019-12-27 14:34   ` [PATCH 19/22] arm64: mte: Allow user control of the tag check mode via prctl() Kevin Brodsky
  1 sibling, 1 reply; 51+ messages in thread
From: Peter Collingbourne @ 2019-12-19 20:32 UTC (permalink / raw)
  To: Catalin Marinas, Evgenii Stepanov, Kostya Serebryany
  Cc: Linux ARM, linux-arch, Richard Earnshaw, Szabolcs Nagy,
	Marc Zyngier, Kevin Brodsky, linux-mm, Andrey Konovalov,
	Vincenzo Frascino, Will Deacon

On Wed, Dec 11, 2019 at 10:45 AM Catalin Marinas
<catalin.marinas@arm.com> wrote:
>
> By default, even if PROT_MTE is set on a memory range, there is no tag
> check fault reporting (SIGSEGV). Introduce a set of option to the
> exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag
> check fault mode:
>
>   PR_MTE_TCF_NONE  - no reporting (default)
>   PR_MTE_TCF_SYNC  - synchronous tag check fault reporting
>   PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting
>
> These options translate into the corresponding SCTLR_EL1.TCF0 bitfield,
> context-switched by the kernel. Note that uaccess done by the kernel is
> not checked and cannot be configured by the user.
>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm64/include/asm/processor.h |   3 +
>  arch/arm64/kernel/process.c        | 119 +++++++++++++++++++++++++++--
>  include/uapi/linux/prctl.h         |   6 ++
>  3 files changed, 123 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index 5ba63204d078..91aa270afc7d 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -148,6 +148,9 @@ struct thread_struct {
>  #ifdef CONFIG_ARM64_PTR_AUTH
>         struct ptrauth_keys     keys_user;
>  #endif
> +#ifdef CONFIG_ARM64_MTE
> +       u64                     sctlr_tcf0;
> +#endif
>  };
>
>  static inline void arch_thread_struct_whitelist(unsigned long *offset,
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index dd98d539894e..47ce98f47253 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -317,11 +317,22 @@ static void flush_tagged_addr_state(void)
>                 clear_thread_flag(TIF_TAGGED_ADDR);
>  }
>
> +#ifdef CONFIG_ARM64_MTE
> +static void flush_mte_state(void)
> +{
> +       if (!system_supports_mte())
> +               return;
> +
> +       /* clear any pending asynchronous tag fault */
> +       clear_thread_flag(TIF_MTE_ASYNC_FAULT);
> +       /* disable tag checking */
> +       current->thread.sctlr_tcf0 = 0;
> +}
> +#else
>  static void flush_mte_state(void)
>  {
> -       if (system_supports_mte())
> -               clear_thread_flag(TIF_MTE_ASYNC_FAULT);
>  }
> +#endif
>
>  void flush_thread(void)
>  {
> @@ -484,6 +495,29 @@ static void ssbs_thread_switch(struct task_struct *next)
>                 set_ssbs_bit(regs);
>  }
>
> +#ifdef CONFIG_ARM64_MTE
> +static void update_sctlr_el1_tcf0(u64 tcf0)
> +{
> +       /* no need for ISB since this only affects EL0, implicit with ERET */
> +       sysreg_clear_set(sctlr_el1, SCTLR_EL1_TCF0_MASK, tcf0);
> +}
> +
> +/* Handle MTE thread switch */
> +static void mte_thread_switch(struct task_struct *next)
> +{
> +       if (!system_supports_mte())
> +               return;
> +
> +       /* avoid expensive SCTLR_EL1 accesses if no change */
> +       if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
> +               update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);

I don't entirely understand why yet, but I've found that this check is
insufficient for ensuring consistency between SCTLR_EL1.TCF0 and
sctlr_tcf0. In my Android test environment with some processes having
sctlr_tcf0=SCTLR_EL1_TCF0_SYNC and others having sctlr_tcf0=0, I am
seeing intermittent tag failures coming from the sctlr_tcf0=0
processes. With this patch:

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index ef3bfa2bf2b1..4e5d02520a51 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -663,6 +663,8 @@ static int do_sea(unsigned long addr, unsigned int
esr, struct pt_regs *regs)
 static int do_tag_check_fault(unsigned long addr, unsigned int esr,
                              struct pt_regs *regs)
 {
+       printk(KERN_ERR "do_tag_check_fault %lx %lx\n",
+              current->thread.sctlr_tcf0, read_sysreg(sctlr_el1));
        do_bad_area(addr, esr, regs);
        return 0;
 }

I see dmesg output like this:

[   15.249216] do_tag_check_fault 0 c60fc64791d

showing that SCTLR_EL1.TCF0 became inconsistent with sctlr_tcf0. This
patch fixes the problem for me:

diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index fba89c9f070b..fb012f0baa12 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -518,9 +518,7 @@ static void mte_thread_switch(struct task_struct *next)
        if (!system_supports_mte())
                return;

-       /* avoid expensive SCTLR_EL1 accesses if no change */
-       if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
-               update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
+       update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
        update_gcr_el1_excl(next->thread.gcr_excl);
 }
 #else
@@ -643,15 +641,8 @@ static long set_mte_ctrl(unsigned long arg)
                return -EINVAL;
        }

-       /*
-        * mte_thread_switch() checks current->thread.sctlr_tcf0 as an
-        * optimisation. Disable preemption so that it does not see
-        * the variable update before the SCTLR_EL1.TCF0 one.
-        */
-       preempt_disable();
        current->thread.sctlr_tcf0 = tcf0;
        update_sctlr_el1_tcf0(tcf0);
-       preempt_enable();

        current->thread.gcr_excl = (arg & PR_MTE_EXCL_MASK) >>
PR_MTE_EXCL_SHIFT;
        update_gcr_el1_excl(current->thread.gcr_excl);

Since sysreg_clear_set only sets the sysreg if it ended up changing, I
wouldn't expect this to cause a significant performance hit unless
just reading SCTLR_EL1 is expensive. That being said, if the
inconsistency is indicative of a deeper problem, we should probably
address that.


Peter

> +}
> +#else
> +static void mte_thread_switch(struct task_struct *next)
> +{
> +}
> +#endif
> +
>  /*
>   * We store our current task in sp_el0, which is clobbered by userspace. Keep a
>   * shadow copy so that we can restore this upon entry from userspace.
> @@ -514,6 +548,7 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
>         uao_thread_switch(next);
>         ptrauth_thread_switch(next);
>         ssbs_thread_switch(next);
> +       mte_thread_switch(next);
>
>         /*
>          * Complete any pending TLB or cache maintenance on this CPU in case
> @@ -574,6 +609,67 @@ void arch_setup_new_exec(void)
>         ptrauth_thread_init_user(current);
>  }
>
> +#ifdef CONFIG_ARM64_MTE
> +static long set_mte_ctrl(unsigned long arg)
> +{
> +       u64 tcf0;
> +
> +       if (!system_supports_mte())
> +               return 0;
> +
> +       switch (arg & PR_MTE_TCF_MASK) {
> +       case PR_MTE_TCF_NONE:
> +               tcf0 = 0;
> +               break;
> +       case PR_MTE_TCF_SYNC:
> +               tcf0 = SCTLR_EL1_TCF0_SYNC;
> +               break;
> +       case PR_MTE_TCF_ASYNC:
> +               tcf0 = SCTLR_EL1_TCF0_ASYNC;
> +               break;
> +       default:
> +               return -EINVAL;
> +       }
> +
> +       /*
> +        * mte_thread_switch() checks current->thread.sctlr_tcf0 as an
> +        * optimisation. Disable preemption so that it does not see
> +        * the variable update before the SCTLR_EL1.TCF0 one.
> +        */
> +       preempt_disable();
> +       current->thread.sctlr_tcf0 = tcf0;
> +       update_sctlr_el1_tcf0(tcf0);
> +       preempt_enable();
> +
> +       return 0;
> +}
> +
> +static long get_mte_ctrl(void)
> +{
> +       if (!system_supports_mte())
> +               return 0;
> +
> +       switch (current->thread.sctlr_tcf0) {
> +       case SCTLR_EL1_TCF0_SYNC:
> +               return PR_MTE_TCF_SYNC;
> +       case SCTLR_EL1_TCF0_ASYNC:
> +               return PR_MTE_TCF_ASYNC;
> +       }
> +
> +       return 0;
> +}
> +#else
> +static long set_mte_ctrl(unsigned long arg)
> +{
> +       return 0;
> +}
> +
> +static long get_mte_ctrl(void)
> +{
> +       return 0;
> +}
> +#endif
> +
>  #ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
>  /*
>   * Control the relaxed ABI allowing tagged user addresses into the kernel.
> @@ -582,9 +678,15 @@ static unsigned int tagged_addr_disabled;
>
>  long set_tagged_addr_ctrl(unsigned long arg)
>  {
> +       unsigned long valid_mask = PR_TAGGED_ADDR_ENABLE;
> +
>         if (is_compat_task())
>                 return -EINVAL;
> -       if (arg & ~PR_TAGGED_ADDR_ENABLE)
> +
> +       if (system_supports_mte())
> +               valid_mask |= PR_MTE_TCF_MASK;
> +
> +       if (arg & ~valid_mask)
>                 return -EINVAL;
>
>         /*
> @@ -594,6 +696,9 @@ long set_tagged_addr_ctrl(unsigned long arg)
>         if (arg & PR_TAGGED_ADDR_ENABLE && tagged_addr_disabled)
>                 return -EINVAL;
>
> +       if (set_mte_ctrl(arg) != 0)
> +               return -EINVAL;
> +
>         update_thread_flag(TIF_TAGGED_ADDR, arg & PR_TAGGED_ADDR_ENABLE);
>
>         return 0;
> @@ -601,13 +706,17 @@ long set_tagged_addr_ctrl(unsigned long arg)
>
>  long get_tagged_addr_ctrl(void)
>  {
> +       long ret = 0;
> +
>         if (is_compat_task())
>                 return -EINVAL;
>
>         if (test_thread_flag(TIF_TAGGED_ADDR))
> -               return PR_TAGGED_ADDR_ENABLE;
> +               ret = PR_TAGGED_ADDR_ENABLE;
>
> -       return 0;
> +       ret |= get_mte_ctrl();
> +
> +       return ret;
>  }
>
>  /*
> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> index 7da1b37b27aa..5e9323e66a38 100644
> --- a/include/uapi/linux/prctl.h
> +++ b/include/uapi/linux/prctl.h
> @@ -233,5 +233,11 @@ struct prctl_mm_map {
>  #define PR_SET_TAGGED_ADDR_CTRL                55
>  #define PR_GET_TAGGED_ADDR_CTRL                56
>  # define PR_TAGGED_ADDR_ENABLE         (1UL << 0)
> +/* MTE tag check fault modes */
> +# define PR_MTE_TCF_SHIFT              1
> +# define PR_MTE_TCF_NONE               (0UL << PR_MTE_TCF_SHIFT)
> +# define PR_MTE_TCF_SYNC               (1UL << PR_MTE_TCF_SHIFT)
> +# define PR_MTE_TCF_ASYNC              (2UL << PR_MTE_TCF_SHIFT)
> +# define PR_MTE_TCF_MASK               (3UL << PR_MTE_TCF_SHIFT)
>
>  #endif /* _LINUX_PRCTL_H */
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH] arm64: mte: Do not service syscalls after async tag fault
  2019-12-17 18:01     ` Catalin Marinas
@ 2019-12-20  1:36       ` Peter Collingbourne
  2020-02-12 11:09         ` Catalin Marinas
  0 siblings, 1 reply; 51+ messages in thread
From: Peter Collingbourne @ 2019-12-20  1:36 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Peter Collingbourne, Evgenii Stepanov, Kostya Serebryany,
	Linux ARM, linux-arch, Richard Earnshaw, Szabolcs Nagy,
	Marc Zyngier, Kevin Brodsky, linux-mm, Andrey Konovalov,
	Vincenzo Frascino, Will Deacon

When entering the kernel after an async tag fault due to a syscall, rather
than for another reason (e.g. preemption), we don't want to service the
syscall as it may mask the tag fault. Rewind the PC to the svc instruction
in order to give a userspace signal handler an opportunity to handle the
fault and resume, and skip all other syscall processing.

Signed-off-by: Peter Collingbourne <pcc@google.com>
---
On Tue, Dec 17, 2019 at 10:01 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Fri, Dec 13, 2019 at 05:43:15PM -0800, Peter Collingbourne wrote:
> > On Wed, Dec 11, 2019 at 10:44 AM Catalin Marinas
> > <catalin.marinas@arm.com> wrote:
> > > diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> > > index dd2cdc0d5be2..41fae64af82a 100644
> > > --- a/arch/arm64/kernel/signal.c
> > > +++ b/arch/arm64/kernel/signal.c
> > > @@ -730,6 +730,9 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
> > >         regs->regs[29] = (unsigned long)&user->next_frame->fp;
> > >         regs->pc = (unsigned long)ka->sa.sa_handler;
> > >
> > > +       /* TCO (Tag Check Override) always cleared for signal handlers */
> > > +       regs->pstate &= ~PSR_TCO_BIT;
> > > +
> > >         if (ka->sa.sa_flags & SA_RESTORER)
> > >                 sigtramp = ka->sa.sa_restorer;
> > >         else
> > > @@ -921,6 +924,11 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
> > >                         if (thread_flags & _TIF_UPROBE)
> > >                                 uprobe_notify_resume(regs);
> > >
> > > +                       if (thread_flags & _TIF_MTE_ASYNC_FAULT) {
> > > +                               clear_thread_flag(TIF_MTE_ASYNC_FAULT);
> > > +                               force_signal_inject(SIGSEGV, SEGV_MTEAERR, 0);
> >
> > In the case where the kernel is entered due to a syscall, this will
> > inject a signal, but only after servicing the syscall. This means
> > that, for example, if the syscall is exit(), the async tag check
> > failure will be silently ignored. I can reproduce the problem with the
> > program below:
> [...]
> > This patch fixes the problem for me:
> >
> > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> > index 9a9d98a443fc..d0c8918dee00 100644
> > --- a/arch/arm64/kernel/syscall.c
> > +++ b/arch/arm64/kernel/syscall.c
> > @@ -94,6 +94,8 @@ static void el0_svc_common(struct pt_regs *regs, int
> > scno, int sc_nr,
> >                            const syscall_fn_t syscall_table[])
> >  {
> >         unsigned long flags = current_thread_info()->flags;
> > +       if (flags & _TIF_MTE_ASYNC_FAULT)
> > +               return;
>
> It needs a bit of thinking. This one wouldn't work if you want to handle
> the signal and resume since it would skip the SVC instruction. We'd need
> at least to do a regs->pc -= 4 and probably move it further down in this
> function.

Okay, how does this look?

Peter

 arch/arm64/kernel/syscall.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 9a9d98a443fc..49ea9bb47190 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -95,13 +95,29 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
 {
 	unsigned long flags = current_thread_info()->flags;
 
-	regs->orig_x0 = regs->regs[0];
-	regs->syscallno = scno;
-
 	cortex_a76_erratum_1463225_svc_handler();
 	local_daif_restore(DAIF_PROCCTX);
 	user_exit();
 
+#ifdef CONFIG_ARM64_MTE
+	if (flags & _TIF_MTE_ASYNC_FAULT) {
+		/*
+		 * We entered the kernel after an async tag fault due to a
+		 * syscall, rather than for another reason (e.g. preemption).
+		 * In this case, we don't want to service the syscall as it may
+		 * mask the tag fault. Rewind the PC to the svc instruction in
+		 * order to give a userspace signal handler an opportunity to
+		 * handle the fault and resume, and skip all other syscall
+		 * processing.
+		 */
+		regs->pc -= 4;
+		return;
+	}
+#endif
+
+	regs->orig_x0 = regs->regs[0];
+	regs->syscallno = scno;
+
 	if (has_syscall_work(flags)) {
 		/* set default errno for user-issued syscall(-1) */
 		if (scno == NO_SYSCALL)
-- 
2.24.1.735.g03f4e72817-goog



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH] arm64: mte: Clear SCTLR_EL1.TCF0 on exec
  2019-12-19 20:32   ` Peter Collingbourne
@ 2019-12-20  1:48     ` Peter Collingbourne
  2020-02-12 17:03       ` Catalin Marinas
  0 siblings, 1 reply; 51+ messages in thread
From: Peter Collingbourne @ 2019-12-20  1:48 UTC (permalink / raw)
  To: Catalin Marinas, Evgenii Stepanov, Kostya Serebryany
  Cc: Peter Collingbourne, Linux ARM, linux-arch, Richard Earnshaw,
	Szabolcs Nagy, Marc Zyngier, Kevin Brodsky, linux-mm,
	Andrey Konovalov, Vincenzo Frascino, Will Deacon

Signed-off-by: Peter Collingbourne <pcc@google.com>
---
On Thu, Dec 19, 2019 at 12:32 PM Peter Collingbourne <pcc@google.com> wrote:
>
> On Wed, Dec 11, 2019 at 10:45 AM Catalin Marinas
> <catalin.marinas@arm.com> wrote:
> > +       if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
> > +               update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
>
> I don't entirely understand why yet, but I've found that this check is
> insufficient for ensuring consistency between SCTLR_EL1.TCF0 and
> sctlr_tcf0. In my Android test environment with some processes having
> sctlr_tcf0=SCTLR_EL1_TCF0_SYNC and others having sctlr_tcf0=0, I am
> seeing intermittent tag failures coming from the sctlr_tcf0=0
> processes. With this patch:
>
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index ef3bfa2bf2b1..4e5d02520a51 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -663,6 +663,8 @@ static int do_sea(unsigned long addr, unsigned int
> esr, struct pt_regs *regs)
>  static int do_tag_check_fault(unsigned long addr, unsigned int esr,
>                               struct pt_regs *regs)
>  {
> +       printk(KERN_ERR "do_tag_check_fault %lx %lx\n",
> +              current->thread.sctlr_tcf0, read_sysreg(sctlr_el1));
>         do_bad_area(addr, esr, regs);
>         return 0;
>  }
>
> I see dmesg output like this:
>
> [   15.249216] do_tag_check_fault 0 c60fc64791d
>
> showing that SCTLR_EL1.TCF0 became inconsistent with sctlr_tcf0. This
> patch fixes the problem for me:
>
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index fba89c9f070b..fb012f0baa12 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -518,9 +518,7 @@ static void mte_thread_switch(struct task_struct *next)
>         if (!system_supports_mte())
>                 return;
>
> -       /* avoid expensive SCTLR_EL1 accesses if no change */
> -       if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
> -               update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
> +       update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
>         update_gcr_el1_excl(next->thread.gcr_excl);
>  }
>  #else
> @@ -643,15 +641,8 @@ static long set_mte_ctrl(unsigned long arg)
>                 return -EINVAL;
>         }
>
> -       /*
> -        * mte_thread_switch() checks current->thread.sctlr_tcf0 as an
> -        * optimisation. Disable preemption so that it does not see
> -        * the variable update before the SCTLR_EL1.TCF0 one.
> -        */
> -       preempt_disable();
>         current->thread.sctlr_tcf0 = tcf0;
>         update_sctlr_el1_tcf0(tcf0);
> -       preempt_enable();
>
>         current->thread.gcr_excl = (arg & PR_MTE_EXCL_MASK) >>
> PR_MTE_EXCL_SHIFT;
>         update_gcr_el1_excl(current->thread.gcr_excl);
>
> Since sysreg_clear_set only sets the sysreg if it ended up changing, I
> wouldn't expect this to cause a significant performance hit unless
> just reading SCTLR_EL1 is expensive. That being said, if the
> inconsistency is indicative of a deeper problem, we should probably
> address that.

I tracked it down to the flush_mte_state() function setting sctlr_tcf0 but
failing to update SCTLR_EL1.TCF0. With this patch I am not seeing any more
inconsistencies.

Peter

 arch/arm64/kernel/process.c | 37 +++++++++++++++++++++----------------
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index fba89c9f070b..07e8e7bd3bec 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -319,6 +319,25 @@ static void flush_tagged_addr_state(void)
 }
 
 #ifdef CONFIG_ARM64_MTE
+static void update_sctlr_el1_tcf0(u64 tcf0)
+{
+	/* no need for ISB since this only affects EL0, implicit with ERET */
+	sysreg_clear_set(sctlr_el1, SCTLR_EL1_TCF0_MASK, tcf0);
+}
+
+static void set_sctlr_el1_tcf0(u64 tcf0)
+{
+	/*
+	 * mte_thread_switch() checks current->thread.sctlr_tcf0 as an
+	 * optimisation. Disable preemption so that it does not see
+	 * the variable update before the SCTLR_EL1.TCF0 one.
+	 */
+	preempt_disable();
+	current->thread.sctlr_tcf0 = tcf0;
+	update_sctlr_el1_tcf0(tcf0);
+	preempt_enable();
+}
+
 static void flush_mte_state(void)
 {
 	if (!system_supports_mte())
@@ -327,7 +346,7 @@ static void flush_mte_state(void)
 	/* clear any pending asynchronous tag fault */
 	clear_thread_flag(TIF_MTE_ASYNC_FAULT);
 	/* disable tag checking */
-	current->thread.sctlr_tcf0 = 0;
+	set_sctlr_el1_tcf0(0);
 }
 #else
 static void flush_mte_state(void)
@@ -497,12 +516,6 @@ static void ssbs_thread_switch(struct task_struct *next)
 }
 
 #ifdef CONFIG_ARM64_MTE
-static void update_sctlr_el1_tcf0(u64 tcf0)
-{
-	/* no need for ISB since this only affects EL0, implicit with ERET */
-	sysreg_clear_set(sctlr_el1, SCTLR_EL1_TCF0_MASK, tcf0);
-}
-
 static void update_gcr_el1_excl(u64 excl)
 {
 	/*
@@ -643,15 +656,7 @@ static long set_mte_ctrl(unsigned long arg)
 		return -EINVAL;
 	}
 
-	/*
-	 * mte_thread_switch() checks current->thread.sctlr_tcf0 as an
-	 * optimisation. Disable preemption so that it does not see
-	 * the variable update before the SCTLR_EL1.TCF0 one.
-	 */
-	preempt_disable();
-	current->thread.sctlr_tcf0 = tcf0;
-	update_sctlr_el1_tcf0(tcf0);
-	preempt_enable();
+	set_sctlr_el1_tcf0(tcf0);
 
 	current->thread.gcr_excl = (arg & PR_MTE_EXCL_MASK) >> PR_MTE_EXCL_SHIFT;
 	update_gcr_el1_excl(current->thread.gcr_excl);
-- 
2.24.1.735.g03f4e72817-goog



^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 22/22] arm64: mte: Add Memory Tagging Extension documentation
  2019-12-11 18:40 ` [PATCH 22/22] arm64: mte: Add Memory Tagging Extension documentation Catalin Marinas
@ 2019-12-24 15:03   ` Kevin Brodsky
  0 siblings, 0 replies; 51+ messages in thread
From: Kevin Brodsky @ 2019-12-24 15:03 UTC (permalink / raw)
  To: Catalin Marinas, linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Andrey Konovalov, linux-mm, linux-arch

On 11/12/2019 18:40, Catalin Marinas wrote:
> From: Vincenzo Frascino <vincenzo.frascino@arm.com>
>
> Memory Tagging Extension (part of the ARMv8.5 Extensions) provides
> a mechanism to detect the sources of memory related errors which
> may be vulnerable to exploitation, including bounds violations,
> use-after-free, use-after-return, use-out-of-scope and use before
> initialization errors.
>
> Add Memory Tagging Extension documentation for the arm64 linux
> kernel support.
>
> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
> Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>   Documentation/arm64/cpu-feature-registers.rst |   4 +
>   Documentation/arm64/elf_hwcaps.rst            |   4 +
>   Documentation/arm64/index.rst                 |   1 +
>   .../arm64/memory-tagging-extension.rst        | 229 ++++++++++++++++++
>   4 files changed, 238 insertions(+)
>   create mode 100644 Documentation/arm64/memory-tagging-extension.rst
>
> diff --git a/Documentation/arm64/cpu-feature-registers.rst b/Documentation/arm64/cpu-feature-registers.rst
> index b6e44884e3ad..67305a5f613a 100644
> --- a/Documentation/arm64/cpu-feature-registers.rst
> +++ b/Documentation/arm64/cpu-feature-registers.rst
> @@ -172,8 +172,12 @@ infrastructure:
>        +------------------------------+---------+---------+
>        | Name                         |  bits   | visible |
>        +------------------------------+---------+---------+
> +     | MTE                          | [11-8]  |    y    |
> +     +------------------------------+---------+---------+
>        | SSBS                         | [7-4]   |    y    |
>        +------------------------------+---------+---------+
> +     | BT                           | [3-0]   |    n    |
> +     +------------------------------+---------+---------+

Not sure the BTI bits should be in this patch.

>     4) MIDR_EL1 - Main ID Register
> diff --git a/Documentation/arm64/elf_hwcaps.rst b/Documentation/arm64/elf_hwcaps.rst
> index 7fa3d215ae6a..0f52d22c28af 100644
> --- a/Documentation/arm64/elf_hwcaps.rst
> +++ b/Documentation/arm64/elf_hwcaps.rst
> @@ -204,6 +204,10 @@ HWCAP2_FRINT
>   
>       Functionality implied by ID_AA64ISAR1_EL1.FRINTTS == 0b0001.
>   
> +HWCAP2_MTE
> +
> +    Functionality implied by ID_AA64PFR1_EL1.MTE == 0b0010.
> +    Documentation/arm64/memory-tagging-extension.rst.

Nit: to be consistent with the PAC bits, the text should be something like "implied 
by ..., as described by Documentation/..."

>   4. Unused AT_HWCAP bits
>   -----------------------
> diff --git a/Documentation/arm64/index.rst b/Documentation/arm64/index.rst
> index 5c0c69dc58aa..82970c6d384f 100644
> --- a/Documentation/arm64/index.rst
> +++ b/Documentation/arm64/index.rst
> @@ -13,6 +13,7 @@ ARM64 Architecture
>       hugetlbpage
>       legacy_instructions
>       memory
> +    memory-tagging-extension
>       pointer-authentication
>       silicon-errata
>       sve
> diff --git a/Documentation/arm64/memory-tagging-extension.rst b/Documentation/arm64/memory-tagging-extension.rst
> new file mode 100644
> index 000000000000..ae02f0771971
> --- /dev/null
> +++ b/Documentation/arm64/memory-tagging-extension.rst
> @@ -0,0 +1,229 @@
> +===============================================
> +Memory Tagging Extension (MTE) in AArch64 Linux
> +===============================================
> +
> +Authors: Vincenzo Frascino <vincenzo.frascino@arm.com>
> +         Catalin Marinas <catalin.marinas@arm.com>
> +
> +Date: 2019-11-29
> +
> +This document describes the provision of the Memory Tagging Extension
> +functionality in AArch64 Linux.
> +
> +Introduction
> +============
> +
> +ARMv8.5 based processors introduce the Memory Tagging Extension (MTE)
> +feature. MTE is built on top of the ARMv8.0 virtual address tagging TBI
> +(Top Byte Ignore) feature and allows software to access a 4-bit
> +allocation tag for each 16-byte granule in the physical address space.
> +Such memory range must be mapped with the Normal-Tagged memory
> +attribute. A logical tag is derived from bits 59-56 of the virtual
> +address used for the memory access. A CPU with MTE enabled will compare
> +the logical tag against the allocation tag and potentially raise an
> +exception on mismatch, subject to system registers configuration.
> +
> +Userspace Support
> +=================
> +
> +Memory Tagging Extension Linux support depends on AArch64 Tagged Address
> +ABI being enabled in the kernel.

This is not very clear. Does it mean that the thread must have PR_TAGGED_ADDR_ENABLE 
set? In fact, as per patch 19, it is perfectly possible to enable tag checking 
without enabling the tagged address ABI, and this can work as long as no tagged 
pointer is passed to a syscall. Therefore enabling the tagged address ABI should be a 
recommendation and not a requirement.

>   For more details on AArch64 Tagged
> +Address ABI refer to Documentation/arm64/tagged-address-abi.rst.
> +
> +When ``CONFIG_ARM64_MTE`` is selected and Memory Tagging Extension is
> +supported by the hardware, the kernel advertises the feature to
> +userspace via ``HWCAP2_MTE``.
> +
> +PROT_MTE
> +--------
> +
> +To access the allocation tags, a user process must enable the Tagged
> +memory attribute on an address range using a new ``prot`` flag for
> +``mmap()`` and ``mprotect()``:
> +
> +``PROT_MTE`` - Pages allow access to the MTE allocation tags.
> +
> +The allocation tag is set to 0 when such pages are first mapped in the
> +user address space and preserved on copy-on-write. ``MAP_SHARED`` is
> +supported and the allocation tags can be shared between processes.
> +
> +**Note**: ``PROT_MTE`` is only supported on ``MAP_ANONYMOUS`` and
> +RAM-based file mappings (``tmpfs``, ``memfd``). Passing it to other
> +types of mapping will result in ``-EINVAL`` returned by these system
> +calls.
> +
> +**Note**: The ``PROT_MTE`` flag (and corresponding memory type) cannot
> +be cleared by ``mprotect()``. If this is desirable, ``munmap()``
> +(followed by ``mmap()``) must be used.

Or mmap(addr, ..., MAP_FIXED) to "overwrite" the mapping. Maybe it's better to just 
say that a new mapping must be created.

> +
> +Tag Check Faults
> +----------------
> +
> +When ``PROT_MTE`` is enabled on an address range and a mismatch between
> +the logical and allocation tags occurs on access, there are three
> +configurable behaviours:
> +
> +- *Ignore* - This is the default mode. The CPU (and kernel) ignores the
> +  tag check fault.
> +
> +- *Synchronous* - The kernel raises a ``SIGSEGV`` synchronously, with
> +  ``.si_code = SEGV_MTESERR`` and ``.si_addr = <fault-address>``. The
> +  memory access is not performed.
> +
> +- *Asynchronous* - The kernel raises a ``SIGSEGV``, in the current
> +  thread, asynchronously following one or multiple tag check faults,
> +  with ``.si_code = SEGV_MTEAERR`` and ``.si_addr = 0``.
> +
> +**Note**: There are no *match-all* logical tags available for user
> +applications.
> +
> +The user can select the above modes, per thread, using the
> +``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call where
> +``flags`` contain one of the following values in the ``PR_MTE_TCF_MASK``
> +bit-field:
> +
> +- ``PR_MTE_TCF_NONE``  - *Ignore* tag check faults
> +- ``PR_MTE_TCF_SYNC``  - *Synchronous* tag check fault mode
> +- ``PR_MTE_TCF_ASYNC`` - *Asynchronous* tag check fault mode
> +
> +Tag checking can also be disabled for a user thread by setting the
> +``PSTATE.TCO`` bit with ``MSR TCO, #1``.
> +
> +**Note**: Signal handlers are always invoked with ``PSTATE.TCO = 0``,
> +irrespective of the interrupted context.
> +
> +**Note**: Kernel accesses to user memory (e.g. ``read()`` system call)
> +do not generate a tag check fault.
> +
> +Excluding Tags in the ``IRG``, ``ADDG`` and ``SUBG`` instructions
> +-----------------------------------------------------------------
> +
> +The architecture allows excluding certain tags to be randomly generated
> +via the ``GCR_EL1.Exclude`` register bit-field. This can be configured,
> +per thread, using the ``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)``
> +system call where ``flags`` contains the exclusion bitmap in the
> +``PR_MTE_EXCL_MASK`` bit-field.
> +
> +Example of correct usage
> +========================
> +
> +*MTE Example code*
> +
> +.. code-block:: c
> +
> +    /*
> +     * To be compiled with -march=armv8.5-a+memtag
> +     */
> +    #include <errno.h>
> +    #include <stdio.h>
> +    #include <stdlib.h>
> +    #include <unistd.h>
> +    #include <sys/auxv.h>
> +    #include <sys/mman.h>
> +    #include <sys/prctl.h>
> +
> +    /*
> +     * From arch/arm64/include/uapi/asm/hwcap.h
> +     */
> +    #define HWCAP2_MTE              (1 << 10)
> +
> +    /*
> +     * From arch/arm64/include/uapi/asm/mman.h
> +     */
> +    #define PROT_MTE                 0x20
> +
> +    /*
> +     * From include/uapi/linux/prctl.h
> +     */
> +    #define PR_SET_TAGGED_ADDR_CTRL 55
> +    #define PR_GET_TAGGED_ADDR_CTRL 56
> +    # define PR_TAGGED_ADDR_ENABLE  (1UL << 0)
> +    # define PR_MTE_TCF_SHIFT       1
> +    # define PR_MTE_TCF_NONE        (0UL << PR_MTE_TCF_SHIFT)
> +    # define PR_MTE_TCF_SYNC        (1UL << PR_MTE_TCF_SHIFT)
> +    # define PR_MTE_TCF_ASYNC       (2UL << PR_MTE_TCF_SHIFT)
> +    # define PR_MTE_TCF_MASK        (3UL << PR_MTE_TCF_SHIFT)
> +    # define PR_MTE_EXCL_SHIFT      3
> +    # define PR_MTE_EXCL_MASK       (0xffffUL << PR_MTE_EXCL_SHIFT)
> +
> +    /*
> +     * Insert a random logical tag into the given pointer.
> +     */
> +    #define insert_random_tag(ptr) ({                       \
> +            __u64 __val;                                    \
> +            asm("irg %0, %1" : "=r" (__val) : "r" (ptr));   \
> +            __val;                                          \
> +    })
> +
> +    /*
> +     * Set the allocation tag on the destination address.
> +     */
> +    #define set_tag(tag, addr) do {                                 \
> +            asm volatile("stg %0, [%1]" : : "r" (tag), "r" (addr)); \

Since STG modifies the memory, "memory" should be added to the clobber list. 
`volatile` makes no difference since there's no output operand.

To simplify the example for people less familiar with MTE, I would also suggest for 
this macro to only take one argument, and use it for both operands of STG. The 
two-operand form is a relatively recent addition to the architecture, and using 
different operands is very rarely needed (especially in userspace).

Otherwise the documentation and example look good to me.

Kevin

> +    } while (0)
> +
> +    int main()
> +    {
> +            unsigned long *a;
> +            unsigned long page_sz = getpagesize();
> +            unsigned long hwcap2 = getauxval(AT_HWCAP2);
> +
> +            /* check if MTE is present */
> +            if (!(hwcap2 & HWCAP2_MTE))
> +                    return -1;
> +
> +            /*
> +             * Enable the tagged address ABI, synchronous MTE tag check faults and
> +             * exclude tag 0 from the randomly generated set.
> +             */
> +            if (prctl(PR_SET_TAGGED_ADDR_CTRL,
> +                      PR_TAGGED_ADDR_ENABLE | PR_MTE_TCF_SYNC | (1 << PR_MTE_EXCL_SHIFT),
> +                      0, 0, 0)) {
> +                    perror("prctl() failed");
> +                    return -1;
> +            }
> +
> +            a = mmap(0, page_sz, PROT_READ | PROT_WRITE,
> +                     MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> +            if (a == MAP_FAILED) {
> +                    perror("mmap() failed");
> +                    return -1;
> +            }
> +
> +            /*
> +             * Enable MTE on the above anonymous mmap. The flag could be passed
> +             * directly to mmap() and skip this step.
> +             */
> +            if (mprotect(a, page_sz, PROT_READ | PROT_WRITE | PROT_MTE)) {
> +                    perror("mprotect() failed");
> +                    return -1;
> +            }
> +
> +            /* access with the default tag (0) */
> +            a[0] = 1;
> +            a[1] = 2;
> +
> +            printf("a[0] = %lu a[1] = %lu\n", a[0], a[1]);
> +
> +            /* set the logical and allocation tags */
> +            a = (unsigned long *)insert_random_tag(a);
> +            set_tag(a, a);
> +
> +            printf("%p\n", a);
> +
> +            /* non-zero tag access */
> +            a[0] = 3;
> +            printf("a[0] = %lu a[1] = %lu\n", a[0], a[1]);
> +
> +            /*
> +             * If MTE is enabled correctly the next instruction will generate an
> +             * exception.
> +             */
> +            printf("Expecting SIGSEGV...\n");
> +            a[2] = 0xdead;
> +
> +            /* this should not be printed in the PR_MTE_TCF_SYNC mode */
> +            printf("...done\n");
> +
> +            return 0;
> +    }



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 19/22] arm64: mte: Allow user control of the tag check mode via prctl()
  2019-12-11 18:40 ` [PATCH 19/22] arm64: mte: Allow user control of the tag check mode via prctl() Catalin Marinas
  2019-12-19 20:32   ` Peter Collingbourne
@ 2019-12-27 14:34   ` Kevin Brodsky
  2020-02-12 11:45     ` Catalin Marinas
  1 sibling, 1 reply; 51+ messages in thread
From: Kevin Brodsky @ 2019-12-27 14:34 UTC (permalink / raw)
  To: Catalin Marinas, linux-arm-kernel
  Cc: Will Deacon, Marc Zyngier, Vincenzo Frascino, Szabolcs Nagy,
	Richard Earnshaw, Andrey Konovalov, linux-mm, linux-arch

Not just related to this patch, but here goes. While trying to debug an MTE-enabled 
process, I realised that there's no way to tell the tagged addr / MTE thread 
configuration from outside of the thread. At this point I thought it'd be really nice 
if this were to be exposed in /proc/pid, maybe in /proc/pid/status. Unfortunately 
there seems to be no precedent for an arch-specific feature to be exposed there. I 
guess a ptrace call would work as well, although it wouldn't be as practical without 
using a debugger.

Any thoughts?

Kevin

On 11/12/2019 18:40, Catalin Marinas wrote:
> By default, even if PROT_MTE is set on a memory range, there is no tag
> check fault reporting (SIGSEGV). Introduce a set of option to the
> exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag
> check fault mode:
>
>    PR_MTE_TCF_NONE  - no reporting (default)
>    PR_MTE_TCF_SYNC  - synchronous tag check fault reporting
>    PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting
>
> These options translate into the corresponding SCTLR_EL1.TCF0 bitfield,
> context-switched by the kernel. Note that uaccess done by the kernel is
> not checked and cannot be configured by the user.
>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>   arch/arm64/include/asm/processor.h |   3 +
>   arch/arm64/kernel/process.c        | 119 +++++++++++++++++++++++++++--
>   include/uapi/linux/prctl.h         |   6 ++
>   3 files changed, 123 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index 5ba63204d078..91aa270afc7d 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -148,6 +148,9 @@ struct thread_struct {
>   #ifdef CONFIG_ARM64_PTR_AUTH
>   	struct ptrauth_keys	keys_user;
>   #endif
> +#ifdef CONFIG_ARM64_MTE
> +	u64			sctlr_tcf0;
> +#endif
>   };
>   
>   static inline void arch_thread_struct_whitelist(unsigned long *offset,
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index dd98d539894e..47ce98f47253 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -317,11 +317,22 @@ static void flush_tagged_addr_state(void)
>   		clear_thread_flag(TIF_TAGGED_ADDR);
>   }
>   
> +#ifdef CONFIG_ARM64_MTE
> +static void flush_mte_state(void)
> +{
> +	if (!system_supports_mte())
> +		return;
> +
> +	/* clear any pending asynchronous tag fault */
> +	clear_thread_flag(TIF_MTE_ASYNC_FAULT);
> +	/* disable tag checking */
> +	current->thread.sctlr_tcf0 = 0;
> +}
> +#else
>   static void flush_mte_state(void)
>   {
> -	if (system_supports_mte())
> -		clear_thread_flag(TIF_MTE_ASYNC_FAULT);
>   }
> +#endif
>   
>   void flush_thread(void)
>   {
> @@ -484,6 +495,29 @@ static void ssbs_thread_switch(struct task_struct *next)
>   		set_ssbs_bit(regs);
>   }
>   
> +#ifdef CONFIG_ARM64_MTE
> +static void update_sctlr_el1_tcf0(u64 tcf0)
> +{
> +	/* no need for ISB since this only affects EL0, implicit with ERET */
> +	sysreg_clear_set(sctlr_el1, SCTLR_EL1_TCF0_MASK, tcf0);
> +}
> +
> +/* Handle MTE thread switch */
> +static void mte_thread_switch(struct task_struct *next)
> +{
> +	if (!system_supports_mte())
> +		return;
> +
> +	/* avoid expensive SCTLR_EL1 accesses if no change */
> +	if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
> +		update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
> +}
> +#else
> +static void mte_thread_switch(struct task_struct *next)
> +{
> +}
> +#endif
> +
>   /*
>    * We store our current task in sp_el0, which is clobbered by userspace. Keep a
>    * shadow copy so that we can restore this upon entry from userspace.
> @@ -514,6 +548,7 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
>   	uao_thread_switch(next);
>   	ptrauth_thread_switch(next);
>   	ssbs_thread_switch(next);
> +	mte_thread_switch(next);
>   
>   	/*
>   	 * Complete any pending TLB or cache maintenance on this CPU in case
> @@ -574,6 +609,67 @@ void arch_setup_new_exec(void)
>   	ptrauth_thread_init_user(current);
>   }
>   
> +#ifdef CONFIG_ARM64_MTE
> +static long set_mte_ctrl(unsigned long arg)
> +{
> +	u64 tcf0;
> +
> +	if (!system_supports_mte())
> +		return 0;
> +
> +	switch (arg & PR_MTE_TCF_MASK) {
> +	case PR_MTE_TCF_NONE:
> +		tcf0 = 0;
> +		break;
> +	case PR_MTE_TCF_SYNC:
> +		tcf0 = SCTLR_EL1_TCF0_SYNC;
> +		break;
> +	case PR_MTE_TCF_ASYNC:
> +		tcf0 = SCTLR_EL1_TCF0_ASYNC;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * mte_thread_switch() checks current->thread.sctlr_tcf0 as an
> +	 * optimisation. Disable preemption so that it does not see
> +	 * the variable update before the SCTLR_EL1.TCF0 one.
> +	 */
> +	preempt_disable();
> +	current->thread.sctlr_tcf0 = tcf0;
> +	update_sctlr_el1_tcf0(tcf0);
> +	preempt_enable();
> +
> +	return 0;
> +}
> +
> +static long get_mte_ctrl(void)
> +{
> +	if (!system_supports_mte())
> +		return 0;
> +
> +	switch (current->thread.sctlr_tcf0) {
> +	case SCTLR_EL1_TCF0_SYNC:
> +		return PR_MTE_TCF_SYNC;
> +	case SCTLR_EL1_TCF0_ASYNC:
> +		return PR_MTE_TCF_ASYNC;
> +	}
> +
> +	return 0;
> +}
> +#else
> +static long set_mte_ctrl(unsigned long arg)
> +{
> +	return 0;
> +}
> +
> +static long get_mte_ctrl(void)
> +{
> +	return 0;
> +}
> +#endif
> +
>   #ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
>   /*
>    * Control the relaxed ABI allowing tagged user addresses into the kernel.
> @@ -582,9 +678,15 @@ static unsigned int tagged_addr_disabled;
>   
>   long set_tagged_addr_ctrl(unsigned long arg)
>   {
> +	unsigned long valid_mask = PR_TAGGED_ADDR_ENABLE;
> +
>   	if (is_compat_task())
>   		return -EINVAL;
> -	if (arg & ~PR_TAGGED_ADDR_ENABLE)
> +
> +	if (system_supports_mte())
> +		valid_mask |= PR_MTE_TCF_MASK;
> +
> +	if (arg & ~valid_mask)
>   		return -EINVAL;
>   
>   	/*
> @@ -594,6 +696,9 @@ long set_tagged_addr_ctrl(unsigned long arg)
>   	if (arg & PR_TAGGED_ADDR_ENABLE && tagged_addr_disabled)
>   		return -EINVAL;
>   
> +	if (set_mte_ctrl(arg) != 0)
> +		return -EINVAL;
> +
>   	update_thread_flag(TIF_TAGGED_ADDR, arg & PR_TAGGED_ADDR_ENABLE);
>   
>   	return 0;
> @@ -601,13 +706,17 @@ long set_tagged_addr_ctrl(unsigned long arg)
>   
>   long get_tagged_addr_ctrl(void)
>   {
> +	long ret = 0;
> +
>   	if (is_compat_task())
>   		return -EINVAL;
>   
>   	if (test_thread_flag(TIF_TAGGED_ADDR))
> -		return PR_TAGGED_ADDR_ENABLE;
> +		ret = PR_TAGGED_ADDR_ENABLE;
>   
> -	return 0;
> +	ret |= get_mte_ctrl();
> +
> +	return ret;
>   }
>   
>   /*
> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> index 7da1b37b27aa..5e9323e66a38 100644
> --- a/include/uapi/linux/prctl.h
> +++ b/include/uapi/linux/prctl.h
> @@ -233,5 +233,11 @@ struct prctl_mm_map {
>   #define PR_SET_TAGGED_ADDR_CTRL		55
>   #define PR_GET_TAGGED_ADDR_CTRL		56
>   # define PR_TAGGED_ADDR_ENABLE		(1UL << 0)
> +/* MTE tag check fault modes */
> +# define PR_MTE_TCF_SHIFT		1
> +# define PR_MTE_TCF_NONE		(0UL << PR_MTE_TCF_SHIFT)
> +# define PR_MTE_TCF_SYNC		(1UL << PR_MTE_TCF_SHIFT)
> +# define PR_MTE_TCF_ASYNC		(2UL << PR_MTE_TCF_SHIFT)
> +# define PR_MTE_TCF_MASK		(3UL << PR_MTE_TCF_SHIFT)
>   
>   #endif /* _LINUX_PRCTL_H */



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 15/22] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  2019-12-11 18:40 ` [PATCH 15/22] arm64: mte: Add PROT_MTE support to mmap() and mprotect() Catalin Marinas
@ 2020-01-21 22:06   ` Peter Collingbourne
  0 siblings, 0 replies; 51+ messages in thread
From: Peter Collingbourne @ 2020-01-21 22:06 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linux ARM, linux-arch, Richard Earnshaw, Szabolcs Nagy,
	Marc Zyngier, Kevin Brodsky, linux-mm, Andrey Konovalov,
	Vincenzo Frascino, Will Deacon

On Wed, Dec 11, 2019 at 10:44 AM Catalin Marinas
<catalin.marinas@arm.com> wrote:
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 9442631fd4af..34bc9e0b4896 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -677,6 +677,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
>                 [ilog2(VM_MERGEABLE)]   = "mg",
>                 [ilog2(VM_UFFD_MISSING)]= "um",
>                 [ilog2(VM_UFFD_WP)]     = "uw",
> +#ifdef CONFIG_ARM64_MTE
> +               [ilog2(VM_MTE)]         = "mt",

We should probably add an entry for VM_MTE_ALLOWED here as well.

Peter


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] arm64: mte: Do not service syscalls after async tag fault
  2019-12-20  1:36       ` [PATCH] arm64: mte: Do not service syscalls after async tag fault Peter Collingbourne
@ 2020-02-12 11:09         ` Catalin Marinas
  2020-02-18 21:59           ` Peter Collingbourne
  0 siblings, 1 reply; 51+ messages in thread
From: Catalin Marinas @ 2020-02-12 11:09 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Evgenii Stepanov, Kostya Serebryany, Linux ARM, linux-arch,
	Richard Earnshaw, Szabolcs Nagy, Marc Zyngier, Kevin Brodsky,
	linux-mm, Andrey Konovalov, Vincenzo Frascino, Will Deacon

On Thu, Dec 19, 2019 at 05:36:39PM -0800, Peter Collingbourne wrote:
> When entering the kernel after an async tag fault due to a syscall, rather
> than for another reason (e.g. preemption), we don't want to service the
> syscall as it may mask the tag fault. Rewind the PC to the svc instruction
> in order to give a userspace signal handler an opportunity to handle the
> fault and resume, and skip all other syscall processing.
> 
> Signed-off-by: Peter Collingbourne <pcc@google.com>
> ---
[...]
>  arch/arm64/kernel/syscall.c | 22 +++++++++++++++++++---
>  1 file changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> index 9a9d98a443fc..49ea9bb47190 100644
> --- a/arch/arm64/kernel/syscall.c
> +++ b/arch/arm64/kernel/syscall.c
> @@ -95,13 +95,29 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
>  {
>  	unsigned long flags = current_thread_info()->flags;
>  
> -	regs->orig_x0 = regs->regs[0];
> -	regs->syscallno = scno;
> -
>  	cortex_a76_erratum_1463225_svc_handler();
>  	local_daif_restore(DAIF_PROCCTX);
>  	user_exit();
>  
> +#ifdef CONFIG_ARM64_MTE
> +	if (flags & _TIF_MTE_ASYNC_FAULT) {
> +		/*
> +		 * We entered the kernel after an async tag fault due to a
> +		 * syscall, rather than for another reason (e.g. preemption).
> +		 * In this case, we don't want to service the syscall as it may
> +		 * mask the tag fault. Rewind the PC to the svc instruction in
> +		 * order to give a userspace signal handler an opportunity to
> +		 * handle the fault and resume, and skip all other syscall
> +		 * processing.
> +		 */
> +		regs->pc -= 4;
> +		return;
> +	}
> +#endif
> +
> +	regs->orig_x0 = regs->regs[0];
> +	regs->syscallno = scno;

I'm slightly worried about the interaction with single-step, other
signals. It might be better if we just use the existing syscall
restarting mechanism. Untested diff below:

-------------------8<-------------------------------
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index a12c0c88d345..db25f5d6a07c 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -102,6 +102,16 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
 	local_daif_restore(DAIF_PROCCTX);
 	user_exit();
 
+	if (system_supports_mte() && (flags & _TIF_MTE_ASYNC_FAULT)) {
+		/*
+		 * Process the asynchronous tag check fault before the actual
+		 * syscall. do_notify_resume() will send a signal to userspace
+		 * before the syscall is restarted.
+		 */
+		regs->regs[0] = -ERESTARTNOINTR;
+		return;
+	}
+
 	if (has_syscall_work(flags)) {
 		/* set default errno for user-issued syscall(-1) */
 		if (scno == NO_SYSCALL)


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 19/22] arm64: mte: Allow user control of the tag check mode via prctl()
  2019-12-27 14:34   ` [PATCH 19/22] arm64: mte: Allow user control of the tag check mode via prctl() Kevin Brodsky
@ 2020-02-12 11:45     ` Catalin Marinas
  0 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2020-02-12 11:45 UTC (permalink / raw)
  To: Kevin Brodsky
  Cc: linux-arm-kernel, Will Deacon, Marc Zyngier, Vincenzo Frascino,
	Szabolcs Nagy, Richard Earnshaw, Andrey Konovalov, linux-mm,
	linux-arch

On Fri, Dec 27, 2019 at 02:34:32PM +0000, Kevin Brodsky wrote:
> Not just related to this patch, but here goes. While trying to debug an
> MTE-enabled process, I realised that there's no way to tell the tagged addr
> / MTE thread configuration from outside of the thread. At this point I
> thought it'd be really nice if this were to be exposed in /proc/pid, maybe
> in /proc/pid/status. Unfortunately there seems to be no precedent for an
> arch-specific feature to be exposed there. I guess a ptrace call would work
> as well, although it wouldn't be as practical without using a debugger.

There is proc_pid_arch_status(), currently only used by x86 to report
the avx512 status. We could do the same on arm64 and provide information
information on the MTE status, SVE configuration, ptrauth. I think this
can be a separate patch covering all these.

-- 
Catalin


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] arm64: mte: Clear SCTLR_EL1.TCF0 on exec
  2019-12-20  1:48     ` [PATCH] arm64: mte: Clear SCTLR_EL1.TCF0 on exec Peter Collingbourne
@ 2020-02-12 17:03       ` Catalin Marinas
  0 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2020-02-12 17:03 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Evgenii Stepanov, Kostya Serebryany, Linux ARM, linux-arch,
	Richard Earnshaw, Szabolcs Nagy, Marc Zyngier, Kevin Brodsky,
	linux-mm, Andrey Konovalov, Vincenzo Frascino, Will Deacon

On Thu, Dec 19, 2019 at 05:48:53PM -0800, Peter Collingbourne wrote:
> On Thu, Dec 19, 2019 at 12:32 PM Peter Collingbourne <pcc@google.com> wrote:
> > On Wed, Dec 11, 2019 at 10:45 AM Catalin Marinas
> > <catalin.marinas@arm.com> wrote:
> > > +       if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
> > > +               update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
> >
> > I don't entirely understand why yet, but I've found that this check is
> > insufficient for ensuring consistency between SCTLR_EL1.TCF0 and
> > sctlr_tcf0. In my Android test environment with some processes having
> > sctlr_tcf0=SCTLR_EL1_TCF0_SYNC and others having sctlr_tcf0=0, I am
> > seeing intermittent tag failures coming from the sctlr_tcf0=0
> > processes. With this patch:
[...]
> > Since sysreg_clear_set only sets the sysreg if it ended up changing, I
> > wouldn't expect this to cause a significant performance hit unless
> > just reading SCTLR_EL1 is expensive. That being said, if the
> > inconsistency is indicative of a deeper problem, we should probably
> > address that.
> 
> I tracked it down to the flush_mte_state() function setting sctlr_tcf0 but
> failing to update SCTLR_EL1.TCF0. With this patch I am not seeing any more
> inconsistencies.

Thanks Peter. I folded in your fix.

-- 
Catalin


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 00/22] arm64: Memory Tagging Extension user-space support
  2019-12-13 18:05 ` [PATCH 00/22] arm64: Memory Tagging Extension user-space support Peter Collingbourne
@ 2020-02-13 11:23   ` Catalin Marinas
  0 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2020-02-13 11:23 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Linux ARM, linux-arch, Richard Earnshaw, Szabolcs Nagy,
	Marc Zyngier, Kevin Brodsky, linux-mm, Andrey Konovalov,
	Vincenzo Frascino, Will Deacon, Evgenii Stepanov,
	Kostya Kortchinsky, Kostya Serebryany

On Fri, Dec 13, 2019 at 10:05:10AM -0800, Peter Collingbourne wrote:
> On Wed, Dec 11, 2019 at 10:40 AM Catalin Marinas
> <catalin.marinas@arm.com> wrote:
> > This series proposes the initial user-space support for the ARMv8.5
> > Memory Tagging Extension [1].
> 
> Thanks for sending out this series. I have been testing it on Android
> with the FVP model and my in-development scudo changes that add memory
> tagging support [1], and have not noticed any problems so far.

Thanks for the comments so far and the testing. I'll post a v2 next
week.

> > - Clarify whether mmap(tagged_addr, PROT_MTE) pre-tags the memory with
> >   the tag given in the tagged_addr hint. Strong justification is
> >   required for this as it would force arm64 to disable the zero page.
> 
> We would like to use this feature in scudo to tag large (>128KB on
> Android) allocations, which are currently allocated via mmap rather
> than from an allocation pool. Otherwise we would need to pay the cost
> (perf and RSS) of faulting all of their pages at allocation time
> instead of on demand, if we want to tag them.

Would the default tag of 0 be sufficient here? We disable match-all for
user-space already, so 0 is not a wildcard.

-- 
Catalin


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] arm64: mte: Do not service syscalls after async tag fault
  2020-02-12 11:09         ` Catalin Marinas
@ 2020-02-18 21:59           ` Peter Collingbourne
  2020-02-19 16:16             ` Catalin Marinas
  0 siblings, 1 reply; 51+ messages in thread
From: Peter Collingbourne @ 2020-02-18 21:59 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Evgenii Stepanov, Kostya Serebryany, Linux ARM, linux-arch,
	Richard Earnshaw, Szabolcs Nagy, Marc Zyngier, Kevin Brodsky,
	linux-mm, Andrey Konovalov, Vincenzo Frascino, Will Deacon

On Wed, Feb 12, 2020 at 3:09 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Thu, Dec 19, 2019 at 05:36:39PM -0800, Peter Collingbourne wrote:
> > When entering the kernel after an async tag fault due to a syscall, rather
> > than for another reason (e.g. preemption), we don't want to service the
> > syscall as it may mask the tag fault. Rewind the PC to the svc instruction
> > in order to give a userspace signal handler an opportunity to handle the
> > fault and resume, and skip all other syscall processing.
> >
> > Signed-off-by: Peter Collingbourne <pcc@google.com>
> > ---
> [...]
> >  arch/arm64/kernel/syscall.c | 22 +++++++++++++++++++---
> >  1 file changed, 19 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> > index 9a9d98a443fc..49ea9bb47190 100644
> > --- a/arch/arm64/kernel/syscall.c
> > +++ b/arch/arm64/kernel/syscall.c
> > @@ -95,13 +95,29 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
> >  {
> >       unsigned long flags = current_thread_info()->flags;
> >
> > -     regs->orig_x0 = regs->regs[0];
> > -     regs->syscallno = scno;
> > -
> >       cortex_a76_erratum_1463225_svc_handler();
> >       local_daif_restore(DAIF_PROCCTX);
> >       user_exit();
> >
> > +#ifdef CONFIG_ARM64_MTE
> > +     if (flags & _TIF_MTE_ASYNC_FAULT) {
> > +             /*
> > +              * We entered the kernel after an async tag fault due to a
> > +              * syscall, rather than for another reason (e.g. preemption).
> > +              * In this case, we don't want to service the syscall as it may
> > +              * mask the tag fault. Rewind the PC to the svc instruction in
> > +              * order to give a userspace signal handler an opportunity to
> > +              * handle the fault and resume, and skip all other syscall
> > +              * processing.
> > +              */
> > +             regs->pc -= 4;
> > +             return;
> > +     }
> > +#endif
> > +
> > +     regs->orig_x0 = regs->regs[0];
> > +     regs->syscallno = scno;
>
> I'm slightly worried about the interaction with single-step, other
> signals. It might be better if we just use the existing syscall
> restarting mechanism. Untested diff below:
>
> -------------------8<-------------------------------
> diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> index a12c0c88d345..db25f5d6a07c 100644
> --- a/arch/arm64/kernel/syscall.c
> +++ b/arch/arm64/kernel/syscall.c
> @@ -102,6 +102,16 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
>         local_daif_restore(DAIF_PROCCTX);
>         user_exit();
>
> +       if (system_supports_mte() && (flags & _TIF_MTE_ASYNC_FAULT)) {
> +               /*
> +                * Process the asynchronous tag check fault before the actual
> +                * syscall. do_notify_resume() will send a signal to userspace
> +                * before the syscall is restarted.
> +                */
> +               regs->regs[0] = -ERESTARTNOINTR;
> +               return;
> +       }
> +
>         if (has_syscall_work(flags)) {
>                 /* set default errno for user-issued syscall(-1) */
>                 if (scno == NO_SYSCALL)

That works for me, and I verified that my small test program as well
as some larger unit tests behave as expected.

Tested-by: Peter Collingbourne <pcc@google.com>


Peter


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH] arm64: mte: Do not service syscalls after async tag fault
  2020-02-18 21:59           ` Peter Collingbourne
@ 2020-02-19 16:16             ` Catalin Marinas
  0 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2020-02-19 16:16 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Evgenii Stepanov, Kostya Serebryany, Linux ARM, linux-arch,
	Richard Earnshaw, Szabolcs Nagy, Marc Zyngier, Kevin Brodsky,
	linux-mm, Andrey Konovalov, Vincenzo Frascino, Will Deacon

On Tue, Feb 18, 2020 at 01:59:34PM -0800, Peter Collingbourne wrote:
> On Wed, Feb 12, 2020 at 3:09 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Thu, Dec 19, 2019 at 05:36:39PM -0800, Peter Collingbourne wrote:
> > > When entering the kernel after an async tag fault due to a syscall, rather
> > > than for another reason (e.g. preemption), we don't want to service the
> > > syscall as it may mask the tag fault. Rewind the PC to the svc instruction
> > > in order to give a userspace signal handler an opportunity to handle the
> > > fault and resume, and skip all other syscall processing.
> > >
> > > Signed-off-by: Peter Collingbourne <pcc@google.com>
> > > ---
> > [...]
> > >  arch/arm64/kernel/syscall.c | 22 +++++++++++++++++++---
> > >  1 file changed, 19 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> > > index 9a9d98a443fc..49ea9bb47190 100644
> > > --- a/arch/arm64/kernel/syscall.c
> > > +++ b/arch/arm64/kernel/syscall.c
> > > @@ -95,13 +95,29 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
> > >  {
> > >       unsigned long flags = current_thread_info()->flags;
> > >
> > > -     regs->orig_x0 = regs->regs[0];
> > > -     regs->syscallno = scno;
> > > -
> > >       cortex_a76_erratum_1463225_svc_handler();
> > >       local_daif_restore(DAIF_PROCCTX);
> > >       user_exit();
> > >
> > > +#ifdef CONFIG_ARM64_MTE
> > > +     if (flags & _TIF_MTE_ASYNC_FAULT) {
> > > +             /*
> > > +              * We entered the kernel after an async tag fault due to a
> > > +              * syscall, rather than for another reason (e.g. preemption).
> > > +              * In this case, we don't want to service the syscall as it may
> > > +              * mask the tag fault. Rewind the PC to the svc instruction in
> > > +              * order to give a userspace signal handler an opportunity to
> > > +              * handle the fault and resume, and skip all other syscall
> > > +              * processing.
> > > +              */
> > > +             regs->pc -= 4;
> > > +             return;
> > > +     }
> > > +#endif
> > > +
> > > +     regs->orig_x0 = regs->regs[0];
> > > +     regs->syscallno = scno;
> >
> > I'm slightly worried about the interaction with single-step, other
> > signals. It might be better if we just use the existing syscall
> > restarting mechanism. Untested diff below:
> >
> > -------------------8<-------------------------------
> > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> > index a12c0c88d345..db25f5d6a07c 100644
> > --- a/arch/arm64/kernel/syscall.c
> > +++ b/arch/arm64/kernel/syscall.c
> > @@ -102,6 +102,16 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
> >         local_daif_restore(DAIF_PROCCTX);
> >         user_exit();
> >
> > +       if (system_supports_mte() && (flags & _TIF_MTE_ASYNC_FAULT)) {
> > +               /*
> > +                * Process the asynchronous tag check fault before the actual
> > +                * syscall. do_notify_resume() will send a signal to userspace
> > +                * before the syscall is restarted.
> > +                */
> > +               regs->regs[0] = -ERESTARTNOINTR;
> > +               return;
> > +       }
> > +
> >         if (has_syscall_work(flags)) {
> >                 /* set default errno for user-issued syscall(-1) */
> >                 if (scno == NO_SYSCALL)
> 
> That works for me, and I verified that my small test program as well
> as some larger unit tests behave as expected.
> 
> Tested-by: Peter Collingbourne <pcc@google.com>

Thanks Peter.

-- 
Catalin


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 20/22] arm64: mte: Allow user control of the excluded tags via prctl()
  2019-12-16 17:30     ` Peter Collingbourne
  2019-12-17 17:56       ` Catalin Marinas
@ 2020-06-22 17:17       ` Catalin Marinas
  2020-06-22 19:00         ` Peter Collingbourne
  1 sibling, 1 reply; 51+ messages in thread
From: Catalin Marinas @ 2020-06-22 17:17 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Kevin Brodsky, Linux ARM, Will Deacon, Marc Zyngier,
	Vincenzo Frascino, Szabolcs Nagy, Richard Earnshaw,
	Andrey Konovalov, linux-mm, linux-arch, Branislav Rankov,
	Dave P Martin

Hi Peter,

Revisiting the gcr_excl vs gcr_incl decision, so reviving an old thread.

On Mon, Dec 16, 2019 at 09:30:36AM -0800, Peter Collingbourne wrote:
> On Mon, Dec 16, 2019 at 6:20 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote:
> > In this patch, the default exclusion mask remains 0 (i.e. all tags can be generated).
> > After some more discussions, Branislav and I think that it would be better to start
> > with the reverse, i.e. all tags but 0 excluded (mask = 0xfe or 0xff).
> >
> > This should simplify the MTE setup in the early C runtime quite a bit. Indeed, if all
> > tags can be generated, doing any heap or stack tagging before the
> > PR_SET_TAGGED_ADDR_CTRL prctl() is issued can cause problems, notably because tagged
> > addresses could end up being passed to syscalls. Conversely, if IRG and ADDG never
> > set the top byte by default, then tagging operations should be no-ops until the
> > prctl() is issued. This would be particularly useful given that it may not be
> > straightforward for the C runtime to issue the prctl() before doing anything else.
> >
> > Additionally, since the default tag checking mode is PR_MTE_TCF_NONE, it would make
> > perfect sense not to generate tags by default.
> 
> This would indeed allow the early C runtime startup code to pass
> tagged addresses to syscalls,

I guess you meant that early C runtime code won't get tagged stack
addresses, hence they can be passed to syscalls. Prior to the prctl(),
the kernel doesn't accept tagged addresses anyway.

> but I don't think it would entirely free
> the code from the burden of worrying about stack tagging. Either way,
> any stack frames that are active at the point when the prctl() is
> issued would need to be compiled without stack tagging, because
> otherwise those stack frames may use ADDG to rematerialize a stack
> object address, which may produce a different address post-prctl.

If you want to guarantee that ADDG always returns tag 0, I guess that's
only possible with a default exclude mask of 0xffff (or if you are
careful enough with the start tag and offset passed).

> Setting the exclude mask to 0xffff would at least make it more likely
> for this problem to be detected, though.

I thought it would be detected if we didn't have a 0xffff default
exclude mask. With only tag 0 generated, any such problem could be
hidden.

> If we change the default in this way, maybe it would be worth
> considering flipping the meaning of the tag mask and have it be a mask
> of tags to allow. That would be consistent with the existing behaviour
> where userspace sets bits in tagged_addr_ctrl in order to enable
> tagging features.

The first question is whether the C runtime requires a default
GCR_EL1.Excl mask of 0xffff (or 0xfffe) so that IRG, ADDG, SUBG always
generate tag 0. If the runtime is fine with a default exclude mask of 0,
I'm tempted to go back to an exclude mask for prctl().

(to me it feels more natural to use an exclude mask as it matches the
ARM ARM definition but maybe I stare too much at the hardware specs ;))

-- 
Catalin


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 20/22] arm64: mte: Allow user control of the excluded tags via prctl()
  2020-06-22 17:17       ` Catalin Marinas
@ 2020-06-22 19:00         ` Peter Collingbourne
  2020-06-23 16:42           ` Catalin Marinas
  0 siblings, 1 reply; 51+ messages in thread
From: Peter Collingbourne @ 2020-06-22 19:00 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Kevin Brodsky, Linux ARM, Will Deacon, Marc Zyngier,
	Vincenzo Frascino, Szabolcs Nagy, Richard Earnshaw,
	Andrey Konovalov, linux-mm, linux-arch, Branislav Rankov,
	Dave P Martin

On Mon, Jun 22, 2020 at 10:17 AM Catalin Marinas
<catalin.marinas@arm.com> wrote:
>
> Hi Peter,
>
> Revisiting the gcr_excl vs gcr_incl decision, so reviving an old thread.
>
> On Mon, Dec 16, 2019 at 09:30:36AM -0800, Peter Collingbourne wrote:
> > On Mon, Dec 16, 2019 at 6:20 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote:
> > > In this patch, the default exclusion mask remains 0 (i.e. all tags can be generated).
> > > After some more discussions, Branislav and I think that it would be better to start
> > > with the reverse, i.e. all tags but 0 excluded (mask = 0xfe or 0xff).
> > >
> > > This should simplify the MTE setup in the early C runtime quite a bit. Indeed, if all
> > > tags can be generated, doing any heap or stack tagging before the
> > > PR_SET_TAGGED_ADDR_CTRL prctl() is issued can cause problems, notably because tagged
> > > addresses could end up being passed to syscalls. Conversely, if IRG and ADDG never
> > > set the top byte by default, then tagging operations should be no-ops until the
> > > prctl() is issued. This would be particularly useful given that it may not be
> > > straightforward for the C runtime to issue the prctl() before doing anything else.
> > >
> > > Additionally, since the default tag checking mode is PR_MTE_TCF_NONE, it would make
> > > perfect sense not to generate tags by default.
> >
> > This would indeed allow the early C runtime startup code to pass
> > tagged addresses to syscalls,
>
> I guess you meant that early C runtime code won't get tagged stack
> addresses, hence they can be passed to syscalls. Prior to the prctl(),
> the kernel doesn't accept tagged addresses anyway.

Right.

> > but I don't think it would entirely free
> > the code from the burden of worrying about stack tagging. Either way,
> > any stack frames that are active at the point when the prctl() is
> > issued would need to be compiled without stack tagging, because
> > otherwise those stack frames may use ADDG to rematerialize a stack
> > object address, which may produce a different address post-prctl.
>
> If you want to guarantee that ADDG always returns tag 0, I guess that's
> only possible with a default exclude mask of 0xffff (or if you are
> careful enough with the start tag and offset passed).
>
> > Setting the exclude mask to 0xffff would at least make it more likely
> > for this problem to be detected, though.
>
> I thought it would be detected if we didn't have a 0xffff default
> exclude mask. With only tag 0 generated, any such problem could be
> hidden.

I don't think that's the case, as long as you aren't using 0 as a
catch-all tag. Imagine that you have some hypothetical startup code
that looks like this:

void init() {
  bool called_prctl = false;
  prctl(PR_SET_TAGGED_ADDR_CTRL, ...); // effect is to change
GCR_EL1.Excl from 0xffff to 1
  called_prctl = true;
}

This may be compiled as something like (well, a real compiler wouldn't
compile it like this but rather use sp-relative stores or eliminate
the dead stores entirely, but imagine that the stores to called_prctl
are obfuscated somehow, e.g. in another translation unit):

sub x19, sp, #16
irg x19, x19 // compute a tag base for the function
addg x0, x19, #0, #1 // add tag offset for "called_prctl"
stzg x0, [x0]
bl prctl
addg x0, x19, #0, #1 // rematerialize "called_prctl" address
mov w1, #1
strb w1, [x0]
ret

The first addg will materialize a tag of 0 due to the default Excl
value, so the stzg will set the memory tag to 0. However, the second
addg will materialize a tag of 1 because of the new Excl value, which
will result in a tag fault in the strb instruction.

This problem is less likely to be detected if we transition Excl from
0 to 1. It will only be detected in the case where the irg instruction
produces a tag of 0xf, which would be incremented to 0 by the first
addg but to 1 by the second one.

> > If we change the default in this way, maybe it would be worth
> > considering flipping the meaning of the tag mask and have it be a mask
> > of tags to allow. That would be consistent with the existing behaviour
> > where userspace sets bits in tagged_addr_ctrl in order to enable
> > tagging features.
>
> The first question is whether the C runtime requires a default
> GCR_EL1.Excl mask of 0xffff (or 0xfffe) so that IRG, ADDG, SUBG always
> generate tag 0. If the runtime is fine with a default exclude mask of 0,
> I'm tempted to go back to an exclude mask for prctl().
>
> (to me it feels more natural to use an exclude mask as it matches the
> ARM ARM definition but maybe I stare too much at the hardware specs ;))

I think that would be fine with me. With the transition from 0 to 1
the above problem would still be detected, but only 1/16 of the time.
But if the problem exists in the early startup code which will be
executed many times during a typical system boot, it makes it likely
that the problem will be detected eventually.

Peter


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 20/22] arm64: mte: Allow user control of the excluded tags via prctl()
  2020-06-22 19:00         ` Peter Collingbourne
@ 2020-06-23 16:42           ` Catalin Marinas
  0 siblings, 0 replies; 51+ messages in thread
From: Catalin Marinas @ 2020-06-23 16:42 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Kevin Brodsky, Linux ARM, Will Deacon, Marc Zyngier,
	Vincenzo Frascino, Szabolcs Nagy, Richard Earnshaw,
	Andrey Konovalov, linux-mm, linux-arch, Branislav Rankov,
	Dave P Martin

On Mon, Jun 22, 2020 at 12:00:48PM -0700, Peter Collingbourne wrote:
> On Mon, Jun 22, 2020 at 10:17 AM Catalin Marinas
> <catalin.marinas@arm.com> wrote:
> > On Mon, Dec 16, 2019 at 09:30:36AM -0800, Peter Collingbourne wrote:
> > > On Mon, Dec 16, 2019 at 6:20 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote:
> > > > In this patch, the default exclusion mask remains 0 (i.e. all tags can be generated).
> > > > After some more discussions, Branislav and I think that it would be better to start
> > > > with the reverse, i.e. all tags but 0 excluded (mask = 0xfe or 0xff).
> > > >
> > > > This should simplify the MTE setup in the early C runtime quite a bit. Indeed, if all
> > > > tags can be generated, doing any heap or stack tagging before the
> > > > PR_SET_TAGGED_ADDR_CTRL prctl() is issued can cause problems, notably because tagged
> > > > addresses could end up being passed to syscalls. Conversely, if IRG and ADDG never
> > > > set the top byte by default, then tagging operations should be no-ops until the
> > > > prctl() is issued. This would be particularly useful given that it may not be
> > > > straightforward for the C runtime to issue the prctl() before doing anything else.
> > > >
> > > > Additionally, since the default tag checking mode is PR_MTE_TCF_NONE, it would make
> > > > perfect sense not to generate tags by default.
> > >
> > > This would indeed allow the early C runtime startup code to pass
> > > tagged addresses to syscalls,
[...]
> > > but I don't think it would entirely free
> > > the code from the burden of worrying about stack tagging. Either way,
> > > any stack frames that are active at the point when the prctl() is
> > > issued would need to be compiled without stack tagging, because
> > > otherwise those stack frames may use ADDG to rematerialize a stack
> > > object address, which may produce a different address post-prctl.
[...]
> > > Setting the exclude mask to 0xffff would at least make it more likely
> > > for this problem to be detected, though.
> >
> > I thought it would be detected if we didn't have a 0xffff default
> > exclude mask. With only tag 0 generated, any such problem could be
> > hidden.
> 
> I don't think that's the case, as long as you aren't using 0 as a
> catch-all tag. Imagine that you have some hypothetical startup code
> that looks like this:
> 
> void init() {
>   bool called_prctl = false;
>   prctl(PR_SET_TAGGED_ADDR_CTRL, ...); // effect is to change
> GCR_EL1.Excl from 0xffff to 1
>   called_prctl = true;
> }
> 
> This may be compiled as something like (well, a real compiler wouldn't
> compile it like this but rather use sp-relative stores or eliminate
> the dead stores entirely, but imagine that the stores to called_prctl
> are obfuscated somehow, e.g. in another translation unit):
> 
> sub x19, sp, #16
> irg x19, x19 // compute a tag base for the function
> addg x0, x19, #0, #1 // add tag offset for "called_prctl"
> stzg x0, [x0]
> bl prctl
> addg x0, x19, #0, #1 // rematerialize "called_prctl" address
> mov w1, #1
> strb w1, [x0]
> ret
> 
> The first addg will materialize a tag of 0 due to the default Excl
> value, so the stzg will set the memory tag to 0. However, the second
> addg will materialize a tag of 1 because of the new Excl value, which
> will result in a tag fault in the strb instruction.
> 
> This problem is less likely to be detected if we transition Excl from
> 0 to 1. It will only be detected in the case where the irg instruction
> produces a tag of 0xf, which would be incremented to 0 by the first
> addg but to 1 by the second one.

Thanks for the explanation. For some reason I thought ADDG would only be
used once (per variable or frame). I now agree that a default exclude
mask of 0xffff would catch such issues early.

> > > If we change the default in this way, maybe it would be worth
> > > considering flipping the meaning of the tag mask and have it be a mask
> > > of tags to allow. That would be consistent with the existing behaviour
> > > where userspace sets bits in tagged_addr_ctrl in order to enable
> > > tagging features.
> >
> > The first question is whether the C runtime requires a default
> > GCR_EL1.Excl mask of 0xffff (or 0xfffe) so that IRG, ADDG, SUBG always
> > generate tag 0. If the runtime is fine with a default exclude mask of 0,
> > I'm tempted to go back to an exclude mask for prctl().
> >
> > (to me it feels more natural to use an exclude mask as it matches the
> > ARM ARM definition but maybe I stare too much at the hardware specs ;))
> 
> I think that would be fine with me. With the transition from 0 to 1
> the above problem would still be detected, but only 1/16 of the time.
> But if the problem exists in the early startup code which will be
> executed many times during a typical system boot, it makes it likely
> that the problem will be detected eventually.

I'm not a big fan of hitting a problem 1/16 times, it makes debugging
harder. So I'll stick to a default exclude mask of 0xffff, in which case
it makes sense to invert the polarity for prctl() and make it an include
mask (as in v4 of the patchset).

Thanks.

-- 
Catalin


^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2020-06-23 16:42 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-11 18:40 [PATCH 00/22] arm64: Memory Tagging Extension user-space support Catalin Marinas
2019-12-11 18:40 ` [PATCH 01/22] mm: Reserve asm-generic prot flags 0x10 and 0x20 for arch use Catalin Marinas
2019-12-11 19:26   ` Arnd Bergmann
2019-12-11 18:40 ` [PATCH 02/22] kbuild: Add support for 'as-instr' to be used in Kconfig files Catalin Marinas
2019-12-12  5:03   ` Masahiro Yamada
2019-12-11 18:40 ` [PATCH 03/22] arm64: alternative: Allow alternative_insn to always issue the first instruction Catalin Marinas
2019-12-11 18:40 ` [PATCH 04/22] arm64: Use macros instead of hard-coded constants for MAIR_EL1 Catalin Marinas
2019-12-11 18:40 ` [PATCH 05/22] arm64: mte: system register definitions Catalin Marinas
2019-12-11 18:40 ` [PATCH 06/22] arm64: mte: CPU feature detection and initial sysreg configuration Catalin Marinas
2019-12-11 18:40 ` [PATCH 07/22] arm64: mte: Use Normal Tagged attributes for the linear map Catalin Marinas
2019-12-11 18:40 ` [PATCH 08/22] arm64: mte: Assembler macros and default architecture for .S files Catalin Marinas
2019-12-11 18:40 ` [PATCH 09/22] arm64: mte: Tags-aware clear_page() implementation Catalin Marinas
2019-12-11 18:40 ` [PATCH 10/22] arm64: mte: Tags-aware copy_page() implementation Catalin Marinas
2019-12-11 18:40 ` [PATCH 11/22] arm64: Tags-aware memcmp_pages() implementation Catalin Marinas
2019-12-11 18:40 ` [PATCH 12/22] arm64: mte: Add specific SIGSEGV codes Catalin Marinas
2019-12-11 19:31   ` Arnd Bergmann
2019-12-12  9:34     ` Catalin Marinas
2019-12-12 18:26     ` Eric W. Biederman
2019-12-17 17:48       ` Catalin Marinas
2019-12-17 20:06         ` Eric W. Biederman
2019-12-11 18:40 ` [PATCH 13/22] arm64: mte: Handle synchronous and asynchronous tag check faults Catalin Marinas
2019-12-14  1:43   ` Peter Collingbourne
2019-12-17 18:01     ` Catalin Marinas
2019-12-20  1:36       ` [PATCH] arm64: mte: Do not service syscalls after async tag fault Peter Collingbourne
2020-02-12 11:09         ` Catalin Marinas
2020-02-18 21:59           ` Peter Collingbourne
2020-02-19 16:16             ` Catalin Marinas
2019-12-11 18:40 ` [PATCH 14/22] mm: Introduce arch_calc_vm_flag_bits() Catalin Marinas
2019-12-11 18:40 ` [PATCH 15/22] arm64: mte: Add PROT_MTE support to mmap() and mprotect() Catalin Marinas
2020-01-21 22:06   ` Peter Collingbourne
2019-12-11 18:40 ` [PATCH 16/22] mm: Introduce arch_validate_flags() Catalin Marinas
2019-12-11 18:40 ` [PATCH 17/22] arm64: mte: Validate the PROT_MTE request via arch_validate_flags() Catalin Marinas
2019-12-11 18:40 ` [PATCH 18/22] mm: Allow arm64 mmap(PROT_MTE) on RAM-based files Catalin Marinas
2019-12-11 18:40 ` [PATCH 19/22] arm64: mte: Allow user control of the tag check mode via prctl() Catalin Marinas
2019-12-19 20:32   ` Peter Collingbourne
2019-12-20  1:48     ` [PATCH] arm64: mte: Clear SCTLR_EL1.TCF0 on exec Peter Collingbourne
2020-02-12 17:03       ` Catalin Marinas
2019-12-27 14:34   ` [PATCH 19/22] arm64: mte: Allow user control of the tag check mode via prctl() Kevin Brodsky
2020-02-12 11:45     ` Catalin Marinas
2019-12-11 18:40 ` [PATCH 20/22] arm64: mte: Allow user control of the excluded tags " Catalin Marinas
2019-12-16 14:20   ` Kevin Brodsky
2019-12-16 17:30     ` Peter Collingbourne
2019-12-17 17:56       ` Catalin Marinas
2020-06-22 17:17       ` Catalin Marinas
2020-06-22 19:00         ` Peter Collingbourne
2020-06-23 16:42           ` Catalin Marinas
2019-12-11 18:40 ` [PATCH 21/22] arm64: mte: Kconfig entry Catalin Marinas
2019-12-11 18:40 ` [PATCH 22/22] arm64: mte: Add Memory Tagging Extension documentation Catalin Marinas
2019-12-24 15:03   ` Kevin Brodsky
2019-12-13 18:05 ` [PATCH 00/22] arm64: Memory Tagging Extension user-space support Peter Collingbourne
2020-02-13 11:23   ` Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).