All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/26] arm64: Memory Tagging Extension user-space support
@ 2020-05-15 17:15 ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne

This is the fourth version (third version here [1]) of the series adding
user-space support for the ARMv8.5 Memory Tagging Extension ([2], [3]).
The patches are also available on this branch:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux devel/mte-v4

Changes in this version:

- Swap and suspend to disk support for saving/restoring tags.

- Deferred the tag zeroing from clear_page() to set_pte_at() to cope
  with RAM filesystems that do not start from a zeroed page. The
  copy_page() function was also optimised to only copy tags if the page
  has been previously mapped as tagged (PROT_MTE). This mechanism
  requires a new PG_arch_2 flag.

- memcmp_pages() rewritten to always return a no-match if at least one
  of the pages is tagged (tag comparison avoided).

- ptrace() updated to prevent accessing tags in a page that has not been
  mapped with PROT_MTE (PG_arch_2 flag not set).

- copy_mount_options() fix re-implemented to avoid
  arch_has_exact_copy_from_user().

- The CPUID handling has been reworked to ensure that, when the feature
  is not backed by the DT, the HWCAP is also hidden from user. Note that
  the DT description is still under internal discussion on whether we
  need it or not.

- A new early param, arm64.mte_disable, was introduced to facilitate
  testing with and without MTE on platforms that support it.

- Asynchronous TCF SIGSEGV is no longer forced, so it can be ignored but
  the user thread.

- CONFIG_ARM64_MTE is now default y since swap is supported.

To do or discuss:

- prctl() accepting an include vs exclude mask for the GCR_EL1.Excl
  field. There is an ongoing discussion on v3 which accepts an include
  mask with 0 being a special case equivalent to 1. If this is not
  desirable, we can change this to an exclude mask.

- mmap(tagged_addr, PROT_MTE) pre-tagging the memory with the tag given
  in the tagged_addr hint.

- ptrace() to expose the prctl() configuration for the user thread (or
  the TCF and GCR_EL1.Excl fields).

- coredump (user) to also dump the tags.

- Kselftest patches will be made available.

[1] https://lore.kernel.org/linux-arm-kernel/20200421142603.3894-1-catalin.marinas@arm.com/
[2] https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/enhancing-memory-safety
[3] https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/Arm_Memory_Tagging_Extension_Whitepaper.pdf

Catalin Marinas (14):
  arm64: mte: Use Normal Tagged attributes for the linear map
  arm64: mte: Clear the tags when a page is mapped in user-space with
    PROT_MTE
  arm64: mte: Tags-aware aware memcmp_pages() implementation
  arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  mm: Introduce arch_validate_flags()
  arm64: mte: Validate the PROT_MTE request via arch_validate_flags()
  mm: Allow arm64 mmap(PROT_MTE) on RAM-based files
  arm64: mte: Allow user control of the tag check mode via prctl()
  arm64: mte: Allow user control of the generated random tags via
    prctl()
  arm64: mte: Restore the GCR_EL1 register after a suspend
  arm64: mte: Add PTRACE_{PEEK,POKE}MTETAGS support
  fs: Handle intra-page faults in copy_mount_options()
  arm64: mte: Check the DT memory nodes for MTE support
  arm64: mte: Introduce early param to disable MTE support

Kevin Brodsky (1):
  mm: Introduce arch_calc_vm_flag_bits()

Steven Price (4):
  mm: Add PG_ARCH_2 page flag
  mm: Add arch hooks for saving/restoring tags
  arm64: mte: Enable swap of tagged pages
  arm64: mte: Save tags when hibernating

Vincenzo Frascino (7):
  arm64: mte: system register definitions
  arm64: mte: CPU feature detection and initial sysreg configuration
  arm64: mte: Add specific SIGSEGV codes
  arm64: mte: Handle synchronous and asynchronous tag check faults
  arm64: mte: Tags-aware copy_page() implementation
  arm64: mte: Kconfig entry
  arm64: mte: Add Memory Tagging Extension documentation

 .../admin-guide/kernel-parameters.txt         |   4 +
 Documentation/arm64/cpu-feature-registers.rst |   2 +
 Documentation/arm64/elf_hwcaps.rst            |   5 +
 Documentation/arm64/index.rst                 |   1 +
 .../arm64/memory-tagging-extension.rst        | 297 ++++++++++++++++
 arch/arm64/Kconfig                            |  33 ++
 arch/arm64/boot/dts/arm/fvp-base-revc.dts     |   1 +
 arch/arm64/include/asm/assembler.h            |  12 +
 arch/arm64/include/asm/cpucaps.h              |   4 +-
 arch/arm64/include/asm/cpufeature.h           |  12 +-
 arch/arm64/include/asm/hwcap.h                |   1 +
 arch/arm64/include/asm/kvm_arm.h              |   3 +-
 arch/arm64/include/asm/memory.h               |  17 +-
 arch/arm64/include/asm/mman.h                 |  78 +++++
 arch/arm64/include/asm/mte.h                  |  86 +++++
 arch/arm64/include/asm/page.h                 |   2 +-
 arch/arm64/include/asm/pgtable-prot.h         |   2 +
 arch/arm64/include/asm/pgtable.h              |  45 ++-
 arch/arm64/include/asm/processor.h            |   4 +
 arch/arm64/include/asm/sysreg.h               |  62 ++++
 arch/arm64/include/asm/thread_info.h          |   4 +-
 arch/arm64/include/uapi/asm/hwcap.h           |   2 +
 arch/arm64/include/uapi/asm/mman.h            |  14 +
 arch/arm64/include/uapi/asm/ptrace.h          |   4 +
 arch/arm64/kernel/Makefile                    |   1 +
 arch/arm64/kernel/cpufeature.c                | 126 ++++++-
 arch/arm64/kernel/cpuinfo.c                   |   2 +
 arch/arm64/kernel/entry.S                     |  37 ++
 arch/arm64/kernel/hibernate.c                 | 118 +++++++
 arch/arm64/kernel/mte.c                       | 324 ++++++++++++++++++
 arch/arm64/kernel/process.c                   |  31 +-
 arch/arm64/kernel/ptrace.c                    |   9 +-
 arch/arm64/kernel/signal.c                    |   8 +
 arch/arm64/kernel/suspend.c                   |   4 +
 arch/arm64/kernel/syscall.c                   |  10 +
 arch/arm64/lib/Makefile                       |   2 +
 arch/arm64/lib/mte.S                          | 140 ++++++++
 arch/arm64/mm/Makefile                        |   1 +
 arch/arm64/mm/copypage.c                      |  14 +-
 arch/arm64/mm/dump.c                          |   4 +
 arch/arm64/mm/fault.c                         |   9 +-
 arch/arm64/mm/mmu.c                           |  22 +-
 arch/arm64/mm/mteswap.c                       |  82 +++++
 arch/arm64/mm/proc.S                          |   8 +-
 arch/x86/kernel/signal_compat.c               |   2 +-
 fs/namespace.c                                |  24 +-
 fs/proc/page.c                                |   3 +
 fs/proc/task_mmu.c                            |   4 +
 include/asm-generic/pgtable.h                 |  23 ++
 include/linux/kernel-page-flags.h             |   1 +
 include/linux/mm.h                            |   8 +
 include/linux/mman.h                          |  22 +-
 include/linux/page-flags.h                    |   3 +
 include/trace/events/mmflags.h                |   9 +-
 include/uapi/asm-generic/siginfo.h            |   4 +-
 include/uapi/linux/prctl.h                    |   9 +
 mm/Kconfig                                    |   3 +
 mm/mmap.c                                     |   9 +
 mm/mprotect.c                                 |   6 +
 mm/page_io.c                                  |  10 +
 mm/shmem.c                                    |   9 +
 mm/swapfile.c                                 |   2 +
 mm/util.c                                     |   2 +-
 tools/vm/page-types.c                         |   2 +
 64 files changed, 1765 insertions(+), 37 deletions(-)
 create mode 100644 Documentation/arm64/memory-tagging-extension.rst
 create mode 100644 arch/arm64/include/asm/mman.h
 create mode 100644 arch/arm64/include/asm/mte.h
 create mode 100644 arch/arm64/include/uapi/asm/mman.h
 create mode 100644 arch/arm64/kernel/mte.c
 create mode 100644 arch/arm64/lib/mte.S
 create mode 100644 arch/arm64/mm/mteswap.c

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v4 00/26] arm64: Memory Tagging Extension user-space support
@ 2020-05-15 17:15 ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Vincenzo Frascino, Will Deacon,
	Dave P Martin

This is the fourth version (third version here [1]) of the series adding
user-space support for the ARMv8.5 Memory Tagging Extension ([2], [3]).
The patches are also available on this branch:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux devel/mte-v4

Changes in this version:

- Swap and suspend to disk support for saving/restoring tags.

- Deferred the tag zeroing from clear_page() to set_pte_at() to cope
  with RAM filesystems that do not start from a zeroed page. The
  copy_page() function was also optimised to only copy tags if the page
  has been previously mapped as tagged (PROT_MTE). This mechanism
  requires a new PG_arch_2 flag.

- memcmp_pages() rewritten to always return a no-match if at least one
  of the pages is tagged (tag comparison avoided).

- ptrace() updated to prevent accessing tags in a page that has not been
  mapped with PROT_MTE (PG_arch_2 flag not set).

- copy_mount_options() fix re-implemented to avoid
  arch_has_exact_copy_from_user().

- The CPUID handling has been reworked to ensure that, when the feature
  is not backed by the DT, the HWCAP is also hidden from user. Note that
  the DT description is still under internal discussion on whether we
  need it or not.

- A new early param, arm64.mte_disable, was introduced to facilitate
  testing with and without MTE on platforms that support it.

- Asynchronous TCF SIGSEGV is no longer forced, so it can be ignored but
  the user thread.

- CONFIG_ARM64_MTE is now default y since swap is supported.

To do or discuss:

- prctl() accepting an include vs exclude mask for the GCR_EL1.Excl
  field. There is an ongoing discussion on v3 which accepts an include
  mask with 0 being a special case equivalent to 1. If this is not
  desirable, we can change this to an exclude mask.

- mmap(tagged_addr, PROT_MTE) pre-tagging the memory with the tag given
  in the tagged_addr hint.

- ptrace() to expose the prctl() configuration for the user thread (or
  the TCF and GCR_EL1.Excl fields).

- coredump (user) to also dump the tags.

- Kselftest patches will be made available.

[1] https://lore.kernel.org/linux-arm-kernel/20200421142603.3894-1-catalin.marinas@arm.com/
[2] https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/enhancing-memory-safety
[3] https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/Arm_Memory_Tagging_Extension_Whitepaper.pdf

Catalin Marinas (14):
  arm64: mte: Use Normal Tagged attributes for the linear map
  arm64: mte: Clear the tags when a page is mapped in user-space with
    PROT_MTE
  arm64: mte: Tags-aware aware memcmp_pages() implementation
  arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  mm: Introduce arch_validate_flags()
  arm64: mte: Validate the PROT_MTE request via arch_validate_flags()
  mm: Allow arm64 mmap(PROT_MTE) on RAM-based files
  arm64: mte: Allow user control of the tag check mode via prctl()
  arm64: mte: Allow user control of the generated random tags via
    prctl()
  arm64: mte: Restore the GCR_EL1 register after a suspend
  arm64: mte: Add PTRACE_{PEEK,POKE}MTETAGS support
  fs: Handle intra-page faults in copy_mount_options()
  arm64: mte: Check the DT memory nodes for MTE support
  arm64: mte: Introduce early param to disable MTE support

Kevin Brodsky (1):
  mm: Introduce arch_calc_vm_flag_bits()

Steven Price (4):
  mm: Add PG_ARCH_2 page flag
  mm: Add arch hooks for saving/restoring tags
  arm64: mte: Enable swap of tagged pages
  arm64: mte: Save tags when hibernating

Vincenzo Frascino (7):
  arm64: mte: system register definitions
  arm64: mte: CPU feature detection and initial sysreg configuration
  arm64: mte: Add specific SIGSEGV codes
  arm64: mte: Handle synchronous and asynchronous tag check faults
  arm64: mte: Tags-aware copy_page() implementation
  arm64: mte: Kconfig entry
  arm64: mte: Add Memory Tagging Extension documentation

 .../admin-guide/kernel-parameters.txt         |   4 +
 Documentation/arm64/cpu-feature-registers.rst |   2 +
 Documentation/arm64/elf_hwcaps.rst            |   5 +
 Documentation/arm64/index.rst                 |   1 +
 .../arm64/memory-tagging-extension.rst        | 297 ++++++++++++++++
 arch/arm64/Kconfig                            |  33 ++
 arch/arm64/boot/dts/arm/fvp-base-revc.dts     |   1 +
 arch/arm64/include/asm/assembler.h            |  12 +
 arch/arm64/include/asm/cpucaps.h              |   4 +-
 arch/arm64/include/asm/cpufeature.h           |  12 +-
 arch/arm64/include/asm/hwcap.h                |   1 +
 arch/arm64/include/asm/kvm_arm.h              |   3 +-
 arch/arm64/include/asm/memory.h               |  17 +-
 arch/arm64/include/asm/mman.h                 |  78 +++++
 arch/arm64/include/asm/mte.h                  |  86 +++++
 arch/arm64/include/asm/page.h                 |   2 +-
 arch/arm64/include/asm/pgtable-prot.h         |   2 +
 arch/arm64/include/asm/pgtable.h              |  45 ++-
 arch/arm64/include/asm/processor.h            |   4 +
 arch/arm64/include/asm/sysreg.h               |  62 ++++
 arch/arm64/include/asm/thread_info.h          |   4 +-
 arch/arm64/include/uapi/asm/hwcap.h           |   2 +
 arch/arm64/include/uapi/asm/mman.h            |  14 +
 arch/arm64/include/uapi/asm/ptrace.h          |   4 +
 arch/arm64/kernel/Makefile                    |   1 +
 arch/arm64/kernel/cpufeature.c                | 126 ++++++-
 arch/arm64/kernel/cpuinfo.c                   |   2 +
 arch/arm64/kernel/entry.S                     |  37 ++
 arch/arm64/kernel/hibernate.c                 | 118 +++++++
 arch/arm64/kernel/mte.c                       | 324 ++++++++++++++++++
 arch/arm64/kernel/process.c                   |  31 +-
 arch/arm64/kernel/ptrace.c                    |   9 +-
 arch/arm64/kernel/signal.c                    |   8 +
 arch/arm64/kernel/suspend.c                   |   4 +
 arch/arm64/kernel/syscall.c                   |  10 +
 arch/arm64/lib/Makefile                       |   2 +
 arch/arm64/lib/mte.S                          | 140 ++++++++
 arch/arm64/mm/Makefile                        |   1 +
 arch/arm64/mm/copypage.c                      |  14 +-
 arch/arm64/mm/dump.c                          |   4 +
 arch/arm64/mm/fault.c                         |   9 +-
 arch/arm64/mm/mmu.c                           |  22 +-
 arch/arm64/mm/mteswap.c                       |  82 +++++
 arch/arm64/mm/proc.S                          |   8 +-
 arch/x86/kernel/signal_compat.c               |   2 +-
 fs/namespace.c                                |  24 +-
 fs/proc/page.c                                |   3 +
 fs/proc/task_mmu.c                            |   4 +
 include/asm-generic/pgtable.h                 |  23 ++
 include/linux/kernel-page-flags.h             |   1 +
 include/linux/mm.h                            |   8 +
 include/linux/mman.h                          |  22 +-
 include/linux/page-flags.h                    |   3 +
 include/trace/events/mmflags.h                |   9 +-
 include/uapi/asm-generic/siginfo.h            |   4 +-
 include/uapi/linux/prctl.h                    |   9 +
 mm/Kconfig                                    |   3 +
 mm/mmap.c                                     |   9 +
 mm/mprotect.c                                 |   6 +
 mm/page_io.c                                  |  10 +
 mm/shmem.c                                    |   9 +
 mm/swapfile.c                                 |   2 +
 mm/util.c                                     |   2 +-
 tools/vm/page-types.c                         |   2 +
 64 files changed, 1765 insertions(+), 37 deletions(-)
 create mode 100644 Documentation/arm64/memory-tagging-extension.rst
 create mode 100644 arch/arm64/include/asm/mman.h
 create mode 100644 arch/arm64/include/asm/mte.h
 create mode 100644 arch/arm64/include/uapi/asm/mman.h
 create mode 100644 arch/arm64/kernel/mte.c
 create mode 100644 arch/arm64/lib/mte.S
 create mode 100644 arch/arm64/mm/mteswap.c


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* [PATCH v4 01/26] arm64: mte: system register definitions
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:15   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

Add Memory Tagging Extension system register definitions together with
the relevant bitfields.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v2:
    - Added SET_PSTATE_TCO() macro.

 arch/arm64/include/asm/kvm_arm.h     |  1 +
 arch/arm64/include/asm/sysreg.h      | 54 ++++++++++++++++++++++++++++
 arch/arm64/include/uapi/asm/ptrace.h |  1 +
 arch/arm64/kernel/ptrace.c           |  2 +-
 4 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 51c1d9918999..8a1cbfd544d6 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -12,6 +12,7 @@
 #include <asm/types.h>
 
 /* Hyp Configuration Register (HCR) bits */
+#define HCR_ATA		(UL(1) << 56)
 #define HCR_FWB		(UL(1) << 46)
 #define HCR_API		(UL(1) << 41)
 #define HCR_APK		(UL(1) << 40)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index c4ac0ac25a00..e823e93b7429 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -91,10 +91,12 @@
 #define PSTATE_PAN			pstate_field(0, 4)
 #define PSTATE_UAO			pstate_field(0, 3)
 #define PSTATE_SSBS			pstate_field(3, 1)
+#define PSTATE_TCO			pstate_field(3, 4)
 
 #define SET_PSTATE_PAN(x)		__emit_inst(0xd500401f | PSTATE_PAN | ((!!x) << PSTATE_Imm_shift))
 #define SET_PSTATE_UAO(x)		__emit_inst(0xd500401f | PSTATE_UAO | ((!!x) << PSTATE_Imm_shift))
 #define SET_PSTATE_SSBS(x)		__emit_inst(0xd500401f | PSTATE_SSBS | ((!!x) << PSTATE_Imm_shift))
+#define SET_PSTATE_TCO(x)		__emit_inst(0xd500401f | PSTATE_TCO | ((!!x) << PSTATE_Imm_shift))
 
 #define __SYS_BARRIER_INSN(CRm, op2, Rt) \
 	__emit_inst(0xd5000000 | sys_insn(0, 3, 3, (CRm), (op2)) | ((Rt) & 0x1f))
@@ -174,6 +176,8 @@
 #define SYS_SCTLR_EL1			sys_reg(3, 0, 1, 0, 0)
 #define SYS_ACTLR_EL1			sys_reg(3, 0, 1, 0, 1)
 #define SYS_CPACR_EL1			sys_reg(3, 0, 1, 0, 2)
+#define SYS_RGSR_EL1			sys_reg(3, 0, 1, 0, 5)
+#define SYS_GCR_EL1			sys_reg(3, 0, 1, 0, 6)
 
 #define SYS_ZCR_EL1			sys_reg(3, 0, 1, 2, 0)
 
@@ -211,6 +215,8 @@
 #define SYS_ERXADDR_EL1			sys_reg(3, 0, 5, 4, 3)
 #define SYS_ERXMISC0_EL1		sys_reg(3, 0, 5, 5, 0)
 #define SYS_ERXMISC1_EL1		sys_reg(3, 0, 5, 5, 1)
+#define SYS_TFSR_EL1			sys_reg(3, 0, 5, 6, 0)
+#define SYS_TFSRE0_EL1			sys_reg(3, 0, 5, 6, 1)
 
 #define SYS_FAR_EL1			sys_reg(3, 0, 6, 0, 0)
 #define SYS_PAR_EL1			sys_reg(3, 0, 7, 4, 0)
@@ -361,6 +367,7 @@
 
 #define SYS_CCSIDR_EL1			sys_reg(3, 1, 0, 0, 0)
 #define SYS_CLIDR_EL1			sys_reg(3, 1, 0, 0, 1)
+#define SYS_GMID_EL1			sys_reg(3, 1, 0, 0, 4)
 #define SYS_AIDR_EL1			sys_reg(3, 1, 0, 0, 7)
 
 #define SYS_CSSELR_EL1			sys_reg(3, 2, 0, 0, 0)
@@ -453,6 +460,7 @@
 #define SYS_ESR_EL2			sys_reg(3, 4, 5, 2, 0)
 #define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
 #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
+#define SYS_TFSR_EL2			sys_reg(3, 4, 5, 6, 0)
 #define SYS_FAR_EL2			sys_reg(3, 4, 6, 0, 0)
 
 #define SYS_VDISR_EL2			sys_reg(3, 4, 12, 1,  1)
@@ -509,6 +517,7 @@
 #define SYS_AFSR0_EL12			sys_reg(3, 5, 5, 1, 0)
 #define SYS_AFSR1_EL12			sys_reg(3, 5, 5, 1, 1)
 #define SYS_ESR_EL12			sys_reg(3, 5, 5, 2, 0)
+#define SYS_TFSR_EL12			sys_reg(3, 5, 5, 6, 0)
 #define SYS_FAR_EL12			sys_reg(3, 5, 6, 0, 0)
 #define SYS_MAIR_EL12			sys_reg(3, 5, 10, 2, 0)
 #define SYS_AMAIR_EL12			sys_reg(3, 5, 10, 3, 0)
@@ -524,6 +533,15 @@
 
 /* Common SCTLR_ELx flags. */
 #define SCTLR_ELx_DSSBS	(BIT(44))
+#define SCTLR_ELx_ATA	(BIT(43))
+
+#define SCTLR_ELx_TCF_SHIFT	40
+#define SCTLR_ELx_TCF_NONE	(UL(0x0) << SCTLR_ELx_TCF_SHIFT)
+#define SCTLR_ELx_TCF_SYNC	(UL(0x1) << SCTLR_ELx_TCF_SHIFT)
+#define SCTLR_ELx_TCF_ASYNC	(UL(0x2) << SCTLR_ELx_TCF_SHIFT)
+#define SCTLR_ELx_TCF_MASK	(UL(0x3) << SCTLR_ELx_TCF_SHIFT)
+
+#define SCTLR_ELx_ITFSB	(BIT(37))
 #define SCTLR_ELx_ENIA	(BIT(31))
 #define SCTLR_ELx_ENIB	(BIT(30))
 #define SCTLR_ELx_ENDA	(BIT(27))
@@ -552,6 +570,14 @@
 #endif
 
 /* SCTLR_EL1 specific flags. */
+#define SCTLR_EL1_ATA0		(BIT(42))
+
+#define SCTLR_EL1_TCF0_SHIFT	38
+#define SCTLR_EL1_TCF0_NONE	(UL(0x0) << SCTLR_EL1_TCF0_SHIFT)
+#define SCTLR_EL1_TCF0_SYNC	(UL(0x1) << SCTLR_EL1_TCF0_SHIFT)
+#define SCTLR_EL1_TCF0_ASYNC	(UL(0x2) << SCTLR_EL1_TCF0_SHIFT)
+#define SCTLR_EL1_TCF0_MASK	(UL(0x3) << SCTLR_EL1_TCF0_SHIFT)
+
 #define SCTLR_EL1_UCI		(BIT(26))
 #define SCTLR_EL1_E0E		(BIT(24))
 #define SCTLR_EL1_SPAN		(BIT(23))
@@ -586,6 +612,7 @@
 #define MAIR_ATTR_DEVICE_GRE		UL(0x0c)
 #define MAIR_ATTR_NORMAL_NC		UL(0x44)
 #define MAIR_ATTR_NORMAL_WT		UL(0xbb)
+#define MAIR_ATTR_NORMAL_TAGGED		UL(0xf0)
 #define MAIR_ATTR_NORMAL		UL(0xff)
 #define MAIR_ATTR_MASK			UL(0xff)
 
@@ -660,11 +687,16 @@
 
 /* id_aa64pfr1 */
 #define ID_AA64PFR1_SSBS_SHIFT		4
+#define ID_AA64PFR1_MTE_SHIFT		8
 
 #define ID_AA64PFR1_SSBS_PSTATE_NI	0
 #define ID_AA64PFR1_SSBS_PSTATE_ONLY	1
 #define ID_AA64PFR1_SSBS_PSTATE_INSNS	2
 
+#define ID_AA64PFR1_MTE_NI		0x0
+#define ID_AA64PFR1_MTE_EL0		0x1
+#define ID_AA64PFR1_MTE			0x2
+
 /* id_aa64zfr0 */
 #define ID_AA64ZFR0_F64MM_SHIFT		56
 #define ID_AA64ZFR0_F32MM_SHIFT		52
@@ -822,6 +854,28 @@
 #define CPACR_EL1_ZEN_EL0EN	(BIT(17)) /* enable EL0 access, if EL1EN set */
 #define CPACR_EL1_ZEN		(CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN)
 
+/* TCR EL1 Bit Definitions */
+#define SYS_TCR_EL1_TCMA1	(BIT(58))
+#define SYS_TCR_EL1_TCMA0	(BIT(57))
+
+/* GCR_EL1 Definitions */
+#define SYS_GCR_EL1_RRND	(BIT(16))
+#define SYS_GCR_EL1_EXCL_MASK	0xffffUL
+
+/* RGSR_EL1 Definitions */
+#define SYS_RGSR_EL1_TAG_MASK	0xfUL
+#define SYS_RGSR_EL1_SEED_SHIFT	8
+#define SYS_RGSR_EL1_SEED_MASK	0xffffUL
+
+/* GMID_EL1 field definitions */
+#define SYS_GMID_EL1_BS_SHIFT	0
+#define SYS_GMID_EL1_BS_SIZE	4
+
+/* TFSR{,E0}_EL1 bit definitions */
+#define SYS_TFSR_EL1_TF0_SHIFT	0
+#define SYS_TFSR_EL1_TF1_SHIFT	1
+#define SYS_TFSR_EL1_TF0	(UL(1) << SYS_TFSR_EL1_TF0_SHIFT)
+#define SYS_TFSR_EL1_TF1	(UK(2) << SYS_TFSR_EL1_TF1_SHIFT)
 
 /* Safe value for MPIDR_EL1: Bit31:RES1, Bit30:U:0, Bit24:MT:0 */
 #define SYS_MPIDR_SAFE_VAL	(BIT(31))
diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h
index d1bb5b69f1ce..1daf6dda8af0 100644
--- a/arch/arm64/include/uapi/asm/ptrace.h
+++ b/arch/arm64/include/uapi/asm/ptrace.h
@@ -50,6 +50,7 @@
 #define PSR_PAN_BIT	0x00400000
 #define PSR_UAO_BIT	0x00800000
 #define PSR_DIT_BIT	0x01000000
+#define PSR_TCO_BIT	0x02000000
 #define PSR_V_BIT	0x10000000
 #define PSR_C_BIT	0x20000000
 #define PSR_Z_BIT	0x40000000
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index b3d3005d9515..077e352495eb 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1873,7 +1873,7 @@ void syscall_trace_exit(struct pt_regs *regs)
  * We also reserve IL for the kernel; SS is handled dynamically.
  */
 #define SPSR_EL1_AARCH64_RES0_BITS \
-	(GENMASK_ULL(63, 32) | GENMASK_ULL(27, 25) | GENMASK_ULL(23, 22) | \
+	(GENMASK_ULL(63, 32) | GENMASK_ULL(27, 26) | GENMASK_ULL(23, 22) | \
 	 GENMASK_ULL(20, 13) | GENMASK_ULL(11, 10) | GENMASK_ULL(5, 5))
 #define SPSR_EL1_AARCH32_RES0_BITS \
 	(GENMASK_ULL(63, 32) | GENMASK_ULL(22, 22) | GENMASK_ULL(20, 20))

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 01/26] arm64: mte: system register definitions
@ 2020-05-15 17:15   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Vincenzo Frascino, Will Deacon,
	Dave P Martin

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

Add Memory Tagging Extension system register definitions together with
the relevant bitfields.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v2:
    - Added SET_PSTATE_TCO() macro.

 arch/arm64/include/asm/kvm_arm.h     |  1 +
 arch/arm64/include/asm/sysreg.h      | 54 ++++++++++++++++++++++++++++
 arch/arm64/include/uapi/asm/ptrace.h |  1 +
 arch/arm64/kernel/ptrace.c           |  2 +-
 4 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 51c1d9918999..8a1cbfd544d6 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -12,6 +12,7 @@
 #include <asm/types.h>
 
 /* Hyp Configuration Register (HCR) bits */
+#define HCR_ATA		(UL(1) << 56)
 #define HCR_FWB		(UL(1) << 46)
 #define HCR_API		(UL(1) << 41)
 #define HCR_APK		(UL(1) << 40)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index c4ac0ac25a00..e823e93b7429 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -91,10 +91,12 @@
 #define PSTATE_PAN			pstate_field(0, 4)
 #define PSTATE_UAO			pstate_field(0, 3)
 #define PSTATE_SSBS			pstate_field(3, 1)
+#define PSTATE_TCO			pstate_field(3, 4)
 
 #define SET_PSTATE_PAN(x)		__emit_inst(0xd500401f | PSTATE_PAN | ((!!x) << PSTATE_Imm_shift))
 #define SET_PSTATE_UAO(x)		__emit_inst(0xd500401f | PSTATE_UAO | ((!!x) << PSTATE_Imm_shift))
 #define SET_PSTATE_SSBS(x)		__emit_inst(0xd500401f | PSTATE_SSBS | ((!!x) << PSTATE_Imm_shift))
+#define SET_PSTATE_TCO(x)		__emit_inst(0xd500401f | PSTATE_TCO | ((!!x) << PSTATE_Imm_shift))
 
 #define __SYS_BARRIER_INSN(CRm, op2, Rt) \
 	__emit_inst(0xd5000000 | sys_insn(0, 3, 3, (CRm), (op2)) | ((Rt) & 0x1f))
@@ -174,6 +176,8 @@
 #define SYS_SCTLR_EL1			sys_reg(3, 0, 1, 0, 0)
 #define SYS_ACTLR_EL1			sys_reg(3, 0, 1, 0, 1)
 #define SYS_CPACR_EL1			sys_reg(3, 0, 1, 0, 2)
+#define SYS_RGSR_EL1			sys_reg(3, 0, 1, 0, 5)
+#define SYS_GCR_EL1			sys_reg(3, 0, 1, 0, 6)
 
 #define SYS_ZCR_EL1			sys_reg(3, 0, 1, 2, 0)
 
@@ -211,6 +215,8 @@
 #define SYS_ERXADDR_EL1			sys_reg(3, 0, 5, 4, 3)
 #define SYS_ERXMISC0_EL1		sys_reg(3, 0, 5, 5, 0)
 #define SYS_ERXMISC1_EL1		sys_reg(3, 0, 5, 5, 1)
+#define SYS_TFSR_EL1			sys_reg(3, 0, 5, 6, 0)
+#define SYS_TFSRE0_EL1			sys_reg(3, 0, 5, 6, 1)
 
 #define SYS_FAR_EL1			sys_reg(3, 0, 6, 0, 0)
 #define SYS_PAR_EL1			sys_reg(3, 0, 7, 4, 0)
@@ -361,6 +367,7 @@
 
 #define SYS_CCSIDR_EL1			sys_reg(3, 1, 0, 0, 0)
 #define SYS_CLIDR_EL1			sys_reg(3, 1, 0, 0, 1)
+#define SYS_GMID_EL1			sys_reg(3, 1, 0, 0, 4)
 #define SYS_AIDR_EL1			sys_reg(3, 1, 0, 0, 7)
 
 #define SYS_CSSELR_EL1			sys_reg(3, 2, 0, 0, 0)
@@ -453,6 +460,7 @@
 #define SYS_ESR_EL2			sys_reg(3, 4, 5, 2, 0)
 #define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
 #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
+#define SYS_TFSR_EL2			sys_reg(3, 4, 5, 6, 0)
 #define SYS_FAR_EL2			sys_reg(3, 4, 6, 0, 0)
 
 #define SYS_VDISR_EL2			sys_reg(3, 4, 12, 1,  1)
@@ -509,6 +517,7 @@
 #define SYS_AFSR0_EL12			sys_reg(3, 5, 5, 1, 0)
 #define SYS_AFSR1_EL12			sys_reg(3, 5, 5, 1, 1)
 #define SYS_ESR_EL12			sys_reg(3, 5, 5, 2, 0)
+#define SYS_TFSR_EL12			sys_reg(3, 5, 5, 6, 0)
 #define SYS_FAR_EL12			sys_reg(3, 5, 6, 0, 0)
 #define SYS_MAIR_EL12			sys_reg(3, 5, 10, 2, 0)
 #define SYS_AMAIR_EL12			sys_reg(3, 5, 10, 3, 0)
@@ -524,6 +533,15 @@
 
 /* Common SCTLR_ELx flags. */
 #define SCTLR_ELx_DSSBS	(BIT(44))
+#define SCTLR_ELx_ATA	(BIT(43))
+
+#define SCTLR_ELx_TCF_SHIFT	40
+#define SCTLR_ELx_TCF_NONE	(UL(0x0) << SCTLR_ELx_TCF_SHIFT)
+#define SCTLR_ELx_TCF_SYNC	(UL(0x1) << SCTLR_ELx_TCF_SHIFT)
+#define SCTLR_ELx_TCF_ASYNC	(UL(0x2) << SCTLR_ELx_TCF_SHIFT)
+#define SCTLR_ELx_TCF_MASK	(UL(0x3) << SCTLR_ELx_TCF_SHIFT)
+
+#define SCTLR_ELx_ITFSB	(BIT(37))
 #define SCTLR_ELx_ENIA	(BIT(31))
 #define SCTLR_ELx_ENIB	(BIT(30))
 #define SCTLR_ELx_ENDA	(BIT(27))
@@ -552,6 +570,14 @@
 #endif
 
 /* SCTLR_EL1 specific flags. */
+#define SCTLR_EL1_ATA0		(BIT(42))
+
+#define SCTLR_EL1_TCF0_SHIFT	38
+#define SCTLR_EL1_TCF0_NONE	(UL(0x0) << SCTLR_EL1_TCF0_SHIFT)
+#define SCTLR_EL1_TCF0_SYNC	(UL(0x1) << SCTLR_EL1_TCF0_SHIFT)
+#define SCTLR_EL1_TCF0_ASYNC	(UL(0x2) << SCTLR_EL1_TCF0_SHIFT)
+#define SCTLR_EL1_TCF0_MASK	(UL(0x3) << SCTLR_EL1_TCF0_SHIFT)
+
 #define SCTLR_EL1_UCI		(BIT(26))
 #define SCTLR_EL1_E0E		(BIT(24))
 #define SCTLR_EL1_SPAN		(BIT(23))
@@ -586,6 +612,7 @@
 #define MAIR_ATTR_DEVICE_GRE		UL(0x0c)
 #define MAIR_ATTR_NORMAL_NC		UL(0x44)
 #define MAIR_ATTR_NORMAL_WT		UL(0xbb)
+#define MAIR_ATTR_NORMAL_TAGGED		UL(0xf0)
 #define MAIR_ATTR_NORMAL		UL(0xff)
 #define MAIR_ATTR_MASK			UL(0xff)
 
@@ -660,11 +687,16 @@
 
 /* id_aa64pfr1 */
 #define ID_AA64PFR1_SSBS_SHIFT		4
+#define ID_AA64PFR1_MTE_SHIFT		8
 
 #define ID_AA64PFR1_SSBS_PSTATE_NI	0
 #define ID_AA64PFR1_SSBS_PSTATE_ONLY	1
 #define ID_AA64PFR1_SSBS_PSTATE_INSNS	2
 
+#define ID_AA64PFR1_MTE_NI		0x0
+#define ID_AA64PFR1_MTE_EL0		0x1
+#define ID_AA64PFR1_MTE			0x2
+
 /* id_aa64zfr0 */
 #define ID_AA64ZFR0_F64MM_SHIFT		56
 #define ID_AA64ZFR0_F32MM_SHIFT		52
@@ -822,6 +854,28 @@
 #define CPACR_EL1_ZEN_EL0EN	(BIT(17)) /* enable EL0 access, if EL1EN set */
 #define CPACR_EL1_ZEN		(CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN)
 
+/* TCR EL1 Bit Definitions */
+#define SYS_TCR_EL1_TCMA1	(BIT(58))
+#define SYS_TCR_EL1_TCMA0	(BIT(57))
+
+/* GCR_EL1 Definitions */
+#define SYS_GCR_EL1_RRND	(BIT(16))
+#define SYS_GCR_EL1_EXCL_MASK	0xffffUL
+
+/* RGSR_EL1 Definitions */
+#define SYS_RGSR_EL1_TAG_MASK	0xfUL
+#define SYS_RGSR_EL1_SEED_SHIFT	8
+#define SYS_RGSR_EL1_SEED_MASK	0xffffUL
+
+/* GMID_EL1 field definitions */
+#define SYS_GMID_EL1_BS_SHIFT	0
+#define SYS_GMID_EL1_BS_SIZE	4
+
+/* TFSR{,E0}_EL1 bit definitions */
+#define SYS_TFSR_EL1_TF0_SHIFT	0
+#define SYS_TFSR_EL1_TF1_SHIFT	1
+#define SYS_TFSR_EL1_TF0	(UL(1) << SYS_TFSR_EL1_TF0_SHIFT)
+#define SYS_TFSR_EL1_TF1	(UK(2) << SYS_TFSR_EL1_TF1_SHIFT)
 
 /* Safe value for MPIDR_EL1: Bit31:RES1, Bit30:U:0, Bit24:MT:0 */
 #define SYS_MPIDR_SAFE_VAL	(BIT(31))
diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h
index d1bb5b69f1ce..1daf6dda8af0 100644
--- a/arch/arm64/include/uapi/asm/ptrace.h
+++ b/arch/arm64/include/uapi/asm/ptrace.h
@@ -50,6 +50,7 @@
 #define PSR_PAN_BIT	0x00400000
 #define PSR_UAO_BIT	0x00800000
 #define PSR_DIT_BIT	0x01000000
+#define PSR_TCO_BIT	0x02000000
 #define PSR_V_BIT	0x10000000
 #define PSR_C_BIT	0x20000000
 #define PSR_Z_BIT	0x40000000
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index b3d3005d9515..077e352495eb 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1873,7 +1873,7 @@ void syscall_trace_exit(struct pt_regs *regs)
  * We also reserve IL for the kernel; SS is handled dynamically.
  */
 #define SPSR_EL1_AARCH64_RES0_BITS \
-	(GENMASK_ULL(63, 32) | GENMASK_ULL(27, 25) | GENMASK_ULL(23, 22) | \
+	(GENMASK_ULL(63, 32) | GENMASK_ULL(27, 26) | GENMASK_ULL(23, 22) | \
 	 GENMASK_ULL(20, 13) | GENMASK_ULL(11, 10) | GENMASK_ULL(5, 5))
 #define SPSR_EL1_AARCH32_RES0_BITS \
 	(GENMASK_ULL(63, 32) | GENMASK_ULL(22, 22) | GENMASK_ULL(20, 20))

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 02/26] arm64: mte: CPU feature detection and initial sysreg configuration
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:15   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Suzuki K Poulose

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

Add the cpufeature and hwcap entries to detect the presence of MTE on
the boot CPUs (primary and secondary). Any late secondary CPU not
supporting the feature, if detected during boot, will be parked.

In addition, add the minimum SCTLR_EL1 and HCR_EL2 bits for enabling
MTE. Without subsequent setting of MAIR, these bits do not have an
effect on tag checking.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
---
 arch/arm64/include/asm/cpucaps.h    |  4 +++-
 arch/arm64/include/asm/cpufeature.h |  6 ++++++
 arch/arm64/include/asm/hwcap.h      |  1 +
 arch/arm64/include/asm/kvm_arm.h    |  2 +-
 arch/arm64/include/asm/sysreg.h     |  1 +
 arch/arm64/include/uapi/asm/hwcap.h |  2 ++
 arch/arm64/kernel/cpufeature.c      | 30 +++++++++++++++++++++++++++++
 arch/arm64/kernel/cpuinfo.c         |  2 ++
 8 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 8eb5a088ae65..4731ebacff54 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -61,7 +61,9 @@
 #define ARM64_HAS_AMU_EXTN			51
 #define ARM64_HAS_ADDRESS_AUTH			52
 #define ARM64_HAS_GENERIC_AUTH			53
+/* 54 reserved for ARM64_BTI */
+#define ARM64_MTE				55
 
-#define ARM64_NCAPS				54
+#define ARM64_NCAPS				56
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index afe08251ff95..afc315814563 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -674,6 +674,12 @@ static inline bool system_uses_irq_prio_masking(void)
 	       cpus_have_const_cap(ARM64_HAS_IRQ_PRIO_MASKING);
 }
 
+static inline bool system_supports_mte(void)
+{
+	return IS_ENABLED(CONFIG_ARM64_MTE) &&
+		cpus_have_const_cap(ARM64_MTE);
+}
+
 static inline bool system_has_prio_mask_debugging(void)
 {
 	return IS_ENABLED(CONFIG_ARM64_DEBUG_PRIORITY_MASKING) &&
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index 0f00265248b5..8b302c88cfeb 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -94,6 +94,7 @@
 #define KERNEL_HWCAP_BF16		__khwcap2_feature(BF16)
 #define KERNEL_HWCAP_DGH		__khwcap2_feature(DGH)
 #define KERNEL_HWCAP_RNG		__khwcap2_feature(RNG)
+#define KERNEL_HWCAP_MTE		__khwcap2_feature(MTE)
 
 /*
  * This yields a mask that user programs can use to figure out what
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 8a1cbfd544d6..6c3b2fc922bb 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -78,7 +78,7 @@
 			 HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | HCR_TLOR | \
 			 HCR_FMO | HCR_IMO)
 #define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF)
-#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK)
+#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK | HCR_ATA)
 #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
 
 /* TCR_EL2 Registers bits */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index e823e93b7429..86236ae6c4e7 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -604,6 +604,7 @@
 			 SCTLR_EL1_SA0  | SCTLR_EL1_SED  | SCTLR_ELx_I    |\
 			 SCTLR_EL1_DZE  | SCTLR_EL1_UCT                   |\
 			 SCTLR_EL1_NTWE | SCTLR_ELx_IESB | SCTLR_EL1_SPAN |\
+			 SCTLR_ELx_ITFSB| SCTLR_ELx_ATA  | SCTLR_EL1_ATA0 |\
 			 ENDIAN_SET_EL1 | SCTLR_EL1_UCI  | SCTLR_EL1_RES1)
 
 /* MAIR_ELx memory attributes (used by Linux) */
diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
index 7752d93bb50f..73ac5aede18c 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -73,5 +73,7 @@
 #define HWCAP2_BF16		(1 << 14)
 #define HWCAP2_DGH		(1 << 15)
 #define HWCAP2_RNG		(1 << 16)
+/* bit 17 reserved for HWCAP2_BTI */
+#define HWCAP2_MTE		(1 << 18)
 
 #endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9fac745aa7bb..512a8b24c5df 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -182,6 +182,8 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = {
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_SSBS_SHIFT, 4, ID_AA64PFR1_SSBS_PSTATE_NI),
+	ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_MTE),
+		       FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_MTE_SHIFT, 4, ID_AA64PFR1_MTE_NI),
 	ARM64_FTR_END,
 };
 
@@ -1409,6 +1411,18 @@ static bool can_use_gic_priorities(const struct arm64_cpu_capabilities *entry,
 }
 #endif
 
+#ifdef CONFIG_ARM64_MTE
+static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
+{
+	/* all non-zero tags excluded by default */
+	write_sysreg_s(SYS_GCR_EL1_RRND | SYS_GCR_EL1_EXCL_MASK, SYS_GCR_EL1);
+	write_sysreg_s(0, SYS_TFSR_EL1);
+	write_sysreg_s(0, SYS_TFSRE0_EL1);
+
+	isb();
+}
+#endif /* CONFIG_ARM64_MTE */
+
 /* Internal helper functions to match cpu capability type */
 static bool
 cpucap_late_cpu_optional(const struct arm64_cpu_capabilities *cap)
@@ -1779,6 +1793,19 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.min_field_value = 1,
 	},
 #endif
+#ifdef CONFIG_ARM64_MTE
+	{
+		.desc = "Memory Tagging Extension",
+		.capability = ARM64_MTE,
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64PFR1_EL1,
+		.field_pos = ID_AA64PFR1_MTE_SHIFT,
+		.min_field_value = ID_AA64PFR1_MTE,
+		.sign = FTR_UNSIGNED,
+		.cpu_enable = cpu_enable_mte,
+	},
+#endif /* CONFIG_ARM64_MTE */
 	{},
 };
 
@@ -1892,6 +1919,9 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
 	HWCAP_MULTI_CAP(ptr_auth_hwcap_addr_matches, CAP_HWCAP, KERNEL_HWCAP_PACA),
 	HWCAP_MULTI_CAP(ptr_auth_hwcap_gen_matches, CAP_HWCAP, KERNEL_HWCAP_PACG),
 #endif
+#ifdef CONFIG_ARM64_MTE
+	HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_MTE_SHIFT, FTR_UNSIGNED, ID_AA64PFR1_MTE, CAP_HWCAP, KERNEL_HWCAP_MTE),
+#endif /* CONFIG_ARM64_MTE */
 	{},
 };
 
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 86136075ae41..d14b29de2c73 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -92,6 +92,8 @@ static const char *const hwcap_str[] = {
 	"bf16",
 	"dgh",
 	"rng",
+	"",		/* reserved for BTI */
+	"mte",
 	NULL
 };
 

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 02/26] arm64: mte: CPU feature detection and initial sysreg configuration
@ 2020-05-15 17:15   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Suzuki K Poulose, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, Peter Collingbourne, linux-mm, Vincenzo Frascino,
	Will Deacon, Dave P Martin

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

Add the cpufeature and hwcap entries to detect the presence of MTE on
the boot CPUs (primary and secondary). Any late secondary CPU not
supporting the feature, if detected during boot, will be parked.

In addition, add the minimum SCTLR_EL1 and HCR_EL2 bits for enabling
MTE. Without subsequent setting of MAIR, these bits do not have an
effect on tag checking.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
---
 arch/arm64/include/asm/cpucaps.h    |  4 +++-
 arch/arm64/include/asm/cpufeature.h |  6 ++++++
 arch/arm64/include/asm/hwcap.h      |  1 +
 arch/arm64/include/asm/kvm_arm.h    |  2 +-
 arch/arm64/include/asm/sysreg.h     |  1 +
 arch/arm64/include/uapi/asm/hwcap.h |  2 ++
 arch/arm64/kernel/cpufeature.c      | 30 +++++++++++++++++++++++++++++
 arch/arm64/kernel/cpuinfo.c         |  2 ++
 8 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 8eb5a088ae65..4731ebacff54 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -61,7 +61,9 @@
 #define ARM64_HAS_AMU_EXTN			51
 #define ARM64_HAS_ADDRESS_AUTH			52
 #define ARM64_HAS_GENERIC_AUTH			53
+/* 54 reserved for ARM64_BTI */
+#define ARM64_MTE				55
 
-#define ARM64_NCAPS				54
+#define ARM64_NCAPS				56
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index afe08251ff95..afc315814563 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -674,6 +674,12 @@ static inline bool system_uses_irq_prio_masking(void)
 	       cpus_have_const_cap(ARM64_HAS_IRQ_PRIO_MASKING);
 }
 
+static inline bool system_supports_mte(void)
+{
+	return IS_ENABLED(CONFIG_ARM64_MTE) &&
+		cpus_have_const_cap(ARM64_MTE);
+}
+
 static inline bool system_has_prio_mask_debugging(void)
 {
 	return IS_ENABLED(CONFIG_ARM64_DEBUG_PRIORITY_MASKING) &&
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index 0f00265248b5..8b302c88cfeb 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -94,6 +94,7 @@
 #define KERNEL_HWCAP_BF16		__khwcap2_feature(BF16)
 #define KERNEL_HWCAP_DGH		__khwcap2_feature(DGH)
 #define KERNEL_HWCAP_RNG		__khwcap2_feature(RNG)
+#define KERNEL_HWCAP_MTE		__khwcap2_feature(MTE)
 
 /*
  * This yields a mask that user programs can use to figure out what
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 8a1cbfd544d6..6c3b2fc922bb 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -78,7 +78,7 @@
 			 HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | HCR_TLOR | \
 			 HCR_FMO | HCR_IMO)
 #define HCR_VIRT_EXCP_MASK (HCR_VSE | HCR_VI | HCR_VF)
-#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK)
+#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK | HCR_ATA)
 #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
 
 /* TCR_EL2 Registers bits */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index e823e93b7429..86236ae6c4e7 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -604,6 +604,7 @@
 			 SCTLR_EL1_SA0  | SCTLR_EL1_SED  | SCTLR_ELx_I    |\
 			 SCTLR_EL1_DZE  | SCTLR_EL1_UCT                   |\
 			 SCTLR_EL1_NTWE | SCTLR_ELx_IESB | SCTLR_EL1_SPAN |\
+			 SCTLR_ELx_ITFSB| SCTLR_ELx_ATA  | SCTLR_EL1_ATA0 |\
 			 ENDIAN_SET_EL1 | SCTLR_EL1_UCI  | SCTLR_EL1_RES1)
 
 /* MAIR_ELx memory attributes (used by Linux) */
diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
index 7752d93bb50f..73ac5aede18c 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -73,5 +73,7 @@
 #define HWCAP2_BF16		(1 << 14)
 #define HWCAP2_DGH		(1 << 15)
 #define HWCAP2_RNG		(1 << 16)
+/* bit 17 reserved for HWCAP2_BTI */
+#define HWCAP2_MTE		(1 << 18)
 
 #endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9fac745aa7bb..512a8b24c5df 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -182,6 +182,8 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = {
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_SSBS_SHIFT, 4, ID_AA64PFR1_SSBS_PSTATE_NI),
+	ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_MTE),
+		       FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_MTE_SHIFT, 4, ID_AA64PFR1_MTE_NI),
 	ARM64_FTR_END,
 };
 
@@ -1409,6 +1411,18 @@ static bool can_use_gic_priorities(const struct arm64_cpu_capabilities *entry,
 }
 #endif
 
+#ifdef CONFIG_ARM64_MTE
+static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
+{
+	/* all non-zero tags excluded by default */
+	write_sysreg_s(SYS_GCR_EL1_RRND | SYS_GCR_EL1_EXCL_MASK, SYS_GCR_EL1);
+	write_sysreg_s(0, SYS_TFSR_EL1);
+	write_sysreg_s(0, SYS_TFSRE0_EL1);
+
+	isb();
+}
+#endif /* CONFIG_ARM64_MTE */
+
 /* Internal helper functions to match cpu capability type */
 static bool
 cpucap_late_cpu_optional(const struct arm64_cpu_capabilities *cap)
@@ -1779,6 +1793,19 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.min_field_value = 1,
 	},
 #endif
+#ifdef CONFIG_ARM64_MTE
+	{
+		.desc = "Memory Tagging Extension",
+		.capability = ARM64_MTE,
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64PFR1_EL1,
+		.field_pos = ID_AA64PFR1_MTE_SHIFT,
+		.min_field_value = ID_AA64PFR1_MTE,
+		.sign = FTR_UNSIGNED,
+		.cpu_enable = cpu_enable_mte,
+	},
+#endif /* CONFIG_ARM64_MTE */
 	{},
 };
 
@@ -1892,6 +1919,9 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
 	HWCAP_MULTI_CAP(ptr_auth_hwcap_addr_matches, CAP_HWCAP, KERNEL_HWCAP_PACA),
 	HWCAP_MULTI_CAP(ptr_auth_hwcap_gen_matches, CAP_HWCAP, KERNEL_HWCAP_PACG),
 #endif
+#ifdef CONFIG_ARM64_MTE
+	HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_MTE_SHIFT, FTR_UNSIGNED, ID_AA64PFR1_MTE, CAP_HWCAP, KERNEL_HWCAP_MTE),
+#endif /* CONFIG_ARM64_MTE */
 	{},
 };
 
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 86136075ae41..d14b29de2c73 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -92,6 +92,8 @@ static const char *const hwcap_str[] = {
 	"bf16",
 	"dgh",
 	"rng",
+	"",		/* reserved for BTI */
+	"mte",
 	NULL
 };
 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 03/26] arm64: mte: Use Normal Tagged attributes for the linear map
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:15   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Suzuki K Poulose

Once user space is given access to tagged memory, the kernel must be
able to clear/save/restore tags visible to the user. This is done via
the linear mapping, therefore map it as such. The new MT_NORMAL_TAGGED
index for MAIR_EL1 is initially mapped as Normal memory and later
changed to Normal Tagged via the cpufeature infrastructure. From a
mismatched attribute aliases perspective, the Tagged memory is
considered a permission and it won't lead to undefined behaviour.

The empty_zero_page is cleared to ensure that the tags it contains are
already zeroed. The actual tags-aware clear_page() implementation is
part of a subsequent patch.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
---

Notes:
    v3:
    - Restrict the safe attribute change in pgattr_change_is_safe() only to
      Normal to/from Normal-Tagged (old version allow any other type as long
      as old or new was Normal(-Tagged)).

 arch/arm64/include/asm/memory.h       |  1 +
 arch/arm64/include/asm/pgtable-prot.h |  2 ++
 arch/arm64/kernel/cpufeature.c        | 30 +++++++++++++++++++++++++++
 arch/arm64/mm/dump.c                  |  4 ++++
 arch/arm64/mm/mmu.c                   | 22 ++++++++++++++++++--
 arch/arm64/mm/proc.S                  |  8 +++++--
 6 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index a1871bb32bb1..472c77a68225 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -136,6 +136,7 @@
 #define MT_NORMAL_NC		3
 #define MT_NORMAL		4
 #define MT_NORMAL_WT		5
+#define MT_NORMAL_TAGGED	6
 
 /*
  * Memory types for Stage-2 translation
diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index 1305e28225fc..9c924b09d5c8 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -39,6 +39,7 @@ extern bool arm64_use_ng_mappings;
 #define PROT_NORMAL_NC		(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_NC))
 #define PROT_NORMAL_WT		(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_WT))
 #define PROT_NORMAL		(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL))
+#define PROT_NORMAL_TAGGED	(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_TAGGED))
 
 #define PROT_SECT_DEVICE_nGnRE	(PROT_SECT_DEFAULT | PMD_SECT_PXN | PMD_SECT_UXN | PMD_ATTRINDX(MT_DEVICE_nGnRE))
 #define PROT_SECT_NORMAL	(PROT_SECT_DEFAULT | PMD_SECT_PXN | PMD_SECT_UXN | PMD_ATTRINDX(MT_NORMAL))
@@ -48,6 +49,7 @@ extern bool arm64_use_ng_mappings;
 #define _HYP_PAGE_DEFAULT	_PAGE_DEFAULT
 
 #define PAGE_KERNEL		__pgprot(PROT_NORMAL)
+#define PAGE_KERNEL_TAGGED	__pgprot(PROT_NORMAL_TAGGED)
 #define PAGE_KERNEL_RO		__pgprot((PROT_NORMAL & ~PTE_WRITE) | PTE_RDONLY)
 #define PAGE_KERNEL_ROX		__pgprot((PROT_NORMAL & ~(PTE_WRITE | PTE_PXN)) | PTE_RDONLY)
 #define PAGE_KERNEL_EXEC	__pgprot(PROT_NORMAL & ~PTE_PXN)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 512a8b24c5df..d2fe8ff72324 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1414,13 +1414,43 @@ static bool can_use_gic_priorities(const struct arm64_cpu_capabilities *entry,
 #ifdef CONFIG_ARM64_MTE
 static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
 {
+	u64 mair;
+
 	/* all non-zero tags excluded by default */
 	write_sysreg_s(SYS_GCR_EL1_RRND | SYS_GCR_EL1_EXCL_MASK, SYS_GCR_EL1);
 	write_sysreg_s(0, SYS_TFSR_EL1);
 	write_sysreg_s(0, SYS_TFSRE0_EL1);
 
+	/*
+	 * Update the MT_NORMAL_TAGGED index in MAIR_EL1. Tag checking is
+	 * disabled for the kernel, so there won't be any observable effect
+	 * other than allowing the kernel to read and write tags.
+	 */
+	mair = read_sysreg_s(SYS_MAIR_EL1);
+	mair &= ~MAIR_ATTRIDX(MAIR_ATTR_MASK, MT_NORMAL_TAGGED);
+	mair |= MAIR_ATTRIDX(MAIR_ATTR_NORMAL_TAGGED, MT_NORMAL_TAGGED);
+	write_sysreg_s(mair, SYS_MAIR_EL1);
+
 	isb();
 }
+
+static int __init system_enable_mte(void)
+{
+	if (!system_supports_mte())
+		return 0;
+
+	/* Ensure the TLB does not have stale MAIR attributes */
+	flush_tlb_all();
+
+	/*
+	 * Clear the zero page (again) so that tags are reset. This needs to
+	 * be done via the linear map which has the Tagged attribute.
+	 */
+	clear_page(lm_alias(empty_zero_page));
+
+	return 0;
+}
+core_initcall(system_enable_mte);
 #endif /* CONFIG_ARM64_MTE */
 
 /* Internal helper functions to match cpu capability type */
diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index 860c00ec8bd3..416a2404ac83 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -165,6 +165,10 @@ static const struct prot_bits pte_bits[] = {
 		.mask	= PTE_ATTRINDX_MASK,
 		.val	= PTE_ATTRINDX(MT_NORMAL),
 		.set	= "MEM/NORMAL",
+	}, {
+		.mask	= PTE_ATTRINDX_MASK,
+		.val	= PTE_ATTRINDX(MT_NORMAL_TAGGED),
+		.set	= "MEM/NORMAL-TAGGED",
 	}
 };
 
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index a374e4f51a62..37bb5b19bdf4 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -121,7 +121,7 @@ static bool pgattr_change_is_safe(u64 old, u64 new)
 	 * The following mapping attributes may be updated in live
 	 * kernel mappings without the need for break-before-make.
 	 */
-	static const pteval_t mask = PTE_PXN | PTE_RDONLY | PTE_WRITE | PTE_NG;
+	pteval_t mask = PTE_PXN | PTE_RDONLY | PTE_WRITE | PTE_NG;
 
 	/* creating or taking down mappings is always safe */
 	if (old == 0 || new == 0)
@@ -135,6 +135,19 @@ static bool pgattr_change_is_safe(u64 old, u64 new)
 	if (old & ~new & PTE_NG)
 		return false;
 
+	if (system_supports_mte()) {
+		/*
+		 * Changing the memory type between Normal and Normal-Tagged
+		 * is safe since Tagged is considered a permission attribute
+		 * from the mismatched attribute aliases perspective.
+		 */
+		if (((old & PTE_ATTRINDX_MASK) == PTE_ATTRINDX(MT_NORMAL) ||
+		     (old & PTE_ATTRINDX_MASK) == PTE_ATTRINDX(MT_NORMAL_TAGGED)) &&
+		    ((new & PTE_ATTRINDX_MASK) == PTE_ATTRINDX(MT_NORMAL) ||
+		     (new & PTE_ATTRINDX_MASK) == PTE_ATTRINDX(MT_NORMAL_TAGGED)))
+			mask |= PTE_ATTRINDX_MASK;
+	}
+
 	return ((old ^ new) & ~mask) == 0;
 }
 
@@ -489,7 +502,12 @@ static void __init map_mem(pgd_t *pgdp)
 		if (memblock_is_nomap(reg))
 			continue;
 
-		__map_memblock(pgdp, start, end, PAGE_KERNEL, flags);
+		/*
+		 * The linear map must allow allocation tags reading/writing
+		 * if MTE is present. Otherwise, it has the same attributes as
+		 * PAGE_KERNEL.
+		 */
+		__map_memblock(pgdp, start, end, PAGE_KERNEL_TAGGED, flags);
 	}
 
 	/*
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 197a9ba2d5ea..48891095eaed 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -44,14 +44,18 @@
 #define TCR_KASAN_FLAGS 0
 #endif
 
-/* Default MAIR_EL1 */
+/*
+ * Default MAIR_EL1. MT_NORMAL_TAGGED is initially mapped as Normal memory and
+ * changed later to Normal Tagged if the system supports MTE.
+ */
 #define MAIR_EL1_SET							\
 	(MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRnE, MT_DEVICE_nGnRnE) |	\
 	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRE, MT_DEVICE_nGnRE) |	\
 	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_GRE, MT_DEVICE_GRE) |		\
 	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_NC, MT_NORMAL_NC) |		\
 	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL, MT_NORMAL) |			\
-	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_WT, MT_NORMAL_WT))
+	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_WT, MT_NORMAL_WT) |		\
+	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL, MT_NORMAL_TAGGED))
 
 #ifdef CONFIG_CPU_PM
 /**

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 03/26] arm64: mte: Use Normal Tagged attributes for the linear map
@ 2020-05-15 17:15   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Suzuki K Poulose, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, Peter Collingbourne, linux-mm, Vincenzo Frascino,
	Will Deacon, Dave P Martin

Once user space is given access to tagged memory, the kernel must be
able to clear/save/restore tags visible to the user. This is done via
the linear mapping, therefore map it as such. The new MT_NORMAL_TAGGED
index for MAIR_EL1 is initially mapped as Normal memory and later
changed to Normal Tagged via the cpufeature infrastructure. From a
mismatched attribute aliases perspective, the Tagged memory is
considered a permission and it won't lead to undefined behaviour.

The empty_zero_page is cleared to ensure that the tags it contains are
already zeroed. The actual tags-aware clear_page() implementation is
part of a subsequent patch.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
---

Notes:
    v3:
    - Restrict the safe attribute change in pgattr_change_is_safe() only to
      Normal to/from Normal-Tagged (old version allow any other type as long
      as old or new was Normal(-Tagged)).

 arch/arm64/include/asm/memory.h       |  1 +
 arch/arm64/include/asm/pgtable-prot.h |  2 ++
 arch/arm64/kernel/cpufeature.c        | 30 +++++++++++++++++++++++++++
 arch/arm64/mm/dump.c                  |  4 ++++
 arch/arm64/mm/mmu.c                   | 22 ++++++++++++++++++--
 arch/arm64/mm/proc.S                  |  8 +++++--
 6 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index a1871bb32bb1..472c77a68225 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -136,6 +136,7 @@
 #define MT_NORMAL_NC		3
 #define MT_NORMAL		4
 #define MT_NORMAL_WT		5
+#define MT_NORMAL_TAGGED	6
 
 /*
  * Memory types for Stage-2 translation
diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index 1305e28225fc..9c924b09d5c8 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -39,6 +39,7 @@ extern bool arm64_use_ng_mappings;
 #define PROT_NORMAL_NC		(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_NC))
 #define PROT_NORMAL_WT		(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_WT))
 #define PROT_NORMAL		(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL))
+#define PROT_NORMAL_TAGGED	(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_NORMAL_TAGGED))
 
 #define PROT_SECT_DEVICE_nGnRE	(PROT_SECT_DEFAULT | PMD_SECT_PXN | PMD_SECT_UXN | PMD_ATTRINDX(MT_DEVICE_nGnRE))
 #define PROT_SECT_NORMAL	(PROT_SECT_DEFAULT | PMD_SECT_PXN | PMD_SECT_UXN | PMD_ATTRINDX(MT_NORMAL))
@@ -48,6 +49,7 @@ extern bool arm64_use_ng_mappings;
 #define _HYP_PAGE_DEFAULT	_PAGE_DEFAULT
 
 #define PAGE_KERNEL		__pgprot(PROT_NORMAL)
+#define PAGE_KERNEL_TAGGED	__pgprot(PROT_NORMAL_TAGGED)
 #define PAGE_KERNEL_RO		__pgprot((PROT_NORMAL & ~PTE_WRITE) | PTE_RDONLY)
 #define PAGE_KERNEL_ROX		__pgprot((PROT_NORMAL & ~(PTE_WRITE | PTE_PXN)) | PTE_RDONLY)
 #define PAGE_KERNEL_EXEC	__pgprot(PROT_NORMAL & ~PTE_PXN)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 512a8b24c5df..d2fe8ff72324 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1414,13 +1414,43 @@ static bool can_use_gic_priorities(const struct arm64_cpu_capabilities *entry,
 #ifdef CONFIG_ARM64_MTE
 static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap)
 {
+	u64 mair;
+
 	/* all non-zero tags excluded by default */
 	write_sysreg_s(SYS_GCR_EL1_RRND | SYS_GCR_EL1_EXCL_MASK, SYS_GCR_EL1);
 	write_sysreg_s(0, SYS_TFSR_EL1);
 	write_sysreg_s(0, SYS_TFSRE0_EL1);
 
+	/*
+	 * Update the MT_NORMAL_TAGGED index in MAIR_EL1. Tag checking is
+	 * disabled for the kernel, so there won't be any observable effect
+	 * other than allowing the kernel to read and write tags.
+	 */
+	mair = read_sysreg_s(SYS_MAIR_EL1);
+	mair &= ~MAIR_ATTRIDX(MAIR_ATTR_MASK, MT_NORMAL_TAGGED);
+	mair |= MAIR_ATTRIDX(MAIR_ATTR_NORMAL_TAGGED, MT_NORMAL_TAGGED);
+	write_sysreg_s(mair, SYS_MAIR_EL1);
+
 	isb();
 }
+
+static int __init system_enable_mte(void)
+{
+	if (!system_supports_mte())
+		return 0;
+
+	/* Ensure the TLB does not have stale MAIR attributes */
+	flush_tlb_all();
+
+	/*
+	 * Clear the zero page (again) so that tags are reset. This needs to
+	 * be done via the linear map which has the Tagged attribute.
+	 */
+	clear_page(lm_alias(empty_zero_page));
+
+	return 0;
+}
+core_initcall(system_enable_mte);
 #endif /* CONFIG_ARM64_MTE */
 
 /* Internal helper functions to match cpu capability type */
diff --git a/arch/arm64/mm/dump.c b/arch/arm64/mm/dump.c
index 860c00ec8bd3..416a2404ac83 100644
--- a/arch/arm64/mm/dump.c
+++ b/arch/arm64/mm/dump.c
@@ -165,6 +165,10 @@ static const struct prot_bits pte_bits[] = {
 		.mask	= PTE_ATTRINDX_MASK,
 		.val	= PTE_ATTRINDX(MT_NORMAL),
 		.set	= "MEM/NORMAL",
+	}, {
+		.mask	= PTE_ATTRINDX_MASK,
+		.val	= PTE_ATTRINDX(MT_NORMAL_TAGGED),
+		.set	= "MEM/NORMAL-TAGGED",
 	}
 };
 
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index a374e4f51a62..37bb5b19bdf4 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -121,7 +121,7 @@ static bool pgattr_change_is_safe(u64 old, u64 new)
 	 * The following mapping attributes may be updated in live
 	 * kernel mappings without the need for break-before-make.
 	 */
-	static const pteval_t mask = PTE_PXN | PTE_RDONLY | PTE_WRITE | PTE_NG;
+	pteval_t mask = PTE_PXN | PTE_RDONLY | PTE_WRITE | PTE_NG;
 
 	/* creating or taking down mappings is always safe */
 	if (old == 0 || new == 0)
@@ -135,6 +135,19 @@ static bool pgattr_change_is_safe(u64 old, u64 new)
 	if (old & ~new & PTE_NG)
 		return false;
 
+	if (system_supports_mte()) {
+		/*
+		 * Changing the memory type between Normal and Normal-Tagged
+		 * is safe since Tagged is considered a permission attribute
+		 * from the mismatched attribute aliases perspective.
+		 */
+		if (((old & PTE_ATTRINDX_MASK) == PTE_ATTRINDX(MT_NORMAL) ||
+		     (old & PTE_ATTRINDX_MASK) == PTE_ATTRINDX(MT_NORMAL_TAGGED)) &&
+		    ((new & PTE_ATTRINDX_MASK) == PTE_ATTRINDX(MT_NORMAL) ||
+		     (new & PTE_ATTRINDX_MASK) == PTE_ATTRINDX(MT_NORMAL_TAGGED)))
+			mask |= PTE_ATTRINDX_MASK;
+	}
+
 	return ((old ^ new) & ~mask) == 0;
 }
 
@@ -489,7 +502,12 @@ static void __init map_mem(pgd_t *pgdp)
 		if (memblock_is_nomap(reg))
 			continue;
 
-		__map_memblock(pgdp, start, end, PAGE_KERNEL, flags);
+		/*
+		 * The linear map must allow allocation tags reading/writing
+		 * if MTE is present. Otherwise, it has the same attributes as
+		 * PAGE_KERNEL.
+		 */
+		__map_memblock(pgdp, start, end, PAGE_KERNEL_TAGGED, flags);
 	}
 
 	/*
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 197a9ba2d5ea..48891095eaed 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -44,14 +44,18 @@
 #define TCR_KASAN_FLAGS 0
 #endif
 
-/* Default MAIR_EL1 */
+/*
+ * Default MAIR_EL1. MT_NORMAL_TAGGED is initially mapped as Normal memory and
+ * changed later to Normal Tagged if the system supports MTE.
+ */
 #define MAIR_EL1_SET							\
 	(MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRnE, MT_DEVICE_nGnRnE) |	\
 	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_nGnRE, MT_DEVICE_nGnRE) |	\
 	 MAIR_ATTRIDX(MAIR_ATTR_DEVICE_GRE, MT_DEVICE_GRE) |		\
 	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_NC, MT_NORMAL_NC) |		\
 	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL, MT_NORMAL) |			\
-	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_WT, MT_NORMAL_WT))
+	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL_WT, MT_NORMAL_WT) |		\
+	 MAIR_ATTRIDX(MAIR_ATTR_NORMAL, MT_NORMAL_TAGGED))
 
 #ifdef CONFIG_CPU_PM
 /**

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 04/26] arm64: mte: Add specific SIGSEGV codes
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:15   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Arnd Bergmann

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

Add MTE-specific SIGSEGV codes to siginfo.h and update the x86
BUILD_BUG_ON(NSIGSEGV != 7) compile check.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
[catalin.marinas@arm.com: renamed precise/imprecise to sync/async]
[catalin.marinas@arm.com: dropped #ifdef __aarch64__, renumbered]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v3:
    - Fixed the BUILD_BUG_ON(NSIGSEGV != 7) on x86
    - Updated the commit log
    
    v2:
    - Dropped the #ifdef __aarch64__.
    - Renumbered the SEGV_MTE* values to avoid clash with ADI.

 arch/x86/kernel/signal_compat.c    | 2 +-
 include/uapi/asm-generic/siginfo.h | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/signal_compat.c b/arch/x86/kernel/signal_compat.c
index 9ccbf0576cd0..a7f3e12cfbdb 100644
--- a/arch/x86/kernel/signal_compat.c
+++ b/arch/x86/kernel/signal_compat.c
@@ -27,7 +27,7 @@ static inline void signal_compat_build_tests(void)
 	 */
 	BUILD_BUG_ON(NSIGILL  != 11);
 	BUILD_BUG_ON(NSIGFPE  != 15);
-	BUILD_BUG_ON(NSIGSEGV != 7);
+	BUILD_BUG_ON(NSIGSEGV != 9);
 	BUILD_BUG_ON(NSIGBUS  != 5);
 	BUILD_BUG_ON(NSIGTRAP != 5);
 	BUILD_BUG_ON(NSIGCHLD != 6);
diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
index cb3d6c267181..7aacf9389010 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -229,7 +229,9 @@ typedef struct siginfo {
 #define SEGV_ACCADI	5	/* ADI not enabled for mapped object */
 #define SEGV_ADIDERR	6	/* Disrupting MCD error */
 #define SEGV_ADIPERR	7	/* Precise MCD exception */
-#define NSIGSEGV	7
+#define SEGV_MTEAERR	8	/* Asynchronous ARM MTE error */
+#define SEGV_MTESERR	9	/* Synchronous ARM MTE exception */
+#define NSIGSEGV	9
 
 /*
  * SIGBUS si_codes

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 04/26] arm64: mte: Add specific SIGSEGV codes
@ 2020-05-15 17:15   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Arnd Bergmann, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, Peter Collingbourne, linux-mm, Vincenzo Frascino,
	Will Deacon, Dave P Martin

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

Add MTE-specific SIGSEGV codes to siginfo.h and update the x86
BUILD_BUG_ON(NSIGSEGV != 7) compile check.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
[catalin.marinas@arm.com: renamed precise/imprecise to sync/async]
[catalin.marinas@arm.com: dropped #ifdef __aarch64__, renumbered]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v3:
    - Fixed the BUILD_BUG_ON(NSIGSEGV != 7) on x86
    - Updated the commit log
    
    v2:
    - Dropped the #ifdef __aarch64__.
    - Renumbered the SEGV_MTE* values to avoid clash with ADI.

 arch/x86/kernel/signal_compat.c    | 2 +-
 include/uapi/asm-generic/siginfo.h | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/signal_compat.c b/arch/x86/kernel/signal_compat.c
index 9ccbf0576cd0..a7f3e12cfbdb 100644
--- a/arch/x86/kernel/signal_compat.c
+++ b/arch/x86/kernel/signal_compat.c
@@ -27,7 +27,7 @@ static inline void signal_compat_build_tests(void)
 	 */
 	BUILD_BUG_ON(NSIGILL  != 11);
 	BUILD_BUG_ON(NSIGFPE  != 15);
-	BUILD_BUG_ON(NSIGSEGV != 7);
+	BUILD_BUG_ON(NSIGSEGV != 9);
 	BUILD_BUG_ON(NSIGBUS  != 5);
 	BUILD_BUG_ON(NSIGTRAP != 5);
 	BUILD_BUG_ON(NSIGCHLD != 6);
diff --git a/include/uapi/asm-generic/siginfo.h b/include/uapi/asm-generic/siginfo.h
index cb3d6c267181..7aacf9389010 100644
--- a/include/uapi/asm-generic/siginfo.h
+++ b/include/uapi/asm-generic/siginfo.h
@@ -229,7 +229,9 @@ typedef struct siginfo {
 #define SEGV_ACCADI	5	/* ADI not enabled for mapped object */
 #define SEGV_ADIDERR	6	/* Disrupting MCD error */
 #define SEGV_ADIPERR	7	/* Precise MCD exception */
-#define NSIGSEGV	7
+#define SEGV_MTEAERR	8	/* Asynchronous ARM MTE error */
+#define SEGV_MTESERR	9	/* Synchronous ARM MTE exception */
+#define NSIGSEGV	9
 
 /*
  * SIGBUS si_codes

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 05/26] arm64: mte: Handle synchronous and asynchronous tag check faults
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:15   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

The Memory Tagging Extension has two modes of notifying a tag check
fault at EL0, configurable through the SCTLR_EL1.TCF0 field:

1. Synchronous raising of a Data Abort exception with DFSC 17.
2. Asynchronous setting of a cumulative bit in TFSRE0_EL1.

Add the exception handler for the synchronous exception and handling of
the asynchronous TFSRE0_EL1.TF0 bit setting via a new TIF flag in
do_notify_resume().

On a tag check failure in user-space, whether synchronous or
asynchronous, a SIGSEGV will be raised on the faulting thread.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v4:
    - Use send_signal_fault() instead of fault_signal_inject() for
      asynchronous tag check faults as execution can continue even if this
      signal is masked.
    - Add DSB ISH prior to writing TFSRE0_EL1 in the clear_mte_async_tcf
      macro.
    - Move clear_mte_async_tcf just after returning to user since
      do_notify_resume() may still cause async tag faults via do_signal().
    
    v3:
    - Asynchronous tag check faults during the uaccess routines in the
      kernel are ignored.
    - Fix check_mte_async_tcf calling site as it expects the first argument
      to be the thread flags.
    - Move the mte_thread_switch() definition and call to a later patch as
      this became empty with the removal of async uaccess checking.
    - Add dsb() and clearing of TFSRE0_EL1 in flush_mte_state(), in case
      execve() triggered a asynchronous tag check fault.
    - Clear TIF_MTE_ASYNC_FAULT in arch_dup_task_struct() so that the child
      does not inherit any pending tag fault in the parent.
    
    v2:
    - Clear PSTATE.TCO on exception entry (automatically set by the hardware).
    - On syscall entry, for asynchronous tag check faults from user space,
      generate the signal early via syscall restarting.
    - Before context switch, save any potential async tag check fault
      generated by the kernel to the TIF flag (this follows an architecture
      update where the uaccess routines use the TCF0 mode).
    - Moved the flush_mte_state() and mte_thread_switch() function to a new
      mte.c file.

 arch/arm64/include/asm/mte.h         | 23 +++++++++++++++++
 arch/arm64/include/asm/thread_info.h |  4 ++-
 arch/arm64/kernel/Makefile           |  1 +
 arch/arm64/kernel/entry.S            | 37 ++++++++++++++++++++++++++++
 arch/arm64/kernel/mte.c              | 21 ++++++++++++++++
 arch/arm64/kernel/process.c          |  5 ++++
 arch/arm64/kernel/signal.c           |  8 ++++++
 arch/arm64/kernel/syscall.c          | 10 ++++++++
 arch/arm64/mm/fault.c                |  9 ++++++-
 9 files changed, 116 insertions(+), 2 deletions(-)
 create mode 100644 arch/arm64/include/asm/mte.h
 create mode 100644 arch/arm64/kernel/mte.c

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
new file mode 100644
index 000000000000..a0bf310da74b
--- /dev/null
+++ b/arch/arm64/include/asm/mte.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020 ARM Ltd.
+ */
+#ifndef __ASM_MTE_H
+#define __ASM_MTE_H
+
+#ifndef __ASSEMBLY__
+
+#ifdef CONFIG_ARM64_MTE
+
+void flush_mte_state(void);
+
+#else
+
+static inline void flush_mte_state(void)
+{
+}
+
+#endif
+
+#endif /* __ASSEMBLY__ */
+#endif /* __ASM_MTE_H  */
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 512174a8e789..0c6e5523b932 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -63,6 +63,7 @@ void arch_release_task_struct(struct task_struct *tsk);
 #define TIF_FOREIGN_FPSTATE	3	/* CPU's FP state is not current's */
 #define TIF_UPROBE		4	/* uprobe breakpoint or singlestep */
 #define TIF_FSCHECK		5	/* Check FS is USER_DS on return */
+#define TIF_MTE_ASYNC_FAULT	6	/* MTE Asynchronous Tag Check Fault */
 #define TIF_SYSCALL_TRACE	8	/* syscall trace active */
 #define TIF_SYSCALL_AUDIT	9	/* syscall auditing */
 #define TIF_SYSCALL_TRACEPOINT	10	/* syscall tracepoint for ftrace */
@@ -91,10 +92,11 @@ void arch_release_task_struct(struct task_struct *tsk);
 #define _TIF_FSCHECK		(1 << TIF_FSCHECK)
 #define _TIF_32BIT		(1 << TIF_32BIT)
 #define _TIF_SVE		(1 << TIF_SVE)
+#define _TIF_MTE_ASYNC_FAULT	(1 << TIF_MTE_ASYNC_FAULT)
 
 #define _TIF_WORK_MASK		(_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
 				 _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
-				 _TIF_UPROBE | _TIF_FSCHECK)
+				 _TIF_UPROBE | _TIF_FSCHECK | _TIF_MTE_ASYNC_FAULT)
 
 #define _TIF_SYSCALL_WORK	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
 				 _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 4e5b8ee31442..dbede7a4c5fb 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -63,6 +63,7 @@ obj-$(CONFIG_CRASH_CORE)		+= crash_core.o
 obj-$(CONFIG_ARM_SDE_INTERFACE)		+= sdei.o
 obj-$(CONFIG_ARM64_SSBD)		+= ssbd.o
 obj-$(CONFIG_ARM64_PTR_AUTH)		+= pointer_auth.o
+obj-$(CONFIG_ARM64_MTE)			+= mte.o
 
 obj-y					+= vdso/ probes/
 obj-$(CONFIG_COMPAT_VDSO)		+= vdso32/
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index ddcde093c433..cbb3cacdf79f 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -145,6 +145,32 @@ alternative_cb_end
 #endif
 	.endm
 
+	/* Check for MTE asynchronous tag check faults */
+	.macro check_mte_async_tcf, flgs, tmp
+#ifdef CONFIG_ARM64_MTE
+alternative_if_not ARM64_MTE
+	b	1f
+alternative_else_nop_endif
+	mrs_s	\tmp, SYS_TFSRE0_EL1
+	tbz	\tmp, #SYS_TFSR_EL1_TF0_SHIFT, 1f
+	/* Asynchronous TCF occurred for TTBR0 access, set the TI flag */
+	orr	\flgs, \flgs, #_TIF_MTE_ASYNC_FAULT
+	str	\flgs, [tsk, #TSK_TI_FLAGS]
+	msr_s	SYS_TFSRE0_EL1, xzr
+1:
+#endif
+	.endm
+
+	/* Clear the MTE asynchronous tag check faults */
+	.macro clear_mte_async_tcf
+#ifdef CONFIG_ARM64_MTE
+alternative_if ARM64_MTE
+	dsb	ish
+	msr_s	SYS_TFSRE0_EL1, xzr
+alternative_else_nop_endif
+#endif
+	.endm
+
 	.macro	kernel_entry, el, regsize = 64
 	.if	\regsize == 32
 	mov	w0, w0				// zero upper 32 bits of x0
@@ -176,6 +202,8 @@ alternative_cb_end
 	ldr	x19, [tsk, #TSK_TI_FLAGS]
 	disable_step_tsk x19, x20
 
+	/* Check for asynchronous tag check faults in user space */
+	check_mte_async_tcf x19, x22
 	apply_ssbd 1, x22, x23
 
 	ptrauth_keys_install_kernel tsk, 1, x20, x22, x23
@@ -244,6 +272,13 @@ alternative_if ARM64_HAS_IRQ_PRIO_MASKING
 	str	x20, [sp, #S_PMR_SAVE]
 alternative_else_nop_endif
 
+	/* Re-enable tag checking (TCO set on exception entry) */
+#ifdef CONFIG_ARM64_MTE
+alternative_if ARM64_MTE
+	SET_PSTATE_TCO(0)
+alternative_else_nop_endif
+#endif
+
 	/*
 	 * Registers that may be useful after this macro is invoked:
 	 *
@@ -748,6 +783,8 @@ ret_to_user:
 	and	x2, x1, #_TIF_WORK_MASK
 	cbnz	x2, work_pending
 finish_ret_to_user:
+	/* Ignore asynchronous tag check faults in the uaccess routines */
+	clear_mte_async_tcf
 	enable_step_tsk x1, x2
 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK
 	bl	stackleak_erase
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
new file mode 100644
index 000000000000..032016823957
--- /dev/null
+++ b/arch/arm64/kernel/mte.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 ARM Ltd.
+ */
+
+#include <linux/thread_info.h>
+
+#include <asm/cpufeature.h>
+#include <asm/mte.h>
+#include <asm/sysreg.h>
+
+void flush_mte_state(void)
+{
+	if (!system_supports_mte())
+		return;
+
+	/* clear any pending asynchronous tag fault */
+	dsb(ish);
+	write_sysreg_s(0, SYS_TFSRE0_EL1);
+	clear_thread_flag(TIF_MTE_ASYNC_FAULT);
+}
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 56be4cbf771f..740047c9cd13 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -50,6 +50,7 @@
 #include <asm/exec.h>
 #include <asm/fpsimd.h>
 #include <asm/mmu_context.h>
+#include <asm/mte.h>
 #include <asm/processor.h>
 #include <asm/pointer_auth.h>
 #include <asm/stacktrace.h>
@@ -323,6 +324,7 @@ void flush_thread(void)
 	tls_thread_flush();
 	flush_ptrace_hw_breakpoint(current);
 	flush_tagged_addr_state();
+	flush_mte_state();
 }
 
 void release_thread(struct task_struct *dead_task)
@@ -355,6 +357,9 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 	dst->thread.sve_state = NULL;
 	clear_tsk_thread_flag(dst, TIF_SVE);
 
+	/* clear any pending asynchronous tag fault raised by the parent */
+	clear_tsk_thread_flag(dst, TIF_MTE_ASYNC_FAULT);
+
 	return 0;
 }
 
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 339882db5a91..149334d5df02 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -732,6 +732,9 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
 	regs->regs[29] = (unsigned long)&user->next_frame->fp;
 	regs->pc = (unsigned long)ka->sa.sa_handler;
 
+	/* TCO (Tag Check Override) always cleared for signal handlers */
+	regs->pstate &= ~PSR_TCO_BIT;
+
 	if (ka->sa.sa_flags & SA_RESTORER)
 		sigtramp = ka->sa.sa_restorer;
 	else
@@ -923,6 +926,11 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
 			if (thread_flags & _TIF_UPROBE)
 				uprobe_notify_resume(regs);
 
+			if (thread_flags & _TIF_MTE_ASYNC_FAULT) {
+				clear_thread_flag(TIF_MTE_ASYNC_FAULT);
+				send_sig_fault(SIGSEGV, SEGV_MTEAERR, 0, current);
+			}
+
 			if (thread_flags & _TIF_SIGPENDING)
 				do_signal(regs);
 
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index a12c0c88d345..db25f5d6a07c 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -102,6 +102,16 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
 	local_daif_restore(DAIF_PROCCTX);
 	user_exit();
 
+	if (system_supports_mte() && (flags & _TIF_MTE_ASYNC_FAULT)) {
+		/*
+		 * Process the asynchronous tag check fault before the actual
+		 * syscall. do_notify_resume() will send a signal to userspace
+		 * before the syscall is restarted.
+		 */
+		regs->regs[0] = -ERESTARTNOINTR;
+		return;
+	}
+
 	if (has_syscall_work(flags)) {
 		/* set default errno for user-issued syscall(-1) */
 		if (scno == NO_SYSCALL)
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index c9cedc0432d2..38b59cace3e3 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -650,6 +650,13 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
 	return 0;
 }
 
+static int do_tag_check_fault(unsigned long addr, unsigned int esr,
+			      struct pt_regs *regs)
+{
+	do_bad_area(addr, esr, regs);
+	return 0;
+}
+
 static const struct fault_info fault_info[] = {
 	{ do_bad,		SIGKILL, SI_KERNEL,	"ttbr address size fault"	},
 	{ do_bad,		SIGKILL, SI_KERNEL,	"level 1 address size fault"	},
@@ -668,7 +675,7 @@ static const struct fault_info fault_info[] = {
 	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 2 permission fault"	},
 	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 3 permission fault"	},
 	{ do_sea,		SIGBUS,  BUS_OBJERR,	"synchronous external abort"	},
-	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 17"			},
+	{ do_tag_check_fault,	SIGSEGV, SEGV_MTESERR,	"synchronous tag check fault"	},
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 18"			},
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 19"			},
 	{ do_sea,		SIGKILL, SI_KERNEL,	"level 0 (translation table walk)"	},

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 05/26] arm64: mte: Handle synchronous and asynchronous tag check faults
@ 2020-05-15 17:15   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Vincenzo Frascino, Will Deacon,
	Dave P Martin

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

The Memory Tagging Extension has two modes of notifying a tag check
fault at EL0, configurable through the SCTLR_EL1.TCF0 field:

1. Synchronous raising of a Data Abort exception with DFSC 17.
2. Asynchronous setting of a cumulative bit in TFSRE0_EL1.

Add the exception handler for the synchronous exception and handling of
the asynchronous TFSRE0_EL1.TF0 bit setting via a new TIF flag in
do_notify_resume().

On a tag check failure in user-space, whether synchronous or
asynchronous, a SIGSEGV will be raised on the faulting thread.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v4:
    - Use send_signal_fault() instead of fault_signal_inject() for
      asynchronous tag check faults as execution can continue even if this
      signal is masked.
    - Add DSB ISH prior to writing TFSRE0_EL1 in the clear_mte_async_tcf
      macro.
    - Move clear_mte_async_tcf just after returning to user since
      do_notify_resume() may still cause async tag faults via do_signal().
    
    v3:
    - Asynchronous tag check faults during the uaccess routines in the
      kernel are ignored.
    - Fix check_mte_async_tcf calling site as it expects the first argument
      to be the thread flags.
    - Move the mte_thread_switch() definition and call to a later patch as
      this became empty with the removal of async uaccess checking.
    - Add dsb() and clearing of TFSRE0_EL1 in flush_mte_state(), in case
      execve() triggered a asynchronous tag check fault.
    - Clear TIF_MTE_ASYNC_FAULT in arch_dup_task_struct() so that the child
      does not inherit any pending tag fault in the parent.
    
    v2:
    - Clear PSTATE.TCO on exception entry (automatically set by the hardware).
    - On syscall entry, for asynchronous tag check faults from user space,
      generate the signal early via syscall restarting.
    - Before context switch, save any potential async tag check fault
      generated by the kernel to the TIF flag (this follows an architecture
      update where the uaccess routines use the TCF0 mode).
    - Moved the flush_mte_state() and mte_thread_switch() function to a new
      mte.c file.

 arch/arm64/include/asm/mte.h         | 23 +++++++++++++++++
 arch/arm64/include/asm/thread_info.h |  4 ++-
 arch/arm64/kernel/Makefile           |  1 +
 arch/arm64/kernel/entry.S            | 37 ++++++++++++++++++++++++++++
 arch/arm64/kernel/mte.c              | 21 ++++++++++++++++
 arch/arm64/kernel/process.c          |  5 ++++
 arch/arm64/kernel/signal.c           |  8 ++++++
 arch/arm64/kernel/syscall.c          | 10 ++++++++
 arch/arm64/mm/fault.c                |  9 ++++++-
 9 files changed, 116 insertions(+), 2 deletions(-)
 create mode 100644 arch/arm64/include/asm/mte.h
 create mode 100644 arch/arm64/kernel/mte.c

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
new file mode 100644
index 000000000000..a0bf310da74b
--- /dev/null
+++ b/arch/arm64/include/asm/mte.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020 ARM Ltd.
+ */
+#ifndef __ASM_MTE_H
+#define __ASM_MTE_H
+
+#ifndef __ASSEMBLY__
+
+#ifdef CONFIG_ARM64_MTE
+
+void flush_mte_state(void);
+
+#else
+
+static inline void flush_mte_state(void)
+{
+}
+
+#endif
+
+#endif /* __ASSEMBLY__ */
+#endif /* __ASM_MTE_H  */
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 512174a8e789..0c6e5523b932 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -63,6 +63,7 @@ void arch_release_task_struct(struct task_struct *tsk);
 #define TIF_FOREIGN_FPSTATE	3	/* CPU's FP state is not current's */
 #define TIF_UPROBE		4	/* uprobe breakpoint or singlestep */
 #define TIF_FSCHECK		5	/* Check FS is USER_DS on return */
+#define TIF_MTE_ASYNC_FAULT	6	/* MTE Asynchronous Tag Check Fault */
 #define TIF_SYSCALL_TRACE	8	/* syscall trace active */
 #define TIF_SYSCALL_AUDIT	9	/* syscall auditing */
 #define TIF_SYSCALL_TRACEPOINT	10	/* syscall tracepoint for ftrace */
@@ -91,10 +92,11 @@ void arch_release_task_struct(struct task_struct *tsk);
 #define _TIF_FSCHECK		(1 << TIF_FSCHECK)
 #define _TIF_32BIT		(1 << TIF_32BIT)
 #define _TIF_SVE		(1 << TIF_SVE)
+#define _TIF_MTE_ASYNC_FAULT	(1 << TIF_MTE_ASYNC_FAULT)
 
 #define _TIF_WORK_MASK		(_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
 				 _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
-				 _TIF_UPROBE | _TIF_FSCHECK)
+				 _TIF_UPROBE | _TIF_FSCHECK | _TIF_MTE_ASYNC_FAULT)
 
 #define _TIF_SYSCALL_WORK	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
 				 _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 4e5b8ee31442..dbede7a4c5fb 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -63,6 +63,7 @@ obj-$(CONFIG_CRASH_CORE)		+= crash_core.o
 obj-$(CONFIG_ARM_SDE_INTERFACE)		+= sdei.o
 obj-$(CONFIG_ARM64_SSBD)		+= ssbd.o
 obj-$(CONFIG_ARM64_PTR_AUTH)		+= pointer_auth.o
+obj-$(CONFIG_ARM64_MTE)			+= mte.o
 
 obj-y					+= vdso/ probes/
 obj-$(CONFIG_COMPAT_VDSO)		+= vdso32/
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index ddcde093c433..cbb3cacdf79f 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -145,6 +145,32 @@ alternative_cb_end
 #endif
 	.endm
 
+	/* Check for MTE asynchronous tag check faults */
+	.macro check_mte_async_tcf, flgs, tmp
+#ifdef CONFIG_ARM64_MTE
+alternative_if_not ARM64_MTE
+	b	1f
+alternative_else_nop_endif
+	mrs_s	\tmp, SYS_TFSRE0_EL1
+	tbz	\tmp, #SYS_TFSR_EL1_TF0_SHIFT, 1f
+	/* Asynchronous TCF occurred for TTBR0 access, set the TI flag */
+	orr	\flgs, \flgs, #_TIF_MTE_ASYNC_FAULT
+	str	\flgs, [tsk, #TSK_TI_FLAGS]
+	msr_s	SYS_TFSRE0_EL1, xzr
+1:
+#endif
+	.endm
+
+	/* Clear the MTE asynchronous tag check faults */
+	.macro clear_mte_async_tcf
+#ifdef CONFIG_ARM64_MTE
+alternative_if ARM64_MTE
+	dsb	ish
+	msr_s	SYS_TFSRE0_EL1, xzr
+alternative_else_nop_endif
+#endif
+	.endm
+
 	.macro	kernel_entry, el, regsize = 64
 	.if	\regsize == 32
 	mov	w0, w0				// zero upper 32 bits of x0
@@ -176,6 +202,8 @@ alternative_cb_end
 	ldr	x19, [tsk, #TSK_TI_FLAGS]
 	disable_step_tsk x19, x20
 
+	/* Check for asynchronous tag check faults in user space */
+	check_mte_async_tcf x19, x22
 	apply_ssbd 1, x22, x23
 
 	ptrauth_keys_install_kernel tsk, 1, x20, x22, x23
@@ -244,6 +272,13 @@ alternative_if ARM64_HAS_IRQ_PRIO_MASKING
 	str	x20, [sp, #S_PMR_SAVE]
 alternative_else_nop_endif
 
+	/* Re-enable tag checking (TCO set on exception entry) */
+#ifdef CONFIG_ARM64_MTE
+alternative_if ARM64_MTE
+	SET_PSTATE_TCO(0)
+alternative_else_nop_endif
+#endif
+
 	/*
 	 * Registers that may be useful after this macro is invoked:
 	 *
@@ -748,6 +783,8 @@ ret_to_user:
 	and	x2, x1, #_TIF_WORK_MASK
 	cbnz	x2, work_pending
 finish_ret_to_user:
+	/* Ignore asynchronous tag check faults in the uaccess routines */
+	clear_mte_async_tcf
 	enable_step_tsk x1, x2
 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK
 	bl	stackleak_erase
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
new file mode 100644
index 000000000000..032016823957
--- /dev/null
+++ b/arch/arm64/kernel/mte.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 ARM Ltd.
+ */
+
+#include <linux/thread_info.h>
+
+#include <asm/cpufeature.h>
+#include <asm/mte.h>
+#include <asm/sysreg.h>
+
+void flush_mte_state(void)
+{
+	if (!system_supports_mte())
+		return;
+
+	/* clear any pending asynchronous tag fault */
+	dsb(ish);
+	write_sysreg_s(0, SYS_TFSRE0_EL1);
+	clear_thread_flag(TIF_MTE_ASYNC_FAULT);
+}
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 56be4cbf771f..740047c9cd13 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -50,6 +50,7 @@
 #include <asm/exec.h>
 #include <asm/fpsimd.h>
 #include <asm/mmu_context.h>
+#include <asm/mte.h>
 #include <asm/processor.h>
 #include <asm/pointer_auth.h>
 #include <asm/stacktrace.h>
@@ -323,6 +324,7 @@ void flush_thread(void)
 	tls_thread_flush();
 	flush_ptrace_hw_breakpoint(current);
 	flush_tagged_addr_state();
+	flush_mte_state();
 }
 
 void release_thread(struct task_struct *dead_task)
@@ -355,6 +357,9 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 	dst->thread.sve_state = NULL;
 	clear_tsk_thread_flag(dst, TIF_SVE);
 
+	/* clear any pending asynchronous tag fault raised by the parent */
+	clear_tsk_thread_flag(dst, TIF_MTE_ASYNC_FAULT);
+
 	return 0;
 }
 
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 339882db5a91..149334d5df02 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -732,6 +732,9 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
 	regs->regs[29] = (unsigned long)&user->next_frame->fp;
 	regs->pc = (unsigned long)ka->sa.sa_handler;
 
+	/* TCO (Tag Check Override) always cleared for signal handlers */
+	regs->pstate &= ~PSR_TCO_BIT;
+
 	if (ka->sa.sa_flags & SA_RESTORER)
 		sigtramp = ka->sa.sa_restorer;
 	else
@@ -923,6 +926,11 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
 			if (thread_flags & _TIF_UPROBE)
 				uprobe_notify_resume(regs);
 
+			if (thread_flags & _TIF_MTE_ASYNC_FAULT) {
+				clear_thread_flag(TIF_MTE_ASYNC_FAULT);
+				send_sig_fault(SIGSEGV, SEGV_MTEAERR, 0, current);
+			}
+
 			if (thread_flags & _TIF_SIGPENDING)
 				do_signal(regs);
 
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index a12c0c88d345..db25f5d6a07c 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -102,6 +102,16 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
 	local_daif_restore(DAIF_PROCCTX);
 	user_exit();
 
+	if (system_supports_mte() && (flags & _TIF_MTE_ASYNC_FAULT)) {
+		/*
+		 * Process the asynchronous tag check fault before the actual
+		 * syscall. do_notify_resume() will send a signal to userspace
+		 * before the syscall is restarted.
+		 */
+		regs->regs[0] = -ERESTARTNOINTR;
+		return;
+	}
+
 	if (has_syscall_work(flags)) {
 		/* set default errno for user-issued syscall(-1) */
 		if (scno == NO_SYSCALL)
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index c9cedc0432d2..38b59cace3e3 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -650,6 +650,13 @@ static int do_sea(unsigned long addr, unsigned int esr, struct pt_regs *regs)
 	return 0;
 }
 
+static int do_tag_check_fault(unsigned long addr, unsigned int esr,
+			      struct pt_regs *regs)
+{
+	do_bad_area(addr, esr, regs);
+	return 0;
+}
+
 static const struct fault_info fault_info[] = {
 	{ do_bad,		SIGKILL, SI_KERNEL,	"ttbr address size fault"	},
 	{ do_bad,		SIGKILL, SI_KERNEL,	"level 1 address size fault"	},
@@ -668,7 +675,7 @@ static const struct fault_info fault_info[] = {
 	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 2 permission fault"	},
 	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 3 permission fault"	},
 	{ do_sea,		SIGBUS,  BUS_OBJERR,	"synchronous external abort"	},
-	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 17"			},
+	{ do_tag_check_fault,	SIGSEGV, SEGV_MTESERR,	"synchronous tag check fault"	},
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 18"			},
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 19"			},
 	{ do_sea,		SIGKILL, SI_KERNEL,	"level 0 (translation table walk)"	},

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 06/26] mm: Add PG_ARCH_2 page flag
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:15   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Steven Price,
	Andrew Morton

From: Steven Price <steven.price@arm.com>

For arm64 MTE support it is necessary to be able to mark pages that
contain user space visible tags that will need to be saved/restored e.g.
when swapped out.

To support this add a new arch specific flag (PG_ARCH_2) that arch code
can opt into using ARCH_USES_PG_ARCH_2.

Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---

Notes:
    New in v4.

 fs/proc/page.c                    | 3 +++
 include/linux/kernel-page-flags.h | 1 +
 include/linux/page-flags.h        | 3 +++
 include/trace/events/mmflags.h    | 9 ++++++++-
 mm/Kconfig                        | 3 +++
 tools/vm/page-types.c             | 2 ++
 6 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/fs/proc/page.c b/fs/proc/page.c
index f909243d4a66..1b6cbe0849a8 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -217,6 +217,9 @@ u64 stable_page_flags(struct page *page)
 	u |= kpf_copy_bit(k, KPF_PRIVATE_2,	PG_private_2);
 	u |= kpf_copy_bit(k, KPF_OWNER_PRIVATE,	PG_owner_priv_1);
 	u |= kpf_copy_bit(k, KPF_ARCH,		PG_arch_1);
+#ifdef CONFIG_ARCH_USES_PG_ARCH_2
+	u |= kpf_copy_bit(k, KPF_ARCH_2,	PG_arch_2);
+#endif
 
 	return u;
 };
diff --git a/include/linux/kernel-page-flags.h b/include/linux/kernel-page-flags.h
index abd20ef93c98..eee1877a354e 100644
--- a/include/linux/kernel-page-flags.h
+++ b/include/linux/kernel-page-flags.h
@@ -17,5 +17,6 @@
 #define KPF_ARCH		38
 #define KPF_UNCACHED		39
 #define KPF_SOFTDIRTY		40
+#define KPF_ARCH_2		41
 
 #endif /* LINUX_KERNEL_PAGE_FLAGS_H */
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 222f6f7b2bb3..1d4971fe4fee 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -135,6 +135,9 @@ enum pageflags {
 #if defined(CONFIG_IDLE_PAGE_TRACKING) && defined(CONFIG_64BIT)
 	PG_young,
 	PG_idle,
+#endif
+#ifdef CONFIG_ARCH_USES_PG_ARCH_2
+	PG_arch_2,
 #endif
 	__NR_PAGEFLAGS,
 
diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index 5fb752034386..5d098029a2d8 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -79,6 +79,12 @@
 #define IF_HAVE_PG_IDLE(flag,string)
 #endif
 
+#ifdef CONFIG_ARCH_USES_PG_ARCH_2
+#define IF_HAVE_PG_ARCH_2(flag,string) ,{1UL << flag, string}
+#else
+#define IF_HAVE_PG_ARCH_2(flag,string)
+#endif
+
 #define __def_pageflag_names						\
 	{1UL << PG_locked,		"locked"	},		\
 	{1UL << PG_waiters,		"waiters"	},		\
@@ -105,7 +111,8 @@ IF_HAVE_PG_MLOCK(PG_mlocked,		"mlocked"	)		\
 IF_HAVE_PG_UNCACHED(PG_uncached,	"uncached"	)		\
 IF_HAVE_PG_HWPOISON(PG_hwpoison,	"hwpoison"	)		\
 IF_HAVE_PG_IDLE(PG_young,		"young"		)		\
-IF_HAVE_PG_IDLE(PG_idle,		"idle"		)
+IF_HAVE_PG_IDLE(PG_idle,		"idle"		)		\
+IF_HAVE_PG_ARCH_2(PG_arch_2,		"arch_2"	)
 
 #define show_page_flags(flags)						\
 	(flags) ? __print_flags(flags, "|",				\
diff --git a/mm/Kconfig b/mm/Kconfig
index c1acc34c1c35..60427ccc3cb8 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -867,4 +867,7 @@ config ARCH_HAS_HUGEPD
 config MAPPING_DIRTY_HELPERS
         bool
 
+config ARCH_USES_PG_ARCH_2
+	bool
+
 endmenu
diff --git a/tools/vm/page-types.c b/tools/vm/page-types.c
index 58c0eab71bca..0517c744b04e 100644
--- a/tools/vm/page-types.c
+++ b/tools/vm/page-types.c
@@ -78,6 +78,7 @@
 #define KPF_ARCH		38
 #define KPF_UNCACHED		39
 #define KPF_SOFTDIRTY		40
+#define KPF_ARCH_2		41
 
 /* [48-] take some arbitrary free slots for expanding overloaded flags
  * not part of kernel API
@@ -135,6 +136,7 @@ static const char * const page_flag_names[] = {
 	[KPF_ARCH]		= "h:arch",
 	[KPF_UNCACHED]		= "c:uncached",
 	[KPF_SOFTDIRTY]		= "f:softdirty",
+	[KPF_ARCH_2]		= "H:arch_2",
 
 	[KPF_READAHEAD]		= "I:readahead",
 	[KPF_SLOB_FREE]		= "P:slob_free",

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 06/26] mm: Add PG_ARCH_2 page flag
@ 2020-05-15 17:15   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Steven Price, Peter Collingbourne, linux-mm, Andrew Morton,
	Vincenzo Frascino, Will Deacon, Dave P Martin

From: Steven Price <steven.price@arm.com>

For arm64 MTE support it is necessary to be able to mark pages that
contain user space visible tags that will need to be saved/restored e.g.
when swapped out.

To support this add a new arch specific flag (PG_ARCH_2) that arch code
can opt into using ARCH_USES_PG_ARCH_2.

Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---

Notes:
    New in v4.

 fs/proc/page.c                    | 3 +++
 include/linux/kernel-page-flags.h | 1 +
 include/linux/page-flags.h        | 3 +++
 include/trace/events/mmflags.h    | 9 ++++++++-
 mm/Kconfig                        | 3 +++
 tools/vm/page-types.c             | 2 ++
 6 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/fs/proc/page.c b/fs/proc/page.c
index f909243d4a66..1b6cbe0849a8 100644
--- a/fs/proc/page.c
+++ b/fs/proc/page.c
@@ -217,6 +217,9 @@ u64 stable_page_flags(struct page *page)
 	u |= kpf_copy_bit(k, KPF_PRIVATE_2,	PG_private_2);
 	u |= kpf_copy_bit(k, KPF_OWNER_PRIVATE,	PG_owner_priv_1);
 	u |= kpf_copy_bit(k, KPF_ARCH,		PG_arch_1);
+#ifdef CONFIG_ARCH_USES_PG_ARCH_2
+	u |= kpf_copy_bit(k, KPF_ARCH_2,	PG_arch_2);
+#endif
 
 	return u;
 };
diff --git a/include/linux/kernel-page-flags.h b/include/linux/kernel-page-flags.h
index abd20ef93c98..eee1877a354e 100644
--- a/include/linux/kernel-page-flags.h
+++ b/include/linux/kernel-page-flags.h
@@ -17,5 +17,6 @@
 #define KPF_ARCH		38
 #define KPF_UNCACHED		39
 #define KPF_SOFTDIRTY		40
+#define KPF_ARCH_2		41
 
 #endif /* LINUX_KERNEL_PAGE_FLAGS_H */
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 222f6f7b2bb3..1d4971fe4fee 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -135,6 +135,9 @@ enum pageflags {
 #if defined(CONFIG_IDLE_PAGE_TRACKING) && defined(CONFIG_64BIT)
 	PG_young,
 	PG_idle,
+#endif
+#ifdef CONFIG_ARCH_USES_PG_ARCH_2
+	PG_arch_2,
 #endif
 	__NR_PAGEFLAGS,
 
diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index 5fb752034386..5d098029a2d8 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -79,6 +79,12 @@
 #define IF_HAVE_PG_IDLE(flag,string)
 #endif
 
+#ifdef CONFIG_ARCH_USES_PG_ARCH_2
+#define IF_HAVE_PG_ARCH_2(flag,string) ,{1UL << flag, string}
+#else
+#define IF_HAVE_PG_ARCH_2(flag,string)
+#endif
+
 #define __def_pageflag_names						\
 	{1UL << PG_locked,		"locked"	},		\
 	{1UL << PG_waiters,		"waiters"	},		\
@@ -105,7 +111,8 @@ IF_HAVE_PG_MLOCK(PG_mlocked,		"mlocked"	)		\
 IF_HAVE_PG_UNCACHED(PG_uncached,	"uncached"	)		\
 IF_HAVE_PG_HWPOISON(PG_hwpoison,	"hwpoison"	)		\
 IF_HAVE_PG_IDLE(PG_young,		"young"		)		\
-IF_HAVE_PG_IDLE(PG_idle,		"idle"		)
+IF_HAVE_PG_IDLE(PG_idle,		"idle"		)		\
+IF_HAVE_PG_ARCH_2(PG_arch_2,		"arch_2"	)
 
 #define show_page_flags(flags)						\
 	(flags) ? __print_flags(flags, "|",				\
diff --git a/mm/Kconfig b/mm/Kconfig
index c1acc34c1c35..60427ccc3cb8 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -867,4 +867,7 @@ config ARCH_HAS_HUGEPD
 config MAPPING_DIRTY_HELPERS
         bool
 
+config ARCH_USES_PG_ARCH_2
+	bool
+
 endmenu
diff --git a/tools/vm/page-types.c b/tools/vm/page-types.c
index 58c0eab71bca..0517c744b04e 100644
--- a/tools/vm/page-types.c
+++ b/tools/vm/page-types.c
@@ -78,6 +78,7 @@
 #define KPF_ARCH		38
 #define KPF_UNCACHED		39
 #define KPF_SOFTDIRTY		40
+#define KPF_ARCH_2		41
 
 /* [48-] take some arbitrary free slots for expanding overloaded flags
  * not part of kernel API
@@ -135,6 +136,7 @@ static const char * const page_flag_names[] = {
 	[KPF_ARCH]		= "h:arch",
 	[KPF_UNCACHED]		= "c:uncached",
 	[KPF_SOFTDIRTY]		= "f:softdirty",
+	[KPF_ARCH_2]		= "H:arch_2",
 
 	[KPF_READAHEAD]		= "I:readahead",
 	[KPF_SLOB_FREE]		= "P:slob_free",

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 07/26] arm64: mte: Clear the tags when a page is mapped in user-space with PROT_MTE
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:15   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Steven Price

Pages allocated by the kernel are not guaranteed to have the tags
zeroed, especially as the kernel does not yet use MTE itself. To ensure
the user can still access such pages when mapped into its address space,
clear the tags via set_pte_at(). A new page flag - PG_mte_tagged
(PG_arch_2) - is used to track pages with valid allocation tags.

Co-developed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    New in v4. Replacing a previous page zeroing the tags in clear_page().

 arch/arm64/include/asm/assembler.h | 12 ++++++++++++
 arch/arm64/include/asm/mte.h       | 16 ++++++++++++++++
 arch/arm64/include/asm/pgtable.h   |  6 ++++++
 arch/arm64/kernel/mte.c            | 14 ++++++++++++++
 arch/arm64/lib/Makefile            |  2 ++
 arch/arm64/lib/mte.S               | 23 +++++++++++++++++++++++
 6 files changed, 73 insertions(+)
 create mode 100644 arch/arm64/lib/mte.S

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 0bff325117b4..dacd66df6c05 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -21,6 +21,7 @@
 #include <asm/page.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
+#include <asm/sysreg.h>
 #include <asm/thread_info.h>
 
 	.macro save_and_disable_daif, flags
@@ -736,4 +737,15 @@ USER(\label, ic	ivau, \tmp2)			// invalidate I line PoU
 .Lyield_out_\@ :
 	.endm
 
+/*
+ * multitag_transfer_size - set \reg to the block size that is accessed by the
+ * LDGM/STGM instructions.
+ */
+	.macro	multitag_transfer_size, reg, tmp
+	mrs_s	\reg, SYS_GMID_EL1
+	ubfx	\reg, \reg, #SYS_GMID_EL1_BS_SHIFT, #SYS_GMID_EL1_BS_SIZE
+	mov	\tmp, #4
+	lsl	\reg, \tmp, \reg
+	.endm
+
 #endif	/* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index a0bf310da74b..4310a7ff10c0 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -7,12 +7,28 @@
 
 #ifndef __ASSEMBLY__
 
+#include <linux/page-flags.h>
+
+#include <asm/pgtable-types.h>
+
+void mte_clear_page_tags(void *addr, size_t size);
+
 #ifdef CONFIG_ARM64_MTE
 
+/* track which pages have valid allocation tags */
+#define PG_mte_tagged	PG_arch_2
+
+void mte_sync_tags(pte_t *ptep, pte_t pte);
 void flush_mte_state(void);
 
 #else
 
+/* unused if !CONFIG_ARM64_MTE, silence the compiler */
+#define PG_mte_tagged	0
+
+static inline void mte_sync_tags(pte_t *ptep, pte_t pte)
+{
+}
 static inline void flush_mte_state(void)
 {
 }
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 538c85e62f86..647a3f0c7874 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -9,6 +9,7 @@
 #include <asm/proc-fns.h>
 
 #include <asm/memory.h>
+#include <asm/mte.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/pgtable-prot.h>
 #include <asm/tlbflush.h>
@@ -80,6 +81,8 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
 #define pte_user_exec(pte)	(!(pte_val(pte) & PTE_UXN))
 #define pte_cont(pte)		(!!(pte_val(pte) & PTE_CONT))
 #define pte_devmap(pte)		(!!(pte_val(pte) & PTE_DEVMAP))
+#define pte_tagged(pte)		((pte_val(pte) & PTE_ATTRINDX_MASK) == \
+				 PTE_ATTRINDX(MT_NORMAL_TAGGED))
 
 #define pte_cont_addr_end(addr, end)						\
 ({	unsigned long __boundary = ((addr) + CONT_PTE_SIZE) & CONT_PTE_MASK;	\
@@ -274,6 +277,9 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
 	if (pte_present(pte) && pte_user_exec(pte) && !pte_special(pte))
 		__sync_icache_dcache(pte);
 
+	if (system_supports_mte() && pte_present(pte) && pte_tagged(pte))
+		mte_sync_tags(ptep, pte);
+
 	__check_racy_pte_update(mm, ptep, pte);
 
 	set_pte(ptep, pte);
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 032016823957..65a2f8490d18 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -3,12 +3,26 @@
  * Copyright (C) 2020 ARM Ltd.
  */
 
+#include <linux/bitops.h>
+#include <linux/mm.h>
 #include <linux/thread_info.h>
 
 #include <asm/cpufeature.h>
 #include <asm/mte.h>
 #include <asm/sysreg.h>
 
+void mte_sync_tags(pte_t *ptep, pte_t pte)
+{
+	struct page *page = pte_page(pte);
+
+	/* if PG_mte_tagged is set, tags have already been initialised */
+	if (test_and_set_bit(PG_mte_tagged, &page->flags))
+		return;
+
+	/* tags of the zero page will be cleared on first mapping */
+	mte_clear_page_tags(page_address(page), page_size(page));
+}
+
 void flush_mte_state(void)
 {
 	if (!system_supports_mte())
diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
index 2fc253466dbf..d31e1169d9b8 100644
--- a/arch/arm64/lib/Makefile
+++ b/arch/arm64/lib/Makefile
@@ -16,3 +16,5 @@ lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
 obj-$(CONFIG_CRC32) += crc32.o
 
 obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
+
+obj-$(CONFIG_ARM64_MTE) += mte.o
diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
new file mode 100644
index 000000000000..130fb7047e17
--- /dev/null
+++ b/arch/arm64/lib/mte.S
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020 ARM Ltd.
+ */
+#include <linux/linkage.h>
+
+#include <asm/assembler.h>
+
+	.arch	armv8.5-a+memtag
+
+/*
+ * Clear the tags in a (potentially huge) page
+ *   x0 - page address
+ *   x1 - page size
+ */
+SYM_FUNC_START(mte_clear_page_tags)
+	multitag_transfer_size x2, x3
+1:	stgm	xzr, [x0]
+	add	x0, x0, x2
+	sub	x1, x1, x2
+	cbnz	x1, 1b
+	ret
+SYM_FUNC_END(mte_clear_page_tags)

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 07/26] arm64: mte: Clear the tags when a page is mapped in user-space with PROT_MTE
@ 2020-05-15 17:15   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Steven Price, Peter Collingbourne, linux-mm, Vincenzo Frascino,
	Will Deacon, Dave P Martin

Pages allocated by the kernel are not guaranteed to have the tags
zeroed, especially as the kernel does not yet use MTE itself. To ensure
the user can still access such pages when mapped into its address space,
clear the tags via set_pte_at(). A new page flag - PG_mte_tagged
(PG_arch_2) - is used to track pages with valid allocation tags.

Co-developed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    New in v4. Replacing a previous page zeroing the tags in clear_page().

 arch/arm64/include/asm/assembler.h | 12 ++++++++++++
 arch/arm64/include/asm/mte.h       | 16 ++++++++++++++++
 arch/arm64/include/asm/pgtable.h   |  6 ++++++
 arch/arm64/kernel/mte.c            | 14 ++++++++++++++
 arch/arm64/lib/Makefile            |  2 ++
 arch/arm64/lib/mte.S               | 23 +++++++++++++++++++++++
 6 files changed, 73 insertions(+)
 create mode 100644 arch/arm64/lib/mte.S

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 0bff325117b4..dacd66df6c05 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -21,6 +21,7 @@
 #include <asm/page.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
+#include <asm/sysreg.h>
 #include <asm/thread_info.h>
 
 	.macro save_and_disable_daif, flags
@@ -736,4 +737,15 @@ USER(\label, ic	ivau, \tmp2)			// invalidate I line PoU
 .Lyield_out_\@ :
 	.endm
 
+/*
+ * multitag_transfer_size - set \reg to the block size that is accessed by the
+ * LDGM/STGM instructions.
+ */
+	.macro	multitag_transfer_size, reg, tmp
+	mrs_s	\reg, SYS_GMID_EL1
+	ubfx	\reg, \reg, #SYS_GMID_EL1_BS_SHIFT, #SYS_GMID_EL1_BS_SIZE
+	mov	\tmp, #4
+	lsl	\reg, \tmp, \reg
+	.endm
+
 #endif	/* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index a0bf310da74b..4310a7ff10c0 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -7,12 +7,28 @@
 
 #ifndef __ASSEMBLY__
 
+#include <linux/page-flags.h>
+
+#include <asm/pgtable-types.h>
+
+void mte_clear_page_tags(void *addr, size_t size);
+
 #ifdef CONFIG_ARM64_MTE
 
+/* track which pages have valid allocation tags */
+#define PG_mte_tagged	PG_arch_2
+
+void mte_sync_tags(pte_t *ptep, pte_t pte);
 void flush_mte_state(void);
 
 #else
 
+/* unused if !CONFIG_ARM64_MTE, silence the compiler */
+#define PG_mte_tagged	0
+
+static inline void mte_sync_tags(pte_t *ptep, pte_t pte)
+{
+}
 static inline void flush_mte_state(void)
 {
 }
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 538c85e62f86..647a3f0c7874 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -9,6 +9,7 @@
 #include <asm/proc-fns.h>
 
 #include <asm/memory.h>
+#include <asm/mte.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/pgtable-prot.h>
 #include <asm/tlbflush.h>
@@ -80,6 +81,8 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
 #define pte_user_exec(pte)	(!(pte_val(pte) & PTE_UXN))
 #define pte_cont(pte)		(!!(pte_val(pte) & PTE_CONT))
 #define pte_devmap(pte)		(!!(pte_val(pte) & PTE_DEVMAP))
+#define pte_tagged(pte)		((pte_val(pte) & PTE_ATTRINDX_MASK) == \
+				 PTE_ATTRINDX(MT_NORMAL_TAGGED))
 
 #define pte_cont_addr_end(addr, end)						\
 ({	unsigned long __boundary = ((addr) + CONT_PTE_SIZE) & CONT_PTE_MASK;	\
@@ -274,6 +277,9 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
 	if (pte_present(pte) && pte_user_exec(pte) && !pte_special(pte))
 		__sync_icache_dcache(pte);
 
+	if (system_supports_mte() && pte_present(pte) && pte_tagged(pte))
+		mte_sync_tags(ptep, pte);
+
 	__check_racy_pte_update(mm, ptep, pte);
 
 	set_pte(ptep, pte);
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 032016823957..65a2f8490d18 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -3,12 +3,26 @@
  * Copyright (C) 2020 ARM Ltd.
  */
 
+#include <linux/bitops.h>
+#include <linux/mm.h>
 #include <linux/thread_info.h>
 
 #include <asm/cpufeature.h>
 #include <asm/mte.h>
 #include <asm/sysreg.h>
 
+void mte_sync_tags(pte_t *ptep, pte_t pte)
+{
+	struct page *page = pte_page(pte);
+
+	/* if PG_mte_tagged is set, tags have already been initialised */
+	if (test_and_set_bit(PG_mte_tagged, &page->flags))
+		return;
+
+	/* tags of the zero page will be cleared on first mapping */
+	mte_clear_page_tags(page_address(page), page_size(page));
+}
+
 void flush_mte_state(void)
 {
 	if (!system_supports_mte())
diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
index 2fc253466dbf..d31e1169d9b8 100644
--- a/arch/arm64/lib/Makefile
+++ b/arch/arm64/lib/Makefile
@@ -16,3 +16,5 @@ lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
 obj-$(CONFIG_CRC32) += crc32.o
 
 obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
+
+obj-$(CONFIG_ARM64_MTE) += mte.o
diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
new file mode 100644
index 000000000000..130fb7047e17
--- /dev/null
+++ b/arch/arm64/lib/mte.S
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020 ARM Ltd.
+ */
+#include <linux/linkage.h>
+
+#include <asm/assembler.h>
+
+	.arch	armv8.5-a+memtag
+
+/*
+ * Clear the tags in a (potentially huge) page
+ *   x0 - page address
+ *   x1 - page size
+ */
+SYM_FUNC_START(mte_clear_page_tags)
+	multitag_transfer_size x2, x3
+1:	stgm	xzr, [x0]
+	add	x0, x0, x2
+	sub	x1, x1, x2
+	cbnz	x1, 1b
+	ret
+SYM_FUNC_END(mte_clear_page_tags)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 08/26] arm64: mte: Tags-aware copy_page() implementation
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:15   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

When the Memory Tagging Extension is enabled, the tags need to be
preserved across page copy (e.g. for copy-on-write).

Introduce MTE-aware copy_page() which preserves the tags across page
copy.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v4:
    - Moved the tag copying to a separate function in mte.S and only called
      if the source page has the PG_mte_tagged flag set.

 arch/arm64/include/asm/mte.h |  4 ++++
 arch/arm64/lib/mte.S         | 19 +++++++++++++++++++
 arch/arm64/mm/copypage.c     | 14 ++++++++++++--
 3 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index 4310a7ff10c0..c1a09499c678 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -19,6 +19,7 @@ void mte_clear_page_tags(void *addr, size_t size);
 #define PG_mte_tagged	PG_arch_2
 
 void mte_sync_tags(pte_t *ptep, pte_t pte);
+void mte_copy_page_tags(void *kto, const void *kfrom);
 void flush_mte_state(void);
 
 #else
@@ -29,6 +30,9 @@ void flush_mte_state(void);
 static inline void mte_sync_tags(pte_t *ptep, pte_t pte)
 {
 }
+static inline void mte_copy_page_tags(void *kto, const void *kfrom)
+{
+}
 static inline void flush_mte_state(void)
 {
 }
diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
index 130fb7047e17..a531b52fa5ba 100644
--- a/arch/arm64/lib/mte.S
+++ b/arch/arm64/lib/mte.S
@@ -5,6 +5,7 @@
 #include <linux/linkage.h>
 
 #include <asm/assembler.h>
+#include <asm/page.h>
 
 	.arch	armv8.5-a+memtag
 
@@ -21,3 +22,21 @@ SYM_FUNC_START(mte_clear_page_tags)
 	cbnz	x1, 1b
 	ret
 SYM_FUNC_END(mte_clear_page_tags)
+
+/*
+ * Copy the tags from the source page to the destination one
+ *   x0 - address of the destination page
+ *   x1 - address of the source page
+ */
+SYM_FUNC_START(mte_copy_page_tags)
+	mov	x2, x0
+	mov	x3, x1
+	multitag_transfer_size x5, x6
+1:	ldgm	x4, [x3]
+	stgm	x4, [x2]
+	add	x2, x2, x5
+	add	x3, x3, x5
+	tst	x2, #(PAGE_SIZE - 1)
+	b.ne	1b
+2:
+SYM_FUNC_END(mte_copy_page_tags)
diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index 2ee7b73433a5..2560ddc479ac 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -6,16 +6,26 @@
  * Copyright (C) 2012 ARM Ltd.
  */
 
+#include <linux/bitops.h>
 #include <linux/mm.h>
 
 #include <asm/page.h>
 #include <asm/cacheflush.h>
+#include <asm/cpufeature.h>
+#include <asm/mte.h>
 
 void __cpu_copy_user_page(void *kto, const void *kfrom, unsigned long vaddr)
 {
-	struct page *page = virt_to_page(kto);
+	struct page *to_page = virt_to_page(kto);
+	struct page *from_page = virt_to_page(kfrom);
+
 	copy_page(kto, kfrom);
-	flush_dcache_page(page);
+	if (system_supports_mte() &&
+	    test_bit(PG_mte_tagged, &from_page->flags)) {
+		mte_copy_page_tags(kto, kfrom);
+		set_bit(PG_mte_tagged, &to_page->flags);
+	}
+	flush_dcache_page(to_page);
 }
 EXPORT_SYMBOL_GPL(__cpu_copy_user_page);
 

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 08/26] arm64: mte: Tags-aware copy_page() implementation
@ 2020-05-15 17:15   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Vincenzo Frascino, Will Deacon,
	Dave P Martin

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

When the Memory Tagging Extension is enabled, the tags need to be
preserved across page copy (e.g. for copy-on-write).

Introduce MTE-aware copy_page() which preserves the tags across page
copy.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v4:
    - Moved the tag copying to a separate function in mte.S and only called
      if the source page has the PG_mte_tagged flag set.

 arch/arm64/include/asm/mte.h |  4 ++++
 arch/arm64/lib/mte.S         | 19 +++++++++++++++++++
 arch/arm64/mm/copypage.c     | 14 ++++++++++++--
 3 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index 4310a7ff10c0..c1a09499c678 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -19,6 +19,7 @@ void mte_clear_page_tags(void *addr, size_t size);
 #define PG_mte_tagged	PG_arch_2
 
 void mte_sync_tags(pte_t *ptep, pte_t pte);
+void mte_copy_page_tags(void *kto, const void *kfrom);
 void flush_mte_state(void);
 
 #else
@@ -29,6 +30,9 @@ void flush_mte_state(void);
 static inline void mte_sync_tags(pte_t *ptep, pte_t pte)
 {
 }
+static inline void mte_copy_page_tags(void *kto, const void *kfrom)
+{
+}
 static inline void flush_mte_state(void)
 {
 }
diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
index 130fb7047e17..a531b52fa5ba 100644
--- a/arch/arm64/lib/mte.S
+++ b/arch/arm64/lib/mte.S
@@ -5,6 +5,7 @@
 #include <linux/linkage.h>
 
 #include <asm/assembler.h>
+#include <asm/page.h>
 
 	.arch	armv8.5-a+memtag
 
@@ -21,3 +22,21 @@ SYM_FUNC_START(mte_clear_page_tags)
 	cbnz	x1, 1b
 	ret
 SYM_FUNC_END(mte_clear_page_tags)
+
+/*
+ * Copy the tags from the source page to the destination one
+ *   x0 - address of the destination page
+ *   x1 - address of the source page
+ */
+SYM_FUNC_START(mte_copy_page_tags)
+	mov	x2, x0
+	mov	x3, x1
+	multitag_transfer_size x5, x6
+1:	ldgm	x4, [x3]
+	stgm	x4, [x2]
+	add	x2, x2, x5
+	add	x3, x3, x5
+	tst	x2, #(PAGE_SIZE - 1)
+	b.ne	1b
+2:
+SYM_FUNC_END(mte_copy_page_tags)
diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index 2ee7b73433a5..2560ddc479ac 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -6,16 +6,26 @@
  * Copyright (C) 2012 ARM Ltd.
  */
 
+#include <linux/bitops.h>
 #include <linux/mm.h>
 
 #include <asm/page.h>
 #include <asm/cacheflush.h>
+#include <asm/cpufeature.h>
+#include <asm/mte.h>
 
 void __cpu_copy_user_page(void *kto, const void *kfrom, unsigned long vaddr)
 {
-	struct page *page = virt_to_page(kto);
+	struct page *to_page = virt_to_page(kto);
+	struct page *from_page = virt_to_page(kfrom);
+
 	copy_page(kto, kfrom);
-	flush_dcache_page(page);
+	if (system_supports_mte() &&
+	    test_bit(PG_mte_tagged, &from_page->flags)) {
+		mte_copy_page_tags(kto, kfrom);
+		set_bit(PG_mte_tagged, &to_page->flags);
+	}
+	flush_dcache_page(to_page);
 }
 EXPORT_SYMBOL_GPL(__cpu_copy_user_page);
 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 09/26] arm64: mte: Tags-aware aware memcmp_pages() implementation
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:15   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne

When the Memory Tagging Extension is enabled, two pages are identical
only if both their data and tags are identical.

Make the generic memcmp_pages() a __weak function and add an
arm64-specific implementation which returns non-zero if any of the two
pages contain valid MTE tags (PG_mte_tagged set). There isn't much
benefit in comparing the tags of two pages since these are normally used
for heap allocations and likely to differ anyway.

Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v4:
    - Remove page tag comparison. This is not very useful to detect
      identical pages as long as set_pte_at() can zero the tags on a page
      without copy-on-write if mapped with PROT_MTE. This can be improved
      if a real case appears but it's unlikely for heap pages to be
      identical across multiple processes.
    - Move the memcmp_pages() function to mte.c.

 arch/arm64/kernel/mte.c | 26 ++++++++++++++++++++++++++
 mm/util.c               |  2 +-
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 65a2f8490d18..da2d70178a4b 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -5,6 +5,7 @@
 
 #include <linux/bitops.h>
 #include <linux/mm.h>
+#include <linux/string.h>
 #include <linux/thread_info.h>
 
 #include <asm/cpufeature.h>
@@ -23,6 +24,31 @@ void mte_sync_tags(pte_t *ptep, pte_t pte)
 	mte_clear_page_tags(page_address(page), page_size(page));
 }
 
+int memcmp_pages(struct page *page1, struct page *page2)
+{
+	char *addr1, *addr2;
+	int ret;
+
+	addr1 = page_address(page1);
+	addr2 = page_address(page2);
+	ret = memcmp(addr1, addr2, PAGE_SIZE);
+
+	if (!system_supports_mte() || ret)
+		return ret;
+
+	/*
+	 * If the page content is identical but at least one of the pages is
+	 * tagged, return non-zero to avoid KSM merging. If only one of the
+	 * pages is tagged, set_pte_at() may zero or change the tags of the
+	 * other page via mte_sync_tags().
+	 */
+	if (test_bit(PG_mte_tagged, &page1->flags) ||
+	    test_bit(PG_mte_tagged, &page2->flags))
+		return addr1 != addr2;
+
+	return ret;
+}
+
 void flush_mte_state(void)
 {
 	if (!system_supports_mte())
diff --git a/mm/util.c b/mm/util.c
index 988d11e6c17c..662fb3da6d01 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -899,7 +899,7 @@ int get_cmdline(struct task_struct *task, char *buffer, int buflen)
 	return res;
 }
 
-int memcmp_pages(struct page *page1, struct page *page2)
+int __weak memcmp_pages(struct page *page1, struct page *page2)
 {
 	char *addr1, *addr2;
 	int ret;

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 09/26] arm64: mte: Tags-aware aware memcmp_pages() implementation
@ 2020-05-15 17:15   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Vincenzo Frascino, Will Deacon,
	Dave P Martin

When the Memory Tagging Extension is enabled, two pages are identical
only if both their data and tags are identical.

Make the generic memcmp_pages() a __weak function and add an
arm64-specific implementation which returns non-zero if any of the two
pages contain valid MTE tags (PG_mte_tagged set). There isn't much
benefit in comparing the tags of two pages since these are normally used
for heap allocations and likely to differ anyway.

Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v4:
    - Remove page tag comparison. This is not very useful to detect
      identical pages as long as set_pte_at() can zero the tags on a page
      without copy-on-write if mapped with PROT_MTE. This can be improved
      if a real case appears but it's unlikely for heap pages to be
      identical across multiple processes.
    - Move the memcmp_pages() function to mte.c.

 arch/arm64/kernel/mte.c | 26 ++++++++++++++++++++++++++
 mm/util.c               |  2 +-
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 65a2f8490d18..da2d70178a4b 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -5,6 +5,7 @@
 
 #include <linux/bitops.h>
 #include <linux/mm.h>
+#include <linux/string.h>
 #include <linux/thread_info.h>
 
 #include <asm/cpufeature.h>
@@ -23,6 +24,31 @@ void mte_sync_tags(pte_t *ptep, pte_t pte)
 	mte_clear_page_tags(page_address(page), page_size(page));
 }
 
+int memcmp_pages(struct page *page1, struct page *page2)
+{
+	char *addr1, *addr2;
+	int ret;
+
+	addr1 = page_address(page1);
+	addr2 = page_address(page2);
+	ret = memcmp(addr1, addr2, PAGE_SIZE);
+
+	if (!system_supports_mte() || ret)
+		return ret;
+
+	/*
+	 * If the page content is identical but at least one of the pages is
+	 * tagged, return non-zero to avoid KSM merging. If only one of the
+	 * pages is tagged, set_pte_at() may zero or change the tags of the
+	 * other page via mte_sync_tags().
+	 */
+	if (test_bit(PG_mte_tagged, &page1->flags) ||
+	    test_bit(PG_mte_tagged, &page2->flags))
+		return addr1 != addr2;
+
+	return ret;
+}
+
 void flush_mte_state(void)
 {
 	if (!system_supports_mte())
diff --git a/mm/util.c b/mm/util.c
index 988d11e6c17c..662fb3da6d01 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -899,7 +899,7 @@ int get_cmdline(struct task_struct *task, char *buffer, int buflen)
 	return res;
 }
 
-int memcmp_pages(struct page *page1, struct page *page2)
+int __weak memcmp_pages(struct page *page1, struct page *page2)
 {
 	char *addr1, *addr2;
 	int ret;

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 10/26] mm: Introduce arch_calc_vm_flag_bits()
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:15   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Kevin Brodsky,
	Andrew Morton

From: Kevin Brodsky <Kevin.Brodsky@arm.com>

Similarly to arch_calc_vm_prot_bits(), introduce a dummy
arch_calc_vm_flag_bits() invoked from calc_vm_flag_bits(). This macro
can be overridden by architectures to insert specific VM_* flags derived
from the mmap() MAP_* flags.

Signed-off-by: Kevin Brodsky <Kevin.Brodsky@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---

Notes:
    v2:
    - Updated the comment above arch_calc_vm_prot_bits().
    - Changed author since this patch had already been posted (internally).

 include/linux/mman.h | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/mman.h b/include/linux/mman.h
index 4b08e9c9c538..15c1162b9d65 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -74,13 +74,17 @@ static inline void vm_unacct_memory(long pages)
 }
 
 /*
- * Allow architectures to handle additional protection bits
+ * Allow architectures to handle additional protection and flag bits
  */
 
 #ifndef arch_calc_vm_prot_bits
 #define arch_calc_vm_prot_bits(prot, pkey) 0
 #endif
 
+#ifndef arch_calc_vm_flag_bits
+#define arch_calc_vm_flag_bits(flags) 0
+#endif
+
 #ifndef arch_vm_get_page_prot
 #define arch_vm_get_page_prot(vm_flags) __pgprot(0)
 #endif
@@ -131,7 +135,8 @@ calc_vm_flag_bits(unsigned long flags)
 	return _calc_vm_trans(flags, MAP_GROWSDOWN,  VM_GROWSDOWN ) |
 	       _calc_vm_trans(flags, MAP_DENYWRITE,  VM_DENYWRITE ) |
 	       _calc_vm_trans(flags, MAP_LOCKED,     VM_LOCKED    ) |
-	       _calc_vm_trans(flags, MAP_SYNC,	     VM_SYNC      );
+	       _calc_vm_trans(flags, MAP_SYNC,	     VM_SYNC      ) |
+	       arch_calc_vm_flag_bits(flags);
 }
 
 unsigned long vm_commit_limit(void);

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 10/26] mm: Introduce arch_calc_vm_flag_bits()
@ 2020-05-15 17:15   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Andrew Morton, Vincenzo Frascino,
	Will Deacon, Dave P Martin

From: Kevin Brodsky <Kevin.Brodsky@arm.com>

Similarly to arch_calc_vm_prot_bits(), introduce a dummy
arch_calc_vm_flag_bits() invoked from calc_vm_flag_bits(). This macro
can be overridden by architectures to insert specific VM_* flags derived
from the mmap() MAP_* flags.

Signed-off-by: Kevin Brodsky <Kevin.Brodsky@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---

Notes:
    v2:
    - Updated the comment above arch_calc_vm_prot_bits().
    - Changed author since this patch had already been posted (internally).

 include/linux/mman.h | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/mman.h b/include/linux/mman.h
index 4b08e9c9c538..15c1162b9d65 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -74,13 +74,17 @@ static inline void vm_unacct_memory(long pages)
 }
 
 /*
- * Allow architectures to handle additional protection bits
+ * Allow architectures to handle additional protection and flag bits
  */
 
 #ifndef arch_calc_vm_prot_bits
 #define arch_calc_vm_prot_bits(prot, pkey) 0
 #endif
 
+#ifndef arch_calc_vm_flag_bits
+#define arch_calc_vm_flag_bits(flags) 0
+#endif
+
 #ifndef arch_vm_get_page_prot
 #define arch_vm_get_page_prot(vm_flags) __pgprot(0)
 #endif
@@ -131,7 +135,8 @@ calc_vm_flag_bits(unsigned long flags)
 	return _calc_vm_trans(flags, MAP_GROWSDOWN,  VM_GROWSDOWN ) |
 	       _calc_vm_trans(flags, MAP_DENYWRITE,  VM_DENYWRITE ) |
 	       _calc_vm_trans(flags, MAP_LOCKED,     VM_LOCKED    ) |
-	       _calc_vm_trans(flags, MAP_SYNC,	     VM_SYNC      );
+	       _calc_vm_trans(flags, MAP_SYNC,	     VM_SYNC      ) |
+	       arch_calc_vm_flag_bits(flags);
 }
 
 unsigned long vm_commit_limit(void);

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:15   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne

To enable tagging on a memory range, the user must explicitly opt in via
a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new
memory type in the AttrIndx field of a pte, simplify the or'ing of these
bits over the protection_map[] attributes by making MT_NORMAL index 0.

There are two conditions for arch_vm_get_page_prot() to return the
MT_NORMAL_TAGGED memory type: (1) the user requested it via PROT_MTE,
registered as VM_MTE in the vm_flags, and (2) the vma supports MTE,
decided during the mmap() call (only) and registered as VM_MTE_ALLOWED.

arch_calc_vm_prot_bits() is responsible for registering the user request
as VM_MTE. The newly introduced arch_calc_vm_flag_bits() sets
VM_MTE_ALLOWED if the mapping is MAP_ANONYMOUS. An MTE-capable
filesystem (RAM-based) may be able to set VM_MTE_ALLOWED during its
mmap() file ops call.

In addition, update VM_DATA_DEFAULT_FLAGS to allow mprotect(PROT_MTE) on
stack or brk area.

The Linux mmap() syscall currently ignores unknown PROT_* flags. In the
presence of MTE, an mmap(PROT_MTE) on a file which does not support MTE
will not report an error and the memory will not be mapped as Normal
Tagged. For consistency, mprotect(PROT_MTE) will not report an error
either if the memory range does not support MTE. Two subsequent patches
in the series will propose tightening of this behaviour.

Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v2:
    - Add VM_MTE_ALLOWED to show_smap_vma_flags().

 arch/arm64/include/asm/memory.h    | 18 +++++----
 arch/arm64/include/asm/mman.h      | 64 ++++++++++++++++++++++++++++++
 arch/arm64/include/asm/page.h      |  2 +-
 arch/arm64/include/asm/pgtable.h   |  7 +++-
 arch/arm64/include/uapi/asm/mman.h | 14 +++++++
 fs/proc/task_mmu.c                 |  4 ++
 include/linux/mm.h                 |  8 ++++
 7 files changed, 108 insertions(+), 9 deletions(-)
 create mode 100644 arch/arm64/include/asm/mman.h
 create mode 100644 arch/arm64/include/uapi/asm/mman.h

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 472c77a68225..770535b7ca35 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -129,14 +129,18 @@
 
 /*
  * Memory types available.
+ *
+ * IMPORTANT: MT_NORMAL must be index 0 since vm_get_page_prot() may 'or' in
+ *	      the MT_NORMAL_TAGGED memory type for PROT_MTE mappings. Note
+ *	      that protection_map[] only contains MT_NORMAL attributes.
  */
-#define MT_DEVICE_nGnRnE	0
-#define MT_DEVICE_nGnRE		1
-#define MT_DEVICE_GRE		2
-#define MT_NORMAL_NC		3
-#define MT_NORMAL		4
-#define MT_NORMAL_WT		5
-#define MT_NORMAL_TAGGED	6
+#define MT_NORMAL		0
+#define MT_NORMAL_TAGGED	1
+#define MT_NORMAL_NC		2
+#define MT_NORMAL_WT		3
+#define MT_DEVICE_nGnRnE	4
+#define MT_DEVICE_nGnRE		5
+#define MT_DEVICE_GRE		6
 
 /*
  * Memory types for Stage-2 translation
diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
new file mode 100644
index 000000000000..c77a23869223
--- /dev/null
+++ b/arch/arm64/include/asm/mman.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_MMAN_H__
+#define __ASM_MMAN_H__
+
+#include <uapi/asm/mman.h>
+
+/*
+ * There are two conditions required for returning a Normal Tagged memory type
+ * in arch_vm_get_page_prot(): (1) the user requested it via PROT_MTE passed
+ * to mmap() or mprotect() and (2) the corresponding vma supports MTE. We
+ * register (1) as VM_MTE in the vma->vm_flags and (2) as VM_MTE_ALLOWED. Note
+ * that the latter can only be set during the mmap() call since mprotect()
+ * does not accept MAP_* flags.
+ */
+static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
+						   unsigned long pkey)
+{
+	if (!system_supports_mte())
+		return 0;
+
+	if (prot & PROT_MTE)
+		return VM_MTE;
+
+	return 0;
+}
+#define arch_calc_vm_prot_bits arch_calc_vm_prot_bits
+
+static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
+{
+	if (!system_supports_mte())
+		return 0;
+
+	/*
+	 * Only allow MTE on anonymous mappings as these are guaranteed to be
+	 * backed by tags-capable memory. The vm_flags may be overridden by a
+	 * filesystem supporting MTE (RAM-based).
+	 */
+	if (flags & MAP_ANONYMOUS)
+		return VM_MTE_ALLOWED;
+
+	return 0;
+}
+#define arch_calc_vm_flag_bits arch_calc_vm_flag_bits
+
+static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
+{
+	return (vm_flags & VM_MTE) && (vm_flags & VM_MTE_ALLOWED) ?
+		__pgprot(PTE_ATTRINDX(MT_NORMAL_TAGGED)) :
+		__pgprot(0);
+}
+#define arch_vm_get_page_prot arch_vm_get_page_prot
+
+static inline bool arch_validate_prot(unsigned long prot, unsigned long addr)
+{
+	unsigned long supported = PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM;
+
+	if (system_supports_mte())
+		supported |= PROT_MTE;
+
+	return (prot & ~supported) == 0;
+}
+#define arch_validate_prot arch_validate_prot
+
+#endif /* !__ASM_MMAN_H__ */
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index c01b52add377..673033e0393b 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -36,7 +36,7 @@ extern int pfn_valid(unsigned long);
 
 #endif /* !__ASSEMBLY__ */
 
-#define VM_DATA_DEFAULT_FLAGS	VM_DATA_FLAGS_TSK_EXEC
+#define VM_DATA_DEFAULT_FLAGS	(VM_DATA_FLAGS_TSK_EXEC | VM_MTE_ALLOWED)
 
 #include <asm-generic/getorder.h>
 
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 647a3f0c7874..f2cd59b01b27 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -665,8 +665,13 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
+	/*
+	 * Normal and Normal-Tagged are two different memory types and indices
+	 * in MAIR_EL1. The mask below has to include PTE_ATTRINDX_MASK.
+	 */
 	const pteval_t mask = PTE_USER | PTE_PXN | PTE_UXN | PTE_RDONLY |
-			      PTE_PROT_NONE | PTE_VALID | PTE_WRITE;
+			      PTE_PROT_NONE | PTE_VALID | PTE_WRITE |
+			      PTE_ATTRINDX_MASK;
 	/* preserve the hardware dirty information */
 	if (pte_hw_dirty(pte))
 		pte = pte_mkdirty(pte);
diff --git a/arch/arm64/include/uapi/asm/mman.h b/arch/arm64/include/uapi/asm/mman.h
new file mode 100644
index 000000000000..d7677ee84878
--- /dev/null
+++ b/arch/arm64/include/uapi/asm/mman.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI__ASM_MMAN_H
+#define _UAPI__ASM_MMAN_H
+
+#include <asm-generic/mman.h>
+
+/*
+ * The generic mman.h file reserves 0x10 and 0x20 for arch-specific PROT_*
+ * flags.
+ */
+/* 0x10 reserved for PROT_BTI */
+#define PROT_MTE	 0x20		/* Normal Tagged mapping */
+
+#endif /* !_UAPI__ASM_MMAN_H */
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 8d382d4ec067..2f26112ebb77 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -647,6 +647,10 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
 		[ilog2(VM_MERGEABLE)]	= "mg",
 		[ilog2(VM_UFFD_MISSING)]= "um",
 		[ilog2(VM_UFFD_WP)]	= "uw",
+#ifdef CONFIG_ARM64_MTE
+		[ilog2(VM_MTE)]		= "mt",
+		[ilog2(VM_MTE_ALLOWED)]	= "",
+#endif
 #ifdef CONFIG_ARCH_HAS_PKEYS
 		/* These come out via ProtectionKey: */
 		[ilog2(VM_PKEY_BIT0)]	= "",
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5a323422d783..132ca88e407d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -336,6 +336,14 @@ extern unsigned int kobjsize(const void *objp);
 # define VM_MPX		VM_NONE
 #endif
 
+#if defined(CONFIG_ARM64_MTE)
+# define VM_MTE		VM_HIGH_ARCH_0	/* Use Tagged memory for access control */
+# define VM_MTE_ALLOWED	VM_HIGH_ARCH_1	/* Tagged memory permitted */
+#else
+# define VM_MTE		VM_NONE
+# define VM_MTE_ALLOWED	VM_NONE
+#endif
+
 #ifndef VM_GROWSUP
 # define VM_GROWSUP	VM_NONE
 #endif

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
@ 2020-05-15 17:15   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Vincenzo Frascino, Will Deacon,
	Dave P Martin

To enable tagging on a memory range, the user must explicitly opt in via
a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new
memory type in the AttrIndx field of a pte, simplify the or'ing of these
bits over the protection_map[] attributes by making MT_NORMAL index 0.

There are two conditions for arch_vm_get_page_prot() to return the
MT_NORMAL_TAGGED memory type: (1) the user requested it via PROT_MTE,
registered as VM_MTE in the vm_flags, and (2) the vma supports MTE,
decided during the mmap() call (only) and registered as VM_MTE_ALLOWED.

arch_calc_vm_prot_bits() is responsible for registering the user request
as VM_MTE. The newly introduced arch_calc_vm_flag_bits() sets
VM_MTE_ALLOWED if the mapping is MAP_ANONYMOUS. An MTE-capable
filesystem (RAM-based) may be able to set VM_MTE_ALLOWED during its
mmap() file ops call.

In addition, update VM_DATA_DEFAULT_FLAGS to allow mprotect(PROT_MTE) on
stack or brk area.

The Linux mmap() syscall currently ignores unknown PROT_* flags. In the
presence of MTE, an mmap(PROT_MTE) on a file which does not support MTE
will not report an error and the memory will not be mapped as Normal
Tagged. For consistency, mprotect(PROT_MTE) will not report an error
either if the memory range does not support MTE. Two subsequent patches
in the series will propose tightening of this behaviour.

Co-developed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v2:
    - Add VM_MTE_ALLOWED to show_smap_vma_flags().

 arch/arm64/include/asm/memory.h    | 18 +++++----
 arch/arm64/include/asm/mman.h      | 64 ++++++++++++++++++++++++++++++
 arch/arm64/include/asm/page.h      |  2 +-
 arch/arm64/include/asm/pgtable.h   |  7 +++-
 arch/arm64/include/uapi/asm/mman.h | 14 +++++++
 fs/proc/task_mmu.c                 |  4 ++
 include/linux/mm.h                 |  8 ++++
 7 files changed, 108 insertions(+), 9 deletions(-)
 create mode 100644 arch/arm64/include/asm/mman.h
 create mode 100644 arch/arm64/include/uapi/asm/mman.h

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 472c77a68225..770535b7ca35 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -129,14 +129,18 @@
 
 /*
  * Memory types available.
+ *
+ * IMPORTANT: MT_NORMAL must be index 0 since vm_get_page_prot() may 'or' in
+ *	      the MT_NORMAL_TAGGED memory type for PROT_MTE mappings. Note
+ *	      that protection_map[] only contains MT_NORMAL attributes.
  */
-#define MT_DEVICE_nGnRnE	0
-#define MT_DEVICE_nGnRE		1
-#define MT_DEVICE_GRE		2
-#define MT_NORMAL_NC		3
-#define MT_NORMAL		4
-#define MT_NORMAL_WT		5
-#define MT_NORMAL_TAGGED	6
+#define MT_NORMAL		0
+#define MT_NORMAL_TAGGED	1
+#define MT_NORMAL_NC		2
+#define MT_NORMAL_WT		3
+#define MT_DEVICE_nGnRnE	4
+#define MT_DEVICE_nGnRE		5
+#define MT_DEVICE_GRE		6
 
 /*
  * Memory types for Stage-2 translation
diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
new file mode 100644
index 000000000000..c77a23869223
--- /dev/null
+++ b/arch/arm64/include/asm/mman.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_MMAN_H__
+#define __ASM_MMAN_H__
+
+#include <uapi/asm/mman.h>
+
+/*
+ * There are two conditions required for returning a Normal Tagged memory type
+ * in arch_vm_get_page_prot(): (1) the user requested it via PROT_MTE passed
+ * to mmap() or mprotect() and (2) the corresponding vma supports MTE. We
+ * register (1) as VM_MTE in the vma->vm_flags and (2) as VM_MTE_ALLOWED. Note
+ * that the latter can only be set during the mmap() call since mprotect()
+ * does not accept MAP_* flags.
+ */
+static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
+						   unsigned long pkey)
+{
+	if (!system_supports_mte())
+		return 0;
+
+	if (prot & PROT_MTE)
+		return VM_MTE;
+
+	return 0;
+}
+#define arch_calc_vm_prot_bits arch_calc_vm_prot_bits
+
+static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
+{
+	if (!system_supports_mte())
+		return 0;
+
+	/*
+	 * Only allow MTE on anonymous mappings as these are guaranteed to be
+	 * backed by tags-capable memory. The vm_flags may be overridden by a
+	 * filesystem supporting MTE (RAM-based).
+	 */
+	if (flags & MAP_ANONYMOUS)
+		return VM_MTE_ALLOWED;
+
+	return 0;
+}
+#define arch_calc_vm_flag_bits arch_calc_vm_flag_bits
+
+static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
+{
+	return (vm_flags & VM_MTE) && (vm_flags & VM_MTE_ALLOWED) ?
+		__pgprot(PTE_ATTRINDX(MT_NORMAL_TAGGED)) :
+		__pgprot(0);
+}
+#define arch_vm_get_page_prot arch_vm_get_page_prot
+
+static inline bool arch_validate_prot(unsigned long prot, unsigned long addr)
+{
+	unsigned long supported = PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM;
+
+	if (system_supports_mte())
+		supported |= PROT_MTE;
+
+	return (prot & ~supported) == 0;
+}
+#define arch_validate_prot arch_validate_prot
+
+#endif /* !__ASM_MMAN_H__ */
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index c01b52add377..673033e0393b 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -36,7 +36,7 @@ extern int pfn_valid(unsigned long);
 
 #endif /* !__ASSEMBLY__ */
 
-#define VM_DATA_DEFAULT_FLAGS	VM_DATA_FLAGS_TSK_EXEC
+#define VM_DATA_DEFAULT_FLAGS	(VM_DATA_FLAGS_TSK_EXEC | VM_MTE_ALLOWED)
 
 #include <asm-generic/getorder.h>
 
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 647a3f0c7874..f2cd59b01b27 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -665,8 +665,13 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
+	/*
+	 * Normal and Normal-Tagged are two different memory types and indices
+	 * in MAIR_EL1. The mask below has to include PTE_ATTRINDX_MASK.
+	 */
 	const pteval_t mask = PTE_USER | PTE_PXN | PTE_UXN | PTE_RDONLY |
-			      PTE_PROT_NONE | PTE_VALID | PTE_WRITE;
+			      PTE_PROT_NONE | PTE_VALID | PTE_WRITE |
+			      PTE_ATTRINDX_MASK;
 	/* preserve the hardware dirty information */
 	if (pte_hw_dirty(pte))
 		pte = pte_mkdirty(pte);
diff --git a/arch/arm64/include/uapi/asm/mman.h b/arch/arm64/include/uapi/asm/mman.h
new file mode 100644
index 000000000000..d7677ee84878
--- /dev/null
+++ b/arch/arm64/include/uapi/asm/mman.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI__ASM_MMAN_H
+#define _UAPI__ASM_MMAN_H
+
+#include <asm-generic/mman.h>
+
+/*
+ * The generic mman.h file reserves 0x10 and 0x20 for arch-specific PROT_*
+ * flags.
+ */
+/* 0x10 reserved for PROT_BTI */
+#define PROT_MTE	 0x20		/* Normal Tagged mapping */
+
+#endif /* !_UAPI__ASM_MMAN_H */
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 8d382d4ec067..2f26112ebb77 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -647,6 +647,10 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
 		[ilog2(VM_MERGEABLE)]	= "mg",
 		[ilog2(VM_UFFD_MISSING)]= "um",
 		[ilog2(VM_UFFD_WP)]	= "uw",
+#ifdef CONFIG_ARM64_MTE
+		[ilog2(VM_MTE)]		= "mt",
+		[ilog2(VM_MTE_ALLOWED)]	= "",
+#endif
 #ifdef CONFIG_ARCH_HAS_PKEYS
 		/* These come out via ProtectionKey: */
 		[ilog2(VM_PKEY_BIT0)]	= "",
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5a323422d783..132ca88e407d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -336,6 +336,14 @@ extern unsigned int kobjsize(const void *objp);
 # define VM_MPX		VM_NONE
 #endif
 
+#if defined(CONFIG_ARM64_MTE)
+# define VM_MTE		VM_HIGH_ARCH_0	/* Use Tagged memory for access control */
+# define VM_MTE_ALLOWED	VM_HIGH_ARCH_1	/* Tagged memory permitted */
+#else
+# define VM_MTE		VM_NONE
+# define VM_MTE_ALLOWED	VM_NONE
+#endif
+
 #ifndef VM_GROWSUP
 # define VM_GROWSUP	VM_NONE
 #endif

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 12/26] mm: Introduce arch_validate_flags()
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:15   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Andrew Morton

Similarly to arch_validate_prot() called from do_mprotect_pkey(), an
architecture may need to sanity-check the new vm_flags.

Define a dummy function always returning true. In addition to
do_mprotect_pkey(), also invoke it from mmap_region() prior to updating
vma->vm_page_prot to allow the architecture code to veto potentially
inconsistent vm_flags.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---

Notes:
    v2:
    - Some comments updated.

 include/linux/mman.h | 13 +++++++++++++
 mm/mmap.c            |  9 +++++++++
 mm/mprotect.c        |  6 ++++++
 3 files changed, 28 insertions(+)

diff --git a/include/linux/mman.h b/include/linux/mman.h
index 15c1162b9d65..09dd414b81b6 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -103,6 +103,19 @@ static inline bool arch_validate_prot(unsigned long prot, unsigned long addr)
 #define arch_validate_prot arch_validate_prot
 #endif
 
+#ifndef arch_validate_flags
+/*
+ * This is called from mmap() and mprotect() with the updated vma->vm_flags.
+ *
+ * Returns true if the VM_* flags are valid.
+ */
+static inline bool arch_validate_flags(unsigned long flags)
+{
+	return true;
+}
+#define arch_validate_flags arch_validate_flags
+#endif
+
 /*
  * Optimisation macro.  It is equivalent to:
  *      (x & bit1) ? bit2 : 0
diff --git a/mm/mmap.c b/mm/mmap.c
index f609e9ec4a25..d5fc93c2072e 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1792,6 +1792,15 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 		vma_set_anonymous(vma);
 	}
 
+	/* Allow architectures to sanity-check the vm_flags */
+	if (!arch_validate_flags(vma->vm_flags)) {
+		error = -EINVAL;
+		if (file)
+			goto unmap_and_free_vma;
+		else
+			goto free_vma;
+	}
+
 	vma_link(mm, vma, prev, rb_link, rb_parent);
 	/* Once vma denies write, undo our temporary denial count */
 	if (file) {
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 494192ca954b..04b1d2cf0e74 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -603,6 +603,12 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
 			goto out;
 		}
 
+		/* Allow architectures to sanity-check the new flags */
+		if (!arch_validate_flags(newflags)) {
+			error = -EINVAL;
+			goto out;
+		}
+
 		error = security_file_mprotect(vma, reqprot, prot);
 		if (error)
 			goto out;

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 12/26] mm: Introduce arch_validate_flags()
@ 2020-05-15 17:15   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Andrew Morton, Vincenzo Frascino,
	Will Deacon, Dave P Martin

Similarly to arch_validate_prot() called from do_mprotect_pkey(), an
architecture may need to sanity-check the new vm_flags.

Define a dummy function always returning true. In addition to
do_mprotect_pkey(), also invoke it from mmap_region() prior to updating
vma->vm_page_prot to allow the architecture code to veto potentially
inconsistent vm_flags.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---

Notes:
    v2:
    - Some comments updated.

 include/linux/mman.h | 13 +++++++++++++
 mm/mmap.c            |  9 +++++++++
 mm/mprotect.c        |  6 ++++++
 3 files changed, 28 insertions(+)

diff --git a/include/linux/mman.h b/include/linux/mman.h
index 15c1162b9d65..09dd414b81b6 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -103,6 +103,19 @@ static inline bool arch_validate_prot(unsigned long prot, unsigned long addr)
 #define arch_validate_prot arch_validate_prot
 #endif
 
+#ifndef arch_validate_flags
+/*
+ * This is called from mmap() and mprotect() with the updated vma->vm_flags.
+ *
+ * Returns true if the VM_* flags are valid.
+ */
+static inline bool arch_validate_flags(unsigned long flags)
+{
+	return true;
+}
+#define arch_validate_flags arch_validate_flags
+#endif
+
 /*
  * Optimisation macro.  It is equivalent to:
  *      (x & bit1) ? bit2 : 0
diff --git a/mm/mmap.c b/mm/mmap.c
index f609e9ec4a25..d5fc93c2072e 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1792,6 +1792,15 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 		vma_set_anonymous(vma);
 	}
 
+	/* Allow architectures to sanity-check the vm_flags */
+	if (!arch_validate_flags(vma->vm_flags)) {
+		error = -EINVAL;
+		if (file)
+			goto unmap_and_free_vma;
+		else
+			goto free_vma;
+	}
+
 	vma_link(mm, vma, prev, rb_link, rb_parent);
 	/* Once vma denies write, undo our temporary denial count */
 	if (file) {
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 494192ca954b..04b1d2cf0e74 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -603,6 +603,12 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
 			goto out;
 		}
 
+		/* Allow architectures to sanity-check the new flags */
+		if (!arch_validate_flags(newflags)) {
+			error = -EINVAL;
+			goto out;
+		}
+
 		error = security_file_mprotect(vma, reqprot, prot);
 		if (error)
 			goto out;

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 13/26] arm64: mte: Validate the PROT_MTE request via arch_validate_flags()
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:15   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne

Make use of the newly introduced arch_validate_flags() hook to
sanity-check the PROT_MTE request passed to mmap() and mprotect(). If
the mapping does not support MTE, these syscalls will return -EINVAL.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/mman.h | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
index c77a23869223..5c356d1ca266 100644
--- a/arch/arm64/include/asm/mman.h
+++ b/arch/arm64/include/asm/mman.h
@@ -44,7 +44,11 @@ static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
 
 static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
 {
-	return (vm_flags & VM_MTE) && (vm_flags & VM_MTE_ALLOWED) ?
+	/*
+	 * Checking for VM_MTE only is sufficient since arch_validate_flags()
+	 * does not permit (VM_MTE & !VM_MTE_ALLOWED).
+	 */
+	return (vm_flags & VM_MTE) ?
 		__pgprot(PTE_ATTRINDX(MT_NORMAL_TAGGED)) :
 		__pgprot(0);
 }
@@ -61,4 +65,14 @@ static inline bool arch_validate_prot(unsigned long prot, unsigned long addr)
 }
 #define arch_validate_prot arch_validate_prot
 
+static inline bool arch_validate_flags(unsigned long flags)
+{
+	if (!system_supports_mte())
+		return true;
+
+	/* only allow VM_MTE if VM_MTE_ALLOWED has been set previously */
+	return !(flags & VM_MTE) || (flags & VM_MTE_ALLOWED);
+}
+#define arch_validate_flags arch_validate_flags
+
 #endif /* !__ASM_MMAN_H__ */

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 13/26] arm64: mte: Validate the PROT_MTE request via arch_validate_flags()
@ 2020-05-15 17:15   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Vincenzo Frascino, Will Deacon,
	Dave P Martin

Make use of the newly introduced arch_validate_flags() hook to
sanity-check the PROT_MTE request passed to mmap() and mprotect(). If
the mapping does not support MTE, these syscalls will return -EINVAL.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/mman.h | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/mman.h b/arch/arm64/include/asm/mman.h
index c77a23869223..5c356d1ca266 100644
--- a/arch/arm64/include/asm/mman.h
+++ b/arch/arm64/include/asm/mman.h
@@ -44,7 +44,11 @@ static inline unsigned long arch_calc_vm_flag_bits(unsigned long flags)
 
 static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
 {
-	return (vm_flags & VM_MTE) && (vm_flags & VM_MTE_ALLOWED) ?
+	/*
+	 * Checking for VM_MTE only is sufficient since arch_validate_flags()
+	 * does not permit (VM_MTE & !VM_MTE_ALLOWED).
+	 */
+	return (vm_flags & VM_MTE) ?
 		__pgprot(PTE_ATTRINDX(MT_NORMAL_TAGGED)) :
 		__pgprot(0);
 }
@@ -61,4 +65,14 @@ static inline bool arch_validate_prot(unsigned long prot, unsigned long addr)
 }
 #define arch_validate_prot arch_validate_prot
 
+static inline bool arch_validate_flags(unsigned long flags)
+{
+	if (!system_supports_mte())
+		return true;
+
+	/* only allow VM_MTE if VM_MTE_ALLOWED has been set previously */
+	return !(flags & VM_MTE) || (flags & VM_MTE_ALLOWED);
+}
+#define arch_validate_flags arch_validate_flags
+
 #endif /* !__ASM_MMAN_H__ */

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 14/26] mm: Allow arm64 mmap(PROT_MTE) on RAM-based files
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:16   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Andrew Morton

Since arm64 memory (allocation) tags can only be stored in RAM, mapping
files with PROT_MTE is not allowed by default. RAM-based files like
those in a tmpfs mount or memfd_create() can support memory tagging, so
update the vm_flags accordingly in shmem_mmap().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 mm/shmem.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/shmem.c b/mm/shmem.c
index d722eb830317..73754ed7af69 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2221,6 +2221,9 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
 			vma->vm_flags &= ~(VM_MAYWRITE);
 	}
 
+	/* arm64 - allow memory tagging on RAM-based files */
+	vma->vm_flags |= VM_MTE_ALLOWED;
+
 	file_accessed(file);
 	vma->vm_ops = &shmem_vm_ops;
 	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 14/26] mm: Allow arm64 mmap(PROT_MTE) on RAM-based files
@ 2020-05-15 17:16   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Andrew Morton, Vincenzo Frascino,
	Will Deacon, Dave P Martin

Since arm64 memory (allocation) tags can only be stored in RAM, mapping
files with PROT_MTE is not allowed by default. RAM-based files like
those in a tmpfs mount or memfd_create() can support memory tagging, so
update the vm_flags accordingly in shmem_mmap().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 mm/shmem.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/shmem.c b/mm/shmem.c
index d722eb830317..73754ed7af69 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2221,6 +2221,9 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma)
 			vma->vm_flags &= ~(VM_MAYWRITE);
 	}
 
+	/* arm64 - allow memory tagging on RAM-based files */
+	vma->vm_flags |= VM_MTE_ALLOWED;
+
 	file_accessed(file);
 	vma->vm_ops = &shmem_vm_ops;
 	if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 15/26] arm64: mte: Allow user control of the tag check mode via prctl()
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:16   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne

By default, even if PROT_MTE is set on a memory range, there is no tag
check fault reporting (SIGSEGV). Introduce a set of option to the
exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag
check fault mode:

  PR_MTE_TCF_NONE  - no reporting (default)
  PR_MTE_TCF_SYNC  - synchronous tag check fault reporting
  PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting

These options translate into the corresponding SCTLR_EL1.TCF0 bitfield,
context-switched by the kernel. Note that uaccess done by the kernel is
not checked and cannot be configured by the user.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v3:
    - Use SCTLR_EL1_TCF0_NONE instead of 0 for consistency.
    - Move mte_thread_switch() in this patch from an earlier one. In
      addition, it is called after the dsb() in __switch_to() so that any
      asynchronous tag check faults have been registered in the TFSR_EL1
      registers (to be added with the in-kernel MTE support.
    
    v2:
    - Handle SCTLR_EL1_TCF0_NONE explicitly for consistency with PR_MTE_TCF_NONE.
    - Fix SCTLR_EL1 register setting in flush_mte_state() (thanks to Peter
      Collingbourne).
    - Added ISB to update_sctlr_el1_tcf0() since, with the latest
      architecture update/fix, the TCF0 field is used by the uaccess
      routines.

 arch/arm64/include/asm/mte.h       | 14 ++++++
 arch/arm64/include/asm/processor.h |  3 ++
 arch/arm64/kernel/mte.c            | 77 ++++++++++++++++++++++++++++++
 arch/arm64/kernel/process.c        | 26 ++++++++--
 include/uapi/linux/prctl.h         |  6 +++
 5 files changed, 123 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index c1a09499c678..c79dbffd30c4 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -21,6 +21,9 @@ void mte_clear_page_tags(void *addr, size_t size);
 void mte_sync_tags(pte_t *ptep, pte_t pte);
 void mte_copy_page_tags(void *kto, const void *kfrom);
 void flush_mte_state(void);
+void mte_thread_switch(struct task_struct *next);
+long set_mte_ctrl(unsigned long arg);
+long get_mte_ctrl(void);
 
 #else
 
@@ -36,6 +39,17 @@ static inline void mte_copy_page_tags(void *kto, const void *kfrom)
 static inline void flush_mte_state(void)
 {
 }
+static inline void mte_thread_switch(struct task_struct *next)
+{
+}
+static inline long set_mte_ctrl(unsigned long arg)
+{
+	return 0;
+}
+static inline long get_mte_ctrl(void)
+{
+	return 0;
+}
 
 #endif
 
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 240fe5e5b720..80e7f0573309 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -151,6 +151,9 @@ struct thread_struct {
 	struct ptrauth_keys_user	keys_user;
 	struct ptrauth_keys_kernel	keys_kernel;
 #endif
+#ifdef CONFIG_ARM64_MTE
+	u64			sctlr_tcf0;
+#endif
 };
 
 static inline void arch_thread_struct_whitelist(unsigned long *offset,
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index da2d70178a4b..3e425f124360 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -5,6 +5,8 @@
 
 #include <linux/bitops.h>
 #include <linux/mm.h>
+#include <linux/prctl.h>
+#include <linux/sched.h>
 #include <linux/string.h>
 #include <linux/thread_info.h>
 
@@ -49,6 +51,26 @@ int memcmp_pages(struct page *page1, struct page *page2)
 	return ret;
 }
 
+static void update_sctlr_el1_tcf0(u64 tcf0)
+{
+	/* ISB required for the kernel uaccess routines */
+	sysreg_clear_set(sctlr_el1, SCTLR_EL1_TCF0_MASK, tcf0);
+	isb();
+}
+
+static void set_sctlr_el1_tcf0(u64 tcf0)
+{
+	/*
+	 * mte_thread_switch() checks current->thread.sctlr_tcf0 as an
+	 * optimisation. Disable preemption so that it does not see
+	 * the variable update before the SCTLR_EL1.TCF0 one.
+	 */
+	preempt_disable();
+	current->thread.sctlr_tcf0 = tcf0;
+	update_sctlr_el1_tcf0(tcf0);
+	preempt_enable();
+}
+
 void flush_mte_state(void)
 {
 	if (!system_supports_mte())
@@ -58,4 +80,59 @@ void flush_mte_state(void)
 	dsb(ish);
 	write_sysreg_s(0, SYS_TFSRE0_EL1);
 	clear_thread_flag(TIF_MTE_ASYNC_FAULT);
+	/* disable tag checking */
+	set_sctlr_el1_tcf0(SCTLR_EL1_TCF0_NONE);
+}
+
+void mte_thread_switch(struct task_struct *next)
+{
+	if (!system_supports_mte())
+		return;
+
+	/* avoid expensive SCTLR_EL1 accesses if no change */
+	if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
+		update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
+}
+
+long set_mte_ctrl(unsigned long arg)
+{
+	u64 tcf0;
+
+	if (!system_supports_mte())
+		return 0;
+
+	switch (arg & PR_MTE_TCF_MASK) {
+	case PR_MTE_TCF_NONE:
+		tcf0 = SCTLR_EL1_TCF0_NONE;
+		break;
+	case PR_MTE_TCF_SYNC:
+		tcf0 = SCTLR_EL1_TCF0_SYNC;
+		break;
+	case PR_MTE_TCF_ASYNC:
+		tcf0 = SCTLR_EL1_TCF0_ASYNC;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	set_sctlr_el1_tcf0(tcf0);
+
+	return 0;
+}
+
+long get_mte_ctrl(void)
+{
+	if (!system_supports_mte())
+		return 0;
+
+	switch (current->thread.sctlr_tcf0) {
+	case SCTLR_EL1_TCF0_NONE:
+		return PR_MTE_TCF_NONE;
+	case SCTLR_EL1_TCF0_SYNC:
+		return PR_MTE_TCF_SYNC;
+	case SCTLR_EL1_TCF0_ASYNC:
+		return PR_MTE_TCF_ASYNC;
+	}
+
+	return 0;
 }
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 740047c9cd13..ff6031a398d0 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -529,6 +529,13 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
 	 */
 	dsb(ish);
 
+	/*
+	 * MTE thread switching must happen after the DSB above to ensure that
+	 * any asynchronous tag check faults have been logged in the TFSR*_EL1
+	 * registers.
+	 */
+	mte_thread_switch(next);
+
 	/* the actual thread switch */
 	last = cpu_switch_to(prev, next);
 
@@ -588,9 +595,15 @@ static unsigned int tagged_addr_disabled;
 
 long set_tagged_addr_ctrl(unsigned long arg)
 {
+	unsigned long valid_mask = PR_TAGGED_ADDR_ENABLE;
+
 	if (is_compat_task())
 		return -EINVAL;
-	if (arg & ~PR_TAGGED_ADDR_ENABLE)
+
+	if (system_supports_mte())
+		valid_mask |= PR_MTE_TCF_MASK;
+
+	if (arg & ~valid_mask)
 		return -EINVAL;
 
 	/*
@@ -600,6 +613,9 @@ long set_tagged_addr_ctrl(unsigned long arg)
 	if (arg & PR_TAGGED_ADDR_ENABLE && tagged_addr_disabled)
 		return -EINVAL;
 
+	if (set_mte_ctrl(arg) != 0)
+		return -EINVAL;
+
 	update_thread_flag(TIF_TAGGED_ADDR, arg & PR_TAGGED_ADDR_ENABLE);
 
 	return 0;
@@ -607,13 +623,17 @@ long set_tagged_addr_ctrl(unsigned long arg)
 
 long get_tagged_addr_ctrl(void)
 {
+	long ret = 0;
+
 	if (is_compat_task())
 		return -EINVAL;
 
 	if (test_thread_flag(TIF_TAGGED_ADDR))
-		return PR_TAGGED_ADDR_ENABLE;
+		ret = PR_TAGGED_ADDR_ENABLE;
 
-	return 0;
+	ret |= get_mte_ctrl();
+
+	return ret;
 }
 
 /*
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 07b4f8131e36..2390ab324afa 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -233,6 +233,12 @@ struct prctl_mm_map {
 #define PR_SET_TAGGED_ADDR_CTRL		55
 #define PR_GET_TAGGED_ADDR_CTRL		56
 # define PR_TAGGED_ADDR_ENABLE		(1UL << 0)
+/* MTE tag check fault modes */
+# define PR_MTE_TCF_SHIFT		1
+# define PR_MTE_TCF_NONE		(0UL << PR_MTE_TCF_SHIFT)
+# define PR_MTE_TCF_SYNC		(1UL << PR_MTE_TCF_SHIFT)
+# define PR_MTE_TCF_ASYNC		(2UL << PR_MTE_TCF_SHIFT)
+# define PR_MTE_TCF_MASK		(3UL << PR_MTE_TCF_SHIFT)
 
 /* Control reclaim behavior when allocating memory */
 #define PR_SET_IO_FLUSHER		57

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 15/26] arm64: mte: Allow user control of the tag check mode via prctl()
@ 2020-05-15 17:16   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Vincenzo Frascino, Will Deacon,
	Dave P Martin

By default, even if PROT_MTE is set on a memory range, there is no tag
check fault reporting (SIGSEGV). Introduce a set of option to the
exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag
check fault mode:

  PR_MTE_TCF_NONE  - no reporting (default)
  PR_MTE_TCF_SYNC  - synchronous tag check fault reporting
  PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting

These options translate into the corresponding SCTLR_EL1.TCF0 bitfield,
context-switched by the kernel. Note that uaccess done by the kernel is
not checked and cannot be configured by the user.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v3:
    - Use SCTLR_EL1_TCF0_NONE instead of 0 for consistency.
    - Move mte_thread_switch() in this patch from an earlier one. In
      addition, it is called after the dsb() in __switch_to() so that any
      asynchronous tag check faults have been registered in the TFSR_EL1
      registers (to be added with the in-kernel MTE support.
    
    v2:
    - Handle SCTLR_EL1_TCF0_NONE explicitly for consistency with PR_MTE_TCF_NONE.
    - Fix SCTLR_EL1 register setting in flush_mte_state() (thanks to Peter
      Collingbourne).
    - Added ISB to update_sctlr_el1_tcf0() since, with the latest
      architecture update/fix, the TCF0 field is used by the uaccess
      routines.

 arch/arm64/include/asm/mte.h       | 14 ++++++
 arch/arm64/include/asm/processor.h |  3 ++
 arch/arm64/kernel/mte.c            | 77 ++++++++++++++++++++++++++++++
 arch/arm64/kernel/process.c        | 26 ++++++++--
 include/uapi/linux/prctl.h         |  6 +++
 5 files changed, 123 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index c1a09499c678..c79dbffd30c4 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -21,6 +21,9 @@ void mte_clear_page_tags(void *addr, size_t size);
 void mte_sync_tags(pte_t *ptep, pte_t pte);
 void mte_copy_page_tags(void *kto, const void *kfrom);
 void flush_mte_state(void);
+void mte_thread_switch(struct task_struct *next);
+long set_mte_ctrl(unsigned long arg);
+long get_mte_ctrl(void);
 
 #else
 
@@ -36,6 +39,17 @@ static inline void mte_copy_page_tags(void *kto, const void *kfrom)
 static inline void flush_mte_state(void)
 {
 }
+static inline void mte_thread_switch(struct task_struct *next)
+{
+}
+static inline long set_mte_ctrl(unsigned long arg)
+{
+	return 0;
+}
+static inline long get_mte_ctrl(void)
+{
+	return 0;
+}
 
 #endif
 
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 240fe5e5b720..80e7f0573309 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -151,6 +151,9 @@ struct thread_struct {
 	struct ptrauth_keys_user	keys_user;
 	struct ptrauth_keys_kernel	keys_kernel;
 #endif
+#ifdef CONFIG_ARM64_MTE
+	u64			sctlr_tcf0;
+#endif
 };
 
 static inline void arch_thread_struct_whitelist(unsigned long *offset,
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index da2d70178a4b..3e425f124360 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -5,6 +5,8 @@
 
 #include <linux/bitops.h>
 #include <linux/mm.h>
+#include <linux/prctl.h>
+#include <linux/sched.h>
 #include <linux/string.h>
 #include <linux/thread_info.h>
 
@@ -49,6 +51,26 @@ int memcmp_pages(struct page *page1, struct page *page2)
 	return ret;
 }
 
+static void update_sctlr_el1_tcf0(u64 tcf0)
+{
+	/* ISB required for the kernel uaccess routines */
+	sysreg_clear_set(sctlr_el1, SCTLR_EL1_TCF0_MASK, tcf0);
+	isb();
+}
+
+static void set_sctlr_el1_tcf0(u64 tcf0)
+{
+	/*
+	 * mte_thread_switch() checks current->thread.sctlr_tcf0 as an
+	 * optimisation. Disable preemption so that it does not see
+	 * the variable update before the SCTLR_EL1.TCF0 one.
+	 */
+	preempt_disable();
+	current->thread.sctlr_tcf0 = tcf0;
+	update_sctlr_el1_tcf0(tcf0);
+	preempt_enable();
+}
+
 void flush_mte_state(void)
 {
 	if (!system_supports_mte())
@@ -58,4 +80,59 @@ void flush_mte_state(void)
 	dsb(ish);
 	write_sysreg_s(0, SYS_TFSRE0_EL1);
 	clear_thread_flag(TIF_MTE_ASYNC_FAULT);
+	/* disable tag checking */
+	set_sctlr_el1_tcf0(SCTLR_EL1_TCF0_NONE);
+}
+
+void mte_thread_switch(struct task_struct *next)
+{
+	if (!system_supports_mte())
+		return;
+
+	/* avoid expensive SCTLR_EL1 accesses if no change */
+	if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
+		update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
+}
+
+long set_mte_ctrl(unsigned long arg)
+{
+	u64 tcf0;
+
+	if (!system_supports_mte())
+		return 0;
+
+	switch (arg & PR_MTE_TCF_MASK) {
+	case PR_MTE_TCF_NONE:
+		tcf0 = SCTLR_EL1_TCF0_NONE;
+		break;
+	case PR_MTE_TCF_SYNC:
+		tcf0 = SCTLR_EL1_TCF0_SYNC;
+		break;
+	case PR_MTE_TCF_ASYNC:
+		tcf0 = SCTLR_EL1_TCF0_ASYNC;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	set_sctlr_el1_tcf0(tcf0);
+
+	return 0;
+}
+
+long get_mte_ctrl(void)
+{
+	if (!system_supports_mte())
+		return 0;
+
+	switch (current->thread.sctlr_tcf0) {
+	case SCTLR_EL1_TCF0_NONE:
+		return PR_MTE_TCF_NONE;
+	case SCTLR_EL1_TCF0_SYNC:
+		return PR_MTE_TCF_SYNC;
+	case SCTLR_EL1_TCF0_ASYNC:
+		return PR_MTE_TCF_ASYNC;
+	}
+
+	return 0;
 }
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 740047c9cd13..ff6031a398d0 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -529,6 +529,13 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
 	 */
 	dsb(ish);
 
+	/*
+	 * MTE thread switching must happen after the DSB above to ensure that
+	 * any asynchronous tag check faults have been logged in the TFSR*_EL1
+	 * registers.
+	 */
+	mte_thread_switch(next);
+
 	/* the actual thread switch */
 	last = cpu_switch_to(prev, next);
 
@@ -588,9 +595,15 @@ static unsigned int tagged_addr_disabled;
 
 long set_tagged_addr_ctrl(unsigned long arg)
 {
+	unsigned long valid_mask = PR_TAGGED_ADDR_ENABLE;
+
 	if (is_compat_task())
 		return -EINVAL;
-	if (arg & ~PR_TAGGED_ADDR_ENABLE)
+
+	if (system_supports_mte())
+		valid_mask |= PR_MTE_TCF_MASK;
+
+	if (arg & ~valid_mask)
 		return -EINVAL;
 
 	/*
@@ -600,6 +613,9 @@ long set_tagged_addr_ctrl(unsigned long arg)
 	if (arg & PR_TAGGED_ADDR_ENABLE && tagged_addr_disabled)
 		return -EINVAL;
 
+	if (set_mte_ctrl(arg) != 0)
+		return -EINVAL;
+
 	update_thread_flag(TIF_TAGGED_ADDR, arg & PR_TAGGED_ADDR_ENABLE);
 
 	return 0;
@@ -607,13 +623,17 @@ long set_tagged_addr_ctrl(unsigned long arg)
 
 long get_tagged_addr_ctrl(void)
 {
+	long ret = 0;
+
 	if (is_compat_task())
 		return -EINVAL;
 
 	if (test_thread_flag(TIF_TAGGED_ADDR))
-		return PR_TAGGED_ADDR_ENABLE;
+		ret = PR_TAGGED_ADDR_ENABLE;
 
-	return 0;
+	ret |= get_mte_ctrl();
+
+	return ret;
 }
 
 /*
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 07b4f8131e36..2390ab324afa 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -233,6 +233,12 @@ struct prctl_mm_map {
 #define PR_SET_TAGGED_ADDR_CTRL		55
 #define PR_GET_TAGGED_ADDR_CTRL		56
 # define PR_TAGGED_ADDR_ENABLE		(1UL << 0)
+/* MTE tag check fault modes */
+# define PR_MTE_TCF_SHIFT		1
+# define PR_MTE_TCF_NONE		(0UL << PR_MTE_TCF_SHIFT)
+# define PR_MTE_TCF_SYNC		(1UL << PR_MTE_TCF_SHIFT)
+# define PR_MTE_TCF_ASYNC		(2UL << PR_MTE_TCF_SHIFT)
+# define PR_MTE_TCF_MASK		(3UL << PR_MTE_TCF_SHIFT)
 
 /* Control reclaim behavior when allocating memory */
 #define PR_SET_IO_FLUSHER		57

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 16/26] arm64: mte: Allow user control of the generated random tags via prctl()
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:16   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne

The IRG, ADDG and SUBG instructions insert a random tag in the resulting
address. Certain tags can be excluded via the GCR_EL1.Exclude bitmap
when, for example, the user wants a certain colour for freed buffers.
Since the GCR_EL1 register is not accessible at EL0, extend the
prctl(PR_SET_TAGGED_ADDR_CTRL) interface to include a 16-bit field in
the first argument for controlling which tags can be generated by the
above instruction (an include rather than exclude mask). Note that by
default all non-zero tags are excluded. This setting is per-thread.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v2:
    - Switch from an exclude mask to an include one for the prctl()
      interface.
    - Reset the allowed tags mask during flush_thread().

 arch/arm64/include/asm/processor.h |  1 +
 arch/arm64/include/asm/sysreg.h    |  7 ++++++
 arch/arm64/kernel/mte.c            | 35 +++++++++++++++++++++++++++---
 arch/arm64/kernel/process.c        |  2 +-
 include/uapi/linux/prctl.h         |  3 +++
 5 files changed, 44 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 80e7f0573309..996b882a32d9 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -153,6 +153,7 @@ struct thread_struct {
 #endif
 #ifdef CONFIG_ARM64_MTE
 	u64			sctlr_tcf0;
+	u64			gcr_incl;
 #endif
 };
 
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 86236ae6c4e7..cb247f2f75ca 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -981,6 +981,13 @@
 		write_sysreg(__scs_new, sysreg);			\
 } while (0)
 
+#define sysreg_clear_set_s(sysreg, clear, set) do {			\
+	u64 __scs_val = read_sysreg_s(sysreg);				\
+	u64 __scs_new = (__scs_val & ~(u64)(clear)) | (set);		\
+	if (__scs_new != __scs_val)					\
+		write_sysreg_s(__scs_new, sysreg);			\
+} while (0)
+
 #endif
 
 #endif	/* __ASM_SYSREG_H */
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 3e425f124360..999300e2110e 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -71,6 +71,25 @@ static void set_sctlr_el1_tcf0(u64 tcf0)
 	preempt_enable();
 }
 
+static void update_gcr_el1_excl(u64 incl)
+{
+	u64 excl = ~incl & SYS_GCR_EL1_EXCL_MASK;
+
+	/*
+	 * Note that 'incl' is an include mask (controlled by the user via
+	 * prctl()) while GCR_EL1 accepts an exclude mask.
+	 * No need for ISB since this only affects EL0 currently, implicit
+	 * with ERET.
+	 */
+	sysreg_clear_set_s(SYS_GCR_EL1, SYS_GCR_EL1_EXCL_MASK, excl);
+}
+
+static void set_gcr_el1_excl(u64 incl)
+{
+	current->thread.gcr_incl = incl;
+	update_gcr_el1_excl(incl);
+}
+
 void flush_mte_state(void)
 {
 	if (!system_supports_mte())
@@ -82,6 +101,8 @@ void flush_mte_state(void)
 	clear_thread_flag(TIF_MTE_ASYNC_FAULT);
 	/* disable tag checking */
 	set_sctlr_el1_tcf0(SCTLR_EL1_TCF0_NONE);
+	/* reset tag generation mask */
+	set_gcr_el1_excl(0);
 }
 
 void mte_thread_switch(struct task_struct *next)
@@ -92,6 +113,7 @@ void mte_thread_switch(struct task_struct *next)
 	/* avoid expensive SCTLR_EL1 accesses if no change */
 	if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
 		update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
+	update_gcr_el1_excl(next->thread.gcr_incl);
 }
 
 long set_mte_ctrl(unsigned long arg)
@@ -116,23 +138,30 @@ long set_mte_ctrl(unsigned long arg)
 	}
 
 	set_sctlr_el1_tcf0(tcf0);
+	set_gcr_el1_excl((arg & PR_MTE_TAG_MASK) >> PR_MTE_TAG_SHIFT);
 
 	return 0;
 }
 
 long get_mte_ctrl(void)
 {
+	unsigned long ret;
+
 	if (!system_supports_mte())
 		return 0;
 
+	ret = current->thread.gcr_incl << PR_MTE_TAG_SHIFT;
+
 	switch (current->thread.sctlr_tcf0) {
 	case SCTLR_EL1_TCF0_NONE:
 		return PR_MTE_TCF_NONE;
 	case SCTLR_EL1_TCF0_SYNC:
-		return PR_MTE_TCF_SYNC;
+		ret |= PR_MTE_TCF_SYNC;
+		break;
 	case SCTLR_EL1_TCF0_ASYNC:
-		return PR_MTE_TCF_ASYNC;
+		ret |= PR_MTE_TCF_ASYNC;
+		break;
 	}
 
-	return 0;
+	return ret;
 }
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index ff6031a398d0..697571be259b 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -601,7 +601,7 @@ long set_tagged_addr_ctrl(unsigned long arg)
 		return -EINVAL;
 
 	if (system_supports_mte())
-		valid_mask |= PR_MTE_TCF_MASK;
+		valid_mask |= PR_MTE_TCF_MASK | PR_MTE_TAG_MASK;
 
 	if (arg & ~valid_mask)
 		return -EINVAL;
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 2390ab324afa..7f0827705c9a 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -239,6 +239,9 @@ struct prctl_mm_map {
 # define PR_MTE_TCF_SYNC		(1UL << PR_MTE_TCF_SHIFT)
 # define PR_MTE_TCF_ASYNC		(2UL << PR_MTE_TCF_SHIFT)
 # define PR_MTE_TCF_MASK		(3UL << PR_MTE_TCF_SHIFT)
+/* MTE tag inclusion mask */
+# define PR_MTE_TAG_SHIFT		3
+# define PR_MTE_TAG_MASK		(0xffffUL << PR_MTE_TAG_SHIFT)
 
 /* Control reclaim behavior when allocating memory */
 #define PR_SET_IO_FLUSHER		57

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 16/26] arm64: mte: Allow user control of the generated random tags via prctl()
@ 2020-05-15 17:16   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Vincenzo Frascino, Will Deacon,
	Dave P Martin

The IRG, ADDG and SUBG instructions insert a random tag in the resulting
address. Certain tags can be excluded via the GCR_EL1.Exclude bitmap
when, for example, the user wants a certain colour for freed buffers.
Since the GCR_EL1 register is not accessible at EL0, extend the
prctl(PR_SET_TAGGED_ADDR_CTRL) interface to include a 16-bit field in
the first argument for controlling which tags can be generated by the
above instruction (an include rather than exclude mask). Note that by
default all non-zero tags are excluded. This setting is per-thread.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v2:
    - Switch from an exclude mask to an include one for the prctl()
      interface.
    - Reset the allowed tags mask during flush_thread().

 arch/arm64/include/asm/processor.h |  1 +
 arch/arm64/include/asm/sysreg.h    |  7 ++++++
 arch/arm64/kernel/mte.c            | 35 +++++++++++++++++++++++++++---
 arch/arm64/kernel/process.c        |  2 +-
 include/uapi/linux/prctl.h         |  3 +++
 5 files changed, 44 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 80e7f0573309..996b882a32d9 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -153,6 +153,7 @@ struct thread_struct {
 #endif
 #ifdef CONFIG_ARM64_MTE
 	u64			sctlr_tcf0;
+	u64			gcr_incl;
 #endif
 };
 
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 86236ae6c4e7..cb247f2f75ca 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -981,6 +981,13 @@
 		write_sysreg(__scs_new, sysreg);			\
 } while (0)
 
+#define sysreg_clear_set_s(sysreg, clear, set) do {			\
+	u64 __scs_val = read_sysreg_s(sysreg);				\
+	u64 __scs_new = (__scs_val & ~(u64)(clear)) | (set);		\
+	if (__scs_new != __scs_val)					\
+		write_sysreg_s(__scs_new, sysreg);			\
+} while (0)
+
 #endif
 
 #endif	/* __ASM_SYSREG_H */
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 3e425f124360..999300e2110e 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -71,6 +71,25 @@ static void set_sctlr_el1_tcf0(u64 tcf0)
 	preempt_enable();
 }
 
+static void update_gcr_el1_excl(u64 incl)
+{
+	u64 excl = ~incl & SYS_GCR_EL1_EXCL_MASK;
+
+	/*
+	 * Note that 'incl' is an include mask (controlled by the user via
+	 * prctl()) while GCR_EL1 accepts an exclude mask.
+	 * No need for ISB since this only affects EL0 currently, implicit
+	 * with ERET.
+	 */
+	sysreg_clear_set_s(SYS_GCR_EL1, SYS_GCR_EL1_EXCL_MASK, excl);
+}
+
+static void set_gcr_el1_excl(u64 incl)
+{
+	current->thread.gcr_incl = incl;
+	update_gcr_el1_excl(incl);
+}
+
 void flush_mte_state(void)
 {
 	if (!system_supports_mte())
@@ -82,6 +101,8 @@ void flush_mte_state(void)
 	clear_thread_flag(TIF_MTE_ASYNC_FAULT);
 	/* disable tag checking */
 	set_sctlr_el1_tcf0(SCTLR_EL1_TCF0_NONE);
+	/* reset tag generation mask */
+	set_gcr_el1_excl(0);
 }
 
 void mte_thread_switch(struct task_struct *next)
@@ -92,6 +113,7 @@ void mte_thread_switch(struct task_struct *next)
 	/* avoid expensive SCTLR_EL1 accesses if no change */
 	if (current->thread.sctlr_tcf0 != next->thread.sctlr_tcf0)
 		update_sctlr_el1_tcf0(next->thread.sctlr_tcf0);
+	update_gcr_el1_excl(next->thread.gcr_incl);
 }
 
 long set_mte_ctrl(unsigned long arg)
@@ -116,23 +138,30 @@ long set_mte_ctrl(unsigned long arg)
 	}
 
 	set_sctlr_el1_tcf0(tcf0);
+	set_gcr_el1_excl((arg & PR_MTE_TAG_MASK) >> PR_MTE_TAG_SHIFT);
 
 	return 0;
 }
 
 long get_mte_ctrl(void)
 {
+	unsigned long ret;
+
 	if (!system_supports_mte())
 		return 0;
 
+	ret = current->thread.gcr_incl << PR_MTE_TAG_SHIFT;
+
 	switch (current->thread.sctlr_tcf0) {
 	case SCTLR_EL1_TCF0_NONE:
 		return PR_MTE_TCF_NONE;
 	case SCTLR_EL1_TCF0_SYNC:
-		return PR_MTE_TCF_SYNC;
+		ret |= PR_MTE_TCF_SYNC;
+		break;
 	case SCTLR_EL1_TCF0_ASYNC:
-		return PR_MTE_TCF_ASYNC;
+		ret |= PR_MTE_TCF_ASYNC;
+		break;
 	}
 
-	return 0;
+	return ret;
 }
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index ff6031a398d0..697571be259b 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -601,7 +601,7 @@ long set_tagged_addr_ctrl(unsigned long arg)
 		return -EINVAL;
 
 	if (system_supports_mte())
-		valid_mask |= PR_MTE_TCF_MASK;
+		valid_mask |= PR_MTE_TCF_MASK | PR_MTE_TAG_MASK;
 
 	if (arg & ~valid_mask)
 		return -EINVAL;
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 2390ab324afa..7f0827705c9a 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -239,6 +239,9 @@ struct prctl_mm_map {
 # define PR_MTE_TCF_SYNC		(1UL << PR_MTE_TCF_SHIFT)
 # define PR_MTE_TCF_ASYNC		(2UL << PR_MTE_TCF_SHIFT)
 # define PR_MTE_TCF_MASK		(3UL << PR_MTE_TCF_SHIFT)
+/* MTE tag inclusion mask */
+# define PR_MTE_TAG_SHIFT		3
+# define PR_MTE_TAG_MASK		(0xffffUL << PR_MTE_TAG_SHIFT)
 
 /* Control reclaim behavior when allocating memory */
 #define PR_SET_IO_FLUSHER		57

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 17/26] arm64: mte: Restore the GCR_EL1 register after a suspend
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:16   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne

The CPU resume/suspend routines only take care of the common system
registers. Restore GCR_EL1 in addition via the __cpu_suspend_exit()
function.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---

Notes:
    New in v3.

 arch/arm64/include/asm/mte.h | 4 ++++
 arch/arm64/kernel/mte.c      | 8 ++++++++
 arch/arm64/kernel/suspend.c  | 4 ++++
 3 files changed, 16 insertions(+)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index c79dbffd30c4..7435f6619bf1 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -22,6 +22,7 @@ void mte_sync_tags(pte_t *ptep, pte_t pte);
 void mte_copy_page_tags(void *kto, const void *kfrom);
 void flush_mte_state(void);
 void mte_thread_switch(struct task_struct *next);
+void mte_suspend_exit(void);
 long set_mte_ctrl(unsigned long arg);
 long get_mte_ctrl(void);
 
@@ -42,6 +43,9 @@ static inline void flush_mte_state(void)
 static inline void mte_thread_switch(struct task_struct *next)
 {
 }
+static inline void mte_suspend_exit(void)
+{
+}
 static inline long set_mte_ctrl(unsigned long arg)
 {
 	return 0;
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 999300e2110e..cda09bc8caf4 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -116,6 +116,14 @@ void mte_thread_switch(struct task_struct *next)
 	update_gcr_el1_excl(next->thread.gcr_incl);
 }
 
+void mte_suspend_exit(void)
+{
+	if (!system_supports_mte())
+		return;
+
+	update_gcr_el1_excl(current->thread.gcr_incl);
+}
+
 long set_mte_ctrl(unsigned long arg)
 {
 	u64 tcf0;
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 9405d1b7f4b0..1d405b73d009 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -9,6 +9,7 @@
 #include <asm/daifflags.h>
 #include <asm/debug-monitors.h>
 #include <asm/exec.h>
+#include <asm/mte.h>
 #include <asm/pgtable.h>
 #include <asm/memory.h>
 #include <asm/mmu_context.h>
@@ -74,6 +75,9 @@ void notrace __cpu_suspend_exit(void)
 	 */
 	if (arm64_get_ssbd_state() == ARM64_SSBD_FORCE_DISABLE)
 		arm64_set_ssbd_mitigation(false);
+
+	/* Restore additional MTE-specific configuration */
+	mte_suspend_exit();
 }
 
 /*

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 17/26] arm64: mte: Restore the GCR_EL1 register after a suspend
@ 2020-05-15 17:16   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Vincenzo Frascino, Will Deacon,
	Dave P Martin

The CPU resume/suspend routines only take care of the common system
registers. Restore GCR_EL1 in addition via the __cpu_suspend_exit()
function.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---

Notes:
    New in v3.

 arch/arm64/include/asm/mte.h | 4 ++++
 arch/arm64/kernel/mte.c      | 8 ++++++++
 arch/arm64/kernel/suspend.c  | 4 ++++
 3 files changed, 16 insertions(+)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index c79dbffd30c4..7435f6619bf1 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -22,6 +22,7 @@ void mte_sync_tags(pte_t *ptep, pte_t pte);
 void mte_copy_page_tags(void *kto, const void *kfrom);
 void flush_mte_state(void);
 void mte_thread_switch(struct task_struct *next);
+void mte_suspend_exit(void);
 long set_mte_ctrl(unsigned long arg);
 long get_mte_ctrl(void);
 
@@ -42,6 +43,9 @@ static inline void flush_mte_state(void)
 static inline void mte_thread_switch(struct task_struct *next)
 {
 }
+static inline void mte_suspend_exit(void)
+{
+}
 static inline long set_mte_ctrl(unsigned long arg)
 {
 	return 0;
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 999300e2110e..cda09bc8caf4 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -116,6 +116,14 @@ void mte_thread_switch(struct task_struct *next)
 	update_gcr_el1_excl(next->thread.gcr_incl);
 }
 
+void mte_suspend_exit(void)
+{
+	if (!system_supports_mte())
+		return;
+
+	update_gcr_el1_excl(current->thread.gcr_incl);
+}
+
 long set_mte_ctrl(unsigned long arg)
 {
 	u64 tcf0;
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 9405d1b7f4b0..1d405b73d009 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -9,6 +9,7 @@
 #include <asm/daifflags.h>
 #include <asm/debug-monitors.h>
 #include <asm/exec.h>
+#include <asm/mte.h>
 #include <asm/pgtable.h>
 #include <asm/memory.h>
 #include <asm/mmu_context.h>
@@ -74,6 +75,9 @@ void notrace __cpu_suspend_exit(void)
 	 */
 	if (arm64_get_ssbd_state() == ARM64_SSBD_FORCE_DISABLE)
 		arm64_set_ssbd_mitigation(false);
+
+	/* Restore additional MTE-specific configuration */
+	mte_suspend_exit();
 }
 
 /*

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 18/26] arm64: mte: Add PTRACE_{PEEK,POKE}MTETAGS support
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:16   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Alan Hayward,
	Luis Machado, Omair Javaid

Add support for bulk setting/getting of the MTE tags in a tracee's
address space at 'addr' in the ptrace() syscall prototype. 'data' points
to a struct iovec in the tracer's address space with iov_base
representing the address of a tracer's buffer of length iov_len. The
tags to be copied to/from the tracer's buffer are stored as one tag per
byte.

On successfully copying at least one tag, ptrace() returns 0 and updates
the tracer's iov_len with the number of tags copied. In case of error,
either -EIO or -EFAULT is returned, trying to follow the ptrace() man
page.

Note that the tag copying functions are not performance critical,
therefore they lack optimisations found in typical memory copy routines.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Alan Hayward <Alan.Hayward@arm.com>
Cc: Luis Machado <luis.machado@linaro.org>
Cc: Omair Javaid <omair.javaid@linaro.org>
---

Notes:
    v4:
    - Following the change to only clear the tags in a page if it is mapped
      to user with PROT_MTE, ptrace() now will refuse to access tags in
      pages not previously mapped with PROT_MTE (PG_mte_tagged set). This is
      primarily to avoid leaking uninitialised tags to user via ptrace().
    - Fix SYM_FUNC_END argument typo.
    - Rename MTE_ALLOC_* to MTE_GRANULE_*.
    - Use uao_user_alternative for the user access in case we ever want to
      call mte_copy_tags_* with a kernel buffer. It also matches the other
      uaccess routines in the kernel.
    - Simplify arch_ptrace() slightly.
    - Reorder down_write_killable() with access_ok() in
      __access_remote_tags().
    - Handle copy length 0 in mte_copy_tags_{to,from}_user().
    - Use put_user() instead of __put_user().
    
    New in v3.

 arch/arm64/include/asm/mte.h         |  17 ++++
 arch/arm64/include/uapi/asm/ptrace.h |   3 +
 arch/arm64/kernel/mte.c              | 139 +++++++++++++++++++++++++++
 arch/arm64/kernel/ptrace.c           |   7 ++
 arch/arm64/lib/mte.S                 |  53 ++++++++++
 5 files changed, 219 insertions(+)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index 7435f6619bf1..9d4b1390d07d 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -5,6 +5,11 @@
 #ifndef __ASM_MTE_H
 #define __ASM_MTE_H
 
+#define MTE_GRANULE_SIZE	UL(16)
+#define MTE_GRANULE_MASK	(~(MTE_GRANULE_SIZE - 1))
+#define MTE_TAG_SHIFT		56
+#define MTE_TAG_SIZE		4
+
 #ifndef __ASSEMBLY__
 
 #include <linux/page-flags.h>
@@ -12,6 +17,10 @@
 #include <asm/pgtable-types.h>
 
 void mte_clear_page_tags(void *addr, size_t size);
+unsigned long mte_copy_tags_from_user(void *to, const void __user *from,
+				      unsigned long n);
+unsigned long mte_copy_tags_to_user(void __user *to, void *from,
+				    unsigned long n);
 
 #ifdef CONFIG_ARM64_MTE
 
@@ -25,6 +34,8 @@ void mte_thread_switch(struct task_struct *next);
 void mte_suspend_exit(void);
 long set_mte_ctrl(unsigned long arg);
 long get_mte_ctrl(void);
+int mte_ptrace_copy_tags(struct task_struct *child, long request,
+			 unsigned long addr, unsigned long data);
 
 #else
 
@@ -54,6 +65,12 @@ static inline long get_mte_ctrl(void)
 {
 	return 0;
 }
+static inline int mte_ptrace_copy_tags(struct task_struct *child,
+				       long request, unsigned long addr,
+				       unsigned long data)
+{
+	return -EIO;
+}
 
 #endif
 
diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h
index 1daf6dda8af0..cd2a4a164de3 100644
--- a/arch/arm64/include/uapi/asm/ptrace.h
+++ b/arch/arm64/include/uapi/asm/ptrace.h
@@ -67,6 +67,9 @@
 /* syscall emulation path in ptrace */
 #define PTRACE_SYSEMU		  31
 #define PTRACE_SYSEMU_SINGLESTEP  32
+/* MTE allocation tag access */
+#define PTRACE_PEEKMTETAGS	  33
+#define PTRACE_POKEMTETAGS	  34
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index cda09bc8caf4..6abd6a16a145 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -4,14 +4,18 @@
  */
 
 #include <linux/bitops.h>
+#include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/prctl.h>
 #include <linux/sched.h>
+#include <linux/sched/mm.h>
 #include <linux/string.h>
 #include <linux/thread_info.h>
+#include <linux/uio.h>
 
 #include <asm/cpufeature.h>
 #include <asm/mte.h>
+#include <asm/ptrace.h>
 #include <asm/sysreg.h>
 
 void mte_sync_tags(pte_t *ptep, pte_t pte)
@@ -173,3 +177,138 @@ long get_mte_ctrl(void)
 
 	return ret;
 }
+
+/*
+ * Access MTE tags in another process' address space as given in mm. Update
+ * the number of tags copied. Return 0 if any tags copied, error otherwise.
+ * Inspired by __access_remote_vm().
+ */
+static int __access_remote_tags(struct task_struct *tsk, struct mm_struct *mm,
+				unsigned long addr, struct iovec *kiov,
+				unsigned int gup_flags)
+{
+	struct vm_area_struct *vma;
+	void __user *buf = kiov->iov_base;
+	size_t len = kiov->iov_len;
+	int ret;
+	int write = gup_flags & FOLL_WRITE;
+
+	if (!access_ok(buf, len))
+		return -EFAULT;
+
+	if (down_read_killable(&mm->mmap_sem))
+		return -EIO;
+
+	while (len) {
+		unsigned long tags, offset;
+		void *maddr;
+		struct page *page = NULL;
+
+		ret = get_user_pages_remote(tsk, mm, addr, 1, gup_flags,
+					    &page, &vma, NULL);
+		if (ret <= 0)
+			break;
+
+		/*
+		 * Only copy tags if the page has been mapped as PROT_MTE
+		 * (PG_mte_tagged set). Otherwise the tags are not valid and
+		 * not accessible to user. Moreover, an mprotect(PROT_MTE)
+		 * would cause the existing tags to be cleared if the page
+		 * was never mapped with PROT_MTE.
+		 */
+		if (!test_bit(PG_mte_tagged, &page->flags)) {
+			ret = -EOPNOTSUPP;
+			put_page(page);
+			break;
+		}
+
+		/* limit access to the end of the page */
+		offset = offset_in_page(addr);
+		tags = min(len, (PAGE_SIZE - offset) / MTE_GRANULE_SIZE);
+
+		maddr = page_address(page);
+		if (write) {
+			tags = mte_copy_tags_from_user(maddr + offset, buf, tags);
+			set_page_dirty_lock(page);
+		} else {
+			tags = mte_copy_tags_to_user(buf, maddr + offset, tags);
+		}
+		put_page(page);
+
+		/* error accessing the tracer's buffer */
+		if (!tags)
+			break;
+
+		len -= tags;
+		buf += tags;
+		addr += tags * MTE_GRANULE_SIZE;
+	}
+	up_read(&mm->mmap_sem);
+
+	/* return an error if no tags copied */
+	kiov->iov_len = buf - kiov->iov_base;
+	if (!kiov->iov_len) {
+		/* check for error accessing the tracee's address space */
+		if (ret <= 0)
+			return -EIO;
+		else
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
+/*
+ * Copy MTE tags in another process' address space at 'addr' to/from tracer's
+ * iovec buffer. Return 0 on success. Inspired by ptrace_access_vm().
+ */
+static int access_remote_tags(struct task_struct *tsk, unsigned long addr,
+			      struct iovec *kiov, unsigned int gup_flags)
+{
+	struct mm_struct *mm;
+	int ret;
+
+	mm = get_task_mm(tsk);
+	if (!mm)
+		return -EPERM;
+
+	if (!tsk->ptrace || (current != tsk->parent) ||
+	    ((get_dumpable(mm) != SUID_DUMP_USER) &&
+	     !ptracer_capable(tsk, mm->user_ns))) {
+		mmput(mm);
+		return -EPERM;
+	}
+
+	ret = __access_remote_tags(tsk, mm, addr, kiov, gup_flags);
+	mmput(mm);
+
+	return ret;
+}
+
+int mte_ptrace_copy_tags(struct task_struct *child, long request,
+			 unsigned long addr, unsigned long data)
+{
+	int ret;
+	struct iovec kiov;
+	struct iovec __user *uiov = (void __user *)data;
+	unsigned int gup_flags = FOLL_FORCE;
+
+	if (!system_supports_mte())
+		return -EIO;
+
+	if (get_user(kiov.iov_base, &uiov->iov_base) ||
+	    get_user(kiov.iov_len, &uiov->iov_len))
+		return -EFAULT;
+
+	if (request == PTRACE_POKEMTETAGS)
+		gup_flags |= FOLL_WRITE;
+
+	/* align addr to the MTE tag granule */
+	addr &= MTE_GRANULE_MASK;
+
+	ret = access_remote_tags(child, addr, &kiov, gup_flags);
+	if (!ret)
+		ret = put_user(kiov.iov_len, &uiov->iov_len);
+
+	return ret;
+}
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 077e352495eb..c8bda5f5b321 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -34,6 +34,7 @@
 #include <asm/cpufeature.h>
 #include <asm/debug-monitors.h>
 #include <asm/fpsimd.h>
+#include <asm/mte.h>
 #include <asm/pgtable.h>
 #include <asm/pointer_auth.h>
 #include <asm/stacktrace.h>
@@ -1797,6 +1798,12 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task)
 long arch_ptrace(struct task_struct *child, long request,
 		 unsigned long addr, unsigned long data)
 {
+	switch (request) {
+	case PTRACE_PEEKMTETAGS:
+	case PTRACE_POKEMTETAGS:
+		return mte_ptrace_copy_tags(child, request, addr, data);
+	}
+
 	return ptrace_request(child, request, addr, data);
 }
 
diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
index a531b52fa5ba..862655a36013 100644
--- a/arch/arm64/lib/mte.S
+++ b/arch/arm64/lib/mte.S
@@ -4,7 +4,9 @@
  */
 #include <linux/linkage.h>
 
+#include <asm/alternative.h>
 #include <asm/assembler.h>
+#include <asm/mte.h>
 #include <asm/page.h>
 
 	.arch	armv8.5-a+memtag
@@ -40,3 +42,54 @@ SYM_FUNC_START(mte_copy_page_tags)
 	b.ne	1b
 2:
 SYM_FUNC_END(mte_copy_page_tags)
+
+/*
+ * Read tags from a user buffer (one tag per byte) and set the corresponding
+ * tags at the given kernel address. Used by PTRACE_POKEMTETAGS.
+ *   x0 - kernel address (to)
+ *   x1 - user buffer (from)
+ *   x2 - number of tags/bytes (n)
+ * Returns:
+ *   x0 - number of tags read/set
+ */
+SYM_FUNC_START(mte_copy_tags_from_user)
+	mov	x3, x1
+	cbz	x2, 2f
+1:
+	uao_user_alternative 2f, ldrb, ldtrb, w4, x1, 0
+	lsl	x4, x4, #MTE_TAG_SHIFT
+	stg	x4, [x0], #MTE_GRANULE_SIZE
+	add	x1, x1, #1
+	subs	x2, x2, #1
+	b.ne	1b
+
+	// exception handling and function return
+2:	sub	x0, x1, x3		// update the number of tags set
+	ret
+SYM_FUNC_END(mte_copy_tags_from_user)
+
+/*
+ * Get the tags from a kernel address range and write the tag values to the
+ * given user buffer (one tag per byte). Used by PTRACE_PEEKMTETAGS.
+ *   x0 - user buffer (to)
+ *   x1 - kernel address (from)
+ *   x2 - number of tags/bytes (n)
+ * Returns:
+ *   x0 - number of tags read/set
+ */
+SYM_FUNC_START(mte_copy_tags_to_user)
+	mov	x3, x0
+	cbz	x2, 2f
+1:
+	ldg	x4, [x1]
+	ubfx	x4, x4, #MTE_TAG_SHIFT, #MTE_TAG_SIZE
+	uao_user_alternative 2f, strb, sttrb, w4, x0, 0
+	add	x0, x0, #1
+	add	x1, x1, #MTE_GRANULE_SIZE
+	subs	x2, x2, #1
+	b.ne	1b
+
+	// exception handling and function return
+2:	sub	x0, x0, x3		// update the number of tags copied
+	ret
+SYM_FUNC_END(mte_copy_tags_to_user)

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 18/26] arm64: mte: Add PTRACE_{PEEK,POKE}MTETAGS support
@ 2020-05-15 17:16   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Luis Machado, Omair Javaid, Szabolcs Nagy,
	Andrey Konovalov, Kevin Brodsky, Peter Collingbourne, linux-mm,
	Alan Hayward, Vincenzo Frascino, Will Deacon, Dave P Martin

Add support for bulk setting/getting of the MTE tags in a tracee's
address space at 'addr' in the ptrace() syscall prototype. 'data' points
to a struct iovec in the tracer's address space with iov_base
representing the address of a tracer's buffer of length iov_len. The
tags to be copied to/from the tracer's buffer are stored as one tag per
byte.

On successfully copying at least one tag, ptrace() returns 0 and updates
the tracer's iov_len with the number of tags copied. In case of error,
either -EIO or -EFAULT is returned, trying to follow the ptrace() man
page.

Note that the tag copying functions are not performance critical,
therefore they lack optimisations found in typical memory copy routines.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Alan Hayward <Alan.Hayward@arm.com>
Cc: Luis Machado <luis.machado@linaro.org>
Cc: Omair Javaid <omair.javaid@linaro.org>
---

Notes:
    v4:
    - Following the change to only clear the tags in a page if it is mapped
      to user with PROT_MTE, ptrace() now will refuse to access tags in
      pages not previously mapped with PROT_MTE (PG_mte_tagged set). This is
      primarily to avoid leaking uninitialised tags to user via ptrace().
    - Fix SYM_FUNC_END argument typo.
    - Rename MTE_ALLOC_* to MTE_GRANULE_*.
    - Use uao_user_alternative for the user access in case we ever want to
      call mte_copy_tags_* with a kernel buffer. It also matches the other
      uaccess routines in the kernel.
    - Simplify arch_ptrace() slightly.
    - Reorder down_write_killable() with access_ok() in
      __access_remote_tags().
    - Handle copy length 0 in mte_copy_tags_{to,from}_user().
    - Use put_user() instead of __put_user().
    
    New in v3.

 arch/arm64/include/asm/mte.h         |  17 ++++
 arch/arm64/include/uapi/asm/ptrace.h |   3 +
 arch/arm64/kernel/mte.c              | 139 +++++++++++++++++++++++++++
 arch/arm64/kernel/ptrace.c           |   7 ++
 arch/arm64/lib/mte.S                 |  53 ++++++++++
 5 files changed, 219 insertions(+)

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index 7435f6619bf1..9d4b1390d07d 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -5,6 +5,11 @@
 #ifndef __ASM_MTE_H
 #define __ASM_MTE_H
 
+#define MTE_GRANULE_SIZE	UL(16)
+#define MTE_GRANULE_MASK	(~(MTE_GRANULE_SIZE - 1))
+#define MTE_TAG_SHIFT		56
+#define MTE_TAG_SIZE		4
+
 #ifndef __ASSEMBLY__
 
 #include <linux/page-flags.h>
@@ -12,6 +17,10 @@
 #include <asm/pgtable-types.h>
 
 void mte_clear_page_tags(void *addr, size_t size);
+unsigned long mte_copy_tags_from_user(void *to, const void __user *from,
+				      unsigned long n);
+unsigned long mte_copy_tags_to_user(void __user *to, void *from,
+				    unsigned long n);
 
 #ifdef CONFIG_ARM64_MTE
 
@@ -25,6 +34,8 @@ void mte_thread_switch(struct task_struct *next);
 void mte_suspend_exit(void);
 long set_mte_ctrl(unsigned long arg);
 long get_mte_ctrl(void);
+int mte_ptrace_copy_tags(struct task_struct *child, long request,
+			 unsigned long addr, unsigned long data);
 
 #else
 
@@ -54,6 +65,12 @@ static inline long get_mte_ctrl(void)
 {
 	return 0;
 }
+static inline int mte_ptrace_copy_tags(struct task_struct *child,
+				       long request, unsigned long addr,
+				       unsigned long data)
+{
+	return -EIO;
+}
 
 #endif
 
diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h
index 1daf6dda8af0..cd2a4a164de3 100644
--- a/arch/arm64/include/uapi/asm/ptrace.h
+++ b/arch/arm64/include/uapi/asm/ptrace.h
@@ -67,6 +67,9 @@
 /* syscall emulation path in ptrace */
 #define PTRACE_SYSEMU		  31
 #define PTRACE_SYSEMU_SINGLESTEP  32
+/* MTE allocation tag access */
+#define PTRACE_PEEKMTETAGS	  33
+#define PTRACE_POKEMTETAGS	  34
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index cda09bc8caf4..6abd6a16a145 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -4,14 +4,18 @@
  */
 
 #include <linux/bitops.h>
+#include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/prctl.h>
 #include <linux/sched.h>
+#include <linux/sched/mm.h>
 #include <linux/string.h>
 #include <linux/thread_info.h>
+#include <linux/uio.h>
 
 #include <asm/cpufeature.h>
 #include <asm/mte.h>
+#include <asm/ptrace.h>
 #include <asm/sysreg.h>
 
 void mte_sync_tags(pte_t *ptep, pte_t pte)
@@ -173,3 +177,138 @@ long get_mte_ctrl(void)
 
 	return ret;
 }
+
+/*
+ * Access MTE tags in another process' address space as given in mm. Update
+ * the number of tags copied. Return 0 if any tags copied, error otherwise.
+ * Inspired by __access_remote_vm().
+ */
+static int __access_remote_tags(struct task_struct *tsk, struct mm_struct *mm,
+				unsigned long addr, struct iovec *kiov,
+				unsigned int gup_flags)
+{
+	struct vm_area_struct *vma;
+	void __user *buf = kiov->iov_base;
+	size_t len = kiov->iov_len;
+	int ret;
+	int write = gup_flags & FOLL_WRITE;
+
+	if (!access_ok(buf, len))
+		return -EFAULT;
+
+	if (down_read_killable(&mm->mmap_sem))
+		return -EIO;
+
+	while (len) {
+		unsigned long tags, offset;
+		void *maddr;
+		struct page *page = NULL;
+
+		ret = get_user_pages_remote(tsk, mm, addr, 1, gup_flags,
+					    &page, &vma, NULL);
+		if (ret <= 0)
+			break;
+
+		/*
+		 * Only copy tags if the page has been mapped as PROT_MTE
+		 * (PG_mte_tagged set). Otherwise the tags are not valid and
+		 * not accessible to user. Moreover, an mprotect(PROT_MTE)
+		 * would cause the existing tags to be cleared if the page
+		 * was never mapped with PROT_MTE.
+		 */
+		if (!test_bit(PG_mte_tagged, &page->flags)) {
+			ret = -EOPNOTSUPP;
+			put_page(page);
+			break;
+		}
+
+		/* limit access to the end of the page */
+		offset = offset_in_page(addr);
+		tags = min(len, (PAGE_SIZE - offset) / MTE_GRANULE_SIZE);
+
+		maddr = page_address(page);
+		if (write) {
+			tags = mte_copy_tags_from_user(maddr + offset, buf, tags);
+			set_page_dirty_lock(page);
+		} else {
+			tags = mte_copy_tags_to_user(buf, maddr + offset, tags);
+		}
+		put_page(page);
+
+		/* error accessing the tracer's buffer */
+		if (!tags)
+			break;
+
+		len -= tags;
+		buf += tags;
+		addr += tags * MTE_GRANULE_SIZE;
+	}
+	up_read(&mm->mmap_sem);
+
+	/* return an error if no tags copied */
+	kiov->iov_len = buf - kiov->iov_base;
+	if (!kiov->iov_len) {
+		/* check for error accessing the tracee's address space */
+		if (ret <= 0)
+			return -EIO;
+		else
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
+/*
+ * Copy MTE tags in another process' address space at 'addr' to/from tracer's
+ * iovec buffer. Return 0 on success. Inspired by ptrace_access_vm().
+ */
+static int access_remote_tags(struct task_struct *tsk, unsigned long addr,
+			      struct iovec *kiov, unsigned int gup_flags)
+{
+	struct mm_struct *mm;
+	int ret;
+
+	mm = get_task_mm(tsk);
+	if (!mm)
+		return -EPERM;
+
+	if (!tsk->ptrace || (current != tsk->parent) ||
+	    ((get_dumpable(mm) != SUID_DUMP_USER) &&
+	     !ptracer_capable(tsk, mm->user_ns))) {
+		mmput(mm);
+		return -EPERM;
+	}
+
+	ret = __access_remote_tags(tsk, mm, addr, kiov, gup_flags);
+	mmput(mm);
+
+	return ret;
+}
+
+int mte_ptrace_copy_tags(struct task_struct *child, long request,
+			 unsigned long addr, unsigned long data)
+{
+	int ret;
+	struct iovec kiov;
+	struct iovec __user *uiov = (void __user *)data;
+	unsigned int gup_flags = FOLL_FORCE;
+
+	if (!system_supports_mte())
+		return -EIO;
+
+	if (get_user(kiov.iov_base, &uiov->iov_base) ||
+	    get_user(kiov.iov_len, &uiov->iov_len))
+		return -EFAULT;
+
+	if (request == PTRACE_POKEMTETAGS)
+		gup_flags |= FOLL_WRITE;
+
+	/* align addr to the MTE tag granule */
+	addr &= MTE_GRANULE_MASK;
+
+	ret = access_remote_tags(child, addr, &kiov, gup_flags);
+	if (!ret)
+		ret = put_user(kiov.iov_len, &uiov->iov_len);
+
+	return ret;
+}
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 077e352495eb..c8bda5f5b321 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -34,6 +34,7 @@
 #include <asm/cpufeature.h>
 #include <asm/debug-monitors.h>
 #include <asm/fpsimd.h>
+#include <asm/mte.h>
 #include <asm/pgtable.h>
 #include <asm/pointer_auth.h>
 #include <asm/stacktrace.h>
@@ -1797,6 +1798,12 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task)
 long arch_ptrace(struct task_struct *child, long request,
 		 unsigned long addr, unsigned long data)
 {
+	switch (request) {
+	case PTRACE_PEEKMTETAGS:
+	case PTRACE_POKEMTETAGS:
+		return mte_ptrace_copy_tags(child, request, addr, data);
+	}
+
 	return ptrace_request(child, request, addr, data);
 }
 
diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
index a531b52fa5ba..862655a36013 100644
--- a/arch/arm64/lib/mte.S
+++ b/arch/arm64/lib/mte.S
@@ -4,7 +4,9 @@
  */
 #include <linux/linkage.h>
 
+#include <asm/alternative.h>
 #include <asm/assembler.h>
+#include <asm/mte.h>
 #include <asm/page.h>
 
 	.arch	armv8.5-a+memtag
@@ -40,3 +42,54 @@ SYM_FUNC_START(mte_copy_page_tags)
 	b.ne	1b
 2:
 SYM_FUNC_END(mte_copy_page_tags)
+
+/*
+ * Read tags from a user buffer (one tag per byte) and set the corresponding
+ * tags at the given kernel address. Used by PTRACE_POKEMTETAGS.
+ *   x0 - kernel address (to)
+ *   x1 - user buffer (from)
+ *   x2 - number of tags/bytes (n)
+ * Returns:
+ *   x0 - number of tags read/set
+ */
+SYM_FUNC_START(mte_copy_tags_from_user)
+	mov	x3, x1
+	cbz	x2, 2f
+1:
+	uao_user_alternative 2f, ldrb, ldtrb, w4, x1, 0
+	lsl	x4, x4, #MTE_TAG_SHIFT
+	stg	x4, [x0], #MTE_GRANULE_SIZE
+	add	x1, x1, #1
+	subs	x2, x2, #1
+	b.ne	1b
+
+	// exception handling and function return
+2:	sub	x0, x1, x3		// update the number of tags set
+	ret
+SYM_FUNC_END(mte_copy_tags_from_user)
+
+/*
+ * Get the tags from a kernel address range and write the tag values to the
+ * given user buffer (one tag per byte). Used by PTRACE_PEEKMTETAGS.
+ *   x0 - user buffer (to)
+ *   x1 - kernel address (from)
+ *   x2 - number of tags/bytes (n)
+ * Returns:
+ *   x0 - number of tags read/set
+ */
+SYM_FUNC_START(mte_copy_tags_to_user)
+	mov	x3, x0
+	cbz	x2, 2f
+1:
+	ldg	x4, [x1]
+	ubfx	x4, x4, #MTE_TAG_SHIFT, #MTE_TAG_SIZE
+	uao_user_alternative 2f, strb, sttrb, w4, x0, 0
+	add	x0, x0, #1
+	add	x1, x1, #MTE_GRANULE_SIZE
+	subs	x2, x2, #1
+	b.ne	1b
+
+	// exception handling and function return
+2:	sub	x0, x0, x3		// update the number of tags copied
+	ret
+SYM_FUNC_END(mte_copy_tags_to_user)

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 19/26] fs: Handle intra-page faults in copy_mount_options()
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:16   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Alexander Viro

The copy_mount_options() function takes a user pointer argument but no
size. It tries to read up to a PAGE_SIZE. However, copy_from_user() is
not guaranteed to return all the accessible bytes if, for example, the
access crosses a page boundary and gets a fault on the second page. To
work around this, the current copy_mount_options() implementation
performs two copy_from_user() passes, first to the end of the current
page and the second to what's left in the subsequent page.

On arm64 with MTE enabled, access to a user page may trigger a fault
after part of the buffer has been copied (when the user pointer tag,
bits 56-59, no longer matches the allocation tag stored in memory).
Allow copy_mount_options() to handle such intra-page faults by returning
-EFAULT only if the first copy_from_user() has not copied any bytes.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
---

Notes:
    v4:
    - Rewrite to avoid arch_has_exact_copy_from_user()
    
    New in v3.

 fs/namespace.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index a28e4db075ed..70aa60ccce57 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3016,7 +3016,7 @@ static void shrink_submounts(struct mount *mnt)
 void *copy_mount_options(const void __user * data)
 {
 	char *copy;
-	unsigned size;
+	unsigned size, left;
 
 	if (!data)
 		return NULL;
@@ -3027,12 +3027,30 @@ void *copy_mount_options(const void __user * data)
 
 	size = PAGE_SIZE - offset_in_page(data);
 
-	if (copy_from_user(copy, data, size)) {
+	/*
+	 * Attempt to copy to the end of the first user page. On success,
+	 * left == 0, copy the rest from the second user page (if it is
+	 * accessible). copy_from_user() will zero the part of the kernel
+	 * buffer not copied into.
+	 *
+	 * On architectures with intra-page faults (arm64 with MTE), the read
+	 * from the first page may fail after copying part of the user data
+	 * (left > 0 && left < size). Do not attempt the second copy in this
+	 * case as the end of the valid user buffer has already been reached.
+	 * Ensure, however, that the second part of the kernel buffer is
+	 * zeroed.
+	 */
+	left = copy_from_user(copy, data, size);
+	if (left == size) {
 		kfree(copy);
 		return ERR_PTR(-EFAULT);
 	}
 	if (size != PAGE_SIZE) {
-		if (copy_from_user(copy + size, data + size, PAGE_SIZE - size))
+		if (left == 0)
+			/* return not relevant, just silence the compiler */
+			left = copy_from_user(copy + size, data + size,
+					      PAGE_SIZE - size);
+		else
 			memset(copy + size, 0, PAGE_SIZE - size);
 	}
 	return copy;

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 19/26] fs: Handle intra-page faults in copy_mount_options()
@ 2020-05-15 17:16   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Alexander Viro, Vincenzo Frascino,
	Will Deacon, Dave P Martin

The copy_mount_options() function takes a user pointer argument but no
size. It tries to read up to a PAGE_SIZE. However, copy_from_user() is
not guaranteed to return all the accessible bytes if, for example, the
access crosses a page boundary and gets a fault on the second page. To
work around this, the current copy_mount_options() implementation
performs two copy_from_user() passes, first to the end of the current
page and the second to what's left in the subsequent page.

On arm64 with MTE enabled, access to a user page may trigger a fault
after part of the buffer has been copied (when the user pointer tag,
bits 56-59, no longer matches the allocation tag stored in memory).
Allow copy_mount_options() to handle such intra-page faults by returning
-EFAULT only if the first copy_from_user() has not copied any bytes.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
---

Notes:
    v4:
    - Rewrite to avoid arch_has_exact_copy_from_user()
    
    New in v3.

 fs/namespace.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index a28e4db075ed..70aa60ccce57 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3016,7 +3016,7 @@ static void shrink_submounts(struct mount *mnt)
 void *copy_mount_options(const void __user * data)
 {
 	char *copy;
-	unsigned size;
+	unsigned size, left;
 
 	if (!data)
 		return NULL;
@@ -3027,12 +3027,30 @@ void *copy_mount_options(const void __user * data)
 
 	size = PAGE_SIZE - offset_in_page(data);
 
-	if (copy_from_user(copy, data, size)) {
+	/*
+	 * Attempt to copy to the end of the first user page. On success,
+	 * left == 0, copy the rest from the second user page (if it is
+	 * accessible). copy_from_user() will zero the part of the kernel
+	 * buffer not copied into.
+	 *
+	 * On architectures with intra-page faults (arm64 with MTE), the read
+	 * from the first page may fail after copying part of the user data
+	 * (left > 0 && left < size). Do not attempt the second copy in this
+	 * case as the end of the valid user buffer has already been reached.
+	 * Ensure, however, that the second part of the kernel buffer is
+	 * zeroed.
+	 */
+	left = copy_from_user(copy, data, size);
+	if (left == size) {
 		kfree(copy);
 		return ERR_PTR(-EFAULT);
 	}
 	if (size != PAGE_SIZE) {
-		if (copy_from_user(copy + size, data + size, PAGE_SIZE - size))
+		if (left == 0)
+			/* return not relevant, just silence the compiler */
+			left = copy_from_user(copy + size, data + size,
+					      PAGE_SIZE - size);
+		else
 			memset(copy + size, 0, PAGE_SIZE - size);
 	}
 	return copy;

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 20/26] mm: Add arch hooks for saving/restoring tags
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:16   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Steven Price,
	Andrew Morton

From: Steven Price <steven.price@arm.com>

Arm's Memory Tagging Extension (MTE) adds some metadata (tags) to
every physical page, when swapping pages out to disk it is necessary to
save these tags, and later restore them when reading the pages back.

Add some hooks along with dummy implementations to enable the
arch code to handle this.

Three new hooks are added to the swap code:
 * arch_prepare_to_swap() and
 * arch_swap_invalidate_page() / arch_swap_invalidate_area().
One new hook is added to shmem:
 * arch_swap_restore_tags()

Signed-off-by: Steven Price <steven.price@arm.com>
[catalin.marinas@arm.com: add unlock_page() on the error path]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 include/asm-generic/pgtable.h | 23 +++++++++++++++++++++++
 mm/page_io.c                  | 10 ++++++++++
 mm/shmem.c                    |  6 ++++++
 mm/swapfile.c                 |  2 ++
 4 files changed, 41 insertions(+)

diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 329b8c8ca703..306cee75b9ec 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -475,6 +475,29 @@ static inline int arch_unmap_one(struct mm_struct *mm,
 }
 #endif
 
+#ifndef __HAVE_ARCH_PREPARE_TO_SWAP
+static inline int arch_prepare_to_swap(struct page *page)
+{
+	return 0;
+}
+#endif
+
+#ifndef __HAVE_ARCH_SWAP_INVALIDATE
+static inline void arch_swap_invalidate_page(int type, pgoff_t offset)
+{
+}
+
+static inline void arch_swap_invalidate_area(int type)
+{
+}
+#endif
+
+#ifndef __HAVE_ARCH_SWAP_RESTORE_TAGS
+static inline void arch_swap_restore_tags(swp_entry_t entry, struct page *page)
+{
+}
+#endif
+
 #ifndef __HAVE_ARCH_PGD_OFFSET_GATE
 #define pgd_offset_gate(mm, addr)	pgd_offset(mm, addr)
 #endif
diff --git a/mm/page_io.c b/mm/page_io.c
index 76965be1d40e..dae85d56e45b 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -253,6 +253,16 @@ int swap_writepage(struct page *page, struct writeback_control *wbc)
 		unlock_page(page);
 		goto out;
 	}
+	/*
+	 * Arch code may have to preserve more data than just the page
+	 * contents, e.g. memory tags.
+	 */
+	ret = arch_prepare_to_swap(page);
+	if (ret) {
+		set_page_dirty(page);
+		unlock_page(page);
+		goto out;
+	}
 	if (frontswap_store(page) == 0) {
 		set_page_writeback(page);
 		unlock_page(page);
diff --git a/mm/shmem.c b/mm/shmem.c
index 73754ed7af69..0e87925a7cb4 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1658,6 +1658,12 @@ static int shmem_swapin_page(struct inode *inode, pgoff_t index,
 	}
 	wait_on_page_writeback(page);
 
+	/*
+	 * Some architectures may have to restore extra metadata to the
+	 * physical page after reading from swap.
+	 */
+	arch_swap_restore_tags(swap, page);
+
 	if (shmem_should_replace_page(page, gfp)) {
 		error = shmem_replace_page(&page, gfp, info, index);
 		if (error)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 5871a2aa86a5..b39c6520b0cf 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -722,6 +722,7 @@ static void swap_range_free(struct swap_info_struct *si, unsigned long offset,
 	else
 		swap_slot_free_notify = NULL;
 	while (offset <= end) {
+		arch_swap_invalidate_page(si->type, offset);
 		frontswap_invalidate_page(si->type, offset);
 		if (swap_slot_free_notify)
 			swap_slot_free_notify(si->bdev, offset);
@@ -2645,6 +2646,7 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
 	frontswap_map = frontswap_map_get(p);
 	spin_unlock(&p->lock);
 	spin_unlock(&swap_lock);
+	arch_swap_invalidate_area(p->type);
 	frontswap_invalidate_area(p->type);
 	frontswap_map_set(p, NULL);
 	mutex_unlock(&swapon_mutex);

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 20/26] mm: Add arch hooks for saving/restoring tags
@ 2020-05-15 17:16   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Steven Price, Peter Collingbourne, linux-mm, Andrew Morton,
	Vincenzo Frascino, Will Deacon, Dave P Martin

From: Steven Price <steven.price@arm.com>

Arm's Memory Tagging Extension (MTE) adds some metadata (tags) to
every physical page, when swapping pages out to disk it is necessary to
save these tags, and later restore them when reading the pages back.

Add some hooks along with dummy implementations to enable the
arch code to handle this.

Three new hooks are added to the swap code:
 * arch_prepare_to_swap() and
 * arch_swap_invalidate_page() / arch_swap_invalidate_area().
One new hook is added to shmem:
 * arch_swap_restore_tags()

Signed-off-by: Steven Price <steven.price@arm.com>
[catalin.marinas@arm.com: add unlock_page() on the error path]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 include/asm-generic/pgtable.h | 23 +++++++++++++++++++++++
 mm/page_io.c                  | 10 ++++++++++
 mm/shmem.c                    |  6 ++++++
 mm/swapfile.c                 |  2 ++
 4 files changed, 41 insertions(+)

diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 329b8c8ca703..306cee75b9ec 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -475,6 +475,29 @@ static inline int arch_unmap_one(struct mm_struct *mm,
 }
 #endif
 
+#ifndef __HAVE_ARCH_PREPARE_TO_SWAP
+static inline int arch_prepare_to_swap(struct page *page)
+{
+	return 0;
+}
+#endif
+
+#ifndef __HAVE_ARCH_SWAP_INVALIDATE
+static inline void arch_swap_invalidate_page(int type, pgoff_t offset)
+{
+}
+
+static inline void arch_swap_invalidate_area(int type)
+{
+}
+#endif
+
+#ifndef __HAVE_ARCH_SWAP_RESTORE_TAGS
+static inline void arch_swap_restore_tags(swp_entry_t entry, struct page *page)
+{
+}
+#endif
+
 #ifndef __HAVE_ARCH_PGD_OFFSET_GATE
 #define pgd_offset_gate(mm, addr)	pgd_offset(mm, addr)
 #endif
diff --git a/mm/page_io.c b/mm/page_io.c
index 76965be1d40e..dae85d56e45b 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -253,6 +253,16 @@ int swap_writepage(struct page *page, struct writeback_control *wbc)
 		unlock_page(page);
 		goto out;
 	}
+	/*
+	 * Arch code may have to preserve more data than just the page
+	 * contents, e.g. memory tags.
+	 */
+	ret = arch_prepare_to_swap(page);
+	if (ret) {
+		set_page_dirty(page);
+		unlock_page(page);
+		goto out;
+	}
 	if (frontswap_store(page) == 0) {
 		set_page_writeback(page);
 		unlock_page(page);
diff --git a/mm/shmem.c b/mm/shmem.c
index 73754ed7af69..0e87925a7cb4 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1658,6 +1658,12 @@ static int shmem_swapin_page(struct inode *inode, pgoff_t index,
 	}
 	wait_on_page_writeback(page);
 
+	/*
+	 * Some architectures may have to restore extra metadata to the
+	 * physical page after reading from swap.
+	 */
+	arch_swap_restore_tags(swap, page);
+
 	if (shmem_should_replace_page(page, gfp)) {
 		error = shmem_replace_page(&page, gfp, info, index);
 		if (error)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 5871a2aa86a5..b39c6520b0cf 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -722,6 +722,7 @@ static void swap_range_free(struct swap_info_struct *si, unsigned long offset,
 	else
 		swap_slot_free_notify = NULL;
 	while (offset <= end) {
+		arch_swap_invalidate_page(si->type, offset);
 		frontswap_invalidate_page(si->type, offset);
 		if (swap_slot_free_notify)
 			swap_slot_free_notify(si->bdev, offset);
@@ -2645,6 +2646,7 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
 	frontswap_map = frontswap_map_get(p);
 	spin_unlock(&p->lock);
 	spin_unlock(&swap_lock);
+	arch_swap_invalidate_area(p->type);
 	frontswap_invalidate_area(p->type);
 	frontswap_map_set(p, NULL);
 	mutex_unlock(&swapon_mutex);

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 21/26] arm64: mte: Enable swap of tagged pages
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:16   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Steven Price,
	Andrew Morton

From: Steven Price <steven.price@arm.com>

When swapping pages out to disk it is necessary to save any tags that
have been set, and restore when swapping back in. Make use of the new
page flag (PG_ARCH_2, locally named PG_mte_tagged) to identify pages
with tags. When swapping out these pages the tags are stored in memory
and later restored when the pages are brought back in. Because shmem can
swap pages back in without restoring the userspace PTE it is also
necessary to add a hook for shmem.

Signed-off-by: Steven Price <steven.price@arm.com>
[catalin.marinas@arm.com: move function prototypes to mte.h]
[catalin.marinas@arm.com: drop '_tags' from arch_swap_restore_tags()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/mte.h     |  8 ++++
 arch/arm64/include/asm/pgtable.h | 32 +++++++++++++
 arch/arm64/kernel/mte.c          | 10 ++++
 arch/arm64/lib/mte.S             | 45 ++++++++++++++++++
 arch/arm64/mm/Makefile           |  1 +
 arch/arm64/mm/mteswap.c          | 82 ++++++++++++++++++++++++++++++++
 include/asm-generic/pgtable.h    |  4 +-
 mm/shmem.c                       |  2 +-
 8 files changed, 181 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm64/mm/mteswap.c

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index 9d4b1390d07d..ad43cfe88d5f 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -21,6 +21,14 @@ unsigned long mte_copy_tags_from_user(void *to, const void __user *from,
 				      unsigned long n);
 unsigned long mte_copy_tags_to_user(void __user *to, void *from,
 				    unsigned long n);
+int mte_save_tags(struct page *page);
+void mte_save_page_tags(const void *page_addr, void *tag_storage);
+bool mte_restore_tags(swp_entry_t entry, struct page *page);
+void mte_restore_page_tags(void *page_addr, const void *tag_storage);
+void mte_invalidate_tags(int type, pgoff_t offset);
+void mte_invalidate_tags_area(int type);
+void *mte_allocate_tag_storage(void);
+void mte_free_tag_storage(char *storage);
 
 #ifdef CONFIG_ARM64_MTE
 
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index f2cd59b01b27..281d0b0f4fe1 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -851,6 +851,38 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
 
 extern int kern_addr_valid(unsigned long addr);
 
+#ifdef CONFIG_ARM64_MTE
+
+#define __HAVE_ARCH_PREPARE_TO_SWAP
+static inline int arch_prepare_to_swap(struct page *page)
+{
+	if (system_supports_mte())
+		return mte_save_tags(page);
+	return 0;
+}
+
+#define __HAVE_ARCH_SWAP_INVALIDATE
+static inline void arch_swap_invalidate_page(int type, pgoff_t offset)
+{
+	if (system_supports_mte())
+		mte_invalidate_tags(type, offset);
+}
+
+static inline void arch_swap_invalidate_area(int type)
+{
+	if (system_supports_mte())
+		mte_invalidate_tags_area(type);
+}
+
+#define __HAVE_ARCH_SWAP_RESTORE
+static inline void arch_swap_restore(swp_entry_t entry, struct page *page)
+{
+	if (system_supports_mte() && mte_restore_tags(entry, page))
+		set_bit(PG_mte_tagged, &page->flags);
+}
+
+#endif /* CONFIG_ARM64_MTE */
+
 #include <asm-generic/pgtable.h>
 
 /*
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 6abd6a16a145..7b3041b0f909 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -10,6 +10,8 @@
 #include <linux/sched.h>
 #include <linux/sched/mm.h>
 #include <linux/string.h>
+#include <linux/swap.h>
+#include <linux/swapops.h>
 #include <linux/thread_info.h>
 #include <linux/uio.h>
 
@@ -21,11 +23,19 @@
 void mte_sync_tags(pte_t *ptep, pte_t pte)
 {
 	struct page *page = pte_page(pte);
+	pte_t old_pte = READ_ONCE(*ptep);
 
 	/* if PG_mte_tagged is set, tags have already been initialised */
 	if (test_and_set_bit(PG_mte_tagged, &page->flags))
 		return;
 
+	if (is_swap_pte(old_pte)) {
+		swp_entry_t entry = pte_to_swp_entry(old_pte);
+
+		if (!non_swap_entry(entry) && mte_restore_tags(entry, page))
+			return;
+	}
+
 	/* tags of the zero page will be cleared on first mapping */
 	mte_clear_page_tags(page_address(page), page_size(page));
 }
diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
index 862655a36013..dd4593693e63 100644
--- a/arch/arm64/lib/mte.S
+++ b/arch/arm64/lib/mte.S
@@ -93,3 +93,48 @@ SYM_FUNC_START(mte_copy_tags_to_user)
 2:	sub	x0, x0, x3		// update the number of tags copied
 	ret
 SYM_FUNC_END(mte_copy_tags_to_user)
+
+/*
+ * Save the tags in a page
+ *   x0 - page address
+ *   x1 - tag storage
+ */
+SYM_FUNC_START(mte_save_page_tags)
+	multitag_transfer_size x7, x5
+1:
+	mov	x2, #0
+2:
+	ldgm	x5, [x0]
+	orr	x2, x2, x5
+	add	x0, x0, x7
+	tst	x0, #0xFF		// 16 tag values fit in a register,
+	b.ne	2b			// which is 16*16=256 bytes
+
+	str	x2, [x1], #8
+
+	tst	x0, #(PAGE_SIZE - 1)
+	b.ne	1b
+
+	ret
+SYM_FUNC_END(mte_save_page_tags)
+
+/*
+ * Restore the tags in a page
+ *   x0 - page address
+ *   x1 - tag storage
+ */
+SYM_FUNC_START(mte_restore_page_tags)
+	multitag_transfer_size x7, x5
+1:
+	ldr	x2, [x1], #8
+2:
+	stgm	x2, [x0]
+	add	x0, x0, x7
+	tst	x0, #0xFF
+	b.ne	2b
+
+	tst	x0, #(PAGE_SIZE - 1)
+	b.ne	1b
+
+	ret
+SYM_FUNC_END(mte_restore_page_tags)
diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
index d91030f0ffee..5bcc9e0aa259 100644
--- a/arch/arm64/mm/Makefile
+++ b/arch/arm64/mm/Makefile
@@ -8,6 +8,7 @@ obj-$(CONFIG_PTDUMP_CORE)	+= dump.o
 obj-$(CONFIG_PTDUMP_DEBUGFS)	+= ptdump_debugfs.o
 obj-$(CONFIG_NUMA)		+= numa.o
 obj-$(CONFIG_DEBUG_VIRTUAL)	+= physaddr.o
+obj-$(CONFIG_ARM64_MTE)		+= mteswap.o
 KASAN_SANITIZE_physaddr.o	+= n
 
 obj-$(CONFIG_KASAN)		+= kasan_init.o
diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c
new file mode 100644
index 000000000000..847d99814d03
--- /dev/null
+++ b/arch/arm64/mm/mteswap.c
@@ -0,0 +1,82 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/pagemap.h>
+#include <linux/xarray.h>
+#include <linux/swap.h>
+#include <linux/swapops.h>
+#include <asm/mte.h>
+
+static DEFINE_XARRAY(mte_pages);
+
+void *mte_allocate_tag_storage(void)
+{
+	/* tags granule is 16 bytes, 2 tags stored per byte */
+	return kmalloc(PAGE_SIZE / 16 / 2, GFP_KERNEL);
+}
+
+void mte_free_tag_storage(char *storage)
+{
+	kfree(storage);
+}
+
+int mte_save_tags(struct page *page)
+{
+	void *tag_storage, *ret;
+
+	if (!test_bit(PG_mte_tagged, &page->flags))
+		return 0;
+
+	tag_storage = mte_allocate_tag_storage();
+	if (!tag_storage)
+		return -ENOMEM;
+
+	mte_save_page_tags(page_address(page), tag_storage);
+
+	/* page_private contains the swap entry.val set in do_swap_page */
+	ret = xa_store(&mte_pages, page_private(page), tag_storage, GFP_KERNEL);
+	if (WARN(xa_is_err(ret), "Failed to store MTE tags")) {
+		mte_free_tag_storage(tag_storage);
+		return xa_err(ret);
+	} else if (ret) {
+		/* Entry is being replaced, free the old entry */
+		mte_free_tag_storage(ret);
+	}
+
+	return 0;
+}
+
+bool mte_restore_tags(swp_entry_t entry, struct page *page)
+{
+	void *tags = xa_load(&mte_pages, entry.val);
+
+	if (!tags)
+		return false;
+
+	mte_restore_page_tags(page_address(page), tags);
+
+	return true;
+}
+
+void mte_invalidate_tags(int type, pgoff_t offset)
+{
+	swp_entry_t entry = swp_entry(type, offset);
+	void *tags = xa_erase(&mte_pages, entry.val);
+
+	mte_free_tag_storage(tags);
+}
+
+void mte_invalidate_tags_area(int type)
+{
+	swp_entry_t entry = swp_entry(type, 0);
+	swp_entry_t last_entry = swp_entry(type + 1, 0);
+	void *tags;
+
+	XA_STATE(xa_state, &mte_pages, entry.val);
+
+	xa_lock(&mte_pages);
+	xas_for_each(&xa_state, tags, last_entry.val - 1) {
+		__xa_erase(&mte_pages, xa_state.xa_index);
+		mte_free_tag_storage(tags);
+	}
+	xa_unlock(&mte_pages);
+}
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 306cee75b9ec..09eb80160920 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -492,8 +492,8 @@ static inline void arch_swap_invalidate_area(int type)
 }
 #endif
 
-#ifndef __HAVE_ARCH_SWAP_RESTORE_TAGS
-static inline void arch_swap_restore_tags(swp_entry_t entry, struct page *page)
+#ifndef __HAVE_ARCH_SWAP_RESTORE
+static inline void arch_swap_restore(swp_entry_t entry, struct page *page)
 {
 }
 #endif
diff --git a/mm/shmem.c b/mm/shmem.c
index 0e87925a7cb4..003b0e96c746 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1662,7 +1662,7 @@ static int shmem_swapin_page(struct inode *inode, pgoff_t index,
 	 * Some architectures may have to restore extra metadata to the
 	 * physical page after reading from swap.
 	 */
-	arch_swap_restore_tags(swap, page);
+	arch_swap_restore(swap, page);
 
 	if (shmem_should_replace_page(page, gfp)) {
 		error = shmem_replace_page(&page, gfp, info, index);

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 21/26] arm64: mte: Enable swap of tagged pages
@ 2020-05-15 17:16   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Steven Price, Peter Collingbourne, linux-mm, Andrew Morton,
	Vincenzo Frascino, Will Deacon, Dave P Martin

From: Steven Price <steven.price@arm.com>

When swapping pages out to disk it is necessary to save any tags that
have been set, and restore when swapping back in. Make use of the new
page flag (PG_ARCH_2, locally named PG_mte_tagged) to identify pages
with tags. When swapping out these pages the tags are stored in memory
and later restored when the pages are brought back in. Because shmem can
swap pages back in without restoring the userspace PTE it is also
necessary to add a hook for shmem.

Signed-off-by: Steven Price <steven.price@arm.com>
[catalin.marinas@arm.com: move function prototypes to mte.h]
[catalin.marinas@arm.com: drop '_tags' from arch_swap_restore_tags()]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/mte.h     |  8 ++++
 arch/arm64/include/asm/pgtable.h | 32 +++++++++++++
 arch/arm64/kernel/mte.c          | 10 ++++
 arch/arm64/lib/mte.S             | 45 ++++++++++++++++++
 arch/arm64/mm/Makefile           |  1 +
 arch/arm64/mm/mteswap.c          | 82 ++++++++++++++++++++++++++++++++
 include/asm-generic/pgtable.h    |  4 +-
 mm/shmem.c                       |  2 +-
 8 files changed, 181 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm64/mm/mteswap.c

diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
index 9d4b1390d07d..ad43cfe88d5f 100644
--- a/arch/arm64/include/asm/mte.h
+++ b/arch/arm64/include/asm/mte.h
@@ -21,6 +21,14 @@ unsigned long mte_copy_tags_from_user(void *to, const void __user *from,
 				      unsigned long n);
 unsigned long mte_copy_tags_to_user(void __user *to, void *from,
 				    unsigned long n);
+int mte_save_tags(struct page *page);
+void mte_save_page_tags(const void *page_addr, void *tag_storage);
+bool mte_restore_tags(swp_entry_t entry, struct page *page);
+void mte_restore_page_tags(void *page_addr, const void *tag_storage);
+void mte_invalidate_tags(int type, pgoff_t offset);
+void mte_invalidate_tags_area(int type);
+void *mte_allocate_tag_storage(void);
+void mte_free_tag_storage(char *storage);
 
 #ifdef CONFIG_ARM64_MTE
 
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index f2cd59b01b27..281d0b0f4fe1 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -851,6 +851,38 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
 
 extern int kern_addr_valid(unsigned long addr);
 
+#ifdef CONFIG_ARM64_MTE
+
+#define __HAVE_ARCH_PREPARE_TO_SWAP
+static inline int arch_prepare_to_swap(struct page *page)
+{
+	if (system_supports_mte())
+		return mte_save_tags(page);
+	return 0;
+}
+
+#define __HAVE_ARCH_SWAP_INVALIDATE
+static inline void arch_swap_invalidate_page(int type, pgoff_t offset)
+{
+	if (system_supports_mte())
+		mte_invalidate_tags(type, offset);
+}
+
+static inline void arch_swap_invalidate_area(int type)
+{
+	if (system_supports_mte())
+		mte_invalidate_tags_area(type);
+}
+
+#define __HAVE_ARCH_SWAP_RESTORE
+static inline void arch_swap_restore(swp_entry_t entry, struct page *page)
+{
+	if (system_supports_mte() && mte_restore_tags(entry, page))
+		set_bit(PG_mte_tagged, &page->flags);
+}
+
+#endif /* CONFIG_ARM64_MTE */
+
 #include <asm-generic/pgtable.h>
 
 /*
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
index 6abd6a16a145..7b3041b0f909 100644
--- a/arch/arm64/kernel/mte.c
+++ b/arch/arm64/kernel/mte.c
@@ -10,6 +10,8 @@
 #include <linux/sched.h>
 #include <linux/sched/mm.h>
 #include <linux/string.h>
+#include <linux/swap.h>
+#include <linux/swapops.h>
 #include <linux/thread_info.h>
 #include <linux/uio.h>
 
@@ -21,11 +23,19 @@
 void mte_sync_tags(pte_t *ptep, pte_t pte)
 {
 	struct page *page = pte_page(pte);
+	pte_t old_pte = READ_ONCE(*ptep);
 
 	/* if PG_mte_tagged is set, tags have already been initialised */
 	if (test_and_set_bit(PG_mte_tagged, &page->flags))
 		return;
 
+	if (is_swap_pte(old_pte)) {
+		swp_entry_t entry = pte_to_swp_entry(old_pte);
+
+		if (!non_swap_entry(entry) && mte_restore_tags(entry, page))
+			return;
+	}
+
 	/* tags of the zero page will be cleared on first mapping */
 	mte_clear_page_tags(page_address(page), page_size(page));
 }
diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
index 862655a36013..dd4593693e63 100644
--- a/arch/arm64/lib/mte.S
+++ b/arch/arm64/lib/mte.S
@@ -93,3 +93,48 @@ SYM_FUNC_START(mte_copy_tags_to_user)
 2:	sub	x0, x0, x3		// update the number of tags copied
 	ret
 SYM_FUNC_END(mte_copy_tags_to_user)
+
+/*
+ * Save the tags in a page
+ *   x0 - page address
+ *   x1 - tag storage
+ */
+SYM_FUNC_START(mte_save_page_tags)
+	multitag_transfer_size x7, x5
+1:
+	mov	x2, #0
+2:
+	ldgm	x5, [x0]
+	orr	x2, x2, x5
+	add	x0, x0, x7
+	tst	x0, #0xFF		// 16 tag values fit in a register,
+	b.ne	2b			// which is 16*16=256 bytes
+
+	str	x2, [x1], #8
+
+	tst	x0, #(PAGE_SIZE - 1)
+	b.ne	1b
+
+	ret
+SYM_FUNC_END(mte_save_page_tags)
+
+/*
+ * Restore the tags in a page
+ *   x0 - page address
+ *   x1 - tag storage
+ */
+SYM_FUNC_START(mte_restore_page_tags)
+	multitag_transfer_size x7, x5
+1:
+	ldr	x2, [x1], #8
+2:
+	stgm	x2, [x0]
+	add	x0, x0, x7
+	tst	x0, #0xFF
+	b.ne	2b
+
+	tst	x0, #(PAGE_SIZE - 1)
+	b.ne	1b
+
+	ret
+SYM_FUNC_END(mte_restore_page_tags)
diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
index d91030f0ffee..5bcc9e0aa259 100644
--- a/arch/arm64/mm/Makefile
+++ b/arch/arm64/mm/Makefile
@@ -8,6 +8,7 @@ obj-$(CONFIG_PTDUMP_CORE)	+= dump.o
 obj-$(CONFIG_PTDUMP_DEBUGFS)	+= ptdump_debugfs.o
 obj-$(CONFIG_NUMA)		+= numa.o
 obj-$(CONFIG_DEBUG_VIRTUAL)	+= physaddr.o
+obj-$(CONFIG_ARM64_MTE)		+= mteswap.o
 KASAN_SANITIZE_physaddr.o	+= n
 
 obj-$(CONFIG_KASAN)		+= kasan_init.o
diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c
new file mode 100644
index 000000000000..847d99814d03
--- /dev/null
+++ b/arch/arm64/mm/mteswap.c
@@ -0,0 +1,82 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/pagemap.h>
+#include <linux/xarray.h>
+#include <linux/swap.h>
+#include <linux/swapops.h>
+#include <asm/mte.h>
+
+static DEFINE_XARRAY(mte_pages);
+
+void *mte_allocate_tag_storage(void)
+{
+	/* tags granule is 16 bytes, 2 tags stored per byte */
+	return kmalloc(PAGE_SIZE / 16 / 2, GFP_KERNEL);
+}
+
+void mte_free_tag_storage(char *storage)
+{
+	kfree(storage);
+}
+
+int mte_save_tags(struct page *page)
+{
+	void *tag_storage, *ret;
+
+	if (!test_bit(PG_mte_tagged, &page->flags))
+		return 0;
+
+	tag_storage = mte_allocate_tag_storage();
+	if (!tag_storage)
+		return -ENOMEM;
+
+	mte_save_page_tags(page_address(page), tag_storage);
+
+	/* page_private contains the swap entry.val set in do_swap_page */
+	ret = xa_store(&mte_pages, page_private(page), tag_storage, GFP_KERNEL);
+	if (WARN(xa_is_err(ret), "Failed to store MTE tags")) {
+		mte_free_tag_storage(tag_storage);
+		return xa_err(ret);
+	} else if (ret) {
+		/* Entry is being replaced, free the old entry */
+		mte_free_tag_storage(ret);
+	}
+
+	return 0;
+}
+
+bool mte_restore_tags(swp_entry_t entry, struct page *page)
+{
+	void *tags = xa_load(&mte_pages, entry.val);
+
+	if (!tags)
+		return false;
+
+	mte_restore_page_tags(page_address(page), tags);
+
+	return true;
+}
+
+void mte_invalidate_tags(int type, pgoff_t offset)
+{
+	swp_entry_t entry = swp_entry(type, offset);
+	void *tags = xa_erase(&mte_pages, entry.val);
+
+	mte_free_tag_storage(tags);
+}
+
+void mte_invalidate_tags_area(int type)
+{
+	swp_entry_t entry = swp_entry(type, 0);
+	swp_entry_t last_entry = swp_entry(type + 1, 0);
+	void *tags;
+
+	XA_STATE(xa_state, &mte_pages, entry.val);
+
+	xa_lock(&mte_pages);
+	xas_for_each(&xa_state, tags, last_entry.val - 1) {
+		__xa_erase(&mte_pages, xa_state.xa_index);
+		mte_free_tag_storage(tags);
+	}
+	xa_unlock(&mte_pages);
+}
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 306cee75b9ec..09eb80160920 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -492,8 +492,8 @@ static inline void arch_swap_invalidate_area(int type)
 }
 #endif
 
-#ifndef __HAVE_ARCH_SWAP_RESTORE_TAGS
-static inline void arch_swap_restore_tags(swp_entry_t entry, struct page *page)
+#ifndef __HAVE_ARCH_SWAP_RESTORE
+static inline void arch_swap_restore(swp_entry_t entry, struct page *page)
 {
 }
 #endif
diff --git a/mm/shmem.c b/mm/shmem.c
index 0e87925a7cb4..003b0e96c746 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1662,7 +1662,7 @@ static int shmem_swapin_page(struct inode *inode, pgoff_t index,
 	 * Some architectures may have to restore extra metadata to the
 	 * physical page after reading from swap.
 	 */
-	arch_swap_restore_tags(swap, page);
+	arch_swap_restore(swap, page);
 
 	if (shmem_should_replace_page(page, gfp)) {
 		error = shmem_replace_page(&page, gfp, info, index);

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 22/26] arm64: mte: Save tags when hibernating
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:16   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Steven Price, James Morse

From: Steven Price <steven.price@arm.com>

When hibernating the contents of all pages in the system are written to
disk, however the MTE tags are not visible to the generic hibernation
code. So just before the hibernation image is created copy the tags out
of the physical tag storage into standard memory so they will be
included in the hibernation image. After hibernation apply the tags back
into the physical tag storage.

Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/kernel/hibernate.c | 118 ++++++++++++++++++++++++++++++++++
 1 file changed, 118 insertions(+)

diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index 5b73e92c99e3..d042ea325417 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -31,6 +31,7 @@
 #include <asm/kexec.h>
 #include <asm/memory.h>
 #include <asm/mmu_context.h>
+#include <asm/mte.h>
 #include <asm/pgalloc.h>
 #include <asm/pgtable.h>
 #include <asm/pgtable-hwdef.h>
@@ -277,6 +278,117 @@ static int create_safe_exec_page(void *src_start, size_t length,
 
 #define dcache_clean_range(start, end)	__flush_dcache_area(start, (end - start))
 
+#ifdef CONFIG_ARM64_MTE
+
+static DEFINE_XARRAY(mte_pages);
+
+static int save_tags(struct page *page, unsigned long pfn)
+{
+	void *tag_storage, *ret;
+
+	tag_storage = mte_allocate_tag_storage();
+	if (!tag_storage)
+		return -ENOMEM;
+
+	mte_save_page_tags(page_address(page), tag_storage);
+
+	ret = xa_store(&mte_pages, pfn, tag_storage, GFP_KERNEL);
+	if (WARN(xa_is_err(ret), "Failed to store MTE tags")) {
+		mte_free_tag_storage(tag_storage);
+		return xa_err(ret);
+	} else if (WARN(ret, "swsusp: %s: Duplicate entry", __func__)) {
+		mte_free_tag_storage(ret);
+	}
+
+	return 0;
+}
+
+static void swsusp_mte_free_storage(void)
+{
+	XA_STATE(xa_state, &mte_pages, 0);
+	void *tags;
+
+	xa_lock(&mte_pages);
+	xas_for_each(&xa_state, tags, ULONG_MAX) {
+		mte_free_tag_storage(tags);
+	}
+	xa_unlock(&mte_pages);
+
+	xa_destroy(&mte_pages);
+}
+
+static int swsusp_mte_save_tags(void)
+{
+	struct zone *zone;
+	unsigned long pfn, max_zone_pfn;
+	int ret = 0;
+	int n = 0;
+
+	if (!system_supports_mte())
+		return 0;
+
+	for_each_populated_zone(zone) {
+		max_zone_pfn = zone_end_pfn(zone);
+		for (pfn = zone->zone_start_pfn; pfn < max_zone_pfn; pfn++) {
+			struct page *page = pfn_to_online_page(pfn);
+
+			if (!page)
+				continue;
+
+			if (!test_bit(PG_mte_tagged, &page->flags))
+				continue;
+
+			ret = save_tags(page, pfn);
+			if (ret) {
+				swsusp_mte_free_storage();
+				goto out;
+			}
+
+			n++;
+		}
+	}
+	pr_info("Saved %d MTE pages\n", n);
+
+out:
+	return ret;
+}
+
+static void swsusp_mte_restore_tags(void)
+{
+	XA_STATE(xa_state, &mte_pages, 0);
+	int n = 0;
+	void *tags;
+
+	xa_lock(&mte_pages);
+	xas_for_each(&xa_state, tags, ULONG_MAX) {
+		unsigned long pfn = xa_state.xa_index;
+		struct page *page = pfn_to_online_page(pfn);
+
+		mte_restore_page_tags(page_address(page), tags);
+
+		mte_free_tag_storage(tags);
+		n++;
+	}
+	xa_unlock(&mte_pages);
+
+	pr_info("Restored %d MTE pages\n", n);
+
+	xa_destroy(&mte_pages);
+}
+
+#else	/* CONFIG_ARM64_MTE */
+
+static int swsusp_mte_save_tags(void)
+{
+	return 0;
+}
+
+static void swsusp_mte_restore_tags(void)
+{
+}
+
+#endif	/* CONFIG_ARM64_MTE */
+
 int swsusp_arch_suspend(void)
 {
 	int ret = 0;
@@ -294,6 +406,10 @@ int swsusp_arch_suspend(void)
 		/* make the crash dump kernel image visible/saveable */
 		crash_prepare_suspend();
 
+		ret = swsusp_mte_save_tags();
+		if (ret)
+			return ret;
+
 		sleep_cpu = smp_processor_id();
 		ret = swsusp_save();
 	} else {
@@ -307,6 +423,8 @@ int swsusp_arch_suspend(void)
 			dcache_clean_range(__hyp_text_start, __hyp_text_end);
 		}
 
+		swsusp_mte_restore_tags();
+
 		/* make the crash dump kernel image protected again */
 		crash_post_resume();
 

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 22/26] arm64: mte: Save tags when hibernating
@ 2020-05-15 17:16   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Steven Price, Peter Collingbourne, linux-mm, James Morse,
	Vincenzo Frascino, Will Deacon, Dave P Martin

From: Steven Price <steven.price@arm.com>

When hibernating the contents of all pages in the system are written to
disk, however the MTE tags are not visible to the generic hibernation
code. So just before the hibernation image is created copy the tags out
of the physical tag storage into standard memory so they will be
included in the hibernation image. After hibernation apply the tags back
into the physical tag storage.

Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/kernel/hibernate.c | 118 ++++++++++++++++++++++++++++++++++
 1 file changed, 118 insertions(+)

diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index 5b73e92c99e3..d042ea325417 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -31,6 +31,7 @@
 #include <asm/kexec.h>
 #include <asm/memory.h>
 #include <asm/mmu_context.h>
+#include <asm/mte.h>
 #include <asm/pgalloc.h>
 #include <asm/pgtable.h>
 #include <asm/pgtable-hwdef.h>
@@ -277,6 +278,117 @@ static int create_safe_exec_page(void *src_start, size_t length,
 
 #define dcache_clean_range(start, end)	__flush_dcache_area(start, (end - start))
 
+#ifdef CONFIG_ARM64_MTE
+
+static DEFINE_XARRAY(mte_pages);
+
+static int save_tags(struct page *page, unsigned long pfn)
+{
+	void *tag_storage, *ret;
+
+	tag_storage = mte_allocate_tag_storage();
+	if (!tag_storage)
+		return -ENOMEM;
+
+	mte_save_page_tags(page_address(page), tag_storage);
+
+	ret = xa_store(&mte_pages, pfn, tag_storage, GFP_KERNEL);
+	if (WARN(xa_is_err(ret), "Failed to store MTE tags")) {
+		mte_free_tag_storage(tag_storage);
+		return xa_err(ret);
+	} else if (WARN(ret, "swsusp: %s: Duplicate entry", __func__)) {
+		mte_free_tag_storage(ret);
+	}
+
+	return 0;
+}
+
+static void swsusp_mte_free_storage(void)
+{
+	XA_STATE(xa_state, &mte_pages, 0);
+	void *tags;
+
+	xa_lock(&mte_pages);
+	xas_for_each(&xa_state, tags, ULONG_MAX) {
+		mte_free_tag_storage(tags);
+	}
+	xa_unlock(&mte_pages);
+
+	xa_destroy(&mte_pages);
+}
+
+static int swsusp_mte_save_tags(void)
+{
+	struct zone *zone;
+	unsigned long pfn, max_zone_pfn;
+	int ret = 0;
+	int n = 0;
+
+	if (!system_supports_mte())
+		return 0;
+
+	for_each_populated_zone(zone) {
+		max_zone_pfn = zone_end_pfn(zone);
+		for (pfn = zone->zone_start_pfn; pfn < max_zone_pfn; pfn++) {
+			struct page *page = pfn_to_online_page(pfn);
+
+			if (!page)
+				continue;
+
+			if (!test_bit(PG_mte_tagged, &page->flags))
+				continue;
+
+			ret = save_tags(page, pfn);
+			if (ret) {
+				swsusp_mte_free_storage();
+				goto out;
+			}
+
+			n++;
+		}
+	}
+	pr_info("Saved %d MTE pages\n", n);
+
+out:
+	return ret;
+}
+
+static void swsusp_mte_restore_tags(void)
+{
+	XA_STATE(xa_state, &mte_pages, 0);
+	int n = 0;
+	void *tags;
+
+	xa_lock(&mte_pages);
+	xas_for_each(&xa_state, tags, ULONG_MAX) {
+		unsigned long pfn = xa_state.xa_index;
+		struct page *page = pfn_to_online_page(pfn);
+
+		mte_restore_page_tags(page_address(page), tags);
+
+		mte_free_tag_storage(tags);
+		n++;
+	}
+	xa_unlock(&mte_pages);
+
+	pr_info("Restored %d MTE pages\n", n);
+
+	xa_destroy(&mte_pages);
+}
+
+#else	/* CONFIG_ARM64_MTE */
+
+static int swsusp_mte_save_tags(void)
+{
+	return 0;
+}
+
+static void swsusp_mte_restore_tags(void)
+{
+}
+
+#endif	/* CONFIG_ARM64_MTE */
+
 int swsusp_arch_suspend(void)
 {
 	int ret = 0;
@@ -294,6 +406,10 @@ int swsusp_arch_suspend(void)
 		/* make the crash dump kernel image visible/saveable */
 		crash_prepare_suspend();
 
+		ret = swsusp_mte_save_tags();
+		if (ret)
+			return ret;
+
 		sleep_cpu = smp_processor_id();
 		ret = swsusp_save();
 	} else {
@@ -307,6 +423,8 @@ int swsusp_arch_suspend(void)
 			dcache_clean_range(__hyp_text_start, __hyp_text_end);
 		}
 
+		swsusp_mte_restore_tags();
+
 		/* make the crash dump kernel image protected again */
 		crash_post_resume();
 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 23/26] arm64: mte: Check the DT memory nodes for MTE support
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:16   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Rob Herring, Mark Rutland

Even if the ID_AA64PFR1_EL1 register advertises the presence of MTE, it
is not guaranteed that the memory system on the SoC supports the
feature. In the absence of system-wide MTE support, the behaviour is
undefined and the kernel should not enable the MTE memory type in
MAIR_EL1.

For FDT, add an 'arm,armv8.5-memtag' property to the /memory nodes and
check for its presence during MTE probing. For example:

	memory@80000000 {
		device_type = "memory";
		arm,armv8.5-memtag;
		reg = <0x00000000 0x80000000 0 0x80000000>,
		      <0x00000008 0x80000000 0 0x80000000>;
	};

If the /memory nodes are not present in DT or if at least one node does
not support MTE, the feature will be disabled. On EFI systems, it is
assumed that the memory description matches the EFI memory map (if not,
it is considered a firmware bug).

MTE is not currently supported on ACPI systems.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Rob Herring <Rob.Herring@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---

Notes:
    Ongoing (internal) discussions on whether this is the right approach.
    The issue may need to be solved similarly for ACPI systems.
    
    v4:
    - Reworked the CPUID field handling to ensure that it is not exposed to
      user (via MRS emulation) when the feature is not described in the DT.
      Previously, only the HWCAP was hidden.
    
    New in v3.

 arch/arm64/boot/dts/arm/fvp-base-revc.dts |  1 +
 arch/arm64/include/asm/cpufeature.h       |  6 ++-
 arch/arm64/kernel/cpufeature.c            | 59 ++++++++++++++++++++---
 3 files changed, 58 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/boot/dts/arm/fvp-base-revc.dts b/arch/arm64/boot/dts/arm/fvp-base-revc.dts
index 66381d89c1ce..c620a289f15e 100644
--- a/arch/arm64/boot/dts/arm/fvp-base-revc.dts
+++ b/arch/arm64/boot/dts/arm/fvp-base-revc.dts
@@ -94,6 +94,7 @@
 
 	memory@80000000 {
 		device_type = "memory";
+		arm,armv8.5-memtag;
 		reg = <0x00000000 0x80000000 0 0x80000000>,
 		      <0x00000008 0x80000000 0 0x80000000>;
 	};
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index afc315814563..0a24d36bf231 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -61,6 +61,7 @@ struct arm64_ftr_bits {
 	u8		shift;
 	u8		width;
 	s64		safe_val; /* safe value for FTR_EXACT features */
+	s64		(*filter)(const struct arm64_ftr_bits *, s64);
 };
 
 /*
@@ -542,7 +543,10 @@ cpuid_feature_extract_field(u64 features, int field, bool sign)
 
 static inline s64 arm64_ftr_value(const struct arm64_ftr_bits *ftrp, u64 val)
 {
-	return (s64)cpuid_feature_extract_field_width(val, ftrp->shift, ftrp->width, ftrp->sign);
+	s64 fval = (s64)cpuid_feature_extract_field_width(val, ftrp->shift, ftrp->width, ftrp->sign);
+	if (ftrp->filter)
+		fval = ftrp->filter(ftrp, fval);
+	return fval;
 }
 
 static inline bool id_aa64mmfr0_mixed_endian_el0(u64 mmfr0)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index d2fe8ff72324..aaadc1cbc006 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -7,6 +7,7 @@
 
 #define pr_fmt(fmt) "CPU features: " fmt
 
+#include <linux/acpi.h>
 #include <linux/bsearch.h>
 #include <linux/cpumask.h>
 #include <linux/crash_dump.h>
@@ -14,6 +15,7 @@
 #include <linux/stop_machine.h>
 #include <linux/types.h>
 #include <linux/mm.h>
+#include <linux/of.h>
 #include <linux/cpu.h>
 #include <asm/cpu.h>
 #include <asm/cpufeature.h>
@@ -87,23 +89,28 @@ DEFINE_STATIC_KEY_ARRAY_FALSE(cpu_hwcap_keys, ARM64_NCAPS);
 EXPORT_SYMBOL(cpu_hwcap_keys);
 
 #define __ARM64_FTR_BITS(SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
-	{						\
 		.sign = SIGNED,				\
 		.visible = VISIBLE,			\
 		.strict = STRICT,			\
 		.type = TYPE,				\
 		.shift = SHIFT,				\
 		.width = WIDTH,				\
-		.safe_val = SAFE_VAL,			\
-	}
+		.safe_val = SAFE_VAL
 
 /* Define a feature with unsigned values */
 #define ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
-	__ARM64_FTR_BITS(FTR_UNSIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
+	{ __ARM64_FTR_BITS(FTR_UNSIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL), }
 
 /* Define a feature with a signed value */
 #define S_ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
-	__ARM64_FTR_BITS(FTR_SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
+	{ __ARM64_FTR_BITS(FTR_SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL), }
+
+/* Define a feature with a filter function to process the field value */
+#define FILTERED_ARM64_FTR_BITS(SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL, filter_fn) \
+	{											\
+		__ARM64_FTR_BITS(SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL),	\
+		.filter = filter_fn,								\
+	}
 
 #define ARM64_FTR_END					\
 	{						\
@@ -118,6 +125,42 @@ static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap);
 
 static bool __system_matches_cap(unsigned int n);
 
+#ifdef CONFIG_ARM64_MTE
+s64 mte_ftr_filter(const struct arm64_ftr_bits *ftrp, s64 val)
+{
+	struct device_node *np;
+	static bool memory_checked = false;
+	static bool mte_capable = true;
+
+	/* EL0-only MTE is not supported by Linux, don't expose it */
+	if (val < ID_AA64PFR1_MTE)
+		return ID_AA64PFR1_MTE_NI;
+
+	if (memory_checked)
+		return mte_capable ? val : ID_AA64PFR1_MTE_NI;
+
+	if (!acpi_disabled) {
+		pr_warn("MTE not supported on ACPI systems\n");
+		return ID_AA64PFR1_MTE_NI;
+	}
+
+	/* check the DT "memory" nodes for MTE support */
+	for_each_node_by_type(np, "memory") {
+		memory_checked = true;
+		mte_capable &= of_property_read_bool(np, "arm,armv8.5-memtag");
+	}
+
+	if (!memory_checked || !mte_capable) {
+		pr_warn("System memory is not MTE-capable\n");
+		memory_checked = true;
+		mte_capable = false;
+		return ID_AA64PFR1_MTE_NI;
+	}
+
+	return val;
+}
+#endif
+
 /*
  * NOTE: Any changes to the visibility of features should be kept in
  * sync with the documentation of the CPU feature register ABI.
@@ -182,8 +225,10 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = {
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_SSBS_SHIFT, 4, ID_AA64PFR1_SSBS_PSTATE_NI),
-	ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_MTE),
-		       FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_MTE_SHIFT, 4, ID_AA64PFR1_MTE_NI),
+#ifdef CONFIG_ARM64_MTE
+	FILTERED_ARM64_FTR_BITS(FTR_UNSIGNED, FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE,
+				ID_AA64PFR1_MTE_SHIFT, 4, ID_AA64PFR1_MTE_NI, mte_ftr_filter),
+#endif
 	ARM64_FTR_END,
 };
 

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 23/26] arm64: mte: Check the DT memory nodes for MTE support
@ 2020-05-15 17:16   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Mark Rutland, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, Rob Herring, Peter Collingbourne, linux-mm,
	Vincenzo Frascino, Will Deacon, Dave P Martin

Even if the ID_AA64PFR1_EL1 register advertises the presence of MTE, it
is not guaranteed that the memory system on the SoC supports the
feature. In the absence of system-wide MTE support, the behaviour is
undefined and the kernel should not enable the MTE memory type in
MAIR_EL1.

For FDT, add an 'arm,armv8.5-memtag' property to the /memory nodes and
check for its presence during MTE probing. For example:

	memory@80000000 {
		device_type = "memory";
		arm,armv8.5-memtag;
		reg = <0x00000000 0x80000000 0 0x80000000>,
		      <0x00000008 0x80000000 0 0x80000000>;
	};

If the /memory nodes are not present in DT or if at least one node does
not support MTE, the feature will be disabled. On EFI systems, it is
assumed that the memory description matches the EFI memory map (if not,
it is considered a firmware bug).

MTE is not currently supported on ACPI systems.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Rob Herring <Rob.Herring@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---

Notes:
    Ongoing (internal) discussions on whether this is the right approach.
    The issue may need to be solved similarly for ACPI systems.
    
    v4:
    - Reworked the CPUID field handling to ensure that it is not exposed to
      user (via MRS emulation) when the feature is not described in the DT.
      Previously, only the HWCAP was hidden.
    
    New in v3.

 arch/arm64/boot/dts/arm/fvp-base-revc.dts |  1 +
 arch/arm64/include/asm/cpufeature.h       |  6 ++-
 arch/arm64/kernel/cpufeature.c            | 59 ++++++++++++++++++++---
 3 files changed, 58 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/boot/dts/arm/fvp-base-revc.dts b/arch/arm64/boot/dts/arm/fvp-base-revc.dts
index 66381d89c1ce..c620a289f15e 100644
--- a/arch/arm64/boot/dts/arm/fvp-base-revc.dts
+++ b/arch/arm64/boot/dts/arm/fvp-base-revc.dts
@@ -94,6 +94,7 @@
 
 	memory@80000000 {
 		device_type = "memory";
+		arm,armv8.5-memtag;
 		reg = <0x00000000 0x80000000 0 0x80000000>,
 		      <0x00000008 0x80000000 0 0x80000000>;
 	};
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index afc315814563..0a24d36bf231 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -61,6 +61,7 @@ struct arm64_ftr_bits {
 	u8		shift;
 	u8		width;
 	s64		safe_val; /* safe value for FTR_EXACT features */
+	s64		(*filter)(const struct arm64_ftr_bits *, s64);
 };
 
 /*
@@ -542,7 +543,10 @@ cpuid_feature_extract_field(u64 features, int field, bool sign)
 
 static inline s64 arm64_ftr_value(const struct arm64_ftr_bits *ftrp, u64 val)
 {
-	return (s64)cpuid_feature_extract_field_width(val, ftrp->shift, ftrp->width, ftrp->sign);
+	s64 fval = (s64)cpuid_feature_extract_field_width(val, ftrp->shift, ftrp->width, ftrp->sign);
+	if (ftrp->filter)
+		fval = ftrp->filter(ftrp, fval);
+	return fval;
 }
 
 static inline bool id_aa64mmfr0_mixed_endian_el0(u64 mmfr0)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index d2fe8ff72324..aaadc1cbc006 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -7,6 +7,7 @@
 
 #define pr_fmt(fmt) "CPU features: " fmt
 
+#include <linux/acpi.h>
 #include <linux/bsearch.h>
 #include <linux/cpumask.h>
 #include <linux/crash_dump.h>
@@ -14,6 +15,7 @@
 #include <linux/stop_machine.h>
 #include <linux/types.h>
 #include <linux/mm.h>
+#include <linux/of.h>
 #include <linux/cpu.h>
 #include <asm/cpu.h>
 #include <asm/cpufeature.h>
@@ -87,23 +89,28 @@ DEFINE_STATIC_KEY_ARRAY_FALSE(cpu_hwcap_keys, ARM64_NCAPS);
 EXPORT_SYMBOL(cpu_hwcap_keys);
 
 #define __ARM64_FTR_BITS(SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
-	{						\
 		.sign = SIGNED,				\
 		.visible = VISIBLE,			\
 		.strict = STRICT,			\
 		.type = TYPE,				\
 		.shift = SHIFT,				\
 		.width = WIDTH,				\
-		.safe_val = SAFE_VAL,			\
-	}
+		.safe_val = SAFE_VAL
 
 /* Define a feature with unsigned values */
 #define ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
-	__ARM64_FTR_BITS(FTR_UNSIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
+	{ __ARM64_FTR_BITS(FTR_UNSIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL), }
 
 /* Define a feature with a signed value */
 #define S_ARM64_FTR_BITS(VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
-	__ARM64_FTR_BITS(FTR_SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL)
+	{ __ARM64_FTR_BITS(FTR_SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL), }
+
+/* Define a feature with a filter function to process the field value */
+#define FILTERED_ARM64_FTR_BITS(SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL, filter_fn) \
+	{											\
+		__ARM64_FTR_BITS(SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL),	\
+		.filter = filter_fn,								\
+	}
 
 #define ARM64_FTR_END					\
 	{						\
@@ -118,6 +125,42 @@ static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap);
 
 static bool __system_matches_cap(unsigned int n);
 
+#ifdef CONFIG_ARM64_MTE
+s64 mte_ftr_filter(const struct arm64_ftr_bits *ftrp, s64 val)
+{
+	struct device_node *np;
+	static bool memory_checked = false;
+	static bool mte_capable = true;
+
+	/* EL0-only MTE is not supported by Linux, don't expose it */
+	if (val < ID_AA64PFR1_MTE)
+		return ID_AA64PFR1_MTE_NI;
+
+	if (memory_checked)
+		return mte_capable ? val : ID_AA64PFR1_MTE_NI;
+
+	if (!acpi_disabled) {
+		pr_warn("MTE not supported on ACPI systems\n");
+		return ID_AA64PFR1_MTE_NI;
+	}
+
+	/* check the DT "memory" nodes for MTE support */
+	for_each_node_by_type(np, "memory") {
+		memory_checked = true;
+		mte_capable &= of_property_read_bool(np, "arm,armv8.5-memtag");
+	}
+
+	if (!memory_checked || !mte_capable) {
+		pr_warn("System memory is not MTE-capable\n");
+		memory_checked = true;
+		mte_capable = false;
+		return ID_AA64PFR1_MTE_NI;
+	}
+
+	return val;
+}
+#endif
+
 /*
  * NOTE: Any changes to the visibility of features should be kept in
  * sync with the documentation of the CPU feature register ABI.
@@ -182,8 +225,10 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = {
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_SSBS_SHIFT, 4, ID_AA64PFR1_SSBS_PSTATE_NI),
-	ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_MTE),
-		       FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_MTE_SHIFT, 4, ID_AA64PFR1_MTE_NI),
+#ifdef CONFIG_ARM64_MTE
+	FILTERED_ARM64_FTR_BITS(FTR_UNSIGNED, FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE,
+				ID_AA64PFR1_MTE_SHIFT, 4, ID_AA64PFR1_MTE_NI, mte_ftr_filter),
+#endif
 	ARM64_FTR_END,
 };
 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:16   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne

For performance analysis it may be desirable to disable MTE altogether
via an early param. Introduce arm64.mte_disable and, if true, filter out
the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
user.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    New in v4.

 Documentation/admin-guide/kernel-parameters.txt |  4 ++++
 arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
 2 files changed, 15 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index f2a93c8679e8..7436e7462b85 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -373,6 +373,10 @@
 	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
 			Format: <io>,<irq>,<nodeID>
 
+	arm64.mte_disable=
+			[ARM64] Disable Linux support for the Memory
+			Tagging Extension (both user and in-kernel).
+
 	ataflop=	[HW,M68k]
 
 	atarimouse=	[HW,MOUSE] Atari Mouse
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index aaadc1cbc006..f7596830694f 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -126,12 +126,23 @@ static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap);
 static bool __system_matches_cap(unsigned int n);
 
 #ifdef CONFIG_ARM64_MTE
+static bool mte_disable;
+
+static int __init arm64_mte_disable(char *buf)
+{
+	return strtobool(buf, &mte_disable);
+}
+early_param("arm64.mte_disable", arm64_mte_disable);
+
 s64 mte_ftr_filter(const struct arm64_ftr_bits *ftrp, s64 val)
 {
 	struct device_node *np;
 	static bool memory_checked = false;
 	static bool mte_capable = true;
 
+	if (mte_disable)
+		return ID_AA64PFR1_MTE_NI;
+
 	/* EL0-only MTE is not supported by Linux, don't expose it */
 	if (val < ID_AA64PFR1_MTE)
 		return ID_AA64PFR1_MTE_NI;

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2020-05-15 17:16   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Vincenzo Frascino, Will Deacon,
	Dave P Martin

For performance analysis it may be desirable to disable MTE altogether
via an early param. Introduce arm64.mte_disable and, if true, filter out
the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
user.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    New in v4.

 Documentation/admin-guide/kernel-parameters.txt |  4 ++++
 arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
 2 files changed, 15 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index f2a93c8679e8..7436e7462b85 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -373,6 +373,10 @@
 	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
 			Format: <io>,<irq>,<nodeID>
 
+	arm64.mte_disable=
+			[ARM64] Disable Linux support for the Memory
+			Tagging Extension (both user and in-kernel).
+
 	ataflop=	[HW,M68k]
 
 	atarimouse=	[HW,MOUSE] Atari Mouse
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index aaadc1cbc006..f7596830694f 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -126,12 +126,23 @@ static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap);
 static bool __system_matches_cap(unsigned int n);
 
 #ifdef CONFIG_ARM64_MTE
+static bool mte_disable;
+
+static int __init arm64_mte_disable(char *buf)
+{
+	return strtobool(buf, &mte_disable);
+}
+early_param("arm64.mte_disable", arm64_mte_disable);
+
 s64 mte_ftr_filter(const struct arm64_ftr_bits *ftrp, s64 val)
 {
 	struct device_node *np;
 	static bool memory_checked = false;
 	static bool mte_capable = true;
 
+	if (mte_disable)
+		return ID_AA64PFR1_MTE_NI;
+
 	/* EL0-only MTE is not supported by Linux, don't expose it */
 	if (val < ID_AA64PFR1_MTE)
 		return ID_AA64PFR1_MTE_NI;

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 25/26] arm64: mte: Kconfig entry
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:16   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

Add Memory Tagging Extension support to the arm64 kbuild.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v4:
    - select ARCH_USES_PG_ARCH_2
    - remove ARCH_NO_SWAP
    - default y

 arch/arm64/Kconfig | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 40fb05d96c60..70deb14d78e0 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1606,6 +1606,39 @@ config ARCH_RANDOM
 
 endmenu
 
+menu "ARMv8.5 architectural features"
+
+config ARM64_AS_HAS_MTE
+	def_bool $(as-instr,.arch armv8.5-a+memtag)
+
+config ARM64_MTE
+	bool "Memory Tagging Extension support"
+	default y
+	depends on ARM64_AS_HAS_MTE && ARM64_TAGGED_ADDR_ABI
+	select ARCH_USES_HIGH_VMA_FLAGS
+	select ARCH_USES_PG_ARCH_2
+	help
+	  Memory Tagging (part of the ARMv8.5 Extensions) provides
+	  architectural support for run-time, always-on detection of
+	  various classes of memory error to aid with software debugging
+	  to eliminate vulnerabilities arising from memory-unsafe
+	  languages.
+
+	  This option enables the support for the Memory Tagging
+	  Extension at EL0 (i.e. for userspace).
+
+	  Selecting this option allows the feature to be detected at
+	  runtime. Any secondary CPU not implementing this feature will
+	  not be allowed a late bring-up.
+
+	  Userspace binaries that want to use this feature must
+	  explicitly opt in. The mechanism for the userspace is
+	  described in:
+
+	  Documentation/arm64/memory-tagging-extension.rst.
+
+endmenu
+
 config ARM64_SVE
 	bool "ARM Scalable Vector Extension support"
 	default y

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 25/26] arm64: mte: Kconfig entry
@ 2020-05-15 17:16   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Vincenzo Frascino, Will Deacon,
	Dave P Martin

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

Add Memory Tagging Extension support to the arm64 kbuild.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v4:
    - select ARCH_USES_PG_ARCH_2
    - remove ARCH_NO_SWAP
    - default y

 arch/arm64/Kconfig | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 40fb05d96c60..70deb14d78e0 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1606,6 +1606,39 @@ config ARCH_RANDOM
 
 endmenu
 
+menu "ARMv8.5 architectural features"
+
+config ARM64_AS_HAS_MTE
+	def_bool $(as-instr,.arch armv8.5-a+memtag)
+
+config ARM64_MTE
+	bool "Memory Tagging Extension support"
+	default y
+	depends on ARM64_AS_HAS_MTE && ARM64_TAGGED_ADDR_ABI
+	select ARCH_USES_HIGH_VMA_FLAGS
+	select ARCH_USES_PG_ARCH_2
+	help
+	  Memory Tagging (part of the ARMv8.5 Extensions) provides
+	  architectural support for run-time, always-on detection of
+	  various classes of memory error to aid with software debugging
+	  to eliminate vulnerabilities arising from memory-unsafe
+	  languages.
+
+	  This option enables the support for the Memory Tagging
+	  Extension at EL0 (i.e. for userspace).
+
+	  Selecting this option allows the feature to be detected at
+	  runtime. Any secondary CPU not implementing this feature will
+	  not be allowed a late bring-up.
+
+	  Userspace binaries that want to use this feature must
+	  explicitly opt in. The mechanism for the userspace is
+	  described in:
+
+	  Documentation/arm64/memory-tagging-extension.rst.
+
+endmenu
+
 config ARM64_SVE
 	bool "ARM Scalable Vector Extension support"
 	default y

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 26/26] arm64: mte: Add Memory Tagging Extension documentation
  2020-05-15 17:15 ` Catalin Marinas
@ 2020-05-15 17:16   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

Memory Tagging Extension (part of the ARMv8.5 Extensions) provides
a mechanism to detect the sources of memory related errors which
may be vulnerable to exploitation, including bounds violations,
use-after-free, use-after-return, use-out-of-scope and use before
initialization errors.

Add Memory Tagging Extension documentation for the arm64 linux
kernel support.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v4:
    - Document behaviour of madvise(MADV_DONTNEED/MADV_FREE).
    - Document the initial process state on fork/execve.
    - Clarify when the kernel uaccess checks the tags.
    - Minor updates to the example code.
    - A few other minor clean-ups following review.
    
    v3:
    - Modify the uaccess checking conditions: only when the sync mode is
      selected by the user. In async mode, the kernel uaccesses are not
      checked.
    - Clarify that an include mask of 0 (exclude mask 0xffff) results in
      always generating tag 0.
    - Document the ptrace() interface.
    
    v2:
    - Documented the uaccess kernel tag checking mode.
    - Removed the BTI definitions from cpu-feature-registers.rst.
    - Removed the paragraph stating that MTE depends on the tagged address
      ABI (while the Kconfig entry does, there is no requirement for the
      user to enable both).
    - Changed the GCR_EL1.Exclude handling description following the change
      in the prctl() interface (include vs exclude mask).
    - Updated the example code.

 Documentation/arm64/cpu-feature-registers.rst |   2 +
 Documentation/arm64/elf_hwcaps.rst            |   5 +
 Documentation/arm64/index.rst                 |   1 +
 .../arm64/memory-tagging-extension.rst        | 297 ++++++++++++++++++
 4 files changed, 305 insertions(+)
 create mode 100644 Documentation/arm64/memory-tagging-extension.rst

diff --git a/Documentation/arm64/cpu-feature-registers.rst b/Documentation/arm64/cpu-feature-registers.rst
index 41937a8091aa..b5679fa85ad9 100644
--- a/Documentation/arm64/cpu-feature-registers.rst
+++ b/Documentation/arm64/cpu-feature-registers.rst
@@ -174,6 +174,8 @@ infrastructure:
      +------------------------------+---------+---------+
      | Name                         |  bits   | visible |
      +------------------------------+---------+---------+
+     | MTE                          | [11-8]  |    y    |
+     +------------------------------+---------+---------+
      | SSBS                         | [7-4]   |    y    |
      +------------------------------+---------+---------+
 
diff --git a/Documentation/arm64/elf_hwcaps.rst b/Documentation/arm64/elf_hwcaps.rst
index 7dfb97dfe416..ca7f90e99e3a 100644
--- a/Documentation/arm64/elf_hwcaps.rst
+++ b/Documentation/arm64/elf_hwcaps.rst
@@ -236,6 +236,11 @@ HWCAP2_RNG
 
     Functionality implied by ID_AA64ISAR0_EL1.RNDR == 0b0001.
 
+HWCAP2_MTE
+
+    Functionality implied by ID_AA64PFR1_EL1.MTE == 0b0010, as described
+    by Documentation/arm64/memory-tagging-extension.rst.
+
 4. Unused AT_HWCAP bits
 -----------------------
 
diff --git a/Documentation/arm64/index.rst b/Documentation/arm64/index.rst
index 09cbb4ed2237..4cd0e696f064 100644
--- a/Documentation/arm64/index.rst
+++ b/Documentation/arm64/index.rst
@@ -14,6 +14,7 @@ ARM64 Architecture
     hugetlbpage
     legacy_instructions
     memory
+    memory-tagging-extension
     pointer-authentication
     silicon-errata
     sve
diff --git a/Documentation/arm64/memory-tagging-extension.rst b/Documentation/arm64/memory-tagging-extension.rst
new file mode 100644
index 000000000000..e7cdcecb656a
--- /dev/null
+++ b/Documentation/arm64/memory-tagging-extension.rst
@@ -0,0 +1,297 @@
+===============================================
+Memory Tagging Extension (MTE) in AArch64 Linux
+===============================================
+
+Authors: Vincenzo Frascino <vincenzo.frascino@arm.com>
+         Catalin Marinas <catalin.marinas@arm.com>
+
+Date: 2020-02-25
+
+This document describes the provision of the Memory Tagging Extension
+functionality in AArch64 Linux.
+
+Introduction
+============
+
+ARMv8.5 based processors introduce the Memory Tagging Extension (MTE)
+feature. MTE is built on top of the ARMv8.0 virtual address tagging TBI
+(Top Byte Ignore) feature and allows software to access a 4-bit
+allocation tag for each 16-byte granule in the physical address space.
+Such memory range must be mapped with the Normal-Tagged memory
+attribute. A logical tag is derived from bits 59-56 of the virtual
+address used for the memory access. A CPU with MTE enabled will compare
+the logical tag against the allocation tag and potentially raise an
+exception on mismatch, subject to system registers configuration.
+
+Userspace Support
+=================
+
+When ``CONFIG_ARM64_MTE`` is selected and Memory Tagging Extension is
+supported by the hardware, the kernel advertises the feature to
+userspace via ``HWCAP2_MTE``.
+
+PROT_MTE
+--------
+
+To access the allocation tags, a user process must enable the Tagged
+memory attribute on an address range using a new ``prot`` flag for
+``mmap()`` and ``mprotect()``:
+
+``PROT_MTE`` - Pages allow access to the MTE allocation tags.
+
+The allocation tag is set to 0 when such pages are first mapped in the
+user address space and preserved on copy-on-write. ``MAP_SHARED`` is
+supported and the allocation tags can be shared between processes.
+
+**Note**: ``PROT_MTE`` is only supported on ``MAP_ANONYMOUS`` and
+RAM-based file mappings (``tmpfs``, ``memfd``). Passing it to other
+types of mapping will result in ``-EINVAL`` returned by these system
+calls.
+
+**Note**: The ``PROT_MTE`` flag (and corresponding memory type) cannot
+be cleared by ``mprotect()``.
+
+**Note**: ``madvise()`` memory ranges with ``MADV_DONTNEED`` and
+``MADV_FREE`` may have the allocation tags cleared (set to 0) at any
+point after the system call.
+
+Tag Check Faults
+----------------
+
+When ``PROT_MTE`` is enabled on an address range and a mismatch between
+the logical and allocation tags occurs on access, there are three
+configurable behaviours:
+
+- *Ignore* - This is the default mode. The CPU (and kernel) ignores the
+  tag check fault.
+
+- *Synchronous* - The kernel raises a ``SIGSEGV`` synchronously, with
+  ``.si_code = SEGV_MTESERR`` and ``.si_addr = <fault-address>``. The
+  memory access is not performed. If ``SIGSEGV`` is ignored or blocked
+  by the offending thread, the containing process is terminated with a
+  ``coredump``.
+
+- *Asynchronous* - The kernel raises a ``SIGSEGV``, in the offending
+  thread, asynchronously following one or multiple tag check faults,
+  with ``.si_code = SEGV_MTEAERR`` and ``.si_addr = 0`` (the faulting
+  address is unknown).
+
+The user can select the above modes, per thread, using the
+``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call where
+``flags`` contain one of the following values in the ``PR_MTE_TCF_MASK``
+bit-field:
+
+- ``PR_MTE_TCF_NONE``  - *Ignore* tag check faults
+- ``PR_MTE_TCF_SYNC``  - *Synchronous* tag check fault mode
+- ``PR_MTE_TCF_ASYNC`` - *Asynchronous* tag check fault mode
+
+The current tag check fault mode can be read using the
+``prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0)`` system call.
+
+Tag checking can also be disabled for a user thread by setting the
+``PSTATE.TCO`` bit with ``MSR TCO, #1``.
+
+**Note**: Signal handlers are always invoked with ``PSTATE.TCO = 0``,
+irrespective of the interrupted context. ``PSTATE.TCO`` is restored on
+``sigreturn()``.
+
+**Note**: There are no *match-all* logical tags available for user
+applications.
+
+**Note**: Kernel accesses to the user address space (e.g. ``read()``
+system call) are not checked if the user thread tag checking mode is
+``PR_MTE_TCF_NONE`` or ``PR_MTE_TCF_ASYNC``. If the tag checking mode is
+``PR_MTE_TCF_SYNC``, the kernel makes a best effort to check its user
+address accesses, however it cannot always guarantee it.
+
+Excluding Tags in the ``IRG``, ``ADDG`` and ``SUBG`` instructions
+-----------------------------------------------------------------
+
+The architecture allows excluding certain tags to be randomly generated
+via the ``GCR_EL1.Exclude`` register bit-field. By default, Linux
+excludes all tags other than 0. A user thread can enable specific tags
+in the randomly generated set using the ``prctl(PR_SET_TAGGED_ADDR_CTRL,
+flags, 0, 0, 0)`` system call where ``flags`` contains the tags bitmap
+in the ``PR_MTE_TAG_MASK`` bit-field.
+
+**Note**: The hardware uses an exclude mask but the ``prctl()``
+interface provides an include mask. An include mask of ``0`` (exclusion
+mask ``0xffff``) results in the CPU always generating tag ``0``.
+
+Initial process state
+---------------------
+
+On ``execve()``, the new process has the following configuration:
+
+- ``PR_TAGGED_ADDR_ENABLE`` set to 0 (disabled)
+- Tag checking mode set to ``PR_MTE_TCF_NONE``
+- ``PR_MTE_TAG_MASK`` set to 0 (all tags excluded)
+- ``PSTATE.TCO`` set to 0
+- ``PROT_MTE`` not set on any of the initial memory maps
+
+On ``fork()``, the new process inherits the parent's configuration and
+memory map attributes with the exception of the ``madvise()`` ranges
+with ``MADV_WIPEONFORK`` which will have the data and tags cleared (set
+to 0).
+
+The ``ptrace()`` interface
+--------------------------
+
+``PTRACE_PEEKMTETAGS`` and ``PTRACE_POKEMTETAGS`` allow a tracer to read
+the tags from or set the tags to a tracee's address space. The
+``ptrace()`` system call is invoked as ``ptrace(request, pid, addr,
+data)`` where:
+
+- ``request`` - one of ``PTRACE_PEEKMTETAGS`` or ``PTRACE_PEEKMTETAGS``.
+- ``pid`` - the tracee's PID.
+- ``addr`` - address in the tracee's address space.
+- ``data`` - pointer to a ``struct iovec`` where ``iov_base`` points to
+  a buffer of ``iov_len`` length in the tracer's address space.
+
+The tags in the tracer's ``iov_base`` buffer are represented as one
+4-bit tag per byte and correspond to a 16-byte MTE tag granule in the
+tracee's address space.
+
+**Note**: If ``addr`` is not aligned to a 16-byte granule, the kernel
+will use the corresponding aligned address.
+
+``ptrace()`` return value:
+
+- 0 - tags were copied, the tracer's ``iov_len`` was updated to the
+  number of tags transferred. This may be smaller than the requested
+  ``iov_len`` if the requested address range in the tracee's or the
+  tracer's space cannot be accessed or does not have valid tags.
+- ``-EPERM`` - the specified process cannot be traced.
+- ``-EIO`` - the tracee's address range cannot be accessed (e.g. invalid
+  address) and no tags copied. ``iov_len`` not updated.
+- ``-EFAULT`` - fault on accessing the tracer's memory (``struct iovec``
+  or ``iov_base`` buffer) and no tags copied. ``iov_len`` not updated.
+- ``-EOPNOTSUPP`` - the tracee's address does not have valid tags (never
+  mapped with the ``PROT_MTE`` flag). ``iov_len`` not updated.
+
+**Note**: There are no transient errors for the requests above, so user
+programs should not retry in case of a non-zero system call return.
+
+Example of correct usage
+========================
+
+*MTE Example code*
+
+.. code-block:: c
+
+    /*
+     * To be compiled with -march=armv8.5-a+memtag
+     */
+    #include <errno.h>
+    #include <stdio.h>
+    #include <stdlib.h>
+    #include <unistd.h>
+    #include <sys/auxv.h>
+    #include <sys/mman.h>
+    #include <sys/prctl.h>
+
+    /*
+     * From arch/arm64/include/uapi/asm/hwcap.h
+     */
+    #define HWCAP2_MTE              (1 << 18)
+
+    /*
+     * From arch/arm64/include/uapi/asm/mman.h
+     */
+    #define PROT_MTE                 0x20
+
+    /*
+     * From include/uapi/linux/prctl.h
+     */
+    #define PR_SET_TAGGED_ADDR_CTRL 55
+    #define PR_GET_TAGGED_ADDR_CTRL 56
+    # define PR_TAGGED_ADDR_ENABLE  (1UL << 0)
+    # define PR_MTE_TCF_SHIFT       1
+    # define PR_MTE_TCF_NONE        (0UL << PR_MTE_TCF_SHIFT)
+    # define PR_MTE_TCF_SYNC        (1UL << PR_MTE_TCF_SHIFT)
+    # define PR_MTE_TCF_ASYNC       (2UL << PR_MTE_TCF_SHIFT)
+    # define PR_MTE_TCF_MASK        (3UL << PR_MTE_TCF_SHIFT)
+    # define PR_MTE_TAG_SHIFT       3
+    # define PR_MTE_TAG_MASK        (0xffffUL << PR_MTE_TAG_SHIFT)
+
+    /*
+     * Insert a random logical tag into the given pointer.
+     */
+    #define insert_random_tag(ptr) ({                       \
+            __u64 __val;                                    \
+            asm("irg %0, %1" : "=r" (__val) : "r" (ptr));   \
+            __val;                                          \
+    })
+
+    /*
+     * Set the allocation tag on the destination address.
+     */
+    #define set_tag(tagged_addr) do {                                      \
+            asm volatile("stg %0, [%0]" : : "r" (tagged_addr) : "memory"); \
+    } while (0)
+
+    int main()
+    {
+            unsigned char *a;
+            unsigned long page_sz = sysconf(_SC_PAGESIZE);
+            unsigned long hwcap2 = getauxval(AT_HWCAP2);
+
+            /* check if MTE is present */
+            if (!(hwcap2 & HWCAP2_MTE))
+                    return EXIT_FAILURE;
+
+            /*
+             * Enable the tagged address ABI, synchronous MTE tag check faults and
+             * allow all non-zero tags in the randomly generated set.
+             */
+            if (prctl(PR_SET_TAGGED_ADDR_CTRL,
+                      PR_TAGGED_ADDR_ENABLE | PR_MTE_TCF_SYNC | (0xfffe << PR_MTE_TAG_SHIFT),
+                      0, 0, 0)) {
+                    perror("prctl() failed");
+                    return EXIT_FAILURE;
+            }
+
+            a = mmap(0, page_sz, PROT_READ | PROT_WRITE,
+                     MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+            if (a == MAP_FAILED) {
+                    perror("mmap() failed");
+                    return EXIT_FAILURE;
+            }
+
+            /*
+             * Enable MTE on the above anonymous mmap. The flag could be passed
+             * directly to mmap() and skip this step.
+             */
+            if (mprotect(a, page_sz, PROT_READ | PROT_WRITE | PROT_MTE)) {
+                    perror("mprotect() failed");
+                    return EXIT_FAILURE;
+            }
+
+            /* access with the default tag (0) */
+            a[0] = 1;
+            a[1] = 2;
+
+            printf("a[0] = %hhu a[1] = %hhu\n", a[0], a[1]);
+
+            /* set the logical and allocation tags */
+            a = (unsigned char *)insert_random_tag(a);
+            set_tag(a);
+
+            printf("%p\n", a);
+
+            /* non-zero tag access */
+            a[0] = 3;
+            printf("a[0] = %hhu a[1] = %hhu\n", a[0], a[1]);
+
+            /*
+             * If MTE is enabled correctly the next instruction will generate an
+             * exception.
+             */
+            printf("Expecting SIGSEGV...\n");
+            a[16] = 0xdd;
+
+            /* this should not be printed in the PR_MTE_TCF_SYNC mode */
+            printf("...haven't got one\n");
+
+            return EXIT_FAILURE;
+    }

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* [PATCH v4 26/26] arm64: mte: Add Memory Tagging Extension documentation
@ 2020-05-15 17:16   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-15 17:16 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Vincenzo Frascino, Will Deacon,
	Dave P Martin

From: Vincenzo Frascino <vincenzo.frascino@arm.com>

Memory Tagging Extension (part of the ARMv8.5 Extensions) provides
a mechanism to detect the sources of memory related errors which
may be vulnerable to exploitation, including bounds violations,
use-after-free, use-after-return, use-out-of-scope and use before
initialization errors.

Add Memory Tagging Extension documentation for the arm64 linux
kernel support.

Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Co-developed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---

Notes:
    v4:
    - Document behaviour of madvise(MADV_DONTNEED/MADV_FREE).
    - Document the initial process state on fork/execve.
    - Clarify when the kernel uaccess checks the tags.
    - Minor updates to the example code.
    - A few other minor clean-ups following review.
    
    v3:
    - Modify the uaccess checking conditions: only when the sync mode is
      selected by the user. In async mode, the kernel uaccesses are not
      checked.
    - Clarify that an include mask of 0 (exclude mask 0xffff) results in
      always generating tag 0.
    - Document the ptrace() interface.
    
    v2:
    - Documented the uaccess kernel tag checking mode.
    - Removed the BTI definitions from cpu-feature-registers.rst.
    - Removed the paragraph stating that MTE depends on the tagged address
      ABI (while the Kconfig entry does, there is no requirement for the
      user to enable both).
    - Changed the GCR_EL1.Exclude handling description following the change
      in the prctl() interface (include vs exclude mask).
    - Updated the example code.

 Documentation/arm64/cpu-feature-registers.rst |   2 +
 Documentation/arm64/elf_hwcaps.rst            |   5 +
 Documentation/arm64/index.rst                 |   1 +
 .../arm64/memory-tagging-extension.rst        | 297 ++++++++++++++++++
 4 files changed, 305 insertions(+)
 create mode 100644 Documentation/arm64/memory-tagging-extension.rst

diff --git a/Documentation/arm64/cpu-feature-registers.rst b/Documentation/arm64/cpu-feature-registers.rst
index 41937a8091aa..b5679fa85ad9 100644
--- a/Documentation/arm64/cpu-feature-registers.rst
+++ b/Documentation/arm64/cpu-feature-registers.rst
@@ -174,6 +174,8 @@ infrastructure:
      +------------------------------+---------+---------+
      | Name                         |  bits   | visible |
      +------------------------------+---------+---------+
+     | MTE                          | [11-8]  |    y    |
+     +------------------------------+---------+---------+
      | SSBS                         | [7-4]   |    y    |
      +------------------------------+---------+---------+
 
diff --git a/Documentation/arm64/elf_hwcaps.rst b/Documentation/arm64/elf_hwcaps.rst
index 7dfb97dfe416..ca7f90e99e3a 100644
--- a/Documentation/arm64/elf_hwcaps.rst
+++ b/Documentation/arm64/elf_hwcaps.rst
@@ -236,6 +236,11 @@ HWCAP2_RNG
 
     Functionality implied by ID_AA64ISAR0_EL1.RNDR == 0b0001.
 
+HWCAP2_MTE
+
+    Functionality implied by ID_AA64PFR1_EL1.MTE == 0b0010, as described
+    by Documentation/arm64/memory-tagging-extension.rst.
+
 4. Unused AT_HWCAP bits
 -----------------------
 
diff --git a/Documentation/arm64/index.rst b/Documentation/arm64/index.rst
index 09cbb4ed2237..4cd0e696f064 100644
--- a/Documentation/arm64/index.rst
+++ b/Documentation/arm64/index.rst
@@ -14,6 +14,7 @@ ARM64 Architecture
     hugetlbpage
     legacy_instructions
     memory
+    memory-tagging-extension
     pointer-authentication
     silicon-errata
     sve
diff --git a/Documentation/arm64/memory-tagging-extension.rst b/Documentation/arm64/memory-tagging-extension.rst
new file mode 100644
index 000000000000..e7cdcecb656a
--- /dev/null
+++ b/Documentation/arm64/memory-tagging-extension.rst
@@ -0,0 +1,297 @@
+===============================================
+Memory Tagging Extension (MTE) in AArch64 Linux
+===============================================
+
+Authors: Vincenzo Frascino <vincenzo.frascino@arm.com>
+         Catalin Marinas <catalin.marinas@arm.com>
+
+Date: 2020-02-25
+
+This document describes the provision of the Memory Tagging Extension
+functionality in AArch64 Linux.
+
+Introduction
+============
+
+ARMv8.5 based processors introduce the Memory Tagging Extension (MTE)
+feature. MTE is built on top of the ARMv8.0 virtual address tagging TBI
+(Top Byte Ignore) feature and allows software to access a 4-bit
+allocation tag for each 16-byte granule in the physical address space.
+Such memory range must be mapped with the Normal-Tagged memory
+attribute. A logical tag is derived from bits 59-56 of the virtual
+address used for the memory access. A CPU with MTE enabled will compare
+the logical tag against the allocation tag and potentially raise an
+exception on mismatch, subject to system registers configuration.
+
+Userspace Support
+=================
+
+When ``CONFIG_ARM64_MTE`` is selected and Memory Tagging Extension is
+supported by the hardware, the kernel advertises the feature to
+userspace via ``HWCAP2_MTE``.
+
+PROT_MTE
+--------
+
+To access the allocation tags, a user process must enable the Tagged
+memory attribute on an address range using a new ``prot`` flag for
+``mmap()`` and ``mprotect()``:
+
+``PROT_MTE`` - Pages allow access to the MTE allocation tags.
+
+The allocation tag is set to 0 when such pages are first mapped in the
+user address space and preserved on copy-on-write. ``MAP_SHARED`` is
+supported and the allocation tags can be shared between processes.
+
+**Note**: ``PROT_MTE`` is only supported on ``MAP_ANONYMOUS`` and
+RAM-based file mappings (``tmpfs``, ``memfd``). Passing it to other
+types of mapping will result in ``-EINVAL`` returned by these system
+calls.
+
+**Note**: The ``PROT_MTE`` flag (and corresponding memory type) cannot
+be cleared by ``mprotect()``.
+
+**Note**: ``madvise()`` memory ranges with ``MADV_DONTNEED`` and
+``MADV_FREE`` may have the allocation tags cleared (set to 0) at any
+point after the system call.
+
+Tag Check Faults
+----------------
+
+When ``PROT_MTE`` is enabled on an address range and a mismatch between
+the logical and allocation tags occurs on access, there are three
+configurable behaviours:
+
+- *Ignore* - This is the default mode. The CPU (and kernel) ignores the
+  tag check fault.
+
+- *Synchronous* - The kernel raises a ``SIGSEGV`` synchronously, with
+  ``.si_code = SEGV_MTESERR`` and ``.si_addr = <fault-address>``. The
+  memory access is not performed. If ``SIGSEGV`` is ignored or blocked
+  by the offending thread, the containing process is terminated with a
+  ``coredump``.
+
+- *Asynchronous* - The kernel raises a ``SIGSEGV``, in the offending
+  thread, asynchronously following one or multiple tag check faults,
+  with ``.si_code = SEGV_MTEAERR`` and ``.si_addr = 0`` (the faulting
+  address is unknown).
+
+The user can select the above modes, per thread, using the
+``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call where
+``flags`` contain one of the following values in the ``PR_MTE_TCF_MASK``
+bit-field:
+
+- ``PR_MTE_TCF_NONE``  - *Ignore* tag check faults
+- ``PR_MTE_TCF_SYNC``  - *Synchronous* tag check fault mode
+- ``PR_MTE_TCF_ASYNC`` - *Asynchronous* tag check fault mode
+
+The current tag check fault mode can be read using the
+``prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0)`` system call.
+
+Tag checking can also be disabled for a user thread by setting the
+``PSTATE.TCO`` bit with ``MSR TCO, #1``.
+
+**Note**: Signal handlers are always invoked with ``PSTATE.TCO = 0``,
+irrespective of the interrupted context. ``PSTATE.TCO`` is restored on
+``sigreturn()``.
+
+**Note**: There are no *match-all* logical tags available for user
+applications.
+
+**Note**: Kernel accesses to the user address space (e.g. ``read()``
+system call) are not checked if the user thread tag checking mode is
+``PR_MTE_TCF_NONE`` or ``PR_MTE_TCF_ASYNC``. If the tag checking mode is
+``PR_MTE_TCF_SYNC``, the kernel makes a best effort to check its user
+address accesses, however it cannot always guarantee it.
+
+Excluding Tags in the ``IRG``, ``ADDG`` and ``SUBG`` instructions
+-----------------------------------------------------------------
+
+The architecture allows excluding certain tags to be randomly generated
+via the ``GCR_EL1.Exclude`` register bit-field. By default, Linux
+excludes all tags other than 0. A user thread can enable specific tags
+in the randomly generated set using the ``prctl(PR_SET_TAGGED_ADDR_CTRL,
+flags, 0, 0, 0)`` system call where ``flags`` contains the tags bitmap
+in the ``PR_MTE_TAG_MASK`` bit-field.
+
+**Note**: The hardware uses an exclude mask but the ``prctl()``
+interface provides an include mask. An include mask of ``0`` (exclusion
+mask ``0xffff``) results in the CPU always generating tag ``0``.
+
+Initial process state
+---------------------
+
+On ``execve()``, the new process has the following configuration:
+
+- ``PR_TAGGED_ADDR_ENABLE`` set to 0 (disabled)
+- Tag checking mode set to ``PR_MTE_TCF_NONE``
+- ``PR_MTE_TAG_MASK`` set to 0 (all tags excluded)
+- ``PSTATE.TCO`` set to 0
+- ``PROT_MTE`` not set on any of the initial memory maps
+
+On ``fork()``, the new process inherits the parent's configuration and
+memory map attributes with the exception of the ``madvise()`` ranges
+with ``MADV_WIPEONFORK`` which will have the data and tags cleared (set
+to 0).
+
+The ``ptrace()`` interface
+--------------------------
+
+``PTRACE_PEEKMTETAGS`` and ``PTRACE_POKEMTETAGS`` allow a tracer to read
+the tags from or set the tags to a tracee's address space. The
+``ptrace()`` system call is invoked as ``ptrace(request, pid, addr,
+data)`` where:
+
+- ``request`` - one of ``PTRACE_PEEKMTETAGS`` or ``PTRACE_PEEKMTETAGS``.
+- ``pid`` - the tracee's PID.
+- ``addr`` - address in the tracee's address space.
+- ``data`` - pointer to a ``struct iovec`` where ``iov_base`` points to
+  a buffer of ``iov_len`` length in the tracer's address space.
+
+The tags in the tracer's ``iov_base`` buffer are represented as one
+4-bit tag per byte and correspond to a 16-byte MTE tag granule in the
+tracee's address space.
+
+**Note**: If ``addr`` is not aligned to a 16-byte granule, the kernel
+will use the corresponding aligned address.
+
+``ptrace()`` return value:
+
+- 0 - tags were copied, the tracer's ``iov_len`` was updated to the
+  number of tags transferred. This may be smaller than the requested
+  ``iov_len`` if the requested address range in the tracee's or the
+  tracer's space cannot be accessed or does not have valid tags.
+- ``-EPERM`` - the specified process cannot be traced.
+- ``-EIO`` - the tracee's address range cannot be accessed (e.g. invalid
+  address) and no tags copied. ``iov_len`` not updated.
+- ``-EFAULT`` - fault on accessing the tracer's memory (``struct iovec``
+  or ``iov_base`` buffer) and no tags copied. ``iov_len`` not updated.
+- ``-EOPNOTSUPP`` - the tracee's address does not have valid tags (never
+  mapped with the ``PROT_MTE`` flag). ``iov_len`` not updated.
+
+**Note**: There are no transient errors for the requests above, so user
+programs should not retry in case of a non-zero system call return.
+
+Example of correct usage
+========================
+
+*MTE Example code*
+
+.. code-block:: c
+
+    /*
+     * To be compiled with -march=armv8.5-a+memtag
+     */
+    #include <errno.h>
+    #include <stdio.h>
+    #include <stdlib.h>
+    #include <unistd.h>
+    #include <sys/auxv.h>
+    #include <sys/mman.h>
+    #include <sys/prctl.h>
+
+    /*
+     * From arch/arm64/include/uapi/asm/hwcap.h
+     */
+    #define HWCAP2_MTE              (1 << 18)
+
+    /*
+     * From arch/arm64/include/uapi/asm/mman.h
+     */
+    #define PROT_MTE                 0x20
+
+    /*
+     * From include/uapi/linux/prctl.h
+     */
+    #define PR_SET_TAGGED_ADDR_CTRL 55
+    #define PR_GET_TAGGED_ADDR_CTRL 56
+    # define PR_TAGGED_ADDR_ENABLE  (1UL << 0)
+    # define PR_MTE_TCF_SHIFT       1
+    # define PR_MTE_TCF_NONE        (0UL << PR_MTE_TCF_SHIFT)
+    # define PR_MTE_TCF_SYNC        (1UL << PR_MTE_TCF_SHIFT)
+    # define PR_MTE_TCF_ASYNC       (2UL << PR_MTE_TCF_SHIFT)
+    # define PR_MTE_TCF_MASK        (3UL << PR_MTE_TCF_SHIFT)
+    # define PR_MTE_TAG_SHIFT       3
+    # define PR_MTE_TAG_MASK        (0xffffUL << PR_MTE_TAG_SHIFT)
+
+    /*
+     * Insert a random logical tag into the given pointer.
+     */
+    #define insert_random_tag(ptr) ({                       \
+            __u64 __val;                                    \
+            asm("irg %0, %1" : "=r" (__val) : "r" (ptr));   \
+            __val;                                          \
+    })
+
+    /*
+     * Set the allocation tag on the destination address.
+     */
+    #define set_tag(tagged_addr) do {                                      \
+            asm volatile("stg %0, [%0]" : : "r" (tagged_addr) : "memory"); \
+    } while (0)
+
+    int main()
+    {
+            unsigned char *a;
+            unsigned long page_sz = sysconf(_SC_PAGESIZE);
+            unsigned long hwcap2 = getauxval(AT_HWCAP2);
+
+            /* check if MTE is present */
+            if (!(hwcap2 & HWCAP2_MTE))
+                    return EXIT_FAILURE;
+
+            /*
+             * Enable the tagged address ABI, synchronous MTE tag check faults and
+             * allow all non-zero tags in the randomly generated set.
+             */
+            if (prctl(PR_SET_TAGGED_ADDR_CTRL,
+                      PR_TAGGED_ADDR_ENABLE | PR_MTE_TCF_SYNC | (0xfffe << PR_MTE_TAG_SHIFT),
+                      0, 0, 0)) {
+                    perror("prctl() failed");
+                    return EXIT_FAILURE;
+            }
+
+            a = mmap(0, page_sz, PROT_READ | PROT_WRITE,
+                     MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+            if (a == MAP_FAILED) {
+                    perror("mmap() failed");
+                    return EXIT_FAILURE;
+            }
+
+            /*
+             * Enable MTE on the above anonymous mmap. The flag could be passed
+             * directly to mmap() and skip this step.
+             */
+            if (mprotect(a, page_sz, PROT_READ | PROT_WRITE | PROT_MTE)) {
+                    perror("mprotect() failed");
+                    return EXIT_FAILURE;
+            }
+
+            /* access with the default tag (0) */
+            a[0] = 1;
+            a[1] = 2;
+
+            printf("a[0] = %hhu a[1] = %hhu\n", a[0], a[1]);
+
+            /* set the logical and allocation tags */
+            a = (unsigned char *)insert_random_tag(a);
+            set_tag(a);
+
+            printf("%p\n", a);
+
+            /* non-zero tag access */
+            a[0] = 3;
+            printf("a[0] = %hhu a[1] = %hhu\n", a[0], a[1]);
+
+            /*
+             * If MTE is enabled correctly the next instruction will generate an
+             * exception.
+             */
+            printf("Expecting SIGSEGV...\n");
+            a[16] = 0xdd;
+
+            /* this should not be printed in the PR_MTE_TCF_SYNC mode */
+            printf("...haven't got one\n");
+
+            return EXIT_FAILURE;
+    }

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
  2020-05-15 17:16   ` Catalin Marinas
@ 2020-05-18 11:26     ` Vladimir Murzin
  -1 siblings, 0 replies; 123+ messages in thread
From: Vladimir Murzin @ 2020-05-18 11:26 UTC (permalink / raw)
  To: Catalin Marinas, linux-arm-kernel
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Vincenzo Frascino, Will Deacon,
	Dave P Martin

On 5/15/20 6:16 PM, Catalin Marinas wrote:
> For performance analysis it may be desirable to disable MTE altogether
> via an early param. Introduce arm64.mte_disable and, if true, filter out
> the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> user.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
> 
> Notes:
>     New in v4.
> 
>  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
>  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index f2a93c8679e8..7436e7462b85 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -373,6 +373,10 @@
>  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
>  			Format: <io>,<irq>,<nodeID>
>  
> +	arm64.mte_disable=
> +			[ARM64] Disable Linux support for the Memory
> +			Tagging Extension (both user and in-kernel).
> +

Should it really to take parameter (on/off/true/false)? It may lead to expectation
that arm64.mte_disable=false should enable MT and, yes, double negatives make it
look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?

Cheers
Vladimir

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2020-05-18 11:26     ` Vladimir Murzin
  0 siblings, 0 replies; 123+ messages in thread
From: Vladimir Murzin @ 2020-05-18 11:26 UTC (permalink / raw)
  To: Catalin Marinas, linux-arm-kernel
  Cc: linux-arch, Will Deacon, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, linux-mm, Vincenzo Frascino, Peter Collingbourne,
	Dave P Martin

On 5/15/20 6:16 PM, Catalin Marinas wrote:
> For performance analysis it may be desirable to disable MTE altogether
> via an early param. Introduce arm64.mte_disable and, if true, filter out
> the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> user.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
> 
> Notes:
>     New in v4.
> 
>  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
>  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index f2a93c8679e8..7436e7462b85 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -373,6 +373,10 @@
>  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
>  			Format: <io>,<irq>,<nodeID>
>  
> +	arm64.mte_disable=
> +			[ARM64] Disable Linux support for the Memory
> +			Tagging Extension (both user and in-kernel).
> +

Should it really to take parameter (on/off/true/false)? It may lead to expectation
that arm64.mte_disable=false should enable MT and, yes, double negatives make it
look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?

Cheers
Vladimir

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
  2020-05-18 11:26     ` Vladimir Murzin
@ 2020-05-18 11:31       ` Will Deacon
  -1 siblings, 0 replies; 123+ messages in thread
From: Will Deacon @ 2020-05-18 11:31 UTC (permalink / raw)
  To: Vladimir Murzin
  Cc: Catalin Marinas, linux-arm-kernel, linux-arch, Szabolcs Nagy,
	Andrey Konovalov, Kevin Brodsky, Peter Collingbourne, linux-mm,
	Vincenzo Frascino, Dave P Martin

On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > For performance analysis it may be desirable to disable MTE altogether
> > via an early param. Introduce arm64.mte_disable and, if true, filter out
> > the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> > user.
> > 
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> > 
> > Notes:
> >     New in v4.
> > 
> >  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
> >  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
> >  2 files changed, 15 insertions(+)
> > 
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index f2a93c8679e8..7436e7462b85 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -373,6 +373,10 @@
> >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> >  			Format: <io>,<irq>,<nodeID>
> >  
> > +	arm64.mte_disable=
> > +			[ARM64] Disable Linux support for the Memory
> > +			Tagging Extension (both user and in-kernel).
> > +
> 
> Should it really to take parameter (on/off/true/false)? It may lead to expectation
> that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?

I don't think "performance analysis" is a good justification for this
parameter tbh. We don't tend to add these options for other architectural
features, and I don't see why MTE is any different in this regard.

Will

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2020-05-18 11:31       ` Will Deacon
  0 siblings, 0 replies; 123+ messages in thread
From: Will Deacon @ 2020-05-18 11:31 UTC (permalink / raw)
  To: Vladimir Murzin
  Cc: linux-arch, Szabolcs Nagy, Catalin Marinas, Kevin Brodsky,
	linux-mm, Andrey Konovalov, Vincenzo Frascino,
	Peter Collingbourne, Dave P Martin, linux-arm-kernel

On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > For performance analysis it may be desirable to disable MTE altogether
> > via an early param. Introduce arm64.mte_disable and, if true, filter out
> > the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> > user.
> > 
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> > 
> > Notes:
> >     New in v4.
> > 
> >  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
> >  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
> >  2 files changed, 15 insertions(+)
> > 
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index f2a93c8679e8..7436e7462b85 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -373,6 +373,10 @@
> >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> >  			Format: <io>,<irq>,<nodeID>
> >  
> > +	arm64.mte_disable=
> > +			[ARM64] Disable Linux support for the Memory
> > +			Tagging Extension (both user and in-kernel).
> > +
> 
> Should it really to take parameter (on/off/true/false)? It may lead to expectation
> that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?

I don't think "performance analysis" is a good justification for this
parameter tbh. We don't tend to add these options for other architectural
features, and I don't see why MTE is any different in this regard.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
  2020-05-18 11:31       ` Will Deacon
@ 2020-05-18 17:20         ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-18 17:20 UTC (permalink / raw)
  To: Will Deacon
  Cc: Vladimir Murzin, linux-arch, Szabolcs Nagy, Kevin Brodsky,
	linux-mm, Andrey Konovalov, Vincenzo Frascino,
	Peter Collingbourne, Dave P Martin, linux-arm-kernel

On Mon, May 18, 2020 at 12:31:03PM +0100, Will Deacon wrote:
> On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> > On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > > For performance analysis it may be desirable to disable MTE altogether
> > > via an early param. Introduce arm64.mte_disable and, if true, filter out
> > > the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> > > user.
> > > 
> > > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > ---
> > > 
> > > Notes:
> > >     New in v4.
> > > 
> > >  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
> > >  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
> > >  2 files changed, 15 insertions(+)
> > > 
> > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > index f2a93c8679e8..7436e7462b85 100644
> > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > @@ -373,6 +373,10 @@
> > >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> > >  			Format: <io>,<irq>,<nodeID>
> > >  
> > > +	arm64.mte_disable=
> > > +			[ARM64] Disable Linux support for the Memory
> > > +			Tagging Extension (both user and in-kernel).
> > > +
> > 
> > Should it really to take parameter (on/off/true/false)? It may lead to expectation
> > that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> > look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?
> 
> I don't think "performance analysis" is a good justification for this
> parameter tbh. We don't tend to add these options for other architectural
> features, and I don't see why MTE is any different in this regard.

There is an expectation of performance impact with MTE enabled,
especially if it's running in synchronous mode. For the in-kernel MTE,
we could add a parameter which sets sync vs async at boot time rather
than a big disable knob. It won't affect user space however.

The other 'justification' is if your hardware has weird unexpected
behaviour but I'd like this handled via errata workarounds.

I'll let the people who asked for this to chip in ;). I agree with you
that we rarely add these (and I rejected a similar option a few weeks
ago on the AMU patchset).

-- 
Catalin

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2020-05-18 17:20         ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-18 17:20 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arch, Vladimir Murzin, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, linux-mm, Vincenzo Frascino, Peter Collingbourne,
	Dave P Martin, linux-arm-kernel

On Mon, May 18, 2020 at 12:31:03PM +0100, Will Deacon wrote:
> On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> > On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > > For performance analysis it may be desirable to disable MTE altogether
> > > via an early param. Introduce arm64.mte_disable and, if true, filter out
> > > the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> > > user.
> > > 
> > > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > ---
> > > 
> > > Notes:
> > >     New in v4.
> > > 
> > >  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
> > >  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
> > >  2 files changed, 15 insertions(+)
> > > 
> > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > index f2a93c8679e8..7436e7462b85 100644
> > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > @@ -373,6 +373,10 @@
> > >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> > >  			Format: <io>,<irq>,<nodeID>
> > >  
> > > +	arm64.mte_disable=
> > > +			[ARM64] Disable Linux support for the Memory
> > > +			Tagging Extension (both user and in-kernel).
> > > +
> > 
> > Should it really to take parameter (on/off/true/false)? It may lead to expectation
> > that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> > look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?
> 
> I don't think "performance analysis" is a good justification for this
> parameter tbh. We don't tend to add these options for other architectural
> features, and I don't see why MTE is any different in this regard.

There is an expectation of performance impact with MTE enabled,
especially if it's running in synchronous mode. For the in-kernel MTE,
we could add a parameter which sets sync vs async at boot time rather
than a big disable knob. It won't affect user space however.

The other 'justification' is if your hardware has weird unexpected
behaviour but I'd like this handled via errata workarounds.

I'll let the people who asked for this to chip in ;). I agree with you
that we rarely add these (and I rejected a similar option a few weeks
ago on the AMU patchset).

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
  2020-05-18 11:26     ` Vladimir Murzin
@ 2020-05-19 16:14       ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-19 16:14 UTC (permalink / raw)
  To: Vladimir Murzin
  Cc: linux-arm-kernel, linux-arch, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, Peter Collingbourne, linux-mm, Vincenzo Frascino,
	Will Deacon, Dave P Martin

On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > For performance analysis it may be desirable to disable MTE altogether
> > via an early param. Introduce arm64.mte_disable and, if true, filter out
> > the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> > user.
> > 
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> > 
> > Notes:
> >     New in v4.
> > 
> >  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
> >  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
> >  2 files changed, 15 insertions(+)
> > 
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index f2a93c8679e8..7436e7462b85 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -373,6 +373,10 @@
> >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> >  			Format: <io>,<irq>,<nodeID>
> >  
> > +	arm64.mte_disable=
> > +			[ARM64] Disable Linux support for the Memory
> > +			Tagging Extension (both user and in-kernel).
> > +
> 
> Should it really to take parameter (on/off/true/false)? It may lead to expectation
> that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?

My reasoning about arm64.mte= was that 'on' may lead people to think it
does something even when MTE isn't available on the SoC. So I ended up
with an explicit 'disable' in the name. Happy to change it if we don't
drop this parameter altogether (in the absence of valid use-cases).

-- 
Catalin

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2020-05-19 16:14       ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-19 16:14 UTC (permalink / raw)
  To: Vladimir Murzin
  Cc: linux-arch, Will Deacon, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, linux-mm, Vincenzo Frascino, Peter Collingbourne,
	Dave P Martin, linux-arm-kernel

On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > For performance analysis it may be desirable to disable MTE altogether
> > via an early param. Introduce arm64.mte_disable and, if true, filter out
> > the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> > user.
> > 
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> > 
> > Notes:
> >     New in v4.
> > 
> >  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
> >  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
> >  2 files changed, 15 insertions(+)
> > 
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index f2a93c8679e8..7436e7462b85 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -373,6 +373,10 @@
> >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> >  			Format: <io>,<irq>,<nodeID>
> >  
> > +	arm64.mte_disable=
> > +			[ARM64] Disable Linux support for the Memory
> > +			Tagging Extension (both user and in-kernel).
> > +
> 
> Should it really to take parameter (on/off/true/false)? It may lead to expectation
> that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?

My reasoning about arm64.mte= was that 'on' may lead people to think it
does something even when MTE isn't available on the SoC. So I ended up
with an explicit 'disable' in the name. Happy to change it if we don't
drop this parameter altogether (in the absence of valid use-cases).

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
  2020-05-18 17:20         ` Catalin Marinas
@ 2020-05-22  5:57           ` Patrick Daly
  -1 siblings, 0 replies; 123+ messages in thread
From: Patrick Daly @ 2020-05-22  5:57 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, linux-arch, Vladimir Murzin, Szabolcs Nagy,
	Andrey Konovalov, Kevin Brodsky, linux-mm, Vincenzo Frascino,
	Peter Collingbourne, Dave P Martin, linux-arm-kernel

On Mon, May 18, 2020 at 06:20:55PM +0100, Catalin Marinas wrote:
> On Mon, May 18, 2020 at 12:31:03PM +0100, Will Deacon wrote:
> > On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> > > On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > > > For performance analysis it may be desirable to disable MTE altogether
> > > > via an early param. Introduce arm64.mte_disable and, if true, filter out
> > > > the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> > > > user.
> > > > 
> > > > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > > > Cc: Will Deacon <will@kernel.org>
> > > > ---
> > > > 
> > > > Notes:
> > > >     New in v4.
> > > > 
> > > >  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
> > > >  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
> > > >  2 files changed, 15 insertions(+)
> > > > 
> > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > index f2a93c8679e8..7436e7462b85 100644
> > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > @@ -373,6 +373,10 @@
> > > >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> > > >  			Format: <io>,<irq>,<nodeID>
> > > >  
> > > > +	arm64.mte_disable=
> > > > +			[ARM64] Disable Linux support for the Memory
> > > > +			Tagging Extension (both user and in-kernel).
> > > > +
> > > 
> > > Should it really to take parameter (on/off/true/false)? It may lead to expectation
> > > that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> > > look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?
> > 
> > I don't think "performance analysis" is a good justification for this
> > parameter tbh. We don't tend to add these options for other architectural
> > features, and I don't see why MTE is any different in this regard.
> 
> There is an expectation of performance impact with MTE enabled,
> especially if it's running in synchronous mode. For the in-kernel MTE,
> we could add a parameter which sets sync vs async at boot time rather
> than a big disable knob. It won't affect user space however.
> 
> The other 'justification' is if your hardware has weird unexpected
> behaviour but I'd like this handled via errata workarounds.
> 
> I'll let the people who asked for this to chip in ;). I agree with you
> that we rarely add these (and I rejected a similar option a few weeks
> ago on the AMU patchset).

We've been looking into other ways this on/off behavior could be achieved.
The "arm,armv8.5-memtag" DT flag already provides what we want - meaning
that this flag could be removed if the system did not support MTE.

I did see your remark on "arm64: mte: Check the DT memory nodes for MTE support"
questioning whether it was the right approach - is this still the case?
--Patrick

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2020-05-22  5:57           ` Patrick Daly
  0 siblings, 0 replies; 123+ messages in thread
From: Patrick Daly @ 2020-05-22  5:57 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, Vladimir Murzin, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, Peter Collingbourne, linux-mm, Vincenzo Frascino,
	Will Deacon, Dave P Martin, linux-arm-kernel

On Mon, May 18, 2020 at 06:20:55PM +0100, Catalin Marinas wrote:
> On Mon, May 18, 2020 at 12:31:03PM +0100, Will Deacon wrote:
> > On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> > > On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > > > For performance analysis it may be desirable to disable MTE altogether
> > > > via an early param. Introduce arm64.mte_disable and, if true, filter out
> > > > the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> > > > user.
> > > > 
> > > > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > > > Cc: Will Deacon <will@kernel.org>
> > > > ---
> > > > 
> > > > Notes:
> > > >     New in v4.
> > > > 
> > > >  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
> > > >  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
> > > >  2 files changed, 15 insertions(+)
> > > > 
> > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > index f2a93c8679e8..7436e7462b85 100644
> > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > @@ -373,6 +373,10 @@
> > > >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> > > >  			Format: <io>,<irq>,<nodeID>
> > > >  
> > > > +	arm64.mte_disable=
> > > > +			[ARM64] Disable Linux support for the Memory
> > > > +			Tagging Extension (both user and in-kernel).
> > > > +
> > > 
> > > Should it really to take parameter (on/off/true/false)? It may lead to expectation
> > > that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> > > look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?
> > 
> > I don't think "performance analysis" is a good justification for this
> > parameter tbh. We don't tend to add these options for other architectural
> > features, and I don't see why MTE is any different in this regard.
> 
> There is an expectation of performance impact with MTE enabled,
> especially if it's running in synchronous mode. For the in-kernel MTE,
> we could add a parameter which sets sync vs async at boot time rather
> than a big disable knob. It won't affect user space however.
> 
> The other 'justification' is if your hardware has weird unexpected
> behaviour but I'd like this handled via errata workarounds.
> 
> I'll let the people who asked for this to chip in ;). I agree with you
> that we rarely add these (and I rejected a similar option a few weeks
> ago on the AMU patchset).

We've been looking into other ways this on/off behavior could be achieved.
The "arm,armv8.5-memtag" DT flag already provides what we want - meaning
that this flag could be removed if the system did not support MTE.

I did see your remark on "arm64: mte: Check the DT memory nodes for MTE support"
questioning whether it was the right approach - is this still the case?
--Patrick

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
  2020-05-22  5:57           ` Patrick Daly
@ 2020-05-22 10:37             ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-22 10:37 UTC (permalink / raw)
  To: Patrick Daly
  Cc: Will Deacon, linux-arch, Vladimir Murzin, Szabolcs Nagy,
	Andrey Konovalov, Kevin Brodsky, linux-mm, Vincenzo Frascino,
	Peter Collingbourne, Dave P Martin, linux-arm-kernel

Hi Patrick,

On Thu, May 21, 2020 at 10:57:10PM -0700, Patrick Daly wrote:
> On Mon, May 18, 2020 at 06:20:55PM +0100, Catalin Marinas wrote:
> > On Mon, May 18, 2020 at 12:31:03PM +0100, Will Deacon wrote:
> > > On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> > > > On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > > index f2a93c8679e8..7436e7462b85 100644
> > > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > > @@ -373,6 +373,10 @@
> > > > >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> > > > >  			Format: <io>,<irq>,<nodeID>
> > > > >  
> > > > > +	arm64.mte_disable=
> > > > > +			[ARM64] Disable Linux support for the Memory
> > > > > +			Tagging Extension (both user and in-kernel).
> > > > > +
> > > > 
> > > > Should it really to take parameter (on/off/true/false)? It may lead to expectation
> > > > that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> > > > look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?
> > > 
> > > I don't think "performance analysis" is a good justification for this
> > > parameter tbh. We don't tend to add these options for other architectural
> > > features, and I don't see why MTE is any different in this regard.
> > 
> > There is an expectation of performance impact with MTE enabled,
> > especially if it's running in synchronous mode. For the in-kernel MTE,
> > we could add a parameter which sets sync vs async at boot time rather
> > than a big disable knob. It won't affect user space however.
> > 
> > The other 'justification' is if your hardware has weird unexpected
> > behaviour but I'd like this handled via errata workarounds.
> > 
> > I'll let the people who asked for this to chip in ;). I agree with you
> > that we rarely add these (and I rejected a similar option a few weeks
> > ago on the AMU patchset).
> 
> We've been looking into other ways this on/off behavior could be achieved.

The actual question here is what the on/off behaviour is needed for. We
can figure out the best mechanism for this once we know what we want to
achieve. My wild guess above was performance analysis but that can be
toggled by either kernel boot parameter or run-time sysctl (or just the
Kconfig option).

If it is about forcing user space not to use MTE, we may look into some
other sysctl controls (we already have one for the tagged address ABI).

If it is for working around hardware not supporting MTE (i.e. no
allocation tag storage), this should be handled differently, not by
kernel parameter.

> The "arm,armv8.5-memtag" DT flag already provides what we want - meaning
> that this flag could be removed if the system did not support MTE.
> 
> I did see your remark on "arm64: mte: Check the DT memory nodes for MTE support"
> questioning whether it was the right approach - is this still the case?

My plan is to remove the DT patch altogether _if_ I get confirmation
from the CPU designers. The idea is that if ID_AA64PFR1_EL1.MTE > 1,
Linux can assume system-wide MTE support. If an MTE-capable CPU is
deployed in an SoC without tag storage, a tie-off should change the ID
field to 1 (or 0). If we do find hardware with an ID field > 1 and no
tag storage, it will be handled as an SoC erratum in the kernel,
probably tied to the new SoC Id advertised by firmware (Sudeep had some
patches recently).

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2020-05-22 10:37             ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-22 10:37 UTC (permalink / raw)
  To: Patrick Daly
  Cc: linux-arch, Vladimir Murzin, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, Peter Collingbourne, linux-mm, Vincenzo Frascino,
	Will Deacon, Dave P Martin, linux-arm-kernel

Hi Patrick,

On Thu, May 21, 2020 at 10:57:10PM -0700, Patrick Daly wrote:
> On Mon, May 18, 2020 at 06:20:55PM +0100, Catalin Marinas wrote:
> > On Mon, May 18, 2020 at 12:31:03PM +0100, Will Deacon wrote:
> > > On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> > > > On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > > index f2a93c8679e8..7436e7462b85 100644
> > > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > > @@ -373,6 +373,10 @@
> > > > >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> > > > >  			Format: <io>,<irq>,<nodeID>
> > > > >  
> > > > > +	arm64.mte_disable=
> > > > > +			[ARM64] Disable Linux support for the Memory
> > > > > +			Tagging Extension (both user and in-kernel).
> > > > > +
> > > > 
> > > > Should it really to take parameter (on/off/true/false)? It may lead to expectation
> > > > that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> > > > look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?
> > > 
> > > I don't think "performance analysis" is a good justification for this
> > > parameter tbh. We don't tend to add these options for other architectural
> > > features, and I don't see why MTE is any different in this regard.
> > 
> > There is an expectation of performance impact with MTE enabled,
> > especially if it's running in synchronous mode. For the in-kernel MTE,
> > we could add a parameter which sets sync vs async at boot time rather
> > than a big disable knob. It won't affect user space however.
> > 
> > The other 'justification' is if your hardware has weird unexpected
> > behaviour but I'd like this handled via errata workarounds.
> > 
> > I'll let the people who asked for this to chip in ;). I agree with you
> > that we rarely add these (and I rejected a similar option a few weeks
> > ago on the AMU patchset).
> 
> We've been looking into other ways this on/off behavior could be achieved.

The actual question here is what the on/off behaviour is needed for. We
can figure out the best mechanism for this once we know what we want to
achieve. My wild guess above was performance analysis but that can be
toggled by either kernel boot parameter or run-time sysctl (or just the
Kconfig option).

If it is about forcing user space not to use MTE, we may look into some
other sysctl controls (we already have one for the tagged address ABI).

If it is for working around hardware not supporting MTE (i.e. no
allocation tag storage), this should be handled differently, not by
kernel parameter.

> The "arm,armv8.5-memtag" DT flag already provides what we want - meaning
> that this flag could be removed if the system did not support MTE.
> 
> I did see your remark on "arm64: mte: Check the DT memory nodes for MTE support"
> questioning whether it was the right approach - is this still the case?

My plan is to remove the DT patch altogether _if_ I get confirmation
from the CPU designers. The idea is that if ID_AA64PFR1_EL1.MTE > 1,
Linux can assume system-wide MTE support. If an MTE-capable CPU is
deployed in an SoC without tag storage, a tie-off should change the ID
field to 1 (or 0). If we do find hardware with an ID field > 1 and no
tag storage, it will be handled as an SoC erratum in the kernel,
probably tied to the new SoC Id advertised by firmware (Sudeep had some
patches recently).

Thanks.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2020-05-27  2:11               ` Patrick Daly
  0 siblings, 0 replies; 123+ messages in thread
From: Patrick Daly @ 2020-05-27  2:11 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, Vladimir Murzin, Will Deacon, Szabolcs Nagy,
	Andrey Konovalov, Kevin Brodsky, linux-mm, Vincenzo Frascino,
	Peter Collingbourne, Dave P Martin, linux-arm-kernel

On Fri, May 22, 2020 at 11:37:15AM +0100, Catalin Marinas wrote:
> Hi Patrick,
> 
> On Thu, May 21, 2020 at 10:57:10PM -0700, Patrick Daly wrote:
> > On Mon, May 18, 2020 at 06:20:55PM +0100, Catalin Marinas wrote:
> > > On Mon, May 18, 2020 at 12:31:03PM +0100, Will Deacon wrote:
> > > > On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> > > > > On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > index f2a93c8679e8..7436e7462b85 100644
> > > > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > @@ -373,6 +373,10 @@
> > > > > >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> > > > > >  			Format: <io>,<irq>,<nodeID>
> > > > > >  
> > > > > > +	arm64.mte_disable=
> > > > > > +			[ARM64] Disable Linux support for the Memory
> > > > > > +			Tagging Extension (both user and in-kernel).
> > > > > > +
> > > > > 
> > > > > Should it really to take parameter (on/off/true/false)? It may lead to expectation
> > > > > that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> > > > > look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?
> > > > 
> > > > I don't think "performance analysis" is a good justification for this
> > > > parameter tbh. We don't tend to add these options for other architectural
> > > > features, and I don't see why MTE is any different in this regard.
> > > 
> > > There is an expectation of performance impact with MTE enabled,
> > > especially if it's running in synchronous mode. For the in-kernel MTE,
> > > we could add a parameter which sets sync vs async at boot time rather
> > > than a big disable knob. It won't affect user space however.
> > > 
> > > The other 'justification' is if your hardware has weird unexpected
> > > behaviour but I'd like this handled via errata workarounds.
> > > 
> > > I'll let the people who asked for this to chip in ;). I agree with you
> > > that we rarely add these (and I rejected a similar option a few weeks
> > > ago on the AMU patchset).
> > 
> > We've been looking into other ways this on/off behavior could be achieved.
> 
> The actual question here is what the on/off behaviour is needed for. We
> can figure out the best mechanism for this once we know what we want to
> achieve. My wild guess above was performance analysis but that can be
> toggled by either kernel boot parameter or run-time sysctl (or just the
> Kconfig option).
> 
> If it is about forcing user space not to use MTE, we may look into some
> other sysctl controls (we already have one for the tagged address ABI).

We want to allow the end user to be able to easily "opt out" of MTE in favour
of better power, perf and battery life.

In terms of deciding policy, a sysctl is much more accessible than
reompiling with CONFIG_MTE=n, or replacing userspace libraries with
equivalents which don't use PROT_MTE.

--Patrick

> 
> If it is for working around hardware not supporting MTE (i.e. no
> allocation tag storage), this should be handled differently, not by
> kernel parameter.
> 
> > The "arm,armv8.5-memtag" DT flag already provides what we want - meaning
> > that this flag could be removed if the system did not support MTE.
> > 
> > I did see your remark on "arm64: mte: Check the DT memory nodes for MTE support"
> > questioning whether it was the right approach - is this still the case?
> 
> My plan is to remove the DT patch altogether _if_ I get confirmation
> from the CPU designers. The idea is that if ID_AA64PFR1_EL1.MTE > 1,
> Linux can assume system-wide MTE support. If an MTE-capable CPU is
> deployed in an SoC without tag storage, a tie-off should change the ID
> field to 1 (or 0). If we do find hardware with an ID field > 1 and no
> tag storage, it will be handled as an SoC erratum in the kernel,
> probably tied to the new SoC Id advertised by firmware (Sudeep had some
> patches recently).

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2020-05-27  2:11               ` Patrick Daly
  0 siblings, 0 replies; 123+ messages in thread
From: Patrick Daly @ 2020-05-27  2:11 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, Vladimir Murzin, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, Peter Collingbourne, linux-mm, Vincenzo Frascino,
	Will Deacon, Dave P Martin, linux-arm-kernel

On Fri, May 22, 2020 at 11:37:15AM +0100, Catalin Marinas wrote:
> Hi Patrick,
> 
> On Thu, May 21, 2020 at 10:57:10PM -0700, Patrick Daly wrote:
> > On Mon, May 18, 2020 at 06:20:55PM +0100, Catalin Marinas wrote:
> > > On Mon, May 18, 2020 at 12:31:03PM +0100, Will Deacon wrote:
> > > > On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> > > > > On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > index f2a93c8679e8..7436e7462b85 100644
> > > > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > @@ -373,6 +373,10 @@
> > > > > >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> > > > > >  			Format: <io>,<irq>,<nodeID>
> > > > > >  
> > > > > > +	arm64.mte_disable=
> > > > > > +			[ARM64] Disable Linux support for the Memory
> > > > > > +			Tagging Extension (both user and in-kernel).
> > > > > > +
> > > > > 
> > > > > Should it really to take parameter (on/off/true/false)? It may lead to expectation
> > > > > that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> > > > > look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?
> > > > 
> > > > I don't think "performance analysis" is a good justification for this
> > > > parameter tbh. We don't tend to add these options for other architectural
> > > > features, and I don't see why MTE is any different in this regard.
> > > 
> > > There is an expectation of performance impact with MTE enabled,
> > > especially if it's running in synchronous mode. For the in-kernel MTE,
> > > we could add a parameter which sets sync vs async at boot time rather
> > > than a big disable knob. It won't affect user space however.
> > > 
> > > The other 'justification' is if your hardware has weird unexpected
> > > behaviour but I'd like this handled via errata workarounds.
> > > 
> > > I'll let the people who asked for this to chip in ;). I agree with you
> > > that we rarely add these (and I rejected a similar option a few weeks
> > > ago on the AMU patchset).
> > 
> > We've been looking into other ways this on/off behavior could be achieved.
> 
> The actual question here is what the on/off behaviour is needed for. We
> can figure out the best mechanism for this once we know what we want to
> achieve. My wild guess above was performance analysis but that can be
> toggled by either kernel boot parameter or run-time sysctl (or just the
> Kconfig option).
> 
> If it is about forcing user space not to use MTE, we may look into some
> other sysctl controls (we already have one for the tagged address ABI).

We want to allow the end user to be able to easily "opt out" of MTE in favour
of better power, perf and battery life.

In terms of deciding policy, a sysctl is much more accessible than
reompiling with CONFIG_MTE=n, or replacing userspace libraries with
equivalents which don't use PROT_MTE.

--Patrick

> 
> If it is for working around hardware not supporting MTE (i.e. no
> allocation tag storage), this should be handled differently, not by
> kernel parameter.
> 
> > The "arm,armv8.5-memtag" DT flag already provides what we want - meaning
> > that this flag could be removed if the system did not support MTE.
> > 
> > I did see your remark on "arm64: mte: Check the DT memory nodes for MTE support"
> > questioning whether it was the right approach - is this still the case?
> 
> My plan is to remove the DT patch altogether _if_ I get confirmation
> from the CPU designers. The idea is that if ID_AA64PFR1_EL1.MTE > 1,
> Linux can assume system-wide MTE support. If an MTE-capable CPU is
> deployed in an SoC without tag storage, a tie-off should change the ID
> field to 1 (or 0). If we do find hardware with an ID field > 1 and no
> tag storage, it will be handled as an SoC erratum in the kernel,
> probably tied to the new SoC Id advertised by firmware (Sudeep had some
> patches recently).

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2020-05-27  2:11               ` Patrick Daly
  0 siblings, 0 replies; 123+ messages in thread
From: Patrick Daly @ 2020-05-27  2:11 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, Vladimir Murzin, Will Deacon, Szabolcs Nagy,
	Andrey Konovalov, Kevin Brodsky, linux-mm, Vincenzo Frascino,
	Peter Collingbourne, Dave P Martin, linux-arm-kernel

On Fri, May 22, 2020 at 11:37:15AM +0100, Catalin Marinas wrote:
> Hi Patrick,
> 
> On Thu, May 21, 2020 at 10:57:10PM -0700, Patrick Daly wrote:
> > On Mon, May 18, 2020 at 06:20:55PM +0100, Catalin Marinas wrote:
> > > On Mon, May 18, 2020 at 12:31:03PM +0100, Will Deacon wrote:
> > > > On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> > > > > On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > index f2a93c8679e8..7436e7462b85 100644
> > > > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > @@ -373,6 +373,10 @@
> > > > > >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> > > > > >  			Format: <io>,<irq>,<nodeID>
> > > > > >  
> > > > > > +	arm64.mte_disable=
> > > > > > +			[ARM64] Disable Linux support for the Memory
> > > > > > +			Tagging Extension (both user and in-kernel).
> > > > > > +
> > > > > 
> > > > > Should it really to take parameter (on/off/true/false)? It may lead to expectation
> > > > > that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> > > > > look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?
> > > > 
> > > > I don't think "performance analysis" is a good justification for this
> > > > parameter tbh. We don't tend to add these options for other architectural
> > > > features, and I don't see why MTE is any different in this regard.
> > > 
> > > There is an expectation of performance impact with MTE enabled,
> > > especially if it's running in synchronous mode. For the in-kernel MTE,
> > > we could add a parameter which sets sync vs async at boot time rather
> > > than a big disable knob. It won't affect user space however.
> > > 
> > > The other 'justification' is if your hardware has weird unexpected
> > > behaviour but I'd like this handled via errata workarounds.
> > > 
> > > I'll let the people who asked for this to chip in ;). I agree with you
> > > that we rarely add these (and I rejected a similar option a few weeks
> > > ago on the AMU patchset).
> > 
> > We've been looking into other ways this on/off behavior could be achieved.
> 
> The actual question here is what the on/off behaviour is needed for. We
> can figure out the best mechanism for this once we know what we want to
> achieve. My wild guess above was performance analysis but that can be
> toggled by either kernel boot parameter or run-time sysctl (or just the
> Kconfig option).
> 
> If it is about forcing user space not to use MTE, we may look into some
> other sysctl controls (we already have one for the tagged address ABI).

We want to allow the end user to be able to easily "opt out" of MTE in favour
of better power, perf and battery life.

In terms of deciding policy, a sysctl is much more accessible than
reompiling with CONFIG_MTE=n, or replacing userspace libraries with
equivalents which don't use PROT_MTE.

--Patrick

> 
> If it is for working around hardware not supporting MTE (i.e. no
> allocation tag storage), this should be handled differently, not by
> kernel parameter.
> 
> > The "arm,armv8.5-memtag" DT flag already provides what we want - meaning
> > that this flag could be removed if the system did not support MTE.
> > 
> > I did see your remark on "arm64: mte: Check the DT memory nodes for MTE support"
> > questioning whether it was the right approach - is this still the case?
> 
> My plan is to remove the DT patch altogether _if_ I get confirmation
> from the CPU designers. The idea is that if ID_AA64PFR1_EL1.MTE > 1,
> Linux can assume system-wide MTE support. If an MTE-capable CPU is
> deployed in an SoC without tag storage, a tie-off should change the ID
> field to 1 (or 0). If we do find hardware with an ID field > 1 and no
> tag storage, it will be handled as an SoC erratum in the kernel,
> probably tied to the new SoC Id advertised by firmware (Sudeep had some
> patches recently).

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 15/26] arm64: mte: Allow user control of the tag check mode via prctl()
  2020-05-15 17:16   ` Catalin Marinas
@ 2020-05-27  7:46     ` Will Deacon
  -1 siblings, 0 replies; 123+ messages in thread
From: Will Deacon @ 2020-05-27  7:46 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, linux-mm, linux-arch, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne

On Fri, May 15, 2020 at 06:16:01PM +0100, Catalin Marinas wrote:
> By default, even if PROT_MTE is set on a memory range, there is no tag
> check fault reporting (SIGSEGV). Introduce a set of option to the
> exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag
> check fault mode:
> 
>   PR_MTE_TCF_NONE  - no reporting (default)
>   PR_MTE_TCF_SYNC  - synchronous tag check fault reporting
>   PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting
> 
> These options translate into the corresponding SCTLR_EL1.TCF0 bitfield,
> context-switched by the kernel. Note that uaccess done by the kernel is
> not checked and cannot be configured by the user.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
> 
> Notes:
>     v3:
>     - Use SCTLR_EL1_TCF0_NONE instead of 0 for consistency.
>     - Move mte_thread_switch() in this patch from an earlier one. In
>       addition, it is called after the dsb() in __switch_to() so that any
>       asynchronous tag check faults have been registered in the TFSR_EL1
>       registers (to be added with the in-kernel MTE support.
>     
>     v2:
>     - Handle SCTLR_EL1_TCF0_NONE explicitly for consistency with PR_MTE_TCF_NONE.
>     - Fix SCTLR_EL1 register setting in flush_mte_state() (thanks to Peter
>       Collingbourne).
>     - Added ISB to update_sctlr_el1_tcf0() since, with the latest
>       architecture update/fix, the TCF0 field is used by the uaccess
>       routines.
> 
>  arch/arm64/include/asm/mte.h       | 14 ++++++
>  arch/arm64/include/asm/processor.h |  3 ++
>  arch/arm64/kernel/mte.c            | 77 ++++++++++++++++++++++++++++++
>  arch/arm64/kernel/process.c        | 26 ++++++++--
>  include/uapi/linux/prctl.h         |  6 +++
>  5 files changed, 123 insertions(+), 3 deletions(-)

Dave is working on man pages for prctl() (and I think also ptrace). I think
it would be /very/ useful for us to have some RFC patches on top of his work
adding documentation for the MTE interactions, as we found some other minor
issues/inconsistencies as a direct result of writing and reviewing the man
page for our existing interfaces.

Cheers,

Will

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 15/26] arm64: mte: Allow user control of the tag check mode via prctl()
@ 2020-05-27  7:46     ` Will Deacon
  0 siblings, 0 replies; 123+ messages in thread
From: Will Deacon @ 2020-05-27  7:46 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	linux-mm, Vincenzo Frascino, Peter Collingbourne, Dave P Martin,
	linux-arm-kernel

On Fri, May 15, 2020 at 06:16:01PM +0100, Catalin Marinas wrote:
> By default, even if PROT_MTE is set on a memory range, there is no tag
> check fault reporting (SIGSEGV). Introduce a set of option to the
> exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag
> check fault mode:
> 
>   PR_MTE_TCF_NONE  - no reporting (default)
>   PR_MTE_TCF_SYNC  - synchronous tag check fault reporting
>   PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting
> 
> These options translate into the corresponding SCTLR_EL1.TCF0 bitfield,
> context-switched by the kernel. Note that uaccess done by the kernel is
> not checked and cannot be configured by the user.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
> 
> Notes:
>     v3:
>     - Use SCTLR_EL1_TCF0_NONE instead of 0 for consistency.
>     - Move mte_thread_switch() in this patch from an earlier one. In
>       addition, it is called after the dsb() in __switch_to() so that any
>       asynchronous tag check faults have been registered in the TFSR_EL1
>       registers (to be added with the in-kernel MTE support.
>     
>     v2:
>     - Handle SCTLR_EL1_TCF0_NONE explicitly for consistency with PR_MTE_TCF_NONE.
>     - Fix SCTLR_EL1 register setting in flush_mte_state() (thanks to Peter
>       Collingbourne).
>     - Added ISB to update_sctlr_el1_tcf0() since, with the latest
>       architecture update/fix, the TCF0 field is used by the uaccess
>       routines.
> 
>  arch/arm64/include/asm/mte.h       | 14 ++++++
>  arch/arm64/include/asm/processor.h |  3 ++
>  arch/arm64/kernel/mte.c            | 77 ++++++++++++++++++++++++++++++
>  arch/arm64/kernel/process.c        | 26 ++++++++--
>  include/uapi/linux/prctl.h         |  6 +++
>  5 files changed, 123 insertions(+), 3 deletions(-)

Dave is working on man pages for prctl() (and I think also ptrace). I think
it would be /very/ useful for us to have some RFC patches on top of his work
adding documentation for the MTE interactions, as we found some other minor
issues/inconsistencies as a direct result of writing and reviewing the man
page for our existing interfaces.

Cheers,

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 15/26] arm64: mte: Allow user control of the tag check mode via prctl()
  2020-05-27  7:46     ` Will Deacon
@ 2020-05-27  8:32       ` Dave Martin
  -1 siblings, 0 replies; 123+ messages in thread
From: Dave Martin @ 2020-05-27  8:32 UTC (permalink / raw)
  To: Will Deacon
  Cc: Catalin Marinas, linux-arch, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, linux-mm, Vincenzo Frascino, Peter Collingbourne,
	linux-arm-kernel

On Wed, May 27, 2020 at 08:46:59AM +0100, Will Deacon wrote:
> On Fri, May 15, 2020 at 06:16:01PM +0100, Catalin Marinas wrote:
> > By default, even if PROT_MTE is set on a memory range, there is no tag
> > check fault reporting (SIGSEGV). Introduce a set of option to the
> > exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag
> > check fault mode:
> > 
> >   PR_MTE_TCF_NONE  - no reporting (default)
> >   PR_MTE_TCF_SYNC  - synchronous tag check fault reporting
> >   PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting
> > 
> > These options translate into the corresponding SCTLR_EL1.TCF0 bitfield,
> > context-switched by the kernel. Note that uaccess done by the kernel is
> > not checked and cannot be configured by the user.
> > 
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> > 
> > Notes:
> >     v3:
> >     - Use SCTLR_EL1_TCF0_NONE instead of 0 for consistency.
> >     - Move mte_thread_switch() in this patch from an earlier one. In
> >       addition, it is called after the dsb() in __switch_to() so that any
> >       asynchronous tag check faults have been registered in the TFSR_EL1
> >       registers (to be added with the in-kernel MTE support.
> >     
> >     v2:
> >     - Handle SCTLR_EL1_TCF0_NONE explicitly for consistency with PR_MTE_TCF_NONE.
> >     - Fix SCTLR_EL1 register setting in flush_mte_state() (thanks to Peter
> >       Collingbourne).
> >     - Added ISB to update_sctlr_el1_tcf0() since, with the latest
> >       architecture update/fix, the TCF0 field is used by the uaccess
> >       routines.
> > 
> >  arch/arm64/include/asm/mte.h       | 14 ++++++
> >  arch/arm64/include/asm/processor.h |  3 ++
> >  arch/arm64/kernel/mte.c            | 77 ++++++++++++++++++++++++++++++
> >  arch/arm64/kernel/process.c        | 26 ++++++++--
> >  include/uapi/linux/prctl.h         |  6 +++
> >  5 files changed, 123 insertions(+), 3 deletions(-)
> 
> Dave is working on man pages for prctl() (and I think also ptrace). I think
> it would be /very/ useful for us to have some RFC patches on top of his work
> adding documentation for the MTE interactions, as we found some other minor
> issues/inconsistencies as a direct result of writing and reviewing the man
> page for our existing interfaces.

I have a local draft for the address tagging and MTE prctls already btw.
I hadn't posted them yet so as to focus on nailing the "easy" stuff down
;)

If I have time I'll try and get them posted today so that people can
take a look before next week.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 15/26] arm64: mte: Allow user control of the tag check mode via prctl()
@ 2020-05-27  8:32       ` Dave Martin
  0 siblings, 0 replies; 123+ messages in thread
From: Dave Martin @ 2020-05-27  8:32 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arch, Szabolcs Nagy, Catalin Marinas, Kevin Brodsky,
	linux-mm, Andrey Konovalov, Vincenzo Frascino,
	Peter Collingbourne, linux-arm-kernel

On Wed, May 27, 2020 at 08:46:59AM +0100, Will Deacon wrote:
> On Fri, May 15, 2020 at 06:16:01PM +0100, Catalin Marinas wrote:
> > By default, even if PROT_MTE is set on a memory range, there is no tag
> > check fault reporting (SIGSEGV). Introduce a set of option to the
> > exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag
> > check fault mode:
> > 
> >   PR_MTE_TCF_NONE  - no reporting (default)
> >   PR_MTE_TCF_SYNC  - synchronous tag check fault reporting
> >   PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting
> > 
> > These options translate into the corresponding SCTLR_EL1.TCF0 bitfield,
> > context-switched by the kernel. Note that uaccess done by the kernel is
> > not checked and cannot be configured by the user.
> > 
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> > 
> > Notes:
> >     v3:
> >     - Use SCTLR_EL1_TCF0_NONE instead of 0 for consistency.
> >     - Move mte_thread_switch() in this patch from an earlier one. In
> >       addition, it is called after the dsb() in __switch_to() so that any
> >       asynchronous tag check faults have been registered in the TFSR_EL1
> >       registers (to be added with the in-kernel MTE support.
> >     
> >     v2:
> >     - Handle SCTLR_EL1_TCF0_NONE explicitly for consistency with PR_MTE_TCF_NONE.
> >     - Fix SCTLR_EL1 register setting in flush_mte_state() (thanks to Peter
> >       Collingbourne).
> >     - Added ISB to update_sctlr_el1_tcf0() since, with the latest
> >       architecture update/fix, the TCF0 field is used by the uaccess
> >       routines.
> > 
> >  arch/arm64/include/asm/mte.h       | 14 ++++++
> >  arch/arm64/include/asm/processor.h |  3 ++
> >  arch/arm64/kernel/mte.c            | 77 ++++++++++++++++++++++++++++++
> >  arch/arm64/kernel/process.c        | 26 ++++++++--
> >  include/uapi/linux/prctl.h         |  6 +++
> >  5 files changed, 123 insertions(+), 3 deletions(-)
> 
> Dave is working on man pages for prctl() (and I think also ptrace). I think
> it would be /very/ useful for us to have some RFC patches on top of his work
> adding documentation for the MTE interactions, as we found some other minor
> issues/inconsistencies as a direct result of writing and reviewing the man
> page for our existing interfaces.

I have a local draft for the address tagging and MTE prctls already btw.
I hadn't posted them yet so as to focus on nailing the "easy" stuff down
;)

If I have time I'll try and get them posted today so that people can
take a look before next week.

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 15/26] arm64: mte: Allow user control of the tag check mode via prctl()
  2020-05-27  8:32       ` Dave Martin
@ 2020-05-27  8:48         ` Will Deacon
  -1 siblings, 0 replies; 123+ messages in thread
From: Will Deacon @ 2020-05-27  8:48 UTC (permalink / raw)
  To: Dave Martin
  Cc: Catalin Marinas, linux-arch, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, linux-mm, Vincenzo Frascino, Peter Collingbourne,
	linux-arm-kernel

On Wed, May 27, 2020 at 09:32:20AM +0100, Dave Martin wrote:
> On Wed, May 27, 2020 at 08:46:59AM +0100, Will Deacon wrote:
> > On Fri, May 15, 2020 at 06:16:01PM +0100, Catalin Marinas wrote:
> > > By default, even if PROT_MTE is set on a memory range, there is no tag
> > > check fault reporting (SIGSEGV). Introduce a set of option to the
> > > exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag
> > > check fault mode:
> > > 
> > >   PR_MTE_TCF_NONE  - no reporting (default)
> > >   PR_MTE_TCF_SYNC  - synchronous tag check fault reporting
> > >   PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting
> > > 
> > > These options translate into the corresponding SCTLR_EL1.TCF0 bitfield,
> > > context-switched by the kernel. Note that uaccess done by the kernel is
> > > not checked and cannot be configured by the user.
> > > 
> > > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > ---
> > > 
> > > Notes:
> > >     v3:
> > >     - Use SCTLR_EL1_TCF0_NONE instead of 0 for consistency.
> > >     - Move mte_thread_switch() in this patch from an earlier one. In
> > >       addition, it is called after the dsb() in __switch_to() so that any
> > >       asynchronous tag check faults have been registered in the TFSR_EL1
> > >       registers (to be added with the in-kernel MTE support.
> > >     
> > >     v2:
> > >     - Handle SCTLR_EL1_TCF0_NONE explicitly for consistency with PR_MTE_TCF_NONE.
> > >     - Fix SCTLR_EL1 register setting in flush_mte_state() (thanks to Peter
> > >       Collingbourne).
> > >     - Added ISB to update_sctlr_el1_tcf0() since, with the latest
> > >       architecture update/fix, the TCF0 field is used by the uaccess
> > >       routines.
> > > 
> > >  arch/arm64/include/asm/mte.h       | 14 ++++++
> > >  arch/arm64/include/asm/processor.h |  3 ++
> > >  arch/arm64/kernel/mte.c            | 77 ++++++++++++++++++++++++++++++
> > >  arch/arm64/kernel/process.c        | 26 ++++++++--
> > >  include/uapi/linux/prctl.h         |  6 +++
> > >  5 files changed, 123 insertions(+), 3 deletions(-)
> > 
> > Dave is working on man pages for prctl() (and I think also ptrace). I think
> > it would be /very/ useful for us to have some RFC patches on top of his work
> > adding documentation for the MTE interactions, as we found some other minor
> > issues/inconsistencies as a direct result of writing and reviewing the man
> > page for our existing interfaces.
> 
> I have a local draft for the address tagging and MTE prctls already btw.
> I hadn't posted them yet so as to focus on nailing the "easy" stuff down
> ;)
> 
> If I have time I'll try and get them posted today so that people can
> take a look before next week.

Oh, great! I wasn't meaning that you should be the one doing it, but if
you're already drafted them that's really good. Might make sense for them to
appear as RFC patches at the end of this series, to be honest, so the next
posting (v5) can all be reviewed together.

But I'll leave that up to you and Catalin to figure out.

Will

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 15/26] arm64: mte: Allow user control of the tag check mode via prctl()
@ 2020-05-27  8:48         ` Will Deacon
  0 siblings, 0 replies; 123+ messages in thread
From: Will Deacon @ 2020-05-27  8:48 UTC (permalink / raw)
  To: Dave Martin
  Cc: linux-arch, Szabolcs Nagy, Catalin Marinas, Kevin Brodsky,
	linux-mm, Andrey Konovalov, Vincenzo Frascino,
	Peter Collingbourne, linux-arm-kernel

On Wed, May 27, 2020 at 09:32:20AM +0100, Dave Martin wrote:
> On Wed, May 27, 2020 at 08:46:59AM +0100, Will Deacon wrote:
> > On Fri, May 15, 2020 at 06:16:01PM +0100, Catalin Marinas wrote:
> > > By default, even if PROT_MTE is set on a memory range, there is no tag
> > > check fault reporting (SIGSEGV). Introduce a set of option to the
> > > exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag
> > > check fault mode:
> > > 
> > >   PR_MTE_TCF_NONE  - no reporting (default)
> > >   PR_MTE_TCF_SYNC  - synchronous tag check fault reporting
> > >   PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting
> > > 
> > > These options translate into the corresponding SCTLR_EL1.TCF0 bitfield,
> > > context-switched by the kernel. Note that uaccess done by the kernel is
> > > not checked and cannot be configured by the user.
> > > 
> > > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > ---
> > > 
> > > Notes:
> > >     v3:
> > >     - Use SCTLR_EL1_TCF0_NONE instead of 0 for consistency.
> > >     - Move mte_thread_switch() in this patch from an earlier one. In
> > >       addition, it is called after the dsb() in __switch_to() so that any
> > >       asynchronous tag check faults have been registered in the TFSR_EL1
> > >       registers (to be added with the in-kernel MTE support.
> > >     
> > >     v2:
> > >     - Handle SCTLR_EL1_TCF0_NONE explicitly for consistency with PR_MTE_TCF_NONE.
> > >     - Fix SCTLR_EL1 register setting in flush_mte_state() (thanks to Peter
> > >       Collingbourne).
> > >     - Added ISB to update_sctlr_el1_tcf0() since, with the latest
> > >       architecture update/fix, the TCF0 field is used by the uaccess
> > >       routines.
> > > 
> > >  arch/arm64/include/asm/mte.h       | 14 ++++++
> > >  arch/arm64/include/asm/processor.h |  3 ++
> > >  arch/arm64/kernel/mte.c            | 77 ++++++++++++++++++++++++++++++
> > >  arch/arm64/kernel/process.c        | 26 ++++++++--
> > >  include/uapi/linux/prctl.h         |  6 +++
> > >  5 files changed, 123 insertions(+), 3 deletions(-)
> > 
> > Dave is working on man pages for prctl() (and I think also ptrace). I think
> > it would be /very/ useful for us to have some RFC patches on top of his work
> > adding documentation for the MTE interactions, as we found some other minor
> > issues/inconsistencies as a direct result of writing and reviewing the man
> > page for our existing interfaces.
> 
> I have a local draft for the address tagging and MTE prctls already btw.
> I hadn't posted them yet so as to focus on nailing the "easy" stuff down
> ;)
> 
> If I have time I'll try and get them posted today so that people can
> take a look before next week.

Oh, great! I wasn't meaning that you should be the one doing it, but if
you're already drafted them that's really good. Might make sense for them to
appear as RFC patches at the end of this series, to be honest, so the next
posting (v5) can all be reviewed together.

But I'll leave that up to you and Catalin to figure out.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
  2020-05-27  2:11               ` Patrick Daly
@ 2020-05-27  9:55                 ` Will Deacon
  -1 siblings, 0 replies; 123+ messages in thread
From: Will Deacon @ 2020-05-27  9:55 UTC (permalink / raw)
  To: Patrick Daly
  Cc: Catalin Marinas, linux-arch, Vladimir Murzin, Szabolcs Nagy,
	Andrey Konovalov, Kevin Brodsky, Peter Collingbourne, linux-mm,
	Vincenzo Frascino, Dave P Martin, linux-arm-kernel

On Tue, May 26, 2020 at 07:11:53PM -0700, Patrick Daly wrote:
> On Fri, May 22, 2020 at 11:37:15AM +0100, Catalin Marinas wrote:
> > On Thu, May 21, 2020 at 10:57:10PM -0700, Patrick Daly wrote:
> > > On Mon, May 18, 2020 at 06:20:55PM +0100, Catalin Marinas wrote:
> > > > On Mon, May 18, 2020 at 12:31:03PM +0100, Will Deacon wrote:
> > > > > On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> > > > > > On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > > > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > > index f2a93c8679e8..7436e7462b85 100644
> > > > > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > > @@ -373,6 +373,10 @@
> > > > > > >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> > > > > > >  			Format: <io>,<irq>,<nodeID>
> > > > > > >  
> > > > > > > +	arm64.mte_disable=
> > > > > > > +			[ARM64] Disable Linux support for the Memory
> > > > > > > +			Tagging Extension (both user and in-kernel).
> > > > > > > +
> > > > > > 
> > > > > > Should it really to take parameter (on/off/true/false)? It may lead to expectation
> > > > > > that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> > > > > > look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?
> > > > > 
> > > > > I don't think "performance analysis" is a good justification for this
> > > > > parameter tbh. We don't tend to add these options for other architectural
> > > > > features, and I don't see why MTE is any different in this regard.
> > > > 
> > > > There is an expectation of performance impact with MTE enabled,
> > > > especially if it's running in synchronous mode. For the in-kernel MTE,
> > > > we could add a parameter which sets sync vs async at boot time rather
> > > > than a big disable knob. It won't affect user space however.
> > > > 
> > > > The other 'justification' is if your hardware has weird unexpected
> > > > behaviour but I'd like this handled via errata workarounds.
> > > > 
> > > > I'll let the people who asked for this to chip in ;). I agree with you
> > > > that we rarely add these (and I rejected a similar option a few weeks
> > > > ago on the AMU patchset).
> > > 
> > > We've been looking into other ways this on/off behavior could be achieved.
> > 
> > The actual question here is what the on/off behaviour is needed for. We
> > can figure out the best mechanism for this once we know what we want to
> > achieve. My wild guess above was performance analysis but that can be
> > toggled by either kernel boot parameter or run-time sysctl (or just the
> > Kconfig option).
> > 
> > If it is about forcing user space not to use MTE, we may look into some
> > other sysctl controls (we already have one for the tagged address ABI).
> 
> We want to allow the end user to be able to easily "opt out" of MTE in favour
> of better power, perf and battery life.

Who is "the end user" in this case?

If MTE is bad enough for power, performance and battery life that we need a
kill switch, then perhaps we shouldn't enable it by default and the few
people that want to use it can build a kernel with it enabled. However, then
I don't really see what MTE buys you over the existing KASAN implementations.

I thought the general idea was that you could run in the (cheap) "async"
mode, and then re-run in the more expensive "sync" mode to further diagnose
any failures. That model seems to work well with these patches, since
reporting is disabled by default. Are you saying that there is a
significant penalty incurred even when reporting is not enabled?

Anyway, we don't offer global runtime/cmdline switches for the vast majority
of other architectural features -- instead, we choose a sensible default,
and I think we should do the same here.

Will

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2020-05-27  9:55                 ` Will Deacon
  0 siblings, 0 replies; 123+ messages in thread
From: Will Deacon @ 2020-05-27  9:55 UTC (permalink / raw)
  To: Patrick Daly
  Cc: linux-arch, Vladimir Murzin, Szabolcs Nagy, Catalin Marinas,
	Kevin Brodsky, linux-mm, Andrey Konovalov, Vincenzo Frascino,
	Peter Collingbourne, Dave P Martin, linux-arm-kernel

On Tue, May 26, 2020 at 07:11:53PM -0700, Patrick Daly wrote:
> On Fri, May 22, 2020 at 11:37:15AM +0100, Catalin Marinas wrote:
> > On Thu, May 21, 2020 at 10:57:10PM -0700, Patrick Daly wrote:
> > > On Mon, May 18, 2020 at 06:20:55PM +0100, Catalin Marinas wrote:
> > > > On Mon, May 18, 2020 at 12:31:03PM +0100, Will Deacon wrote:
> > > > > On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> > > > > > On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > > > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > > index f2a93c8679e8..7436e7462b85 100644
> > > > > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > > @@ -373,6 +373,10 @@
> > > > > > >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> > > > > > >  			Format: <io>,<irq>,<nodeID>
> > > > > > >  
> > > > > > > +	arm64.mte_disable=
> > > > > > > +			[ARM64] Disable Linux support for the Memory
> > > > > > > +			Tagging Extension (both user and in-kernel).
> > > > > > > +
> > > > > > 
> > > > > > Should it really to take parameter (on/off/true/false)? It may lead to expectation
> > > > > > that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> > > > > > look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?
> > > > > 
> > > > > I don't think "performance analysis" is a good justification for this
> > > > > parameter tbh. We don't tend to add these options for other architectural
> > > > > features, and I don't see why MTE is any different in this regard.
> > > > 
> > > > There is an expectation of performance impact with MTE enabled,
> > > > especially if it's running in synchronous mode. For the in-kernel MTE,
> > > > we could add a parameter which sets sync vs async at boot time rather
> > > > than a big disable knob. It won't affect user space however.
> > > > 
> > > > The other 'justification' is if your hardware has weird unexpected
> > > > behaviour but I'd like this handled via errata workarounds.
> > > > 
> > > > I'll let the people who asked for this to chip in ;). I agree with you
> > > > that we rarely add these (and I rejected a similar option a few weeks
> > > > ago on the AMU patchset).
> > > 
> > > We've been looking into other ways this on/off behavior could be achieved.
> > 
> > The actual question here is what the on/off behaviour is needed for. We
> > can figure out the best mechanism for this once we know what we want to
> > achieve. My wild guess above was performance analysis but that can be
> > toggled by either kernel boot parameter or run-time sysctl (or just the
> > Kconfig option).
> > 
> > If it is about forcing user space not to use MTE, we may look into some
> > other sysctl controls (we already have one for the tagged address ABI).
> 
> We want to allow the end user to be able to easily "opt out" of MTE in favour
> of better power, perf and battery life.

Who is "the end user" in this case?

If MTE is bad enough for power, performance and battery life that we need a
kill switch, then perhaps we shouldn't enable it by default and the few
people that want to use it can build a kernel with it enabled. However, then
I don't really see what MTE buys you over the existing KASAN implementations.

I thought the general idea was that you could run in the (cheap) "async"
mode, and then re-run in the more expensive "sync" mode to further diagnose
any failures. That model seems to work well with these patches, since
reporting is disabled by default. Are you saying that there is a
significant penalty incurred even when reporting is not enabled?

Anyway, we don't offer global runtime/cmdline switches for the vast majority
of other architectural features -- instead, we choose a sensible default,
and I think we should do the same here.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
  2020-05-27  9:55                 ` Will Deacon
@ 2020-05-27 10:37                   ` Szabolcs Nagy
  -1 siblings, 0 replies; 123+ messages in thread
From: Szabolcs Nagy @ 2020-05-27 10:37 UTC (permalink / raw)
  To: Will Deacon
  Cc: Patrick Daly, Catalin Marinas, linux-arch, Vladimir Murzin,
	Andrey Konovalov, Kevin Brodsky, Peter Collingbourne, linux-mm,
	Vincenzo Frascino, Dave P Martin, linux-arm-kernel, nd

The 05/27/2020 10:55, Will Deacon wrote:
> On Tue, May 26, 2020 at 07:11:53PM -0700, Patrick Daly wrote:
> > On Fri, May 22, 2020 at 11:37:15AM +0100, Catalin Marinas wrote:
> > > The actual question here is what the on/off behaviour is needed for. We
> > > can figure out the best mechanism for this once we know what we want to
> > > achieve. My wild guess above was performance analysis but that can be
> > > toggled by either kernel boot parameter or run-time sysctl (or just the
> > > Kconfig option).
> > > 
> > > If it is about forcing user space not to use MTE, we may look into some
> > > other sysctl controls (we already have one for the tagged address ABI).
> > 
> > We want to allow the end user to be able to easily "opt out" of MTE in favour
> > of better power, perf and battery life.
> 
> Who is "the end user" in this case?
> 
> If MTE is bad enough for power, performance and battery life that we need a
> kill switch, then perhaps we shouldn't enable it by default and the few
> people that want to use it can build a kernel with it enabled. However, then
> I don't really see what MTE buys you over the existing KASAN implementations.
> 
> I thought the general idea was that you could run in the (cheap) "async"
> mode, and then re-run in the more expensive "sync" mode to further diagnose
> any failures. That model seems to work well with these patches, since
> reporting is disabled by default. Are you saying that there is a
> significant penalty incurred even when reporting is not enabled?
> 
> Anyway, we don't offer global runtime/cmdline switches for the vast majority
> of other architectural features -- instead, we choose a sensible default,
> and I think we should do the same here.

i would not expect mte overhead if userspace processes
don't map anything with PROT_MTE and don't enable tag
checking with prctl. (i.e. userspace runtimes can "opt
out" of mte for better power, perf and battery)

is that not the case?

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2020-05-27 10:37                   ` Szabolcs Nagy
  0 siblings, 0 replies; 123+ messages in thread
From: Szabolcs Nagy @ 2020-05-27 10:37 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arch, nd, Vladimir Murzin, Patrick Daly, Catalin Marinas,
	Kevin Brodsky, linux-mm, Andrey Konovalov, Vincenzo Frascino,
	Peter Collingbourne, Dave P Martin, linux-arm-kernel

The 05/27/2020 10:55, Will Deacon wrote:
> On Tue, May 26, 2020 at 07:11:53PM -0700, Patrick Daly wrote:
> > On Fri, May 22, 2020 at 11:37:15AM +0100, Catalin Marinas wrote:
> > > The actual question here is what the on/off behaviour is needed for. We
> > > can figure out the best mechanism for this once we know what we want to
> > > achieve. My wild guess above was performance analysis but that can be
> > > toggled by either kernel boot parameter or run-time sysctl (or just the
> > > Kconfig option).
> > > 
> > > If it is about forcing user space not to use MTE, we may look into some
> > > other sysctl controls (we already have one for the tagged address ABI).
> > 
> > We want to allow the end user to be able to easily "opt out" of MTE in favour
> > of better power, perf and battery life.
> 
> Who is "the end user" in this case?
> 
> If MTE is bad enough for power, performance and battery life that we need a
> kill switch, then perhaps we shouldn't enable it by default and the few
> people that want to use it can build a kernel with it enabled. However, then
> I don't really see what MTE buys you over the existing KASAN implementations.
> 
> I thought the general idea was that you could run in the (cheap) "async"
> mode, and then re-run in the more expensive "sync" mode to further diagnose
> any failures. That model seems to work well with these patches, since
> reporting is disabled by default. Are you saying that there is a
> significant penalty incurred even when reporting is not enabled?
> 
> Anyway, we don't offer global runtime/cmdline switches for the vast majority
> of other architectural features -- instead, we choose a sensible default,
> and I think we should do the same here.

i would not expect mte overhead if userspace processes
don't map anything with PROT_MTE and don't enable tag
checking with prctl. (i.e. userspace runtimes can "opt
out" of mte for better power, perf and battery)

is that not the case?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
  2020-05-27  9:55                 ` Will Deacon
@ 2020-05-27 11:12                   ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-27 11:12 UTC (permalink / raw)
  To: Will Deacon
  Cc: Patrick Daly, linux-arch, Vladimir Murzin, Szabolcs Nagy,
	Andrey Konovalov, Kevin Brodsky, Peter Collingbourne, linux-mm,
	Vincenzo Frascino, Dave P Martin, linux-arm-kernel

On Wed, May 27, 2020 at 10:55:05AM +0100, Will Deacon wrote:
> On Tue, May 26, 2020 at 07:11:53PM -0700, Patrick Daly wrote:
> > On Fri, May 22, 2020 at 11:37:15AM +0100, Catalin Marinas wrote:
> > > On Thu, May 21, 2020 at 10:57:10PM -0700, Patrick Daly wrote:
> > > > On Mon, May 18, 2020 at 06:20:55PM +0100, Catalin Marinas wrote:
> > > > > On Mon, May 18, 2020 at 12:31:03PM +0100, Will Deacon wrote:
> > > > > > On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> > > > > > > On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > > > > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > > > index f2a93c8679e8..7436e7462b85 100644
> > > > > > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > > > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > > > @@ -373,6 +373,10 @@
> > > > > > > >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> > > > > > > >  			Format: <io>,<irq>,<nodeID>
> > > > > > > >  
> > > > > > > > +	arm64.mte_disable=
> > > > > > > > +			[ARM64] Disable Linux support for the Memory
> > > > > > > > +			Tagging Extension (both user and in-kernel).
> > > > > > > > +
> > > > > > > 
> > > > > > > Should it really to take parameter (on/off/true/false)? It may lead to expectation
> > > > > > > that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> > > > > > > look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?
> > > > > > 
> > > > > > I don't think "performance analysis" is a good justification for this
> > > > > > parameter tbh. We don't tend to add these options for other architectural
> > > > > > features, and I don't see why MTE is any different in this regard.
> > > > > 
> > > > > There is an expectation of performance impact with MTE enabled,
> > > > > especially if it's running in synchronous mode. For the in-kernel MTE,
> > > > > we could add a parameter which sets sync vs async at boot time rather
> > > > > than a big disable knob. It won't affect user space however.
> > > > > 
> > > > > The other 'justification' is if your hardware has weird unexpected
> > > > > behaviour but I'd like this handled via errata workarounds.
> > > > > 
> > > > > I'll let the people who asked for this to chip in ;). I agree with you
> > > > > that we rarely add these (and I rejected a similar option a few weeks
> > > > > ago on the AMU patchset).
> > > > 
> > > > We've been looking into other ways this on/off behavior could be achieved.
> > > 
> > > The actual question here is what the on/off behaviour is needed for. We
> > > can figure out the best mechanism for this once we know what we want to
> > > achieve. My wild guess above was performance analysis but that can be
> > > toggled by either kernel boot parameter or run-time sysctl (or just the
> > > Kconfig option).
> > > 
> > > If it is about forcing user space not to use MTE, we may look into some
> > > other sysctl controls (we already have one for the tagged address ABI).
> > 
> > We want to allow the end user to be able to easily "opt out" of MTE in favour
> > of better power, perf and battery life.
> 
> Who is "the end user" in this case?

Good question. I have a suspicion it's still the (kernel) developer ;).

> If MTE is bad enough for power, performance and battery life that we need a
> kill switch, then perhaps we shouldn't enable it by default and the few
> people that want to use it can build a kernel with it enabled. However, then
> I don't really see what MTE buys you over the existing KASAN implementations.

MTE is faster than KASan (with async mode the fastest), however I'd
expect it to still be noticeable compared to no-MTE. It's a trade-off if
you want to find security bugs in your code on a large scale.

> I thought the general idea was that you could run in the (cheap) "async"
> mode, and then re-run in the more expensive "sync" mode to further diagnose
> any failures. That model seems to work well with these patches, since
> reporting is disabled by default. Are you saying that there is a
> significant penalty incurred even when reporting is not enabled?

The tag checking mode is controlled by the user on a per-process basis.
The modes and hardware perf/power expectations:

1. no tag checking - no expected performance penalty from the hardware
   perspective (tags not fetched from memory).

2. async tag checking - tags fetched from memory but checked
   asynchronously, so it allows the hardware to perform as well as it
   can (I don't have numbers yet). Probably a small degradation vs (1).

3. sync tag checking - there is an expectation of further perf/power
   degradation vs (2).

In addition to the hardware aspects above, you have the software cost
for colouring memory both on allocation and on free. By default, a
malloc()/free() wouldn't touch the memory (maybe some red zones) but
with MTE the libc will have to set the colour. That's faster than a
memset since it need to store 4 bits for every 16 bytes of address but
slower than not doing it at all. For a calloc(), The memset + tag
setting can be combined in a single DC instruction.

So, it really depends on what the user is doing. If we want a knob where
the user doesn't even attempt to colour pages (not even (1) above),
maybe a user space env variable parsed by the libc is a better option.

While MTE and the tagged addr ABI are complementary (one can still set
PROT_MTE without enabling the tagged addr ABI), most likely a libc
implementation would try to enable the latter before using MTE. We
already have a sysctl to force the tagged addr ABI off. The side-effect
is that MTE will be disabled in the C library, so assuming no run-time
cost (the libc people to confirm).

The tagged addr sysctl doesn't cover the in-kernel MTE but we can leave
the discussion for when we have the patches.

> Anyway, we don't offer global runtime/cmdline switches for the vast majority
> of other architectural features -- instead, we choose a sensible default,
> and I think we should do the same here.

The sensible defaults are currently "off" with a user opt-in. I think
the question is whether we need a "safety" knob at the kernel level like
we did with the sysctl abi.tagged_addr_disabled or we leave it to the
user as it sees fit (e.g. env variables) since it doesn't affect the
kernel (unlike the tagged addr ABI).

-- 
Catalin

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2020-05-27 11:12                   ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-27 11:12 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arch, Vladimir Murzin, Patrick Daly, Szabolcs Nagy,
	Andrey Konovalov, Kevin Brodsky, linux-mm, Vincenzo Frascino,
	Peter Collingbourne, Dave P Martin, linux-arm-kernel

On Wed, May 27, 2020 at 10:55:05AM +0100, Will Deacon wrote:
> On Tue, May 26, 2020 at 07:11:53PM -0700, Patrick Daly wrote:
> > On Fri, May 22, 2020 at 11:37:15AM +0100, Catalin Marinas wrote:
> > > On Thu, May 21, 2020 at 10:57:10PM -0700, Patrick Daly wrote:
> > > > On Mon, May 18, 2020 at 06:20:55PM +0100, Catalin Marinas wrote:
> > > > > On Mon, May 18, 2020 at 12:31:03PM +0100, Will Deacon wrote:
> > > > > > On Mon, May 18, 2020 at 12:26:30PM +0100, Vladimir Murzin wrote:
> > > > > > > On 5/15/20 6:16 PM, Catalin Marinas wrote:
> > > > > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > > > index f2a93c8679e8..7436e7462b85 100644
> > > > > > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > > > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > > > @@ -373,6 +373,10 @@
> > > > > > > >  	arcrimi=	[HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> > > > > > > >  			Format: <io>,<irq>,<nodeID>
> > > > > > > >  
> > > > > > > > +	arm64.mte_disable=
> > > > > > > > +			[ARM64] Disable Linux support for the Memory
> > > > > > > > +			Tagging Extension (both user and in-kernel).
> > > > > > > > +
> > > > > > > 
> > > > > > > Should it really to take parameter (on/off/true/false)? It may lead to expectation
> > > > > > > that arm64.mte_disable=false should enable MT and, yes, double negatives make it
> > > > > > > look ugly, so if we do need parameter, can it be arm64.mte=on/off/true/false?
> > > > > > 
> > > > > > I don't think "performance analysis" is a good justification for this
> > > > > > parameter tbh. We don't tend to add these options for other architectural
> > > > > > features, and I don't see why MTE is any different in this regard.
> > > > > 
> > > > > There is an expectation of performance impact with MTE enabled,
> > > > > especially if it's running in synchronous mode. For the in-kernel MTE,
> > > > > we could add a parameter which sets sync vs async at boot time rather
> > > > > than a big disable knob. It won't affect user space however.
> > > > > 
> > > > > The other 'justification' is if your hardware has weird unexpected
> > > > > behaviour but I'd like this handled via errata workarounds.
> > > > > 
> > > > > I'll let the people who asked for this to chip in ;). I agree with you
> > > > > that we rarely add these (and I rejected a similar option a few weeks
> > > > > ago on the AMU patchset).
> > > > 
> > > > We've been looking into other ways this on/off behavior could be achieved.
> > > 
> > > The actual question here is what the on/off behaviour is needed for. We
> > > can figure out the best mechanism for this once we know what we want to
> > > achieve. My wild guess above was performance analysis but that can be
> > > toggled by either kernel boot parameter or run-time sysctl (or just the
> > > Kconfig option).
> > > 
> > > If it is about forcing user space not to use MTE, we may look into some
> > > other sysctl controls (we already have one for the tagged address ABI).
> > 
> > We want to allow the end user to be able to easily "opt out" of MTE in favour
> > of better power, perf and battery life.
> 
> Who is "the end user" in this case?

Good question. I have a suspicion it's still the (kernel) developer ;).

> If MTE is bad enough for power, performance and battery life that we need a
> kill switch, then perhaps we shouldn't enable it by default and the few
> people that want to use it can build a kernel with it enabled. However, then
> I don't really see what MTE buys you over the existing KASAN implementations.

MTE is faster than KASan (with async mode the fastest), however I'd
expect it to still be noticeable compared to no-MTE. It's a trade-off if
you want to find security bugs in your code on a large scale.

> I thought the general idea was that you could run in the (cheap) "async"
> mode, and then re-run in the more expensive "sync" mode to further diagnose
> any failures. That model seems to work well with these patches, since
> reporting is disabled by default. Are you saying that there is a
> significant penalty incurred even when reporting is not enabled?

The tag checking mode is controlled by the user on a per-process basis.
The modes and hardware perf/power expectations:

1. no tag checking - no expected performance penalty from the hardware
   perspective (tags not fetched from memory).

2. async tag checking - tags fetched from memory but checked
   asynchronously, so it allows the hardware to perform as well as it
   can (I don't have numbers yet). Probably a small degradation vs (1).

3. sync tag checking - there is an expectation of further perf/power
   degradation vs (2).

In addition to the hardware aspects above, you have the software cost
for colouring memory both on allocation and on free. By default, a
malloc()/free() wouldn't touch the memory (maybe some red zones) but
with MTE the libc will have to set the colour. That's faster than a
memset since it need to store 4 bits for every 16 bytes of address but
slower than not doing it at all. For a calloc(), The memset + tag
setting can be combined in a single DC instruction.

So, it really depends on what the user is doing. If we want a knob where
the user doesn't even attempt to colour pages (not even (1) above),
maybe a user space env variable parsed by the libc is a better option.

While MTE and the tagged addr ABI are complementary (one can still set
PROT_MTE without enabling the tagged addr ABI), most likely a libc
implementation would try to enable the latter before using MTE. We
already have a sysctl to force the tagged addr ABI off. The side-effect
is that MTE will be disabled in the C library, so assuming no run-time
cost (the libc people to confirm).

The tagged addr sysctl doesn't cover the in-kernel MTE but we can leave
the discussion for when we have the patches.

> Anyway, we don't offer global runtime/cmdline switches for the vast majority
> of other architectural features -- instead, we choose a sensible default,
> and I think we should do the same here.

The sensible defaults are currently "off" with a user opt-in. I think
the question is whether we need a "safety" knob at the kernel level like
we did with the sysctl abi.tagged_addr_disabled or we leave it to the
user as it sees fit (e.g. env variables) since it doesn't affect the
kernel (unlike the tagged addr ABI).

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 15/26] arm64: mte: Allow user control of the tag check mode via prctl()
  2020-05-27  8:32       ` Dave Martin
@ 2020-05-27 11:16         ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-27 11:16 UTC (permalink / raw)
  To: Dave Martin
  Cc: Will Deacon, linux-arch, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, linux-mm, Vincenzo Frascino, Peter Collingbourne,
	linux-arm-kernel

On Wed, May 27, 2020 at 09:32:20AM +0100, Dave P Martin wrote:
> On Wed, May 27, 2020 at 08:46:59AM +0100, Will Deacon wrote:
> > On Fri, May 15, 2020 at 06:16:01PM +0100, Catalin Marinas wrote:
> > > By default, even if PROT_MTE is set on a memory range, there is no tag
> > > check fault reporting (SIGSEGV). Introduce a set of option to the
> > > exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag
> > > check fault mode:
> > > 
> > >   PR_MTE_TCF_NONE  - no reporting (default)
> > >   PR_MTE_TCF_SYNC  - synchronous tag check fault reporting
> > >   PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting
> > > 
> > > These options translate into the corresponding SCTLR_EL1.TCF0 bitfield,
> > > context-switched by the kernel. Note that uaccess done by the kernel is
> > > not checked and cannot be configured by the user.
> > > 
> > > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > ---
> > > 
> > > Notes:
> > >     v3:
> > >     - Use SCTLR_EL1_TCF0_NONE instead of 0 for consistency.
> > >     - Move mte_thread_switch() in this patch from an earlier one. In
> > >       addition, it is called after the dsb() in __switch_to() so that any
> > >       asynchronous tag check faults have been registered in the TFSR_EL1
> > >       registers (to be added with the in-kernel MTE support.
> > >     
> > >     v2:
> > >     - Handle SCTLR_EL1_TCF0_NONE explicitly for consistency with PR_MTE_TCF_NONE.
> > >     - Fix SCTLR_EL1 register setting in flush_mte_state() (thanks to Peter
> > >       Collingbourne).
> > >     - Added ISB to update_sctlr_el1_tcf0() since, with the latest
> > >       architecture update/fix, the TCF0 field is used by the uaccess
> > >       routines.
> > > 
> > >  arch/arm64/include/asm/mte.h       | 14 ++++++
> > >  arch/arm64/include/asm/processor.h |  3 ++
> > >  arch/arm64/kernel/mte.c            | 77 ++++++++++++++++++++++++++++++
> > >  arch/arm64/kernel/process.c        | 26 ++++++++--
> > >  include/uapi/linux/prctl.h         |  6 +++
> > >  5 files changed, 123 insertions(+), 3 deletions(-)
> > 
> > Dave is working on man pages for prctl() (and I think also ptrace). I think
> > it would be /very/ useful for us to have some RFC patches on top of his work
> > adding documentation for the MTE interactions, as we found some other minor
> > issues/inconsistencies as a direct result of writing and reviewing the man
> > page for our existing interfaces.
> 
> I have a local draft for the address tagging and MTE prctls already btw.
> I hadn't posted them yet so as to focus on nailing the "easy" stuff down
> ;)

That's great Dave. Thanks!

> If I have time I'll try and get them posted today so that people can
> take a look before next week.

Feel free to post them whenever you can. I'll include them in v5 (likely
to be posted after the merging window).

-- 
Catalin

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 15/26] arm64: mte: Allow user control of the tag check mode via prctl()
@ 2020-05-27 11:16         ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-27 11:16 UTC (permalink / raw)
  To: Dave Martin
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Vincenzo Frascino, Will Deacon,
	linux-arm-kernel

On Wed, May 27, 2020 at 09:32:20AM +0100, Dave P Martin wrote:
> On Wed, May 27, 2020 at 08:46:59AM +0100, Will Deacon wrote:
> > On Fri, May 15, 2020 at 06:16:01PM +0100, Catalin Marinas wrote:
> > > By default, even if PROT_MTE is set on a memory range, there is no tag
> > > check fault reporting (SIGSEGV). Introduce a set of option to the
> > > exiting prctl(PR_SET_TAGGED_ADDR_CTRL) to allow user control of the tag
> > > check fault mode:
> > > 
> > >   PR_MTE_TCF_NONE  - no reporting (default)
> > >   PR_MTE_TCF_SYNC  - synchronous tag check fault reporting
> > >   PR_MTE_TCF_ASYNC - asynchronous tag check fault reporting
> > > 
> > > These options translate into the corresponding SCTLR_EL1.TCF0 bitfield,
> > > context-switched by the kernel. Note that uaccess done by the kernel is
> > > not checked and cannot be configured by the user.
> > > 
> > > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > ---
> > > 
> > > Notes:
> > >     v3:
> > >     - Use SCTLR_EL1_TCF0_NONE instead of 0 for consistency.
> > >     - Move mte_thread_switch() in this patch from an earlier one. In
> > >       addition, it is called after the dsb() in __switch_to() so that any
> > >       asynchronous tag check faults have been registered in the TFSR_EL1
> > >       registers (to be added with the in-kernel MTE support.
> > >     
> > >     v2:
> > >     - Handle SCTLR_EL1_TCF0_NONE explicitly for consistency with PR_MTE_TCF_NONE.
> > >     - Fix SCTLR_EL1 register setting in flush_mte_state() (thanks to Peter
> > >       Collingbourne).
> > >     - Added ISB to update_sctlr_el1_tcf0() since, with the latest
> > >       architecture update/fix, the TCF0 field is used by the uaccess
> > >       routines.
> > > 
> > >  arch/arm64/include/asm/mte.h       | 14 ++++++
> > >  arch/arm64/include/asm/processor.h |  3 ++
> > >  arch/arm64/kernel/mte.c            | 77 ++++++++++++++++++++++++++++++
> > >  arch/arm64/kernel/process.c        | 26 ++++++++--
> > >  include/uapi/linux/prctl.h         |  6 +++
> > >  5 files changed, 123 insertions(+), 3 deletions(-)
> > 
> > Dave is working on man pages for prctl() (and I think also ptrace). I think
> > it would be /very/ useful for us to have some RFC patches on top of his work
> > adding documentation for the MTE interactions, as we found some other minor
> > issues/inconsistencies as a direct result of writing and reviewing the man
> > page for our existing interfaces.
> 
> I have a local draft for the address tagging and MTE prctls already btw.
> I hadn't posted them yet so as to focus on nailing the "easy" stuff down
> ;)

That's great Dave. Thanks!

> If I have time I'll try and get them posted today so that people can
> take a look before next week.

Feel free to post them whenever you can. I'll include them in v5 (likely
to be posted after the merging window).

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
@ 2020-05-27 18:57     ` Peter Collingbourne
  0 siblings, 0 replies; 123+ messages in thread
From: Peter Collingbourne @ 2020-05-27 18:57 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	linux-mm, Evgenii Stepanov, Vincenzo Frascino, Will Deacon,
	Dave P Martin, Linux ARM

On Fri, May 15, 2020 at 10:16 AM Catalin Marinas
<catalin.marinas@arm.com> wrote:
>
> To enable tagging on a memory range, the user must explicitly opt in via
> a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new
> memory type in the AttrIndx field of a pte, simplify the or'ing of these
> bits over the protection_map[] attributes by making MT_NORMAL index 0.

Should the userspace stack always be mapped as if with PROT_MTE if the
hardware supports it? Such a change would be invisible to non-MTE
aware userspace since it would already need to opt in to tag checking
via prctl. This would let userspace avoid a complex stack
initialization sequence when running with stack tagging enabled on the
main thread.

Peter

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
@ 2020-05-27 18:57     ` Peter Collingbourne
  0 siblings, 0 replies; 123+ messages in thread
From: Peter Collingbourne @ 2020-05-27 18:57 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linux ARM, linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Evgenii Stepanov

On Fri, May 15, 2020 at 10:16 AM Catalin Marinas
<catalin.marinas@arm.com> wrote:
>
> To enable tagging on a memory range, the user must explicitly opt in via
> a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new
> memory type in the AttrIndx field of a pte, simplify the or'ing of these
> bits over the protection_map[] attributes by making MT_NORMAL index 0.

Should the userspace stack always be mapped as if with PROT_MTE if the
hardware supports it? Such a change would be invisible to non-MTE
aware userspace since it would already need to opt in to tag checking
via prctl. This would let userspace avoid a complex stack
initialization sequence when running with stack tagging enabled on the
main thread.

Peter

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
@ 2020-05-27 18:57     ` Peter Collingbourne
  0 siblings, 0 replies; 123+ messages in thread
From: Peter Collingbourne @ 2020-05-27 18:57 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linux ARM, linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Evgenii Stepanov

On Fri, May 15, 2020 at 10:16 AM Catalin Marinas
<catalin.marinas@arm.com> wrote:
>
> To enable tagging on a memory range, the user must explicitly opt in via
> a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new
> memory type in the AttrIndx field of a pte, simplify the or'ing of these
> bits over the protection_map[] attributes by making MT_NORMAL index 0.

Should the userspace stack always be mapped as if with PROT_MTE if the
hardware supports it? Such a change would be invisible to non-MTE
aware userspace since it would already need to opt in to tag checking
via prctl. This would let userspace avoid a complex stack
initialization sequence when running with stack tagging enabled on the
main thread.

Peter


^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
@ 2020-05-27 18:57     ` Peter Collingbourne
  0 siblings, 0 replies; 123+ messages in thread
From: Peter Collingbourne @ 2020-05-27 18:57 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	linux-mm, Evgenii Stepanov, Vincenzo Frascino, Will Deacon,
	Dave P Martin, Linux ARM

On Fri, May 15, 2020 at 10:16 AM Catalin Marinas
<catalin.marinas@arm.com> wrote:
>
> To enable tagging on a memory range, the user must explicitly opt in via
> a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new
> memory type in the AttrIndx field of a pte, simplify the or'ing of these
> bits over the protection_map[] attributes by making MT_NORMAL index 0.

Should the userspace stack always be mapped as if with PROT_MTE if the
hardware supports it? Such a change would be invisible to non-MTE
aware userspace since it would already need to opt in to tag checking
via prctl. This would let userspace avoid a complex stack
initialization sequence when running with stack tagging enabled on the
main thread.

Peter

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  2020-05-27 18:57     ` Peter Collingbourne
@ 2020-05-28  9:14       ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-28  9:14 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: Linux ARM, linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Evgenii Stepanov

On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:
> On Fri, May 15, 2020 at 10:16 AM Catalin Marinas
> <catalin.marinas@arm.com> wrote:
> > To enable tagging on a memory range, the user must explicitly opt in via
> > a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new
> > memory type in the AttrIndx field of a pte, simplify the or'ing of these
> > bits over the protection_map[] attributes by making MT_NORMAL index 0.
> 
> Should the userspace stack always be mapped as if with PROT_MTE if the
> hardware supports it? Such a change would be invisible to non-MTE
> aware userspace since it would already need to opt in to tag checking
> via prctl. This would let userspace avoid a complex stack
> initialization sequence when running with stack tagging enabled on the
> main thread.

I don't think the stack initialisation is that difficult. On program
startup (can be the dynamic loader). Something like (untested):

	register unsigned long stack asm ("sp");
	unsigned long page_sz = sysconf(_SC_PAGESIZE);

	mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
		 PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);

(the essential part it PROT_GROWSDOWN so that you don't have to specify
a stack lower limit)

I don't like enabling this by default since it will have a small cost
even if the application doesn't enable tag checking. The kernel would
still have to zero the tags when mapping the stack and preserve them
when swapping out.

Another case where this could go wrong is if we want enable some
quiet monitoring of user programs: the libc enables PROT_MTE on heap
allocations but keeps tag checking disabled as it doesn't want any
SIGSEGV; the kernel could enable async TCF and log any faults
(rate-limited). Default PROT_MTE stack would get in the way. Anyway,
this use-case is something for the future, so far these patches rely on
the user solely driving the tag checking mode.

I'm fine, however, with enabling PROT_MTE on the main stack based on
some ELF note.

-- 
Catalin

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
@ 2020-05-28  9:14       ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-28  9:14 UTC (permalink / raw)
  To: Peter Collingbourne
  Cc: linux-arch, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	linux-mm, Evgenii Stepanov, Vincenzo Frascino, Will Deacon,
	Dave P Martin, Linux ARM

On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:
> On Fri, May 15, 2020 at 10:16 AM Catalin Marinas
> <catalin.marinas@arm.com> wrote:
> > To enable tagging on a memory range, the user must explicitly opt in via
> > a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new
> > memory type in the AttrIndx field of a pte, simplify the or'ing of these
> > bits over the protection_map[] attributes by making MT_NORMAL index 0.
> 
> Should the userspace stack always be mapped as if with PROT_MTE if the
> hardware supports it? Such a change would be invisible to non-MTE
> aware userspace since it would already need to opt in to tag checking
> via prctl. This would let userspace avoid a complex stack
> initialization sequence when running with stack tagging enabled on the
> main thread.

I don't think the stack initialisation is that difficult. On program
startup (can be the dynamic loader). Something like (untested):

	register unsigned long stack asm ("sp");
	unsigned long page_sz = sysconf(_SC_PAGESIZE);

	mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
		 PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);

(the essential part it PROT_GROWSDOWN so that you don't have to specify
a stack lower limit)

I don't like enabling this by default since it will have a small cost
even if the application doesn't enable tag checking. The kernel would
still have to zero the tags when mapping the stack and preserve them
when swapping out.

Another case where this could go wrong is if we want enable some
quiet monitoring of user programs: the libc enables PROT_MTE on heap
allocations but keeps tag checking disabled as it doesn't want any
SIGSEGV; the kernel could enable async TCF and log any faults
(rate-limited). Default PROT_MTE stack would get in the way. Anyway,
this use-case is something for the future, so far these patches rely on
the user solely driving the tag checking mode.

I'm fine, however, with enabling PROT_MTE on the main stack based on
some ELF note.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
@ 2020-05-28 11:05         ` Szabolcs Nagy
  0 siblings, 0 replies; 123+ messages in thread
From: Szabolcs Nagy @ 2020-05-28 11:05 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, nd, Peter Collingbourne, Kevin Brodsky, linux-mm,
	Evgenii Stepanov, Andrey Konovalov, Vincenzo Frascino,
	Will Deacon, Dave P Martin, Linux ARM

The 05/28/2020 10:14, Catalin Marinas wrote:
> On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:
> > On Fri, May 15, 2020 at 10:16 AM Catalin Marinas
> > <catalin.marinas@arm.com> wrote:
> > > To enable tagging on a memory range, the user must explicitly opt in via
> > > a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new
> > > memory type in the AttrIndx field of a pte, simplify the or'ing of these
> > > bits over the protection_map[] attributes by making MT_NORMAL index 0.
> > 
> > Should the userspace stack always be mapped as if with PROT_MTE if the
> > hardware supports it? Such a change would be invisible to non-MTE
> > aware userspace since it would already need to opt in to tag checking
> > via prctl. This would let userspace avoid a complex stack
> > initialization sequence when running with stack tagging enabled on the
> > main thread.
> 
> I don't think the stack initialisation is that difficult. On program
> startup (can be the dynamic loader). Something like (untested):
> 
> 	register unsigned long stack asm ("sp");
> 	unsigned long page_sz = sysconf(_SC_PAGESIZE);
> 
> 	mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
> 		 PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);
> 
> (the essential part it PROT_GROWSDOWN so that you don't have to specify
> a stack lower limit)

does this work even if the currently mapped stack is more than page_sz?
determining the mapped main stack area is i think non-trivial to do in
userspace (requires parsing /proc/self/maps or similar).

...
> I'm fine, however, with enabling PROT_MTE on the main stack based on
> some ELF note.

note that would likely mean an elf note on the dynamic linker
(because a dynamic linked executable may not be loaded by the
kernel and ctors in loaded libs run before the executable entry
code anyway, so the executable alone cannot be in charge of this
decision) i.e. one global switch for all dynamic linked binaries.

i think a dynamic linker can map a new stack and switch to it
if it needs to control the properties of the stack at runtime
(it's wasteful though).

and i think there should be a runtime mechanism for the brk area:
it should be possible to request that future brk expansions are
mapped as PROT_MTE so an mte aware malloc implementation can use
brk. i think this is not important in the initial design, but if
a prctl flag can do it that may be useful to add (may be at a
later time).

(and eventually there should be a way to use PROT_MTE on
writable global data and appropriate code generation that
takes colors into account when globals are accessed, but
that requires significant ELF, ld.so and compiler changes,
that need not be part of the initial mte design).

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
@ 2020-05-28 11:05         ` Szabolcs Nagy
  0 siblings, 0 replies; 123+ messages in thread
From: Szabolcs Nagy @ 2020-05-28 11:05 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Peter Collingbourne, Linux ARM, linux-mm, linux-arch,
	Will Deacon, Dave P Martin, Vincenzo Frascino, Kevin Brodsky,
	Andrey Konovalov, Evgenii Stepanov, nd

The 05/28/2020 10:14, Catalin Marinas wrote:
> On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:
> > On Fri, May 15, 2020 at 10:16 AM Catalin Marinas
> > <catalin.marinas@arm.com> wrote:
> > > To enable tagging on a memory range, the user must explicitly opt in via
> > > a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new
> > > memory type in the AttrIndx field of a pte, simplify the or'ing of these
> > > bits over the protection_map[] attributes by making MT_NORMAL index 0.
> > 
> > Should the userspace stack always be mapped as if with PROT_MTE if the
> > hardware supports it? Such a change would be invisible to non-MTE
> > aware userspace since it would already need to opt in to tag checking
> > via prctl. This would let userspace avoid a complex stack
> > initialization sequence when running with stack tagging enabled on the
> > main thread.
> 
> I don't think the stack initialisation is that difficult. On program
> startup (can be the dynamic loader). Something like (untested):
> 
> 	register unsigned long stack asm ("sp");
> 	unsigned long page_sz = sysconf(_SC_PAGESIZE);
> 
> 	mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
> 		 PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);
> 
> (the essential part it PROT_GROWSDOWN so that you don't have to specify
> a stack lower limit)

does this work even if the currently mapped stack is more than page_sz?
determining the mapped main stack area is i think non-trivial to do in
userspace (requires parsing /proc/self/maps or similar).

...
> I'm fine, however, with enabling PROT_MTE on the main stack based on
> some ELF note.

note that would likely mean an elf note on the dynamic linker
(because a dynamic linked executable may not be loaded by the
kernel and ctors in loaded libs run before the executable entry
code anyway, so the executable alone cannot be in charge of this
decision) i.e. one global switch for all dynamic linked binaries.

i think a dynamic linker can map a new stack and switch to it
if it needs to control the properties of the stack at runtime
(it's wasteful though).

and i think there should be a runtime mechanism for the brk area:
it should be possible to request that future brk expansions are
mapped as PROT_MTE so an mte aware malloc implementation can use
brk. i think this is not important in the initial design, but if
a prctl flag can do it that may be useful to add (may be at a
later time).

(and eventually there should be a way to use PROT_MTE on
writable global data and appropriate code generation that
takes colors into account when globals are accessed, but
that requires significant ELF, ld.so and compiler changes,
that need not be part of the initial mte design).

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
@ 2020-05-28 11:05         ` Szabolcs Nagy
  0 siblings, 0 replies; 123+ messages in thread
From: Szabolcs Nagy @ 2020-05-28 11:05 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, nd, Peter Collingbourne, Kevin Brodsky, linux-mm,
	Evgenii Stepanov, Andrey Konovalov, Vincenzo Frascino,
	Will Deacon, Dave P Martin, Linux ARM

The 05/28/2020 10:14, Catalin Marinas wrote:
> On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:
> > On Fri, May 15, 2020 at 10:16 AM Catalin Marinas
> > <catalin.marinas@arm.com> wrote:
> > > To enable tagging on a memory range, the user must explicitly opt in via
> > > a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new
> > > memory type in the AttrIndx field of a pte, simplify the or'ing of these
> > > bits over the protection_map[] attributes by making MT_NORMAL index 0.
> > 
> > Should the userspace stack always be mapped as if with PROT_MTE if the
> > hardware supports it? Such a change would be invisible to non-MTE
> > aware userspace since it would already need to opt in to tag checking
> > via prctl. This would let userspace avoid a complex stack
> > initialization sequence when running with stack tagging enabled on the
> > main thread.
> 
> I don't think the stack initialisation is that difficult. On program
> startup (can be the dynamic loader). Something like (untested):
> 
> 	register unsigned long stack asm ("sp");
> 	unsigned long page_sz = sysconf(_SC_PAGESIZE);
> 
> 	mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
> 		 PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);
> 
> (the essential part it PROT_GROWSDOWN so that you don't have to specify
> a stack lower limit)

does this work even if the currently mapped stack is more than page_sz?
determining the mapped main stack area is i think non-trivial to do in
userspace (requires parsing /proc/self/maps or similar).

...
> I'm fine, however, with enabling PROT_MTE on the main stack based on
> some ELF note.

note that would likely mean an elf note on the dynamic linker
(because a dynamic linked executable may not be loaded by the
kernel and ctors in loaded libs run before the executable entry
code anyway, so the executable alone cannot be in charge of this
decision) i.e. one global switch for all dynamic linked binaries.

i think a dynamic linker can map a new stack and switch to it
if it needs to control the properties of the stack at runtime
(it's wasteful though).

and i think there should be a runtime mechanism for the brk area:
it should be possible to request that future brk expansions are
mapped as PROT_MTE so an mte aware malloc implementation can use
brk. i think this is not important in the initial design, but if
a prctl flag can do it that may be useful to add (may be at a
later time).

(and eventually there should be a way to use PROT_MTE on
writable global data and appropriate code generation that
takes colors into account when globals are accessed, but
that requires significant ELF, ld.so and compiler changes,
that need not be part of the initial mte design).

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  2020-05-28 11:05         ` Szabolcs Nagy
@ 2020-05-28 16:34           ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-28 16:34 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: Peter Collingbourne, Linux ARM, linux-mm, linux-arch,
	Will Deacon, Dave P Martin, Vincenzo Frascino, Kevin Brodsky,
	Andrey Konovalov, Evgenii Stepanov, nd

On Thu, May 28, 2020 at 12:05:09PM +0100, Szabolcs Nagy wrote:
> The 05/28/2020 10:14, Catalin Marinas wrote:
> > On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:
> > > On Fri, May 15, 2020 at 10:16 AM Catalin Marinas
> > > <catalin.marinas@arm.com> wrote:
> > > > To enable tagging on a memory range, the user must explicitly opt in via
> > > > a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new
> > > > memory type in the AttrIndx field of a pte, simplify the or'ing of these
> > > > bits over the protection_map[] attributes by making MT_NORMAL index 0.
> > > 
> > > Should the userspace stack always be mapped as if with PROT_MTE if the
> > > hardware supports it? Such a change would be invisible to non-MTE
> > > aware userspace since it would already need to opt in to tag checking
> > > via prctl. This would let userspace avoid a complex stack
> > > initialization sequence when running with stack tagging enabled on the
> > > main thread.
> > 
> > I don't think the stack initialisation is that difficult. On program
> > startup (can be the dynamic loader). Something like (untested):
> > 
> > 	register unsigned long stack asm ("sp");
> > 	unsigned long page_sz = sysconf(_SC_PAGESIZE);
> > 
> > 	mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
> > 		 PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);
> > 
> > (the essential part it PROT_GROWSDOWN so that you don't have to specify
> > a stack lower limit)
> 
> does this work even if the currently mapped stack is more than page_sz?
> determining the mapped main stack area is i think non-trivial to do in
> userspace (requires parsing /proc/self/maps or similar).

Because of PROT_GROWSDOWN, the kernel adjusts the start of the range
down automatically. It is potentially problematic if the top of the
stack is more than a page away and you want the whole stack coloured. I
haven't run a test but my reading of the kernel code is that the stack
vma would be split in this scenario, so the range beyond sp+page_sz
won't have PROT_MTE set.

My assumption is that if you do this during program start, the stack is
smaller than a page. Alternatively, could we use argv or envp to
determine the top of the user stack (the bottom is taken care of by the
kernel)?

> > I'm fine, however, with enabling PROT_MTE on the main stack based on
> > some ELF note.
> 
> note that would likely mean an elf note on the dynamic linker
> (because a dynamic linked executable may not be loaded by the
> kernel and ctors in loaded libs run before the executable entry
> code anyway, so the executable alone cannot be in charge of this
> decision) i.e. one global switch for all dynamic linked binaries.

I guess parsing such note in the kernel is only useful for static
binaries.

> i think a dynamic linker can map a new stack and switch to it
> if it needs to control the properties of the stack at runtime
> (it's wasteful though).

There is already user code to check for HWCAP2_MTE and the prctl(), so
adding an mprotect() doesn't look like a significant overhead.

> and i think there should be a runtime mechanism for the brk area:
> it should be possible to request that future brk expansions are
> mapped as PROT_MTE so an mte aware malloc implementation can use
> brk. i think this is not important in the initial design, but if
> a prctl flag can do it that may be useful to add (may be at a
> later time).

Looking at the kernel code briefly, I think this would work. We do end
up with two vmas for the brk, only the expansion having PROT_MTE, and
I'd to find a way to store the extra flag.

From a coding perspective, it's easier to just set PROT_MTE by default
on both brk and initial stack ;) (VM_DATA_DEFAULT_FLAGS).

> (and eventually there should be a way to use PROT_MTE on
> writable global data and appropriate code generation that
> takes colors into account when globals are accessed, but
> that requires significant ELF, ld.so and compiler changes,
> that need not be part of the initial mte design).

The .data section needs to be driven by the ELF information. It's also a
file mapping and we don't support PROT_MTE on them even if MAP_PRIVATE.
There are complications like DAX where the file you mmap for CoW may be
hosted on memory that does not support MTE (copied to RAM on write).

Is there a use-case for global data to be tagged?

-- 
Catalin

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
@ 2020-05-28 16:34           ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-28 16:34 UTC (permalink / raw)
  To: Szabolcs Nagy
  Cc: linux-arch, nd, Peter Collingbourne, Kevin Brodsky, linux-mm,
	Evgenii Stepanov, Andrey Konovalov, Vincenzo Frascino,
	Will Deacon, Dave P Martin, Linux ARM

On Thu, May 28, 2020 at 12:05:09PM +0100, Szabolcs Nagy wrote:
> The 05/28/2020 10:14, Catalin Marinas wrote:
> > On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:
> > > On Fri, May 15, 2020 at 10:16 AM Catalin Marinas
> > > <catalin.marinas@arm.com> wrote:
> > > > To enable tagging on a memory range, the user must explicitly opt in via
> > > > a new PROT_MTE flag passed to mmap() or mprotect(). Since this is a new
> > > > memory type in the AttrIndx field of a pte, simplify the or'ing of these
> > > > bits over the protection_map[] attributes by making MT_NORMAL index 0.
> > > 
> > > Should the userspace stack always be mapped as if with PROT_MTE if the
> > > hardware supports it? Such a change would be invisible to non-MTE
> > > aware userspace since it would already need to opt in to tag checking
> > > via prctl. This would let userspace avoid a complex stack
> > > initialization sequence when running with stack tagging enabled on the
> > > main thread.
> > 
> > I don't think the stack initialisation is that difficult. On program
> > startup (can be the dynamic loader). Something like (untested):
> > 
> > 	register unsigned long stack asm ("sp");
> > 	unsigned long page_sz = sysconf(_SC_PAGESIZE);
> > 
> > 	mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
> > 		 PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);
> > 
> > (the essential part it PROT_GROWSDOWN so that you don't have to specify
> > a stack lower limit)
> 
> does this work even if the currently mapped stack is more than page_sz?
> determining the mapped main stack area is i think non-trivial to do in
> userspace (requires parsing /proc/self/maps or similar).

Because of PROT_GROWSDOWN, the kernel adjusts the start of the range
down automatically. It is potentially problematic if the top of the
stack is more than a page away and you want the whole stack coloured. I
haven't run a test but my reading of the kernel code is that the stack
vma would be split in this scenario, so the range beyond sp+page_sz
won't have PROT_MTE set.

My assumption is that if you do this during program start, the stack is
smaller than a page. Alternatively, could we use argv or envp to
determine the top of the user stack (the bottom is taken care of by the
kernel)?

> > I'm fine, however, with enabling PROT_MTE on the main stack based on
> > some ELF note.
> 
> note that would likely mean an elf note on the dynamic linker
> (because a dynamic linked executable may not be loaded by the
> kernel and ctors in loaded libs run before the executable entry
> code anyway, so the executable alone cannot be in charge of this
> decision) i.e. one global switch for all dynamic linked binaries.

I guess parsing such note in the kernel is only useful for static
binaries.

> i think a dynamic linker can map a new stack and switch to it
> if it needs to control the properties of the stack at runtime
> (it's wasteful though).

There is already user code to check for HWCAP2_MTE and the prctl(), so
adding an mprotect() doesn't look like a significant overhead.

> and i think there should be a runtime mechanism for the brk area:
> it should be possible to request that future brk expansions are
> mapped as PROT_MTE so an mte aware malloc implementation can use
> brk. i think this is not important in the initial design, but if
> a prctl flag can do it that may be useful to add (may be at a
> later time).

Looking at the kernel code briefly, I think this would work. We do end
up with two vmas for the brk, only the expansion having PROT_MTE, and
I'd to find a way to store the extra flag.

From a coding perspective, it's easier to just set PROT_MTE by default
on both brk and initial stack ;) (VM_DATA_DEFAULT_FLAGS).

> (and eventually there should be a way to use PROT_MTE on
> writable global data and appropriate code generation that
> takes colors into account when globals are accessed, but
> that requires significant ELF, ld.so and compiler changes,
> that need not be part of the initial mte design).

The .data section needs to be driven by the ELF information. It's also a
file mapping and we don't support PROT_MTE on them even if MAP_PRIVATE.
There are complications like DAX where the file you mmap for CoW may be
hosted on memory that does not support MTE (copied to RAM on write).

Is there a use-case for global data to be tagged?

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  2020-05-28 16:34           ` Catalin Marinas
@ 2020-05-28 18:35             ` Evgenii Stepanov
  -1 siblings, 0 replies; 123+ messages in thread
From: Evgenii Stepanov @ 2020-05-28 18:35 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Szabolcs Nagy, Peter Collingbourne, Linux ARM,
	Linux Memory Management List, linux-arch, Will Deacon,
	Dave P Martin, Vincenzo Frascino, Kevin Brodsky,
	Andrey Konovalov, nd

[-- Attachment #1: Type: text/plain, Size: 5268 bytes --]

On Thu, May 28, 2020 at 9:34 AM Catalin Marinas <catalin.marinas@arm.com>
wrote:

> On Thu, May 28, 2020 at 12:05:09PM +0100, Szabolcs Nagy wrote:
> > The 05/28/2020 10:14, Catalin Marinas wrote:
> > > On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:
> > > > On Fri, May 15, 2020 at 10:16 AM Catalin Marinas
> > > > <catalin.marinas@arm.com> wrote:
> > > > > To enable tagging on a memory range, the user must explicitly opt
> in via
> > > > > a new PROT_MTE flag passed to mmap() or mprotect(). Since this is
> a new
> > > > > memory type in the AttrIndx field of a pte, simplify the or'ing of
> these
> > > > > bits over the protection_map[] attributes by making MT_NORMAL
> index 0.
> > > >
> > > > Should the userspace stack always be mapped as if with PROT_MTE if
> the
> > > > hardware supports it? Such a change would be invisible to non-MTE
> > > > aware userspace since it would already need to opt in to tag checking
> > > > via prctl. This would let userspace avoid a complex stack
> > > > initialization sequence when running with stack tagging enabled on
> the
> > > > main thread.
> > >
> > > I don't think the stack initialisation is that difficult. On program
> > > startup (can be the dynamic loader). Something like (untested):
> > >
> > >     register unsigned long stack asm ("sp");
> > >     unsigned long page_sz = sysconf(_SC_PAGESIZE);
> > >
> > >     mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
> > >              PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);
> > >
> > > (the essential part it PROT_GROWSDOWN so that you don't have to specify
> > > a stack lower limit)
> >
> > does this work even if the currently mapped stack is more than page_sz?
> > determining the mapped main stack area is i think non-trivial to do in
> > userspace (requires parsing /proc/self/maps or similar).
>
> Because of PROT_GROWSDOWN, the kernel adjusts the start of the range
> down automatically. It is potentially problematic if the top of the
> stack is more than a page away and you want the whole stack coloured. I
> haven't run a test but my reading of the kernel code is that the stack
> vma would be split in this scenario, so the range beyond sp+page_sz
> won't have PROT_MTE set.
>
> My assumption is that if you do this during program start, the stack is
> smaller than a page. Alternatively, could we use argv or envp to
> determine the top of the user stack (the bottom is taken care of by the
> kernel)?
>

PROT_GROWSDOWN seems to work fine in our case, and the extra tag
maintenance overhead sounds like a valid argument against setting PROT_MTE
unconditionally.

On the other hand, we may end up doing this in the userspace in every
process. The reason is, PROT_MTE can not be set on a page that contains a
live frame with stack tagging because of mismatching tags (IRG is not
affected by PROT_MTE but STG is). So ideally, this should be done at (or
near) the program entry point, while the stack is mostly empty.


>
> > > I'm fine, however, with enabling PROT_MTE on the main stack based on
> > > some ELF note.
> >
> > note that would likely mean an elf note on the dynamic linker
> > (because a dynamic linked executable may not be loaded by the
> > kernel and ctors in loaded libs run before the executable entry
> > code anyway, so the executable alone cannot be in charge of this
> > decision) i.e. one global switch for all dynamic linked binaries.
>
> I guess parsing such note in the kernel is only useful for static
> binaries.
>
> > i think a dynamic linker can map a new stack and switch to it
> > if it needs to control the properties of the stack at runtime
> > (it's wasteful though).
>
> There is already user code to check for HWCAP2_MTE and the prctl(), so
> adding an mprotect() doesn't look like a significant overhead.
>
> > and i think there should be a runtime mechanism for the brk area:
> > it should be possible to request that future brk expansions are
> > mapped as PROT_MTE so an mte aware malloc implementation can use
> > brk. i think this is not important in the initial design, but if
> > a prctl flag can do it that may be useful to add (may be at a
> > later time).
>
> Looking at the kernel code briefly, I think this would work. We do end
> up with two vmas for the brk, only the expansion having PROT_MTE, and
> I'd to find a way to store the extra flag.
>
> From a coding perspective, it's easier to just set PROT_MTE by default
> on both brk and initial stack ;) (VM_DATA_DEFAULT_FLAGS).
>
> > (and eventually there should be a way to use PROT_MTE on
> > writable global data and appropriate code generation that
> > takes colors into account when globals are accessed, but
> > that requires significant ELF, ld.so and compiler changes,
> > that need not be part of the initial mte design).
>
> The .data section needs to be driven by the ELF information. It's also a
> file mapping and we don't support PROT_MTE on them even if MAP_PRIVATE.
> There are complications like DAX where the file you mmap for CoW may be
> hosted on memory that does not support MTE (copied to RAM on write).
>
> Is there a use-case for global data to be tagged?
>

Yes, catching global buffer overflow bugs. They are not nearly as common as
heap-based issues though.


>
> --
> Catalin
>

[-- Attachment #2: Type: text/html, Size: 6785 bytes --]

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
@ 2020-05-28 18:35             ` Evgenii Stepanov
  0 siblings, 0 replies; 123+ messages in thread
From: Evgenii Stepanov @ 2020-05-28 18:35 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Szabolcs Nagy, Peter Collingbourne, Linux ARM,
	Linux Memory Management List, linux-arch, Will Deacon,
	Dave P Martin, Vincenzo Frascino, Kevin Brodsky,
	Andrey Konovalov, nd

[-- Attachment #1: Type: text/plain, Size: 5268 bytes --]

On Thu, May 28, 2020 at 9:34 AM Catalin Marinas <catalin.marinas@arm.com>
wrote:

> On Thu, May 28, 2020 at 12:05:09PM +0100, Szabolcs Nagy wrote:
> > The 05/28/2020 10:14, Catalin Marinas wrote:
> > > On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:
> > > > On Fri, May 15, 2020 at 10:16 AM Catalin Marinas
> > > > <catalin.marinas@arm.com> wrote:
> > > > > To enable tagging on a memory range, the user must explicitly opt
> in via
> > > > > a new PROT_MTE flag passed to mmap() or mprotect(). Since this is
> a new
> > > > > memory type in the AttrIndx field of a pte, simplify the or'ing of
> these
> > > > > bits over the protection_map[] attributes by making MT_NORMAL
> index 0.
> > > >
> > > > Should the userspace stack always be mapped as if with PROT_MTE if
> the
> > > > hardware supports it? Such a change would be invisible to non-MTE
> > > > aware userspace since it would already need to opt in to tag checking
> > > > via prctl. This would let userspace avoid a complex stack
> > > > initialization sequence when running with stack tagging enabled on
> the
> > > > main thread.
> > >
> > > I don't think the stack initialisation is that difficult. On program
> > > startup (can be the dynamic loader). Something like (untested):
> > >
> > >     register unsigned long stack asm ("sp");
> > >     unsigned long page_sz = sysconf(_SC_PAGESIZE);
> > >
> > >     mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
> > >              PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);
> > >
> > > (the essential part it PROT_GROWSDOWN so that you don't have to specify
> > > a stack lower limit)
> >
> > does this work even if the currently mapped stack is more than page_sz?
> > determining the mapped main stack area is i think non-trivial to do in
> > userspace (requires parsing /proc/self/maps or similar).
>
> Because of PROT_GROWSDOWN, the kernel adjusts the start of the range
> down automatically. It is potentially problematic if the top of the
> stack is more than a page away and you want the whole stack coloured. I
> haven't run a test but my reading of the kernel code is that the stack
> vma would be split in this scenario, so the range beyond sp+page_sz
> won't have PROT_MTE set.
>
> My assumption is that if you do this during program start, the stack is
> smaller than a page. Alternatively, could we use argv or envp to
> determine the top of the user stack (the bottom is taken care of by the
> kernel)?
>

PROT_GROWSDOWN seems to work fine in our case, and the extra tag
maintenance overhead sounds like a valid argument against setting PROT_MTE
unconditionally.

On the other hand, we may end up doing this in the userspace in every
process. The reason is, PROT_MTE can not be set on a page that contains a
live frame with stack tagging because of mismatching tags (IRG is not
affected by PROT_MTE but STG is). So ideally, this should be done at (or
near) the program entry point, while the stack is mostly empty.


>
> > > I'm fine, however, with enabling PROT_MTE on the main stack based on
> > > some ELF note.
> >
> > note that would likely mean an elf note on the dynamic linker
> > (because a dynamic linked executable may not be loaded by the
> > kernel and ctors in loaded libs run before the executable entry
> > code anyway, so the executable alone cannot be in charge of this
> > decision) i.e. one global switch for all dynamic linked binaries.
>
> I guess parsing such note in the kernel is only useful for static
> binaries.
>
> > i think a dynamic linker can map a new stack and switch to it
> > if it needs to control the properties of the stack at runtime
> > (it's wasteful though).
>
> There is already user code to check for HWCAP2_MTE and the prctl(), so
> adding an mprotect() doesn't look like a significant overhead.
>
> > and i think there should be a runtime mechanism for the brk area:
> > it should be possible to request that future brk expansions are
> > mapped as PROT_MTE so an mte aware malloc implementation can use
> > brk. i think this is not important in the initial design, but if
> > a prctl flag can do it that may be useful to add (may be at a
> > later time).
>
> Looking at the kernel code briefly, I think this would work. We do end
> up with two vmas for the brk, only the expansion having PROT_MTE, and
> I'd to find a way to store the extra flag.
>
> From a coding perspective, it's easier to just set PROT_MTE by default
> on both brk and initial stack ;) (VM_DATA_DEFAULT_FLAGS).
>
> > (and eventually there should be a way to use PROT_MTE on
> > writable global data and appropriate code generation that
> > takes colors into account when globals are accessed, but
> > that requires significant ELF, ld.so and compiler changes,
> > that need not be part of the initial mte design).
>
> The .data section needs to be driven by the ELF information. It's also a
> file mapping and we don't support PROT_MTE on them even if MAP_PRIVATE.
> There are complications like DAX where the file you mmap for CoW may be
> hosted on memory that does not support MTE (copied to RAM on write).
>
> Is there a use-case for global data to be tagged?
>

Yes, catching global buffer overflow bugs. They are not nearly as common as
heap-based issues though.


>
> --
> Catalin
>

[-- Attachment #2: Type: text/html, Size: 6785 bytes --]

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  2020-05-28 18:35             ` Evgenii Stepanov
@ 2020-05-29 11:19               ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-29 11:19 UTC (permalink / raw)
  To: Evgenii Stepanov
  Cc: Szabolcs Nagy, Peter Collingbourne, Linux ARM,
	Linux Memory Management List, linux-arch, Will Deacon,
	Dave P Martin, Vincenzo Frascino, Kevin Brodsky,
	Andrey Konovalov, nd

On Thu, May 28, 2020 at 11:35:50AM -0700, Evgenii Stepanov wrote:
> On Thu, May 28, 2020 at 9:34 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Thu, May 28, 2020 at 12:05:09PM +0100, Szabolcs Nagy wrote:
> > > On 05/28/2020 10:14, Catalin Marinas wrote:
> > > > I don't think the stack initialisation is that difficult. On program
> > > > startup (can be the dynamic loader). Something like (untested):
> > > >
> > > >     register unsigned long stack asm ("sp");
> > > >     unsigned long page_sz = sysconf(_SC_PAGESIZE);
> > > >
> > > >     mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
> > > >              PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);
> > > >
> > > > (the essential part it PROT_GROWSDOWN so that you don't have to specify
> > > > a stack lower limit)
> > >
> > > does this work even if the currently mapped stack is more than page_sz?
> > > determining the mapped main stack area is i think non-trivial to do in
> > > userspace (requires parsing /proc/self/maps or similar).
> >
> > Because of PROT_GROWSDOWN, the kernel adjusts the start of the range
> > down automatically. It is potentially problematic if the top of the
> > stack is more than a page away and you want the whole stack coloured. I
> > haven't run a test but my reading of the kernel code is that the stack
> > vma would be split in this scenario, so the range beyond sp+page_sz
> > won't have PROT_MTE set.
> >
> > My assumption is that if you do this during program start, the stack is
> > smaller than a page. Alternatively, could we use argv or envp to
> > determine the top of the user stack (the bottom is taken care of by the
> > kernel)?
> 
> PROT_GROWSDOWN seems to work fine in our case, and the extra tag
> maintenance overhead sounds like a valid argument against setting PROT_MTE
> unconditionally.
> 
> On the other hand, we may end up doing this in the userspace in every
> process. The reason is, PROT_MTE can not be set on a page that contains a
> live frame with stack tagging because of mismatching tags (IRG is not
> affected by PROT_MTE but STG is). So ideally, this should be done at (or
> near) the program entry point, while the stack is mostly empty.

Since stack tagging cannot use instructions in the NOP space anyway, I
think we need an ELF note to check for the presence of STG etc. and, in
addition, we can turn PROT_MTE by default on the initial stack. Maybe on
such binaries we could just set PROT_MTE on all anonymous and ramfs
mappings (i.e. VM_MTE_ALLOWED implies VM_MTE).

For dynamically linked binaries, we base this decision on the main ELF,
not the interpreter, and it would be up to the dynamic loader to reject
libraries that have such note when HWCAP2_MTE is not present.

> > > (and eventually there should be a way to use PROT_MTE on
> > > writable global data and appropriate code generation that
> > > takes colors into account when globals are accessed, but
> > > that requires significant ELF, ld.so and compiler changes,
> > > that need not be part of the initial mte design).
> >
> > The .data section needs to be driven by the ELF information. It's also a
> > file mapping and we don't support PROT_MTE on them even if MAP_PRIVATE.
> > There are complications like DAX where the file you mmap for CoW may be
> > hosted on memory that does not support MTE (copied to RAM on write).
> >
> > Is there a use-case for global data to be tagged?
> 
> Yes, catching global buffer overflow bugs. They are not nearly as
> common as heap-based issues though.

OK, so these would be tagged red-zones around global data. IIUC, having
different colours for global variables was not considered because of the
relocations and relative accesses.

If such red-zone colouring is done during load (the dynamic linker?), we
could set PROT_MTE only when MAP_PRIVATE and copied on write to make
sure it is in RAM. As above, I think this should be driven by some ELF
information.

There's also the option of scrapping PROT_MTE altogether and enabling
MTE (default tag 0) on all anonymous and private+copied pages (i.e.
those stored in RAM). At this point, I can't really tell whether there
will be a performance impact.

-- 
Catalin

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
@ 2020-05-29 11:19               ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-05-29 11:19 UTC (permalink / raw)
  To: Evgenii Stepanov
  Cc: linux-arch, nd, Szabolcs Nagy, Peter Collingbourne,
	Kevin Brodsky, Linux Memory Management List, Andrey Konovalov,
	Vincenzo Frascino, Will Deacon, Dave P Martin, Linux ARM

On Thu, May 28, 2020 at 11:35:50AM -0700, Evgenii Stepanov wrote:
> On Thu, May 28, 2020 at 9:34 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Thu, May 28, 2020 at 12:05:09PM +0100, Szabolcs Nagy wrote:
> > > On 05/28/2020 10:14, Catalin Marinas wrote:
> > > > I don't think the stack initialisation is that difficult. On program
> > > > startup (can be the dynamic loader). Something like (untested):
> > > >
> > > >     register unsigned long stack asm ("sp");
> > > >     unsigned long page_sz = sysconf(_SC_PAGESIZE);
> > > >
> > > >     mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
> > > >              PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);
> > > >
> > > > (the essential part it PROT_GROWSDOWN so that you don't have to specify
> > > > a stack lower limit)
> > >
> > > does this work even if the currently mapped stack is more than page_sz?
> > > determining the mapped main stack area is i think non-trivial to do in
> > > userspace (requires parsing /proc/self/maps or similar).
> >
> > Because of PROT_GROWSDOWN, the kernel adjusts the start of the range
> > down automatically. It is potentially problematic if the top of the
> > stack is more than a page away and you want the whole stack coloured. I
> > haven't run a test but my reading of the kernel code is that the stack
> > vma would be split in this scenario, so the range beyond sp+page_sz
> > won't have PROT_MTE set.
> >
> > My assumption is that if you do this during program start, the stack is
> > smaller than a page. Alternatively, could we use argv or envp to
> > determine the top of the user stack (the bottom is taken care of by the
> > kernel)?
> 
> PROT_GROWSDOWN seems to work fine in our case, and the extra tag
> maintenance overhead sounds like a valid argument against setting PROT_MTE
> unconditionally.
> 
> On the other hand, we may end up doing this in the userspace in every
> process. The reason is, PROT_MTE can not be set on a page that contains a
> live frame with stack tagging because of mismatching tags (IRG is not
> affected by PROT_MTE but STG is). So ideally, this should be done at (or
> near) the program entry point, while the stack is mostly empty.

Since stack tagging cannot use instructions in the NOP space anyway, I
think we need an ELF note to check for the presence of STG etc. and, in
addition, we can turn PROT_MTE by default on the initial stack. Maybe on
such binaries we could just set PROT_MTE on all anonymous and ramfs
mappings (i.e. VM_MTE_ALLOWED implies VM_MTE).

For dynamically linked binaries, we base this decision on the main ELF,
not the interpreter, and it would be up to the dynamic loader to reject
libraries that have such note when HWCAP2_MTE is not present.

> > > (and eventually there should be a way to use PROT_MTE on
> > > writable global data and appropriate code generation that
> > > takes colors into account when globals are accessed, but
> > > that requires significant ELF, ld.so and compiler changes,
> > > that need not be part of the initial mte design).
> >
> > The .data section needs to be driven by the ELF information. It's also a
> > file mapping and we don't support PROT_MTE on them even if MAP_PRIVATE.
> > There are complications like DAX where the file you mmap for CoW may be
> > hosted on memory that does not support MTE (copied to RAM on write).
> >
> > Is there a use-case for global data to be tagged?
> 
> Yes, catching global buffer overflow bugs. They are not nearly as
> common as heap-based issues though.

OK, so these would be tagged red-zones around global data. IIUC, having
different colours for global variables was not considered because of the
relocations and relative accesses.

If such red-zone colouring is done during load (the dynamic linker?), we
could set PROT_MTE only when MAP_PRIVATE and copied on write to make
sure it is in RAM. As above, I think this should be driven by some ELF
information.

There's also the option of scrapping PROT_MTE altogether and enabling
MTE (default tag 0) on all anonymous and private+copied pages (i.e.
those stored in RAM). At this point, I can't really tell whether there
will be a performance impact.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 18/26] arm64: mte: Add PTRACE_{PEEK,POKE}MTETAGS support
  2020-05-15 17:16   ` Catalin Marinas
@ 2020-05-29 21:25     ` Luis Machado
  -1 siblings, 0 replies; 123+ messages in thread
From: Luis Machado @ 2020-05-29 21:25 UTC (permalink / raw)
  To: Catalin Marinas, linux-arm-kernel
  Cc: linux-mm, linux-arch, Will Deacon, Dave P Martin,
	Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Alan Hayward,
	Omair Javaid

Hi Catalin,

I have a question about siginfo MTE information. I suppose SEGV_MTESERR 
will be the most useful setting for debugging, right? Does si_addr 
contain the tagged pointer with the logical tag, a zero-tagged memory 
address or a tagged pointer with the allocation tag?

 From the debugger user's perspective, one would want to see both the 
logical tag and the allocation tag. And it would be handy to have both 
available in siginfo. Does that make sense?

Also, when would we see SEGV_MTEAERR, for example? That would provide no 
additional information about a particular memory address, which is not 
that useful for the debugger.

Thanks,
Luis

On 5/15/20 2:16 PM, Catalin Marinas wrote:
> Add support for bulk setting/getting of the MTE tags in a tracee's
> address space at 'addr' in the ptrace() syscall prototype. 'data' points
> to a struct iovec in the tracer's address space with iov_base
> representing the address of a tracer's buffer of length iov_len. The
> tags to be copied to/from the tracer's buffer are stored as one tag per
> byte.
> 
> On successfully copying at least one tag, ptrace() returns 0 and updates
> the tracer's iov_len with the number of tags copied. In case of error,
> either -EIO or -EFAULT is returned, trying to follow the ptrace() man
> page.
> 
> Note that the tag copying functions are not performance critical,
> therefore they lack optimisations found in typical memory copy routines.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Alan Hayward <Alan.Hayward@arm.com>
> Cc: Luis Machado <luis.machado@linaro.org>
> Cc: Omair Javaid <omair.javaid@linaro.org>
> ---
> 
> Notes:
>      v4:
>      - Following the change to only clear the tags in a page if it is mapped
>        to user with PROT_MTE, ptrace() now will refuse to access tags in
>        pages not previously mapped with PROT_MTE (PG_mte_tagged set). This is
>        primarily to avoid leaking uninitialised tags to user via ptrace().
>      - Fix SYM_FUNC_END argument typo.
>      - Rename MTE_ALLOC_* to MTE_GRANULE_*.
>      - Use uao_user_alternative for the user access in case we ever want to
>        call mte_copy_tags_* with a kernel buffer. It also matches the other
>        uaccess routines in the kernel.
>      - Simplify arch_ptrace() slightly.
>      - Reorder down_write_killable() with access_ok() in
>        __access_remote_tags().
>      - Handle copy length 0 in mte_copy_tags_{to,from}_user().
>      - Use put_user() instead of __put_user().
>      
>      New in v3.
> 
>   arch/arm64/include/asm/mte.h         |  17 ++++
>   arch/arm64/include/uapi/asm/ptrace.h |   3 +
>   arch/arm64/kernel/mte.c              | 139 +++++++++++++++++++++++++++
>   arch/arm64/kernel/ptrace.c           |   7 ++
>   arch/arm64/lib/mte.S                 |  53 ++++++++++
>   5 files changed, 219 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
> index 7435f6619bf1..9d4b1390d07d 100644
> --- a/arch/arm64/include/asm/mte.h
> +++ b/arch/arm64/include/asm/mte.h
> @@ -5,6 +5,11 @@
>   #ifndef __ASM_MTE_H
>   #define __ASM_MTE_H
>   
> +#define MTE_GRANULE_SIZE	UL(16)
> +#define MTE_GRANULE_MASK	(~(MTE_GRANULE_SIZE - 1))
> +#define MTE_TAG_SHIFT		56
> +#define MTE_TAG_SIZE		4
> +
>   #ifndef __ASSEMBLY__
>   
>   #include <linux/page-flags.h>
> @@ -12,6 +17,10 @@
>   #include <asm/pgtable-types.h>
>   
>   void mte_clear_page_tags(void *addr, size_t size);
> +unsigned long mte_copy_tags_from_user(void *to, const void __user *from,
> +				      unsigned long n);
> +unsigned long mte_copy_tags_to_user(void __user *to, void *from,
> +				    unsigned long n);
>   
>   #ifdef CONFIG_ARM64_MTE
>   
> @@ -25,6 +34,8 @@ void mte_thread_switch(struct task_struct *next);
>   void mte_suspend_exit(void);
>   long set_mte_ctrl(unsigned long arg);
>   long get_mte_ctrl(void);
> +int mte_ptrace_copy_tags(struct task_struct *child, long request,
> +			 unsigned long addr, unsigned long data);
>   
>   #else
>   
> @@ -54,6 +65,12 @@ static inline long get_mte_ctrl(void)
>   {
>   	return 0;
>   }
> +static inline int mte_ptrace_copy_tags(struct task_struct *child,
> +				       long request, unsigned long addr,
> +				       unsigned long data)
> +{
> +	return -EIO;
> +}
>   
>   #endif
>   
> diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h
> index 1daf6dda8af0..cd2a4a164de3 100644
> --- a/arch/arm64/include/uapi/asm/ptrace.h
> +++ b/arch/arm64/include/uapi/asm/ptrace.h
> @@ -67,6 +67,9 @@
>   /* syscall emulation path in ptrace */
>   #define PTRACE_SYSEMU		  31
>   #define PTRACE_SYSEMU_SINGLESTEP  32
> +/* MTE allocation tag access */
> +#define PTRACE_PEEKMTETAGS	  33
> +#define PTRACE_POKEMTETAGS	  34
>   
>   #ifndef __ASSEMBLY__
>   
> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> index cda09bc8caf4..6abd6a16a145 100644
> --- a/arch/arm64/kernel/mte.c
> +++ b/arch/arm64/kernel/mte.c
> @@ -4,14 +4,18 @@
>    */
>   
>   #include <linux/bitops.h>
> +#include <linux/kernel.h>
>   #include <linux/mm.h>
>   #include <linux/prctl.h>
>   #include <linux/sched.h>
> +#include <linux/sched/mm.h>
>   #include <linux/string.h>
>   #include <linux/thread_info.h>
> +#include <linux/uio.h>
>   
>   #include <asm/cpufeature.h>
>   #include <asm/mte.h>
> +#include <asm/ptrace.h>
>   #include <asm/sysreg.h>
>   
>   void mte_sync_tags(pte_t *ptep, pte_t pte)
> @@ -173,3 +177,138 @@ long get_mte_ctrl(void)
>   
>   	return ret;
>   }
> +
> +/*
> + * Access MTE tags in another process' address space as given in mm. Update
> + * the number of tags copied. Return 0 if any tags copied, error otherwise.
> + * Inspired by __access_remote_vm().
> + */
> +static int __access_remote_tags(struct task_struct *tsk, struct mm_struct *mm,
> +				unsigned long addr, struct iovec *kiov,
> +				unsigned int gup_flags)
> +{
> +	struct vm_area_struct *vma;
> +	void __user *buf = kiov->iov_base;
> +	size_t len = kiov->iov_len;
> +	int ret;
> +	int write = gup_flags & FOLL_WRITE;
> +
> +	if (!access_ok(buf, len))
> +		return -EFAULT;
> +
> +	if (down_read_killable(&mm->mmap_sem))
> +		return -EIO;
> +
> +	while (len) {
> +		unsigned long tags, offset;
> +		void *maddr;
> +		struct page *page = NULL;
> +
> +		ret = get_user_pages_remote(tsk, mm, addr, 1, gup_flags,
> +					    &page, &vma, NULL);
> +		if (ret <= 0)
> +			break;
> +
> +		/*
> +		 * Only copy tags if the page has been mapped as PROT_MTE
> +		 * (PG_mte_tagged set). Otherwise the tags are not valid and
> +		 * not accessible to user. Moreover, an mprotect(PROT_MTE)
> +		 * would cause the existing tags to be cleared if the page
> +		 * was never mapped with PROT_MTE.
> +		 */
> +		if (!test_bit(PG_mte_tagged, &page->flags)) {
> +			ret = -EOPNOTSUPP;
> +			put_page(page);
> +			break;
> +		}
> +
> +		/* limit access to the end of the page */
> +		offset = offset_in_page(addr);
> +		tags = min(len, (PAGE_SIZE - offset) / MTE_GRANULE_SIZE);
> +
> +		maddr = page_address(page);
> +		if (write) {
> +			tags = mte_copy_tags_from_user(maddr + offset, buf, tags);
> +			set_page_dirty_lock(page);
> +		} else {
> +			tags = mte_copy_tags_to_user(buf, maddr + offset, tags);
> +		}
> +		put_page(page);
> +
> +		/* error accessing the tracer's buffer */
> +		if (!tags)
> +			break;
> +
> +		len -= tags;
> +		buf += tags;
> +		addr += tags * MTE_GRANULE_SIZE;
> +	}
> +	up_read(&mm->mmap_sem);
> +
> +	/* return an error if no tags copied */
> +	kiov->iov_len = buf - kiov->iov_base;
> +	if (!kiov->iov_len) {
> +		/* check for error accessing the tracee's address space */
> +		if (ret <= 0)
> +			return -EIO;
> +		else
> +			return -EFAULT;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * Copy MTE tags in another process' address space at 'addr' to/from tracer's
> + * iovec buffer. Return 0 on success. Inspired by ptrace_access_vm().
> + */
> +static int access_remote_tags(struct task_struct *tsk, unsigned long addr,
> +			      struct iovec *kiov, unsigned int gup_flags)
> +{
> +	struct mm_struct *mm;
> +	int ret;
> +
> +	mm = get_task_mm(tsk);
> +	if (!mm)
> +		return -EPERM;
> +
> +	if (!tsk->ptrace || (current != tsk->parent) ||
> +	    ((get_dumpable(mm) != SUID_DUMP_USER) &&
> +	     !ptracer_capable(tsk, mm->user_ns))) {
> +		mmput(mm);
> +		return -EPERM;
> +	}
> +
> +	ret = __access_remote_tags(tsk, mm, addr, kiov, gup_flags);
> +	mmput(mm);
> +
> +	return ret;
> +}
> +
> +int mte_ptrace_copy_tags(struct task_struct *child, long request,
> +			 unsigned long addr, unsigned long data)
> +{
> +	int ret;
> +	struct iovec kiov;
> +	struct iovec __user *uiov = (void __user *)data;
> +	unsigned int gup_flags = FOLL_FORCE;
> +
> +	if (!system_supports_mte())
> +		return -EIO;
> +
> +	if (get_user(kiov.iov_base, &uiov->iov_base) ||
> +	    get_user(kiov.iov_len, &uiov->iov_len))
> +		return -EFAULT;
> +
> +	if (request == PTRACE_POKEMTETAGS)
> +		gup_flags |= FOLL_WRITE;
> +
> +	/* align addr to the MTE tag granule */
> +	addr &= MTE_GRANULE_MASK;
> +
> +	ret = access_remote_tags(child, addr, &kiov, gup_flags);
> +	if (!ret)
> +		ret = put_user(kiov.iov_len, &uiov->iov_len);
> +
> +	return ret;
> +}
> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> index 077e352495eb..c8bda5f5b321 100644
> --- a/arch/arm64/kernel/ptrace.c
> +++ b/arch/arm64/kernel/ptrace.c
> @@ -34,6 +34,7 @@
>   #include <asm/cpufeature.h>
>   #include <asm/debug-monitors.h>
>   #include <asm/fpsimd.h>
> +#include <asm/mte.h>
>   #include <asm/pgtable.h>
>   #include <asm/pointer_auth.h>
>   #include <asm/stacktrace.h>
> @@ -1797,6 +1798,12 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task)
>   long arch_ptrace(struct task_struct *child, long request,
>   		 unsigned long addr, unsigned long data)
>   {
> +	switch (request) {
> +	case PTRACE_PEEKMTETAGS:
> +	case PTRACE_POKEMTETAGS:
> +		return mte_ptrace_copy_tags(child, request, addr, data);
> +	}
> +
>   	return ptrace_request(child, request, addr, data);
>   }
>   
> diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
> index a531b52fa5ba..862655a36013 100644
> --- a/arch/arm64/lib/mte.S
> +++ b/arch/arm64/lib/mte.S
> @@ -4,7 +4,9 @@
>    */
>   #include <linux/linkage.h>
>   
> +#include <asm/alternative.h>
>   #include <asm/assembler.h>
> +#include <asm/mte.h>
>   #include <asm/page.h>
>   
>   	.arch	armv8.5-a+memtag
> @@ -40,3 +42,54 @@ SYM_FUNC_START(mte_copy_page_tags)
>   	b.ne	1b
>   2:
>   SYM_FUNC_END(mte_copy_page_tags)
> +
> +/*
> + * Read tags from a user buffer (one tag per byte) and set the corresponding
> + * tags at the given kernel address. Used by PTRACE_POKEMTETAGS.
> + *   x0 - kernel address (to)
> + *   x1 - user buffer (from)
> + *   x2 - number of tags/bytes (n)
> + * Returns:
> + *   x0 - number of tags read/set
> + */
> +SYM_FUNC_START(mte_copy_tags_from_user)
> +	mov	x3, x1
> +	cbz	x2, 2f
> +1:
> +	uao_user_alternative 2f, ldrb, ldtrb, w4, x1, 0
> +	lsl	x4, x4, #MTE_TAG_SHIFT
> +	stg	x4, [x0], #MTE_GRANULE_SIZE
> +	add	x1, x1, #1
> +	subs	x2, x2, #1
> +	b.ne	1b
> +
> +	// exception handling and function return
> +2:	sub	x0, x1, x3		// update the number of tags set
> +	ret
> +SYM_FUNC_END(mte_copy_tags_from_user)
> +
> +/*
> + * Get the tags from a kernel address range and write the tag values to the
> + * given user buffer (one tag per byte). Used by PTRACE_PEEKMTETAGS.
> + *   x0 - user buffer (to)
> + *   x1 - kernel address (from)
> + *   x2 - number of tags/bytes (n)
> + * Returns:
> + *   x0 - number of tags read/set
> + */
> +SYM_FUNC_START(mte_copy_tags_to_user)
> +	mov	x3, x0
> +	cbz	x2, 2f
> +1:
> +	ldg	x4, [x1]
> +	ubfx	x4, x4, #MTE_TAG_SHIFT, #MTE_TAG_SIZE
> +	uao_user_alternative 2f, strb, sttrb, w4, x0, 0
> +	add	x0, x0, #1
> +	add	x1, x1, #MTE_GRANULE_SIZE
> +	subs	x2, x2, #1
> +	b.ne	1b
> +
> +	// exception handling and function return
> +2:	sub	x0, x0, x3		// update the number of tags copied
> +	ret
> +SYM_FUNC_END(mte_copy_tags_to_user)
> 

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 18/26] arm64: mte: Add PTRACE_{PEEK,POKE}MTETAGS support
@ 2020-05-29 21:25     ` Luis Machado
  0 siblings, 0 replies; 123+ messages in thread
From: Luis Machado @ 2020-05-29 21:25 UTC (permalink / raw)
  To: Catalin Marinas, linux-arm-kernel
  Cc: linux-arch, Omair Javaid, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, Peter Collingbourne, linux-mm, Alan Hayward,
	Vincenzo Frascino, Will Deacon, Dave P Martin

Hi Catalin,

I have a question about siginfo MTE information. I suppose SEGV_MTESERR 
will be the most useful setting for debugging, right? Does si_addr 
contain the tagged pointer with the logical tag, a zero-tagged memory 
address or a tagged pointer with the allocation tag?

 From the debugger user's perspective, one would want to see both the 
logical tag and the allocation tag. And it would be handy to have both 
available in siginfo. Does that make sense?

Also, when would we see SEGV_MTEAERR, for example? That would provide no 
additional information about a particular memory address, which is not 
that useful for the debugger.

Thanks,
Luis

On 5/15/20 2:16 PM, Catalin Marinas wrote:
> Add support for bulk setting/getting of the MTE tags in a tracee's
> address space at 'addr' in the ptrace() syscall prototype. 'data' points
> to a struct iovec in the tracer's address space with iov_base
> representing the address of a tracer's buffer of length iov_len. The
> tags to be copied to/from the tracer's buffer are stored as one tag per
> byte.
> 
> On successfully copying at least one tag, ptrace() returns 0 and updates
> the tracer's iov_len with the number of tags copied. In case of error,
> either -EIO or -EFAULT is returned, trying to follow the ptrace() man
> page.
> 
> Note that the tag copying functions are not performance critical,
> therefore they lack optimisations found in typical memory copy routines.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Alan Hayward <Alan.Hayward@arm.com>
> Cc: Luis Machado <luis.machado@linaro.org>
> Cc: Omair Javaid <omair.javaid@linaro.org>
> ---
> 
> Notes:
>      v4:
>      - Following the change to only clear the tags in a page if it is mapped
>        to user with PROT_MTE, ptrace() now will refuse to access tags in
>        pages not previously mapped with PROT_MTE (PG_mte_tagged set). This is
>        primarily to avoid leaking uninitialised tags to user via ptrace().
>      - Fix SYM_FUNC_END argument typo.
>      - Rename MTE_ALLOC_* to MTE_GRANULE_*.
>      - Use uao_user_alternative for the user access in case we ever want to
>        call mte_copy_tags_* with a kernel buffer. It also matches the other
>        uaccess routines in the kernel.
>      - Simplify arch_ptrace() slightly.
>      - Reorder down_write_killable() with access_ok() in
>        __access_remote_tags().
>      - Handle copy length 0 in mte_copy_tags_{to,from}_user().
>      - Use put_user() instead of __put_user().
>      
>      New in v3.
> 
>   arch/arm64/include/asm/mte.h         |  17 ++++
>   arch/arm64/include/uapi/asm/ptrace.h |   3 +
>   arch/arm64/kernel/mte.c              | 139 +++++++++++++++++++++++++++
>   arch/arm64/kernel/ptrace.c           |   7 ++
>   arch/arm64/lib/mte.S                 |  53 ++++++++++
>   5 files changed, 219 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
> index 7435f6619bf1..9d4b1390d07d 100644
> --- a/arch/arm64/include/asm/mte.h
> +++ b/arch/arm64/include/asm/mte.h
> @@ -5,6 +5,11 @@
>   #ifndef __ASM_MTE_H
>   #define __ASM_MTE_H
>   
> +#define MTE_GRANULE_SIZE	UL(16)
> +#define MTE_GRANULE_MASK	(~(MTE_GRANULE_SIZE - 1))
> +#define MTE_TAG_SHIFT		56
> +#define MTE_TAG_SIZE		4
> +
>   #ifndef __ASSEMBLY__
>   
>   #include <linux/page-flags.h>
> @@ -12,6 +17,10 @@
>   #include <asm/pgtable-types.h>
>   
>   void mte_clear_page_tags(void *addr, size_t size);
> +unsigned long mte_copy_tags_from_user(void *to, const void __user *from,
> +				      unsigned long n);
> +unsigned long mte_copy_tags_to_user(void __user *to, void *from,
> +				    unsigned long n);
>   
>   #ifdef CONFIG_ARM64_MTE
>   
> @@ -25,6 +34,8 @@ void mte_thread_switch(struct task_struct *next);
>   void mte_suspend_exit(void);
>   long set_mte_ctrl(unsigned long arg);
>   long get_mte_ctrl(void);
> +int mte_ptrace_copy_tags(struct task_struct *child, long request,
> +			 unsigned long addr, unsigned long data);
>   
>   #else
>   
> @@ -54,6 +65,12 @@ static inline long get_mte_ctrl(void)
>   {
>   	return 0;
>   }
> +static inline int mte_ptrace_copy_tags(struct task_struct *child,
> +				       long request, unsigned long addr,
> +				       unsigned long data)
> +{
> +	return -EIO;
> +}
>   
>   #endif
>   
> diff --git a/arch/arm64/include/uapi/asm/ptrace.h b/arch/arm64/include/uapi/asm/ptrace.h
> index 1daf6dda8af0..cd2a4a164de3 100644
> --- a/arch/arm64/include/uapi/asm/ptrace.h
> +++ b/arch/arm64/include/uapi/asm/ptrace.h
> @@ -67,6 +67,9 @@
>   /* syscall emulation path in ptrace */
>   #define PTRACE_SYSEMU		  31
>   #define PTRACE_SYSEMU_SINGLESTEP  32
> +/* MTE allocation tag access */
> +#define PTRACE_PEEKMTETAGS	  33
> +#define PTRACE_POKEMTETAGS	  34
>   
>   #ifndef __ASSEMBLY__
>   
> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> index cda09bc8caf4..6abd6a16a145 100644
> --- a/arch/arm64/kernel/mte.c
> +++ b/arch/arm64/kernel/mte.c
> @@ -4,14 +4,18 @@
>    */
>   
>   #include <linux/bitops.h>
> +#include <linux/kernel.h>
>   #include <linux/mm.h>
>   #include <linux/prctl.h>
>   #include <linux/sched.h>
> +#include <linux/sched/mm.h>
>   #include <linux/string.h>
>   #include <linux/thread_info.h>
> +#include <linux/uio.h>
>   
>   #include <asm/cpufeature.h>
>   #include <asm/mte.h>
> +#include <asm/ptrace.h>
>   #include <asm/sysreg.h>
>   
>   void mte_sync_tags(pte_t *ptep, pte_t pte)
> @@ -173,3 +177,138 @@ long get_mte_ctrl(void)
>   
>   	return ret;
>   }
> +
> +/*
> + * Access MTE tags in another process' address space as given in mm. Update
> + * the number of tags copied. Return 0 if any tags copied, error otherwise.
> + * Inspired by __access_remote_vm().
> + */
> +static int __access_remote_tags(struct task_struct *tsk, struct mm_struct *mm,
> +				unsigned long addr, struct iovec *kiov,
> +				unsigned int gup_flags)
> +{
> +	struct vm_area_struct *vma;
> +	void __user *buf = kiov->iov_base;
> +	size_t len = kiov->iov_len;
> +	int ret;
> +	int write = gup_flags & FOLL_WRITE;
> +
> +	if (!access_ok(buf, len))
> +		return -EFAULT;
> +
> +	if (down_read_killable(&mm->mmap_sem))
> +		return -EIO;
> +
> +	while (len) {
> +		unsigned long tags, offset;
> +		void *maddr;
> +		struct page *page = NULL;
> +
> +		ret = get_user_pages_remote(tsk, mm, addr, 1, gup_flags,
> +					    &page, &vma, NULL);
> +		if (ret <= 0)
> +			break;
> +
> +		/*
> +		 * Only copy tags if the page has been mapped as PROT_MTE
> +		 * (PG_mte_tagged set). Otherwise the tags are not valid and
> +		 * not accessible to user. Moreover, an mprotect(PROT_MTE)
> +		 * would cause the existing tags to be cleared if the page
> +		 * was never mapped with PROT_MTE.
> +		 */
> +		if (!test_bit(PG_mte_tagged, &page->flags)) {
> +			ret = -EOPNOTSUPP;
> +			put_page(page);
> +			break;
> +		}
> +
> +		/* limit access to the end of the page */
> +		offset = offset_in_page(addr);
> +		tags = min(len, (PAGE_SIZE - offset) / MTE_GRANULE_SIZE);
> +
> +		maddr = page_address(page);
> +		if (write) {
> +			tags = mte_copy_tags_from_user(maddr + offset, buf, tags);
> +			set_page_dirty_lock(page);
> +		} else {
> +			tags = mte_copy_tags_to_user(buf, maddr + offset, tags);
> +		}
> +		put_page(page);
> +
> +		/* error accessing the tracer's buffer */
> +		if (!tags)
> +			break;
> +
> +		len -= tags;
> +		buf += tags;
> +		addr += tags * MTE_GRANULE_SIZE;
> +	}
> +	up_read(&mm->mmap_sem);
> +
> +	/* return an error if no tags copied */
> +	kiov->iov_len = buf - kiov->iov_base;
> +	if (!kiov->iov_len) {
> +		/* check for error accessing the tracee's address space */
> +		if (ret <= 0)
> +			return -EIO;
> +		else
> +			return -EFAULT;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * Copy MTE tags in another process' address space at 'addr' to/from tracer's
> + * iovec buffer. Return 0 on success. Inspired by ptrace_access_vm().
> + */
> +static int access_remote_tags(struct task_struct *tsk, unsigned long addr,
> +			      struct iovec *kiov, unsigned int gup_flags)
> +{
> +	struct mm_struct *mm;
> +	int ret;
> +
> +	mm = get_task_mm(tsk);
> +	if (!mm)
> +		return -EPERM;
> +
> +	if (!tsk->ptrace || (current != tsk->parent) ||
> +	    ((get_dumpable(mm) != SUID_DUMP_USER) &&
> +	     !ptracer_capable(tsk, mm->user_ns))) {
> +		mmput(mm);
> +		return -EPERM;
> +	}
> +
> +	ret = __access_remote_tags(tsk, mm, addr, kiov, gup_flags);
> +	mmput(mm);
> +
> +	return ret;
> +}
> +
> +int mte_ptrace_copy_tags(struct task_struct *child, long request,
> +			 unsigned long addr, unsigned long data)
> +{
> +	int ret;
> +	struct iovec kiov;
> +	struct iovec __user *uiov = (void __user *)data;
> +	unsigned int gup_flags = FOLL_FORCE;
> +
> +	if (!system_supports_mte())
> +		return -EIO;
> +
> +	if (get_user(kiov.iov_base, &uiov->iov_base) ||
> +	    get_user(kiov.iov_len, &uiov->iov_len))
> +		return -EFAULT;
> +
> +	if (request == PTRACE_POKEMTETAGS)
> +		gup_flags |= FOLL_WRITE;
> +
> +	/* align addr to the MTE tag granule */
> +	addr &= MTE_GRANULE_MASK;
> +
> +	ret = access_remote_tags(child, addr, &kiov, gup_flags);
> +	if (!ret)
> +		ret = put_user(kiov.iov_len, &uiov->iov_len);
> +
> +	return ret;
> +}
> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> index 077e352495eb..c8bda5f5b321 100644
> --- a/arch/arm64/kernel/ptrace.c
> +++ b/arch/arm64/kernel/ptrace.c
> @@ -34,6 +34,7 @@
>   #include <asm/cpufeature.h>
>   #include <asm/debug-monitors.h>
>   #include <asm/fpsimd.h>
> +#include <asm/mte.h>
>   #include <asm/pgtable.h>
>   #include <asm/pointer_auth.h>
>   #include <asm/stacktrace.h>
> @@ -1797,6 +1798,12 @@ const struct user_regset_view *task_user_regset_view(struct task_struct *task)
>   long arch_ptrace(struct task_struct *child, long request,
>   		 unsigned long addr, unsigned long data)
>   {
> +	switch (request) {
> +	case PTRACE_PEEKMTETAGS:
> +	case PTRACE_POKEMTETAGS:
> +		return mte_ptrace_copy_tags(child, request, addr, data);
> +	}
> +
>   	return ptrace_request(child, request, addr, data);
>   }
>   
> diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
> index a531b52fa5ba..862655a36013 100644
> --- a/arch/arm64/lib/mte.S
> +++ b/arch/arm64/lib/mte.S
> @@ -4,7 +4,9 @@
>    */
>   #include <linux/linkage.h>
>   
> +#include <asm/alternative.h>
>   #include <asm/assembler.h>
> +#include <asm/mte.h>
>   #include <asm/page.h>
>   
>   	.arch	armv8.5-a+memtag
> @@ -40,3 +42,54 @@ SYM_FUNC_START(mte_copy_page_tags)
>   	b.ne	1b
>   2:
>   SYM_FUNC_END(mte_copy_page_tags)
> +
> +/*
> + * Read tags from a user buffer (one tag per byte) and set the corresponding
> + * tags at the given kernel address. Used by PTRACE_POKEMTETAGS.
> + *   x0 - kernel address (to)
> + *   x1 - user buffer (from)
> + *   x2 - number of tags/bytes (n)
> + * Returns:
> + *   x0 - number of tags read/set
> + */
> +SYM_FUNC_START(mte_copy_tags_from_user)
> +	mov	x3, x1
> +	cbz	x2, 2f
> +1:
> +	uao_user_alternative 2f, ldrb, ldtrb, w4, x1, 0
> +	lsl	x4, x4, #MTE_TAG_SHIFT
> +	stg	x4, [x0], #MTE_GRANULE_SIZE
> +	add	x1, x1, #1
> +	subs	x2, x2, #1
> +	b.ne	1b
> +
> +	// exception handling and function return
> +2:	sub	x0, x1, x3		// update the number of tags set
> +	ret
> +SYM_FUNC_END(mte_copy_tags_from_user)
> +
> +/*
> + * Get the tags from a kernel address range and write the tag values to the
> + * given user buffer (one tag per byte). Used by PTRACE_PEEKMTETAGS.
> + *   x0 - user buffer (to)
> + *   x1 - kernel address (from)
> + *   x2 - number of tags/bytes (n)
> + * Returns:
> + *   x0 - number of tags read/set
> + */
> +SYM_FUNC_START(mte_copy_tags_to_user)
> +	mov	x3, x0
> +	cbz	x2, 2f
> +1:
> +	ldg	x4, [x1]
> +	ubfx	x4, x4, #MTE_TAG_SHIFT, #MTE_TAG_SIZE
> +	uao_user_alternative 2f, strb, sttrb, w4, x0, 0
> +	add	x0, x0, #1
> +	add	x1, x1, #MTE_GRANULE_SIZE
> +	subs	x2, x2, #1
> +	b.ne	1b
> +
> +	// exception handling and function return
> +2:	sub	x0, x0, x3		// update the number of tags copied
> +	ret
> +SYM_FUNC_END(mte_copy_tags_to_user)
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  2020-05-28 16:34           ` Catalin Marinas
@ 2020-06-01  8:55             ` Dave Martin
  -1 siblings, 0 replies; 123+ messages in thread
From: Dave Martin @ 2020-06-01  8:55 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Szabolcs Nagy, linux-arch, nd, Peter Collingbourne,
	Kevin Brodsky, linux-mm, Evgenii Stepanov, Andrey Konovalov,
	Vincenzo Frascino, Will Deacon, Linux ARM

On Thu, May 28, 2020 at 05:34:13PM +0100, Catalin Marinas wrote:
> On Thu, May 28, 2020 at 12:05:09PM +0100, Szabolcs Nagy wrote:
> > The 05/28/2020 10:14, Catalin Marinas wrote:
> > > On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:

[...]

Just jumping in on this point:

> > > > Should the userspace stack always be mapped as if with PROT_MTE if the
> > > > hardware supports it? Such a change would be invisible to non-MTE
> > > > aware userspace since it would already need to opt in to tag checking
> > > > via prctl. This would let userspace avoid a complex stack
> > > > initialization sequence when running with stack tagging enabled on the
> > > > main thread.
> > > 
> > > I don't think the stack initialisation is that difficult. On program
> > > startup (can be the dynamic loader). Something like (untested):
> > > 
> > > 	register unsigned long stack asm ("sp");
> > > 	unsigned long page_sz = sysconf(_SC_PAGESIZE);
> > > 
> > > 	mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
> > > 		 PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);
> > > 
> > > (the essential part it PROT_GROWSDOWN so that you don't have to specify
> > > a stack lower limit)
> > 
> > does this work even if the currently mapped stack is more than page_sz?
> > determining the mapped main stack area is i think non-trivial to do in
> > userspace (requires parsing /proc/self/maps or similar).
> 
> Because of PROT_GROWSDOWN, the kernel adjusts the start of the range
> down automatically. It is potentially problematic if the top of the
> stack is more than a page away and you want the whole stack coloured. I
> haven't run a test but my reading of the kernel code is that the stack
> vma would be split in this scenario, so the range beyond sp+page_sz
> won't have PROT_MTE set.
> 
> My assumption is that if you do this during program start, the stack is
> smaller than a page. Alternatively, could we use argv or envp to
> determine the top of the user stack (the bottom is taken care of by the
> kernel)?

I don't think you can easily know when the stack ends, but perhaps it
doesn't matter.

From memory, the initial stack looks like:

	argv/env strings
	AT_NULL
	auxv
	NULL
	env
	NULL
	argv
	argc	<--- sp

If we don't care about tagging the strings correctly, we could step to
the end of auxv and tag down from there.

If we do care about tagging the strings, there's probably no good way
to find the end of the string area, other than looking up sp in
/proc/self/maps.  I'm not sure we should trust all past and future
kernels to spit out the strings in a predictable order.

Assuming that the last env string has the highest address does not
sounds like a good idea to me.  It would be easy for someone to break
that assumption later without realising.


If we're concerned about this, and reading /proc/self/auxv is deemed
unacceptable (likely: some binaries need to work before /proc is
mounted) then we could perhaps add a new auxv entry to report the stack
base address to the user startup code.


I don't think it matters if all this is "hard" for userspace: only the
C library / runtime should be doing this.  After libc startup, it's
generally too late to do this kind of thing safely.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
@ 2020-06-01  8:55             ` Dave Martin
  0 siblings, 0 replies; 123+ messages in thread
From: Dave Martin @ 2020-06-01  8:55 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, Vincenzo Frascino, Will Deacon, Szabolcs Nagy,
	Andrey Konovalov, Kevin Brodsky, linux-mm, Linux ARM, nd,
	Peter Collingbourne, Evgenii Stepanov

On Thu, May 28, 2020 at 05:34:13PM +0100, Catalin Marinas wrote:
> On Thu, May 28, 2020 at 12:05:09PM +0100, Szabolcs Nagy wrote:
> > The 05/28/2020 10:14, Catalin Marinas wrote:
> > > On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:

[...]

Just jumping in on this point:

> > > > Should the userspace stack always be mapped as if with PROT_MTE if the
> > > > hardware supports it? Such a change would be invisible to non-MTE
> > > > aware userspace since it would already need to opt in to tag checking
> > > > via prctl. This would let userspace avoid a complex stack
> > > > initialization sequence when running with stack tagging enabled on the
> > > > main thread.
> > > 
> > > I don't think the stack initialisation is that difficult. On program
> > > startup (can be the dynamic loader). Something like (untested):
> > > 
> > > 	register unsigned long stack asm ("sp");
> > > 	unsigned long page_sz = sysconf(_SC_PAGESIZE);
> > > 
> > > 	mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
> > > 		 PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);
> > > 
> > > (the essential part it PROT_GROWSDOWN so that you don't have to specify
> > > a stack lower limit)
> > 
> > does this work even if the currently mapped stack is more than page_sz?
> > determining the mapped main stack area is i think non-trivial to do in
> > userspace (requires parsing /proc/self/maps or similar).
> 
> Because of PROT_GROWSDOWN, the kernel adjusts the start of the range
> down automatically. It is potentially problematic if the top of the
> stack is more than a page away and you want the whole stack coloured. I
> haven't run a test but my reading of the kernel code is that the stack
> vma would be split in this scenario, so the range beyond sp+page_sz
> won't have PROT_MTE set.
> 
> My assumption is that if you do this during program start, the stack is
> smaller than a page. Alternatively, could we use argv or envp to
> determine the top of the user stack (the bottom is taken care of by the
> kernel)?

I don't think you can easily know when the stack ends, but perhaps it
doesn't matter.

From memory, the initial stack looks like:

	argv/env strings
	AT_NULL
	auxv
	NULL
	env
	NULL
	argv
	argc	<--- sp

If we don't care about tagging the strings correctly, we could step to
the end of auxv and tag down from there.

If we do care about tagging the strings, there's probably no good way
to find the end of the string area, other than looking up sp in
/proc/self/maps.  I'm not sure we should trust all past and future
kernels to spit out the strings in a predictable order.

Assuming that the last env string has the highest address does not
sounds like a good idea to me.  It would be easy for someone to break
that assumption later without realising.


If we're concerned about this, and reading /proc/self/auxv is deemed
unacceptable (likely: some binaries need to work before /proc is
mounted) then we could perhaps add a new auxv entry to report the stack
base address to the user startup code.


I don't think it matters if all this is "hard" for userspace: only the
C library / runtime should be doing this.  After libc startup, it's
generally too late to do this kind of thing safely.

[...]

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 18/26] arm64: mte: Add PTRACE_{PEEK,POKE}MTETAGS support
  2020-05-29 21:25     ` Luis Machado
@ 2020-06-01 12:07       ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-06-01 12:07 UTC (permalink / raw)
  To: Luis Machado
  Cc: linux-arm-kernel, linux-mm, linux-arch, Will Deacon,
	Dave P Martin, Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Alan Hayward,
	Omair Javaid

On Fri, May 29, 2020 at 06:25:14PM -0300, Luis Machado wrote:
> I have a question about siginfo MTE information. I suppose SEGV_MTESERR will
> be the most useful setting for debugging, right? Does si_addr contain the
> tagged pointer with the logical tag, a zero-tagged memory address or a
> tagged pointer with the allocation tag?

The si_addr is zero-tagged currently. We were planning to expose the tag
in FAR_EL1 as a separate siginfo field. See these discussions:

https://lore.kernel.org/linux-arm-kernel/20200513180914.50892-1-pcc@google.com/
https://lore.kernel.org/linux-arm-kernel/20200521022943.195898-1-pcc@google.com/

In theory, we could add the tag to si_addr for SEGV_MTESERR, it
shouldn't break the existing ABI (well, it depends on how you look at
it).

> From the debugger user's perspective, one would want to see both the logical
> tag and the allocation tag. And it would be handy to have both available in
> siginfo. Does that make sense?

The debugger can access the allocation tag via PTRACE_PEEKMTETAGS. I
don't think the kernel should provide this in siginfo. Also, the signal
handler can do an LDG and read the allocation tag directly, no need for
it to be in siginfo.

> Also, when would we see SEGV_MTEAERR, for example? That would provide no
> additional information about a particular memory address, which is not that
> useful for the debugger.

Yeah, we can't really do much here since the hardware doesn't provide us
such information. The async mode is only useful as a general test to see
if your program has MTE faults but for actual debugging you'd have to
switch to synchronous. For glibc at least, I think the mode can be
driven by an environment variable.

-- 
Catalin

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 18/26] arm64: mte: Add PTRACE_{PEEK,POKE}MTETAGS support
@ 2020-06-01 12:07       ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-06-01 12:07 UTC (permalink / raw)
  To: Luis Machado
  Cc: linux-arch, Omair Javaid, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, Peter Collingbourne, linux-mm, Alan Hayward,
	Vincenzo Frascino, Will Deacon, Dave P Martin, linux-arm-kernel

On Fri, May 29, 2020 at 06:25:14PM -0300, Luis Machado wrote:
> I have a question about siginfo MTE information. I suppose SEGV_MTESERR will
> be the most useful setting for debugging, right? Does si_addr contain the
> tagged pointer with the logical tag, a zero-tagged memory address or a
> tagged pointer with the allocation tag?

The si_addr is zero-tagged currently. We were planning to expose the tag
in FAR_EL1 as a separate siginfo field. See these discussions:

https://lore.kernel.org/linux-arm-kernel/20200513180914.50892-1-pcc@google.com/
https://lore.kernel.org/linux-arm-kernel/20200521022943.195898-1-pcc@google.com/

In theory, we could add the tag to si_addr for SEGV_MTESERR, it
shouldn't break the existing ABI (well, it depends on how you look at
it).

> From the debugger user's perspective, one would want to see both the logical
> tag and the allocation tag. And it would be handy to have both available in
> siginfo. Does that make sense?

The debugger can access the allocation tag via PTRACE_PEEKMTETAGS. I
don't think the kernel should provide this in siginfo. Also, the signal
handler can do an LDG and read the allocation tag directly, no need for
it to be in siginfo.

> Also, when would we see SEGV_MTEAERR, for example? That would provide no
> additional information about a particular memory address, which is not that
> useful for the debugger.

Yeah, we can't really do much here since the hardware doesn't provide us
such information. The async mode is only useful as a general test to see
if your program has MTE faults but for actual debugging you'd have to
switch to synchronous. For glibc at least, I think the mode can be
driven by an environment variable.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  2020-06-01  8:55             ` Dave Martin
@ 2020-06-01 14:45               ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-06-01 14:45 UTC (permalink / raw)
  To: Dave Martin
  Cc: Szabolcs Nagy, linux-arch, nd, Peter Collingbourne,
	Kevin Brodsky, linux-mm, Evgenii Stepanov, Andrey Konovalov,
	Vincenzo Frascino, Will Deacon, Linux ARM

On Mon, Jun 01, 2020 at 09:55:38AM +0100, Dave P Martin wrote:
> On Thu, May 28, 2020 at 05:34:13PM +0100, Catalin Marinas wrote:
> > On Thu, May 28, 2020 at 12:05:09PM +0100, Szabolcs Nagy wrote:
> > > The 05/28/2020 10:14, Catalin Marinas wrote:
> > > > On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:
> > > > > Should the userspace stack always be mapped as if with PROT_MTE if the
> > > > > hardware supports it? Such a change would be invisible to non-MTE
> > > > > aware userspace since it would already need to opt in to tag checking
> > > > > via prctl. This would let userspace avoid a complex stack
> > > > > initialization sequence when running with stack tagging enabled on the
> > > > > main thread.
> > > > 
> > > > I don't think the stack initialisation is that difficult. On program
> > > > startup (can be the dynamic loader). Something like (untested):
> > > > 
> > > > 	register unsigned long stack asm ("sp");
> > > > 	unsigned long page_sz = sysconf(_SC_PAGESIZE);
> > > > 
> > > > 	mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
> > > > 		 PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);
> > > > 
> > > > (the essential part it PROT_GROWSDOWN so that you don't have to specify
> > > > a stack lower limit)
> > > 
> > > does this work even if the currently mapped stack is more than page_sz?
> > > determining the mapped main stack area is i think non-trivial to do in
> > > userspace (requires parsing /proc/self/maps or similar).
> > 
> > Because of PROT_GROWSDOWN, the kernel adjusts the start of the range
> > down automatically. It is potentially problematic if the top of the
> > stack is more than a page away and you want the whole stack coloured. I
> > haven't run a test but my reading of the kernel code is that the stack
> > vma would be split in this scenario, so the range beyond sp+page_sz
> > won't have PROT_MTE set.
> > 
> > My assumption is that if you do this during program start, the stack is
> > smaller than a page. Alternatively, could we use argv or envp to
> > determine the top of the user stack (the bottom is taken care of by the
> > kernel)?
> 
> I don't think you can easily know when the stack ends, but perhaps it
> doesn't matter.
> 
> From memory, the initial stack looks like:
> 
> 	argv/env strings
> 	AT_NULL
> 	auxv
> 	NULL
> 	env
> 	NULL
> 	argv
> 	argc	<--- sp
> 
> If we don't care about tagging the strings correctly, we could step to
> the end of auxv and tag down from there.
> 
> If we do care about tagging the strings, there's probably no good way
> to find the end of the string area, other than looking up sp in
> /proc/self/maps.  I'm not sure we should trust all past and future
> kernels to spit out the strings in a predictable order.

I don't think we care about tagging whatever the kernel places on the
stack since the argv/envp pointers are untagged. An mprotect(PROT_MTE)
may or may not cover the environment but it shouldn't matter as the
kernel clears the tags on the corresponding pages anyway.

AFAIK stack tagging works by colouring a stack frame on function entry
and clearing the tags on return. We would only hit a problem if the
function issuing mprotect(sp, PROT_MTE) on and its callers already
assumed a PROT_MTE stack. Without PROT_MTE, an STG would be
write-ignore, so subsequently turning it on would lead to a mismatch
between the pointer and the allocation tags.

So PROT_MTE turning on should happen very early in the user process
startup code before any code with stack tagging enabled. Whether you
reach the top of the stack with such mprotect() doesn't really matter
since up to that point there should not be any use of stack tagging. If
that's not possible, for example the glibc code setting up the stack was
compiled to stack tagging itself, the kernel would have to enable it
when the user process starts. However, I'd only do this based on some
ELF note.

-- 
Catalin

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
@ 2020-06-01 14:45               ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-06-01 14:45 UTC (permalink / raw)
  To: Dave Martin
  Cc: linux-arch, Vincenzo Frascino, Will Deacon, Szabolcs Nagy,
	Andrey Konovalov, Kevin Brodsky, linux-mm, Linux ARM, nd,
	Peter Collingbourne, Evgenii Stepanov

On Mon, Jun 01, 2020 at 09:55:38AM +0100, Dave P Martin wrote:
> On Thu, May 28, 2020 at 05:34:13PM +0100, Catalin Marinas wrote:
> > On Thu, May 28, 2020 at 12:05:09PM +0100, Szabolcs Nagy wrote:
> > > The 05/28/2020 10:14, Catalin Marinas wrote:
> > > > On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:
> > > > > Should the userspace stack always be mapped as if with PROT_MTE if the
> > > > > hardware supports it? Such a change would be invisible to non-MTE
> > > > > aware userspace since it would already need to opt in to tag checking
> > > > > via prctl. This would let userspace avoid a complex stack
> > > > > initialization sequence when running with stack tagging enabled on the
> > > > > main thread.
> > > > 
> > > > I don't think the stack initialisation is that difficult. On program
> > > > startup (can be the dynamic loader). Something like (untested):
> > > > 
> > > > 	register unsigned long stack asm ("sp");
> > > > 	unsigned long page_sz = sysconf(_SC_PAGESIZE);
> > > > 
> > > > 	mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
> > > > 		 PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);
> > > > 
> > > > (the essential part it PROT_GROWSDOWN so that you don't have to specify
> > > > a stack lower limit)
> > > 
> > > does this work even if the currently mapped stack is more than page_sz?
> > > determining the mapped main stack area is i think non-trivial to do in
> > > userspace (requires parsing /proc/self/maps or similar).
> > 
> > Because of PROT_GROWSDOWN, the kernel adjusts the start of the range
> > down automatically. It is potentially problematic if the top of the
> > stack is more than a page away and you want the whole stack coloured. I
> > haven't run a test but my reading of the kernel code is that the stack
> > vma would be split in this scenario, so the range beyond sp+page_sz
> > won't have PROT_MTE set.
> > 
> > My assumption is that if you do this during program start, the stack is
> > smaller than a page. Alternatively, could we use argv or envp to
> > determine the top of the user stack (the bottom is taken care of by the
> > kernel)?
> 
> I don't think you can easily know when the stack ends, but perhaps it
> doesn't matter.
> 
> From memory, the initial stack looks like:
> 
> 	argv/env strings
> 	AT_NULL
> 	auxv
> 	NULL
> 	env
> 	NULL
> 	argv
> 	argc	<--- sp
> 
> If we don't care about tagging the strings correctly, we could step to
> the end of auxv and tag down from there.
> 
> If we do care about tagging the strings, there's probably no good way
> to find the end of the string area, other than looking up sp in
> /proc/self/maps.  I'm not sure we should trust all past and future
> kernels to spit out the strings in a predictable order.

I don't think we care about tagging whatever the kernel places on the
stack since the argv/envp pointers are untagged. An mprotect(PROT_MTE)
may or may not cover the environment but it shouldn't matter as the
kernel clears the tags on the corresponding pages anyway.

AFAIK stack tagging works by colouring a stack frame on function entry
and clearing the tags on return. We would only hit a problem if the
function issuing mprotect(sp, PROT_MTE) on and its callers already
assumed a PROT_MTE stack. Without PROT_MTE, an STG would be
write-ignore, so subsequently turning it on would lead to a mismatch
between the pointer and the allocation tags.

So PROT_MTE turning on should happen very early in the user process
startup code before any code with stack tagging enabled. Whether you
reach the top of the stack with such mprotect() doesn't really matter
since up to that point there should not be any use of stack tagging. If
that's not possible, for example the glibc code setting up the stack was
compiled to stack tagging itself, the kernel would have to enable it
when the user process starts. However, I'd only do this based on some
ELF note.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
  2020-06-01 14:45               ` Catalin Marinas
@ 2020-06-01 15:04                 ` Dave Martin
  -1 siblings, 0 replies; 123+ messages in thread
From: Dave Martin @ 2020-06-01 15:04 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, Vincenzo Frascino, Will Deacon, Szabolcs Nagy,
	Andrey Konovalov, Kevin Brodsky, linux-mm, Linux ARM, nd,
	Peter Collingbourne, Evgenii Stepanov

On Mon, Jun 01, 2020 at 03:45:45PM +0100, Catalin Marinas wrote:
> On Mon, Jun 01, 2020 at 09:55:38AM +0100, Dave P Martin wrote:
> > On Thu, May 28, 2020 at 05:34:13PM +0100, Catalin Marinas wrote:
> > > On Thu, May 28, 2020 at 12:05:09PM +0100, Szabolcs Nagy wrote:
> > > > The 05/28/2020 10:14, Catalin Marinas wrote:
> > > > > On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:
> > > > > > Should the userspace stack always be mapped as if with PROT_MTE if the
> > > > > > hardware supports it? Such a change would be invisible to non-MTE
> > > > > > aware userspace since it would already need to opt in to tag checking
> > > > > > via prctl. This would let userspace avoid a complex stack
> > > > > > initialization sequence when running with stack tagging enabled on the
> > > > > > main thread.
> > > > > 
> > > > > I don't think the stack initialisation is that difficult. On program
> > > > > startup (can be the dynamic loader). Something like (untested):
> > > > > 
> > > > > 	register unsigned long stack asm ("sp");
> > > > > 	unsigned long page_sz = sysconf(_SC_PAGESIZE);
> > > > > 
> > > > > 	mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
> > > > > 		 PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);
> > > > > 
> > > > > (the essential part it PROT_GROWSDOWN so that you don't have to specify
> > > > > a stack lower limit)
> > > > 
> > > > does this work even if the currently mapped stack is more than page_sz?
> > > > determining the mapped main stack area is i think non-trivial to do in
> > > > userspace (requires parsing /proc/self/maps or similar).
> > > 
> > > Because of PROT_GROWSDOWN, the kernel adjusts the start of the range
> > > down automatically. It is potentially problematic if the top of the
> > > stack is more than a page away and you want the whole stack coloured. I
> > > haven't run a test but my reading of the kernel code is that the stack
> > > vma would be split in this scenario, so the range beyond sp+page_sz
> > > won't have PROT_MTE set.
> > > 
> > > My assumption is that if you do this during program start, the stack is
> > > smaller than a page. Alternatively, could we use argv or envp to
> > > determine the top of the user stack (the bottom is taken care of by the
> > > kernel)?
> > 
> > I don't think you can easily know when the stack ends, but perhaps it
> > doesn't matter.
> > 
> > From memory, the initial stack looks like:
> > 
> > 	argv/env strings
> > 	AT_NULL
> > 	auxv
> > 	NULL
> > 	env
> > 	NULL
> > 	argv
> > 	argc	<--- sp
> > 
> > If we don't care about tagging the strings correctly, we could step to
> > the end of auxv and tag down from there.
> > 
> > If we do care about tagging the strings, there's probably no good way
> > to find the end of the string area, other than looking up sp in
> > /proc/self/maps.  I'm not sure we should trust all past and future
> > kernels to spit out the strings in a predictable order.
> 
> I don't think we care about tagging whatever the kernel places on the
> stack since the argv/envp pointers are untagged. An mprotect(PROT_MTE)
> may or may not cover the environment but it shouldn't matter as the
> kernel clears the tags on the corresponding pages anyway.

We have no match-all tag, right?  So we do rely on the tags being
cleared for the initial stack contents so that using untagged pointers
to access it works.

> AFAIK stack tagging works by colouring a stack frame on function entry
> and clearing the tags on return. We would only hit a problem if the
> function issuing mprotect(sp, PROT_MTE) on and its callers already
> assumed a PROT_MTE stack. Without PROT_MTE, an STG would be
> write-ignore, so subsequently turning it on would lead to a mismatch
> between the pointer and the allocation tags.
> 
> So PROT_MTE turning on should happen very early in the user process
> startup code before any code with stack tagging enabled. Whether you
> reach the top of the stack with such mprotect() doesn't really matter
> since up to that point there should not be any use of stack tagging. If
> that's not possible, for example the glibc code setting up the stack was
> compiled to stack tagging itself, the kernel would have to enable it
> when the user process starts. However, I'd only do this based on some
> ELF note.

Sounds fair.

This early on, the process shouldn't be exposed to arbitrary, untrusted
data.  So it's probably not a problem that tagging isn't turned on right
from the start.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect()
@ 2020-06-01 15:04                 ` Dave Martin
  0 siblings, 0 replies; 123+ messages in thread
From: Dave Martin @ 2020-06-01 15:04 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, nd, Szabolcs Nagy, Andrey Konovalov, Kevin Brodsky,
	Peter Collingbourne, linux-mm, Evgenii Stepanov,
	Vincenzo Frascino, Will Deacon, Linux ARM

On Mon, Jun 01, 2020 at 03:45:45PM +0100, Catalin Marinas wrote:
> On Mon, Jun 01, 2020 at 09:55:38AM +0100, Dave P Martin wrote:
> > On Thu, May 28, 2020 at 05:34:13PM +0100, Catalin Marinas wrote:
> > > On Thu, May 28, 2020 at 12:05:09PM +0100, Szabolcs Nagy wrote:
> > > > The 05/28/2020 10:14, Catalin Marinas wrote:
> > > > > On Wed, May 27, 2020 at 11:57:39AM -0700, Peter Collingbourne wrote:
> > > > > > Should the userspace stack always be mapped as if with PROT_MTE if the
> > > > > > hardware supports it? Such a change would be invisible to non-MTE
> > > > > > aware userspace since it would already need to opt in to tag checking
> > > > > > via prctl. This would let userspace avoid a complex stack
> > > > > > initialization sequence when running with stack tagging enabled on the
> > > > > > main thread.
> > > > > 
> > > > > I don't think the stack initialisation is that difficult. On program
> > > > > startup (can be the dynamic loader). Something like (untested):
> > > > > 
> > > > > 	register unsigned long stack asm ("sp");
> > > > > 	unsigned long page_sz = sysconf(_SC_PAGESIZE);
> > > > > 
> > > > > 	mprotect((void *)(stack & ~(page_sz - 1)), page_sz,
> > > > > 		 PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN);
> > > > > 
> > > > > (the essential part it PROT_GROWSDOWN so that you don't have to specify
> > > > > a stack lower limit)
> > > > 
> > > > does this work even if the currently mapped stack is more than page_sz?
> > > > determining the mapped main stack area is i think non-trivial to do in
> > > > userspace (requires parsing /proc/self/maps or similar).
> > > 
> > > Because of PROT_GROWSDOWN, the kernel adjusts the start of the range
> > > down automatically. It is potentially problematic if the top of the
> > > stack is more than a page away and you want the whole stack coloured. I
> > > haven't run a test but my reading of the kernel code is that the stack
> > > vma would be split in this scenario, so the range beyond sp+page_sz
> > > won't have PROT_MTE set.
> > > 
> > > My assumption is that if you do this during program start, the stack is
> > > smaller than a page. Alternatively, could we use argv or envp to
> > > determine the top of the user stack (the bottom is taken care of by the
> > > kernel)?
> > 
> > I don't think you can easily know when the stack ends, but perhaps it
> > doesn't matter.
> > 
> > From memory, the initial stack looks like:
> > 
> > 	argv/env strings
> > 	AT_NULL
> > 	auxv
> > 	NULL
> > 	env
> > 	NULL
> > 	argv
> > 	argc	<--- sp
> > 
> > If we don't care about tagging the strings correctly, we could step to
> > the end of auxv and tag down from there.
> > 
> > If we do care about tagging the strings, there's probably no good way
> > to find the end of the string area, other than looking up sp in
> > /proc/self/maps.  I'm not sure we should trust all past and future
> > kernels to spit out the strings in a predictable order.
> 
> I don't think we care about tagging whatever the kernel places on the
> stack since the argv/envp pointers are untagged. An mprotect(PROT_MTE)
> may or may not cover the environment but it shouldn't matter as the
> kernel clears the tags on the corresponding pages anyway.

We have no match-all tag, right?  So we do rely on the tags being
cleared for the initial stack contents so that using untagged pointers
to access it works.

> AFAIK stack tagging works by colouring a stack frame on function entry
> and clearing the tags on return. We would only hit a problem if the
> function issuing mprotect(sp, PROT_MTE) on and its callers already
> assumed a PROT_MTE stack. Without PROT_MTE, an STG would be
> write-ignore, so subsequently turning it on would lead to a mismatch
> between the pointer and the allocation tags.
> 
> So PROT_MTE turning on should happen very early in the user process
> startup code before any code with stack tagging enabled. Whether you
> reach the top of the stack with such mprotect() doesn't really matter
> since up to that point there should not be any use of stack tagging. If
> that's not possible, for example the glibc code setting up the stack was
> compiled to stack tagging itself, the kernel would have to enable it
> when the user process starts. However, I'd only do this based on some
> ELF note.

Sounds fair.

This early on, the process shouldn't be exposed to arbitrary, untrusted
data.  So it's probably not a problem that tagging isn't turned on right
from the start.

Cheers
---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 18/26] arm64: mte: Add PTRACE_{PEEK,POKE}MTETAGS support
  2020-06-01 12:07       ` Catalin Marinas
@ 2020-06-01 15:17         ` Luis Machado
  -1 siblings, 0 replies; 123+ messages in thread
From: Luis Machado @ 2020-06-01 15:17 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, linux-mm, linux-arch, Will Deacon,
	Dave P Martin, Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Alan Hayward,
	Omair Javaid

On 6/1/20 9:07 AM, Catalin Marinas wrote:
> On Fri, May 29, 2020 at 06:25:14PM -0300, Luis Machado wrote:
>> I have a question about siginfo MTE information. I suppose SEGV_MTESERR will
>> be the most useful setting for debugging, right? Does si_addr contain the
>> tagged pointer with the logical tag, a zero-tagged memory address or a
>> tagged pointer with the allocation tag?
> 
> The si_addr is zero-tagged currently. We were planning to expose the tag
> in FAR_EL1 as a separate siginfo field. See these discussions:
> > 
https://lore.kernel.org/linux-arm-kernel/20200513180914.50892-1-pcc@google.com/
> https://lore.kernel.org/linux-arm-kernel/20200521022943.195898-1-pcc@google.com/
> 
> In theory, we could add the tag to si_addr for SEGV_MTESERR, it
> shouldn't break the existing ABI (well, it depends on how you look at
> it).
> 

Having additional fields in siginfo that hold useful information is 
probably best for debuggers. See my comment below about Intel MPX.

>>  From the debugger user's perspective, one would want to see both the logical
>> tag and the allocation tag. And it would be handy to have both available in
>> siginfo. Does that make sense?
> 
> The debugger can access the allocation tag via PTRACE_PEEKMTETAGS. I
> don't think the kernel should provide this in siginfo. Also, the signal
> handler can do an LDG and read the allocation tag directly, no need for
> it to be in siginfo.
> 

While the debugger can request this information from the kernel, the 
debugger has already received a SIGSEGV signal and will have to fetch 
siginfo for si_code. Having to do another PTRACE_PEEKMTETAGS call just 
to fetch the allocation tag doesn't sound great. Remember this can 
travel through TCP to gdbserver so it can call ptrace from the remote's 
end. It would be best to avoid the round trip.

Also, there seems to be past precedent to include more information in 
siginfo. For example, Intel MPX includes upper/lower bounds violation 
data in there.

Regarding using LDG, are you suggesting force-running this particular 
instruction in the traced process? If so, that isn't the way GDB (in 
particular, not sure about LLDB) does things.

>> Also, when would we see SEGV_MTEAERR, for example? That would provide no
>> additional information about a particular memory address, which is not that
>> useful for the debugger.
> 
> Yeah, we can't really do much here since the hardware doesn't provide us
> such information. The async mode is only useful as a general test to see
> if your program has MTE faults but for actual debugging you'd have to
> switch to synchronous. For glibc at least, I think the mode can be
> driven by an environment variable.
> 

I suspect SEGV_MTESERR would be a reasonable default then, for whoever 
is responsible for setting the default settings.

I'm assuming it is not the debugger, as it doesn't know how to toggle 
prctl settings.

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 18/26] arm64: mte: Add PTRACE_{PEEK,POKE}MTETAGS support
@ 2020-06-01 15:17         ` Luis Machado
  0 siblings, 0 replies; 123+ messages in thread
From: Luis Machado @ 2020-06-01 15:17 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, Omair Javaid, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, Peter Collingbourne, linux-mm, Alan Hayward,
	Vincenzo Frascino, Will Deacon, Dave P Martin, linux-arm-kernel

On 6/1/20 9:07 AM, Catalin Marinas wrote:
> On Fri, May 29, 2020 at 06:25:14PM -0300, Luis Machado wrote:
>> I have a question about siginfo MTE information. I suppose SEGV_MTESERR will
>> be the most useful setting for debugging, right? Does si_addr contain the
>> tagged pointer with the logical tag, a zero-tagged memory address or a
>> tagged pointer with the allocation tag?
> 
> The si_addr is zero-tagged currently. We were planning to expose the tag
> in FAR_EL1 as a separate siginfo field. See these discussions:
> > 
https://lore.kernel.org/linux-arm-kernel/20200513180914.50892-1-pcc@google.com/
> https://lore.kernel.org/linux-arm-kernel/20200521022943.195898-1-pcc@google.com/
> 
> In theory, we could add the tag to si_addr for SEGV_MTESERR, it
> shouldn't break the existing ABI (well, it depends on how you look at
> it).
> 

Having additional fields in siginfo that hold useful information is 
probably best for debuggers. See my comment below about Intel MPX.

>>  From the debugger user's perspective, one would want to see both the logical
>> tag and the allocation tag. And it would be handy to have both available in
>> siginfo. Does that make sense?
> 
> The debugger can access the allocation tag via PTRACE_PEEKMTETAGS. I
> don't think the kernel should provide this in siginfo. Also, the signal
> handler can do an LDG and read the allocation tag directly, no need for
> it to be in siginfo.
> 

While the debugger can request this information from the kernel, the 
debugger has already received a SIGSEGV signal and will have to fetch 
siginfo for si_code. Having to do another PTRACE_PEEKMTETAGS call just 
to fetch the allocation tag doesn't sound great. Remember this can 
travel through TCP to gdbserver so it can call ptrace from the remote's 
end. It would be best to avoid the round trip.

Also, there seems to be past precedent to include more information in 
siginfo. For example, Intel MPX includes upper/lower bounds violation 
data in there.

Regarding using LDG, are you suggesting force-running this particular 
instruction in the traced process? If so, that isn't the way GDB (in 
particular, not sure about LLDB) does things.

>> Also, when would we see SEGV_MTEAERR, for example? That would provide no
>> additional information about a particular memory address, which is not that
>> useful for the debugger.
> 
> Yeah, we can't really do much here since the hardware doesn't provide us
> such information. The async mode is only useful as a general test to see
> if your program has MTE faults but for actual debugging you'd have to
> switch to synchronous. For glibc at least, I think the mode can be
> driven by an environment variable.
> 

I suspect SEGV_MTESERR would be a reasonable default then, for whoever 
is responsible for setting the default settings.

I'm assuming it is not the debugger, as it doesn't know how to toggle 
prctl settings.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 18/26] arm64: mte: Add PTRACE_{PEEK,POKE}MTETAGS support
  2020-06-01 15:17         ` Luis Machado
@ 2020-06-01 16:33           ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-06-01 16:33 UTC (permalink / raw)
  To: Luis Machado
  Cc: linux-arm-kernel, linux-mm, linux-arch, Will Deacon,
	Dave P Martin, Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Andrey Konovalov, Peter Collingbourne, Alan Hayward,
	Omair Javaid

On Mon, Jun 01, 2020 at 12:17:27PM -0300, Luis Machado wrote:
> On 6/1/20 9:07 AM, Catalin Marinas wrote:
> > On Fri, May 29, 2020 at 06:25:14PM -0300, Luis Machado wrote:
> > > I have a question about siginfo MTE information. I suppose SEGV_MTESERR will
> > > be the most useful setting for debugging, right? Does si_addr contain the
> > > tagged pointer with the logical tag, a zero-tagged memory address or a
> > > tagged pointer with the allocation tag?
> > 
> > The si_addr is zero-tagged currently. We were planning to expose the tag
> > in FAR_EL1 as a separate siginfo field. See these discussions:
> >
> > https://lore.kernel.org/linux-arm-kernel/20200513180914.50892-1-pcc@google.com/
> > https://lore.kernel.org/linux-arm-kernel/20200521022943.195898-1-pcc@google.com/
> > 
> > In theory, we could add the tag to si_addr for SEGV_MTESERR, it
> > shouldn't break the existing ABI (well, it depends on how you look at
> > it).
> 
> Having additional fields in siginfo that hold useful information is probably
> best for debuggers. See my comment below about Intel MPX.
> 
> > >  From the debugger user's perspective, one would want to see both the logical
> > > tag and the allocation tag. And it would be handy to have both available in
> > > siginfo. Does that make sense?
> > 
> > The debugger can access the allocation tag via PTRACE_PEEKMTETAGS. I
> > don't think the kernel should provide this in siginfo. Also, the signal
> > handler can do an LDG and read the allocation tag directly, no need for
> > it to be in siginfo.
> 
> While the debugger can request this information from the kernel, the
> debugger has already received a SIGSEGV signal and will have to fetch
> siginfo for si_code. Having to do another PTRACE_PEEKMTETAGS call just to
> fetch the allocation tag doesn't sound great. Remember this can travel
> through TCP to gdbserver so it can call ptrace from the remote's end. It
> would be best to avoid the round trip.

But given that this is supposed to be a rare event, does another round
trip to read some memory matter much?

> Also, there seems to be past precedent to include more information in
> siginfo. For example, Intel MPX includes upper/lower bounds violation data
> in there.

There is a possible race here with getting the allocation tag. For
example, another thread of the same process unmaps the memory and
reading the tag from the user memory is no longer possible. If we are to
add such information, we can't guarantee it's always possible (it may
differ from the MPX case).

In general, I don't like providing information that's already accessible
by other means (unless there's a performance issue but I don't think
that's the case here).

> Regarding using LDG, are you suggesting force-running this particular
> instruction in the traced process? If so, that isn't the way GDB (in
> particular, not sure about LLDB) does things.

No. What I meant is that if the signal handler itself needs the
information, it can execute an LDG. For gdb, the equivalent currently is
PTRACE_PEEKMTETAGS.

> > > Also, when would we see SEGV_MTEAERR, for example? That would provide no
> > > additional information about a particular memory address, which is not that
> > > useful for the debugger.
> > 
> > Yeah, we can't really do much here since the hardware doesn't provide us
> > such information. The async mode is only useful as a general test to see
> > if your program has MTE faults but for actual debugging you'd have to
> > switch to synchronous. For glibc at least, I think the mode can be
> > driven by an environment variable.
> 
> I suspect SEGV_MTESERR would be a reasonable default then, for whoever is
> responsible for setting the default settings.
> 
> I'm assuming it is not the debugger, as it doesn't know how to toggle prctl
> settings.

The debugger could set the environment before starting the debugged
process. But yes, that would be the C library.

-- 
Catalin

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 18/26] arm64: mte: Add PTRACE_{PEEK,POKE}MTETAGS support
@ 2020-06-01 16:33           ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2020-06-01 16:33 UTC (permalink / raw)
  To: Luis Machado
  Cc: linux-arch, Omair Javaid, Szabolcs Nagy, Andrey Konovalov,
	Kevin Brodsky, Peter Collingbourne, linux-mm, Alan Hayward,
	Vincenzo Frascino, Will Deacon, Dave P Martin, linux-arm-kernel

On Mon, Jun 01, 2020 at 12:17:27PM -0300, Luis Machado wrote:
> On 6/1/20 9:07 AM, Catalin Marinas wrote:
> > On Fri, May 29, 2020 at 06:25:14PM -0300, Luis Machado wrote:
> > > I have a question about siginfo MTE information. I suppose SEGV_MTESERR will
> > > be the most useful setting for debugging, right? Does si_addr contain the
> > > tagged pointer with the logical tag, a zero-tagged memory address or a
> > > tagged pointer with the allocation tag?
> > 
> > The si_addr is zero-tagged currently. We were planning to expose the tag
> > in FAR_EL1 as a separate siginfo field. See these discussions:
> >
> > https://lore.kernel.org/linux-arm-kernel/20200513180914.50892-1-pcc@google.com/
> > https://lore.kernel.org/linux-arm-kernel/20200521022943.195898-1-pcc@google.com/
> > 
> > In theory, we could add the tag to si_addr for SEGV_MTESERR, it
> > shouldn't break the existing ABI (well, it depends on how you look at
> > it).
> 
> Having additional fields in siginfo that hold useful information is probably
> best for debuggers. See my comment below about Intel MPX.
> 
> > >  From the debugger user's perspective, one would want to see both the logical
> > > tag and the allocation tag. And it would be handy to have both available in
> > > siginfo. Does that make sense?
> > 
> > The debugger can access the allocation tag via PTRACE_PEEKMTETAGS. I
> > don't think the kernel should provide this in siginfo. Also, the signal
> > handler can do an LDG and read the allocation tag directly, no need for
> > it to be in siginfo.
> 
> While the debugger can request this information from the kernel, the
> debugger has already received a SIGSEGV signal and will have to fetch
> siginfo for si_code. Having to do another PTRACE_PEEKMTETAGS call just to
> fetch the allocation tag doesn't sound great. Remember this can travel
> through TCP to gdbserver so it can call ptrace from the remote's end. It
> would be best to avoid the round trip.

But given that this is supposed to be a rare event, does another round
trip to read some memory matter much?

> Also, there seems to be past precedent to include more information in
> siginfo. For example, Intel MPX includes upper/lower bounds violation data
> in there.

There is a possible race here with getting the allocation tag. For
example, another thread of the same process unmaps the memory and
reading the tag from the user memory is no longer possible. If we are to
add such information, we can't guarantee it's always possible (it may
differ from the MPX case).

In general, I don't like providing information that's already accessible
by other means (unless there's a performance issue but I don't think
that's the case here).

> Regarding using LDG, are you suggesting force-running this particular
> instruction in the traced process? If so, that isn't the way GDB (in
> particular, not sure about LLDB) does things.

No. What I meant is that if the signal handler itself needs the
information, it can execute an LDG. For gdb, the equivalent currently is
PTRACE_PEEKMTETAGS.

> > > Also, when would we see SEGV_MTEAERR, for example? That would provide no
> > > additional information about a particular memory address, which is not that
> > > useful for the debugger.
> > 
> > Yeah, we can't really do much here since the hardware doesn't provide us
> > such information. The async mode is only useful as a general test to see
> > if your program has MTE faults but for actual debugging you'd have to
> > switch to synchronous. For glibc at least, I think the mode can be
> > driven by an environment variable.
> 
> I suspect SEGV_MTESERR would be a reasonable default then, for whoever is
> responsible for setting the default settings.
> 
> I'm assuming it is not the debugger, as it doesn't know how to toggle prctl
> settings.

The debugger could set the environment before starting the debugged
process. But yes, that would be the C library.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
  2020-05-15 17:16   ` Catalin Marinas
  (?)
@ 2021-01-21 19:37     ` Andrey Konovalov
  -1 siblings, 0 replies; 123+ messages in thread
From: Andrey Konovalov @ 2021-01-21 19:37 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linux ARM, Linux Memory Management List, linux-arch, Will Deacon,
	Dave P Martin, Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Peter Collingbourne

On Fri, May 15, 2020 at 7:17 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> For performance analysis it may be desirable to disable MTE altogether
> via an early param. Introduce arm64.mte_disable and, if true, filter out
> the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> user.
>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>
> Notes:
>     New in v4.
>
>  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
>  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
>  2 files changed, 15 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index f2a93c8679e8..7436e7462b85 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -373,6 +373,10 @@
>         arcrimi=        [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
>                         Format: <io>,<irq>,<nodeID>
>
> +       arm64.mte_disable=
> +                       [ARM64] Disable Linux support for the Memory
> +                       Tagging Extension (both user and in-kernel).
> +
>         ataflop=        [HW,M68k]
>
>         atarimouse=     [HW,MOUSE] Atari Mouse
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index aaadc1cbc006..f7596830694f 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -126,12 +126,23 @@ static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap);
>  static bool __system_matches_cap(unsigned int n);
>
>  #ifdef CONFIG_ARM64_MTE
> +static bool mte_disable;
> +
> +static int __init arm64_mte_disable(char *buf)
> +{
> +       return strtobool(buf, &mte_disable);
> +}
> +early_param("arm64.mte_disable", arm64_mte_disable);
> +
>  s64 mte_ftr_filter(const struct arm64_ftr_bits *ftrp, s64 val)
>  {
>         struct device_node *np;
>         static bool memory_checked = false;
>         static bool mte_capable = true;
>
> +       if (mte_disable)
> +               return ID_AA64PFR1_MTE_NI;
> +
>         /* EL0-only MTE is not supported by Linux, don't expose it */
>         if (val < ID_AA64PFR1_MTE)
>                 return ID_AA64PFR1_MTE_NI;

Hi Calatin,

While this patch didn't land upstream, we need an MTE kill-switch for
Android GKI. Is this patch OK to take as is? Is it still valid?

Thanks!

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2021-01-21 19:37     ` Andrey Konovalov
  0 siblings, 0 replies; 123+ messages in thread
From: Andrey Konovalov @ 2021-01-21 19:37 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linux ARM, Linux Memory Management List, linux-arch, Will Deacon,
	Dave P Martin, Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Peter Collingbourne

On Fri, May 15, 2020 at 7:17 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> For performance analysis it may be desirable to disable MTE altogether
> via an early param. Introduce arm64.mte_disable and, if true, filter out
> the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> user.
>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>
> Notes:
>     New in v4.
>
>  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
>  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
>  2 files changed, 15 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index f2a93c8679e8..7436e7462b85 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -373,6 +373,10 @@
>         arcrimi=        [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
>                         Format: <io>,<irq>,<nodeID>
>
> +       arm64.mte_disable=
> +                       [ARM64] Disable Linux support for the Memory
> +                       Tagging Extension (both user and in-kernel).
> +
>         ataflop=        [HW,M68k]
>
>         atarimouse=     [HW,MOUSE] Atari Mouse
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index aaadc1cbc006..f7596830694f 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -126,12 +126,23 @@ static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap);
>  static bool __system_matches_cap(unsigned int n);
>
>  #ifdef CONFIG_ARM64_MTE
> +static bool mte_disable;
> +
> +static int __init arm64_mte_disable(char *buf)
> +{
> +       return strtobool(buf, &mte_disable);
> +}
> +early_param("arm64.mte_disable", arm64_mte_disable);
> +
>  s64 mte_ftr_filter(const struct arm64_ftr_bits *ftrp, s64 val)
>  {
>         struct device_node *np;
>         static bool memory_checked = false;
>         static bool mte_capable = true;
>
> +       if (mte_disable)
> +               return ID_AA64PFR1_MTE_NI;
> +
>         /* EL0-only MTE is not supported by Linux, don't expose it */
>         if (val < ID_AA64PFR1_MTE)
>                 return ID_AA64PFR1_MTE_NI;

Hi Calatin,

While this patch didn't land upstream, we need an MTE kill-switch for
Android GKI. Is this patch OK to take as is? Is it still valid?

Thanks!


^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2021-01-21 19:37     ` Andrey Konovalov
  0 siblings, 0 replies; 123+ messages in thread
From: Andrey Konovalov @ 2021-01-21 19:37 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, Szabolcs Nagy, Peter Collingbourne, Kevin Brodsky,
	Linux Memory Management List, Vincenzo Frascino, Will Deacon,
	Dave P Martin, Linux ARM

On Fri, May 15, 2020 at 7:17 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> For performance analysis it may be desirable to disable MTE altogether
> via an early param. Introduce arm64.mte_disable and, if true, filter out
> the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> user.
>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> ---
>
> Notes:
>     New in v4.
>
>  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
>  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
>  2 files changed, 15 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index f2a93c8679e8..7436e7462b85 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -373,6 +373,10 @@
>         arcrimi=        [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
>                         Format: <io>,<irq>,<nodeID>
>
> +       arm64.mte_disable=
> +                       [ARM64] Disable Linux support for the Memory
> +                       Tagging Extension (both user and in-kernel).
> +
>         ataflop=        [HW,M68k]
>
>         atarimouse=     [HW,MOUSE] Atari Mouse
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index aaadc1cbc006..f7596830694f 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -126,12 +126,23 @@ static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap);
>  static bool __system_matches_cap(unsigned int n);
>
>  #ifdef CONFIG_ARM64_MTE
> +static bool mte_disable;
> +
> +static int __init arm64_mte_disable(char *buf)
> +{
> +       return strtobool(buf, &mte_disable);
> +}
> +early_param("arm64.mte_disable", arm64_mte_disable);
> +
>  s64 mte_ftr_filter(const struct arm64_ftr_bits *ftrp, s64 val)
>  {
>         struct device_node *np;
>         static bool memory_checked = false;
>         static bool mte_capable = true;
>
> +       if (mte_disable)
> +               return ID_AA64PFR1_MTE_NI;
> +
>         /* EL0-only MTE is not supported by Linux, don't expose it */
>         if (val < ID_AA64PFR1_MTE)
>                 return ID_AA64PFR1_MTE_NI;

Hi Calatin,

While this patch didn't land upstream, we need an MTE kill-switch for
Android GKI. Is this patch OK to take as is? Is it still valid?

Thanks!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
  2021-01-21 19:37     ` Andrey Konovalov
  (?)
@ 2021-01-22  2:03       ` Andrey Konovalov
  -1 siblings, 0 replies; 123+ messages in thread
From: Andrey Konovalov @ 2021-01-22  2:03 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linux ARM, Linux Memory Management List, linux-arch, Will Deacon,
	Dave P Martin, Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Peter Collingbourne

On Thu, Jan 21, 2021 at 8:37 PM Andrey Konovalov <andreyknvl@google.com> wrote:
>
> On Fri, May 15, 2020 at 7:17 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> >
> > For performance analysis it may be desirable to disable MTE altogether
> > via an early param. Introduce arm64.mte_disable and, if true, filter out
> > the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> > user.
> >
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> >
> > Notes:
> >     New in v4.
> >
> >  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
> >  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
> >  2 files changed, 15 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index f2a93c8679e8..7436e7462b85 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -373,6 +373,10 @@
> >         arcrimi=        [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> >                         Format: <io>,<irq>,<nodeID>
> >
> > +       arm64.mte_disable=
> > +                       [ARM64] Disable Linux support for the Memory
> > +                       Tagging Extension (both user and in-kernel).
> > +
> >         ataflop=        [HW,M68k]
> >
> >         atarimouse=     [HW,MOUSE] Atari Mouse
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index aaadc1cbc006..f7596830694f 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -126,12 +126,23 @@ static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap);
> >  static bool __system_matches_cap(unsigned int n);
> >
> >  #ifdef CONFIG_ARM64_MTE
> > +static bool mte_disable;
> > +
> > +static int __init arm64_mte_disable(char *buf)
> > +{
> > +       return strtobool(buf, &mte_disable);
> > +}
> > +early_param("arm64.mte_disable", arm64_mte_disable);
> > +
> >  s64 mte_ftr_filter(const struct arm64_ftr_bits *ftrp, s64 val)
> >  {
> >         struct device_node *np;
> >         static bool memory_checked = false;
> >         static bool mte_capable = true;
> >
> > +       if (mte_disable)
> > +               return ID_AA64PFR1_MTE_NI;
> > +
> >         /* EL0-only MTE is not supported by Linux, don't expose it */
> >         if (val < ID_AA64PFR1_MTE)
> >                 return ID_AA64PFR1_MTE_NI;
>
> Hi Catalin,
>
> While this patch didn't land upstream, we need an MTE kill-switch for
> Android GKI. Is this patch OK to take as is? Is it still valid?

Looking at this more closely: looks like this code no longer exists.
What would be the approach to add this kind of switch now?

Thanks!

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2021-01-22  2:03       ` Andrey Konovalov
  0 siblings, 0 replies; 123+ messages in thread
From: Andrey Konovalov @ 2021-01-22  2:03 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Linux ARM, Linux Memory Management List, linux-arch, Will Deacon,
	Dave P Martin, Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Peter Collingbourne

On Thu, Jan 21, 2021 at 8:37 PM Andrey Konovalov <andreyknvl@google.com> wrote:
>
> On Fri, May 15, 2020 at 7:17 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> >
> > For performance analysis it may be desirable to disable MTE altogether
> > via an early param. Introduce arm64.mte_disable and, if true, filter out
> > the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> > user.
> >
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> >
> > Notes:
> >     New in v4.
> >
> >  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
> >  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
> >  2 files changed, 15 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index f2a93c8679e8..7436e7462b85 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -373,6 +373,10 @@
> >         arcrimi=        [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> >                         Format: <io>,<irq>,<nodeID>
> >
> > +       arm64.mte_disable=
> > +                       [ARM64] Disable Linux support for the Memory
> > +                       Tagging Extension (both user and in-kernel).
> > +
> >         ataflop=        [HW,M68k]
> >
> >         atarimouse=     [HW,MOUSE] Atari Mouse
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index aaadc1cbc006..f7596830694f 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -126,12 +126,23 @@ static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap);
> >  static bool __system_matches_cap(unsigned int n);
> >
> >  #ifdef CONFIG_ARM64_MTE
> > +static bool mte_disable;
> > +
> > +static int __init arm64_mte_disable(char *buf)
> > +{
> > +       return strtobool(buf, &mte_disable);
> > +}
> > +early_param("arm64.mte_disable", arm64_mte_disable);
> > +
> >  s64 mte_ftr_filter(const struct arm64_ftr_bits *ftrp, s64 val)
> >  {
> >         struct device_node *np;
> >         static bool memory_checked = false;
> >         static bool mte_capable = true;
> >
> > +       if (mte_disable)
> > +               return ID_AA64PFR1_MTE_NI;
> > +
> >         /* EL0-only MTE is not supported by Linux, don't expose it */
> >         if (val < ID_AA64PFR1_MTE)
> >                 return ID_AA64PFR1_MTE_NI;
>
> Hi Catalin,
>
> While this patch didn't land upstream, we need an MTE kill-switch for
> Android GKI. Is this patch OK to take as is? Is it still valid?

Looking at this more closely: looks like this code no longer exists.
What would be the approach to add this kind of switch now?

Thanks!


^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2021-01-22  2:03       ` Andrey Konovalov
  0 siblings, 0 replies; 123+ messages in thread
From: Andrey Konovalov @ 2021-01-22  2:03 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, Szabolcs Nagy, Peter Collingbourne, Kevin Brodsky,
	Linux Memory Management List, Vincenzo Frascino, Will Deacon,
	Dave P Martin, Linux ARM

On Thu, Jan 21, 2021 at 8:37 PM Andrey Konovalov <andreyknvl@google.com> wrote:
>
> On Fri, May 15, 2020 at 7:17 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> >
> > For performance analysis it may be desirable to disable MTE altogether
> > via an early param. Introduce arm64.mte_disable and, if true, filter out
> > the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> > user.
> >
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> >
> > Notes:
> >     New in v4.
> >
> >  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
> >  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
> >  2 files changed, 15 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index f2a93c8679e8..7436e7462b85 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -373,6 +373,10 @@
> >         arcrimi=        [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> >                         Format: <io>,<irq>,<nodeID>
> >
> > +       arm64.mte_disable=
> > +                       [ARM64] Disable Linux support for the Memory
> > +                       Tagging Extension (both user and in-kernel).
> > +
> >         ataflop=        [HW,M68k]
> >
> >         atarimouse=     [HW,MOUSE] Atari Mouse
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index aaadc1cbc006..f7596830694f 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -126,12 +126,23 @@ static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap);
> >  static bool __system_matches_cap(unsigned int n);
> >
> >  #ifdef CONFIG_ARM64_MTE
> > +static bool mte_disable;
> > +
> > +static int __init arm64_mte_disable(char *buf)
> > +{
> > +       return strtobool(buf, &mte_disable);
> > +}
> > +early_param("arm64.mte_disable", arm64_mte_disable);
> > +
> >  s64 mte_ftr_filter(const struct arm64_ftr_bits *ftrp, s64 val)
> >  {
> >         struct device_node *np;
> >         static bool memory_checked = false;
> >         static bool mte_capable = true;
> >
> > +       if (mte_disable)
> > +               return ID_AA64PFR1_MTE_NI;
> > +
> >         /* EL0-only MTE is not supported by Linux, don't expose it */
> >         if (val < ID_AA64PFR1_MTE)
> >                 return ID_AA64PFR1_MTE_NI;
>
> Hi Catalin,
>
> While this patch didn't land upstream, we need an MTE kill-switch for
> Android GKI. Is this patch OK to take as is? Is it still valid?

Looking at this more closely: looks like this code no longer exists.
What would be the approach to add this kind of switch now?

Thanks!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
  2021-01-21 19:37     ` Andrey Konovalov
@ 2021-01-22 14:41       ` Catalin Marinas
  -1 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2021-01-22 14:41 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: Linux ARM, Linux Memory Management List, linux-arch, Will Deacon,
	Dave P Martin, Vincenzo Frascino, Szabolcs Nagy, Kevin Brodsky,
	Peter Collingbourne

On Thu, Jan 21, 2021 at 08:37:18PM +0100, Andrey Konovalov wrote:
> On Fri, May 15, 2020 at 7:17 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > For performance analysis it may be desirable to disable MTE altogether
> > via an early param. Introduce arm64.mte_disable and, if true, filter out
> > the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> > user.
> >
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> >
> > Notes:
> >     New in v4.
> >
> >  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
> >  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
> >  2 files changed, 15 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index f2a93c8679e8..7436e7462b85 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -373,6 +373,10 @@
> >         arcrimi=        [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> >                         Format: <io>,<irq>,<nodeID>
> >
> > +       arm64.mte_disable=
> > +                       [ARM64] Disable Linux support for the Memory
> > +                       Tagging Extension (both user and in-kernel).
> > +
> >         ataflop=        [HW,M68k]
> >
> >         atarimouse=     [HW,MOUSE] Atari Mouse
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index aaadc1cbc006..f7596830694f 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -126,12 +126,23 @@ static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap);
> >  static bool __system_matches_cap(unsigned int n);
> >
> >  #ifdef CONFIG_ARM64_MTE
> > +static bool mte_disable;
> > +
> > +static int __init arm64_mte_disable(char *buf)
> > +{
> > +       return strtobool(buf, &mte_disable);
> > +}
> > +early_param("arm64.mte_disable", arm64_mte_disable);
> > +
> >  s64 mte_ftr_filter(const struct arm64_ftr_bits *ftrp, s64 val)
> >  {
> >         struct device_node *np;
> >         static bool memory_checked = false;
> >         static bool mte_capable = true;
> >
> > +       if (mte_disable)
> > +               return ID_AA64PFR1_MTE_NI;
> > +
> >         /* EL0-only MTE is not supported by Linux, don't expose it */
> >         if (val < ID_AA64PFR1_MTE)
> >                 return ID_AA64PFR1_MTE_NI;
> 
> While this patch didn't land upstream, we need an MTE kill-switch for
> Android GKI. Is this patch OK to take as is? Is it still valid?

As you noticed, this code no longer exists. The CPUID is checked early
during boot in proc.S, before the MMU is enabled, as you need to set up
the MAIR register.

Now, what do you mean by kill switch? There are multiple levels at which
one can disable MTE or some of its effects: memory type (MAIR) level,
tag allocation (TCR_EL1.ATA), tag checking (SCTLR_EL1.TCF). Apart from
the latter, all the other bits are cached in the TLB which make them
more problematic to toggle at run-time.

For the kernel, we can currently disable tag checking via the kasan
command line options. For user-space, we don't have a kill switch
specific to MTE, however one can disable the tagged addr ABI and
presumably the C library will avoid generating tagged heap pointers.

-- 
Catalin

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2021-01-22 14:41       ` Catalin Marinas
  0 siblings, 0 replies; 123+ messages in thread
From: Catalin Marinas @ 2021-01-22 14:41 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: linux-arch, Szabolcs Nagy, Peter Collingbourne, Kevin Brodsky,
	Linux Memory Management List, Vincenzo Frascino, Will Deacon,
	Dave P Martin, Linux ARM

On Thu, Jan 21, 2021 at 08:37:18PM +0100, Andrey Konovalov wrote:
> On Fri, May 15, 2020 at 7:17 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
> > For performance analysis it may be desirable to disable MTE altogether
> > via an early param. Introduce arm64.mte_disable and, if true, filter out
> > the sanitised ID_AA64PFR1_EL1.MTE field to avoid exposing the HWCAP to
> > user.
> >
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> >
> > Notes:
> >     New in v4.
> >
> >  Documentation/admin-guide/kernel-parameters.txt |  4 ++++
> >  arch/arm64/kernel/cpufeature.c                  | 11 +++++++++++
> >  2 files changed, 15 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index f2a93c8679e8..7436e7462b85 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -373,6 +373,10 @@
> >         arcrimi=        [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
> >                         Format: <io>,<irq>,<nodeID>
> >
> > +       arm64.mte_disable=
> > +                       [ARM64] Disable Linux support for the Memory
> > +                       Tagging Extension (both user and in-kernel).
> > +
> >         ataflop=        [HW,M68k]
> >
> >         atarimouse=     [HW,MOUSE] Atari Mouse
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index aaadc1cbc006..f7596830694f 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -126,12 +126,23 @@ static void cpu_enable_cnp(struct arm64_cpu_capabilities const *cap);
> >  static bool __system_matches_cap(unsigned int n);
> >
> >  #ifdef CONFIG_ARM64_MTE
> > +static bool mte_disable;
> > +
> > +static int __init arm64_mte_disable(char *buf)
> > +{
> > +       return strtobool(buf, &mte_disable);
> > +}
> > +early_param("arm64.mte_disable", arm64_mte_disable);
> > +
> >  s64 mte_ftr_filter(const struct arm64_ftr_bits *ftrp, s64 val)
> >  {
> >         struct device_node *np;
> >         static bool memory_checked = false;
> >         static bool mte_capable = true;
> >
> > +       if (mte_disable)
> > +               return ID_AA64PFR1_MTE_NI;
> > +
> >         /* EL0-only MTE is not supported by Linux, don't expose it */
> >         if (val < ID_AA64PFR1_MTE)
> >                 return ID_AA64PFR1_MTE_NI;
> 
> While this patch didn't land upstream, we need an MTE kill-switch for
> Android GKI. Is this patch OK to take as is? Is it still valid?

As you noticed, this code no longer exists. The CPUID is checked early
during boot in proc.S, before the MMU is enabled, as you need to set up
the MAIR register.

Now, what do you mean by kill switch? There are multiple levels at which
one can disable MTE or some of its effects: memory type (MAIR) level,
tag allocation (TCR_EL1.ATA), tag checking (SCTLR_EL1.TCF). Apart from
the latter, all the other bits are cached in the TLB which make them
more problematic to toggle at run-time.

For the kernel, we can currently disable tag checking via the kasan
command line options. For user-space, we don't have a kill switch
specific to MTE, however one can disable the tagged addr ABI and
presumably the C library will avoid generating tagged heap pointers.

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
  2021-01-22 14:41       ` Catalin Marinas
  (?)
@ 2021-01-22 17:28         ` Andrey Konovalov
  -1 siblings, 0 replies; 123+ messages in thread
From: Andrey Konovalov @ 2021-01-22 17:28 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: Linux ARM, Linux Memory Management List, linux-arch, Will Deacon,
	Dave P Martin, Vincenzo Frascino, Szabolcs Nagy, Catalin Marinas,
	Kevin Brodsky, Peter Collingbourne

On Fri, Jan 22, 2021 at 3:41 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> > While this patch didn't land upstream, we need an MTE kill-switch for
> > Android GKI. Is this patch OK to take as is? Is it still valid?
>
> As you noticed, this code no longer exists. The CPUID is checked early
> during boot in proc.S, before the MMU is enabled, as you need to set up
> the MAIR register.
>
> Now, what do you mean by kill switch? There are multiple levels at which
> one can disable MTE or some of its effects: memory type (MAIR) level,
> tag allocation (TCR_EL1.ATA), tag checking (SCTLR_EL1.TCF). Apart from
> the latter, all the other bits are cached in the TLB which make them
> more problematic to toggle at run-time.
>
> For the kernel, we can currently disable tag checking via the kasan
> command line options. For user-space, we don't have a kill switch
> specific to MTE, however one can disable the tagged addr ABI and
> presumably the C library will avoid generating tagged heap pointers.

Just FTR: As discussed off-the-list, there won't be any need for a
kill-switch for userspace MTE.

Thanks!

^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2021-01-22 17:28         ` Andrey Konovalov
  0 siblings, 0 replies; 123+ messages in thread
From: Andrey Konovalov @ 2021-01-22 17:28 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: Linux ARM, Linux Memory Management List, linux-arch, Will Deacon,
	Dave P Martin, Vincenzo Frascino, Szabolcs Nagy, Catalin Marinas,
	Kevin Brodsky, Peter Collingbourne

On Fri, Jan 22, 2021 at 3:41 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> > While this patch didn't land upstream, we need an MTE kill-switch for
> > Android GKI. Is this patch OK to take as is? Is it still valid?
>
> As you noticed, this code no longer exists. The CPUID is checked early
> during boot in proc.S, before the MMU is enabled, as you need to set up
> the MAIR register.
>
> Now, what do you mean by kill switch? There are multiple levels at which
> one can disable MTE or some of its effects: memory type (MAIR) level,
> tag allocation (TCR_EL1.ATA), tag checking (SCTLR_EL1.TCF). Apart from
> the latter, all the other bits are cached in the TLB which make them
> more problematic to toggle at run-time.
>
> For the kernel, we can currently disable tag checking via the kasan
> command line options. For user-space, we don't have a kill switch
> specific to MTE, however one can disable the tagged addr ABI and
> presumably the C library will avoid generating tagged heap pointers.

Just FTR: As discussed off-the-list, there won't be any need for a
kill-switch for userspace MTE.

Thanks!


^ permalink raw reply	[flat|nested] 123+ messages in thread

* Re: [PATCH v4 24/26] arm64: mte: Introduce early param to disable MTE support
@ 2021-01-22 17:28         ` Andrey Konovalov
  0 siblings, 0 replies; 123+ messages in thread
From: Andrey Konovalov @ 2021-01-22 17:28 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: linux-arch, Szabolcs Nagy, Catalin Marinas, Kevin Brodsky,
	Peter Collingbourne, Linux Memory Management List,
	Vincenzo Frascino, Will Deacon, Dave P Martin, Linux ARM

On Fri, Jan 22, 2021 at 3:41 PM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> > While this patch didn't land upstream, we need an MTE kill-switch for
> > Android GKI. Is this patch OK to take as is? Is it still valid?
>
> As you noticed, this code no longer exists. The CPUID is checked early
> during boot in proc.S, before the MMU is enabled, as you need to set up
> the MAIR register.
>
> Now, what do you mean by kill switch? There are multiple levels at which
> one can disable MTE or some of its effects: memory type (MAIR) level,
> tag allocation (TCR_EL1.ATA), tag checking (SCTLR_EL1.TCF). Apart from
> the latter, all the other bits are cached in the TLB which make them
> more problematic to toggle at run-time.
>
> For the kernel, we can currently disable tag checking via the kasan
> command line options. For user-space, we don't have a kill switch
> specific to MTE, however one can disable the tagged addr ABI and
> presumably the C library will avoid generating tagged heap pointers.

Just FTR: As discussed off-the-list, there won't be any need for a
kill-switch for userspace MTE.

Thanks!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 123+ messages in thread

end of thread, other threads:[~2021-01-22 17:51 UTC | newest]

Thread overview: 123+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-15 17:15 [PATCH v4 00/26] arm64: Memory Tagging Extension user-space support Catalin Marinas
2020-05-15 17:15 ` Catalin Marinas
2020-05-15 17:15 ` [PATCH v4 01/26] arm64: mte: system register definitions Catalin Marinas
2020-05-15 17:15   ` Catalin Marinas
2020-05-15 17:15 ` [PATCH v4 02/26] arm64: mte: CPU feature detection and initial sysreg configuration Catalin Marinas
2020-05-15 17:15   ` Catalin Marinas
2020-05-15 17:15 ` [PATCH v4 03/26] arm64: mte: Use Normal Tagged attributes for the linear map Catalin Marinas
2020-05-15 17:15   ` Catalin Marinas
2020-05-15 17:15 ` [PATCH v4 04/26] arm64: mte: Add specific SIGSEGV codes Catalin Marinas
2020-05-15 17:15   ` Catalin Marinas
2020-05-15 17:15 ` [PATCH v4 05/26] arm64: mte: Handle synchronous and asynchronous tag check faults Catalin Marinas
2020-05-15 17:15   ` Catalin Marinas
2020-05-15 17:15 ` [PATCH v4 06/26] mm: Add PG_ARCH_2 page flag Catalin Marinas
2020-05-15 17:15   ` Catalin Marinas
2020-05-15 17:15 ` [PATCH v4 07/26] arm64: mte: Clear the tags when a page is mapped in user-space with PROT_MTE Catalin Marinas
2020-05-15 17:15   ` Catalin Marinas
2020-05-15 17:15 ` [PATCH v4 08/26] arm64: mte: Tags-aware copy_page() implementation Catalin Marinas
2020-05-15 17:15   ` Catalin Marinas
2020-05-15 17:15 ` [PATCH v4 09/26] arm64: mte: Tags-aware aware memcmp_pages() implementation Catalin Marinas
2020-05-15 17:15   ` Catalin Marinas
2020-05-15 17:15 ` [PATCH v4 10/26] mm: Introduce arch_calc_vm_flag_bits() Catalin Marinas
2020-05-15 17:15   ` Catalin Marinas
2020-05-15 17:15 ` [PATCH v4 11/26] arm64: mte: Add PROT_MTE support to mmap() and mprotect() Catalin Marinas
2020-05-15 17:15   ` Catalin Marinas
2020-05-27 18:57   ` Peter Collingbourne
2020-05-27 18:57     ` Peter Collingbourne
2020-05-27 18:57     ` Peter Collingbourne
2020-05-27 18:57     ` Peter Collingbourne
2020-05-28  9:14     ` Catalin Marinas
2020-05-28  9:14       ` Catalin Marinas
2020-05-28 11:05       ` Szabolcs Nagy
2020-05-28 11:05         ` Szabolcs Nagy
2020-05-28 11:05         ` Szabolcs Nagy
2020-05-28 16:34         ` Catalin Marinas
2020-05-28 16:34           ` Catalin Marinas
2020-05-28 18:35           ` Evgenii Stepanov
2020-05-28 18:35             ` Evgenii Stepanov
2020-05-29 11:19             ` Catalin Marinas
2020-05-29 11:19               ` Catalin Marinas
2020-06-01  8:55           ` Dave Martin
2020-06-01  8:55             ` Dave Martin
2020-06-01 14:45             ` Catalin Marinas
2020-06-01 14:45               ` Catalin Marinas
2020-06-01 15:04               ` Dave Martin
2020-06-01 15:04                 ` Dave Martin
2020-05-15 17:15 ` [PATCH v4 12/26] mm: Introduce arch_validate_flags() Catalin Marinas
2020-05-15 17:15   ` Catalin Marinas
2020-05-15 17:15 ` [PATCH v4 13/26] arm64: mte: Validate the PROT_MTE request via arch_validate_flags() Catalin Marinas
2020-05-15 17:15   ` Catalin Marinas
2020-05-15 17:16 ` [PATCH v4 14/26] mm: Allow arm64 mmap(PROT_MTE) on RAM-based files Catalin Marinas
2020-05-15 17:16   ` Catalin Marinas
2020-05-15 17:16 ` [PATCH v4 15/26] arm64: mte: Allow user control of the tag check mode via prctl() Catalin Marinas
2020-05-15 17:16   ` Catalin Marinas
2020-05-27  7:46   ` Will Deacon
2020-05-27  7:46     ` Will Deacon
2020-05-27  8:32     ` Dave Martin
2020-05-27  8:32       ` Dave Martin
2020-05-27  8:48       ` Will Deacon
2020-05-27  8:48         ` Will Deacon
2020-05-27 11:16       ` Catalin Marinas
2020-05-27 11:16         ` Catalin Marinas
2020-05-15 17:16 ` [PATCH v4 16/26] arm64: mte: Allow user control of the generated random tags " Catalin Marinas
2020-05-15 17:16   ` Catalin Marinas
2020-05-15 17:16 ` [PATCH v4 17/26] arm64: mte: Restore the GCR_EL1 register after a suspend Catalin Marinas
2020-05-15 17:16   ` Catalin Marinas
2020-05-15 17:16 ` [PATCH v4 18/26] arm64: mte: Add PTRACE_{PEEK,POKE}MTETAGS support Catalin Marinas
2020-05-15 17:16   ` Catalin Marinas
2020-05-29 21:25   ` Luis Machado
2020-05-29 21:25     ` Luis Machado
2020-06-01 12:07     ` Catalin Marinas
2020-06-01 12:07       ` Catalin Marinas
2020-06-01 15:17       ` Luis Machado
2020-06-01 15:17         ` Luis Machado
2020-06-01 16:33         ` Catalin Marinas
2020-06-01 16:33           ` Catalin Marinas
2020-05-15 17:16 ` [PATCH v4 19/26] fs: Handle intra-page faults in copy_mount_options() Catalin Marinas
2020-05-15 17:16   ` Catalin Marinas
2020-05-15 17:16 ` [PATCH v4 20/26] mm: Add arch hooks for saving/restoring tags Catalin Marinas
2020-05-15 17:16   ` Catalin Marinas
2020-05-15 17:16 ` [PATCH v4 21/26] arm64: mte: Enable swap of tagged pages Catalin Marinas
2020-05-15 17:16   ` Catalin Marinas
2020-05-15 17:16 ` [PATCH v4 22/26] arm64: mte: Save tags when hibernating Catalin Marinas
2020-05-15 17:16   ` Catalin Marinas
2020-05-15 17:16 ` [PATCH v4 23/26] arm64: mte: Check the DT memory nodes for MTE support Catalin Marinas
2020-05-15 17:16   ` Catalin Marinas
2020-05-15 17:16 ` [PATCH v4 24/26] arm64: mte: Introduce early param to disable " Catalin Marinas
2020-05-15 17:16   ` Catalin Marinas
2020-05-18 11:26   ` Vladimir Murzin
2020-05-18 11:26     ` Vladimir Murzin
2020-05-18 11:31     ` Will Deacon
2020-05-18 11:31       ` Will Deacon
2020-05-18 17:20       ` Catalin Marinas
2020-05-18 17:20         ` Catalin Marinas
2020-05-22  5:57         ` Patrick Daly
2020-05-22  5:57           ` Patrick Daly
2020-05-22 10:37           ` Catalin Marinas
2020-05-22 10:37             ` Catalin Marinas
2020-05-27  2:11             ` Patrick Daly
2020-05-27  2:11               ` Patrick Daly
2020-05-27  2:11               ` Patrick Daly
2020-05-27  9:55               ` Will Deacon
2020-05-27  9:55                 ` Will Deacon
2020-05-27 10:37                 ` Szabolcs Nagy
2020-05-27 10:37                   ` Szabolcs Nagy
2020-05-27 11:12                 ` Catalin Marinas
2020-05-27 11:12                   ` Catalin Marinas
2020-05-19 16:14     ` Catalin Marinas
2020-05-19 16:14       ` Catalin Marinas
2021-01-21 19:37   ` Andrey Konovalov
2021-01-21 19:37     ` Andrey Konovalov
2021-01-21 19:37     ` Andrey Konovalov
2021-01-22  2:03     ` Andrey Konovalov
2021-01-22  2:03       ` Andrey Konovalov
2021-01-22  2:03       ` Andrey Konovalov
2021-01-22 14:41     ` Catalin Marinas
2021-01-22 14:41       ` Catalin Marinas
2021-01-22 17:28       ` Andrey Konovalov
2021-01-22 17:28         ` Andrey Konovalov
2021-01-22 17:28         ` Andrey Konovalov
2020-05-15 17:16 ` [PATCH v4 25/26] arm64: mte: Kconfig entry Catalin Marinas
2020-05-15 17:16   ` Catalin Marinas
2020-05-15 17:16 ` [PATCH v4 26/26] arm64: mte: Add Memory Tagging Extension documentation Catalin Marinas
2020-05-15 17:16   ` Catalin Marinas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.