Linux-arch Archive on lore.kernel.org
 help / color / Atom feed
* [RFC PATCH 0/3] Add support for Asymmetric AArch32 systems
@ 2020-10-08 18:16 Qais Yousef
  2020-10-08 18:16 ` [RFC PATCH 1/3] arm64: kvm: Handle " Qais Yousef
                   ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Qais Yousef @ 2020-10-08 18:16 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Marc Zyngier, Peter Zijlstra (Intel)
  Cc: Morten Rasmussen, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel, linux-arch, Qais Yousef

This RFC series enables AArch32 EL0 support on systems where only a subset of
CPUs implement it. AArch32 feature asymmetry comes with downsides, but it is
likely that some vendors are willing to accept those to maintain AArch32 EL0
support on systems where some cores are AArch64 only.

Enabling AArch32 when it isn't supported on all CPUs inevitably requires
careful affinity management of AArch32 tasks. The bare minimum kernel support
is offered by the second patch which put the burden of managing task affinity
entirely on user-space. AAarch32 tasks receive SIGKILL if they try to run on
a non-AArch32 CPU. The third patch is optional and overrides task affinity in
some cases to prevent AArch32 tasks getting SIGKILL.

We don't expose the asymmetry to userspace. If we want to delegate affinity
management to user space we need to introduce a way to do that.
/sys/devices/system/cpu/cpu*/regs/identification/midr_el1 contains the specific
CPU ID. This could be extended to expose the other ID_* registers where the
AArch32 feature can be detected.

If the user hotplugs all 32bit capable CPUs, then all running 32bit tasks will
be SIGKILLed if scheduled.

Patch 1 ensures KVM handles such systems properly. Especially if the guest is
misbehaving and tries to force run aarch32 regardless of what ID registers
advertise.

Patch 2 introduces basic asymetric aarch32 support. It will SIGKILL any task as
soon as it scheduled on the wrong CPU even if its affinity allows it to migrate
to a capable CPU.

Patch 3 suggests how handling the affinity problem could be done in the kernel.
It's not a generic solution, rather a demonstration of what could potentially
be done.

Qais Yousef (3):
  arm64: kvm: Handle Asymmetric AArch32 systems
  arm64: Add support for asymmetric AArch32 EL0 configurations
  arm64: Handle AArch32 tasks running on non AArch32 cpu

 arch/arm64/Kconfig                   | 14 +++++
 arch/arm64/include/asm/cpu.h         |  2 +
 arch/arm64/include/asm/cpucaps.h     |  3 +-
 arch/arm64/include/asm/cpufeature.h  | 22 +++++++-
 arch/arm64/include/asm/thread_info.h |  5 +-
 arch/arm64/kernel/cpufeature.c       | 77 ++++++++++++++++++----------
 arch/arm64/kernel/cpuinfo.c          | 71 +++++++++++++++----------
 arch/arm64/kernel/process.c          | 17 ++++++
 arch/arm64/kernel/signal.c           | 33 ++++++++++++
 arch/arm64/kvm/arm.c                 | 17 ++++++
 arch/arm64/kvm/guest.c               |  2 +-
 arch/arm64/kvm/sys_regs.c            | 14 ++++-
 12 files changed, 218 insertions(+), 59 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC PATCH 1/3] arm64: kvm: Handle Asymmetric AArch32 systems
  2020-10-08 18:16 [RFC PATCH 0/3] Add support for Asymmetric AArch32 systems Qais Yousef
@ 2020-10-08 18:16 ` Qais Yousef
  2020-10-09  8:12   ` Marc Zyngier
  2020-10-08 18:16 ` [RFC PATCH 2/3] arm64: Add support for asymmetric AArch32 EL0 configurations Qais Yousef
  2020-10-08 18:16 ` [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu Qais Yousef
  2 siblings, 1 reply; 33+ messages in thread
From: Qais Yousef @ 2020-10-08 18:16 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Marc Zyngier, Peter Zijlstra (Intel)
  Cc: Morten Rasmussen, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel, linux-arch, Qais Yousef

On a system without uniform support for AArch32 at EL0, it is possible
for the guest to force run AArch32 at EL0 and potentially cause an
illegal exception if running on the wrong core.

Add an extra check to catch if the guest ever does that and prevent it
from running again, treating it as ARM_EXCEPTION_IL.

We try to catch this misbehavior as early as possible and not rely on
PSTATE.IL to occur.

Tested on Juno by instrumenting the host to:

	* Fake asym aarch32.
	* Comment out hiding of ID registers from the guest.

Any attempt to run 32bit app in the guest will produce such error on
qemu:

	# ./test
	error: kvm run failed Invalid argument
	R00=ffff0fff R01=ffffffff R02=00000000 R03=00087968
	R04=000874b8 R05=ffd70b24 R06=ffd70b2c R07=00000055
	R08=00000000 R09=00000000 R10=00000000 R11=00000000
	R12=0000001c R13=ffd6f974 R14=0001ff64 R15=ffff0fe0
	PSR=a0000010 N-C- A usr32

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
---
 arch/arm64/kvm/arm.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index b588c3b5c2f0..22ff3373d855 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -644,6 +644,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 	struct kvm_run *run = vcpu->run;
 	int ret;
 
+	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
+		kvm_err("Illegal AArch32 mode at EL0, can't run.");
+		return -ENOEXEC;
+	}
+
 	if (unlikely(!kvm_vcpu_initialized(vcpu)))
 		return -ENOEXEC;
 
@@ -804,6 +809,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 
 		preempt_enable();
 
+		/*
+		 * For asym aarch32 systems we present a 64bit only system to
+		 * the guest. But in case it managed somehow to escape that and
+		 * enter 32bit mode, catch that and prevent it from running
+		 * again.
+		 */
+		if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
+			kvm_err("Detected illegal AArch32 mode at EL0, exiting.");
+			ret = ARM_EXCEPTION_IL;
+		}
+
 		ret = handle_exit(vcpu, ret);
 	}
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC PATCH 2/3] arm64: Add support for asymmetric AArch32 EL0 configurations
  2020-10-08 18:16 [RFC PATCH 0/3] Add support for Asymmetric AArch32 systems Qais Yousef
  2020-10-08 18:16 ` [RFC PATCH 1/3] arm64: kvm: Handle " Qais Yousef
@ 2020-10-08 18:16 ` Qais Yousef
  2020-10-08 18:22   ` Randy Dunlap
                     ` (2 more replies)
  2020-10-08 18:16 ` [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu Qais Yousef
  2 siblings, 3 replies; 33+ messages in thread
From: Qais Yousef @ 2020-10-08 18:16 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Marc Zyngier, Peter Zijlstra (Intel)
  Cc: Morten Rasmussen, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel, linux-arch, Qais Yousef

When the CONFIG_ASYMMETRIC_AARCH32 option is enabled (EXPERT), the type
of the ARM64_HAS_32BIT_EL0 capability becomes WEAK_LOCAL_CPU_FEATURE.
The kernel will now return true for system_supports_32bit_el0() and
checks 32-bit tasks are affined to AArch32 capable CPUs only in
do_notify_resume(). If the affinity contains a non-capable AArch32 CPU,
the tasks will get SIGKILLed. If the last CPU supporting 32-bit is
offlined, the kernel will SIGKILL any scheduled 32-bit tasks (the
alternative is to prevent offlining through a new .cpu_disable feature
entry).

In addition to the relaxation of the ARM64_HAS_32BIT_EL0 capability,
this patch factors out the 32-bit cpuinfo and features setting into
separate functions: __cpuinfo_store_cpu_32bit(),
init_cpu_32bit_features(). The cpuinfo of the booting CPU
(boot_cpu_data) is now updated on the first 32-bit capable CPU even if
it is a secondary one. The ID_AA64PFR0_EL0_64BIT_ONLY feature is relaxed
to FTR_NONSTRICT and FTR_HIGHER_SAFE when the asymmetric AArch32 support
is enabled. The compat_elf_hwcaps are only verified for the
AArch32-capable CPUs to still allow hotplugging AArch64-only CPUs.

Make sure that KVM never sees the asymmetric 32bit system. Guest can
still ignore ID registers and force run 32bit at EL0.

Co-developed-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
---
 arch/arm64/Kconfig                   | 14 ++++++
 arch/arm64/include/asm/cpu.h         |  2 +
 arch/arm64/include/asm/cpucaps.h     |  3 +-
 arch/arm64/include/asm/cpufeature.h  | 20 +++++++-
 arch/arm64/include/asm/thread_info.h |  5 +-
 arch/arm64/kernel/cpufeature.c       | 66 +++++++++++++++-----------
 arch/arm64/kernel/cpuinfo.c          | 71 ++++++++++++++++++----------
 arch/arm64/kernel/process.c          | 17 +++++++
 arch/arm64/kernel/signal.c           | 18 +++++++
 arch/arm64/kvm/arm.c                 |  5 +-
 arch/arm64/kvm/guest.c               |  2 +-
 arch/arm64/kvm/sys_regs.c            | 14 +++++-
 12 files changed, 176 insertions(+), 61 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6d232837cbee..591853504dc4 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1868,6 +1868,20 @@ config DMI
 
 endmenu
 
+config ASYMMETRIC_AARCH32
+	bool "Allow support for asymmetric AArch32 support"
+	depends on COMPAT && EXPERT
+	help
+	  Enable this option to allow support for asymmetric AArch32 EL0
+	  CPU configurations. Once the AArch32 EL0 support is detected
+	  on a CPU, the feature is made available to user space to allow
+	  the execution of 32-bit (compat) applications. If the affinity
+	  of the 32-bit application contains a non-AArch32 capable CPU
+	  or the last AArch32 capable CPU is offlined, the application
+	  will be killed.
+
+	  If unsure say N.
+
 config SYSVIPC_COMPAT
 	def_bool y
 	depends on COMPAT && SYSVIPC
diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index 7faae6ff3ab4..c920fa45e502 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -15,6 +15,7 @@
 struct cpuinfo_arm64 {
 	struct cpu	cpu;
 	struct kobject	kobj;
+	bool		aarch32_valid;
 	u32		reg_ctr;
 	u32		reg_cntfrq;
 	u32		reg_dczid;
@@ -65,6 +66,7 @@ void cpuinfo_store_cpu(void);
 void __init cpuinfo_store_boot_cpu(void);
 
 void __init init_cpu_features(struct cpuinfo_arm64 *info);
+void init_cpu_32bit_features(struct cpuinfo_arm64 *info);
 void update_cpu_features(int cpu, struct cpuinfo_arm64 *info,
 				 struct cpuinfo_arm64 *boot);
 
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 07b643a70710..d3b6a5dce456 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -64,7 +64,8 @@
 #define ARM64_BTI				54
 #define ARM64_HAS_ARMv8_4_TTL			55
 #define ARM64_HAS_TLB_RANGE			56
+#define ARM64_HAS_ASYM_32BIT_EL0		57
 
-#define ARM64_NCAPS				57
+#define ARM64_NCAPS				58
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 89b4f0142c28..fa2413715041 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -17,6 +17,7 @@
 #ifndef __ASSEMBLY__
 
 #include <linux/bug.h>
+#include <linux/cpumask.h>
 #include <linux/jump_label.h>
 #include <linux/kernel.h>
 
@@ -582,9 +583,26 @@ static inline bool cpu_supports_mixed_endian_el0(void)
 	return id_aa64mmfr0_mixed_endian_el0(read_cpuid(ID_AA64MMFR0_EL1));
 }
 
-static inline bool system_supports_32bit_el0(void)
+static inline bool system_supports_sym_32bit_el0(void)
 {
 	return cpus_have_const_cap(ARM64_HAS_32BIT_EL0);
+
+}
+
+static inline bool system_supports_asym_32bit_el0(void)
+{
+#ifdef CONFIG_ASYMMETRIC_AARCH32
+	return !cpus_have_const_cap(ARM64_HAS_32BIT_EL0) &&
+	       cpus_have_const_cap(ARM64_HAS_ASYM_32BIT_EL0);
+#else
+	return false;
+#endif
+}
+
+static inline bool system_supports_32bit_el0(void)
+{
+	return system_supports_sym_32bit_el0() ||
+	       system_supports_asym_32bit_el0();
 }
 
 static inline bool system_supports_4kb_granule(void)
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 5e784e16ee89..312974ab2c85 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -67,6 +67,7 @@ void arch_release_task_struct(struct task_struct *tsk);
 #define TIF_FOREIGN_FPSTATE	3	/* CPU's FP state is not current's */
 #define TIF_UPROBE		4	/* uprobe breakpoint or singlestep */
 #define TIF_FSCHECK		5	/* Check FS is USER_DS on return */
+#define TIF_CHECK_32BIT_AFFINITY 6	/* Check thread affinity for asymmetric AArch32 */
 #define TIF_SYSCALL_TRACE	8	/* syscall trace active */
 #define TIF_SYSCALL_AUDIT	9	/* syscall auditing */
 #define TIF_SYSCALL_TRACEPOINT	10	/* syscall tracepoint for ftrace */
@@ -95,11 +96,13 @@ void arch_release_task_struct(struct task_struct *tsk);
 #define _TIF_FSCHECK		(1 << TIF_FSCHECK)
 #define _TIF_SINGLESTEP		(1 << TIF_SINGLESTEP)
 #define _TIF_32BIT		(1 << TIF_32BIT)
+#define _TIF_CHECK_32BIT_AFFINITY (1 << TIF_CHECK_32BIT_AFFINITY)
 #define _TIF_SVE		(1 << TIF_SVE)
 
 #define _TIF_WORK_MASK		(_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
 				 _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
-				 _TIF_UPROBE | _TIF_FSCHECK)
+				 _TIF_UPROBE | _TIF_FSCHECK | \
+				 _TIF_CHECK_32BIT_AFFINITY)
 
 #define _TIF_SYSCALL_WORK	(_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
 				 _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 6424584be01e..d46732113305 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -63,7 +63,6 @@
 #define pr_fmt(fmt) "CPU features: " fmt
 
 #include <linux/bsearch.h>
-#include <linux/cpumask.h>
 #include <linux/crash_dump.h>
 #include <linux/sort.h>
 #include <linux/stop_machine.h>
@@ -753,7 +752,7 @@ static void __init sort_ftr_regs(void)
  * Any bits that are not covered by an arm64_ftr_bits entry are considered
  * RES0 for the system-wide value, and must strictly match.
  */
-static void __init init_cpu_ftr_reg(u32 sys_reg, u64 new)
+static void init_cpu_ftr_reg(u32 sys_reg, u64 new)
 {
 	u64 val = 0;
 	u64 strict_mask = ~0x0ULL;
@@ -835,30 +834,6 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info)
 	init_cpu_ftr_reg(SYS_ID_AA64PFR1_EL1, info->reg_id_aa64pfr1);
 	init_cpu_ftr_reg(SYS_ID_AA64ZFR0_EL1, info->reg_id_aa64zfr0);
 
-	if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) {
-		init_cpu_ftr_reg(SYS_ID_DFR0_EL1, info->reg_id_dfr0);
-		init_cpu_ftr_reg(SYS_ID_DFR1_EL1, info->reg_id_dfr1);
-		init_cpu_ftr_reg(SYS_ID_ISAR0_EL1, info->reg_id_isar0);
-		init_cpu_ftr_reg(SYS_ID_ISAR1_EL1, info->reg_id_isar1);
-		init_cpu_ftr_reg(SYS_ID_ISAR2_EL1, info->reg_id_isar2);
-		init_cpu_ftr_reg(SYS_ID_ISAR3_EL1, info->reg_id_isar3);
-		init_cpu_ftr_reg(SYS_ID_ISAR4_EL1, info->reg_id_isar4);
-		init_cpu_ftr_reg(SYS_ID_ISAR5_EL1, info->reg_id_isar5);
-		init_cpu_ftr_reg(SYS_ID_ISAR6_EL1, info->reg_id_isar6);
-		init_cpu_ftr_reg(SYS_ID_MMFR0_EL1, info->reg_id_mmfr0);
-		init_cpu_ftr_reg(SYS_ID_MMFR1_EL1, info->reg_id_mmfr1);
-		init_cpu_ftr_reg(SYS_ID_MMFR2_EL1, info->reg_id_mmfr2);
-		init_cpu_ftr_reg(SYS_ID_MMFR3_EL1, info->reg_id_mmfr3);
-		init_cpu_ftr_reg(SYS_ID_MMFR4_EL1, info->reg_id_mmfr4);
-		init_cpu_ftr_reg(SYS_ID_MMFR5_EL1, info->reg_id_mmfr5);
-		init_cpu_ftr_reg(SYS_ID_PFR0_EL1, info->reg_id_pfr0);
-		init_cpu_ftr_reg(SYS_ID_PFR1_EL1, info->reg_id_pfr1);
-		init_cpu_ftr_reg(SYS_ID_PFR2_EL1, info->reg_id_pfr2);
-		init_cpu_ftr_reg(SYS_MVFR0_EL1, info->reg_mvfr0);
-		init_cpu_ftr_reg(SYS_MVFR1_EL1, info->reg_mvfr1);
-		init_cpu_ftr_reg(SYS_MVFR2_EL1, info->reg_mvfr2);
-	}
-
 	if (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) {
 		init_cpu_ftr_reg(SYS_ZCR_EL1, info->reg_zcr);
 		sve_init_vq_map();
@@ -877,6 +852,31 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info)
 	setup_boot_cpu_capabilities();
 }
 
+void init_cpu_32bit_features(struct cpuinfo_arm64 *info)
+{
+	init_cpu_ftr_reg(SYS_ID_DFR0_EL1, info->reg_id_dfr0);
+	init_cpu_ftr_reg(SYS_ID_DFR1_EL1, info->reg_id_dfr1);
+	init_cpu_ftr_reg(SYS_ID_ISAR0_EL1, info->reg_id_isar0);
+	init_cpu_ftr_reg(SYS_ID_ISAR1_EL1, info->reg_id_isar1);
+	init_cpu_ftr_reg(SYS_ID_ISAR2_EL1, info->reg_id_isar2);
+	init_cpu_ftr_reg(SYS_ID_ISAR3_EL1, info->reg_id_isar3);
+	init_cpu_ftr_reg(SYS_ID_ISAR4_EL1, info->reg_id_isar4);
+	init_cpu_ftr_reg(SYS_ID_ISAR5_EL1, info->reg_id_isar5);
+	init_cpu_ftr_reg(SYS_ID_ISAR6_EL1, info->reg_id_isar6);
+	init_cpu_ftr_reg(SYS_ID_MMFR0_EL1, info->reg_id_mmfr0);
+	init_cpu_ftr_reg(SYS_ID_MMFR1_EL1, info->reg_id_mmfr1);
+	init_cpu_ftr_reg(SYS_ID_MMFR2_EL1, info->reg_id_mmfr2);
+	init_cpu_ftr_reg(SYS_ID_MMFR3_EL1, info->reg_id_mmfr3);
+	init_cpu_ftr_reg(SYS_ID_MMFR4_EL1, info->reg_id_mmfr4);
+	init_cpu_ftr_reg(SYS_ID_MMFR5_EL1, info->reg_id_mmfr5);
+	init_cpu_ftr_reg(SYS_ID_PFR0_EL1, info->reg_id_pfr0);
+	init_cpu_ftr_reg(SYS_ID_PFR1_EL1, info->reg_id_pfr1);
+	init_cpu_ftr_reg(SYS_ID_PFR2_EL1, info->reg_id_pfr2);
+	init_cpu_ftr_reg(SYS_MVFR0_EL1, info->reg_mvfr0);
+	init_cpu_ftr_reg(SYS_MVFR1_EL1, info->reg_mvfr1);
+	init_cpu_ftr_reg(SYS_MVFR2_EL1, info->reg_mvfr2);
+}
+
 static void update_cpu_ftr_reg(struct arm64_ftr_reg *reg, u64 new)
 {
 	const struct arm64_ftr_bits *ftrp;
@@ -1804,6 +1804,17 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.field_pos = ID_AA64PFR0_EL0_SHIFT,
 		.min_field_value = ID_AA64PFR0_EL0_32BIT_64BIT,
 	},
+#ifdef CONFIG_ASYMMETRIC_AARCH32
+	{
+		.capability = ARM64_HAS_ASYM_32BIT_EL0,
+		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64PFR0_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64PFR0_EL0_SHIFT,
+		.min_field_value = ID_AA64PFR0_EL0_32BIT_64BIT,
+	},
+#endif
 #ifdef CONFIG_KVM
 	{
 		.desc = "32-bit EL1 Support",
@@ -2576,7 +2587,8 @@ static void verify_local_cpu_capabilities(void)
 
 	verify_local_elf_hwcaps(arm64_elf_hwcaps);
 
-	if (system_supports_32bit_el0())
+	if (system_supports_32bit_el0() &&
+	    this_cpu_has_cap(ARM64_HAS_32BIT_EL0))
 		verify_local_elf_hwcaps(compat_elf_hwcaps);
 
 	if (system_supports_sve())
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index d0076c2159e6..b7f69cbbc088 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -362,32 +362,6 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
 	info->reg_id_aa64pfr1 = read_cpuid(ID_AA64PFR1_EL1);
 	info->reg_id_aa64zfr0 = read_cpuid(ID_AA64ZFR0_EL1);
 
-	/* Update the 32bit ID registers only if AArch32 is implemented */
-	if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) {
-		info->reg_id_dfr0 = read_cpuid(ID_DFR0_EL1);
-		info->reg_id_dfr1 = read_cpuid(ID_DFR1_EL1);
-		info->reg_id_isar0 = read_cpuid(ID_ISAR0_EL1);
-		info->reg_id_isar1 = read_cpuid(ID_ISAR1_EL1);
-		info->reg_id_isar2 = read_cpuid(ID_ISAR2_EL1);
-		info->reg_id_isar3 = read_cpuid(ID_ISAR3_EL1);
-		info->reg_id_isar4 = read_cpuid(ID_ISAR4_EL1);
-		info->reg_id_isar5 = read_cpuid(ID_ISAR5_EL1);
-		info->reg_id_isar6 = read_cpuid(ID_ISAR6_EL1);
-		info->reg_id_mmfr0 = read_cpuid(ID_MMFR0_EL1);
-		info->reg_id_mmfr1 = read_cpuid(ID_MMFR1_EL1);
-		info->reg_id_mmfr2 = read_cpuid(ID_MMFR2_EL1);
-		info->reg_id_mmfr3 = read_cpuid(ID_MMFR3_EL1);
-		info->reg_id_mmfr4 = read_cpuid(ID_MMFR4_EL1);
-		info->reg_id_mmfr5 = read_cpuid(ID_MMFR5_EL1);
-		info->reg_id_pfr0 = read_cpuid(ID_PFR0_EL1);
-		info->reg_id_pfr1 = read_cpuid(ID_PFR1_EL1);
-		info->reg_id_pfr2 = read_cpuid(ID_PFR2_EL1);
-
-		info->reg_mvfr0 = read_cpuid(MVFR0_EL1);
-		info->reg_mvfr1 = read_cpuid(MVFR1_EL1);
-		info->reg_mvfr2 = read_cpuid(MVFR2_EL1);
-	}
-
 	if (IS_ENABLED(CONFIG_ARM64_SVE) &&
 	    id_aa64pfr0_sve(info->reg_id_aa64pfr0))
 		info->reg_zcr = read_zcr_features();
@@ -395,10 +369,51 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
 	cpuinfo_detect_icache_policy(info);
 }
 
+static void __cpuinfo_store_cpu_32bit(struct cpuinfo_arm64 *info)
+{
+	info->aarch32_valid = true;
+
+	info->reg_id_dfr0 = read_cpuid(ID_DFR0_EL1);
+	info->reg_id_dfr1 = read_cpuid(ID_DFR1_EL1);
+	info->reg_id_isar0 = read_cpuid(ID_ISAR0_EL1);
+	info->reg_id_isar1 = read_cpuid(ID_ISAR1_EL1);
+	info->reg_id_isar2 = read_cpuid(ID_ISAR2_EL1);
+	info->reg_id_isar3 = read_cpuid(ID_ISAR3_EL1);
+	info->reg_id_isar4 = read_cpuid(ID_ISAR4_EL1);
+	info->reg_id_isar5 = read_cpuid(ID_ISAR5_EL1);
+	info->reg_id_isar6 = read_cpuid(ID_ISAR6_EL1);
+	info->reg_id_mmfr0 = read_cpuid(ID_MMFR0_EL1);
+	info->reg_id_mmfr1 = read_cpuid(ID_MMFR1_EL1);
+	info->reg_id_mmfr2 = read_cpuid(ID_MMFR2_EL1);
+	info->reg_id_mmfr3 = read_cpuid(ID_MMFR3_EL1);
+	info->reg_id_mmfr4 = read_cpuid(ID_MMFR4_EL1);
+	info->reg_id_mmfr5 = read_cpuid(ID_MMFR5_EL1);
+	info->reg_id_pfr0 = read_cpuid(ID_PFR0_EL1);
+	info->reg_id_pfr1 = read_cpuid(ID_PFR1_EL1);
+	info->reg_id_pfr2 = read_cpuid(ID_PFR2_EL1);
+
+	info->reg_mvfr0 = read_cpuid(MVFR0_EL1);
+	info->reg_mvfr1 = read_cpuid(MVFR1_EL1);
+	info->reg_mvfr2 = read_cpuid(MVFR2_EL1);
+}
+
 void cpuinfo_store_cpu(void)
 {
 	struct cpuinfo_arm64 *info = this_cpu_ptr(&cpu_data);
 	__cpuinfo_store_cpu(info);
+	if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0))
+		__cpuinfo_store_cpu_32bit(info);
+	/*
+	 * With asymmetric AArch32 support, populate the boot CPU information
+	 * on the first 32-bit capable secondary CPU if the primary one
+	 * skipped this step.
+	 */
+	if (IS_ENABLED(CONFIG_ASYMMETRIC_AARCH32) &&
+	    !boot_cpu_data.aarch32_valid &&
+	    id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) {
+		__cpuinfo_store_cpu_32bit(&boot_cpu_data);
+		init_cpu_32bit_features(&boot_cpu_data);
+	}
 	update_cpu_features(smp_processor_id(), info, &boot_cpu_data);
 }
 
@@ -406,7 +421,11 @@ void __init cpuinfo_store_boot_cpu(void)
 {
 	struct cpuinfo_arm64 *info = &per_cpu(cpu_data, 0);
 	__cpuinfo_store_cpu(info);
+	if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0))
+		__cpuinfo_store_cpu_32bit(info);
 
 	boot_cpu_data = *info;
 	init_cpu_features(&boot_cpu_data);
+	if (id_aa64pfr0_32bit_el0(boot_cpu_data.reg_id_aa64pfr0))
+		init_cpu_32bit_features(&boot_cpu_data);
 }
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index f1804496b935..a2f9ffb2b173 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -513,6 +513,15 @@ static void entry_task_switch(struct task_struct *next)
 	__this_cpu_write(__entry_task, next);
 }
 
+static void aarch32_thread_switch(struct task_struct *next)
+{
+	struct thread_info *ti = task_thread_info(next);
+
+	if (IS_ENABLED(CONFIG_ASYMMETRIC_AARCH32) && is_compat_thread(ti) &&
+	    !this_cpu_has_cap(ARM64_HAS_32BIT_EL0))
+		set_ti_thread_flag(ti, TIF_CHECK_32BIT_AFFINITY);
+}
+
 /*
  * ARM erratum 1418040 handling, affecting the 32bit view of CNTVCT.
  * Assuming the virtual counter is enabled at the beginning of times:
@@ -562,6 +571,7 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
 	uao_thread_switch(next);
 	ssbs_thread_switch(next);
 	erratum_1418040_thread_switch(prev, next);
+	aarch32_thread_switch(next);
 
 	/*
 	 * Complete any pending TLB or cache maintenance on this CPU in case
@@ -620,6 +630,13 @@ void arch_setup_new_exec(void)
 	current->mm->context.flags = is_compat_task() ? MMCF_AARCH32 : 0;
 
 	ptrauth_thread_init_user(current);
+
+	/*
+	 * If exec'ing a 32-bit task, force the asymmetric 32-bit feature
+	 * check as the task may not go through a switch_to() call.
+	 */
+	if (IS_ENABLED(CONFIG_ASYMMETRIC_AARCH32) && is_compat_task())
+		set_thread_flag(TIF_CHECK_32BIT_AFFINITY);
 }
 
 #ifdef CONFIG_ARM64_TAGGED_ADDR_ABI
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 3b4f31f35e45..cf94cc248fbe 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -8,6 +8,7 @@
 
 #include <linux/cache.h>
 #include <linux/compat.h>
+#include <linux/cpumask.h>
 #include <linux/errno.h>
 #include <linux/kernel.h>
 #include <linux/signal.h>
@@ -907,6 +908,17 @@ static void do_signal(struct pt_regs *regs)
 	restore_saved_sigmask();
 }
 
+static void check_aarch32_cpumask(void)
+{
+	/*
+	 * If the task moved to uncapable CPU, SIGKILL it.
+	 */
+	if (!this_cpu_has_cap(ARM64_HAS_32BIT_EL0)) {
+		pr_warn_once("CPU affinity contains CPUs that are not capable of running 32-bit tasks\n");
+		force_sig(SIGKILL);
+	}
+}
+
 asmlinkage void do_notify_resume(struct pt_regs *regs,
 				 unsigned long thread_flags)
 {
@@ -929,6 +941,12 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
 		} else {
 			local_daif_restore(DAIF_PROCCTX);
 
+			if (IS_ENABLED(CONFIG_ASYMMETRIC_AARCH32) &&
+			    thread_flags & _TIF_CHECK_32BIT_AFFINITY) {
+				clear_thread_flag(TIF_CHECK_32BIT_AFFINITY);
+				check_aarch32_cpumask();
+			}
+
 			if (thread_flags & _TIF_UPROBE)
 				uprobe_notify_resume(regs);
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 22ff3373d855..17c6674a7fcd 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -644,7 +644,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 	struct kvm_run *run = vcpu->run;
 	int ret;
 
-	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
+	if (!system_supports_sym_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
 		kvm_err("Illegal AArch32 mode at EL0, can't run.");
 		return -ENOEXEC;
 	}
@@ -815,7 +815,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 		 * enter 32bit mode, catch that and prevent it from running
 		 * again.
 		 */
-		if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
+		if (!system_supports_sym_32bit_el0() &&
+		    vcpu_mode_is_32bit(vcpu)) {
 			kvm_err("Detected illegal AArch32 mode at EL0, exiting.");
 			ret = ARM_EXCEPTION_IL;
 		}
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index dfb5218137ca..0f67b53eaf17 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -226,7 +226,7 @@ static int set_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 		u64 mode = (*(u64 *)valp) & PSR_AA32_MODE_MASK;
 		switch (mode) {
 		case PSR_AA32_MODE_USR:
-			if (!system_supports_32bit_el0())
+			if (!system_supports_sym_32bit_el0())
 				return -EINVAL;
 			break;
 		case PSR_AA32_MODE_FIQ:
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 077293b5115f..0b9aaee1df59 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -670,7 +670,7 @@ static void reset_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 	 */
 	val = ((pmcr & ~ARMV8_PMU_PMCR_MASK)
 	       | (ARMV8_PMU_PMCR_MASK & 0xdecafbad)) & (~ARMV8_PMU_PMCR_E);
-	if (!system_supports_32bit_el0())
+	if (!system_supports_sym_32bit_el0())
 		val |= ARMV8_PMU_PMCR_LC;
 	__vcpu_sys_reg(vcpu, r->reg) = val;
 }
@@ -722,7 +722,7 @@ static bool access_pmcr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
 		val = __vcpu_sys_reg(vcpu, PMCR_EL0);
 		val &= ~ARMV8_PMU_PMCR_MASK;
 		val |= p->regval & ARMV8_PMU_PMCR_MASK;
-		if (!system_supports_32bit_el0())
+		if (!system_supports_sym_32bit_el0())
 			val |= ARMV8_PMU_PMCR_LC;
 		__vcpu_sys_reg(vcpu, PMCR_EL0) = val;
 		kvm_pmu_handle_pmcr(vcpu, val);
@@ -1131,6 +1131,16 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
 		if (!vcpu_has_sve(vcpu))
 			val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
 		val &= ~(0xfUL << ID_AA64PFR0_AMU_SHIFT);
+
+		if (!system_supports_sym_32bit_el0()) {
+			/*
+			 * We could be running on asym aarch32 system.
+			 * Override to present a aarch64 only system.
+			 */
+			val &= ~(0xfUL << ID_AA64PFR0_EL0_SHIFT);
+			val |= (ID_AA64PFR0_EL0_64BIT_ONLY << ID_AA64PFR0_EL0_SHIFT);
+		}
+
 	} else if (id == SYS_ID_AA64ISAR1_EL1 && !vcpu_has_ptrauth(vcpu)) {
 		val &= ~((0xfUL << ID_AA64ISAR1_APA_SHIFT) |
 			 (0xfUL << ID_AA64ISAR1_API_SHIFT) |
-- 
2.17.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu
  2020-10-08 18:16 [RFC PATCH 0/3] Add support for Asymmetric AArch32 systems Qais Yousef
  2020-10-08 18:16 ` [RFC PATCH 1/3] arm64: kvm: Handle " Qais Yousef
  2020-10-08 18:16 ` [RFC PATCH 2/3] arm64: Add support for asymmetric AArch32 EL0 configurations Qais Yousef
@ 2020-10-08 18:16 ` Qais Yousef
  2020-10-09  7:29   ` Peter Zijlstra
  2 siblings, 1 reply; 33+ messages in thread
From: Qais Yousef @ 2020-10-08 18:16 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Marc Zyngier, Peter Zijlstra (Intel)
  Cc: Morten Rasmussen, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel, linux-arch, Qais Yousef

On Asym AArch32 system, if a task has invalid cpus in its affinity, we
try to fix the cpumask silently and let it continue to run. If the
cpumask doesn't contain any valid cpu, we have no choice but to send
SIGKILL.

This patch can be omitted if user-space can guarantee the cpumask of
all AArch32 apps only contains AArch32 capable CPUs.

The biggest challenge in user space managing the affinity is handling
apps who try to modify their own CPU affinity via sched_setaffinity().
Without this change they could trigger a SIGKILL if they unknowingly
affine to the wrong set of CPUs. Only by enforcing all 32bit apps to
a cpuset user space can guarantee apps can't escape this affinity.

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
---
 arch/arm64/Kconfig                  |  8 ++++----
 arch/arm64/include/asm/cpufeature.h |  2 ++
 arch/arm64/kernel/cpufeature.c      | 11 +++++++++++
 arch/arm64/kernel/signal.c          | 25 ++++++++++++++++++++-----
 4 files changed, 37 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 591853504dc4..ad6d52dd8ac0 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1875,10 +1875,10 @@ config ASYMMETRIC_AARCH32
 	  Enable this option to allow support for asymmetric AArch32 EL0
 	  CPU configurations. Once the AArch32 EL0 support is detected
 	  on a CPU, the feature is made available to user space to allow
-	  the execution of 32-bit (compat) applications. If the affinity
-	  of the 32-bit application contains a non-AArch32 capable CPU
-	  or the last AArch32 capable CPU is offlined, the application
-	  will be killed.
+	  the execution of 32-bit (compat) applications by migrating
+	  them to the capable CPUs. Offlining such CPUs leads to 32-bit
+	  applications being killed. Similarly if the affinity contains
+	  no 32-bit capable CPU.
 
 	  If unsure say N.
 
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index fa2413715041..57275e47cd3d 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -376,6 +376,8 @@ cpucap_multi_entry_cap_matches(const struct arm64_cpu_capabilities *entry,
 	return false;
 }
 
+extern cpumask_t aarch32_el0_mask;
+
 extern DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
 extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
 extern struct static_key_false arm64_const_caps_ready;
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index d46732113305..4c0858c13e6d 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1721,6 +1721,16 @@ cpucap_panic_on_conflict(const struct arm64_cpu_capabilities *cap)
 	return !!(cap->type & ARM64_CPUCAP_PANIC_ON_CONFLICT);
 }
 
+#ifdef CONFIG_ASYMMETRIC_AARCH32
+cpumask_t aarch32_el0_mask;
+
+static void cpu_enable_aarch32_el0(struct arm64_cpu_capabilities const *cap)
+{
+	if (has_cpuid_feature(cap, SCOPE_LOCAL_CPU))
+		cpumask_set_cpu(smp_processor_id(), &aarch32_el0_mask);
+}
+#endif
+
 static const struct arm64_cpu_capabilities arm64_features[] = {
 	{
 		.desc = "GIC system register CPU interface",
@@ -1799,6 +1809,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.capability = ARM64_HAS_32BIT_EL0,
 		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
 		.matches = has_cpuid_feature,
+		.cpu_enable = cpu_enable_aarch32_el0,
 		.sys_reg = SYS_ID_AA64PFR0_EL1,
 		.sign = FTR_UNSIGNED,
 		.field_pos = ID_AA64PFR0_EL0_SHIFT,
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index cf94cc248fbe..7e97f1589f33 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -908,13 +908,28 @@ static void do_signal(struct pt_regs *regs)
 	restore_saved_sigmask();
 }
 
-static void check_aarch32_cpumask(void)
+static void set_32bit_cpus_allowed(void)
 {
+	cpumask_var_t cpus_allowed;
+	int ret = 0;
+
+	if (cpumask_subset(current->cpus_ptr, &aarch32_el0_mask))
+		return;
+
 	/*
-	 * If the task moved to uncapable CPU, SIGKILL it.
+	 * On asym aarch32 systems, if the task has invalid cpus in its mask,
+	 * we try to fix it by removing the invalid ones.
 	 */
-	if (!this_cpu_has_cap(ARM64_HAS_32BIT_EL0)) {
-		pr_warn_once("CPU affinity contains CPUs that are not capable of running 32-bit tasks\n");
+	if (!alloc_cpumask_var(&cpus_allowed, GFP_ATOMIC)) {
+		ret = -ENOMEM;
+	} else {
+		cpumask_and(cpus_allowed, current->cpus_ptr, &aarch32_el0_mask);
+		ret = set_cpus_allowed_ptr(current, cpus_allowed);
+		free_cpumask_var(cpus_allowed);
+	}
+
+	if (ret) {
+		pr_warn_once("Failed to fixup affinity of running 32-bit task\n");
 		force_sig(SIGKILL);
 	}
 }
@@ -944,7 +959,7 @@ asmlinkage void do_notify_resume(struct pt_regs *regs,
 			if (IS_ENABLED(CONFIG_ASYMMETRIC_AARCH32) &&
 			    thread_flags & _TIF_CHECK_32BIT_AFFINITY) {
 				clear_thread_flag(TIF_CHECK_32BIT_AFFINITY);
-				check_aarch32_cpumask();
+				set_32bit_cpus_allowed();
 			}
 
 			if (thread_flags & _TIF_UPROBE)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 2/3] arm64: Add support for asymmetric AArch32 EL0 configurations
  2020-10-08 18:16 ` [RFC PATCH 2/3] arm64: Add support for asymmetric AArch32 EL0 configurations Qais Yousef
@ 2020-10-08 18:22   ` Randy Dunlap
  2020-10-12 10:22     ` Qais Yousef
  2020-10-09  6:13   ` Greg Kroah-Hartman
  2020-10-09  9:39   ` Catalin Marinas
  2 siblings, 1 reply; 33+ messages in thread
From: Randy Dunlap @ 2020-10-08 18:22 UTC (permalink / raw)
  To: Qais Yousef, Catalin Marinas, Will Deacon, Marc Zyngier,
	Peter Zijlstra (Intel)
  Cc: Morten Rasmussen, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel, linux-arch

On 10/8/20 11:16 AM, Qais Yousef wrote:
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 6d232837cbee..591853504dc4 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1868,6 +1868,20 @@ config DMI
>  
>  endmenu
>  
> +config ASYMMETRIC_AARCH32
> +	bool "Allow support for asymmetric AArch32 support"

Please drop one "support" or reword the prompt string.

> +	depends on COMPAT && EXPERT
> +	help
> +	  Enable this option to allow support for asymmetric AArch32 EL0
> +	  CPU configurations. Once the AArch32 EL0 support is detected
> +	  on a CPU, the feature is made available to user space to allow
> +	  the execution of 32-bit (compat) applications. If the affinity
> +	  of the 32-bit application contains a non-AArch32 capable CPU
> +	  or the last AArch32 capable CPU is offlined, the application
> +	  will be killed.
> +
> +	  If unsure say N.


-- 
~Randy


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 2/3] arm64: Add support for asymmetric AArch32 EL0 configurations
  2020-10-08 18:16 ` [RFC PATCH 2/3] arm64: Add support for asymmetric AArch32 EL0 configurations Qais Yousef
  2020-10-08 18:22   ` Randy Dunlap
@ 2020-10-09  6:13   ` Greg Kroah-Hartman
  2020-10-09  8:40     ` Will Deacon
  2020-10-09  8:50     ` Catalin Marinas
  2020-10-09  9:39   ` Catalin Marinas
  2 siblings, 2 replies; 33+ messages in thread
From: Greg Kroah-Hartman @ 2020-10-09  6:13 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Catalin Marinas, Will Deacon, Marc Zyngier,
	Peter Zijlstra (Intel),
	Morten Rasmussen, Linus Torvalds, linux-arm-kernel, linux-arch

On Thu, Oct 08, 2020 at 07:16:40PM +0100, Qais Yousef wrote:
> When the CONFIG_ASYMMETRIC_AARCH32 option is enabled (EXPERT), the type
> of the ARM64_HAS_32BIT_EL0 capability becomes WEAK_LOCAL_CPU_FEATURE.
> The kernel will now return true for system_supports_32bit_el0() and
> checks 32-bit tasks are affined to AArch32 capable CPUs only in
> do_notify_resume(). If the affinity contains a non-capable AArch32 CPU,
> the tasks will get SIGKILLed. If the last CPU supporting 32-bit is
> offlined, the kernel will SIGKILL any scheduled 32-bit tasks (the
> alternative is to prevent offlining through a new .cpu_disable feature
> entry).
> 
> In addition to the relaxation of the ARM64_HAS_32BIT_EL0 capability,
> this patch factors out the 32-bit cpuinfo and features setting into
> separate functions: __cpuinfo_store_cpu_32bit(),
> init_cpu_32bit_features(). The cpuinfo of the booting CPU
> (boot_cpu_data) is now updated on the first 32-bit capable CPU even if
> it is a secondary one. The ID_AA64PFR0_EL0_64BIT_ONLY feature is relaxed
> to FTR_NONSTRICT and FTR_HIGHER_SAFE when the asymmetric AArch32 support
> is enabled. The compat_elf_hwcaps are only verified for the
> AArch32-capable CPUs to still allow hotplugging AArch64-only CPUs.
> 
> Make sure that KVM never sees the asymmetric 32bit system. Guest can
> still ignore ID registers and force run 32bit at EL0.
> 
> Co-developed-by: Qais Yousef <qais.yousef@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Qais Yousef <qais.yousef@arm.com>
> ---
>  arch/arm64/Kconfig                   | 14 ++++++
>  arch/arm64/include/asm/cpu.h         |  2 +
>  arch/arm64/include/asm/cpucaps.h     |  3 +-
>  arch/arm64/include/asm/cpufeature.h  | 20 +++++++-
>  arch/arm64/include/asm/thread_info.h |  5 +-
>  arch/arm64/kernel/cpufeature.c       | 66 +++++++++++++++-----------
>  arch/arm64/kernel/cpuinfo.c          | 71 ++++++++++++++++++----------
>  arch/arm64/kernel/process.c          | 17 +++++++
>  arch/arm64/kernel/signal.c           | 18 +++++++
>  arch/arm64/kvm/arm.c                 |  5 +-
>  arch/arm64/kvm/guest.c               |  2 +-
>  arch/arm64/kvm/sys_regs.c            | 14 +++++-
>  12 files changed, 176 insertions(+), 61 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 6d232837cbee..591853504dc4 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1868,6 +1868,20 @@ config DMI
>  
>  endmenu
>  
> +config ASYMMETRIC_AARCH32
> +	bool "Allow support for asymmetric AArch32 support"
> +	depends on COMPAT && EXPERT

Why EXPERT?  You don't want this able to be enabled by anyone?

> +	help
> +	  Enable this option to allow support for asymmetric AArch32 EL0
> +	  CPU configurations. Once the AArch32 EL0 support is detected
> +	  on a CPU, the feature is made available to user space to allow
> +	  the execution of 32-bit (compat) applications. If the affinity
> +	  of the 32-bit application contains a non-AArch32 capable CPU
> +	  or the last AArch32 capable CPU is offlined, the application
> +	  will be killed.
> +
> +	  If unsure say N.
> +
>  config SYSVIPC_COMPAT
>  	def_bool y
>  	depends on COMPAT && SYSVIPC
> diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
> index 7faae6ff3ab4..c920fa45e502 100644
> --- a/arch/arm64/include/asm/cpu.h
> +++ b/arch/arm64/include/asm/cpu.h
> @@ -15,6 +15,7 @@
>  struct cpuinfo_arm64 {
>  	struct cpu	cpu;
>  	struct kobject	kobj;
> +	bool		aarch32_valid;

Do you mean to cause holes in this structure?  :)

Isn't "valid" the common thing?  Do you now have to explicitly enable
this everywhere instead of just dealing with the uncommon case of this
cpu variant?

I don't see this information being exported to userspace anywhere.  I
know Intel has submitted a patch to export this "type" of thing to the
cpu sysfs directories, can you do the same thing here?

Otherwise, how is userspace supposed to know where to place programs
that are 32bit?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu
  2020-10-08 18:16 ` [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu Qais Yousef
@ 2020-10-09  7:29   ` Peter Zijlstra
  2020-10-09  8:13     ` Morten Rasmussen
  0 siblings, 1 reply; 33+ messages in thread
From: Peter Zijlstra @ 2020-10-09  7:29 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Catalin Marinas, Will Deacon, Marc Zyngier, Morten Rasmussen,
	Greg Kroah-Hartman, Linus Torvalds, linux-arm-kernel, linux-arch

On Thu, Oct 08, 2020 at 07:16:41PM +0100, Qais Yousef wrote:
> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> index cf94cc248fbe..7e97f1589f33 100644
> --- a/arch/arm64/kernel/signal.c
> +++ b/arch/arm64/kernel/signal.c
> @@ -908,13 +908,28 @@ static void do_signal(struct pt_regs *regs)
>  	restore_saved_sigmask();
>  }
>  
> +static void set_32bit_cpus_allowed(void)
>  {
> +	cpumask_var_t cpus_allowed;
> +	int ret = 0;
> +
> +	if (cpumask_subset(current->cpus_ptr, &aarch32_el0_mask))
> +		return;
> +
>  	/*
> +	 * On asym aarch32 systems, if the task has invalid cpus in its mask,
> +	 * we try to fix it by removing the invalid ones.
>  	 */
> +	if (!alloc_cpumask_var(&cpus_allowed, GFP_ATOMIC)) {
> +		ret = -ENOMEM;
> +	} else {
> +		cpumask_and(cpus_allowed, current->cpus_ptr, &aarch32_el0_mask);
> +		ret = set_cpus_allowed_ptr(current, cpus_allowed);
> +		free_cpumask_var(cpus_allowed);
> +	}
> +
> +	if (ret) {
> +		pr_warn_once("Failed to fixup affinity of running 32-bit task\n");
>  		force_sig(SIGKILL);
>  	}
>  }

Yeah, no. Not going to happen.

Fundamentally, you're not supposed to change the userspace provided
affinity mask. If we want to do something like this, we'll have to teach
the scheduler about this second mask such that it can compute an
effective mask as the intersection between the 'feature' and user mask.

Also, practically, using ->cpus_ptr here to construct the mask will be
really dodgy vs the proposed migrate_disable() patches.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 1/3] arm64: kvm: Handle Asymmetric AArch32 systems
  2020-10-08 18:16 ` [RFC PATCH 1/3] arm64: kvm: Handle " Qais Yousef
@ 2020-10-09  8:12   ` Marc Zyngier
  2020-10-09  9:58     ` Qais Yousef
  0 siblings, 1 reply; 33+ messages in thread
From: Marc Zyngier @ 2020-10-09  8:12 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Catalin Marinas, Will Deacon, Peter Zijlstra (Intel),
	Morten Rasmussen, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel, linux-arch

On 2020-10-08 19:16, Qais Yousef wrote:
> On a system without uniform support for AArch32 at EL0, it is possible
> for the guest to force run AArch32 at EL0 and potentially cause an
> illegal exception if running on the wrong core.
> 
> Add an extra check to catch if the guest ever does that and prevent it
> from running again, treating it as ARM_EXCEPTION_IL.
> 
> We try to catch this misbehavior as early as possible and not rely on
> PSTATE.IL to occur.

nit: PSTATE.IL doesn't "occur". It is an "Illegal Exception Return" that
can happen.

> 
> Tested on Juno by instrumenting the host to:
> 
> 	* Fake asym aarch32.
> 	* Comment out hiding of ID registers from the guest.
> 
> Any attempt to run 32bit app in the guest will produce such error on
> qemu:
> 
> 	# ./test
> 	error: kvm run failed Invalid argument
> 	R00=ffff0fff R01=ffffffff R02=00000000 R03=00087968
> 	R04=000874b8 R05=ffd70b24 R06=ffd70b2c R07=00000055
> 	R08=00000000 R09=00000000 R10=00000000 R11=00000000
> 	R12=0000001c R13=ffd6f974 R14=0001ff64 R15=ffff0fe0
> 	PSR=a0000010 N-C- A usr32
> 
> Signed-off-by: Qais Yousef <qais.yousef@arm.com>
> ---
>  arch/arm64/kvm/arm.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index b588c3b5c2f0..22ff3373d855 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -644,6 +644,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  	struct kvm_run *run = vcpu->run;
>  	int ret;
> 
> +	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
> +		kvm_err("Illegal AArch32 mode at EL0, can't run.");

No, we don't scream on the console in an uncontrolled way based on
illegal user input (yes, the VM *is* userspace).

Furthermore, you seem to deal with the same problem *twice*. See below.

> +		return -ENOEXEC;
> +	}
> +
>  	if (unlikely(!kvm_vcpu_initialized(vcpu)))
>  		return -ENOEXEC;
> 
> @@ -804,6 +809,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> 
>  		preempt_enable();
> 
> +		/*
> +		 * For asym aarch32 systems we present a 64bit only system to
> +		 * the guest. But in case it managed somehow to escape that and
> +		 * enter 32bit mode, catch that and prevent it from running
> +		 * again.

The guest didn't *escape* anything. It merely used the CPU as designed.
The fact that the hypervisor cannot prevent the guest from using AArch32
is an architectural defect.

> +		 */
> +		if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
> +			kvm_err("Detected illegal AArch32 mode at EL0, exiting.");

Same remark as above. Userspace has access to PSTATE and can work out
the issue by itself.

> +			ret = ARM_EXCEPTION_IL;

This will cause the thread to return to userspace after having done a
vcpu_put(). So why don't you just mark the vcpu as uninitialized before
returning to userspace? It already is in an illegal state, and the only
reasonable thing userspace can do is to reset it.

This way, we keep the horror in a single place, and force userspace to
take action if it really wants to recover the guest.

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu
  2020-10-09  7:29   ` Peter Zijlstra
@ 2020-10-09  8:13     ` Morten Rasmussen
  2020-10-09  8:31       ` Will Deacon
  2020-10-09  9:25       ` Peter Zijlstra
  0 siblings, 2 replies; 33+ messages in thread
From: Morten Rasmussen @ 2020-10-09  8:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Qais Yousef, linux-arch, Marc Zyngier, Will Deacon,
	Catalin Marinas, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel

On Fri, Oct 09, 2020 at 09:29:43AM +0200, Peter Zijlstra wrote:
> On Thu, Oct 08, 2020 at 07:16:41PM +0100, Qais Yousef wrote:
> > diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> > index cf94cc248fbe..7e97f1589f33 100644
> > --- a/arch/arm64/kernel/signal.c
> > +++ b/arch/arm64/kernel/signal.c
> > @@ -908,13 +908,28 @@ static void do_signal(struct pt_regs *regs)
> >  	restore_saved_sigmask();
> >  }
> >  
> > +static void set_32bit_cpus_allowed(void)
> >  {
> > +	cpumask_var_t cpus_allowed;
> > +	int ret = 0;
> > +
> > +	if (cpumask_subset(current->cpus_ptr, &aarch32_el0_mask))
> > +		return;
> > +
> >  	/*
> > +	 * On asym aarch32 systems, if the task has invalid cpus in its mask,
> > +	 * we try to fix it by removing the invalid ones.
> >  	 */
> > +	if (!alloc_cpumask_var(&cpus_allowed, GFP_ATOMIC)) {
> > +		ret = -ENOMEM;
> > +	} else {
> > +		cpumask_and(cpus_allowed, current->cpus_ptr, &aarch32_el0_mask);
> > +		ret = set_cpus_allowed_ptr(current, cpus_allowed);
> > +		free_cpumask_var(cpus_allowed);
> > +	}
> > +
> > +	if (ret) {
> > +		pr_warn_once("Failed to fixup affinity of running 32-bit task\n");
> >  		force_sig(SIGKILL);
> >  	}
> >  }
> 
> Yeah, no. Not going to happen.
> 
> Fundamentally, you're not supposed to change the userspace provided
> affinity mask. If we want to do something like this, we'll have to teach
> the scheduler about this second mask such that it can compute an
> effective mask as the intersection between the 'feature' and user mask.

I agree that we shouldn't mess wit the user-space mask directly. Would it
be unthinkable to go down the route of maintaining a new mask which is
the intersection of the feature mask (controlled and updated by arch
code) and the user-space mask?

It shouldn't add overhead in the scheduler as it would use the
intersection mask instead of the user-space mask, the main complexity
would be around making sure the intersection mask is updated correctly
(cpusets, hotplug, ...).

Like the above tweak, this won't help if the intersection mask is empty,
task will still get killed but it will allow tasks to survive
user-space masks including some non-compatible CPUs. If we want to
prevent task killing in all cases (ignoring hotplug) it gets more ugly
as we would have to ignore the user-space mask in some cases.

Morten

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu
  2020-10-09  8:13     ` Morten Rasmussen
@ 2020-10-09  8:31       ` Will Deacon
  2020-10-09  8:50         ` Morten Rasmussen
  2020-10-09  9:33         ` Catalin Marinas
  2020-10-09  9:25       ` Peter Zijlstra
  1 sibling, 2 replies; 33+ messages in thread
From: Will Deacon @ 2020-10-09  8:31 UTC (permalink / raw)
  To: Morten Rasmussen
  Cc: Peter Zijlstra, Qais Yousef, linux-arch, Marc Zyngier,
	Catalin Marinas, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel

On Fri, Oct 09, 2020 at 10:13:12AM +0200, Morten Rasmussen wrote:
> On Fri, Oct 09, 2020 at 09:29:43AM +0200, Peter Zijlstra wrote:
> > On Thu, Oct 08, 2020 at 07:16:41PM +0100, Qais Yousef wrote:
> > > diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> > > index cf94cc248fbe..7e97f1589f33 100644
> > > --- a/arch/arm64/kernel/signal.c
> > > +++ b/arch/arm64/kernel/signal.c
> > > @@ -908,13 +908,28 @@ static void do_signal(struct pt_regs *regs)
> > >  	restore_saved_sigmask();
> > >  }
> > >  
> > > +static void set_32bit_cpus_allowed(void)
> > >  {
> > > +	cpumask_var_t cpus_allowed;
> > > +	int ret = 0;
> > > +
> > > +	if (cpumask_subset(current->cpus_ptr, &aarch32_el0_mask))
> > > +		return;
> > > +
> > >  	/*
> > > +	 * On asym aarch32 systems, if the task has invalid cpus in its mask,
> > > +	 * we try to fix it by removing the invalid ones.
> > >  	 */
> > > +	if (!alloc_cpumask_var(&cpus_allowed, GFP_ATOMIC)) {
> > > +		ret = -ENOMEM;
> > > +	} else {
> > > +		cpumask_and(cpus_allowed, current->cpus_ptr, &aarch32_el0_mask);
> > > +		ret = set_cpus_allowed_ptr(current, cpus_allowed);
> > > +		free_cpumask_var(cpus_allowed);
> > > +	}
> > > +
> > > +	if (ret) {
> > > +		pr_warn_once("Failed to fixup affinity of running 32-bit task\n");
> > >  		force_sig(SIGKILL);
> > >  	}
> > >  }
> > 
> > Yeah, no. Not going to happen.
> > 
> > Fundamentally, you're not supposed to change the userspace provided
> > affinity mask. If we want to do something like this, we'll have to teach
> > the scheduler about this second mask such that it can compute an
> > effective mask as the intersection between the 'feature' and user mask.
> 
> I agree that we shouldn't mess wit the user-space mask directly. Would it
> be unthinkable to go down the route of maintaining a new mask which is
> the intersection of the feature mask (controlled and updated by arch
> code) and the user-space mask?
> 
> It shouldn't add overhead in the scheduler as it would use the
> intersection mask instead of the user-space mask, the main complexity
> would be around making sure the intersection mask is updated correctly
> (cpusets, hotplug, ...).
> 
> Like the above tweak, this won't help if the intersection mask is empty,
> task will still get killed but it will allow tasks to survive
> user-space masks including some non-compatible CPUs. If we want to
> prevent task killing in all cases (ignoring hotplug) it gets more ugly
> as we would have to ignore the user-space mask in some cases.

Honestly, I don't understand why we're trying to hide this asymmetry from
userspace by playing games with affinity masks in the kernel. Userspace
is likely to want to move things about _anyway_ because even amongst the
32-bit capable cores, you may well have different clock frequencies to
contend with.

So I'd be *much* happier to let the schesduler do its thing, and if one
of these 32-bit tasks ends up on a core that can't deal with it, then
tough, it gets killed. Give userspace the information it needs to avoid
that happening in the first place, rather than implicitly limit the mask.

That way, the kernel support really boils down to two parts:

  1. Remove the sanity checks we have to prevent 32-bit applications running
     on asymmetric systems

  2. Tell userspace about the problem

Will

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 2/3] arm64: Add support for asymmetric AArch32 EL0 configurations
  2020-10-09  6:13   ` Greg Kroah-Hartman
@ 2020-10-09  8:40     ` Will Deacon
  2020-10-09  8:50     ` Catalin Marinas
  1 sibling, 0 replies; 33+ messages in thread
From: Will Deacon @ 2020-10-09  8:40 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Qais Yousef, Catalin Marinas, Marc Zyngier,
	Peter Zijlstra (Intel),
	Morten Rasmussen, Linus Torvalds, linux-arm-kernel, linux-arch

On Fri, Oct 09, 2020 at 08:13:56AM +0200, Greg Kroah-Hartman wrote:
> On Thu, Oct 08, 2020 at 07:16:40PM +0100, Qais Yousef wrote:
> > When the CONFIG_ASYMMETRIC_AARCH32 option is enabled (EXPERT), the type
> > of the ARM64_HAS_32BIT_EL0 capability becomes WEAK_LOCAL_CPU_FEATURE.
> > The kernel will now return true for system_supports_32bit_el0() and
> > checks 32-bit tasks are affined to AArch32 capable CPUs only in
> > do_notify_resume(). If the affinity contains a non-capable AArch32 CPU,
> > the tasks will get SIGKILLed. If the last CPU supporting 32-bit is
> > offlined, the kernel will SIGKILL any scheduled 32-bit tasks (the
> > alternative is to prevent offlining through a new .cpu_disable feature
> > entry).
> > 
> > In addition to the relaxation of the ARM64_HAS_32BIT_EL0 capability,
> > this patch factors out the 32-bit cpuinfo and features setting into
> > separate functions: __cpuinfo_store_cpu_32bit(),
> > init_cpu_32bit_features(). The cpuinfo of the booting CPU
> > (boot_cpu_data) is now updated on the first 32-bit capable CPU even if
> > it is a secondary one. The ID_AA64PFR0_EL0_64BIT_ONLY feature is relaxed
> > to FTR_NONSTRICT and FTR_HIGHER_SAFE when the asymmetric AArch32 support
> > is enabled. The compat_elf_hwcaps are only verified for the
> > AArch32-capable CPUs to still allow hotplugging AArch64-only CPUs.
> > 
> > Make sure that KVM never sees the asymmetric 32bit system. Guest can
> > still ignore ID registers and force run 32bit at EL0.
> > 
> > Co-developed-by: Qais Yousef <qais.yousef@arm.com>
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Signed-off-by: Qais Yousef <qais.yousef@arm.com>
> > ---
> >  arch/arm64/Kconfig                   | 14 ++++++
> >  arch/arm64/include/asm/cpu.h         |  2 +
> >  arch/arm64/include/asm/cpucaps.h     |  3 +-
> >  arch/arm64/include/asm/cpufeature.h  | 20 +++++++-
> >  arch/arm64/include/asm/thread_info.h |  5 +-
> >  arch/arm64/kernel/cpufeature.c       | 66 +++++++++++++++-----------
> >  arch/arm64/kernel/cpuinfo.c          | 71 ++++++++++++++++++----------
> >  arch/arm64/kernel/process.c          | 17 +++++++
> >  arch/arm64/kernel/signal.c           | 18 +++++++
> >  arch/arm64/kvm/arm.c                 |  5 +-
> >  arch/arm64/kvm/guest.c               |  2 +-
> >  arch/arm64/kvm/sys_regs.c            | 14 +++++-
> >  12 files changed, 176 insertions(+), 61 deletions(-)
> > 
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 6d232837cbee..591853504dc4 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -1868,6 +1868,20 @@ config DMI
> >  
> >  endmenu
> >  
> > +config ASYMMETRIC_AARCH32
> > +	bool "Allow support for asymmetric AArch32 support"
> > +	depends on COMPAT && EXPERT
> 
> Why EXPERT?  You don't want this able to be enabled by anyone?

TBH, I'd be inclined to drop the Kconfig option altogether. We're not
looking at a lot of code here, so all it does is further fragment the
build testing we get from CI (or it just ends up being enabled all of the
time).

A cmdline option, on the other hand, makes a tonne of sense to me, as it
acts as an "opt-in" that the distribution is ready to handle the madness
(because userspace will need to care about this even with the scheduler
hacks proposed here).

Will

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 2/3] arm64: Add support for asymmetric AArch32 EL0 configurations
  2020-10-09  6:13   ` Greg Kroah-Hartman
  2020-10-09  8:40     ` Will Deacon
@ 2020-10-09  8:50     ` Catalin Marinas
  1 sibling, 0 replies; 33+ messages in thread
From: Catalin Marinas @ 2020-10-09  8:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Qais Yousef, Will Deacon, Marc Zyngier, Peter Zijlstra (Intel),
	Morten Rasmussen, Linus Torvalds, linux-arm-kernel, linux-arch

On Fri, Oct 09, 2020 at 08:13:56AM +0200, Greg Kroah-Hartman wrote:
> On Thu, Oct 08, 2020 at 07:16:40PM +0100, Qais Yousef wrote:
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 6d232837cbee..591853504dc4 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -1868,6 +1868,20 @@ config DMI
> >  
> >  endmenu
> >  
> > +config ASYMMETRIC_AARCH32
> > +	bool "Allow support for asymmetric AArch32 support"
> > +	depends on COMPAT && EXPERT
> 
> Why EXPERT?  You don't want this able to be enabled by anyone?

Exactly ;). Anyway, depending on how user-friendly the feature becomes
(like the kernel transparently handling task placement), we may drop
this condition or replace it with a command line option. By default
current kernels block ELF32 processes on such platform.

> > +	help
> > +	  Enable this option to allow support for asymmetric AArch32 EL0
> > +	  CPU configurations. Once the AArch32 EL0 support is detected
> > +	  on a CPU, the feature is made available to user space to allow
> > +	  the execution of 32-bit (compat) applications. If the affinity
> > +	  of the 32-bit application contains a non-AArch32 capable CPU
> > +	  or the last AArch32 capable CPU is offlined, the application
> > +	  will be killed.
> > +
> > +	  If unsure say N.
> > +
> >  config SYSVIPC_COMPAT
> >  	def_bool y
> >  	depends on COMPAT && SYSVIPC
> > diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
> > index 7faae6ff3ab4..c920fa45e502 100644
> > --- a/arch/arm64/include/asm/cpu.h
> > +++ b/arch/arm64/include/asm/cpu.h
> > @@ -15,6 +15,7 @@
> >  struct cpuinfo_arm64 {
> >  	struct cpu	cpu;
> >  	struct kobject	kobj;
> > +	bool		aarch32_valid;
> 
> Do you mean to cause holes in this structure?  :)

No matter where you put it, there's still a hole (could move the hole at
the end though), unless we make it a 32-bit value.

Thinking about this, since we only check it on the boot_cpu_data
structure, we could probably make it a stand-alone variable or drop it
altogether (see below).

> Isn't "valid" the common thing?  Do you now have to explicitly enable
> this everywhere instead of just dealing with the uncommon case of this
> cpu variant?

We have a whole infrastructure for dealing with asymmetric features and
in most cases we only want the common functionality. The CPUID register
fields across all CPUs are normally sanitised to the lowest common
value. For some secondary CPU feature not matching the boot CPU we taint
the kernel.

In the original patch (that ended up on some Google gerrit), we relaxed
the 32-bit support checking so that the sanitised register contains the
highest value. However, if booting on a (big) CPU that did not have
32-bit, the kernel would end up tainted. The reason for aarch32_valid
was to delay populating the boot CPU information until a secondary CPU
comes up with 32-bit support and subsequently avoid the tainting.

With a last minute change (yesterday), we reverted the sanitised 32-bit
support field to the lowest value and introduced a new feature check
that's enabled when at least one of the CPUs has 32-bit support (we do
something similar for errata detection). With this in place, I think the
aarch32_valid setting/checking and delayed boot_cpu_data update can go.

> I don't see this information being exported to userspace anywhere.  I
> know Intel has submitted a patch to export this "type" of thing to the
> cpu sysfs directories, can you do the same thing here?
> 
> Otherwise, how is userspace supposed to know where to place programs
> that are 32bit?

In this series, we tried to avoid this by introducing an automatic
affinity setting/hacking (patch 3). So it's entirely transparent to the
user, it doesn't need to explicitly ask for specific task placement.

Given that Peter is against overriding the user cpumask and that a void
intersection between the 32-bit cpumask and the user one would lead to
SIGKILL, we probably have to expose the information somewhere. Currently
we have the midr_el1 register exposed in sysfs and this tells the
specific CPU implementation. It doesn't, however, state whether 32-bit
is supported unless one checks the specifications. I'd prefer to extend
the current arm64 sysfs cpu interface to include the rest of the id_*
registers, unless we find a way to create a common interface with what
the x86 guys are doing. But I'm slightly doubtful that we can find a
common interface. While 32-bit is common across other architectures, we
may want this for other features which don't have any correspondent.

-- 
Catalin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu
  2020-10-09  8:31       ` Will Deacon
@ 2020-10-09  8:50         ` Morten Rasmussen
  2020-10-09  9:33         ` Catalin Marinas
  1 sibling, 0 replies; 33+ messages in thread
From: Morten Rasmussen @ 2020-10-09  8:50 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arch, Marc Zyngier, Peter Zijlstra, Greg Kroah-Hartman,
	Catalin Marinas, Linus Torvalds, Qais Yousef, linux-arm-kernel

On Fri, Oct 09, 2020 at 09:31:47AM +0100, Will Deacon wrote:
> On Fri, Oct 09, 2020 at 10:13:12AM +0200, Morten Rasmussen wrote:
> > On Fri, Oct 09, 2020 at 09:29:43AM +0200, Peter Zijlstra wrote:
> > > On Thu, Oct 08, 2020 at 07:16:41PM +0100, Qais Yousef wrote:
> > > > diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> > > > index cf94cc248fbe..7e97f1589f33 100644
> > > > --- a/arch/arm64/kernel/signal.c
> > > > +++ b/arch/arm64/kernel/signal.c
> > > > @@ -908,13 +908,28 @@ static void do_signal(struct pt_regs *regs)
> > > >  	restore_saved_sigmask();
> > > >  }
> > > >  
> > > > +static void set_32bit_cpus_allowed(void)
> > > >  {
> > > > +	cpumask_var_t cpus_allowed;
> > > > +	int ret = 0;
> > > > +
> > > > +	if (cpumask_subset(current->cpus_ptr, &aarch32_el0_mask))
> > > > +		return;
> > > > +
> > > >  	/*
> > > > +	 * On asym aarch32 systems, if the task has invalid cpus in its mask,
> > > > +	 * we try to fix it by removing the invalid ones.
> > > >  	 */
> > > > +	if (!alloc_cpumask_var(&cpus_allowed, GFP_ATOMIC)) {
> > > > +		ret = -ENOMEM;
> > > > +	} else {
> > > > +		cpumask_and(cpus_allowed, current->cpus_ptr, &aarch32_el0_mask);
> > > > +		ret = set_cpus_allowed_ptr(current, cpus_allowed);
> > > > +		free_cpumask_var(cpus_allowed);
> > > > +	}
> > > > +
> > > > +	if (ret) {
> > > > +		pr_warn_once("Failed to fixup affinity of running 32-bit task\n");
> > > >  		force_sig(SIGKILL);
> > > >  	}
> > > >  }
> > > 
> > > Yeah, no. Not going to happen.
> > > 
> > > Fundamentally, you're not supposed to change the userspace provided
> > > affinity mask. If we want to do something like this, we'll have to teach
> > > the scheduler about this second mask such that it can compute an
> > > effective mask as the intersection between the 'feature' and user mask.
> > 
> > I agree that we shouldn't mess wit the user-space mask directly. Would it
> > be unthinkable to go down the route of maintaining a new mask which is
> > the intersection of the feature mask (controlled and updated by arch
> > code) and the user-space mask?
> > 
> > It shouldn't add overhead in the scheduler as it would use the
> > intersection mask instead of the user-space mask, the main complexity
> > would be around making sure the intersection mask is updated correctly
> > (cpusets, hotplug, ...).
> > 
> > Like the above tweak, this won't help if the intersection mask is empty,
> > task will still get killed but it will allow tasks to survive
> > user-space masks including some non-compatible CPUs. If we want to
> > prevent task killing in all cases (ignoring hotplug) it gets more ugly
> > as we would have to ignore the user-space mask in some cases.
> 
> Honestly, I don't understand why we're trying to hide this asymmetry from
> userspace by playing games with affinity masks in the kernel. Userspace
> is likely to want to move things about _anyway_ because even amongst the
> 32-bit capable cores, you may well have different clock frequencies to
> contend with.

I agree it doesn't make sense to hide the asymmetry. The only argument I
see for tweaking the affinity is to be more friendly in case user-space
is unaware.

> So I'd be *much* happier to let the schesduler do its thing, and if one
> of these 32-bit tasks ends up on a core that can't deal with it, then
> tough, it gets killed. Give userspace the information it needs to avoid
> that happening in the first place, rather than implicitly limit the mask.
> 
> That way, the kernel support really boils down to two parts:
> 
>   1. Remove the sanity checks we have to prevent 32-bit applications running
>      on asymmetric systems
> 
>   2. Tell userspace about the problem

I'm fine with that. We just have to accept that existing unaware
user-space(s) may see tasks getting killed if they use task affinity.

Morten

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu
  2020-10-09  8:13     ` Morten Rasmussen
  2020-10-09  8:31       ` Will Deacon
@ 2020-10-09  9:25       ` Peter Zijlstra
  2020-10-09  9:39         ` Qais Yousef
  2020-10-09  9:51         ` Catalin Marinas
  1 sibling, 2 replies; 33+ messages in thread
From: Peter Zijlstra @ 2020-10-09  9:25 UTC (permalink / raw)
  To: Morten Rasmussen
  Cc: Qais Yousef, linux-arch, Marc Zyngier, Will Deacon,
	Catalin Marinas, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel

On Fri, Oct 09, 2020 at 10:13:12AM +0200, Morten Rasmussen wrote:
> On Fri, Oct 09, 2020 at 09:29:43AM +0200, Peter Zijlstra wrote:

> > Fundamentally, you're not supposed to change the userspace provided
> > affinity mask. If we want to do something like this, we'll have to teach
> > the scheduler about this second mask such that it can compute an
> > effective mask as the intersection between the 'feature' and user mask.
> 
> I agree that we shouldn't mess wit the user-space mask directly. Would it
> be unthinkable to go down the route of maintaining a new mask which is
> the intersection of the feature mask (controlled and updated by arch
> code) and the user-space mask?
> 
> It shouldn't add overhead in the scheduler as it would use the
> intersection mask instead of the user-space mask, the main complexity
> would be around making sure the intersection mask is updated correctly
> (cpusets, hotplug, ...).

IFF we _need_ to go there, then yes that was the plan, compose the
intersection when either the (arch) feature(set) mask or the userspace
mask changes.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu
  2020-10-09  8:31       ` Will Deacon
  2020-10-09  8:50         ` Morten Rasmussen
@ 2020-10-09  9:33         ` Catalin Marinas
  2020-10-09  9:42           ` Greg Kroah-Hartman
  2020-10-09 11:31           ` Qais Yousef
  1 sibling, 2 replies; 33+ messages in thread
From: Catalin Marinas @ 2020-10-09  9:33 UTC (permalink / raw)
  To: Will Deacon
  Cc: Morten Rasmussen, Peter Zijlstra, Qais Yousef, linux-arch,
	Marc Zyngier, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel

On Fri, Oct 09, 2020 at 09:31:47AM +0100, Will Deacon wrote:
> On Fri, Oct 09, 2020 at 10:13:12AM +0200, Morten Rasmussen wrote:
> > On Fri, Oct 09, 2020 at 09:29:43AM +0200, Peter Zijlstra wrote:
> > > On Thu, Oct 08, 2020 at 07:16:41PM +0100, Qais Yousef wrote:
> > > > diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> > > > index cf94cc248fbe..7e97f1589f33 100644
> > > > --- a/arch/arm64/kernel/signal.c
> > > > +++ b/arch/arm64/kernel/signal.c
> > > > @@ -908,13 +908,28 @@ static void do_signal(struct pt_regs *regs)
> > > >  	restore_saved_sigmask();
> > > >  }
> > > >  
> > > > +static void set_32bit_cpus_allowed(void)
> > > >  {
> > > > +	cpumask_var_t cpus_allowed;
> > > > +	int ret = 0;
> > > > +
> > > > +	if (cpumask_subset(current->cpus_ptr, &aarch32_el0_mask))
> > > > +		return;
> > > > +
> > > >  	/*
> > > > +	 * On asym aarch32 systems, if the task has invalid cpus in its mask,
> > > > +	 * we try to fix it by removing the invalid ones.
> > > >  	 */
> > > > +	if (!alloc_cpumask_var(&cpus_allowed, GFP_ATOMIC)) {
> > > > +		ret = -ENOMEM;
> > > > +	} else {
> > > > +		cpumask_and(cpus_allowed, current->cpus_ptr, &aarch32_el0_mask);
> > > > +		ret = set_cpus_allowed_ptr(current, cpus_allowed);
> > > > +		free_cpumask_var(cpus_allowed);
> > > > +	}
> > > > +
> > > > +	if (ret) {
> > > > +		pr_warn_once("Failed to fixup affinity of running 32-bit task\n");
> > > >  		force_sig(SIGKILL);
> > > >  	}
> > > >  }
> > > 
> > > Yeah, no. Not going to happen.
> > > 
> > > Fundamentally, you're not supposed to change the userspace provided
> > > affinity mask. If we want to do something like this, we'll have to teach
> > > the scheduler about this second mask such that it can compute an
> > > effective mask as the intersection between the 'feature' and user mask.
> > 
> > I agree that we shouldn't mess wit the user-space mask directly. Would it
> > be unthinkable to go down the route of maintaining a new mask which is
> > the intersection of the feature mask (controlled and updated by arch
> > code) and the user-space mask?
> > 
> > It shouldn't add overhead in the scheduler as it would use the
> > intersection mask instead of the user-space mask, the main complexity
> > would be around making sure the intersection mask is updated correctly
> > (cpusets, hotplug, ...).
> > 
> > Like the above tweak, this won't help if the intersection mask is empty,
> > task will still get killed but it will allow tasks to survive
> > user-space masks including some non-compatible CPUs. If we want to
> > prevent task killing in all cases (ignoring hotplug) it gets more ugly
> > as we would have to ignore the user-space mask in some cases.
> 
> Honestly, I don't understand why we're trying to hide this asymmetry from
> userspace by playing games with affinity masks in the kernel. Userspace
> is likely to want to move things about _anyway_ because even amongst the
> 32-bit capable cores, you may well have different clock frequencies to
> contend with.
> 
> So I'd be *much* happier to let the schesduler do its thing, and if one
> of these 32-bit tasks ends up on a core that can't deal with it, then
> tough, it gets killed. Give userspace the information it needs to avoid
> that happening in the first place, rather than implicitly limit the mask.
> 
> That way, the kernel support really boils down to two parts:
> 
>   1. Remove the sanity checks we have to prevent 32-bit applications running
>      on asymmetric systems
> 
>   2. Tell userspace about the problem

This works for me as well as long as it is default off with a knob to
turn it on. I'd prefer a sysctl (which can be driven from the command
line in recent kernels IIRC) so that one can play with it a run-time.
This way it's also a userspace choice and not an admin or whoever
controls the cmdline (well, that's rather theoretical since the target
is Android).

-- 
Catalin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu
  2020-10-09  9:25       ` Peter Zijlstra
@ 2020-10-09  9:39         ` Qais Yousef
  2020-10-09  9:51         ` Catalin Marinas
  1 sibling, 0 replies; 33+ messages in thread
From: Qais Yousef @ 2020-10-09  9:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Morten Rasmussen, linux-arch, Marc Zyngier, Will Deacon,
	Catalin Marinas, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel

On 10/09/20 11:25, Peter Zijlstra wrote:
> On Fri, Oct 09, 2020 at 10:13:12AM +0200, Morten Rasmussen wrote:
> > On Fri, Oct 09, 2020 at 09:29:43AM +0200, Peter Zijlstra wrote:
> 
> > > Fundamentally, you're not supposed to change the userspace provided
> > > affinity mask. If we want to do something like this, we'll have to teach
> > > the scheduler about this second mask such that it can compute an
> > > effective mask as the intersection between the 'feature' and user mask.
> > 
> > I agree that we shouldn't mess wit the user-space mask directly. Would it
> > be unthinkable to go down the route of maintaining a new mask which is
> > the intersection of the feature mask (controlled and updated by arch
> > code) and the user-space mask?
> > 
> > It shouldn't add overhead in the scheduler as it would use the
> > intersection mask instead of the user-space mask, the main complexity
> > would be around making sure the intersection mask is updated correctly
> > (cpusets, hotplug, ...).
> 
> IFF we _need_ to go there, then yes that was the plan, compose the
> intersection when either the (arch) feature(set) mask or the userspace
> mask changes.

On such systems these tasks are only valid to run on a subset of cpus. It makes
a lot of sense to me if we want to go down that route to fixup the affinity
when a task is spawned and make sure sched_setaffinity() never allows it to go
outside this range. The tasks can't physically run on those cpus, so I don't
see us breaking user-space affinity here. Just reflecting the reality.

Only if it moved to a cpuset with no intersection it would be killed. Which
I think is the behavior anyway today.

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 2/3] arm64: Add support for asymmetric AArch32 EL0 configurations
  2020-10-08 18:16 ` [RFC PATCH 2/3] arm64: Add support for asymmetric AArch32 EL0 configurations Qais Yousef
  2020-10-08 18:22   ` Randy Dunlap
  2020-10-09  6:13   ` Greg Kroah-Hartman
@ 2020-10-09  9:39   ` Catalin Marinas
  2020-10-12 12:46     ` Qais Yousef
  2 siblings, 1 reply; 33+ messages in thread
From: Catalin Marinas @ 2020-10-09  9:39 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Will Deacon, Marc Zyngier, Peter Zijlstra (Intel),
	Morten Rasmussen, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel, linux-arch

On Thu, Oct 08, 2020 at 07:16:40PM +0100, Qais Yousef wrote:
> diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
> index 7faae6ff3ab4..c920fa45e502 100644
> --- a/arch/arm64/include/asm/cpu.h
> +++ b/arch/arm64/include/asm/cpu.h
> @@ -15,6 +15,7 @@
>  struct cpuinfo_arm64 {
>  	struct cpu	cpu;
>  	struct kobject	kobj;
> +	bool		aarch32_valid;

As I replied to Greg, I think we can drop this field entirely. But, of
course, please check that the kernel doesn't get tainted if booting on a
non-32-bit capable CPU.

>  void cpuinfo_store_cpu(void)
>  {
>  	struct cpuinfo_arm64 *info = this_cpu_ptr(&cpu_data);
>  	__cpuinfo_store_cpu(info);
> +	if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0))
> +		__cpuinfo_store_cpu_32bit(info);
> +	/*
> +	 * With asymmetric AArch32 support, populate the boot CPU information
> +	 * on the first 32-bit capable secondary CPU if the primary one
> +	 * skipped this step.
> +	 */
> +	if (IS_ENABLED(CONFIG_ASYMMETRIC_AARCH32) &&
> +	    !boot_cpu_data.aarch32_valid &&
> +	    id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) {
> +		__cpuinfo_store_cpu_32bit(&boot_cpu_data);
> +		init_cpu_32bit_features(&boot_cpu_data);
> +	}

Same here, we can probably drop the boot_cpu_data update here.

> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 077293b5115f..0b9aaee1df59 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1131,6 +1131,16 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
>  		if (!vcpu_has_sve(vcpu))
>  			val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
>  		val &= ~(0xfUL << ID_AA64PFR0_AMU_SHIFT);
> +
> +		if (!system_supports_sym_32bit_el0()) {
> +			/*
> +			 * We could be running on asym aarch32 system.
> +			 * Override to present a aarch64 only system.
> +			 */
> +			val &= ~(0xfUL << ID_AA64PFR0_EL0_SHIFT);
> +			val |= (ID_AA64PFR0_EL0_64BIT_ONLY << ID_AA64PFR0_EL0_SHIFT);
> +		}

With the sanitised registers using the lowest value of this field, I
think we no longer need this explicit masking.

-- 
Catalin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu
  2020-10-09  9:33         ` Catalin Marinas
@ 2020-10-09  9:42           ` Greg Kroah-Hartman
  2020-10-09 11:31           ` Qais Yousef
  1 sibling, 0 replies; 33+ messages in thread
From: Greg Kroah-Hartman @ 2020-10-09  9:42 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Morten Rasmussen, Peter Zijlstra, Qais Yousef,
	linux-arch, Marc Zyngier, Linus Torvalds, linux-arm-kernel

On Fri, Oct 09, 2020 at 10:33:41AM +0100, Catalin Marinas wrote:
> On Fri, Oct 09, 2020 at 09:31:47AM +0100, Will Deacon wrote:
> > On Fri, Oct 09, 2020 at 10:13:12AM +0200, Morten Rasmussen wrote:
> > > On Fri, Oct 09, 2020 at 09:29:43AM +0200, Peter Zijlstra wrote:
> > > > On Thu, Oct 08, 2020 at 07:16:41PM +0100, Qais Yousef wrote:
> > > > > diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> > > > > index cf94cc248fbe..7e97f1589f33 100644
> > > > > --- a/arch/arm64/kernel/signal.c
> > > > > +++ b/arch/arm64/kernel/signal.c
> > > > > @@ -908,13 +908,28 @@ static void do_signal(struct pt_regs *regs)
> > > > >  	restore_saved_sigmask();
> > > > >  }
> > > > >  
> > > > > +static void set_32bit_cpus_allowed(void)
> > > > >  {
> > > > > +	cpumask_var_t cpus_allowed;
> > > > > +	int ret = 0;
> > > > > +
> > > > > +	if (cpumask_subset(current->cpus_ptr, &aarch32_el0_mask))
> > > > > +		return;
> > > > > +
> > > > >  	/*
> > > > > +	 * On asym aarch32 systems, if the task has invalid cpus in its mask,
> > > > > +	 * we try to fix it by removing the invalid ones.
> > > > >  	 */
> > > > > +	if (!alloc_cpumask_var(&cpus_allowed, GFP_ATOMIC)) {
> > > > > +		ret = -ENOMEM;
> > > > > +	} else {
> > > > > +		cpumask_and(cpus_allowed, current->cpus_ptr, &aarch32_el0_mask);
> > > > > +		ret = set_cpus_allowed_ptr(current, cpus_allowed);
> > > > > +		free_cpumask_var(cpus_allowed);
> > > > > +	}
> > > > > +
> > > > > +	if (ret) {
> > > > > +		pr_warn_once("Failed to fixup affinity of running 32-bit task\n");
> > > > >  		force_sig(SIGKILL);
> > > > >  	}
> > > > >  }
> > > > 
> > > > Yeah, no. Not going to happen.
> > > > 
> > > > Fundamentally, you're not supposed to change the userspace provided
> > > > affinity mask. If we want to do something like this, we'll have to teach
> > > > the scheduler about this second mask such that it can compute an
> > > > effective mask as the intersection between the 'feature' and user mask.
> > > 
> > > I agree that we shouldn't mess wit the user-space mask directly. Would it
> > > be unthinkable to go down the route of maintaining a new mask which is
> > > the intersection of the feature mask (controlled and updated by arch
> > > code) and the user-space mask?
> > > 
> > > It shouldn't add overhead in the scheduler as it would use the
> > > intersection mask instead of the user-space mask, the main complexity
> > > would be around making sure the intersection mask is updated correctly
> > > (cpusets, hotplug, ...).
> > > 
> > > Like the above tweak, this won't help if the intersection mask is empty,
> > > task will still get killed but it will allow tasks to survive
> > > user-space masks including some non-compatible CPUs. If we want to
> > > prevent task killing in all cases (ignoring hotplug) it gets more ugly
> > > as we would have to ignore the user-space mask in some cases.
> > 
> > Honestly, I don't understand why we're trying to hide this asymmetry from
> > userspace by playing games with affinity masks in the kernel. Userspace
> > is likely to want to move things about _anyway_ because even amongst the
> > 32-bit capable cores, you may well have different clock frequencies to
> > contend with.
> > 
> > So I'd be *much* happier to let the schesduler do its thing, and if one
> > of these 32-bit tasks ends up on a core that can't deal with it, then
> > tough, it gets killed. Give userspace the information it needs to avoid
> > that happening in the first place, rather than implicitly limit the mask.
> > 
> > That way, the kernel support really boils down to two parts:
> > 
> >   1. Remove the sanity checks we have to prevent 32-bit applications running
> >      on asymmetric systems
> > 
> >   2. Tell userspace about the problem
> 
> This works for me as well as long as it is default off with a knob to
> turn it on. I'd prefer a sysctl (which can be driven from the command
> line in recent kernels IIRC) so that one can play with it a run-time.
> This way it's also a userspace choice and not an admin or whoever
> controls the cmdline (well, that's rather theoretical since the target
> is Android).

The target _today_ is Android.  Chips and cores have a life of their own
and end up getting used all over the place over time, as you know :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu
  2020-10-09  9:25       ` Peter Zijlstra
  2020-10-09  9:39         ` Qais Yousef
@ 2020-10-09  9:51         ` Catalin Marinas
  1 sibling, 0 replies; 33+ messages in thread
From: Catalin Marinas @ 2020-10-09  9:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Morten Rasmussen, Qais Yousef, linux-arch, Marc Zyngier,
	Will Deacon, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel

On Fri, Oct 09, 2020 at 11:25:59AM +0200, Peter Zijlstra wrote:
> On Fri, Oct 09, 2020 at 10:13:12AM +0200, Morten Rasmussen wrote:
> > On Fri, Oct 09, 2020 at 09:29:43AM +0200, Peter Zijlstra wrote:
> 
> > > Fundamentally, you're not supposed to change the userspace provided
> > > affinity mask. If we want to do something like this, we'll have to teach
> > > the scheduler about this second mask such that it can compute an
> > > effective mask as the intersection between the 'feature' and user mask.
> > 
> > I agree that we shouldn't mess wit the user-space mask directly. Would it
> > be unthinkable to go down the route of maintaining a new mask which is
> > the intersection of the feature mask (controlled and updated by arch
> > code) and the user-space mask?
> > 
> > It shouldn't add overhead in the scheduler as it would use the
> > intersection mask instead of the user-space mask, the main complexity
> > would be around making sure the intersection mask is updated correctly
> > (cpusets, hotplug, ...).
> 
> IFF we _need_ to go there, then yes that was the plan, compose the
> intersection when either the (arch) feature(set) mask or the userspace
> mask changes.

Another thought: should the arm64 compat_elf_check_arch() and
sys_arm64_personality(PER_LINUX32) validate the user cpumask before
returning success/failure? It won't prevent the user from changing the
affinity at run-time but at least it won't randomly get killed just
because the kernel advertises 32-bit support.

-- 
Catalin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 1/3] arm64: kvm: Handle Asymmetric AArch32 systems
  2020-10-09  8:12   ` Marc Zyngier
@ 2020-10-09  9:58     ` Qais Yousef
  2020-10-09 12:34       ` Marc Zyngier
  0 siblings, 1 reply; 33+ messages in thread
From: Qais Yousef @ 2020-10-09  9:58 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, Peter Zijlstra (Intel),
	Morten Rasmussen, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel, linux-arch

On 10/09/20 09:12, Marc Zyngier wrote:
> On 2020-10-08 19:16, Qais Yousef wrote:
> > On a system without uniform support for AArch32 at EL0, it is possible
> > for the guest to force run AArch32 at EL0 and potentially cause an
> > illegal exception if running on the wrong core.
> > 
> > Add an extra check to catch if the guest ever does that and prevent it
> > from running again, treating it as ARM_EXCEPTION_IL.
> > 
> > We try to catch this misbehavior as early as possible and not rely on
> > PSTATE.IL to occur.
> 
> nit: PSTATE.IL doesn't "occur". It is an "Illegal Exception Return" that
> can happen.

+1

> 
> > 
> > Tested on Juno by instrumenting the host to:
> > 
> > 	* Fake asym aarch32.
> > 	* Comment out hiding of ID registers from the guest.
> > 
> > Any attempt to run 32bit app in the guest will produce such error on
> > qemu:
> > 
> > 	# ./test
> > 	error: kvm run failed Invalid argument
> > 	R00=ffff0fff R01=ffffffff R02=00000000 R03=00087968
> > 	R04=000874b8 R05=ffd70b24 R06=ffd70b2c R07=00000055
> > 	R08=00000000 R09=00000000 R10=00000000 R11=00000000
> > 	R12=0000001c R13=ffd6f974 R14=0001ff64 R15=ffff0fe0
> > 	PSR=a0000010 N-C- A usr32
> > 
> > Signed-off-by: Qais Yousef <qais.yousef@arm.com>
> > ---
> >  arch/arm64/kvm/arm.c | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> > 
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index b588c3b5c2f0..22ff3373d855 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -644,6 +644,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> >  	struct kvm_run *run = vcpu->run;
> >  	int ret;
> > 
> > +	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
> > +		kvm_err("Illegal AArch32 mode at EL0, can't run.");
> 
> No, we don't scream on the console in an uncontrolled way based on
> illegal user input (yes, the VM *is* userspace).

It seemed kind to print a good reason of what just happened.

> 
> Furthermore, you seem to deal with the same problem *twice*. See below.

It's done below because we could loop back into the guest again, so we force an
exit then. Here to make sure if the VMM ignores the error value we returned
earlier it can't force its way back in again.

> 
> > +		return -ENOEXEC;
> > +	}
> > +
> >  	if (unlikely(!kvm_vcpu_initialized(vcpu)))
> >  		return -ENOEXEC;
> > 
> > @@ -804,6 +809,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > 
> >  		preempt_enable();
> > 
> > +		/*
> > +		 * For asym aarch32 systems we present a 64bit only system to
> > +		 * the guest. But in case it managed somehow to escape that and
> > +		 * enter 32bit mode, catch that and prevent it from running
> > +		 * again.
> 
> The guest didn't *escape* anything. It merely used the CPU as designed.
> The fact that the hypervisor cannot prevent the guest from using AArch32
> is an architectural defect.

Happy to change the wording if you tell me what you prefer :-)

> 
> > +		 */
> > +		if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
> > +			kvm_err("Detected illegal AArch32 mode at EL0, exiting.");
> 
> Same remark as above. Userspace has access to PSTATE and can work out
> the issue by itself.

Okay.

> 
> > +			ret = ARM_EXCEPTION_IL;
> 
> This will cause the thread to return to userspace after having done a
> vcpu_put(). So why don't you just mark the vcpu as uninitialized before
> returning to userspace? It already is in an illegal state, and the only
> reasonable thing userspace can do is to reset it.

Because I probably didn't navigate my way correctly around the code. Mind
expanding how to mark the vcpu as uninitialized? I have tried 2 ways
in that effect but they were really horrible, so will abstain from sharing :-)

> 
> This way, we keep the horror in a single place, and force userspace to
> take action if it really wants to recover the guest.

Happy to try it out.

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu
  2020-10-09  9:33         ` Catalin Marinas
  2020-10-09  9:42           ` Greg Kroah-Hartman
@ 2020-10-09 11:31           ` Qais Yousef
  2020-10-09 12:40             ` Catalin Marinas
  1 sibling, 1 reply; 33+ messages in thread
From: Qais Yousef @ 2020-10-09 11:31 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Morten Rasmussen, Peter Zijlstra, linux-arch,
	Marc Zyngier, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel

On 10/09/20 10:33, Catalin Marinas wrote:
> On Fri, Oct 09, 2020 at 09:31:47AM +0100, Will Deacon wrote:
> > Honestly, I don't understand why we're trying to hide this asymmetry from
> > userspace by playing games with affinity masks in the kernel. Userspace
> > is likely to want to move things about _anyway_ because even amongst the
> > 32-bit capable cores, you may well have different clock frequencies to
> > contend with.
> > 
> > So I'd be *much* happier to let the schesduler do its thing, and if one
> > of these 32-bit tasks ends up on a core that can't deal with it, then
> > tough, it gets killed. Give userspace the information it needs to avoid
> > that happening in the first place, rather than implicitly limit the mask.
> > 
> > That way, the kernel support really boils down to two parts:
> > 
> >   1. Remove the sanity checks we have to prevent 32-bit applications running
> >      on asymmetric systems
> > 
> >   2. Tell userspace about the problem
> 
> This works for me as well as long as it is default off with a knob to
> turn it on. I'd prefer a sysctl (which can be driven from the command
> line in recent kernels IIRC) so that one can play with it a run-time.
> This way it's also a userspace choice and not an admin or whoever
> controls the cmdline (well, that's rather theoretical since the target
> is Android).

I like the cmdline option more. It implies a custom bootloader and user space
are required to enable this. Which in return implies they can write their own
custom driver to manage exporting this info to user-space. Reliefing us from
maintaining any ABI in mainline kernel.

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 1/3] arm64: kvm: Handle Asymmetric AArch32 systems
  2020-10-09  9:58     ` Qais Yousef
@ 2020-10-09 12:34       ` Marc Zyngier
  2020-10-09 12:48         ` Qais Yousef
  0 siblings, 1 reply; 33+ messages in thread
From: Marc Zyngier @ 2020-10-09 12:34 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Catalin Marinas, Will Deacon, Peter Zijlstra (Intel),
	Morten Rasmussen, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel, linux-arch

On 2020-10-09 10:58, Qais Yousef wrote:

[...]

>> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> > index b588c3b5c2f0..22ff3373d855 100644
>> > --- a/arch/arm64/kvm/arm.c
>> > +++ b/arch/arm64/kvm/arm.c
>> > @@ -644,6 +644,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>> >  	struct kvm_run *run = vcpu->run;
>> >  	int ret;
>> >
>> > +	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
>> > +		kvm_err("Illegal AArch32 mode at EL0, can't run.");
>> 
>> No, we don't scream on the console in an uncontrolled way based on
>> illegal user input (yes, the VM *is* userspace).
> 
> It seemed kind to print a good reason of what just happened.

I'm afraid it only serves as an instrument to spam the console. 
Userspace
gave you an illegal state, you respond with an error. The error is, on
its own, descriptive enough. In general, we only print on the console
when KVM is faced with an internal error of some sort. That's not the
case here.

> 
>> 
>> Furthermore, you seem to deal with the same problem *twice*. See 
>> below.
> 
> It's done below because we could loop back into the guest again, so we 
> force an
> exit then. Here to make sure if the VMM ignores the error value we 
> returned
> earlier it can't force its way back in again.

Which we already handle if you do what I hinted at below.

>> 
>> > +		return -ENOEXEC;
>> > +	}
>> > +
>> >  	if (unlikely(!kvm_vcpu_initialized(vcpu)))
>> >  		return -ENOEXEC;
>> >
>> > @@ -804,6 +809,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>> >
>> >  		preempt_enable();
>> >
>> > +		/*
>> > +		 * For asym aarch32 systems we present a 64bit only system to
>> > +		 * the guest. But in case it managed somehow to escape that and
>> > +		 * enter 32bit mode, catch that and prevent it from running
>> > +		 * again.
>> 
>> The guest didn't *escape* anything. It merely used the CPU as 
>> designed.
>> The fact that the hypervisor cannot prevent the guest from using 
>> AArch32
>> is an architectural defect.
> 
> Happy to change the wording if you tell me what you prefer :-)

"The ARMv8 architecture doesn't give the hypervisor a mechanism to 
prevent
  a guest from dropping to AArch32 EL0 if implemented by the CPU. If we 
spot
  the guest in such state and that we decided it wasn't supposed to do so
  (like with the asymmetric AArch32 case), return to userspace with a 
fatal
  error."

> 
>> 
>> > +		 */
>> > +		if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
>> > +			kvm_err("Detected illegal AArch32 mode at EL0, exiting.");
>> 
>> Same remark as above. Userspace has access to PSTATE and can work out
>> the issue by itself.
> 
> Okay.
> 
>> 
>> > +			ret = ARM_EXCEPTION_IL;
>> 
>> This will cause the thread to return to userspace after having done a
>> vcpu_put(). So why don't you just mark the vcpu as uninitialized 
>> before
>> returning to userspace? It already is in an illegal state, and the 
>> only
>> reasonable thing userspace can do is to reset it.
> 
> Because I probably didn't navigate my way correctly around the code. 
> Mind
> expanding how to mark the vcpu as uninitialized? I have tried 2 ways
> in that effect but they were really horrible, so will abstain from 
> sharing :-)

You can try setting vcpu->arch.target to -1, which is already caught by
kvm_vcpu_initialized() right at the top of this function. This will
prevent any reentry unless the VMM issues a KVM_ARM_VCPU_INIT ioctl.

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu
  2020-10-09 11:31           ` Qais Yousef
@ 2020-10-09 12:40             ` Catalin Marinas
  2020-10-13 14:23               ` Qais Yousef
  0 siblings, 1 reply; 33+ messages in thread
From: Catalin Marinas @ 2020-10-09 12:40 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Will Deacon, Morten Rasmussen, Peter Zijlstra, linux-arch,
	Marc Zyngier, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel

On Fri, Oct 09, 2020 at 12:31:56PM +0100, Qais Yousef wrote:
> On 10/09/20 10:33, Catalin Marinas wrote:
> > On Fri, Oct 09, 2020 at 09:31:47AM +0100, Will Deacon wrote:
> > > Honestly, I don't understand why we're trying to hide this asymmetry from
> > > userspace by playing games with affinity masks in the kernel. Userspace
> > > is likely to want to move things about _anyway_ because even amongst the
> > > 32-bit capable cores, you may well have different clock frequencies to
> > > contend with.
> > > 
> > > So I'd be *much* happier to let the schesduler do its thing, and if one
> > > of these 32-bit tasks ends up on a core that can't deal with it, then
> > > tough, it gets killed. Give userspace the information it needs to avoid
> > > that happening in the first place, rather than implicitly limit the mask.
> > > 
> > > That way, the kernel support really boils down to two parts:
> > > 
> > >   1. Remove the sanity checks we have to prevent 32-bit applications running
> > >      on asymmetric systems
> > > 
> > >   2. Tell userspace about the problem
> > 
> > This works for me as well as long as it is default off with a knob to
> > turn it on. I'd prefer a sysctl (which can be driven from the command
> > line in recent kernels IIRC) so that one can play with it a run-time.
> > This way it's also a userspace choice and not an admin or whoever
> > controls the cmdline (well, that's rather theoretical since the target
> > is Android).
> 
> I like the cmdline option more. It implies a custom bootloader and user space
> are required to enable this. Which in return implies they can write their own
> custom driver to manage exporting this info to user-space. Reliefing us from
> maintaining any ABI in mainline kernel.

Regardless of whether it's cmdline or sysctl, I'm strongly opposed to
custom drivers for exposing this information to user. It leads to
custom incompatible ABIs scattered around.

Note that user can already check the MIDR_EL1 value if it knows which
CPU type and revision has 32-bit support.

-- 
Catalin

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 1/3] arm64: kvm: Handle Asymmetric AArch32 systems
  2020-10-09 12:34       ` Marc Zyngier
@ 2020-10-09 12:48         ` Qais Yousef
  2020-10-12 15:32           ` James Morse
  0 siblings, 1 reply; 33+ messages in thread
From: Qais Yousef @ 2020-10-09 12:48 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Catalin Marinas, Will Deacon, Peter Zijlstra (Intel),
	Morten Rasmussen, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel, linux-arch

On 10/09/20 13:34, Marc Zyngier wrote:
> On 2020-10-09 10:58, Qais Yousef wrote:
> 
> [...]
> 
> > > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > > > index b588c3b5c2f0..22ff3373d855 100644
> > > > --- a/arch/arm64/kvm/arm.c
> > > > +++ b/arch/arm64/kvm/arm.c
> > > > @@ -644,6 +644,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > > >  	struct kvm_run *run = vcpu->run;
> > > >  	int ret;
> > > >
> > > > +	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
> > > > +		kvm_err("Illegal AArch32 mode at EL0, can't run.");
> > > 
> > > No, we don't scream on the console in an uncontrolled way based on
> > > illegal user input (yes, the VM *is* userspace).
> > 
> > It seemed kind to print a good reason of what just happened.
> 
> I'm afraid it only serves as an instrument to spam the console. Userspace
> gave you an illegal state, you respond with an error. The error is, on
> its own, descriptive enough. In general, we only print on the console
> when KVM is faced with an internal error of some sort. That's not the
> case here.

Fair enough.

> 
> > 
> > > 
> > > Furthermore, you seem to deal with the same problem *twice*. See
> > > below.
> > 
> > It's done below because we could loop back into the guest again, so we
> > force an
> > exit then. Here to make sure if the VMM ignores the error value we
> > returned
> > earlier it can't force its way back in again.
> 
> Which we already handle if you do what I hinted at below.
> 
> > > 
> > > > +		return -ENOEXEC;
> > > > +	}
> > > > +
> > > >  	if (unlikely(!kvm_vcpu_initialized(vcpu)))
> > > >  		return -ENOEXEC;
> > > >
> > > > @@ -804,6 +809,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > > >
> > > >  		preempt_enable();
> > > >
> > > > +		/*
> > > > +		 * For asym aarch32 systems we present a 64bit only system to
> > > > +		 * the guest. But in case it managed somehow to escape that and
> > > > +		 * enter 32bit mode, catch that and prevent it from running
> > > > +		 * again.
> > > 
> > > The guest didn't *escape* anything. It merely used the CPU as
> > > designed.
> > > The fact that the hypervisor cannot prevent the guest from using
> > > AArch32
> > > is an architectural defect.
> > 
> > Happy to change the wording if you tell me what you prefer :-)
> 
> "The ARMv8 architecture doesn't give the hypervisor a mechanism to prevent
>  a guest from dropping to AArch32 EL0 if implemented by the CPU. If we spot
>  the guest in such state and that we decided it wasn't supposed to do so
>  (like with the asymmetric AArch32 case), return to userspace with a fatal
>  error."

Thanks.

> > Because I probably didn't navigate my way correctly around the code.
> > Mind
> > expanding how to mark the vcpu as uninitialized? I have tried 2 ways
> > in that effect but they were really horrible, so will abstain from
> > sharing :-)
> 
> You can try setting vcpu->arch.target to -1, which is already caught by
> kvm_vcpu_initialized() right at the top of this function. This will
> prevent any reentry unless the VMM issues a KVM_ARM_VCPU_INIT ioctl.

That's actually one of the things I tried before ending with this patch.
I thought it's really aggressive and unfriendly.

It does indeed cause qemu to abort out completely.

> +                       /* Reset target so it won't run again */
> +                       vcpu->arch.target = -1;
>
>       # ./test
>       error: kvm run failed Invalid argument
>        PC=ffff80001093ad64 X00=ffff80001686d014 X01=ffff80001093ad40
>       X02=3fa71c3de1990200 X03=0000000000000030 X04=0000000000000000
>       X05=0000000000000000 X06=ffff0000767d3c06 X07=74246e6873716875
>       X08=7f7f7f7f7f7f7f7f X09=0400000000000001 X10=0101010101010101
>       X11=0000000000000001 X12=ffff800016887000 X13=ffff0000767d3c07
>       X14=ffff0000767d3c08 X15=ffff80001618a988 X16=000000000000000e
>       X17=0000000000000000 X18=ffffffffffffffff X19=ffff00007ad99800
>       X20=ffff8000163d20b8 X21=0000000000000000 X22=ffff00007ad99810
>       X23=ffff8000163d2270 X24=ffff8000163d22d0 X25=0000000000000000
>       X26=ffff800012bb5b80 X27=ffff800012cc1068 X28=0000000000000007
>       X29=ffff80001668bab0 X30=ffff8000109367c8  SP=ffff80001668bab0
>       PSTATE=80000005 N--- EL1h
>       qemu-system-aarch64: Failed to get KVM_REG_ARM_TIMER_CNT
>       Aborted

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 2/3] arm64: Add support for asymmetric AArch32 EL0 configurations
  2020-10-08 18:22   ` Randy Dunlap
@ 2020-10-12 10:22     ` Qais Yousef
  0 siblings, 0 replies; 33+ messages in thread
From: Qais Yousef @ 2020-10-12 10:22 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Catalin Marinas, Will Deacon, Marc Zyngier,
	Peter Zijlstra (Intel),
	Morten Rasmussen, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel, linux-arch

On 10/08/20 11:22, Randy Dunlap wrote:
> On 10/8/20 11:16 AM, Qais Yousef wrote:
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 6d232837cbee..591853504dc4 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -1868,6 +1868,20 @@ config DMI
> >  
> >  endmenu
> >  
> > +config ASYMMETRIC_AARCH32
> > +	bool "Allow support for asymmetric AArch32 support"
> 
> Please drop one "support" or reword the prompt string.

Thanks Randy. It now reads

	"Allow support for asymmetric AArch32 systems"

Cheers

--
Qais Yousef

> 
> > +	depends on COMPAT && EXPERT
> > +	help
> > +	  Enable this option to allow support for asymmetric AArch32 EL0
> > +	  CPU configurations. Once the AArch32 EL0 support is detected
> > +	  on a CPU, the feature is made available to user space to allow
> > +	  the execution of 32-bit (compat) applications. If the affinity
> > +	  of the 32-bit application contains a non-AArch32 capable CPU
> > +	  or the last AArch32 capable CPU is offlined, the application
> > +	  will be killed.
> > +
> > +	  If unsure say N.
> 
> 
> -- 
> ~Randy
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 2/3] arm64: Add support for asymmetric AArch32 EL0 configurations
  2020-10-09  9:39   ` Catalin Marinas
@ 2020-10-12 12:46     ` Qais Yousef
  0 siblings, 0 replies; 33+ messages in thread
From: Qais Yousef @ 2020-10-12 12:46 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Marc Zyngier, Peter Zijlstra (Intel),
	Morten Rasmussen, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel, linux-arch

On 10/09/20 10:39, Catalin Marinas wrote:
> On Thu, Oct 08, 2020 at 07:16:40PM +0100, Qais Yousef wrote:
> > diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
> > index 7faae6ff3ab4..c920fa45e502 100644
> > --- a/arch/arm64/include/asm/cpu.h
> > +++ b/arch/arm64/include/asm/cpu.h
> > @@ -15,6 +15,7 @@
> >  struct cpuinfo_arm64 {
> >  	struct cpu	cpu;
> >  	struct kobject	kobj;
> > +	bool		aarch32_valid;
> 
> As I replied to Greg, I think we can drop this field entirely. But, of
> course, please check that the kernel doesn't get tainted if booting on a
> non-32-bit capable CPU.

Faking asymmetry on Juno where CPU0 (boot CPU) is not 32bit capable

	dmesg | grep -i taint

returns nothing.

> 
> >  void cpuinfo_store_cpu(void)
> >  {
> >  	struct cpuinfo_arm64 *info = this_cpu_ptr(&cpu_data);
> >  	__cpuinfo_store_cpu(info);
> > +	if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0))
> > +		__cpuinfo_store_cpu_32bit(info);

 >>>>>>>

> > +	/*
> > +	 * With asymmetric AArch32 support, populate the boot CPU information
> > +	 * on the first 32-bit capable secondary CPU if the primary one
> > +	 * skipped this step.
> > +	 */
> > +	if (IS_ENABLED(CONFIG_ASYMMETRIC_AARCH32) &&
> > +	    !boot_cpu_data.aarch32_valid &&
> > +	    id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) {
> > +		__cpuinfo_store_cpu_32bit(&boot_cpu_data);
> > +		init_cpu_32bit_features(&boot_cpu_data);
> > +	}

<<<<<<<

> 
> Same here, we can probably drop the boot_cpu_data update here.

Removed the block above.

> 
> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index 077293b5115f..0b9aaee1df59 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -1131,6 +1131,16 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> >  		if (!vcpu_has_sve(vcpu))
> >  			val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
> >  		val &= ~(0xfUL << ID_AA64PFR0_AMU_SHIFT);
> > +
> > +		if (!system_supports_sym_32bit_el0()) {
> > +			/*
> > +			 * We could be running on asym aarch32 system.
> > +			 * Override to present a aarch64 only system.
> > +			 */
> > +			val &= ~(0xfUL << ID_AA64PFR0_EL0_SHIFT);
> > +			val |= (ID_AA64PFR0_EL0_64BIT_ONLY << ID_AA64PFR0_EL0_SHIFT);
> > +		}
> 
> With the sanitised registers using the lowest value of this field, I
> think we no longer need this explicit masking.

Indeed. Removed.

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 1/3] arm64: kvm: Handle Asymmetric AArch32 systems
  2020-10-09 12:48         ` Qais Yousef
@ 2020-10-12 15:32           ` James Morse
  2020-10-13 10:32             ` Marc Zyngier
  0 siblings, 1 reply; 33+ messages in thread
From: James Morse @ 2020-10-12 15:32 UTC (permalink / raw)
  To: Qais Yousef, Marc Zyngier
  Cc: linux-arch, Will Deacon, Peter Zijlstra (Intel),
	Catalin Marinas, Greg Kroah-Hartman, Linus Torvalds,
	Morten Rasmussen, linux-arm-kernel

Hi Marc, Qais,

On 09/10/2020 13:48, Qais Yousef wrote:
> On 10/09/20 13:34, Marc Zyngier wrote:
>> On 2020-10-09 10:58, Qais Yousef wrote:

>>>>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>>>>> index b588c3b5c2f0..22ff3373d855 100644
>>>>> --- a/arch/arm64/kvm/arm.c
>>>>> +++ b/arch/arm64/kvm/arm.c
>>>>> @@ -644,6 +644,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>>>>>  	struct kvm_run *run = vcpu->run;
>>>>>  	int ret;
>>>>>
>>>>> +	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
>>>>> +		kvm_err("Illegal AArch32 mode at EL0, can't run.");
>>>>
>>>> No, we don't scream on the console in an uncontrolled way based on
>>>> illegal user input (yes, the VM *is* userspace).
>>>
>>> It seemed kind to print a good reason of what just happened.

>>>> Furthermore, you seem to deal with the same problem *twice*. See
>>>> below.
>>>
>>> It's done below because we could loop back into the guest again, so we
>>> force an
>>> exit then. Here to make sure if the VMM ignores the error value we
>>> returned
>>> earlier it can't force its way back in again.

>> Which we already handle if you do what I hinted at below.

Do we trust the VMM not to try and get out of this?

We sanity-check the SPSR values the VMM writes via set_one_reg() to prevent aarch32 on
systems that don't support it. It seems strange that if you can get the bad value out of
hardware: you can keep it.
Returning to aarch32 from EL2 on a CPU that doesn't support it is terrifying.

To avoid always testing on entry from user-space, we could add a
'vmm-fixed-bad-value-from-hardware' request type, and refactor check_vcpu_requests() to
allow it to force a guest exit. Any KVM_EXIT_FAIL_ENTRY can set this to ensure the VMM
doesn't have its fingers in its ears.

This means the VMM can fix this by set_one_reg()ing an exception to aarch64 if it really
wants to, but it can't restart the guest with the bad SPSR value.


>>>>> +		return -ENOEXEC;
>>>>> +	}
>>>>> +
>>>>>  	if (unlikely(!kvm_vcpu_initialized(vcpu)))
>>>>>  		return -ENOEXEC;
>>>>>
>>>>> @@ -804,6 +809,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>>>>>
>>>>>  		preempt_enable();
>>>>>
>>>>> +		/*
>>>>> +		 * For asym aarch32 systems we present a 64bit only system to
>>>>> +		 * the guest. But in case it managed somehow to escape that and
>>>>> +		 * enter 32bit mode, catch that and prevent it from running
>>>>> +		 * again.
>>>>
>>>> The guest didn't *escape* anything. It merely used the CPU as
>>>> designed.
>>>> The fact that the hypervisor cannot prevent the guest from using
>>>> AArch32
>>>> is an architectural defect.

>>> Because I probably didn't navigate my way correctly around the code.
>>> Mind
>>> expanding how to mark the vcpu as uninitialized? I have tried 2 ways
>>> in that effect but they were really horrible, so will abstain from
>>> sharing :-)
>>
>> You can try setting vcpu->arch.target to -1, which is already caught by
>> kvm_vcpu_initialized() right at the top of this function. This will

>> prevent any reentry unless the VMM issues a KVM_ARM_VCPU_INIT ioctl.

This doesn't reset SPSR, so this lets the VMM restart the guest with a bad value.

I think we should make it impossible to return to aarch32 from EL2 on these systems.


Thanks,

James


> That's actually one of the things I tried before ending with this patch.
> I thought it's really aggressive and unfriendly.
> 
> It does indeed cause qemu to abort out completely.
> 
>> +                       /* Reset target so it won't run again */
>> +                       vcpu->arch.target = -1;
>>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 1/3] arm64: kvm: Handle Asymmetric AArch32 systems
  2020-10-12 15:32           ` James Morse
@ 2020-10-13 10:32             ` Marc Zyngier
  2020-10-13 11:51               ` James Morse
  0 siblings, 1 reply; 33+ messages in thread
From: Marc Zyngier @ 2020-10-13 10:32 UTC (permalink / raw)
  To: James Morse
  Cc: Qais Yousef, linux-arch, Will Deacon, Peter Zijlstra (Intel),
	Catalin Marinas, Greg Kroah-Hartman, Linus Torvalds,
	Morten Rasmussen, linux-arm-kernel

Hi James,

On 2020-10-12 16:32, James Morse wrote:
> Hi Marc, Qais,
> 
> On 09/10/2020 13:48, Qais Yousef wrote:
>> On 10/09/20 13:34, Marc Zyngier wrote:
>>> On 2020-10-09 10:58, Qais Yousef wrote:
> 
>>>>>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>>>>>> index b588c3b5c2f0..22ff3373d855 100644
>>>>>> --- a/arch/arm64/kvm/arm.c
>>>>>> +++ b/arch/arm64/kvm/arm.c
>>>>>> @@ -644,6 +644,11 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu 
>>>>>> *vcpu)
>>>>>>  	struct kvm_run *run = vcpu->run;
>>>>>>  	int ret;
>>>>>> 
>>>>>> +	if (!system_supports_32bit_el0() && vcpu_mode_is_32bit(vcpu)) {
>>>>>> +		kvm_err("Illegal AArch32 mode at EL0, can't run.");
>>>>> 
>>>>> No, we don't scream on the console in an uncontrolled way based on
>>>>> illegal user input (yes, the VM *is* userspace).
>>>> 
>>>> It seemed kind to print a good reason of what just happened.
> 
>>>>> Furthermore, you seem to deal with the same problem *twice*. See
>>>>> below.
>>>> 
>>>> It's done below because we could loop back into the guest again, so 
>>>> we
>>>> force an
>>>> exit then. Here to make sure if the VMM ignores the error value we
>>>> returned
>>>> earlier it can't force its way back in again.
> 
>>> Which we already handle if you do what I hinted at below.
> 
> Do we trust the VMM not to try and get out of this?

I usually don't put the words trust and VMM in the same sentence...

> We sanity-check the SPSR values the VMM writes via set_one_reg() to
> prevent aarch32 on systems that don't support it. It seems strange
> that if you can get the bad value out of hardware: you can keep it.

The guest went into an illegal state, and we return that illegal state
to userspace. Garbage in, garbage out. I'm not too bothered about that,
but we cal always sanity check it.

> Returning to aarch32 from EL2 on a CPU that doesn't support it is 
> terrifying.

And I'm not proposing that we even try.

> 
> To avoid always testing on entry from user-space, we could add a
> 'vmm-fixed-bad-value-from-hardware' request type, and refactor
> check_vcpu_requests() to
> allow it to force a guest exit. Any KVM_EXIT_FAIL_ENTRY can set this
> to ensure the VMM
> doesn't have its fingers in its ears.
> 
> This means the VMM can fix this by set_one_reg()ing an exception to
> aarch64 if it really
> wants to, but it can't restart the guest with the bad SPSR value.
> 
> 
>>>>>> +		return -ENOEXEC;
>>>>>> +	}
>>>>>> +
>>>>>>  	if (unlikely(!kvm_vcpu_initialized(vcpu)))
>>>>>>  		return -ENOEXEC;
>>>>>> 
>>>>>> @@ -804,6 +809,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu 
>>>>>> *vcpu)
>>>>>> 
>>>>>>  		preempt_enable();
>>>>>> 
>>>>>> +		/*
>>>>>> +		 * For asym aarch32 systems we present a 64bit only system to
>>>>>> +		 * the guest. But in case it managed somehow to escape that and
>>>>>> +		 * enter 32bit mode, catch that and prevent it from running
>>>>>> +		 * again.
>>>>> 
>>>>> The guest didn't *escape* anything. It merely used the CPU as
>>>>> designed.
>>>>> The fact that the hypervisor cannot prevent the guest from using
>>>>> AArch32
>>>>> is an architectural defect.
> 
>>>> Because I probably didn't navigate my way correctly around the code.
>>>> Mind
>>>> expanding how to mark the vcpu as uninitialized? I have tried 2 ways
>>>> in that effect but they were really horrible, so will abstain from
>>>> sharing :-)
>>> 
>>> You can try setting vcpu->arch.target to -1, which is already caught 
>>> by
>>> kvm_vcpu_initialized() right at the top of this function. This will
> 
>>> prevent any reentry unless the VMM issues a KVM_ARM_VCPU_INIT ioctl.
> 
> This doesn't reset SPSR, so this lets the VMM restart the guest with a
> bad value.

That's not my reading of the code. Without a valid target, you cannot 
enter
the guest (kvm_vcpu_initialized() prevents you to do so). To set the 
target,
you need to issue a KVM_ARM_VCPU_INIT ioctl, which eventually calls
kvm_reset_vcpu(), which does set PSTATE to something legal.

Or do you mean the guest's SPSR_EL1? Why would that matter?

> I think we should make it impossible to return to aarch32 from EL2 on
> these systems.

And I'm saying that the above fulfills that requirement. Am I missing
something obvious?

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 1/3] arm64: kvm: Handle Asymmetric AArch32 systems
  2020-10-13 10:32             ` Marc Zyngier
@ 2020-10-13 11:51               ` James Morse
  2020-10-13 11:59                 ` Qais Yousef
  0 siblings, 1 reply; 33+ messages in thread
From: James Morse @ 2020-10-13 11:51 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Qais Yousef, linux-arch, Will Deacon, Peter Zijlstra (Intel),
	Catalin Marinas, Greg Kroah-Hartman, Linus Torvalds,
	Morten Rasmussen, linux-arm-kernel

Hi Marc,

On 13/10/2020 11:32, Marc Zyngier wrote:
> On 2020-10-12 16:32, James Morse wrote:
>> On 09/10/2020 13:48, Qais Yousef wrote:
>>> On 10/09/20 13:34, Marc Zyngier wrote:
>>>> You can try setting vcpu->arch.target to -1, which is already caught by
>>>> kvm_vcpu_initialized() right at the top of this function. This will
>>
>>>> prevent any reentry unless the VMM issues a KVM_ARM_VCPU_INIT ioctl.
>>
>> This doesn't reset SPSR, so this lets the VMM restart the guest with a
>> bad value.
> 
> That's not my reading of the code. Without a valid target, you cannot enter
> the guest (kvm_vcpu_initialized() prevents you to do so). To set the target,
> you need to issue a KVM_ARM_VCPU_INIT ioctl, which eventually calls

> kvm_reset_vcpu(), which does set PSTATE to something legal.

aha! So it does, this is what I was missing.


> Or do you mean the guest's SPSR_EL1? Why would that matter?
> 
>> I think we should make it impossible to return to aarch32 from EL2 on
>> these systems.
> 
> And I'm saying that the above fulfills that requirement. Am I missing
> something obvious?

Nope, I was.

I agree the check on entry from user-space isn't needed.


Thanks,

James

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 1/3] arm64: kvm: Handle Asymmetric AArch32 systems
  2020-10-13 11:51               ` James Morse
@ 2020-10-13 11:59                 ` Qais Yousef
  2020-10-13 12:09                   ` Marc Zyngier
  0 siblings, 1 reply; 33+ messages in thread
From: Qais Yousef @ 2020-10-13 11:59 UTC (permalink / raw)
  To: James Morse
  Cc: Marc Zyngier, linux-arch, Will Deacon, Peter Zijlstra (Intel),
	Catalin Marinas, Greg Kroah-Hartman, Linus Torvalds,
	Morten Rasmussen, linux-arm-kernel

On 10/13/20 12:51, James Morse wrote:
> Hi Marc,
> 
> On 13/10/2020 11:32, Marc Zyngier wrote:
> > On 2020-10-12 16:32, James Morse wrote:
> >> On 09/10/2020 13:48, Qais Yousef wrote:
> >>> On 10/09/20 13:34, Marc Zyngier wrote:
> >>>> You can try setting vcpu->arch.target to -1, which is already caught by
> >>>> kvm_vcpu_initialized() right at the top of this function. This will
> >>
> >>>> prevent any reentry unless the VMM issues a KVM_ARM_VCPU_INIT ioctl.
> >>
> >> This doesn't reset SPSR, so this lets the VMM restart the guest with a
> >> bad value.
> > 
> > That's not my reading of the code. Without a valid target, you cannot enter
> > the guest (kvm_vcpu_initialized() prevents you to do so). To set the target,
> > you need to issue a KVM_ARM_VCPU_INIT ioctl, which eventually calls
> 
> > kvm_reset_vcpu(), which does set PSTATE to something legal.
> 
> aha! So it does, this is what I was missing.
> 
> 
> > Or do you mean the guest's SPSR_EL1? Why would that matter?
> > 
> >> I think we should make it impossible to return to aarch32 from EL2 on
> >> these systems.
> > 
> > And I'm saying that the above fulfills that requirement. Am I missing
> > something obvious?
> 
> Nope, I was.
> 
> I agree the check on entry from user-space isn't needed.

Thanks both.

So using the vcpu->arch.target = -1 is the best way forward. In my experiments
when I did that I considered calling kvm_reset_vcpu() too, does it make sense
to force the reset here too? Or too heavy handed?

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 1/3] arm64: kvm: Handle Asymmetric AArch32 systems
  2020-10-13 11:59                 ` Qais Yousef
@ 2020-10-13 12:09                   ` Marc Zyngier
  2020-10-13 12:16                     ` Qais Yousef
  0 siblings, 1 reply; 33+ messages in thread
From: Marc Zyngier @ 2020-10-13 12:09 UTC (permalink / raw)
  To: Qais Yousef
  Cc: James Morse, linux-arch, Will Deacon, Peter Zijlstra (Intel),
	Catalin Marinas, Greg Kroah-Hartman, Linus Torvalds,
	Morten Rasmussen, linux-arm-kernel

On 2020-10-13 12:59, Qais Yousef wrote:
> On 10/13/20 12:51, James Morse wrote:
>> Hi Marc,
>> 
>> On 13/10/2020 11:32, Marc Zyngier wrote:
>> > On 2020-10-12 16:32, James Morse wrote:
>> >> On 09/10/2020 13:48, Qais Yousef wrote:
>> >>> On 10/09/20 13:34, Marc Zyngier wrote:
>> >>>> You can try setting vcpu->arch.target to -1, which is already caught by
>> >>>> kvm_vcpu_initialized() right at the top of this function. This will
>> >>
>> >>>> prevent any reentry unless the VMM issues a KVM_ARM_VCPU_INIT ioctl.
>> >>
>> >> This doesn't reset SPSR, so this lets the VMM restart the guest with a
>> >> bad value.
>> >
>> > That's not my reading of the code. Without a valid target, you cannot enter
>> > the guest (kvm_vcpu_initialized() prevents you to do so). To set the target,
>> > you need to issue a KVM_ARM_VCPU_INIT ioctl, which eventually calls
>> 
>> > kvm_reset_vcpu(), which does set PSTATE to something legal.
>> 
>> aha! So it does, this is what I was missing.
>> 
>> 
>> > Or do you mean the guest's SPSR_EL1? Why would that matter?
>> >
>> >> I think we should make it impossible to return to aarch32 from EL2 on
>> >> these systems.
>> >
>> > And I'm saying that the above fulfills that requirement. Am I missing
>> > something obvious?
>> 
>> Nope, I was.
>> 
>> I agree the check on entry from user-space isn't needed.
> 
> Thanks both.
> 
> So using the vcpu->arch.target = -1 is the best way forward. In my 
> experiments
> when I did that I considered calling kvm_reset_vcpu() too, does it make 
> sense
> to force the reset here too? Or too heavy handed?

No, userspace should know about it and take action if it wants too.
Trying to fix it behind the scenes is setting expectations, which
I'd really like to avoid.

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 1/3] arm64: kvm: Handle Asymmetric AArch32 systems
  2020-10-13 12:09                   ` Marc Zyngier
@ 2020-10-13 12:16                     ` Qais Yousef
  0 siblings, 0 replies; 33+ messages in thread
From: Qais Yousef @ 2020-10-13 12:16 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, linux-arch, Will Deacon, Peter Zijlstra (Intel),
	Catalin Marinas, Greg Kroah-Hartman, Linus Torvalds,
	Morten Rasmussen, linux-arm-kernel

On 10/13/20 13:09, Marc Zyngier wrote:
> On 2020-10-13 12:59, Qais Yousef wrote:
> > Thanks both.
> > 
> > So using the vcpu->arch.target = -1 is the best way forward. In my
> > experiments
> > when I did that I considered calling kvm_reset_vcpu() too, does it make
> > sense
> > to force the reset here too? Or too heavy handed?
> 
> No, userspace should know about it and take action if it wants too.
> Trying to fix it behind the scenes is setting expectations, which
> I'd really like to avoid.

Cool I thought so but I wanted to hear it directly :-)

Thanks!

--
Qais Yousef

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu
  2020-10-09 12:40             ` Catalin Marinas
@ 2020-10-13 14:23               ` Qais Yousef
  0 siblings, 0 replies; 33+ messages in thread
From: Qais Yousef @ 2020-10-13 14:23 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Morten Rasmussen, Peter Zijlstra, linux-arch,
	Marc Zyngier, Greg Kroah-Hartman, Linus Torvalds,
	linux-arm-kernel

On 10/09/20 13:40, Catalin Marinas wrote:
> On Fri, Oct 09, 2020 at 12:31:56PM +0100, Qais Yousef wrote:
> > On 10/09/20 10:33, Catalin Marinas wrote:
> > > On Fri, Oct 09, 2020 at 09:31:47AM +0100, Will Deacon wrote:
> > > > Honestly, I don't understand why we're trying to hide this asymmetry from
> > > > userspace by playing games with affinity masks in the kernel. Userspace
> > > > is likely to want to move things about _anyway_ because even amongst the
> > > > 32-bit capable cores, you may well have different clock frequencies to
> > > > contend with.
> > > > 
> > > > So I'd be *much* happier to let the schesduler do its thing, and if one
> > > > of these 32-bit tasks ends up on a core that can't deal with it, then
> > > > tough, it gets killed. Give userspace the information it needs to avoid
> > > > that happening in the first place, rather than implicitly limit the mask.
> > > > 
> > > > That way, the kernel support really boils down to two parts:
> > > > 
> > > >   1. Remove the sanity checks we have to prevent 32-bit applications running
> > > >      on asymmetric systems
> > > > 
> > > >   2. Tell userspace about the problem
> > > 
> > > This works for me as well as long as it is default off with a knob to
> > > turn it on. I'd prefer a sysctl (which can be driven from the command
> > > line in recent kernels IIRC) so that one can play with it a run-time.
> > > This way it's also a userspace choice and not an admin or whoever
> > > controls the cmdline (well, that's rather theoretical since the target
> > > is Android).
> > 
> > I like the cmdline option more. It implies a custom bootloader and user space
> > are required to enable this. Which in return implies they can write their own
> > custom driver to manage exporting this info to user-space. Reliefing us from
> > maintaining any ABI in mainline kernel.
> 
> Regardless of whether it's cmdline or sysctl, I'm strongly opposed to
> custom drivers for exposing this information to user. It leads to
> custom incompatible ABIs scattered around.

Yeah I see what you mean.

So for now I'll remove the Kconfig dependency and replace it with a sysctl
instead such that system_supports_asym_32bit_el0() only returns true if the
system is asym AND the sysctl is true.

Thanks

--
Qais Yousef

> Note that user can already check the MIDR_EL1 value if it knows which
> CPU type and revision has 32-bit support.
> 
> -- 
> Catalin

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, back to index

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-08 18:16 [RFC PATCH 0/3] Add support for Asymmetric AArch32 systems Qais Yousef
2020-10-08 18:16 ` [RFC PATCH 1/3] arm64: kvm: Handle " Qais Yousef
2020-10-09  8:12   ` Marc Zyngier
2020-10-09  9:58     ` Qais Yousef
2020-10-09 12:34       ` Marc Zyngier
2020-10-09 12:48         ` Qais Yousef
2020-10-12 15:32           ` James Morse
2020-10-13 10:32             ` Marc Zyngier
2020-10-13 11:51               ` James Morse
2020-10-13 11:59                 ` Qais Yousef
2020-10-13 12:09                   ` Marc Zyngier
2020-10-13 12:16                     ` Qais Yousef
2020-10-08 18:16 ` [RFC PATCH 2/3] arm64: Add support for asymmetric AArch32 EL0 configurations Qais Yousef
2020-10-08 18:22   ` Randy Dunlap
2020-10-12 10:22     ` Qais Yousef
2020-10-09  6:13   ` Greg Kroah-Hartman
2020-10-09  8:40     ` Will Deacon
2020-10-09  8:50     ` Catalin Marinas
2020-10-09  9:39   ` Catalin Marinas
2020-10-12 12:46     ` Qais Yousef
2020-10-08 18:16 ` [RFC PATCH 3/3] arm64: Handle AArch32 tasks running on non AArch32 cpu Qais Yousef
2020-10-09  7:29   ` Peter Zijlstra
2020-10-09  8:13     ` Morten Rasmussen
2020-10-09  8:31       ` Will Deacon
2020-10-09  8:50         ` Morten Rasmussen
2020-10-09  9:33         ` Catalin Marinas
2020-10-09  9:42           ` Greg Kroah-Hartman
2020-10-09 11:31           ` Qais Yousef
2020-10-09 12:40             ` Catalin Marinas
2020-10-13 14:23               ` Qais Yousef
2020-10-09  9:25       ` Peter Zijlstra
2020-10-09  9:39         ` Qais Yousef
2020-10-09  9:51         ` Catalin Marinas

Linux-arch Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-arch/0 linux-arch/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-arch linux-arch/ https://lore.kernel.org/linux-arch \
		linux-arch@vger.kernel.org
	public-inbox-index linux-arch

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-arch


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git