linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/7] arm64: provide pseudo NMI with GICv3
@ 2017-10-11 13:00 Julien Thierry
  2017-10-11 13:00 ` [RFC 1/7] arm64: irqflags: Reorder the fiq & async macros Julien Thierry
                   ` (8 more replies)
  0 siblings, 9 replies; 19+ messages in thread
From: Julien Thierry @ 2017-10-11 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

This series is a continuation of the work started by Daniel [1]. The goal is
to use GICv3 interrupt priorities to simulate an NMI.

To achieve this, set two priorities, one for standard interrupts and
another, higher priority, for NMIs. Whenever we want to disable interrupts,
we mask the standard priority instead so NMIs can still be raised. Some
corner cases though still require to actually mask all interrupts
effectively disabling the NMI.

Of course, using priority masking instead of PSR.I comes at some cost. On
hackbench, I notice a ~6% drop of performance. However, on a kernel build,
I do not see significant differences.

Currently, only PPIs and SPIs can be set as NMIs. IPIs being currently
hardcoded IRQ numbers, there isn't a generic interface to set SGIs as NMI
for now. I don't think there is any reason LPIs should be allowed to be set
as NMI as they do not have an active state.
When an NMI is active on a CPU, no other NMI can be triggered on the CPU.

Requirements to use this:
- Have GICv3
- SCR_EL3.FIQ is set to 1 when linux runs
- Select Kernel Feature -> Use ICC system registers for IRQ masking

>From my testing with the perf interrupt, there doesn't seem to be any major
issue left (but if you find some, do tell!).

* Patches 1 and 5 are minor utility changes.
* Patches 2 and 3 allows to detect and enable the use of GICv3 system
  registers during boot time.
* Patch 4 introduces the masking of IRQs using priorities replacing irq
  disabling.
* Patch 6 add detection of the view linux has on GICv3 priorities, without
  this we cannot easily mask specific priorities in an accurate manner
* Patch 7 adds the support for NMIs


[1] http://www.spinics.net/lists/arm-kernel/msg525077.html

Thanks,

Julien Thierry


Daniel Thompson (4):
  arm64: irqflags: Reorder the fiq & async macros
  arm64: cpufeature: Allow early detect of specific features
  arm64: alternative: Apply alternatives early in boot process
  arm64: irqflags: Use ICC sysregs to implement IRQ masking

Julien Thierry (3):
  irqchip/gic: Add functions to access irq priorities
  arm64: Detect current view of GIC priorities
  arm64: Add support for pseudo-NMIs

 Documentation/arm64/booting.txt        |   5 +
 arch/arm64/Kconfig                     |  15 ++
 arch/arm64/include/asm/alternative.h   |   1 +
 arch/arm64/include/asm/arch_gicv3.h    |  24 +++
 arch/arm64/include/asm/assembler.h     |  33 +++-
 arch/arm64/include/asm/efi.h           |   5 +
 arch/arm64/include/asm/irqflags.h      | 143 +++++++++++++++-
 arch/arm64/include/asm/kvm_host.h      |  40 +++++
 arch/arm64/include/asm/processor.h     |   4 +
 arch/arm64/include/asm/ptrace.h        |  14 +-
 arch/arm64/include/asm/sysreg.h        |   1 +
 arch/arm64/kernel/alternative.c        |  39 ++++-
 arch/arm64/kernel/asm-offsets.c        |   1 +
 arch/arm64/kernel/cpufeature.c         |  67 +++++---
 arch/arm64/kernel/entry.S              | 132 ++++++++++++--
 arch/arm64/kernel/head.S               |  38 +++++
 arch/arm64/kernel/process.c            |   6 +
 arch/arm64/kernel/smp.c                |  14 ++
 arch/arm64/mm/proc.S                   |  23 +++
 drivers/irqchip/irq-gic-common.c       |  10 ++
 drivers/irqchip/irq-gic-common.h       |   2 +
 drivers/irqchip/irq-gic-v3-its.c       |   2 +-
 drivers/irqchip/irq-gic-v3.c           | 304 +++++++++++++++++++++++++++++----
 include/linux/interrupt.h              |   1 +
 include/linux/irqchip/arm-gic-common.h |   6 +
 include/linux/irqchip/arm-gic.h        |   5 -
 26 files changed, 845 insertions(+), 90 deletions(-)

--
1.9.1

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC 1/7] arm64: irqflags: Reorder the fiq & async macros
  2017-10-11 13:00 [RFC 0/7] arm64: provide pseudo NMI with GICv3 Julien Thierry
@ 2017-10-11 13:00 ` Julien Thierry
  2017-10-11 13:00 ` [RFC 2/7] arm64: cpufeature: Allow early detect of specific features Julien Thierry
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Julien Thierry @ 2017-10-11 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Daniel Thompson <daniel.thompson@linaro.org>

Separate out the local fiq & async macros from the various arch inlines.
This makes is easier for us (in a later patch) to provide an alternative
implementation of these inlines.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/irqflags.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 8c58128..f367cbd 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -53,12 +53,6 @@ static inline void arch_local_irq_disable(void)
 		: "memory");
 }

-#define local_fiq_enable()	asm("msr	daifclr, #1" : : : "memory")
-#define local_fiq_disable()	asm("msr	daifset, #1" : : : "memory")
-
-#define local_async_enable()	asm("msr	daifclr, #4" : : : "memory")
-#define local_async_disable()	asm("msr	daifset, #4" : : : "memory")
-
 /*
  * Save the current interrupt enable state.
  */
@@ -90,6 +84,12 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
 	return flags & PSR_I_BIT;
 }

+#define local_fiq_enable()	asm("msr	daifclr, #1" : : : "memory")
+#define local_fiq_disable()	asm("msr	daifset, #1" : : : "memory")
+
+#define local_async_enable()	asm("msr	daifclr, #4" : : : "memory")
+#define local_async_disable()	asm("msr	daifset, #4" : : : "memory")
+
 /*
  * save and restore debug state
  */
--
1.9.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC 2/7] arm64: cpufeature: Allow early detect of specific features
  2017-10-11 13:00 [RFC 0/7] arm64: provide pseudo NMI with GICv3 Julien Thierry
  2017-10-11 13:00 ` [RFC 1/7] arm64: irqflags: Reorder the fiq & async macros Julien Thierry
@ 2017-10-11 13:00 ` Julien Thierry
  2017-10-11 13:00 ` [RFC 3/7] arm64: alternative: Apply alternatives early in boot process Julien Thierry
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Julien Thierry @ 2017-10-11 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Daniel Thompson <daniel.thompson@linaro.org>

Currently it is not possible to detect features of the boot CPU
until the other CPUs have been brought up.

This prevents us from reacting to features of the boot CPU until
fairly late in the boot process. To solve this we allow a subset
of features (that are likely to be common to all clusters) to be
detected based on the boot CPU alone.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
[julien.thierry at arm.com: check non-boot cpu missing early features, avoid
			 duplicates between early features and normal
			 features, do not move init_cpu_features]
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/kernel/cpufeature.c | 67 ++++++++++++++++++++++++++++--------------
 1 file changed, 45 insertions(+), 22 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 21e2c95..0e58cdd 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -51,6 +51,8 @@
 DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
 EXPORT_SYMBOL(cpu_hwcaps);

+static void __init setup_early_feature_capabilities(void);
+
 static int dump_cpu_hwcaps(struct notifier_block *self, unsigned long v, void *p)
 {
 	/* file-wide pr_fmt adds "CPU features: " prefix */
@@ -505,6 +507,7 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info)
 		init_cpu_ftr_reg(SYS_MVFR2_EL1, info->reg_mvfr2);
 	}

+	setup_early_feature_capabilities();
 }

 static void update_cpu_ftr_reg(struct arm64_ftr_reg *reg, u64 new)
@@ -796,7 +799,7 @@ static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unus
 					ID_AA64PFR0_FP_SHIFT) < 0;
 }

-static const struct arm64_cpu_capabilities arm64_features[] = {
+static const struct arm64_cpu_capabilities arm64_early_features[] = {
 	{
 		.desc = "GIC system register CPU interface",
 		.capability = ARM64_HAS_SYSREG_GIC_CPUIF,
@@ -807,6 +810,10 @@ static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unus
 		.sign = FTR_UNSIGNED,
 		.min_field_value = 1,
 	},
+	{}
+};
+
+static const struct arm64_cpu_capabilities arm64_features[] = {
 #ifdef CONFIG_ARM64_PAN
 	{
 		.desc = "Privileged Access Never",
@@ -1055,6 +1062,29 @@ static inline void set_sys_caps_initialised(void)
 	sys_caps_initialised = true;
 }

+/* Returns false on a capability mismatch */
+static bool
+verify_local_cpu_features(const struct arm64_cpu_capabilities *caps)
+{
+	for (; caps->matches; caps++) {
+		if (!cpus_have_cap(caps->capability))
+			continue;
+		/*
+		 * If the new CPU misses an advertised feature, we cannot
+		 * proceed further, park the cpu.
+		 */
+		if (!caps->matches(caps, SCOPE_LOCAL_CPU)) {
+			pr_crit("CPU%d: missing feature: %s\n",
+					smp_processor_id(), caps->desc);
+			return false;
+		}
+		if (caps->enable)
+			caps->enable(NULL);
+	}
+
+	return true;
+}
+
 /*
  * Check for CPU features that are used in early boot
  * based on the Boot CPU value.
@@ -1063,6 +1093,9 @@ static void check_early_cpu_features(void)
 {
 	verify_cpu_run_el();
 	verify_cpu_asid_bits();
+
+	if (!verify_local_cpu_features(arm64_early_features))
+		cpu_panic_kernel();
 }

 static void
@@ -1077,26 +1110,6 @@ static void check_early_cpu_features(void)
 		}
 }

-static void
-verify_local_cpu_features(const struct arm64_cpu_capabilities *caps)
-{
-	for (; caps->matches; caps++) {
-		if (!cpus_have_cap(caps->capability))
-			continue;
-		/*
-		 * If the new CPU misses an advertised feature, we cannot proceed
-		 * further, park the cpu.
-		 */
-		if (!caps->matches(caps, SCOPE_LOCAL_CPU)) {
-			pr_crit("CPU%d: missing feature: %s\n",
-					smp_processor_id(), caps->desc);
-			cpu_die_early();
-		}
-		if (caps->enable)
-			caps->enable(NULL);
-	}
-}
-
 /*
  * Run through the enabled system capabilities and enable() it on this CPU.
  * The capabilities were decided based on the available CPUs at the boot time.
@@ -1108,7 +1121,10 @@ static void check_early_cpu_features(void)
 static void verify_local_cpu_capabilities(void)
 {
 	verify_local_cpu_errata_workarounds();
-	verify_local_cpu_features(arm64_features);
+
+	if (!verify_local_cpu_features(arm64_features))
+		cpu_die_early();
+
 	verify_local_elf_hwcaps(arm64_elf_hwcaps);
 	if (system_supports_32bit_el0())
 		verify_local_elf_hwcaps(compat_elf_hwcaps);
@@ -1134,6 +1150,13 @@ void check_local_cpu_capabilities(void)
 		verify_local_cpu_capabilities();
 }

+static void __init setup_early_feature_capabilities(void)
+{
+	update_cpu_capabilities(arm64_early_features,
+				"early detected feature:");
+	enable_cpu_capabilities(arm64_early_features);
+}
+
 static void __init setup_feature_capabilities(void)
 {
 	update_cpu_capabilities(arm64_features, "detected feature:");
--
1.9.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC 3/7] arm64: alternative: Apply alternatives early in boot process
  2017-10-11 13:00 [RFC 0/7] arm64: provide pseudo NMI with GICv3 Julien Thierry
  2017-10-11 13:00 ` [RFC 1/7] arm64: irqflags: Reorder the fiq & async macros Julien Thierry
  2017-10-11 13:00 ` [RFC 2/7] arm64: cpufeature: Allow early detect of specific features Julien Thierry
@ 2017-10-11 13:00 ` Julien Thierry
  2017-11-13 17:34   ` Suzuki K Poulose
  2017-10-11 13:00 ` [RFC 4/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking Julien Thierry
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Julien Thierry @ 2017-10-11 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Daniel Thompson <daniel.thompson@linaro.org>

Currently alternatives are applied very late in the boot process (and
a long time after we enable scheduling). Some alternative sequences,
such as those that alter the way CPU context is stored, must be applied
much earlier in the boot sequence.

Introduce apply_alternatives_early() to allow some alternatives to be
applied immediately after we detect the CPU features of the boot CPU.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
---
 arch/arm64/include/asm/alternative.h |  1 +
 arch/arm64/kernel/alternative.c      | 39 +++++++++++++++++++++++++++++++++---
 arch/arm64/kernel/smp.c              |  6 ++++++
 3 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
index 6e1cb8c..01792d5 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -19,6 +19,7 @@ struct alt_instr {
 	u8  alt_len;		/* size of new instruction(s), <= orig_len */
 };

+void __init apply_alternatives_early(void);
 void __init apply_alternatives_all(void);
 void apply_alternatives(void *start, size_t length);

diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
index 6dd0a3a3..78051d4 100644
--- a/arch/arm64/kernel/alternative.c
+++ b/arch/arm64/kernel/alternative.c
@@ -28,6 +28,18 @@
 #include <asm/sections.h>
 #include <linux/stop_machine.h>

+/*
+ * early-apply features are detected using only the boot CPU and checked on
+ * secondary CPUs startup, even then,
+ * These early-apply features should only include features where we must
+ * patch the kernel very early in the boot process.
+ *
+ * Note that the cpufeature logic *must* be made aware of early-apply
+ * features to ensure they are reported as enabled without waiting
+ * for other CPUs to boot.
+ */
+#define EARLY_APPLY_FEATURE_MASK BIT(ARM64_HAS_SYSREG_GIC_CPUIF)
+
 #define __ALT_PTR(a,f)		((void *)&(a)->f + (a)->f)
 #define ALT_ORIG_PTR(a)		__ALT_PTR(a, orig_offset)
 #define ALT_REPL_PTR(a)		__ALT_PTR(a, alt_offset)
@@ -105,7 +117,8 @@ static u32 get_alt_insn(struct alt_instr *alt, __le32 *insnptr, __le32 *altinsnp
 	return insn;
 }

-static void __apply_alternatives(void *alt_region, bool use_linear_alias)
+static void __apply_alternatives(void *alt_region,  bool use_linear_alias,
+				 unsigned long feature_mask)
 {
 	struct alt_instr *alt;
 	struct alt_region *region = alt_region;
@@ -115,6 +128,9 @@ static void __apply_alternatives(void *alt_region, bool use_linear_alias)
 		u32 insn;
 		int i, nr_inst;

+		if ((BIT(alt->cpufeature) & feature_mask) == 0)
+			continue;
+
 		if (!cpus_have_cap(alt->cpufeature))
 			continue;

@@ -138,6 +154,21 @@ static void __apply_alternatives(void *alt_region, bool use_linear_alias)
 }

 /*
+ * This is called very early in the boot process (directly after we run
+ * a feature detect on the boot CPU). No need to worry about other CPUs
+ * here.
+ */
+void apply_alternatives_early(void)
+{
+	struct alt_region region = {
+		.begin	= (struct alt_instr *)__alt_instructions,
+		.end	= (struct alt_instr *)__alt_instructions_end,
+	};
+
+	__apply_alternatives(&region, true, EARLY_APPLY_FEATURE_MASK);
+}
+
+/*
  * We might be patching the stop_machine state machine, so implement a
  * really simple polling protocol here.
  */
@@ -156,7 +187,9 @@ static int __apply_alternatives_multi_stop(void *unused)
 		isb();
 	} else {
 		BUG_ON(patched);
-		__apply_alternatives(&region, true);
+
+		__apply_alternatives(&region, true, ~EARLY_APPLY_FEATURE_MASK);
+
 		/* Barriers provided by the cache flushing */
 		WRITE_ONCE(patched, 1);
 	}
@@ -177,5 +210,5 @@ void apply_alternatives(void *start, size_t length)
 		.end	= start + length,
 	};

-	__apply_alternatives(&region, false);
+	__apply_alternatives(&region, false, -1);
 }
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 9f7195a..84c6983 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -455,6 +455,12 @@ void __init smp_prepare_boot_cpu(void)
 	 * cpuinfo_store_boot_cpu() above.
 	 */
 	update_cpu_errata_workarounds();
+	/*
+	 * We now know enough about the boot CPU to apply the
+	 * alternatives that cannot wait until interrupt handling
+	 * and/or scheduling is enabled.
+	 */
+	apply_alternatives_early();
 }

 static u64 __init of_get_cpu_mpidr(struct device_node *dn)
--
1.9.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC 4/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking
  2017-10-11 13:00 [RFC 0/7] arm64: provide pseudo NMI with GICv3 Julien Thierry
                   ` (2 preceding siblings ...)
  2017-10-11 13:00 ` [RFC 3/7] arm64: alternative: Apply alternatives early in boot process Julien Thierry
@ 2017-10-11 13:00 ` Julien Thierry
  2017-10-16 21:18   ` Christoffer Dall
  2017-10-11 13:01 ` [RFC 5/7] irqchip/gic: Add functions to access irq priorities Julien Thierry
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Julien Thierry @ 2017-10-11 13:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Daniel Thompson <daniel.thompson@linaro.org>

Currently irqflags is implemented using the PSR's I bit. It is possible
to implement irqflags by using the co-processor interface to the GIC.
Using the co-processor interface makes it feasible to simulate NMIs
using GIC interrupt prioritization.

This patch changes the irqflags macros to modify, save and restore
ICC_PMR_EL1. This has a substantial knock on effect for the rest of
the kernel. There are four reasons for this:

1. The state of the PMR becomes part of the interrupt context and must be
   saved and restored during exceptions. It is saved on the stack as part
   of the saved context when an interrupt/exception is taken.

2. The hardware automatically masks the I bit (at boot, during traps, etc).
   When the I bit is set by hardware we must add code to switch from I
   bit masking and PMR masking:
       - For IRQs, this is done after the interrupt has been acknowledged
       	 avoiding the need to unmask.
       - For other exceptions, this is done right after saving the context.

3. Some instructions, such as wfi, require that the PMR not be used
   for interrupt masking. Before calling these instructions we must
   switch from PMR masking to I bit masking.
   This is also the case when KVM runs a guest, if the CPU receives
   an interrupt from the host, interrupts must not be masked in PMR
   otherwise the GIC will not signal it to the CPU.

4. We use the alternatives system to allow a single kernel to boot and
   be switched to the alternative masking approach at runtime.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
[julien.thierry at arm.com: changes reflected in commit,
			 message, fixes, renaming]
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
---
 arch/arm64/Kconfig                     |  15 ++++
 arch/arm64/include/asm/arch_gicv3.h    |  19 +++++
 arch/arm64/include/asm/assembler.h     |  33 ++++++++-
 arch/arm64/include/asm/efi.h           |   5 ++
 arch/arm64/include/asm/irqflags.h      | 125 +++++++++++++++++++++++++++++++++
 arch/arm64/include/asm/kvm_host.h      |  40 +++++++++++
 arch/arm64/include/asm/processor.h     |   4 ++
 arch/arm64/include/asm/ptrace.h        |  14 +++-
 arch/arm64/kernel/asm-offsets.c        |   1 +
 arch/arm64/kernel/entry.S              |  78 ++++++++++++++++----
 arch/arm64/kernel/head.S               |  38 ++++++++++
 arch/arm64/kernel/process.c            |   6 ++
 arch/arm64/kernel/smp.c                |   8 +++
 arch/arm64/mm/proc.S                   |  23 ++++++
 drivers/irqchip/irq-gic-v3-its.c       |   2 +-
 drivers/irqchip/irq-gic-v3.c           |  82 +++++++++++----------
 include/linux/irqchip/arm-gic-common.h |   6 ++
 include/linux/irqchip/arm-gic.h        |   5 --
 18 files changed, 444 insertions(+), 60 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 0df64a6..269e506 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -803,6 +803,21 @@ config FORCE_MAX_ZONEORDER
 	  However for 4K, we choose a higher default value, 11 as opposed to 10, giving us
 	  4M allocations matching the default size used by generic code.

+config USE_ICC_SYSREGS_FOR_IRQFLAGS
+	bool "Use ICC system registers for IRQ masking"
+	select CONFIG_ARM_GIC_V3
+	help
+	  Using the ICC system registers for IRQ masking makes it possible
+	  to simulate NMI on ARM64 systems. This allows several interesting
+	  features (especially debug features) to be used on these systems.
+
+	  Say Y here to implement IRQ masking using ICC system
+	  registers when the GIC System Registers are available. The changes
+	  are applied dynamically using the alternatives system so it is safe
+	  to enable this option on systems with older interrupt controllers.
+
+	  If unsure, say N
+
 menuconfig ARMV8_DEPRECATED
 	bool "Emulate deprecated/obsolete ARMv8 instructions"
 	depends on COMPAT
diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
index b7e3f74..b6b430c 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -140,5 +140,24 @@ static inline void gic_write_bpr1(u32 val)
 #define gits_write_vpendbaser(v, c)	writeq_relaxed(v, c)
 #define gits_read_vpendbaser(c)		readq_relaxed(c)

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+static inline void gic_start_pmr_masking(void)
+{
+	if (static_branch_likely(&cpu_hwcap_keys[ARM64_HAS_SYSREG_GIC_CPUIF])) {
+		gic_write_pmr(ICC_PMR_EL1_MASKED);
+		asm volatile ("msr daifclr, #2" : : : "memory");
+	}
+}
+
+static inline void gic_end_pmr_masking(void)
+{
+	if (static_branch_likely(&cpu_hwcap_keys[ARM64_HAS_SYSREG_GIC_CPUIF])) {
+		asm volatile ("msr	daifset, #2" : : : "memory");
+		gic_write_pmr(ICC_PMR_EL1_UNMASKED);
+		dsb(sy);
+	}
+}
+#endif
+
 #endif /* __ASSEMBLY__ */
 #endif /* __ASM_ARCH_GICV3_H */
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index d58a625..33ebe21 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -23,6 +23,8 @@
 #ifndef __ASM_ASSEMBLER_H
 #define __ASM_ASSEMBLER_H

+#include <asm/alternative.h>
+#include <asm/arch_gicv3.h>
 #include <asm/asm-offsets.h>
 #include <asm/cpufeature.h>
 #include <asm/mmu_context.h>
@@ -34,12 +36,32 @@
 /*
  * Enable and disable interrupts.
  */
-	.macro	disable_irq
+	.macro	disable_irq, tmp
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	mov	\tmp, #ICC_PMR_EL1_MASKED
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
 	msr	daifset, #2
+alternative_else
+	msr_s	SYS_ICC_PMR_EL1, \tmp
+alternative_endif
+#else
+	msr	daifset, #2
+#endif
 	.endm

-	.macro	enable_irq
+	.macro	enable_irq, tmp
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	mov     \tmp, #ICC_PMR_EL1_UNMASKED
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
 	msr	daifclr, #2
+	nop
+alternative_else
+	msr_s	SYS_ICC_PMR_EL1, \tmp
+	dsb	sy
+alternative_endif
+#else
+	msr	daifclr, #2
+#endif
 	.endm

 	.macro	save_and_disable_irq, flags
@@ -85,8 +107,13 @@
  * faster than two daifclr operations, since writes to this register
  * are self-synchronising.
  */
-	.macro	enable_dbg_and_irq
+	.macro	enable_dbg_and_irq, tmp
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	enable_dbg
+	enable_irq \tmp
+#else
 	msr	daifclr, #(8 | 2)
+#endif
 	.endm

 /*
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index b93904b..166dd2f 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -39,7 +39,12 @@
 	efi_virtmap_unload();						\
 })

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+#define ARCH_EFI_IRQ_FLAGS_MASK \
+	(PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT | ARCH_FLAG_PMR_EN)
+#else
 #define ARCH_EFI_IRQ_FLAGS_MASK (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)
+#endif

 /* arch specific definitions used by the stub code */

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index f367cbd..1c69ebb 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -18,7 +18,12 @@

 #ifdef __KERNEL__

+#include <asm/alternative.h>
+#include <asm/cpufeature.h>
 #include <asm/ptrace.h>
+#include <asm/sysreg.h>
+
+#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS

 /*
  * CPU interrupt mask handling.
@@ -84,6 +89,126 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
 	return flags & PSR_I_BIT;
 }

+static inline void maybe_switch_to_sysreg_gic_cpuif(void) {}
+
+#else /* CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS */
+
+#define ARCH_FLAG_PMR_EN 0x1
+
+#define MAKE_ARCH_FLAGS(daif, pmr)					\
+	((daif) | (((pmr) >> ICC_PMR_EL1_EN_SHIFT) & ARCH_FLAG_PMR_EN))
+
+#define ARCH_FLAGS_GET_PMR(flags)				\
+	((((flags) & ARCH_FLAG_PMR_EN) << ICC_PMR_EL1_EN_SHIFT) \
+		| ICC_PMR_EL1_MASKED)
+
+#define ARCH_FLAGS_GET_DAIF(flags) ((flags) & ~ARCH_FLAG_PMR_EN)
+
+/*
+ * CPU interrupt mask handling.
+ */
+static inline unsigned long arch_local_irq_save(void)
+{
+	unsigned long flags, masked = ICC_PMR_EL1_MASKED;
+	unsigned long pmr = 0;
+
+	asm volatile(ALTERNATIVE(
+		"mrs	%0, daif		// arch_local_irq_save\n"
+		"msr	daifset, #2\n"
+		"mov	%1, #" __stringify(ICC_PMR_EL1_UNMASKED),
+		/* --- */
+		"mrs	%0, daif\n"
+		"mrs_s  %1, " __stringify(SYS_ICC_PMR_EL1) "\n"
+		"msr_s	" __stringify(SYS_ICC_PMR_EL1) ", %2",
+		ARM64_HAS_SYSREG_GIC_CPUIF)
+		: "=&r" (flags), "=&r" (pmr)
+		: "r" (masked)
+		: "memory");
+
+	return MAKE_ARCH_FLAGS(flags, pmr);
+}
+
+static inline void arch_local_irq_enable(void)
+{
+	unsigned long unmasked = ICC_PMR_EL1_UNMASKED;
+
+	asm volatile(ALTERNATIVE(
+		"msr	daifclr, #2		// arch_local_irq_enable\n"
+		"nop",
+		"msr_s  " __stringify(SYS_ICC_PMR_EL1) ",%0\n"
+		"dsb	sy",
+		ARM64_HAS_SYSREG_GIC_CPUIF)
+		:
+		: "r" (unmasked)
+		: "memory");
+}
+
+static inline void arch_local_irq_disable(void)
+{
+	unsigned long masked = ICC_PMR_EL1_MASKED;
+
+	asm volatile(ALTERNATIVE(
+		"msr	daifset, #2		// arch_local_irq_disable",
+		"msr_s  " __stringify(SYS_ICC_PMR_EL1) ",%0",
+		ARM64_HAS_SYSREG_GIC_CPUIF)
+		:
+		: "r" (masked)
+		: "memory");
+}
+
+/*
+ * Save the current interrupt enable state.
+ */
+static inline unsigned long arch_local_save_flags(void)
+{
+	unsigned long flags;
+	unsigned long pmr = 0;
+
+	asm volatile(ALTERNATIVE(
+		"mrs	%0, daif		// arch_local_save_flags\n"
+		"mov	%1, #" __stringify(ICC_PMR_EL1_UNMASKED),
+		"mrs	%0, daif\n"
+		"mrs_s  %1, " __stringify(SYS_ICC_PMR_EL1),
+		ARM64_HAS_SYSREG_GIC_CPUIF)
+		: "=r" (flags), "=r" (pmr)
+		:
+		: "memory");
+
+	return MAKE_ARCH_FLAGS(flags, pmr);
+}
+
+/*
+ * restore saved IRQ state
+ */
+static inline void arch_local_irq_restore(unsigned long flags)
+{
+	unsigned long pmr = ARCH_FLAGS_GET_PMR(flags);
+
+	flags = ARCH_FLAGS_GET_DAIF(flags);
+
+	asm volatile(ALTERNATIVE(
+		"msr	daif, %0		// arch_local_irq_restore\n"
+		"nop\n"
+		"nop",
+		"msr	daif, %0\n"
+		"msr_s  " __stringify(SYS_ICC_PMR_EL1) ",%1\n"
+		"dsb	sy",
+		ARM64_HAS_SYSREG_GIC_CPUIF)
+	:
+	: "r" (flags), "r" (pmr)
+	: "memory");
+}
+
+static inline int arch_irqs_disabled_flags(unsigned long flags)
+{
+	return (ARCH_FLAGS_GET_DAIF(flags) & (PSR_I_BIT)) |
+		!(ARCH_FLAGS_GET_PMR(flags) & ICC_PMR_EL1_EN_BIT);
+}
+
+void maybe_switch_to_sysreg_gic_cpuif(void);
+
+#endif /* CONFIG_IRQFLAGS_GIC_MASKING */
+
 #define local_fiq_enable()	asm("msr	daifclr, #1" : : : "memory")
 #define local_fiq_disable()	asm("msr	daifset, #1" : : : "memory")

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e923b58..070e8a5 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -38,6 +38,10 @@
 #include <kvm/arm_arch_timer.h>
 #include <kvm/arm_pmu.h>

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+#include <asm/arch_gicv3.h>
+#endif
+
 #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS

 #define KVM_VCPU_MAX_FEATURES 4
@@ -332,7 +336,30 @@ int kvm_unmap_hva_range(struct kvm *kvm,
 void kvm_arm_resume_guest(struct kvm *kvm);

 u64 __kvm_call_hyp(void *hypfn, ...);
+
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+/*
+ * If we use PMR to disable interrupts, interrupts happening while the
+ * guest is executing will not be signaled to the host by the GIC  (and
+ * any call to a waiting kvm_kick_many_cpus will likely cause a
+ * deadlock).
+ * We need to switch back to using PSR.I.
+ */
+#define kvm_call_hyp(f, ...)						\
+	({								\
+		u64 res;						\
+		unsigned long flags;					\
+									\
+		flags = arch_local_irq_save();				\
+		gic_end_pmr_masking();					\
+		res = __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__);	\
+		gic_start_pmr_masking();				\
+		arch_local_irq_restore(flags);				\
+		res;							\
+	})
+#else
 #define kvm_call_hyp(f, ...) __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__)
+#endif

 void force_vm_exit(const cpumask_t *mask);
 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
@@ -349,6 +376,9 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
 				       unsigned long hyp_stack_ptr,
 				       unsigned long vector_ptr)
 {
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	unsigned long flags;
+#endif
 	/*
 	 * Call initialization code, and switch to the full blown HYP code.
 	 * If the cpucaps haven't been finalized yet, something has gone very
@@ -356,7 +386,17 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
 	 * cpus_have_const_cap() wrapper.
 	 */
 	BUG_ON(!static_branch_likely(&arm64_const_caps_ready));
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	flags = arch_local_irq_save();
+	gic_end_pmr_masking();
+#endif
+
 	__kvm_call_hyp((void *)pgd_ptr, hyp_stack_ptr, vector_ptr);
+
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	gic_start_pmr_masking();
+	arch_local_irq_restore(flags);
+#endif
 }

 static inline void kvm_arch_hardware_unsetup(void) {}
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 29adab8..2a871f6 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -114,6 +114,10 @@ static inline void start_thread_common(struct pt_regs *regs, unsigned long pc)
 	memset(regs, 0, sizeof(*regs));
 	forget_syscall(regs);
 	regs->pc = pc;
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	/* Have IRQs enabled by default */
+	regs->pmr_save = ICC_PMR_EL1_UNMASKED;
+#endif
 }

 static inline void start_thread(struct pt_regs *regs, unsigned long pc,
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 6069d66..aa1e948 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -25,6 +25,12 @@
 #define CurrentEL_EL1		(1 << 2)
 #define CurrentEL_EL2		(2 << 2)

+/* PMR values used to mask/unmask interrupts */
+#define ICC_PMR_EL1_EN_SHIFT	6
+#define ICC_PMR_EL1_EN_BIT	(1 << ICC_PMR_EL1_EN_SHIFT) // PMR IRQ enable
+#define ICC_PMR_EL1_UNMASKED    0xf0
+#define ICC_PMR_EL1_MASKED      (ICC_PMR_EL1_UNMASKED ^ ICC_PMR_EL1_EN_BIT)
+
 /* AArch32-specific ptrace requests */
 #define COMPAT_PTRACE_GETREGS		12
 #define COMPAT_PTRACE_SETREGS		13
@@ -136,7 +142,7 @@ struct pt_regs {
 #endif

 	u64 orig_addr_limit;
-	u64 unused;	// maintain 16 byte alignment
+	u64 pmr_save;
 	u64 stackframe[2];
 };

@@ -171,8 +177,14 @@ static inline void forget_syscall(struct pt_regs *regs)
 #define processor_mode(regs) \
 	((regs)->pstate & PSR_MODE_MASK)

+#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
 #define interrupts_enabled(regs) \
 	(!((regs)->pstate & PSR_I_BIT))
+#else
+#define interrupts_enabled(regs)	    \
+	((!((regs)->pstate & PSR_I_BIT)) && \
+	 ((regs)->pmr_save & ICC_PMR_EL1_EN_BIT))
+#endif

 #define fast_interrupts_enabled(regs) \
 	(!((regs)->pstate & PSR_F_BIT))
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 71bf088..6b00b0d 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -75,6 +75,7 @@ int main(void)
   DEFINE(S_ORIG_X0,		offsetof(struct pt_regs, orig_x0));
   DEFINE(S_SYSCALLNO,		offsetof(struct pt_regs, syscallno));
   DEFINE(S_ORIG_ADDR_LIMIT,	offsetof(struct pt_regs, orig_addr_limit));
+  DEFINE(S_PMR_SAVE,		offsetof(struct pt_regs, pmr_save));
   DEFINE(S_STACKFRAME,		offsetof(struct pt_regs, stackframe));
   DEFINE(S_FRAME_SIZE,		sizeof(struct pt_regs));
   BLANK();
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index e1c59d4..1933fea 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -20,6 +20,7 @@

 #include <linux/init.h>
 #include <linux/linkage.h>
+#include <linux/irqchip/arm-gic-v3.h>

 #include <asm/alternative.h>
 #include <asm/assembler.h>
@@ -210,6 +211,16 @@ alternative_else_nop_endif
 	msr	sp_el0, tsk
 	.endif

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	/* Save pmr */
+alternative_if ARM64_HAS_SYSREG_GIC_CPUIF
+	mrs_s	x20, SYS_ICC_PMR_EL1
+alternative_else
+	mov	x20, #ICC_PMR_EL1_UNMASKED
+alternative_endif
+	str	x20, [sp, #S_PMR_SAVE]
+#endif
+
 	/*
 	 * Registers that may be useful after this macro is invoked:
 	 *
@@ -220,6 +231,16 @@ alternative_else_nop_endif
 	.endm

 	.macro	kernel_exit, el
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	/* Restore pmr, ensuring IRQs are off before restoring context. */
+alternative_if ARM64_HAS_SYSREG_GIC_CPUIF
+	msr	daifset, #2
+	ldr	x20, [sp, #S_PMR_SAVE]
+	msr_s	SYS_ICC_PMR_EL1, x20
+	dsb	sy
+alternative_else_nop_endif
+#endif
+
 	.if	\el != 0
 	/* Restore the task's original addr_limit. */
 	ldr	x20, [sp, #S_ORIG_ADDR_LIMIT]
@@ -333,6 +354,19 @@ alternative_else_nop_endif
 	mov	sp, x19
 	.endm

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	/*
+	 * pmr_irq_mask_start
+	 */
+	.macro	pmr_irq_mask_start, tmp
+	alternative_if ARM64_HAS_SYSREG_GIC_CPUIF
+	mov	\tmp, #ICC_PMR_EL1_MASKED
+	msr_s	SYS_ICC_PMR_EL1, \tmp
+	msr	daifclr, #2
+	alternative_else_nop_endif
+	.endm
+#endif
+
 /*
  * These are the registers used in the syscall handler, and allow us to
  * have in theory up to 7 arguments to a function - x0 to x6.
@@ -426,6 +460,9 @@ __bad_stack:
  */
 	.macro	inv_entry, el, reason, regsize = 64
 	kernel_entry \el, \regsize
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	pmr_irq_mask_start x9
+#endif
 	mov	x0, sp
 	mov	x1, #\reason
 	mrs	x2, esr_el1
@@ -481,6 +518,9 @@ ENDPROC(el1_error_invalid)
 	.align	6
 el1_sync:
 	kernel_entry 1
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	pmr_irq_mask_start x9
+#endif
 	mrs	x1, esr_el1			// read the syndrome register
 	lsr	x24, x1, #ESR_ELx_EC_SHIFT	// exception class
 	cmp	x24, #ESR_ELx_EC_DABT_CUR	// data abort in EL1
@@ -510,15 +550,23 @@ el1_da:
 	mrs	x3, far_el1
 	enable_dbg
 	// re-enable interrupts if they were enabled in the aborted context
-	tbnz	x23, #7, 1f			// PSR_I_BIT
-	enable_irq
+	tbnz	x23, #7, el1_keep_irq_off	// PSR_I_BIT
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+alternative_if ARM64_HAS_SYSREG_GIC_CPUIF
+	ldr	x2, [sp, #S_PMR_SAVE]
+	tbnz	x2, #ICC_PMR_EL1_EN_SHIFT, 1f
+	b	el1_keep_irq_off
 1:
+alternative_else_nop_endif
+#endif
+	enable_irq x2
+el1_keep_irq_off:
 	clear_address_tag x0, x3
 	mov	x2, sp				// struct pt_regs
 	bl	do_mem_abort

 	// disable interrupts before pulling preserved data off the stack
-	disable_irq
+	disable_irq x21
 	kernel_exit 1
 el1_sp_pc:
 	/*
@@ -597,6 +645,9 @@ el1_preempt:
 	.align	6
 el0_sync:
 	kernel_entry 0
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	pmr_irq_mask_start x9
+#endif
 	mrs	x25, esr_el1			// read the syndrome register
 	lsr	x24, x25, #ESR_ELx_EC_SHIFT	// exception class
 	cmp	x24, #ESR_ELx_EC_SVC64		// SVC in 64-bit state
@@ -625,6 +676,9 @@ el0_sync:
 	.align	6
 el0_sync_compat:
 	kernel_entry 0, 32
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	pmr_irq_mask_start x9
+#endif
 	mrs	x25, esr_el1			// read the syndrome register
 	lsr	x24, x25, #ESR_ELx_EC_SHIFT	// exception class
 	cmp	x24, #ESR_ELx_EC_SVC32		// SVC in 32-bit state
@@ -675,7 +729,7 @@ el0_da:
 	 */
 	mrs	x26, far_el1
 	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_dbg_and_irq x0
 	ct_user_exit
 	clear_address_tag x0, x26
 	mov	x1, x25
@@ -688,7 +742,7 @@ el0_ia:
 	 */
 	mrs	x26, far_el1
 	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_dbg_and_irq x0
 	ct_user_exit
 	mov	x0, x26
 	mov	x1, x25
@@ -721,7 +775,7 @@ el0_sp_pc:
 	 */
 	mrs	x26, far_el1
 	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_dbg_and_irq x0
 	ct_user_exit
 	mov	x0, x26
 	mov	x1, x25
@@ -733,7 +787,7 @@ el0_undef:
 	 * Undefined instruction
 	 */
 	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
+	enable_dbg_and_irq x0
 	ct_user_exit
 	mov	x0, sp
 	bl	do_undefinstr
@@ -742,7 +796,7 @@ el0_sys:
 	/*
 	 * System instructions, for trapped cache maintenance instructions
 	 */
-	enable_dbg_and_irq
+	enable_dbg_and_irq x0
 	ct_user_exit
 	mov	x0, x25
 	mov	x1, sp
@@ -793,7 +847,7 @@ ENDPROC(el0_irq)
  * and this includes saving x0 back into the kernel stack.
  */
 ret_fast_syscall:
-	disable_irq				// disable interrupts
+	disable_irq x21				// disable interrupts
 	str	x0, [sp, #S_X0]			// returned x0
 	ldr	x1, [tsk, #TSK_TI_FLAGS]	// re-check for syscall tracing
 	and	x2, x1, #_TIF_SYSCALL_WORK
@@ -803,7 +857,7 @@ ret_fast_syscall:
 	enable_step_tsk x1, x2
 	kernel_exit 0
 ret_fast_syscall_trace:
-	enable_irq				// enable interrupts
+	enable_irq x0				// enable interrupts
 	b	__sys_trace_return_skipped	// we already saved x0

 /*
@@ -821,7 +875,7 @@ work_pending:
  * "slow" syscall return path.
  */
 ret_to_user:
-	disable_irq				// disable interrupts
+	disable_irq x21				// disable interrupts
 	ldr	x1, [tsk, #TSK_TI_FLAGS]
 	and	x2, x1, #_TIF_WORK_MASK
 	cbnz	x2, work_pending
@@ -840,7 +894,7 @@ el0_svc:
 	mov	wsc_nr, #__NR_syscalls
 el0_svc_naked:					// compat entry point
 	stp	x0, xscno, [sp, #S_ORIG_X0]	// save the original x0 and syscall number
-	enable_dbg_and_irq
+	enable_dbg_and_irq x16
 	ct_user_exit 1

 	ldr	x16, [tsk, #TSK_TI_FLAGS]	// check for syscall hooks
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 0b243ec..2ee06dd 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -545,6 +545,44 @@ set_cpu_boot_mode_flag:
 	ret
 ENDPROC(set_cpu_boot_mode_flag)

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+/*
+ * void maybe_switch_to_sysreg_gic_cpuif(void)
+ *
+ * Enable interrupt controller system register access if this feature
+ * has been detected by the alternatives system.
+ *
+ * Before we jump into generic code we must enable interrupt controller system
+ * register access because this is required by the irqflags macros.  We must
+ * also mask interrupts at the PMR and unmask them within the PSR. That leaves
+ * us set up and ready for the kernel to make its first call to
+ * arch_local_irq_enable().
+
+ *
+ */
+ENTRY(maybe_switch_to_sysreg_gic_cpuif)
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
+	b	1f
+alternative_else
+	mrs_s	x0, SYS_ICC_SRE_EL1
+alternative_endif
+	orr	x0, x0, #1
+	msr_s	SYS_ICC_SRE_EL1, x0	// Set ICC_SRE_EL1.SRE==1
+	isb				// Make sure SRE is now set
+	mrs	x0, daif
+	tbz	x0, #7, no_mask_pmr	// Are interrupts on?
+	mov	x0, ICC_PMR_EL1_MASKED
+	msr_s	SYS_ICC_PMR_EL1, x0	// Prepare for unmask of I bit
+	msr	daifclr, #2		// Clear the I bit
+	b	1f
+no_mask_pmr:
+	mov	x0, ICC_PMR_EL1_UNMASKED
+	msr_s	SYS_ICC_PMR_EL1, x0
+1:
+	ret
+ENDPROC(maybe_switch_to_sysreg_gic_cpuif)
+#endif
+
 /*
  * These values are written with the MMU off, but read with the MMU on.
  * Writers will invalidate the corresponding address, discarding up to a
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 2dc0f84..6e1f4e8 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -65,6 +65,8 @@
 EXPORT_SYMBOL(__stack_chk_guard);
 #endif

+#include <asm/arch_gicv3.h>
+
 /*
  * Function pointers to optional machine specific functions
  */
@@ -191,6 +193,7 @@ void __show_regs(struct pt_regs *regs)
 	printk("pc : [<%016llx>] lr : [<%016llx>] pstate: %08llx\n",
 	       regs->pc, lr, regs->pstate);
 	printk("sp : %016llx\n", sp);
+	printk("pmr_save: %08llx\n", regs->pmr_save);

 	i = top_reg;

@@ -284,6 +287,9 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
 	} else {
 		memset(childregs, 0, sizeof(struct pt_regs));
 		childregs->pstate = PSR_MODE_EL1h;
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+		childregs->pmr_save = ICC_PMR_EL1_UNMASKED;
+#endif
 		if (IS_ENABLED(CONFIG_ARM64_UAO) &&
 		    cpus_have_const_cap(ARM64_HAS_UAO))
 			childregs->pstate |= PSR_UAO_BIT;
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 84c6983..401fc8e 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -219,6 +219,8 @@ asmlinkage void secondary_start_kernel(void)
 	struct mm_struct *mm = &init_mm;
 	unsigned int cpu;

+	maybe_switch_to_sysreg_gic_cpuif();
+
 	cpu = task_cpu(current);
 	set_my_cpu_offset(per_cpu_offset(cpu));

@@ -461,6 +463,12 @@ void __init smp_prepare_boot_cpu(void)
 	 * and/or scheduling is enabled.
 	 */
 	apply_alternatives_early();
+
+	/*
+	 * Conditionally switch to GIC PMR for interrupt masking (this
+	 * will be a nop if we are using normal interrupt masking)
+	 */
+	maybe_switch_to_sysreg_gic_cpuif();
 }

 static u64 __init of_get_cpu_mpidr(struct device_node *dn)
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 877d42f..111d6f4 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -20,6 +20,7 @@

 #include <linux/init.h>
 #include <linux/linkage.h>
+#include <linux/irqchip/arm-gic-v3.h>
 #include <asm/assembler.h>
 #include <asm/asm-offsets.h>
 #include <asm/hwcap.h>
@@ -47,11 +48,33 @@
  *	cpu_do_idle()
  *
  *	Idle the processor (wait for interrupt).
+ *
+ *	If CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS is set we must do additional
+ *	work to ensure that interrupts are not masked at the PMR (because the
+ *	core will not wake up if we block the wake up signal in the interrupt
+ *	controller).
  */
 ENTRY(cpu_do_idle)
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+alternative_if_not ARM64_HAS_SYSREG_GIC_CPUIF
+#endif
+	dsb	sy				// WFI may enter a low-power mode
+	wfi
+	ret
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+alternative_else
+	mrs	x0, daif			// save I bit
+	msr	daifset, #2			// set I bit
+	mrs_s	x1, SYS_ICC_PMR_EL1		// save PMR
+alternative_endif
+	mov	x2, #ICC_PMR_EL1_UNMASKED
+	msr_s	SYS_ICC_PMR_EL1, x2		// unmask at PMR
 	dsb	sy				// WFI may enter a low-power mode
 	wfi
+	msr_s	SYS_ICC_PMR_EL1, x1		// restore PMR
+	msr	daif, x0			// restore I bit
 	ret
+#endif
 ENDPROC(cpu_do_idle)

 #ifdef CONFIG_CPU_PM
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index e8d8934..a72f7c4 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -60,7 +60,7 @@
 #define LPI_PROPBASE_SZ		ALIGN(BIT(LPI_NRBITS), SZ_64K)
 #define LPI_PENDBASE_SZ		ALIGN(BIT(LPI_NRBITS) / 8, SZ_64K)

-#define LPI_PROP_DEFAULT_PRIO	0xa0
+#define LPI_PROP_DEFAULT_PRIO	GICD_INT_DEF_PRI

 /*
  * Collection structure - just an ID, and a redistributor address to
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index b5df99c..1cdf495 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -68,9 +68,6 @@ struct gic_chip_data {
 #define gic_data_rdist_rd_base()	(gic_data_rdist()->rd_base)
 #define gic_data_rdist_sgi_base()	(gic_data_rdist_rd_base() + SZ_64K)

-/* Our default, arbitrary priority value. Linux only uses one anyway. */
-#define DEFAULT_PMR_VALUE	0xf0
-
 static inline unsigned int gic_irq(struct irq_data *d)
 {
 	return d->hwirq;
@@ -345,48 +342,55 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs
 {
 	u32 irqnr;

-	do {
-		irqnr = gic_read_iar();
+	irqnr = gic_read_iar();
+
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	isb();
+	/* Masking IRQs earlier would prevent to ack the current interrupt */
+	gic_start_pmr_masking();
+#endif
+
+	if (likely(irqnr > 15 && irqnr < 1020) || irqnr >= 8192) {
+		int err;

-		if (likely(irqnr > 15 && irqnr < 1020) || irqnr >= 8192) {
-			int err;
+		if (static_key_true(&supports_deactivate))
+			gic_write_eoir(irqnr);
+		else {
+#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+			isb();
+#endif
+		}

-			if (static_key_true(&supports_deactivate))
+		err = handle_domain_irq(gic_data.domain, irqnr, regs);
+		if (err) {
+			WARN_ONCE(true, "Unexpected interrupt received!\n");
+			if (static_key_true(&supports_deactivate)) {
+				if (irqnr < 8192)
+					gic_write_dir(irqnr);
+			} else {
 				gic_write_eoir(irqnr);
-			else
-				isb();
-
-			err = handle_domain_irq(gic_data.domain, irqnr, regs);
-			if (err) {
-				WARN_ONCE(true, "Unexpected interrupt received!\n");
-				if (static_key_true(&supports_deactivate)) {
-					if (irqnr < 8192)
-						gic_write_dir(irqnr);
-				} else {
-					gic_write_eoir(irqnr);
-				}
 			}
-			continue;
 		}
-		if (irqnr < 16) {
-			gic_write_eoir(irqnr);
-			if (static_key_true(&supports_deactivate))
-				gic_write_dir(irqnr);
+		return;
+	}
+	if (irqnr < 16) {
+		gic_write_eoir(irqnr);
+		if (static_key_true(&supports_deactivate))
+			gic_write_dir(irqnr);
 #ifdef CONFIG_SMP
-			/*
-			 * Unlike GICv2, we don't need an smp_rmb() here.
-			 * The control dependency from gic_read_iar to
-			 * the ISB in gic_write_eoir is enough to ensure
-			 * that any shared data read by handle_IPI will
-			 * be read after the ACK.
-			 */
-			handle_IPI(irqnr, regs);
+		/*
+		 * Unlike GICv2, we don't need an smp_rmb() here.
+		 * The control dependency from gic_read_iar to
+		 * the ISB in gic_write_eoir is enough to ensure
+		 * that any shared data read by handle_IPI will
+		 * be read after the ACK.
+		 */
+		handle_IPI(irqnr, regs);
 #else
-			WARN_ONCE(true, "Unexpected SGI received!\n");
+		WARN_ONCE(true, "Unexpected SGI received!\n");
 #endif
-			continue;
-		}
-	} while (irqnr != ICC_IAR1_EL1_SPURIOUS);
+		return;
+	}
 }

 static void __init gic_dist_init(void)
@@ -536,8 +540,10 @@ static void gic_cpu_sys_reg_init(void)
 	if (!gic_enable_sre())
 		pr_err("GIC: unable to set SRE (disabled@EL2), panic ahead\n");

+#ifndef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
 	/* Set priority mask register */
-	gic_write_pmr(DEFAULT_PMR_VALUE);
+	gic_write_pmr(ICC_PMR_EL1_UNMASKED);
+#endif

 	/*
 	 * Some firmwares hand over to the kernel with the BPR changed from
diff --git a/include/linux/irqchip/arm-gic-common.h b/include/linux/irqchip/arm-gic-common.h
index 0a83b43..2c9a4b3 100644
--- a/include/linux/irqchip/arm-gic-common.h
+++ b/include/linux/irqchip/arm-gic-common.h
@@ -13,6 +13,12 @@
 #include <linux/types.h>
 #include <linux/ioport.h>

+#define GICD_INT_DEF_PRI		0xc0
+#define GICD_INT_DEF_PRI_X4		((GICD_INT_DEF_PRI << 24) |\
+					(GICD_INT_DEF_PRI << 16) |\
+					(GICD_INT_DEF_PRI << 8) |\
+					GICD_INT_DEF_PRI)
+
 enum gic_type {
 	GIC_V2,
 	GIC_V3,
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index d3453ee..47f5a8c 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -65,11 +65,6 @@
 #define GICD_INT_EN_CLR_X32		0xffffffff
 #define GICD_INT_EN_SET_SGI		0x0000ffff
 #define GICD_INT_EN_CLR_PPI		0xffff0000
-#define GICD_INT_DEF_PRI		0xa0
-#define GICD_INT_DEF_PRI_X4		((GICD_INT_DEF_PRI << 24) |\
-					(GICD_INT_DEF_PRI << 16) |\
-					(GICD_INT_DEF_PRI << 8) |\
-					GICD_INT_DEF_PRI)

 #define GICH_HCR			0x0
 #define GICH_VTR			0x4
--
1.9.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC 5/7] irqchip/gic: Add functions to access irq priorities
  2017-10-11 13:00 [RFC 0/7] arm64: provide pseudo NMI with GICv3 Julien Thierry
                   ` (3 preceding siblings ...)
  2017-10-11 13:00 ` [RFC 4/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking Julien Thierry
@ 2017-10-11 13:01 ` Julien Thierry
  2017-10-11 13:01 ` [RFC 6/7] arm64: Detect current view of GIC priorities Julien Thierry
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Julien Thierry @ 2017-10-11 13:01 UTC (permalink / raw)
  To: linux-arm-kernel

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 drivers/irqchip/irq-gic-common.c | 10 ++++++++++
 drivers/irqchip/irq-gic-common.h |  2 ++
 2 files changed, 12 insertions(+)

diff --git a/drivers/irqchip/irq-gic-common.c b/drivers/irqchip/irq-gic-common.c
index 9ae7180..06ec1f2 100644
--- a/drivers/irqchip/irq-gic-common.c
+++ b/drivers/irqchip/irq-gic-common.c
@@ -90,6 +90,16 @@ int gic_configure_irq(unsigned int irq, unsigned int type,
 	return ret;
 }

+void gic_set_irq_prio(unsigned int irq, void __iomem *base, u8 prio)
+{
+	writeb_relaxed(prio, base + GIC_DIST_PRI + irq);
+}
+
+u8 gic_get_irq_prio(unsigned int irq, void __iomem *base)
+{
+	return readb_relaxed(base + GIC_DIST_PRI + irq);
+}
+
 void gic_dist_config(void __iomem *base, int gic_irqs,
 		     void (*sync_access)(void))
 {
diff --git a/drivers/irqchip/irq-gic-common.h b/drivers/irqchip/irq-gic-common.h
index 205e5fd..1c02518 100644
--- a/drivers/irqchip/irq-gic-common.h
+++ b/drivers/irqchip/irq-gic-common.h
@@ -35,6 +35,8 @@ void gic_dist_config(void __iomem *base, int gic_irqs,
 void gic_cpu_config(void __iomem *base, void (*sync_access)(void));
 void gic_enable_quirks(u32 iidr, const struct gic_quirk *quirks,
 		void *data);
+void gic_set_irq_prio(unsigned int irq, void __iomem *base, u8 prio);
+u8 gic_get_irq_prio(unsigned int irq, void __iomem *base);

 void gic_set_kvm_info(const struct gic_kvm_info *info);

--
1.9.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC 6/7] arm64: Detect current view of GIC priorities
  2017-10-11 13:00 [RFC 0/7] arm64: provide pseudo NMI with GICv3 Julien Thierry
                   ` (4 preceding siblings ...)
  2017-10-11 13:01 ` [RFC 5/7] irqchip/gic: Add functions to access irq priorities Julien Thierry
@ 2017-10-11 13:01 ` Julien Thierry
  2017-11-03 14:02   ` Marc Zyngier
  2017-10-11 13:01 ` [RFC 7/7] arm64: Add support for pseudo-NMIs Julien Thierry
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Julien Thierry @ 2017-10-11 13:01 UTC (permalink / raw)
  To: linux-arm-kernel

The values non secure EL1 needs to use for priority registers depends on
the value of SCR_EL3.FIQ.

Since we don't have access to SCR_EL3, we fake an interrupt and compare the
GIC priority with the one present in the [re]distributor.

Also, add firmware requirements related to SCR_EL3.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 Documentation/arm64/booting.txt     |  5 +++
 arch/arm64/include/asm/arch_gicv3.h |  5 +++
 arch/arm64/include/asm/irqflags.h   |  6 +++
 arch/arm64/include/asm/sysreg.h     |  1 +
 drivers/irqchip/irq-gic-v3.c        | 83 +++++++++++++++++++++++++++++++++++++
 5 files changed, 100 insertions(+)

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
index 8d0df62..e387938 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.txt
@@ -188,6 +188,11 @@ Before jumping into the kernel, the following conditions must be met:
   the kernel image will be entered must be initialised by software at a
   higher exception level to prevent execution in an UNKNOWN state.

+  - SCR_EL3.FIQ must have the same value across all CPUs the kernel is
+    executing on.
+  - The value of SCR_EL3.FIQ must be the same as the one present at boot
+    time whenever the kernel is executing.
+
   For systems with a GICv3 interrupt controller to be used in v3 mode:
   - If EL3 is present:
     ICC_SRE_EL3.Enable (bit 3) must be initialiased to 0b1.
diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
index b6b430c..90a3dc3 100644
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -114,6 +114,11 @@ static inline void gic_write_bpr1(u32 val)
 	write_sysreg_s(val, SYS_ICC_BPR1_EL1);
 }

+static inline u32 gic_read_rpr(void)
+{
+	return read_sysreg_s(SYS_ICC_RPR_EL1);
+}
+
 #define gic_read_typer(c)		readq_relaxed(c)
 #define gic_write_irouter(v, c)		writeq_relaxed(v, c)
 #define gic_read_lpir(c)		readq_relaxed(c)
diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 1c69ebb..4718ca3f 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -205,6 +205,12 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
 		!(ARCH_FLAGS_GET_PMR(flags) & ICC_PMR_EL1_EN_BIT);
 }

+/* Mask IRQs at CPU level instead of GIC level */
+static inline void arch_irqs_daif_disable(void)
+{
+	asm volatile ("msr daifset, #2" : : : "memory");
+}
+
 void maybe_switch_to_sysreg_gic_cpuif(void);

 #endif /* CONFIG_IRQFLAGS_GIC_MASKING */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index f707fed..e71bc81 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -205,6 +205,7 @@
 #define SYS_ICC_SRE_EL1			sys_reg(3, 0, 12, 12, 5)
 #define SYS_ICC_IGRPEN0_EL1		sys_reg(3, 0, 12, 12, 6)
 #define SYS_ICC_IGRPEN1_EL1		sys_reg(3, 0, 12, 12, 7)
+#define SYS_ICC_RPR_EL1			sys_reg(3, 0, 12, 11, 3)

 #define SYS_CONTEXTIDR_EL1		sys_reg(3, 0, 13, 0, 1)
 #define SYS_TPIDR_EL1			sys_reg(3, 0, 13, 0, 4)
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 1cdf495..e0b235f 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -62,6 +62,10 @@ struct gic_chip_data {
 static struct gic_chip_data gic_data __read_mostly;
 static struct static_key supports_deactivate = STATIC_KEY_INIT_TRUE;

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+DEFINE_STATIC_KEY_FALSE(have_non_secure_prio_view);
+#endif
+
 static struct gic_kvm_info gic_v3_kvm_info;

 #define gic_data_rdist()		(this_cpu_ptr(gic_data.rdists.rdist))
@@ -969,6 +973,81 @@ static int partition_domain_translate(struct irq_domain *d,
 	.select = gic_irq_domain_select,
 };

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+/*
+ * The behaviours of RPR and PMR registers differ depending on the value of
+ * SCR_EL3.FIQ, while the behaviour of priority registers of the distributor
+ * and redistributors is always the same.
+ *
+ * If SCR_EL3.FIQ == 1, the values used for RPR and PMR are the same as the ones
+ * programmed in the distributor and redistributors registers.
+ *
+ * Otherwise, the value presented by RPR as well as the value which will be
+ * compared against PMR is: (GIC_(R)DIST_PRI[irq] >> 1) | 0x80;
+ *
+ * see GICv3/GICv4 Architecture Specification (IHI0069D):
+ * - section 4.8.1 Non-secure accesses to register fields for Secure interrupt
+ *   priorities.
+ * - Figure 4-7 Secure read of the priority field for a Non-secure Group 1
+ *   interrupt.
+ */
+static void __init gic_detect_prio_view(void)
+{
+	/*
+	 * Randomly picked SGI, must be <= 8 as other SGIs might be
+	 * used by the firmware.
+	 */
+	const u32 fake_irqnr = 7;
+	const u32 fake_irqmask = BIT(fake_irqnr);
+	void __iomem * const rdist_base = gic_data_rdist_sgi_base();
+	unsigned long irq_flags;
+	u32 acked_irqnr;
+	bool was_enabled;
+
+	irq_flags = arch_local_save_flags();
+
+	arch_irqs_daif_disable();
+
+	was_enabled = !!(readl_relaxed(rdist_base + GICD_ISENABLER) &
+			 fake_irqmask);
+
+	if (!was_enabled)
+		writel_relaxed(fake_irqmask, rdist_base + GICD_ISENABLER);
+
+	/* Need to unmask to acknowledge the IRQ */
+	gic_write_pmr(ICC_PMR_EL1_UNMASKED);
+	dsb(sy);
+
+	/* Fake a pending SGI */
+	writel_relaxed(fake_irqmask, rdist_base + GICD_ISPENDR);
+	dsb(sy);
+
+	acked_irqnr = gic_read_iar();
+
+	if (acked_irqnr == fake_irqnr) {
+		if (gic_read_rpr() == gic_get_irq_prio(acked_irqnr, rdist_base))
+			static_branch_enable(&have_non_secure_prio_view);
+	} else {
+		pr_warn("Unexpected IRQ for priority detection: %u\n",
+			acked_irqnr);
+	}
+
+	if (acked_irqnr < 1020) {
+		gic_write_eoir(acked_irqnr);
+		if (static_key_true(&supports_deactivate))
+			gic_write_dir(acked_irqnr);
+	}
+
+	/* Restore enabled state */
+	if (!was_enabled) {
+		writel_relaxed(fake_irqmask, rdist_base + GICD_ICENABLER);
+		gic_redist_wait_for_rwp();
+	}
+
+	arch_local_irq_restore(irq_flags);
+}
+#endif
+
 static int __init gic_init_bases(void __iomem *dist_base,
 				 struct redist_region *rdist_regs,
 				 u32 nr_redist_regions,
@@ -1025,6 +1104,10 @@ static int __init gic_init_bases(void __iomem *dist_base,
 	gic_cpu_init();
 	gic_cpu_pm_init();

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	gic_detect_prio_view();
+#endif
+
 	return 0;

 out_free:
--
1.9.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC 7/7] arm64: Add support for pseudo-NMIs
  2017-10-11 13:00 [RFC 0/7] arm64: provide pseudo NMI with GICv3 Julien Thierry
                   ` (5 preceding siblings ...)
  2017-10-11 13:01 ` [RFC 6/7] arm64: Detect current view of GIC priorities Julien Thierry
@ 2017-10-11 13:01 ` Julien Thierry
  2017-10-11 15:55 ` [RFC 0/7] arm64: provide pseudo NMI with GICv3 Daniel Thompson
  2017-11-03 11:50 ` Julien Thierry
  8 siblings, 0 replies; 19+ messages in thread
From: Julien Thierry @ 2017-10-11 13:01 UTC (permalink / raw)
  To: linux-arm-kernel

arm64 does not provide native NMIs. Emulate the NMI behaviour using GIC
priorities.

Add the possibility to set an IRQ as an NMI and the handling of the NMI.

If the view of GIC priorities is the secure one (i.e. SCR_EL3.FIQ == 0), do
not allow the use of NMIs. Emit a warning when attempting to set an IRQ as
NMI under this scenario.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kernel/entry.S    |  54 +++++++++++++++++
 drivers/irqchip/irq-gic-v3.c | 141 +++++++++++++++++++++++++++++++++++++++++++
 include/linux/interrupt.h    |   1 +
 3 files changed, 196 insertions(+)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 1933fea..4fbe2fc 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -365,6 +365,16 @@ alternative_else_nop_endif
 	msr	daifclr, #2
 	alternative_else_nop_endif
 	.endm
+
+	/* Should be checked on return from irq handlers */
+	.macro	branch_if_was_nmi, tmp, target
+	alternative_if ARM64_HAS_SYSREG_GIC_CPUIF
+	mrs	\tmp, daif
+	alternative_else
+	mov	\tmp, #0
+	alternative_endif
+	tbnz	\tmp, #7, \target // Exiting an NMI
+	.endm
 #endif

 /*
@@ -610,12 +620,30 @@ ENDPROC(el1_sync)
 el1_irq:
 	kernel_entry 1
 	enable_dbg
+
 #ifdef CONFIG_TRACE_IRQFLAGS
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	ldr	x20, [sp, #S_PMR_SAVE]
+	/* Irqs were disabled, don't trace */
+	tbz	x20, ICC_PMR_EL1_EN_SHIFT, 1f
+#endif
 	bl	trace_hardirqs_off
+1:
 #endif

 	irq_handler

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	/*
+	 * Irqs were disabled, we have an nmi.
+	 * We might have interrupted a context with interrupt disabled that set
+	 * NEED_RESCHED flag.
+	 * Skip preemption and irq tracing if needed.
+	 */
+	tbz	x20, ICC_PMR_EL1_EN_SHIFT, untraced_irq_exit
+	branch_if_was_nmi x0, skip_preempt
+#endif
+
 #ifdef CONFIG_PREEMPT
 	ldr	w24, [tsk, #TSK_TI_PREEMPT]	// get preempt count
 	cbnz	w24, 1f				// preempt count != 0
@@ -624,9 +652,13 @@ el1_irq:
 	bl	el1_preempt
 1:
 #endif
+
+skip_preempt:
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_on
 #endif
+
+untraced_irq_exit:
 	kernel_exit 1
 ENDPROC(el1_irq)

@@ -839,6 +871,11 @@ el0_irq_naked:
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_on
 #endif
+
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	branch_if_was_nmi x2, nmi_ret_to_user
+#endif
+
 	b	ret_to_user
 ENDPROC(el0_irq)

@@ -984,8 +1021,15 @@ ENTRY(cpu_switch_to)
 	ldp	x27, x28, [x8], #16
 	ldp	x29, x9, [x8], #16
 	ldr	lr, [x8]
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	mrs	x10, daif
+	msr	daifset, #2
+#endif
 	mov	sp, x9
 	msr	sp_el0, x1
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	msr	daif, x10
+#endif
 	ret
 ENDPROC(cpu_switch_to)
 NOKPROBE(cpu_switch_to)
@@ -1002,3 +1046,13 @@ ENTRY(ret_from_fork)
 	b	ret_to_user
 ENDPROC(ret_from_fork)
 NOKPROBE(ret_from_fork)
+
+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+/*
+ * NMI return path to EL0
+ */
+nmi_ret_to_user:
+	ldr	x1, [tsk, #TSK_TI_FLAGS]
+	b	finish_ret_to_user
+ENDPROC(nmi_ret_to_user)
+#endif
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index e0b235f..990a7e3 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -34,6 +34,8 @@
 #include <linux/irqchip/arm-gic-v3.h>
 #include <linux/irqchip/irq-partition-percpu.h>

+#include <trace/events/irq.h>
+
 #include <asm/cputype.h>
 #include <asm/exception.h>
 #include <asm/smp_plat.h>
@@ -41,6 +43,8 @@

 #include "irq-gic-common.h"

+#define GICD_INT_NMI_PRI		0xa0
+
 struct redist_region {
 	void __iomem		*redist_base;
 	phys_addr_t		phys_base;
@@ -224,6 +228,87 @@ static void gic_unmask_irq(struct irq_data *d)
 	gic_poke_irq(d, GICD_ISENABLER);
 }

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+/*
+ * Chip flow handler for SPIs set as NMI
+ */
+static void handle_fasteoi_nmi(struct irq_desc *desc)
+{
+	struct irq_chip *chip = irq_desc_get_chip(desc);
+	struct irqaction *action = desc->action;
+	unsigned int irq = irq_desc_get_irq(desc);
+	irqreturn_t res;
+
+	if (chip->irq_ack)
+		chip->irq_ack(&desc->irq_data);
+
+	trace_irq_handler_entry(irq, action);
+	res = action->handler(irq, action->dev_id);
+	trace_irq_handler_exit(irq, action, res);
+
+	if (chip->irq_eoi)
+		chip->irq_eoi(&desc->irq_data);
+}
+
+/*
+ * Chip flow handler for PPIs set as NMI
+ */
+static void handle_percpu_devid_nmi(struct irq_desc *desc)
+{
+	struct irq_chip *chip = irq_desc_get_chip(desc);
+	struct irqaction *action = desc->action;
+	unsigned int irq = irq_desc_get_irq(desc);
+	irqreturn_t res;
+
+	if (chip->irq_ack)
+		chip->irq_ack(&desc->irq_data);
+
+	trace_irq_handler_entry(irq, action);
+	res = action->handler(irq, raw_cpu_ptr(action->percpu_dev_id));
+	trace_irq_handler_exit(irq, action, res);
+
+	if (chip->irq_eoi)
+		chip->irq_eoi(&desc->irq_data);
+}
+
+static int gic_irq_set_irqchip_prio(struct irq_data *d, bool val)
+{
+	u8 prio;
+	irq_flow_handler_t handler;
+
+	if (gic_peek_irq(d, GICD_ISENABLER)) {
+		pr_err("Cannot set NMI property of enabled IRQ %u\n", d->irq);
+		return -EPERM;
+	}
+
+	if (val) {
+		prio = GICD_INT_NMI_PRI;
+
+		if (gic_irq(d) < 32)
+			handler = handle_percpu_devid_nmi;
+		else
+			handler = handle_fasteoi_nmi;
+	} else {
+		prio = GICD_INT_DEF_PRI;
+
+		if (gic_irq(d) < 32)
+			handler = handle_percpu_devid_irq;
+		else
+			handler = handle_fasteoi_irq;
+	}
+
+	/*
+	 * Already in a locked context for the desc from calling
+	 * irq_set_irq_chip_state.
+	 * It should be safe to simply modify the handler.
+	 */
+	irq_to_desc(d->irq)->handle_irq = handler;
+	gic_set_irq_prio(gic_irq(d), gic_dist_base(d), prio);
+
+	return 0;
+}
+#endif
+
 static int gic_irq_set_irqchip_state(struct irq_data *d,
 				     enum irqchip_irq_state which, bool val)
 {
@@ -245,6 +330,18 @@ static int gic_irq_set_irqchip_state(struct irq_data *d,
 		reg = val ? GICD_ICENABLER : GICD_ISENABLER;
 		break;

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	case IRQCHIP_STATE_NMI:
+		if (static_branch_likely(&have_non_secure_prio_view)) {
+			return gic_irq_set_irqchip_prio(d, val);
+		} else if (val) {
+			pr_warn("Failed to set IRQ %u as NMI, NMIs are unsupported\n",
+				gic_irq(d));
+			return -EINVAL;
+		}
+		return 0;
+#endif
+
 	default:
 		return -EINVAL;
 	}
@@ -272,6 +369,13 @@ static int gic_irq_get_irqchip_state(struct irq_data *d,
 		*val = !gic_peek_irq(d, GICD_ISENABLER);
 		break;

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+	case IRQCHIP_STATE_NMI:
+		*val = (gic_get_irq_prio(gic_irq(d), gic_dist_base(d)) ==
+			GICD_INT_NMI_PRI);
+		break;
+#endif
+
 	default:
 		return -EINVAL;
 	}
@@ -342,6 +446,22 @@ static u64 gic_mpidr_to_affinity(unsigned long mpidr)
 	return aff;
 }

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+static void do_handle_nmi(unsigned int hwirq, struct pt_regs *regs)
+{
+	struct pt_regs *old_regs = set_irq_regs(regs);
+	unsigned int irq;
+
+	nmi_enter();
+
+	irq = irq_find_mapping(gic_data.domain, hwirq);
+	generic_handle_irq(irq);
+
+	nmi_exit();
+	set_irq_regs(old_regs);
+}
+#endif
+
 static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
 {
 	u32 irqnr;
@@ -357,6 +477,25 @@ static asmlinkage void __exception_irq_entry gic_handle_irq(struct pt_regs *regs
 	if (likely(irqnr > 15 && irqnr < 1020) || irqnr >= 8192) {
 		int err;

+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
+		if (static_branch_likely(&have_non_secure_prio_view)
+		    && unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) {
+			/*
+			 * We need to prevent other NMIs to occur even after a
+			 * priority drop.
+			 * We keep I flag set until cpsr is restored from
+			 * kernel_exit.
+			 */
+			arch_irqs_daif_disable();
+
+			if (static_key_true(&supports_deactivate))
+				gic_write_eoir(irqnr);
+
+			do_handle_nmi(irqnr, regs);
+			return;
+		}
+#endif
+
 		if (static_key_true(&supports_deactivate))
 			gic_write_eoir(irqnr);
 		else {
@@ -1027,6 +1166,8 @@ static void __init gic_detect_prio_view(void)
 	if (acked_irqnr == fake_irqnr) {
 		if (gic_read_rpr() == gic_get_irq_prio(acked_irqnr, rdist_base))
 			static_branch_enable(&have_non_secure_prio_view);
+		else
+			pr_warn("Cannot enable use of pseudo-NMIs\n");
 	} else {
 		pr_warn("Unexpected IRQ for priority detection: %u\n",
 			acked_irqnr);
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 59ba116..bb637d8 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -420,6 +420,7 @@ enum irqchip_irq_state {
 	IRQCHIP_STATE_ACTIVE,		/* Is interrupt in progress? */
 	IRQCHIP_STATE_MASKED,		/* Is interrupt masked? */
 	IRQCHIP_STATE_LINE_LEVEL,	/* Is IRQ line high? */
+	IRQCHIP_STATE_NMI,		/* Is IRQ an NMI? */
 };

 extern int irq_get_irqchip_state(unsigned int irq, enum irqchip_irq_state which,
--
1.9.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC 0/7] arm64: provide pseudo NMI with GICv3
  2017-10-11 13:00 [RFC 0/7] arm64: provide pseudo NMI with GICv3 Julien Thierry
                   ` (6 preceding siblings ...)
  2017-10-11 13:01 ` [RFC 7/7] arm64: Add support for pseudo-NMIs Julien Thierry
@ 2017-10-11 15:55 ` Daniel Thompson
  2017-10-11 16:10   ` Julien Thierry
  2017-11-03 11:50 ` Julien Thierry
  8 siblings, 1 reply; 19+ messages in thread
From: Daniel Thompson @ 2017-10-11 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 11 October 2017 at 14:00, Julien Thierry <julien.thierry@arm.com> wrote:
> Hi,
>
> This series is a continuation of the work started by Daniel [1]. The goal is
> to use GICv3 interrupt priorities to simulate an NMI.

Cool. By coincidence, and after ignoring the code for three kernel
releases, I'm halfway through rebasing this to v4.14... and *very*
happy to be able to grab yours instead. There's a machine I want to
play with it on!

Have you just brought it up to date or have you acted on Marc Z.'s
pending feedback from the last circulation?


> To achieve this, set two priorities, one for standard interrupts and
> another, higher priority, for NMIs. Whenever we want to disable interrupts,
> we mask the standard priority instead so NMIs can still be raised. Some
> corner cases though still require to actually mask all interrupts
> effectively disabling the NMI.
>
> Of course, using priority masking instead of PSR.I comes at some cost. On
> hackbench, I notice a ~6% drop of performance. However, on a kernel build,
> I do not see significant differences.

Which micro-architecture?

In micro-benchmarks (with hot caches) I've seen the lock sequence
speed up on some designs, slow down a bit on some and slow down a
*lot* others. In the worst cases the code impacts Android macro
benchmarks like scrolling... on the other hand the code was stable
enough to run the whole Android stack reliably.


Daniel.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC 0/7] arm64: provide pseudo NMI with GICv3
  2017-10-11 15:55 ` [RFC 0/7] arm64: provide pseudo NMI with GICv3 Daniel Thompson
@ 2017-10-11 16:10   ` Julien Thierry
  0 siblings, 0 replies; 19+ messages in thread
From: Julien Thierry @ 2017-10-11 16:10 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Daniel,

On 11/10/17 16:55, Daniel Thompson wrote:
> On 11 October 2017 at 14:00, Julien Thierry <julien.thierry@arm.com> wrote:
>> Hi,
>>
>> This series is a continuation of the work started by Daniel [1]. The goal is
>> to use GICv3 interrupt priorities to simulate an NMI.
> 
> Cool. By coincidence, and after ignoring the code for three kernel
> releases, I'm halfway through rebasing this to v4.14... and *very*
> happy to be able to grab yours instead. There's a machine I want to
> play with it on!
> 

Good to know someone will be testing this then ;) , thanks.

> Have you just brought it up to date or have you acted on Marc Z.'s
> pending feedback from the last circulation?
> 

First 3 patches should be pretty much identical. I've done a bit of 
rework on patch 4, acting on some of Marc's comments, fixing some issues 
I found along the road, some refactoring.

Then I added code to the GICv3 irqchip so you can turn an irq as NMI 
like so:
irq_set_irqchip_state(virq, IRQCHIP_STATE_NMI, true);

So this means you need to do it through a virtual irq number (e.g. no 
IPI). I started working on some patches to make it work with IPI as 
well, but I wanted to get some feeback on this first before finalizing them.

> 
>> To achieve this, set two priorities, one for standard interrupts and
>> another, higher priority, for NMIs. Whenever we want to disable interrupts,
>> we mask the standard priority instead so NMIs can still be raised. Some
>> corner cases though still require to actually mask all interrupts
>> effectively disabling the NMI.
>>
>> Of course, using priority masking instead of PSR.I comes at some cost. On
>> hackbench, I notice a ~6% drop of performance. However, on a kernel build,
>> I do not see significant differences.
> 
> Which micro-architecture?
> 

8 cores A57.

> In micro-benchmarks (with hot caches) I've seen the lock sequence
> speed up on some designs, slow down a bit on some and slow down a
> *lot* others. In the worst cases the code impacts Android macro
> benchmarks like scrolling... on the other hand the code was stable
> enough to run the whole Android stack reliably.
> 

So far I only tested with classic Linux distros, so if you find issues 
with Android (or other) do let me know.

Same if you have any question about the changes.

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC 4/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking
  2017-10-11 13:00 ` [RFC 4/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking Julien Thierry
@ 2017-10-16 21:18   ` Christoffer Dall
  2017-10-17  8:36     ` Julien Thierry
  0 siblings, 1 reply; 19+ messages in thread
From: Christoffer Dall @ 2017-10-16 21:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Oct 11, 2017 at 02:00:59PM +0100, Julien Thierry wrote:
> From: Daniel Thompson <daniel.thompson@linaro.org>
> 
> Currently irqflags is implemented using the PSR's I bit. It is possible
> to implement irqflags by using the co-processor interface to the GIC.
> Using the co-processor interface makes it feasible to simulate NMIs
> using GIC interrupt prioritization.
> 
> This patch changes the irqflags macros to modify, save and restore
> ICC_PMR_EL1. This has a substantial knock on effect for the rest of
> the kernel. There are four reasons for this:
> 
> 1. The state of the PMR becomes part of the interrupt context and must be
>    saved and restored during exceptions. It is saved on the stack as part
>    of the saved context when an interrupt/exception is taken.
> 
> 2. The hardware automatically masks the I bit (at boot, during traps, etc).
>    When the I bit is set by hardware we must add code to switch from I
>    bit masking and PMR masking:
>        - For IRQs, this is done after the interrupt has been acknowledged
>        	 avoiding the need to unmask.
>        - For other exceptions, this is done right after saving the context.
> 
> 3. Some instructions, such as wfi, require that the PMR not be used
>    for interrupt masking. Before calling these instructions we must
>    switch from PMR masking to I bit masking.
>    This is also the case when KVM runs a guest, if the CPU receives
>    an interrupt from the host, interrupts must not be masked in PMR
>    otherwise the GIC will not signal it to the CPU.
> 
> 4. We use the alternatives system to allow a single kernel to boot and
>    be switched to the alternative masking approach at runtime.
> 
> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
> [julien.thierry at arm.com: changes reflected in commit,
> 			 message, fixes, renaming]
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Christoffer Dall <christoffer.dall@linaro.org>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Jason Cooper <jason@lakedaemon.net>
---

I just gave a quick look to the KVM part here but didn't get into
whether this rather invasive change is warranted or not (it's definitely
entertaining though).

[...]


> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index e923b58..070e8a5 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -38,6 +38,10 @@
>  #include <kvm/arm_arch_timer.h>
>  #include <kvm/arm_pmu.h>
> 
> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
> +#include <asm/arch_gicv3.h>
> +#endif
> +
>  #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
> 
>  #define KVM_VCPU_MAX_FEATURES 4
> @@ -332,7 +336,30 @@ int kvm_unmap_hva_range(struct kvm *kvm,
>  void kvm_arm_resume_guest(struct kvm *kvm);
> 
>  u64 __kvm_call_hyp(void *hypfn, ...);
> +
> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
> +/*
> + * If we use PMR to disable interrupts, interrupts happening while the

I would describe this as 'masking' interrupts, as opposed to 'disabling'
interrupts, in general, and probably we can be more precise by
categorizing interrupts as 'normal' vs. 'priority (NMIs)' or something
like that.

> + * guest is executing will not be signaled to the host by the GIC  (and

signaled to the CPU.  (Whether this then causes an exception to EL2
depends on the HCR_EL2.IMO flag and what KVM does with that exception).

> + * any call to a waiting kvm_kick_many_cpus will likely cause a
> + * deadlock).

This kick thing is an unccessary specific example to bring out here, I
don't think it adds to the general understanding.

> + * We need to switch back to using PSR.I.
> + */
> +#define kvm_call_hyp(f, ...)						\
> +	({								\
> +		u64 res;						\
> +		unsigned long flags;					\
> +									\
> +		flags = arch_local_irq_save();				\
> +		gic_end_pmr_masking();					\
> +		res = __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__);	\
> +		gic_start_pmr_masking();				\
> +		arch_local_irq_restore(flags);				\
> +		res;							\

This is in the critical path, so ideally this could instead be done
inside the function, because the trap will set PSTATE.I already, so it
should be possible to reduce this to a set of save/clear/restore
operations after __kvm_call_hyp.

> +	})
> +#else
>  #define kvm_call_hyp(f, ...) __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__)
> +#endif
> 
>  void force_vm_exit(const cpumask_t *mask);
>  void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
> @@ -349,6 +376,9 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
>  				       unsigned long hyp_stack_ptr,
>  				       unsigned long vector_ptr)
>  {
> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
> +	unsigned long flags;
> +#endif
>  	/*
>  	 * Call initialization code, and switch to the full blown HYP code.
>  	 * If the cpucaps haven't been finalized yet, something has gone very
> @@ -356,7 +386,17 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
>  	 * cpus_have_const_cap() wrapper.
>  	 */
>  	BUG_ON(!static_branch_likely(&arm64_const_caps_ready));
> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
> +	flags = arch_local_irq_save();
> +	gic_end_pmr_masking();
> +#endif
> +
>  	__kvm_call_hyp((void *)pgd_ptr, hyp_stack_ptr, vector_ptr);
> +
> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
> +	gic_start_pmr_masking();
> +	arch_local_irq_restore(flags);
> +#endif

I don't think you need any of this, because I don't think you'd ever
need to handle interrupts while initializing hyp mode.

>  }
> 
Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC 4/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking
  2017-10-16 21:18   ` Christoffer Dall
@ 2017-10-17  8:36     ` Julien Thierry
  2017-10-17  9:12       ` Christoffer Dall
  0 siblings, 1 reply; 19+ messages in thread
From: Julien Thierry @ 2017-10-17  8:36 UTC (permalink / raw)
  To: linux-arm-kernel



On 16/10/17 22:18, Christoffer Dall wrote:
> On Wed, Oct 11, 2017 at 02:00:59PM +0100, Julien Thierry wrote:
>> From: Daniel Thompson <daniel.thompson@linaro.org>
>>
>> Currently irqflags is implemented using the PSR's I bit. It is possible
>> to implement irqflags by using the co-processor interface to the GIC.
>> Using the co-processor interface makes it feasible to simulate NMIs
>> using GIC interrupt prioritization.
>>
>> This patch changes the irqflags macros to modify, save and restore
>> ICC_PMR_EL1. This has a substantial knock on effect for the rest of
>> the kernel. There are four reasons for this:
>>
>> 1. The state of the PMR becomes part of the interrupt context and must be
>>     saved and restored during exceptions. It is saved on the stack as part
>>     of the saved context when an interrupt/exception is taken.
>>
>> 2. The hardware automatically masks the I bit (at boot, during traps, etc).
>>     When the I bit is set by hardware we must add code to switch from I
>>     bit masking and PMR masking:
>>         - For IRQs, this is done after the interrupt has been acknowledged
>>         	 avoiding the need to unmask.
>>         - For other exceptions, this is done right after saving the context.
>>
>> 3. Some instructions, such as wfi, require that the PMR not be used
>>     for interrupt masking. Before calling these instructions we must
>>     switch from PMR masking to I bit masking.
>>     This is also the case when KVM runs a guest, if the CPU receives
>>     an interrupt from the host, interrupts must not be masked in PMR
>>     otherwise the GIC will not signal it to the CPU.
>>
>> 4. We use the alternatives system to allow a single kernel to boot and
>>     be switched to the alternative masking approach at runtime.
>>
>> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
>> [julien.thierry at arm.com: changes reflected in commit,
>> 			 message, fixes, renaming]
>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will.deacon@arm.com>
>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Jason Cooper <jason@lakedaemon.net>
> ---
> 
> I just gave a quick look to the KVM part here but didn't get into
> whether this rather invasive change is warranted or not (it's definitely
> entertaining though).
> 

I appreciate you looking into this, it wasn't very clear to me how to 
deal with that for KVM.

> [...]
> 
> 
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index e923b58..070e8a5 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -38,6 +38,10 @@
>>   #include <kvm/arm_arch_timer.h>
>>   #include <kvm/arm_pmu.h>
>>
>> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>> +#include <asm/arch_gicv3.h>
>> +#endif
>> +
>>   #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
>>
>>   #define KVM_VCPU_MAX_FEATURES 4
>> @@ -332,7 +336,30 @@ int kvm_unmap_hva_range(struct kvm *kvm,
>>   void kvm_arm_resume_guest(struct kvm *kvm);
>>
>>   u64 __kvm_call_hyp(void *hypfn, ...);
>> +
>> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>> +/*
>> + * If we use PMR to disable interrupts, interrupts happening while the
> 
> I would describe this as 'masking' interrupts, as opposed to 'disabling'
> interrupts, in general, and probably we can be more precise by
> categorizing interrupts as 'normal' vs. 'priority (NMIs)' or something
> like that.
> 

Right, I'll do that.

>> + * guest is executing will not be signaled to the host by the GIC  (and
> 
> signaled to the CPU.  (Whether this then causes an exception to EL2
> depends on the HCR_EL2.IMO flag and what KVM does with that exception).
> 

True, I'll fix that.

>> + * any call to a waiting kvm_kick_many_cpus will likely cause a
>> + * deadlock).
> 
> This kick thing is an unccessary specific example to bring out here, I
> don't think it adds to the general understanding.
> 

Yes, makes sense.

>> + * We need to switch back to using PSR.I.
>> + */
>> +#define kvm_call_hyp(f, ...)						\
>> +	({								\
>> +		u64 res;						\
>> +		unsigned long flags;					\
>> +									\
>> +		flags = arch_local_irq_save();				\
>> +		gic_end_pmr_masking();					\
>> +		res = __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__);	\
>> +		gic_start_pmr_masking();				\
>> +		arch_local_irq_restore(flags);				\
>> +		res;							\
> 
> This is in the critical path, so ideally this could instead be done
> inside the function, because the trap will set PSTATE.I already, so it
> should be possible to reduce this to a set of save/clear/restore
> operations after __kvm_call_hyp.

Hmmm, you are talking about the function called by __kvm_call_hyp, right?

But this means we'll need to add the save/clear/restore to each function 
that can be called by __kvm_call_hyp, no?

Also in the VHE case we don't have the trap setting the PSTATE.I (the 
code calling kvm_call_hyp disables interrupts before entering the guest, 
but now the function disabling interrupts is just masking them with PMR).

> 
>> +	})
>> +#else
>>   #define kvm_call_hyp(f, ...) __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__)
>> +#endif
>>
>>   void force_vm_exit(const cpumask_t *mask);
>>   void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
>> @@ -349,6 +376,9 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
>>   				       unsigned long hyp_stack_ptr,
>>   				       unsigned long vector_ptr)
>>   {
>> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>> +	unsigned long flags;
>> +#endif
>>   	/*
>>   	 * Call initialization code, and switch to the full blown HYP code.
>>   	 * If the cpucaps haven't been finalized yet, something has gone very
>> @@ -356,7 +386,17 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
>>   	 * cpus_have_const_cap() wrapper.
>>   	 */
>>   	BUG_ON(!static_branch_likely(&arm64_const_caps_ready));
>> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>> +	flags = arch_local_irq_save();
>> +	gic_end_pmr_masking();
>> +#endif
>> +
>>   	__kvm_call_hyp((void *)pgd_ptr, hyp_stack_ptr, vector_ptr);
>> +
>> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>> +	gic_start_pmr_masking();
>> +	arch_local_irq_restore(flags);
>> +#endif
> 
> I don't think you need any of this, because I don't think you'd ever
> need to handle interrupts while initializing hyp mode.
> 

I was more worried about getting an "NMI" during the EL2 initialisation 
rather than missing the masked interrupts. Do you know whether this 
might be an issue here or not?

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC 4/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking
  2017-10-17  8:36     ` Julien Thierry
@ 2017-10-17  9:12       ` Christoffer Dall
  2017-11-03 11:57         ` Julien Thierry
  0 siblings, 1 reply; 19+ messages in thread
From: Christoffer Dall @ 2017-10-17  9:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Oct 17, 2017 at 09:36:51AM +0100, Julien Thierry wrote:
> 
> 
> On 16/10/17 22:18, Christoffer Dall wrote:
> >On Wed, Oct 11, 2017 at 02:00:59PM +0100, Julien Thierry wrote:
> >>From: Daniel Thompson <daniel.thompson@linaro.org>
> >>
> >>Currently irqflags is implemented using the PSR's I bit. It is possible
> >>to implement irqflags by using the co-processor interface to the GIC.
> >>Using the co-processor interface makes it feasible to simulate NMIs
> >>using GIC interrupt prioritization.
> >>
> >>This patch changes the irqflags macros to modify, save and restore
> >>ICC_PMR_EL1. This has a substantial knock on effect for the rest of
> >>the kernel. There are four reasons for this:
> >>
> >>1. The state of the PMR becomes part of the interrupt context and must be
> >>    saved and restored during exceptions. It is saved on the stack as part
> >>    of the saved context when an interrupt/exception is taken.
> >>
> >>2. The hardware automatically masks the I bit (at boot, during traps, etc).
> >>    When the I bit is set by hardware we must add code to switch from I
> >>    bit masking and PMR masking:
> >>        - For IRQs, this is done after the interrupt has been acknowledged
> >>        	 avoiding the need to unmask.
> >>        - For other exceptions, this is done right after saving the context.
> >>
> >>3. Some instructions, such as wfi, require that the PMR not be used
> >>    for interrupt masking. Before calling these instructions we must
> >>    switch from PMR masking to I bit masking.
> >>    This is also the case when KVM runs a guest, if the CPU receives
> >>    an interrupt from the host, interrupts must not be masked in PMR
> >>    otherwise the GIC will not signal it to the CPU.
> >>
> >>4. We use the alternatives system to allow a single kernel to boot and
> >>    be switched to the alternative masking approach at runtime.
> >>
> >>Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
> >>[julien.thierry at arm.com: changes reflected in commit,
> >>			 message, fixes, renaming]
> >>Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> >>Cc: Catalin Marinas <catalin.marinas@arm.com>
> >>Cc: Will Deacon <will.deacon@arm.com>
> >>Cc: Christoffer Dall <christoffer.dall@linaro.org>
> >>Cc: Marc Zyngier <marc.zyngier@arm.com>
> >>Cc: Thomas Gleixner <tglx@linutronix.de>
> >>Cc: Jason Cooper <jason@lakedaemon.net>
> >---
> >
> >I just gave a quick look to the KVM part here but didn't get into
> >whether this rather invasive change is warranted or not (it's definitely
> >entertaining though).
> >
> 
> I appreciate you looking into this, it wasn't very clear to me how to deal
> with that for KVM.
> 
> >[...]
> >
> >
> >>diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> >>index e923b58..070e8a5 100644
> >>--- a/arch/arm64/include/asm/kvm_host.h
> >>+++ b/arch/arm64/include/asm/kvm_host.h
> >>@@ -38,6 +38,10 @@
> >>  #include <kvm/arm_arch_timer.h>
> >>  #include <kvm/arm_pmu.h>
> >>
> >>+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
> >>+#include <asm/arch_gicv3.h>
> >>+#endif
> >>+
> >>  #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
> >>
> >>  #define KVM_VCPU_MAX_FEATURES 4
> >>@@ -332,7 +336,30 @@ int kvm_unmap_hva_range(struct kvm *kvm,
> >>  void kvm_arm_resume_guest(struct kvm *kvm);
> >>
> >>  u64 __kvm_call_hyp(void *hypfn, ...);
> >>+
> >>+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
> >>+/*
> >>+ * If we use PMR to disable interrupts, interrupts happening while the
> >
> >I would describe this as 'masking' interrupts, as opposed to 'disabling'
> >interrupts, in general, and probably we can be more precise by
> >categorizing interrupts as 'normal' vs. 'priority (NMIs)' or something
> >like that.
> >
> 
> Right, I'll do that.
> 
> >>+ * guest is executing will not be signaled to the host by the GIC  (and
> >
> >signaled to the CPU.  (Whether this then causes an exception to EL2
> >depends on the HCR_EL2.IMO flag and what KVM does with that exception).
> >
> 
> True, I'll fix that.
> 
> >>+ * any call to a waiting kvm_kick_many_cpus will likely cause a
> >>+ * deadlock).
> >
> >This kick thing is an unccessary specific example to bring out here, I
> >don't think it adds to the general understanding.
> >
> 
> Yes, makes sense.
> 
> >>+ * We need to switch back to using PSR.I.
> >>+ */
> >>+#define kvm_call_hyp(f, ...)						\
> >>+	({								\
> >>+		u64 res;						\
> >>+		unsigned long flags;					\
> >>+									\
> >>+		flags = arch_local_irq_save();				\
> >>+		gic_end_pmr_masking();					\
> >>+		res = __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__);	\
> >>+		gic_start_pmr_masking();				\
> >>+		arch_local_irq_restore(flags);				\
> >>+		res;							\
> >
> >This is in the critical path, so ideally this could instead be done
> >inside the function, because the trap will set PSTATE.I already, so it
> >should be possible to reduce this to a set of save/clear/restore
> >operations after __kvm_call_hyp.
> 
> Hmmm, you are talking about the function called by __kvm_call_hyp, right?
> 
> But this means we'll need to add the save/clear/restore to each function
> that can be called by __kvm_call_hyp, no?

No, you only need to do this in the case where you run the guedt,
__kvm_vcpu_run, because all the other callers don't need to be able to
take interrupts as they are under control of the host.

> 
> Also in the VHE case we don't have the trap setting the PSTATE.I (the code
> calling kvm_call_hyp disables interrupts before entering the guest, but now
> the function disabling interrupts is just masking them with PMR).
> 

Yes, for VHE you'd have to do something more.  One option would be to
disable interrupts using PSTATE.I in kvm_arch_vcpu_ioctl_run() for VHE,
another option is to simply mask interrupts in the PSTATE only for VHE
in __kvm_vcpu_run.

> >
> >>+	})
> >>+#else
> >>  #define kvm_call_hyp(f, ...) __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__)
> >>+#endif
> >>
> >>  void force_vm_exit(const cpumask_t *mask);
> >>  void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
> >>@@ -349,6 +376,9 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
> >>  				       unsigned long hyp_stack_ptr,
> >>  				       unsigned long vector_ptr)
> >>  {
> >>+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
> >>+	unsigned long flags;
> >>+#endif
> >>  	/*
> >>  	 * Call initialization code, and switch to the full blown HYP code.
> >>  	 * If the cpucaps haven't been finalized yet, something has gone very
> >>@@ -356,7 +386,17 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
> >>  	 * cpus_have_const_cap() wrapper.
> >>  	 */
> >>  	BUG_ON(!static_branch_likely(&arm64_const_caps_ready));
> >>+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
> >>+	flags = arch_local_irq_save();
> >>+	gic_end_pmr_masking();
> >>+#endif
> >>+
> >>  	__kvm_call_hyp((void *)pgd_ptr, hyp_stack_ptr, vector_ptr);
> >>+
> >>+#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
> >>+	gic_start_pmr_masking();
> >>+	arch_local_irq_restore(flags);
> >>+#endif
> >
> >I don't think you need any of this, because I don't think you'd ever
> >need to handle interrupts while initializing hyp mode.
> >
> 
> I was more worried about getting an "NMI" during the EL2 initialisation
> rather than missing the masked interrupts. Do you know whether this might be
> an issue here or not?
> 

For non-VHE it won't be an issue, because making a hyp call will mask
interrupts using PSTATE.I, and for VHE we don't do any work here, so
it's not an issue there either.


Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC 0/7] arm64: provide pseudo NMI with GICv3
  2017-10-11 13:00 [RFC 0/7] arm64: provide pseudo NMI with GICv3 Julien Thierry
                   ` (7 preceding siblings ...)
  2017-10-11 15:55 ` [RFC 0/7] arm64: provide pseudo NMI with GICv3 Daniel Thompson
@ 2017-11-03 11:50 ` Julien Thierry
  8 siblings, 0 replies; 19+ messages in thread
From: Julien Thierry @ 2017-11-03 11:50 UTC (permalink / raw)
  To: linux-arm-kernel

Gentle ping on this series.

On 11/10/17 14:00, Julien Thierry wrote:
> Hi,
> 
> This series is a continuation of the work started by Daniel [1]. The goal is
> to use GICv3 interrupt priorities to simulate an NMI.
> 
> To achieve this, set two priorities, one for standard interrupts and
> another, higher priority, for NMIs. Whenever we want to disable interrupts,
> we mask the standard priority instead so NMIs can still be raised. Some
> corner cases though still require to actually mask all interrupts
> effectively disabling the NMI.
> 
> Of course, using priority masking instead of PSR.I comes at some cost. On
> hackbench, I notice a ~6% drop of performance. However, on a kernel build,
> I do not see significant differences.
> 
> Currently, only PPIs and SPIs can be set as NMIs. IPIs being currently
> hardcoded IRQ numbers, there isn't a generic interface to set SGIs as NMI
> for now. I don't think there is any reason LPIs should be allowed to be set
> as NMI as they do not have an active state.
> When an NMI is active on a CPU, no other NMI can be triggered on the CPU.
> 
> Requirements to use this:
> - Have GICv3
> - SCR_EL3.FIQ is set to 1 when linux runs
> - Select Kernel Feature -> Use ICC system registers for IRQ masking
> 
>  From my testing with the perf interrupt, there doesn't seem to be any major
> issue left (but if you find some, do tell!).
> 
> * Patches 1 and 5 are minor utility changes.
> * Patches 2 and 3 allows to detect and enable the use of GICv3 system
>    registers during boot time.
> * Patch 4 introduces the masking of IRQs using priorities replacing irq
>    disabling.
> * Patch 6 add detection of the view linux has on GICv3 priorities, without
>    this we cannot easily mask specific priorities in an accurate manner
> * Patch 7 adds the support for NMIs
> 
> 
> [1] http://www.spinics.net/lists/arm-kernel/msg525077.html
> 
> Thanks,
> 
> Julien Thierry
> 
> 
> Daniel Thompson (4):
>    arm64: irqflags: Reorder the fiq & async macros
>    arm64: cpufeature: Allow early detect of specific features
>    arm64: alternative: Apply alternatives early in boot process
>    arm64: irqflags: Use ICC sysregs to implement IRQ masking
> 
> Julien Thierry (3):
>    irqchip/gic: Add functions to access irq priorities
>    arm64: Detect current view of GIC priorities
>    arm64: Add support for pseudo-NMIs
> 
>   Documentation/arm64/booting.txt        |   5 +
>   arch/arm64/Kconfig                     |  15 ++
>   arch/arm64/include/asm/alternative.h   |   1 +
>   arch/arm64/include/asm/arch_gicv3.h    |  24 +++
>   arch/arm64/include/asm/assembler.h     |  33 +++-
>   arch/arm64/include/asm/efi.h           |   5 +
>   arch/arm64/include/asm/irqflags.h      | 143 +++++++++++++++-
>   arch/arm64/include/asm/kvm_host.h      |  40 +++++
>   arch/arm64/include/asm/processor.h     |   4 +
>   arch/arm64/include/asm/ptrace.h        |  14 +-
>   arch/arm64/include/asm/sysreg.h        |   1 +
>   arch/arm64/kernel/alternative.c        |  39 ++++-
>   arch/arm64/kernel/asm-offsets.c        |   1 +
>   arch/arm64/kernel/cpufeature.c         |  67 +++++---
>   arch/arm64/kernel/entry.S              | 132 ++++++++++++--
>   arch/arm64/kernel/head.S               |  38 +++++
>   arch/arm64/kernel/process.c            |   6 +
>   arch/arm64/kernel/smp.c                |  14 ++
>   arch/arm64/mm/proc.S                   |  23 +++
>   drivers/irqchip/irq-gic-common.c       |  10 ++
>   drivers/irqchip/irq-gic-common.h       |   2 +
>   drivers/irqchip/irq-gic-v3-its.c       |   2 +-
>   drivers/irqchip/irq-gic-v3.c           | 304 +++++++++++++++++++++++++++++----
>   include/linux/interrupt.h              |   1 +
>   include/linux/irqchip/arm-gic-common.h |   6 +
>   include/linux/irqchip/arm-gic.h        |   5 -
>   26 files changed, 845 insertions(+), 90 deletions(-)
> 
> --
> 1.9.1
> 

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC 4/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking
  2017-10-17  9:12       ` Christoffer Dall
@ 2017-11-03 11:57         ` Julien Thierry
  2017-11-03 14:30           ` Marc Zyngier
  0 siblings, 1 reply; 19+ messages in thread
From: Julien Thierry @ 2017-11-03 11:57 UTC (permalink / raw)
  To: linux-arm-kernel



On 17/10/17 10:12, Christoffer Dall wrote:
> On Tue, Oct 17, 2017 at 09:36:51AM +0100, Julien Thierry wrote:
>>
>>
>> On 16/10/17 22:18, Christoffer Dall wrote:
>>> On Wed, Oct 11, 2017 at 02:00:59PM +0100, Julien Thierry wrote:
>>>> From: Daniel Thompson <daniel.thompson@linaro.org>
>>>>
>>>> Currently irqflags is implemented using the PSR's I bit. It is possible
>>>> to implement irqflags by using the co-processor interface to the GIC.
>>>> Using the co-processor interface makes it feasible to simulate NMIs
>>>> using GIC interrupt prioritization.
>>>>
>>>> This patch changes the irqflags macros to modify, save and restore
>>>> ICC_PMR_EL1. This has a substantial knock on effect for the rest of
>>>> the kernel. There are four reasons for this:
>>>>
>>>> 1. The state of the PMR becomes part of the interrupt context and must be
>>>>     saved and restored during exceptions. It is saved on the stack as part
>>>>     of the saved context when an interrupt/exception is taken.
>>>>
>>>> 2. The hardware automatically masks the I bit (at boot, during traps, etc).
>>>>     When the I bit is set by hardware we must add code to switch from I
>>>>     bit masking and PMR masking:
>>>>         - For IRQs, this is done after the interrupt has been acknowledged
>>>>         	 avoiding the need to unmask.
>>>>         - For other exceptions, this is done right after saving the context.
>>>>
>>>> 3. Some instructions, such as wfi, require that the PMR not be used
>>>>     for interrupt masking. Before calling these instructions we must
>>>>     switch from PMR masking to I bit masking.
>>>>     This is also the case when KVM runs a guest, if the CPU receives
>>>>     an interrupt from the host, interrupts must not be masked in PMR
>>>>     otherwise the GIC will not signal it to the CPU.
>>>>
>>>> 4. We use the alternatives system to allow a single kernel to boot and
>>>>     be switched to the alternative masking approach at runtime.
>>>>
>>>> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
>>>> [julien.thierry at arm.com: changes reflected in commit,
>>>> 			 message, fixes, renaming]
>>>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>>> Cc: Will Deacon <will.deacon@arm.com>
>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>>> Cc: Jason Cooper <jason@lakedaemon.net>
>>> ---
>>>
>>> I just gave a quick look to the KVM part here but didn't get into
>>> whether this rather invasive change is warranted or not (it's definitely
>>> entertaining though).
>>>
>>
>> I appreciate you looking into this, it wasn't very clear to me how to deal
>> with that for KVM.
>>
>>> [...]
>>>
>>>
>>>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>>>> index e923b58..070e8a5 100644
>>>> --- a/arch/arm64/include/asm/kvm_host.h
>>>> +++ b/arch/arm64/include/asm/kvm_host.h
>>>> @@ -38,6 +38,10 @@
>>>>   #include <kvm/arm_arch_timer.h>
>>>>   #include <kvm/arm_pmu.h>
>>>>
>>>> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>>>> +#include <asm/arch_gicv3.h>
>>>> +#endif
>>>> +
>>>>   #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
>>>>
>>>>   #define KVM_VCPU_MAX_FEATURES 4
>>>> @@ -332,7 +336,30 @@ int kvm_unmap_hva_range(struct kvm *kvm,
>>>>   void kvm_arm_resume_guest(struct kvm *kvm);
>>>>
>>>>   u64 __kvm_call_hyp(void *hypfn, ...);
>>>> +
>>>> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>>>> +/*
>>>> + * If we use PMR to disable interrupts, interrupts happening while the
>>>
>>> I would describe this as 'masking' interrupts, as opposed to 'disabling'
>>> interrupts, in general, and probably we can be more precise by
>>> categorizing interrupts as 'normal' vs. 'priority (NMIs)' or something
>>> like that.
>>>
>>
>> Right, I'll do that.
>>
>>>> + * guest is executing will not be signaled to the host by the GIC  (and
>>>
>>> signaled to the CPU.  (Whether this then causes an exception to EL2
>>> depends on the HCR_EL2.IMO flag and what KVM does with that exception).
>>>
>>
>> True, I'll fix that.
>>
>>>> + * any call to a waiting kvm_kick_many_cpus will likely cause a
>>>> + * deadlock).
>>>
>>> This kick thing is an unccessary specific example to bring out here, I
>>> don't think it adds to the general understanding.
>>>
>>
>> Yes, makes sense.
>>
>>>> + * We need to switch back to using PSR.I.
>>>> + */
>>>> +#define kvm_call_hyp(f, ...)						\
>>>> +	({								\
>>>> +		u64 res;						\
>>>> +		unsigned long flags;					\
>>>> +									\
>>>> +		flags = arch_local_irq_save();				\
>>>> +		gic_end_pmr_masking();					\
>>>> +		res = __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__);	\
>>>> +		gic_start_pmr_masking();				\
>>>> +		arch_local_irq_restore(flags);				\
>>>> +		res;							\
>>>
>>> This is in the critical path, so ideally this could instead be done
>>> inside the function, because the trap will set PSTATE.I already, so it
>>> should be possible to reduce this to a set of save/clear/restore
>>> operations after __kvm_call_hyp.
>>
>> Hmmm, you are talking about the function called by __kvm_call_hyp, right?
>>
>> But this means we'll need to add the save/clear/restore to each function
>> that can be called by __kvm_call_hyp, no?
> 
> No, you only need to do this in the case where you run the guedt,
> __kvm_vcpu_run, because all the other callers don't need to be able to
> take interrupts as they are under control of the host.
> 
>>
>> Also in the VHE case we don't have the trap setting the PSTATE.I (the code
>> calling kvm_call_hyp disables interrupts before entering the guest, but now
>> the function disabling interrupts is just masking them with PMR).
>>
> 
> Yes, for VHE you'd have to do something more.  One option would be to
> disable interrupts using PSTATE.I in kvm_arch_vcpu_ioctl_run() for VHE,
> another option is to simply mask interrupts in the PSTATE only for VHE
> in __kvm_vcpu_run.
> 

I agree this should work, but I wonder whether it is worth splitting the 
different places we have to deal with that.

Do you think there is good performance improvement to be had here?

Marc, what's your opinion on saving/restoring the PMR state in the hyp 
run code, relying on the I bit being always set before hand by hvc or 
explicitly when in VHE?

>>>
>>>> +	})
>>>> +#else
>>>>   #define kvm_call_hyp(f, ...) __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__)
>>>> +#endif
>>>>
>>>>   void force_vm_exit(const cpumask_t *mask);
>>>>   void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
>>>> @@ -349,6 +376,9 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
>>>>   				       unsigned long hyp_stack_ptr,
>>>>   				       unsigned long vector_ptr)
>>>>   {
>>>> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>>>> +	unsigned long flags;
>>>> +#endif
>>>>   	/*
>>>>   	 * Call initialization code, and switch to the full blown HYP code.
>>>>   	 * If the cpucaps haven't been finalized yet, something has gone very
>>>> @@ -356,7 +386,17 @@ static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
>>>>   	 * cpus_have_const_cap() wrapper.
>>>>   	 */
>>>>   	BUG_ON(!static_branch_likely(&arm64_const_caps_ready));
>>>> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>>>> +	flags = arch_local_irq_save();
>>>> +	gic_end_pmr_masking();
>>>> +#endif
>>>> +
>>>>   	__kvm_call_hyp((void *)pgd_ptr, hyp_stack_ptr, vector_ptr);
>>>> +
>>>> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>>>> +	gic_start_pmr_masking();
>>>> +	arch_local_irq_restore(flags);
>>>> +#endif
>>>
>>> I don't think you need any of this, because I don't think you'd ever
>>> need to handle interrupts while initializing hyp mode.
>>>
>>
>> I was more worried about getting an "NMI" during the EL2 initialisation
>> rather than missing the masked interrupts. Do you know whether this might be
>> an issue here or not?
>>
> 
> For non-VHE it won't be an issue, because making a hyp call will mask
> interrupts using PSTATE.I, and for VHE we don't do any work here, so
> it's not an issue there either.
> 

Noted, I'll get rid of that part.

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC 6/7] arm64: Detect current view of GIC priorities
  2017-10-11 13:01 ` [RFC 6/7] arm64: Detect current view of GIC priorities Julien Thierry
@ 2017-11-03 14:02   ` Marc Zyngier
  0 siblings, 0 replies; 19+ messages in thread
From: Marc Zyngier @ 2017-11-03 14:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/10/17 14:01, Julien Thierry wrote:
> The values non secure EL1 needs to use for priority registers depends on
> the value of SCR_EL3.FIQ.
> 
> Since we don't have access to SCR_EL3, we fake an interrupt and compare the
> GIC priority with the one present in the [re]distributor.
> 
> Also, add firmware requirements related to SCR_EL3.
> 
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Jason Cooper <jason@lakedaemon.net>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  Documentation/arm64/booting.txt     |  5 +++
>  arch/arm64/include/asm/arch_gicv3.h |  5 +++
>  arch/arm64/include/asm/irqflags.h   |  6 +++
>  arch/arm64/include/asm/sysreg.h     |  1 +
>  drivers/irqchip/irq-gic-v3.c        | 83 +++++++++++++++++++++++++++++++++++++
>  5 files changed, 100 insertions(+)
> 
> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
> index 8d0df62..e387938 100644
> --- a/Documentation/arm64/booting.txt
> +++ b/Documentation/arm64/booting.txt
> @@ -188,6 +188,11 @@ Before jumping into the kernel, the following conditions must be met:
>    the kernel image will be entered must be initialised by software at a
>    higher exception level to prevent execution in an UNKNOWN state.
> 
> +  - SCR_EL3.FIQ must have the same value across all CPUs the kernel is
> +    executing on.
> +  - The value of SCR_EL3.FIQ must be the same as the one present at boot
> +    time whenever the kernel is executing.
> +
>    For systems with a GICv3 interrupt controller to be used in v3 mode:
>    - If EL3 is present:
>      ICC_SRE_EL3.Enable (bit 3) must be initialiased to 0b1.
> diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h
> index b6b430c..90a3dc3 100644
> --- a/arch/arm64/include/asm/arch_gicv3.h
> +++ b/arch/arm64/include/asm/arch_gicv3.h
> @@ -114,6 +114,11 @@ static inline void gic_write_bpr1(u32 val)
>  	write_sysreg_s(val, SYS_ICC_BPR1_EL1);
>  }
> 
> +static inline u32 gic_read_rpr(void)
> +{
> +	return read_sysreg_s(SYS_ICC_RPR_EL1);
> +}
> +
>  #define gic_read_typer(c)		readq_relaxed(c)
>  #define gic_write_irouter(v, c)		writeq_relaxed(v, c)
>  #define gic_read_lpir(c)		readq_relaxed(c)
> diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
> index 1c69ebb..4718ca3f 100644
> --- a/arch/arm64/include/asm/irqflags.h
> +++ b/arch/arm64/include/asm/irqflags.h
> @@ -205,6 +205,12 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
>  		!(ARCH_FLAGS_GET_PMR(flags) & ICC_PMR_EL1_EN_BIT);
>  }
> 
> +/* Mask IRQs at CPU level instead of GIC level */
> +static inline void arch_irqs_daif_disable(void)

nit: I find the naming pretty horrible. No, I don't have a better
suggestion.

> +{
> +	asm volatile ("msr daifset, #2" : : : "memory");
> +}
> +
>  void maybe_switch_to_sysreg_gic_cpuif(void);
> 
>  #endif /* CONFIG_IRQFLAGS_GIC_MASKING */
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index f707fed..e71bc81 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -205,6 +205,7 @@
>  #define SYS_ICC_SRE_EL1			sys_reg(3, 0, 12, 12, 5)
>  #define SYS_ICC_IGRPEN0_EL1		sys_reg(3, 0, 12, 12, 6)
>  #define SYS_ICC_IGRPEN1_EL1		sys_reg(3, 0, 12, 12, 7)
> +#define SYS_ICC_RPR_EL1			sys_reg(3, 0, 12, 11, 3)

We should try and keep the ordering of sysreg consistent, following the
(Op0, Op1, CRn, CRm, Op2) encoding. And actually, this sysreg has
already been added (see 43515894c06f8).

> 
>  #define SYS_CONTEXTIDR_EL1		sys_reg(3, 0, 13, 0, 1)
>  #define SYS_TPIDR_EL1			sys_reg(3, 0, 13, 0, 4)
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 1cdf495..e0b235f 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -62,6 +62,10 @@ struct gic_chip_data {
>  static struct gic_chip_data gic_data __read_mostly;
>  static struct static_key supports_deactivate = STATIC_KEY_INIT_TRUE;
> 
> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
> +DEFINE_STATIC_KEY_FALSE(have_non_secure_prio_view);
> +#endif
> +
>  static struct gic_kvm_info gic_v3_kvm_info;
> 
>  #define gic_data_rdist()		(this_cpu_ptr(gic_data.rdists.rdist))
> @@ -969,6 +973,81 @@ static int partition_domain_translate(struct irq_domain *d,
>  	.select = gic_irq_domain_select,
>  };
> 
> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
> +/*
> + * The behaviours of RPR and PMR registers differ depending on the value of
> + * SCR_EL3.FIQ, while the behaviour of priority registers of the distributor
> + * and redistributors is always the same.
> + *
> + * If SCR_EL3.FIQ == 1, the values used for RPR and PMR are the same as the ones
> + * programmed in the distributor and redistributors registers.
> + *
> + * Otherwise, the value presented by RPR as well as the value which will be
> + * compared against PMR is: (GIC_(R)DIST_PRI[irq] >> 1) | 0x80;
> + *
> + * see GICv3/GICv4 Architecture Specification (IHI0069D):
> + * - section 4.8.1 Non-secure accesses to register fields for Secure interrupt
> + *   priorities.
> + * - Figure 4-7 Secure read of the priority field for a Non-secure Group 1
> + *   interrupt.
> + */
> +static void __init gic_detect_prio_view(void)
> +{
> +	/*
> +	 * Randomly picked SGI, must be <= 8 as other SGIs might be
> +	 * used by the firmware.
> +	 */
> +	const u32 fake_irqnr = 7;
> +	const u32 fake_irqmask = BIT(fake_irqnr);
> +	void __iomem * const rdist_base = gic_data_rdist_sgi_base();
> +	unsigned long irq_flags;
> +	u32 acked_irqnr;
> +	bool was_enabled;
> +
> +	irq_flags = arch_local_save_flags();
> +
> +	arch_irqs_daif_disable();
> +
> +	was_enabled = !!(readl_relaxed(rdist_base + GICD_ISENABLER) &
> +			 fake_irqmask);
> +
> +	if (!was_enabled)
> +		writel_relaxed(fake_irqmask, rdist_base + GICD_ISENABLER);
> +
> +	/* Need to unmask to acknowledge the IRQ */
> +	gic_write_pmr(ICC_PMR_EL1_UNMASKED);
> +	dsb(sy);
> +
> +	/* Fake a pending SGI */
> +	writel_relaxed(fake_irqmask, rdist_base + GICD_ISPENDR);
> +	dsb(sy);
> +
> +	acked_irqnr = gic_read_iar();

Careful, there is no guarantee that you'll immediately get an interrupt
here. There is a propagation delay, and the only thing the spec says is
that you'll get something in finite time (give or take a few days).

> +
> +	if (acked_irqnr == fake_irqnr) {
> +		if (gic_read_rpr() == gic_get_irq_prio(acked_irqnr, rdist_base))
> +			static_branch_enable(&have_non_secure_prio_view);
> +	} else {
> +		pr_warn("Unexpected IRQ for priority detection: %u\n",
> +			acked_irqnr);
> +	}
> +
> +	if (acked_irqnr < 1020) {
> +		gic_write_eoir(acked_irqnr);
> +		if (static_key_true(&supports_deactivate))
> +			gic_write_dir(acked_irqnr);

I think we could move this construct to a helper (it is otherwise used
in the driver).

> +	}
> +
> +	/* Restore enabled state */
> +	if (!was_enabled) {
> +		writel_relaxed(fake_irqmask, rdist_base + GICD_ICENABLER);
> +		gic_redist_wait_for_rwp();
> +	}
> +
> +	arch_local_irq_restore(irq_flags);
> +}
> +#endif
> +
>  static int __init gic_init_bases(void __iomem *dist_base,
>  				 struct redist_region *rdist_regs,
>  				 u32 nr_redist_regions,
> @@ -1025,6 +1104,10 @@ static int __init gic_init_bases(void __iomem *dist_base,
>  	gic_cpu_init();
>  	gic_cpu_pm_init();
> 
> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
> +	gic_detect_prio_view();
> +#endif
> +
>  	return 0;
> 
>  out_free:
> --
> 1.9.1
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC 4/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking
  2017-11-03 11:57         ` Julien Thierry
@ 2017-11-03 14:30           ` Marc Zyngier
  0 siblings, 0 replies; 19+ messages in thread
From: Marc Zyngier @ 2017-11-03 14:30 UTC (permalink / raw)
  To: linux-arm-kernel

On 03/11/17 11:57, Julien Thierry wrote:
> 
> 
> On 17/10/17 10:12, Christoffer Dall wrote:
>> On Tue, Oct 17, 2017 at 09:36:51AM +0100, Julien Thierry wrote:
>>>
>>>
>>> On 16/10/17 22:18, Christoffer Dall wrote:
>>>> On Wed, Oct 11, 2017 at 02:00:59PM +0100, Julien Thierry wrote:
>>>>> From: Daniel Thompson <daniel.thompson@linaro.org>
>>>>>
>>>>> Currently irqflags is implemented using the PSR's I bit. It is possible
>>>>> to implement irqflags by using the co-processor interface to the GIC.
>>>>> Using the co-processor interface makes it feasible to simulate NMIs
>>>>> using GIC interrupt prioritization.
>>>>>
>>>>> This patch changes the irqflags macros to modify, save and restore
>>>>> ICC_PMR_EL1. This has a substantial knock on effect for the rest of
>>>>> the kernel. There are four reasons for this:
>>>>>
>>>>> 1. The state of the PMR becomes part of the interrupt context and must be
>>>>>     saved and restored during exceptions. It is saved on the stack as part
>>>>>     of the saved context when an interrupt/exception is taken.
>>>>>
>>>>> 2. The hardware automatically masks the I bit (at boot, during traps, etc).
>>>>>     When the I bit is set by hardware we must add code to switch from I
>>>>>     bit masking and PMR masking:
>>>>>         - For IRQs, this is done after the interrupt has been acknowledged
>>>>>         	 avoiding the need to unmask.
>>>>>         - For other exceptions, this is done right after saving the context.
>>>>>
>>>>> 3. Some instructions, such as wfi, require that the PMR not be used
>>>>>     for interrupt masking. Before calling these instructions we must
>>>>>     switch from PMR masking to I bit masking.
>>>>>     This is also the case when KVM runs a guest, if the CPU receives
>>>>>     an interrupt from the host, interrupts must not be masked in PMR
>>>>>     otherwise the GIC will not signal it to the CPU.
>>>>>
>>>>> 4. We use the alternatives system to allow a single kernel to boot and
>>>>>     be switched to the alternative masking approach at runtime.
>>>>>
>>>>> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
>>>>> [julien.thierry at arm.com: changes reflected in commit,
>>>>> 			 message, fixes, renaming]
>>>>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>>>> Cc: Will Deacon <will.deacon@arm.com>
>>>>> Cc: Christoffer Dall <christoffer.dall@linaro.org>
>>>>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>>>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>>>> Cc: Jason Cooper <jason@lakedaemon.net>
>>>> ---
>>>>
>>>> I just gave a quick look to the KVM part here but didn't get into
>>>> whether this rather invasive change is warranted or not (it's definitely
>>>> entertaining though).
>>>>
>>>
>>> I appreciate you looking into this, it wasn't very clear to me how to deal
>>> with that for KVM.
>>>
>>>> [...]
>>>>
>>>>
>>>>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>>>>> index e923b58..070e8a5 100644
>>>>> --- a/arch/arm64/include/asm/kvm_host.h
>>>>> +++ b/arch/arm64/include/asm/kvm_host.h
>>>>> @@ -38,6 +38,10 @@
>>>>>   #include <kvm/arm_arch_timer.h>
>>>>>   #include <kvm/arm_pmu.h>
>>>>>
>>>>> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>>>>> +#include <asm/arch_gicv3.h>
>>>>> +#endif
>>>>> +
>>>>>   #define KVM_MAX_VCPUS VGIC_V3_MAX_CPUS
>>>>>
>>>>>   #define KVM_VCPU_MAX_FEATURES 4
>>>>> @@ -332,7 +336,30 @@ int kvm_unmap_hva_range(struct kvm *kvm,
>>>>>   void kvm_arm_resume_guest(struct kvm *kvm);
>>>>>
>>>>>   u64 __kvm_call_hyp(void *hypfn, ...);
>>>>> +
>>>>> +#ifdef CONFIG_USE_ICC_SYSREGS_FOR_IRQFLAGS
>>>>> +/*
>>>>> + * If we use PMR to disable interrupts, interrupts happening while the
>>>>
>>>> I would describe this as 'masking' interrupts, as opposed to 'disabling'
>>>> interrupts, in general, and probably we can be more precise by
>>>> categorizing interrupts as 'normal' vs. 'priority (NMIs)' or something
>>>> like that.
>>>>
>>>
>>> Right, I'll do that.
>>>
>>>>> + * guest is executing will not be signaled to the host by the GIC  (and
>>>>
>>>> signaled to the CPU.  (Whether this then causes an exception to EL2
>>>> depends on the HCR_EL2.IMO flag and what KVM does with that exception).
>>>>
>>>
>>> True, I'll fix that.
>>>
>>>>> + * any call to a waiting kvm_kick_many_cpus will likely cause a
>>>>> + * deadlock).
>>>>
>>>> This kick thing is an unccessary specific example to bring out here, I
>>>> don't think it adds to the general understanding.
>>>>
>>>
>>> Yes, makes sense.
>>>
>>>>> + * We need to switch back to using PSR.I.
>>>>> + */
>>>>> +#define kvm_call_hyp(f, ...)						\
>>>>> +	({								\
>>>>> +		u64 res;						\
>>>>> +		unsigned long flags;					\
>>>>> +									\
>>>>> +		flags = arch_local_irq_save();				\
>>>>> +		gic_end_pmr_masking();					\
>>>>> +		res = __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__);	\
>>>>> +		gic_start_pmr_masking();				\
>>>>> +		arch_local_irq_restore(flags);				\
>>>>> +		res;							\
>>>>
>>>> This is in the critical path, so ideally this could instead be done
>>>> inside the function, because the trap will set PSTATE.I already, so it
>>>> should be possible to reduce this to a set of save/clear/restore
>>>> operations after __kvm_call_hyp.
>>>
>>> Hmmm, you are talking about the function called by __kvm_call_hyp, right?
>>>
>>> But this means we'll need to add the save/clear/restore to each function
>>> that can be called by __kvm_call_hyp, no?
>>
>> No, you only need to do this in the case where you run the guedt,
>> __kvm_vcpu_run, because all the other callers don't need to be able to
>> take interrupts as they are under control of the host.
>>
>>>
>>> Also in the VHE case we don't have the trap setting the PSTATE.I (the code
>>> calling kvm_call_hyp disables interrupts before entering the guest, but now
>>> the function disabling interrupts is just masking them with PMR).
>>>
>>
>> Yes, for VHE you'd have to do something more.  One option would be to
>> disable interrupts using PSTATE.I in kvm_arch_vcpu_ioctl_run() for VHE,
>> another option is to simply mask interrupts in the PSTATE only for VHE
>> in __kvm_vcpu_run.
>>
> 
> I agree this should work, but I wonder whether it is worth splitting the 
> different places we have to deal with that.
> 
> Do you think there is good performance improvement to be had here?
> 
> Marc, what's your opinion on saving/restoring the PMR state in the hyp 
> run code, relying on the I bit being always set before hand by hvc or 
> explicitly when in VHE?

One thing I'm not overly keen on is to multiply the locations where we
reconfigure the interrupt delivery, because it is overly fragile.

If I have followed what should happen, we have the following situation:

- pre-VHE: after HVC, unmask the PMR so that we can get interrupts when
the guest is running. Restore it to its normal value before returning to
the host

- VHE: Disable interrupts (mimicking the HVC), unmask PMR. Do the
opposite on return.

At the moment, you do a bit too much, but that's because we only have a
single entry point (__kvm_call_hyp). With Christoffer's VHE rework, we
get a different entry point per architecture, probably making it simpler
to hack something there.

You can have a look at those patches and see if that would make things
easier...

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC 3/7] arm64: alternative: Apply alternatives early in boot process
  2017-10-11 13:00 ` [RFC 3/7] arm64: alternative: Apply alternatives early in boot process Julien Thierry
@ 2017-11-13 17:34   ` Suzuki K Poulose
  2017-11-13 17:39     ` Suzuki K Poulose
  0 siblings, 1 reply; 19+ messages in thread
From: Suzuki K Poulose @ 2017-11-13 17:34 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/10/17 14:00, Julien Thierry wrote:
> From: Daniel Thompson <daniel.thompson@linaro.org>
> 
> Currently alternatives are applied very late in the boot process (and
> a long time after we enable scheduling). Some alternative sequences,
> such as those that alter the way CPU context is stored, must be applied
> much earlier in the boot sequence.
> 
> Introduce apply_alternatives_early() to allow some alternatives to be
> applied immediately after we detect the CPU features of the boot CPU.
> 
> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> ---
>   arch/arm64/include/asm/alternative.h |  1 +
>   arch/arm64/kernel/alternative.c      | 39 +++++++++++++++++++++++++++++++++---
>   arch/arm64/kernel/smp.c              |  6 ++++++
>   3 files changed, 43 insertions(+), 3 deletions(-)
> 


> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 9f7195a..84c6983 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -455,6 +455,12 @@ void __init smp_prepare_boot_cpu(void)
>   	 * cpuinfo_store_boot_cpu() above.
>   	 */
>   	update_cpu_errata_workarounds();
> +	/*
> +	 * We now know enough about the boot CPU to apply the
> +	 * alternatives that cannot wait until interrupt handling
> +	 * and/or scheduling is enabled.
> +	 */
> +	apply_alternatives_early();
>   }

What happens if a secondary CPU, activated at boot doesn't have the "early"
capabilities ?

Suzuki

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC 3/7] arm64: alternative: Apply alternatives early in boot process
  2017-11-13 17:34   ` Suzuki K Poulose
@ 2017-11-13 17:39     ` Suzuki K Poulose
  0 siblings, 0 replies; 19+ messages in thread
From: Suzuki K Poulose @ 2017-11-13 17:39 UTC (permalink / raw)
  To: linux-arm-kernel

On 13/11/17 17:34, Suzuki K Poulose wrote:
> On 11/10/17 14:00, Julien Thierry wrote:
>> From: Daniel Thompson <daniel.thompson@linaro.org>
>>
>> Currently alternatives are applied very late in the boot process (and
>> a long time after we enable scheduling). Some alternative sequences,
>> such as those that alter the way CPU context is stored, must be applied
>> much earlier in the boot sequence.
>>
>> Introduce apply_alternatives_early() to allow some alternatives to be
>> applied immediately after we detect the CPU features of the boot CPU.
>>
>> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will.deacon@arm.com>
>> ---
>> ? arch/arm64/include/asm/alternative.h |? 1 +
>> ? arch/arm64/kernel/alternative.c????? | 39 +++++++++++++++++++++++++++++++++---
>> ? arch/arm64/kernel/smp.c????????????? |? 6 ++++++
>> ? 3 files changed, 43 insertions(+), 3 deletions(-)
>>
> 
> 
>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>> index 9f7195a..84c6983 100644
>> --- a/arch/arm64/kernel/smp.c
>> +++ b/arch/arm64/kernel/smp.c
>> @@ -455,6 +455,12 @@ void __init smp_prepare_boot_cpu(void)
>> ?????? * cpuinfo_store_boot_cpu() above.
>> ?????? */
>> ????? update_cpu_errata_workarounds();
>> +??? /*
>> +???? * We now know enough about the boot CPU to apply the
>> +???? * alternatives that cannot wait until interrupt handling
>> +???? * and/or scheduling is enabled.
>> +???? */
>> +??? apply_alternatives_early();
>> ? }
> 
> What happens if a secondary CPU, activated at boot doesn't have the "early"
> capabilities ?

Err, ignore this comment. I was somehow missing the patch 2 in the series and
after looking that up in the archives, it looks fine.

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-11-13 17:39 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-11 13:00 [RFC 0/7] arm64: provide pseudo NMI with GICv3 Julien Thierry
2017-10-11 13:00 ` [RFC 1/7] arm64: irqflags: Reorder the fiq & async macros Julien Thierry
2017-10-11 13:00 ` [RFC 2/7] arm64: cpufeature: Allow early detect of specific features Julien Thierry
2017-10-11 13:00 ` [RFC 3/7] arm64: alternative: Apply alternatives early in boot process Julien Thierry
2017-11-13 17:34   ` Suzuki K Poulose
2017-11-13 17:39     ` Suzuki K Poulose
2017-10-11 13:00 ` [RFC 4/7] arm64: irqflags: Use ICC sysregs to implement IRQ masking Julien Thierry
2017-10-16 21:18   ` Christoffer Dall
2017-10-17  8:36     ` Julien Thierry
2017-10-17  9:12       ` Christoffer Dall
2017-11-03 11:57         ` Julien Thierry
2017-11-03 14:30           ` Marc Zyngier
2017-10-11 13:01 ` [RFC 5/7] irqchip/gic: Add functions to access irq priorities Julien Thierry
2017-10-11 13:01 ` [RFC 6/7] arm64: Detect current view of GIC priorities Julien Thierry
2017-11-03 14:02   ` Marc Zyngier
2017-10-11 13:01 ` [RFC 7/7] arm64: Add support for pseudo-NMIs Julien Thierry
2017-10-11 15:55 ` [RFC 0/7] arm64: provide pseudo NMI with GICv3 Daniel Thompson
2017-10-11 16:10   ` Julien Thierry
2017-11-03 11:50 ` Julien Thierry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).