All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/19] KVM/arm64: Randomise EL2 mappings
@ 2018-01-04 18:43 ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm; +Cc: Catalin Marinas, Will Deacon

Whilst KVM benefits from the kernel randomisation via KASLR, there is
no additional randomisation when the kernel is running at EL1, as we
directly use a fixed offset from the linear mapping. This is not
necessarily a problem, but we could do a bit better by independently
randomizing the HYP placement.

This series proposes to randomise the offset by inserting a few random
bits between the MSB of the RAM linear mapping and the top of the HYP
VA (VA_BITS - 2). That's not a lot of random bits (on my Mustang, I
get 13 bits), but that's better than nothing.

In order to achieve this, we need to be able to patch dynamic values
in the kernel text. This results in a bunch of changes to the
alternative framework, the insn library, and a few more hacks in KVM
itself (we get a new way to map the GIC at EL2). This series used to
depend on a number of cleanups in asm-offsets, which is not the case
anymore. I'm still including them as I think they are still pretty
useful.

This has been tested on the FVP model, Seattle (both 39 and 48bit VA),
Mustang and Thunder-X. I've also done a sanity check on 32bit (which
is only impacted by the HYP IO VA stuff).

Thanks,

	M.

* From v3:
  - Reworked the alternative code to leave the actual patching to
    the callback function. This should allow for more flexibility
    should someone or something require it
  - Now detects underflows in the IOVA allocator
  - Moved the VA patching code to va_layout.c

* From v2:
  - Fixed a crapload of bugs in the immediate generation patch
    I now have a test harness for it, making sure it generates the
    same thing as GAS...
  - Fixed a bug in the asm-offsets.h exclusion patch
  - Reworked the alternative_cb code to be nicer and avoid generating
    pointless nops

* From v1:
  - Now works correctly with KASLR
  - Dropped the callback field from alt_instr, and reuse one of the
    existing fields to store an offset to the callback
  - Fix HYP teardown path (depends on fixes previously posted)
  - Dropped the VA offset macros

Marc Zyngier (19):
  arm64: asm-offsets: Avoid clashing DMA definitions
  arm64: asm-offsets: Remove unused definitions
  arm64: asm-offsets: Remove potential circular dependency
  arm64: alternatives: Enforce alignment of struct alt_instr
  arm64: alternatives: Add dynamic patching feature
  arm64: insn: Add N immediate encoding
  arm64: insn: Add encoder for bitwise operations using literals
  arm64: KVM: Dynamically patch the kernel/hyp VA mask
  arm64: cpufeatures: Drop the ARM64_HYP_OFFSET_LOW feature flag
  KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
  KVM: arm/arm64: Demote HYP VA range display to being a debug feature
  KVM: arm/arm64: Move ioremap calls to create_hyp_io_mappings
  KVM: arm/arm64: Keep GICv2 HYP VAs in kvm_vgic_global_state
  KVM: arm/arm64: Move HYP IO VAs to the "idmap" range
  arm64; insn: Add encoder for the EXTR instruction
  arm64: insn: Allow ADD/SUB (immediate) with LSL #12
  arm64: KVM: Dynamically compute the HYP VA mask
  arm64: KVM: Introduce EL2 VA randomisation
  arm64: Update the KVM memory map documentation

 Documentation/arm64/memory.txt             |   8 +-
 arch/arm/include/asm/kvm_hyp.h             |   6 +
 arch/arm/include/asm/kvm_mmu.h             |   4 +-
 arch/arm64/include/asm/alternative.h       |  49 ++++++--
 arch/arm64/include/asm/alternative_types.h |  17 +++
 arch/arm64/include/asm/asm-offsets.h       |   2 +
 arch/arm64/include/asm/cpucaps.h           |   2 +-
 arch/arm64/include/asm/insn.h              |  16 +++
 arch/arm64/include/asm/kvm_hyp.h           |   9 ++
 arch/arm64/include/asm/kvm_mmu.h           |  57 ++++-----
 arch/arm64/kernel/alternative.c            |  43 +++++--
 arch/arm64/kernel/asm-offsets.c            |  17 +--
 arch/arm64/kernel/cpufeature.c             |  19 ---
 arch/arm64/kernel/insn.c                   | 190 ++++++++++++++++++++++++++++-
 arch/arm64/kvm/Makefile                    |   2 +-
 arch/arm64/kvm/va_layout.c                 | 144 ++++++++++++++++++++++
 arch/arm64/mm/cache.S                      |   4 +-
 include/kvm/arm_vgic.h                     |  12 +-
 virt/kvm/arm/hyp/vgic-v2-sr.c              |  12 +-
 virt/kvm/arm/mmu.c                         |  95 +++++++++++----
 virt/kvm/arm/vgic/vgic-init.c              |   6 -
 virt/kvm/arm/vgic/vgic-v2.c                |  40 ++----
 22 files changed, 589 insertions(+), 165 deletions(-)
 create mode 100644 arch/arm64/include/asm/alternative_types.h
 create mode 100644 arch/arm64/kvm/va_layout.c

-- 
2.14.2

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 00/19] KVM/arm64: Randomise EL2 mappings
@ 2018-01-04 18:43 ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

Whilst KVM benefits from the kernel randomisation via KASLR, there is
no additional randomisation when the kernel is running at EL1, as we
directly use a fixed offset from the linear mapping. This is not
necessarily a problem, but we could do a bit better by independently
randomizing the HYP placement.

This series proposes to randomise the offset by inserting a few random
bits between the MSB of the RAM linear mapping and the top of the HYP
VA (VA_BITS - 2). That's not a lot of random bits (on my Mustang, I
get 13 bits), but that's better than nothing.

In order to achieve this, we need to be able to patch dynamic values
in the kernel text. This results in a bunch of changes to the
alternative framework, the insn library, and a few more hacks in KVM
itself (we get a new way to map the GIC at EL2). This series used to
depend on a number of cleanups in asm-offsets, which is not the case
anymore. I'm still including them as I think they are still pretty
useful.

This has been tested on the FVP model, Seattle (both 39 and 48bit VA),
Mustang and Thunder-X. I've also done a sanity check on 32bit (which
is only impacted by the HYP IO VA stuff).

Thanks,

	M.

* From v3:
  - Reworked the alternative code to leave the actual patching to
    the callback function. This should allow for more flexibility
    should someone or something require it
  - Now detects underflows in the IOVA allocator
  - Moved the VA patching code to va_layout.c

* From v2:
  - Fixed a crapload of bugs in the immediate generation patch
    I now have a test harness for it, making sure it generates the
    same thing as GAS...
  - Fixed a bug in the asm-offsets.h exclusion patch
  - Reworked the alternative_cb code to be nicer and avoid generating
    pointless nops

* From v1:
  - Now works correctly with KASLR
  - Dropped the callback field from alt_instr, and reuse one of the
    existing fields to store an offset to the callback
  - Fix HYP teardown path (depends on fixes previously posted)
  - Dropped the VA offset macros

Marc Zyngier (19):
  arm64: asm-offsets: Avoid clashing DMA definitions
  arm64: asm-offsets: Remove unused definitions
  arm64: asm-offsets: Remove potential circular dependency
  arm64: alternatives: Enforce alignment of struct alt_instr
  arm64: alternatives: Add dynamic patching feature
  arm64: insn: Add N immediate encoding
  arm64: insn: Add encoder for bitwise operations using literals
  arm64: KVM: Dynamically patch the kernel/hyp VA mask
  arm64: cpufeatures: Drop the ARM64_HYP_OFFSET_LOW feature flag
  KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
  KVM: arm/arm64: Demote HYP VA range display to being a debug feature
  KVM: arm/arm64: Move ioremap calls to create_hyp_io_mappings
  KVM: arm/arm64: Keep GICv2 HYP VAs in kvm_vgic_global_state
  KVM: arm/arm64: Move HYP IO VAs to the "idmap" range
  arm64; insn: Add encoder for the EXTR instruction
  arm64: insn: Allow ADD/SUB (immediate) with LSL #12
  arm64: KVM: Dynamically compute the HYP VA mask
  arm64: KVM: Introduce EL2 VA randomisation
  arm64: Update the KVM memory map documentation

 Documentation/arm64/memory.txt             |   8 +-
 arch/arm/include/asm/kvm_hyp.h             |   6 +
 arch/arm/include/asm/kvm_mmu.h             |   4 +-
 arch/arm64/include/asm/alternative.h       |  49 ++++++--
 arch/arm64/include/asm/alternative_types.h |  17 +++
 arch/arm64/include/asm/asm-offsets.h       |   2 +
 arch/arm64/include/asm/cpucaps.h           |   2 +-
 arch/arm64/include/asm/insn.h              |  16 +++
 arch/arm64/include/asm/kvm_hyp.h           |   9 ++
 arch/arm64/include/asm/kvm_mmu.h           |  57 ++++-----
 arch/arm64/kernel/alternative.c            |  43 +++++--
 arch/arm64/kernel/asm-offsets.c            |  17 +--
 arch/arm64/kernel/cpufeature.c             |  19 ---
 arch/arm64/kernel/insn.c                   | 190 ++++++++++++++++++++++++++++-
 arch/arm64/kvm/Makefile                    |   2 +-
 arch/arm64/kvm/va_layout.c                 | 144 ++++++++++++++++++++++
 arch/arm64/mm/cache.S                      |   4 +-
 include/kvm/arm_vgic.h                     |  12 +-
 virt/kvm/arm/hyp/vgic-v2-sr.c              |  12 +-
 virt/kvm/arm/mmu.c                         |  95 +++++++++++----
 virt/kvm/arm/vgic/vgic-init.c              |   6 -
 virt/kvm/arm/vgic/vgic-v2.c                |  40 ++----
 22 files changed, 589 insertions(+), 165 deletions(-)
 create mode 100644 arch/arm64/include/asm/alternative_types.h
 create mode 100644 arch/arm64/kvm/va_layout.c

-- 
2.14.2

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 01/19] arm64: asm-offsets: Avoid clashing DMA definitions
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: Christoffer Dall, Mark Rutland, Catalin Marinas, Will Deacon,
	James Morse, Steve Capper, Peter Maydell

asm-offsets.h contains a few DMA related definitions that have
the exact same name than the enum members they are derived from.

While this is not a problem so far, it will become an issue if
both asm-offsets.h and include/linux/dma-direction.h: are pulled
by the same file.

Let's sidestep the issue by renaming the asm-offsets.h constants.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kernel/asm-offsets.c | 6 +++---
 arch/arm64/mm/cache.S           | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 71bf088f1e4b..7e8be0c22ce0 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -87,9 +87,9 @@ int main(void)
   BLANK();
   DEFINE(PAGE_SZ,	       	PAGE_SIZE);
   BLANK();
-  DEFINE(DMA_BIDIRECTIONAL,	DMA_BIDIRECTIONAL);
-  DEFINE(DMA_TO_DEVICE,		DMA_TO_DEVICE);
-  DEFINE(DMA_FROM_DEVICE,	DMA_FROM_DEVICE);
+  DEFINE(__DMA_BIDIRECTIONAL,	DMA_BIDIRECTIONAL);
+  DEFINE(__DMA_TO_DEVICE,	DMA_TO_DEVICE);
+  DEFINE(__DMA_FROM_DEVICE,	DMA_FROM_DEVICE);
   BLANK();
   DEFINE(CLOCK_REALTIME,	CLOCK_REALTIME);
   DEFINE(CLOCK_MONOTONIC,	CLOCK_MONOTONIC);
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 7f1dbe962cf5..c1336be085eb 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -205,7 +205,7 @@ ENDPIPROC(__dma_flush_area)
  *	- dir	- DMA direction
  */
 ENTRY(__dma_map_area)
-	cmp	w2, #DMA_FROM_DEVICE
+	cmp	w2, #__DMA_FROM_DEVICE
 	b.eq	__dma_inv_area
 	b	__dma_clean_area
 ENDPIPROC(__dma_map_area)
@@ -217,7 +217,7 @@ ENDPIPROC(__dma_map_area)
  *	- dir	- DMA direction
  */
 ENTRY(__dma_unmap_area)
-	cmp	w2, #DMA_TO_DEVICE
+	cmp	w2, #__DMA_TO_DEVICE
 	b.ne	__dma_inv_area
 	ret
 ENDPIPROC(__dma_unmap_area)
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 01/19] arm64: asm-offsets: Avoid clashing DMA definitions
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

asm-offsets.h contains a few DMA related definitions that have
the exact same name than the enum members they are derived from.

While this is not a problem so far, it will become an issue if
both asm-offsets.h and include/linux/dma-direction.h: are pulled
by the same file.

Let's sidestep the issue by renaming the asm-offsets.h constants.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kernel/asm-offsets.c | 6 +++---
 arch/arm64/mm/cache.S           | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 71bf088f1e4b..7e8be0c22ce0 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -87,9 +87,9 @@ int main(void)
   BLANK();
   DEFINE(PAGE_SZ,	       	PAGE_SIZE);
   BLANK();
-  DEFINE(DMA_BIDIRECTIONAL,	DMA_BIDIRECTIONAL);
-  DEFINE(DMA_TO_DEVICE,		DMA_TO_DEVICE);
-  DEFINE(DMA_FROM_DEVICE,	DMA_FROM_DEVICE);
+  DEFINE(__DMA_BIDIRECTIONAL,	DMA_BIDIRECTIONAL);
+  DEFINE(__DMA_TO_DEVICE,	DMA_TO_DEVICE);
+  DEFINE(__DMA_FROM_DEVICE,	DMA_FROM_DEVICE);
   BLANK();
   DEFINE(CLOCK_REALTIME,	CLOCK_REALTIME);
   DEFINE(CLOCK_MONOTONIC,	CLOCK_MONOTONIC);
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 7f1dbe962cf5..c1336be085eb 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -205,7 +205,7 @@ ENDPIPROC(__dma_flush_area)
  *	- dir	- DMA direction
  */
 ENTRY(__dma_map_area)
-	cmp	w2, #DMA_FROM_DEVICE
+	cmp	w2, #__DMA_FROM_DEVICE
 	b.eq	__dma_inv_area
 	b	__dma_clean_area
 ENDPIPROC(__dma_map_area)
@@ -217,7 +217,7 @@ ENDPIPROC(__dma_map_area)
  *	- dir	- DMA direction
  */
 ENTRY(__dma_unmap_area)
-	cmp	w2, #DMA_TO_DEVICE
+	cmp	w2, #__DMA_TO_DEVICE
 	b.ne	__dma_inv_area
 	ret
 ENDPIPROC(__dma_unmap_area)
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 02/19] arm64: asm-offsets: Remove unused definitions
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: Christoffer Dall, Mark Rutland, Catalin Marinas, Will Deacon,
	James Morse, Steve Capper, Peter Maydell

asm-offsets.h contains a number of definitions that are not used
at all, and in some cases conflict with other definitions (such as
NSEC_PER_SEC).

Spring clean-up time.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kernel/asm-offsets.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 7e8be0c22ce0..742887330101 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -83,10 +83,6 @@ int main(void)
   DEFINE(VMA_VM_MM,		offsetof(struct vm_area_struct, vm_mm));
   DEFINE(VMA_VM_FLAGS,		offsetof(struct vm_area_struct, vm_flags));
   BLANK();
-  DEFINE(VM_EXEC,	       	VM_EXEC);
-  BLANK();
-  DEFINE(PAGE_SZ,	       	PAGE_SIZE);
-  BLANK();
   DEFINE(__DMA_BIDIRECTIONAL,	DMA_BIDIRECTIONAL);
   DEFINE(__DMA_TO_DEVICE,	DMA_TO_DEVICE);
   DEFINE(__DMA_FROM_DEVICE,	DMA_FROM_DEVICE);
@@ -98,7 +94,6 @@ int main(void)
   DEFINE(CLOCK_REALTIME_COARSE,	CLOCK_REALTIME_COARSE);
   DEFINE(CLOCK_MONOTONIC_COARSE,CLOCK_MONOTONIC_COARSE);
   DEFINE(CLOCK_COARSE_RES,	LOW_RES_NSEC);
-  DEFINE(NSEC_PER_SEC,		NSEC_PER_SEC);
   BLANK();
   DEFINE(VDSO_CS_CYCLE_LAST,	offsetof(struct vdso_data, cs_cycle_last));
   DEFINE(VDSO_RAW_TIME_SEC,	offsetof(struct vdso_data, raw_time_sec));
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 02/19] arm64: asm-offsets: Remove unused definitions
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

asm-offsets.h contains a number of definitions that are not used
at all, and in some cases conflict with other definitions (such as
NSEC_PER_SEC).

Spring clean-up time.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kernel/asm-offsets.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 7e8be0c22ce0..742887330101 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -83,10 +83,6 @@ int main(void)
   DEFINE(VMA_VM_MM,		offsetof(struct vm_area_struct, vm_mm));
   DEFINE(VMA_VM_FLAGS,		offsetof(struct vm_area_struct, vm_flags));
   BLANK();
-  DEFINE(VM_EXEC,	       	VM_EXEC);
-  BLANK();
-  DEFINE(PAGE_SZ,	       	PAGE_SIZE);
-  BLANK();
   DEFINE(__DMA_BIDIRECTIONAL,	DMA_BIDIRECTIONAL);
   DEFINE(__DMA_TO_DEVICE,	DMA_TO_DEVICE);
   DEFINE(__DMA_FROM_DEVICE,	DMA_FROM_DEVICE);
@@ -98,7 +94,6 @@ int main(void)
   DEFINE(CLOCK_REALTIME_COARSE,	CLOCK_REALTIME_COARSE);
   DEFINE(CLOCK_MONOTONIC_COARSE,CLOCK_MONOTONIC_COARSE);
   DEFINE(CLOCK_COARSE_RES,	LOW_RES_NSEC);
-  DEFINE(NSEC_PER_SEC,		NSEC_PER_SEC);
   BLANK();
   DEFINE(VDSO_CS_CYCLE_LAST,	offsetof(struct vdso_data, cs_cycle_last));
   DEFINE(VDSO_RAW_TIME_SEC,	offsetof(struct vdso_data, raw_time_sec));
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 03/19] arm64: asm-offsets: Remove potential circular dependency
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm; +Cc: Catalin Marinas, Will Deacon

So far, we've been lucky enough that none of the include files
that asm-offsets.c requires include asm-offsets.h. This is
about to change, and would introduce a nasty circular dependency...

Let's now guard the inclusion of asm-offsets.h so that it never
gets pulled from asm-offsets.c. The same issue exists between
bounce.c and include/generated/bounds.h, and is worked around
by using the existing guard symbol.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/asm-offsets.h | 2 ++
 arch/arm64/kernel/asm-offsets.c      | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/asm-offsets.h b/arch/arm64/include/asm/asm-offsets.h
index d370ee36a182..7d6531a81eb3 100644
--- a/arch/arm64/include/asm/asm-offsets.h
+++ b/arch/arm64/include/asm/asm-offsets.h
@@ -1 +1,3 @@
+#if !defined(__GENERATING_ASM_OFFSETS_H) && !defined(__GENERATING_BOUNDS_H)
 #include <generated/asm-offsets.h>
+#endif
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 742887330101..5ab8841af382 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -18,6 +18,8 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#define __GENERATING_ASM_OFFSETS_H	1
+
 #include <linux/sched.h>
 #include <linux/mm.h>
 #include <linux/dma-mapping.h>
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 03/19] arm64: asm-offsets: Remove potential circular dependency
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

So far, we've been lucky enough that none of the include files
that asm-offsets.c requires include asm-offsets.h. This is
about to change, and would introduce a nasty circular dependency...

Let's now guard the inclusion of asm-offsets.h so that it never
gets pulled from asm-offsets.c. The same issue exists between
bounce.c and include/generated/bounds.h, and is worked around
by using the existing guard symbol.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/asm-offsets.h | 2 ++
 arch/arm64/kernel/asm-offsets.c      | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/asm-offsets.h b/arch/arm64/include/asm/asm-offsets.h
index d370ee36a182..7d6531a81eb3 100644
--- a/arch/arm64/include/asm/asm-offsets.h
+++ b/arch/arm64/include/asm/asm-offsets.h
@@ -1 +1,3 @@
+#if !defined(__GENERATING_ASM_OFFSETS_H) && !defined(__GENERATING_BOUNDS_H)
 #include <generated/asm-offsets.h>
+#endif
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 742887330101..5ab8841af382 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -18,6 +18,8 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#define __GENERATING_ASM_OFFSETS_H	1
+
 #include <linux/sched.h>
 #include <linux/mm.h>
 #include <linux/dma-mapping.h>
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 04/19] arm64: alternatives: Enforce alignment of struct alt_instr
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: Christoffer Dall, Mark Rutland, Catalin Marinas, Will Deacon,
	James Morse, Steve Capper, Peter Maydell

We're playing a dangerous game with struct alt_instr, as we produce
it using assembly tricks, but parse them using the C structure.
We just assume that the respective alignments of the two will
be the same.

But as we add more fields to this structure, the alignment requirements
of the structure may change, and lead to all kind of funky bugs.

TO solve this, let's move the definition of struct alt_instr to its
own file, and use this to generate the alignment constraint from
asm-offsets.c. The various macros are then patched to take the
alignment into account.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/alternative.h       | 13 +++++--------
 arch/arm64/include/asm/alternative_types.h | 13 +++++++++++++
 arch/arm64/kernel/asm-offsets.c            |  4 ++++
 3 files changed, 22 insertions(+), 8 deletions(-)
 create mode 100644 arch/arm64/include/asm/alternative_types.h

diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
index 4a85c6952a22..395befde7595 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -2,28 +2,24 @@
 #ifndef __ASM_ALTERNATIVE_H
 #define __ASM_ALTERNATIVE_H
 
+#include <asm/asm-offsets.h>
 #include <asm/cpucaps.h>
 #include <asm/insn.h>
 
 #ifndef __ASSEMBLY__
 
+#include <asm/alternative_types.h>
+
 #include <linux/init.h>
 #include <linux/types.h>
 #include <linux/stddef.h>
 #include <linux/stringify.h>
 
-struct alt_instr {
-	s32 orig_offset;	/* offset to original instruction */
-	s32 alt_offset;		/* offset to replacement instruction */
-	u16 cpufeature;		/* cpufeature bit set for replacement */
-	u8  orig_len;		/* size of original instruction(s) */
-	u8  alt_len;		/* size of new instruction(s), <= orig_len */
-};
-
 void __init apply_alternatives_all(void);
 void apply_alternatives(void *start, size_t length);
 
 #define ALTINSTR_ENTRY(feature)						      \
+	" .align " __stringify(ALTINSTR_ALIGN) "\n"			      \
 	" .word 661b - .\n"				/* label           */ \
 	" .word 663f - .\n"				/* new instruction */ \
 	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
@@ -69,6 +65,7 @@ void apply_alternatives(void *start, size_t length);
 #include <asm/assembler.h>
 
 .macro altinstruction_entry orig_offset alt_offset feature orig_len alt_len
+	.align ALTINSTR_ALIGN
 	.word \orig_offset - .
 	.word \alt_offset - .
 	.hword \feature
diff --git a/arch/arm64/include/asm/alternative_types.h b/arch/arm64/include/asm/alternative_types.h
new file mode 100644
index 000000000000..26cf76167f2d
--- /dev/null
+++ b/arch/arm64/include/asm/alternative_types.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_ALTERNATIVE_TYPES_H
+#define __ASM_ALTERNATIVE_TYPES_H
+
+struct alt_instr {
+	s32 orig_offset;	/* offset to original instruction */
+	s32 alt_offset;		/* offset to replacement instruction */
+	u16 cpufeature;		/* cpufeature bit set for replacement */
+	u8  orig_len;		/* size of original instruction(s) */
+	u8  alt_len;		/* size of new instruction(s), <= orig_len */
+};
+
+#endif
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 5ab8841af382..f00666341ae2 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -25,6 +25,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/kvm_host.h>
 #include <linux/suspend.h>
+#include <asm/alternative_types.h>
 #include <asm/cpufeature.h>
 #include <asm/thread_info.h>
 #include <asm/memory.h>
@@ -151,5 +152,8 @@ int main(void)
   DEFINE(HIBERN_PBE_ADDR,	offsetof(struct pbe, address));
   DEFINE(HIBERN_PBE_NEXT,	offsetof(struct pbe, next));
   DEFINE(ARM64_FTR_SYSVAL,	offsetof(struct arm64_ftr_reg, sys_val));
+  BLANK();
+  DEFINE(ALTINSTR_ALIGN,	(63 - __builtin_clzl(__alignof__(struct alt_instr))));
+
   return 0;
 }
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 04/19] arm64: alternatives: Enforce alignment of struct alt_instr
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

We're playing a dangerous game with struct alt_instr, as we produce
it using assembly tricks, but parse them using the C structure.
We just assume that the respective alignments of the two will
be the same.

But as we add more fields to this structure, the alignment requirements
of the structure may change, and lead to all kind of funky bugs.

TO solve this, let's move the definition of struct alt_instr to its
own file, and use this to generate the alignment constraint from
asm-offsets.c. The various macros are then patched to take the
alignment into account.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/alternative.h       | 13 +++++--------
 arch/arm64/include/asm/alternative_types.h | 13 +++++++++++++
 arch/arm64/kernel/asm-offsets.c            |  4 ++++
 3 files changed, 22 insertions(+), 8 deletions(-)
 create mode 100644 arch/arm64/include/asm/alternative_types.h

diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
index 4a85c6952a22..395befde7595 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -2,28 +2,24 @@
 #ifndef __ASM_ALTERNATIVE_H
 #define __ASM_ALTERNATIVE_H
 
+#include <asm/asm-offsets.h>
 #include <asm/cpucaps.h>
 #include <asm/insn.h>
 
 #ifndef __ASSEMBLY__
 
+#include <asm/alternative_types.h>
+
 #include <linux/init.h>
 #include <linux/types.h>
 #include <linux/stddef.h>
 #include <linux/stringify.h>
 
-struct alt_instr {
-	s32 orig_offset;	/* offset to original instruction */
-	s32 alt_offset;		/* offset to replacement instruction */
-	u16 cpufeature;		/* cpufeature bit set for replacement */
-	u8  orig_len;		/* size of original instruction(s) */
-	u8  alt_len;		/* size of new instruction(s), <= orig_len */
-};
-
 void __init apply_alternatives_all(void);
 void apply_alternatives(void *start, size_t length);
 
 #define ALTINSTR_ENTRY(feature)						      \
+	" .align " __stringify(ALTINSTR_ALIGN) "\n"			      \
 	" .word 661b - .\n"				/* label           */ \
 	" .word 663f - .\n"				/* new instruction */ \
 	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
@@ -69,6 +65,7 @@ void apply_alternatives(void *start, size_t length);
 #include <asm/assembler.h>
 
 .macro altinstruction_entry orig_offset alt_offset feature orig_len alt_len
+	.align ALTINSTR_ALIGN
 	.word \orig_offset - .
 	.word \alt_offset - .
 	.hword \feature
diff --git a/arch/arm64/include/asm/alternative_types.h b/arch/arm64/include/asm/alternative_types.h
new file mode 100644
index 000000000000..26cf76167f2d
--- /dev/null
+++ b/arch/arm64/include/asm/alternative_types.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_ALTERNATIVE_TYPES_H
+#define __ASM_ALTERNATIVE_TYPES_H
+
+struct alt_instr {
+	s32 orig_offset;	/* offset to original instruction */
+	s32 alt_offset;		/* offset to replacement instruction */
+	u16 cpufeature;		/* cpufeature bit set for replacement */
+	u8  orig_len;		/* size of original instruction(s) */
+	u8  alt_len;		/* size of new instruction(s), <= orig_len */
+};
+
+#endif
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 5ab8841af382..f00666341ae2 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -25,6 +25,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/kvm_host.h>
 #include <linux/suspend.h>
+#include <asm/alternative_types.h>
 #include <asm/cpufeature.h>
 #include <asm/thread_info.h>
 #include <asm/memory.h>
@@ -151,5 +152,8 @@ int main(void)
   DEFINE(HIBERN_PBE_ADDR,	offsetof(struct pbe, address));
   DEFINE(HIBERN_PBE_NEXT,	offsetof(struct pbe, next));
   DEFINE(ARM64_FTR_SYSVAL,	offsetof(struct arm64_ftr_reg, sys_val));
+  BLANK();
+  DEFINE(ALTINSTR_ALIGN,	(63 - __builtin_clzl(__alignof__(struct alt_instr))));
+
   return 0;
 }
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 05/19] arm64: alternatives: Add dynamic patching feature
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm; +Cc: Catalin Marinas, Will Deacon

We've so far relied on a patching infrastructure that only gave us
a single alternative, without any way to finely control what gets
patched. For a single feature, this is an all or nothing thing.

It would be interesting to have a more fine grained way of patching
the kernel though, where we could dynamically tune the code that gets
injected.

In order to achive this, let's introduce a new form of alternative
that is associated with a callback. This callback gets the instruction
sequence number and the old instruction as a parameter, and returns
the new instruction. This callback is always called, as the patching
decision is now done at runtime (not patching is equivalent to returning
the same instruction).

Patching with a callback is declared with the new ALTERNATIVE_CB
and alternative_cb directives:

	asm volatile(ALTERNATIVE_CB("mov %0, #0\n", callback)
		     : "r" (v));
or
	alternative_cb callback
		mov	x0, #0
	alternative_cb_end

where callback is the C function computing the alternative.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/alternative.h       | 36 ++++++++++++++++++++++---
 arch/arm64/include/asm/alternative_types.h |  4 +++
 arch/arm64/kernel/alternative.c            | 43 ++++++++++++++++++++++--------
 3 files changed, 68 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
index 395befde7595..04f66f6173fc 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -18,10 +18,14 @@
 void __init apply_alternatives_all(void);
 void apply_alternatives(void *start, size_t length);
 
-#define ALTINSTR_ENTRY(feature)						      \
+#define ALTINSTR_ENTRY(feature,cb)					      \
 	" .align " __stringify(ALTINSTR_ALIGN) "\n"			      \
 	" .word 661b - .\n"				/* label           */ \
+	" .if " __stringify(cb) " == 0\n"				      \
 	" .word 663f - .\n"				/* new instruction */ \
+	" .else\n"							      \
+	" .word " __stringify(cb) "- .\n"		/* callback */	      \
+	" .endif\n"							      \
 	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
 	" .byte 662b-661b\n"				/* source len      */ \
 	" .byte 664f-663f\n"				/* replacement len */
@@ -39,15 +43,18 @@ void apply_alternatives(void *start, size_t length);
  * but most assemblers die if insn1 or insn2 have a .inst. This should
  * be fixed in a binutils release posterior to 2.25.51.0.2 (anything
  * containing commit 4e4d08cf7399b606 or c1baaddf8861).
+ *
+ * Alternatives with callbacks do not generate replacement instructions.
  */
-#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled)	\
+#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled, cb)	\
 	".if "__stringify(cfg_enabled)" == 1\n"				\
 	"661:\n\t"							\
 	oldinstr "\n"							\
 	"662:\n"							\
 	".pushsection .altinstructions,\"a\"\n"				\
-	ALTINSTR_ENTRY(feature)						\
+	ALTINSTR_ENTRY(feature,cb)					\
 	".popsection\n"							\
+	" .if " __stringify(cb) " == 0\n"				\
 	".pushsection .altinstr_replacement, \"a\"\n"			\
 	"663:\n\t"							\
 	newinstr "\n"							\
@@ -55,11 +62,17 @@ void apply_alternatives(void *start, size_t length);
 	".popsection\n\t"						\
 	".org	. - (664b-663b) + (662b-661b)\n\t"			\
 	".org	. - (662b-661b) + (664b-663b)\n"			\
+	".else\n\t"							\
+	"663:\n\t"							\
+	"664:\n\t"							\
+	".endif\n"							\
 	".endif\n"
 
 #define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...)	\
-	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg))
+	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg), 0)
 
+#define ALTERNATIVE_CB(oldinstr, cb) \
+	__ALTERNATIVE_CFG(oldinstr, "NOT_AN_INSTRUCTION", ARM64_NCAPS, 1, cb)
 #else
 
 #include <asm/assembler.h>
@@ -127,6 +140,14 @@ void apply_alternatives(void *start, size_t length);
 661:
 .endm
 
+.macro alternative_cb cb
+	.set .Lasm_alt_mode, 0
+	.pushsection .altinstructions, "a"
+	altinstruction_entry 661f, \cb, ARM64_NCAPS, 662f-661f, 0
+	.popsection
+661:
+.endm
+
 /*
  * Provide the other half of the alternative code sequence.
  */
@@ -152,6 +173,13 @@ void apply_alternatives(void *start, size_t length);
 	.org	. - (662b-661b) + (664b-663b)
 .endm
 
+/*
+ * Callback-based alternative epilogue
+ */
+.macro alternative_cb_end
+662:
+.endm
+
 /*
  * Provides a trivial alternative or default sequence consisting solely
  * of NOPs. The number of NOPs is chosen automatically to match the
diff --git a/arch/arm64/include/asm/alternative_types.h b/arch/arm64/include/asm/alternative_types.h
index 26cf76167f2d..e400b9061957 100644
--- a/arch/arm64/include/asm/alternative_types.h
+++ b/arch/arm64/include/asm/alternative_types.h
@@ -2,6 +2,10 @@
 #ifndef __ASM_ALTERNATIVE_TYPES_H
 #define __ASM_ALTERNATIVE_TYPES_H
 
+struct alt_instr;
+typedef void (*alternative_cb_t)(struct alt_instr *alt,
+				 __le32 *origptr, __le32 *updptr, int nr_inst);
+
 struct alt_instr {
 	s32 orig_offset;	/* offset to original instruction */
 	s32 alt_offset;		/* offset to replacement instruction */
diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
index 6dd0a3a3e5c9..0f52627fbb29 100644
--- a/arch/arm64/kernel/alternative.c
+++ b/arch/arm64/kernel/alternative.c
@@ -105,32 +105,53 @@ static u32 get_alt_insn(struct alt_instr *alt, __le32 *insnptr, __le32 *altinsnp
 	return insn;
 }
 
+static void patch_alternative(struct alt_instr *alt,
+			      __le32 *origptr, __le32 *updptr, int nr_inst)
+{
+	__le32 *replptr;
+	int i;
+
+	replptr = ALT_REPL_PTR(alt);
+	for (i = 0; i < nr_inst; i++) {
+		u32 insn;
+
+		insn = get_alt_insn(alt, origptr + i, replptr + i);
+		updptr[i] = cpu_to_le32(insn);
+	}
+}
+
 static void __apply_alternatives(void *alt_region, bool use_linear_alias)
 {
 	struct alt_instr *alt;
 	struct alt_region *region = alt_region;
-	__le32 *origptr, *replptr, *updptr;
+	__le32 *origptr, *updptr;
+	alternative_cb_t alt_cb;
 
 	for (alt = region->begin; alt < region->end; alt++) {
-		u32 insn;
-		int i, nr_inst;
+		int nr_inst;
 
-		if (!cpus_have_cap(alt->cpufeature))
+		/* Use ARM64_NCAPS as an unconditional patch */
+		if (alt->cpufeature < ARM64_NCAPS &&
+		    !cpus_have_cap(alt->cpufeature))
 			continue;
 
-		BUG_ON(alt->alt_len != alt->orig_len);
+		if (alt->cpufeature == ARM64_NCAPS)
+			BUG_ON(alt->alt_len != 0);
+		else
+			BUG_ON(alt->alt_len != alt->orig_len);
 
 		pr_info_once("patching kernel code\n");
 
 		origptr = ALT_ORIG_PTR(alt);
-		replptr = ALT_REPL_PTR(alt);
 		updptr = use_linear_alias ? lm_alias(origptr) : origptr;
-		nr_inst = alt->alt_len / sizeof(insn);
+		nr_inst = alt->orig_len / AARCH64_INSN_SIZE;
 
-		for (i = 0; i < nr_inst; i++) {
-			insn = get_alt_insn(alt, origptr + i, replptr + i);
-			updptr[i] = cpu_to_le32(insn);
-		}
+		if (alt->cpufeature < ARM64_NCAPS)
+			alt_cb = patch_alternative;
+		else
+			alt_cb  = ALT_REPL_PTR(alt);
+
+		alt_cb(alt, origptr, updptr, nr_inst);
 
 		flush_icache_range((uintptr_t)origptr,
 				   (uintptr_t)(origptr + nr_inst));
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 05/19] arm64: alternatives: Add dynamic patching feature
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

We've so far relied on a patching infrastructure that only gave us
a single alternative, without any way to finely control what gets
patched. For a single feature, this is an all or nothing thing.

It would be interesting to have a more fine grained way of patching
the kernel though, where we could dynamically tune the code that gets
injected.

In order to achive this, let's introduce a new form of alternative
that is associated with a callback. This callback gets the instruction
sequence number and the old instruction as a parameter, and returns
the new instruction. This callback is always called, as the patching
decision is now done at runtime (not patching is equivalent to returning
the same instruction).

Patching with a callback is declared with the new ALTERNATIVE_CB
and alternative_cb directives:

	asm volatile(ALTERNATIVE_CB("mov %0, #0\n", callback)
		     : "r" (v));
or
	alternative_cb callback
		mov	x0, #0
	alternative_cb_end

where callback is the C function computing the alternative.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/alternative.h       | 36 ++++++++++++++++++++++---
 arch/arm64/include/asm/alternative_types.h |  4 +++
 arch/arm64/kernel/alternative.c            | 43 ++++++++++++++++++++++--------
 3 files changed, 68 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
index 395befde7595..04f66f6173fc 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -18,10 +18,14 @@
 void __init apply_alternatives_all(void);
 void apply_alternatives(void *start, size_t length);
 
-#define ALTINSTR_ENTRY(feature)						      \
+#define ALTINSTR_ENTRY(feature,cb)					      \
 	" .align " __stringify(ALTINSTR_ALIGN) "\n"			      \
 	" .word 661b - .\n"				/* label           */ \
+	" .if " __stringify(cb) " == 0\n"				      \
 	" .word 663f - .\n"				/* new instruction */ \
+	" .else\n"							      \
+	" .word " __stringify(cb) "- .\n"		/* callback */	      \
+	" .endif\n"							      \
 	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
 	" .byte 662b-661b\n"				/* source len      */ \
 	" .byte 664f-663f\n"				/* replacement len */
@@ -39,15 +43,18 @@ void apply_alternatives(void *start, size_t length);
  * but most assemblers die if insn1 or insn2 have a .inst. This should
  * be fixed in a binutils release posterior to 2.25.51.0.2 (anything
  * containing commit 4e4d08cf7399b606 or c1baaddf8861).
+ *
+ * Alternatives with callbacks do not generate replacement instructions.
  */
-#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled)	\
+#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled, cb)	\
 	".if "__stringify(cfg_enabled)" == 1\n"				\
 	"661:\n\t"							\
 	oldinstr "\n"							\
 	"662:\n"							\
 	".pushsection .altinstructions,\"a\"\n"				\
-	ALTINSTR_ENTRY(feature)						\
+	ALTINSTR_ENTRY(feature,cb)					\
 	".popsection\n"							\
+	" .if " __stringify(cb) " == 0\n"				\
 	".pushsection .altinstr_replacement, \"a\"\n"			\
 	"663:\n\t"							\
 	newinstr "\n"							\
@@ -55,11 +62,17 @@ void apply_alternatives(void *start, size_t length);
 	".popsection\n\t"						\
 	".org	. - (664b-663b) + (662b-661b)\n\t"			\
 	".org	. - (662b-661b) + (664b-663b)\n"			\
+	".else\n\t"							\
+	"663:\n\t"							\
+	"664:\n\t"							\
+	".endif\n"							\
 	".endif\n"
 
 #define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...)	\
-	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg))
+	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg), 0)
 
+#define ALTERNATIVE_CB(oldinstr, cb) \
+	__ALTERNATIVE_CFG(oldinstr, "NOT_AN_INSTRUCTION", ARM64_NCAPS, 1, cb)
 #else
 
 #include <asm/assembler.h>
@@ -127,6 +140,14 @@ void apply_alternatives(void *start, size_t length);
 661:
 .endm
 
+.macro alternative_cb cb
+	.set .Lasm_alt_mode, 0
+	.pushsection .altinstructions, "a"
+	altinstruction_entry 661f, \cb, ARM64_NCAPS, 662f-661f, 0
+	.popsection
+661:
+.endm
+
 /*
  * Provide the other half of the alternative code sequence.
  */
@@ -152,6 +173,13 @@ void apply_alternatives(void *start, size_t length);
 	.org	. - (662b-661b) + (664b-663b)
 .endm
 
+/*
+ * Callback-based alternative epilogue
+ */
+.macro alternative_cb_end
+662:
+.endm
+
 /*
  * Provides a trivial alternative or default sequence consisting solely
  * of NOPs. The number of NOPs is chosen automatically to match the
diff --git a/arch/arm64/include/asm/alternative_types.h b/arch/arm64/include/asm/alternative_types.h
index 26cf76167f2d..e400b9061957 100644
--- a/arch/arm64/include/asm/alternative_types.h
+++ b/arch/arm64/include/asm/alternative_types.h
@@ -2,6 +2,10 @@
 #ifndef __ASM_ALTERNATIVE_TYPES_H
 #define __ASM_ALTERNATIVE_TYPES_H
 
+struct alt_instr;
+typedef void (*alternative_cb_t)(struct alt_instr *alt,
+				 __le32 *origptr, __le32 *updptr, int nr_inst);
+
 struct alt_instr {
 	s32 orig_offset;	/* offset to original instruction */
 	s32 alt_offset;		/* offset to replacement instruction */
diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
index 6dd0a3a3e5c9..0f52627fbb29 100644
--- a/arch/arm64/kernel/alternative.c
+++ b/arch/arm64/kernel/alternative.c
@@ -105,32 +105,53 @@ static u32 get_alt_insn(struct alt_instr *alt, __le32 *insnptr, __le32 *altinsnp
 	return insn;
 }
 
+static void patch_alternative(struct alt_instr *alt,
+			      __le32 *origptr, __le32 *updptr, int nr_inst)
+{
+	__le32 *replptr;
+	int i;
+
+	replptr = ALT_REPL_PTR(alt);
+	for (i = 0; i < nr_inst; i++) {
+		u32 insn;
+
+		insn = get_alt_insn(alt, origptr + i, replptr + i);
+		updptr[i] = cpu_to_le32(insn);
+	}
+}
+
 static void __apply_alternatives(void *alt_region, bool use_linear_alias)
 {
 	struct alt_instr *alt;
 	struct alt_region *region = alt_region;
-	__le32 *origptr, *replptr, *updptr;
+	__le32 *origptr, *updptr;
+	alternative_cb_t alt_cb;
 
 	for (alt = region->begin; alt < region->end; alt++) {
-		u32 insn;
-		int i, nr_inst;
+		int nr_inst;
 
-		if (!cpus_have_cap(alt->cpufeature))
+		/* Use ARM64_NCAPS as an unconditional patch */
+		if (alt->cpufeature < ARM64_NCAPS &&
+		    !cpus_have_cap(alt->cpufeature))
 			continue;
 
-		BUG_ON(alt->alt_len != alt->orig_len);
+		if (alt->cpufeature == ARM64_NCAPS)
+			BUG_ON(alt->alt_len != 0);
+		else
+			BUG_ON(alt->alt_len != alt->orig_len);
 
 		pr_info_once("patching kernel code\n");
 
 		origptr = ALT_ORIG_PTR(alt);
-		replptr = ALT_REPL_PTR(alt);
 		updptr = use_linear_alias ? lm_alias(origptr) : origptr;
-		nr_inst = alt->alt_len / sizeof(insn);
+		nr_inst = alt->orig_len / AARCH64_INSN_SIZE;
 
-		for (i = 0; i < nr_inst; i++) {
-			insn = get_alt_insn(alt, origptr + i, replptr + i);
-			updptr[i] = cpu_to_le32(insn);
-		}
+		if (alt->cpufeature < ARM64_NCAPS)
+			alt_cb = patch_alternative;
+		else
+			alt_cb  = ALT_REPL_PTR(alt);
+
+		alt_cb(alt, origptr, updptr, nr_inst);
 
 		flush_icache_range((uintptr_t)origptr,
 				   (uintptr_t)(origptr + nr_inst));
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 06/19] arm64: insn: Add N immediate encoding
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm; +Cc: Catalin Marinas, Will Deacon

We're missing the a way to generate the encoding of the N immediate,
which is only a single bit used in a number of instruction that take
an immediate.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/insn.h | 1 +
 arch/arm64/kernel/insn.c      | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 4214c38d016b..21fffdd290a3 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -70,6 +70,7 @@ enum aarch64_insn_imm_type {
 	AARCH64_INSN_IMM_6,
 	AARCH64_INSN_IMM_S,
 	AARCH64_INSN_IMM_R,
+	AARCH64_INSN_IMM_N,
 	AARCH64_INSN_IMM_MAX
 };
 
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 2718a77da165..7e432662d454 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -343,6 +343,10 @@ static int __kprobes aarch64_get_imm_shift_mask(enum aarch64_insn_imm_type type,
 		mask = BIT(6) - 1;
 		shift = 16;
 		break;
+	case AARCH64_INSN_IMM_N:
+		mask = 1;
+		shift = 22;
+		break;
 	default:
 		return -EINVAL;
 	}
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 06/19] arm64: insn: Add N immediate encoding
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

We're missing the a way to generate the encoding of the N immediate,
which is only a single bit used in a number of instruction that take
an immediate.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/insn.h | 1 +
 arch/arm64/kernel/insn.c      | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 4214c38d016b..21fffdd290a3 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -70,6 +70,7 @@ enum aarch64_insn_imm_type {
 	AARCH64_INSN_IMM_6,
 	AARCH64_INSN_IMM_S,
 	AARCH64_INSN_IMM_R,
+	AARCH64_INSN_IMM_N,
 	AARCH64_INSN_IMM_MAX
 };
 
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 2718a77da165..7e432662d454 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -343,6 +343,10 @@ static int __kprobes aarch64_get_imm_shift_mask(enum aarch64_insn_imm_type type,
 		mask = BIT(6) - 1;
 		shift = 16;
 		break;
+	case AARCH64_INSN_IMM_N:
+		mask = 1;
+		shift = 22;
+		break;
 	default:
 		return -EINVAL;
 	}
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 07/19] arm64: insn: Add encoder for bitwise operations using literals
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: Christoffer Dall, Mark Rutland, Catalin Marinas, Will Deacon,
	James Morse, Steve Capper, Peter Maydell

We lack a way to encode operations such as AND, ORR, EOR that take
an immediate value. Doing so is quite involved, and is all about
reverse engineering the decoding algorithm described in the
pseudocode function DecodeBitMasks().

This has been tested by feeding it all the possible literal values
and comparing the output with that of GAS.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/insn.h |   9 +++
 arch/arm64/kernel/insn.c      | 136 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 145 insertions(+)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 21fffdd290a3..815b35bc53ed 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -315,6 +315,10 @@ __AARCH64_INSN_FUNCS(eor,	0x7F200000, 0x4A000000)
 __AARCH64_INSN_FUNCS(eon,	0x7F200000, 0x4A200000)
 __AARCH64_INSN_FUNCS(ands,	0x7F200000, 0x6A000000)
 __AARCH64_INSN_FUNCS(bics,	0x7F200000, 0x6A200000)
+__AARCH64_INSN_FUNCS(and_imm,	0x7F800000, 0x12000000)
+__AARCH64_INSN_FUNCS(orr_imm,	0x7F800000, 0x32000000)
+__AARCH64_INSN_FUNCS(eor_imm,	0x7F800000, 0x52000000)
+__AARCH64_INSN_FUNCS(ands_imm,	0x7F800000, 0x72000000)
 __AARCH64_INSN_FUNCS(b,		0xFC000000, 0x14000000)
 __AARCH64_INSN_FUNCS(bl,	0xFC000000, 0x94000000)
 __AARCH64_INSN_FUNCS(cbz,	0x7F000000, 0x34000000)
@@ -424,6 +428,11 @@ u32 aarch64_insn_gen_logical_shifted_reg(enum aarch64_insn_register dst,
 					 int shift,
 					 enum aarch64_insn_variant variant,
 					 enum aarch64_insn_logic_type type);
+u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
+				       enum aarch64_insn_variant variant,
+				       enum aarch64_insn_register Rn,
+				       enum aarch64_insn_register Rd,
+				       u64 imm);
 u32 aarch64_insn_gen_prefetch(enum aarch64_insn_register base,
 			      enum aarch64_insn_prfm_type type,
 			      enum aarch64_insn_prfm_target target,
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 7e432662d454..72cb1721c63f 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -1485,3 +1485,139 @@ pstate_check_t * const aarch32_opcode_cond_checks[16] = {
 	__check_hi, __check_ls, __check_ge, __check_lt,
 	__check_gt, __check_le, __check_al, __check_al
 };
+
+static bool range_of_ones(u64 val)
+{
+	/* Doesn't handle full ones or full zeroes */
+	u64 sval = val >> __ffs64(val);
+
+	/* One of Sean Eron Anderson's bithack tricks */
+	return ((sval + 1) & (sval)) == 0;
+}
+
+static u32 aarch64_encode_immediate(u64 imm,
+				    enum aarch64_insn_variant variant,
+				    u32 insn)
+{
+	unsigned int immr, imms, n, ones, ror, esz, tmp;
+	u64 mask = ~0UL;
+
+	/* Can't encode full zeroes or full ones */
+	if (!imm || !~imm)
+		return AARCH64_BREAK_FAULT;
+
+	switch (variant) {
+	case AARCH64_INSN_VARIANT_32BIT:
+		if (upper_32_bits(imm))
+			return AARCH64_BREAK_FAULT;
+		esz = 32;
+		break;
+	case AARCH64_INSN_VARIANT_64BIT:
+		insn |= AARCH64_INSN_SF_BIT;
+		esz = 64;
+		break;
+	default:
+		pr_err("%s: unknown variant encoding %d\n", __func__, variant);
+		return AARCH64_BREAK_FAULT;
+	}
+
+	/*
+	 * Inverse of Replicate(). Try to spot a repeating pattern
+	 * with a pow2 stride.
+	 */
+	for (tmp = esz / 2; tmp >= 2; tmp /= 2) {
+		u64 emask = BIT(tmp) - 1;
+
+		if ((imm & emask) != ((imm >> (tmp / 2)) & emask))
+			break;
+
+		esz = tmp;
+		mask = emask;
+	}
+
+	/* N is only set if we're encoding a 64bit value */
+	n = esz == 64;
+
+	/* Trim imm to the element size */
+	imm &= mask;
+
+	/* That's how many ones we need to encode */
+	ones = hweight64(imm);
+
+	/*
+	 * imms is set to (ones - 1), prefixed with a string of ones
+	 * and a zero if they fit. Cap it to 6 bits.
+	 */
+	imms  = ones - 1;
+	imms |= 0xf << ffs(esz);
+	imms &= BIT(6) - 1;
+
+	/* Compute the rotation */
+	if (range_of_ones(imm)) {
+		/*
+		 * Pattern: 0..01..10..0
+		 *
+		 * Compute how many rotate we need to align it right
+		 */
+		ror = __ffs64(imm);
+	} else {
+		/*
+		 * Pattern: 0..01..10..01..1
+		 *
+		 * Fill the unused top bits with ones, and check if
+		 * the result is a valid immediate (all ones with a
+		 * contiguous ranges of zeroes).
+		 */
+		imm |= ~mask;
+		if (!range_of_ones(~imm))
+			return AARCH64_BREAK_FAULT;
+
+		/*
+		 * Compute the rotation to get a continuous set of
+		 * ones, with the first bit set at position 0
+		 */
+		ror = fls(~imm);
+	}
+
+	/*
+	 * immr is the number of bits we need to rotate back to the
+	 * original set of ones. Note that this is relative to the
+	 * element size...
+	 */
+	immr = (esz - ror) % esz;
+
+	insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_N, insn, n);
+	insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_R, insn, immr);
+	return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_S, insn, imms);
+}
+
+u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
+				       enum aarch64_insn_variant variant,
+				       enum aarch64_insn_register Rn,
+				       enum aarch64_insn_register Rd,
+				       u64 imm)
+{
+	u32 insn;
+
+	switch (type) {
+	case AARCH64_INSN_LOGIC_AND:
+		insn = aarch64_insn_get_and_imm_value();
+		break;
+	case AARCH64_INSN_LOGIC_ORR:
+		insn = aarch64_insn_get_orr_imm_value();
+		break;
+	case AARCH64_INSN_LOGIC_EOR:
+		insn = aarch64_insn_get_eor_imm_value();
+		break;
+	case AARCH64_INSN_LOGIC_AND_SETFLAGS:
+		insn = aarch64_insn_get_ands_imm_value();
+		break;
+	default:
+		pr_err("%s: unknown logical encoding %d\n", __func__, type);
+		return AARCH64_BREAK_FAULT;
+	}
+
+	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, Rd);
+	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, Rn);
+	return aarch64_encode_immediate(imm, variant, insn);
+}
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 07/19] arm64: insn: Add encoder for bitwise operations using literals
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

We lack a way to encode operations such as AND, ORR, EOR that take
an immediate value. Doing so is quite involved, and is all about
reverse engineering the decoding algorithm described in the
pseudocode function DecodeBitMasks().

This has been tested by feeding it all the possible literal values
and comparing the output with that of GAS.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/insn.h |   9 +++
 arch/arm64/kernel/insn.c      | 136 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 145 insertions(+)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 21fffdd290a3..815b35bc53ed 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -315,6 +315,10 @@ __AARCH64_INSN_FUNCS(eor,	0x7F200000, 0x4A000000)
 __AARCH64_INSN_FUNCS(eon,	0x7F200000, 0x4A200000)
 __AARCH64_INSN_FUNCS(ands,	0x7F200000, 0x6A000000)
 __AARCH64_INSN_FUNCS(bics,	0x7F200000, 0x6A200000)
+__AARCH64_INSN_FUNCS(and_imm,	0x7F800000, 0x12000000)
+__AARCH64_INSN_FUNCS(orr_imm,	0x7F800000, 0x32000000)
+__AARCH64_INSN_FUNCS(eor_imm,	0x7F800000, 0x52000000)
+__AARCH64_INSN_FUNCS(ands_imm,	0x7F800000, 0x72000000)
 __AARCH64_INSN_FUNCS(b,		0xFC000000, 0x14000000)
 __AARCH64_INSN_FUNCS(bl,	0xFC000000, 0x94000000)
 __AARCH64_INSN_FUNCS(cbz,	0x7F000000, 0x34000000)
@@ -424,6 +428,11 @@ u32 aarch64_insn_gen_logical_shifted_reg(enum aarch64_insn_register dst,
 					 int shift,
 					 enum aarch64_insn_variant variant,
 					 enum aarch64_insn_logic_type type);
+u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
+				       enum aarch64_insn_variant variant,
+				       enum aarch64_insn_register Rn,
+				       enum aarch64_insn_register Rd,
+				       u64 imm);
 u32 aarch64_insn_gen_prefetch(enum aarch64_insn_register base,
 			      enum aarch64_insn_prfm_type type,
 			      enum aarch64_insn_prfm_target target,
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 7e432662d454..72cb1721c63f 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -1485,3 +1485,139 @@ pstate_check_t * const aarch32_opcode_cond_checks[16] = {
 	__check_hi, __check_ls, __check_ge, __check_lt,
 	__check_gt, __check_le, __check_al, __check_al
 };
+
+static bool range_of_ones(u64 val)
+{
+	/* Doesn't handle full ones or full zeroes */
+	u64 sval = val >> __ffs64(val);
+
+	/* One of Sean Eron Anderson's bithack tricks */
+	return ((sval + 1) & (sval)) == 0;
+}
+
+static u32 aarch64_encode_immediate(u64 imm,
+				    enum aarch64_insn_variant variant,
+				    u32 insn)
+{
+	unsigned int immr, imms, n, ones, ror, esz, tmp;
+	u64 mask = ~0UL;
+
+	/* Can't encode full zeroes or full ones */
+	if (!imm || !~imm)
+		return AARCH64_BREAK_FAULT;
+
+	switch (variant) {
+	case AARCH64_INSN_VARIANT_32BIT:
+		if (upper_32_bits(imm))
+			return AARCH64_BREAK_FAULT;
+		esz = 32;
+		break;
+	case AARCH64_INSN_VARIANT_64BIT:
+		insn |= AARCH64_INSN_SF_BIT;
+		esz = 64;
+		break;
+	default:
+		pr_err("%s: unknown variant encoding %d\n", __func__, variant);
+		return AARCH64_BREAK_FAULT;
+	}
+
+	/*
+	 * Inverse of Replicate(). Try to spot a repeating pattern
+	 * with a pow2 stride.
+	 */
+	for (tmp = esz / 2; tmp >= 2; tmp /= 2) {
+		u64 emask = BIT(tmp) - 1;
+
+		if ((imm & emask) != ((imm >> (tmp / 2)) & emask))
+			break;
+
+		esz = tmp;
+		mask = emask;
+	}
+
+	/* N is only set if we're encoding a 64bit value */
+	n = esz == 64;
+
+	/* Trim imm to the element size */
+	imm &= mask;
+
+	/* That's how many ones we need to encode */
+	ones = hweight64(imm);
+
+	/*
+	 * imms is set to (ones - 1), prefixed with a string of ones
+	 * and a zero if they fit. Cap it to 6 bits.
+	 */
+	imms  = ones - 1;
+	imms |= 0xf << ffs(esz);
+	imms &= BIT(6) - 1;
+
+	/* Compute the rotation */
+	if (range_of_ones(imm)) {
+		/*
+		 * Pattern: 0..01..10..0
+		 *
+		 * Compute how many rotate we need to align it right
+		 */
+		ror = __ffs64(imm);
+	} else {
+		/*
+		 * Pattern: 0..01..10..01..1
+		 *
+		 * Fill the unused top bits with ones, and check if
+		 * the result is a valid immediate (all ones with a
+		 * contiguous ranges of zeroes).
+		 */
+		imm |= ~mask;
+		if (!range_of_ones(~imm))
+			return AARCH64_BREAK_FAULT;
+
+		/*
+		 * Compute the rotation to get a continuous set of
+		 * ones, with the first bit set@position 0
+		 */
+		ror = fls(~imm);
+	}
+
+	/*
+	 * immr is the number of bits we need to rotate back to the
+	 * original set of ones. Note that this is relative to the
+	 * element size...
+	 */
+	immr = (esz - ror) % esz;
+
+	insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_N, insn, n);
+	insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_R, insn, immr);
+	return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_S, insn, imms);
+}
+
+u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
+				       enum aarch64_insn_variant variant,
+				       enum aarch64_insn_register Rn,
+				       enum aarch64_insn_register Rd,
+				       u64 imm)
+{
+	u32 insn;
+
+	switch (type) {
+	case AARCH64_INSN_LOGIC_AND:
+		insn = aarch64_insn_get_and_imm_value();
+		break;
+	case AARCH64_INSN_LOGIC_ORR:
+		insn = aarch64_insn_get_orr_imm_value();
+		break;
+	case AARCH64_INSN_LOGIC_EOR:
+		insn = aarch64_insn_get_eor_imm_value();
+		break;
+	case AARCH64_INSN_LOGIC_AND_SETFLAGS:
+		insn = aarch64_insn_get_ands_imm_value();
+		break;
+	default:
+		pr_err("%s: unknown logical encoding %d\n", __func__, type);
+		return AARCH64_BREAK_FAULT;
+	}
+
+	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, Rd);
+	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, Rn);
+	return aarch64_encode_immediate(imm, variant, insn);
+}
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 08/19] arm64: KVM: Dynamically patch the kernel/hyp VA mask
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: Christoffer Dall, Mark Rutland, Catalin Marinas, Will Deacon,
	James Morse, Steve Capper, Peter Maydell

So far, we're using a complicated sequence of alternatives to
patch the kernel/hyp VA mask on non-VHE, and NOP out the
masking altogether when on VHE.

THe newly introduced dynamic patching gives us the opportunity
to simplify that code by patching a single instruction with
the correct mask (instead of the mind bending cummulative masking
we have at the moment) or even a single NOP on VHE.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_mmu.h | 45 ++++++--------------
 arch/arm64/kvm/Makefile          |  2 +-
 arch/arm64/kvm/va_layout.c       | 91 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 104 insertions(+), 34 deletions(-)
 create mode 100644 arch/arm64/kvm/va_layout.c

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 672c8684d5c2..b0c3cbe9b513 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -69,9 +69,6 @@
  * mappings, and none of this applies in that case.
  */
 
-#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
-#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
-
 #ifdef __ASSEMBLY__
 
 #include <asm/alternative.h>
@@ -81,28 +78,14 @@
  * Convert a kernel VA into a HYP VA.
  * reg: VA to be converted.
  *
- * This generates the following sequences:
- * - High mask:
- *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
- *		nop
- * - Low mask:
- *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
- *		and x0, x0, #HYP_PAGE_OFFSET_LOW_MASK
- * - VHE:
- *		nop
- *		nop
- *
- * The "low mask" version works because the mask is a strict subset of
- * the "high mask", hence performing the first mask for nothing.
- * Should be completely invisible on any viable CPU.
+ * The actual code generation takes place in kvm_update_va_mask, and
+ * the instructions below are only there to reserve the space and
+ * perform the register allocation.
  */
 .macro kern_hyp_va	reg
-alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
-	and     \reg, \reg, #HYP_PAGE_OFFSET_HIGH_MASK
-alternative_else_nop_endif
-alternative_if ARM64_HYP_OFFSET_LOW
-	and     \reg, \reg, #HYP_PAGE_OFFSET_LOW_MASK
-alternative_else_nop_endif
+alternative_cb kvm_update_va_mask
+	and     \reg, \reg, #1
+alternative_cb_end
 .endm
 
 #else
@@ -113,18 +96,14 @@ alternative_else_nop_endif
 #include <asm/mmu_context.h>
 #include <asm/pgtable.h>
 
+void kvm_update_va_mask(struct alt_instr *alt,
+			__le32 *origptr, __le32 *updptr, int nr_inst);
+
 static inline unsigned long __kern_hyp_va(unsigned long v)
 {
-	asm volatile(ALTERNATIVE("and %0, %0, %1",
-				 "nop",
-				 ARM64_HAS_VIRT_HOST_EXTN)
-		     : "+r" (v)
-		     : "i" (HYP_PAGE_OFFSET_HIGH_MASK));
-	asm volatile(ALTERNATIVE("nop",
-				 "and %0, %0, %1",
-				 ARM64_HYP_OFFSET_LOW)
-		     : "+r" (v)
-		     : "i" (HYP_PAGE_OFFSET_LOW_MASK));
+	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
+				    kvm_update_va_mask)
+		     : "+r" (v));
 	return v;
 }
 
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 87c4f7ae24de..93afff91cb7c 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -16,7 +16,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/e
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
 
-kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o
+kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
 kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
 kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
 kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o
diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
new file mode 100644
index 000000000000..aee758574e61
--- /dev/null
+++ b/arch/arm64/kvm/va_layout.c
@@ -0,0 +1,91 @@
+/*
+ * Copyright (C) 2017 ARM Ltd.
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/kvm_host.h>
+#include <asm/alternative.h>
+#include <asm/debug-monitors.h>
+#include <asm/insn.h>
+#include <asm/kvm_mmu.h>
+
+#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
+#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
+
+static u64 va_mask;
+
+static void compute_layout(void)
+{
+	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
+	unsigned long mask = HYP_PAGE_OFFSET_HIGH_MASK;
+
+	/*
+	 * Activate the lower HYP offset only if the idmap doesn't
+	 * clash with it,
+	 */
+	if (idmap_addr > HYP_PAGE_OFFSET_LOW_MASK)
+		mask = HYP_PAGE_OFFSET_HIGH_MASK;
+
+	va_mask = mask;
+}
+
+static u32 compute_instruction(int n, u32 rd, u32 rn)
+{
+	u32 insn = AARCH64_BREAK_FAULT;
+
+	switch (n) {
+	case 0:
+		insn = aarch64_insn_gen_logical_immediate(AARCH64_INSN_LOGIC_AND,
+							  AARCH64_INSN_VARIANT_64BIT,
+							  rn, rd, va_mask);
+		break;
+	}
+
+	return insn;
+}
+
+void __init kvm_update_va_mask(struct alt_instr *alt,
+			       __le32 *origptr, __le32 *updptr, int nr_inst)
+{
+	int i;
+
+	/* We only expect a 1 instruction sequence */
+	BUG_ON(nr_inst != 1);
+
+	if (!has_vhe() && !va_mask)
+		compute_layout();
+
+	for (i = 0; i < nr_inst; i++) {
+		u32 rd, rn, insn, oinsn;
+
+		/*
+		 * VHE doesn't need any address translation, let's NOP
+		 * everything.
+		 */
+		if (has_vhe()) {
+			updptr[i] = aarch64_insn_gen_nop();
+			continue;
+		}
+
+		oinsn = le32_to_cpu(origptr[i]);
+		rd = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RD, oinsn);
+		rn = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, oinsn);
+
+		insn = compute_instruction(i, rd, rn);
+		BUG_ON(insn == AARCH64_BREAK_FAULT);
+
+		updptr[i] = cpu_to_le32(insn);
+	}
+}
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 08/19] arm64: KVM: Dynamically patch the kernel/hyp VA mask
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

So far, we're using a complicated sequence of alternatives to
patch the kernel/hyp VA mask on non-VHE, and NOP out the
masking altogether when on VHE.

THe newly introduced dynamic patching gives us the opportunity
to simplify that code by patching a single instruction with
the correct mask (instead of the mind bending cummulative masking
we have at the moment) or even a single NOP on VHE.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_mmu.h | 45 ++++++--------------
 arch/arm64/kvm/Makefile          |  2 +-
 arch/arm64/kvm/va_layout.c       | 91 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 104 insertions(+), 34 deletions(-)
 create mode 100644 arch/arm64/kvm/va_layout.c

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 672c8684d5c2..b0c3cbe9b513 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -69,9 +69,6 @@
  * mappings, and none of this applies in that case.
  */
 
-#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
-#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
-
 #ifdef __ASSEMBLY__
 
 #include <asm/alternative.h>
@@ -81,28 +78,14 @@
  * Convert a kernel VA into a HYP VA.
  * reg: VA to be converted.
  *
- * This generates the following sequences:
- * - High mask:
- *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
- *		nop
- * - Low mask:
- *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
- *		and x0, x0, #HYP_PAGE_OFFSET_LOW_MASK
- * - VHE:
- *		nop
- *		nop
- *
- * The "low mask" version works because the mask is a strict subset of
- * the "high mask", hence performing the first mask for nothing.
- * Should be completely invisible on any viable CPU.
+ * The actual code generation takes place in kvm_update_va_mask, and
+ * the instructions below are only there to reserve the space and
+ * perform the register allocation.
  */
 .macro kern_hyp_va	reg
-alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
-	and     \reg, \reg, #HYP_PAGE_OFFSET_HIGH_MASK
-alternative_else_nop_endif
-alternative_if ARM64_HYP_OFFSET_LOW
-	and     \reg, \reg, #HYP_PAGE_OFFSET_LOW_MASK
-alternative_else_nop_endif
+alternative_cb kvm_update_va_mask
+	and     \reg, \reg, #1
+alternative_cb_end
 .endm
 
 #else
@@ -113,18 +96,14 @@ alternative_else_nop_endif
 #include <asm/mmu_context.h>
 #include <asm/pgtable.h>
 
+void kvm_update_va_mask(struct alt_instr *alt,
+			__le32 *origptr, __le32 *updptr, int nr_inst);
+
 static inline unsigned long __kern_hyp_va(unsigned long v)
 {
-	asm volatile(ALTERNATIVE("and %0, %0, %1",
-				 "nop",
-				 ARM64_HAS_VIRT_HOST_EXTN)
-		     : "+r" (v)
-		     : "i" (HYP_PAGE_OFFSET_HIGH_MASK));
-	asm volatile(ALTERNATIVE("nop",
-				 "and %0, %0, %1",
-				 ARM64_HYP_OFFSET_LOW)
-		     : "+r" (v)
-		     : "i" (HYP_PAGE_OFFSET_LOW_MASK));
+	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
+				    kvm_update_va_mask)
+		     : "+r" (v));
 	return v;
 }
 
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 87c4f7ae24de..93afff91cb7c 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -16,7 +16,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/e
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o
 kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
 
-kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o
+kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
 kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
 kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
 kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o
diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
new file mode 100644
index 000000000000..aee758574e61
--- /dev/null
+++ b/arch/arm64/kvm/va_layout.c
@@ -0,0 +1,91 @@
+/*
+ * Copyright (C) 2017 ARM Ltd.
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/kvm_host.h>
+#include <asm/alternative.h>
+#include <asm/debug-monitors.h>
+#include <asm/insn.h>
+#include <asm/kvm_mmu.h>
+
+#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
+#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
+
+static u64 va_mask;
+
+static void compute_layout(void)
+{
+	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
+	unsigned long mask = HYP_PAGE_OFFSET_HIGH_MASK;
+
+	/*
+	 * Activate the lower HYP offset only if the idmap doesn't
+	 * clash with it,
+	 */
+	if (idmap_addr > HYP_PAGE_OFFSET_LOW_MASK)
+		mask = HYP_PAGE_OFFSET_HIGH_MASK;
+
+	va_mask = mask;
+}
+
+static u32 compute_instruction(int n, u32 rd, u32 rn)
+{
+	u32 insn = AARCH64_BREAK_FAULT;
+
+	switch (n) {
+	case 0:
+		insn = aarch64_insn_gen_logical_immediate(AARCH64_INSN_LOGIC_AND,
+							  AARCH64_INSN_VARIANT_64BIT,
+							  rn, rd, va_mask);
+		break;
+	}
+
+	return insn;
+}
+
+void __init kvm_update_va_mask(struct alt_instr *alt,
+			       __le32 *origptr, __le32 *updptr, int nr_inst)
+{
+	int i;
+
+	/* We only expect a 1 instruction sequence */
+	BUG_ON(nr_inst != 1);
+
+	if (!has_vhe() && !va_mask)
+		compute_layout();
+
+	for (i = 0; i < nr_inst; i++) {
+		u32 rd, rn, insn, oinsn;
+
+		/*
+		 * VHE doesn't need any address translation, let's NOP
+		 * everything.
+		 */
+		if (has_vhe()) {
+			updptr[i] = aarch64_insn_gen_nop();
+			continue;
+		}
+
+		oinsn = le32_to_cpu(origptr[i]);
+		rd = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RD, oinsn);
+		rn = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, oinsn);
+
+		insn = compute_instruction(i, rd, rn);
+		BUG_ON(insn == AARCH64_BREAK_FAULT);
+
+		updptr[i] = cpu_to_le32(insn);
+	}
+}
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 09/19] arm64: cpufeatures: Drop the ARM64_HYP_OFFSET_LOW feature flag
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: Christoffer Dall, Mark Rutland, Catalin Marinas, Will Deacon,
	James Morse, Steve Capper, Peter Maydell

Now that we can dynamically compute the kernek/hyp VA mask, there
is need for a feature flag to trigger the alternative patching.
Let's drop the flag and everything that depends on it.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/cpucaps.h |  2 +-
 arch/arm64/kernel/cpufeature.c   | 19 -------------------
 2 files changed, 1 insertion(+), 20 deletions(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 2ff7c5e8efab..f130f35dca3c 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -32,7 +32,7 @@
 #define ARM64_HAS_VIRT_HOST_EXTN		11
 #define ARM64_WORKAROUND_CAVIUM_27456		12
 #define ARM64_HAS_32BIT_EL0			13
-#define ARM64_HYP_OFFSET_LOW			14
+/* #define ARM64_UNALLOCATED_ENTRY			14 */
 #define ARM64_MISMATCHED_CACHE_LINE_SIZE	15
 #define ARM64_HAS_NO_FPSIMD			16
 #define ARM64_WORKAROUND_REPEAT_TLBI		17
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index a73a5928f09b..b99f8b1688c3 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -825,19 +825,6 @@ static bool runs_at_el2(const struct arm64_cpu_capabilities *entry, int __unused
 	return is_kernel_in_hyp_mode();
 }
 
-static bool hyp_offset_low(const struct arm64_cpu_capabilities *entry,
-			   int __unused)
-{
-	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
-
-	/*
-	 * Activate the lower HYP offset only if:
-	 * - the idmap doesn't clash with it,
-	 * - the kernel is not running at EL2.
-	 */
-	return idmap_addr > GENMASK(VA_BITS - 2, 0) && !is_kernel_in_hyp_mode();
-}
-
 static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unused)
 {
 	u64 pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
@@ -926,12 +913,6 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.field_pos = ID_AA64PFR0_EL0_SHIFT,
 		.min_field_value = ID_AA64PFR0_EL0_32BIT_64BIT,
 	},
-	{
-		.desc = "Reduced HYP mapping offset",
-		.capability = ARM64_HYP_OFFSET_LOW,
-		.def_scope = SCOPE_SYSTEM,
-		.matches = hyp_offset_low,
-	},
 	{
 		/* FP/SIMD is not implemented */
 		.capability = ARM64_HAS_NO_FPSIMD,
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 09/19] arm64: cpufeatures: Drop the ARM64_HYP_OFFSET_LOW feature flag
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

Now that we can dynamically compute the kernek/hyp VA mask, there
is need for a feature flag to trigger the alternative patching.
Let's drop the flag and everything that depends on it.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/cpucaps.h |  2 +-
 arch/arm64/kernel/cpufeature.c   | 19 -------------------
 2 files changed, 1 insertion(+), 20 deletions(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 2ff7c5e8efab..f130f35dca3c 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -32,7 +32,7 @@
 #define ARM64_HAS_VIRT_HOST_EXTN		11
 #define ARM64_WORKAROUND_CAVIUM_27456		12
 #define ARM64_HAS_32BIT_EL0			13
-#define ARM64_HYP_OFFSET_LOW			14
+/* #define ARM64_UNALLOCATED_ENTRY			14 */
 #define ARM64_MISMATCHED_CACHE_LINE_SIZE	15
 #define ARM64_HAS_NO_FPSIMD			16
 #define ARM64_WORKAROUND_REPEAT_TLBI		17
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index a73a5928f09b..b99f8b1688c3 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -825,19 +825,6 @@ static bool runs_at_el2(const struct arm64_cpu_capabilities *entry, int __unused
 	return is_kernel_in_hyp_mode();
 }
 
-static bool hyp_offset_low(const struct arm64_cpu_capabilities *entry,
-			   int __unused)
-{
-	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
-
-	/*
-	 * Activate the lower HYP offset only if:
-	 * - the idmap doesn't clash with it,
-	 * - the kernel is not running at EL2.
-	 */
-	return idmap_addr > GENMASK(VA_BITS - 2, 0) && !is_kernel_in_hyp_mode();
-}
-
 static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unused)
 {
 	u64 pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
@@ -926,12 +913,6 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.field_pos = ID_AA64PFR0_EL0_SHIFT,
 		.min_field_value = ID_AA64PFR0_EL0_32BIT_64BIT,
 	},
-	{
-		.desc = "Reduced HYP mapping offset",
-		.capability = ARM64_HYP_OFFSET_LOW,
-		.def_scope = SCOPE_SYSTEM,
-		.matches = hyp_offset_low,
-	},
 	{
 		/* FP/SIMD is not implemented */
 		.capability = ARM64_HAS_NO_FPSIMD,
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: Christoffer Dall, Mark Rutland, Catalin Marinas, Will Deacon,
	James Morse, Steve Capper, Peter Maydell

kvm_vgic_global_state is part of the read-only section, and is
usually accessed using a PC-relative address generation (adrp + add).

It is thus useless to use kern_hyp_va() on it, and actively problematic
if kern_hyp_va() becomes non-idempotent. On the other hand, there is
no way that the compiler is going to guarantee that such access is
always be PC relative.

So let's bite the bullet and provide our own accessor.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/kvm_hyp.h   | 6 ++++++
 arch/arm64/include/asm/kvm_hyp.h | 9 +++++++++
 virt/kvm/arm/hyp/vgic-v2-sr.c    | 4 ++--
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
index ab20ffa8b9e7..1d42d0aa2feb 100644
--- a/arch/arm/include/asm/kvm_hyp.h
+++ b/arch/arm/include/asm/kvm_hyp.h
@@ -26,6 +26,12 @@
 
 #define __hyp_text __section(.hyp.text) notrace
 
+#define hyp_symbol_addr(s)						\
+	({								\
+		typeof(s) *addr = &(s);					\
+		addr;							\
+	})
+
 #define __ACCESS_VFP(CRn)			\
 	"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
 
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 08d3bb66c8b7..a2d98c539023 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -25,6 +25,15 @@
 
 #define __hyp_text __section(.hyp.text) notrace
 
+#define hyp_symbol_addr(s)						\
+	({								\
+		typeof(s) *addr;					\
+		asm volatile("adrp	%0, %1\n"			\
+			     "add	%0, %0, :lo12:%1\n"		\
+			     : "=r" (addr) : "S" (&s));			\
+		addr;							\
+	})
+
 #define read_sysreg_elx(r,nvh,vh)					\
 	({								\
 		u64 reg;						\
diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/virt/kvm/arm/hyp/vgic-v2-sr.c
index d7fd46fe9efb..4573d0552af3 100644
--- a/virt/kvm/arm/hyp/vgic-v2-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v2-sr.c
@@ -25,7 +25,7 @@
 static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
 {
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
-	int nr_lr = (kern_hyp_va(&kvm_vgic_global_state))->nr_lr;
+	int nr_lr = hyp_symbol_addr(kvm_vgic_global_state)->nr_lr;
 	u32 elrsr0, elrsr1;
 
 	elrsr0 = readl_relaxed(base + GICH_ELRSR0);
@@ -139,7 +139,7 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu)
 		return -1;
 
 	rd = kvm_vcpu_dabt_get_rd(vcpu);
-	addr  = kern_hyp_va((kern_hyp_va(&kvm_vgic_global_state))->vcpu_base_va);
+	addr  = kern_hyp_va(hyp_symbol_addr(kvm_vgic_global_state)->vcpu_base_va);
 	addr += fault_ipa - vgic->vgic_cpu_base;
 
 	if (kvm_vcpu_dabt_iswrite(vcpu)) {
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

kvm_vgic_global_state is part of the read-only section, and is
usually accessed using a PC-relative address generation (adrp + add).

It is thus useless to use kern_hyp_va() on it, and actively problematic
if kern_hyp_va() becomes non-idempotent. On the other hand, there is
no way that the compiler is going to guarantee that such access is
always be PC relative.

So let's bite the bullet and provide our own accessor.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/kvm_hyp.h   | 6 ++++++
 arch/arm64/include/asm/kvm_hyp.h | 9 +++++++++
 virt/kvm/arm/hyp/vgic-v2-sr.c    | 4 ++--
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
index ab20ffa8b9e7..1d42d0aa2feb 100644
--- a/arch/arm/include/asm/kvm_hyp.h
+++ b/arch/arm/include/asm/kvm_hyp.h
@@ -26,6 +26,12 @@
 
 #define __hyp_text __section(.hyp.text) notrace
 
+#define hyp_symbol_addr(s)						\
+	({								\
+		typeof(s) *addr = &(s);					\
+		addr;							\
+	})
+
 #define __ACCESS_VFP(CRn)			\
 	"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
 
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 08d3bb66c8b7..a2d98c539023 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -25,6 +25,15 @@
 
 #define __hyp_text __section(.hyp.text) notrace
 
+#define hyp_symbol_addr(s)						\
+	({								\
+		typeof(s) *addr;					\
+		asm volatile("adrp	%0, %1\n"			\
+			     "add	%0, %0, :lo12:%1\n"		\
+			     : "=r" (addr) : "S" (&s));			\
+		addr;							\
+	})
+
 #define read_sysreg_elx(r,nvh,vh)					\
 	({								\
 		u64 reg;						\
diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/virt/kvm/arm/hyp/vgic-v2-sr.c
index d7fd46fe9efb..4573d0552af3 100644
--- a/virt/kvm/arm/hyp/vgic-v2-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v2-sr.c
@@ -25,7 +25,7 @@
 static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
 {
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
-	int nr_lr = (kern_hyp_va(&kvm_vgic_global_state))->nr_lr;
+	int nr_lr = hyp_symbol_addr(kvm_vgic_global_state)->nr_lr;
 	u32 elrsr0, elrsr1;
 
 	elrsr0 = readl_relaxed(base + GICH_ELRSR0);
@@ -139,7 +139,7 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu)
 		return -1;
 
 	rd = kvm_vcpu_dabt_get_rd(vcpu);
-	addr  = kern_hyp_va((kern_hyp_va(&kvm_vgic_global_state))->vcpu_base_va);
+	addr  = kern_hyp_va(hyp_symbol_addr(kvm_vgic_global_state)->vcpu_base_va);
 	addr += fault_ipa - vgic->vgic_cpu_base;
 
 	if (kvm_vcpu_dabt_iswrite(vcpu)) {
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 11/19] KVM: arm/arm64: Demote HYP VA range display to being a debug feature
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: Christoffer Dall, Mark Rutland, Catalin Marinas, Will Deacon,
	James Morse, Steve Capper, Peter Maydell

Displaying the HYP VA information is slightly counterproductive when
using VA randomization. Turn it into a debug feature only, and adjust
the last displayed value to reflect the top of RAM instead of ~0.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/mmu.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index b4b69c2d1012..84d09f1a44d4 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1760,9 +1760,10 @@ int kvm_mmu_init(void)
 	 */
 	BUG_ON((hyp_idmap_start ^ (hyp_idmap_end - 1)) & PAGE_MASK);
 
-	kvm_info("IDMAP page: %lx\n", hyp_idmap_start);
-	kvm_info("HYP VA range: %lx:%lx\n",
-		 kern_hyp_va(PAGE_OFFSET), kern_hyp_va(~0UL));
+	kvm_debug("IDMAP page: %lx\n", hyp_idmap_start);
+	kvm_debug("HYP VA range: %lx:%lx\n",
+		  kern_hyp_va(PAGE_OFFSET),
+		  kern_hyp_va((unsigned long)high_memory - 1));
 
 	if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
 	    hyp_idmap_start <  kern_hyp_va(~0UL) &&
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 11/19] KVM: arm/arm64: Demote HYP VA range display to being a debug feature
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

Displaying the HYP VA information is slightly counterproductive when
using VA randomization. Turn it into a debug feature only, and adjust
the last displayed value to reflect the top of RAM instead of ~0.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/mmu.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index b4b69c2d1012..84d09f1a44d4 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1760,9 +1760,10 @@ int kvm_mmu_init(void)
 	 */
 	BUG_ON((hyp_idmap_start ^ (hyp_idmap_end - 1)) & PAGE_MASK);
 
-	kvm_info("IDMAP page: %lx\n", hyp_idmap_start);
-	kvm_info("HYP VA range: %lx:%lx\n",
-		 kern_hyp_va(PAGE_OFFSET), kern_hyp_va(~0UL));
+	kvm_debug("IDMAP page: %lx\n", hyp_idmap_start);
+	kvm_debug("HYP VA range: %lx:%lx\n",
+		  kern_hyp_va(PAGE_OFFSET),
+		  kern_hyp_va((unsigned long)high_memory - 1));
 
 	if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
 	    hyp_idmap_start <  kern_hyp_va(~0UL) &&
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 12/19] KVM: arm/arm64: Move ioremap calls to create_hyp_io_mappings
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: Christoffer Dall, Mark Rutland, Catalin Marinas, Will Deacon,
	James Morse, Steve Capper, Peter Maydell

Both HYP io mappings call ioremap, followed by create_hyp_io_mappings.
Let's move the ioremap call into create_hyp_io_mappings itself, which
simplifies the code a bit and allows for further refactoring.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   |  3 ++-
 arch/arm64/include/asm/kvm_mmu.h |  3 ++-
 virt/kvm/arm/mmu.c               | 24 ++++++++++++++----------
 virt/kvm/arm/vgic/vgic-v2.c      | 31 ++++++++-----------------------
 4 files changed, 26 insertions(+), 35 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index fa6f2174276b..cb3bef71ec9b 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -41,7 +41,8 @@
 #include <asm/stage2_pgtable.h>
 
 int create_hyp_mappings(void *from, void *to, pgprot_t prot);
-int create_hyp_io_mappings(void *from, void *to, phys_addr_t);
+int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
+			   void __iomem **kaddr);
 void free_hyp_pgds(void);
 
 void stage2_unmap_vm(struct kvm *kvm);
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index b0c3cbe9b513..09a208014457 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -119,7 +119,8 @@ static inline unsigned long __kern_hyp_va(unsigned long v)
 #include <asm/stage2_pgtable.h>
 
 int create_hyp_mappings(void *from, void *to, pgprot_t prot);
-int create_hyp_io_mappings(void *from, void *to, phys_addr_t);
+int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
+			   void __iomem **kaddr);
 void free_hyp_pgds(void);
 
 void stage2_unmap_vm(struct kvm *kvm);
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 84d09f1a44d4..38adbe0a016c 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -709,26 +709,30 @@ int create_hyp_mappings(void *from, void *to, pgprot_t prot)
 }
 
 /**
- * create_hyp_io_mappings - duplicate a kernel IO mapping into Hyp mode
- * @from:	The kernel start VA of the range
- * @to:		The kernel end VA of the range (exclusive)
+ * create_hyp_io_mappings - Map IO into both kernel and HYP
  * @phys_addr:	The physical start address which gets mapped
+ * @size:	Size of the region being mapped
+ * @kaddr:	Kernel VA for this mapping
  *
  * The resulting HYP VA is the same as the kernel VA, modulo
  * HYP_PAGE_OFFSET.
  */
-int create_hyp_io_mappings(void *from, void *to, phys_addr_t phys_addr)
+int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
+			   void __iomem **kaddr)
 {
-	unsigned long start = kern_hyp_va((unsigned long)from);
-	unsigned long end = kern_hyp_va((unsigned long)to);
+	unsigned long start, end;
 
-	if (is_kernel_in_hyp_mode())
+	*kaddr = ioremap(phys_addr, size);
+	if (!*kaddr)
+		return -ENOMEM;
+
+	if (is_kernel_in_hyp_mode()) {
 		return 0;
+	}
 
-	/* Check for a valid kernel IO mapping */
-	if (!is_vmalloc_addr(from) || !is_vmalloc_addr(to - 1))
-		return -EINVAL;
 
+	start = kern_hyp_va((unsigned long)*kaddr);
+	end = kern_hyp_va((unsigned long)*kaddr + size);
 	return __create_hyp_mappings(hyp_pgd, start, end,
 				     __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
 }
diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
index 80897102da26..bc49d702f9f0 100644
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -332,16 +332,10 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
 	if (!PAGE_ALIGNED(info->vcpu.start) ||
 	    !PAGE_ALIGNED(resource_size(&info->vcpu))) {
 		kvm_info("GICV region size/alignment is unsafe, using trapping (reduced performance)\n");
-		kvm_vgic_global_state.vcpu_base_va = ioremap(info->vcpu.start,
-							     resource_size(&info->vcpu));
-		if (!kvm_vgic_global_state.vcpu_base_va) {
-			kvm_err("Cannot ioremap GICV\n");
-			return -ENOMEM;
-		}
 
-		ret = create_hyp_io_mappings(kvm_vgic_global_state.vcpu_base_va,
-					     kvm_vgic_global_state.vcpu_base_va + resource_size(&info->vcpu),
-					     info->vcpu.start);
+		ret = create_hyp_io_mappings(info->vcpu.start,
+					     resource_size(&info->vcpu),
+					     &kvm_vgic_global_state.vcpu_base_va);
 		if (ret) {
 			kvm_err("Cannot map GICV into hyp\n");
 			goto out;
@@ -350,26 +344,17 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
 		static_branch_enable(&vgic_v2_cpuif_trap);
 	}
 
-	kvm_vgic_global_state.vctrl_base = ioremap(info->vctrl.start,
-						   resource_size(&info->vctrl));
-	if (!kvm_vgic_global_state.vctrl_base) {
-		kvm_err("Cannot ioremap GICH\n");
-		ret = -ENOMEM;
+	ret = create_hyp_io_mappings(info->vctrl.start,
+				     resource_size(&info->vctrl),
+				     &kvm_vgic_global_state.vctrl_base);
+	if (ret) {
+		kvm_err("Cannot map VCTRL into hyp\n");
 		goto out;
 	}
 
 	vtr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_VTR);
 	kvm_vgic_global_state.nr_lr = (vtr & 0x3f) + 1;
 
-	ret = create_hyp_io_mappings(kvm_vgic_global_state.vctrl_base,
-				     kvm_vgic_global_state.vctrl_base +
-					 resource_size(&info->vctrl),
-				     info->vctrl.start);
-	if (ret) {
-		kvm_err("Cannot map VCTRL into hyp\n");
-		goto out;
-	}
-
 	ret = kvm_register_vgic_device(KVM_DEV_TYPE_ARM_VGIC_V2);
 	if (ret) {
 		kvm_err("Cannot register GICv2 KVM device\n");
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 12/19] KVM: arm/arm64: Move ioremap calls to create_hyp_io_mappings
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

Both HYP io mappings call ioremap, followed by create_hyp_io_mappings.
Let's move the ioremap call into create_hyp_io_mappings itself, which
simplifies the code a bit and allows for further refactoring.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   |  3 ++-
 arch/arm64/include/asm/kvm_mmu.h |  3 ++-
 virt/kvm/arm/mmu.c               | 24 ++++++++++++++----------
 virt/kvm/arm/vgic/vgic-v2.c      | 31 ++++++++-----------------------
 4 files changed, 26 insertions(+), 35 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index fa6f2174276b..cb3bef71ec9b 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -41,7 +41,8 @@
 #include <asm/stage2_pgtable.h>
 
 int create_hyp_mappings(void *from, void *to, pgprot_t prot);
-int create_hyp_io_mappings(void *from, void *to, phys_addr_t);
+int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
+			   void __iomem **kaddr);
 void free_hyp_pgds(void);
 
 void stage2_unmap_vm(struct kvm *kvm);
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index b0c3cbe9b513..09a208014457 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -119,7 +119,8 @@ static inline unsigned long __kern_hyp_va(unsigned long v)
 #include <asm/stage2_pgtable.h>
 
 int create_hyp_mappings(void *from, void *to, pgprot_t prot);
-int create_hyp_io_mappings(void *from, void *to, phys_addr_t);
+int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
+			   void __iomem **kaddr);
 void free_hyp_pgds(void);
 
 void stage2_unmap_vm(struct kvm *kvm);
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 84d09f1a44d4..38adbe0a016c 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -709,26 +709,30 @@ int create_hyp_mappings(void *from, void *to, pgprot_t prot)
 }
 
 /**
- * create_hyp_io_mappings - duplicate a kernel IO mapping into Hyp mode
- * @from:	The kernel start VA of the range
- * @to:		The kernel end VA of the range (exclusive)
+ * create_hyp_io_mappings - Map IO into both kernel and HYP
  * @phys_addr:	The physical start address which gets mapped
+ * @size:	Size of the region being mapped
+ * @kaddr:	Kernel VA for this mapping
  *
  * The resulting HYP VA is the same as the kernel VA, modulo
  * HYP_PAGE_OFFSET.
  */
-int create_hyp_io_mappings(void *from, void *to, phys_addr_t phys_addr)
+int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
+			   void __iomem **kaddr)
 {
-	unsigned long start = kern_hyp_va((unsigned long)from);
-	unsigned long end = kern_hyp_va((unsigned long)to);
+	unsigned long start, end;
 
-	if (is_kernel_in_hyp_mode())
+	*kaddr = ioremap(phys_addr, size);
+	if (!*kaddr)
+		return -ENOMEM;
+
+	if (is_kernel_in_hyp_mode()) {
 		return 0;
+	}
 
-	/* Check for a valid kernel IO mapping */
-	if (!is_vmalloc_addr(from) || !is_vmalloc_addr(to - 1))
-		return -EINVAL;
 
+	start = kern_hyp_va((unsigned long)*kaddr);
+	end = kern_hyp_va((unsigned long)*kaddr + size);
 	return __create_hyp_mappings(hyp_pgd, start, end,
 				     __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
 }
diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
index 80897102da26..bc49d702f9f0 100644
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -332,16 +332,10 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
 	if (!PAGE_ALIGNED(info->vcpu.start) ||
 	    !PAGE_ALIGNED(resource_size(&info->vcpu))) {
 		kvm_info("GICV region size/alignment is unsafe, using trapping (reduced performance)\n");
-		kvm_vgic_global_state.vcpu_base_va = ioremap(info->vcpu.start,
-							     resource_size(&info->vcpu));
-		if (!kvm_vgic_global_state.vcpu_base_va) {
-			kvm_err("Cannot ioremap GICV\n");
-			return -ENOMEM;
-		}
 
-		ret = create_hyp_io_mappings(kvm_vgic_global_state.vcpu_base_va,
-					     kvm_vgic_global_state.vcpu_base_va + resource_size(&info->vcpu),
-					     info->vcpu.start);
+		ret = create_hyp_io_mappings(info->vcpu.start,
+					     resource_size(&info->vcpu),
+					     &kvm_vgic_global_state.vcpu_base_va);
 		if (ret) {
 			kvm_err("Cannot map GICV into hyp\n");
 			goto out;
@@ -350,26 +344,17 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
 		static_branch_enable(&vgic_v2_cpuif_trap);
 	}
 
-	kvm_vgic_global_state.vctrl_base = ioremap(info->vctrl.start,
-						   resource_size(&info->vctrl));
-	if (!kvm_vgic_global_state.vctrl_base) {
-		kvm_err("Cannot ioremap GICH\n");
-		ret = -ENOMEM;
+	ret = create_hyp_io_mappings(info->vctrl.start,
+				     resource_size(&info->vctrl),
+				     &kvm_vgic_global_state.vctrl_base);
+	if (ret) {
+		kvm_err("Cannot map VCTRL into hyp\n");
 		goto out;
 	}
 
 	vtr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_VTR);
 	kvm_vgic_global_state.nr_lr = (vtr & 0x3f) + 1;
 
-	ret = create_hyp_io_mappings(kvm_vgic_global_state.vctrl_base,
-				     kvm_vgic_global_state.vctrl_base +
-					 resource_size(&info->vctrl),
-				     info->vctrl.start);
-	if (ret) {
-		kvm_err("Cannot map VCTRL into hyp\n");
-		goto out;
-	}
-
 	ret = kvm_register_vgic_device(KVM_DEV_TYPE_ARM_VGIC_V2);
 	if (ret) {
 		kvm_err("Cannot register GICv2 KVM device\n");
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 13/19] KVM: arm/arm64: Keep GICv2 HYP VAs in kvm_vgic_global_state
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm; +Cc: Catalin Marinas, Will Deacon

As we're about to change the way we map devices at HYP, we need
to move away from kern_hyp_va on an IO address.

One way of achieving this is to store the VAs in kvm_vgic_global_state,
and use that directly from the HYP code. This requires a small change
to create_hyp_io_mappings so that it can also return a HYP VA.

We take this opportunity to nuke the vctrl_base field in the emulated
distributor, as it is not used anymore.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   |  3 ++-
 arch/arm64/include/asm/kvm_mmu.h |  3 ++-
 include/kvm/arm_vgic.h           | 12 ++++++------
 virt/kvm/arm/hyp/vgic-v2-sr.c    | 10 +++-------
 virt/kvm/arm/mmu.c               | 20 ++++++++++++++++----
 virt/kvm/arm/vgic/vgic-init.c    |  6 ------
 virt/kvm/arm/vgic/vgic-v2.c      | 13 +++++++------
 7 files changed, 36 insertions(+), 31 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index cb3bef71ec9b..feff24b34506 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -42,7 +42,8 @@
 
 int create_hyp_mappings(void *from, void *to, pgprot_t prot);
 int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
-			   void __iomem **kaddr);
+			   void __iomem **kaddr,
+			   void __iomem **haddr);
 void free_hyp_pgds(void);
 
 void stage2_unmap_vm(struct kvm *kvm);
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 09a208014457..cc882e890bb1 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -120,7 +120,8 @@ static inline unsigned long __kern_hyp_va(unsigned long v)
 
 int create_hyp_mappings(void *from, void *to, pgprot_t prot);
 int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
-			   void __iomem **kaddr);
+			   void __iomem **kaddr,
+			   void __iomem **haddr);
 void free_hyp_pgds(void);
 
 void stage2_unmap_vm(struct kvm *kvm);
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 8c896540a72c..8b3fbc03293b 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -57,11 +57,15 @@ struct vgic_global {
 	/* Physical address of vgic virtual cpu interface */
 	phys_addr_t		vcpu_base;
 
-	/* GICV mapping */
+	/* GICV mapping, kernel VA */
 	void __iomem		*vcpu_base_va;
+	/* GICV mapping, HYP VA */
+	void __iomem		*vcpu_hyp_va;
 
-	/* virtual control interface mapping */
+	/* virtual control interface mapping, kernel VA */
 	void __iomem		*vctrl_base;
+	/* virtual control interface mapping, HYP VA */
+	void __iomem		*vctrl_hyp;
 
 	/* Number of implemented list registers */
 	int			nr_lr;
@@ -198,10 +202,6 @@ struct vgic_dist {
 
 	int			nr_spis;
 
-	/* TODO: Consider moving to global state */
-	/* Virtual control interface mapping */
-	void __iomem		*vctrl_base;
-
 	/* base addresses in guest physical address space: */
 	gpa_t			vgic_dist_base;		/* distributor */
 	union {
diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/virt/kvm/arm/hyp/vgic-v2-sr.c
index 4573d0552af3..dbd109f3a4ab 100644
--- a/virt/kvm/arm/hyp/vgic-v2-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v2-sr.c
@@ -56,10 +56,8 @@ static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
 /* vcpu is already in the HYP VA space */
 void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
 {
-	struct kvm *kvm = kern_hyp_va(vcpu->kvm);
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
-	struct vgic_dist *vgic = &kvm->arch.vgic;
-	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
+	void __iomem *base = hyp_symbol_addr(kvm_vgic_global_state)->vctrl_hyp;
 	u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
 
 	if (!base)
@@ -81,10 +79,8 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
 /* vcpu is already in the HYP VA space */
 void __hyp_text __vgic_v2_restore_state(struct kvm_vcpu *vcpu)
 {
-	struct kvm *kvm = kern_hyp_va(vcpu->kvm);
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
-	struct vgic_dist *vgic = &kvm->arch.vgic;
-	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
+	void __iomem *base = hyp_symbol_addr(kvm_vgic_global_state)->vctrl_hyp;
 	int i;
 	u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
 
@@ -139,7 +135,7 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu)
 		return -1;
 
 	rd = kvm_vcpu_dabt_get_rd(vcpu);
-	addr  = kern_hyp_va(hyp_symbol_addr(kvm_vgic_global_state)->vcpu_base_va);
+	addr  = hyp_symbol_addr(kvm_vgic_global_state)->vcpu_hyp_va;
 	addr += fault_ipa - vgic->vgic_cpu_base;
 
 	if (kvm_vcpu_dabt_iswrite(vcpu)) {
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 38adbe0a016c..6192d45d1e1a 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -713,28 +713,40 @@ int create_hyp_mappings(void *from, void *to, pgprot_t prot)
  * @phys_addr:	The physical start address which gets mapped
  * @size:	Size of the region being mapped
  * @kaddr:	Kernel VA for this mapping
+ * @haddr:	HYP VA for this mapping
  *
- * The resulting HYP VA is the same as the kernel VA, modulo
- * HYP_PAGE_OFFSET.
+ * The resulting HYP VA is completely unrelated to the kernel VA.
  */
 int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
-			   void __iomem **kaddr)
+			   void __iomem **kaddr,
+			   void __iomem **haddr)
 {
 	unsigned long start, end;
+	int ret;
 
 	*kaddr = ioremap(phys_addr, size);
 	if (!*kaddr)
 		return -ENOMEM;
 
 	if (is_kernel_in_hyp_mode()) {
+		*haddr = *kaddr;
 		return 0;
 	}
 
 
 	start = kern_hyp_va((unsigned long)*kaddr);
 	end = kern_hyp_va((unsigned long)*kaddr + size);
-	return __create_hyp_mappings(hyp_pgd, start, end,
+	ret = __create_hyp_mappings(hyp_pgd, start, end,
 				     __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
+
+	if (ret) {
+		iounmap(*kaddr);
+		*kaddr = NULL;
+	} else {
+		*haddr = (void __iomem *)start;
+	}
+
+	return ret;
 }
 
 /**
diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index 62310122ee78..3f01b5975055 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -166,12 +166,6 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
 	kvm->arch.vgic.in_kernel = true;
 	kvm->arch.vgic.vgic_model = type;
 
-	/*
-	 * kvm_vgic_global_state.vctrl_base is set on vgic probe (kvm_arch_init)
-	 * it is stored in distributor struct for asm save/restore purpose
-	 */
-	kvm->arch.vgic.vctrl_base = kvm_vgic_global_state.vctrl_base;
-
 	kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
 	kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
 	kvm->arch.vgic.vgic_redist_base = VGIC_ADDR_UNDEF;
diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
index bc49d702f9f0..f0f566e4494e 100644
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -335,7 +335,8 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
 
 		ret = create_hyp_io_mappings(info->vcpu.start,
 					     resource_size(&info->vcpu),
-					     &kvm_vgic_global_state.vcpu_base_va);
+					     &kvm_vgic_global_state.vcpu_base_va,
+					     &kvm_vgic_global_state.vcpu_hyp_va);
 		if (ret) {
 			kvm_err("Cannot map GICV into hyp\n");
 			goto out;
@@ -346,7 +347,8 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
 
 	ret = create_hyp_io_mappings(info->vctrl.start,
 				     resource_size(&info->vctrl),
-				     &kvm_vgic_global_state.vctrl_base);
+				     &kvm_vgic_global_state.vctrl_base,
+				     &kvm_vgic_global_state.vctrl_hyp);
 	if (ret) {
 		kvm_err("Cannot map VCTRL into hyp\n");
 		goto out;
@@ -381,15 +383,14 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
 void vgic_v2_load(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
-	struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
 
-	writel_relaxed(cpu_if->vgic_vmcr, vgic->vctrl_base + GICH_VMCR);
+	writel_relaxed(cpu_if->vgic_vmcr,
+		       kvm_vgic_global_state.vctrl_base + GICH_VMCR);
 }
 
 void vgic_v2_put(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
-	struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
 
-	cpu_if->vgic_vmcr = readl_relaxed(vgic->vctrl_base + GICH_VMCR);
+	cpu_if->vgic_vmcr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_VMCR);
 }
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 13/19] KVM: arm/arm64: Keep GICv2 HYP VAs in kvm_vgic_global_state
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

As we're about to change the way we map devices at HYP, we need
to move away from kern_hyp_va on an IO address.

One way of achieving this is to store the VAs in kvm_vgic_global_state,
and use that directly from the HYP code. This requires a small change
to create_hyp_io_mappings so that it can also return a HYP VA.

We take this opportunity to nuke the vctrl_base field in the emulated
distributor, as it is not used anymore.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   |  3 ++-
 arch/arm64/include/asm/kvm_mmu.h |  3 ++-
 include/kvm/arm_vgic.h           | 12 ++++++------
 virt/kvm/arm/hyp/vgic-v2-sr.c    | 10 +++-------
 virt/kvm/arm/mmu.c               | 20 ++++++++++++++++----
 virt/kvm/arm/vgic/vgic-init.c    |  6 ------
 virt/kvm/arm/vgic/vgic-v2.c      | 13 +++++++------
 7 files changed, 36 insertions(+), 31 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index cb3bef71ec9b..feff24b34506 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -42,7 +42,8 @@
 
 int create_hyp_mappings(void *from, void *to, pgprot_t prot);
 int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
-			   void __iomem **kaddr);
+			   void __iomem **kaddr,
+			   void __iomem **haddr);
 void free_hyp_pgds(void);
 
 void stage2_unmap_vm(struct kvm *kvm);
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 09a208014457..cc882e890bb1 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -120,7 +120,8 @@ static inline unsigned long __kern_hyp_va(unsigned long v)
 
 int create_hyp_mappings(void *from, void *to, pgprot_t prot);
 int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
-			   void __iomem **kaddr);
+			   void __iomem **kaddr,
+			   void __iomem **haddr);
 void free_hyp_pgds(void);
 
 void stage2_unmap_vm(struct kvm *kvm);
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 8c896540a72c..8b3fbc03293b 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -57,11 +57,15 @@ struct vgic_global {
 	/* Physical address of vgic virtual cpu interface */
 	phys_addr_t		vcpu_base;
 
-	/* GICV mapping */
+	/* GICV mapping, kernel VA */
 	void __iomem		*vcpu_base_va;
+	/* GICV mapping, HYP VA */
+	void __iomem		*vcpu_hyp_va;
 
-	/* virtual control interface mapping */
+	/* virtual control interface mapping, kernel VA */
 	void __iomem		*vctrl_base;
+	/* virtual control interface mapping, HYP VA */
+	void __iomem		*vctrl_hyp;
 
 	/* Number of implemented list registers */
 	int			nr_lr;
@@ -198,10 +202,6 @@ struct vgic_dist {
 
 	int			nr_spis;
 
-	/* TODO: Consider moving to global state */
-	/* Virtual control interface mapping */
-	void __iomem		*vctrl_base;
-
 	/* base addresses in guest physical address space: */
 	gpa_t			vgic_dist_base;		/* distributor */
 	union {
diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/virt/kvm/arm/hyp/vgic-v2-sr.c
index 4573d0552af3..dbd109f3a4ab 100644
--- a/virt/kvm/arm/hyp/vgic-v2-sr.c
+++ b/virt/kvm/arm/hyp/vgic-v2-sr.c
@@ -56,10 +56,8 @@ static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
 /* vcpu is already in the HYP VA space */
 void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
 {
-	struct kvm *kvm = kern_hyp_va(vcpu->kvm);
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
-	struct vgic_dist *vgic = &kvm->arch.vgic;
-	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
+	void __iomem *base = hyp_symbol_addr(kvm_vgic_global_state)->vctrl_hyp;
 	u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
 
 	if (!base)
@@ -81,10 +79,8 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
 /* vcpu is already in the HYP VA space */
 void __hyp_text __vgic_v2_restore_state(struct kvm_vcpu *vcpu)
 {
-	struct kvm *kvm = kern_hyp_va(vcpu->kvm);
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
-	struct vgic_dist *vgic = &kvm->arch.vgic;
-	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
+	void __iomem *base = hyp_symbol_addr(kvm_vgic_global_state)->vctrl_hyp;
 	int i;
 	u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
 
@@ -139,7 +135,7 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu)
 		return -1;
 
 	rd = kvm_vcpu_dabt_get_rd(vcpu);
-	addr  = kern_hyp_va(hyp_symbol_addr(kvm_vgic_global_state)->vcpu_base_va);
+	addr  = hyp_symbol_addr(kvm_vgic_global_state)->vcpu_hyp_va;
 	addr += fault_ipa - vgic->vgic_cpu_base;
 
 	if (kvm_vcpu_dabt_iswrite(vcpu)) {
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 38adbe0a016c..6192d45d1e1a 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -713,28 +713,40 @@ int create_hyp_mappings(void *from, void *to, pgprot_t prot)
  * @phys_addr:	The physical start address which gets mapped
  * @size:	Size of the region being mapped
  * @kaddr:	Kernel VA for this mapping
+ * @haddr:	HYP VA for this mapping
  *
- * The resulting HYP VA is the same as the kernel VA, modulo
- * HYP_PAGE_OFFSET.
+ * The resulting HYP VA is completely unrelated to the kernel VA.
  */
 int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
-			   void __iomem **kaddr)
+			   void __iomem **kaddr,
+			   void __iomem **haddr)
 {
 	unsigned long start, end;
+	int ret;
 
 	*kaddr = ioremap(phys_addr, size);
 	if (!*kaddr)
 		return -ENOMEM;
 
 	if (is_kernel_in_hyp_mode()) {
+		*haddr = *kaddr;
 		return 0;
 	}
 
 
 	start = kern_hyp_va((unsigned long)*kaddr);
 	end = kern_hyp_va((unsigned long)*kaddr + size);
-	return __create_hyp_mappings(hyp_pgd, start, end,
+	ret = __create_hyp_mappings(hyp_pgd, start, end,
 				     __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
+
+	if (ret) {
+		iounmap(*kaddr);
+		*kaddr = NULL;
+	} else {
+		*haddr = (void __iomem *)start;
+	}
+
+	return ret;
 }
 
 /**
diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index 62310122ee78..3f01b5975055 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -166,12 +166,6 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
 	kvm->arch.vgic.in_kernel = true;
 	kvm->arch.vgic.vgic_model = type;
 
-	/*
-	 * kvm_vgic_global_state.vctrl_base is set on vgic probe (kvm_arch_init)
-	 * it is stored in distributor struct for asm save/restore purpose
-	 */
-	kvm->arch.vgic.vctrl_base = kvm_vgic_global_state.vctrl_base;
-
 	kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
 	kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
 	kvm->arch.vgic.vgic_redist_base = VGIC_ADDR_UNDEF;
diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
index bc49d702f9f0..f0f566e4494e 100644
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -335,7 +335,8 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
 
 		ret = create_hyp_io_mappings(info->vcpu.start,
 					     resource_size(&info->vcpu),
-					     &kvm_vgic_global_state.vcpu_base_va);
+					     &kvm_vgic_global_state.vcpu_base_va,
+					     &kvm_vgic_global_state.vcpu_hyp_va);
 		if (ret) {
 			kvm_err("Cannot map GICV into hyp\n");
 			goto out;
@@ -346,7 +347,8 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
 
 	ret = create_hyp_io_mappings(info->vctrl.start,
 				     resource_size(&info->vctrl),
-				     &kvm_vgic_global_state.vctrl_base);
+				     &kvm_vgic_global_state.vctrl_base,
+				     &kvm_vgic_global_state.vctrl_hyp);
 	if (ret) {
 		kvm_err("Cannot map VCTRL into hyp\n");
 		goto out;
@@ -381,15 +383,14 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
 void vgic_v2_load(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
-	struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
 
-	writel_relaxed(cpu_if->vgic_vmcr, vgic->vctrl_base + GICH_VMCR);
+	writel_relaxed(cpu_if->vgic_vmcr,
+		       kvm_vgic_global_state.vctrl_base + GICH_VMCR);
 }
 
 void vgic_v2_put(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
-	struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
 
-	cpu_if->vgic_vmcr = readl_relaxed(vgic->vctrl_base + GICH_VMCR);
+	cpu_if->vgic_vmcr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_VMCR);
 }
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 14/19] KVM: arm/arm64: Move HYP IO VAs to the "idmap" range
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm; +Cc: Catalin Marinas, Will Deacon

We so far mapped our HYP IO (which is essencially the GICv2 control
registers) using the same method as for memory. It recently appeared
that is a bit unsafe:

We compute the HYP VA using the kern_hyp_va helper, but that helper
is only designed to deal with kernel VAs coming from the linear map,
and not from the vmalloc region... This could in turn cause some bad
aliasing between the two, amplified by the upcoming VA randomisation.

A solution is to come up with our very own basic VA allocator for
MMIO. Since half of the HYP address space only contains a single
page (the idmap), we have plenty to borrow from. Let's use the idmap
as a base, and allocate downwards from it. GICv2 now lives on the
other side of the great VA barrier.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/mmu.c | 56 +++++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 43 insertions(+), 13 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 6192d45d1e1a..14c5e5534f2f 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -43,6 +43,9 @@ static unsigned long hyp_idmap_start;
 static unsigned long hyp_idmap_end;
 static phys_addr_t hyp_idmap_vector;
 
+static DEFINE_MUTEX(io_map_lock);
+static unsigned long io_map_base;
+
 #define S2_PGD_SIZE	(PTRS_PER_S2_PGD * sizeof(pgd_t))
 #define hyp_pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
 
@@ -502,27 +505,31 @@ static void unmap_hyp_range(pgd_t *pgdp, phys_addr_t start, u64 size)
  *
  * Assumes hyp_pgd is a page table used strictly in Hyp-mode and
  * therefore contains either mappings in the kernel memory area (above
- * PAGE_OFFSET), or device mappings in the vmalloc range (from
- * VMALLOC_START to VMALLOC_END).
+ * PAGE_OFFSET), or device mappings in the idmap range.
  *
- * boot_hyp_pgd should only map two pages for the init code.
+ * boot_hyp_pgd should only map the idmap range, and is only used in
+ * the extended idmap case.
  */
 void free_hyp_pgds(void)
 {
+	pgd_t *id_pgd;
+
 	mutex_lock(&kvm_hyp_pgd_mutex);
 
+	id_pgd = boot_hyp_pgd ? boot_hyp_pgd : hyp_pgd;
+
+	if (id_pgd)
+		unmap_hyp_range(id_pgd, io_map_base,
+				hyp_idmap_start + PAGE_SIZE - io_map_base);
+
 	if (boot_hyp_pgd) {
-		unmap_hyp_range(boot_hyp_pgd, hyp_idmap_start, PAGE_SIZE);
 		free_pages((unsigned long)boot_hyp_pgd, hyp_pgd_order);
 		boot_hyp_pgd = NULL;
 	}
 
 	if (hyp_pgd) {
-		unmap_hyp_range(hyp_pgd, hyp_idmap_start, PAGE_SIZE);
 		unmap_hyp_range(hyp_pgd, kern_hyp_va(PAGE_OFFSET),
 				(uintptr_t)high_memory - PAGE_OFFSET);
-		unmap_hyp_range(hyp_pgd, kern_hyp_va(VMALLOC_START),
-				VMALLOC_END - VMALLOC_START);
 
 		free_pages((unsigned long)hyp_pgd, hyp_pgd_order);
 		hyp_pgd = NULL;
@@ -721,7 +728,8 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
 			   void __iomem **kaddr,
 			   void __iomem **haddr)
 {
-	unsigned long start, end;
+	pgd_t *pgd = hyp_pgd;
+	unsigned long base;
 	int ret;
 
 	*kaddr = ioremap(phys_addr, size);
@@ -733,17 +741,38 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
 		return 0;
 	}
 
+	mutex_lock(&io_map_lock);
+
+	base = io_map_base - size;
+	base &= ~(size - 1);
 
-	start = kern_hyp_va((unsigned long)*kaddr);
-	end = kern_hyp_va((unsigned long)*kaddr + size);
-	ret = __create_hyp_mappings(hyp_pgd, start, end,
+	/*
+	 * Verify that BIT(VA_BITS - 1) hasn't been flipped by
+	 * allocating the new area, as it would indicate we've
+	 * overflowed the idmap/IO address range.
+	 */
+	if ((base ^ io_map_base) & BIT(VA_BITS - 1)) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	if (__kvm_cpu_uses_extended_idmap())
+		pgd = boot_hyp_pgd;
+
+	ret = __create_hyp_mappings(pgd, base, base + size,
 				     __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
 
+	if (!ret) {
+		*haddr = (void __iomem *)base;
+		io_map_base = base;
+	}
+
+out:
+	mutex_unlock(&io_map_lock);
+
 	if (ret) {
 		iounmap(*kaddr);
 		*kaddr = NULL;
-	} else {
-		*haddr = (void __iomem *)start;
 	}
 
 	return ret;
@@ -1826,6 +1855,7 @@ int kvm_mmu_init(void)
 			goto out;
 	}
 
+	io_map_base = hyp_idmap_start;
 	return 0;
 out:
 	free_hyp_pgds();
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 14/19] KVM: arm/arm64: Move HYP IO VAs to the "idmap" range
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

We so far mapped our HYP IO (which is essencially the GICv2 control
registers) using the same method as for memory. It recently appeared
that is a bit unsafe:

We compute the HYP VA using the kern_hyp_va helper, but that helper
is only designed to deal with kernel VAs coming from the linear map,
and not from the vmalloc region... This could in turn cause some bad
aliasing between the two, amplified by the upcoming VA randomisation.

A solution is to come up with our very own basic VA allocator for
MMIO. Since half of the HYP address space only contains a single
page (the idmap), we have plenty to borrow from. Let's use the idmap
as a base, and allocate downwards from it. GICv2 now lives on the
other side of the great VA barrier.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/mmu.c | 56 +++++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 43 insertions(+), 13 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 6192d45d1e1a..14c5e5534f2f 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -43,6 +43,9 @@ static unsigned long hyp_idmap_start;
 static unsigned long hyp_idmap_end;
 static phys_addr_t hyp_idmap_vector;
 
+static DEFINE_MUTEX(io_map_lock);
+static unsigned long io_map_base;
+
 #define S2_PGD_SIZE	(PTRS_PER_S2_PGD * sizeof(pgd_t))
 #define hyp_pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
 
@@ -502,27 +505,31 @@ static void unmap_hyp_range(pgd_t *pgdp, phys_addr_t start, u64 size)
  *
  * Assumes hyp_pgd is a page table used strictly in Hyp-mode and
  * therefore contains either mappings in the kernel memory area (above
- * PAGE_OFFSET), or device mappings in the vmalloc range (from
- * VMALLOC_START to VMALLOC_END).
+ * PAGE_OFFSET), or device mappings in the idmap range.
  *
- * boot_hyp_pgd should only map two pages for the init code.
+ * boot_hyp_pgd should only map the idmap range, and is only used in
+ * the extended idmap case.
  */
 void free_hyp_pgds(void)
 {
+	pgd_t *id_pgd;
+
 	mutex_lock(&kvm_hyp_pgd_mutex);
 
+	id_pgd = boot_hyp_pgd ? boot_hyp_pgd : hyp_pgd;
+
+	if (id_pgd)
+		unmap_hyp_range(id_pgd, io_map_base,
+				hyp_idmap_start + PAGE_SIZE - io_map_base);
+
 	if (boot_hyp_pgd) {
-		unmap_hyp_range(boot_hyp_pgd, hyp_idmap_start, PAGE_SIZE);
 		free_pages((unsigned long)boot_hyp_pgd, hyp_pgd_order);
 		boot_hyp_pgd = NULL;
 	}
 
 	if (hyp_pgd) {
-		unmap_hyp_range(hyp_pgd, hyp_idmap_start, PAGE_SIZE);
 		unmap_hyp_range(hyp_pgd, kern_hyp_va(PAGE_OFFSET),
 				(uintptr_t)high_memory - PAGE_OFFSET);
-		unmap_hyp_range(hyp_pgd, kern_hyp_va(VMALLOC_START),
-				VMALLOC_END - VMALLOC_START);
 
 		free_pages((unsigned long)hyp_pgd, hyp_pgd_order);
 		hyp_pgd = NULL;
@@ -721,7 +728,8 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
 			   void __iomem **kaddr,
 			   void __iomem **haddr)
 {
-	unsigned long start, end;
+	pgd_t *pgd = hyp_pgd;
+	unsigned long base;
 	int ret;
 
 	*kaddr = ioremap(phys_addr, size);
@@ -733,17 +741,38 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
 		return 0;
 	}
 
+	mutex_lock(&io_map_lock);
+
+	base = io_map_base - size;
+	base &= ~(size - 1);
 
-	start = kern_hyp_va((unsigned long)*kaddr);
-	end = kern_hyp_va((unsigned long)*kaddr + size);
-	ret = __create_hyp_mappings(hyp_pgd, start, end,
+	/*
+	 * Verify that BIT(VA_BITS - 1) hasn't been flipped by
+	 * allocating the new area, as it would indicate we've
+	 * overflowed the idmap/IO address range.
+	 */
+	if ((base ^ io_map_base) & BIT(VA_BITS - 1)) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	if (__kvm_cpu_uses_extended_idmap())
+		pgd = boot_hyp_pgd;
+
+	ret = __create_hyp_mappings(pgd, base, base + size,
 				     __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
 
+	if (!ret) {
+		*haddr = (void __iomem *)base;
+		io_map_base = base;
+	}
+
+out:
+	mutex_unlock(&io_map_lock);
+
 	if (ret) {
 		iounmap(*kaddr);
 		*kaddr = NULL;
-	} else {
-		*haddr = (void __iomem *)start;
 	}
 
 	return ret;
@@ -1826,6 +1855,7 @@ int kvm_mmu_init(void)
 			goto out;
 	}
 
+	io_map_base = hyp_idmap_start;
 	return 0;
 out:
 	free_hyp_pgds();
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 15/19] arm64; insn: Add encoder for the EXTR instruction
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm; +Cc: Catalin Marinas, Will Deacon

Add an encoder for the EXTR instruction, which also implements the ROR
variant (where Rn == Rm).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/insn.h |  6 ++++++
 arch/arm64/kernel/insn.c      | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 815b35bc53ed..f62c56b1793f 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -319,6 +319,7 @@ __AARCH64_INSN_FUNCS(and_imm,	0x7F800000, 0x12000000)
 __AARCH64_INSN_FUNCS(orr_imm,	0x7F800000, 0x32000000)
 __AARCH64_INSN_FUNCS(eor_imm,	0x7F800000, 0x52000000)
 __AARCH64_INSN_FUNCS(ands_imm,	0x7F800000, 0x72000000)
+__AARCH64_INSN_FUNCS(extr,	0x7FA00000, 0x13800000)
 __AARCH64_INSN_FUNCS(b,		0xFC000000, 0x14000000)
 __AARCH64_INSN_FUNCS(bl,	0xFC000000, 0x94000000)
 __AARCH64_INSN_FUNCS(cbz,	0x7F000000, 0x34000000)
@@ -433,6 +434,11 @@ u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
 				       enum aarch64_insn_register Rn,
 				       enum aarch64_insn_register Rd,
 				       u64 imm);
+u32 aarch64_insn_gen_extr(enum aarch64_insn_variant variant,
+			  enum aarch64_insn_register Rm,
+			  enum aarch64_insn_register Rn,
+			  enum aarch64_insn_register Rd,
+			  u8 lsb);
 u32 aarch64_insn_gen_prefetch(enum aarch64_insn_register base,
 			      enum aarch64_insn_prfm_type type,
 			      enum aarch64_insn_prfm_target target,
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 72cb1721c63f..59669d7d4383 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -1621,3 +1621,35 @@ u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
 	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, Rn);
 	return aarch64_encode_immediate(imm, variant, insn);
 }
+
+u32 aarch64_insn_gen_extr(enum aarch64_insn_variant variant,
+			  enum aarch64_insn_register Rm,
+			  enum aarch64_insn_register Rn,
+			  enum aarch64_insn_register Rd,
+			  u8 lsb)
+{
+	u32 insn;
+
+	insn = aarch64_insn_get_extr_value();
+
+	switch (variant) {
+	case AARCH64_INSN_VARIANT_32BIT:
+		if (lsb > 31)
+			return AARCH64_BREAK_FAULT;
+		break;
+	case AARCH64_INSN_VARIANT_64BIT:
+		if (lsb > 63)
+			return AARCH64_BREAK_FAULT;
+		insn |= AARCH64_INSN_SF_BIT;
+		insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_N, insn, 1);
+		break;
+	default:
+		pr_err("%s: unknown variant encoding %d\n", __func__, variant);
+		return AARCH64_BREAK_FAULT;
+	}
+
+	insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_S, insn, lsb);
+	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, Rd);
+	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, Rn);
+	return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RM, insn, Rm);
+}
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 15/19] arm64; insn: Add encoder for the EXTR instruction
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

Add an encoder for the EXTR instruction, which also implements the ROR
variant (where Rn == Rm).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/insn.h |  6 ++++++
 arch/arm64/kernel/insn.c      | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 815b35bc53ed..f62c56b1793f 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -319,6 +319,7 @@ __AARCH64_INSN_FUNCS(and_imm,	0x7F800000, 0x12000000)
 __AARCH64_INSN_FUNCS(orr_imm,	0x7F800000, 0x32000000)
 __AARCH64_INSN_FUNCS(eor_imm,	0x7F800000, 0x52000000)
 __AARCH64_INSN_FUNCS(ands_imm,	0x7F800000, 0x72000000)
+__AARCH64_INSN_FUNCS(extr,	0x7FA00000, 0x13800000)
 __AARCH64_INSN_FUNCS(b,		0xFC000000, 0x14000000)
 __AARCH64_INSN_FUNCS(bl,	0xFC000000, 0x94000000)
 __AARCH64_INSN_FUNCS(cbz,	0x7F000000, 0x34000000)
@@ -433,6 +434,11 @@ u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
 				       enum aarch64_insn_register Rn,
 				       enum aarch64_insn_register Rd,
 				       u64 imm);
+u32 aarch64_insn_gen_extr(enum aarch64_insn_variant variant,
+			  enum aarch64_insn_register Rm,
+			  enum aarch64_insn_register Rn,
+			  enum aarch64_insn_register Rd,
+			  u8 lsb);
 u32 aarch64_insn_gen_prefetch(enum aarch64_insn_register base,
 			      enum aarch64_insn_prfm_type type,
 			      enum aarch64_insn_prfm_target target,
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 72cb1721c63f..59669d7d4383 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -1621,3 +1621,35 @@ u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
 	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, Rn);
 	return aarch64_encode_immediate(imm, variant, insn);
 }
+
+u32 aarch64_insn_gen_extr(enum aarch64_insn_variant variant,
+			  enum aarch64_insn_register Rm,
+			  enum aarch64_insn_register Rn,
+			  enum aarch64_insn_register Rd,
+			  u8 lsb)
+{
+	u32 insn;
+
+	insn = aarch64_insn_get_extr_value();
+
+	switch (variant) {
+	case AARCH64_INSN_VARIANT_32BIT:
+		if (lsb > 31)
+			return AARCH64_BREAK_FAULT;
+		break;
+	case AARCH64_INSN_VARIANT_64BIT:
+		if (lsb > 63)
+			return AARCH64_BREAK_FAULT;
+		insn |= AARCH64_INSN_SF_BIT;
+		insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_N, insn, 1);
+		break;
+	default:
+		pr_err("%s: unknown variant encoding %d\n", __func__, variant);
+		return AARCH64_BREAK_FAULT;
+	}
+
+	insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_S, insn, lsb);
+	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, Rd);
+	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, Rn);
+	return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RM, insn, Rm);
+}
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 16/19] arm64: insn: Allow ADD/SUB (immediate) with LSL #12
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: Christoffer Dall, Mark Rutland, Catalin Marinas, Will Deacon,
	James Morse, Steve Capper, Peter Maydell

The encoder for ADD/SUB (immediate) can only cope with 12bit
immediates, while there is an encoding for a 12bit immediate shifted
by 12 bits to the left.

Let's fix this small oversight by allowing the LSL_12 bit to be set.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kernel/insn.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 59669d7d4383..20655537cdd1 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -35,6 +35,7 @@
 
 #define AARCH64_INSN_SF_BIT	BIT(31)
 #define AARCH64_INSN_N_BIT	BIT(22)
+#define AARCH64_INSN_LSL_12	BIT(22)
 
 static int aarch64_insn_encoding_class[] = {
 	AARCH64_INSN_CLS_UNKNOWN,
@@ -903,9 +904,18 @@ u32 aarch64_insn_gen_add_sub_imm(enum aarch64_insn_register dst,
 		return AARCH64_BREAK_FAULT;
 	}
 
+	/* We can't encode more than a 24bit value (12bit + 12bit shift) */
+	if (imm & ~(BIT(24) - 1))
+		goto out;
+
+	/* If we have something in the top 12 bits... */
 	if (imm & ~(SZ_4K - 1)) {
-		pr_err("%s: invalid immediate encoding %d\n", __func__, imm);
-		return AARCH64_BREAK_FAULT;
+		/* ... and in the low 12 bits -> error */
+		if (imm & (SZ_4K - 1))
+			goto out;
+
+		imm >>= 12;
+		insn |= AARCH64_INSN_LSL_12;
 	}
 
 	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, dst);
@@ -913,6 +923,10 @@ u32 aarch64_insn_gen_add_sub_imm(enum aarch64_insn_register dst,
 	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, src);
 
 	return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_12, insn, imm);
+
+out:
+	pr_err("%s: invalid immediate encoding %d\n", __func__, imm);
+	return AARCH64_BREAK_FAULT;
 }
 
 u32 aarch64_insn_gen_bitfield(enum aarch64_insn_register dst,
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 16/19] arm64: insn: Allow ADD/SUB (immediate) with LSL #12
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

The encoder for ADD/SUB (immediate) can only cope with 12bit
immediates, while there is an encoding for a 12bit immediate shifted
by 12 bits to the left.

Let's fix this small oversight by allowing the LSL_12 bit to be set.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kernel/insn.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 59669d7d4383..20655537cdd1 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -35,6 +35,7 @@
 
 #define AARCH64_INSN_SF_BIT	BIT(31)
 #define AARCH64_INSN_N_BIT	BIT(22)
+#define AARCH64_INSN_LSL_12	BIT(22)
 
 static int aarch64_insn_encoding_class[] = {
 	AARCH64_INSN_CLS_UNKNOWN,
@@ -903,9 +904,18 @@ u32 aarch64_insn_gen_add_sub_imm(enum aarch64_insn_register dst,
 		return AARCH64_BREAK_FAULT;
 	}
 
+	/* We can't encode more than a 24bit value (12bit + 12bit shift) */
+	if (imm & ~(BIT(24) - 1))
+		goto out;
+
+	/* If we have something in the top 12 bits... */
 	if (imm & ~(SZ_4K - 1)) {
-		pr_err("%s: invalid immediate encoding %d\n", __func__, imm);
-		return AARCH64_BREAK_FAULT;
+		/* ... and in the low 12 bits -> error */
+		if (imm & (SZ_4K - 1))
+			goto out;
+
+		imm >>= 12;
+		insn |= AARCH64_INSN_LSL_12;
 	}
 
 	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, dst);
@@ -913,6 +923,10 @@ u32 aarch64_insn_gen_add_sub_imm(enum aarch64_insn_register dst,
 	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, src);
 
 	return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_12, insn, imm);
+
+out:
+	pr_err("%s: invalid immediate encoding %d\n", __func__, imm);
+	return AARCH64_BREAK_FAULT;
 }
 
 u32 aarch64_insn_gen_bitfield(enum aarch64_insn_register dst,
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 17/19] arm64: KVM: Dynamically compute the HYP VA mask
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm; +Cc: Catalin Marinas, Will Deacon

As we're moving towards a much more dynamic way to compute our
HYP VA, let's express the mask in a slightly different way.

Instead of comparing the idmap position to the "low" VA mask,
we directly compute the mask by taking into account the idmap's
(VA_BIT-1) bit.

No functionnal change.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/va_layout.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
index aee758574e61..75bb1c6772b0 100644
--- a/arch/arm64/kvm/va_layout.c
+++ b/arch/arm64/kvm/va_layout.c
@@ -21,24 +21,19 @@
 #include <asm/insn.h>
 #include <asm/kvm_mmu.h>
 
-#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
-#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
-
 static u64 va_mask;
 
 static void compute_layout(void)
 {
 	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
-	unsigned long mask = HYP_PAGE_OFFSET_HIGH_MASK;
+	u64 region;
 
-	/*
-	 * Activate the lower HYP offset only if the idmap doesn't
-	 * clash with it,
-	 */
-	if (idmap_addr > HYP_PAGE_OFFSET_LOW_MASK)
-		mask = HYP_PAGE_OFFSET_HIGH_MASK;
+	/* Where is my RAM region? */
+	region  = idmap_addr & BIT(VA_BITS - 1);
+	region ^= BIT(VA_BITS - 1);
 
-	va_mask = mask;
+	va_mask  = BIT(VA_BITS - 1) - 1;
+	va_mask |= region;
 }
 
 static u32 compute_instruction(int n, u32 rd, u32 rn)
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 17/19] arm64: KVM: Dynamically compute the HYP VA mask
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

As we're moving towards a much more dynamic way to compute our
HYP VA, let's express the mask in a slightly different way.

Instead of comparing the idmap position to the "low" VA mask,
we directly compute the mask by taking into account the idmap's
(VA_BIT-1) bit.

No functionnal change.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/kvm/va_layout.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
index aee758574e61..75bb1c6772b0 100644
--- a/arch/arm64/kvm/va_layout.c
+++ b/arch/arm64/kvm/va_layout.c
@@ -21,24 +21,19 @@
 #include <asm/insn.h>
 #include <asm/kvm_mmu.h>
 
-#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
-#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
-
 static u64 va_mask;
 
 static void compute_layout(void)
 {
 	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
-	unsigned long mask = HYP_PAGE_OFFSET_HIGH_MASK;
+	u64 region;
 
-	/*
-	 * Activate the lower HYP offset only if the idmap doesn't
-	 * clash with it,
-	 */
-	if (idmap_addr > HYP_PAGE_OFFSET_LOW_MASK)
-		mask = HYP_PAGE_OFFSET_HIGH_MASK;
+	/* Where is my RAM region? */
+	region  = idmap_addr & BIT(VA_BITS - 1);
+	region ^= BIT(VA_BITS - 1);
 
-	va_mask = mask;
+	va_mask  = BIT(VA_BITS - 1) - 1;
+	va_mask |= region;
 }
 
 static u32 compute_instruction(int n, u32 rd, u32 rn)
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 18/19] arm64: KVM: Introduce EL2 VA randomisation
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: Christoffer Dall, Mark Rutland, Catalin Marinas, Will Deacon,
	James Morse, Steve Capper, Peter Maydell

The main idea behind randomising the EL2 VA is that we usually have
a few spare bits between the most significant bit of the VA mask
and the most significant bit of the linear mapping.

Those bits could be a bunch of zeroes, and could be useful
to move things around a bit. Of course, the more memory you have,
the less randomisation you get...

Alternatively, these bits could be the result of KASLR, in which
case they are already random. But it would be nice to have a
*different* randomization, just to make the job of a potential
attacker a bit more difficult.

Inserting these random bits is a bit involved. We don't have a spare
register (short of rewriting all the kern_hyp_va call sites), and
the immediate we want to insert is too random to be used with the
ORR instruction. The best option I could come up with is the following
sequence:

	and x0, x0, #va_mask
	ror x0, x0, #first_random_bit
	add x0, x0, #(random & 0xfff)
	add x0, x0, #(random >> 12), lsl #12
	ror x0, x0, #(63 - first_random_bit)

making it a fairly long sequence, but one that a decent CPU should
be able to execute without breaking a sweat. It is of course NOPed
out on VHE. The last 4 instructions can also be turned into NOPs
if it appears that there is no free bits to use.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_mmu.h | 10 +++++-
 arch/arm64/kvm/va_layout.c       | 68 +++++++++++++++++++++++++++++++++++++---
 virt/kvm/arm/mmu.c               |  2 +-
 3 files changed, 73 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index cc882e890bb1..4fca6ddadccc 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -85,6 +85,10 @@
 .macro kern_hyp_va	reg
 alternative_cb kvm_update_va_mask
 	and     \reg, \reg, #1
+	ror	\reg, \reg, #1
+	add	\reg, \reg, #0
+	add	\reg, \reg, #0
+	ror	\reg, \reg, #63
 alternative_cb_end
 .endm
 
@@ -101,7 +105,11 @@ void kvm_update_va_mask(struct alt_instr *alt,
 
 static inline unsigned long __kern_hyp_va(unsigned long v)
 {
-	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
+	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n"
+				    "ror %0, %0, #1\n"
+				    "add %0, %0, #0\n"
+				    "add %0, %0, #0\n"
+				    "ror %0, %0, #63\n",
 				    kvm_update_va_mask)
 		     : "+r" (v));
 	return v;
diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
index 75bb1c6772b0..bf0d6bdf5f14 100644
--- a/arch/arm64/kvm/va_layout.c
+++ b/arch/arm64/kvm/va_layout.c
@@ -16,11 +16,15 @@
  */
 
 #include <linux/kvm_host.h>
+#include <linux/random.h>
+#include <linux/memblock.h>
 #include <asm/alternative.h>
 #include <asm/debug-monitors.h>
 #include <asm/insn.h>
 #include <asm/kvm_mmu.h>
 
+static u8 tag_lsb;
+static u64 tag_val;
 static u64 va_mask;
 
 static void compute_layout(void)
@@ -32,8 +36,31 @@ static void compute_layout(void)
 	region  = idmap_addr & BIT(VA_BITS - 1);
 	region ^= BIT(VA_BITS - 1);
 
-	va_mask  = BIT(VA_BITS - 1) - 1;
-	va_mask |= region;
+	tag_lsb = fls64((u64)phys_to_virt(memblock_start_of_DRAM()) ^
+			(u64)(high_memory - 1));
+
+	if (tag_lsb == (VA_BITS - 1)) {
+		/*
+		 * No space in the address, let's compute the mask so
+		 * that it covers (VA_BITS - 1) bits, and the region
+		 * bit. The tag is set to zero.
+		 */
+		tag_lsb = tag_val = 0;
+		va_mask  = BIT(VA_BITS - 1) - 1;
+		va_mask |= region;
+	} else {
+		/*
+		 * We do have some free bits. Let's have the mask to
+		 * cover the low bits of the VA, and the tag to
+		 * contain the random stuff plus the region bit.
+		 */
+		u64 mask = GENMASK_ULL(VA_BITS - 2, tag_lsb);
+
+		va_mask = BIT(tag_lsb) - 1;
+		tag_val  = get_random_long() & mask;
+		tag_val |= region;
+		tag_val >>= tag_lsb;
+	}
 }
 
 static u32 compute_instruction(int n, u32 rd, u32 rn)
@@ -46,6 +73,33 @@ static u32 compute_instruction(int n, u32 rd, u32 rn)
 							  AARCH64_INSN_VARIANT_64BIT,
 							  rn, rd, va_mask);
 		break;
+
+	case 1:
+		/* ROR is a variant of EXTR with Rm = Rn */
+		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
+					     rn, rn, rd,
+					     tag_lsb);
+		break;
+
+	case 2:
+		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
+						    tag_val & (SZ_4K - 1),
+						    AARCH64_INSN_VARIANT_64BIT,
+						    AARCH64_INSN_ADSB_ADD);
+		break;
+
+	case 3:
+		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
+						    tag_val & GENMASK(23, 12),
+						    AARCH64_INSN_VARIANT_64BIT,
+						    AARCH64_INSN_ADSB_ADD);
+		break;
+
+	case 4:
+		/* ROR is a variant of EXTR with Rm = Rn */
+		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
+					     rn, rn, rd, 64 - tag_lsb);
+		break;
 	}
 
 	return insn;
@@ -56,8 +110,8 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
 {
 	int i;
 
-	/* We only expect a 1 instruction sequence */
-	BUG_ON(nr_inst != 1);
+	/* We only expect a 5 instruction sequence */
+	BUG_ON(nr_inst != 5);
 
 	if (!has_vhe() && !va_mask)
 		compute_layout();
@@ -68,8 +122,12 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
 		/*
 		 * VHE doesn't need any address translation, let's NOP
 		 * everything.
+		 *
+		 * Alternatively, if we don't have any spare bits in
+		 * the address, NOP everything after masking tha
+		 * kernel VA.
 		 */
-		if (has_vhe()) {
+		if (has_vhe() || (!tag_lsb && i > 1)) {
 			updptr[i] = aarch64_insn_gen_nop();
 			continue;
 		}
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 14c5e5534f2f..d01c7111b1f7 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1811,7 +1811,7 @@ int kvm_mmu_init(void)
 		  kern_hyp_va((unsigned long)high_memory - 1));
 
 	if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
-	    hyp_idmap_start <  kern_hyp_va(~0UL) &&
+	    hyp_idmap_start <  kern_hyp_va((unsigned long)high_memory - 1) &&
 	    hyp_idmap_start != (unsigned long)__hyp_idmap_text_start) {
 		/*
 		 * The idmap page is intersecting with the VA space,
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 18/19] arm64: KVM: Introduce EL2 VA randomisation
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

The main idea behind randomising the EL2 VA is that we usually have
a few spare bits between the most significant bit of the VA mask
and the most significant bit of the linear mapping.

Those bits could be a bunch of zeroes, and could be useful
to move things around a bit. Of course, the more memory you have,
the less randomisation you get...

Alternatively, these bits could be the result of KASLR, in which
case they are already random. But it would be nice to have a
*different* randomization, just to make the job of a potential
attacker a bit more difficult.

Inserting these random bits is a bit involved. We don't have a spare
register (short of rewriting all the kern_hyp_va call sites), and
the immediate we want to insert is too random to be used with the
ORR instruction. The best option I could come up with is the following
sequence:

	and x0, x0, #va_mask
	ror x0, x0, #first_random_bit
	add x0, x0, #(random & 0xfff)
	add x0, x0, #(random >> 12), lsl #12
	ror x0, x0, #(63 - first_random_bit)

making it a fairly long sequence, but one that a decent CPU should
be able to execute without breaking a sweat. It is of course NOPed
out on VHE. The last 4 instructions can also be turned into NOPs
if it appears that there is no free bits to use.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 arch/arm64/include/asm/kvm_mmu.h | 10 +++++-
 arch/arm64/kvm/va_layout.c       | 68 +++++++++++++++++++++++++++++++++++++---
 virt/kvm/arm/mmu.c               |  2 +-
 3 files changed, 73 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index cc882e890bb1..4fca6ddadccc 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -85,6 +85,10 @@
 .macro kern_hyp_va	reg
 alternative_cb kvm_update_va_mask
 	and     \reg, \reg, #1
+	ror	\reg, \reg, #1
+	add	\reg, \reg, #0
+	add	\reg, \reg, #0
+	ror	\reg, \reg, #63
 alternative_cb_end
 .endm
 
@@ -101,7 +105,11 @@ void kvm_update_va_mask(struct alt_instr *alt,
 
 static inline unsigned long __kern_hyp_va(unsigned long v)
 {
-	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
+	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n"
+				    "ror %0, %0, #1\n"
+				    "add %0, %0, #0\n"
+				    "add %0, %0, #0\n"
+				    "ror %0, %0, #63\n",
 				    kvm_update_va_mask)
 		     : "+r" (v));
 	return v;
diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
index 75bb1c6772b0..bf0d6bdf5f14 100644
--- a/arch/arm64/kvm/va_layout.c
+++ b/arch/arm64/kvm/va_layout.c
@@ -16,11 +16,15 @@
  */
 
 #include <linux/kvm_host.h>
+#include <linux/random.h>
+#include <linux/memblock.h>
 #include <asm/alternative.h>
 #include <asm/debug-monitors.h>
 #include <asm/insn.h>
 #include <asm/kvm_mmu.h>
 
+static u8 tag_lsb;
+static u64 tag_val;
 static u64 va_mask;
 
 static void compute_layout(void)
@@ -32,8 +36,31 @@ static void compute_layout(void)
 	region  = idmap_addr & BIT(VA_BITS - 1);
 	region ^= BIT(VA_BITS - 1);
 
-	va_mask  = BIT(VA_BITS - 1) - 1;
-	va_mask |= region;
+	tag_lsb = fls64((u64)phys_to_virt(memblock_start_of_DRAM()) ^
+			(u64)(high_memory - 1));
+
+	if (tag_lsb == (VA_BITS - 1)) {
+		/*
+		 * No space in the address, let's compute the mask so
+		 * that it covers (VA_BITS - 1) bits, and the region
+		 * bit. The tag is set to zero.
+		 */
+		tag_lsb = tag_val = 0;
+		va_mask  = BIT(VA_BITS - 1) - 1;
+		va_mask |= region;
+	} else {
+		/*
+		 * We do have some free bits. Let's have the mask to
+		 * cover the low bits of the VA, and the tag to
+		 * contain the random stuff plus the region bit.
+		 */
+		u64 mask = GENMASK_ULL(VA_BITS - 2, tag_lsb);
+
+		va_mask = BIT(tag_lsb) - 1;
+		tag_val  = get_random_long() & mask;
+		tag_val |= region;
+		tag_val >>= tag_lsb;
+	}
 }
 
 static u32 compute_instruction(int n, u32 rd, u32 rn)
@@ -46,6 +73,33 @@ static u32 compute_instruction(int n, u32 rd, u32 rn)
 							  AARCH64_INSN_VARIANT_64BIT,
 							  rn, rd, va_mask);
 		break;
+
+	case 1:
+		/* ROR is a variant of EXTR with Rm = Rn */
+		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
+					     rn, rn, rd,
+					     tag_lsb);
+		break;
+
+	case 2:
+		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
+						    tag_val & (SZ_4K - 1),
+						    AARCH64_INSN_VARIANT_64BIT,
+						    AARCH64_INSN_ADSB_ADD);
+		break;
+
+	case 3:
+		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
+						    tag_val & GENMASK(23, 12),
+						    AARCH64_INSN_VARIANT_64BIT,
+						    AARCH64_INSN_ADSB_ADD);
+		break;
+
+	case 4:
+		/* ROR is a variant of EXTR with Rm = Rn */
+		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
+					     rn, rn, rd, 64 - tag_lsb);
+		break;
 	}
 
 	return insn;
@@ -56,8 +110,8 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
 {
 	int i;
 
-	/* We only expect a 1 instruction sequence */
-	BUG_ON(nr_inst != 1);
+	/* We only expect a 5 instruction sequence */
+	BUG_ON(nr_inst != 5);
 
 	if (!has_vhe() && !va_mask)
 		compute_layout();
@@ -68,8 +122,12 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
 		/*
 		 * VHE doesn't need any address translation, let's NOP
 		 * everything.
+		 *
+		 * Alternatively, if we don't have any spare bits in
+		 * the address, NOP everything after masking tha
+		 * kernel VA.
 		 */
-		if (has_vhe()) {
+		if (has_vhe() || (!tag_lsb && i > 1)) {
 			updptr[i] = aarch64_insn_gen_nop();
 			continue;
 		}
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 14c5e5534f2f..d01c7111b1f7 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1811,7 +1811,7 @@ int kvm_mmu_init(void)
 		  kern_hyp_va((unsigned long)high_memory - 1));
 
 	if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
-	    hyp_idmap_start <  kern_hyp_va(~0UL) &&
+	    hyp_idmap_start <  kern_hyp_va((unsigned long)high_memory - 1) &&
 	    hyp_idmap_start != (unsigned long)__hyp_idmap_text_start) {
 		/*
 		 * The idmap page is intersecting with the VA space,
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 19/19] arm64: Update the KVM memory map documentation
  2018-01-04 18:43 ` Marc Zyngier
@ 2018-01-04 18:43   ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm
  Cc: Christoffer Dall, Mark Rutland, Catalin Marinas, Will Deacon,
	James Morse, Steve Capper, Peter Maydell

Update the documentation to reflect the new tricks we play on the
EL2 mappings...

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 Documentation/arm64/memory.txt | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/Documentation/arm64/memory.txt b/Documentation/arm64/memory.txt
index 671bc0639262..ea64e20037f6 100644
--- a/Documentation/arm64/memory.txt
+++ b/Documentation/arm64/memory.txt
@@ -86,9 +86,11 @@ Translation table lookup with 64KB pages:
  +-------------------------------------------------> [63] TTBR0/1
 
 
-When using KVM without the Virtualization Host Extensions, the hypervisor
-maps kernel pages in EL2 at a fixed offset from the kernel VA. See the
-kern_hyp_va macro for more details.
+When using KVM without the Virtualization Host Extensions, the
+hypervisor maps kernel pages in EL2 at a fixed offset (modulo a random
+offset) from the linear mapping. See the kern_hyp_va macro and
+kvm_update_va_mask function for more details. MMIO devices such as
+GICv2 gets mapped next to the HYP idmap page.
 
 When using KVM with the Virtualization Host Extensions, no additional
 mappings are created, since the host kernel runs directly in EL2.
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH v4 19/19] arm64: Update the KVM memory map documentation
@ 2018-01-04 18:43   ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-04 18:43 UTC (permalink / raw)
  To: linux-arm-kernel

Update the documentation to reflect the new tricks we play on the
EL2 mappings...

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 Documentation/arm64/memory.txt | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/Documentation/arm64/memory.txt b/Documentation/arm64/memory.txt
index 671bc0639262..ea64e20037f6 100644
--- a/Documentation/arm64/memory.txt
+++ b/Documentation/arm64/memory.txt
@@ -86,9 +86,11 @@ Translation table lookup with 64KB pages:
  +-------------------------------------------------> [63] TTBR0/1
 
 
-When using KVM without the Virtualization Host Extensions, the hypervisor
-maps kernel pages in EL2 at a fixed offset from the kernel VA. See the
-kern_hyp_va macro for more details.
+When using KVM without the Virtualization Host Extensions, the
+hypervisor maps kernel pages in EL2 at a fixed offset (modulo a random
+offset) from the linear mapping. See the kern_hyp_va macro and
+kvm_update_va_mask function for more details. MMIO devices such as
+GICv2 gets mapped next to the HYP idmap page.
 
 When using KVM with the Virtualization Host Extensions, no additional
 mappings are created, since the host kernel runs directly in EL2.
-- 
2.14.2

^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 03/19] arm64: asm-offsets: Remove potential circular dependency
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-15  8:34     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15  8:34 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Catalin Marinas, Will Deacon, linux-arm-kernel, kvmarm

On Thu, Jan 04, 2018 at 06:43:18PM +0000, Marc Zyngier wrote:
> So far, we've been lucky enough that none of the include files
> that asm-offsets.c requires include asm-offsets.h. This is
> about to change, and would introduce a nasty circular dependency...
> 
> Let's now guard the inclusion of asm-offsets.h so that it never
> gets pulled from asm-offsets.c. The same issue exists between
> bounce.c and include/generated/bounds.h, and is worked around
> by using the existing guard symbol.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/asm-offsets.h | 2 ++
>  arch/arm64/kernel/asm-offsets.c      | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/asm-offsets.h b/arch/arm64/include/asm/asm-offsets.h
> index d370ee36a182..7d6531a81eb3 100644
> --- a/arch/arm64/include/asm/asm-offsets.h
> +++ b/arch/arm64/include/asm/asm-offsets.h
> @@ -1 +1,3 @@
> +#if !defined(__GENERATING_ASM_OFFSETS_H) && !defined(__GENERATING_BOUNDS_H)

I don't understand why we need to check __GENERATING_BOUNDS_H here?
What is the interaction between asm-offsets and bounds?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 03/19] arm64: asm-offsets: Remove potential circular dependency
@ 2018-01-15  8:34     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15  8:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:18PM +0000, Marc Zyngier wrote:
> So far, we've been lucky enough that none of the include files
> that asm-offsets.c requires include asm-offsets.h. This is
> about to change, and would introduce a nasty circular dependency...
> 
> Let's now guard the inclusion of asm-offsets.h so that it never
> gets pulled from asm-offsets.c. The same issue exists between
> bounce.c and include/generated/bounds.h, and is worked around
> by using the existing guard symbol.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/asm-offsets.h | 2 ++
>  arch/arm64/kernel/asm-offsets.c      | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/asm-offsets.h b/arch/arm64/include/asm/asm-offsets.h
> index d370ee36a182..7d6531a81eb3 100644
> --- a/arch/arm64/include/asm/asm-offsets.h
> +++ b/arch/arm64/include/asm/asm-offsets.h
> @@ -1 +1,3 @@
> +#if !defined(__GENERATING_ASM_OFFSETS_H) && !defined(__GENERATING_BOUNDS_H)

I don't understand why we need to check __GENERATING_BOUNDS_H here?
What is the interaction between asm-offsets and bounds?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 03/19] arm64: asm-offsets: Remove potential circular dependency
  2018-01-15  8:34     ` Christoffer Dall
@ 2018-01-15  8:42       ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-15  8:42 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, kvm, kvmarm, Christoffer Dall, Mark Rutland,
	Catalin Marinas, Will Deacon, James Morse, Steve Capper,
	Peter Maydell

On 15/01/18 08:34, Christoffer Dall wrote:
> On Thu, Jan 04, 2018 at 06:43:18PM +0000, Marc Zyngier wrote:
>> So far, we've been lucky enough that none of the include files
>> that asm-offsets.c requires include asm-offsets.h. This is
>> about to change, and would introduce a nasty circular dependency...
>>
>> Let's now guard the inclusion of asm-offsets.h so that it never
>> gets pulled from asm-offsets.c. The same issue exists between
>> bounce.c and include/generated/bounds.h, and is worked around
>> by using the existing guard symbol.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/include/asm/asm-offsets.h | 2 ++
>>  arch/arm64/kernel/asm-offsets.c      | 2 ++
>>  2 files changed, 4 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/asm-offsets.h b/arch/arm64/include/asm/asm-offsets.h
>> index d370ee36a182..7d6531a81eb3 100644
>> --- a/arch/arm64/include/asm/asm-offsets.h
>> +++ b/arch/arm64/include/asm/asm-offsets.h
>> @@ -1 +1,3 @@
>> +#if !defined(__GENERATING_ASM_OFFSETS_H) && !defined(__GENERATING_BOUNDS_H)
> 
> I don't understand why we need to check __GENERATING_BOUNDS_H here?
> What is the interaction between asm-offsets and bounds?

bound.c ends up including asm-offsets.h as well once you start adding
the dependency between it and alternatives.h. See the report there:

https://www.spinics.net/lists/arm-kernel/msg623723.html

which this check addresses.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 03/19] arm64: asm-offsets: Remove potential circular dependency
@ 2018-01-15  8:42       ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-01-15  8:42 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/01/18 08:34, Christoffer Dall wrote:
> On Thu, Jan 04, 2018 at 06:43:18PM +0000, Marc Zyngier wrote:
>> So far, we've been lucky enough that none of the include files
>> that asm-offsets.c requires include asm-offsets.h. This is
>> about to change, and would introduce a nasty circular dependency...
>>
>> Let's now guard the inclusion of asm-offsets.h so that it never
>> gets pulled from asm-offsets.c. The same issue exists between
>> bounce.c and include/generated/bounds.h, and is worked around
>> by using the existing guard symbol.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/include/asm/asm-offsets.h | 2 ++
>>  arch/arm64/kernel/asm-offsets.c      | 2 ++
>>  2 files changed, 4 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/asm-offsets.h b/arch/arm64/include/asm/asm-offsets.h
>> index d370ee36a182..7d6531a81eb3 100644
>> --- a/arch/arm64/include/asm/asm-offsets.h
>> +++ b/arch/arm64/include/asm/asm-offsets.h
>> @@ -1 +1,3 @@
>> +#if !defined(__GENERATING_ASM_OFFSETS_H) && !defined(__GENERATING_BOUNDS_H)
> 
> I don't understand why we need to check __GENERATING_BOUNDS_H here?
> What is the interaction between asm-offsets and bounds?

bound.c ends up including asm-offsets.h as well once you start adding
the dependency between it and alternatives.h. See the report there:

https://www.spinics.net/lists/arm-kernel/msg623723.html

which this check addresses.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 04/19] arm64: alternatives: Enforce alignment of struct alt_instr
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-15  9:11     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15  9:11 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:19PM +0000, Marc Zyngier wrote:
> We're playing a dangerous game with struct alt_instr, as we produce
> it using assembly tricks, but parse them using the C structure.
> We just assume that the respective alignments of the two will
> be the same.
> 
> But as we add more fields to this structure, the alignment requirements
> of the structure may change, and lead to all kind of funky bugs.
> 
> TO solve this, let's move the definition of struct alt_instr to its
> own file, and use this to generate the alignment constraint from
> asm-offsets.c. The various macros are then patched to take the
> alignment into account.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/alternative.h       | 13 +++++--------
>  arch/arm64/include/asm/alternative_types.h | 13 +++++++++++++
>  arch/arm64/kernel/asm-offsets.c            |  4 ++++
>  3 files changed, 22 insertions(+), 8 deletions(-)
>  create mode 100644 arch/arm64/include/asm/alternative_types.h
> 
> diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
> index 4a85c6952a22..395befde7595 100644
> --- a/arch/arm64/include/asm/alternative.h
> +++ b/arch/arm64/include/asm/alternative.h
> @@ -2,28 +2,24 @@
>  #ifndef __ASM_ALTERNATIVE_H
>  #define __ASM_ALTERNATIVE_H
>  
> +#include <asm/asm-offsets.h>
>  #include <asm/cpucaps.h>
>  #include <asm/insn.h>
>  
>  #ifndef __ASSEMBLY__
>  
> +#include <asm/alternative_types.h>
> +
>  #include <linux/init.h>
>  #include <linux/types.h>
>  #include <linux/stddef.h>
>  #include <linux/stringify.h>
>  
> -struct alt_instr {
> -	s32 orig_offset;	/* offset to original instruction */
> -	s32 alt_offset;		/* offset to replacement instruction */
> -	u16 cpufeature;		/* cpufeature bit set for replacement */
> -	u8  orig_len;		/* size of original instruction(s) */
> -	u8  alt_len;		/* size of new instruction(s), <= orig_len */
> -};
> -
>  void __init apply_alternatives_all(void);
>  void apply_alternatives(void *start, size_t length);
>  
>  #define ALTINSTR_ENTRY(feature)						      \
> +	" .align " __stringify(ALTINSTR_ALIGN) "\n"			      \
>  	" .word 661b - .\n"				/* label           */ \
>  	" .word 663f - .\n"				/* new instruction */ \
>  	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
> @@ -69,6 +65,7 @@ void apply_alternatives(void *start, size_t length);
>  #include <asm/assembler.h>
>  
>  .macro altinstruction_entry orig_offset alt_offset feature orig_len alt_len
> +	.align ALTINSTR_ALIGN
>  	.word \orig_offset - .
>  	.word \alt_offset - .
>  	.hword \feature

I don't understand much of the magics in this file, but...

> diff --git a/arch/arm64/include/asm/alternative_types.h b/arch/arm64/include/asm/alternative_types.h
> new file mode 100644
> index 000000000000..26cf76167f2d
> --- /dev/null
> +++ b/arch/arm64/include/asm/alternative_types.h
> @@ -0,0 +1,13 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_ALTERNATIVE_TYPES_H
> +#define __ASM_ALTERNATIVE_TYPES_H
> +
> +struct alt_instr {
> +	s32 orig_offset;	/* offset to original instruction */
> +	s32 alt_offset;		/* offset to replacement instruction */
> +	u16 cpufeature;		/* cpufeature bit set for replacement */
> +	u8  orig_len;		/* size of original instruction(s) */
> +	u8  alt_len;		/* size of new instruction(s), <= orig_len */
> +};
> +
> +#endif
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 5ab8841af382..f00666341ae2 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -25,6 +25,7 @@
>  #include <linux/dma-mapping.h>
>  #include <linux/kvm_host.h>
>  #include <linux/suspend.h>
> +#include <asm/alternative_types.h>
>  #include <asm/cpufeature.h>
>  #include <asm/thread_info.h>
>  #include <asm/memory.h>
> @@ -151,5 +152,8 @@ int main(void)
>    DEFINE(HIBERN_PBE_ADDR,	offsetof(struct pbe, address));
>    DEFINE(HIBERN_PBE_NEXT,	offsetof(struct pbe, next));
>    DEFINE(ARM64_FTR_SYSVAL,	offsetof(struct arm64_ftr_reg, sys_val));
> +  BLANK();
> +  DEFINE(ALTINSTR_ALIGN,	(63 - __builtin_clzl(__alignof__(struct alt_instr))));
> +
>    return 0;
>  }
> -- 
> 2.14.2
> 

... the rest looks correct to me.  FWIW:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 04/19] arm64: alternatives: Enforce alignment of struct alt_instr
@ 2018-01-15  9:11     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15  9:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:19PM +0000, Marc Zyngier wrote:
> We're playing a dangerous game with struct alt_instr, as we produce
> it using assembly tricks, but parse them using the C structure.
> We just assume that the respective alignments of the two will
> be the same.
> 
> But as we add more fields to this structure, the alignment requirements
> of the structure may change, and lead to all kind of funky bugs.
> 
> TO solve this, let's move the definition of struct alt_instr to its
> own file, and use this to generate the alignment constraint from
> asm-offsets.c. The various macros are then patched to take the
> alignment into account.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/alternative.h       | 13 +++++--------
>  arch/arm64/include/asm/alternative_types.h | 13 +++++++++++++
>  arch/arm64/kernel/asm-offsets.c            |  4 ++++
>  3 files changed, 22 insertions(+), 8 deletions(-)
>  create mode 100644 arch/arm64/include/asm/alternative_types.h
> 
> diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
> index 4a85c6952a22..395befde7595 100644
> --- a/arch/arm64/include/asm/alternative.h
> +++ b/arch/arm64/include/asm/alternative.h
> @@ -2,28 +2,24 @@
>  #ifndef __ASM_ALTERNATIVE_H
>  #define __ASM_ALTERNATIVE_H
>  
> +#include <asm/asm-offsets.h>
>  #include <asm/cpucaps.h>
>  #include <asm/insn.h>
>  
>  #ifndef __ASSEMBLY__
>  
> +#include <asm/alternative_types.h>
> +
>  #include <linux/init.h>
>  #include <linux/types.h>
>  #include <linux/stddef.h>
>  #include <linux/stringify.h>
>  
> -struct alt_instr {
> -	s32 orig_offset;	/* offset to original instruction */
> -	s32 alt_offset;		/* offset to replacement instruction */
> -	u16 cpufeature;		/* cpufeature bit set for replacement */
> -	u8  orig_len;		/* size of original instruction(s) */
> -	u8  alt_len;		/* size of new instruction(s), <= orig_len */
> -};
> -
>  void __init apply_alternatives_all(void);
>  void apply_alternatives(void *start, size_t length);
>  
>  #define ALTINSTR_ENTRY(feature)						      \
> +	" .align " __stringify(ALTINSTR_ALIGN) "\n"			      \
>  	" .word 661b - .\n"				/* label           */ \
>  	" .word 663f - .\n"				/* new instruction */ \
>  	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
> @@ -69,6 +65,7 @@ void apply_alternatives(void *start, size_t length);
>  #include <asm/assembler.h>
>  
>  .macro altinstruction_entry orig_offset alt_offset feature orig_len alt_len
> +	.align ALTINSTR_ALIGN
>  	.word \orig_offset - .
>  	.word \alt_offset - .
>  	.hword \feature

I don't understand much of the magics in this file, but...

> diff --git a/arch/arm64/include/asm/alternative_types.h b/arch/arm64/include/asm/alternative_types.h
> new file mode 100644
> index 000000000000..26cf76167f2d
> --- /dev/null
> +++ b/arch/arm64/include/asm/alternative_types.h
> @@ -0,0 +1,13 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_ALTERNATIVE_TYPES_H
> +#define __ASM_ALTERNATIVE_TYPES_H
> +
> +struct alt_instr {
> +	s32 orig_offset;	/* offset to original instruction */
> +	s32 alt_offset;		/* offset to replacement instruction */
> +	u16 cpufeature;		/* cpufeature bit set for replacement */
> +	u8  orig_len;		/* size of original instruction(s) */
> +	u8  alt_len;		/* size of new instruction(s), <= orig_len */
> +};
> +
> +#endif
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 5ab8841af382..f00666341ae2 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -25,6 +25,7 @@
>  #include <linux/dma-mapping.h>
>  #include <linux/kvm_host.h>
>  #include <linux/suspend.h>
> +#include <asm/alternative_types.h>
>  #include <asm/cpufeature.h>
>  #include <asm/thread_info.h>
>  #include <asm/memory.h>
> @@ -151,5 +152,8 @@ int main(void)
>    DEFINE(HIBERN_PBE_ADDR,	offsetof(struct pbe, address));
>    DEFINE(HIBERN_PBE_NEXT,	offsetof(struct pbe, next));
>    DEFINE(ARM64_FTR_SYSVAL,	offsetof(struct arm64_ftr_reg, sys_val));
> +  BLANK();
> +  DEFINE(ALTINSTR_ALIGN,	(63 - __builtin_clzl(__alignof__(struct alt_instr))));
> +
>    return 0;
>  }
> -- 
> 2.14.2
> 

... the rest looks correct to me.  FWIW:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 03/19] arm64: asm-offsets: Remove potential circular dependency
  2018-01-15  8:42       ` Marc Zyngier
@ 2018-01-15  9:46         ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15  9:46 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On Mon, Jan 15, 2018 at 9:42 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 15/01/18 08:34, Christoffer Dall wrote:
>> On Thu, Jan 04, 2018 at 06:43:18PM +0000, Marc Zyngier wrote:
>>> So far, we've been lucky enough that none of the include files
>>> that asm-offsets.c requires include asm-offsets.h. This is
>>> about to change, and would introduce a nasty circular dependency...
>>>
>>> Let's now guard the inclusion of asm-offsets.h so that it never
>>> gets pulled from asm-offsets.c. The same issue exists between
>>> bounce.c and include/generated/bounds.h, and is worked around
>>> by using the existing guard symbol.
>>>
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> ---
>>>  arch/arm64/include/asm/asm-offsets.h | 2 ++
>>>  arch/arm64/kernel/asm-offsets.c      | 2 ++
>>>  2 files changed, 4 insertions(+)
>>>
>>> diff --git a/arch/arm64/include/asm/asm-offsets.h b/arch/arm64/include/asm/asm-offsets.h
>>> index d370ee36a182..7d6531a81eb3 100644
>>> --- a/arch/arm64/include/asm/asm-offsets.h
>>> +++ b/arch/arm64/include/asm/asm-offsets.h
>>> @@ -1 +1,3 @@
>>> +#if !defined(__GENERATING_ASM_OFFSETS_H) && !defined(__GENERATING_BOUNDS_H)
>>
>> I don't understand why we need to check __GENERATING_BOUNDS_H here?
>> What is the interaction between asm-offsets and bounds?
>
> bound.c ends up including asm-offsets.h as well once you start adding
> the dependency between it and alternatives.h. See the report there:
>
> https://www.spinics.net/lists/arm-kernel/msg623723.html
>
> which this check addresses.
>
Ah, I see.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 03/19] arm64: asm-offsets: Remove potential circular dependency
@ 2018-01-15  9:46         ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15  9:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jan 15, 2018 at 9:42 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> On 15/01/18 08:34, Christoffer Dall wrote:
>> On Thu, Jan 04, 2018 at 06:43:18PM +0000, Marc Zyngier wrote:
>>> So far, we've been lucky enough that none of the include files
>>> that asm-offsets.c requires include asm-offsets.h. This is
>>> about to change, and would introduce a nasty circular dependency...
>>>
>>> Let's now guard the inclusion of asm-offsets.h so that it never
>>> gets pulled from asm-offsets.c. The same issue exists between
>>> bounce.c and include/generated/bounds.h, and is worked around
>>> by using the existing guard symbol.
>>>
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> ---
>>>  arch/arm64/include/asm/asm-offsets.h | 2 ++
>>>  arch/arm64/kernel/asm-offsets.c      | 2 ++
>>>  2 files changed, 4 insertions(+)
>>>
>>> diff --git a/arch/arm64/include/asm/asm-offsets.h b/arch/arm64/include/asm/asm-offsets.h
>>> index d370ee36a182..7d6531a81eb3 100644
>>> --- a/arch/arm64/include/asm/asm-offsets.h
>>> +++ b/arch/arm64/include/asm/asm-offsets.h
>>> @@ -1 +1,3 @@
>>> +#if !defined(__GENERATING_ASM_OFFSETS_H) && !defined(__GENERATING_BOUNDS_H)
>>
>> I don't understand why we need to check __GENERATING_BOUNDS_H here?
>> What is the interaction between asm-offsets and bounds?
>
> bound.c ends up including asm-offsets.h as well once you start adding
> the dependency between it and alternatives.h. See the report there:
>
> https://www.spinics.net/lists/arm-kernel/msg623723.html
>
> which this check addresses.
>
Ah, I see.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 05/19] arm64: alternatives: Add dynamic patching feature
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-15 11:26     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 11:26 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:20PM +0000, Marc Zyngier wrote:
> We've so far relied on a patching infrastructure that only gave us
> a single alternative, without any way to finely control what gets
> patched. 

Not sure I understand this point.  Do you mean without any way to
provide a range of potential replacement instructions?

> For a single feature, this is an all or nothing thing.
> 
> It would be interesting to have a more fine grained way of patching
> the kernel though, where we could dynamically tune the code that gets
> injected.

Again, I'm not really sure how this is more fine grained than what we
had before, but it's certainly more flexible, in that we can now apply
arbitrary run-time logic in the form of a C-function to decide which
instructions should be used in a given location.

> 
> In order to achive this, let's introduce a new form of alternative
> that is associated with a callback.

And we call this new form of alternative "dynamic patching" ?  I think
it's good to introduce a bit of consistent nomenclature here.

> This callback gets the instruction
> sequence number and the old instruction as a parameter, and returns
> the new instruction. This callback is always called, as the patching
> decision is now done at runtime (not patching is equivalent to returning
> the same instruction).

Sorry to be a bit nit-picky here, but didn't we also patch instructions
at runtime before this feature, it's just that we apply logic to
figure out the replacement instruction now, which can now even compose
an instruction at runtime?

> 
> Patching with a callback is declared with the new ALTERNATIVE_CB

So here we could say "callback patching" or "dynamic patching", and...

> and alternative_cb directives:
> 
> 	asm volatile(ALTERNATIVE_CB("mov %0, #0\n", callback)
> 		     : "r" (v));
> or
> 	alternative_cb callback
> 		mov	x0, #0
> 	alternative_cb_end
> 
> where callback is the C function computing the alternative.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/alternative.h       | 36 ++++++++++++++++++++++---
>  arch/arm64/include/asm/alternative_types.h |  4 +++
>  arch/arm64/kernel/alternative.c            | 43 ++++++++++++++++++++++--------
>  3 files changed, 68 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
> index 395befde7595..04f66f6173fc 100644
> --- a/arch/arm64/include/asm/alternative.h
> +++ b/arch/arm64/include/asm/alternative.h
> @@ -18,10 +18,14 @@
>  void __init apply_alternatives_all(void);
>  void apply_alternatives(void *start, size_t length);
>  
> -#define ALTINSTR_ENTRY(feature)						      \
> +#define ALTINSTR_ENTRY(feature,cb)					      \
>  	" .align " __stringify(ALTINSTR_ALIGN) "\n"			      \
>  	" .word 661b - .\n"				/* label           */ \
> +	" .if " __stringify(cb) " == 0\n"				      \
>  	" .word 663f - .\n"				/* new instruction */ \
> +	" .else\n"							      \
> +	" .word " __stringify(cb) "- .\n"		/* callback */	      \
> +	" .endif\n"							      \
>  	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
>  	" .byte 662b-661b\n"				/* source len      */ \
>  	" .byte 664f-663f\n"				/* replacement len */
> @@ -39,15 +43,18 @@ void apply_alternatives(void *start, size_t length);
>   * but most assemblers die if insn1 or insn2 have a .inst. This should
>   * be fixed in a binutils release posterior to 2.25.51.0.2 (anything
>   * containing commit 4e4d08cf7399b606 or c1baaddf8861).
> + *
> + * Alternatives with callbacks do not generate replacement instructions.
>   */
> -#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled)	\
> +#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled, cb)	\
>  	".if "__stringify(cfg_enabled)" == 1\n"				\
>  	"661:\n\t"							\
>  	oldinstr "\n"							\
>  	"662:\n"							\
>  	".pushsection .altinstructions,\"a\"\n"				\
> -	ALTINSTR_ENTRY(feature)						\
> +	ALTINSTR_ENTRY(feature,cb)					\
>  	".popsection\n"							\
> +	" .if " __stringify(cb) " == 0\n"				\
>  	".pushsection .altinstr_replacement, \"a\"\n"			\
>  	"663:\n\t"							\
>  	newinstr "\n"							\
> @@ -55,11 +62,17 @@ void apply_alternatives(void *start, size_t length);
>  	".popsection\n\t"						\
>  	".org	. - (664b-663b) + (662b-661b)\n\t"			\
>  	".org	. - (662b-661b) + (664b-663b)\n"			\
> +	".else\n\t"							\
> +	"663:\n\t"							\
> +	"664:\n\t"							\
> +	".endif\n"							\
>  	".endif\n"
>  
>  #define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...)	\
> -	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg))
> +	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg), 0)
>  
> +#define ALTERNATIVE_CB(oldinstr, cb) \
> +	__ALTERNATIVE_CFG(oldinstr, "NOT_AN_INSTRUCTION", ARM64_NCAPS, 1, cb)
>  #else
>  
>  #include <asm/assembler.h>
> @@ -127,6 +140,14 @@ void apply_alternatives(void *start, size_t length);
>  661:
>  .endm
>  
> +.macro alternative_cb cb
> +	.set .Lasm_alt_mode, 0
> +	.pushsection .altinstructions, "a"
> +	altinstruction_entry 661f, \cb, ARM64_NCAPS, 662f-661f, 0
> +	.popsection
> +661:
> +.endm
> +
>  /*
>   * Provide the other half of the alternative code sequence.
>   */
> @@ -152,6 +173,13 @@ void apply_alternatives(void *start, size_t length);
>  	.org	. - (662b-661b) + (664b-663b)
>  .endm
>  
> +/*
> + * Callback-based alternative epilogue
> + */
> +.macro alternative_cb_end
> +662:
> +.endm
> +
>  /*
>   * Provides a trivial alternative or default sequence consisting solely
>   * of NOPs. The number of NOPs is chosen automatically to match the
> diff --git a/arch/arm64/include/asm/alternative_types.h b/arch/arm64/include/asm/alternative_types.h
> index 26cf76167f2d..e400b9061957 100644
> --- a/arch/arm64/include/asm/alternative_types.h
> +++ b/arch/arm64/include/asm/alternative_types.h
> @@ -2,6 +2,10 @@
>  #ifndef __ASM_ALTERNATIVE_TYPES_H
>  #define __ASM_ALTERNATIVE_TYPES_H
>  
> +struct alt_instr;
> +typedef void (*alternative_cb_t)(struct alt_instr *alt,
> +				 __le32 *origptr, __le32 *updptr, int nr_inst);
> +
>  struct alt_instr {
>  	s32 orig_offset;	/* offset to original instruction */
>  	s32 alt_offset;		/* offset to replacement instruction */
> diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
> index 6dd0a3a3e5c9..0f52627fbb29 100644
> --- a/arch/arm64/kernel/alternative.c
> +++ b/arch/arm64/kernel/alternative.c
> @@ -105,32 +105,53 @@ static u32 get_alt_insn(struct alt_instr *alt, __le32 *insnptr, __le32 *altinsnp
>  	return insn;
>  }
>  
> +static void patch_alternative(struct alt_instr *alt,
> +			      __le32 *origptr, __le32 *updptr, int nr_inst)
> +{
> +	__le32 *replptr;
> +	int i;
> +
> +	replptr = ALT_REPL_PTR(alt);
> +	for (i = 0; i < nr_inst; i++) {
> +		u32 insn;
> +
> +		insn = get_alt_insn(alt, origptr + i, replptr + i);
> +		updptr[i] = cpu_to_le32(insn);
> +	}
> +}
> +
>  static void __apply_alternatives(void *alt_region, bool use_linear_alias)
>  {
>  	struct alt_instr *alt;
>  	struct alt_region *region = alt_region;
> -	__le32 *origptr, *replptr, *updptr;
> +	__le32 *origptr, *updptr;
> +	alternative_cb_t alt_cb;
>  
>  	for (alt = region->begin; alt < region->end; alt++) {
> -		u32 insn;
> -		int i, nr_inst;
> +		int nr_inst;
>  
> -		if (!cpus_have_cap(alt->cpufeature))
> +		/* Use ARM64_NCAPS as an unconditional patch */

... here I think we can re-use this term, if I correctly understand that
ARM64_NCAPS means it's a "callback patch" (or "dynamic patch").

You could consider #define ARM64_CB_PATCH ARM64_NCAPS as well I suppose.

> +		if (alt->cpufeature < ARM64_NCAPS &&
> +		    !cpus_have_cap(alt->cpufeature))
>  			continue;
>  
> -		BUG_ON(alt->alt_len != alt->orig_len);
> +		if (alt->cpufeature == ARM64_NCAPS)
> +			BUG_ON(alt->alt_len != 0);
> +		else
> +			BUG_ON(alt->alt_len != alt->orig_len);
>  
>  		pr_info_once("patching kernel code\n");
>  
>  		origptr = ALT_ORIG_PTR(alt);
> -		replptr = ALT_REPL_PTR(alt);
>  		updptr = use_linear_alias ? lm_alias(origptr) : origptr;
> -		nr_inst = alt->alt_len / sizeof(insn);
> +		nr_inst = alt->orig_len / AARCH64_INSN_SIZE;
>  
> -		for (i = 0; i < nr_inst; i++) {
> -			insn = get_alt_insn(alt, origptr + i, replptr + i);
> -			updptr[i] = cpu_to_le32(insn);
> -		}
> +		if (alt->cpufeature < ARM64_NCAPS)
> +			alt_cb = patch_alternative;
> +		else
> +			alt_cb  = ALT_REPL_PTR(alt);
> +
> +		alt_cb(alt, origptr, updptr, nr_inst);
>  
>  		flush_icache_range((uintptr_t)origptr,
>  				   (uintptr_t)(origptr + nr_inst));
> -- 
> 2.14.2
> 

So despite my language nit-picks above, I haven't been able to spot
anything problematic with this patch:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 05/19] arm64: alternatives: Add dynamic patching feature
@ 2018-01-15 11:26     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 11:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:20PM +0000, Marc Zyngier wrote:
> We've so far relied on a patching infrastructure that only gave us
> a single alternative, without any way to finely control what gets
> patched. 

Not sure I understand this point.  Do you mean without any way to
provide a range of potential replacement instructions?

> For a single feature, this is an all or nothing thing.
> 
> It would be interesting to have a more fine grained way of patching
> the kernel though, where we could dynamically tune the code that gets
> injected.

Again, I'm not really sure how this is more fine grained than what we
had before, but it's certainly more flexible, in that we can now apply
arbitrary run-time logic in the form of a C-function to decide which
instructions should be used in a given location.

> 
> In order to achive this, let's introduce a new form of alternative
> that is associated with a callback.

And we call this new form of alternative "dynamic patching" ?  I think
it's good to introduce a bit of consistent nomenclature here.

> This callback gets the instruction
> sequence number and the old instruction as a parameter, and returns
> the new instruction. This callback is always called, as the patching
> decision is now done at runtime (not patching is equivalent to returning
> the same instruction).

Sorry to be a bit nit-picky here, but didn't we also patch instructions
at runtime before this feature, it's just that we apply logic to
figure out the replacement instruction now, which can now even compose
an instruction at runtime?

> 
> Patching with a callback is declared with the new ALTERNATIVE_CB

So here we could say "callback patching" or "dynamic patching", and...

> and alternative_cb directives:
> 
> 	asm volatile(ALTERNATIVE_CB("mov %0, #0\n", callback)
> 		     : "r" (v));
> or
> 	alternative_cb callback
> 		mov	x0, #0
> 	alternative_cb_end
> 
> where callback is the C function computing the alternative.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/alternative.h       | 36 ++++++++++++++++++++++---
>  arch/arm64/include/asm/alternative_types.h |  4 +++
>  arch/arm64/kernel/alternative.c            | 43 ++++++++++++++++++++++--------
>  3 files changed, 68 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
> index 395befde7595..04f66f6173fc 100644
> --- a/arch/arm64/include/asm/alternative.h
> +++ b/arch/arm64/include/asm/alternative.h
> @@ -18,10 +18,14 @@
>  void __init apply_alternatives_all(void);
>  void apply_alternatives(void *start, size_t length);
>  
> -#define ALTINSTR_ENTRY(feature)						      \
> +#define ALTINSTR_ENTRY(feature,cb)					      \
>  	" .align " __stringify(ALTINSTR_ALIGN) "\n"			      \
>  	" .word 661b - .\n"				/* label           */ \
> +	" .if " __stringify(cb) " == 0\n"				      \
>  	" .word 663f - .\n"				/* new instruction */ \
> +	" .else\n"							      \
> +	" .word " __stringify(cb) "- .\n"		/* callback */	      \
> +	" .endif\n"							      \
>  	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
>  	" .byte 662b-661b\n"				/* source len      */ \
>  	" .byte 664f-663f\n"				/* replacement len */
> @@ -39,15 +43,18 @@ void apply_alternatives(void *start, size_t length);
>   * but most assemblers die if insn1 or insn2 have a .inst. This should
>   * be fixed in a binutils release posterior to 2.25.51.0.2 (anything
>   * containing commit 4e4d08cf7399b606 or c1baaddf8861).
> + *
> + * Alternatives with callbacks do not generate replacement instructions.
>   */
> -#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled)	\
> +#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled, cb)	\
>  	".if "__stringify(cfg_enabled)" == 1\n"				\
>  	"661:\n\t"							\
>  	oldinstr "\n"							\
>  	"662:\n"							\
>  	".pushsection .altinstructions,\"a\"\n"				\
> -	ALTINSTR_ENTRY(feature)						\
> +	ALTINSTR_ENTRY(feature,cb)					\
>  	".popsection\n"							\
> +	" .if " __stringify(cb) " == 0\n"				\
>  	".pushsection .altinstr_replacement, \"a\"\n"			\
>  	"663:\n\t"							\
>  	newinstr "\n"							\
> @@ -55,11 +62,17 @@ void apply_alternatives(void *start, size_t length);
>  	".popsection\n\t"						\
>  	".org	. - (664b-663b) + (662b-661b)\n\t"			\
>  	".org	. - (662b-661b) + (664b-663b)\n"			\
> +	".else\n\t"							\
> +	"663:\n\t"							\
> +	"664:\n\t"							\
> +	".endif\n"							\
>  	".endif\n"
>  
>  #define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...)	\
> -	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg))
> +	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg), 0)
>  
> +#define ALTERNATIVE_CB(oldinstr, cb) \
> +	__ALTERNATIVE_CFG(oldinstr, "NOT_AN_INSTRUCTION", ARM64_NCAPS, 1, cb)
>  #else
>  
>  #include <asm/assembler.h>
> @@ -127,6 +140,14 @@ void apply_alternatives(void *start, size_t length);
>  661:
>  .endm
>  
> +.macro alternative_cb cb
> +	.set .Lasm_alt_mode, 0
> +	.pushsection .altinstructions, "a"
> +	altinstruction_entry 661f, \cb, ARM64_NCAPS, 662f-661f, 0
> +	.popsection
> +661:
> +.endm
> +
>  /*
>   * Provide the other half of the alternative code sequence.
>   */
> @@ -152,6 +173,13 @@ void apply_alternatives(void *start, size_t length);
>  	.org	. - (662b-661b) + (664b-663b)
>  .endm
>  
> +/*
> + * Callback-based alternative epilogue
> + */
> +.macro alternative_cb_end
> +662:
> +.endm
> +
>  /*
>   * Provides a trivial alternative or default sequence consisting solely
>   * of NOPs. The number of NOPs is chosen automatically to match the
> diff --git a/arch/arm64/include/asm/alternative_types.h b/arch/arm64/include/asm/alternative_types.h
> index 26cf76167f2d..e400b9061957 100644
> --- a/arch/arm64/include/asm/alternative_types.h
> +++ b/arch/arm64/include/asm/alternative_types.h
> @@ -2,6 +2,10 @@
>  #ifndef __ASM_ALTERNATIVE_TYPES_H
>  #define __ASM_ALTERNATIVE_TYPES_H
>  
> +struct alt_instr;
> +typedef void (*alternative_cb_t)(struct alt_instr *alt,
> +				 __le32 *origptr, __le32 *updptr, int nr_inst);
> +
>  struct alt_instr {
>  	s32 orig_offset;	/* offset to original instruction */
>  	s32 alt_offset;		/* offset to replacement instruction */
> diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
> index 6dd0a3a3e5c9..0f52627fbb29 100644
> --- a/arch/arm64/kernel/alternative.c
> +++ b/arch/arm64/kernel/alternative.c
> @@ -105,32 +105,53 @@ static u32 get_alt_insn(struct alt_instr *alt, __le32 *insnptr, __le32 *altinsnp
>  	return insn;
>  }
>  
> +static void patch_alternative(struct alt_instr *alt,
> +			      __le32 *origptr, __le32 *updptr, int nr_inst)
> +{
> +	__le32 *replptr;
> +	int i;
> +
> +	replptr = ALT_REPL_PTR(alt);
> +	for (i = 0; i < nr_inst; i++) {
> +		u32 insn;
> +
> +		insn = get_alt_insn(alt, origptr + i, replptr + i);
> +		updptr[i] = cpu_to_le32(insn);
> +	}
> +}
> +
>  static void __apply_alternatives(void *alt_region, bool use_linear_alias)
>  {
>  	struct alt_instr *alt;
>  	struct alt_region *region = alt_region;
> -	__le32 *origptr, *replptr, *updptr;
> +	__le32 *origptr, *updptr;
> +	alternative_cb_t alt_cb;
>  
>  	for (alt = region->begin; alt < region->end; alt++) {
> -		u32 insn;
> -		int i, nr_inst;
> +		int nr_inst;
>  
> -		if (!cpus_have_cap(alt->cpufeature))
> +		/* Use ARM64_NCAPS as an unconditional patch */

... here I think we can re-use this term, if I correctly understand that
ARM64_NCAPS means it's a "callback patch" (or "dynamic patch").

You could consider #define ARM64_CB_PATCH ARM64_NCAPS as well I suppose.

> +		if (alt->cpufeature < ARM64_NCAPS &&
> +		    !cpus_have_cap(alt->cpufeature))
>  			continue;
>  
> -		BUG_ON(alt->alt_len != alt->orig_len);
> +		if (alt->cpufeature == ARM64_NCAPS)
> +			BUG_ON(alt->alt_len != 0);
> +		else
> +			BUG_ON(alt->alt_len != alt->orig_len);
>  
>  		pr_info_once("patching kernel code\n");
>  
>  		origptr = ALT_ORIG_PTR(alt);
> -		replptr = ALT_REPL_PTR(alt);
>  		updptr = use_linear_alias ? lm_alias(origptr) : origptr;
> -		nr_inst = alt->alt_len / sizeof(insn);
> +		nr_inst = alt->orig_len / AARCH64_INSN_SIZE;
>  
> -		for (i = 0; i < nr_inst; i++) {
> -			insn = get_alt_insn(alt, origptr + i, replptr + i);
> -			updptr[i] = cpu_to_le32(insn);
> -		}
> +		if (alt->cpufeature < ARM64_NCAPS)
> +			alt_cb = patch_alternative;
> +		else
> +			alt_cb  = ALT_REPL_PTR(alt);
> +
> +		alt_cb(alt, origptr, updptr, nr_inst);
>  
>  		flush_icache_range((uintptr_t)origptr,
>  				   (uintptr_t)(origptr + nr_inst));
> -- 
> 2.14.2
> 

So despite my language nit-picks above, I haven't been able to spot
anything problematic with this patch:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 06/19] arm64: insn: Add N immediate encoding
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-15 11:26     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 11:26 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:21PM +0000, Marc Zyngier wrote:
> We're missing the a way to generate the encoding of the N immediate,
> which is only a single bit used in a number of instruction that take
> an immediate.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

> ---
>  arch/arm64/include/asm/insn.h | 1 +
>  arch/arm64/kernel/insn.c      | 4 ++++
>  2 files changed, 5 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index 4214c38d016b..21fffdd290a3 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -70,6 +70,7 @@ enum aarch64_insn_imm_type {
>  	AARCH64_INSN_IMM_6,
>  	AARCH64_INSN_IMM_S,
>  	AARCH64_INSN_IMM_R,
> +	AARCH64_INSN_IMM_N,
>  	AARCH64_INSN_IMM_MAX
>  };
>  
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 2718a77da165..7e432662d454 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -343,6 +343,10 @@ static int __kprobes aarch64_get_imm_shift_mask(enum aarch64_insn_imm_type type,
>  		mask = BIT(6) - 1;
>  		shift = 16;
>  		break;
> +	case AARCH64_INSN_IMM_N:
> +		mask = 1;
> +		shift = 22;
> +		break;
>  	default:
>  		return -EINVAL;
>  	}
> -- 
> 2.14.2
> 

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 06/19] arm64: insn: Add N immediate encoding
@ 2018-01-15 11:26     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 11:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:21PM +0000, Marc Zyngier wrote:
> We're missing the a way to generate the encoding of the N immediate,
> which is only a single bit used in a number of instruction that take
> an immediate.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

> ---
>  arch/arm64/include/asm/insn.h | 1 +
>  arch/arm64/kernel/insn.c      | 4 ++++
>  2 files changed, 5 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index 4214c38d016b..21fffdd290a3 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -70,6 +70,7 @@ enum aarch64_insn_imm_type {
>  	AARCH64_INSN_IMM_6,
>  	AARCH64_INSN_IMM_S,
>  	AARCH64_INSN_IMM_R,
> +	AARCH64_INSN_IMM_N,
>  	AARCH64_INSN_IMM_MAX
>  };
>  
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 2718a77da165..7e432662d454 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -343,6 +343,10 @@ static int __kprobes aarch64_get_imm_shift_mask(enum aarch64_insn_imm_type type,
>  		mask = BIT(6) - 1;
>  		shift = 16;
>  		break;
> +	case AARCH64_INSN_IMM_N:
> +		mask = 1;
> +		shift = 22;
> +		break;
>  	default:
>  		return -EINVAL;
>  	}
> -- 
> 2.14.2
> 

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 07/19] arm64: insn: Add encoder for bitwise operations using literals
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-15 11:26     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 11:26 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On Thu, Jan 04, 2018 at 06:43:22PM +0000, Marc Zyngier wrote:
> We lack a way to encode operations such as AND, ORR, EOR that take
> an immediate value. Doing so is quite involved, and is all about
> reverse engineering the decoding algorithm described in the
> pseudocode function DecodeBitMasks().

Black magic.

> 
> This has been tested by feeding it all the possible literal values
> and comparing the output with that of GAS.

That's comforting.

I didn't attempt at verifying the functionality or every hard-coded
value or dirty bit trick in this patch, but I did glance over the parts
I could vaguely understand and didn't see any issues.

I suppose that's a weak sort of:

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/insn.h |   9 +++
>  arch/arm64/kernel/insn.c      | 136 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 145 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index 21fffdd290a3..815b35bc53ed 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -315,6 +315,10 @@ __AARCH64_INSN_FUNCS(eor,	0x7F200000, 0x4A000000)
>  __AARCH64_INSN_FUNCS(eon,	0x7F200000, 0x4A200000)
>  __AARCH64_INSN_FUNCS(ands,	0x7F200000, 0x6A000000)
>  __AARCH64_INSN_FUNCS(bics,	0x7F200000, 0x6A200000)
> +__AARCH64_INSN_FUNCS(and_imm,	0x7F800000, 0x12000000)
> +__AARCH64_INSN_FUNCS(orr_imm,	0x7F800000, 0x32000000)
> +__AARCH64_INSN_FUNCS(eor_imm,	0x7F800000, 0x52000000)
> +__AARCH64_INSN_FUNCS(ands_imm,	0x7F800000, 0x72000000)
>  __AARCH64_INSN_FUNCS(b,		0xFC000000, 0x14000000)
>  __AARCH64_INSN_FUNCS(bl,	0xFC000000, 0x94000000)
>  __AARCH64_INSN_FUNCS(cbz,	0x7F000000, 0x34000000)
> @@ -424,6 +428,11 @@ u32 aarch64_insn_gen_logical_shifted_reg(enum aarch64_insn_register dst,
>  					 int shift,
>  					 enum aarch64_insn_variant variant,
>  					 enum aarch64_insn_logic_type type);
> +u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
> +				       enum aarch64_insn_variant variant,
> +				       enum aarch64_insn_register Rn,
> +				       enum aarch64_insn_register Rd,
> +				       u64 imm);
>  u32 aarch64_insn_gen_prefetch(enum aarch64_insn_register base,
>  			      enum aarch64_insn_prfm_type type,
>  			      enum aarch64_insn_prfm_target target,
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 7e432662d454..72cb1721c63f 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -1485,3 +1485,139 @@ pstate_check_t * const aarch32_opcode_cond_checks[16] = {
>  	__check_hi, __check_ls, __check_ge, __check_lt,
>  	__check_gt, __check_le, __check_al, __check_al
>  };
> +
> +static bool range_of_ones(u64 val)
> +{
> +	/* Doesn't handle full ones or full zeroes */
> +	u64 sval = val >> __ffs64(val);
> +
> +	/* One of Sean Eron Anderson's bithack tricks */
> +	return ((sval + 1) & (sval)) == 0;
> +}
> +
> +static u32 aarch64_encode_immediate(u64 imm,
> +				    enum aarch64_insn_variant variant,
> +				    u32 insn)
> +{
> +	unsigned int immr, imms, n, ones, ror, esz, tmp;
> +	u64 mask = ~0UL;
> +
> +	/* Can't encode full zeroes or full ones */
> +	if (!imm || !~imm)
> +		return AARCH64_BREAK_FAULT;
> +
> +	switch (variant) {
> +	case AARCH64_INSN_VARIANT_32BIT:
> +		if (upper_32_bits(imm))
> +			return AARCH64_BREAK_FAULT;
> +		esz = 32;
> +		break;
> +	case AARCH64_INSN_VARIANT_64BIT:
> +		insn |= AARCH64_INSN_SF_BIT;
> +		esz = 64;
> +		break;
> +	default:
> +		pr_err("%s: unknown variant encoding %d\n", __func__, variant);
> +		return AARCH64_BREAK_FAULT;
> +	}
> +
> +	/*
> +	 * Inverse of Replicate(). Try to spot a repeating pattern
> +	 * with a pow2 stride.
> +	 */
> +	for (tmp = esz / 2; tmp >= 2; tmp /= 2) {
> +		u64 emask = BIT(tmp) - 1;
> +
> +		if ((imm & emask) != ((imm >> (tmp / 2)) & emask))
> +			break;
> +
> +		esz = tmp;
> +		mask = emask;
> +	}
> +
> +	/* N is only set if we're encoding a 64bit value */
> +	n = esz == 64;
> +
> +	/* Trim imm to the element size */
> +	imm &= mask;
> +
> +	/* That's how many ones we need to encode */
> +	ones = hweight64(imm);
> +
> +	/*
> +	 * imms is set to (ones - 1), prefixed with a string of ones
> +	 * and a zero if they fit. Cap it to 6 bits.
> +	 */
> +	imms  = ones - 1;
> +	imms |= 0xf << ffs(esz);
> +	imms &= BIT(6) - 1;
> +
> +	/* Compute the rotation */
> +	if (range_of_ones(imm)) {
> +		/*
> +		 * Pattern: 0..01..10..0
> +		 *
> +		 * Compute how many rotate we need to align it right
> +		 */
> +		ror = __ffs64(imm);
> +	} else {
> +		/*
> +		 * Pattern: 0..01..10..01..1
> +		 *
> +		 * Fill the unused top bits with ones, and check if
> +		 * the result is a valid immediate (all ones with a
> +		 * contiguous ranges of zeroes).
> +		 */
> +		imm |= ~mask;
> +		if (!range_of_ones(~imm))
> +			return AARCH64_BREAK_FAULT;
> +
> +		/*
> +		 * Compute the rotation to get a continuous set of
> +		 * ones, with the first bit set at position 0
> +		 */
> +		ror = fls(~imm);
> +	}
> +
> +	/*
> +	 * immr is the number of bits we need to rotate back to the
> +	 * original set of ones. Note that this is relative to the
> +	 * element size...
> +	 */
> +	immr = (esz - ror) % esz;
> +
> +	insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_N, insn, n);
> +	insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_R, insn, immr);
> +	return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_S, insn, imms);
> +}
> +
> +u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
> +				       enum aarch64_insn_variant variant,
> +				       enum aarch64_insn_register Rn,
> +				       enum aarch64_insn_register Rd,
> +				       u64 imm)
> +{
> +	u32 insn;
> +
> +	switch (type) {
> +	case AARCH64_INSN_LOGIC_AND:
> +		insn = aarch64_insn_get_and_imm_value();
> +		break;
> +	case AARCH64_INSN_LOGIC_ORR:
> +		insn = aarch64_insn_get_orr_imm_value();
> +		break;
> +	case AARCH64_INSN_LOGIC_EOR:
> +		insn = aarch64_insn_get_eor_imm_value();
> +		break;
> +	case AARCH64_INSN_LOGIC_AND_SETFLAGS:
> +		insn = aarch64_insn_get_ands_imm_value();
> +		break;
> +	default:
> +		pr_err("%s: unknown logical encoding %d\n", __func__, type);
> +		return AARCH64_BREAK_FAULT;
> +	}
> +
> +	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, Rd);
> +	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, Rn);
> +	return aarch64_encode_immediate(imm, variant, insn);
> +}
> -- 
> 2.14.2
> 

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 07/19] arm64: insn: Add encoder for bitwise operations using literals
@ 2018-01-15 11:26     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 11:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:22PM +0000, Marc Zyngier wrote:
> We lack a way to encode operations such as AND, ORR, EOR that take
> an immediate value. Doing so is quite involved, and is all about
> reverse engineering the decoding algorithm described in the
> pseudocode function DecodeBitMasks().

Black magic.

> 
> This has been tested by feeding it all the possible literal values
> and comparing the output with that of GAS.

That's comforting.

I didn't attempt at verifying the functionality or every hard-coded
value or dirty bit trick in this patch, but I did glance over the parts
I could vaguely understand and didn't see any issues.

I suppose that's a weak sort of:

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/insn.h |   9 +++
>  arch/arm64/kernel/insn.c      | 136 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 145 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index 21fffdd290a3..815b35bc53ed 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -315,6 +315,10 @@ __AARCH64_INSN_FUNCS(eor,	0x7F200000, 0x4A000000)
>  __AARCH64_INSN_FUNCS(eon,	0x7F200000, 0x4A200000)
>  __AARCH64_INSN_FUNCS(ands,	0x7F200000, 0x6A000000)
>  __AARCH64_INSN_FUNCS(bics,	0x7F200000, 0x6A200000)
> +__AARCH64_INSN_FUNCS(and_imm,	0x7F800000, 0x12000000)
> +__AARCH64_INSN_FUNCS(orr_imm,	0x7F800000, 0x32000000)
> +__AARCH64_INSN_FUNCS(eor_imm,	0x7F800000, 0x52000000)
> +__AARCH64_INSN_FUNCS(ands_imm,	0x7F800000, 0x72000000)
>  __AARCH64_INSN_FUNCS(b,		0xFC000000, 0x14000000)
>  __AARCH64_INSN_FUNCS(bl,	0xFC000000, 0x94000000)
>  __AARCH64_INSN_FUNCS(cbz,	0x7F000000, 0x34000000)
> @@ -424,6 +428,11 @@ u32 aarch64_insn_gen_logical_shifted_reg(enum aarch64_insn_register dst,
>  					 int shift,
>  					 enum aarch64_insn_variant variant,
>  					 enum aarch64_insn_logic_type type);
> +u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
> +				       enum aarch64_insn_variant variant,
> +				       enum aarch64_insn_register Rn,
> +				       enum aarch64_insn_register Rd,
> +				       u64 imm);
>  u32 aarch64_insn_gen_prefetch(enum aarch64_insn_register base,
>  			      enum aarch64_insn_prfm_type type,
>  			      enum aarch64_insn_prfm_target target,
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 7e432662d454..72cb1721c63f 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -1485,3 +1485,139 @@ pstate_check_t * const aarch32_opcode_cond_checks[16] = {
>  	__check_hi, __check_ls, __check_ge, __check_lt,
>  	__check_gt, __check_le, __check_al, __check_al
>  };
> +
> +static bool range_of_ones(u64 val)
> +{
> +	/* Doesn't handle full ones or full zeroes */
> +	u64 sval = val >> __ffs64(val);
> +
> +	/* One of Sean Eron Anderson's bithack tricks */
> +	return ((sval + 1) & (sval)) == 0;
> +}
> +
> +static u32 aarch64_encode_immediate(u64 imm,
> +				    enum aarch64_insn_variant variant,
> +				    u32 insn)
> +{
> +	unsigned int immr, imms, n, ones, ror, esz, tmp;
> +	u64 mask = ~0UL;
> +
> +	/* Can't encode full zeroes or full ones */
> +	if (!imm || !~imm)
> +		return AARCH64_BREAK_FAULT;
> +
> +	switch (variant) {
> +	case AARCH64_INSN_VARIANT_32BIT:
> +		if (upper_32_bits(imm))
> +			return AARCH64_BREAK_FAULT;
> +		esz = 32;
> +		break;
> +	case AARCH64_INSN_VARIANT_64BIT:
> +		insn |= AARCH64_INSN_SF_BIT;
> +		esz = 64;
> +		break;
> +	default:
> +		pr_err("%s: unknown variant encoding %d\n", __func__, variant);
> +		return AARCH64_BREAK_FAULT;
> +	}
> +
> +	/*
> +	 * Inverse of Replicate(). Try to spot a repeating pattern
> +	 * with a pow2 stride.
> +	 */
> +	for (tmp = esz / 2; tmp >= 2; tmp /= 2) {
> +		u64 emask = BIT(tmp) - 1;
> +
> +		if ((imm & emask) != ((imm >> (tmp / 2)) & emask))
> +			break;
> +
> +		esz = tmp;
> +		mask = emask;
> +	}
> +
> +	/* N is only set if we're encoding a 64bit value */
> +	n = esz == 64;
> +
> +	/* Trim imm to the element size */
> +	imm &= mask;
> +
> +	/* That's how many ones we need to encode */
> +	ones = hweight64(imm);
> +
> +	/*
> +	 * imms is set to (ones - 1), prefixed with a string of ones
> +	 * and a zero if they fit. Cap it to 6 bits.
> +	 */
> +	imms  = ones - 1;
> +	imms |= 0xf << ffs(esz);
> +	imms &= BIT(6) - 1;
> +
> +	/* Compute the rotation */
> +	if (range_of_ones(imm)) {
> +		/*
> +		 * Pattern: 0..01..10..0
> +		 *
> +		 * Compute how many rotate we need to align it right
> +		 */
> +		ror = __ffs64(imm);
> +	} else {
> +		/*
> +		 * Pattern: 0..01..10..01..1
> +		 *
> +		 * Fill the unused top bits with ones, and check if
> +		 * the result is a valid immediate (all ones with a
> +		 * contiguous ranges of zeroes).
> +		 */
> +		imm |= ~mask;
> +		if (!range_of_ones(~imm))
> +			return AARCH64_BREAK_FAULT;
> +
> +		/*
> +		 * Compute the rotation to get a continuous set of
> +		 * ones, with the first bit set at position 0
> +		 */
> +		ror = fls(~imm);
> +	}
> +
> +	/*
> +	 * immr is the number of bits we need to rotate back to the
> +	 * original set of ones. Note that this is relative to the
> +	 * element size...
> +	 */
> +	immr = (esz - ror) % esz;
> +
> +	insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_N, insn, n);
> +	insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_R, insn, immr);
> +	return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_S, insn, imms);
> +}
> +
> +u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
> +				       enum aarch64_insn_variant variant,
> +				       enum aarch64_insn_register Rn,
> +				       enum aarch64_insn_register Rd,
> +				       u64 imm)
> +{
> +	u32 insn;
> +
> +	switch (type) {
> +	case AARCH64_INSN_LOGIC_AND:
> +		insn = aarch64_insn_get_and_imm_value();
> +		break;
> +	case AARCH64_INSN_LOGIC_ORR:
> +		insn = aarch64_insn_get_orr_imm_value();
> +		break;
> +	case AARCH64_INSN_LOGIC_EOR:
> +		insn = aarch64_insn_get_eor_imm_value();
> +		break;
> +	case AARCH64_INSN_LOGIC_AND_SETFLAGS:
> +		insn = aarch64_insn_get_ands_imm_value();
> +		break;
> +	default:
> +		pr_err("%s: unknown logical encoding %d\n", __func__, type);
> +		return AARCH64_BREAK_FAULT;
> +	}
> +
> +	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, Rd);
> +	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, Rn);
> +	return aarch64_encode_immediate(imm, variant, insn);
> +}
> -- 
> 2.14.2
> 

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 08/19] arm64: KVM: Dynamically patch the kernel/hyp VA mask
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-15 11:47     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 11:47 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:23PM +0000, Marc Zyngier wrote:
> So far, we're using a complicated sequence of alternatives to
> patch the kernel/hyp VA mask on non-VHE, and NOP out the
> masking altogether when on VHE.
> 
> THe newly introduced dynamic patching gives us the opportunity

nit: s/THe/The/

> to simplify that code by patching a single instruction with
> the correct mask (instead of the mind bending cummulative masking
> we have at the moment) or even a single NOP on VHE.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_mmu.h | 45 ++++++--------------
>  arch/arm64/kvm/Makefile          |  2 +-
>  arch/arm64/kvm/va_layout.c       | 91 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 104 insertions(+), 34 deletions(-)
>  create mode 100644 arch/arm64/kvm/va_layout.c
> 
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 672c8684d5c2..b0c3cbe9b513 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -69,9 +69,6 @@
>   * mappings, and none of this applies in that case.
>   */
>  
> -#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
> -#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
> -
>  #ifdef __ASSEMBLY__
>  
>  #include <asm/alternative.h>
> @@ -81,28 +78,14 @@
>   * Convert a kernel VA into a HYP VA.
>   * reg: VA to be converted.
>   *
> - * This generates the following sequences:
> - * - High mask:
> - *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
> - *		nop
> - * - Low mask:
> - *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
> - *		and x0, x0, #HYP_PAGE_OFFSET_LOW_MASK
> - * - VHE:
> - *		nop
> - *		nop
> - *
> - * The "low mask" version works because the mask is a strict subset of
> - * the "high mask", hence performing the first mask for nothing.
> - * Should be completely invisible on any viable CPU.
> + * The actual code generation takes place in kvm_update_va_mask, and
> + * the instructions below are only there to reserve the space and
> + * perform the register allocation.

Not just register allocation, but also to tell the generating code which
registers to use for its operands, right?

>   */
>  .macro kern_hyp_va	reg
> -alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
> -	and     \reg, \reg, #HYP_PAGE_OFFSET_HIGH_MASK
> -alternative_else_nop_endif
> -alternative_if ARM64_HYP_OFFSET_LOW
> -	and     \reg, \reg, #HYP_PAGE_OFFSET_LOW_MASK
> -alternative_else_nop_endif
> +alternative_cb kvm_update_va_mask
> +	and     \reg, \reg, #1
> +alternative_cb_end
>  .endm
>  
>  #else
> @@ -113,18 +96,14 @@ alternative_else_nop_endif
>  #include <asm/mmu_context.h>
>  #include <asm/pgtable.h>
>  
> +void kvm_update_va_mask(struct alt_instr *alt,
> +			__le32 *origptr, __le32 *updptr, int nr_inst);
> +
>  static inline unsigned long __kern_hyp_va(unsigned long v)
>  {
> -	asm volatile(ALTERNATIVE("and %0, %0, %1",
> -				 "nop",
> -				 ARM64_HAS_VIRT_HOST_EXTN)
> -		     : "+r" (v)
> -		     : "i" (HYP_PAGE_OFFSET_HIGH_MASK));
> -	asm volatile(ALTERNATIVE("nop",
> -				 "and %0, %0, %1",
> -				 ARM64_HYP_OFFSET_LOW)
> -		     : "+r" (v)
> -		     : "i" (HYP_PAGE_OFFSET_LOW_MASK));
> +	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
> +				    kvm_update_va_mask)
> +		     : "+r" (v));
>  	return v;
>  }
>  
> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> index 87c4f7ae24de..93afff91cb7c 100644
> --- a/arch/arm64/kvm/Makefile
> +++ b/arch/arm64/kvm/Makefile
> @@ -16,7 +16,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/e
>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
>  
> -kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o
> +kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o
> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
> new file mode 100644
> index 000000000000..aee758574e61
> --- /dev/null
> +++ b/arch/arm64/kvm/va_layout.c
> @@ -0,0 +1,91 @@
> +/*
> + * Copyright (C) 2017 ARM Ltd.
> + * Author: Marc Zyngier <marc.zyngier@arm.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <linux/kvm_host.h>
> +#include <asm/alternative.h>
> +#include <asm/debug-monitors.h>
> +#include <asm/insn.h>
> +#include <asm/kvm_mmu.h>
> +
> +#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
> +#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
> +
> +static u64 va_mask;
> +
> +static void compute_layout(void)
> +{
> +	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
> +	unsigned long mask = HYP_PAGE_OFFSET_HIGH_MASK;
> +
> +	/*
> +	 * Activate the lower HYP offset only if the idmap doesn't
> +	 * clash with it,
> +	 */

The commentary is a bit strange given the logic below...

> +	if (idmap_addr > HYP_PAGE_OFFSET_LOW_MASK)
> +		mask = HYP_PAGE_OFFSET_HIGH_MASK;

... was the initialization supposed to be LOW_MASK?

(and does this imply that this was never tested on a system that
actually used the low mask?)

> +
> +	va_mask = mask;
> +}
> +
> +static u32 compute_instruction(int n, u32 rd, u32 rn)
> +{
> +	u32 insn = AARCH64_BREAK_FAULT;
> +
> +	switch (n) {
> +	case 0:

hmmm, wonder why we need this n==0 check...

> +		insn = aarch64_insn_gen_logical_immediate(AARCH64_INSN_LOGIC_AND,
> +							  AARCH64_INSN_VARIANT_64BIT,
> +							  rn, rd, va_mask);
> +		break;
> +	}
> +
> +	return insn;
> +}
> +
> +void __init kvm_update_va_mask(struct alt_instr *alt,
> +			       __le32 *origptr, __le32 *updptr, int nr_inst)
> +{
> +	int i;
> +
> +	/* We only expect a 1 instruction sequence */

nit: wording is a bit strange, how about
"We only expect a single instruction in the alternative sequence"

> +	BUG_ON(nr_inst != 1);
> +
> +	if (!has_vhe() && !va_mask)
> +		compute_layout();
> +
> +	for (i = 0; i < nr_inst; i++) {

It's a bit funny to have a loop with the above BUG_ON.

(I'm beginning to wonder if a future patch expands on this single
instruction thing, in which case a hint in the commit message would have
been nice.)

> +		u32 rd, rn, insn, oinsn;
> +
> +		/*
> +		 * VHE doesn't need any address translation, let's NOP
> +		 * everything.
> +		 */
> +		if (has_vhe()) {
> +			updptr[i] = aarch64_insn_gen_nop();
> +			continue;
> +		}
> +
> +		oinsn = le32_to_cpu(origptr[i]);
> +		rd = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RD, oinsn);
> +		rn = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, oinsn);
> +
> +		insn = compute_instruction(i, rd, rn);
> +		BUG_ON(insn == AARCH64_BREAK_FAULT);
> +
> +		updptr[i] = cpu_to_le32(insn);
> +	}
> +}
> -- 
> 2.14.2
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 08/19] arm64: KVM: Dynamically patch the kernel/hyp VA mask
@ 2018-01-15 11:47     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 11:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:23PM +0000, Marc Zyngier wrote:
> So far, we're using a complicated sequence of alternatives to
> patch the kernel/hyp VA mask on non-VHE, and NOP out the
> masking altogether when on VHE.
> 
> THe newly introduced dynamic patching gives us the opportunity

nit: s/THe/The/

> to simplify that code by patching a single instruction with
> the correct mask (instead of the mind bending cummulative masking
> we have at the moment) or even a single NOP on VHE.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_mmu.h | 45 ++++++--------------
>  arch/arm64/kvm/Makefile          |  2 +-
>  arch/arm64/kvm/va_layout.c       | 91 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 104 insertions(+), 34 deletions(-)
>  create mode 100644 arch/arm64/kvm/va_layout.c
> 
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 672c8684d5c2..b0c3cbe9b513 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -69,9 +69,6 @@
>   * mappings, and none of this applies in that case.
>   */
>  
> -#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
> -#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
> -
>  #ifdef __ASSEMBLY__
>  
>  #include <asm/alternative.h>
> @@ -81,28 +78,14 @@
>   * Convert a kernel VA into a HYP VA.
>   * reg: VA to be converted.
>   *
> - * This generates the following sequences:
> - * - High mask:
> - *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
> - *		nop
> - * - Low mask:
> - *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
> - *		and x0, x0, #HYP_PAGE_OFFSET_LOW_MASK
> - * - VHE:
> - *		nop
> - *		nop
> - *
> - * The "low mask" version works because the mask is a strict subset of
> - * the "high mask", hence performing the first mask for nothing.
> - * Should be completely invisible on any viable CPU.
> + * The actual code generation takes place in kvm_update_va_mask, and
> + * the instructions below are only there to reserve the space and
> + * perform the register allocation.

Not just register allocation, but also to tell the generating code which
registers to use for its operands, right?

>   */
>  .macro kern_hyp_va	reg
> -alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
> -	and     \reg, \reg, #HYP_PAGE_OFFSET_HIGH_MASK
> -alternative_else_nop_endif
> -alternative_if ARM64_HYP_OFFSET_LOW
> -	and     \reg, \reg, #HYP_PAGE_OFFSET_LOW_MASK
> -alternative_else_nop_endif
> +alternative_cb kvm_update_va_mask
> +	and     \reg, \reg, #1
> +alternative_cb_end
>  .endm
>  
>  #else
> @@ -113,18 +96,14 @@ alternative_else_nop_endif
>  #include <asm/mmu_context.h>
>  #include <asm/pgtable.h>
>  
> +void kvm_update_va_mask(struct alt_instr *alt,
> +			__le32 *origptr, __le32 *updptr, int nr_inst);
> +
>  static inline unsigned long __kern_hyp_va(unsigned long v)
>  {
> -	asm volatile(ALTERNATIVE("and %0, %0, %1",
> -				 "nop",
> -				 ARM64_HAS_VIRT_HOST_EXTN)
> -		     : "+r" (v)
> -		     : "i" (HYP_PAGE_OFFSET_HIGH_MASK));
> -	asm volatile(ALTERNATIVE("nop",
> -				 "and %0, %0, %1",
> -				 ARM64_HYP_OFFSET_LOW)
> -		     : "+r" (v)
> -		     : "i" (HYP_PAGE_OFFSET_LOW_MASK));
> +	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
> +				    kvm_update_va_mask)
> +		     : "+r" (v));
>  	return v;
>  }
>  
> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> index 87c4f7ae24de..93afff91cb7c 100644
> --- a/arch/arm64/kvm/Makefile
> +++ b/arch/arm64/kvm/Makefile
> @@ -16,7 +16,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/e
>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
>  
> -kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o
> +kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
>  kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o
> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
> new file mode 100644
> index 000000000000..aee758574e61
> --- /dev/null
> +++ b/arch/arm64/kvm/va_layout.c
> @@ -0,0 +1,91 @@
> +/*
> + * Copyright (C) 2017 ARM Ltd.
> + * Author: Marc Zyngier <marc.zyngier@arm.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <linux/kvm_host.h>
> +#include <asm/alternative.h>
> +#include <asm/debug-monitors.h>
> +#include <asm/insn.h>
> +#include <asm/kvm_mmu.h>
> +
> +#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
> +#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
> +
> +static u64 va_mask;
> +
> +static void compute_layout(void)
> +{
> +	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
> +	unsigned long mask = HYP_PAGE_OFFSET_HIGH_MASK;
> +
> +	/*
> +	 * Activate the lower HYP offset only if the idmap doesn't
> +	 * clash with it,
> +	 */

The commentary is a bit strange given the logic below...

> +	if (idmap_addr > HYP_PAGE_OFFSET_LOW_MASK)
> +		mask = HYP_PAGE_OFFSET_HIGH_MASK;

... was the initialization supposed to be LOW_MASK?

(and does this imply that this was never tested on a system that
actually used the low mask?)

> +
> +	va_mask = mask;
> +}
> +
> +static u32 compute_instruction(int n, u32 rd, u32 rn)
> +{
> +	u32 insn = AARCH64_BREAK_FAULT;
> +
> +	switch (n) {
> +	case 0:

hmmm, wonder why we need this n==0 check...

> +		insn = aarch64_insn_gen_logical_immediate(AARCH64_INSN_LOGIC_AND,
> +							  AARCH64_INSN_VARIANT_64BIT,
> +							  rn, rd, va_mask);
> +		break;
> +	}
> +
> +	return insn;
> +}
> +
> +void __init kvm_update_va_mask(struct alt_instr *alt,
> +			       __le32 *origptr, __le32 *updptr, int nr_inst)
> +{
> +	int i;
> +
> +	/* We only expect a 1 instruction sequence */

nit: wording is a bit strange, how about
"We only expect a single instruction in the alternative sequence"

> +	BUG_ON(nr_inst != 1);
> +
> +	if (!has_vhe() && !va_mask)
> +		compute_layout();
> +
> +	for (i = 0; i < nr_inst; i++) {

It's a bit funny to have a loop with the above BUG_ON.

(I'm beginning to wonder if a future patch expands on this single
instruction thing, in which case a hint in the commit message would have
been nice.)

> +		u32 rd, rn, insn, oinsn;
> +
> +		/*
> +		 * VHE doesn't need any address translation, let's NOP
> +		 * everything.
> +		 */
> +		if (has_vhe()) {
> +			updptr[i] = aarch64_insn_gen_nop();
> +			continue;
> +		}
> +
> +		oinsn = le32_to_cpu(origptr[i]);
> +		rd = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RD, oinsn);
> +		rn = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, oinsn);
> +
> +		insn = compute_instruction(i, rd, rn);
> +		BUG_ON(insn == AARCH64_BREAK_FAULT);
> +
> +		updptr[i] = cpu_to_le32(insn);
> +	}
> +}
> -- 
> 2.14.2
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 09/19] arm64: cpufeatures: Drop the ARM64_HYP_OFFSET_LOW feature flag
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-15 11:48     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 11:48 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On Thu, Jan 04, 2018 at 06:43:24PM +0000, Marc Zyngier wrote:
> Now that we can dynamically compute the kernek/hyp VA mask, there

nit: s/kernek/kernel/

> is need for a feature flag to trigger the alternative patching.

is *no* need?

> Let's drop the flag and everything that depends on it.

Otherwise:

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/cpucaps.h |  2 +-
>  arch/arm64/kernel/cpufeature.c   | 19 -------------------
>  2 files changed, 1 insertion(+), 20 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
> index 2ff7c5e8efab..f130f35dca3c 100644
> --- a/arch/arm64/include/asm/cpucaps.h
> +++ b/arch/arm64/include/asm/cpucaps.h
> @@ -32,7 +32,7 @@
>  #define ARM64_HAS_VIRT_HOST_EXTN		11
>  #define ARM64_WORKAROUND_CAVIUM_27456		12
>  #define ARM64_HAS_32BIT_EL0			13
> -#define ARM64_HYP_OFFSET_LOW			14
> +/* #define ARM64_UNALLOCATED_ENTRY			14 */
>  #define ARM64_MISMATCHED_CACHE_LINE_SIZE	15
>  #define ARM64_HAS_NO_FPSIMD			16
>  #define ARM64_WORKAROUND_REPEAT_TLBI		17
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index a73a5928f09b..b99f8b1688c3 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -825,19 +825,6 @@ static bool runs_at_el2(const struct arm64_cpu_capabilities *entry, int __unused
>  	return is_kernel_in_hyp_mode();
>  }
>  
> -static bool hyp_offset_low(const struct arm64_cpu_capabilities *entry,
> -			   int __unused)
> -{
> -	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
> -
> -	/*
> -	 * Activate the lower HYP offset only if:
> -	 * - the idmap doesn't clash with it,
> -	 * - the kernel is not running at EL2.
> -	 */
> -	return idmap_addr > GENMASK(VA_BITS - 2, 0) && !is_kernel_in_hyp_mode();
> -}
> -
>  static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unused)
>  {
>  	u64 pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> @@ -926,12 +913,6 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>  		.field_pos = ID_AA64PFR0_EL0_SHIFT,
>  		.min_field_value = ID_AA64PFR0_EL0_32BIT_64BIT,
>  	},
> -	{
> -		.desc = "Reduced HYP mapping offset",
> -		.capability = ARM64_HYP_OFFSET_LOW,
> -		.def_scope = SCOPE_SYSTEM,
> -		.matches = hyp_offset_low,
> -	},
>  	{
>  		/* FP/SIMD is not implemented */
>  		.capability = ARM64_HAS_NO_FPSIMD,
> -- 
> 2.14.2
> 

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 09/19] arm64: cpufeatures: Drop the ARM64_HYP_OFFSET_LOW feature flag
@ 2018-01-15 11:48     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 11:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:24PM +0000, Marc Zyngier wrote:
> Now that we can dynamically compute the kernek/hyp VA mask, there

nit: s/kernek/kernel/

> is need for a feature flag to trigger the alternative patching.

is *no* need?

> Let's drop the flag and everything that depends on it.

Otherwise:

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/cpucaps.h |  2 +-
>  arch/arm64/kernel/cpufeature.c   | 19 -------------------
>  2 files changed, 1 insertion(+), 20 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
> index 2ff7c5e8efab..f130f35dca3c 100644
> --- a/arch/arm64/include/asm/cpucaps.h
> +++ b/arch/arm64/include/asm/cpucaps.h
> @@ -32,7 +32,7 @@
>  #define ARM64_HAS_VIRT_HOST_EXTN		11
>  #define ARM64_WORKAROUND_CAVIUM_27456		12
>  #define ARM64_HAS_32BIT_EL0			13
> -#define ARM64_HYP_OFFSET_LOW			14
> +/* #define ARM64_UNALLOCATED_ENTRY			14 */
>  #define ARM64_MISMATCHED_CACHE_LINE_SIZE	15
>  #define ARM64_HAS_NO_FPSIMD			16
>  #define ARM64_WORKAROUND_REPEAT_TLBI		17
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index a73a5928f09b..b99f8b1688c3 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -825,19 +825,6 @@ static bool runs_at_el2(const struct arm64_cpu_capabilities *entry, int __unused
>  	return is_kernel_in_hyp_mode();
>  }
>  
> -static bool hyp_offset_low(const struct arm64_cpu_capabilities *entry,
> -			   int __unused)
> -{
> -	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
> -
> -	/*
> -	 * Activate the lower HYP offset only if:
> -	 * - the idmap doesn't clash with it,
> -	 * - the kernel is not running at EL2.
> -	 */
> -	return idmap_addr > GENMASK(VA_BITS - 2, 0) && !is_kernel_in_hyp_mode();
> -}
> -
>  static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unused)
>  {
>  	u64 pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> @@ -926,12 +913,6 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>  		.field_pos = ID_AA64PFR0_EL0_SHIFT,
>  		.min_field_value = ID_AA64PFR0_EL0_32BIT_64BIT,
>  	},
> -	{
> -		.desc = "Reduced HYP mapping offset",
> -		.capability = ARM64_HYP_OFFSET_LOW,
> -		.def_scope = SCOPE_SYSTEM,
> -		.matches = hyp_offset_low,
> -	},
>  	{
>  		/* FP/SIMD is not implemented */
>  		.capability = ARM64_HAS_NO_FPSIMD,
> -- 
> 2.14.2
> 

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-15 15:36     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 15:36 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:25PM +0000, Marc Zyngier wrote:
> kvm_vgic_global_state is part of the read-only section, and is
> usually accessed using a PC-relative address generation (adrp + add).
> 
> It is thus useless to use kern_hyp_va() on it, and actively problematic
> if kern_hyp_va() becomes non-idempotent. On the other hand, there is
> no way that the compiler is going to guarantee that such access is
> always be PC relative.

nit: is always be

> 
> So let's bite the bullet and provide our own accessor.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/include/asm/kvm_hyp.h   | 6 ++++++
>  arch/arm64/include/asm/kvm_hyp.h | 9 +++++++++
>  virt/kvm/arm/hyp/vgic-v2-sr.c    | 4 ++--
>  3 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
> index ab20ffa8b9e7..1d42d0aa2feb 100644
> --- a/arch/arm/include/asm/kvm_hyp.h
> +++ b/arch/arm/include/asm/kvm_hyp.h
> @@ -26,6 +26,12 @@
>  
>  #define __hyp_text __section(.hyp.text) notrace
>  
> +#define hyp_symbol_addr(s)						\
> +	({								\
> +		typeof(s) *addr = &(s);					\
> +		addr;							\
> +	})
> +
>  #define __ACCESS_VFP(CRn)			\
>  	"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
>  
> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> index 08d3bb66c8b7..a2d98c539023 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -25,6 +25,15 @@
>  
>  #define __hyp_text __section(.hyp.text) notrace
>  
> +#define hyp_symbol_addr(s)						\
> +	({								\
> +		typeof(s) *addr;					\
> +		asm volatile("adrp	%0, %1\n"			\
> +			     "add	%0, %0, :lo12:%1\n"		\
> +			     : "=r" (addr) : "S" (&s));			\

Can't we use adr_l here?

> +		addr;							\
> +	})
> +

I don't fully appreciate the semantics of this macro going by its name
only.  My understanding is that if you want to resolve a symbol to an
address which is mapped in hyp, then use this.  Is this correct?

If so, can we add a small comment (because I can't come up with a better
name).


>  #define read_sysreg_elx(r,nvh,vh)					\
>  	({								\
>  		u64 reg;						\
> diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/virt/kvm/arm/hyp/vgic-v2-sr.c
> index d7fd46fe9efb..4573d0552af3 100644
> --- a/virt/kvm/arm/hyp/vgic-v2-sr.c
> +++ b/virt/kvm/arm/hyp/vgic-v2-sr.c
> @@ -25,7 +25,7 @@
>  static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
>  {
>  	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> -	int nr_lr = (kern_hyp_va(&kvm_vgic_global_state))->nr_lr;
> +	int nr_lr = hyp_symbol_addr(kvm_vgic_global_state)->nr_lr;
>  	u32 elrsr0, elrsr1;
>  
>  	elrsr0 = readl_relaxed(base + GICH_ELRSR0);
> @@ -139,7 +139,7 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu)
>  		return -1;
>  
>  	rd = kvm_vcpu_dabt_get_rd(vcpu);
> -	addr  = kern_hyp_va((kern_hyp_va(&kvm_vgic_global_state))->vcpu_base_va);
> +	addr  = kern_hyp_va(hyp_symbol_addr(kvm_vgic_global_state)->vcpu_base_va);
>  	addr += fault_ipa - vgic->vgic_cpu_base;
>  
>  	if (kvm_vcpu_dabt_iswrite(vcpu)) {
> -- 
> 2.14.2
> 
Otherwise looks good.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
@ 2018-01-15 15:36     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 15:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:25PM +0000, Marc Zyngier wrote:
> kvm_vgic_global_state is part of the read-only section, and is
> usually accessed using a PC-relative address generation (adrp + add).
> 
> It is thus useless to use kern_hyp_va() on it, and actively problematic
> if kern_hyp_va() becomes non-idempotent. On the other hand, there is
> no way that the compiler is going to guarantee that such access is
> always be PC relative.

nit: is always be

> 
> So let's bite the bullet and provide our own accessor.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/include/asm/kvm_hyp.h   | 6 ++++++
>  arch/arm64/include/asm/kvm_hyp.h | 9 +++++++++
>  virt/kvm/arm/hyp/vgic-v2-sr.c    | 4 ++--
>  3 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
> index ab20ffa8b9e7..1d42d0aa2feb 100644
> --- a/arch/arm/include/asm/kvm_hyp.h
> +++ b/arch/arm/include/asm/kvm_hyp.h
> @@ -26,6 +26,12 @@
>  
>  #define __hyp_text __section(.hyp.text) notrace
>  
> +#define hyp_symbol_addr(s)						\
> +	({								\
> +		typeof(s) *addr = &(s);					\
> +		addr;							\
> +	})
> +
>  #define __ACCESS_VFP(CRn)			\
>  	"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
>  
> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> index 08d3bb66c8b7..a2d98c539023 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -25,6 +25,15 @@
>  
>  #define __hyp_text __section(.hyp.text) notrace
>  
> +#define hyp_symbol_addr(s)						\
> +	({								\
> +		typeof(s) *addr;					\
> +		asm volatile("adrp	%0, %1\n"			\
> +			     "add	%0, %0, :lo12:%1\n"		\
> +			     : "=r" (addr) : "S" (&s));			\

Can't we use adr_l here?

> +		addr;							\
> +	})
> +

I don't fully appreciate the semantics of this macro going by its name
only.  My understanding is that if you want to resolve a symbol to an
address which is mapped in hyp, then use this.  Is this correct?

If so, can we add a small comment (because I can't come up with a better
name).


>  #define read_sysreg_elx(r,nvh,vh)					\
>  	({								\
>  		u64 reg;						\
> diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/virt/kvm/arm/hyp/vgic-v2-sr.c
> index d7fd46fe9efb..4573d0552af3 100644
> --- a/virt/kvm/arm/hyp/vgic-v2-sr.c
> +++ b/virt/kvm/arm/hyp/vgic-v2-sr.c
> @@ -25,7 +25,7 @@
>  static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
>  {
>  	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> -	int nr_lr = (kern_hyp_va(&kvm_vgic_global_state))->nr_lr;
> +	int nr_lr = hyp_symbol_addr(kvm_vgic_global_state)->nr_lr;
>  	u32 elrsr0, elrsr1;
>  
>  	elrsr0 = readl_relaxed(base + GICH_ELRSR0);
> @@ -139,7 +139,7 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu)
>  		return -1;
>  
>  	rd = kvm_vcpu_dabt_get_rd(vcpu);
> -	addr  = kern_hyp_va((kern_hyp_va(&kvm_vgic_global_state))->vcpu_base_va);
> +	addr  = kern_hyp_va(hyp_symbol_addr(kvm_vgic_global_state)->vcpu_base_va);
>  	addr += fault_ipa - vgic->vgic_cpu_base;
>  
>  	if (kvm_vcpu_dabt_iswrite(vcpu)) {
> -- 
> 2.14.2
> 
Otherwise looks good.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 11/19] KVM: arm/arm64: Demote HYP VA range display to being a debug feature
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-15 15:54     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 15:54 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On Thu, Jan 04, 2018 at 06:43:26PM +0000, Marc Zyngier wrote:
> Displaying the HYP VA information is slightly counterproductive when
> using VA randomization. Turn it into a debug feature only, and adjust
> the last displayed value to reflect the top of RAM instead of ~0.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

> ---
>  virt/kvm/arm/mmu.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index b4b69c2d1012..84d09f1a44d4 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1760,9 +1760,10 @@ int kvm_mmu_init(void)
>  	 */
>  	BUG_ON((hyp_idmap_start ^ (hyp_idmap_end - 1)) & PAGE_MASK);
>  
> -	kvm_info("IDMAP page: %lx\n", hyp_idmap_start);
> -	kvm_info("HYP VA range: %lx:%lx\n",
> -		 kern_hyp_va(PAGE_OFFSET), kern_hyp_va(~0UL));
> +	kvm_debug("IDMAP page: %lx\n", hyp_idmap_start);
> +	kvm_debug("HYP VA range: %lx:%lx\n",
> +		  kern_hyp_va(PAGE_OFFSET),
> +		  kern_hyp_va((unsigned long)high_memory - 1));
>  
>  	if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
>  	    hyp_idmap_start <  kern_hyp_va(~0UL) &&
> -- 
> 2.14.2
> 

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 11/19] KVM: arm/arm64: Demote HYP VA range display to being a debug feature
@ 2018-01-15 15:54     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 15:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:26PM +0000, Marc Zyngier wrote:
> Displaying the HYP VA information is slightly counterproductive when
> using VA randomization. Turn it into a debug feature only, and adjust
> the last displayed value to reflect the top of RAM instead of ~0.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

> ---
>  virt/kvm/arm/mmu.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index b4b69c2d1012..84d09f1a44d4 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1760,9 +1760,10 @@ int kvm_mmu_init(void)
>  	 */
>  	BUG_ON((hyp_idmap_start ^ (hyp_idmap_end - 1)) & PAGE_MASK);
>  
> -	kvm_info("IDMAP page: %lx\n", hyp_idmap_start);
> -	kvm_info("HYP VA range: %lx:%lx\n",
> -		 kern_hyp_va(PAGE_OFFSET), kern_hyp_va(~0UL));
> +	kvm_debug("IDMAP page: %lx\n", hyp_idmap_start);
> +	kvm_debug("HYP VA range: %lx:%lx\n",
> +		  kern_hyp_va(PAGE_OFFSET),
> +		  kern_hyp_va((unsigned long)high_memory - 1));
>  
>  	if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
>  	    hyp_idmap_start <  kern_hyp_va(~0UL) &&
> -- 
> 2.14.2
> 

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 12/19] KVM: arm/arm64: Move ioremap calls to create_hyp_io_mappings
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-15 18:07     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 18:07 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:27PM +0000, Marc Zyngier wrote:
> Both HYP io mappings call ioremap, followed by create_hyp_io_mappings.
> Let's move the ioremap call into create_hyp_io_mappings itself, which
> simplifies the code a bit and allows for further refactoring.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/include/asm/kvm_mmu.h   |  3 ++-
>  arch/arm64/include/asm/kvm_mmu.h |  3 ++-
>  virt/kvm/arm/mmu.c               | 24 ++++++++++++++----------
>  virt/kvm/arm/vgic/vgic-v2.c      | 31 ++++++++-----------------------
>  4 files changed, 26 insertions(+), 35 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index fa6f2174276b..cb3bef71ec9b 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -41,7 +41,8 @@
>  #include <asm/stage2_pgtable.h>
>  
>  int create_hyp_mappings(void *from, void *to, pgprot_t prot);
> -int create_hyp_io_mappings(void *from, void *to, phys_addr_t);
> +int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> +			   void __iomem **kaddr);
>  void free_hyp_pgds(void);
>  
>  void stage2_unmap_vm(struct kvm *kvm);
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index b0c3cbe9b513..09a208014457 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -119,7 +119,8 @@ static inline unsigned long __kern_hyp_va(unsigned long v)
>  #include <asm/stage2_pgtable.h>
>  
>  int create_hyp_mappings(void *from, void *to, pgprot_t prot);
> -int create_hyp_io_mappings(void *from, void *to, phys_addr_t);
> +int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> +			   void __iomem **kaddr);
>  void free_hyp_pgds(void);
>  
>  void stage2_unmap_vm(struct kvm *kvm);
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 84d09f1a44d4..38adbe0a016c 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -709,26 +709,30 @@ int create_hyp_mappings(void *from, void *to, pgprot_t prot)
>  }
>  
>  /**
> - * create_hyp_io_mappings - duplicate a kernel IO mapping into Hyp mode
> - * @from:	The kernel start VA of the range
> - * @to:		The kernel end VA of the range (exclusive)
> + * create_hyp_io_mappings - Map IO into both kernel and HYP
>   * @phys_addr:	The physical start address which gets mapped
> + * @size:	Size of the region being mapped
> + * @kaddr:	Kernel VA for this mapping
>   *
>   * The resulting HYP VA is the same as the kernel VA, modulo
>   * HYP_PAGE_OFFSET.
>   */
> -int create_hyp_io_mappings(void *from, void *to, phys_addr_t phys_addr)
> +int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> +			   void __iomem **kaddr)

nit: you could make this return a "void __iomem *" and use ERR_PTR etc.
as well, but it'll probably look worse on the calling side, so either
way is fine with me.

>  {
> -	unsigned long start = kern_hyp_va((unsigned long)from);
> -	unsigned long end = kern_hyp_va((unsigned long)to);
> +	unsigned long start, end;
>  
> -	if (is_kernel_in_hyp_mode())
> +	*kaddr = ioremap(phys_addr, size);
> +	if (!*kaddr)
> +		return -ENOMEM;
> +
> +	if (is_kernel_in_hyp_mode()) {
>  		return 0;
> +	}

nit: we don't need the curly braces

>  
> -	/* Check for a valid kernel IO mapping */
> -	if (!is_vmalloc_addr(from) || !is_vmalloc_addr(to - 1))
> -		return -EINVAL;
>  
> +	start = kern_hyp_va((unsigned long)*kaddr);
> +	end = kern_hyp_va((unsigned long)*kaddr + size);
>  	return __create_hyp_mappings(hyp_pgd, start, end,
>  				     __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
>  }
> diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
> index 80897102da26..bc49d702f9f0 100644
> --- a/virt/kvm/arm/vgic/vgic-v2.c
> +++ b/virt/kvm/arm/vgic/vgic-v2.c
> @@ -332,16 +332,10 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
>  	if (!PAGE_ALIGNED(info->vcpu.start) ||
>  	    !PAGE_ALIGNED(resource_size(&info->vcpu))) {
>  		kvm_info("GICV region size/alignment is unsafe, using trapping (reduced performance)\n");
> -		kvm_vgic_global_state.vcpu_base_va = ioremap(info->vcpu.start,
> -							     resource_size(&info->vcpu));
> -		if (!kvm_vgic_global_state.vcpu_base_va) {
> -			kvm_err("Cannot ioremap GICV\n");
> -			return -ENOMEM;
> -		}
>  
> -		ret = create_hyp_io_mappings(kvm_vgic_global_state.vcpu_base_va,
> -					     kvm_vgic_global_state.vcpu_base_va + resource_size(&info->vcpu),
> -					     info->vcpu.start);
> +		ret = create_hyp_io_mappings(info->vcpu.start,
> +					     resource_size(&info->vcpu),
> +					     &kvm_vgic_global_state.vcpu_base_va);
>  		if (ret) {
>  			kvm_err("Cannot map GICV into hyp\n");
>  			goto out;
> @@ -350,26 +344,17 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
>  		static_branch_enable(&vgic_v2_cpuif_trap);
>  	}
>  
> -	kvm_vgic_global_state.vctrl_base = ioremap(info->vctrl.start,
> -						   resource_size(&info->vctrl));
> -	if (!kvm_vgic_global_state.vctrl_base) {
> -		kvm_err("Cannot ioremap GICH\n");
> -		ret = -ENOMEM;
> +	ret = create_hyp_io_mappings(info->vctrl.start,
> +				     resource_size(&info->vctrl),
> +				     &kvm_vgic_global_state.vctrl_base);
> +	if (ret) {
> +		kvm_err("Cannot map VCTRL into hyp\n");
>  		goto out;
>  	}
>  
>  	vtr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_VTR);
>  	kvm_vgic_global_state.nr_lr = (vtr & 0x3f) + 1;
>  
> -	ret = create_hyp_io_mappings(kvm_vgic_global_state.vctrl_base,
> -				     kvm_vgic_global_state.vctrl_base +
> -					 resource_size(&info->vctrl),
> -				     info->vctrl.start);
> -	if (ret) {
> -		kvm_err("Cannot map VCTRL into hyp\n");
> -		goto out;
> -	}
> -
>  	ret = kvm_register_vgic_device(KVM_DEV_TYPE_ARM_VGIC_V2);
>  	if (ret) {
>  		kvm_err("Cannot register GICv2 KVM device\n");
> -- 
> 2.14.2
> 

Otherwise looks good:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 12/19] KVM: arm/arm64: Move ioremap calls to create_hyp_io_mappings
@ 2018-01-15 18:07     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-15 18:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:27PM +0000, Marc Zyngier wrote:
> Both HYP io mappings call ioremap, followed by create_hyp_io_mappings.
> Let's move the ioremap call into create_hyp_io_mappings itself, which
> simplifies the code a bit and allows for further refactoring.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/include/asm/kvm_mmu.h   |  3 ++-
>  arch/arm64/include/asm/kvm_mmu.h |  3 ++-
>  virt/kvm/arm/mmu.c               | 24 ++++++++++++++----------
>  virt/kvm/arm/vgic/vgic-v2.c      | 31 ++++++++-----------------------
>  4 files changed, 26 insertions(+), 35 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index fa6f2174276b..cb3bef71ec9b 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -41,7 +41,8 @@
>  #include <asm/stage2_pgtable.h>
>  
>  int create_hyp_mappings(void *from, void *to, pgprot_t prot);
> -int create_hyp_io_mappings(void *from, void *to, phys_addr_t);
> +int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> +			   void __iomem **kaddr);
>  void free_hyp_pgds(void);
>  
>  void stage2_unmap_vm(struct kvm *kvm);
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index b0c3cbe9b513..09a208014457 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -119,7 +119,8 @@ static inline unsigned long __kern_hyp_va(unsigned long v)
>  #include <asm/stage2_pgtable.h>
>  
>  int create_hyp_mappings(void *from, void *to, pgprot_t prot);
> -int create_hyp_io_mappings(void *from, void *to, phys_addr_t);
> +int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> +			   void __iomem **kaddr);
>  void free_hyp_pgds(void);
>  
>  void stage2_unmap_vm(struct kvm *kvm);
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 84d09f1a44d4..38adbe0a016c 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -709,26 +709,30 @@ int create_hyp_mappings(void *from, void *to, pgprot_t prot)
>  }
>  
>  /**
> - * create_hyp_io_mappings - duplicate a kernel IO mapping into Hyp mode
> - * @from:	The kernel start VA of the range
> - * @to:		The kernel end VA of the range (exclusive)
> + * create_hyp_io_mappings - Map IO into both kernel and HYP
>   * @phys_addr:	The physical start address which gets mapped
> + * @size:	Size of the region being mapped
> + * @kaddr:	Kernel VA for this mapping
>   *
>   * The resulting HYP VA is the same as the kernel VA, modulo
>   * HYP_PAGE_OFFSET.
>   */
> -int create_hyp_io_mappings(void *from, void *to, phys_addr_t phys_addr)
> +int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> +			   void __iomem **kaddr)

nit: you could make this return a "void __iomem *" and use ERR_PTR etc.
as well, but it'll probably look worse on the calling side, so either
way is fine with me.

>  {
> -	unsigned long start = kern_hyp_va((unsigned long)from);
> -	unsigned long end = kern_hyp_va((unsigned long)to);
> +	unsigned long start, end;
>  
> -	if (is_kernel_in_hyp_mode())
> +	*kaddr = ioremap(phys_addr, size);
> +	if (!*kaddr)
> +		return -ENOMEM;
> +
> +	if (is_kernel_in_hyp_mode()) {
>  		return 0;
> +	}

nit: we don't need the curly braces

>  
> -	/* Check for a valid kernel IO mapping */
> -	if (!is_vmalloc_addr(from) || !is_vmalloc_addr(to - 1))
> -		return -EINVAL;
>  
> +	start = kern_hyp_va((unsigned long)*kaddr);
> +	end = kern_hyp_va((unsigned long)*kaddr + size);
>  	return __create_hyp_mappings(hyp_pgd, start, end,
>  				     __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
>  }
> diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
> index 80897102da26..bc49d702f9f0 100644
> --- a/virt/kvm/arm/vgic/vgic-v2.c
> +++ b/virt/kvm/arm/vgic/vgic-v2.c
> @@ -332,16 +332,10 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
>  	if (!PAGE_ALIGNED(info->vcpu.start) ||
>  	    !PAGE_ALIGNED(resource_size(&info->vcpu))) {
>  		kvm_info("GICV region size/alignment is unsafe, using trapping (reduced performance)\n");
> -		kvm_vgic_global_state.vcpu_base_va = ioremap(info->vcpu.start,
> -							     resource_size(&info->vcpu));
> -		if (!kvm_vgic_global_state.vcpu_base_va) {
> -			kvm_err("Cannot ioremap GICV\n");
> -			return -ENOMEM;
> -		}
>  
> -		ret = create_hyp_io_mappings(kvm_vgic_global_state.vcpu_base_va,
> -					     kvm_vgic_global_state.vcpu_base_va + resource_size(&info->vcpu),
> -					     info->vcpu.start);
> +		ret = create_hyp_io_mappings(info->vcpu.start,
> +					     resource_size(&info->vcpu),
> +					     &kvm_vgic_global_state.vcpu_base_va);
>  		if (ret) {
>  			kvm_err("Cannot map GICV into hyp\n");
>  			goto out;
> @@ -350,26 +344,17 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
>  		static_branch_enable(&vgic_v2_cpuif_trap);
>  	}
>  
> -	kvm_vgic_global_state.vctrl_base = ioremap(info->vctrl.start,
> -						   resource_size(&info->vctrl));
> -	if (!kvm_vgic_global_state.vctrl_base) {
> -		kvm_err("Cannot ioremap GICH\n");
> -		ret = -ENOMEM;
> +	ret = create_hyp_io_mappings(info->vctrl.start,
> +				     resource_size(&info->vctrl),
> +				     &kvm_vgic_global_state.vctrl_base);
> +	if (ret) {
> +		kvm_err("Cannot map VCTRL into hyp\n");
>  		goto out;
>  	}
>  
>  	vtr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_VTR);
>  	kvm_vgic_global_state.nr_lr = (vtr & 0x3f) + 1;
>  
> -	ret = create_hyp_io_mappings(kvm_vgic_global_state.vctrl_base,
> -				     kvm_vgic_global_state.vctrl_base +
> -					 resource_size(&info->vctrl),
> -				     info->vctrl.start);
> -	if (ret) {
> -		kvm_err("Cannot map VCTRL into hyp\n");
> -		goto out;
> -	}
> -
>  	ret = kvm_register_vgic_device(KVM_DEV_TYPE_ARM_VGIC_V2);
>  	if (ret) {
>  		kvm_err("Cannot register GICv2 KVM device\n");
> -- 
> 2.14.2
> 

Otherwise looks good:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 13/19] KVM: arm/arm64: Keep GICv2 HYP VAs in kvm_vgic_global_state
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-18 14:39     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-18 14:39 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:28PM +0000, Marc Zyngier wrote:
> As we're about to change the way we map devices at HYP, we need
> to move away from kern_hyp_va on an IO address.
> 
> One way of achieving this is to store the VAs in kvm_vgic_global_state,
> and use that directly from the HYP code. This requires a small change
> to create_hyp_io_mappings so that it can also return a HYP VA.
> 
> We take this opportunity to nuke the vctrl_base field in the emulated
> distributor, as it is not used anymore.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/include/asm/kvm_mmu.h   |  3 ++-
>  arch/arm64/include/asm/kvm_mmu.h |  3 ++-
>  include/kvm/arm_vgic.h           | 12 ++++++------
>  virt/kvm/arm/hyp/vgic-v2-sr.c    | 10 +++-------
>  virt/kvm/arm/mmu.c               | 20 ++++++++++++++++----
>  virt/kvm/arm/vgic/vgic-init.c    |  6 ------
>  virt/kvm/arm/vgic/vgic-v2.c      | 13 +++++++------
>  7 files changed, 36 insertions(+), 31 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index cb3bef71ec9b..feff24b34506 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -42,7 +42,8 @@
>  
>  int create_hyp_mappings(void *from, void *to, pgprot_t prot);
>  int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> -			   void __iomem **kaddr);
> +			   void __iomem **kaddr,
> +			   void __iomem **haddr);
>  void free_hyp_pgds(void);
>  
>  void stage2_unmap_vm(struct kvm *kvm);
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 09a208014457..cc882e890bb1 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -120,7 +120,8 @@ static inline unsigned long __kern_hyp_va(unsigned long v)
>  
>  int create_hyp_mappings(void *from, void *to, pgprot_t prot);
>  int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> -			   void __iomem **kaddr);
> +			   void __iomem **kaddr,
> +			   void __iomem **haddr);
>  void free_hyp_pgds(void);
>  
>  void stage2_unmap_vm(struct kvm *kvm);
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 8c896540a72c..8b3fbc03293b 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -57,11 +57,15 @@ struct vgic_global {
>  	/* Physical address of vgic virtual cpu interface */
>  	phys_addr_t		vcpu_base;
>  
> -	/* GICV mapping */
> +	/* GICV mapping, kernel VA */
>  	void __iomem		*vcpu_base_va;
> +	/* GICV mapping, HYP VA */
> +	void __iomem		*vcpu_hyp_va;
>  
> -	/* virtual control interface mapping */
> +	/* virtual control interface mapping, kernel VA */
>  	void __iomem		*vctrl_base;
> +	/* virtual control interface mapping, HYP VA */
> +	void __iomem		*vctrl_hyp;
>  
>  	/* Number of implemented list registers */
>  	int			nr_lr;
> @@ -198,10 +202,6 @@ struct vgic_dist {
>  
>  	int			nr_spis;
>  
> -	/* TODO: Consider moving to global state */

yay!

> -	/* Virtual control interface mapping */
> -	void __iomem		*vctrl_base;
> -
>  	/* base addresses in guest physical address space: */
>  	gpa_t			vgic_dist_base;		/* distributor */
>  	union {
> diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/virt/kvm/arm/hyp/vgic-v2-sr.c
> index 4573d0552af3..dbd109f3a4ab 100644
> --- a/virt/kvm/arm/hyp/vgic-v2-sr.c
> +++ b/virt/kvm/arm/hyp/vgic-v2-sr.c
> @@ -56,10 +56,8 @@ static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
>  /* vcpu is already in the HYP VA space */
>  void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
>  {
> -	struct kvm *kvm = kern_hyp_va(vcpu->kvm);
>  	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> -	struct vgic_dist *vgic = &kvm->arch.vgic;
> -	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
> +	void __iomem *base = hyp_symbol_addr(kvm_vgic_global_state)->vctrl_hyp;
>  	u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
>  
>  	if (!base)
> @@ -81,10 +79,8 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
>  /* vcpu is already in the HYP VA space */
>  void __hyp_text __vgic_v2_restore_state(struct kvm_vcpu *vcpu)
>  {
> -	struct kvm *kvm = kern_hyp_va(vcpu->kvm);
>  	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> -	struct vgic_dist *vgic = &kvm->arch.vgic;
> -	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
> +	void __iomem *base = hyp_symbol_addr(kvm_vgic_global_state)->vctrl_hyp;
>  	int i;
>  	u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
>  
> @@ -139,7 +135,7 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu)
>  		return -1;
>  
>  	rd = kvm_vcpu_dabt_get_rd(vcpu);
> -	addr  = kern_hyp_va(hyp_symbol_addr(kvm_vgic_global_state)->vcpu_base_va);
> +	addr  = hyp_symbol_addr(kvm_vgic_global_state)->vcpu_hyp_va;
>  	addr += fault_ipa - vgic->vgic_cpu_base;
>  
>  	if (kvm_vcpu_dabt_iswrite(vcpu)) {
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 38adbe0a016c..6192d45d1e1a 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -713,28 +713,40 @@ int create_hyp_mappings(void *from, void *to, pgprot_t prot)
>   * @phys_addr:	The physical start address which gets mapped
>   * @size:	Size of the region being mapped
>   * @kaddr:	Kernel VA for this mapping
> + * @haddr:	HYP VA for this mapping
>   *
> - * The resulting HYP VA is the same as the kernel VA, modulo
> - * HYP_PAGE_OFFSET.
> + * The resulting HYP VA is completely unrelated to the kernel VA.
>   */
>  int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> -			   void __iomem **kaddr)
> +			   void __iomem **kaddr,
> +			   void __iomem **haddr)
>  {
>  	unsigned long start, end;
> +	int ret;
>  
>  	*kaddr = ioremap(phys_addr, size);
>  	if (!*kaddr)
>  		return -ENOMEM;
>  
>  	if (is_kernel_in_hyp_mode()) {
> +		*haddr = *kaddr;
>  		return 0;
>  	}
>  
>  
>  	start = kern_hyp_va((unsigned long)*kaddr);
>  	end = kern_hyp_va((unsigned long)*kaddr + size);
> -	return __create_hyp_mappings(hyp_pgd, start, end,
> +	ret = __create_hyp_mappings(hyp_pgd, start, end,
>  				     __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
> +
> +	if (ret) {
> +		iounmap(*kaddr);
> +		*kaddr = NULL;
> +	} else {
> +		*haddr = (void __iomem *)start;
> +	}

potential waste-of-time nit: this could be written perhaps slightly more
elegant as:

	if (ret) {
		iounmap(*kaddr);
		*kaddr = NULL;
		return ret;
	}

	*haddr = (void __iomem *)start
	return 0;

>  }
>  
>  /**
> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
> index 62310122ee78..3f01b5975055 100644
> --- a/virt/kvm/arm/vgic/vgic-init.c
> +++ b/virt/kvm/arm/vgic/vgic-init.c
> @@ -166,12 +166,6 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
>  	kvm->arch.vgic.in_kernel = true;
>  	kvm->arch.vgic.vgic_model = type;
>  
> -	/*
> -	 * kvm_vgic_global_state.vctrl_base is set on vgic probe (kvm_arch_init)
> -	 * it is stored in distributor struct for asm save/restore purpose
> -	 */
> -	kvm->arch.vgic.vctrl_base = kvm_vgic_global_state.vctrl_base;
> -
>  	kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
>  	kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
>  	kvm->arch.vgic.vgic_redist_base = VGIC_ADDR_UNDEF;
> diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
> index bc49d702f9f0..f0f566e4494e 100644
> --- a/virt/kvm/arm/vgic/vgic-v2.c
> +++ b/virt/kvm/arm/vgic/vgic-v2.c
> @@ -335,7 +335,8 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
>  
>  		ret = create_hyp_io_mappings(info->vcpu.start,
>  					     resource_size(&info->vcpu),
> -					     &kvm_vgic_global_state.vcpu_base_va);
> +					     &kvm_vgic_global_state.vcpu_base_va,
> +					     &kvm_vgic_global_state.vcpu_hyp_va);
>  		if (ret) {
>  			kvm_err("Cannot map GICV into hyp\n");
>  			goto out;
> @@ -346,7 +347,8 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
>  
>  	ret = create_hyp_io_mappings(info->vctrl.start,
>  				     resource_size(&info->vctrl),
> -				     &kvm_vgic_global_state.vctrl_base);
> +				     &kvm_vgic_global_state.vctrl_base,
> +				     &kvm_vgic_global_state.vctrl_hyp);
>  	if (ret) {
>  		kvm_err("Cannot map VCTRL into hyp\n");
>  		goto out;
> @@ -381,15 +383,14 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
>  void vgic_v2_load(struct kvm_vcpu *vcpu)
>  {
>  	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> -	struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
>  
> -	writel_relaxed(cpu_if->vgic_vmcr, vgic->vctrl_base + GICH_VMCR);
> +	writel_relaxed(cpu_if->vgic_vmcr,
> +		       kvm_vgic_global_state.vctrl_base + GICH_VMCR);
>  }
>  
>  void vgic_v2_put(struct kvm_vcpu *vcpu)
>  {
>  	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> -	struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
>  
> -	cpu_if->vgic_vmcr = readl_relaxed(vgic->vctrl_base + GICH_VMCR);
> +	cpu_if->vgic_vmcr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_VMCR);
>  }
> -- 
> 2.14.2
> 

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 13/19] KVM: arm/arm64: Keep GICv2 HYP VAs in kvm_vgic_global_state
@ 2018-01-18 14:39     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-18 14:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:28PM +0000, Marc Zyngier wrote:
> As we're about to change the way we map devices at HYP, we need
> to move away from kern_hyp_va on an IO address.
> 
> One way of achieving this is to store the VAs in kvm_vgic_global_state,
> and use that directly from the HYP code. This requires a small change
> to create_hyp_io_mappings so that it can also return a HYP VA.
> 
> We take this opportunity to nuke the vctrl_base field in the emulated
> distributor, as it is not used anymore.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm/include/asm/kvm_mmu.h   |  3 ++-
>  arch/arm64/include/asm/kvm_mmu.h |  3 ++-
>  include/kvm/arm_vgic.h           | 12 ++++++------
>  virt/kvm/arm/hyp/vgic-v2-sr.c    | 10 +++-------
>  virt/kvm/arm/mmu.c               | 20 ++++++++++++++++----
>  virt/kvm/arm/vgic/vgic-init.c    |  6 ------
>  virt/kvm/arm/vgic/vgic-v2.c      | 13 +++++++------
>  7 files changed, 36 insertions(+), 31 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index cb3bef71ec9b..feff24b34506 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -42,7 +42,8 @@
>  
>  int create_hyp_mappings(void *from, void *to, pgprot_t prot);
>  int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> -			   void __iomem **kaddr);
> +			   void __iomem **kaddr,
> +			   void __iomem **haddr);
>  void free_hyp_pgds(void);
>  
>  void stage2_unmap_vm(struct kvm *kvm);
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 09a208014457..cc882e890bb1 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -120,7 +120,8 @@ static inline unsigned long __kern_hyp_va(unsigned long v)
>  
>  int create_hyp_mappings(void *from, void *to, pgprot_t prot);
>  int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> -			   void __iomem **kaddr);
> +			   void __iomem **kaddr,
> +			   void __iomem **haddr);
>  void free_hyp_pgds(void);
>  
>  void stage2_unmap_vm(struct kvm *kvm);
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 8c896540a72c..8b3fbc03293b 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -57,11 +57,15 @@ struct vgic_global {
>  	/* Physical address of vgic virtual cpu interface */
>  	phys_addr_t		vcpu_base;
>  
> -	/* GICV mapping */
> +	/* GICV mapping, kernel VA */
>  	void __iomem		*vcpu_base_va;
> +	/* GICV mapping, HYP VA */
> +	void __iomem		*vcpu_hyp_va;
>  
> -	/* virtual control interface mapping */
> +	/* virtual control interface mapping, kernel VA */
>  	void __iomem		*vctrl_base;
> +	/* virtual control interface mapping, HYP VA */
> +	void __iomem		*vctrl_hyp;
>  
>  	/* Number of implemented list registers */
>  	int			nr_lr;
> @@ -198,10 +202,6 @@ struct vgic_dist {
>  
>  	int			nr_spis;
>  
> -	/* TODO: Consider moving to global state */

yay!

> -	/* Virtual control interface mapping */
> -	void __iomem		*vctrl_base;
> -
>  	/* base addresses in guest physical address space: */
>  	gpa_t			vgic_dist_base;		/* distributor */
>  	union {
> diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/virt/kvm/arm/hyp/vgic-v2-sr.c
> index 4573d0552af3..dbd109f3a4ab 100644
> --- a/virt/kvm/arm/hyp/vgic-v2-sr.c
> +++ b/virt/kvm/arm/hyp/vgic-v2-sr.c
> @@ -56,10 +56,8 @@ static void __hyp_text save_lrs(struct kvm_vcpu *vcpu, void __iomem *base)
>  /* vcpu is already in the HYP VA space */
>  void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
>  {
> -	struct kvm *kvm = kern_hyp_va(vcpu->kvm);
>  	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> -	struct vgic_dist *vgic = &kvm->arch.vgic;
> -	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
> +	void __iomem *base = hyp_symbol_addr(kvm_vgic_global_state)->vctrl_hyp;
>  	u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
>  
>  	if (!base)
> @@ -81,10 +79,8 @@ void __hyp_text __vgic_v2_save_state(struct kvm_vcpu *vcpu)
>  /* vcpu is already in the HYP VA space */
>  void __hyp_text __vgic_v2_restore_state(struct kvm_vcpu *vcpu)
>  {
> -	struct kvm *kvm = kern_hyp_va(vcpu->kvm);
>  	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> -	struct vgic_dist *vgic = &kvm->arch.vgic;
> -	void __iomem *base = kern_hyp_va(vgic->vctrl_base);
> +	void __iomem *base = hyp_symbol_addr(kvm_vgic_global_state)->vctrl_hyp;
>  	int i;
>  	u64 used_lrs = vcpu->arch.vgic_cpu.used_lrs;
>  
> @@ -139,7 +135,7 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu)
>  		return -1;
>  
>  	rd = kvm_vcpu_dabt_get_rd(vcpu);
> -	addr  = kern_hyp_va(hyp_symbol_addr(kvm_vgic_global_state)->vcpu_base_va);
> +	addr  = hyp_symbol_addr(kvm_vgic_global_state)->vcpu_hyp_va;
>  	addr += fault_ipa - vgic->vgic_cpu_base;
>  
>  	if (kvm_vcpu_dabt_iswrite(vcpu)) {
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 38adbe0a016c..6192d45d1e1a 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -713,28 +713,40 @@ int create_hyp_mappings(void *from, void *to, pgprot_t prot)
>   * @phys_addr:	The physical start address which gets mapped
>   * @size:	Size of the region being mapped
>   * @kaddr:	Kernel VA for this mapping
> + * @haddr:	HYP VA for this mapping
>   *
> - * The resulting HYP VA is the same as the kernel VA, modulo
> - * HYP_PAGE_OFFSET.
> + * The resulting HYP VA is completely unrelated to the kernel VA.
>   */
>  int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> -			   void __iomem **kaddr)
> +			   void __iomem **kaddr,
> +			   void __iomem **haddr)
>  {
>  	unsigned long start, end;
> +	int ret;
>  
>  	*kaddr = ioremap(phys_addr, size);
>  	if (!*kaddr)
>  		return -ENOMEM;
>  
>  	if (is_kernel_in_hyp_mode()) {
> +		*haddr = *kaddr;
>  		return 0;
>  	}
>  
>  
>  	start = kern_hyp_va((unsigned long)*kaddr);
>  	end = kern_hyp_va((unsigned long)*kaddr + size);
> -	return __create_hyp_mappings(hyp_pgd, start, end,
> +	ret = __create_hyp_mappings(hyp_pgd, start, end,
>  				     __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
> +
> +	if (ret) {
> +		iounmap(*kaddr);
> +		*kaddr = NULL;
> +	} else {
> +		*haddr = (void __iomem *)start;
> +	}

potential waste-of-time nit: this could be written perhaps slightly more
elegant as:

	if (ret) {
		iounmap(*kaddr);
		*kaddr = NULL;
		return ret;
	}

	*haddr = (void __iomem *)start
	return 0;

>  }
>  
>  /**
> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
> index 62310122ee78..3f01b5975055 100644
> --- a/virt/kvm/arm/vgic/vgic-init.c
> +++ b/virt/kvm/arm/vgic/vgic-init.c
> @@ -166,12 +166,6 @@ int kvm_vgic_create(struct kvm *kvm, u32 type)
>  	kvm->arch.vgic.in_kernel = true;
>  	kvm->arch.vgic.vgic_model = type;
>  
> -	/*
> -	 * kvm_vgic_global_state.vctrl_base is set on vgic probe (kvm_arch_init)
> -	 * it is stored in distributor struct for asm save/restore purpose
> -	 */
> -	kvm->arch.vgic.vctrl_base = kvm_vgic_global_state.vctrl_base;
> -
>  	kvm->arch.vgic.vgic_dist_base = VGIC_ADDR_UNDEF;
>  	kvm->arch.vgic.vgic_cpu_base = VGIC_ADDR_UNDEF;
>  	kvm->arch.vgic.vgic_redist_base = VGIC_ADDR_UNDEF;
> diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
> index bc49d702f9f0..f0f566e4494e 100644
> --- a/virt/kvm/arm/vgic/vgic-v2.c
> +++ b/virt/kvm/arm/vgic/vgic-v2.c
> @@ -335,7 +335,8 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
>  
>  		ret = create_hyp_io_mappings(info->vcpu.start,
>  					     resource_size(&info->vcpu),
> -					     &kvm_vgic_global_state.vcpu_base_va);
> +					     &kvm_vgic_global_state.vcpu_base_va,
> +					     &kvm_vgic_global_state.vcpu_hyp_va);
>  		if (ret) {
>  			kvm_err("Cannot map GICV into hyp\n");
>  			goto out;
> @@ -346,7 +347,8 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
>  
>  	ret = create_hyp_io_mappings(info->vctrl.start,
>  				     resource_size(&info->vctrl),
> -				     &kvm_vgic_global_state.vctrl_base);
> +				     &kvm_vgic_global_state.vctrl_base,
> +				     &kvm_vgic_global_state.vctrl_hyp);
>  	if (ret) {
>  		kvm_err("Cannot map VCTRL into hyp\n");
>  		goto out;
> @@ -381,15 +383,14 @@ int vgic_v2_probe(const struct gic_kvm_info *info)
>  void vgic_v2_load(struct kvm_vcpu *vcpu)
>  {
>  	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> -	struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
>  
> -	writel_relaxed(cpu_if->vgic_vmcr, vgic->vctrl_base + GICH_VMCR);
> +	writel_relaxed(cpu_if->vgic_vmcr,
> +		       kvm_vgic_global_state.vctrl_base + GICH_VMCR);
>  }
>  
>  void vgic_v2_put(struct kvm_vcpu *vcpu)
>  {
>  	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
> -	struct vgic_dist *vgic = &vcpu->kvm->arch.vgic;
>  
> -	cpu_if->vgic_vmcr = readl_relaxed(vgic->vctrl_base + GICH_VMCR);
> +	cpu_if->vgic_vmcr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_VMCR);
>  }
> -- 
> 2.14.2
> 

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 14/19] KVM: arm/arm64: Move HYP IO VAs to the "idmap" range
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-18 14:39     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-18 14:39 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:29PM +0000, Marc Zyngier wrote:
> We so far mapped our HYP IO (which is essencially the GICv2 control
> registers) using the same method as for memory. It recently appeared
> that is a bit unsafe:
> 
> We compute the HYP VA using the kern_hyp_va helper, but that helper
> is only designed to deal with kernel VAs coming from the linear map,
> and not from the vmalloc region... This could in turn cause some bad
> aliasing between the two, amplified by the upcoming VA randomisation.
> 
> A solution is to come up with our very own basic VA allocator for
> MMIO. Since half of the HYP address space only contains a single
> page (the idmap), we have plenty to borrow from. Let's use the idmap
> as a base, and allocate downwards from it. GICv2 now lives on the
> other side of the great VA barrier.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  virt/kvm/arm/mmu.c | 56 +++++++++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 43 insertions(+), 13 deletions(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 6192d45d1e1a..14c5e5534f2f 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -43,6 +43,9 @@ static unsigned long hyp_idmap_start;
>  static unsigned long hyp_idmap_end;
>  static phys_addr_t hyp_idmap_vector;
>  
> +static DEFINE_MUTEX(io_map_lock);
> +static unsigned long io_map_base;
> +
>  #define S2_PGD_SIZE	(PTRS_PER_S2_PGD * sizeof(pgd_t))
>  #define hyp_pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
>  
> @@ -502,27 +505,31 @@ static void unmap_hyp_range(pgd_t *pgdp, phys_addr_t start, u64 size)
>   *
>   * Assumes hyp_pgd is a page table used strictly in Hyp-mode and
>   * therefore contains either mappings in the kernel memory area (above
> - * PAGE_OFFSET), or device mappings in the vmalloc range (from
> - * VMALLOC_START to VMALLOC_END).
> + * PAGE_OFFSET), or device mappings in the idmap range.
>   *
> - * boot_hyp_pgd should only map two pages for the init code.
> + * boot_hyp_pgd should only map the idmap range, and is only used in
> + * the extended idmap case.
>   */
>  void free_hyp_pgds(void)
>  {
> +	pgd_t *id_pgd;
> +
>  	mutex_lock(&kvm_hyp_pgd_mutex);
>  
> +	id_pgd = boot_hyp_pgd ? boot_hyp_pgd : hyp_pgd;
> +
> +	if (id_pgd)
> +		unmap_hyp_range(id_pgd, io_map_base,
> +				hyp_idmap_start + PAGE_SIZE - io_map_base);
> +
>  	if (boot_hyp_pgd) {
> -		unmap_hyp_range(boot_hyp_pgd, hyp_idmap_start, PAGE_SIZE);
>  		free_pages((unsigned long)boot_hyp_pgd, hyp_pgd_order);
>  		boot_hyp_pgd = NULL;
>  	}
>  
>  	if (hyp_pgd) {
> -		unmap_hyp_range(hyp_pgd, hyp_idmap_start, PAGE_SIZE);
>  		unmap_hyp_range(hyp_pgd, kern_hyp_va(PAGE_OFFSET),
>  				(uintptr_t)high_memory - PAGE_OFFSET);
> -		unmap_hyp_range(hyp_pgd, kern_hyp_va(VMALLOC_START),
> -				VMALLOC_END - VMALLOC_START);
>  
>  		free_pages((unsigned long)hyp_pgd, hyp_pgd_order);
>  		hyp_pgd = NULL;
> @@ -721,7 +728,8 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>  			   void __iomem **kaddr,
>  			   void __iomem **haddr)
>  {
> -	unsigned long start, end;
> +	pgd_t *pgd = hyp_pgd;
> +	unsigned long base;
>  	int ret;
>  
>  	*kaddr = ioremap(phys_addr, size);
> @@ -733,17 +741,38 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>  		return 0;
>  	}
>  
> +	mutex_lock(&io_map_lock);
> +
> +	base = io_map_base - size;

are we guaranteed that hyp_idmap_start (and therefore io_map_base) is
sufficiently greater than 0 ?  I suppose that even if RAM starts at 0,
and the kernel was loaded at 0, the idmap page for Hyp would be at some
reasonable offset from the start of the kernel image?

> +	base &= ~(size - 1);

I'm not sure I understand this line.  Wouldn't it make more sense to use
PAGE_SIZE here?

>  
> -	start = kern_hyp_va((unsigned long)*kaddr);
> -	end = kern_hyp_va((unsigned long)*kaddr + size);
> -	ret = __create_hyp_mappings(hyp_pgd, start, end,
> +	/*
> +	 * Verify that BIT(VA_BITS - 1) hasn't been flipped by
> +	 * allocating the new area, as it would indicate we've
> +	 * overflowed the idmap/IO address range.
> +	 */
> +	if ((base ^ io_map_base) & BIT(VA_BITS - 1)) {
> +		ret = -ENOMEM;
> +		goto out;
> +	}
> +
> +	if (__kvm_cpu_uses_extended_idmap())
> +		pgd = boot_hyp_pgd;
> +

I don't understand why you can change the pgd used for mappings here,
perhaps a patch splitting issue?

> +	ret = __create_hyp_mappings(pgd, base, base + size,
>  				     __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
>  
> +	if (!ret) {
> +		*haddr = (void __iomem *)base;
> +		io_map_base = base;
> +	}
> +
> +out:
> +	mutex_unlock(&io_map_lock);
> +
>  	if (ret) {
>  		iounmap(*kaddr);
>  		*kaddr = NULL;
> -	} else {
> -		*haddr = (void __iomem *)start;

ah, it gets reworked here, so never mind the comment on the previous patch.

>  	}
>  
>  	return ret;
> @@ -1826,6 +1855,7 @@ int kvm_mmu_init(void)
>  			goto out;
>  	}
>  
> +	io_map_base = hyp_idmap_start;
>  	return 0;
>  out:
>  	free_hyp_pgds();
> -- 
> 2.14.2
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 14/19] KVM: arm/arm64: Move HYP IO VAs to the "idmap" range
@ 2018-01-18 14:39     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-18 14:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:29PM +0000, Marc Zyngier wrote:
> We so far mapped our HYP IO (which is essencially the GICv2 control
> registers) using the same method as for memory. It recently appeared
> that is a bit unsafe:
> 
> We compute the HYP VA using the kern_hyp_va helper, but that helper
> is only designed to deal with kernel VAs coming from the linear map,
> and not from the vmalloc region... This could in turn cause some bad
> aliasing between the two, amplified by the upcoming VA randomisation.
> 
> A solution is to come up with our very own basic VA allocator for
> MMIO. Since half of the HYP address space only contains a single
> page (the idmap), we have plenty to borrow from. Let's use the idmap
> as a base, and allocate downwards from it. GICv2 now lives on the
> other side of the great VA barrier.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  virt/kvm/arm/mmu.c | 56 +++++++++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 43 insertions(+), 13 deletions(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 6192d45d1e1a..14c5e5534f2f 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -43,6 +43,9 @@ static unsigned long hyp_idmap_start;
>  static unsigned long hyp_idmap_end;
>  static phys_addr_t hyp_idmap_vector;
>  
> +static DEFINE_MUTEX(io_map_lock);
> +static unsigned long io_map_base;
> +
>  #define S2_PGD_SIZE	(PTRS_PER_S2_PGD * sizeof(pgd_t))
>  #define hyp_pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
>  
> @@ -502,27 +505,31 @@ static void unmap_hyp_range(pgd_t *pgdp, phys_addr_t start, u64 size)
>   *
>   * Assumes hyp_pgd is a page table used strictly in Hyp-mode and
>   * therefore contains either mappings in the kernel memory area (above
> - * PAGE_OFFSET), or device mappings in the vmalloc range (from
> - * VMALLOC_START to VMALLOC_END).
> + * PAGE_OFFSET), or device mappings in the idmap range.
>   *
> - * boot_hyp_pgd should only map two pages for the init code.
> + * boot_hyp_pgd should only map the idmap range, and is only used in
> + * the extended idmap case.
>   */
>  void free_hyp_pgds(void)
>  {
> +	pgd_t *id_pgd;
> +
>  	mutex_lock(&kvm_hyp_pgd_mutex);
>  
> +	id_pgd = boot_hyp_pgd ? boot_hyp_pgd : hyp_pgd;
> +
> +	if (id_pgd)
> +		unmap_hyp_range(id_pgd, io_map_base,
> +				hyp_idmap_start + PAGE_SIZE - io_map_base);
> +
>  	if (boot_hyp_pgd) {
> -		unmap_hyp_range(boot_hyp_pgd, hyp_idmap_start, PAGE_SIZE);
>  		free_pages((unsigned long)boot_hyp_pgd, hyp_pgd_order);
>  		boot_hyp_pgd = NULL;
>  	}
>  
>  	if (hyp_pgd) {
> -		unmap_hyp_range(hyp_pgd, hyp_idmap_start, PAGE_SIZE);
>  		unmap_hyp_range(hyp_pgd, kern_hyp_va(PAGE_OFFSET),
>  				(uintptr_t)high_memory - PAGE_OFFSET);
> -		unmap_hyp_range(hyp_pgd, kern_hyp_va(VMALLOC_START),
> -				VMALLOC_END - VMALLOC_START);
>  
>  		free_pages((unsigned long)hyp_pgd, hyp_pgd_order);
>  		hyp_pgd = NULL;
> @@ -721,7 +728,8 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>  			   void __iomem **kaddr,
>  			   void __iomem **haddr)
>  {
> -	unsigned long start, end;
> +	pgd_t *pgd = hyp_pgd;
> +	unsigned long base;
>  	int ret;
>  
>  	*kaddr = ioremap(phys_addr, size);
> @@ -733,17 +741,38 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>  		return 0;
>  	}
>  
> +	mutex_lock(&io_map_lock);
> +
> +	base = io_map_base - size;

are we guaranteed that hyp_idmap_start (and therefore io_map_base) is
sufficiently greater than 0 ?  I suppose that even if RAM starts at 0,
and the kernel was loaded at 0, the idmap page for Hyp would be at some
reasonable offset from the start of the kernel image?

> +	base &= ~(size - 1);

I'm not sure I understand this line.  Wouldn't it make more sense to use
PAGE_SIZE here?

>  
> -	start = kern_hyp_va((unsigned long)*kaddr);
> -	end = kern_hyp_va((unsigned long)*kaddr + size);
> -	ret = __create_hyp_mappings(hyp_pgd, start, end,
> +	/*
> +	 * Verify that BIT(VA_BITS - 1) hasn't been flipped by
> +	 * allocating the new area, as it would indicate we've
> +	 * overflowed the idmap/IO address range.
> +	 */
> +	if ((base ^ io_map_base) & BIT(VA_BITS - 1)) {
> +		ret = -ENOMEM;
> +		goto out;
> +	}
> +
> +	if (__kvm_cpu_uses_extended_idmap())
> +		pgd = boot_hyp_pgd;
> +

I don't understand why you can change the pgd used for mappings here,
perhaps a patch splitting issue?

> +	ret = __create_hyp_mappings(pgd, base, base + size,
>  				     __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
>  
> +	if (!ret) {
> +		*haddr = (void __iomem *)base;
> +		io_map_base = base;
> +	}
> +
> +out:
> +	mutex_unlock(&io_map_lock);
> +
>  	if (ret) {
>  		iounmap(*kaddr);
>  		*kaddr = NULL;
> -	} else {
> -		*haddr = (void __iomem *)start;

ah, it gets reworked here, so never mind the comment on the previous patch.

>  	}
>  
>  	return ret;
> @@ -1826,6 +1855,7 @@ int kvm_mmu_init(void)
>  			goto out;
>  	}
>  
> +	io_map_base = hyp_idmap_start;
>  	return 0;
>  out:
>  	free_hyp_pgds();
> -- 
> 2.14.2
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 15/19] arm64; insn: Add encoder for the EXTR instruction
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-18 20:27     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-18 20:27 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On Thu, Jan 04, 2018 at 06:43:30PM +0000, Marc Zyngier wrote:
> Add an encoder for the EXTR instruction, which also implements the ROR
> variant (where Rn == Rm).
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/insn.h |  6 ++++++
>  arch/arm64/kernel/insn.c      | 32 ++++++++++++++++++++++++++++++++
>  2 files changed, 38 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index 815b35bc53ed..f62c56b1793f 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -319,6 +319,7 @@ __AARCH64_INSN_FUNCS(and_imm,	0x7F800000, 0x12000000)
>  __AARCH64_INSN_FUNCS(orr_imm,	0x7F800000, 0x32000000)
>  __AARCH64_INSN_FUNCS(eor_imm,	0x7F800000, 0x52000000)
>  __AARCH64_INSN_FUNCS(ands_imm,	0x7F800000, 0x72000000)
> +__AARCH64_INSN_FUNCS(extr,	0x7FA00000, 0x13800000)
>  __AARCH64_INSN_FUNCS(b,		0xFC000000, 0x14000000)
>  __AARCH64_INSN_FUNCS(bl,	0xFC000000, 0x94000000)
>  __AARCH64_INSN_FUNCS(cbz,	0x7F000000, 0x34000000)
> @@ -433,6 +434,11 @@ u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
>  				       enum aarch64_insn_register Rn,
>  				       enum aarch64_insn_register Rd,
>  				       u64 imm);
> +u32 aarch64_insn_gen_extr(enum aarch64_insn_variant variant,
> +			  enum aarch64_insn_register Rm,
> +			  enum aarch64_insn_register Rn,
> +			  enum aarch64_insn_register Rd,
> +			  u8 lsb);
>  u32 aarch64_insn_gen_prefetch(enum aarch64_insn_register base,
>  			      enum aarch64_insn_prfm_type type,
>  			      enum aarch64_insn_prfm_target target,
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 72cb1721c63f..59669d7d4383 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -1621,3 +1621,35 @@ u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
>  	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, Rn);
>  	return aarch64_encode_immediate(imm, variant, insn);
>  }
> +
> +u32 aarch64_insn_gen_extr(enum aarch64_insn_variant variant,
> +			  enum aarch64_insn_register Rm,
> +			  enum aarch64_insn_register Rn,
> +			  enum aarch64_insn_register Rd,
> +			  u8 lsb)
> +{
> +	u32 insn;
> +
> +	insn = aarch64_insn_get_extr_value();
> +
> +	switch (variant) {
> +	case AARCH64_INSN_VARIANT_32BIT:
> +		if (lsb > 31)
> +			return AARCH64_BREAK_FAULT;
> +		break;
> +	case AARCH64_INSN_VARIANT_64BIT:
> +		if (lsb > 63)
> +			return AARCH64_BREAK_FAULT;
> +		insn |= AARCH64_INSN_SF_BIT;
> +		insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_N, insn, 1);
> +		break;
> +	default:
> +		pr_err("%s: unknown variant encoding %d\n", __func__, variant);
> +		return AARCH64_BREAK_FAULT;
> +	}
> +
> +	insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_S, insn, lsb);
> +	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, Rd);
> +	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, Rn);
> +	return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RM, insn, Rm);
> +}
> -- 
> 2.14.2
> 

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 15/19] arm64; insn: Add encoder for the EXTR instruction
@ 2018-01-18 20:27     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-18 20:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:30PM +0000, Marc Zyngier wrote:
> Add an encoder for the EXTR instruction, which also implements the ROR
> variant (where Rn == Rm).
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/insn.h |  6 ++++++
>  arch/arm64/kernel/insn.c      | 32 ++++++++++++++++++++++++++++++++
>  2 files changed, 38 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index 815b35bc53ed..f62c56b1793f 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -319,6 +319,7 @@ __AARCH64_INSN_FUNCS(and_imm,	0x7F800000, 0x12000000)
>  __AARCH64_INSN_FUNCS(orr_imm,	0x7F800000, 0x32000000)
>  __AARCH64_INSN_FUNCS(eor_imm,	0x7F800000, 0x52000000)
>  __AARCH64_INSN_FUNCS(ands_imm,	0x7F800000, 0x72000000)
> +__AARCH64_INSN_FUNCS(extr,	0x7FA00000, 0x13800000)
>  __AARCH64_INSN_FUNCS(b,		0xFC000000, 0x14000000)
>  __AARCH64_INSN_FUNCS(bl,	0xFC000000, 0x94000000)
>  __AARCH64_INSN_FUNCS(cbz,	0x7F000000, 0x34000000)
> @@ -433,6 +434,11 @@ u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
>  				       enum aarch64_insn_register Rn,
>  				       enum aarch64_insn_register Rd,
>  				       u64 imm);
> +u32 aarch64_insn_gen_extr(enum aarch64_insn_variant variant,
> +			  enum aarch64_insn_register Rm,
> +			  enum aarch64_insn_register Rn,
> +			  enum aarch64_insn_register Rd,
> +			  u8 lsb);
>  u32 aarch64_insn_gen_prefetch(enum aarch64_insn_register base,
>  			      enum aarch64_insn_prfm_type type,
>  			      enum aarch64_insn_prfm_target target,
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 72cb1721c63f..59669d7d4383 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -1621,3 +1621,35 @@ u32 aarch64_insn_gen_logical_immediate(enum aarch64_insn_logic_type type,
>  	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, Rn);
>  	return aarch64_encode_immediate(imm, variant, insn);
>  }
> +
> +u32 aarch64_insn_gen_extr(enum aarch64_insn_variant variant,
> +			  enum aarch64_insn_register Rm,
> +			  enum aarch64_insn_register Rn,
> +			  enum aarch64_insn_register Rd,
> +			  u8 lsb)
> +{
> +	u32 insn;
> +
> +	insn = aarch64_insn_get_extr_value();
> +
> +	switch (variant) {
> +	case AARCH64_INSN_VARIANT_32BIT:
> +		if (lsb > 31)
> +			return AARCH64_BREAK_FAULT;
> +		break;
> +	case AARCH64_INSN_VARIANT_64BIT:
> +		if (lsb > 63)
> +			return AARCH64_BREAK_FAULT;
> +		insn |= AARCH64_INSN_SF_BIT;
> +		insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_N, insn, 1);
> +		break;
> +	default:
> +		pr_err("%s: unknown variant encoding %d\n", __func__, variant);
> +		return AARCH64_BREAK_FAULT;
> +	}
> +
> +	insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_S, insn, lsb);
> +	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, Rd);
> +	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, Rn);
> +	return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RM, insn, Rm);
> +}
> -- 
> 2.14.2
> 

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 16/19] arm64: insn: Allow ADD/SUB (immediate) with LSL #12
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-18 20:28     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-18 20:28 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On Thu, Jan 04, 2018 at 06:43:31PM +0000, Marc Zyngier wrote:
> The encoder for ADD/SUB (immediate) can only cope with 12bit
> immediates, while there is an encoding for a 12bit immediate shifted
> by 12 bits to the left.
> 
> Let's fix this small oversight by allowing the LSL_12 bit to be set.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kernel/insn.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 59669d7d4383..20655537cdd1 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -35,6 +35,7 @@
>  
>  #define AARCH64_INSN_SF_BIT	BIT(31)
>  #define AARCH64_INSN_N_BIT	BIT(22)
> +#define AARCH64_INSN_LSL_12	BIT(22)
>  
>  static int aarch64_insn_encoding_class[] = {
>  	AARCH64_INSN_CLS_UNKNOWN,
> @@ -903,9 +904,18 @@ u32 aarch64_insn_gen_add_sub_imm(enum aarch64_insn_register dst,
>  		return AARCH64_BREAK_FAULT;
>  	}
>  
> +	/* We can't encode more than a 24bit value (12bit + 12bit shift) */
> +	if (imm & ~(BIT(24) - 1))
> +		goto out;
> +
> +	/* If we have something in the top 12 bits... */
>  	if (imm & ~(SZ_4K - 1)) {
> -		pr_err("%s: invalid immediate encoding %d\n", __func__, imm);
> -		return AARCH64_BREAK_FAULT;
> +		/* ... and in the low 12 bits -> error */
> +		if (imm & (SZ_4K - 1))
> +			goto out;
> +
> +		imm >>= 12;
> +		insn |= AARCH64_INSN_LSL_12;
>  	}
>  
>  	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, dst);
> @@ -913,6 +923,10 @@ u32 aarch64_insn_gen_add_sub_imm(enum aarch64_insn_register dst,
>  	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, src);
>  
>  	return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_12, insn, imm);
> +
> +out:
> +	pr_err("%s: invalid immediate encoding %d\n", __func__, imm);
> +	return AARCH64_BREAK_FAULT;
>  }
>  
>  u32 aarch64_insn_gen_bitfield(enum aarch64_insn_register dst,


Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 16/19] arm64: insn: Allow ADD/SUB (immediate) with LSL #12
@ 2018-01-18 20:28     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-18 20:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:31PM +0000, Marc Zyngier wrote:
> The encoder for ADD/SUB (immediate) can only cope with 12bit
> immediates, while there is an encoding for a 12bit immediate shifted
> by 12 bits to the left.
> 
> Let's fix this small oversight by allowing the LSL_12 bit to be set.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kernel/insn.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 59669d7d4383..20655537cdd1 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -35,6 +35,7 @@
>  
>  #define AARCH64_INSN_SF_BIT	BIT(31)
>  #define AARCH64_INSN_N_BIT	BIT(22)
> +#define AARCH64_INSN_LSL_12	BIT(22)
>  
>  static int aarch64_insn_encoding_class[] = {
>  	AARCH64_INSN_CLS_UNKNOWN,
> @@ -903,9 +904,18 @@ u32 aarch64_insn_gen_add_sub_imm(enum aarch64_insn_register dst,
>  		return AARCH64_BREAK_FAULT;
>  	}
>  
> +	/* We can't encode more than a 24bit value (12bit + 12bit shift) */
> +	if (imm & ~(BIT(24) - 1))
> +		goto out;
> +
> +	/* If we have something in the top 12 bits... */
>  	if (imm & ~(SZ_4K - 1)) {
> -		pr_err("%s: invalid immediate encoding %d\n", __func__, imm);
> -		return AARCH64_BREAK_FAULT;
> +		/* ... and in the low 12 bits -> error */
> +		if (imm & (SZ_4K - 1))
> +			goto out;
> +
> +		imm >>= 12;
> +		insn |= AARCH64_INSN_LSL_12;
>  	}
>  
>  	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RD, insn, dst);
> @@ -913,6 +923,10 @@ u32 aarch64_insn_gen_add_sub_imm(enum aarch64_insn_register dst,
>  	insn = aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RN, insn, src);
>  
>  	return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_12, insn, imm);
> +
> +out:
> +	pr_err("%s: invalid immediate encoding %d\n", __func__, imm);
> +	return AARCH64_BREAK_FAULT;
>  }
>  
>  u32 aarch64_insn_gen_bitfield(enum aarch64_insn_register dst,


Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 17/19] arm64: KVM: Dynamically compute the HYP VA mask
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-18 20:28     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-18 20:28 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:32PM +0000, Marc Zyngier wrote:
> As we're moving towards a much more dynamic way to compute our
> HYP VA, let's express the mask in a slightly different way.
> 
> Instead of comparing the idmap position to the "low" VA mask,
> we directly compute the mask by taking into account the idmap's
> (VA_BIT-1) bit.
> 
> No functionnal change.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/va_layout.c | 17 ++++++-----------
>  1 file changed, 6 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
> index aee758574e61..75bb1c6772b0 100644
> --- a/arch/arm64/kvm/va_layout.c
> +++ b/arch/arm64/kvm/va_layout.c
> @@ -21,24 +21,19 @@
>  #include <asm/insn.h>
>  #include <asm/kvm_mmu.h>
>  
> -#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
> -#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
> -
>  static u64 va_mask;
>  
>  static void compute_layout(void)
>  {
>  	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
> -	unsigned long mask = HYP_PAGE_OFFSET_HIGH_MASK;
> +	u64 region;

the naming here really confused me.  Would it make sense to call this
'hyp_va_msb' or something like that instead?

>  
> -	/*
> -	 * Activate the lower HYP offset only if the idmap doesn't
> -	 * clash with it,
> -	 */
> -	if (idmap_addr > HYP_PAGE_OFFSET_LOW_MASK)
> -		mask = HYP_PAGE_OFFSET_HIGH_MASK;

Ah, the series was tested, it was just that this code only existed for a
short while.  Amusingly, I think this ephemeral bug goes against the "No
function change" statement in the commit message.

> +	/* Where is my RAM region? */
> +	region  = idmap_addr & BIT(VA_BITS - 1);
> +	region ^= BIT(VA_BITS - 1);
>  
> -	va_mask = mask;
> +	va_mask  = BIT(VA_BITS - 1) - 1;

nit: This could also be written as GENMASK_ULL(VA_BITS - 2, 0) --- and
now I'm not sure which one I prefer.

> +	va_mask |= region;
>  }
>  
>  static u32 compute_instruction(int n, u32 rd, u32 rn)
> -- 
> 2.14.2
> 
Otherwise looks good:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 17/19] arm64: KVM: Dynamically compute the HYP VA mask
@ 2018-01-18 20:28     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-18 20:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:32PM +0000, Marc Zyngier wrote:
> As we're moving towards a much more dynamic way to compute our
> HYP VA, let's express the mask in a slightly different way.
> 
> Instead of comparing the idmap position to the "low" VA mask,
> we directly compute the mask by taking into account the idmap's
> (VA_BIT-1) bit.
> 
> No functionnal change.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/kvm/va_layout.c | 17 ++++++-----------
>  1 file changed, 6 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
> index aee758574e61..75bb1c6772b0 100644
> --- a/arch/arm64/kvm/va_layout.c
> +++ b/arch/arm64/kvm/va_layout.c
> @@ -21,24 +21,19 @@
>  #include <asm/insn.h>
>  #include <asm/kvm_mmu.h>
>  
> -#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
> -#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
> -
>  static u64 va_mask;
>  
>  static void compute_layout(void)
>  {
>  	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
> -	unsigned long mask = HYP_PAGE_OFFSET_HIGH_MASK;
> +	u64 region;

the naming here really confused me.  Would it make sense to call this
'hyp_va_msb' or something like that instead?

>  
> -	/*
> -	 * Activate the lower HYP offset only if the idmap doesn't
> -	 * clash with it,
> -	 */
> -	if (idmap_addr > HYP_PAGE_OFFSET_LOW_MASK)
> -		mask = HYP_PAGE_OFFSET_HIGH_MASK;

Ah, the series was tested, it was just that this code only existed for a
short while.  Amusingly, I think this ephemeral bug goes against the "No
function change" statement in the commit message.

> +	/* Where is my RAM region? */
> +	region  = idmap_addr & BIT(VA_BITS - 1);
> +	region ^= BIT(VA_BITS - 1);
>  
> -	va_mask = mask;
> +	va_mask  = BIT(VA_BITS - 1) - 1;

nit: This could also be written as GENMASK_ULL(VA_BITS - 2, 0) --- and
now I'm not sure which one I prefer.

> +	va_mask |= region;
>  }
>  
>  static u32 compute_instruction(int n, u32 rd, u32 rn)
> -- 
> 2.14.2
> 
Otherwise looks good:

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 18/19] arm64: KVM: Introduce EL2 VA randomisation
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-18 20:28     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-18 20:28 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On Thu, Jan 04, 2018 at 06:43:33PM +0000, Marc Zyngier wrote:
> The main idea behind randomising the EL2 VA is that we usually have
> a few spare bits between the most significant bit of the VA mask
> and the most significant bit of the linear mapping.
> 
> Those bits could be a bunch of zeroes, and could be useful
> to move things around a bit. Of course, the more memory you have,
> the less randomisation you get...
> 
> Alternatively, these bits could be the result of KASLR, in which
> case they are already random. But it would be nice to have a
> *different* randomization, just to make the job of a potential
> attacker a bit more difficult.
> 
> Inserting these random bits is a bit involved. We don't have a spare
> register (short of rewriting all the kern_hyp_va call sites), and
> the immediate we want to insert is too random to be used with the
> ORR instruction. The best option I could come up with is the following
> sequence:
> 
> 	and x0, x0, #va_mask

So if I get this right, you want to insert an arbitrary random value
without an extra register in bits [(VA_BITS-1):first_random_bit] and
BIT(VA_BITS-1) is always set in the input because it's a kernel address.

> 	ror x0, x0, #first_random_bit

Then you rotate so that the random bits become the LSBs and the random
value should be inserted into bits [NR_RAND_BITS-1:0] in x0 ?

> 	add x0, x0, #(random & 0xfff)

So you do this via two rounds, first the lower 12 bits

> 	add x0, x0, #(random >> 12), lsl #12

Then the upper 12 bits (permitting a maximum of 24 randomized bits)

> 	ror x0, x0, #(63 - first_random_bit)

And then you rotate things back into their place.

Only, I don't understand why this isn't then (64 - first_random_bit) ?

> 
> making it a fairly long sequence, but one that a decent CPU should
> be able to execute without breaking a sweat. It is of course NOPed
> out on VHE. The last 4 instructions can also be turned into NOPs
> if it appears that there is no free bits to use.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_mmu.h | 10 +++++-
>  arch/arm64/kvm/va_layout.c       | 68 +++++++++++++++++++++++++++++++++++++---
>  virt/kvm/arm/mmu.c               |  2 +-
>  3 files changed, 73 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index cc882e890bb1..4fca6ddadccc 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -85,6 +85,10 @@
>  .macro kern_hyp_va	reg
>  alternative_cb kvm_update_va_mask
>  	and     \reg, \reg, #1
> +	ror	\reg, \reg, #1
> +	add	\reg, \reg, #0
> +	add	\reg, \reg, #0
> +	ror	\reg, \reg, #63
>  alternative_cb_end
>  .endm
>  
> @@ -101,7 +105,11 @@ void kvm_update_va_mask(struct alt_instr *alt,
>  
>  static inline unsigned long __kern_hyp_va(unsigned long v)
>  {
> -	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
> +	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n"
> +				    "ror %0, %0, #1\n"
> +				    "add %0, %0, #0\n"
> +				    "add %0, %0, #0\n"
> +				    "ror %0, %0, #63\n",

This now sort of serves as the documentation if you don't have the
commit message, so I think you should annotate each line like the commit
message does.

Alternative, since you're duplicating a bunch of code which will be
replaced at runtime anyway, you could make all of these "and %0, %0, #1"
and then copy the documentation assembly code as a comment to
compute_instruction() and put a comment reference here.

>  				    kvm_update_va_mask)
>  		     : "+r" (v));
>  	return v;
> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
> index 75bb1c6772b0..bf0d6bdf5f14 100644
> --- a/arch/arm64/kvm/va_layout.c
> +++ b/arch/arm64/kvm/va_layout.c
> @@ -16,11 +16,15 @@
>   */
>  
>  #include <linux/kvm_host.h>
> +#include <linux/random.h>
> +#include <linux/memblock.h>
>  #include <asm/alternative.h>
>  #include <asm/debug-monitors.h>
>  #include <asm/insn.h>
>  #include <asm/kvm_mmu.h>
>  

It would be nice to have a comment on these, something like:

/* The LSB of the random hyp VA tag or 0 if no randomization is used. */
> +static u8 tag_lsb;
/* The random hyp VA tag value with the region bit, if hyp randomization is used */
> +static u64 tag_val;


>  static u64 va_mask;
>  
>  static void compute_layout(void)
> @@ -32,8 +36,31 @@ static void compute_layout(void)
>  	region  = idmap_addr & BIT(VA_BITS - 1);
>  	region ^= BIT(VA_BITS - 1);
>  
> -	va_mask  = BIT(VA_BITS - 1) - 1;
> -	va_mask |= region;
> +	tag_lsb = fls64((u64)phys_to_virt(memblock_start_of_DRAM()) ^
> +			(u64)(high_memory - 1));
> +
> +	if (tag_lsb == (VA_BITS - 1)) {
> +		/*
> +		 * No space in the address, let's compute the mask so
> +		 * that it covers (VA_BITS - 1) bits, and the region
> +		 * bit. The tag is set to zero.
> +		 */
> +		tag_lsb = tag_val = 0;

tag_val should already be 0, right?

and wouldn't it be slightly nicer to have a temporary variable and only
set tag_lsb when needed, called something like linear_bits ?

> +		va_mask  = BIT(VA_BITS - 1) - 1;
> +		va_mask |= region;
> +	} else {
> +		/*
> +		 * We do have some free bits. Let's have the mask to
> +		 * cover the low bits of the VA, and the tag to
> +		 * contain the random stuff plus the region bit.
> +		 */

Since you have two masks below this comment is a bit hard to parse, how
about explaining what makes up a Hyp address from a kernel linear
address instead, something like:

		/*
		 * We do have some free bits to insert a random tag.
		 * Hyp VAs are now created from kernel linear map VAs
		 * using the following formula (with V == VA_BITS):
		 *
		 *  63 ... V |   V-1  | V-2 ... tag_lsb | tag_lsb - 1 ... 0
		 *  -------------------------------------------------------
		 * | 0000000 | region |    random tag   |  kern linear VA  |
		 */

(assuming I got this vaguely correct).

> +		u64 mask = GENMASK_ULL(VA_BITS - 2, tag_lsb);

for consistency it would be nicer to use GENMASK_ULL(VA_BITS - 2, 0)
above as suggested in the other patch then.  And we could also call this
tag_mask to be super explicit.

> +
> +		va_mask = BIT(tag_lsb) - 1;

and here, GENMASK_ULL(tag_lsb - 1, 0).

> +		tag_val  = get_random_long() & mask;
> +		tag_val |= region;

it's actually unclear to me why you need the region bit included in
tag_val?

> +		tag_val >>= tag_lsb;
> +	}
>  }
>  
>  static u32 compute_instruction(int n, u32 rd, u32 rn)
> @@ -46,6 +73,33 @@ static u32 compute_instruction(int n, u32 rd, u32 rn)
>  							  AARCH64_INSN_VARIANT_64BIT,
>  							  rn, rd, va_mask);
>  		break;
> +
> +	case 1:
> +		/* ROR is a variant of EXTR with Rm = Rn */
> +		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
> +					     rn, rn, rd,
> +					     tag_lsb);
> +		break;
> +
> +	case 2:
> +		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
> +						    tag_val & (SZ_4K - 1),
> +						    AARCH64_INSN_VARIANT_64BIT,
> +						    AARCH64_INSN_ADSB_ADD);
> +		break;
> +
> +	case 3:
> +		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
> +						    tag_val & GENMASK(23, 12),
> +						    AARCH64_INSN_VARIANT_64BIT,
> +						    AARCH64_INSN_ADSB_ADD);
> +		break;
> +
> +	case 4:
> +		/* ROR is a variant of EXTR with Rm = Rn */
> +		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
> +					     rn, rn, rd, 64 - tag_lsb);

Ah, you do use 64 - first_rand in the code.  Well, I approve of this
line of code then.

> +		break;
>  	}
>  
>  	return insn;
> @@ -56,8 +110,8 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
>  {
>  	int i;
>  
> -	/* We only expect a 1 instruction sequence */
> -	BUG_ON(nr_inst != 1);
> +	/* We only expect a 5 instruction sequence */

Still sounds strange to me, just drop the comment I think if we keep the
BUG_ON.

> +	BUG_ON(nr_inst != 5);
>  
>  	if (!has_vhe() && !va_mask)
>  		compute_layout();
> @@ -68,8 +122,12 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
>  		/*
>  		 * VHE doesn't need any address translation, let's NOP
>  		 * everything.
> +		 *
> +		 * Alternatively, if we don't have any spare bits in
> +		 * the address, NOP everything after masking tha

s/tha/the/

> +		 * kernel VA.
>  		 */
> -		if (has_vhe()) {
> +		if (has_vhe() || (!tag_lsb && i > 1)) {
>  			updptr[i] = aarch64_insn_gen_nop();
>  			continue;
>  		}
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 14c5e5534f2f..d01c7111b1f7 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1811,7 +1811,7 @@ int kvm_mmu_init(void)
>  		  kern_hyp_va((unsigned long)high_memory - 1));
>  
>  	if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
> -	    hyp_idmap_start <  kern_hyp_va(~0UL) &&
> +	    hyp_idmap_start <  kern_hyp_va((unsigned long)high_memory - 1) &&

Is this actually required for this patch or are we just trying to be
nice?

I'm actually not sure I remember what this is about beyond the VA=idmap
for everything on 32-bit case; I thought we chose the hyp address space
exactly so that it wouldn't overlap with the idmap?

>  	    hyp_idmap_start != (unsigned long)__hyp_idmap_text_start) {
>  		/*
>  		 * The idmap page is intersecting with the VA space,
> -- 
> 2.14.2
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 18/19] arm64: KVM: Introduce EL2 VA randomisation
@ 2018-01-18 20:28     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-18 20:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:33PM +0000, Marc Zyngier wrote:
> The main idea behind randomising the EL2 VA is that we usually have
> a few spare bits between the most significant bit of the VA mask
> and the most significant bit of the linear mapping.
> 
> Those bits could be a bunch of zeroes, and could be useful
> to move things around a bit. Of course, the more memory you have,
> the less randomisation you get...
> 
> Alternatively, these bits could be the result of KASLR, in which
> case they are already random. But it would be nice to have a
> *different* randomization, just to make the job of a potential
> attacker a bit more difficult.
> 
> Inserting these random bits is a bit involved. We don't have a spare
> register (short of rewriting all the kern_hyp_va call sites), and
> the immediate we want to insert is too random to be used with the
> ORR instruction. The best option I could come up with is the following
> sequence:
> 
> 	and x0, x0, #va_mask

So if I get this right, you want to insert an arbitrary random value
without an extra register in bits [(VA_BITS-1):first_random_bit] and
BIT(VA_BITS-1) is always set in the input because it's a kernel address.

> 	ror x0, x0, #first_random_bit

Then you rotate so that the random bits become the LSBs and the random
value should be inserted into bits [NR_RAND_BITS-1:0] in x0 ?

> 	add x0, x0, #(random & 0xfff)

So you do this via two rounds, first the lower 12 bits

> 	add x0, x0, #(random >> 12), lsl #12

Then the upper 12 bits (permitting a maximum of 24 randomized bits)

> 	ror x0, x0, #(63 - first_random_bit)

And then you rotate things back into their place.

Only, I don't understand why this isn't then (64 - first_random_bit) ?

> 
> making it a fairly long sequence, but one that a decent CPU should
> be able to execute without breaking a sweat. It is of course NOPed
> out on VHE. The last 4 instructions can also be turned into NOPs
> if it appears that there is no free bits to use.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  arch/arm64/include/asm/kvm_mmu.h | 10 +++++-
>  arch/arm64/kvm/va_layout.c       | 68 +++++++++++++++++++++++++++++++++++++---
>  virt/kvm/arm/mmu.c               |  2 +-
>  3 files changed, 73 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index cc882e890bb1..4fca6ddadccc 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -85,6 +85,10 @@
>  .macro kern_hyp_va	reg
>  alternative_cb kvm_update_va_mask
>  	and     \reg, \reg, #1
> +	ror	\reg, \reg, #1
> +	add	\reg, \reg, #0
> +	add	\reg, \reg, #0
> +	ror	\reg, \reg, #63
>  alternative_cb_end
>  .endm
>  
> @@ -101,7 +105,11 @@ void kvm_update_va_mask(struct alt_instr *alt,
>  
>  static inline unsigned long __kern_hyp_va(unsigned long v)
>  {
> -	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
> +	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n"
> +				    "ror %0, %0, #1\n"
> +				    "add %0, %0, #0\n"
> +				    "add %0, %0, #0\n"
> +				    "ror %0, %0, #63\n",

This now sort of serves as the documentation if you don't have the
commit message, so I think you should annotate each line like the commit
message does.

Alternative, since you're duplicating a bunch of code which will be
replaced at runtime anyway, you could make all of these "and %0, %0, #1"
and then copy the documentation assembly code as a comment to
compute_instruction() and put a comment reference here.

>  				    kvm_update_va_mask)
>  		     : "+r" (v));
>  	return v;
> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
> index 75bb1c6772b0..bf0d6bdf5f14 100644
> --- a/arch/arm64/kvm/va_layout.c
> +++ b/arch/arm64/kvm/va_layout.c
> @@ -16,11 +16,15 @@
>   */
>  
>  #include <linux/kvm_host.h>
> +#include <linux/random.h>
> +#include <linux/memblock.h>
>  #include <asm/alternative.h>
>  #include <asm/debug-monitors.h>
>  #include <asm/insn.h>
>  #include <asm/kvm_mmu.h>
>  

It would be nice to have a comment on these, something like:

/* The LSB of the random hyp VA tag or 0 if no randomization is used. */
> +static u8 tag_lsb;
/* The random hyp VA tag value with the region bit, if hyp randomization is used */
> +static u64 tag_val;


>  static u64 va_mask;
>  
>  static void compute_layout(void)
> @@ -32,8 +36,31 @@ static void compute_layout(void)
>  	region  = idmap_addr & BIT(VA_BITS - 1);
>  	region ^= BIT(VA_BITS - 1);
>  
> -	va_mask  = BIT(VA_BITS - 1) - 1;
> -	va_mask |= region;
> +	tag_lsb = fls64((u64)phys_to_virt(memblock_start_of_DRAM()) ^
> +			(u64)(high_memory - 1));
> +
> +	if (tag_lsb == (VA_BITS - 1)) {
> +		/*
> +		 * No space in the address, let's compute the mask so
> +		 * that it covers (VA_BITS - 1) bits, and the region
> +		 * bit. The tag is set to zero.
> +		 */
> +		tag_lsb = tag_val = 0;

tag_val should already be 0, right?

and wouldn't it be slightly nicer to have a temporary variable and only
set tag_lsb when needed, called something like linear_bits ?

> +		va_mask  = BIT(VA_BITS - 1) - 1;
> +		va_mask |= region;
> +	} else {
> +		/*
> +		 * We do have some free bits. Let's have the mask to
> +		 * cover the low bits of the VA, and the tag to
> +		 * contain the random stuff plus the region bit.
> +		 */

Since you have two masks below this comment is a bit hard to parse, how
about explaining what makes up a Hyp address from a kernel linear
address instead, something like:

		/*
		 * We do have some free bits to insert a random tag.
		 * Hyp VAs are now created from kernel linear map VAs
		 * using the following formula (with V == VA_BITS):
		 *
		 *  63 ... V |   V-1  | V-2 ... tag_lsb | tag_lsb - 1 ... 0
		 *  -------------------------------------------------------
		 * | 0000000 | region |    random tag   |  kern linear VA  |
		 */

(assuming I got this vaguely correct).

> +		u64 mask = GENMASK_ULL(VA_BITS - 2, tag_lsb);

for consistency it would be nicer to use GENMASK_ULL(VA_BITS - 2, 0)
above as suggested in the other patch then.  And we could also call this
tag_mask to be super explicit.

> +
> +		va_mask = BIT(tag_lsb) - 1;

and here, GENMASK_ULL(tag_lsb - 1, 0).

> +		tag_val  = get_random_long() & mask;
> +		tag_val |= region;

it's actually unclear to me why you need the region bit included in
tag_val?

> +		tag_val >>= tag_lsb;
> +	}
>  }
>  
>  static u32 compute_instruction(int n, u32 rd, u32 rn)
> @@ -46,6 +73,33 @@ static u32 compute_instruction(int n, u32 rd, u32 rn)
>  							  AARCH64_INSN_VARIANT_64BIT,
>  							  rn, rd, va_mask);
>  		break;
> +
> +	case 1:
> +		/* ROR is a variant of EXTR with Rm = Rn */
> +		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
> +					     rn, rn, rd,
> +					     tag_lsb);
> +		break;
> +
> +	case 2:
> +		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
> +						    tag_val & (SZ_4K - 1),
> +						    AARCH64_INSN_VARIANT_64BIT,
> +						    AARCH64_INSN_ADSB_ADD);
> +		break;
> +
> +	case 3:
> +		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
> +						    tag_val & GENMASK(23, 12),
> +						    AARCH64_INSN_VARIANT_64BIT,
> +						    AARCH64_INSN_ADSB_ADD);
> +		break;
> +
> +	case 4:
> +		/* ROR is a variant of EXTR with Rm = Rn */
> +		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
> +					     rn, rn, rd, 64 - tag_lsb);

Ah, you do use 64 - first_rand in the code.  Well, I approve of this
line of code then.

> +		break;
>  	}
>  
>  	return insn;
> @@ -56,8 +110,8 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
>  {
>  	int i;
>  
> -	/* We only expect a 1 instruction sequence */
> -	BUG_ON(nr_inst != 1);
> +	/* We only expect a 5 instruction sequence */

Still sounds strange to me, just drop the comment I think if we keep the
BUG_ON.

> +	BUG_ON(nr_inst != 5);
>  
>  	if (!has_vhe() && !va_mask)
>  		compute_layout();
> @@ -68,8 +122,12 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
>  		/*
>  		 * VHE doesn't need any address translation, let's NOP
>  		 * everything.
> +		 *
> +		 * Alternatively, if we don't have any spare bits in
> +		 * the address, NOP everything after masking tha

s/tha/the/

> +		 * kernel VA.
>  		 */
> -		if (has_vhe()) {
> +		if (has_vhe() || (!tag_lsb && i > 1)) {
>  			updptr[i] = aarch64_insn_gen_nop();
>  			continue;
>  		}
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 14c5e5534f2f..d01c7111b1f7 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1811,7 +1811,7 @@ int kvm_mmu_init(void)
>  		  kern_hyp_va((unsigned long)high_memory - 1));
>  
>  	if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
> -	    hyp_idmap_start <  kern_hyp_va(~0UL) &&
> +	    hyp_idmap_start <  kern_hyp_va((unsigned long)high_memory - 1) &&

Is this actually required for this patch or are we just trying to be
nice?

I'm actually not sure I remember what this is about beyond the VA=idmap
for everything on 32-bit case; I thought we chose the hyp address space
exactly so that it wouldn't overlap with the idmap?

>  	    hyp_idmap_start != (unsigned long)__hyp_idmap_text_start) {
>  		/*
>  		 * The idmap page is intersecting with the VA space,
> -- 
> 2.14.2
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 19/19] arm64: Update the KVM memory map documentation
  2018-01-04 18:43   ` Marc Zyngier
@ 2018-01-18 20:28     ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-18 20:28 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvm, Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:34PM +0000, Marc Zyngier wrote:
> Update the documentation to reflect the new tricks we play on the
> EL2 mappings...
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

> ---
>  Documentation/arm64/memory.txt | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/arm64/memory.txt b/Documentation/arm64/memory.txt
> index 671bc0639262..ea64e20037f6 100644
> --- a/Documentation/arm64/memory.txt
> +++ b/Documentation/arm64/memory.txt
> @@ -86,9 +86,11 @@ Translation table lookup with 64KB pages:
>   +-------------------------------------------------> [63] TTBR0/1
>  
>  
> -When using KVM without the Virtualization Host Extensions, the hypervisor
> -maps kernel pages in EL2 at a fixed offset from the kernel VA. See the
> -kern_hyp_va macro for more details.
> +When using KVM without the Virtualization Host Extensions, the
> +hypervisor maps kernel pages in EL2 at a fixed offset (modulo a random
> +offset) from the linear mapping. See the kern_hyp_va macro and
> +kvm_update_va_mask function for more details. MMIO devices such as
> +GICv2 gets mapped next to the HYP idmap page.
>  
>  When using KVM with the Virtualization Host Extensions, no additional
>  mappings are created, since the host kernel runs directly in EL2.
> -- 
> 2.14.2
> 

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 19/19] arm64: Update the KVM memory map documentation
@ 2018-01-18 20:28     ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-01-18 20:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 04, 2018 at 06:43:34PM +0000, Marc Zyngier wrote:
> Update the documentation to reflect the new tricks we play on the
> EL2 mappings...
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

> ---
>  Documentation/arm64/memory.txt | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/arm64/memory.txt b/Documentation/arm64/memory.txt
> index 671bc0639262..ea64e20037f6 100644
> --- a/Documentation/arm64/memory.txt
> +++ b/Documentation/arm64/memory.txt
> @@ -86,9 +86,11 @@ Translation table lookup with 64KB pages:
>   +-------------------------------------------------> [63] TTBR0/1
>  
>  
> -When using KVM without the Virtualization Host Extensions, the hypervisor
> -maps kernel pages in EL2 at a fixed offset from the kernel VA. See the
> -kern_hyp_va macro for more details.
> +When using KVM without the Virtualization Host Extensions, the
> +hypervisor maps kernel pages in EL2 at a fixed offset (modulo a random
> +offset) from the linear mapping. See the kern_hyp_va macro and
> +kvm_update_va_mask function for more details. MMIO devices such as
> +GICv2 gets mapped next to the HYP idmap page.
>  
>  When using KVM with the Virtualization Host Extensions, no additional
>  mappings are created, since the host kernel runs directly in EL2.
> -- 
> 2.14.2
> 

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 08/19] arm64: KVM: Dynamically patch the kernel/hyp VA mask
  2018-01-15 11:47     ` Christoffer Dall
@ 2018-02-15 13:11       ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-15 13:11 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On 15/01/18 11:47, Christoffer Dall wrote:
> On Thu, Jan 04, 2018 at 06:43:23PM +0000, Marc Zyngier wrote:
>> So far, we're using a complicated sequence of alternatives to
>> patch the kernel/hyp VA mask on non-VHE, and NOP out the
>> masking altogether when on VHE.
>>
>> THe newly introduced dynamic patching gives us the opportunity
> 
> nit: s/THe/The/
> 
>> to simplify that code by patching a single instruction with
>> the correct mask (instead of the mind bending cummulative masking
>> we have at the moment) or even a single NOP on VHE.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_mmu.h | 45 ++++++--------------
>>  arch/arm64/kvm/Makefile          |  2 +-
>>  arch/arm64/kvm/va_layout.c       | 91 ++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 104 insertions(+), 34 deletions(-)
>>  create mode 100644 arch/arm64/kvm/va_layout.c
>>
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index 672c8684d5c2..b0c3cbe9b513 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -69,9 +69,6 @@
>>   * mappings, and none of this applies in that case.
>>   */
>>  
>> -#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
>> -#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
>> -
>>  #ifdef __ASSEMBLY__
>>  
>>  #include <asm/alternative.h>
>> @@ -81,28 +78,14 @@
>>   * Convert a kernel VA into a HYP VA.
>>   * reg: VA to be converted.
>>   *
>> - * This generates the following sequences:
>> - * - High mask:
>> - *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
>> - *		nop
>> - * - Low mask:
>> - *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
>> - *		and x0, x0, #HYP_PAGE_OFFSET_LOW_MASK
>> - * - VHE:
>> - *		nop
>> - *		nop
>> - *
>> - * The "low mask" version works because the mask is a strict subset of
>> - * the "high mask", hence performing the first mask for nothing.
>> - * Should be completely invisible on any viable CPU.
>> + * The actual code generation takes place in kvm_update_va_mask, and
>> + * the instructions below are only there to reserve the space and
>> + * perform the register allocation.
> 
> Not just register allocation, but also to tell the generating code which
> registers to use for its operands, right?

That's what I meant by register allocation.

> 
>>   */
>>  .macro kern_hyp_va	reg
>> -alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
>> -	and     \reg, \reg, #HYP_PAGE_OFFSET_HIGH_MASK
>> -alternative_else_nop_endif
>> -alternative_if ARM64_HYP_OFFSET_LOW
>> -	and     \reg, \reg, #HYP_PAGE_OFFSET_LOW_MASK
>> -alternative_else_nop_endif
>> +alternative_cb kvm_update_va_mask
>> +	and     \reg, \reg, #1
>> +alternative_cb_end
>>  .endm
>>  
>>  #else
>> @@ -113,18 +96,14 @@ alternative_else_nop_endif
>>  #include <asm/mmu_context.h>
>>  #include <asm/pgtable.h>
>>  
>> +void kvm_update_va_mask(struct alt_instr *alt,
>> +			__le32 *origptr, __le32 *updptr, int nr_inst);
>> +
>>  static inline unsigned long __kern_hyp_va(unsigned long v)
>>  {
>> -	asm volatile(ALTERNATIVE("and %0, %0, %1",
>> -				 "nop",
>> -				 ARM64_HAS_VIRT_HOST_EXTN)
>> -		     : "+r" (v)
>> -		     : "i" (HYP_PAGE_OFFSET_HIGH_MASK));
>> -	asm volatile(ALTERNATIVE("nop",
>> -				 "and %0, %0, %1",
>> -				 ARM64_HYP_OFFSET_LOW)
>> -		     : "+r" (v)
>> -		     : "i" (HYP_PAGE_OFFSET_LOW_MASK));
>> +	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
>> +				    kvm_update_va_mask)
>> +		     : "+r" (v));
>>  	return v;
>>  }
>>  
>> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
>> index 87c4f7ae24de..93afff91cb7c 100644
>> --- a/arch/arm64/kvm/Makefile
>> +++ b/arch/arm64/kvm/Makefile
>> @@ -16,7 +16,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/e
>>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o
>>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
>>  
>> -kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o
>> +kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
>>  kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
>>  kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
>>  kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o
>> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
>> new file mode 100644
>> index 000000000000..aee758574e61
>> --- /dev/null
>> +++ b/arch/arm64/kvm/va_layout.c
>> @@ -0,0 +1,91 @@
>> +/*
>> + * Copyright (C) 2017 ARM Ltd.
>> + * Author: Marc Zyngier <marc.zyngier@arm.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <linux/kvm_host.h>
>> +#include <asm/alternative.h>
>> +#include <asm/debug-monitors.h>
>> +#include <asm/insn.h>
>> +#include <asm/kvm_mmu.h>
>> +
>> +#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
>> +#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
>> +
>> +static u64 va_mask;
>> +
>> +static void compute_layout(void)
>> +{
>> +	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
>> +	unsigned long mask = HYP_PAGE_OFFSET_HIGH_MASK;
>> +
>> +	/*
>> +	 * Activate the lower HYP offset only if the idmap doesn't
>> +	 * clash with it,
>> +	 */
> 
> The commentary is a bit strange given the logic below...
> 
>> +	if (idmap_addr > HYP_PAGE_OFFSET_LOW_MASK)
>> +		mask = HYP_PAGE_OFFSET_HIGH_MASK;
> 
> ... was the initialization supposed to be LOW_MASK?
> 
> (and does this imply that this was never tested on a system that
> actually used the low mask?)

I must has messed up with a later refactoring, as that code gets
replaced pretty quickly in the series. The full series was definitely
tested on Seattle with 39bit VAs, which is the only configuration I have
that triggers the low mask.

> 
>> +
>> +	va_mask = mask;
>> +}
>> +
>> +static u32 compute_instruction(int n, u32 rd, u32 rn)
>> +{
>> +	u32 insn = AARCH64_BREAK_FAULT;
>> +
>> +	switch (n) {
>> +	case 0:
> 
> hmmm, wonder why we need this n==0 check...

Makes patch splitting a bit easier. I can rework that if that helps.

> 
>> +		insn = aarch64_insn_gen_logical_immediate(AARCH64_INSN_LOGIC_AND,
>> +							  AARCH64_INSN_VARIANT_64BIT,
>> +							  rn, rd, va_mask);
>> +		break;
>> +	}
>> +
>> +	return insn;
>> +}
>> +
>> +void __init kvm_update_va_mask(struct alt_instr *alt,
>> +			       __le32 *origptr, __le32 *updptr, int nr_inst)
>> +{
>> +	int i;
>> +
>> +	/* We only expect a 1 instruction sequence */
> 
> nit: wording is a bit strange, how about
> "We only expect a single instruction in the alternative sequence"

Sure.

> 
>> +	BUG_ON(nr_inst != 1);
>> +
>> +	if (!has_vhe() && !va_mask)
>> +		compute_layout();
>> +
>> +	for (i = 0; i < nr_inst; i++) {
> 
> It's a bit funny to have a loop with the above BUG_ON.
> 
> (I'm beginning to wonder if a future patch expands on this single
> instruction thing, in which case a hint in the commit message would have
> been nice.)

That's indeed what is happening. A further patch expands the single
instruction to a 5 instruction sequence. I'll add a comment to that effect.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 08/19] arm64: KVM: Dynamically patch the kernel/hyp VA mask
@ 2018-02-15 13:11       ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-15 13:11 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/01/18 11:47, Christoffer Dall wrote:
> On Thu, Jan 04, 2018 at 06:43:23PM +0000, Marc Zyngier wrote:
>> So far, we're using a complicated sequence of alternatives to
>> patch the kernel/hyp VA mask on non-VHE, and NOP out the
>> masking altogether when on VHE.
>>
>> THe newly introduced dynamic patching gives us the opportunity
> 
> nit: s/THe/The/
> 
>> to simplify that code by patching a single instruction with
>> the correct mask (instead of the mind bending cummulative masking
>> we have at the moment) or even a single NOP on VHE.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_mmu.h | 45 ++++++--------------
>>  arch/arm64/kvm/Makefile          |  2 +-
>>  arch/arm64/kvm/va_layout.c       | 91 ++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 104 insertions(+), 34 deletions(-)
>>  create mode 100644 arch/arm64/kvm/va_layout.c
>>
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index 672c8684d5c2..b0c3cbe9b513 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -69,9 +69,6 @@
>>   * mappings, and none of this applies in that case.
>>   */
>>  
>> -#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
>> -#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
>> -
>>  #ifdef __ASSEMBLY__
>>  
>>  #include <asm/alternative.h>
>> @@ -81,28 +78,14 @@
>>   * Convert a kernel VA into a HYP VA.
>>   * reg: VA to be converted.
>>   *
>> - * This generates the following sequences:
>> - * - High mask:
>> - *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
>> - *		nop
>> - * - Low mask:
>> - *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
>> - *		and x0, x0, #HYP_PAGE_OFFSET_LOW_MASK
>> - * - VHE:
>> - *		nop
>> - *		nop
>> - *
>> - * The "low mask" version works because the mask is a strict subset of
>> - * the "high mask", hence performing the first mask for nothing.
>> - * Should be completely invisible on any viable CPU.
>> + * The actual code generation takes place in kvm_update_va_mask, and
>> + * the instructions below are only there to reserve the space and
>> + * perform the register allocation.
> 
> Not just register allocation, but also to tell the generating code which
> registers to use for its operands, right?

That's what I meant by register allocation.

> 
>>   */
>>  .macro kern_hyp_va	reg
>> -alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
>> -	and     \reg, \reg, #HYP_PAGE_OFFSET_HIGH_MASK
>> -alternative_else_nop_endif
>> -alternative_if ARM64_HYP_OFFSET_LOW
>> -	and     \reg, \reg, #HYP_PAGE_OFFSET_LOW_MASK
>> -alternative_else_nop_endif
>> +alternative_cb kvm_update_va_mask
>> +	and     \reg, \reg, #1
>> +alternative_cb_end
>>  .endm
>>  
>>  #else
>> @@ -113,18 +96,14 @@ alternative_else_nop_endif
>>  #include <asm/mmu_context.h>
>>  #include <asm/pgtable.h>
>>  
>> +void kvm_update_va_mask(struct alt_instr *alt,
>> +			__le32 *origptr, __le32 *updptr, int nr_inst);
>> +
>>  static inline unsigned long __kern_hyp_va(unsigned long v)
>>  {
>> -	asm volatile(ALTERNATIVE("and %0, %0, %1",
>> -				 "nop",
>> -				 ARM64_HAS_VIRT_HOST_EXTN)
>> -		     : "+r" (v)
>> -		     : "i" (HYP_PAGE_OFFSET_HIGH_MASK));
>> -	asm volatile(ALTERNATIVE("nop",
>> -				 "and %0, %0, %1",
>> -				 ARM64_HYP_OFFSET_LOW)
>> -		     : "+r" (v)
>> -		     : "i" (HYP_PAGE_OFFSET_LOW_MASK));
>> +	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
>> +				    kvm_update_va_mask)
>> +		     : "+r" (v));
>>  	return v;
>>  }
>>  
>> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
>> index 87c4f7ae24de..93afff91cb7c 100644
>> --- a/arch/arm64/kvm/Makefile
>> +++ b/arch/arm64/kvm/Makefile
>> @@ -16,7 +16,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/e
>>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o
>>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
>>  
>> -kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o
>> +kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
>>  kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
>>  kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
>>  kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o
>> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
>> new file mode 100644
>> index 000000000000..aee758574e61
>> --- /dev/null
>> +++ b/arch/arm64/kvm/va_layout.c
>> @@ -0,0 +1,91 @@
>> +/*
>> + * Copyright (C) 2017 ARM Ltd.
>> + * Author: Marc Zyngier <marc.zyngier@arm.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <linux/kvm_host.h>
>> +#include <asm/alternative.h>
>> +#include <asm/debug-monitors.h>
>> +#include <asm/insn.h>
>> +#include <asm/kvm_mmu.h>
>> +
>> +#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
>> +#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
>> +
>> +static u64 va_mask;
>> +
>> +static void compute_layout(void)
>> +{
>> +	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
>> +	unsigned long mask = HYP_PAGE_OFFSET_HIGH_MASK;
>> +
>> +	/*
>> +	 * Activate the lower HYP offset only if the idmap doesn't
>> +	 * clash with it,
>> +	 */
> 
> The commentary is a bit strange given the logic below...
> 
>> +	if (idmap_addr > HYP_PAGE_OFFSET_LOW_MASK)
>> +		mask = HYP_PAGE_OFFSET_HIGH_MASK;
> 
> ... was the initialization supposed to be LOW_MASK?
> 
> (and does this imply that this was never tested on a system that
> actually used the low mask?)

I must has messed up with a later refactoring, as that code gets
replaced pretty quickly in the series. The full series was definitely
tested on Seattle with 39bit VAs, which is the only configuration I have
that triggers the low mask.

> 
>> +
>> +	va_mask = mask;
>> +}
>> +
>> +static u32 compute_instruction(int n, u32 rd, u32 rn)
>> +{
>> +	u32 insn = AARCH64_BREAK_FAULT;
>> +
>> +	switch (n) {
>> +	case 0:
> 
> hmmm, wonder why we need this n==0 check...

Makes patch splitting a bit easier. I can rework that if that helps.

> 
>> +		insn = aarch64_insn_gen_logical_immediate(AARCH64_INSN_LOGIC_AND,
>> +							  AARCH64_INSN_VARIANT_64BIT,
>> +							  rn, rd, va_mask);
>> +		break;
>> +	}
>> +
>> +	return insn;
>> +}
>> +
>> +void __init kvm_update_va_mask(struct alt_instr *alt,
>> +			       __le32 *origptr, __le32 *updptr, int nr_inst)
>> +{
>> +	int i;
>> +
>> +	/* We only expect a 1 instruction sequence */
> 
> nit: wording is a bit strange, how about
> "We only expect a single instruction in the alternative sequence"

Sure.

> 
>> +	BUG_ON(nr_inst != 1);
>> +
>> +	if (!has_vhe() && !va_mask)
>> +		compute_layout();
>> +
>> +	for (i = 0; i < nr_inst; i++) {
> 
> It's a bit funny to have a loop with the above BUG_ON.
> 
> (I'm beginning to wonder if a future patch expands on this single
> instruction thing, in which case a hint in the commit message would have
> been nice.)

That's indeed what is happening. A further patch expands the single
instruction to a 5 instruction sequence. I'll add a comment to that effect.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
  2018-01-15 15:36     ` Christoffer Dall
@ 2018-02-15 13:22       ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-15 13:22 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On 15/01/18 15:36, Christoffer Dall wrote:
> On Thu, Jan 04, 2018 at 06:43:25PM +0000, Marc Zyngier wrote:
>> kvm_vgic_global_state is part of the read-only section, and is
>> usually accessed using a PC-relative address generation (adrp + add).
>>
>> It is thus useless to use kern_hyp_va() on it, and actively problematic
>> if kern_hyp_va() becomes non-idempotent. On the other hand, there is
>> no way that the compiler is going to guarantee that such access is
>> always be PC relative.
> 
> nit: is always be
> 
>>
>> So let's bite the bullet and provide our own accessor.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm/include/asm/kvm_hyp.h   | 6 ++++++
>>  arch/arm64/include/asm/kvm_hyp.h | 9 +++++++++
>>  virt/kvm/arm/hyp/vgic-v2-sr.c    | 4 ++--
>>  3 files changed, 17 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
>> index ab20ffa8b9e7..1d42d0aa2feb 100644
>> --- a/arch/arm/include/asm/kvm_hyp.h
>> +++ b/arch/arm/include/asm/kvm_hyp.h
>> @@ -26,6 +26,12 @@
>>  
>>  #define __hyp_text __section(.hyp.text) notrace
>>  
>> +#define hyp_symbol_addr(s)						\
>> +	({								\
>> +		typeof(s) *addr = &(s);					\
>> +		addr;							\
>> +	})
>> +
>>  #define __ACCESS_VFP(CRn)			\
>>  	"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
>>  
>> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
>> index 08d3bb66c8b7..a2d98c539023 100644
>> --- a/arch/arm64/include/asm/kvm_hyp.h
>> +++ b/arch/arm64/include/asm/kvm_hyp.h
>> @@ -25,6 +25,15 @@
>>  
>>  #define __hyp_text __section(.hyp.text) notrace
>>  
>> +#define hyp_symbol_addr(s)						\
>> +	({								\
>> +		typeof(s) *addr;					\
>> +		asm volatile("adrp	%0, %1\n"			\
>> +			     "add	%0, %0, :lo12:%1\n"		\
>> +			     : "=r" (addr) : "S" (&s));			\
> 
> Can't we use adr_l here?

Unfortunately not. All the asm/assembler.h macros are unavailable to
inline assembly. We could start introducing equivalent macros for that
purpose, but that's starting to be outside of the scope of this series.

> 
>> +		addr;							\
>> +	})
>> +
> 
> I don't fully appreciate the semantics of this macro going by its name
> only.  My understanding is that if you want to resolve a symbol to an
> address which is mapped in hyp, then use this.  Is this correct?

The goal of this macro is to return a symbol's address based on a
PC-relative computation, as opposed to a loading the VA from a constant
pool or something similar. This works well for HYP, as an absolute VA is
guaranteed to be wrong.

> 
> If so, can we add a small comment (because I can't come up with a better
> name).

I'll add the above if that works for you.

> 
> 
>>  #define read_sysreg_elx(r,nvh,vh)					\
>>  	({								\
>>  		u64 reg;						\
>> diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/virt/kvm/arm/hyp/vgic-v2-sr.c
>> index d7fd46fe9efb..4573d0552af3 100644
>> --- a/virt/kvm/arm/hyp/vgic-v2-sr.c
>> +++ b/virt/kvm/arm/hyp/vgic-v2-sr.c
>> @@ -25,7 +25,7 @@
>>  static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
>>  {
>>  	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
>> -	int nr_lr = (kern_hyp_va(&kvm_vgic_global_state))->nr_lr;
>> +	int nr_lr = hyp_symbol_addr(kvm_vgic_global_state)->nr_lr;
>>  	u32 elrsr0, elrsr1;
>>  
>>  	elrsr0 = readl_relaxed(base + GICH_ELRSR0);
>> @@ -139,7 +139,7 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu)
>>  		return -1;
>>  
>>  	rd = kvm_vcpu_dabt_get_rd(vcpu);
>> -	addr  = kern_hyp_va((kern_hyp_va(&kvm_vgic_global_state))->vcpu_base_va);
>> +	addr  = kern_hyp_va(hyp_symbol_addr(kvm_vgic_global_state)->vcpu_base_va);
>>  	addr += fault_ipa - vgic->vgic_cpu_base;
>>  
>>  	if (kvm_vcpu_dabt_iswrite(vcpu)) {
>> -- 
>> 2.14.2
>>
> Otherwise looks good.
> 
> Thanks,
> -Christoffer
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
@ 2018-02-15 13:22       ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-15 13:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/01/18 15:36, Christoffer Dall wrote:
> On Thu, Jan 04, 2018 at 06:43:25PM +0000, Marc Zyngier wrote:
>> kvm_vgic_global_state is part of the read-only section, and is
>> usually accessed using a PC-relative address generation (adrp + add).
>>
>> It is thus useless to use kern_hyp_va() on it, and actively problematic
>> if kern_hyp_va() becomes non-idempotent. On the other hand, there is
>> no way that the compiler is going to guarantee that such access is
>> always be PC relative.
> 
> nit: is always be
> 
>>
>> So let's bite the bullet and provide our own accessor.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm/include/asm/kvm_hyp.h   | 6 ++++++
>>  arch/arm64/include/asm/kvm_hyp.h | 9 +++++++++
>>  virt/kvm/arm/hyp/vgic-v2-sr.c    | 4 ++--
>>  3 files changed, 17 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
>> index ab20ffa8b9e7..1d42d0aa2feb 100644
>> --- a/arch/arm/include/asm/kvm_hyp.h
>> +++ b/arch/arm/include/asm/kvm_hyp.h
>> @@ -26,6 +26,12 @@
>>  
>>  #define __hyp_text __section(.hyp.text) notrace
>>  
>> +#define hyp_symbol_addr(s)						\
>> +	({								\
>> +		typeof(s) *addr = &(s);					\
>> +		addr;							\
>> +	})
>> +
>>  #define __ACCESS_VFP(CRn)			\
>>  	"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
>>  
>> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
>> index 08d3bb66c8b7..a2d98c539023 100644
>> --- a/arch/arm64/include/asm/kvm_hyp.h
>> +++ b/arch/arm64/include/asm/kvm_hyp.h
>> @@ -25,6 +25,15 @@
>>  
>>  #define __hyp_text __section(.hyp.text) notrace
>>  
>> +#define hyp_symbol_addr(s)						\
>> +	({								\
>> +		typeof(s) *addr;					\
>> +		asm volatile("adrp	%0, %1\n"			\
>> +			     "add	%0, %0, :lo12:%1\n"		\
>> +			     : "=r" (addr) : "S" (&s));			\
> 
> Can't we use adr_l here?

Unfortunately not. All the asm/assembler.h macros are unavailable to
inline assembly. We could start introducing equivalent macros for that
purpose, but that's starting to be outside of the scope of this series.

> 
>> +		addr;							\
>> +	})
>> +
> 
> I don't fully appreciate the semantics of this macro going by its name
> only.  My understanding is that if you want to resolve a symbol to an
> address which is mapped in hyp, then use this.  Is this correct?

The goal of this macro is to return a symbol's address based on a
PC-relative computation, as opposed to a loading the VA from a constant
pool or something similar. This works well for HYP, as an absolute VA is
guaranteed to be wrong.

> 
> If so, can we add a small comment (because I can't come up with a better
> name).

I'll add the above if that works for you.

> 
> 
>>  #define read_sysreg_elx(r,nvh,vh)					\
>>  	({								\
>>  		u64 reg;						\
>> diff --git a/virt/kvm/arm/hyp/vgic-v2-sr.c b/virt/kvm/arm/hyp/vgic-v2-sr.c
>> index d7fd46fe9efb..4573d0552af3 100644
>> --- a/virt/kvm/arm/hyp/vgic-v2-sr.c
>> +++ b/virt/kvm/arm/hyp/vgic-v2-sr.c
>> @@ -25,7 +25,7 @@
>>  static void __hyp_text save_elrsr(struct kvm_vcpu *vcpu, void __iomem *base)
>>  {
>>  	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
>> -	int nr_lr = (kern_hyp_va(&kvm_vgic_global_state))->nr_lr;
>> +	int nr_lr = hyp_symbol_addr(kvm_vgic_global_state)->nr_lr;
>>  	u32 elrsr0, elrsr1;
>>  
>>  	elrsr0 = readl_relaxed(base + GICH_ELRSR0);
>> @@ -139,7 +139,7 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu)
>>  		return -1;
>>  
>>  	rd = kvm_vcpu_dabt_get_rd(vcpu);
>> -	addr  = kern_hyp_va((kern_hyp_va(&kvm_vgic_global_state))->vcpu_base_va);
>> +	addr  = kern_hyp_va(hyp_symbol_addr(kvm_vgic_global_state)->vcpu_base_va);
>>  	addr += fault_ipa - vgic->vgic_cpu_base;
>>  
>>  	if (kvm_vcpu_dabt_iswrite(vcpu)) {
>> -- 
>> 2.14.2
>>
> Otherwise looks good.
> 
> Thanks,
> -Christoffer
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 14/19] KVM: arm/arm64: Move HYP IO VAs to the "idmap" range
  2018-01-18 14:39     ` Christoffer Dall
@ 2018-02-15 13:52       ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-15 13:52 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On 18/01/18 14:39, Christoffer Dall wrote:
> On Thu, Jan 04, 2018 at 06:43:29PM +0000, Marc Zyngier wrote:
>> We so far mapped our HYP IO (which is essencially the GICv2 control
>> registers) using the same method as for memory. It recently appeared
>> that is a bit unsafe:
>>
>> We compute the HYP VA using the kern_hyp_va helper, but that helper
>> is only designed to deal with kernel VAs coming from the linear map,
>> and not from the vmalloc region... This could in turn cause some bad
>> aliasing between the two, amplified by the upcoming VA randomisation.
>>
>> A solution is to come up with our very own basic VA allocator for
>> MMIO. Since half of the HYP address space only contains a single
>> page (the idmap), we have plenty to borrow from. Let's use the idmap
>> as a base, and allocate downwards from it. GICv2 now lives on the
>> other side of the great VA barrier.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  virt/kvm/arm/mmu.c | 56 +++++++++++++++++++++++++++++++++++++++++-------------
>>  1 file changed, 43 insertions(+), 13 deletions(-)
>>
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 6192d45d1e1a..14c5e5534f2f 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -43,6 +43,9 @@ static unsigned long hyp_idmap_start;
>>  static unsigned long hyp_idmap_end;
>>  static phys_addr_t hyp_idmap_vector;
>>  
>> +static DEFINE_MUTEX(io_map_lock);
>> +static unsigned long io_map_base;
>> +
>>  #define S2_PGD_SIZE	(PTRS_PER_S2_PGD * sizeof(pgd_t))
>>  #define hyp_pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
>>  
>> @@ -502,27 +505,31 @@ static void unmap_hyp_range(pgd_t *pgdp, phys_addr_t start, u64 size)
>>   *
>>   * Assumes hyp_pgd is a page table used strictly in Hyp-mode and
>>   * therefore contains either mappings in the kernel memory area (above
>> - * PAGE_OFFSET), or device mappings in the vmalloc range (from
>> - * VMALLOC_START to VMALLOC_END).
>> + * PAGE_OFFSET), or device mappings in the idmap range.
>>   *
>> - * boot_hyp_pgd should only map two pages for the init code.
>> + * boot_hyp_pgd should only map the idmap range, and is only used in
>> + * the extended idmap case.
>>   */
>>  void free_hyp_pgds(void)
>>  {
>> +	pgd_t *id_pgd;
>> +
>>  	mutex_lock(&kvm_hyp_pgd_mutex);
>>  
>> +	id_pgd = boot_hyp_pgd ? boot_hyp_pgd : hyp_pgd;
>> +
>> +	if (id_pgd)
>> +		unmap_hyp_range(id_pgd, io_map_base,
>> +				hyp_idmap_start + PAGE_SIZE - io_map_base);
>> +
>>  	if (boot_hyp_pgd) {
>> -		unmap_hyp_range(boot_hyp_pgd, hyp_idmap_start, PAGE_SIZE);
>>  		free_pages((unsigned long)boot_hyp_pgd, hyp_pgd_order);
>>  		boot_hyp_pgd = NULL;
>>  	}
>>  
>>  	if (hyp_pgd) {
>> -		unmap_hyp_range(hyp_pgd, hyp_idmap_start, PAGE_SIZE);
>>  		unmap_hyp_range(hyp_pgd, kern_hyp_va(PAGE_OFFSET),
>>  				(uintptr_t)high_memory - PAGE_OFFSET);
>> -		unmap_hyp_range(hyp_pgd, kern_hyp_va(VMALLOC_START),
>> -				VMALLOC_END - VMALLOC_START);
>>  
>>  		free_pages((unsigned long)hyp_pgd, hyp_pgd_order);
>>  		hyp_pgd = NULL;
>> @@ -721,7 +728,8 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>>  			   void __iomem **kaddr,
>>  			   void __iomem **haddr)
>>  {
>> -	unsigned long start, end;
>> +	pgd_t *pgd = hyp_pgd;
>> +	unsigned long base;
>>  	int ret;
>>  
>>  	*kaddr = ioremap(phys_addr, size);
>> @@ -733,17 +741,38 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>>  		return 0;
>>  	}
>>  
>> +	mutex_lock(&io_map_lock);
>> +
>> +	base = io_map_base - size;
> 
> are we guaranteed that hyp_idmap_start (and therefore io_map_base) is
> sufficiently greater than 0 ?  I suppose that even if RAM starts at 0,
> and the kernel was loaded at 0, the idmap page for Hyp would be at some
> reasonable offset from the start of the kernel image?

On my kernel image:
ffff000008080000 t _head
ffff000008cc6000 T __hyp_idmap_text_start
ffff000009aaa000 B swapper_pg_end

_hyp_idmap_text_start is about 12MB from the beginning of the image, and
about 14MB from the end. Yes, it is a big kernel. But we're only mapping
a few pages there, even with my upcoming crazy vector remapping crap. So
the likeliness of this failing is close to zero.

Now, close to zero is not necessarily close enough. What I could do is
to switch the allocator around on failure, so that if we can't allocate
on one side, we can at least try to allocate on the other side. I'm
pretty sure we'll never trigger that code, but I can implement it if you
think that's worth it.

> 
>> +	base &= ~(size - 1);
> 
> I'm not sure I understand this line.  Wouldn't it make more sense to use
> PAGE_SIZE here?

This is trying to align the base of the allocation to its natural size
(8kB aligned on an 8kB boundary, for example), which is what other
allocators in the kernel do. I've now added a roundup_pow_of_two(size)
so that we're guaranteed to deal with those.

> 
>>  
>> -	start = kern_hyp_va((unsigned long)*kaddr);
>> -	end = kern_hyp_va((unsigned long)*kaddr + size);
>> -	ret = __create_hyp_mappings(hyp_pgd, start, end,
>> +	/*
>> +	 * Verify that BIT(VA_BITS - 1) hasn't been flipped by
>> +	 * allocating the new area, as it would indicate we've
>> +	 * overflowed the idmap/IO address range.
>> +	 */
>> +	if ((base ^ io_map_base) & BIT(VA_BITS - 1)) {
>> +		ret = -ENOMEM;
>> +		goto out;
>> +	}
>> +
>> +	if (__kvm_cpu_uses_extended_idmap())
>> +		pgd = boot_hyp_pgd;
>> +
> 
> I don't understand why you can change the pgd used for mappings here,
> perhaps a patch splitting issue?

No, that's absolutely required. If you use an extended idmap, you cannot
reach the idmap from the main PGD. You can only reach it either from the
merged_hyp_pgd (which is not usable because it has an extra level), or
from the idmap_pgd itself. See for example how kvm_map_idmap_text() is
used at boot time.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 14/19] KVM: arm/arm64: Move HYP IO VAs to the "idmap" range
@ 2018-02-15 13:52       ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-15 13:52 UTC (permalink / raw)
  To: linux-arm-kernel

On 18/01/18 14:39, Christoffer Dall wrote:
> On Thu, Jan 04, 2018 at 06:43:29PM +0000, Marc Zyngier wrote:
>> We so far mapped our HYP IO (which is essencially the GICv2 control
>> registers) using the same method as for memory. It recently appeared
>> that is a bit unsafe:
>>
>> We compute the HYP VA using the kern_hyp_va helper, but that helper
>> is only designed to deal with kernel VAs coming from the linear map,
>> and not from the vmalloc region... This could in turn cause some bad
>> aliasing between the two, amplified by the upcoming VA randomisation.
>>
>> A solution is to come up with our very own basic VA allocator for
>> MMIO. Since half of the HYP address space only contains a single
>> page (the idmap), we have plenty to borrow from. Let's use the idmap
>> as a base, and allocate downwards from it. GICv2 now lives on the
>> other side of the great VA barrier.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  virt/kvm/arm/mmu.c | 56 +++++++++++++++++++++++++++++++++++++++++-------------
>>  1 file changed, 43 insertions(+), 13 deletions(-)
>>
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 6192d45d1e1a..14c5e5534f2f 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -43,6 +43,9 @@ static unsigned long hyp_idmap_start;
>>  static unsigned long hyp_idmap_end;
>>  static phys_addr_t hyp_idmap_vector;
>>  
>> +static DEFINE_MUTEX(io_map_lock);
>> +static unsigned long io_map_base;
>> +
>>  #define S2_PGD_SIZE	(PTRS_PER_S2_PGD * sizeof(pgd_t))
>>  #define hyp_pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
>>  
>> @@ -502,27 +505,31 @@ static void unmap_hyp_range(pgd_t *pgdp, phys_addr_t start, u64 size)
>>   *
>>   * Assumes hyp_pgd is a page table used strictly in Hyp-mode and
>>   * therefore contains either mappings in the kernel memory area (above
>> - * PAGE_OFFSET), or device mappings in the vmalloc range (from
>> - * VMALLOC_START to VMALLOC_END).
>> + * PAGE_OFFSET), or device mappings in the idmap range.
>>   *
>> - * boot_hyp_pgd should only map two pages for the init code.
>> + * boot_hyp_pgd should only map the idmap range, and is only used in
>> + * the extended idmap case.
>>   */
>>  void free_hyp_pgds(void)
>>  {
>> +	pgd_t *id_pgd;
>> +
>>  	mutex_lock(&kvm_hyp_pgd_mutex);
>>  
>> +	id_pgd = boot_hyp_pgd ? boot_hyp_pgd : hyp_pgd;
>> +
>> +	if (id_pgd)
>> +		unmap_hyp_range(id_pgd, io_map_base,
>> +				hyp_idmap_start + PAGE_SIZE - io_map_base);
>> +
>>  	if (boot_hyp_pgd) {
>> -		unmap_hyp_range(boot_hyp_pgd, hyp_idmap_start, PAGE_SIZE);
>>  		free_pages((unsigned long)boot_hyp_pgd, hyp_pgd_order);
>>  		boot_hyp_pgd = NULL;
>>  	}
>>  
>>  	if (hyp_pgd) {
>> -		unmap_hyp_range(hyp_pgd, hyp_idmap_start, PAGE_SIZE);
>>  		unmap_hyp_range(hyp_pgd, kern_hyp_va(PAGE_OFFSET),
>>  				(uintptr_t)high_memory - PAGE_OFFSET);
>> -		unmap_hyp_range(hyp_pgd, kern_hyp_va(VMALLOC_START),
>> -				VMALLOC_END - VMALLOC_START);
>>  
>>  		free_pages((unsigned long)hyp_pgd, hyp_pgd_order);
>>  		hyp_pgd = NULL;
>> @@ -721,7 +728,8 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>>  			   void __iomem **kaddr,
>>  			   void __iomem **haddr)
>>  {
>> -	unsigned long start, end;
>> +	pgd_t *pgd = hyp_pgd;
>> +	unsigned long base;
>>  	int ret;
>>  
>>  	*kaddr = ioremap(phys_addr, size);
>> @@ -733,17 +741,38 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>>  		return 0;
>>  	}
>>  
>> +	mutex_lock(&io_map_lock);
>> +
>> +	base = io_map_base - size;
> 
> are we guaranteed that hyp_idmap_start (and therefore io_map_base) is
> sufficiently greater than 0 ?  I suppose that even if RAM starts at 0,
> and the kernel was loaded at 0, the idmap page for Hyp would be at some
> reasonable offset from the start of the kernel image?

On my kernel image:
ffff000008080000 t _head
ffff000008cc6000 T __hyp_idmap_text_start
ffff000009aaa000 B swapper_pg_end

_hyp_idmap_text_start is about 12MB from the beginning of the image, and
about 14MB from the end. Yes, it is a big kernel. But we're only mapping
a few pages there, even with my upcoming crazy vector remapping crap. So
the likeliness of this failing is close to zero.

Now, close to zero is not necessarily close enough. What I could do is
to switch the allocator around on failure, so that if we can't allocate
on one side, we can at least try to allocate on the other side. I'm
pretty sure we'll never trigger that code, but I can implement it if you
think that's worth it.

> 
>> +	base &= ~(size - 1);
> 
> I'm not sure I understand this line.  Wouldn't it make more sense to use
> PAGE_SIZE here?

This is trying to align the base of the allocation to its natural size
(8kB aligned on an 8kB boundary, for example), which is what other
allocators in the kernel do. I've now added a roundup_pow_of_two(size)
so that we're guaranteed to deal with those.

> 
>>  
>> -	start = kern_hyp_va((unsigned long)*kaddr);
>> -	end = kern_hyp_va((unsigned long)*kaddr + size);
>> -	ret = __create_hyp_mappings(hyp_pgd, start, end,
>> +	/*
>> +	 * Verify that BIT(VA_BITS - 1) hasn't been flipped by
>> +	 * allocating the new area, as it would indicate we've
>> +	 * overflowed the idmap/IO address range.
>> +	 */
>> +	if ((base ^ io_map_base) & BIT(VA_BITS - 1)) {
>> +		ret = -ENOMEM;
>> +		goto out;
>> +	}
>> +
>> +	if (__kvm_cpu_uses_extended_idmap())
>> +		pgd = boot_hyp_pgd;
>> +
> 
> I don't understand why you can change the pgd used for mappings here,
> perhaps a patch splitting issue?

No, that's absolutely required. If you use an extended idmap, you cannot
reach the idmap from the main PGD. You can only reach it either from the
merged_hyp_pgd (which is not usable because it has an extra level), or
from the idmap_pgd itself. See for example how kvm_map_idmap_text() is
used at boot time.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 17/19] arm64: KVM: Dynamically compute the HYP VA mask
  2018-01-18 20:28     ` Christoffer Dall
@ 2018-02-15 13:58       ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-15 13:58 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On 18/01/18 20:28, Christoffer Dall wrote:
> On Thu, Jan 04, 2018 at 06:43:32PM +0000, Marc Zyngier wrote:
>> As we're moving towards a much more dynamic way to compute our
>> HYP VA, let's express the mask in a slightly different way.
>>
>> Instead of comparing the idmap position to the "low" VA mask,
>> we directly compute the mask by taking into account the idmap's
>> (VA_BIT-1) bit.
>>
>> No functionnal change.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/kvm/va_layout.c | 17 ++++++-----------
>>  1 file changed, 6 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
>> index aee758574e61..75bb1c6772b0 100644
>> --- a/arch/arm64/kvm/va_layout.c
>> +++ b/arch/arm64/kvm/va_layout.c
>> @@ -21,24 +21,19 @@
>>  #include <asm/insn.h>
>>  #include <asm/kvm_mmu.h>
>>  
>> -#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
>> -#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
>> -
>>  static u64 va_mask;
>>  
>>  static void compute_layout(void)
>>  {
>>  	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
>> -	unsigned long mask = HYP_PAGE_OFFSET_HIGH_MASK;
>> +	u64 region;
> 
> the naming here really confused me.  Would it make sense to call this
> 'hyp_va_msb' or something like that instead?
> 
>>  
>> -	/*
>> -	 * Activate the lower HYP offset only if the idmap doesn't
>> -	 * clash with it,
>> -	 */
>> -	if (idmap_addr > HYP_PAGE_OFFSET_LOW_MASK)
>> -		mask = HYP_PAGE_OFFSET_HIGH_MASK;
> 
> Ah, the series was tested, it was just that this code only existed for a
> short while.  Amusingly, I think this ephemeral bug goes against the "No
> function change" statement in the commit message.
> 
>> +	/* Where is my RAM region? */
>> +	region  = idmap_addr & BIT(VA_BITS - 1);
>> +	region ^= BIT(VA_BITS - 1);
>>  
>> -	va_mask = mask;
>> +	va_mask  = BIT(VA_BITS - 1) - 1;
> 
> nit: This could also be written as GENMASK_ULL(VA_BITS - 2, 0) --- and
> now I'm not sure which one I prefer.

Good point. I think GENMASK makes it clearer what the intent is, and
assigning a mask to a mask has certain degree of consistency (/me fondly
remembers dimensional analysis...).

> 
>> +	va_mask |= region;
>>  }
>>  
>>  static u32 compute_instruction(int n, u32 rd, u32 rn)
>> -- 
>> 2.14.2
>>
> Otherwise looks good:
> 
> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 17/19] arm64: KVM: Dynamically compute the HYP VA mask
@ 2018-02-15 13:58       ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-15 13:58 UTC (permalink / raw)
  To: linux-arm-kernel

On 18/01/18 20:28, Christoffer Dall wrote:
> On Thu, Jan 04, 2018 at 06:43:32PM +0000, Marc Zyngier wrote:
>> As we're moving towards a much more dynamic way to compute our
>> HYP VA, let's express the mask in a slightly different way.
>>
>> Instead of comparing the idmap position to the "low" VA mask,
>> we directly compute the mask by taking into account the idmap's
>> (VA_BIT-1) bit.
>>
>> No functionnal change.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/kvm/va_layout.c | 17 ++++++-----------
>>  1 file changed, 6 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
>> index aee758574e61..75bb1c6772b0 100644
>> --- a/arch/arm64/kvm/va_layout.c
>> +++ b/arch/arm64/kvm/va_layout.c
>> @@ -21,24 +21,19 @@
>>  #include <asm/insn.h>
>>  #include <asm/kvm_mmu.h>
>>  
>> -#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
>> -#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
>> -
>>  static u64 va_mask;
>>  
>>  static void compute_layout(void)
>>  {
>>  	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
>> -	unsigned long mask = HYP_PAGE_OFFSET_HIGH_MASK;
>> +	u64 region;
> 
> the naming here really confused me.  Would it make sense to call this
> 'hyp_va_msb' or something like that instead?
> 
>>  
>> -	/*
>> -	 * Activate the lower HYP offset only if the idmap doesn't
>> -	 * clash with it,
>> -	 */
>> -	if (idmap_addr > HYP_PAGE_OFFSET_LOW_MASK)
>> -		mask = HYP_PAGE_OFFSET_HIGH_MASK;
> 
> Ah, the series was tested, it was just that this code only existed for a
> short while.  Amusingly, I think this ephemeral bug goes against the "No
> function change" statement in the commit message.
> 
>> +	/* Where is my RAM region? */
>> +	region  = idmap_addr & BIT(VA_BITS - 1);
>> +	region ^= BIT(VA_BITS - 1);
>>  
>> -	va_mask = mask;
>> +	va_mask  = BIT(VA_BITS - 1) - 1;
> 
> nit: This could also be written as GENMASK_ULL(VA_BITS - 2, 0) --- and
> now I'm not sure which one I prefer.

Good point. I think GENMASK makes it clearer what the intent is, and
assigning a mask to a mask has certain degree of consistency (/me fondly
remembers dimensional analysis...).

> 
>> +	va_mask |= region;
>>  }
>>  
>>  static u32 compute_instruction(int n, u32 rd, u32 rn)
>> -- 
>> 2.14.2
>>
> Otherwise looks good:
> 
> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 18/19] arm64: KVM: Introduce EL2 VA randomisation
  2018-01-18 20:28     ` Christoffer Dall
@ 2018-02-15 15:32       ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-15 15:32 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: kvm, Catalin Marinas, Will Deacon, kvmarm, linux-arm-kernel

On 18/01/18 20:28, Christoffer Dall wrote:
> On Thu, Jan 04, 2018 at 06:43:33PM +0000, Marc Zyngier wrote:
>> The main idea behind randomising the EL2 VA is that we usually have
>> a few spare bits between the most significant bit of the VA mask
>> and the most significant bit of the linear mapping.
>>
>> Those bits could be a bunch of zeroes, and could be useful
>> to move things around a bit. Of course, the more memory you have,
>> the less randomisation you get...
>>
>> Alternatively, these bits could be the result of KASLR, in which
>> case they are already random. But it would be nice to have a
>> *different* randomization, just to make the job of a potential
>> attacker a bit more difficult.
>>
>> Inserting these random bits is a bit involved. We don't have a spare
>> register (short of rewriting all the kern_hyp_va call sites), and
>> the immediate we want to insert is too random to be used with the
>> ORR instruction. The best option I could come up with is the following
>> sequence:
>>
>> 	and x0, x0, #va_mask
> 
> So if I get this right, you want to insert an arbitrary random value
> without an extra register in bits [(VA_BITS-1):first_random_bit] and
> BIT(VA_BITS-1) is always set in the input because it's a kernel address.

Correct.

> 
>> 	ror x0, x0, #first_random_bit
> 
> Then you rotate so that the random bits become the LSBs and the random
> value should be inserted into bits [NR_RAND_BITS-1:0] in x0 ?

Correct again. The important thing to notice is that the bottom bits are
guaranteed to be zero, making sure that the subsequent adds act as ors.

> 
>> 	add x0, x0, #(random & 0xfff)
> 
> So you do this via two rounds, first the lower 12 bits
> 
>> 	add x0, x0, #(random >> 12), lsl #12
> 
> Then the upper 12 bits (permitting a maximum of 24 randomized bits)

Still correct. It is debatable whether allowing more than 12 bits is
really useful, as only platforms with very little memory will be able to
reach past 12 bits of entropy.

> 
>> 	ror x0, x0, #(63 - first_random_bit)
> 
> And then you rotate things back into their place.
> 
> Only, I don't understand why this isn't then (64 - first_random_bit) ?

That looks like a typo.

> 
>>
>> making it a fairly long sequence, but one that a decent CPU should
>> be able to execute without breaking a sweat. It is of course NOPed
>> out on VHE. The last 4 instructions can also be turned into NOPs
>> if it appears that there is no free bits to use.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_mmu.h | 10 +++++-
>>  arch/arm64/kvm/va_layout.c       | 68 +++++++++++++++++++++++++++++++++++++---
>>  virt/kvm/arm/mmu.c               |  2 +-
>>  3 files changed, 73 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index cc882e890bb1..4fca6ddadccc 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -85,6 +85,10 @@
>>  .macro kern_hyp_va	reg
>>  alternative_cb kvm_update_va_mask
>>  	and     \reg, \reg, #1
>> +	ror	\reg, \reg, #1
>> +	add	\reg, \reg, #0
>> +	add	\reg, \reg, #0
>> +	ror	\reg, \reg, #63
>>  alternative_cb_end
>>  .endm
>>  
>> @@ -101,7 +105,11 @@ void kvm_update_va_mask(struct alt_instr *alt,
>>  
>>  static inline unsigned long __kern_hyp_va(unsigned long v)
>>  {
>> -	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
>> +	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n"
>> +				    "ror %0, %0, #1\n"
>> +				    "add %0, %0, #0\n"
>> +				    "add %0, %0, #0\n"
>> +				    "ror %0, %0, #63\n",
> 
> This now sort of serves as the documentation if you don't have the
> commit message, so I think you should annotate each line like the commit
> message does.
> 
> Alternative, since you're duplicating a bunch of code which will be
> replaced at runtime anyway, you could make all of these "and %0, %0, #1"
> and then copy the documentation assembly code as a comment to
> compute_instruction() and put a comment reference here.

I found that adding something that looks a bit like the generated code
helps a lot. I'll add some documentation there.

> 
>>  				    kvm_update_va_mask)
>>  		     : "+r" (v));
>>  	return v;
>> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
>> index 75bb1c6772b0..bf0d6bdf5f14 100644
>> --- a/arch/arm64/kvm/va_layout.c
>> +++ b/arch/arm64/kvm/va_layout.c
>> @@ -16,11 +16,15 @@
>>   */
>>  
>>  #include <linux/kvm_host.h>
>> +#include <linux/random.h>
>> +#include <linux/memblock.h>
>>  #include <asm/alternative.h>
>>  #include <asm/debug-monitors.h>
>>  #include <asm/insn.h>
>>  #include <asm/kvm_mmu.h>
>>  
> 
> It would be nice to have a comment on these, something like:
> 
> /* The LSB of the random hyp VA tag or 0 if no randomization is used. */
>> +static u8 tag_lsb;
> /* The random hyp VA tag value with the region bit, if hyp randomization is used */
>> +static u64 tag_val;

Sure.

> 
> 
>>  static u64 va_mask;
>>  
>>  static void compute_layout(void)
>> @@ -32,8 +36,31 @@ static void compute_layout(void)
>>  	region  = idmap_addr & BIT(VA_BITS - 1);
>>  	region ^= BIT(VA_BITS - 1);
>>  
>> -	va_mask  = BIT(VA_BITS - 1) - 1;
>> -	va_mask |= region;
>> +	tag_lsb = fls64((u64)phys_to_virt(memblock_start_of_DRAM()) ^
>> +			(u64)(high_memory - 1));
>> +
>> +	if (tag_lsb == (VA_BITS - 1)) {
>> +		/*
>> +		 * No space in the address, let's compute the mask so
>> +		 * that it covers (VA_BITS - 1) bits, and the region
>> +		 * bit. The tag is set to zero.
>> +		 */
>> +		tag_lsb = tag_val = 0;
> 
> tag_val should already be 0, right?
> 
> and wouldn't it be slightly nicer to have a temporary variable and only
> set tag_lsb when needed, called something like linear_bits ?

OK.

> 
>> +		va_mask  = BIT(VA_BITS - 1) - 1;
>> +		va_mask |= region;
>> +	} else {
>> +		/*
>> +		 * We do have some free bits. Let's have the mask to
>> +		 * cover the low bits of the VA, and the tag to
>> +		 * contain the random stuff plus the region bit.
>> +		 */
> 
> Since you have two masks below this comment is a bit hard to parse, how
> about explaining what makes up a Hyp address from a kernel linear
> address instead, something like:
> 
> 		/*
> 		 * We do have some free bits to insert a random tag.
> 		 * Hyp VAs are now created from kernel linear map VAs
> 		 * using the following formula (with V == VA_BITS):
> 		 *
> 		 *  63 ... V |   V-1  | V-2 ... tag_lsb | tag_lsb - 1 ... 0
> 		 *  -------------------------------------------------------
> 		 * | 0000000 | region |    random tag   |  kern linear VA  |
> 		 */
> 
> (assuming I got this vaguely correct).

/me copy-pastes...

> 
>> +		u64 mask = GENMASK_ULL(VA_BITS - 2, tag_lsb);
> 
> for consistency it would be nicer to use GENMASK_ULL(VA_BITS - 2, 0)
> above as suggested in the other patch then.  And we could also call this
> tag_mask to be super explicit.
> 
>> +
>> +		va_mask = BIT(tag_lsb) - 1;
> 
> and here, GENMASK_ULL(tag_lsb - 1, 0).

Yup.

> 
>> +		tag_val  = get_random_long() & mask;
>> +		tag_val |= region;
> 
> it's actually unclear to me why you need the region bit included in
> tag_val?

Because the initial masking strips it from the VA, and we need to add it
back. storing it as part of the tag makes it easy to ORR in.

> 
>> +		tag_val >>= tag_lsb;
>> +	}
>>  }
>>  
>>  static u32 compute_instruction(int n, u32 rd, u32 rn)
>> @@ -46,6 +73,33 @@ static u32 compute_instruction(int n, u32 rd, u32 rn)
>>  							  AARCH64_INSN_VARIANT_64BIT,
>>  							  rn, rd, va_mask);
>>  		break;
>> +
>> +	case 1:
>> +		/* ROR is a variant of EXTR with Rm = Rn */
>> +		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
>> +					     rn, rn, rd,
>> +					     tag_lsb);
>> +		break;
>> +
>> +	case 2:
>> +		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
>> +						    tag_val & (SZ_4K - 1),
>> +						    AARCH64_INSN_VARIANT_64BIT,
>> +						    AARCH64_INSN_ADSB_ADD);
>> +		break;
>> +
>> +	case 3:
>> +		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
>> +						    tag_val & GENMASK(23, 12),
>> +						    AARCH64_INSN_VARIANT_64BIT,
>> +						    AARCH64_INSN_ADSB_ADD);
>> +		break;
>> +
>> +	case 4:
>> +		/* ROR is a variant of EXTR with Rm = Rn */
>> +		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
>> +					     rn, rn, rd, 64 - tag_lsb);
> 
> Ah, you do use 64 - first_rand in the code.  Well, I approve of this
> line of code then.
> 
>> +		break;
>>  	}
>>  
>>  	return insn;
>> @@ -56,8 +110,8 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
>>  {
>>  	int i;
>>  
>> -	/* We only expect a 1 instruction sequence */
>> -	BUG_ON(nr_inst != 1);
>> +	/* We only expect a 5 instruction sequence */
> 
> Still sounds strange to me, just drop the comment I think if we keep the
> BUG_ON.

Sure.

> 
>> +	BUG_ON(nr_inst != 5);
>>  
>>  	if (!has_vhe() && !va_mask)
>>  		compute_layout();
>> @@ -68,8 +122,12 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
>>  		/*
>>  		 * VHE doesn't need any address translation, let's NOP
>>  		 * everything.
>> +		 *
>> +		 * Alternatively, if we don't have any spare bits in
>> +		 * the address, NOP everything after masking tha
> 
> s/tha/the/
> 
>> +		 * kernel VA.
>>  		 */
>> -		if (has_vhe()) {
>> +		if (has_vhe() || (!tag_lsb && i > 1)) {
>>  			updptr[i] = aarch64_insn_gen_nop();
>>  			continue;
>>  		}
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 14c5e5534f2f..d01c7111b1f7 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1811,7 +1811,7 @@ int kvm_mmu_init(void)
>>  		  kern_hyp_va((unsigned long)high_memory - 1));
>>  
>>  	if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
>> -	    hyp_idmap_start <  kern_hyp_va(~0UL) &&
>> +	    hyp_idmap_start <  kern_hyp_va((unsigned long)high_memory - 1) &&
> 
> Is this actually required for this patch or are we just trying to be
> nice?

You really need something like that. Remember that we compute the tag
based on the available memory, so something that goes beyond that is not
a valid input to kern_hyp_va anymore.
> 
> I'm actually not sure I remember what this is about beyond the VA=idmap
> for everything on 32-bit case; I thought we chose the hyp address space
> exactly so that it wouldn't overlap with the idmap?

This is just a sanity check that kern_hyp_va returns the right thing
with respect to the idmap (i.e. we cannot hit the idmap by feeding
something to the macro). Is that clear enough (I'm not sure it is...)?

> 
>>  	    hyp_idmap_start != (unsigned long)__hyp_idmap_text_start) {
>>  		/*
>>  		 * The idmap page is intersecting with the VA space,
>> -- 
>> 2.14.2
>>
> 
> Thanks,
> -Christoffer
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 18/19] arm64: KVM: Introduce EL2 VA randomisation
@ 2018-02-15 15:32       ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-15 15:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 18/01/18 20:28, Christoffer Dall wrote:
> On Thu, Jan 04, 2018 at 06:43:33PM +0000, Marc Zyngier wrote:
>> The main idea behind randomising the EL2 VA is that we usually have
>> a few spare bits between the most significant bit of the VA mask
>> and the most significant bit of the linear mapping.
>>
>> Those bits could be a bunch of zeroes, and could be useful
>> to move things around a bit. Of course, the more memory you have,
>> the less randomisation you get...
>>
>> Alternatively, these bits could be the result of KASLR, in which
>> case they are already random. But it would be nice to have a
>> *different* randomization, just to make the job of a potential
>> attacker a bit more difficult.
>>
>> Inserting these random bits is a bit involved. We don't have a spare
>> register (short of rewriting all the kern_hyp_va call sites), and
>> the immediate we want to insert is too random to be used with the
>> ORR instruction. The best option I could come up with is the following
>> sequence:
>>
>> 	and x0, x0, #va_mask
> 
> So if I get this right, you want to insert an arbitrary random value
> without an extra register in bits [(VA_BITS-1):first_random_bit] and
> BIT(VA_BITS-1) is always set in the input because it's a kernel address.

Correct.

> 
>> 	ror x0, x0, #first_random_bit
> 
> Then you rotate so that the random bits become the LSBs and the random
> value should be inserted into bits [NR_RAND_BITS-1:0] in x0 ?

Correct again. The important thing to notice is that the bottom bits are
guaranteed to be zero, making sure that the subsequent adds act as ors.

> 
>> 	add x0, x0, #(random & 0xfff)
> 
> So you do this via two rounds, first the lower 12 bits
> 
>> 	add x0, x0, #(random >> 12), lsl #12
> 
> Then the upper 12 bits (permitting a maximum of 24 randomized bits)

Still correct. It is debatable whether allowing more than 12 bits is
really useful, as only platforms with very little memory will be able to
reach past 12 bits of entropy.

> 
>> 	ror x0, x0, #(63 - first_random_bit)
> 
> And then you rotate things back into their place.
> 
> Only, I don't understand why this isn't then (64 - first_random_bit) ?

That looks like a typo.

> 
>>
>> making it a fairly long sequence, but one that a decent CPU should
>> be able to execute without breaking a sweat. It is of course NOPed
>> out on VHE. The last 4 instructions can also be turned into NOPs
>> if it appears that there is no free bits to use.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_mmu.h | 10 +++++-
>>  arch/arm64/kvm/va_layout.c       | 68 +++++++++++++++++++++++++++++++++++++---
>>  virt/kvm/arm/mmu.c               |  2 +-
>>  3 files changed, 73 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index cc882e890bb1..4fca6ddadccc 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -85,6 +85,10 @@
>>  .macro kern_hyp_va	reg
>>  alternative_cb kvm_update_va_mask
>>  	and     \reg, \reg, #1
>> +	ror	\reg, \reg, #1
>> +	add	\reg, \reg, #0
>> +	add	\reg, \reg, #0
>> +	ror	\reg, \reg, #63
>>  alternative_cb_end
>>  .endm
>>  
>> @@ -101,7 +105,11 @@ void kvm_update_va_mask(struct alt_instr *alt,
>>  
>>  static inline unsigned long __kern_hyp_va(unsigned long v)
>>  {
>> -	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
>> +	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n"
>> +				    "ror %0, %0, #1\n"
>> +				    "add %0, %0, #0\n"
>> +				    "add %0, %0, #0\n"
>> +				    "ror %0, %0, #63\n",
> 
> This now sort of serves as the documentation if you don't have the
> commit message, so I think you should annotate each line like the commit
> message does.
> 
> Alternative, since you're duplicating a bunch of code which will be
> replaced at runtime anyway, you could make all of these "and %0, %0, #1"
> and then copy the documentation assembly code as a comment to
> compute_instruction() and put a comment reference here.

I found that adding something that looks a bit like the generated code
helps a lot. I'll add some documentation there.

> 
>>  				    kvm_update_va_mask)
>>  		     : "+r" (v));
>>  	return v;
>> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
>> index 75bb1c6772b0..bf0d6bdf5f14 100644
>> --- a/arch/arm64/kvm/va_layout.c
>> +++ b/arch/arm64/kvm/va_layout.c
>> @@ -16,11 +16,15 @@
>>   */
>>  
>>  #include <linux/kvm_host.h>
>> +#include <linux/random.h>
>> +#include <linux/memblock.h>
>>  #include <asm/alternative.h>
>>  #include <asm/debug-monitors.h>
>>  #include <asm/insn.h>
>>  #include <asm/kvm_mmu.h>
>>  
> 
> It would be nice to have a comment on these, something like:
> 
> /* The LSB of the random hyp VA tag or 0 if no randomization is used. */
>> +static u8 tag_lsb;
> /* The random hyp VA tag value with the region bit, if hyp randomization is used */
>> +static u64 tag_val;

Sure.

> 
> 
>>  static u64 va_mask;
>>  
>>  static void compute_layout(void)
>> @@ -32,8 +36,31 @@ static void compute_layout(void)
>>  	region  = idmap_addr & BIT(VA_BITS - 1);
>>  	region ^= BIT(VA_BITS - 1);
>>  
>> -	va_mask  = BIT(VA_BITS - 1) - 1;
>> -	va_mask |= region;
>> +	tag_lsb = fls64((u64)phys_to_virt(memblock_start_of_DRAM()) ^
>> +			(u64)(high_memory - 1));
>> +
>> +	if (tag_lsb == (VA_BITS - 1)) {
>> +		/*
>> +		 * No space in the address, let's compute the mask so
>> +		 * that it covers (VA_BITS - 1) bits, and the region
>> +		 * bit. The tag is set to zero.
>> +		 */
>> +		tag_lsb = tag_val = 0;
> 
> tag_val should already be 0, right?
> 
> and wouldn't it be slightly nicer to have a temporary variable and only
> set tag_lsb when needed, called something like linear_bits ?

OK.

> 
>> +		va_mask  = BIT(VA_BITS - 1) - 1;
>> +		va_mask |= region;
>> +	} else {
>> +		/*
>> +		 * We do have some free bits. Let's have the mask to
>> +		 * cover the low bits of the VA, and the tag to
>> +		 * contain the random stuff plus the region bit.
>> +		 */
> 
> Since you have two masks below this comment is a bit hard to parse, how
> about explaining what makes up a Hyp address from a kernel linear
> address instead, something like:
> 
> 		/*
> 		 * We do have some free bits to insert a random tag.
> 		 * Hyp VAs are now created from kernel linear map VAs
> 		 * using the following formula (with V == VA_BITS):
> 		 *
> 		 *  63 ... V |   V-1  | V-2 ... tag_lsb | tag_lsb - 1 ... 0
> 		 *  -------------------------------------------------------
> 		 * | 0000000 | region |    random tag   |  kern linear VA  |
> 		 */
> 
> (assuming I got this vaguely correct).

/me copy-pastes...

> 
>> +		u64 mask = GENMASK_ULL(VA_BITS - 2, tag_lsb);
> 
> for consistency it would be nicer to use GENMASK_ULL(VA_BITS - 2, 0)
> above as suggested in the other patch then.  And we could also call this
> tag_mask to be super explicit.
> 
>> +
>> +		va_mask = BIT(tag_lsb) - 1;
> 
> and here, GENMASK_ULL(tag_lsb - 1, 0).

Yup.

> 
>> +		tag_val  = get_random_long() & mask;
>> +		tag_val |= region;
> 
> it's actually unclear to me why you need the region bit included in
> tag_val?

Because the initial masking strips it from the VA, and we need to add it
back. storing it as part of the tag makes it easy to ORR in.

> 
>> +		tag_val >>= tag_lsb;
>> +	}
>>  }
>>  
>>  static u32 compute_instruction(int n, u32 rd, u32 rn)
>> @@ -46,6 +73,33 @@ static u32 compute_instruction(int n, u32 rd, u32 rn)
>>  							  AARCH64_INSN_VARIANT_64BIT,
>>  							  rn, rd, va_mask);
>>  		break;
>> +
>> +	case 1:
>> +		/* ROR is a variant of EXTR with Rm = Rn */
>> +		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
>> +					     rn, rn, rd,
>> +					     tag_lsb);
>> +		break;
>> +
>> +	case 2:
>> +		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
>> +						    tag_val & (SZ_4K - 1),
>> +						    AARCH64_INSN_VARIANT_64BIT,
>> +						    AARCH64_INSN_ADSB_ADD);
>> +		break;
>> +
>> +	case 3:
>> +		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
>> +						    tag_val & GENMASK(23, 12),
>> +						    AARCH64_INSN_VARIANT_64BIT,
>> +						    AARCH64_INSN_ADSB_ADD);
>> +		break;
>> +
>> +	case 4:
>> +		/* ROR is a variant of EXTR with Rm = Rn */
>> +		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
>> +					     rn, rn, rd, 64 - tag_lsb);
> 
> Ah, you do use 64 - first_rand in the code.  Well, I approve of this
> line of code then.
> 
>> +		break;
>>  	}
>>  
>>  	return insn;
>> @@ -56,8 +110,8 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
>>  {
>>  	int i;
>>  
>> -	/* We only expect a 1 instruction sequence */
>> -	BUG_ON(nr_inst != 1);
>> +	/* We only expect a 5 instruction sequence */
> 
> Still sounds strange to me, just drop the comment I think if we keep the
> BUG_ON.

Sure.

> 
>> +	BUG_ON(nr_inst != 5);
>>  
>>  	if (!has_vhe() && !va_mask)
>>  		compute_layout();
>> @@ -68,8 +122,12 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
>>  		/*
>>  		 * VHE doesn't need any address translation, let's NOP
>>  		 * everything.
>> +		 *
>> +		 * Alternatively, if we don't have any spare bits in
>> +		 * the address, NOP everything after masking tha
> 
> s/tha/the/
> 
>> +		 * kernel VA.
>>  		 */
>> -		if (has_vhe()) {
>> +		if (has_vhe() || (!tag_lsb && i > 1)) {
>>  			updptr[i] = aarch64_insn_gen_nop();
>>  			continue;
>>  		}
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 14c5e5534f2f..d01c7111b1f7 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1811,7 +1811,7 @@ int kvm_mmu_init(void)
>>  		  kern_hyp_va((unsigned long)high_memory - 1));
>>  
>>  	if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
>> -	    hyp_idmap_start <  kern_hyp_va(~0UL) &&
>> +	    hyp_idmap_start <  kern_hyp_va((unsigned long)high_memory - 1) &&
> 
> Is this actually required for this patch or are we just trying to be
> nice?

You really need something like that. Remember that we compute the tag
based on the available memory, so something that goes beyond that is not
a valid input to kern_hyp_va anymore.
> 
> I'm actually not sure I remember what this is about beyond the VA=idmap
> for everything on 32-bit case; I thought we chose the hyp address space
> exactly so that it wouldn't overlap with the idmap?

This is just a sanity check that kern_hyp_va returns the right thing
with respect to the idmap (i.e. we cannot hit the idmap by feeding
something to the macro). Is that clear enough (I'm not sure it is...)?

> 
>>  	    hyp_idmap_start != (unsigned long)__hyp_idmap_text_start) {
>>  		/*
>>  		 * The idmap page is intersecting with the VA space,
>> -- 
>> 2.14.2
>>
> 
> Thanks,
> -Christoffer
> 

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 08/19] arm64: KVM: Dynamically patch the kernel/hyp VA mask
  2018-02-15 13:11       ` Marc Zyngier
@ 2018-02-16  9:02         ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-02-16  9:02 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On Thu, Feb 15, 2018 at 01:11:02PM +0000, Marc Zyngier wrote:
> On 15/01/18 11:47, Christoffer Dall wrote:
> > On Thu, Jan 04, 2018 at 06:43:23PM +0000, Marc Zyngier wrote:
> >> So far, we're using a complicated sequence of alternatives to
> >> patch the kernel/hyp VA mask on non-VHE, and NOP out the
> >> masking altogether when on VHE.
> >>
> >> THe newly introduced dynamic patching gives us the opportunity
> > 
> > nit: s/THe/The/
> > 
> >> to simplify that code by patching a single instruction with
> >> the correct mask (instead of the mind bending cummulative masking
> >> we have at the moment) or even a single NOP on VHE.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  arch/arm64/include/asm/kvm_mmu.h | 45 ++++++--------------
> >>  arch/arm64/kvm/Makefile          |  2 +-
> >>  arch/arm64/kvm/va_layout.c       | 91 ++++++++++++++++++++++++++++++++++++++++
> >>  3 files changed, 104 insertions(+), 34 deletions(-)
> >>  create mode 100644 arch/arm64/kvm/va_layout.c
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> >> index 672c8684d5c2..b0c3cbe9b513 100644
> >> --- a/arch/arm64/include/asm/kvm_mmu.h
> >> +++ b/arch/arm64/include/asm/kvm_mmu.h
> >> @@ -69,9 +69,6 @@
> >>   * mappings, and none of this applies in that case.
> >>   */
> >>  
> >> -#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
> >> -#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
> >> -
> >>  #ifdef __ASSEMBLY__
> >>  
> >>  #include <asm/alternative.h>
> >> @@ -81,28 +78,14 @@
> >>   * Convert a kernel VA into a HYP VA.
> >>   * reg: VA to be converted.
> >>   *
> >> - * This generates the following sequences:
> >> - * - High mask:
> >> - *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
> >> - *		nop
> >> - * - Low mask:
> >> - *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
> >> - *		and x0, x0, #HYP_PAGE_OFFSET_LOW_MASK
> >> - * - VHE:
> >> - *		nop
> >> - *		nop
> >> - *
> >> - * The "low mask" version works because the mask is a strict subset of
> >> - * the "high mask", hence performing the first mask for nothing.
> >> - * Should be completely invisible on any viable CPU.
> >> + * The actual code generation takes place in kvm_update_va_mask, and
> >> + * the instructions below are only there to reserve the space and
> >> + * perform the register allocation.
> > 
> > Not just register allocation, but also to tell the generating code which
> > registers to use for its operands, right?
> 
> That's what I meant by register allocation.
> 

I suppose that's included in the term.  My confusion was that I
initially looked at this like the clobber list in inline asm, but then
realized that you really use the specific registers for each instruction
in the order listed here.

> > 
> >>   */
> >>  .macro kern_hyp_va	reg
> >> -alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
> >> -	and     \reg, \reg, #HYP_PAGE_OFFSET_HIGH_MASK
> >> -alternative_else_nop_endif
> >> -alternative_if ARM64_HYP_OFFSET_LOW
> >> -	and     \reg, \reg, #HYP_PAGE_OFFSET_LOW_MASK
> >> -alternative_else_nop_endif
> >> +alternative_cb kvm_update_va_mask
> >> +	and     \reg, \reg, #1
> >> +alternative_cb_end
> >>  .endm
> >>  
> >>  #else
> >> @@ -113,18 +96,14 @@ alternative_else_nop_endif
> >>  #include <asm/mmu_context.h>
> >>  #include <asm/pgtable.h>
> >>  
> >> +void kvm_update_va_mask(struct alt_instr *alt,
> >> +			__le32 *origptr, __le32 *updptr, int nr_inst);
> >> +
> >>  static inline unsigned long __kern_hyp_va(unsigned long v)
> >>  {
> >> -	asm volatile(ALTERNATIVE("and %0, %0, %1",
> >> -				 "nop",
> >> -				 ARM64_HAS_VIRT_HOST_EXTN)
> >> -		     : "+r" (v)
> >> -		     : "i" (HYP_PAGE_OFFSET_HIGH_MASK));
> >> -	asm volatile(ALTERNATIVE("nop",
> >> -				 "and %0, %0, %1",
> >> -				 ARM64_HYP_OFFSET_LOW)
> >> -		     : "+r" (v)
> >> -		     : "i" (HYP_PAGE_OFFSET_LOW_MASK));
> >> +	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
> >> +				    kvm_update_va_mask)
> >> +		     : "+r" (v));
> >>  	return v;
> >>  }
> >>  
> >> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> >> index 87c4f7ae24de..93afff91cb7c 100644
> >> --- a/arch/arm64/kvm/Makefile
> >> +++ b/arch/arm64/kvm/Makefile
> >> @@ -16,7 +16,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/e
> >>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o
> >>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
> >>  
> >> -kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o
> >> +kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
> >>  kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
> >>  kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
> >>  kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o
> >> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
> >> new file mode 100644
> >> index 000000000000..aee758574e61
> >> --- /dev/null
> >> +++ b/arch/arm64/kvm/va_layout.c
> >> @@ -0,0 +1,91 @@
> >> +/*
> >> + * Copyright (C) 2017 ARM Ltd.
> >> + * Author: Marc Zyngier <marc.zyngier@arm.com>
> >> + *
> >> + * This program is free software; you can redistribute it and/or modify
> >> + * it under the terms of the GNU General Public License version 2 as
> >> + * published by the Free Software Foundation.
> >> + *
> >> + * This program is distributed in the hope that it will be useful,
> >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >> + * GNU General Public License for more details.
> >> + *
> >> + * You should have received a copy of the GNU General Public License
> >> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> >> + */
> >> +
> >> +#include <linux/kvm_host.h>
> >> +#include <asm/alternative.h>
> >> +#include <asm/debug-monitors.h>
> >> +#include <asm/insn.h>
> >> +#include <asm/kvm_mmu.h>
> >> +
> >> +#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
> >> +#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
> >> +
> >> +static u64 va_mask;
> >> +
> >> +static void compute_layout(void)
> >> +{
> >> +	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
> >> +	unsigned long mask = HYP_PAGE_OFFSET_HIGH_MASK;
> >> +
> >> +	/*
> >> +	 * Activate the lower HYP offset only if the idmap doesn't
> >> +	 * clash with it,
> >> +	 */
> > 
> > The commentary is a bit strange given the logic below...
> > 
> >> +	if (idmap_addr > HYP_PAGE_OFFSET_LOW_MASK)
> >> +		mask = HYP_PAGE_OFFSET_HIGH_MASK;
> > 
> > ... was the initialization supposed to be LOW_MASK?
> > 
> > (and does this imply that this was never tested on a system that
> > actually used the low mask?)
> 
> I must has messed up with a later refactoring, as that code gets
> replaced pretty quickly in the series. The full series was definitely
> tested on Seattle with 39bit VAs, which is the only configuration I have
> that triggers the low mask.
> 

I realized later that this was just a temporary thing in the patch
series. Still, for posterity, we should probably fix it up.

> > 
> >> +
> >> +	va_mask = mask;
> >> +}
> >> +
> >> +static u32 compute_instruction(int n, u32 rd, u32 rn)
> >> +{
> >> +	u32 insn = AARCH64_BREAK_FAULT;
> >> +
> >> +	switch (n) {
> >> +	case 0:
> > 
> > hmmm, wonder why we need this n==0 check...
> 
> Makes patch splitting a bit easier. I can rework that if that helps.
> 

No need, I get it now.

> > 
> >> +		insn = aarch64_insn_gen_logical_immediate(AARCH64_INSN_LOGIC_AND,
> >> +							  AARCH64_INSN_VARIANT_64BIT,
> >> +							  rn, rd, va_mask);
> >> +		break;
> >> +	}
> >> +
> >> +	return insn;
> >> +}
> >> +
> >> +void __init kvm_update_va_mask(struct alt_instr *alt,
> >> +			       __le32 *origptr, __le32 *updptr, int nr_inst)
> >> +{
> >> +	int i;
> >> +
> >> +	/* We only expect a 1 instruction sequence */
> > 
> > nit: wording is a bit strange, how about
> > "We only expect a single instruction in the alternative sequence"
> 
> Sure.
> 
> > 
> >> +	BUG_ON(nr_inst != 1);
> >> +
> >> +	if (!has_vhe() && !va_mask)
> >> +		compute_layout();
> >> +
> >> +	for (i = 0; i < nr_inst; i++) {
> > 
> > It's a bit funny to have a loop with the above BUG_ON.
> > 
> > (I'm beginning to wonder if a future patch expands on this single
> > instruction thing, in which case a hint in the commit message would have
> > been nice.)
> 
> That's indeed what is happening. A further patch expands the single
> instruction to a 5 instruction sequence. I'll add a comment to that effect.
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 08/19] arm64: KVM: Dynamically patch the kernel/hyp VA mask
@ 2018-02-16  9:02         ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-02-16  9:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 15, 2018 at 01:11:02PM +0000, Marc Zyngier wrote:
> On 15/01/18 11:47, Christoffer Dall wrote:
> > On Thu, Jan 04, 2018 at 06:43:23PM +0000, Marc Zyngier wrote:
> >> So far, we're using a complicated sequence of alternatives to
> >> patch the kernel/hyp VA mask on non-VHE, and NOP out the
> >> masking altogether when on VHE.
> >>
> >> THe newly introduced dynamic patching gives us the opportunity
> > 
> > nit: s/THe/The/
> > 
> >> to simplify that code by patching a single instruction with
> >> the correct mask (instead of the mind bending cummulative masking
> >> we have at the moment) or even a single NOP on VHE.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  arch/arm64/include/asm/kvm_mmu.h | 45 ++++++--------------
> >>  arch/arm64/kvm/Makefile          |  2 +-
> >>  arch/arm64/kvm/va_layout.c       | 91 ++++++++++++++++++++++++++++++++++++++++
> >>  3 files changed, 104 insertions(+), 34 deletions(-)
> >>  create mode 100644 arch/arm64/kvm/va_layout.c
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> >> index 672c8684d5c2..b0c3cbe9b513 100644
> >> --- a/arch/arm64/include/asm/kvm_mmu.h
> >> +++ b/arch/arm64/include/asm/kvm_mmu.h
> >> @@ -69,9 +69,6 @@
> >>   * mappings, and none of this applies in that case.
> >>   */
> >>  
> >> -#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
> >> -#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
> >> -
> >>  #ifdef __ASSEMBLY__
> >>  
> >>  #include <asm/alternative.h>
> >> @@ -81,28 +78,14 @@
> >>   * Convert a kernel VA into a HYP VA.
> >>   * reg: VA to be converted.
> >>   *
> >> - * This generates the following sequences:
> >> - * - High mask:
> >> - *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
> >> - *		nop
> >> - * - Low mask:
> >> - *		and x0, x0, #HYP_PAGE_OFFSET_HIGH_MASK
> >> - *		and x0, x0, #HYP_PAGE_OFFSET_LOW_MASK
> >> - * - VHE:
> >> - *		nop
> >> - *		nop
> >> - *
> >> - * The "low mask" version works because the mask is a strict subset of
> >> - * the "high mask", hence performing the first mask for nothing.
> >> - * Should be completely invisible on any viable CPU.
> >> + * The actual code generation takes place in kvm_update_va_mask, and
> >> + * the instructions below are only there to reserve the space and
> >> + * perform the register allocation.
> > 
> > Not just register allocation, but also to tell the generating code which
> > registers to use for its operands, right?
> 
> That's what I meant by register allocation.
> 

I suppose that's included in the term.  My confusion was that I
initially looked at this like the clobber list in inline asm, but then
realized that you really use the specific registers for each instruction
in the order listed here.

> > 
> >>   */
> >>  .macro kern_hyp_va	reg
> >> -alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
> >> -	and     \reg, \reg, #HYP_PAGE_OFFSET_HIGH_MASK
> >> -alternative_else_nop_endif
> >> -alternative_if ARM64_HYP_OFFSET_LOW
> >> -	and     \reg, \reg, #HYP_PAGE_OFFSET_LOW_MASK
> >> -alternative_else_nop_endif
> >> +alternative_cb kvm_update_va_mask
> >> +	and     \reg, \reg, #1
> >> +alternative_cb_end
> >>  .endm
> >>  
> >>  #else
> >> @@ -113,18 +96,14 @@ alternative_else_nop_endif
> >>  #include <asm/mmu_context.h>
> >>  #include <asm/pgtable.h>
> >>  
> >> +void kvm_update_va_mask(struct alt_instr *alt,
> >> +			__le32 *origptr, __le32 *updptr, int nr_inst);
> >> +
> >>  static inline unsigned long __kern_hyp_va(unsigned long v)
> >>  {
> >> -	asm volatile(ALTERNATIVE("and %0, %0, %1",
> >> -				 "nop",
> >> -				 ARM64_HAS_VIRT_HOST_EXTN)
> >> -		     : "+r" (v)
> >> -		     : "i" (HYP_PAGE_OFFSET_HIGH_MASK));
> >> -	asm volatile(ALTERNATIVE("nop",
> >> -				 "and %0, %0, %1",
> >> -				 ARM64_HYP_OFFSET_LOW)
> >> -		     : "+r" (v)
> >> -		     : "i" (HYP_PAGE_OFFSET_LOW_MASK));
> >> +	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
> >> +				    kvm_update_va_mask)
> >> +		     : "+r" (v));
> >>  	return v;
> >>  }
> >>  
> >> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> >> index 87c4f7ae24de..93afff91cb7c 100644
> >> --- a/arch/arm64/kvm/Makefile
> >> +++ b/arch/arm64/kvm/Makefile
> >> @@ -16,7 +16,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/e
> >>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arm.o $(KVM)/arm/mmu.o $(KVM)/arm/mmio.o
> >>  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o $(KVM)/arm/perf.o
> >>  
> >> -kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o
> >> +kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
> >>  kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
> >>  kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
> >>  kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o
> >> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
> >> new file mode 100644
> >> index 000000000000..aee758574e61
> >> --- /dev/null
> >> +++ b/arch/arm64/kvm/va_layout.c
> >> @@ -0,0 +1,91 @@
> >> +/*
> >> + * Copyright (C) 2017 ARM Ltd.
> >> + * Author: Marc Zyngier <marc.zyngier@arm.com>
> >> + *
> >> + * This program is free software; you can redistribute it and/or modify
> >> + * it under the terms of the GNU General Public License version 2 as
> >> + * published by the Free Software Foundation.
> >> + *
> >> + * This program is distributed in the hope that it will be useful,
> >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >> + * GNU General Public License for more details.
> >> + *
> >> + * You should have received a copy of the GNU General Public License
> >> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> >> + */
> >> +
> >> +#include <linux/kvm_host.h>
> >> +#include <asm/alternative.h>
> >> +#include <asm/debug-monitors.h>
> >> +#include <asm/insn.h>
> >> +#include <asm/kvm_mmu.h>
> >> +
> >> +#define HYP_PAGE_OFFSET_HIGH_MASK	((UL(1) << VA_BITS) - 1)
> >> +#define HYP_PAGE_OFFSET_LOW_MASK	((UL(1) << (VA_BITS - 1)) - 1)
> >> +
> >> +static u64 va_mask;
> >> +
> >> +static void compute_layout(void)
> >> +{
> >> +	phys_addr_t idmap_addr = __pa_symbol(__hyp_idmap_text_start);
> >> +	unsigned long mask = HYP_PAGE_OFFSET_HIGH_MASK;
> >> +
> >> +	/*
> >> +	 * Activate the lower HYP offset only if the idmap doesn't
> >> +	 * clash with it,
> >> +	 */
> > 
> > The commentary is a bit strange given the logic below...
> > 
> >> +	if (idmap_addr > HYP_PAGE_OFFSET_LOW_MASK)
> >> +		mask = HYP_PAGE_OFFSET_HIGH_MASK;
> > 
> > ... was the initialization supposed to be LOW_MASK?
> > 
> > (and does this imply that this was never tested on a system that
> > actually used the low mask?)
> 
> I must has messed up with a later refactoring, as that code gets
> replaced pretty quickly in the series. The full series was definitely
> tested on Seattle with 39bit VAs, which is the only configuration I have
> that triggers the low mask.
> 

I realized later that this was just a temporary thing in the patch
series. Still, for posterity, we should probably fix it up.

> > 
> >> +
> >> +	va_mask = mask;
> >> +}
> >> +
> >> +static u32 compute_instruction(int n, u32 rd, u32 rn)
> >> +{
> >> +	u32 insn = AARCH64_BREAK_FAULT;
> >> +
> >> +	switch (n) {
> >> +	case 0:
> > 
> > hmmm, wonder why we need this n==0 check...
> 
> Makes patch splitting a bit easier. I can rework that if that helps.
> 

No need, I get it now.

> > 
> >> +		insn = aarch64_insn_gen_logical_immediate(AARCH64_INSN_LOGIC_AND,
> >> +							  AARCH64_INSN_VARIANT_64BIT,
> >> +							  rn, rd, va_mask);
> >> +		break;
> >> +	}
> >> +
> >> +	return insn;
> >> +}
> >> +
> >> +void __init kvm_update_va_mask(struct alt_instr *alt,
> >> +			       __le32 *origptr, __le32 *updptr, int nr_inst)
> >> +{
> >> +	int i;
> >> +
> >> +	/* We only expect a 1 instruction sequence */
> > 
> > nit: wording is a bit strange, how about
> > "We only expect a single instruction in the alternative sequence"
> 
> Sure.
> 
> > 
> >> +	BUG_ON(nr_inst != 1);
> >> +
> >> +	if (!has_vhe() && !va_mask)
> >> +		compute_layout();
> >> +
> >> +	for (i = 0; i < nr_inst; i++) {
> > 
> > It's a bit funny to have a loop with the above BUG_ON.
> > 
> > (I'm beginning to wonder if a future patch expands on this single
> > instruction thing, in which case a hint in the commit message would have
> > been nice.)
> 
> That's indeed what is happening. A further patch expands the single
> instruction to a 5 instruction sequence. I'll add a comment to that effect.
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
  2018-02-15 13:22       ` Marc Zyngier
@ 2018-02-16  9:05         ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-02-16  9:05 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On Thu, Feb 15, 2018 at 01:22:56PM +0000, Marc Zyngier wrote:
> On 15/01/18 15:36, Christoffer Dall wrote:
> > On Thu, Jan 04, 2018 at 06:43:25PM +0000, Marc Zyngier wrote:
> >> kvm_vgic_global_state is part of the read-only section, and is
> >> usually accessed using a PC-relative address generation (adrp + add).
> >>
> >> It is thus useless to use kern_hyp_va() on it, and actively problematic
> >> if kern_hyp_va() becomes non-idempotent. On the other hand, there is
> >> no way that the compiler is going to guarantee that such access is
> >> always be PC relative.
> > 
> > nit: is always be
> > 
> >>
> >> So let's bite the bullet and provide our own accessor.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  arch/arm/include/asm/kvm_hyp.h   | 6 ++++++
> >>  arch/arm64/include/asm/kvm_hyp.h | 9 +++++++++
> >>  virt/kvm/arm/hyp/vgic-v2-sr.c    | 4 ++--
> >>  3 files changed, 17 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
> >> index ab20ffa8b9e7..1d42d0aa2feb 100644
> >> --- a/arch/arm/include/asm/kvm_hyp.h
> >> +++ b/arch/arm/include/asm/kvm_hyp.h
> >> @@ -26,6 +26,12 @@
> >>  
> >>  #define __hyp_text __section(.hyp.text) notrace
> >>  
> >> +#define hyp_symbol_addr(s)						\
> >> +	({								\
> >> +		typeof(s) *addr = &(s);					\
> >> +		addr;							\
> >> +	})
> >> +
> >>  #define __ACCESS_VFP(CRn)			\
> >>  	"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
> >>  
> >> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> >> index 08d3bb66c8b7..a2d98c539023 100644
> >> --- a/arch/arm64/include/asm/kvm_hyp.h
> >> +++ b/arch/arm64/include/asm/kvm_hyp.h
> >> @@ -25,6 +25,15 @@
> >>  
> >>  #define __hyp_text __section(.hyp.text) notrace
> >>  
> >> +#define hyp_symbol_addr(s)						\
> >> +	({								\
> >> +		typeof(s) *addr;					\
> >> +		asm volatile("adrp	%0, %1\n"			\
> >> +			     "add	%0, %0, :lo12:%1\n"		\
> >> +			     : "=r" (addr) : "S" (&s));			\
> > 
> > Can't we use adr_l here?
> 
> Unfortunately not. All the asm/assembler.h macros are unavailable to
> inline assembly. We could start introducing equivalent macros for that
> purpose, but that's starting to be outside of the scope of this series.
> 

Absolutely.  Forget I asked.

> > 
> >> +		addr;							\
> >> +	})
> >> +
> > 
> > I don't fully appreciate the semantics of this macro going by its name
> > only.  My understanding is that if you want to resolve a symbol to an
> > address which is mapped in hyp, then use this.  Is this correct?
> 
> The goal of this macro is to return a symbol's address based on a
> PC-relative computation, as opposed to a loading the VA from a constant
> pool or something similar. This works well for HYP, as an absolute VA is
> guaranteed to be wrong.
> 
> > 
> > If so, can we add a small comment (because I can't come up with a better
> > name).
> 
> I'll add the above if that works for you.
> 

Yes it does.  The only thing that remains a bit unclear is what the
difference between this and kern_hyp_va is, and when you'd choose to use
one over the other.  Perhaps we need a single place which documents our
primitives and tells us what to use when.  At least, I'm for sure not
going to be able to figure this out later on.


Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
@ 2018-02-16  9:05         ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-02-16  9:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 15, 2018 at 01:22:56PM +0000, Marc Zyngier wrote:
> On 15/01/18 15:36, Christoffer Dall wrote:
> > On Thu, Jan 04, 2018 at 06:43:25PM +0000, Marc Zyngier wrote:
> >> kvm_vgic_global_state is part of the read-only section, and is
> >> usually accessed using a PC-relative address generation (adrp + add).
> >>
> >> It is thus useless to use kern_hyp_va() on it, and actively problematic
> >> if kern_hyp_va() becomes non-idempotent. On the other hand, there is
> >> no way that the compiler is going to guarantee that such access is
> >> always be PC relative.
> > 
> > nit: is always be
> > 
> >>
> >> So let's bite the bullet and provide our own accessor.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  arch/arm/include/asm/kvm_hyp.h   | 6 ++++++
> >>  arch/arm64/include/asm/kvm_hyp.h | 9 +++++++++
> >>  virt/kvm/arm/hyp/vgic-v2-sr.c    | 4 ++--
> >>  3 files changed, 17 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
> >> index ab20ffa8b9e7..1d42d0aa2feb 100644
> >> --- a/arch/arm/include/asm/kvm_hyp.h
> >> +++ b/arch/arm/include/asm/kvm_hyp.h
> >> @@ -26,6 +26,12 @@
> >>  
> >>  #define __hyp_text __section(.hyp.text) notrace
> >>  
> >> +#define hyp_symbol_addr(s)						\
> >> +	({								\
> >> +		typeof(s) *addr = &(s);					\
> >> +		addr;							\
> >> +	})
> >> +
> >>  #define __ACCESS_VFP(CRn)			\
> >>  	"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
> >>  
> >> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> >> index 08d3bb66c8b7..a2d98c539023 100644
> >> --- a/arch/arm64/include/asm/kvm_hyp.h
> >> +++ b/arch/arm64/include/asm/kvm_hyp.h
> >> @@ -25,6 +25,15 @@
> >>  
> >>  #define __hyp_text __section(.hyp.text) notrace
> >>  
> >> +#define hyp_symbol_addr(s)						\
> >> +	({								\
> >> +		typeof(s) *addr;					\
> >> +		asm volatile("adrp	%0, %1\n"			\
> >> +			     "add	%0, %0, :lo12:%1\n"		\
> >> +			     : "=r" (addr) : "S" (&s));			\
> > 
> > Can't we use adr_l here?
> 
> Unfortunately not. All the asm/assembler.h macros are unavailable to
> inline assembly. We could start introducing equivalent macros for that
> purpose, but that's starting to be outside of the scope of this series.
> 

Absolutely.  Forget I asked.

> > 
> >> +		addr;							\
> >> +	})
> >> +
> > 
> > I don't fully appreciate the semantics of this macro going by its name
> > only.  My understanding is that if you want to resolve a symbol to an
> > address which is mapped in hyp, then use this.  Is this correct?
> 
> The goal of this macro is to return a symbol's address based on a
> PC-relative computation, as opposed to a loading the VA from a constant
> pool or something similar. This works well for HYP, as an absolute VA is
> guaranteed to be wrong.
> 
> > 
> > If so, can we add a small comment (because I can't come up with a better
> > name).
> 
> I'll add the above if that works for you.
> 

Yes it does.  The only thing that remains a bit unclear is what the
difference between this and kern_hyp_va is, and when you'd choose to use
one over the other.  Perhaps we need a single place which documents our
primitives and tells us what to use when.  At least, I'm for sure not
going to be able to figure this out later on.


Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 14/19] KVM: arm/arm64: Move HYP IO VAs to the "idmap" range
  2018-02-15 13:52       ` Marc Zyngier
@ 2018-02-16  9:25         ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-02-16  9:25 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On Thu, Feb 15, 2018 at 01:52:05PM +0000, Marc Zyngier wrote:
> On 18/01/18 14:39, Christoffer Dall wrote:
> > On Thu, Jan 04, 2018 at 06:43:29PM +0000, Marc Zyngier wrote:
> >> We so far mapped our HYP IO (which is essencially the GICv2 control
> >> registers) using the same method as for memory. It recently appeared
> >> that is a bit unsafe:
> >>
> >> We compute the HYP VA using the kern_hyp_va helper, but that helper
> >> is only designed to deal with kernel VAs coming from the linear map,
> >> and not from the vmalloc region... This could in turn cause some bad
> >> aliasing between the two, amplified by the upcoming VA randomisation.
> >>
> >> A solution is to come up with our very own basic VA allocator for
> >> MMIO. Since half of the HYP address space only contains a single
> >> page (the idmap), we have plenty to borrow from. Let's use the idmap
> >> as a base, and allocate downwards from it. GICv2 now lives on the
> >> other side of the great VA barrier.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  virt/kvm/arm/mmu.c | 56 +++++++++++++++++++++++++++++++++++++++++-------------
> >>  1 file changed, 43 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> >> index 6192d45d1e1a..14c5e5534f2f 100644
> >> --- a/virt/kvm/arm/mmu.c
> >> +++ b/virt/kvm/arm/mmu.c
> >> @@ -43,6 +43,9 @@ static unsigned long hyp_idmap_start;
> >>  static unsigned long hyp_idmap_end;
> >>  static phys_addr_t hyp_idmap_vector;
> >>  
> >> +static DEFINE_MUTEX(io_map_lock);
> >> +static unsigned long io_map_base;
> >> +
> >>  #define S2_PGD_SIZE	(PTRS_PER_S2_PGD * sizeof(pgd_t))
> >>  #define hyp_pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
> >>  
> >> @@ -502,27 +505,31 @@ static void unmap_hyp_range(pgd_t *pgdp, phys_addr_t start, u64 size)
> >>   *
> >>   * Assumes hyp_pgd is a page table used strictly in Hyp-mode and
> >>   * therefore contains either mappings in the kernel memory area (above
> >> - * PAGE_OFFSET), or device mappings in the vmalloc range (from
> >> - * VMALLOC_START to VMALLOC_END).
> >> + * PAGE_OFFSET), or device mappings in the idmap range.
> >>   *
> >> - * boot_hyp_pgd should only map two pages for the init code.
> >> + * boot_hyp_pgd should only map the idmap range, and is only used in
> >> + * the extended idmap case.
> >>   */
> >>  void free_hyp_pgds(void)
> >>  {
> >> +	pgd_t *id_pgd;
> >> +
> >>  	mutex_lock(&kvm_hyp_pgd_mutex);
> >>  
> >> +	id_pgd = boot_hyp_pgd ? boot_hyp_pgd : hyp_pgd;
> >> +
> >> +	if (id_pgd)
> >> +		unmap_hyp_range(id_pgd, io_map_base,
> >> +				hyp_idmap_start + PAGE_SIZE - io_map_base);
> >> +
> >>  	if (boot_hyp_pgd) {
> >> -		unmap_hyp_range(boot_hyp_pgd, hyp_idmap_start, PAGE_SIZE);
> >>  		free_pages((unsigned long)boot_hyp_pgd, hyp_pgd_order);
> >>  		boot_hyp_pgd = NULL;
> >>  	}
> >>  
> >>  	if (hyp_pgd) {
> >> -		unmap_hyp_range(hyp_pgd, hyp_idmap_start, PAGE_SIZE);
> >>  		unmap_hyp_range(hyp_pgd, kern_hyp_va(PAGE_OFFSET),
> >>  				(uintptr_t)high_memory - PAGE_OFFSET);
> >> -		unmap_hyp_range(hyp_pgd, kern_hyp_va(VMALLOC_START),
> >> -				VMALLOC_END - VMALLOC_START);
> >>  
> >>  		free_pages((unsigned long)hyp_pgd, hyp_pgd_order);
> >>  		hyp_pgd = NULL;
> >> @@ -721,7 +728,8 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> >>  			   void __iomem **kaddr,
> >>  			   void __iomem **haddr)
> >>  {
> >> -	unsigned long start, end;
> >> +	pgd_t *pgd = hyp_pgd;
> >> +	unsigned long base;
> >>  	int ret;
> >>  
> >>  	*kaddr = ioremap(phys_addr, size);
> >> @@ -733,17 +741,38 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> >>  		return 0;
> >>  	}
> >>  
> >> +	mutex_lock(&io_map_lock);
> >> +
> >> +	base = io_map_base - size;
> > 
> > are we guaranteed that hyp_idmap_start (and therefore io_map_base) is
> > sufficiently greater than 0 ?  I suppose that even if RAM starts at 0,
> > and the kernel was loaded at 0, the idmap page for Hyp would be at some
> > reasonable offset from the start of the kernel image?
> 
> On my kernel image:
> ffff000008080000 t _head
> ffff000008cc6000 T __hyp_idmap_text_start
> ffff000009aaa000 B swapper_pg_end
> 
> _hyp_idmap_text_start is about 12MB from the beginning of the image, and
> about 14MB from the end. Yes, it is a big kernel. But we're only mapping
> a few pages there, even with my upcoming crazy vector remapping crap. So
> the likeliness of this failing is close to zero.
> 
> Now, close to zero is not necessarily close enough. What I could do is
> to switch the allocator around on failure, so that if we can't allocate
> on one side, we can at least try to allocate on the other side. I'm
> pretty sure we'll never trigger that code, but I can implement it if you
> think that's worth it.
> 

I don't think we should necessarily implement it, my main concern is
when reading the code, the reader has to concince himself/herself why
this is always safe (and not just very likely to be safe).  I'm fine
with adding a comment that explains this instead of implementing
complicated logic though.  What do you think?

> > 
> >> +	base &= ~(size - 1);
> > 
> > I'm not sure I understand this line.  Wouldn't it make more sense to use
> > PAGE_SIZE here?
> 
> This is trying to align the base of the allocation to its natural size
> (8kB aligned on an 8kB boundary, for example), which is what other
> allocators in the kernel do. I've now added a roundup_pow_of_two(size)
> so that we're guaranteed to deal with those.
> 

Ah right, it's just that I wasn't thinking of the size in terms of
always being page aligned, so couldn't you here attempt two 4K
allocations on a 64K page system, where one would overwrite the other?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 14/19] KVM: arm/arm64: Move HYP IO VAs to the "idmap" range
@ 2018-02-16  9:25         ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-02-16  9:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 15, 2018 at 01:52:05PM +0000, Marc Zyngier wrote:
> On 18/01/18 14:39, Christoffer Dall wrote:
> > On Thu, Jan 04, 2018 at 06:43:29PM +0000, Marc Zyngier wrote:
> >> We so far mapped our HYP IO (which is essencially the GICv2 control
> >> registers) using the same method as for memory. It recently appeared
> >> that is a bit unsafe:
> >>
> >> We compute the HYP VA using the kern_hyp_va helper, but that helper
> >> is only designed to deal with kernel VAs coming from the linear map,
> >> and not from the vmalloc region... This could in turn cause some bad
> >> aliasing between the two, amplified by the upcoming VA randomisation.
> >>
> >> A solution is to come up with our very own basic VA allocator for
> >> MMIO. Since half of the HYP address space only contains a single
> >> page (the idmap), we have plenty to borrow from. Let's use the idmap
> >> as a base, and allocate downwards from it. GICv2 now lives on the
> >> other side of the great VA barrier.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  virt/kvm/arm/mmu.c | 56 +++++++++++++++++++++++++++++++++++++++++-------------
> >>  1 file changed, 43 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> >> index 6192d45d1e1a..14c5e5534f2f 100644
> >> --- a/virt/kvm/arm/mmu.c
> >> +++ b/virt/kvm/arm/mmu.c
> >> @@ -43,6 +43,9 @@ static unsigned long hyp_idmap_start;
> >>  static unsigned long hyp_idmap_end;
> >>  static phys_addr_t hyp_idmap_vector;
> >>  
> >> +static DEFINE_MUTEX(io_map_lock);
> >> +static unsigned long io_map_base;
> >> +
> >>  #define S2_PGD_SIZE	(PTRS_PER_S2_PGD * sizeof(pgd_t))
> >>  #define hyp_pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
> >>  
> >> @@ -502,27 +505,31 @@ static void unmap_hyp_range(pgd_t *pgdp, phys_addr_t start, u64 size)
> >>   *
> >>   * Assumes hyp_pgd is a page table used strictly in Hyp-mode and
> >>   * therefore contains either mappings in the kernel memory area (above
> >> - * PAGE_OFFSET), or device mappings in the vmalloc range (from
> >> - * VMALLOC_START to VMALLOC_END).
> >> + * PAGE_OFFSET), or device mappings in the idmap range.
> >>   *
> >> - * boot_hyp_pgd should only map two pages for the init code.
> >> + * boot_hyp_pgd should only map the idmap range, and is only used in
> >> + * the extended idmap case.
> >>   */
> >>  void free_hyp_pgds(void)
> >>  {
> >> +	pgd_t *id_pgd;
> >> +
> >>  	mutex_lock(&kvm_hyp_pgd_mutex);
> >>  
> >> +	id_pgd = boot_hyp_pgd ? boot_hyp_pgd : hyp_pgd;
> >> +
> >> +	if (id_pgd)
> >> +		unmap_hyp_range(id_pgd, io_map_base,
> >> +				hyp_idmap_start + PAGE_SIZE - io_map_base);
> >> +
> >>  	if (boot_hyp_pgd) {
> >> -		unmap_hyp_range(boot_hyp_pgd, hyp_idmap_start, PAGE_SIZE);
> >>  		free_pages((unsigned long)boot_hyp_pgd, hyp_pgd_order);
> >>  		boot_hyp_pgd = NULL;
> >>  	}
> >>  
> >>  	if (hyp_pgd) {
> >> -		unmap_hyp_range(hyp_pgd, hyp_idmap_start, PAGE_SIZE);
> >>  		unmap_hyp_range(hyp_pgd, kern_hyp_va(PAGE_OFFSET),
> >>  				(uintptr_t)high_memory - PAGE_OFFSET);
> >> -		unmap_hyp_range(hyp_pgd, kern_hyp_va(VMALLOC_START),
> >> -				VMALLOC_END - VMALLOC_START);
> >>  
> >>  		free_pages((unsigned long)hyp_pgd, hyp_pgd_order);
> >>  		hyp_pgd = NULL;
> >> @@ -721,7 +728,8 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> >>  			   void __iomem **kaddr,
> >>  			   void __iomem **haddr)
> >>  {
> >> -	unsigned long start, end;
> >> +	pgd_t *pgd = hyp_pgd;
> >> +	unsigned long base;
> >>  	int ret;
> >>  
> >>  	*kaddr = ioremap(phys_addr, size);
> >> @@ -733,17 +741,38 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
> >>  		return 0;
> >>  	}
> >>  
> >> +	mutex_lock(&io_map_lock);
> >> +
> >> +	base = io_map_base - size;
> > 
> > are we guaranteed that hyp_idmap_start (and therefore io_map_base) is
> > sufficiently greater than 0 ?  I suppose that even if RAM starts at 0,
> > and the kernel was loaded at 0, the idmap page for Hyp would be at some
> > reasonable offset from the start of the kernel image?
> 
> On my kernel image:
> ffff000008080000 t _head
> ffff000008cc6000 T __hyp_idmap_text_start
> ffff000009aaa000 B swapper_pg_end
> 
> _hyp_idmap_text_start is about 12MB from the beginning of the image, and
> about 14MB from the end. Yes, it is a big kernel. But we're only mapping
> a few pages there, even with my upcoming crazy vector remapping crap. So
> the likeliness of this failing is close to zero.
> 
> Now, close to zero is not necessarily close enough. What I could do is
> to switch the allocator around on failure, so that if we can't allocate
> on one side, we can at least try to allocate on the other side. I'm
> pretty sure we'll never trigger that code, but I can implement it if you
> think that's worth it.
> 

I don't think we should necessarily implement it, my main concern is
when reading the code, the reader has to concince himself/herself why
this is always safe (and not just very likely to be safe).  I'm fine
with adding a comment that explains this instead of implementing
complicated logic though.  What do you think?

> > 
> >> +	base &= ~(size - 1);
> > 
> > I'm not sure I understand this line.  Wouldn't it make more sense to use
> > PAGE_SIZE here?
> 
> This is trying to align the base of the allocation to its natural size
> (8kB aligned on an 8kB boundary, for example), which is what other
> allocators in the kernel do. I've now added a roundup_pow_of_two(size)
> so that we're guaranteed to deal with those.
> 

Ah right, it's just that I wasn't thinking of the size in terms of
always being page aligned, so couldn't you here attempt two 4K
allocations on a 64K page system, where one would overwrite the other?

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 18/19] arm64: KVM: Introduce EL2 VA randomisation
  2018-02-15 15:32       ` Marc Zyngier
@ 2018-02-16  9:33         ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-02-16  9:33 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On Thu, Feb 15, 2018 at 03:32:52PM +0000, Marc Zyngier wrote:
> On 18/01/18 20:28, Christoffer Dall wrote:
> > On Thu, Jan 04, 2018 at 06:43:33PM +0000, Marc Zyngier wrote:
> >> The main idea behind randomising the EL2 VA is that we usually have
> >> a few spare bits between the most significant bit of the VA mask
> >> and the most significant bit of the linear mapping.
> >>
> >> Those bits could be a bunch of zeroes, and could be useful
> >> to move things around a bit. Of course, the more memory you have,
> >> the less randomisation you get...
> >>
> >> Alternatively, these bits could be the result of KASLR, in which
> >> case they are already random. But it would be nice to have a
> >> *different* randomization, just to make the job of a potential
> >> attacker a bit more difficult.
> >>
> >> Inserting these random bits is a bit involved. We don't have a spare
> >> register (short of rewriting all the kern_hyp_va call sites), and
> >> the immediate we want to insert is too random to be used with the
> >> ORR instruction. The best option I could come up with is the following
> >> sequence:
> >>
> >> 	and x0, x0, #va_mask
> > 
> > So if I get this right, you want to insert an arbitrary random value
> > without an extra register in bits [(VA_BITS-1):first_random_bit] and
> > BIT(VA_BITS-1) is always set in the input because it's a kernel address.
> 
> Correct.
> 
> > 
> >> 	ror x0, x0, #first_random_bit
> > 
> > Then you rotate so that the random bits become the LSBs and the random
> > value should be inserted into bits [NR_RAND_BITS-1:0] in x0 ?
> 
> Correct again. The important thing to notice is that the bottom bits are
> guaranteed to be zero, making sure that the subsequent adds act as ors.
> 
> > 
> >> 	add x0, x0, #(random & 0xfff)
> > 
> > So you do this via two rounds, first the lower 12 bits
> > 
> >> 	add x0, x0, #(random >> 12), lsl #12
> > 
> > Then the upper 12 bits (permitting a maximum of 24 randomized bits)
> 
> Still correct. It is debatable whether allowing more than 12 bits is
> really useful, as only platforms with very little memory will be able to
> reach past 12 bits of entropy.
> 
> > 
> >> 	ror x0, x0, #(63 - first_random_bit)
> > 
> > And then you rotate things back into their place.
> > 
> > Only, I don't understand why this isn't then (64 - first_random_bit) ?
> 
> That looks like a typo.
> 
> > 
> >>
> >> making it a fairly long sequence, but one that a decent CPU should
> >> be able to execute without breaking a sweat. It is of course NOPed
> >> out on VHE. The last 4 instructions can also be turned into NOPs
> >> if it appears that there is no free bits to use.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  arch/arm64/include/asm/kvm_mmu.h | 10 +++++-
> >>  arch/arm64/kvm/va_layout.c       | 68 +++++++++++++++++++++++++++++++++++++---
> >>  virt/kvm/arm/mmu.c               |  2 +-
> >>  3 files changed, 73 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> >> index cc882e890bb1..4fca6ddadccc 100644
> >> --- a/arch/arm64/include/asm/kvm_mmu.h
> >> +++ b/arch/arm64/include/asm/kvm_mmu.h
> >> @@ -85,6 +85,10 @@
> >>  .macro kern_hyp_va	reg
> >>  alternative_cb kvm_update_va_mask
> >>  	and     \reg, \reg, #1
> >> +	ror	\reg, \reg, #1
> >> +	add	\reg, \reg, #0
> >> +	add	\reg, \reg, #0
> >> +	ror	\reg, \reg, #63
> >>  alternative_cb_end
> >>  .endm
> >>  
> >> @@ -101,7 +105,11 @@ void kvm_update_va_mask(struct alt_instr *alt,
> >>  
> >>  static inline unsigned long __kern_hyp_va(unsigned long v)
> >>  {
> >> -	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
> >> +	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n"
> >> +				    "ror %0, %0, #1\n"
> >> +				    "add %0, %0, #0\n"
> >> +				    "add %0, %0, #0\n"
> >> +				    "ror %0, %0, #63\n",
> > 
> > This now sort of serves as the documentation if you don't have the
> > commit message, so I think you should annotate each line like the commit
> > message does.
> > 
> > Alternative, since you're duplicating a bunch of code which will be
> > replaced at runtime anyway, you could make all of these "and %0, %0, #1"
> > and then copy the documentation assembly code as a comment to
> > compute_instruction() and put a comment reference here.
> 
> I found that adding something that looks a bit like the generated code
> helps a lot. I'll add some documentation there.
> 

The annotation in the commit message was quite nice.

> > 
> >>  				    kvm_update_va_mask)
> >>  		     : "+r" (v));
> >>  	return v;
> >> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
> >> index 75bb1c6772b0..bf0d6bdf5f14 100644
> >> --- a/arch/arm64/kvm/va_layout.c
> >> +++ b/arch/arm64/kvm/va_layout.c
> >> @@ -16,11 +16,15 @@
> >>   */
> >>  
> >>  #include <linux/kvm_host.h>
> >> +#include <linux/random.h>
> >> +#include <linux/memblock.h>
> >>  #include <asm/alternative.h>
> >>  #include <asm/debug-monitors.h>
> >>  #include <asm/insn.h>
> >>  #include <asm/kvm_mmu.h>
> >>  
> > 
> > It would be nice to have a comment on these, something like:
> > 
> > /* The LSB of the random hyp VA tag or 0 if no randomization is used. */
> >> +static u8 tag_lsb;
> > /* The random hyp VA tag value with the region bit, if hyp randomization is used */
> >> +static u64 tag_val;
> 
> Sure.
> 
> > 
> > 
> >>  static u64 va_mask;
> >>  
> >>  static void compute_layout(void)
> >> @@ -32,8 +36,31 @@ static void compute_layout(void)
> >>  	region  = idmap_addr & BIT(VA_BITS - 1);
> >>  	region ^= BIT(VA_BITS - 1);
> >>  
> >> -	va_mask  = BIT(VA_BITS - 1) - 1;
> >> -	va_mask |= region;
> >> +	tag_lsb = fls64((u64)phys_to_virt(memblock_start_of_DRAM()) ^
> >> +			(u64)(high_memory - 1));
> >> +
> >> +	if (tag_lsb == (VA_BITS - 1)) {
> >> +		/*
> >> +		 * No space in the address, let's compute the mask so
> >> +		 * that it covers (VA_BITS - 1) bits, and the region
> >> +		 * bit. The tag is set to zero.
> >> +		 */
> >> +		tag_lsb = tag_val = 0;
> > 
> > tag_val should already be 0, right?
> > 
> > and wouldn't it be slightly nicer to have a temporary variable and only
> > set tag_lsb when needed, called something like linear_bits ?
> 
> OK.
> 
> > 
> >> +		va_mask  = BIT(VA_BITS - 1) - 1;
> >> +		va_mask |= region;
> >> +	} else {
> >> +		/*
> >> +		 * We do have some free bits. Let's have the mask to
> >> +		 * cover the low bits of the VA, and the tag to
> >> +		 * contain the random stuff plus the region bit.
> >> +		 */
> > 
> > Since you have two masks below this comment is a bit hard to parse, how
> > about explaining what makes up a Hyp address from a kernel linear
> > address instead, something like:
> > 
> > 		/*
> > 		 * We do have some free bits to insert a random tag.
> > 		 * Hyp VAs are now created from kernel linear map VAs
> > 		 * using the following formula (with V == VA_BITS):
> > 		 *
> > 		 *  63 ... V |   V-1  | V-2 ... tag_lsb | tag_lsb - 1 ... 0
> > 		 *  -------------------------------------------------------
> > 		 * | 0000000 | region |    random tag   |  kern linear VA  |
> > 		 */
> > 
> > (assuming I got this vaguely correct).
> 
> /me copy-pastes...
> 
> > 
> >> +		u64 mask = GENMASK_ULL(VA_BITS - 2, tag_lsb);
> > 
> > for consistency it would be nicer to use GENMASK_ULL(VA_BITS - 2, 0)
> > above as suggested in the other patch then.  And we could also call this
> > tag_mask to be super explicit.
> > 
> >> +
> >> +		va_mask = BIT(tag_lsb) - 1;
> > 
> > and here, GENMASK_ULL(tag_lsb - 1, 0).
> 
> Yup.
> 
> > 
> >> +		tag_val  = get_random_long() & mask;
> >> +		tag_val |= region;
> > 
> > it's actually unclear to me why you need the region bit included in
> > tag_val?
> 
> Because the initial masking strips it from the VA, and we need to add it
> back. storing it as part of the tag makes it easy to ORR in.
> 

Right, ok, it just slightly ticked my OCD to have a separate notion of a
tag and a region, and then merge them in a variabled named 'tag_val'.
But with the comment above, I think I can still manage to sleep at
night.

> > 
> >> +		tag_val >>= tag_lsb;
> >> +	}
> >>  }
> >>  
> >>  static u32 compute_instruction(int n, u32 rd, u32 rn)
> >> @@ -46,6 +73,33 @@ static u32 compute_instruction(int n, u32 rd, u32 rn)
> >>  							  AARCH64_INSN_VARIANT_64BIT,
> >>  							  rn, rd, va_mask);
> >>  		break;
> >> +
> >> +	case 1:
> >> +		/* ROR is a variant of EXTR with Rm = Rn */
> >> +		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
> >> +					     rn, rn, rd,
> >> +					     tag_lsb);
> >> +		break;
> >> +
> >> +	case 2:
> >> +		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
> >> +						    tag_val & (SZ_4K - 1),
> >> +						    AARCH64_INSN_VARIANT_64BIT,
> >> +						    AARCH64_INSN_ADSB_ADD);
> >> +		break;
> >> +
> >> +	case 3:
> >> +		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
> >> +						    tag_val & GENMASK(23, 12),
> >> +						    AARCH64_INSN_VARIANT_64BIT,
> >> +						    AARCH64_INSN_ADSB_ADD);
> >> +		break;
> >> +
> >> +	case 4:
> >> +		/* ROR is a variant of EXTR with Rm = Rn */
> >> +		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
> >> +					     rn, rn, rd, 64 - tag_lsb);
> > 
> > Ah, you do use 64 - first_rand in the code.  Well, I approve of this
> > line of code then.
> > 
> >> +		break;
> >>  	}
> >>  
> >>  	return insn;
> >> @@ -56,8 +110,8 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
> >>  {
> >>  	int i;
> >>  
> >> -	/* We only expect a 1 instruction sequence */
> >> -	BUG_ON(nr_inst != 1);
> >> +	/* We only expect a 5 instruction sequence */
> > 
> > Still sounds strange to me, just drop the comment I think if we keep the
> > BUG_ON.
> 
> Sure.
> 
> > 
> >> +	BUG_ON(nr_inst != 5);
> >>  
> >>  	if (!has_vhe() && !va_mask)
> >>  		compute_layout();
> >> @@ -68,8 +122,12 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
> >>  		/*
> >>  		 * VHE doesn't need any address translation, let's NOP
> >>  		 * everything.
> >> +		 *
> >> +		 * Alternatively, if we don't have any spare bits in
> >> +		 * the address, NOP everything after masking tha
> > 
> > s/tha/the/
> > 
> >> +		 * kernel VA.
> >>  		 */
> >> -		if (has_vhe()) {
> >> +		if (has_vhe() || (!tag_lsb && i > 1)) {
> >>  			updptr[i] = aarch64_insn_gen_nop();
> >>  			continue;
> >>  		}
> >> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> >> index 14c5e5534f2f..d01c7111b1f7 100644
> >> --- a/virt/kvm/arm/mmu.c
> >> +++ b/virt/kvm/arm/mmu.c
> >> @@ -1811,7 +1811,7 @@ int kvm_mmu_init(void)
> >>  		  kern_hyp_va((unsigned long)high_memory - 1));
> >>  
> >>  	if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
> >> -	    hyp_idmap_start <  kern_hyp_va(~0UL) &&
> >> +	    hyp_idmap_start <  kern_hyp_va((unsigned long)high_memory - 1) &&
> > 
> > Is this actually required for this patch or are we just trying to be
> > nice?
> 
> You really need something like that. Remember that we compute the tag
> based on the available memory, so something that goes beyond that is not
> a valid input to kern_hyp_va anymore.

ah right.

> > 
> > I'm actually not sure I remember what this is about beyond the VA=idmap
> > for everything on 32-bit case; I thought we chose the hyp address space
> > exactly so that it wouldn't overlap with the idmap?
> 
> This is just a sanity check that kern_hyp_va returns the right thing
> with respect to the idmap (i.e. we cannot hit the idmap by feeding
> something to the macro). Is that clear enough (I'm not sure it is...)?
> 

It feels a bit like defensive coding, which we don't normally do, so I
was confused if this was something we actually expected to encounter in
some cases, or if we were just covering out backs.  If it's the latter,
then it makes perfect sense to me, and I'd rather catch an error here
than see the world explode later.


Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 18/19] arm64: KVM: Introduce EL2 VA randomisation
@ 2018-02-16  9:33         ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-02-16  9:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Feb 15, 2018 at 03:32:52PM +0000, Marc Zyngier wrote:
> On 18/01/18 20:28, Christoffer Dall wrote:
> > On Thu, Jan 04, 2018 at 06:43:33PM +0000, Marc Zyngier wrote:
> >> The main idea behind randomising the EL2 VA is that we usually have
> >> a few spare bits between the most significant bit of the VA mask
> >> and the most significant bit of the linear mapping.
> >>
> >> Those bits could be a bunch of zeroes, and could be useful
> >> to move things around a bit. Of course, the more memory you have,
> >> the less randomisation you get...
> >>
> >> Alternatively, these bits could be the result of KASLR, in which
> >> case they are already random. But it would be nice to have a
> >> *different* randomization, just to make the job of a potential
> >> attacker a bit more difficult.
> >>
> >> Inserting these random bits is a bit involved. We don't have a spare
> >> register (short of rewriting all the kern_hyp_va call sites), and
> >> the immediate we want to insert is too random to be used with the
> >> ORR instruction. The best option I could come up with is the following
> >> sequence:
> >>
> >> 	and x0, x0, #va_mask
> > 
> > So if I get this right, you want to insert an arbitrary random value
> > without an extra register in bits [(VA_BITS-1):first_random_bit] and
> > BIT(VA_BITS-1) is always set in the input because it's a kernel address.
> 
> Correct.
> 
> > 
> >> 	ror x0, x0, #first_random_bit
> > 
> > Then you rotate so that the random bits become the LSBs and the random
> > value should be inserted into bits [NR_RAND_BITS-1:0] in x0 ?
> 
> Correct again. The important thing to notice is that the bottom bits are
> guaranteed to be zero, making sure that the subsequent adds act as ors.
> 
> > 
> >> 	add x0, x0, #(random & 0xfff)
> > 
> > So you do this via two rounds, first the lower 12 bits
> > 
> >> 	add x0, x0, #(random >> 12), lsl #12
> > 
> > Then the upper 12 bits (permitting a maximum of 24 randomized bits)
> 
> Still correct. It is debatable whether allowing more than 12 bits is
> really useful, as only platforms with very little memory will be able to
> reach past 12 bits of entropy.
> 
> > 
> >> 	ror x0, x0, #(63 - first_random_bit)
> > 
> > And then you rotate things back into their place.
> > 
> > Only, I don't understand why this isn't then (64 - first_random_bit) ?
> 
> That looks like a typo.
> 
> > 
> >>
> >> making it a fairly long sequence, but one that a decent CPU should
> >> be able to execute without breaking a sweat. It is of course NOPed
> >> out on VHE. The last 4 instructions can also be turned into NOPs
> >> if it appears that there is no free bits to use.
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  arch/arm64/include/asm/kvm_mmu.h | 10 +++++-
> >>  arch/arm64/kvm/va_layout.c       | 68 +++++++++++++++++++++++++++++++++++++---
> >>  virt/kvm/arm/mmu.c               |  2 +-
> >>  3 files changed, 73 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> >> index cc882e890bb1..4fca6ddadccc 100644
> >> --- a/arch/arm64/include/asm/kvm_mmu.h
> >> +++ b/arch/arm64/include/asm/kvm_mmu.h
> >> @@ -85,6 +85,10 @@
> >>  .macro kern_hyp_va	reg
> >>  alternative_cb kvm_update_va_mask
> >>  	and     \reg, \reg, #1
> >> +	ror	\reg, \reg, #1
> >> +	add	\reg, \reg, #0
> >> +	add	\reg, \reg, #0
> >> +	ror	\reg, \reg, #63
> >>  alternative_cb_end
> >>  .endm
> >>  
> >> @@ -101,7 +105,11 @@ void kvm_update_va_mask(struct alt_instr *alt,
> >>  
> >>  static inline unsigned long __kern_hyp_va(unsigned long v)
> >>  {
> >> -	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n",
> >> +	asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n"
> >> +				    "ror %0, %0, #1\n"
> >> +				    "add %0, %0, #0\n"
> >> +				    "add %0, %0, #0\n"
> >> +				    "ror %0, %0, #63\n",
> > 
> > This now sort of serves as the documentation if you don't have the
> > commit message, so I think you should annotate each line like the commit
> > message does.
> > 
> > Alternative, since you're duplicating a bunch of code which will be
> > replaced at runtime anyway, you could make all of these "and %0, %0, #1"
> > and then copy the documentation assembly code as a comment to
> > compute_instruction() and put a comment reference here.
> 
> I found that adding something that looks a bit like the generated code
> helps a lot. I'll add some documentation there.
> 

The annotation in the commit message was quite nice.

> > 
> >>  				    kvm_update_va_mask)
> >>  		     : "+r" (v));
> >>  	return v;
> >> diff --git a/arch/arm64/kvm/va_layout.c b/arch/arm64/kvm/va_layout.c
> >> index 75bb1c6772b0..bf0d6bdf5f14 100644
> >> --- a/arch/arm64/kvm/va_layout.c
> >> +++ b/arch/arm64/kvm/va_layout.c
> >> @@ -16,11 +16,15 @@
> >>   */
> >>  
> >>  #include <linux/kvm_host.h>
> >> +#include <linux/random.h>
> >> +#include <linux/memblock.h>
> >>  #include <asm/alternative.h>
> >>  #include <asm/debug-monitors.h>
> >>  #include <asm/insn.h>
> >>  #include <asm/kvm_mmu.h>
> >>  
> > 
> > It would be nice to have a comment on these, something like:
> > 
> > /* The LSB of the random hyp VA tag or 0 if no randomization is used. */
> >> +static u8 tag_lsb;
> > /* The random hyp VA tag value with the region bit, if hyp randomization is used */
> >> +static u64 tag_val;
> 
> Sure.
> 
> > 
> > 
> >>  static u64 va_mask;
> >>  
> >>  static void compute_layout(void)
> >> @@ -32,8 +36,31 @@ static void compute_layout(void)
> >>  	region  = idmap_addr & BIT(VA_BITS - 1);
> >>  	region ^= BIT(VA_BITS - 1);
> >>  
> >> -	va_mask  = BIT(VA_BITS - 1) - 1;
> >> -	va_mask |= region;
> >> +	tag_lsb = fls64((u64)phys_to_virt(memblock_start_of_DRAM()) ^
> >> +			(u64)(high_memory - 1));
> >> +
> >> +	if (tag_lsb == (VA_BITS - 1)) {
> >> +		/*
> >> +		 * No space in the address, let's compute the mask so
> >> +		 * that it covers (VA_BITS - 1) bits, and the region
> >> +		 * bit. The tag is set to zero.
> >> +		 */
> >> +		tag_lsb = tag_val = 0;
> > 
> > tag_val should already be 0, right?
> > 
> > and wouldn't it be slightly nicer to have a temporary variable and only
> > set tag_lsb when needed, called something like linear_bits ?
> 
> OK.
> 
> > 
> >> +		va_mask  = BIT(VA_BITS - 1) - 1;
> >> +		va_mask |= region;
> >> +	} else {
> >> +		/*
> >> +		 * We do have some free bits. Let's have the mask to
> >> +		 * cover the low bits of the VA, and the tag to
> >> +		 * contain the random stuff plus the region bit.
> >> +		 */
> > 
> > Since you have two masks below this comment is a bit hard to parse, how
> > about explaining what makes up a Hyp address from a kernel linear
> > address instead, something like:
> > 
> > 		/*
> > 		 * We do have some free bits to insert a random tag.
> > 		 * Hyp VAs are now created from kernel linear map VAs
> > 		 * using the following formula (with V == VA_BITS):
> > 		 *
> > 		 *  63 ... V |   V-1  | V-2 ... tag_lsb | tag_lsb - 1 ... 0
> > 		 *  -------------------------------------------------------
> > 		 * | 0000000 | region |    random tag   |  kern linear VA  |
> > 		 */
> > 
> > (assuming I got this vaguely correct).
> 
> /me copy-pastes...
> 
> > 
> >> +		u64 mask = GENMASK_ULL(VA_BITS - 2, tag_lsb);
> > 
> > for consistency it would be nicer to use GENMASK_ULL(VA_BITS - 2, 0)
> > above as suggested in the other patch then.  And we could also call this
> > tag_mask to be super explicit.
> > 
> >> +
> >> +		va_mask = BIT(tag_lsb) - 1;
> > 
> > and here, GENMASK_ULL(tag_lsb - 1, 0).
> 
> Yup.
> 
> > 
> >> +		tag_val  = get_random_long() & mask;
> >> +		tag_val |= region;
> > 
> > it's actually unclear to me why you need the region bit included in
> > tag_val?
> 
> Because the initial masking strips it from the VA, and we need to add it
> back. storing it as part of the tag makes it easy to ORR in.
> 

Right, ok, it just slightly ticked my OCD to have a separate notion of a
tag and a region, and then merge them in a variabled named 'tag_val'.
But with the comment above, I think I can still manage to sleep at
night.

> > 
> >> +		tag_val >>= tag_lsb;
> >> +	}
> >>  }
> >>  
> >>  static u32 compute_instruction(int n, u32 rd, u32 rn)
> >> @@ -46,6 +73,33 @@ static u32 compute_instruction(int n, u32 rd, u32 rn)
> >>  							  AARCH64_INSN_VARIANT_64BIT,
> >>  							  rn, rd, va_mask);
> >>  		break;
> >> +
> >> +	case 1:
> >> +		/* ROR is a variant of EXTR with Rm = Rn */
> >> +		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
> >> +					     rn, rn, rd,
> >> +					     tag_lsb);
> >> +		break;
> >> +
> >> +	case 2:
> >> +		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
> >> +						    tag_val & (SZ_4K - 1),
> >> +						    AARCH64_INSN_VARIANT_64BIT,
> >> +						    AARCH64_INSN_ADSB_ADD);
> >> +		break;
> >> +
> >> +	case 3:
> >> +		insn = aarch64_insn_gen_add_sub_imm(rd, rn,
> >> +						    tag_val & GENMASK(23, 12),
> >> +						    AARCH64_INSN_VARIANT_64BIT,
> >> +						    AARCH64_INSN_ADSB_ADD);
> >> +		break;
> >> +
> >> +	case 4:
> >> +		/* ROR is a variant of EXTR with Rm = Rn */
> >> +		insn = aarch64_insn_gen_extr(AARCH64_INSN_VARIANT_64BIT,
> >> +					     rn, rn, rd, 64 - tag_lsb);
> > 
> > Ah, you do use 64 - first_rand in the code.  Well, I approve of this
> > line of code then.
> > 
> >> +		break;
> >>  	}
> >>  
> >>  	return insn;
> >> @@ -56,8 +110,8 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
> >>  {
> >>  	int i;
> >>  
> >> -	/* We only expect a 1 instruction sequence */
> >> -	BUG_ON(nr_inst != 1);
> >> +	/* We only expect a 5 instruction sequence */
> > 
> > Still sounds strange to me, just drop the comment I think if we keep the
> > BUG_ON.
> 
> Sure.
> 
> > 
> >> +	BUG_ON(nr_inst != 5);
> >>  
> >>  	if (!has_vhe() && !va_mask)
> >>  		compute_layout();
> >> @@ -68,8 +122,12 @@ void __init kvm_update_va_mask(struct alt_instr *alt,
> >>  		/*
> >>  		 * VHE doesn't need any address translation, let's NOP
> >>  		 * everything.
> >> +		 *
> >> +		 * Alternatively, if we don't have any spare bits in
> >> +		 * the address, NOP everything after masking tha
> > 
> > s/tha/the/
> > 
> >> +		 * kernel VA.
> >>  		 */
> >> -		if (has_vhe()) {
> >> +		if (has_vhe() || (!tag_lsb && i > 1)) {
> >>  			updptr[i] = aarch64_insn_gen_nop();
> >>  			continue;
> >>  		}
> >> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> >> index 14c5e5534f2f..d01c7111b1f7 100644
> >> --- a/virt/kvm/arm/mmu.c
> >> +++ b/virt/kvm/arm/mmu.c
> >> @@ -1811,7 +1811,7 @@ int kvm_mmu_init(void)
> >>  		  kern_hyp_va((unsigned long)high_memory - 1));
> >>  
> >>  	if (hyp_idmap_start >= kern_hyp_va(PAGE_OFFSET) &&
> >> -	    hyp_idmap_start <  kern_hyp_va(~0UL) &&
> >> +	    hyp_idmap_start <  kern_hyp_va((unsigned long)high_memory - 1) &&
> > 
> > Is this actually required for this patch or are we just trying to be
> > nice?
> 
> You really need something like that. Remember that we compute the tag
> based on the available memory, so something that goes beyond that is not
> a valid input to kern_hyp_va anymore.

ah right.

> > 
> > I'm actually not sure I remember what this is about beyond the VA=idmap
> > for everything on 32-bit case; I thought we chose the hyp address space
> > exactly so that it wouldn't overlap with the idmap?
> 
> This is just a sanity check that kern_hyp_va returns the right thing
> with respect to the idmap (i.e. we cannot hit the idmap by feeding
> something to the macro). Is that clear enough (I'm not sure it is...)?
> 

It feels a bit like defensive coding, which we don't normally do, so I
was confused if this was something we actually expected to encounter in
some cases, or if we were just covering out backs.  If it's the latter,
then it makes perfect sense to me, and I'd rather catch an error here
than see the world explode later.


Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
  2018-02-16  9:05         ` Christoffer Dall
@ 2018-02-16  9:33           ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-16  9:33 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On 16/02/18 09:05, Christoffer Dall wrote:
> On Thu, Feb 15, 2018 at 01:22:56PM +0000, Marc Zyngier wrote:
>> On 15/01/18 15:36, Christoffer Dall wrote:
>>> On Thu, Jan 04, 2018 at 06:43:25PM +0000, Marc Zyngier wrote:
>>>> kvm_vgic_global_state is part of the read-only section, and is
>>>> usually accessed using a PC-relative address generation (adrp + add).
>>>>
>>>> It is thus useless to use kern_hyp_va() on it, and actively problematic
>>>> if kern_hyp_va() becomes non-idempotent. On the other hand, there is
>>>> no way that the compiler is going to guarantee that such access is
>>>> always be PC relative.
>>>
>>> nit: is always be
>>>
>>>>
>>>> So let's bite the bullet and provide our own accessor.
>>>>
>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>> ---
>>>>  arch/arm/include/asm/kvm_hyp.h   | 6 ++++++
>>>>  arch/arm64/include/asm/kvm_hyp.h | 9 +++++++++
>>>>  virt/kvm/arm/hyp/vgic-v2-sr.c    | 4 ++--
>>>>  3 files changed, 17 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
>>>> index ab20ffa8b9e7..1d42d0aa2feb 100644
>>>> --- a/arch/arm/include/asm/kvm_hyp.h
>>>> +++ b/arch/arm/include/asm/kvm_hyp.h
>>>> @@ -26,6 +26,12 @@
>>>>  
>>>>  #define __hyp_text __section(.hyp.text) notrace
>>>>  
>>>> +#define hyp_symbol_addr(s)						\
>>>> +	({								\
>>>> +		typeof(s) *addr = &(s);					\
>>>> +		addr;							\
>>>> +	})
>>>> +
>>>>  #define __ACCESS_VFP(CRn)			\
>>>>  	"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
>>>>  
>>>> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
>>>> index 08d3bb66c8b7..a2d98c539023 100644
>>>> --- a/arch/arm64/include/asm/kvm_hyp.h
>>>> +++ b/arch/arm64/include/asm/kvm_hyp.h
>>>> @@ -25,6 +25,15 @@
>>>>  
>>>>  #define __hyp_text __section(.hyp.text) notrace
>>>>  
>>>> +#define hyp_symbol_addr(s)						\
>>>> +	({								\
>>>> +		typeof(s) *addr;					\
>>>> +		asm volatile("adrp	%0, %1\n"			\
>>>> +			     "add	%0, %0, :lo12:%1\n"		\
>>>> +			     : "=r" (addr) : "S" (&s));			\
>>>
>>> Can't we use adr_l here?
>>
>> Unfortunately not. All the asm/assembler.h macros are unavailable to
>> inline assembly. We could start introducing equivalent macros for that
>> purpose, but that's starting to be outside of the scope of this series.
>>
> 
> Absolutely.  Forget I asked.
> 
>>>
>>>> +		addr;							\
>>>> +	})
>>>> +
>>>
>>> I don't fully appreciate the semantics of this macro going by its name
>>> only.  My understanding is that if you want to resolve a symbol to an
>>> address which is mapped in hyp, then use this.  Is this correct?
>>
>> The goal of this macro is to return a symbol's address based on a
>> PC-relative computation, as opposed to a loading the VA from a constant
>> pool or something similar. This works well for HYP, as an absolute VA is
>> guaranteed to be wrong.
>>
>>>
>>> If so, can we add a small comment (because I can't come up with a better
>>> name).
>>
>> I'll add the above if that works for you.
>>
> 
> Yes it does.  The only thing that remains a bit unclear is what the
> difference between this and kern_hyp_va is, and when you'd choose to use
> one over the other.  Perhaps we need a single place which documents our
> primitives and tells us what to use when.  At least, I'm for sure not
> going to be able to figure this out later on.

Let me try to explain that:

The two primitives work on different "objects". kern_hyp_va() works on
an address. If what you have is a pointer, then kern_hyp_va is your
friend. On the contrary, if what you have is a symbol instead of the
address of that object (and thus not something we obtain by reading a
variable), then hyp_symbol_addr is probably what you need.

Of course, a symbol can also be a variable, which makes things a bit
harder. The asm constraint are such as compilation will break if you try
to treat a local variable as a symbol (the 'S' constraint is defined as
"An absolute symbolic address or a label reference", and the '&s' makes
it pretty hard to fool).

I've tried to make it make it foolproof, but who knows... ;-)

Hope this helps,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
@ 2018-02-16  9:33           ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-16  9:33 UTC (permalink / raw)
  To: linux-arm-kernel

On 16/02/18 09:05, Christoffer Dall wrote:
> On Thu, Feb 15, 2018 at 01:22:56PM +0000, Marc Zyngier wrote:
>> On 15/01/18 15:36, Christoffer Dall wrote:
>>> On Thu, Jan 04, 2018 at 06:43:25PM +0000, Marc Zyngier wrote:
>>>> kvm_vgic_global_state is part of the read-only section, and is
>>>> usually accessed using a PC-relative address generation (adrp + add).
>>>>
>>>> It is thus useless to use kern_hyp_va() on it, and actively problematic
>>>> if kern_hyp_va() becomes non-idempotent. On the other hand, there is
>>>> no way that the compiler is going to guarantee that such access is
>>>> always be PC relative.
>>>
>>> nit: is always be
>>>
>>>>
>>>> So let's bite the bullet and provide our own accessor.
>>>>
>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>> ---
>>>>  arch/arm/include/asm/kvm_hyp.h   | 6 ++++++
>>>>  arch/arm64/include/asm/kvm_hyp.h | 9 +++++++++
>>>>  virt/kvm/arm/hyp/vgic-v2-sr.c    | 4 ++--
>>>>  3 files changed, 17 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
>>>> index ab20ffa8b9e7..1d42d0aa2feb 100644
>>>> --- a/arch/arm/include/asm/kvm_hyp.h
>>>> +++ b/arch/arm/include/asm/kvm_hyp.h
>>>> @@ -26,6 +26,12 @@
>>>>  
>>>>  #define __hyp_text __section(.hyp.text) notrace
>>>>  
>>>> +#define hyp_symbol_addr(s)						\
>>>> +	({								\
>>>> +		typeof(s) *addr = &(s);					\
>>>> +		addr;							\
>>>> +	})
>>>> +
>>>>  #define __ACCESS_VFP(CRn)			\
>>>>  	"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
>>>>  
>>>> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
>>>> index 08d3bb66c8b7..a2d98c539023 100644
>>>> --- a/arch/arm64/include/asm/kvm_hyp.h
>>>> +++ b/arch/arm64/include/asm/kvm_hyp.h
>>>> @@ -25,6 +25,15 @@
>>>>  
>>>>  #define __hyp_text __section(.hyp.text) notrace
>>>>  
>>>> +#define hyp_symbol_addr(s)						\
>>>> +	({								\
>>>> +		typeof(s) *addr;					\
>>>> +		asm volatile("adrp	%0, %1\n"			\
>>>> +			     "add	%0, %0, :lo12:%1\n"		\
>>>> +			     : "=r" (addr) : "S" (&s));			\
>>>
>>> Can't we use adr_l here?
>>
>> Unfortunately not. All the asm/assembler.h macros are unavailable to
>> inline assembly. We could start introducing equivalent macros for that
>> purpose, but that's starting to be outside of the scope of this series.
>>
> 
> Absolutely.  Forget I asked.
> 
>>>
>>>> +		addr;							\
>>>> +	})
>>>> +
>>>
>>> I don't fully appreciate the semantics of this macro going by its name
>>> only.  My understanding is that if you want to resolve a symbol to an
>>> address which is mapped in hyp, then use this.  Is this correct?
>>
>> The goal of this macro is to return a symbol's address based on a
>> PC-relative computation, as opposed to a loading the VA from a constant
>> pool or something similar. This works well for HYP, as an absolute VA is
>> guaranteed to be wrong.
>>
>>>
>>> If so, can we add a small comment (because I can't come up with a better
>>> name).
>>
>> I'll add the above if that works for you.
>>
> 
> Yes it does.  The only thing that remains a bit unclear is what the
> difference between this and kern_hyp_va is, and when you'd choose to use
> one over the other.  Perhaps we need a single place which documents our
> primitives and tells us what to use when.  At least, I'm for sure not
> going to be able to figure this out later on.

Let me try to explain that:

The two primitives work on different "objects". kern_hyp_va() works on
an address. If what you have is a pointer, then kern_hyp_va is your
friend. On the contrary, if what you have is a symbol instead of the
address of that object (and thus not something we obtain by reading a
variable), then hyp_symbol_addr is probably what you need.

Of course, a symbol can also be a variable, which makes things a bit
harder. The asm constraint are such as compilation will break if you try
to treat a local variable as a symbol (the 'S' constraint is defined as
"An absolute symbolic address or a label reference", and the '&s' makes
it pretty hard to fool).

I've tried to make it make it foolproof, but who knows... ;-)

Hope this helps,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 14/19] KVM: arm/arm64: Move HYP IO VAs to the "idmap" range
  2018-02-16  9:25         ` Christoffer Dall
@ 2018-02-16 15:20           ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-16 15:20 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: linux-arm-kernel, kvm, kvmarm, Mark Rutland, Catalin Marinas,
	Will Deacon, James Morse, Steve Capper, Peter Maydell

On 16/02/18 09:25, Christoffer Dall wrote:
> On Thu, Feb 15, 2018 at 01:52:05PM +0000, Marc Zyngier wrote:
>> On 18/01/18 14:39, Christoffer Dall wrote:
>>> On Thu, Jan 04, 2018 at 06:43:29PM +0000, Marc Zyngier wrote:
>>>> We so far mapped our HYP IO (which is essencially the GICv2 control
>>>> registers) using the same method as for memory. It recently appeared
>>>> that is a bit unsafe:
>>>>
>>>> We compute the HYP VA using the kern_hyp_va helper, but that helper
>>>> is only designed to deal with kernel VAs coming from the linear map,
>>>> and not from the vmalloc region... This could in turn cause some bad
>>>> aliasing between the two, amplified by the upcoming VA randomisation.
>>>>
>>>> A solution is to come up with our very own basic VA allocator for
>>>> MMIO. Since half of the HYP address space only contains a single
>>>> page (the idmap), we have plenty to borrow from. Let's use the idmap
>>>> as a base, and allocate downwards from it. GICv2 now lives on the
>>>> other side of the great VA barrier.
>>>>
>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>> ---
>>>>  virt/kvm/arm/mmu.c | 56 +++++++++++++++++++++++++++++++++++++++++-------------
>>>>  1 file changed, 43 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>>> index 6192d45d1e1a..14c5e5534f2f 100644
>>>> --- a/virt/kvm/arm/mmu.c
>>>> +++ b/virt/kvm/arm/mmu.c
>>>> @@ -43,6 +43,9 @@ static unsigned long hyp_idmap_start;
>>>>  static unsigned long hyp_idmap_end;
>>>>  static phys_addr_t hyp_idmap_vector;
>>>>  
>>>> +static DEFINE_MUTEX(io_map_lock);
>>>> +static unsigned long io_map_base;
>>>> +
>>>>  #define S2_PGD_SIZE	(PTRS_PER_S2_PGD * sizeof(pgd_t))
>>>>  #define hyp_pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
>>>>  
>>>> @@ -502,27 +505,31 @@ static void unmap_hyp_range(pgd_t *pgdp, phys_addr_t start, u64 size)
>>>>   *
>>>>   * Assumes hyp_pgd is a page table used strictly in Hyp-mode and
>>>>   * therefore contains either mappings in the kernel memory area (above
>>>> - * PAGE_OFFSET), or device mappings in the vmalloc range (from
>>>> - * VMALLOC_START to VMALLOC_END).
>>>> + * PAGE_OFFSET), or device mappings in the idmap range.
>>>>   *
>>>> - * boot_hyp_pgd should only map two pages for the init code.
>>>> + * boot_hyp_pgd should only map the idmap range, and is only used in
>>>> + * the extended idmap case.
>>>>   */
>>>>  void free_hyp_pgds(void)
>>>>  {
>>>> +	pgd_t *id_pgd;
>>>> +
>>>>  	mutex_lock(&kvm_hyp_pgd_mutex);
>>>>  
>>>> +	id_pgd = boot_hyp_pgd ? boot_hyp_pgd : hyp_pgd;
>>>> +
>>>> +	if (id_pgd)
>>>> +		unmap_hyp_range(id_pgd, io_map_base,
>>>> +				hyp_idmap_start + PAGE_SIZE - io_map_base);
>>>> +
>>>>  	if (boot_hyp_pgd) {
>>>> -		unmap_hyp_range(boot_hyp_pgd, hyp_idmap_start, PAGE_SIZE);
>>>>  		free_pages((unsigned long)boot_hyp_pgd, hyp_pgd_order);
>>>>  		boot_hyp_pgd = NULL;
>>>>  	}
>>>>  
>>>>  	if (hyp_pgd) {
>>>> -		unmap_hyp_range(hyp_pgd, hyp_idmap_start, PAGE_SIZE);
>>>>  		unmap_hyp_range(hyp_pgd, kern_hyp_va(PAGE_OFFSET),
>>>>  				(uintptr_t)high_memory - PAGE_OFFSET);
>>>> -		unmap_hyp_range(hyp_pgd, kern_hyp_va(VMALLOC_START),
>>>> -				VMALLOC_END - VMALLOC_START);
>>>>  
>>>>  		free_pages((unsigned long)hyp_pgd, hyp_pgd_order);
>>>>  		hyp_pgd = NULL;
>>>> @@ -721,7 +728,8 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>>>>  			   void __iomem **kaddr,
>>>>  			   void __iomem **haddr)
>>>>  {
>>>> -	unsigned long start, end;
>>>> +	pgd_t *pgd = hyp_pgd;
>>>> +	unsigned long base;
>>>>  	int ret;
>>>>  
>>>>  	*kaddr = ioremap(phys_addr, size);
>>>> @@ -733,17 +741,38 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>>>>  		return 0;
>>>>  	}
>>>>  
>>>> +	mutex_lock(&io_map_lock);
>>>> +
>>>> +	base = io_map_base - size;
>>>
>>> are we guaranteed that hyp_idmap_start (and therefore io_map_base) is
>>> sufficiently greater than 0 ?  I suppose that even if RAM starts at 0,
>>> and the kernel was loaded at 0, the idmap page for Hyp would be at some
>>> reasonable offset from the start of the kernel image?
>>
>> On my kernel image:
>> ffff000008080000 t _head
>> ffff000008cc6000 T __hyp_idmap_text_start
>> ffff000009aaa000 B swapper_pg_end
>>
>> _hyp_idmap_text_start is about 12MB from the beginning of the image, and
>> about 14MB from the end. Yes, it is a big kernel. But we're only mapping
>> a few pages there, even with my upcoming crazy vector remapping crap. So
>> the likeliness of this failing is close to zero.
>>
>> Now, close to zero is not necessarily close enough. What I could do is
>> to switch the allocator around on failure, so that if we can't allocate
>> on one side, we can at least try to allocate on the other side. I'm
>> pretty sure we'll never trigger that code, but I can implement it if you
>> think that's worth it.
>>
> 
> I don't think we should necessarily implement it, my main concern is
> when reading the code, the reader has to concince himself/herself why
> this is always safe (and not just very likely to be safe).  I'm fine
> with adding a comment that explains this instead of implementing
> complicated logic though.  What do you think?

Oh, absolutely. I'll add a blurb about this.

> 
>>>
>>>> +	base &= ~(size - 1);
>>>
>>> I'm not sure I understand this line.  Wouldn't it make more sense to use
>>> PAGE_SIZE here?
>>
>> This is trying to align the base of the allocation to its natural size
>> (8kB aligned on an 8kB boundary, for example), which is what other
>> allocators in the kernel do. I've now added a roundup_pow_of_two(size)
>> so that we're guaranteed to deal with those.
>>
> 
> Ah right, it's just that I wasn't thinking of the size in terms of
> always being page aligned, so couldn't you here attempt two 4K
> allocations on a 64K page system, where one would overwrite the other?

Ah, I see what you mean. Well spotted. I'll turn that into a
max(roundup_pow_of_two(size), PAGE_SIZE).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 14/19] KVM: arm/arm64: Move HYP IO VAs to the "idmap" range
@ 2018-02-16 15:20           ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-16 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

On 16/02/18 09:25, Christoffer Dall wrote:
> On Thu, Feb 15, 2018 at 01:52:05PM +0000, Marc Zyngier wrote:
>> On 18/01/18 14:39, Christoffer Dall wrote:
>>> On Thu, Jan 04, 2018 at 06:43:29PM +0000, Marc Zyngier wrote:
>>>> We so far mapped our HYP IO (which is essencially the GICv2 control
>>>> registers) using the same method as for memory. It recently appeared
>>>> that is a bit unsafe:
>>>>
>>>> We compute the HYP VA using the kern_hyp_va helper, but that helper
>>>> is only designed to deal with kernel VAs coming from the linear map,
>>>> and not from the vmalloc region... This could in turn cause some bad
>>>> aliasing between the two, amplified by the upcoming VA randomisation.
>>>>
>>>> A solution is to come up with our very own basic VA allocator for
>>>> MMIO. Since half of the HYP address space only contains a single
>>>> page (the idmap), we have plenty to borrow from. Let's use the idmap
>>>> as a base, and allocate downwards from it. GICv2 now lives on the
>>>> other side of the great VA barrier.
>>>>
>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>> ---
>>>>  virt/kvm/arm/mmu.c | 56 +++++++++++++++++++++++++++++++++++++++++-------------
>>>>  1 file changed, 43 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>>> index 6192d45d1e1a..14c5e5534f2f 100644
>>>> --- a/virt/kvm/arm/mmu.c
>>>> +++ b/virt/kvm/arm/mmu.c
>>>> @@ -43,6 +43,9 @@ static unsigned long hyp_idmap_start;
>>>>  static unsigned long hyp_idmap_end;
>>>>  static phys_addr_t hyp_idmap_vector;
>>>>  
>>>> +static DEFINE_MUTEX(io_map_lock);
>>>> +static unsigned long io_map_base;
>>>> +
>>>>  #define S2_PGD_SIZE	(PTRS_PER_S2_PGD * sizeof(pgd_t))
>>>>  #define hyp_pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
>>>>  
>>>> @@ -502,27 +505,31 @@ static void unmap_hyp_range(pgd_t *pgdp, phys_addr_t start, u64 size)
>>>>   *
>>>>   * Assumes hyp_pgd is a page table used strictly in Hyp-mode and
>>>>   * therefore contains either mappings in the kernel memory area (above
>>>> - * PAGE_OFFSET), or device mappings in the vmalloc range (from
>>>> - * VMALLOC_START to VMALLOC_END).
>>>> + * PAGE_OFFSET), or device mappings in the idmap range.
>>>>   *
>>>> - * boot_hyp_pgd should only map two pages for the init code.
>>>> + * boot_hyp_pgd should only map the idmap range, and is only used in
>>>> + * the extended idmap case.
>>>>   */
>>>>  void free_hyp_pgds(void)
>>>>  {
>>>> +	pgd_t *id_pgd;
>>>> +
>>>>  	mutex_lock(&kvm_hyp_pgd_mutex);
>>>>  
>>>> +	id_pgd = boot_hyp_pgd ? boot_hyp_pgd : hyp_pgd;
>>>> +
>>>> +	if (id_pgd)
>>>> +		unmap_hyp_range(id_pgd, io_map_base,
>>>> +				hyp_idmap_start + PAGE_SIZE - io_map_base);
>>>> +
>>>>  	if (boot_hyp_pgd) {
>>>> -		unmap_hyp_range(boot_hyp_pgd, hyp_idmap_start, PAGE_SIZE);
>>>>  		free_pages((unsigned long)boot_hyp_pgd, hyp_pgd_order);
>>>>  		boot_hyp_pgd = NULL;
>>>>  	}
>>>>  
>>>>  	if (hyp_pgd) {
>>>> -		unmap_hyp_range(hyp_pgd, hyp_idmap_start, PAGE_SIZE);
>>>>  		unmap_hyp_range(hyp_pgd, kern_hyp_va(PAGE_OFFSET),
>>>>  				(uintptr_t)high_memory - PAGE_OFFSET);
>>>> -		unmap_hyp_range(hyp_pgd, kern_hyp_va(VMALLOC_START),
>>>> -				VMALLOC_END - VMALLOC_START);
>>>>  
>>>>  		free_pages((unsigned long)hyp_pgd, hyp_pgd_order);
>>>>  		hyp_pgd = NULL;
>>>> @@ -721,7 +728,8 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>>>>  			   void __iomem **kaddr,
>>>>  			   void __iomem **haddr)
>>>>  {
>>>> -	unsigned long start, end;
>>>> +	pgd_t *pgd = hyp_pgd;
>>>> +	unsigned long base;
>>>>  	int ret;
>>>>  
>>>>  	*kaddr = ioremap(phys_addr, size);
>>>> @@ -733,17 +741,38 @@ int create_hyp_io_mappings(phys_addr_t phys_addr, size_t size,
>>>>  		return 0;
>>>>  	}
>>>>  
>>>> +	mutex_lock(&io_map_lock);
>>>> +
>>>> +	base = io_map_base - size;
>>>
>>> are we guaranteed that hyp_idmap_start (and therefore io_map_base) is
>>> sufficiently greater than 0 ?  I suppose that even if RAM starts at 0,
>>> and the kernel was loaded at 0, the idmap page for Hyp would be at some
>>> reasonable offset from the start of the kernel image?
>>
>> On my kernel image:
>> ffff000008080000 t _head
>> ffff000008cc6000 T __hyp_idmap_text_start
>> ffff000009aaa000 B swapper_pg_end
>>
>> _hyp_idmap_text_start is about 12MB from the beginning of the image, and
>> about 14MB from the end. Yes, it is a big kernel. But we're only mapping
>> a few pages there, even with my upcoming crazy vector remapping crap. So
>> the likeliness of this failing is close to zero.
>>
>> Now, close to zero is not necessarily close enough. What I could do is
>> to switch the allocator around on failure, so that if we can't allocate
>> on one side, we can at least try to allocate on the other side. I'm
>> pretty sure we'll never trigger that code, but I can implement it if you
>> think that's worth it.
>>
> 
> I don't think we should necessarily implement it, my main concern is
> when reading the code, the reader has to concince himself/herself why
> this is always safe (and not just very likely to be safe).  I'm fine
> with adding a comment that explains this instead of implementing
> complicated logic though.  What do you think?

Oh, absolutely. I'll add a blurb about this.

> 
>>>
>>>> +	base &= ~(size - 1);
>>>
>>> I'm not sure I understand this line.  Wouldn't it make more sense to use
>>> PAGE_SIZE here?
>>
>> This is trying to align the base of the allocation to its natural size
>> (8kB aligned on an 8kB boundary, for example), which is what other
>> allocators in the kernel do. I've now added a roundup_pow_of_two(size)
>> so that we're guaranteed to deal with those.
>>
> 
> Ah right, it's just that I wasn't thinking of the size in terms of
> always being page aligned, so couldn't you here attempt two 4K
> allocations on a 64K page system, where one would overwrite the other?

Ah, I see what you mean. Well spotted. I'll turn that into a
max(roundup_pow_of_two(size), PAGE_SIZE).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
  2018-02-16  9:33           ` Marc Zyngier
@ 2018-02-19 14:39             ` Christoffer Dall
  -1 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-02-19 14:39 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Mark Rutland, Peter Maydell, kvm, Steve Capper, Catalin Marinas,
	Will Deacon, James Morse, kvmarm, linux-arm-kernel

On Fri, Feb 16, 2018 at 09:33:39AM +0000, Marc Zyngier wrote:
> On 16/02/18 09:05, Christoffer Dall wrote:
> > On Thu, Feb 15, 2018 at 01:22:56PM +0000, Marc Zyngier wrote:
> >> On 15/01/18 15:36, Christoffer Dall wrote:
> >>> On Thu, Jan 04, 2018 at 06:43:25PM +0000, Marc Zyngier wrote:
> >>>> kvm_vgic_global_state is part of the read-only section, and is
> >>>> usually accessed using a PC-relative address generation (adrp + add).
> >>>>
> >>>> It is thus useless to use kern_hyp_va() on it, and actively problematic
> >>>> if kern_hyp_va() becomes non-idempotent. On the other hand, there is
> >>>> no way that the compiler is going to guarantee that such access is
> >>>> always be PC relative.
> >>>
> >>> nit: is always be
> >>>
> >>>>
> >>>> So let's bite the bullet and provide our own accessor.
> >>>>
> >>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >>>> ---
> >>>>  arch/arm/include/asm/kvm_hyp.h   | 6 ++++++
> >>>>  arch/arm64/include/asm/kvm_hyp.h | 9 +++++++++
> >>>>  virt/kvm/arm/hyp/vgic-v2-sr.c    | 4 ++--
> >>>>  3 files changed, 17 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
> >>>> index ab20ffa8b9e7..1d42d0aa2feb 100644
> >>>> --- a/arch/arm/include/asm/kvm_hyp.h
> >>>> +++ b/arch/arm/include/asm/kvm_hyp.h
> >>>> @@ -26,6 +26,12 @@
> >>>>  
> >>>>  #define __hyp_text __section(.hyp.text) notrace
> >>>>  
> >>>> +#define hyp_symbol_addr(s)						\
> >>>> +	({								\
> >>>> +		typeof(s) *addr = &(s);					\
> >>>> +		addr;							\
> >>>> +	})
> >>>> +
> >>>>  #define __ACCESS_VFP(CRn)			\
> >>>>  	"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
> >>>>  
> >>>> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> >>>> index 08d3bb66c8b7..a2d98c539023 100644
> >>>> --- a/arch/arm64/include/asm/kvm_hyp.h
> >>>> +++ b/arch/arm64/include/asm/kvm_hyp.h
> >>>> @@ -25,6 +25,15 @@
> >>>>  
> >>>>  #define __hyp_text __section(.hyp.text) notrace
> >>>>  
> >>>> +#define hyp_symbol_addr(s)						\
> >>>> +	({								\
> >>>> +		typeof(s) *addr;					\
> >>>> +		asm volatile("adrp	%0, %1\n"			\
> >>>> +			     "add	%0, %0, :lo12:%1\n"		\
> >>>> +			     : "=r" (addr) : "S" (&s));			\
> >>>
> >>> Can't we use adr_l here?
> >>
> >> Unfortunately not. All the asm/assembler.h macros are unavailable to
> >> inline assembly. We could start introducing equivalent macros for that
> >> purpose, but that's starting to be outside of the scope of this series.
> >>
> > 
> > Absolutely.  Forget I asked.
> > 
> >>>
> >>>> +		addr;							\
> >>>> +	})
> >>>> +
> >>>
> >>> I don't fully appreciate the semantics of this macro going by its name
> >>> only.  My understanding is that if you want to resolve a symbol to an
> >>> address which is mapped in hyp, then use this.  Is this correct?
> >>
> >> The goal of this macro is to return a symbol's address based on a
> >> PC-relative computation, as opposed to a loading the VA from a constant
> >> pool or something similar. This works well for HYP, as an absolute VA is
> >> guaranteed to be wrong.
> >>
> >>>
> >>> If so, can we add a small comment (because I can't come up with a better
> >>> name).
> >>
> >> I'll add the above if that works for you.
> >>
> > 
> > Yes it does.  The only thing that remains a bit unclear is what the
> > difference between this and kern_hyp_va is, and when you'd choose to use
> > one over the other.  Perhaps we need a single place which documents our
> > primitives and tells us what to use when.  At least, I'm for sure not
> > going to be able to figure this out later on.
> 
> Let me try to explain that:
> 
> The two primitives work on different "objects". kern_hyp_va() works on
> an address. If what you have is a pointer, then kern_hyp_va is your
> friend. On the contrary, if what you have is a symbol instead of the
> address of that object (and thus not something we obtain by reading a
> variable), then hyp_symbol_addr is probably what you need.
> 
> Of course, a symbol can also be a variable, which makes things a bit
> harder. The asm constraint are such as compilation will break if you try
> to treat a local variable as a symbol (the 'S' constraint is defined as
> "An absolute symbolic address or a label reference", and the '&s' makes
> it pretty hard to fool).
> 
> I've tried to make it make it foolproof, but who knows... ;-)
> 
ok, so what exactly is this macro doing different then simply using the
symbol directly?  Only ensuring that the generated code is using
PC-relative addressing?  If so, should we be using this macro everywhere
in Hyp code where we dereference a symbol?

What is it about hyp code that makes us need this when it's not needed
for the kernel itself, and both support address space randomization?  Is
that because the main kernel is properly relocated after deciding on the
randomization, but Hyp is not?

It may be worth trying to put hyp_symbol_addr and kern_hyp_va in the
same header file and have something explaining when to use what and why
in that single place, but if that breaks the world, then never mind...

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
@ 2018-02-19 14:39             ` Christoffer Dall
  0 siblings, 0 replies; 104+ messages in thread
From: Christoffer Dall @ 2018-02-19 14:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 16, 2018 at 09:33:39AM +0000, Marc Zyngier wrote:
> On 16/02/18 09:05, Christoffer Dall wrote:
> > On Thu, Feb 15, 2018 at 01:22:56PM +0000, Marc Zyngier wrote:
> >> On 15/01/18 15:36, Christoffer Dall wrote:
> >>> On Thu, Jan 04, 2018 at 06:43:25PM +0000, Marc Zyngier wrote:
> >>>> kvm_vgic_global_state is part of the read-only section, and is
> >>>> usually accessed using a PC-relative address generation (adrp + add).
> >>>>
> >>>> It is thus useless to use kern_hyp_va() on it, and actively problematic
> >>>> if kern_hyp_va() becomes non-idempotent. On the other hand, there is
> >>>> no way that the compiler is going to guarantee that such access is
> >>>> always be PC relative.
> >>>
> >>> nit: is always be
> >>>
> >>>>
> >>>> So let's bite the bullet and provide our own accessor.
> >>>>
> >>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >>>> ---
> >>>>  arch/arm/include/asm/kvm_hyp.h   | 6 ++++++
> >>>>  arch/arm64/include/asm/kvm_hyp.h | 9 +++++++++
> >>>>  virt/kvm/arm/hyp/vgic-v2-sr.c    | 4 ++--
> >>>>  3 files changed, 17 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
> >>>> index ab20ffa8b9e7..1d42d0aa2feb 100644
> >>>> --- a/arch/arm/include/asm/kvm_hyp.h
> >>>> +++ b/arch/arm/include/asm/kvm_hyp.h
> >>>> @@ -26,6 +26,12 @@
> >>>>  
> >>>>  #define __hyp_text __section(.hyp.text) notrace
> >>>>  
> >>>> +#define hyp_symbol_addr(s)						\
> >>>> +	({								\
> >>>> +		typeof(s) *addr = &(s);					\
> >>>> +		addr;							\
> >>>> +	})
> >>>> +
> >>>>  #define __ACCESS_VFP(CRn)			\
> >>>>  	"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
> >>>>  
> >>>> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> >>>> index 08d3bb66c8b7..a2d98c539023 100644
> >>>> --- a/arch/arm64/include/asm/kvm_hyp.h
> >>>> +++ b/arch/arm64/include/asm/kvm_hyp.h
> >>>> @@ -25,6 +25,15 @@
> >>>>  
> >>>>  #define __hyp_text __section(.hyp.text) notrace
> >>>>  
> >>>> +#define hyp_symbol_addr(s)						\
> >>>> +	({								\
> >>>> +		typeof(s) *addr;					\
> >>>> +		asm volatile("adrp	%0, %1\n"			\
> >>>> +			     "add	%0, %0, :lo12:%1\n"		\
> >>>> +			     : "=r" (addr) : "S" (&s));			\
> >>>
> >>> Can't we use adr_l here?
> >>
> >> Unfortunately not. All the asm/assembler.h macros are unavailable to
> >> inline assembly. We could start introducing equivalent macros for that
> >> purpose, but that's starting to be outside of the scope of this series.
> >>
> > 
> > Absolutely.  Forget I asked.
> > 
> >>>
> >>>> +		addr;							\
> >>>> +	})
> >>>> +
> >>>
> >>> I don't fully appreciate the semantics of this macro going by its name
> >>> only.  My understanding is that if you want to resolve a symbol to an
> >>> address which is mapped in hyp, then use this.  Is this correct?
> >>
> >> The goal of this macro is to return a symbol's address based on a
> >> PC-relative computation, as opposed to a loading the VA from a constant
> >> pool or something similar. This works well for HYP, as an absolute VA is
> >> guaranteed to be wrong.
> >>
> >>>
> >>> If so, can we add a small comment (because I can't come up with a better
> >>> name).
> >>
> >> I'll add the above if that works for you.
> >>
> > 
> > Yes it does.  The only thing that remains a bit unclear is what the
> > difference between this and kern_hyp_va is, and when you'd choose to use
> > one over the other.  Perhaps we need a single place which documents our
> > primitives and tells us what to use when.  At least, I'm for sure not
> > going to be able to figure this out later on.
> 
> Let me try to explain that:
> 
> The two primitives work on different "objects". kern_hyp_va() works on
> an address. If what you have is a pointer, then kern_hyp_va is your
> friend. On the contrary, if what you have is a symbol instead of the
> address of that object (and thus not something we obtain by reading a
> variable), then hyp_symbol_addr is probably what you need.
> 
> Of course, a symbol can also be a variable, which makes things a bit
> harder. The asm constraint are such as compilation will break if you try
> to treat a local variable as a symbol (the 'S' constraint is defined as
> "An absolute symbolic address or a label reference", and the '&s' makes
> it pretty hard to fool).
> 
> I've tried to make it make it foolproof, but who knows... ;-)
> 
ok, so what exactly is this macro doing different then simply using the
symbol directly?  Only ensuring that the generated code is using
PC-relative addressing?  If so, should we be using this macro everywhere
in Hyp code where we dereference a symbol?

What is it about hyp code that makes us need this when it's not needed
for the kernel itself, and both support address space randomization?  Is
that because the main kernel is properly relocated after deciding on the
randomization, but Hyp is not?

It may be worth trying to put hyp_symbol_addr and kern_hyp_va in the
same header file and have something explaining when to use what and why
in that single place, but if that breaks the world, then never mind...

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH v4 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
  2018-02-19 14:39             ` Christoffer Dall
@ 2018-02-20 11:40               ` Marc Zyngier
  -1 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-20 11:40 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Mark Rutland, Peter Maydell, kvm, Steve Capper, Catalin Marinas,
	Will Deacon, James Morse, kvmarm, linux-arm-kernel

On 19/02/18 14:39, Christoffer Dall wrote:
> On Fri, Feb 16, 2018 at 09:33:39AM +0000, Marc Zyngier wrote:
>> On 16/02/18 09:05, Christoffer Dall wrote:
>>> On Thu, Feb 15, 2018 at 01:22:56PM +0000, Marc Zyngier wrote:
>>>> On 15/01/18 15:36, Christoffer Dall wrote:
>>>>> On Thu, Jan 04, 2018 at 06:43:25PM +0000, Marc Zyngier wrote:
>>>>>> kvm_vgic_global_state is part of the read-only section, and is
>>>>>> usually accessed using a PC-relative address generation (adrp + add).
>>>>>>
>>>>>> It is thus useless to use kern_hyp_va() on it, and actively problematic
>>>>>> if kern_hyp_va() becomes non-idempotent. On the other hand, there is
>>>>>> no way that the compiler is going to guarantee that such access is
>>>>>> always be PC relative.
>>>>>
>>>>> nit: is always be
>>>>>
>>>>>>
>>>>>> So let's bite the bullet and provide our own accessor.
>>>>>>
>>>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>>> ---
>>>>>>  arch/arm/include/asm/kvm_hyp.h   | 6 ++++++
>>>>>>  arch/arm64/include/asm/kvm_hyp.h | 9 +++++++++
>>>>>>  virt/kvm/arm/hyp/vgic-v2-sr.c    | 4 ++--
>>>>>>  3 files changed, 17 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
>>>>>> index ab20ffa8b9e7..1d42d0aa2feb 100644
>>>>>> --- a/arch/arm/include/asm/kvm_hyp.h
>>>>>> +++ b/arch/arm/include/asm/kvm_hyp.h
>>>>>> @@ -26,6 +26,12 @@
>>>>>>  
>>>>>>  #define __hyp_text __section(.hyp.text) notrace
>>>>>>  
>>>>>> +#define hyp_symbol_addr(s)						\
>>>>>> +	({								\
>>>>>> +		typeof(s) *addr = &(s);					\
>>>>>> +		addr;							\
>>>>>> +	})
>>>>>> +
>>>>>>  #define __ACCESS_VFP(CRn)			\
>>>>>>  	"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
>>>>>>  
>>>>>> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
>>>>>> index 08d3bb66c8b7..a2d98c539023 100644
>>>>>> --- a/arch/arm64/include/asm/kvm_hyp.h
>>>>>> +++ b/arch/arm64/include/asm/kvm_hyp.h
>>>>>> @@ -25,6 +25,15 @@
>>>>>>  
>>>>>>  #define __hyp_text __section(.hyp.text) notrace
>>>>>>  
>>>>>> +#define hyp_symbol_addr(s)						\
>>>>>> +	({								\
>>>>>> +		typeof(s) *addr;					\
>>>>>> +		asm volatile("adrp	%0, %1\n"			\
>>>>>> +			     "add	%0, %0, :lo12:%1\n"		\
>>>>>> +			     : "=r" (addr) : "S" (&s));			\
>>>>>
>>>>> Can't we use adr_l here?
>>>>
>>>> Unfortunately not. All the asm/assembler.h macros are unavailable to
>>>> inline assembly. We could start introducing equivalent macros for that
>>>> purpose, but that's starting to be outside of the scope of this series.
>>>>
>>>
>>> Absolutely.  Forget I asked.
>>>
>>>>>
>>>>>> +		addr;							\
>>>>>> +	})
>>>>>> +
>>>>>
>>>>> I don't fully appreciate the semantics of this macro going by its name
>>>>> only.  My understanding is that if you want to resolve a symbol to an
>>>>> address which is mapped in hyp, then use this.  Is this correct?
>>>>
>>>> The goal of this macro is to return a symbol's address based on a
>>>> PC-relative computation, as opposed to a loading the VA from a constant
>>>> pool or something similar. This works well for HYP, as an absolute VA is
>>>> guaranteed to be wrong.
>>>>
>>>>>
>>>>> If so, can we add a small comment (because I can't come up with a better
>>>>> name).
>>>>
>>>> I'll add the above if that works for you.
>>>>
>>>
>>> Yes it does.  The only thing that remains a bit unclear is what the
>>> difference between this and kern_hyp_va is, and when you'd choose to use
>>> one over the other.  Perhaps we need a single place which documents our
>>> primitives and tells us what to use when.  At least, I'm for sure not
>>> going to be able to figure this out later on.
>>
>> Let me try to explain that:
>>
>> The two primitives work on different "objects". kern_hyp_va() works on
>> an address. If what you have is a pointer, then kern_hyp_va is your
>> friend. On the contrary, if what you have is a symbol instead of the
>> address of that object (and thus not something we obtain by reading a
>> variable), then hyp_symbol_addr is probably what you need.
>>
>> Of course, a symbol can also be a variable, which makes things a bit
>> harder. The asm constraint are such as compilation will break if you try
>> to treat a local variable as a symbol (the 'S' constraint is defined as
>> "An absolute symbolic address or a label reference", and the '&s' makes
>> it pretty hard to fool).
>>
>> I've tried to make it make it foolproof, but who knows... ;-)
>>
> ok, so what exactly is this macro doing different then simply using the
> symbol directly?  Only ensuring that the generated code is using
> PC-relative addressing?  If so, should we be using this macro everywhere
> in Hyp code where we dereference a symbol?

I think we should. We so far rely on the fact that the compiler won't be
stupid enough to generate a constant pool load to compute an address
that it can reach using PC-relative addressing, but you never know.

> What is it about hyp code that makes us need this when it's not needed
> for the kernel itself, and both support address space randomization?  Is
> that because the main kernel is properly relocated after deciding on the
> randomization, but Hyp is not?

The issue is that the kernel is linked at a particular VA, and we run it
at another. This hasn't much to do with randomization.

> It may be worth trying to put hyp_symbol_addr and kern_hyp_va in the
> same header file and have something explaining when to use what and why
> in that single place, but if that breaks the world, then never mind...

Seems sensible.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH v4 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state
@ 2018-02-20 11:40               ` Marc Zyngier
  0 siblings, 0 replies; 104+ messages in thread
From: Marc Zyngier @ 2018-02-20 11:40 UTC (permalink / raw)
  To: linux-arm-kernel

On 19/02/18 14:39, Christoffer Dall wrote:
> On Fri, Feb 16, 2018 at 09:33:39AM +0000, Marc Zyngier wrote:
>> On 16/02/18 09:05, Christoffer Dall wrote:
>>> On Thu, Feb 15, 2018 at 01:22:56PM +0000, Marc Zyngier wrote:
>>>> On 15/01/18 15:36, Christoffer Dall wrote:
>>>>> On Thu, Jan 04, 2018 at 06:43:25PM +0000, Marc Zyngier wrote:
>>>>>> kvm_vgic_global_state is part of the read-only section, and is
>>>>>> usually accessed using a PC-relative address generation (adrp + add).
>>>>>>
>>>>>> It is thus useless to use kern_hyp_va() on it, and actively problematic
>>>>>> if kern_hyp_va() becomes non-idempotent. On the other hand, there is
>>>>>> no way that the compiler is going to guarantee that such access is
>>>>>> always be PC relative.
>>>>>
>>>>> nit: is always be
>>>>>
>>>>>>
>>>>>> So let's bite the bullet and provide our own accessor.
>>>>>>
>>>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>>> ---
>>>>>>  arch/arm/include/asm/kvm_hyp.h   | 6 ++++++
>>>>>>  arch/arm64/include/asm/kvm_hyp.h | 9 +++++++++
>>>>>>  virt/kvm/arm/hyp/vgic-v2-sr.c    | 4 ++--
>>>>>>  3 files changed, 17 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
>>>>>> index ab20ffa8b9e7..1d42d0aa2feb 100644
>>>>>> --- a/arch/arm/include/asm/kvm_hyp.h
>>>>>> +++ b/arch/arm/include/asm/kvm_hyp.h
>>>>>> @@ -26,6 +26,12 @@
>>>>>>  
>>>>>>  #define __hyp_text __section(.hyp.text) notrace
>>>>>>  
>>>>>> +#define hyp_symbol_addr(s)						\
>>>>>> +	({								\
>>>>>> +		typeof(s) *addr = &(s);					\
>>>>>> +		addr;							\
>>>>>> +	})
>>>>>> +
>>>>>>  #define __ACCESS_VFP(CRn)			\
>>>>>>  	"mrc", "mcr", __stringify(p10, 7, %0, CRn, cr0, 0), u32
>>>>>>  
>>>>>> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
>>>>>> index 08d3bb66c8b7..a2d98c539023 100644
>>>>>> --- a/arch/arm64/include/asm/kvm_hyp.h
>>>>>> +++ b/arch/arm64/include/asm/kvm_hyp.h
>>>>>> @@ -25,6 +25,15 @@
>>>>>>  
>>>>>>  #define __hyp_text __section(.hyp.text) notrace
>>>>>>  
>>>>>> +#define hyp_symbol_addr(s)						\
>>>>>> +	({								\
>>>>>> +		typeof(s) *addr;					\
>>>>>> +		asm volatile("adrp	%0, %1\n"			\
>>>>>> +			     "add	%0, %0, :lo12:%1\n"		\
>>>>>> +			     : "=r" (addr) : "S" (&s));			\
>>>>>
>>>>> Can't we use adr_l here?
>>>>
>>>> Unfortunately not. All the asm/assembler.h macros are unavailable to
>>>> inline assembly. We could start introducing equivalent macros for that
>>>> purpose, but that's starting to be outside of the scope of this series.
>>>>
>>>
>>> Absolutely.  Forget I asked.
>>>
>>>>>
>>>>>> +		addr;							\
>>>>>> +	})
>>>>>> +
>>>>>
>>>>> I don't fully appreciate the semantics of this macro going by its name
>>>>> only.  My understanding is that if you want to resolve a symbol to an
>>>>> address which is mapped in hyp, then use this.  Is this correct?
>>>>
>>>> The goal of this macro is to return a symbol's address based on a
>>>> PC-relative computation, as opposed to a loading the VA from a constant
>>>> pool or something similar. This works well for HYP, as an absolute VA is
>>>> guaranteed to be wrong.
>>>>
>>>>>
>>>>> If so, can we add a small comment (because I can't come up with a better
>>>>> name).
>>>>
>>>> I'll add the above if that works for you.
>>>>
>>>
>>> Yes it does.  The only thing that remains a bit unclear is what the
>>> difference between this and kern_hyp_va is, and when you'd choose to use
>>> one over the other.  Perhaps we need a single place which documents our
>>> primitives and tells us what to use when.  At least, I'm for sure not
>>> going to be able to figure this out later on.
>>
>> Let me try to explain that:
>>
>> The two primitives work on different "objects". kern_hyp_va() works on
>> an address. If what you have is a pointer, then kern_hyp_va is your
>> friend. On the contrary, if what you have is a symbol instead of the
>> address of that object (and thus not something we obtain by reading a
>> variable), then hyp_symbol_addr is probably what you need.
>>
>> Of course, a symbol can also be a variable, which makes things a bit
>> harder. The asm constraint are such as compilation will break if you try
>> to treat a local variable as a symbol (the 'S' constraint is defined as
>> "An absolute symbolic address or a label reference", and the '&s' makes
>> it pretty hard to fool).
>>
>> I've tried to make it make it foolproof, but who knows... ;-)
>>
> ok, so what exactly is this macro doing different then simply using the
> symbol directly?  Only ensuring that the generated code is using
> PC-relative addressing?  If so, should we be using this macro everywhere
> in Hyp code where we dereference a symbol?

I think we should. We so far rely on the fact that the compiler won't be
stupid enough to generate a constant pool load to compute an address
that it can reach using PC-relative addressing, but you never know.

> What is it about hyp code that makes us need this when it's not needed
> for the kernel itself, and both support address space randomization?  Is
> that because the main kernel is properly relocated after deciding on the
> randomization, but Hyp is not?

The issue is that the kernel is linked at a particular VA, and we run it
at another. This hasn't much to do with randomization.

> It may be worth trying to put hyp_symbol_addr and kern_hyp_va in the
> same header file and have something explaining when to use what and why
> in that single place, but if that breaks the world, then never mind...

Seems sensible.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 104+ messages in thread

end of thread, other threads:[~2018-02-20 11:40 UTC | newest]

Thread overview: 104+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-04 18:43 [PATCH v4 00/19] KVM/arm64: Randomise EL2 mappings Marc Zyngier
2018-01-04 18:43 ` Marc Zyngier
2018-01-04 18:43 ` [PATCH v4 01/19] arm64: asm-offsets: Avoid clashing DMA definitions Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-04 18:43 ` [PATCH v4 02/19] arm64: asm-offsets: Remove unused definitions Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-04 18:43 ` [PATCH v4 03/19] arm64: asm-offsets: Remove potential circular dependency Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-15  8:34   ` Christoffer Dall
2018-01-15  8:34     ` Christoffer Dall
2018-01-15  8:42     ` Marc Zyngier
2018-01-15  8:42       ` Marc Zyngier
2018-01-15  9:46       ` Christoffer Dall
2018-01-15  9:46         ` Christoffer Dall
2018-01-04 18:43 ` [PATCH v4 04/19] arm64: alternatives: Enforce alignment of struct alt_instr Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-15  9:11   ` Christoffer Dall
2018-01-15  9:11     ` Christoffer Dall
2018-01-04 18:43 ` [PATCH v4 05/19] arm64: alternatives: Add dynamic patching feature Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-15 11:26   ` Christoffer Dall
2018-01-15 11:26     ` Christoffer Dall
2018-01-04 18:43 ` [PATCH v4 06/19] arm64: insn: Add N immediate encoding Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-15 11:26   ` Christoffer Dall
2018-01-15 11:26     ` Christoffer Dall
2018-01-04 18:43 ` [PATCH v4 07/19] arm64: insn: Add encoder for bitwise operations using literals Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-15 11:26   ` Christoffer Dall
2018-01-15 11:26     ` Christoffer Dall
2018-01-04 18:43 ` [PATCH v4 08/19] arm64: KVM: Dynamically patch the kernel/hyp VA mask Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-15 11:47   ` Christoffer Dall
2018-01-15 11:47     ` Christoffer Dall
2018-02-15 13:11     ` Marc Zyngier
2018-02-15 13:11       ` Marc Zyngier
2018-02-16  9:02       ` Christoffer Dall
2018-02-16  9:02         ` Christoffer Dall
2018-01-04 18:43 ` [PATCH v4 09/19] arm64: cpufeatures: Drop the ARM64_HYP_OFFSET_LOW feature flag Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-15 11:48   ` Christoffer Dall
2018-01-15 11:48     ` Christoffer Dall
2018-01-04 18:43 ` [PATCH v4 10/19] KVM: arm/arm64: Do not use kern_hyp_va() with kvm_vgic_global_state Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-15 15:36   ` Christoffer Dall
2018-01-15 15:36     ` Christoffer Dall
2018-02-15 13:22     ` Marc Zyngier
2018-02-15 13:22       ` Marc Zyngier
2018-02-16  9:05       ` Christoffer Dall
2018-02-16  9:05         ` Christoffer Dall
2018-02-16  9:33         ` Marc Zyngier
2018-02-16  9:33           ` Marc Zyngier
2018-02-19 14:39           ` Christoffer Dall
2018-02-19 14:39             ` Christoffer Dall
2018-02-20 11:40             ` Marc Zyngier
2018-02-20 11:40               ` Marc Zyngier
2018-01-04 18:43 ` [PATCH v4 11/19] KVM: arm/arm64: Demote HYP VA range display to being a debug feature Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-15 15:54   ` Christoffer Dall
2018-01-15 15:54     ` Christoffer Dall
2018-01-04 18:43 ` [PATCH v4 12/19] KVM: arm/arm64: Move ioremap calls to create_hyp_io_mappings Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-15 18:07   ` Christoffer Dall
2018-01-15 18:07     ` Christoffer Dall
2018-01-04 18:43 ` [PATCH v4 13/19] KVM: arm/arm64: Keep GICv2 HYP VAs in kvm_vgic_global_state Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-18 14:39   ` Christoffer Dall
2018-01-18 14:39     ` Christoffer Dall
2018-01-04 18:43 ` [PATCH v4 14/19] KVM: arm/arm64: Move HYP IO VAs to the "idmap" range Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-18 14:39   ` Christoffer Dall
2018-01-18 14:39     ` Christoffer Dall
2018-02-15 13:52     ` Marc Zyngier
2018-02-15 13:52       ` Marc Zyngier
2018-02-16  9:25       ` Christoffer Dall
2018-02-16  9:25         ` Christoffer Dall
2018-02-16 15:20         ` Marc Zyngier
2018-02-16 15:20           ` Marc Zyngier
2018-01-04 18:43 ` [PATCH v4 15/19] arm64; insn: Add encoder for the EXTR instruction Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-18 20:27   ` Christoffer Dall
2018-01-18 20:27     ` Christoffer Dall
2018-01-04 18:43 ` [PATCH v4 16/19] arm64: insn: Allow ADD/SUB (immediate) with LSL #12 Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-18 20:28   ` Christoffer Dall
2018-01-18 20:28     ` Christoffer Dall
2018-01-04 18:43 ` [PATCH v4 17/19] arm64: KVM: Dynamically compute the HYP VA mask Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-18 20:28   ` Christoffer Dall
2018-01-18 20:28     ` Christoffer Dall
2018-02-15 13:58     ` Marc Zyngier
2018-02-15 13:58       ` Marc Zyngier
2018-01-04 18:43 ` [PATCH v4 18/19] arm64: KVM: Introduce EL2 VA randomisation Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-18 20:28   ` Christoffer Dall
2018-01-18 20:28     ` Christoffer Dall
2018-02-15 15:32     ` Marc Zyngier
2018-02-15 15:32       ` Marc Zyngier
2018-02-16  9:33       ` Christoffer Dall
2018-02-16  9:33         ` Christoffer Dall
2018-01-04 18:43 ` [PATCH v4 19/19] arm64: Update the KVM memory map documentation Marc Zyngier
2018-01-04 18:43   ` Marc Zyngier
2018-01-18 20:28   ` Christoffer Dall
2018-01-18 20:28     ` Christoffer Dall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.