linux-coco.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [RFC] Support for Arm CCA VMs on Linux
@ 2023-01-27 11:22 Suzuki K Poulose
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
                   ` (8 more replies)
  0 siblings, 9 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:22 UTC (permalink / raw)
  To: linux-coco, linux-kernel, kvm, kvmarm, linux-arm-kernel
  Cc: Alexandru Elisei, Andrew Jones, Catalin Marinas, Chao Peng,
	Christoffer Dall, Fuad Tabba, James Morse, Jean-Philippe Brucker,
	Joey Gouly, Marc Zyngier, Mark Rutland, Oliver Upton,
	Paolo Bonzini, Quentin Perret, Sean Christopherson, Steven Price,
	Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

We are happy to announce the early RFC version of the Arm
Confidential Compute Architecture (CCA) support for the Linux
stack. The intention is to seek early feedback in the following areas:
 * KVM integration of the Arm CCA
 * KVM UABI for managing the Realms, seeking to generalise the operations
   wherever possible with other Confidential Compute solutions.
   Note: This version doesn't support Guest Private memory, which will be added
   later (see below).
 * Linux Guest support for Realms

Arm CCA Introduction
=====================

The Arm CCA is a reference software architecture and implementation that builds
on the Realm Management Extension (RME), enabling the execution of Virtual
machines, while preventing access by more privileged software, such as hypervisor.
The Arm CCA allows the hypervisor to control the VM, but removes the right for
access to the code, register state or data that is used by VM.
More information on the architecture is available here[0].

    Arm CCA Reference Software Architecture

        Realm World    ||    Normal World   ||  Secure World  ||
                       ||        |          ||                ||
 EL0 x-------x         || x----x | x------x ||                ||
     | Realm |         || |    | | |      | ||                ||
     |       |         || | VM | | |      | ||                ||
 ----|  VM*  |---------||-|    |---|      |-||----------------||
     |       |         || |    | | |  H   | ||                ||
 EL1 x-------x         || x----x | |      | ||                ||
         ^             ||        | |  o   | ||                ||
         |             ||        | |      | ||                ||
 ------- R*------------------------|  s  -|---------------------
         S             ||          |      | ||                ||
         I             ||          |  t   | ||                ||
         |             ||          |      | ||                || 
         v             ||          x------x ||                ||
 EL2    RMM*           ||              ^    ||                ||
         ^             ||              |    ||                ||
 ========|=============================|========================
         |                             | SMC
         x--------- *RMI* -------------x

 EL3                   Root World
                       EL3 Firmware
 ===============================================================
Where :
 RMM - Realm Management Monitor
 RMI - Realm Management Interface
 RSI - Realm Service Interface
 SMC - Secure Monitor Call

RME introduces a new security state "Realm world", in addition to the
traditional Secure and Non-Secure states. The Arm CCA defines a new component,
Realm Management Monitor (RMM) that runs at R-EL2. This is a standard piece of
firmware, verified, installed and loaded by the EL3 firmware (e.g, TF-A), at
system boot.

The RMM provides standard interfaces - Realm Management Interface (RMI) - to the
Normal world hypervisor to manage the VMs running in the Realm world (also called
Realms in short). These are exposed via SMC and are routed through the EL3
firmwre.
The RMI interface includes:
  - Move a physical page from the Normal world to the Realm world
  - Creating a Realm with requested parameters, tracked via Realm Descriptor (RD)
  - Creating VCPUs aka Realm Execution Context (REC), with initial register state.
  - Create stage2 translation table at any level.
  - Load initial images into Realm Memory from normal world memory
  - Schedule RECs (vCPUs) and handle exits
  - Inject virtual interrupts into the Realm
  - Service stage2 runtime faults with pages (provided by host, scrubbed by RMM).
  - Create "shared" mappings that can be accessed by VMM/Hyp.
  - Reclaim the memory allocated for the RAM and RTTs (Realm Translation Tables)

However v1.0 of RMM specifications doesn't support:
 - Paging protected memory of a Realm VM. Thus the pages backing the protected
   memory region must be pinned.
 - Live migration of Realms.
 - Trusted Device assignment.
 - Physical interrupt backed Virtual interrupts for Realms

RMM also provides certain services to the Realms via SMC, called Realm Service
Interface (RSI). These include:
 - Realm Guest Configuration.
 - Attestation & Measurement services
 - Managing the state of an Intermediate Physical Address (IPA aka GPA) page.
 - Host Call service (Communication with the Normal world Hypervisor)

The specifications for the RMM software is currently at *v1.0-Beta2* and the
latest version is available here [1].

The Trusted Firmware foundation has an implementation of the RMM - TF-RMM -
available here [3].

Implementation
=================

This version of the stack is based on the RMM specification v1.0-Beta0[2], with
following exceptions :
  - TF-RMM/KVM currently doesn't support the optional features of PMU,
     SVE and Self-hosted debug (coming soon).
  - The RSI_HOST_CALL structure alignment requirement is reduced to match
     RMM v1.0 Beta1
  - RMI/RSI version numbers do not match the RMM spec. This will be
    resolved once the spec/implementation is complete, across TF-RMM+Linux stack.

We plan to update the stack to support the latest version of the RMMv1.0 spec
in the coming revisions.

This release includes the following components :

 a) Linux Kernel
     i) Host / KVM support - Support for driving the Realms via RMI. This is
     dependent on running in the Kernel at EL2 (aka VHE mode). Also provides
     UABI for VMMs to manage the Realm VMs. The support is restricted to 4K page
     size, matching the Stage2 granule supported by RMM. The VMM is responsible
     for making sure the guest memory is locked.

       TODO: Guest Private memory[10] integration - We have been following the
       series and support will be added once it is merged upstream.
     
     ii) Guest support - Support for a Linux Kernel to run in the Realm VM at
     Realm-EL1, using RSI services. This includes virtio support (virtio-v1.0
     only). All I/O are treated as non-secure/shared.
 
 c) kvmtool - VMM changes required to manage Realm VMs. No guest private memory
    as mentioned above.
 d) kvm-unit-tests - Support for running in Realms along with additional tests
    for RSI ABI.

Running the stack
====================

To run/test the stack, you would need the following components :

1) FVP Base AEM RevC model with FEAT_RME support [4]
2) TF-A firmware for EL3 [5]
3) TF-A RMM for R-EL2 [3]
4) Linux Kernel [6]
5) kvmtool [7]
6) kvm-unit-tests [8]

Instructions for building the firmware components and running the model are
available here [9]. Once, the host kernel is booted, a Realm can be launched by
invoking the `lkvm` commad as follows:

 $ lkvm run --realm 				 \
	 --measurement-algo=["sha256", "sha512"] \
	 --disable-sve				 \
	 <normal-vm-options>

Where:
 * --measurement-algo (Optional) specifies the algorithm selected for creating the
   initial measurements by the RMM for this Realm (defaults to sha256).
 * GICv3 is mandatory for the Realms.
 * SVE is not yet supported in the TF-RMM, and thus must be disabled using
   --disable-sve

You may also run the kvm-unit-tests inside the Realm world, using the similar
options as above.


Links
============

[0] Arm CCA Landing page (See Key Resources section for various documentations)
    https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture

[1] RMM Specification Latest
    https://developer.arm.com/documentation/den0137/latest

[2] RMM v1.0-Beta0 specification
    https://developer.arm.com/documentation/den0137/1-0bet0/

[3] Trusted Firmware RMM - TF-RMM
    https://www.trustedfirmware.org/projects/tf-rmm/
    GIT: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git

[4] FVP Base RevC AEM Model (available on x86_64 / Arm64 Linux)
    https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms

[5] Trusted Firmware for A class
    https://www.trustedfirmware.org/projects/tf-a/

[6] Linux kernel support for Arm-CCA
    https://gitlab.arm.com/linux-arm/linux-cca
    Host Support branch:	cca-host/rfc-v1
    Guest Support branch:	cca-guest/rfc-v1

[7] kvmtool support for Arm CCA
    https://gitlab.arm.com/linux-arm/kvmtool-cca cca/rfc-v1

[8] kvm-unit-tests support for Arm CCA
    https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca  cca/rfc-v1

[9] Instructions for Building Firmware components and running the model, see
    section 4.19.2 "Building and running TF-A with RME"
    https://trustedfirmware-a.readthedocs.io/en/latest/components/realm-management-extension.html#building-and-running-tf-a-with-rme

[10] fd based Guest Private memory for KVM
   https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng@linux.intel.com

Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Andrew Jones <andrew.jones@linux.dev>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chao Peng <chao.p.peng@linux.intel.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: Joey Gouly <Joey.Gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Quentin Perret <qperret@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zenghui Yu <yuzenghui@huawei.com>
To: linux-coco@lists.linux.dev
To: kvmarm@lists.linux.dev
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-arm-kernel@lists.infradead.org
To: linux-kernel@vger.kernel.org
To: kvm@vger.kernel.org

^ permalink raw reply	[flat|nested] 190+ messages in thread

* [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA
  2023-01-27 11:22 [RFC] Support for Arm CCA VMs on Linux Suzuki K Poulose
@ 2023-01-27 11:27 ` Steven Price
  2023-01-27 11:27   ` [RFC PATCH 01/14] arm64: remove redundant 'extern' Steven Price
                     ` (13 more replies)
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                   ` (7 subsequent siblings)
  8 siblings, 14 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:27 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Steven Price, Catalin Marinas, Ard Biesheuvel, Marc Zyngier,
	Will Deacon, Suzuki K Poulose, James Morse, Oliver Upton,
	Zenghui Yu, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

This series is an RFC adding support for running Linux in a protected
VM under the new Arm Confidential Compute Architecture (CCA). The
purpose of this series is to gather feedback on the proposed changes to
the architecture code for CCA.

The ABI to the RMM from a realm (the RSI) is based on the Beta 0
specification[2] and will be updated in the future when a final version
of the specification is published.

This series is based on v6.2-rc1. It is also available as a git
repository:

https://gitlab.arm.com/linux-arm/linux-cca cca-guest/rfc-v1

Introduction
============
A more general introduction to Arm CCA is available on the Arm
website[3], and links to the other components involved are available in
the overall cover letter[4].

Arm Confidential Compute Architecture adds two new 'worlds' to the
architecture: Root and Realm. A new software component known as the RMM
(Realm Management Monitor) runs in Realm EL2 and is trusted by both the
Normal World and VMs running within Realms. This enables mutual
distrust between the Realm VMs and the Normal World.

Virtual machines running within a Realm can decide on a (4k)
page-by-page granularity whether to share a page with the (Normal World)
host or to keep it private (protected). This protection is provided by
the hardware and attempts to access a page which isn't shared by the
Normal World will trigger a Granule Protection Fault.

Realm VMs can communicate with the RMM via another SMC interface known
as RSI (Realm Services Interface). This series adds wrappers for the
full set of RSI commands and uses them to manage the Realm IPA State
(RIPAS) and to discover the configuration of the realm.

The VM running within the Realm needs to ensure that memory that is
going to use is marked as 'RIPAS_RAM' (i.e. protected memory accessible
only to the guest). This could be provided by the VMM (and subject to
measurement to ensure it is setup correctly) or the VM can set it
itself.  This series includes a patch which will iterate over all
described RAM and set the RIPAS. An alternative would be to update
booting.rst and state this as a requirement, but this would reduce the
flexibility of the VMM to manage the available memory to the guest (as
the initial RIPAS state is part of the guest's measurement).

Within the Realm the most-significant active bit of the IPA is used to
select whether the access is to protected memory or to memory shared
with the host. This series treats this bit as if it is attribute bit in
the page tables and will modify it when sharing/unsharing memory with
the host.

This top bit usage also necessitates that the IPA width is made more
dynamic in the guest. The VMM will choose a width (and therefore which
bit controls the shared flag) and the guest must be able to identify
this bit to mask it out when necessary. PHYS_MASK_SHIFT/PHYS_MASK are
therefore made dynamic.

To allow virtio to communicate with the host the shared buffers must be
placed in memory which has this top IPA bit set. This is achieved by
implementating the set_memory_{encrypted,decrypted} APIs for arm64 and
forcing the use of bounce buffers. For now all device access is
considered to required the memory to be shared, at this stage there is
no support for real devices to be assigned to a realm guest - obviously
if device assignment is added this will have to change.

Finally the GIC is (largely) emulated by the (untrusted) host. The RMM
provides some management (including register save/restore) but the
ITS buffers must be placed into shared memory for the host to emulate.
There is likely to be future work to harden the GIC driver against a
malicious host (along with any other drivers used within a Realm guest).

[1] https://lore.kernel.org/r/20221202061347.1070246-1-chao.p.peng%40linux.intel.com
[2] https://developer.arm.com/documentation/den0137/1-0bet0/
[3] https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
[4] https://lore.kernel.org/r/20230127112248.136810-1-suzuki.poulose%40arm.com

Steven Price (9):
  arm64: remove redundant 'extern'
  arm64: Detect if in a realm and set RIPAS RAM
  arm64: realm: Query IPA size from the RMM
  arm64: Mark all I/O as non-secure shared
  arm64: Make the PHYS_MASK_SHIFT dynamic
  arm64: Enforce bounce buffers for realm DMA
  arm64: Enable memory encrypt for Realms
  arm64: realm: Support nonsecure ITS emulation shared
  HACK: Accept prototype RSI version

Suzuki K Poulose (5):
  arm64: rsi: Add RSI definitions
  fixmap: Allow architecture overriding set_fixmap_io
  arm64: Override set_fixmap_io
  arm64: Force device mappings to be non-secure shared
  efi: arm64: Map Device with Prot Shared

 arch/arm64/Kconfig                     |   3 +
 arch/arm64/include/asm/fixmap.h        |   4 +-
 arch/arm64/include/asm/io.h            |   6 +-
 arch/arm64/include/asm/kvm_arm.h       |   2 +-
 arch/arm64/include/asm/mem_encrypt.h   |  19 ++++
 arch/arm64/include/asm/pgtable-hwdef.h |   4 +-
 arch/arm64/include/asm/pgtable-prot.h  |   2 +
 arch/arm64/include/asm/pgtable.h       |   7 +-
 arch/arm64/include/asm/rsi.h           |  46 +++++++++
 arch/arm64/include/asm/rsi_cmds.h      |  71 +++++++++++++
 arch/arm64/include/asm/rsi_smc.h       | 132 +++++++++++++++++++++++++
 arch/arm64/kernel/Makefile             |   2 +-
 arch/arm64/kernel/efi.c                |   2 +-
 arch/arm64/kernel/head.S               |   2 +-
 arch/arm64/kernel/rsi.c                |  82 +++++++++++++++
 arch/arm64/kernel/setup.c              |   3 +
 arch/arm64/kvm/Kconfig                 |   8 ++
 arch/arm64/mm/init.c                   |  10 +-
 arch/arm64/mm/mmu.c                    |  13 +++
 arch/arm64/mm/pageattr.c               |  48 ++++++++-
 drivers/irqchip/irq-gic-v3-its.c       |  95 +++++++++++++-----
 include/asm-generic/fixmap.h           |   2 +
 22 files changed, 523 insertions(+), 40 deletions(-)
 create mode 100644 arch/arm64/include/asm/mem_encrypt.h
 create mode 100644 arch/arm64/include/asm/rsi.h
 create mode 100644 arch/arm64/include/asm/rsi_cmds.h
 create mode 100644 arch/arm64/include/asm/rsi_smc.h
 create mode 100644 arch/arm64/kernel/rsi.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 190+ messages in thread

* [RFC PATCH 01/14] arm64: remove redundant 'extern'
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
@ 2023-01-27 11:27   ` Steven Price
  2023-01-27 11:27   ` [RFC PATCH 02/14] arm64: rsi: Add RSI definitions Steven Price
                     ` (12 subsequent siblings)
  13 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:27 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Steven Price, Catalin Marinas, Ard Biesheuvel, Marc Zyngier,
	Will Deacon, Suzuki K Poulose, James Morse, Oliver Upton,
	Zenghui Yu, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

It isn't necessary to mark function definitions extern and goes against
the kernel coding style. Remove the redundant extern keyword.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/fixmap.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index 71ed5fdf718b..09ba9fe3b02c 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -107,7 +107,7 @@ void __init early_fixmap_init(void);
 #define __late_set_fixmap __set_fixmap
 #define __late_clear_fixmap(idx) __set_fixmap((idx), 0, FIXMAP_PAGE_CLEAR)
 
-extern void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot);
+void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot);
 
 #include <asm-generic/fixmap.h>
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 02/14] arm64: rsi: Add RSI definitions
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
  2023-01-27 11:27   ` [RFC PATCH 01/14] arm64: remove redundant 'extern' Steven Price
@ 2023-01-27 11:27   ` Steven Price
  2023-01-27 11:27   ` [RFC PATCH 03/14] arm64: Detect if in a realm and set RIPAS RAM Steven Price
                     ` (11 subsequent siblings)
  13 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:27 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Steven Price, Catalin Marinas, Ard Biesheuvel, Marc Zyngier,
	Will Deacon, Suzuki K Poulose, James Morse, Oliver Upton,
	Zenghui Yu, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

From: Suzuki K Poulose <suzuki.poulose@arm.com>

The RMM (Realm Management Monitor) provides functionality that can be
accessed by a realm guest through SMC (Realm Services Interface) calls.

The SMC definitions are based on DEN0137[1] version A-bet0.

[1] https://developer.arm.com/documentation/den0137/latest

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/rsi_cmds.h |  57 +++++++++++++
 arch/arm64/include/asm/rsi_smc.h  | 130 ++++++++++++++++++++++++++++++
 2 files changed, 187 insertions(+)
 create mode 100644 arch/arm64/include/asm/rsi_cmds.h
 create mode 100644 arch/arm64/include/asm/rsi_smc.h

diff --git a/arch/arm64/include/asm/rsi_cmds.h b/arch/arm64/include/asm/rsi_cmds.h
new file mode 100644
index 000000000000..a0b3c1bd786a
--- /dev/null
+++ b/arch/arm64/include/asm/rsi_cmds.h
@@ -0,0 +1,57 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023 ARM Ltd.
+ */
+
+#ifndef __ASM_RSI_CMDS_H
+#define __ASM_RSI_CMDS_H
+
+#include <linux/arm-smccc.h>
+
+#include <asm/rsi_smc.h>
+
+enum ripas {
+	RSI_RIPAS_EMPTY,
+	RSI_RIPAS_RAM,
+};
+
+static inline unsigned long rsi_get_version(void)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_smc(SMC_RSI_ABI_VERSION, 0, 0, 0, 0, 0, 0, 0, &res);
+
+	return res.a0;
+}
+
+static inline unsigned long invoke_rsi_fn_smc(unsigned long function_id,
+					      unsigned long arg0,
+					      unsigned long arg1,
+					      unsigned long arg2,
+					      unsigned long arg3)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_smc(function_id, arg0, arg1, arg2, arg3, 0, 0, 0, &res);
+	return res.a0;
+}
+
+static inline void invoke_rsi_fn_smc_with_res(unsigned long function_id,
+					      unsigned long arg0,
+					      unsigned long arg1,
+					      unsigned long arg2,
+					      unsigned long arg3,
+					      struct arm_smccc_res *res)
+{
+	arm_smccc_smc(function_id, arg0, arg1, arg2, arg3, 0, 0, 0, res);
+}
+
+static inline unsigned long rsi_set_addr_range_state(phys_addr_t start,
+						     phys_addr_t end,
+						     enum ripas state)
+{
+	return invoke_rsi_fn_smc(SMC_RSI_IPA_STATE_SET,
+				 start, (end - start), state, 0);
+}
+
+#endif
diff --git a/arch/arm64/include/asm/rsi_smc.h b/arch/arm64/include/asm/rsi_smc.h
new file mode 100644
index 000000000000..bc0cdd83f164
--- /dev/null
+++ b/arch/arm64/include/asm/rsi_smc.h
@@ -0,0 +1,130 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023 ARM Ltd.
+ */
+
+#ifndef __SMC_RSI_H_
+#define __SMC_RSI_H_
+
+/*
+ * This file describes the Realm Services Interface (RSI) Application Binary
+ * Interface (ABI) for SMC calls made from within the Realm to the RMM and
+ * serviced by the RMM.
+ */
+
+#define SMC_RSI_CALL_BASE		0xC4000000
+
+/*
+ * The major version number of the RSI implementation.  Increase this whenever
+ * the binary format or semantics of the SMC calls change.
+ */
+#define RSI_ABI_VERSION_MAJOR		1
+
+/*
+ * The minor version number of the RSI implementation.  Increase this when
+ * a bug is fixed, or a feature is added without breaking binary compatibility.
+ */
+#define RSI_ABI_VERSION_MINOR		0
+
+#define RSI_ABI_VERSION			((RSI_ABI_VERSION_MAJOR << 16) | \
+					 RSI_ABI_VERSION_MINOR)
+
+#define RSI_ABI_VERSION_GET_MAJOR(_version) ((_version) >> 16)
+#define RSI_ABI_VERSION_GET_MINOR(_version) ((_version) & 0xFFFF)
+
+#define RSI_SUCCESS			0
+#define RSI_ERROR_INPUT			1
+#define RSI_ERROR_STATE			2
+#define RSI_INCOMPLETE			3
+
+#define SMC_RSI_FID(_x)			(SMC_RSI_CALL_BASE + (_x))
+
+#define SMC_RSI_ABI_VERSION			SMC_RSI_FID(0x190)
+
+/*
+ * arg1 == The IPA of token buffer
+ * arg2 == Challenge value, bytes:  0 -  7
+ * arg3 == Challenge value, bytes:  7 - 15
+ * arg4 == Challenge value, bytes: 16 - 23
+ * arg5 == Challenge value, bytes: 24 - 31
+ * arg6 == Challenge value, bytes: 32 - 39
+ * arg7 == Challenge value, bytes: 40 - 47
+ * arg8 == Challenge value, bytes: 48 - 55
+ * arg9 == Challenge value, bytes: 56 - 63
+ * ret0 == Status / error
+ */
+#define SMC_RSI_ATTESTATION_TOKEN_INIT		SMC_RSI_FID(0x194)
+
+/*
+ * arg1 == The IPA of token buffer
+ * ret0 == Status / error
+ * ret1 == Size of completed token in bytes
+ */
+#define SMC_RSI_ATTESTATION_TOKEN_CONTINUE	SMC_RSI_FID(0x195)
+
+/*
+ * arg1  == Index, which measurements slot to extend
+ * arg2  == Size of realm measurement in bytes, max 64 bytes
+ * arg3  == Measurement value, bytes:  0 -  7
+ * arg4  == Measurement value, bytes:  7 - 15
+ * arg5  == Measurement value, bytes: 16 - 23
+ * arg6  == Measurement value, bytes: 24 - 31
+ * arg7  == Measurement value, bytes: 32 - 39
+ * arg8  == Measurement value, bytes: 40 - 47
+ * arg9  == Measurement value, bytes: 48 - 55
+ * arg10 == Measurement value, bytes: 56 - 63
+ * ret0  == Status / error
+ */
+#define SMC_RSI_MEASUREMENT_EXTEND		SMC_RSI_FID(0x193)
+
+/*
+ * arg1 == Index, which measurements slot to read
+ * ret0 == Status / error
+ * ret1 == Measurement value, bytes:  0 -  7
+ * ret2 == Measurement value, bytes:  7 - 15
+ * ret3 == Measurement value, bytes: 16 - 23
+ * ret4 == Measurement value, bytes: 24 - 31
+ * ret5 == Measurement value, bytes: 32 - 39
+ * ret6 == Measurement value, bytes: 40 - 47
+ * ret7 == Measurement value, bytes: 48 - 55
+ * ret8 == Measurement value, bytes: 56 - 63
+ */
+#define SMC_RSI_MEASUREMENT_READ		SMC_RSI_FID(0x192)
+
+#ifndef __ASSEMBLY__
+
+struct realm_config {
+	unsigned long ipa_bits; /* Width of IPA in bits */
+};
+
+#endif /* __ASSEMBLY__ */
+
+/*
+ * arg1 == struct realm_config addr
+ * ret0 == Status / error
+ */
+#define SMC_RSI_REALM_CONFIG			SMC_RSI_FID(0x196)
+
+/*
+ * arg1 == IPA address of target region
+ * arg2 == size of target region in bytes
+ * arg3 == RIPAS value
+ * ret0 == Status / error
+ * ret1 == Top of modified IPA range
+ */
+#define SMC_RSI_IPA_STATE_SET			SMC_RSI_FID(0x197)
+
+/*
+ * arg1 == IPA of target page
+ * ret0 == Status / error
+ * ret1 == RIPAS value
+ */
+#define SMC_RSI_IPA_STATE_GET			SMC_RSI_FID(0x198)
+
+/*
+ * arg1 == IPA of host call structure
+ * ret0 == Status / error
+ */
+#define SMC_RSI_HOST_CALL			SMC_RSI_FID(0x199)
+
+#endif /* __SMC_RSI_H_ */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 03/14] arm64: Detect if in a realm and set RIPAS RAM
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
  2023-01-27 11:27   ` [RFC PATCH 01/14] arm64: remove redundant 'extern' Steven Price
  2023-01-27 11:27   ` [RFC PATCH 02/14] arm64: rsi: Add RSI definitions Steven Price
@ 2023-01-27 11:27   ` Steven Price
  2023-01-27 11:27   ` [RFC PATCH 04/14] arm64: realm: Query IPA size from the RMM Steven Price
                     ` (10 subsequent siblings)
  13 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:27 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Steven Price, Catalin Marinas, Ard Biesheuvel, Marc Zyngier,
	Will Deacon, Suzuki K Poulose, James Morse, Oliver Upton,
	Zenghui Yu, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

Detect that the VM is a realm guest by the presence of the RSI
interface.

If in a realm then all memory needs to be marked as RIPAS RAM initially,
the loader may or may not have done this for us. To be sure iterate over
all RAM and mark it as such. Any failure is fatal as that implies the
RAM regions passed to Linux are incorrect - which would mean failing
later when attempting to access non-existent RAM.

Co-developed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/rsi.h      | 46 ++++++++++++++++++++++++++++
 arch/arm64/include/asm/rsi_cmds.h | 12 ++++++--
 arch/arm64/kernel/Makefile        |  2 +-
 arch/arm64/kernel/rsi.c           | 50 +++++++++++++++++++++++++++++++
 arch/arm64/kernel/setup.c         |  3 ++
 arch/arm64/mm/init.c              |  2 ++
 6 files changed, 111 insertions(+), 4 deletions(-)
 create mode 100644 arch/arm64/include/asm/rsi.h
 create mode 100644 arch/arm64/kernel/rsi.c

diff --git a/arch/arm64/include/asm/rsi.h b/arch/arm64/include/asm/rsi.h
new file mode 100644
index 000000000000..3b56aac5dc43
--- /dev/null
+++ b/arch/arm64/include/asm/rsi.h
@@ -0,0 +1,46 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023 ARM Ltd.
+ */
+
+#ifndef __ASM_RSI_H_
+#define __ASM_RSI_H_
+
+#include <linux/jump_label.h>
+#include <asm/rsi_cmds.h>
+
+extern struct static_key_false rsi_present;
+
+void arm64_setup_memory(void);
+
+void __init arm64_rsi_init(void);
+static inline bool is_realm_world(void)
+{
+	return static_branch_unlikely(&rsi_present);
+}
+
+static inline void set_memory_range(phys_addr_t start, phys_addr_t end,
+				    enum ripas state)
+{
+	unsigned long ret;
+	phys_addr_t top;
+
+	while (start != end) {
+		ret = rsi_set_addr_range_state(start, end, state, &top);
+		BUG_ON(ret);
+		BUG_ON(top < start);
+		BUG_ON(top > end);
+		start = top;
+	}
+}
+
+static inline void set_memory_range_protected(phys_addr_t start, phys_addr_t end)
+{
+	set_memory_range(start, end, RSI_RIPAS_RAM);
+}
+
+static inline void set_memory_range_shared(phys_addr_t start, phys_addr_t end)
+{
+	set_memory_range(start, end, RSI_RIPAS_EMPTY);
+}
+#endif
diff --git a/arch/arm64/include/asm/rsi_cmds.h b/arch/arm64/include/asm/rsi_cmds.h
index a0b3c1bd786a..ee0df00efd87 100644
--- a/arch/arm64/include/asm/rsi_cmds.h
+++ b/arch/arm64/include/asm/rsi_cmds.h
@@ -48,10 +48,16 @@ static inline void invoke_rsi_fn_smc_with_res(unsigned long function_id,
 
 static inline unsigned long rsi_set_addr_range_state(phys_addr_t start,
 						     phys_addr_t end,
-						     enum ripas state)
+						     enum ripas state,
+						     phys_addr_t *top)
 {
-	return invoke_rsi_fn_smc(SMC_RSI_IPA_STATE_SET,
-				 start, (end - start), state, 0);
+	struct arm_smccc_res res;
+
+	invoke_rsi_fn_smc_with_res(SMC_RSI_IPA_STATE_SET,
+				   start, (end - start), state, 0, &res);
+
+	*top = res.a1;
+	return res.a0;
 }
 
 #endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index ceba6792f5b3..f301c2ad2fa7 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -34,7 +34,7 @@ obj-y			:= debug-monitors.o entry.o irq.o fpsimd.o		\
 			   cpufeature.o alternative.o cacheinfo.o		\
 			   smp.o smp_spin_table.o topology.o smccc-call.o	\
 			   syscall.o proton-pack.o idreg-override.o idle.o	\
-			   patching.o
+			   patching.o rsi.o
 
 obj-$(CONFIG_COMPAT)			+= sys32.o signal32.o			\
 					   sys_compat.o
diff --git a/arch/arm64/kernel/rsi.c b/arch/arm64/kernel/rsi.c
new file mode 100644
index 000000000000..b354ac661c9d
--- /dev/null
+++ b/arch/arm64/kernel/rsi.c
@@ -0,0 +1,50 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2023 ARM Ltd.
+ */
+
+#include <linux/jump_label.h>
+#include <linux/memblock.h>
+#include <asm/rsi.h>
+
+DEFINE_STATIC_KEY_FALSE_RO(rsi_present);
+
+static bool rsi_version_matches(void)
+{
+	unsigned long ver = rsi_get_version();
+
+	if (ver == SMCCC_RET_NOT_SUPPORTED)
+		return false;
+
+	pr_info("RME: RSI version %lu.%lu advertised\n",
+		RSI_ABI_VERSION_GET_MAJOR(ver),
+		RSI_ABI_VERSION_GET_MINOR(ver));
+
+	return (ver >= RSI_ABI_VERSION &&
+		RSI_ABI_VERSION_GET_MAJOR(ver) == RSI_ABI_VERSION_MAJOR);
+}
+
+void arm64_setup_memory(void)
+{
+	u64 i;
+	phys_addr_t start, end;
+
+	if (!static_branch_unlikely(&rsi_present))
+		return;
+
+	/*
+	 * Iterate over the available memory ranges
+	 * and convert the state to protected memory.
+	 */
+	for_each_mem_range(i, &start, &end) {
+		set_memory_range_protected(start, end);
+	}
+}
+
+void __init arm64_rsi_init(void)
+{
+	if (!rsi_version_matches())
+		return;
+
+	static_branch_enable(&rsi_present);
+}
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 12cfe9d0d3fa..ea89ee563135 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -43,6 +43,7 @@
 #include <asm/cpu_ops.h>
 #include <asm/kasan.h>
 #include <asm/numa.h>
+#include <asm/rsi.h>
 #include <asm/scs.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
@@ -312,6 +313,8 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
 	 * cpufeature code and early parameters.
 	 */
 	jump_label_init();
+	/* Init RSI after jump_labels are active */
+	arm64_rsi_init();
 	parse_early_param();
 
 	dynamic_scs_init();
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 58a0bb2c17f1..fa9088add624 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -39,6 +39,7 @@
 #include <asm/kvm_host.h>
 #include <asm/memory.h>
 #include <asm/numa.h>
+#include <asm/rsi.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
 #include <linux/sizes.h>
@@ -412,6 +413,7 @@ void __init arm64_memblock_init(void)
 		reserve_crashkernel();
 
 	high_memory = __va(memblock_end_of_DRAM() - 1) + 1;
+	arm64_setup_memory();
 }
 
 void __init bootmem_init(void)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 04/14] arm64: realm: Query IPA size from the RMM
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
                     ` (2 preceding siblings ...)
  2023-01-27 11:27   ` [RFC PATCH 03/14] arm64: Detect if in a realm and set RIPAS RAM Steven Price
@ 2023-01-27 11:27   ` Steven Price
  2023-01-27 11:27   ` [RFC PATCH 05/14] arm64: Mark all I/O as non-secure shared Steven Price
                     ` (9 subsequent siblings)
  13 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:27 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Steven Price, Catalin Marinas, Ard Biesheuvel, Marc Zyngier,
	Will Deacon, Suzuki K Poulose, James Morse, Oliver Upton,
	Zenghui Yu, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

The top bit of the configured IPA size is used as an attribute to
control whether the address is protected or shared. Query the
configuration from the RMM to assertain which bit this is.

Co-developed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/pgtable-prot.h | 2 ++
 arch/arm64/include/asm/rsi_cmds.h     | 8 ++++++++
 arch/arm64/kernel/rsi.c               | 8 ++++++++
 3 files changed, 18 insertions(+)

diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index 9b165117a454..3f24080d6cc9 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -33,7 +33,9 @@
 #include <asm/pgtable-types.h>
 
 extern bool arm64_use_ng_mappings;
+extern unsigned long prot_ns_shared;
 
+#define PROT_NS_SHARED		((prot_ns_shared))
 #define _PROT_DEFAULT		(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
 #define _PROT_SECT_DEFAULT	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
 
diff --git a/arch/arm64/include/asm/rsi_cmds.h b/arch/arm64/include/asm/rsi_cmds.h
index ee0df00efd87..e13f702de6c2 100644
--- a/arch/arm64/include/asm/rsi_cmds.h
+++ b/arch/arm64/include/asm/rsi_cmds.h
@@ -46,6 +46,14 @@ static inline void invoke_rsi_fn_smc_with_res(unsigned long function_id,
 	arm_smccc_smc(function_id, arg0, arg1, arg2, arg3, 0, 0, 0, res);
 }
 
+static inline unsigned long rsi_get_realm_config(struct realm_config *cfg)
+{
+	struct arm_smccc_res res;
+
+	invoke_rsi_fn_smc_with_res(SMC_RSI_REALM_CONFIG, virt_to_phys(cfg), 0, 0, 0, &res);
+	return res.a0;
+}
+
 static inline unsigned long rsi_set_addr_range_state(phys_addr_t start,
 						     phys_addr_t end,
 						     enum ripas state,
diff --git a/arch/arm64/kernel/rsi.c b/arch/arm64/kernel/rsi.c
index b354ac661c9d..9c63ee1c6979 100644
--- a/arch/arm64/kernel/rsi.c
+++ b/arch/arm64/kernel/rsi.c
@@ -7,6 +7,11 @@
 #include <linux/memblock.h>
 #include <asm/rsi.h>
 
+struct realm_config __attribute((aligned(PAGE_SIZE))) config;
+
+unsigned long prot_ns_shared;
+EXPORT_SYMBOL(prot_ns_shared);
+
 DEFINE_STATIC_KEY_FALSE_RO(rsi_present);
 
 static bool rsi_version_matches(void)
@@ -45,6 +50,9 @@ void __init arm64_rsi_init(void)
 {
 	if (!rsi_version_matches())
 		return;
+	if (rsi_get_realm_config(&config))
+		return;
+	prot_ns_shared = BIT(config.ipa_bits - 1);
 
 	static_branch_enable(&rsi_present);
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 05/14] arm64: Mark all I/O as non-secure shared
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
                     ` (3 preceding siblings ...)
  2023-01-27 11:27   ` [RFC PATCH 04/14] arm64: realm: Query IPA size from the RMM Steven Price
@ 2023-01-27 11:27   ` Steven Price
  2023-01-27 11:27   ` [RFC PATCH 06/14] fixmap: Allow architecture overriding set_fixmap_io Steven Price
                     ` (8 subsequent siblings)
  13 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:27 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Steven Price, Catalin Marinas, Ard Biesheuvel, Marc Zyngier,
	Will Deacon, Suzuki K Poulose, James Morse, Oliver Upton,
	Zenghui Yu, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

All I/O is by default considered non-secure for realms. As such
mark them as shared with the host.

Co-developed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/io.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
index 877495a0fd0c..b1a9c22aed72 100644
--- a/arch/arm64/include/asm/io.h
+++ b/arch/arm64/include/asm/io.h
@@ -142,12 +142,12 @@ extern void __memset_io(volatile void __iomem *, int, size_t);
 bool ioremap_allowed(phys_addr_t phys_addr, size_t size, unsigned long prot);
 #define ioremap_allowed ioremap_allowed
 
-#define _PAGE_IOREMAP PROT_DEVICE_nGnRE
+#define _PAGE_IOREMAP (PROT_DEVICE_nGnRE | PROT_NS_SHARED)
 
 #define ioremap_wc(addr, size)	\
-	ioremap_prot((addr), (size), PROT_NORMAL_NC)
+	ioremap_prot((addr), (size), (PROT_NORMAL_NC | PROT_NS_SHARED))
 #define ioremap_np(addr, size)	\
-	ioremap_prot((addr), (size), PROT_DEVICE_nGnRnE)
+	ioremap_prot((addr), (size), (PROT_DEVICE_nGnRnE | PROT_NS_SHARED))
 
 /*
  * io{read,write}{16,32,64}be() macros
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 06/14] fixmap: Allow architecture overriding set_fixmap_io
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
                     ` (4 preceding siblings ...)
  2023-01-27 11:27   ` [RFC PATCH 05/14] arm64: Mark all I/O as non-secure shared Steven Price
@ 2023-01-27 11:27   ` Steven Price
  2023-01-27 11:27   ` [RFC PATCH 07/14] arm64: Override set_fixmap_io Steven Price
                     ` (7 subsequent siblings)
  13 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:27 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Steven Price, Catalin Marinas, Ard Biesheuvel, Marc Zyngier,
	Will Deacon, Suzuki K Poulose, James Morse, Oliver Upton,
	Zenghui Yu, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

From: Suzuki K Poulose <suzuki.poulose@arm.com>

For a realm guest it will be necessary to ensure IO mappings are shared
so that the VMM can emulate the device. The following patch will provide
an implementation of set_fixmap_io for arm64 setting the shared bit (if
in a realm).

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 include/asm-generic/fixmap.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/asm-generic/fixmap.h b/include/asm-generic/fixmap.h
index 8cc7b09c1bc7..c5ce0368c1ee 100644
--- a/include/asm-generic/fixmap.h
+++ b/include/asm-generic/fixmap.h
@@ -94,8 +94,10 @@ static inline unsigned long virt_to_fix(const unsigned long vaddr)
 /*
  * Some fixmaps are for IO
  */
+#ifndef set_fixmap_io
 #define set_fixmap_io(idx, phys) \
 	__set_fixmap(idx, phys, FIXMAP_PAGE_IO)
+#endif
 
 #define set_fixmap_offset_io(idx, phys) \
 	__set_fixmap_offset(idx, phys, FIXMAP_PAGE_IO)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 07/14] arm64: Override set_fixmap_io
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
                     ` (5 preceding siblings ...)
  2023-01-27 11:27   ` [RFC PATCH 06/14] fixmap: Allow architecture overriding set_fixmap_io Steven Price
@ 2023-01-27 11:27   ` Steven Price
  2023-01-27 11:27   ` [RFC PATCH 08/14] arm64: Make the PHYS_MASK_SHIFT dynamic Steven Price
                     ` (6 subsequent siblings)
  13 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:27 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Steven Price, Catalin Marinas, Ard Biesheuvel, Marc Zyngier,
	Will Deacon, Suzuki K Poulose, James Morse, Oliver Upton,
	Zenghui Yu, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

From: Suzuki K Poulose <suzuki.poulose@arm.com>

Override the set_fixmap_io to set shared permission for the host
in case of a CC guest. For now we mark it shared unconditionally.
Future changes could filter the physical address and make the
decision accordingly.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/fixmap.h |  2 ++
 arch/arm64/mm/mmu.c             | 13 +++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index 09ba9fe3b02c..1acafc1c7fae 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -107,6 +107,8 @@ void __init early_fixmap_init(void);
 #define __late_set_fixmap __set_fixmap
 #define __late_clear_fixmap(idx) __set_fixmap((idx), 0, FIXMAP_PAGE_CLEAR)
 
+#define set_fixmap_io set_fixmap_io
+void set_fixmap_io(enum fixed_addresses idx, phys_addr_t phys);
 void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot);
 
 #include <asm-generic/fixmap.h>
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 14c87e8d69d8..33fda73c669b 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1279,6 +1279,19 @@ void __set_fixmap(enum fixed_addresses idx,
 	}
 }
 
+void set_fixmap_io(enum fixed_addresses idx, phys_addr_t phys)
+{
+	pgprot_t prot = FIXMAP_PAGE_IO;
+
+	/*
+	 * For now we consider all I/O as non-secure. For future
+	 * filter the I/O base for setting appropriate permissions.
+	 */
+	prot = __pgprot(pgprot_val(prot) | PROT_NS_SHARED);
+
+	return __set_fixmap(idx, phys, prot);
+}
+
 void *__init fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot)
 {
 	const u64 dt_virt_base = __fix_to_virt(FIX_FDT);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 08/14] arm64: Make the PHYS_MASK_SHIFT dynamic
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
                     ` (6 preceding siblings ...)
  2023-01-27 11:27   ` [RFC PATCH 07/14] arm64: Override set_fixmap_io Steven Price
@ 2023-01-27 11:27   ` Steven Price
  2023-01-27 11:27   ` [RFC PATCH 09/14] arm64: Enforce bounce buffers for realm DMA Steven Price
                     ` (5 subsequent siblings)
  13 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:27 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Steven Price, Catalin Marinas, Ard Biesheuvel, Marc Zyngier,
	Will Deacon, Suzuki K Poulose, James Morse, Oliver Upton,
	Zenghui Yu, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

Make the PHYS_MASK_SHIFT dynamic for Realms. This is only is required
for masking the PFN from a pte entry. Elsewhere, we could still use the
PA bits configured by the kernel. So, this patch:

 -> renames PHYS_MASK_SHIFT -> MAX_PHYS_SHIFT as supported by the kernel
 -> Makes PHYS_MASK_SHIFT -> Dynamic value of the (I)PA bit width
 -> For a realm: reduces phys_mask_shift if the RMM reports a smaller
    configured size for the guest.

Co-developed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/kvm_arm.h       | 2 +-
 arch/arm64/include/asm/pgtable-hwdef.h | 4 ++--
 arch/arm64/include/asm/pgtable.h       | 5 +++++
 arch/arm64/kernel/head.S               | 2 +-
 arch/arm64/kernel/rsi.c                | 5 +++++
 5 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 0df3fc3a0173..924f84024009 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -346,7 +346,7 @@
  * bits in PAR are res0.
  */
 #define PAR_TO_HPFAR(par)		\
-	(((par) & GENMASK_ULL(52 - 1, 12)) >> 8)
+	(((par) & GENMASK_ULL(MAX_PHYS_MASK_SHIFT - 1, 12)) >> 8)
 
 #define ECN(x) { ESR_ELx_EC_##x, #x }
 
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index f658aafc47df..677bf7a91616 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -178,8 +178,8 @@
 /*
  * Highest possible physical address supported.
  */
-#define PHYS_MASK_SHIFT		(CONFIG_ARM64_PA_BITS)
-#define PHYS_MASK		((UL(1) << PHYS_MASK_SHIFT) - 1)
+#define MAX_PHYS_MASK_SHIFT	(CONFIG_ARM64_PA_BITS)
+#define MAX_PHYS_MASK		((UL(1) << PHYS_MASK_SHIFT) - 1)
 
 #define TTBR_CNP_BIT		(UL(1) << 0)
 
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index b4bbeed80fb6..a1319a743b38 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -35,6 +35,11 @@
 #include <linux/sched.h>
 #include <linux/page_table_check.h>
 
+extern unsigned int phys_mask_shift;
+
+#define PHYS_MASK_SHIFT		(phys_mask_shift)
+#define PHYS_MASK		((1UL << PHYS_MASK_SHIFT) - 1)
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
 
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 952e17bd1c0b..a05504667b69 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -321,7 +321,7 @@ SYM_FUNC_START_LOCAL(create_idmap)
 #error "Mismatch between VA_BITS and page size/number of translation levels"
 #endif
 #else
-#define IDMAP_PGD_ORDER	(PHYS_MASK_SHIFT - PGDIR_SHIFT)
+#define IDMAP_PGD_ORDER	(MAX_PHYS_MASK_SHIFT - PGDIR_SHIFT)
 #define EXTRA_SHIFT
 	/*
 	 * If VA_BITS == 48, we don't have to configure an additional
diff --git a/arch/arm64/kernel/rsi.c b/arch/arm64/kernel/rsi.c
index 9c63ee1c6979..49d36dfe0064 100644
--- a/arch/arm64/kernel/rsi.c
+++ b/arch/arm64/kernel/rsi.c
@@ -12,6 +12,8 @@ struct realm_config __attribute((aligned(PAGE_SIZE))) config;
 unsigned long prot_ns_shared;
 EXPORT_SYMBOL(prot_ns_shared);
 
+unsigned int phys_mask_shift = CONFIG_ARM64_PA_BITS;
+
 DEFINE_STATIC_KEY_FALSE_RO(rsi_present);
 
 static bool rsi_version_matches(void)
@@ -54,5 +56,8 @@ void __init arm64_rsi_init(void)
 		return;
 	prot_ns_shared = BIT(config.ipa_bits - 1);
 
+	if (config.ipa_bits - 1 < phys_mask_shift)
+		phys_mask_shift = config.ipa_bits - 1;
+
 	static_branch_enable(&rsi_present);
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 09/14] arm64: Enforce bounce buffers for realm DMA
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
                     ` (7 preceding siblings ...)
  2023-01-27 11:27   ` [RFC PATCH 08/14] arm64: Make the PHYS_MASK_SHIFT dynamic Steven Price
@ 2023-01-27 11:27   ` Steven Price
  2023-01-27 11:27   ` [RFC PATCH 10/14] arm64: Enable memory encrypt for Realms Steven Price
                     ` (4 subsequent siblings)
  13 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:27 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Steven Price, Catalin Marinas, Ard Biesheuvel, Marc Zyngier,
	Will Deacon, Suzuki K Poulose, James Morse, Oliver Upton,
	Zenghui Yu, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

Within a realm guest it's not possible for a device emulated by the VMM
to access arbitrary guest memory. So force the use of bounce buffers to
ensure that the memory the emulated devices are accessing is in memory
which is explicitly shared with the host.

Co-developed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kernel/rsi.c | 2 ++
 arch/arm64/mm/init.c    | 8 +++++++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/rsi.c b/arch/arm64/kernel/rsi.c
index 49d36dfe0064..1a07eefdd2e9 100644
--- a/arch/arm64/kernel/rsi.c
+++ b/arch/arm64/kernel/rsi.c
@@ -5,6 +5,8 @@
 
 #include <linux/jump_label.h>
 #include <linux/memblock.h>
+#include <linux/swiotlb.h>
+
 #include <asm/rsi.h>
 
 struct realm_config __attribute((aligned(PAGE_SIZE))) config;
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index fa9088add624..32a4710ad861 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -472,7 +472,13 @@ void __init bootmem_init(void)
  */
 void __init mem_init(void)
 {
-	swiotlb_init(max_pfn > PFN_DOWN(arm64_dma_phys_limit), SWIOTLB_VERBOSE);
+	if (is_realm_world()) {
+		swiotlb_init(true, SWIOTLB_VERBOSE | SWIOTLB_FORCE);
+		swiotlb_update_mem_attributes();
+	} else {
+		swiotlb_init(max_pfn > PFN_DOWN(arm64_dma_phys_limit),
+			     SWIOTLB_VERBOSE);
+	}
 
 	/* this will put all unused low memory onto the freelists */
 	memblock_free_all();
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 10/14] arm64: Enable memory encrypt for Realms
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
                     ` (8 preceding siblings ...)
  2023-01-27 11:27   ` [RFC PATCH 09/14] arm64: Enforce bounce buffers for realm DMA Steven Price
@ 2023-01-27 11:27   ` Steven Price
  2023-01-27 11:27   ` [RFC PATCH 11/14] arm64: Force device mappings to be non-secure shared Steven Price
                     ` (3 subsequent siblings)
  13 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:27 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Steven Price, Catalin Marinas, Ard Biesheuvel, Marc Zyngier,
	Will Deacon, Suzuki K Poulose, James Morse, Oliver Upton,
	Zenghui Yu, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

Use the memory encryption APIs to trigger a RSI call to request a
transition between protected memory and shared memory (or vice versa)
and updating the kernel's linear map of modified pages to flip the top
bit of the IPA. This requires that block mappings are not used in the
direct map for realm guests.

Co-developed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/Kconfig                   |  3 ++
 arch/arm64/include/asm/mem_encrypt.h | 19 +++++++++++
 arch/arm64/kernel/rsi.c              | 12 +++++++
 arch/arm64/mm/pageattr.c             | 48 +++++++++++++++++++++++++---
 4 files changed, 78 insertions(+), 4 deletions(-)
 create mode 100644 arch/arm64/include/asm/mem_encrypt.h

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 03934808b2ed..0aac44a993ac 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -20,6 +20,7 @@ config ARM64
 	select ARCH_ENABLE_SPLIT_PMD_PTLOCK if PGTABLE_LEVELS > 2
 	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
 	select ARCH_HAS_CACHE_LINE_SIZE
+	select ARCH_HAS_CC_PLATFORM
 	select ARCH_HAS_CURRENT_STACK_POINTER
 	select ARCH_HAS_DEBUG_VIRTUAL
 	select ARCH_HAS_DEBUG_VM_PGTABLE
@@ -39,6 +40,8 @@ config ARM64
 	select ARCH_HAS_SETUP_DMA_OPS
 	select ARCH_HAS_SET_DIRECT_MAP
 	select ARCH_HAS_SET_MEMORY
+	select ARCH_HAS_MEM_ENCRYPT
+	select ARCH_HAS_FORCE_DMA_UNENCRYPTED
 	select ARCH_STACKWALK
 	select ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_HAS_STRICT_MODULE_RWX
diff --git a/arch/arm64/include/asm/mem_encrypt.h b/arch/arm64/include/asm/mem_encrypt.h
new file mode 100644
index 000000000000..7381f9585321
--- /dev/null
+++ b/arch/arm64/include/asm/mem_encrypt.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2023 ARM Ltd.
+ */
+
+#ifndef __ASM_MEM_ENCRYPT_H
+#define __ASM_MEM_ENCRYPT_H
+
+#include <asm/rsi.h>
+
+/* All DMA must be to non-secure memory for now */
+static inline bool force_dma_unencrypted(struct device *dev)
+{
+	return is_realm_world();
+}
+
+int set_memory_encrypted(unsigned long addr, int numpages);
+int set_memory_decrypted(unsigned long addr, int numpages);
+#endif
diff --git a/arch/arm64/kernel/rsi.c b/arch/arm64/kernel/rsi.c
index 1a07eefdd2e9..1cc292826f2b 100644
--- a/arch/arm64/kernel/rsi.c
+++ b/arch/arm64/kernel/rsi.c
@@ -6,6 +6,7 @@
 #include <linux/jump_label.h>
 #include <linux/memblock.h>
 #include <linux/swiotlb.h>
+#include <linux/cc_platform.h>
 
 #include <asm/rsi.h>
 
@@ -18,6 +19,17 @@ unsigned int phys_mask_shift = CONFIG_ARM64_PA_BITS;
 
 DEFINE_STATIC_KEY_FALSE_RO(rsi_present);
 
+bool cc_platform_has(enum cc_attr attr)
+{
+	switch (attr) {
+	case CC_ATTR_MEM_ENCRYPT:
+		return is_realm_world();
+	default:
+		return false;
+	}
+}
+EXPORT_SYMBOL_GPL(cc_platform_has);
+
 static bool rsi_version_matches(void)
 {
 	unsigned long ver = rsi_get_version();
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index 79dd201c59d8..bbd7364dd9a8 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -5,10 +5,12 @@
 #include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/module.h>
+#include <linux/mem_encrypt.h>
 #include <linux/sched.h>
 #include <linux/vmalloc.h>
 
 #include <asm/cacheflush.h>
+#include <asm/pgtable-prot.h>
 #include <asm/set_memory.h>
 #include <asm/tlbflush.h>
 
@@ -22,12 +24,12 @@ bool rodata_full __ro_after_init = IS_ENABLED(CONFIG_RODATA_FULL_DEFAULT_ENABLED
 bool can_set_direct_map(void)
 {
 	/*
-	 * rodata_full, DEBUG_PAGEALLOC and KFENCE require linear map to be
-	 * mapped at page granularity, so that it is possible to
-	 * protect/unprotect single pages.
+	 * rodata_full, DEBUG_PAGEALLOC, KFENCE and a Realm guest all require
+	 * linear map to be mapped at page granularity, so that it is possible
+	 * to protect/unprotect single pages.
 	 */
 	return (rodata_enabled && rodata_full) || debug_pagealloc_enabled() ||
-		IS_ENABLED(CONFIG_KFENCE);
+		IS_ENABLED(CONFIG_KFENCE) || is_realm_world();
 }
 
 static int change_page_range(pte_t *ptep, unsigned long addr, void *data)
@@ -38,6 +40,7 @@ static int change_page_range(pte_t *ptep, unsigned long addr, void *data)
 	pte = clear_pte_bit(pte, cdata->clear_mask);
 	pte = set_pte_bit(pte, cdata->set_mask);
 
+	/* TODO: Break before make for PROT_NS_SHARED updates */
 	set_pte(ptep, pte);
 	return 0;
 }
@@ -190,6 +193,43 @@ int set_direct_map_default_noflush(struct page *page)
 				   PAGE_SIZE, change_page_range, &data);
 }
 
+static int __set_memory_encrypted(unsigned long addr,
+				  int numpages,
+				  bool encrypt)
+{
+	unsigned long set_prot = 0, clear_prot = 0;
+	phys_addr_t start, end;
+
+	if (!is_realm_world())
+		return 0;
+
+	WARN_ON(!__is_lm_address(addr));
+	start = __virt_to_phys(addr);
+	end = start + numpages * PAGE_SIZE;
+
+	if (encrypt) {
+		clear_prot = PROT_NS_SHARED;
+		set_memory_range_protected(start, end);
+	} else {
+		set_prot = PROT_NS_SHARED;
+		set_memory_range_shared(start, end);
+	}
+
+	return __change_memory_common(addr, PAGE_SIZE * numpages,
+				      __pgprot(set_prot),
+				      __pgprot(clear_prot));
+}
+
+int set_memory_encrypted(unsigned long addr, int numpages)
+{
+	return __set_memory_encrypted(addr, numpages, true);
+}
+
+int set_memory_decrypted(unsigned long addr, int numpages)
+{
+	return __set_memory_encrypted(addr, numpages, false);
+}
+
 #ifdef CONFIG_DEBUG_PAGEALLOC
 void __kernel_map_pages(struct page *page, int numpages, int enable)
 {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 11/14] arm64: Force device mappings to be non-secure shared
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
                     ` (9 preceding siblings ...)
  2023-01-27 11:27   ` [RFC PATCH 10/14] arm64: Enable memory encrypt for Realms Steven Price
@ 2023-01-27 11:27   ` Steven Price
  2023-01-27 11:27   ` [RFC PATCH 12/14] efi: arm64: Map Device with Prot Shared Steven Price
                     ` (2 subsequent siblings)
  13 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:27 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Steven Price, Catalin Marinas, Ard Biesheuvel, Marc Zyngier,
	Will Deacon, Suzuki K Poulose, James Morse, Oliver Upton,
	Zenghui Yu, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

From: Suzuki K Poulose <suzuki.poulose@arm.com>

Device mappings (currently) need to be emulated by the VMM so must be
mapped shared with the host.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/pgtable.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index a1319a743b38..f283ac3fb905 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -567,7 +567,7 @@ static inline void set_pud_at(struct mm_struct *mm, unsigned long addr,
 #define pgprot_writecombine(prot) \
 	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_NC) | PTE_PXN | PTE_UXN)
 #define pgprot_device(prot) \
-	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN)
+	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRE) | PTE_PXN | PTE_UXN | PROT_NS_SHARED)
 #define pgprot_tagged(prot) \
 	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_TAGGED))
 #define pgprot_mhp	pgprot_tagged
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 12/14] efi: arm64: Map Device with Prot Shared
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
                     ` (10 preceding siblings ...)
  2023-01-27 11:27   ` [RFC PATCH 11/14] arm64: Force device mappings to be non-secure shared Steven Price
@ 2023-01-27 11:27   ` Steven Price
  2023-01-27 11:27   ` [RFC PATCH 13/14] arm64: realm: Support nonsecure ITS emulation shared Steven Price
  2023-01-27 11:27   ` [RFC PATCH 14/14] HACK: Accept prototype RSI version Steven Price
  13 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:27 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Steven Price, Catalin Marinas, Ard Biesheuvel, Marc Zyngier,
	Will Deacon, Suzuki K Poulose, James Morse, Oliver Upton,
	Zenghui Yu, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

From: Suzuki K Poulose <suzuki.poulose@arm.com>

Device mappings need to be emualted by the VMM so must be mapped shared
with the host.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kernel/efi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index fab05de2e12d..03a876707fc5 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -31,7 +31,7 @@ static __init pteval_t create_mapping_protection(efi_memory_desc_t *md)
 	u32 type = md->type;
 
 	if (type == EFI_MEMORY_MAPPED_IO)
-		return PROT_DEVICE_nGnRE;
+		return PROT_NS_SHARED | PROT_DEVICE_nGnRE;
 
 	if (region_is_misaligned(md)) {
 		static bool __initdata code_is_misaligned;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 13/14] arm64: realm: Support nonsecure ITS emulation shared
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
                     ` (11 preceding siblings ...)
  2023-01-27 11:27   ` [RFC PATCH 12/14] efi: arm64: Map Device with Prot Shared Steven Price
@ 2023-01-27 11:27   ` Steven Price
  2023-01-27 11:27   ` [RFC PATCH 14/14] HACK: Accept prototype RSI version Steven Price
  13 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:27 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Steven Price, Catalin Marinas, Ard Biesheuvel, Marc Zyngier,
	Will Deacon, Suzuki K Poulose, James Morse, Oliver Upton,
	Zenghui Yu, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

Within a realm guest the ITS is emulated by the host. This means the
allocations must have been made available to the host by a call to
set_memory_decrypted(). Introduce an allocation function which performs
this extra call.

Co-developed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 drivers/irqchip/irq-gic-v3-its.c | 95 ++++++++++++++++++++++++--------
 1 file changed, 71 insertions(+), 24 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 973ede0197e3..5f9829376f6c 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -18,6 +18,7 @@
 #include <linux/irqdomain.h>
 #include <linux/list.h>
 #include <linux/log2.h>
+#include <linux/mem_encrypt.h>
 #include <linux/memblock.h>
 #include <linux/mm.h>
 #include <linux/msi.h>
@@ -27,6 +28,7 @@
 #include <linux/of_pci.h>
 #include <linux/of_platform.h>
 #include <linux/percpu.h>
+#include <linux/set_memory.h>
 #include <linux/slab.h>
 #include <linux/syscore_ops.h>
 
@@ -165,6 +167,7 @@ struct its_device {
 	struct its_node		*its;
 	struct event_lpi_map	event_map;
 	void			*itt;
+	u32			itt_order;
 	u32			nr_ites;
 	u32			device_id;
 	bool			shared;
@@ -200,6 +203,33 @@ static DEFINE_IDA(its_vpeid_ida);
 #define gic_data_rdist_rd_base()	(gic_data_rdist()->rd_base)
 #define gic_data_rdist_vlpi_base()	(gic_data_rdist_rd_base() + SZ_128K)
 
+static struct page *its_alloc_shared_pages_node(int node, gfp_t gfp,
+						unsigned int order)
+{
+	struct page *page;
+
+	if (node == NUMA_NO_NODE)
+		page = alloc_pages(gfp, order);
+	else
+		page = alloc_pages_node(node, gfp, order);
+
+	if (page)
+		set_memory_decrypted((unsigned long)page_address(page),
+				     1 << order);
+	return page;
+}
+
+static struct page *its_alloc_shared_pages(gfp_t gfp, unsigned int order)
+{
+	return its_alloc_shared_pages_node(NUMA_NO_NODE, gfp, order);
+}
+
+static void its_free_shared_pages(void *addr, unsigned int order)
+{
+	set_memory_encrypted((unsigned long)addr, 1 << order);
+	free_pages((unsigned long)addr, order);
+}
+
 /*
  * Skip ITSs that have no vLPIs mapped, unless we're on GICv4.1, as we
  * always have vSGIs mapped.
@@ -2178,7 +2208,8 @@ static struct page *its_allocate_prop_table(gfp_t gfp_flags)
 {
 	struct page *prop_page;
 
-	prop_page = alloc_pages(gfp_flags, get_order(LPI_PROPBASE_SZ));
+	prop_page = its_alloc_shared_pages(gfp_flags,
+					   get_order(LPI_PROPBASE_SZ));
 	if (!prop_page)
 		return NULL;
 
@@ -2189,8 +2220,8 @@ static struct page *its_allocate_prop_table(gfp_t gfp_flags)
 
 static void its_free_prop_table(struct page *prop_page)
 {
-	free_pages((unsigned long)page_address(prop_page),
-		   get_order(LPI_PROPBASE_SZ));
+	its_free_shared_pages(page_address(prop_page),
+			      get_order(LPI_PROPBASE_SZ));
 }
 
 static bool gic_check_reserved_range(phys_addr_t addr, unsigned long size)
@@ -2312,10 +2343,10 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
 		order = get_order(GITS_BASER_PAGES_MAX * psz);
 	}
 
-	page = alloc_pages_node(its->numa_node, GFP_KERNEL | __GFP_ZERO, order);
+	page = its_alloc_shared_pages_node(its->numa_node,
+					   GFP_KERNEL | __GFP_ZERO, order);
 	if (!page)
 		return -ENOMEM;
-
 	base = (void *)page_address(page);
 	baser_phys = virt_to_phys(base);
 
@@ -2325,7 +2356,7 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
 		/* 52bit PA is supported only when PageSize=64K */
 		if (psz != SZ_64K) {
 			pr_err("ITS: no 52bit PA support when psz=%d\n", psz);
-			free_pages((unsigned long)base, order);
+			its_free_shared_pages(base, order);
 			return -ENXIO;
 		}
 
@@ -2379,7 +2410,7 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser,
 		pr_err("ITS@%pa: %s doesn't stick: %llx %llx\n",
 		       &its->phys_base, its_base_type_string[type],
 		       val, tmp);
-		free_pages((unsigned long)base, order);
+		its_free_shared_pages(base, order);
 		return -ENXIO;
 	}
 
@@ -2518,8 +2549,8 @@ static void its_free_tables(struct its_node *its)
 
 	for (i = 0; i < GITS_BASER_NR_REGS; i++) {
 		if (its->tables[i].base) {
-			free_pages((unsigned long)its->tables[i].base,
-				   its->tables[i].order);
+			its_free_shared_pages(its->tables[i].base,
+					      its->tables[i].order);
 			its->tables[i].base = NULL;
 		}
 	}
@@ -2778,7 +2809,8 @@ static bool allocate_vpe_l2_table(int cpu, u32 id)
 
 	/* Allocate memory for 2nd level table */
 	if (!table[idx]) {
-		page = alloc_pages(GFP_KERNEL | __GFP_ZERO, get_order(psz));
+		page = its_alloc_shared_pages(GFP_KERNEL | __GFP_ZERO,
+					      get_order(psz));
 		if (!page)
 			return false;
 
@@ -2897,7 +2929,8 @@ static int allocate_vpe_l1_table(void)
 
 	pr_debug("np = %d, npg = %lld, psz = %d, epp = %d, esz = %d\n",
 		 np, npg, psz, epp, esz);
-	page = alloc_pages(GFP_ATOMIC | __GFP_ZERO, get_order(np * PAGE_SIZE));
+	page = its_alloc_shared_pages(GFP_ATOMIC | __GFP_ZERO,
+				      get_order(np * PAGE_SIZE));
 	if (!page)
 		return -ENOMEM;
 
@@ -2941,8 +2974,8 @@ static struct page *its_allocate_pending_table(gfp_t gfp_flags)
 {
 	struct page *pend_page;
 
-	pend_page = alloc_pages(gfp_flags | __GFP_ZERO,
-				get_order(LPI_PENDBASE_SZ));
+	pend_page = its_alloc_shared_pages(gfp_flags | __GFP_ZERO,
+					   get_order(LPI_PENDBASE_SZ));
 	if (!pend_page)
 		return NULL;
 
@@ -2954,7 +2987,8 @@ static struct page *its_allocate_pending_table(gfp_t gfp_flags)
 
 static void its_free_pending_table(struct page *pt)
 {
-	free_pages((unsigned long)page_address(pt), get_order(LPI_PENDBASE_SZ));
+	its_free_shared_pages(page_address(pt),
+			      get_order(LPI_PENDBASE_SZ));
 }
 
 /*
@@ -3283,8 +3317,9 @@ static bool its_alloc_table_entry(struct its_node *its,
 
 	/* Allocate memory for 2nd level table */
 	if (!table[idx]) {
-		page = alloc_pages_node(its->numa_node, GFP_KERNEL | __GFP_ZERO,
-					get_order(baser->psz));
+		page = its_alloc_shared_pages_node(its->numa_node,
+						   GFP_KERNEL | __GFP_ZERO,
+						   get_order(baser->psz));
 		if (!page)
 			return false;
 
@@ -3367,7 +3402,9 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 	unsigned long *lpi_map = NULL;
 	unsigned long flags;
 	u16 *col_map = NULL;
+	struct page *page;
 	void *itt;
+	int itt_order;
 	int lpi_base;
 	int nr_lpis;
 	int nr_ites;
@@ -3379,7 +3416,6 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 	if (WARN_ON(!is_power_of_2(nvecs)))
 		nvecs = roundup_pow_of_two(nvecs);
 
-	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
 	/*
 	 * Even if the device wants a single LPI, the ITT must be
 	 * sized as a power of two (and you need at least one bit...).
@@ -3387,7 +3423,16 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 	nr_ites = max(2, nvecs);
 	sz = nr_ites * (FIELD_GET(GITS_TYPER_ITT_ENTRY_SIZE, its->typer) + 1);
 	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
-	itt = kzalloc_node(sz, GFP_KERNEL, its->numa_node);
+	itt_order = get_order(sz);
+	page = its_alloc_shared_pages_node(its->numa_node,
+					   GFP_KERNEL | __GFP_ZERO,
+					   itt_order);
+	if (!page)
+		return NULL;
+	itt = (void *)page_address(page);
+
+	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+
 	if (alloc_lpis) {
 		lpi_map = its_lpi_alloc(nvecs, &lpi_base, &nr_lpis);
 		if (lpi_map)
@@ -3399,9 +3444,9 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 		lpi_base = 0;
 	}
 
-	if (!dev || !itt ||  !col_map || (!lpi_map && alloc_lpis)) {
+	if (!dev || !col_map || (!lpi_map && alloc_lpis)) {
 		kfree(dev);
-		kfree(itt);
+		its_free_shared_pages(itt, itt_order);
 		bitmap_free(lpi_map);
 		kfree(col_map);
 		return NULL;
@@ -3411,6 +3456,7 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 
 	dev->its = its;
 	dev->itt = itt;
+	dev->itt_order = itt_order;
 	dev->nr_ites = nr_ites;
 	dev->event_map.lpi_map = lpi_map;
 	dev->event_map.col_map = col_map;
@@ -3438,7 +3484,7 @@ static void its_free_device(struct its_device *its_dev)
 	list_del(&its_dev->entry);
 	raw_spin_unlock_irqrestore(&its_dev->its->lock, flags);
 	kfree(its_dev->event_map.col_map);
-	kfree(its_dev->itt);
+	its_free_shared_pages(its_dev->itt, its_dev->itt_order);
 	kfree(its_dev);
 }
 
@@ -5064,8 +5110,9 @@ static int __init its_probe_one(struct resource *res,
 
 	its->numa_node = numa_node;
 
-	page = alloc_pages_node(its->numa_node, GFP_KERNEL | __GFP_ZERO,
-				get_order(ITS_CMD_QUEUE_SZ));
+	page = its_alloc_shared_pages_node(its->numa_node,
+					   GFP_KERNEL | __GFP_ZERO,
+					   get_order(ITS_CMD_QUEUE_SZ));
 	if (!page) {
 		err = -ENOMEM;
 		goto out_unmap_sgir;
@@ -5131,7 +5178,7 @@ static int __init its_probe_one(struct resource *res,
 out_free_tables:
 	its_free_tables(its);
 out_free_cmd:
-	free_pages((unsigned long)its->cmd_base, get_order(ITS_CMD_QUEUE_SZ));
+	its_free_shared_pages(its->cmd_base, get_order(ITS_CMD_QUEUE_SZ));
 out_unmap_sgir:
 	if (its->sgir_base)
 		iounmap(its->sgir_base);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 14/14] HACK: Accept prototype RSI version
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
                     ` (12 preceding siblings ...)
  2023-01-27 11:27   ` [RFC PATCH 13/14] arm64: realm: Support nonsecure ITS emulation shared Steven Price
@ 2023-01-27 11:27   ` Steven Price
  13 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:27 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel
  Cc: Steven Price, Catalin Marinas, Ard Biesheuvel, Marc Zyngier,
	Will Deacon, Suzuki K Poulose, James Morse, Oliver Upton,
	Zenghui Yu, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

The upstream RMM currently advertises the major version of an internal
prototype (v12.0) rather than the expected version from the RMM
architecture specification (v1.0).

Add a config option to enable support for the prototype RSI v12.0.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/rsi_smc.h | 2 ++
 arch/arm64/kernel/rsi.c          | 5 +++++
 arch/arm64/kvm/Kconfig           | 8 ++++++++
 3 files changed, 15 insertions(+)

diff --git a/arch/arm64/include/asm/rsi_smc.h b/arch/arm64/include/asm/rsi_smc.h
index bc0cdd83f164..baf07f905353 100644
--- a/arch/arm64/include/asm/rsi_smc.h
+++ b/arch/arm64/include/asm/rsi_smc.h
@@ -29,6 +29,8 @@
 #define RSI_ABI_VERSION			((RSI_ABI_VERSION_MAJOR << 16) | \
 					 RSI_ABI_VERSION_MINOR)
 
+#define RSI_LEGACY_ABI_VERSION		0xc0000
+
 #define RSI_ABI_VERSION_GET_MAJOR(_version) ((_version) >> 16)
 #define RSI_ABI_VERSION_GET_MINOR(_version) ((_version) & 0xFFFF)
 
diff --git a/arch/arm64/kernel/rsi.c b/arch/arm64/kernel/rsi.c
index 1cc292826f2b..45b26f23e706 100644
--- a/arch/arm64/kernel/rsi.c
+++ b/arch/arm64/kernel/rsi.c
@@ -41,6 +41,11 @@ static bool rsi_version_matches(void)
 		RSI_ABI_VERSION_GET_MAJOR(ver),
 		RSI_ABI_VERSION_GET_MINOR(ver));
 
+#ifdef CONFIG_RME_USE_PROTOTYPE_HACKS
+	if (ver == RSI_LEGACY_ABI_VERSION)
+		return true;
+#endif
+
 	return (ver >= RSI_ABI_VERSION &&
 		RSI_ABI_VERSION_GET_MAJOR(ver) == RSI_ABI_VERSION_MAJOR);
 }
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 05da3c8f7e88..13858a5047fd 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -58,6 +58,14 @@ config NVHE_EL2_DEBUG
 
 	  If unsure, say N.
 
+config RME_USE_PROTOTYPE_HACKS
+	bool "Allow RMM prototype version numbers"
+	default y
+	help
+	  For compatibility with the the current RMM code allow versions
+	  numbers from a prototype implementation as well as the expected
+	  version number from the RMM specification.
+
 config PROTECTED_NVHE_STACKTRACE
 	bool "Protected KVM hypervisor stacktraces"
 	depends on NVHE_EL2_DEBUG
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM
  2023-01-27 11:22 [RFC] Support for Arm CCA VMs on Linux Suzuki K Poulose
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
@ 2023-01-27 11:29 ` Steven Price
  2023-01-27 11:29   ` [RFC PATCH 01/28] arm64: RME: Handle Granule Protection Faults (GPFs) Steven Price
                     ` (27 more replies)
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                   ` (6 subsequent siblings)
  8 siblings, 28 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

This series is an RFC adding support for running protected VMs using KVM
under the new Arm Confidential Compute Architecture (CCA). The purpose
of this series is to gather feedback on the proposed changes to the
architecture code for CCA.

The user ABI is not in it's final form, we plan to make use of the
memfd_restricted() allocator[1] and associated infrastructure which will
avoid problems in the current user ABI where a malicious VMM may be able
to cause a Granule Protection Fault in the kernel (which is fatal).

The ABI to the RMM (the RMI) is based on the Beta 0 specification[2] and
will be updated in the future when a final version of the specification
is published.

This series is based on v6.2-rc1. It is also available as a git
repository:

https://gitlab.arm.com/linux-arm/linux-cca cca-host/rfc-v1

Introduction
============
A more general introduction to Arm CCA is available on the Arm
website[3], and links to the other components involved are available in
the overall cover letter[4].

Arm Confidential Compute Architecture adds two new 'worlds' to the
architecture: Root and Realm. A new software component known as the RMM
(Realm Management Monitor) runs in Realm EL2 and is trusted by both the
Normal World and VMs running within Realms. This enables mutual
distrust between the Realm VMs and the Normal World.

Virtual machines running within a Realm can decide on a (4k)
page-by-page granularity whether to share a page with the (Normal World)
host or to keep it private (protected). This protection is provided by
the hardware and attempts to access a page which isn't shared by the
Normal World will trigger a Granule Protection Fault. The series starts
by adding handling for these; faults within user space can be handled by
killing the process, faults within kernel space are considered fatal.

The Normal World host can communicate with the RMM via an SMC interface
known as RMI (Realm Management Interface), and Realm VMs can communicate
with the RMM via another SMC interface known as RSI (Realm Services
Interface). This series adds wrappers for the full set of RMI commands
and uses them to manage the realm guests.

The Normal World can use RMI commands to delegate pages to the Realm
world and to create, manage and run Realm VMs. Once delegated the pages
are inaccessible to the Normal World (unless explicitly shared by the
guest). However the Normal World may destroy the Realm VM at any time to
be able to reclaim (undelegate) the pages.

Entry/exit of a Realm VM attempts to reuse the KVM infrastructure, but
ultimately the final mechanism is different. So this series has a bunch
of commits handling the differences. As much as possible is placed in a
two new files: rme.c and rme-exit.c.

The RMM specification provides a new mechanism for a guest to
communicate with host which goes by the name "Host Call". For now this
is simply hooked up to the existing support for HVC calls from a normal
guest.

[1] https://lore.kernel.org/r/20221202061347.1070246-1-chao.p.peng%40linux.intel.com
[2] https://developer.arm.com/documentation/den0137/1-0bet0/
[3] https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
[4] .. cover letter ..

Joey Gouly (2):
  arm64: rme: allow userspace to inject aborts
  arm64: rme: support RSI_HOST_CALL

Steven Price (25):
  arm64: RME: Handle Granule Protection Faults (GPFs)
  arm64: RME: Add SMC definitions for calling the RMM
  arm64: RME: Add wrappers for RMI calls
  arm64: RME: Check for RME support at KVM init
  arm64: RME: Define the user ABI
  arm64: RME: ioctls to create and configure realms
  arm64: kvm: Allow passing machine type in KVM creation
  arm64: RME: Keep a spare page delegated to the RMM
  arm64: RME: RTT handling
  arm64: RME: Allocate/free RECs to match vCPUs
  arm64: RME: Support for the VGIC in realms
  KVM: arm64: Support timers in realm RECs
  arm64: RME: Allow VMM to set RIPAS
  arm64: RME: Handle realm enter/exit
  KVM: arm64: Handle realm MMIO emulation
  arm64: RME: Allow populating initial contents
  arm64: RME: Runtime faulting of memory
  KVM: arm64: Handle realm VCPU load
  KVM: arm64: Validate register access for a Realm VM
  KVM: arm64: Handle Realm PSCI requests
  KVM: arm64: WARN on injected undef exceptions
  arm64: Don't expose stolen time for realm guests
  KVM: arm64: Allow activating realms
  arm64: RME: Always use 4k pages for realms
  HACK: Accept prototype RMI versions

Suzuki K Poulose (1):
  arm64: rme: Allow checking SVE on VM instance

 Documentation/virt/kvm/api.rst       |    3 +
 arch/arm64/include/asm/kvm_emulate.h |   29 +
 arch/arm64/include/asm/kvm_host.h    |    7 +
 arch/arm64/include/asm/kvm_rme.h     |   98 ++
 arch/arm64/include/asm/rmi_cmds.h    |  259 +++++
 arch/arm64/include/asm/rmi_smc.h     |  242 +++++
 arch/arm64/include/asm/virt.h        |    1 +
 arch/arm64/include/uapi/asm/kvm.h    |   63 ++
 arch/arm64/kvm/Kconfig               |    8 +
 arch/arm64/kvm/Makefile              |    3 +-
 arch/arm64/kvm/arch_timer.c          |   53 +-
 arch/arm64/kvm/arm.c                 |  105 +-
 arch/arm64/kvm/guest.c               |   50 +
 arch/arm64/kvm/inject_fault.c        |    2 +
 arch/arm64/kvm/mmio.c                |    7 +
 arch/arm64/kvm/mmu.c                 |   80 +-
 arch/arm64/kvm/psci.c                |   23 +
 arch/arm64/kvm/reset.c               |   41 +
 arch/arm64/kvm/rme-exit.c            |  194 ++++
 arch/arm64/kvm/rme.c                 | 1453 ++++++++++++++++++++++++++
 arch/arm64/kvm/vgic/vgic-v3.c        |    9 +-
 arch/arm64/kvm/vgic/vgic.c           |   37 +-
 arch/arm64/mm/fault.c                |   29 +-
 include/kvm/arm_arch_timer.h         |    2 +
 include/uapi/linux/kvm.h             |   21 +-
 25 files changed, 2772 insertions(+), 47 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_rme.h
 create mode 100644 arch/arm64/include/asm/rmi_cmds.h
 create mode 100644 arch/arm64/include/asm/rmi_smc.h
 create mode 100644 arch/arm64/kvm/rme-exit.c
 create mode 100644 arch/arm64/kvm/rme.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 190+ messages in thread

* [RFC PATCH 01/28] arm64: RME: Handle Granule Protection Faults (GPFs)
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-01-27 11:29   ` [RFC PATCH 02/28] arm64: RME: Add SMC definitions for calling the RMM Steven Price
                     ` (26 subsequent siblings)
  27 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

If the host attempts to access granules that have been delegated for use
in a realm these accesses will be caught and will trigger a Granule
Protection Fault (GPF).

A fault during a page walk signals a bug in the kernel and is handled by
oopsing the kernel. A non-page walk fault could be caused by user space
having access to a page which has been delegated to the kernel and will
trigger a SIGBUS to allow debugging why user space is trying to access a
delegated page.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/mm/fault.c | 29 ++++++++++++++++++++++++-----
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 596f46dabe4e..fd84be115657 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -756,6 +756,25 @@ static int do_tag_check_fault(unsigned long far, unsigned long esr,
 	return 0;
 }
 
+static int do_gpf_ptw(unsigned long far, unsigned long esr, struct pt_regs *regs)
+{
+	const struct fault_info *inf = esr_to_fault_info(esr);
+
+	die_kernel_fault(inf->name, far, esr, regs);
+	return 0;
+}
+
+static int do_gpf(unsigned long far, unsigned long esr, struct pt_regs *regs)
+{
+	const struct fault_info *inf = esr_to_fault_info(esr);
+
+	if (!is_el1_instruction_abort(esr) && fixup_exception(regs))
+		return 0;
+
+	arm64_notify_die(inf->name, regs, inf->sig, inf->code, far, esr);
+	return 0;
+}
+
 static const struct fault_info fault_info[] = {
 	{ do_bad,		SIGKILL, SI_KERNEL,	"ttbr address size fault"	},
 	{ do_bad,		SIGKILL, SI_KERNEL,	"level 1 address size fault"	},
@@ -793,11 +812,11 @@ static const struct fault_info fault_info[] = {
 	{ do_alignment_fault,	SIGBUS,  BUS_ADRALN,	"alignment fault"		},
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 34"			},
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 35"			},
-	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 36"			},
-	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 37"			},
-	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 38"			},
-	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 39"			},
-	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 40"			},
+	{ do_gpf_ptw,		SIGKILL, SI_KERNEL,	"Granule Protection Fault at level 0" },
+	{ do_gpf_ptw,		SIGKILL, SI_KERNEL,	"Granule Protection Fault at level 1" },
+	{ do_gpf_ptw,		SIGKILL, SI_KERNEL,	"Granule Protection Fault at level 2" },
+	{ do_gpf_ptw,		SIGKILL, SI_KERNEL,	"Granule Protection Fault at level 3" },
+	{ do_gpf,		SIGBUS,  SI_KERNEL,	"Granule Protection Fault not on table walk" },
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 41"			},
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 42"			},
 	{ do_bad,		SIGKILL, SI_KERNEL,	"unknown 43"			},
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 02/28] arm64: RME: Add SMC definitions for calling the RMM
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
  2023-01-27 11:29   ` [RFC PATCH 01/28] arm64: RME: Handle Granule Protection Faults (GPFs) Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-01-27 11:29   ` [RFC PATCH 03/28] arm64: RME: Add wrappers for RMI calls Steven Price
                     ` (25 subsequent siblings)
  27 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

The RMM (Realm Management Monitor) provides functionality that can be
accessed by SMC calls from the host.

The SMC definitions are based on DEN0137[1] version A-bet0.

[1] https://developer.arm.com/documentation/den0137/latest

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/rmi_smc.h | 235 +++++++++++++++++++++++++++++++
 1 file changed, 235 insertions(+)
 create mode 100644 arch/arm64/include/asm/rmi_smc.h

diff --git a/arch/arm64/include/asm/rmi_smc.h b/arch/arm64/include/asm/rmi_smc.h
new file mode 100644
index 000000000000..16ff65090f3a
--- /dev/null
+++ b/arch/arm64/include/asm/rmi_smc.h
@@ -0,0 +1,235 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2023 ARM Ltd.
+ */
+
+#ifndef __ASM_RME_SMC_H
+#define __ASM_RME_SMC_H
+
+#include <linux/arm-smccc.h>
+
+#define SMC_RxI_CALL(func)				\
+	ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL,		\
+			   ARM_SMCCC_SMC_64,		\
+			   ARM_SMCCC_OWNER_STANDARD,	\
+			   (func))
+
+/* FID numbers from alp10 specification */
+
+#define SMC_RMI_DATA_CREATE		SMC_RxI_CALL(0x0153)
+#define SMC_RMI_DATA_CREATE_UNKNOWN	SMC_RxI_CALL(0x0154)
+#define SMC_RMI_DATA_DESTROY		SMC_RxI_CALL(0x0155)
+#define SMC_RMI_FEATURES		SMC_RxI_CALL(0x0165)
+#define SMC_RMI_GRANULE_DELEGATE	SMC_RxI_CALL(0x0151)
+#define SMC_RMI_GRANULE_UNDELEGATE	SMC_RxI_CALL(0x0152)
+#define SMC_RMI_PSCI_COMPLETE		SMC_RxI_CALL(0x0164)
+#define SMC_RMI_REALM_ACTIVATE		SMC_RxI_CALL(0x0157)
+#define SMC_RMI_REALM_CREATE		SMC_RxI_CALL(0x0158)
+#define SMC_RMI_REALM_DESTROY		SMC_RxI_CALL(0x0159)
+#define SMC_RMI_REC_AUX_COUNT		SMC_RxI_CALL(0x0167)
+#define SMC_RMI_REC_CREATE		SMC_RxI_CALL(0x015a)
+#define SMC_RMI_REC_DESTROY		SMC_RxI_CALL(0x015b)
+#define SMC_RMI_REC_ENTER		SMC_RxI_CALL(0x015c)
+#define SMC_RMI_RTT_CREATE		SMC_RxI_CALL(0x015d)
+#define SMC_RMI_RTT_DESTROY		SMC_RxI_CALL(0x015e)
+#define SMC_RMI_RTT_FOLD		SMC_RxI_CALL(0x0166)
+#define SMC_RMI_RTT_INIT_RIPAS		SMC_RxI_CALL(0x0168)
+#define SMC_RMI_RTT_MAP_UNPROTECTED	SMC_RxI_CALL(0x015f)
+#define SMC_RMI_RTT_READ_ENTRY		SMC_RxI_CALL(0x0161)
+#define SMC_RMI_RTT_SET_RIPAS		SMC_RxI_CALL(0x0169)
+#define SMC_RMI_RTT_UNMAP_UNPROTECTED	SMC_RxI_CALL(0x0162)
+#define SMC_RMI_VERSION			SMC_RxI_CALL(0x0150)
+
+#define RMI_ABI_MAJOR_VERSION	1
+#define RMI_ABI_MINOR_VERSION	0
+
+#define RMI_UNASSIGNED			0
+#define RMI_DESTROYED			1
+#define RMI_ASSIGNED			2
+#define RMI_TABLE			3
+#define RMI_VALID_NS			4
+
+#define RMI_ABI_VERSION_GET_MAJOR(version) ((version) >> 16)
+#define RMI_ABI_VERSION_GET_MINOR(version) ((version) & 0xFFFF)
+
+#define RMI_RETURN_STATUS(ret)		((ret) & 0xFF)
+#define RMI_RETURN_INDEX(ret)		(((ret) >> 8) & 0xFF)
+
+#define RMI_SUCCESS		0
+#define RMI_ERROR_INPUT		1
+#define RMI_ERROR_REALM		2
+#define RMI_ERROR_REC		3
+#define RMI_ERROR_RTT		4
+#define RMI_ERROR_IN_USE	5
+
+#define RMI_EMPTY		0
+#define RMI_RAM			1
+
+#define RMI_NO_MEASURE_CONTENT	0
+#define RMI_MEASURE_CONTENT	1
+
+#define RMI_FEATURE_REGISTER_0_S2SZ		GENMASK(7, 0)
+#define RMI_FEATURE_REGISTER_0_LPA2		BIT(8)
+#define RMI_FEATURE_REGISTER_0_SVE_EN		BIT(9)
+#define RMI_FEATURE_REGISTER_0_SVE_VL		GENMASK(13, 10)
+#define RMI_FEATURE_REGISTER_0_NUM_BPS		GENMASK(17, 14)
+#define RMI_FEATURE_REGISTER_0_NUM_WPS		GENMASK(21, 18)
+#define RMI_FEATURE_REGISTER_0_PMU_EN		BIT(22)
+#define RMI_FEATURE_REGISTER_0_PMU_NUM_CTRS	GENMASK(27, 23)
+#define RMI_FEATURE_REGISTER_0_HASH_SHA_256	BIT(28)
+#define RMI_FEATURE_REGISTER_0_HASH_SHA_512	BIT(29)
+
+struct realm_params {
+	union {
+		u64 features_0;
+		u8 padding_1[0x100];
+	};
+	union {
+		u8 measurement_algo;
+		u8 padding_2[0x300];
+	};
+	union {
+		u8 rpv[64];
+		u8 padding_3[0x400];
+	};
+	union {
+		struct {
+			u16 vmid;
+			u8 padding_4[6];
+			u64 rtt_base;
+			u64 rtt_level_start;
+			u32 rtt_num_start;
+		};
+		u8 padding_5[0x800];
+	};
+};
+
+/*
+ * The number of GPRs (starting from X0) that are
+ * configured by the host when a REC is created.
+ */
+#define REC_CREATE_NR_GPRS		8
+
+#define REC_PARAMS_FLAG_RUNNABLE	BIT_ULL(0)
+
+#define REC_PARAMS_AUX_GRANULES		16
+
+struct rec_params {
+	union {
+		u64 flags;
+		u8 padding1[0x100];
+	};
+	union {
+		u64 mpidr;
+		u8 padding2[0x100];
+	};
+	union {
+		u64 pc;
+		u8 padding3[0x100];
+	};
+	union {
+		u64 gprs[REC_CREATE_NR_GPRS];
+		u8 padding4[0x500];
+	};
+	u64 num_rec_aux;
+	u64 aux[REC_PARAMS_AUX_GRANULES];
+};
+
+#define RMI_EMULATED_MMIO		BIT(0)
+#define RMI_INJECT_SEA			BIT(1)
+#define RMI_TRAP_WFI			BIT(2)
+#define RMI_TRAP_WFE			BIT(3)
+
+#define REC_RUN_GPRS			31
+#define REC_GIC_NUM_LRS			16
+
+struct rec_entry {
+	union { /* 0x000 */
+		u64 flags;
+		u8 padding0[0x200];
+	};
+	union { /* 0x200 */
+		u64 gprs[REC_RUN_GPRS];
+		u8 padding2[0x100];
+	};
+	union { /* 0x300 */
+		struct {
+			u64 gicv3_hcr;
+			u64 gicv3_lrs[REC_GIC_NUM_LRS];
+		};
+		u8 padding3[0x100];
+	};
+	u8 padding4[0x400];
+};
+
+struct rec_exit {
+	union { /* 0x000 */
+		u8 exit_reason;
+		u8 padding0[0x100];
+	};
+	union { /* 0x100 */
+		struct {
+			u64 esr;
+			u64 far;
+			u64 hpfar;
+		};
+		u8 padding1[0x100];
+	};
+	union { /* 0x200 */
+		u64 gprs[REC_RUN_GPRS];
+		u8 padding2[0x100];
+	};
+	union { /* 0x300 */
+		struct {
+			u64 gicv3_hcr;
+			u64 gicv3_lrs[REC_GIC_NUM_LRS];
+			u64 gicv3_misr;
+			u64 gicv3_vmcr;
+		};
+		u8 padding3[0x100];
+	};
+	union { /* 0x400 */
+		struct {
+			u64 cntp_ctl;
+			u64 cntp_cval;
+			u64 cntv_ctl;
+			u64 cntv_cval;
+		};
+		u8 padding4[0x100];
+	};
+	union { /* 0x500 */
+		struct {
+			u64 ripas_base;
+			u64 ripas_size;
+			u64 ripas_value; /* Only lowest bit */
+		};
+		u8 padding5[0x100];
+	};
+	union { /* 0x600 */
+		u16 imm;
+		u8 padding6[0x100];
+	};
+	union { /* 0x700 */
+		struct {
+			u64 pmu_ovf;
+			u64 pmu_intr_en;
+			u64 pmu_cntr_en;
+		};
+		u8 padding7[0x100];
+	};
+};
+
+struct rec_run {
+	struct rec_entry entry;
+	struct rec_exit exit;
+};
+
+#define RMI_EXIT_SYNC			0x00
+#define RMI_EXIT_IRQ			0x01
+#define RMI_EXIT_FIQ			0x02
+#define RMI_EXIT_PSCI			0x03
+#define RMI_EXIT_RIPAS_CHANGE		0x04
+#define RMI_EXIT_HOST_CALL		0x05
+#define RMI_EXIT_SERROR			0x06
+
+#endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 03/28] arm64: RME: Add wrappers for RMI calls
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
  2023-01-27 11:29   ` [RFC PATCH 01/28] arm64: RME: Handle Granule Protection Faults (GPFs) Steven Price
  2023-01-27 11:29   ` [RFC PATCH 02/28] arm64: RME: Add SMC definitions for calling the RMM Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-02-13 16:43     ` Zhi Wang
  2024-03-18  7:03     ` Ganapatrao Kulkarni
  2023-01-27 11:29   ` [RFC PATCH 04/28] arm64: RME: Check for RME support at KVM init Steven Price
                     ` (24 subsequent siblings)
  27 siblings, 2 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

The wrappers make the call sites easier to read and deal with the
boiler plate of handling the error codes from the RMM.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/rmi_cmds.h | 259 ++++++++++++++++++++++++++++++
 1 file changed, 259 insertions(+)
 create mode 100644 arch/arm64/include/asm/rmi_cmds.h

diff --git a/arch/arm64/include/asm/rmi_cmds.h b/arch/arm64/include/asm/rmi_cmds.h
new file mode 100644
index 000000000000..d5468ee46f35
--- /dev/null
+++ b/arch/arm64/include/asm/rmi_cmds.h
@@ -0,0 +1,259 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2023 ARM Ltd.
+ */
+
+#ifndef __ASM_RMI_CMDS_H
+#define __ASM_RMI_CMDS_H
+
+#include <linux/arm-smccc.h>
+
+#include <asm/rmi_smc.h>
+
+struct rtt_entry {
+	unsigned long walk_level;
+	unsigned long desc;
+	int state;
+	bool ripas;
+};
+
+static inline int rmi_data_create(unsigned long data, unsigned long rd,
+				  unsigned long map_addr, unsigned long src,
+				  unsigned long flags)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_DATA_CREATE, data, rd, map_addr, src,
+			     flags, &res);
+
+	return res.a0;
+}
+
+static inline int rmi_data_create_unknown(unsigned long data,
+					  unsigned long rd,
+					  unsigned long map_addr)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_DATA_CREATE_UNKNOWN, data, rd, map_addr,
+			     &res);
+
+	return res.a0;
+}
+
+static inline int rmi_data_destroy(unsigned long rd, unsigned long map_addr)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_DATA_DESTROY, rd, map_addr, &res);
+
+	return res.a0;
+}
+
+static inline int rmi_features(unsigned long index, unsigned long *out)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_FEATURES, index, &res);
+
+	*out = res.a1;
+	return res.a0;
+}
+
+static inline int rmi_granule_delegate(unsigned long phys)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_GRANULE_DELEGATE, phys, &res);
+
+	return res.a0;
+}
+
+static inline int rmi_granule_undelegate(unsigned long phys)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_GRANULE_UNDELEGATE, phys, &res);
+
+	return res.a0;
+}
+
+static inline int rmi_psci_complete(unsigned long calling_rec,
+				    unsigned long target_rec)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_PSCI_COMPLETE, calling_rec, target_rec,
+			     &res);
+
+	return res.a0;
+}
+
+static inline int rmi_realm_activate(unsigned long rd)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_REALM_ACTIVATE, rd, &res);
+
+	return res.a0;
+}
+
+static inline int rmi_realm_create(unsigned long rd, unsigned long params_ptr)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_REALM_CREATE, rd, params_ptr, &res);
+
+	return res.a0;
+}
+
+static inline int rmi_realm_destroy(unsigned long rd)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_REALM_DESTROY, rd, &res);
+
+	return res.a0;
+}
+
+static inline int rmi_rec_aux_count(unsigned long rd, unsigned long *aux_count)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_REC_AUX_COUNT, rd, &res);
+
+	*aux_count = res.a1;
+	return res.a0;
+}
+
+static inline int rmi_rec_create(unsigned long rec, unsigned long rd,
+				 unsigned long params_ptr)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_REC_CREATE, rec, rd, params_ptr, &res);
+
+	return res.a0;
+}
+
+static inline int rmi_rec_destroy(unsigned long rec)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_REC_DESTROY, rec, &res);
+
+	return res.a0;
+}
+
+static inline int rmi_rec_enter(unsigned long rec, unsigned long run_ptr)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_REC_ENTER, rec, run_ptr, &res);
+
+	return res.a0;
+}
+
+static inline int rmi_rtt_create(unsigned long rtt, unsigned long rd,
+				 unsigned long map_addr, unsigned long level)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_RTT_CREATE, rtt, rd, map_addr, level,
+			     &res);
+
+	return res.a0;
+}
+
+static inline int rmi_rtt_destroy(unsigned long rtt, unsigned long rd,
+				  unsigned long map_addr, unsigned long level)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_RTT_DESTROY, rtt, rd, map_addr, level,
+			     &res);
+
+	return res.a0;
+}
+
+static inline int rmi_rtt_fold(unsigned long rtt, unsigned long rd,
+			       unsigned long map_addr, unsigned long level)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_RTT_FOLD, rtt, rd, map_addr, level, &res);
+
+	return res.a0;
+}
+
+static inline int rmi_rtt_init_ripas(unsigned long rd, unsigned long map_addr,
+				     unsigned long level)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_RTT_INIT_RIPAS, rd, map_addr, level, &res);
+
+	return res.a0;
+}
+
+static inline int rmi_rtt_map_unprotected(unsigned long rd,
+					  unsigned long map_addr,
+					  unsigned long level,
+					  unsigned long desc)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_RTT_MAP_UNPROTECTED, rd, map_addr, level,
+			     desc, &res);
+
+	return res.a0;
+}
+
+static inline int rmi_rtt_read_entry(unsigned long rd, unsigned long map_addr,
+				     unsigned long level, struct rtt_entry *rtt)
+{
+	struct arm_smccc_1_2_regs regs = {
+		SMC_RMI_RTT_READ_ENTRY,
+		rd, map_addr, level
+	};
+
+	arm_smccc_1_2_smc(&regs, &regs);
+
+	rtt->walk_level = regs.a1;
+	rtt->state = regs.a2 & 0xFF;
+	rtt->desc = regs.a3;
+	rtt->ripas = regs.a4 & 1;
+
+	return regs.a0;
+}
+
+static inline int rmi_rtt_set_ripas(unsigned long rd, unsigned long rec,
+				    unsigned long map_addr, unsigned long level,
+				    unsigned long ripas)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_RTT_SET_RIPAS, rd, rec, map_addr, level,
+			     ripas, &res);
+
+	return res.a0;
+}
+
+static inline int rmi_rtt_unmap_unprotected(unsigned long rd,
+					    unsigned long map_addr,
+					    unsigned long level)
+{
+	struct arm_smccc_res res;
+
+	arm_smccc_1_1_invoke(SMC_RMI_RTT_UNMAP_UNPROTECTED, rd, map_addr,
+			     level, &res);
+
+	return res.a0;
+}
+
+static inline phys_addr_t rmi_rtt_get_phys(struct rtt_entry *rtt)
+{
+	return rtt->desc & GENMASK(47, 12);
+}
+
+#endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 04/28] arm64: RME: Check for RME support at KVM init
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (2 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 03/28] arm64: RME: Add wrappers for RMI calls Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-02-13 15:48     ` Zhi Wang
                       ` (2 more replies)
  2023-01-27 11:29   ` [RFC PATCH 05/28] arm64: RME: Define the user ABI Steven Price
                     ` (23 subsequent siblings)
  27 siblings, 3 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

Query the RMI version number and check if it is a compatible version. A
static key is also provided to signal that a supported RMM is available.

Functions are provided to query if a VM or VCPU is a realm (or rec)
which currently will always return false.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/kvm_emulate.h | 17 ++++++++++
 arch/arm64/include/asm/kvm_host.h    |  4 +++
 arch/arm64/include/asm/kvm_rme.h     | 22 +++++++++++++
 arch/arm64/include/asm/virt.h        |  1 +
 arch/arm64/kvm/Makefile              |  3 +-
 arch/arm64/kvm/arm.c                 |  8 +++++
 arch/arm64/kvm/rme.c                 | 49 ++++++++++++++++++++++++++++
 7 files changed, 103 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/kvm_rme.h
 create mode 100644 arch/arm64/kvm/rme.c

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 9bdba47f7e14..5a2b7229e83f 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -490,4 +490,21 @@ static inline bool vcpu_has_feature(struct kvm_vcpu *vcpu, int feature)
 	return test_bit(feature, vcpu->arch.features);
 }
 
+static inline bool kvm_is_realm(struct kvm *kvm)
+{
+	if (static_branch_unlikely(&kvm_rme_is_available))
+		return kvm->arch.is_realm;
+	return false;
+}
+
+static inline enum realm_state kvm_realm_state(struct kvm *kvm)
+{
+	return READ_ONCE(kvm->arch.realm.state);
+}
+
+static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu)
+{
+	return false;
+}
+
 #endif /* __ARM64_KVM_EMULATE_H__ */
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 35a159d131b5..04347c3a8c6b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -26,6 +26,7 @@
 #include <asm/fpsimd.h>
 #include <asm/kvm.h>
 #include <asm/kvm_asm.h>
+#include <asm/kvm_rme.h>
 
 #define __KVM_HAVE_ARCH_INTC_INITIALIZED
 
@@ -240,6 +241,9 @@ struct kvm_arch {
 	 * the associated pKVM instance in the hypervisor.
 	 */
 	struct kvm_protected_vm pkvm;
+
+	bool is_realm;
+	struct realm realm;
 };
 
 struct kvm_vcpu_fault_info {
diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
new file mode 100644
index 000000000000..c26bc2c6770d
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_rme.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2023 ARM Ltd.
+ */
+
+#ifndef __ASM_KVM_RME_H
+#define __ASM_KVM_RME_H
+
+enum realm_state {
+	REALM_STATE_NONE,
+	REALM_STATE_NEW,
+	REALM_STATE_ACTIVE,
+	REALM_STATE_DYING
+};
+
+struct realm {
+	enum realm_state state;
+};
+
+int kvm_init_rme(void);
+
+#endif
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 4eb601e7de50..be1383e26626 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -80,6 +80,7 @@ void __hyp_set_vectors(phys_addr_t phys_vector_base);
 void __hyp_reset_vectors(void);
 
 DECLARE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);
+DECLARE_STATIC_KEY_FALSE(kvm_rme_is_available);
 
 /* Reports the availability of HYP mode */
 static inline bool is_hyp_mode_available(void)
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 5e33c2d4645a..d2f0400c50da 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -20,7 +20,8 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
 	 vgic/vgic-v3.o vgic/vgic-v4.o \
 	 vgic/vgic-mmio.o vgic/vgic-mmio-v2.o \
 	 vgic/vgic-mmio-v3.o vgic/vgic-kvm-device.o \
-	 vgic/vgic-its.o vgic/vgic-debug.o
+	 vgic/vgic-its.o vgic/vgic-debug.o \
+	 rme.o
 
 kvm-$(CONFIG_HW_PERF_EVENTS)  += pmu-emul.o pmu.o
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 9c5573bc4614..d97b39d042ab 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -38,6 +38,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmu.h>
 #include <asm/kvm_pkvm.h>
+#include <asm/kvm_rme.h>
 #include <asm/kvm_emulate.h>
 #include <asm/sections.h>
 
@@ -47,6 +48,7 @@
 
 static enum kvm_mode kvm_mode = KVM_MODE_DEFAULT;
 DEFINE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);
+DEFINE_STATIC_KEY_FALSE(kvm_rme_is_available);
 
 DECLARE_KVM_HYP_PER_CPU(unsigned long, kvm_hyp_vector);
 
@@ -2213,6 +2215,12 @@ int kvm_arch_init(void *opaque)
 
 	in_hyp_mode = is_kernel_in_hyp_mode();
 
+	if (in_hyp_mode) {
+		err = kvm_init_rme();
+		if (err)
+			return err;
+	}
+
 	if (cpus_have_final_cap(ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE) ||
 	    cpus_have_final_cap(ARM64_WORKAROUND_1508412))
 		kvm_info("Guests without required CPU erratum workarounds can deadlock system!\n" \
diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
new file mode 100644
index 000000000000..f6b587bc116e
--- /dev/null
+++ b/arch/arm64/kvm/rme.c
@@ -0,0 +1,49 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2023 ARM Ltd.
+ */
+
+#include <linux/kvm_host.h>
+
+#include <asm/rmi_cmds.h>
+#include <asm/virt.h>
+
+static int rmi_check_version(void)
+{
+	struct arm_smccc_res res;
+	int version_major, version_minor;
+
+	arm_smccc_1_1_invoke(SMC_RMI_VERSION, &res);
+
+	if (res.a0 == SMCCC_RET_NOT_SUPPORTED)
+		return -ENXIO;
+
+	version_major = RMI_ABI_VERSION_GET_MAJOR(res.a0);
+	version_minor = RMI_ABI_VERSION_GET_MINOR(res.a0);
+
+	if (version_major != RMI_ABI_MAJOR_VERSION) {
+		kvm_err("Unsupported RMI ABI (version %d.%d) we support %d\n",
+			version_major, version_minor,
+			RMI_ABI_MAJOR_VERSION);
+		return -ENXIO;
+	}
+
+	kvm_info("RMI ABI version %d.%d\n", version_major, version_minor);
+
+	return 0;
+}
+
+int kvm_init_rme(void)
+{
+	if (PAGE_SIZE != SZ_4K)
+		/* Only 4k page size on the host is supported */
+		return 0;
+
+	if (rmi_check_version())
+		/* Continue without realm support */
+		return 0;
+
+	/* Future patch will enable static branch kvm_rme_is_available */
+
+	return 0;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 05/28] arm64: RME: Define the user ABI
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (3 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 04/28] arm64: RME: Check for RME support at KVM init Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-02-13 16:04     ` Zhi Wang
  2023-01-27 11:29   ` [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms Steven Price
                     ` (22 subsequent siblings)
  27 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

There is one (multiplexed) CAP which can be used to create, populate and
then activate the realm.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 Documentation/virt/kvm/api.rst    |  1 +
 arch/arm64/include/uapi/asm/kvm.h | 63 +++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h          |  2 +
 3 files changed, 66 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 0dd5d8733dd5..f1a59d6fb7fc 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -4965,6 +4965,7 @@ Recognised values for feature:
 
   =====      ===========================================
   arm64      KVM_ARM_VCPU_SVE (requires KVM_CAP_ARM_SVE)
+  arm64      KVM_ARM_VCPU_REC (requires KVM_CAP_ARM_RME)
   =====      ===========================================
 
 Finalizes the configuration of the specified vcpu feature.
diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
index a7a857f1784d..fcc0b8dce29b 100644
--- a/arch/arm64/include/uapi/asm/kvm.h
+++ b/arch/arm64/include/uapi/asm/kvm.h
@@ -109,6 +109,7 @@ struct kvm_regs {
 #define KVM_ARM_VCPU_SVE		4 /* enable SVE for this CPU */
 #define KVM_ARM_VCPU_PTRAUTH_ADDRESS	5 /* VCPU uses address authentication */
 #define KVM_ARM_VCPU_PTRAUTH_GENERIC	6 /* VCPU uses generic authentication */
+#define KVM_ARM_VCPU_REC		7 /* VCPU REC state as part of Realm */
 
 struct kvm_vcpu_init {
 	__u32 target;
@@ -401,6 +402,68 @@ enum {
 #define   KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES	3
 #define   KVM_DEV_ARM_ITS_CTRL_RESET		4
 
+/* KVM_CAP_ARM_RME on VM fd */
+#define KVM_CAP_ARM_RME_CONFIG_REALM		0
+#define KVM_CAP_ARM_RME_CREATE_RD		1
+#define KVM_CAP_ARM_RME_INIT_IPA_REALM		2
+#define KVM_CAP_ARM_RME_POPULATE_REALM		3
+#define KVM_CAP_ARM_RME_ACTIVATE_REALM		4
+
+#define KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256		0
+#define KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA512		1
+
+#define KVM_CAP_ARM_RME_RPV_SIZE 64
+
+/* List of configuration items accepted for KVM_CAP_ARM_RME_CONFIG_REALM */
+#define KVM_CAP_ARM_RME_CFG_RPV			0
+#define KVM_CAP_ARM_RME_CFG_HASH_ALGO		1
+#define KVM_CAP_ARM_RME_CFG_SVE			2
+#define KVM_CAP_ARM_RME_CFG_DBG			3
+#define KVM_CAP_ARM_RME_CFG_PMU			4
+
+struct kvm_cap_arm_rme_config_item {
+	__u32 cfg;
+	union {
+		/* cfg == KVM_CAP_ARM_RME_CFG_RPV */
+		struct {
+			__u8	rpv[KVM_CAP_ARM_RME_RPV_SIZE];
+		};
+
+		/* cfg == KVM_CAP_ARM_RME_CFG_HASH_ALGO */
+		struct {
+			__u32	hash_algo;
+		};
+
+		/* cfg == KVM_CAP_ARM_RME_CFG_SVE */
+		struct {
+			__u32	sve_vq;
+		};
+
+		/* cfg == KVM_CAP_ARM_RME_CFG_DBG */
+		struct {
+			__u32	num_brps;
+			__u32	num_wrps;
+		};
+
+		/* cfg == KVM_CAP_ARM_RME_CFG_PMU */
+		struct {
+			__u32	num_pmu_cntrs;
+		};
+		/* Fix the size of the union */
+		__u8	reserved[256];
+	};
+};
+
+struct kvm_cap_arm_rme_populate_realm_args {
+	__u64 populate_ipa_base;
+	__u64 populate_ipa_size;
+};
+
+struct kvm_cap_arm_rme_init_ipa_args {
+	__u64 init_ipa_base;
+	__u64 init_ipa_size;
+};
+
 /* Device Control API on vcpu fd */
 #define KVM_ARM_VCPU_PMU_V3_CTRL	0
 #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 20522d4ba1e0..fec1909e8b73 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1176,6 +1176,8 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224
 #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225
 
+#define KVM_CAP_ARM_RME 300 // FIXME: Large number to prevent conflicts
+
 #ifdef KVM_CAP_IRQ_ROUTING
 
 struct kvm_irq_routing_irqchip {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (4 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 05/28] arm64: RME: Define the user ABI Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-02-07 12:25     ` Jean-Philippe Brucker
                       ` (3 more replies)
  2023-01-27 11:29   ` [RFC PATCH 07/28] arm64: kvm: Allow passing machine type in KVM creation Steven Price
                     ` (21 subsequent siblings)
  27 siblings, 4 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

Add the KVM_CAP_ARM_RME_CREATE_FD ioctl to create a realm. This involves
delegating pages to the RMM to hold the Realm Descriptor (RD) and for
the base level of the Realm Translation Tables (RTT). A VMID also need
to be picked, since the RMM has a separate VMID address space a
dedicated allocator is added for this purpose.

KVM_CAP_ARM_RME_CONFIG_REALM is provided to allow configuring the realm
before it is created.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/kvm_rme.h |  14 ++
 arch/arm64/kvm/arm.c             |  19 ++
 arch/arm64/kvm/mmu.c             |   6 +
 arch/arm64/kvm/reset.c           |  33 +++
 arch/arm64/kvm/rme.c             | 357 +++++++++++++++++++++++++++++++
 5 files changed, 429 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
index c26bc2c6770d..055a22accc08 100644
--- a/arch/arm64/include/asm/kvm_rme.h
+++ b/arch/arm64/include/asm/kvm_rme.h
@@ -6,6 +6,8 @@
 #ifndef __ASM_KVM_RME_H
 #define __ASM_KVM_RME_H
 
+#include <uapi/linux/kvm.h>
+
 enum realm_state {
 	REALM_STATE_NONE,
 	REALM_STATE_NEW,
@@ -15,8 +17,20 @@ enum realm_state {
 
 struct realm {
 	enum realm_state state;
+
+	void *rd;
+	struct realm_params *params;
+
+	unsigned long num_aux;
+	unsigned int vmid;
+	unsigned int ia_bits;
 };
 
 int kvm_init_rme(void);
+u32 kvm_realm_ipa_limit(void);
+
+int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
+int kvm_init_realm_vm(struct kvm *kvm);
+void kvm_destroy_realm(struct kvm *kvm);
 
 #endif
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index d97b39d042ab..50f54a63732a 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -103,6 +103,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		r = 0;
 		set_bit(KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED, &kvm->arch.flags);
 		break;
+	case KVM_CAP_ARM_RME:
+		if (!static_branch_unlikely(&kvm_rme_is_available))
+			return -EINVAL;
+		mutex_lock(&kvm->lock);
+		r = kvm_realm_enable_cap(kvm, cap);
+		mutex_unlock(&kvm->lock);
+		break;
 	default:
 		r = -EINVAL;
 		break;
@@ -172,6 +179,13 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	 */
 	kvm->arch.dfr0_pmuver.imp = kvm_arm_pmu_get_pmuver_limit();
 
+	/* Initialise the realm bits after the generic bits are enabled */
+	if (kvm_is_realm(kvm)) {
+		ret = kvm_init_realm_vm(kvm);
+		if (ret)
+			goto err_free_cpumask;
+	}
+
 	return 0;
 
 err_free_cpumask:
@@ -204,6 +218,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 	kvm_destroy_vcpus(kvm);
 
 	kvm_unshare_hyp(kvm, kvm + 1);
+
+	kvm_destroy_realm(kvm);
 }
 
 int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
@@ -300,6 +316,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_ARM_PTRAUTH_GENERIC:
 		r = system_has_full_ptr_auth();
 		break;
+	case KVM_CAP_ARM_RME:
+		r = static_key_enabled(&kvm_rme_is_available);
+		break;
 	default:
 		r = 0;
 	}
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 31d7fa4c7c14..d0f707767d05 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -840,6 +840,12 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
 	struct kvm_pgtable *pgt = NULL;
 
 	write_lock(&kvm->mmu_lock);
+	if (kvm_is_realm(kvm) &&
+	    kvm_realm_state(kvm) != REALM_STATE_DYING) {
+		/* TODO: teardown rtts */
+		write_unlock(&kvm->mmu_lock);
+		return;
+	}
 	pgt = mmu->pgt;
 	if (pgt) {
 		mmu->pgd_phys = 0;
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index e0267f672b8a..c165df174737 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -395,3 +395,36 @@ int kvm_set_ipa_limit(void)
 
 	return 0;
 }
+
+int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type)
+{
+	u64 mmfr0, mmfr1;
+	u32 phys_shift;
+	u32 ipa_limit = kvm_ipa_limit;
+
+	if (kvm_is_realm(kvm))
+		ipa_limit = kvm_realm_ipa_limit();
+
+	if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
+		return -EINVAL;
+
+	phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type);
+	if (phys_shift) {
+		if (phys_shift > ipa_limit ||
+		    phys_shift < ARM64_MIN_PARANGE_BITS)
+			return -EINVAL;
+	} else {
+		phys_shift = KVM_PHYS_SHIFT;
+		if (phys_shift > ipa_limit) {
+			pr_warn_once("%s using unsupported default IPA limit, upgrade your VMM\n",
+				     current->comm);
+			return -EINVAL;
+		}
+	}
+
+	mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+	mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
+	kvm->arch.vtcr = kvm_get_vtcr(mmfr0, mmfr1, phys_shift);
+
+	return 0;
+}
diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
index f6b587bc116e..9f8c5a91b8fc 100644
--- a/arch/arm64/kvm/rme.c
+++ b/arch/arm64/kvm/rme.c
@@ -5,9 +5,49 @@
 
 #include <linux/kvm_host.h>
 
+#include <asm/kvm_emulate.h>
+#include <asm/kvm_mmu.h>
 #include <asm/rmi_cmds.h>
 #include <asm/virt.h>
 
+/************ FIXME: Copied from kvm/hyp/pgtable.c **********/
+#include <asm/kvm_pgtable.h>
+
+struct kvm_pgtable_walk_data {
+	struct kvm_pgtable		*pgt;
+	struct kvm_pgtable_walker	*walker;
+
+	u64				addr;
+	u64				end;
+};
+
+static u32 __kvm_pgd_page_idx(struct kvm_pgtable *pgt, u64 addr)
+{
+	u64 shift = kvm_granule_shift(pgt->start_level - 1); /* May underflow */
+	u64 mask = BIT(pgt->ia_bits) - 1;
+
+	return (addr & mask) >> shift;
+}
+
+static u32 kvm_pgd_pages(u32 ia_bits, u32 start_level)
+{
+	struct kvm_pgtable pgt = {
+		.ia_bits	= ia_bits,
+		.start_level	= start_level,
+	};
+
+	return __kvm_pgd_page_idx(&pgt, -1ULL) + 1;
+}
+
+/******************/
+
+static unsigned long rmm_feat_reg0;
+
+static bool rme_supports(unsigned long feature)
+{
+	return !!u64_get_bits(rmm_feat_reg0, feature);
+}
+
 static int rmi_check_version(void)
 {
 	struct arm_smccc_res res;
@@ -33,8 +73,319 @@ static int rmi_check_version(void)
 	return 0;
 }
 
+static unsigned long create_realm_feat_reg0(struct kvm *kvm)
+{
+	unsigned long ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
+	u64 feat_reg0 = 0;
+
+	int num_bps = u64_get_bits(rmm_feat_reg0,
+				   RMI_FEATURE_REGISTER_0_NUM_BPS);
+	int num_wps = u64_get_bits(rmm_feat_reg0,
+				   RMI_FEATURE_REGISTER_0_NUM_WPS);
+
+	feat_reg0 |= u64_encode_bits(ia_bits, RMI_FEATURE_REGISTER_0_S2SZ);
+	feat_reg0 |= u64_encode_bits(num_bps, RMI_FEATURE_REGISTER_0_NUM_BPS);
+	feat_reg0 |= u64_encode_bits(num_wps, RMI_FEATURE_REGISTER_0_NUM_WPS);
+
+	return feat_reg0;
+}
+
+u32 kvm_realm_ipa_limit(void)
+{
+	return u64_get_bits(rmm_feat_reg0, RMI_FEATURE_REGISTER_0_S2SZ);
+}
+
+static u32 get_start_level(struct kvm *kvm)
+{
+	long sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, kvm->arch.vtcr);
+
+	return VTCR_EL2_TGRAN_SL0_BASE - sl0;
+}
+
+static int realm_create_rd(struct kvm *kvm)
+{
+	struct realm *realm = &kvm->arch.realm;
+	struct realm_params *params = realm->params;
+	void *rd = NULL;
+	phys_addr_t rd_phys, params_phys;
+	struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
+	unsigned int pgd_sz;
+	int i, r;
+
+	if (WARN_ON(realm->rd) || WARN_ON(!realm->params))
+		return -EEXIST;
+
+	rd = (void *)__get_free_page(GFP_KERNEL);
+	if (!rd)
+		return -ENOMEM;
+
+	rd_phys = virt_to_phys(rd);
+	if (rmi_granule_delegate(rd_phys)) {
+		r = -ENXIO;
+		goto out;
+	}
+
+	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
+	for (i = 0; i < pgd_sz; i++) {
+		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
+
+		if (rmi_granule_delegate(pgd_phys)) {
+			r = -ENXIO;
+			goto out_undelegate_tables;
+		}
+	}
+
+	params->rtt_level_start = get_start_level(kvm);
+	params->rtt_num_start = pgd_sz;
+	params->rtt_base = kvm->arch.mmu.pgd_phys;
+	params->vmid = realm->vmid;
+
+	params_phys = virt_to_phys(params);
+
+	if (rmi_realm_create(rd_phys, params_phys)) {
+		r = -ENXIO;
+		goto out_undelegate_tables;
+	}
+
+	realm->rd = rd;
+	realm->ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
+
+	if (WARN_ON(rmi_rec_aux_count(rd_phys, &realm->num_aux))) {
+		WARN_ON(rmi_realm_destroy(rd_phys));
+		goto out_undelegate_tables;
+	}
+
+	return 0;
+
+out_undelegate_tables:
+	while (--i >= 0) {
+		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
+
+		WARN_ON(rmi_granule_undelegate(pgd_phys));
+	}
+	WARN_ON(rmi_granule_undelegate(rd_phys));
+out:
+	free_page((unsigned long)rd);
+	return r;
+}
+
+/* Protects access to rme_vmid_bitmap */
+static DEFINE_SPINLOCK(rme_vmid_lock);
+static unsigned long *rme_vmid_bitmap;
+
+static int rme_vmid_init(void)
+{
+	unsigned int vmid_count = 1 << kvm_get_vmid_bits();
+
+	rme_vmid_bitmap = bitmap_zalloc(vmid_count, GFP_KERNEL);
+	if (!rme_vmid_bitmap) {
+		kvm_err("%s: Couldn't allocate rme vmid bitmap\n", __func__);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static int rme_vmid_reserve(void)
+{
+	int ret;
+	unsigned int vmid_count = 1 << kvm_get_vmid_bits();
+
+	spin_lock(&rme_vmid_lock);
+	ret = bitmap_find_free_region(rme_vmid_bitmap, vmid_count, 0);
+	spin_unlock(&rme_vmid_lock);
+
+	return ret;
+}
+
+static void rme_vmid_release(unsigned int vmid)
+{
+	spin_lock(&rme_vmid_lock);
+	bitmap_release_region(rme_vmid_bitmap, vmid, 0);
+	spin_unlock(&rme_vmid_lock);
+}
+
+static int kvm_create_realm(struct kvm *kvm)
+{
+	struct realm *realm = &kvm->arch.realm;
+	int ret;
+
+	if (!kvm_is_realm(kvm) || kvm_realm_state(kvm) != REALM_STATE_NONE)
+		return -EEXIST;
+
+	ret = rme_vmid_reserve();
+	if (ret < 0)
+		return ret;
+	realm->vmid = ret;
+
+	ret = realm_create_rd(kvm);
+	if (ret) {
+		rme_vmid_release(realm->vmid);
+		return ret;
+	}
+
+	WRITE_ONCE(realm->state, REALM_STATE_NEW);
+
+	/* The realm is up, free the parameters.  */
+	free_page((unsigned long)realm->params);
+	realm->params = NULL;
+
+	return 0;
+}
+
+static int config_realm_hash_algo(struct realm *realm,
+				  struct kvm_cap_arm_rme_config_item *cfg)
+{
+	switch (cfg->hash_algo) {
+	case KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256:
+		if (!rme_supports(RMI_FEATURE_REGISTER_0_HASH_SHA_256))
+			return -EINVAL;
+		break;
+	case KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA512:
+		if (!rme_supports(RMI_FEATURE_REGISTER_0_HASH_SHA_512))
+			return -EINVAL;
+		break;
+	default:
+		return -EINVAL;
+	}
+	realm->params->measurement_algo = cfg->hash_algo;
+	return 0;
+}
+
+static int config_realm_sve(struct realm *realm,
+			    struct kvm_cap_arm_rme_config_item *cfg)
+{
+	u64 features_0 = realm->params->features_0;
+	int max_sve_vq = u64_get_bits(rmm_feat_reg0,
+				      RMI_FEATURE_REGISTER_0_SVE_VL);
+
+	if (!rme_supports(RMI_FEATURE_REGISTER_0_SVE_EN))
+		return -EINVAL;
+
+	if (cfg->sve_vq > max_sve_vq)
+		return -EINVAL;
+
+	features_0 &= ~(RMI_FEATURE_REGISTER_0_SVE_EN |
+			RMI_FEATURE_REGISTER_0_SVE_VL);
+	features_0 |= u64_encode_bits(1, RMI_FEATURE_REGISTER_0_SVE_EN);
+	features_0 |= u64_encode_bits(cfg->sve_vq,
+				      RMI_FEATURE_REGISTER_0_SVE_VL);
+
+	realm->params->features_0 = features_0;
+	return 0;
+}
+
+static int kvm_rme_config_realm(struct kvm *kvm, struct kvm_enable_cap *cap)
+{
+	struct kvm_cap_arm_rme_config_item cfg;
+	struct realm *realm = &kvm->arch.realm;
+	int r = 0;
+
+	if (kvm_realm_state(kvm) != REALM_STATE_NONE)
+		return -EBUSY;
+
+	if (copy_from_user(&cfg, (void __user *)cap->args[1], sizeof(cfg)))
+		return -EFAULT;
+
+	switch (cfg.cfg) {
+	case KVM_CAP_ARM_RME_CFG_RPV:
+		memcpy(&realm->params->rpv, &cfg.rpv, sizeof(cfg.rpv));
+		break;
+	case KVM_CAP_ARM_RME_CFG_HASH_ALGO:
+		r = config_realm_hash_algo(realm, &cfg);
+		break;
+	case KVM_CAP_ARM_RME_CFG_SVE:
+		r = config_realm_sve(realm, &cfg);
+		break;
+	default:
+		r = -EINVAL;
+	}
+
+	return r;
+}
+
+int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
+{
+	int r = 0;
+
+	switch (cap->args[0]) {
+	case KVM_CAP_ARM_RME_CONFIG_REALM:
+		r = kvm_rme_config_realm(kvm, cap);
+		break;
+	case KVM_CAP_ARM_RME_CREATE_RD:
+		if (kvm->created_vcpus) {
+			r = -EBUSY;
+			break;
+		}
+
+		r = kvm_create_realm(kvm);
+		break;
+	default:
+		r = -EINVAL;
+		break;
+	}
+
+	return r;
+}
+
+void kvm_destroy_realm(struct kvm *kvm)
+{
+	struct realm *realm = &kvm->arch.realm;
+	struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
+	unsigned int pgd_sz;
+	int i;
+
+	if (realm->params) {
+		free_page((unsigned long)realm->params);
+		realm->params = NULL;
+	}
+
+	if (kvm_realm_state(kvm) == REALM_STATE_NONE)
+		return;
+
+	WRITE_ONCE(realm->state, REALM_STATE_DYING);
+
+	rme_vmid_release(realm->vmid);
+
+	if (realm->rd) {
+		phys_addr_t rd_phys = virt_to_phys(realm->rd);
+
+		if (WARN_ON(rmi_realm_destroy(rd_phys)))
+			return;
+		if (WARN_ON(rmi_granule_undelegate(rd_phys)))
+			return;
+		free_page((unsigned long)realm->rd);
+		realm->rd = NULL;
+	}
+
+	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
+	for (i = 0; i < pgd_sz; i++) {
+		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
+
+		if (WARN_ON(rmi_granule_undelegate(pgd_phys)))
+			return;
+	}
+
+	kvm_free_stage2_pgd(&kvm->arch.mmu);
+}
+
+int kvm_init_realm_vm(struct kvm *kvm)
+{
+	struct realm_params *params;
+
+	params = (struct realm_params *)get_zeroed_page(GFP_KERNEL);
+	if (!params)
+		return -ENOMEM;
+
+	params->features_0 = create_realm_feat_reg0(kvm);
+	kvm->arch.realm.params = params;
+	return 0;
+}
+
 int kvm_init_rme(void)
 {
+	int ret;
+
 	if (PAGE_SIZE != SZ_4K)
 		/* Only 4k page size on the host is supported */
 		return 0;
@@ -43,6 +394,12 @@ int kvm_init_rme(void)
 		/* Continue without realm support */
 		return 0;
 
+	ret = rme_vmid_init();
+	if (ret)
+		return ret;
+
+	WARN_ON(rmi_features(0, &rmm_feat_reg0));
+
 	/* Future patch will enable static branch kvm_rme_is_available */
 
 	return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 07/28] arm64: kvm: Allow passing machine type in KVM creation
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (5 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-02-13 16:35     ` Zhi Wang
  2023-01-27 11:29   ` [RFC PATCH 08/28] arm64: RME: Keep a spare page delegated to the RMM Steven Price
                     ` (20 subsequent siblings)
  27 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

Previously machine type was used purely for specifying the physical
address size of the guest. Reserve the higher bits to specify an ARM
specific machine type and declare a new type 'KVM_VM_TYPE_ARM_REALM'
used to create a realm guest.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kvm/arm.c     | 13 +++++++++++++
 arch/arm64/kvm/mmu.c     |  3 ---
 arch/arm64/kvm/reset.c   |  3 ---
 include/uapi/linux/kvm.h | 19 +++++++++++++++----
 4 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 50f54a63732a..badd775547b8 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -147,6 +147,19 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 {
 	int ret;
 
+	if (type & ~(KVM_VM_TYPE_ARM_MASK | KVM_VM_TYPE_ARM_IPA_SIZE_MASK))
+		return -EINVAL;
+
+	switch (type & KVM_VM_TYPE_ARM_MASK) {
+	case KVM_VM_TYPE_ARM_NORMAL:
+		break;
+	case KVM_VM_TYPE_ARM_REALM:
+		kvm->arch.is_realm = true;
+		break;
+	default:
+		return -EINVAL;
+	}
+
 	ret = kvm_share_hyp(kvm, kvm + 1);
 	if (ret)
 		return ret;
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index d0f707767d05..22c00274884a 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -709,9 +709,6 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t
 	u64 mmfr0, mmfr1;
 	u32 phys_shift;
 
-	if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
-		return -EINVAL;
-
 	phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type);
 	if (is_protected_kvm_enabled()) {
 		phys_shift = kvm_ipa_limit;
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index c165df174737..9e71d69e051f 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -405,9 +405,6 @@ int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type)
 	if (kvm_is_realm(kvm))
 		ipa_limit = kvm_realm_ipa_limit();
 
-	if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
-		return -EINVAL;
-
 	phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type);
 	if (phys_shift) {
 		if (phys_shift > ipa_limit ||
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index fec1909e8b73..bcfc4d58dc19 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -898,14 +898,25 @@ struct kvm_ppc_resize_hpt {
 #define KVM_S390_SIE_PAGE_OFFSET 1
 
 /*
- * On arm64, machine type can be used to request the physical
- * address size for the VM. Bits[7-0] are reserved for the guest
- * PA size shift (i.e, log2(PA_Size)). For backward compatibility,
- * value 0 implies the default IPA size, 40bits.
+ * On arm64, machine type can be used to request both the machine type and
+ * the physical address size for the VM.
+ *
+ * Bits[11-8] are reserved for the ARM specific machine type.
+ *
+ * Bits[7-0] are reserved for the guest PA size shift (i.e, log2(PA_Size)).
+ * For backward compatibility, value 0 implies the default IPA size, 40bits.
  */
+#define KVM_VM_TYPE_ARM_SHIFT		8
+#define KVM_VM_TYPE_ARM_MASK		(0xfULL << KVM_VM_TYPE_ARM_SHIFT)
+#define KVM_VM_TYPE_ARM(_type)		\
+	(((_type) << KVM_VM_TYPE_ARM_SHIFT) & KVM_VM_TYPE_ARM_MASK)
+#define KVM_VM_TYPE_ARM_NORMAL		KVM_VM_TYPE_ARM(0)
+#define KVM_VM_TYPE_ARM_REALM		KVM_VM_TYPE_ARM(1)
+
 #define KVM_VM_TYPE_ARM_IPA_SIZE_MASK	0xffULL
 #define KVM_VM_TYPE_ARM_IPA_SIZE(x)		\
 	((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
+
 /*
  * ioctls for /dev/kvm fds:
  */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 08/28] arm64: RME: Keep a spare page delegated to the RMM
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (6 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 07/28] arm64: kvm: Allow passing machine type in KVM creation Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-02-13 16:47     ` Zhi Wang
  2023-01-27 11:29   ` [RFC PATCH 09/28] arm64: RME: RTT handling Steven Price
                     ` (19 subsequent siblings)
  27 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

Pages can only be populated/destroyed on the RMM at the 4KB granule,
this requires creating the full depth of RTTs. However if the pages are
going to be combined into a 4MB huge page the last RTT is only
temporarily needed. Similarly when freeing memory the huge page must be
temporarily split requiring temporary usage of the full depth oF RTTs.

To avoid needing to perform a temporary allocation and delegation of a
page for this purpose we keep a spare delegated page around. In
particular this avoids the need for memory allocation while destroying
the realm guest.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/kvm_rme.h | 3 +++
 arch/arm64/kvm/rme.c             | 6 ++++++
 2 files changed, 9 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
index 055a22accc08..a6318af3ed11 100644
--- a/arch/arm64/include/asm/kvm_rme.h
+++ b/arch/arm64/include/asm/kvm_rme.h
@@ -21,6 +21,9 @@ struct realm {
 	void *rd;
 	struct realm_params *params;
 
+	/* A spare already delegated page */
+	phys_addr_t spare_page;
+
 	unsigned long num_aux;
 	unsigned int vmid;
 	unsigned int ia_bits;
diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
index 9f8c5a91b8fc..0c9d70e4d9e6 100644
--- a/arch/arm64/kvm/rme.c
+++ b/arch/arm64/kvm/rme.c
@@ -148,6 +148,7 @@ static int realm_create_rd(struct kvm *kvm)
 	}
 
 	realm->rd = rd;
+	realm->spare_page = PHYS_ADDR_MAX;
 	realm->ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
 
 	if (WARN_ON(rmi_rec_aux_count(rd_phys, &realm->num_aux))) {
@@ -357,6 +358,11 @@ void kvm_destroy_realm(struct kvm *kvm)
 		free_page((unsigned long)realm->rd);
 		realm->rd = NULL;
 	}
+	if (realm->spare_page != PHYS_ADDR_MAX) {
+		if (!WARN_ON(rmi_granule_undelegate(realm->spare_page)))
+			free_page((unsigned long)phys_to_virt(realm->spare_page));
+		realm->spare_page = PHYS_ADDR_MAX;
+	}
 
 	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
 	for (i = 0; i < pgd_sz; i++) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 09/28] arm64: RME: RTT handling
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (7 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 08/28] arm64: RME: Keep a spare page delegated to the RMM Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-02-13 17:44     ` Zhi Wang
  2024-03-18 11:01     ` Ganapatrao Kulkarni
  2023-01-27 11:29   ` [RFC PATCH 10/28] arm64: RME: Allocate/free RECs to match vCPUs Steven Price
                     ` (18 subsequent siblings)
  27 siblings, 2 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

The RMM owns the stage 2 page tables for a realm, and KVM must request
that the RMM creates/destroys entries as necessary. The physical pages
to store the page tables are delegated to the realm as required, and can
be undelegated when no longer used.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/kvm_rme.h |  19 +++++
 arch/arm64/kvm/mmu.c             |   7 +-
 arch/arm64/kvm/rme.c             | 139 +++++++++++++++++++++++++++++++
 3 files changed, 162 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
index a6318af3ed11..eea5118dfa8a 100644
--- a/arch/arm64/include/asm/kvm_rme.h
+++ b/arch/arm64/include/asm/kvm_rme.h
@@ -35,5 +35,24 @@ u32 kvm_realm_ipa_limit(void);
 int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
 int kvm_init_realm_vm(struct kvm *kvm);
 void kvm_destroy_realm(struct kvm *kvm);
+void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level);
+
+#define RME_RTT_BLOCK_LEVEL	2
+#define RME_RTT_MAX_LEVEL	3
+
+#define RME_PAGE_SHIFT		12
+#define RME_PAGE_SIZE		BIT(RME_PAGE_SHIFT)
+/* See ARM64_HW_PGTABLE_LEVEL_SHIFT() */
+#define RME_RTT_LEVEL_SHIFT(l)	\
+	((RME_PAGE_SHIFT - 3) * (4 - (l)) + 3)
+#define RME_L2_BLOCK_SIZE	BIT(RME_RTT_LEVEL_SHIFT(2))
+
+static inline unsigned long rme_rtt_level_mapsize(int level)
+{
+	if (WARN_ON(level > RME_RTT_MAX_LEVEL))
+		return RME_PAGE_SIZE;
+
+	return (1UL << RME_RTT_LEVEL_SHIFT(level));
+}
 
 #endif
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 22c00274884a..f29558c5dcbc 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -834,16 +834,17 @@ void stage2_unmap_vm(struct kvm *kvm)
 void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
 {
 	struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
-	struct kvm_pgtable *pgt = NULL;
+	struct kvm_pgtable *pgt;
 
 	write_lock(&kvm->mmu_lock);
+	pgt = mmu->pgt;
 	if (kvm_is_realm(kvm) &&
 	    kvm_realm_state(kvm) != REALM_STATE_DYING) {
-		/* TODO: teardown rtts */
 		write_unlock(&kvm->mmu_lock);
+		kvm_realm_destroy_rtts(&kvm->arch.realm, pgt->ia_bits,
+				       pgt->start_level);
 		return;
 	}
-	pgt = mmu->pgt;
 	if (pgt) {
 		mmu->pgd_phys = 0;
 		mmu->pgt = NULL;
diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
index 0c9d70e4d9e6..f7b0e5a779f8 100644
--- a/arch/arm64/kvm/rme.c
+++ b/arch/arm64/kvm/rme.c
@@ -73,6 +73,28 @@ static int rmi_check_version(void)
 	return 0;
 }
 
+static void realm_destroy_undelegate_range(struct realm *realm,
+					   unsigned long ipa,
+					   unsigned long addr,
+					   ssize_t size)
+{
+	unsigned long rd = virt_to_phys(realm->rd);
+	int ret;
+
+	while (size > 0) {
+		ret = rmi_data_destroy(rd, ipa);
+		WARN_ON(ret);
+		ret = rmi_granule_undelegate(addr);
+
+		if (ret)
+			get_page(phys_to_page(addr));
+
+		addr += PAGE_SIZE;
+		ipa += PAGE_SIZE;
+		size -= PAGE_SIZE;
+	}
+}
+
 static unsigned long create_realm_feat_reg0(struct kvm *kvm)
 {
 	unsigned long ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
@@ -170,6 +192,123 @@ static int realm_create_rd(struct kvm *kvm)
 	return r;
 }
 
+static int realm_rtt_destroy(struct realm *realm, unsigned long addr,
+			     int level, phys_addr_t rtt_granule)
+{
+	addr = ALIGN_DOWN(addr, rme_rtt_level_mapsize(level - 1));
+	return rmi_rtt_destroy(rtt_granule, virt_to_phys(realm->rd), addr,
+			level);
+}
+
+static int realm_destroy_free_rtt(struct realm *realm, unsigned long addr,
+				  int level, phys_addr_t rtt_granule)
+{
+	if (realm_rtt_destroy(realm, addr, level, rtt_granule))
+		return -ENXIO;
+	if (!WARN_ON(rmi_granule_undelegate(rtt_granule)))
+		put_page(phys_to_page(rtt_granule));
+
+	return 0;
+}
+
+static int realm_rtt_create(struct realm *realm,
+			    unsigned long addr,
+			    int level,
+			    phys_addr_t phys)
+{
+	addr = ALIGN_DOWN(addr, rme_rtt_level_mapsize(level - 1));
+	return rmi_rtt_create(phys, virt_to_phys(realm->rd), addr, level);
+}
+
+static int realm_tear_down_rtt_range(struct realm *realm, int level,
+				     unsigned long start, unsigned long end)
+{
+	phys_addr_t rd = virt_to_phys(realm->rd);
+	ssize_t map_size = rme_rtt_level_mapsize(level);
+	unsigned long addr, next_addr;
+	bool failed = false;
+
+	for (addr = start; addr < end; addr = next_addr) {
+		phys_addr_t rtt_addr, tmp_rtt;
+		struct rtt_entry rtt;
+		unsigned long end_addr;
+
+		next_addr = ALIGN(addr + 1, map_size);
+
+		end_addr = min(next_addr, end);
+
+		if (rmi_rtt_read_entry(rd, ALIGN_DOWN(addr, map_size),
+				       level, &rtt)) {
+			failed = true;
+			continue;
+		}
+
+		rtt_addr = rmi_rtt_get_phys(&rtt);
+		WARN_ON(level != rtt.walk_level);
+
+		switch (rtt.state) {
+		case RMI_UNASSIGNED:
+		case RMI_DESTROYED:
+			break;
+		case RMI_TABLE:
+			if (realm_tear_down_rtt_range(realm, level + 1,
+						      addr, end_addr)) {
+				failed = true;
+				break;
+			}
+			if (IS_ALIGNED(addr, map_size) &&
+			    next_addr <= end &&
+			    realm_destroy_free_rtt(realm, addr, level + 1,
+						   rtt_addr))
+				failed = true;
+			break;
+		case RMI_ASSIGNED:
+			WARN_ON(!rtt_addr);
+			/*
+			 * If there is a block mapping, break it now, using the
+			 * spare_page. We are sure to have a valid delegated
+			 * page at spare_page before we enter here, otherwise
+			 * WARN once, which will be followed by further
+			 * warnings.
+			 */
+			tmp_rtt = realm->spare_page;
+			if (level == 2 &&
+			    !WARN_ON_ONCE(tmp_rtt == PHYS_ADDR_MAX) &&
+			    realm_rtt_create(realm, addr,
+					     RME_RTT_MAX_LEVEL, tmp_rtt)) {
+				WARN_ON(1);
+				failed = true;
+				break;
+			}
+			realm_destroy_undelegate_range(realm, addr,
+						       rtt_addr, map_size);
+			/*
+			 * Collapse the last level table and make the spare page
+			 * reusable again.
+			 */
+			if (level == 2 &&
+			    realm_rtt_destroy(realm, addr, RME_RTT_MAX_LEVEL,
+					      tmp_rtt))
+				failed = true;
+			break;
+		case RMI_VALID_NS:
+			WARN_ON(rmi_rtt_unmap_unprotected(rd, addr, level));
+			break;
+		default:
+			WARN_ON(1);
+			failed = true;
+			break;
+		}
+	}
+
+	return failed ? -EINVAL : 0;
+}
+
+void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level)
+{
+	realm_tear_down_rtt_range(realm, start_level, 0, (1UL << ia_bits));
+}
+
 /* Protects access to rme_vmid_bitmap */
 static DEFINE_SPINLOCK(rme_vmid_lock);
 static unsigned long *rme_vmid_bitmap;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 10/28] arm64: RME: Allocate/free RECs to match vCPUs
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (8 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 09/28] arm64: RME: RTT handling Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-02-13 18:08     ` Zhi Wang
  2023-01-27 11:29   ` [RFC PATCH 11/28] arm64: RME: Support for the VGIC in realms Steven Price
                     ` (17 subsequent siblings)
  27 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

The RMM maintains a data structure known as the Realm Execution Context
(or REC). It is similar to struct kvm_vcpu and tracks the state of the
virtual CPUs. KVM must delegate memory and request the structures are
created when vCPUs are created, and suitably tear down on destruction.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/kvm_emulate.h |   2 +
 arch/arm64/include/asm/kvm_host.h    |   3 +
 arch/arm64/include/asm/kvm_rme.h     |  10 ++
 arch/arm64/kvm/arm.c                 |   1 +
 arch/arm64/kvm/reset.c               |  11 ++
 arch/arm64/kvm/rme.c                 | 144 +++++++++++++++++++++++++++
 6 files changed, 171 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 5a2b7229e83f..285e62914ca4 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -504,6 +504,8 @@ static inline enum realm_state kvm_realm_state(struct kvm *kvm)
 
 static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu)
 {
+	if (static_branch_unlikely(&kvm_rme_is_available))
+		return vcpu->arch.rec.mpidr != INVALID_HWID;
 	return false;
 }
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 04347c3a8c6b..ef497b718cdb 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -505,6 +505,9 @@ struct kvm_vcpu_arch {
 		u64 last_steal;
 		gpa_t base;
 	} steal;
+
+	/* Realm meta data */
+	struct rec rec;
 };
 
 /*
diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
index eea5118dfa8a..4b219ebe1400 100644
--- a/arch/arm64/include/asm/kvm_rme.h
+++ b/arch/arm64/include/asm/kvm_rme.h
@@ -6,6 +6,7 @@
 #ifndef __ASM_KVM_RME_H
 #define __ASM_KVM_RME_H
 
+#include <asm/rmi_smc.h>
 #include <uapi/linux/kvm.h>
 
 enum realm_state {
@@ -29,6 +30,13 @@ struct realm {
 	unsigned int ia_bits;
 };
 
+struct rec {
+	unsigned long mpidr;
+	void *rec_page;
+	struct page *aux_pages[REC_PARAMS_AUX_GRANULES];
+	struct rec_run *run;
+};
+
 int kvm_init_rme(void);
 u32 kvm_realm_ipa_limit(void);
 
@@ -36,6 +44,8 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
 int kvm_init_realm_vm(struct kvm *kvm);
 void kvm_destroy_realm(struct kvm *kvm);
 void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level);
+int kvm_create_rec(struct kvm_vcpu *vcpu);
+void kvm_destroy_rec(struct kvm_vcpu *vcpu);
 
 #define RME_RTT_BLOCK_LEVEL	2
 #define RME_RTT_MAX_LEVEL	3
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index badd775547b8..52affed2f3cf 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -373,6 +373,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	/* Force users to call KVM_ARM_VCPU_INIT */
 	vcpu->arch.target = -1;
 	bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
+	vcpu->arch.rec.mpidr = INVALID_HWID;
 
 	vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
 
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 9e71d69e051f..0c84392a4bf2 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -135,6 +135,11 @@ int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature)
 			return -EPERM;
 
 		return kvm_vcpu_finalize_sve(vcpu);
+	case KVM_ARM_VCPU_REC:
+		if (!kvm_is_realm(vcpu->kvm))
+			return -EINVAL;
+
+		return kvm_create_rec(vcpu);
 	}
 
 	return -EINVAL;
@@ -145,6 +150,11 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu)
 	if (vcpu_has_sve(vcpu) && !kvm_arm_vcpu_sve_finalized(vcpu))
 		return false;
 
+	if (kvm_is_realm(vcpu->kvm) &&
+	    !(vcpu_is_rec(vcpu) &&
+	      READ_ONCE(vcpu->kvm->arch.realm.state) == REALM_STATE_ACTIVE))
+		return false;
+
 	return true;
 }
 
@@ -157,6 +167,7 @@ void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu)
 	if (sve_state)
 		kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu));
 	kfree(sve_state);
+	kvm_destroy_rec(vcpu);
 }
 
 static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu)
diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
index f7b0e5a779f8..d79ed889ca4d 100644
--- a/arch/arm64/kvm/rme.c
+++ b/arch/arm64/kvm/rme.c
@@ -514,6 +514,150 @@ void kvm_destroy_realm(struct kvm *kvm)
 	kvm_free_stage2_pgd(&kvm->arch.mmu);
 }
 
+static void free_rec_aux(struct page **aux_pages,
+			 unsigned int num_aux)
+{
+	unsigned int i;
+
+	for (i = 0; i < num_aux; i++) {
+		phys_addr_t aux_page_phys = page_to_phys(aux_pages[i]);
+
+		if (WARN_ON(rmi_granule_undelegate(aux_page_phys)))
+			continue;
+
+		__free_page(aux_pages[i]);
+	}
+}
+
+static int alloc_rec_aux(struct page **aux_pages,
+			 u64 *aux_phys_pages,
+			 unsigned int num_aux)
+{
+	int ret;
+	unsigned int i;
+
+	for (i = 0; i < num_aux; i++) {
+		struct page *aux_page;
+		phys_addr_t aux_page_phys;
+
+		aux_page = alloc_page(GFP_KERNEL);
+		if (!aux_page) {
+			ret = -ENOMEM;
+			goto out_err;
+		}
+		aux_page_phys = page_to_phys(aux_page);
+		if (rmi_granule_delegate(aux_page_phys)) {
+			__free_page(aux_page);
+			ret = -ENXIO;
+			goto out_err;
+		}
+		aux_pages[i] = aux_page;
+		aux_phys_pages[i] = aux_page_phys;
+	}
+
+	return 0;
+out_err:
+	free_rec_aux(aux_pages, i);
+	return ret;
+}
+
+int kvm_create_rec(struct kvm_vcpu *vcpu)
+{
+	struct user_pt_regs *vcpu_regs = vcpu_gp_regs(vcpu);
+	unsigned long mpidr = kvm_vcpu_get_mpidr_aff(vcpu);
+	struct realm *realm = &vcpu->kvm->arch.realm;
+	struct rec *rec = &vcpu->arch.rec;
+	unsigned long rec_page_phys;
+	struct rec_params *params;
+	int r, i;
+
+	if (kvm_realm_state(vcpu->kvm) != REALM_STATE_NEW)
+		return -ENOENT;
+
+	/*
+	 * The RMM will report PSCI v1.0 to Realms and the KVM_ARM_VCPU_PSCI_0_2
+	 * flag covers v0.2 and onwards.
+	 */
+	if (!test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
+		return -EINVAL;
+
+	BUILD_BUG_ON(sizeof(*params) > PAGE_SIZE);
+	BUILD_BUG_ON(sizeof(*rec->run) > PAGE_SIZE);
+
+	params = (struct rec_params *)get_zeroed_page(GFP_KERNEL);
+	rec->rec_page = (void *)__get_free_page(GFP_KERNEL);
+	rec->run = (void *)get_zeroed_page(GFP_KERNEL);
+	if (!params || !rec->rec_page || !rec->run) {
+		r = -ENOMEM;
+		goto out_free_pages;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(params->gprs); i++)
+		params->gprs[i] = vcpu_regs->regs[i];
+
+	params->pc = vcpu_regs->pc;
+
+	if (vcpu->vcpu_id == 0)
+		params->flags |= REC_PARAMS_FLAG_RUNNABLE;
+
+	rec_page_phys = virt_to_phys(rec->rec_page);
+
+	if (rmi_granule_delegate(rec_page_phys)) {
+		r = -ENXIO;
+		goto out_free_pages;
+	}
+
+	r = alloc_rec_aux(rec->aux_pages, params->aux, realm->num_aux);
+	if (r)
+		goto out_undelegate_rmm_rec;
+
+	params->num_rec_aux = realm->num_aux;
+	params->mpidr = mpidr;
+
+	if (rmi_rec_create(rec_page_phys,
+			   virt_to_phys(realm->rd),
+			   virt_to_phys(params))) {
+		r = -ENXIO;
+		goto out_free_rec_aux;
+	}
+
+	rec->mpidr = mpidr;
+
+	free_page((unsigned long)params);
+	return 0;
+
+out_free_rec_aux:
+	free_rec_aux(rec->aux_pages, realm->num_aux);
+out_undelegate_rmm_rec:
+	if (WARN_ON(rmi_granule_undelegate(rec_page_phys)))
+		rec->rec_page = NULL;
+out_free_pages:
+	free_page((unsigned long)rec->run);
+	free_page((unsigned long)rec->rec_page);
+	free_page((unsigned long)params);
+	return r;
+}
+
+void kvm_destroy_rec(struct kvm_vcpu *vcpu)
+{
+	struct realm *realm = &vcpu->kvm->arch.realm;
+	struct rec *rec = &vcpu->arch.rec;
+	unsigned long rec_page_phys;
+
+	if (!vcpu_is_rec(vcpu))
+		return;
+
+	rec_page_phys = virt_to_phys(rec->rec_page);
+
+	if (WARN_ON(rmi_rec_destroy(rec_page_phys)))
+		return;
+	if (WARN_ON(rmi_granule_undelegate(rec_page_phys)))
+		return;
+
+	free_rec_aux(rec->aux_pages, realm->num_aux);
+	free_page((unsigned long)rec->rec_page);
+}
+
 int kvm_init_realm_vm(struct kvm *kvm)
 {
 	struct realm_params *params;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 11/28] arm64: RME: Support for the VGIC in realms
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (9 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 10/28] arm64: RME: Allocate/free RECs to match vCPUs Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-01-27 11:29   ` [RFC PATCH 12/28] KVM: arm64: Support timers in realm RECs Steven Price
                     ` (16 subsequent siblings)
  27 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

The RMM provides emulation of a VGIC to the realm guest but delegates
much of the handling to the host. Implement support in KVM for
saving/restoring state to/from the REC structure.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kvm/arm.c          | 15 +++++++++++---
 arch/arm64/kvm/vgic/vgic-v3.c |  9 +++++++--
 arch/arm64/kvm/vgic/vgic.c    | 37 +++++++++++++++++++++++++++++++++--
 3 files changed, 54 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 52affed2f3cf..1b2547516f62 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -475,17 +475,22 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
+	kvm_timer_vcpu_put(vcpu);
+	kvm_vgic_put(vcpu);
+
+	vcpu->cpu = -1;
+
+	if (vcpu_is_rec(vcpu))
+		return;
+
 	kvm_arch_vcpu_put_debug_state_flags(vcpu);
 	kvm_arch_vcpu_put_fp(vcpu);
 	if (has_vhe())
 		kvm_vcpu_put_sysregs_vhe(vcpu);
-	kvm_timer_vcpu_put(vcpu);
-	kvm_vgic_put(vcpu);
 	kvm_vcpu_pmu_restore_host(vcpu);
 	kvm_arm_vmid_clear_active();
 
 	vcpu_clear_on_unsupported_cpu(vcpu);
-	vcpu->cpu = -1;
 }
 
 void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu)
@@ -623,6 +628,10 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
 	}
 
 	if (!irqchip_in_kernel(kvm)) {
+		/* Userspace irqchip not yet supported with Realms */
+		if (kvm_is_realm(vcpu->kvm))
+			return -EOPNOTSUPP;
+
 		/*
 		 * Tell the rest of the code that there are userspace irqchip
 		 * VMs in the wild.
diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index 826ff6f2a4e7..121c7a68c397 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -6,9 +6,11 @@
 #include <linux/kvm.h>
 #include <linux/kvm_host.h>
 #include <kvm/arm_vgic.h>
+#include <asm/kvm_emulate.h>
 #include <asm/kvm_hyp.h>
 #include <asm/kvm_mmu.h>
 #include <asm/kvm_asm.h>
+#include <asm/rmi_smc.h>
 
 #include "vgic.h"
 
@@ -669,7 +671,8 @@ int vgic_v3_probe(const struct gic_kvm_info *info)
 			(unsigned long long)info->vcpu.start);
 	} else if (kvm_get_mode() != KVM_MODE_PROTECTED) {
 		kvm_vgic_global_state.vcpu_base = info->vcpu.start;
-		kvm_vgic_global_state.can_emulate_gicv2 = true;
+		if (!static_branch_unlikely(&kvm_rme_is_available))
+			kvm_vgic_global_state.can_emulate_gicv2 = true;
 		ret = kvm_register_vgic_device(KVM_DEV_TYPE_ARM_VGIC_V2);
 		if (ret) {
 			kvm_err("Cannot register GICv2 KVM device.\n");
@@ -744,7 +747,9 @@ void vgic_v3_vmcr_sync(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
 
-	if (likely(cpu_if->vgic_sre))
+	if (vcpu_is_rec(vcpu))
+		cpu_if->vgic_vmcr = vcpu->arch.rec.run->exit.gicv3_vmcr;
+	else if (likely(cpu_if->vgic_sre))
 		cpu_if->vgic_vmcr = kvm_call_hyp_ret(__vgic_v3_read_vmcr);
 }
 
diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index d97e6080b421..bc77660f7051 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -10,7 +10,9 @@
 #include <linux/list_sort.h>
 #include <linux/nospec.h>
 
+#include <asm/kvm_emulate.h>
 #include <asm/kvm_hyp.h>
+#include <asm/rmi_smc.h>
 
 #include "vgic.h"
 
@@ -848,10 +850,23 @@ static inline bool can_access_vgic_from_kernel(void)
 	return !static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif) || has_vhe();
 }
 
+static inline void vgic_rmm_save_state(struct kvm_vcpu *vcpu)
+{
+	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
+	int i;
+
+	for (i = 0; i < kvm_vgic_global_state.nr_lr; i++) {
+		cpu_if->vgic_lr[i] = vcpu->arch.rec.run->exit.gicv3_lrs[i];
+		vcpu->arch.rec.run->entry.gicv3_lrs[i] = 0;
+	}
+}
+
 static inline void vgic_save_state(struct kvm_vcpu *vcpu)
 {
 	if (!static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif))
 		vgic_v2_save_state(vcpu);
+	else if (vcpu_is_rec(vcpu))
+		vgic_rmm_save_state(vcpu);
 	else
 		__vgic_v3_save_state(&vcpu->arch.vgic_cpu.vgic_v3);
 }
@@ -878,10 +893,28 @@ void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 	vgic_prune_ap_list(vcpu);
 }
 
+static inline void vgic_rmm_restore_state(struct kvm_vcpu *vcpu)
+{
+	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
+	int i;
+
+	for (i = 0; i < kvm_vgic_global_state.nr_lr; i++) {
+		vcpu->arch.rec.run->entry.gicv3_lrs[i] = cpu_if->vgic_lr[i];
+		/*
+		 * Also populate the rec.run->exit copies so that a late
+		 * decision to back out from entering the realm doesn't cause
+		 * the state to be lost
+		 */
+		vcpu->arch.rec.run->exit.gicv3_lrs[i] = cpu_if->vgic_lr[i];
+	}
+}
+
 static inline void vgic_restore_state(struct kvm_vcpu *vcpu)
 {
 	if (!static_branch_unlikely(&kvm_vgic_global_state.gicv3_cpuif))
 		vgic_v2_restore_state(vcpu);
+	else if (vcpu_is_rec(vcpu))
+		vgic_rmm_restore_state(vcpu);
 	else
 		__vgic_v3_restore_state(&vcpu->arch.vgic_cpu.vgic_v3);
 }
@@ -922,7 +955,7 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
 
 void kvm_vgic_load(struct kvm_vcpu *vcpu)
 {
-	if (unlikely(!vgic_initialized(vcpu->kvm)))
+	if (unlikely(!vgic_initialized(vcpu->kvm)) || vcpu_is_rec(vcpu))
 		return;
 
 	if (kvm_vgic_global_state.type == VGIC_V2)
@@ -933,7 +966,7 @@ void kvm_vgic_load(struct kvm_vcpu *vcpu)
 
 void kvm_vgic_put(struct kvm_vcpu *vcpu)
 {
-	if (unlikely(!vgic_initialized(vcpu->kvm)))
+	if (unlikely(!vgic_initialized(vcpu->kvm)) || vcpu_is_rec(vcpu))
 		return;
 
 	if (kvm_vgic_global_state.type == VGIC_V2)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 12/28] KVM: arm64: Support timers in realm RECs
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (10 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 11/28] arm64: RME: Support for the VGIC in realms Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2024-03-18 11:28     ` Ganapatrao Kulkarni
  2023-01-27 11:29   ` [RFC PATCH 13/28] arm64: RME: Allow VMM to set RIPAS Steven Price
                     ` (15 subsequent siblings)
  27 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

The RMM keeps track of the timer while the realm REC is running, but on
exit to the normal world KVM is responsible for handling the timers.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kvm/arch_timer.c  | 53 ++++++++++++++++++++++++++++++++----
 include/kvm/arm_arch_timer.h |  2 ++
 2 files changed, 49 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index bb24a76b4224..d4af9ee58550 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -130,6 +130,11 @@ static void timer_set_offset(struct arch_timer_context *ctxt, u64 offset)
 {
 	struct kvm_vcpu *vcpu = ctxt->vcpu;
 
+	if (kvm_is_realm(vcpu->kvm)) {
+		WARN_ON(offset);
+		return;
+	}
+
 	switch(arch_timer_ctx_index(ctxt)) {
 	case TIMER_VTIMER:
 		__vcpu_sys_reg(vcpu, CNTVOFF_EL2) = offset;
@@ -411,6 +416,21 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
 	}
 }
 
+void kvm_realm_timers_update(struct kvm_vcpu *vcpu)
+{
+	struct arch_timer_cpu *arch_timer = &vcpu->arch.timer_cpu;
+	int i;
+
+	for (i = 0; i < NR_KVM_TIMERS; i++) {
+		struct arch_timer_context *timer = &arch_timer->timers[i];
+		bool status = timer_get_ctl(timer) & ARCH_TIMER_CTRL_IT_STAT;
+		bool level = kvm_timer_irq_can_fire(timer) && status;
+
+		if (level != timer->irq.level)
+			kvm_timer_update_irq(vcpu, level, timer);
+	}
+}
+
 /* Only called for a fully emulated timer */
 static void timer_emulate(struct arch_timer_context *ctx)
 {
@@ -621,6 +641,11 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
 	if (unlikely(!timer->enabled))
 		return;
 
+	kvm_timer_unblocking(vcpu);
+
+	if (vcpu_is_rec(vcpu))
+		return;
+
 	get_timer_map(vcpu, &map);
 
 	if (static_branch_likely(&has_gic_active_state)) {
@@ -633,8 +658,6 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
 
 	set_cntvoff(timer_get_offset(map.direct_vtimer));
 
-	kvm_timer_unblocking(vcpu);
-
 	timer_restore_state(map.direct_vtimer);
 	if (map.direct_ptimer)
 		timer_restore_state(map.direct_ptimer);
@@ -668,6 +691,9 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
 	if (unlikely(!timer->enabled))
 		return;
 
+	if (vcpu_is_rec(vcpu))
+		goto out;
+
 	get_timer_map(vcpu, &map);
 
 	timer_save_state(map.direct_vtimer);
@@ -686,9 +712,6 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
 	if (map.emul_ptimer)
 		soft_timer_cancel(&map.emul_ptimer->hrtimer);
 
-	if (kvm_vcpu_is_blocking(vcpu))
-		kvm_timer_blocking(vcpu);
-
 	/*
 	 * The kernel may decide to run userspace after calling vcpu_put, so
 	 * we reset cntvoff to 0 to ensure a consistent read between user
@@ -697,6 +720,11 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
 	 * virtual offset of zero, so no need to zero CNTVOFF_EL2 register.
 	 */
 	set_cntvoff(0);
+
+out:
+	if (kvm_vcpu_is_blocking(vcpu))
+		kvm_timer_blocking(vcpu);
+
 }
 
 /*
@@ -785,12 +813,18 @@ void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
 	struct arch_timer_cpu *timer = vcpu_timer(vcpu);
 	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
 	struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
+	u64 cntvoff;
 
 	vtimer->vcpu = vcpu;
 	ptimer->vcpu = vcpu;
 
+	if (kvm_is_realm(vcpu->kvm))
+		cntvoff = 0;
+	else
+		cntvoff = kvm_phys_timer_read();
+
 	/* Synchronize cntvoff across all vtimers of a VM. */
-	update_vtimer_cntvoff(vcpu, kvm_phys_timer_read());
+	update_vtimer_cntvoff(vcpu, cntvoff);
 	timer_set_offset(ptimer, 0);
 
 	hrtimer_init(&timer->bg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
@@ -1265,6 +1299,13 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
 		return -EINVAL;
 	}
 
+	/*
+	 * We don't use mapped IRQs for Realms because the RMI doesn't allow
+	 * us setting the LR.HW bit in the VGIC.
+	 */
+	if (vcpu_is_rec(vcpu))
+		return 0;
+
 	get_timer_map(vcpu, &map);
 
 	ret = kvm_vgic_map_phys_irq(vcpu,
diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index cd6d8f260eab..158280e15a33 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -76,6 +76,8 @@ int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr);
 int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr);
 int kvm_arm_timer_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr);
 
+void kvm_realm_timers_update(struct kvm_vcpu *vcpu);
+
 u64 kvm_phys_timer_read(void);
 
 void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 13/28] arm64: RME: Allow VMM to set RIPAS
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (11 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 12/28] KVM: arm64: Support timers in realm RECs Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-02-17 13:07     ` Zhi Wang
  2023-01-27 11:29   ` [RFC PATCH 14/28] arm64: RME: Handle realm enter/exit Steven Price
                     ` (14 subsequent siblings)
  27 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

Each page within the protected region of the realm guest can be marked
as either RAM or EMPTY. Allow the VMM to control this before the guest
has started and provide the equivalent functions to change this (with
the guest's approval) at runtime.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/kvm_rme.h |   4 +
 arch/arm64/kvm/rme.c             | 288 +++++++++++++++++++++++++++++++
 2 files changed, 292 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
index 4b219ebe1400..3e75cedaad18 100644
--- a/arch/arm64/include/asm/kvm_rme.h
+++ b/arch/arm64/include/asm/kvm_rme.h
@@ -47,6 +47,10 @@ void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level);
 int kvm_create_rec(struct kvm_vcpu *vcpu);
 void kvm_destroy_rec(struct kvm_vcpu *vcpu);
 
+int realm_set_ipa_state(struct kvm_vcpu *vcpu,
+			unsigned long addr, unsigned long end,
+			unsigned long ripas);
+
 #define RME_RTT_BLOCK_LEVEL	2
 #define RME_RTT_MAX_LEVEL	3
 
diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
index d79ed889ca4d..b3ea79189839 100644
--- a/arch/arm64/kvm/rme.c
+++ b/arch/arm64/kvm/rme.c
@@ -73,6 +73,58 @@ static int rmi_check_version(void)
 	return 0;
 }
 
+static phys_addr_t __alloc_delegated_page(struct realm *realm,
+					  struct kvm_mmu_memory_cache *mc, gfp_t flags)
+{
+	phys_addr_t phys = PHYS_ADDR_MAX;
+	void *virt;
+
+	if (realm->spare_page != PHYS_ADDR_MAX) {
+		swap(realm->spare_page, phys);
+		goto out;
+	}
+
+	if (mc)
+		virt = kvm_mmu_memory_cache_alloc(mc);
+	else
+		virt = (void *)__get_free_page(flags);
+
+	if (!virt)
+		goto out;
+
+	phys = virt_to_phys(virt);
+
+	if (rmi_granule_delegate(phys)) {
+		free_page((unsigned long)virt);
+
+		phys = PHYS_ADDR_MAX;
+	}
+
+out:
+	return phys;
+}
+
+static phys_addr_t alloc_delegated_page(struct realm *realm,
+					struct kvm_mmu_memory_cache *mc)
+{
+	return __alloc_delegated_page(realm, mc, GFP_KERNEL);
+}
+
+static void free_delegated_page(struct realm *realm, phys_addr_t phys)
+{
+	if (realm->spare_page == PHYS_ADDR_MAX) {
+		realm->spare_page = phys;
+		return;
+	}
+
+	if (WARN_ON(rmi_granule_undelegate(phys))) {
+		/* Undelegate failed: leak the page */
+		return;
+	}
+
+	free_page((unsigned long)phys_to_virt(phys));
+}
+
 static void realm_destroy_undelegate_range(struct realm *realm,
 					   unsigned long ipa,
 					   unsigned long addr,
@@ -220,6 +272,30 @@ static int realm_rtt_create(struct realm *realm,
 	return rmi_rtt_create(phys, virt_to_phys(realm->rd), addr, level);
 }
 
+static int realm_create_rtt_levels(struct realm *realm,
+				   unsigned long ipa,
+				   int level,
+				   int max_level,
+				   struct kvm_mmu_memory_cache *mc)
+{
+	if (WARN_ON(level == max_level))
+		return 0;
+
+	while (level++ < max_level) {
+		phys_addr_t rtt = alloc_delegated_page(realm, mc);
+
+		if (rtt == PHYS_ADDR_MAX)
+			return -ENOMEM;
+
+		if (realm_rtt_create(realm, ipa, level, rtt)) {
+			free_delegated_page(realm, rtt);
+			return -ENXIO;
+		}
+	}
+
+	return 0;
+}
+
 static int realm_tear_down_rtt_range(struct realm *realm, int level,
 				     unsigned long start, unsigned long end)
 {
@@ -309,6 +385,206 @@ void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level)
 	realm_tear_down_rtt_range(realm, start_level, 0, (1UL << ia_bits));
 }
 
+void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size)
+{
+	u32 ia_bits = kvm->arch.mmu.pgt->ia_bits;
+	u32 start_level = kvm->arch.mmu.pgt->start_level;
+	unsigned long end = ipa + size;
+	struct realm *realm = &kvm->arch.realm;
+	phys_addr_t tmp_rtt = PHYS_ADDR_MAX;
+
+	if (end > (1UL << ia_bits))
+		end = 1UL << ia_bits;
+	/*
+	 * Make sure we have a spare delegated page for tearing down the
+	 * block mappings. We must use Atomic allocations as we are called
+	 * with kvm->mmu_lock held.
+	 */
+	if (realm->spare_page == PHYS_ADDR_MAX) {
+		tmp_rtt = __alloc_delegated_page(realm, NULL, GFP_ATOMIC);
+		/*
+		 * We don't have to check the status here, as we may not
+		 * have a block level mapping. Delay any error to the point
+		 * where we need it.
+		 */
+		realm->spare_page = tmp_rtt;
+	}
+
+	realm_tear_down_rtt_range(&kvm->arch.realm, start_level, ipa, end);
+
+	/* Free up the atomic page, if there were any */
+	if (tmp_rtt != PHYS_ADDR_MAX) {
+		free_delegated_page(realm, tmp_rtt);
+		/*
+		 * Update the spare_page after we have freed the
+		 * above page to make sure it doesn't get cached
+		 * in spare_page.
+		 * We should re-write this part and always have
+		 * a dedicated page for handling block mappings.
+		 */
+		realm->spare_page = PHYS_ADDR_MAX;
+	}
+}
+
+static int set_ipa_state(struct kvm_vcpu *vcpu,
+			 unsigned long ipa,
+			 unsigned long end,
+			 int level,
+			 unsigned long ripas)
+{
+	struct kvm *kvm = vcpu->kvm;
+	struct realm *realm = &kvm->arch.realm;
+	struct rec *rec = &vcpu->arch.rec;
+	phys_addr_t rd_phys = virt_to_phys(realm->rd);
+	phys_addr_t rec_phys = virt_to_phys(rec->rec_page);
+	unsigned long map_size = rme_rtt_level_mapsize(level);
+	int ret;
+
+	while (ipa < end) {
+		ret = rmi_rtt_set_ripas(rd_phys, rec_phys, ipa, level, ripas);
+
+		if (!ret) {
+			if (!ripas)
+				kvm_realm_unmap_range(kvm, ipa, map_size);
+		} else if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
+			int walk_level = RMI_RETURN_INDEX(ret);
+
+			if (walk_level < level) {
+				ret = realm_create_rtt_levels(realm, ipa,
+							      walk_level,
+							      level, NULL);
+				if (ret)
+					return ret;
+				continue;
+			}
+
+			if (WARN_ON(level >= RME_RTT_MAX_LEVEL))
+				return -EINVAL;
+
+			/* Recurse one level lower */
+			ret = set_ipa_state(vcpu, ipa, ipa + map_size,
+					    level + 1, ripas);
+			if (ret)
+				return ret;
+		} else {
+			WARN(1, "Unexpected error in %s: %#x\n", __func__,
+			     ret);
+			return -EINVAL;
+		}
+		ipa += map_size;
+	}
+
+	return 0;
+}
+
+static int realm_init_ipa_state(struct realm *realm,
+				unsigned long ipa,
+				unsigned long end,
+				int level)
+{
+	unsigned long map_size = rme_rtt_level_mapsize(level);
+	phys_addr_t rd_phys = virt_to_phys(realm->rd);
+	int ret;
+
+	while (ipa < end) {
+		ret = rmi_rtt_init_ripas(rd_phys, ipa, level);
+
+		if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
+			int cur_level = RMI_RETURN_INDEX(ret);
+
+			if (cur_level < level) {
+				ret = realm_create_rtt_levels(realm, ipa,
+							      cur_level,
+							      level, NULL);
+				if (ret)
+					return ret;
+				/* Retry with the RTT levels in place */
+				continue;
+			}
+
+			/* There's an entry at a lower level, recurse */
+			if (WARN_ON(level >= RME_RTT_MAX_LEVEL))
+				return -EINVAL;
+
+			realm_init_ipa_state(realm, ipa, ipa + map_size,
+					     level + 1);
+		} else if (WARN_ON(ret)) {
+			return -ENXIO;
+		}
+
+		ipa += map_size;
+	}
+
+	return 0;
+}
+
+static int find_map_level(struct kvm *kvm, unsigned long start, unsigned long end)
+{
+	int level = RME_RTT_MAX_LEVEL;
+
+	while (level > get_start_level(kvm) + 1) {
+		unsigned long map_size = rme_rtt_level_mapsize(level - 1);
+
+		if (!IS_ALIGNED(start, map_size) ||
+		    (start + map_size) > end)
+			break;
+
+		level--;
+	}
+
+	return level;
+}
+
+int realm_set_ipa_state(struct kvm_vcpu *vcpu,
+			unsigned long addr, unsigned long end,
+			unsigned long ripas)
+{
+	int ret = 0;
+
+	while (addr < end) {
+		int level = find_map_level(vcpu->kvm, addr, end);
+		unsigned long map_size = rme_rtt_level_mapsize(level);
+
+		ret = set_ipa_state(vcpu, addr, addr + map_size, level, ripas);
+		if (ret)
+			break;
+
+		addr += map_size;
+	}
+
+	return ret;
+}
+
+static int kvm_init_ipa_range_realm(struct kvm *kvm,
+				    struct kvm_cap_arm_rme_init_ipa_args *args)
+{
+	int ret = 0;
+	gpa_t addr, end;
+	struct realm *realm = &kvm->arch.realm;
+
+	addr = args->init_ipa_base;
+	end = addr + args->init_ipa_size;
+
+	if (end < addr)
+		return -EINVAL;
+
+	if (kvm_realm_state(kvm) != REALM_STATE_NEW)
+		return -EBUSY;
+
+	while (addr < end) {
+		int level = find_map_level(kvm, addr, end);
+		unsigned long map_size = rme_rtt_level_mapsize(level);
+
+		ret = realm_init_ipa_state(realm, addr, addr + map_size, level);
+		if (ret)
+			break;
+
+		addr += map_size;
+	}
+
+	return ret;
+}
+
 /* Protects access to rme_vmid_bitmap */
 static DEFINE_SPINLOCK(rme_vmid_lock);
 static unsigned long *rme_vmid_bitmap;
@@ -460,6 +736,18 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
 
 		r = kvm_create_realm(kvm);
 		break;
+	case KVM_CAP_ARM_RME_INIT_IPA_REALM: {
+		struct kvm_cap_arm_rme_init_ipa_args args;
+		void __user *argp = u64_to_user_ptr(cap->args[1]);
+
+		if (copy_from_user(&args, argp, sizeof(args))) {
+			r = -EFAULT;
+			break;
+		}
+
+		r = kvm_init_ipa_range_realm(kvm, &args);
+		break;
+	}
 	default:
 		r = -EINVAL;
 		break;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 14/28] arm64: RME: Handle realm enter/exit
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (12 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 13/28] arm64: RME: Allow VMM to set RIPAS Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-01-27 11:29   ` [RFC PATCH 15/28] KVM: arm64: Handle realm MMIO emulation Steven Price
                     ` (13 subsequent siblings)
  27 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

Entering a realm is done using a SMC call to the RMM. On exit the
exit-codes need to be handled slightly differently to the normal KVM
path so define our own functions for realm enter/exit and hook them
in if the guest is a realm guest.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/kvm_rme.h |  11 ++
 arch/arm64/kvm/Makefile          |   2 +-
 arch/arm64/kvm/arm.c             |  19 +++-
 arch/arm64/kvm/rme-exit.c        | 168 +++++++++++++++++++++++++++++++
 arch/arm64/kvm/rme.c             |  11 ++
 5 files changed, 205 insertions(+), 6 deletions(-)
 create mode 100644 arch/arm64/kvm/rme-exit.c

diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
index 3e75cedaad18..9d1583c44a99 100644
--- a/arch/arm64/include/asm/kvm_rme.h
+++ b/arch/arm64/include/asm/kvm_rme.h
@@ -47,6 +47,9 @@ void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level);
 int kvm_create_rec(struct kvm_vcpu *vcpu);
 void kvm_destroy_rec(struct kvm_vcpu *vcpu);
 
+int kvm_rec_enter(struct kvm_vcpu *vcpu);
+int handle_rme_exit(struct kvm_vcpu *vcpu, int rec_run_status);
+
 int realm_set_ipa_state(struct kvm_vcpu *vcpu,
 			unsigned long addr, unsigned long end,
 			unsigned long ripas);
@@ -69,4 +72,12 @@ static inline unsigned long rme_rtt_level_mapsize(int level)
 	return (1UL << RME_RTT_LEVEL_SHIFT(level));
 }
 
+static inline bool realm_is_addr_protected(struct realm *realm,
+					   unsigned long addr)
+{
+	unsigned int ia_bits = realm->ia_bits;
+
+	return !(addr & ~(BIT(ia_bits - 1) - 1));
+}
+
 #endif
diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index d2f0400c50da..884c7c44439f 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -21,7 +21,7 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
 	 vgic/vgic-mmio.o vgic/vgic-mmio-v2.o \
 	 vgic/vgic-mmio-v3.o vgic/vgic-kvm-device.o \
 	 vgic/vgic-its.o vgic/vgic-debug.o \
-	 rme.o
+	 rme.o rme-exit.o
 
 kvm-$(CONFIG_HW_PERF_EVENTS)  += pmu-emul.o pmu.o
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 1b2547516f62..fd9e28f48903 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -985,7 +985,10 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 		trace_kvm_entry(*vcpu_pc(vcpu));
 		guest_timing_enter_irqoff();
 
-		ret = kvm_arm_vcpu_enter_exit(vcpu);
+		if (vcpu_is_rec(vcpu))
+			ret = kvm_rec_enter(vcpu);
+		else
+			ret = kvm_arm_vcpu_enter_exit(vcpu);
 
 		vcpu->mode = OUTSIDE_GUEST_MODE;
 		vcpu->stat.exits++;
@@ -1039,10 +1042,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 
 		local_irq_enable();
 
-		trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
-
 		/* Exit types that need handling before we can be preempted */
-		handle_exit_early(vcpu, ret);
+		if (!vcpu_is_rec(vcpu)) {
+			trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu),
+				       *vcpu_pc(vcpu));
+
+			handle_exit_early(vcpu, ret);
+		}
 
 		preempt_enable();
 
@@ -1065,7 +1071,10 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 			ret = ARM_EXCEPTION_IL;
 		}
 
-		ret = handle_exit(vcpu, ret);
+		if (vcpu_is_rec(vcpu))
+			ret = handle_rme_exit(vcpu, ret);
+		else
+			ret = handle_exit(vcpu, ret);
 	}
 
 	/* Tell userspace about in-kernel device output levels */
diff --git a/arch/arm64/kvm/rme-exit.c b/arch/arm64/kvm/rme-exit.c
new file mode 100644
index 000000000000..15a4ff3517db
--- /dev/null
+++ b/arch/arm64/kvm/rme-exit.c
@@ -0,0 +1,168 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2023 ARM Ltd.
+ */
+
+#include <linux/kvm_host.h>
+#include <kvm/arm_psci.h>
+
+#include <asm/rmi_smc.h>
+#include <asm/kvm_emulate.h>
+#include <asm/kvm_rme.h>
+#include <asm/kvm_mmu.h>
+
+typedef int (*exit_handler_fn)(struct kvm_vcpu *vcpu);
+
+static int rec_exit_reason_notimpl(struct kvm_vcpu *vcpu)
+{
+	struct rec *rec = &vcpu->arch.rec;
+
+	pr_err("[vcpu %d] Unhandled exit reason from realm (ESR: %#llx)\n",
+	       vcpu->vcpu_id, rec->run->exit.esr);
+	return -ENXIO;
+}
+
+static int rec_exit_sync_dabt(struct kvm_vcpu *vcpu)
+{
+	struct rec *rec = &vcpu->arch.rec;
+
+	if (kvm_vcpu_dabt_iswrite(vcpu) && kvm_vcpu_dabt_isvalid(vcpu))
+		vcpu_set_reg(vcpu, kvm_vcpu_dabt_get_rd(vcpu),
+			     rec->run->exit.gprs[0]);
+
+	return kvm_handle_guest_abort(vcpu);
+}
+
+static int rec_exit_sync_iabt(struct kvm_vcpu *vcpu)
+{
+	struct rec *rec = &vcpu->arch.rec;
+
+	pr_err("[vcpu %d] Unhandled instruction abort (ESR: %#llx).\n",
+	       vcpu->vcpu_id, rec->run->exit.esr);
+	return -ENXIO;
+}
+
+static int rec_exit_sys_reg(struct kvm_vcpu *vcpu)
+{
+	struct rec *rec = &vcpu->arch.rec;
+	unsigned long esr = kvm_vcpu_get_esr(vcpu);
+	int rt = kvm_vcpu_sys_get_rt(vcpu);
+	bool is_write = !(esr & 1);
+	int ret;
+
+	if (is_write)
+		vcpu_set_reg(vcpu, rt, rec->run->exit.gprs[0]);
+
+	ret = kvm_handle_sys_reg(vcpu);
+
+	if (ret >= 0 && !is_write)
+		rec->run->entry.gprs[0] = vcpu_get_reg(vcpu, rt);
+
+	return ret;
+}
+
+static exit_handler_fn rec_exit_handlers[] = {
+	[0 ... ESR_ELx_EC_MAX]	= rec_exit_reason_notimpl,
+	[ESR_ELx_EC_SYS64]	= rec_exit_sys_reg,
+	[ESR_ELx_EC_DABT_LOW]	= rec_exit_sync_dabt,
+	[ESR_ELx_EC_IABT_LOW]	= rec_exit_sync_iabt
+};
+
+static int rec_exit_psci(struct kvm_vcpu *vcpu)
+{
+	struct rec *rec = &vcpu->arch.rec;
+	int i;
+
+	for (i = 0; i < REC_RUN_GPRS; i++)
+		vcpu_set_reg(vcpu, i, rec->run->exit.gprs[i]);
+
+	return kvm_psci_call(vcpu);
+}
+
+static int rec_exit_ripas_change(struct kvm_vcpu *vcpu)
+{
+	struct realm *realm = &vcpu->kvm->arch.realm;
+	struct rec *rec = &vcpu->arch.rec;
+	unsigned long base = rec->run->exit.ripas_base;
+	unsigned long size = rec->run->exit.ripas_size;
+	unsigned long ripas = rec->run->exit.ripas_value & 1;
+	int ret = -EINVAL;
+
+	if (realm_is_addr_protected(realm, base) &&
+	    realm_is_addr_protected(realm, base + size))
+		ret = realm_set_ipa_state(vcpu, base, base + size, ripas);
+
+	WARN(ret, "Unable to satisfy SET_IPAS for %#lx - %#lx, ripas: %#lx\n",
+	     base, base + size, ripas);
+
+	return 1;
+}
+
+static void update_arch_timer_irq_lines(struct kvm_vcpu *vcpu)
+{
+	struct rec *rec = &vcpu->arch.rec;
+
+	__vcpu_sys_reg(vcpu, CNTV_CTL_EL0) = rec->run->exit.cntv_ctl;
+	__vcpu_sys_reg(vcpu, CNTV_CVAL_EL0) = rec->run->exit.cntv_cval;
+	__vcpu_sys_reg(vcpu, CNTP_CTL_EL0) = rec->run->exit.cntp_ctl;
+	__vcpu_sys_reg(vcpu, CNTP_CVAL_EL0) = rec->run->exit.cntp_cval;
+
+	kvm_realm_timers_update(vcpu);
+}
+
+/*
+ * Return > 0 to return to guest, < 0 on error, 0 (and set exit_reason) on
+ * proper exit to userspace.
+ */
+int handle_rme_exit(struct kvm_vcpu *vcpu, int rec_run_ret)
+{
+	struct rec *rec = &vcpu->arch.rec;
+	u8 esr_ec = ESR_ELx_EC(rec->run->exit.esr);
+	unsigned long status, index;
+
+	status = RMI_RETURN_STATUS(rec_run_ret);
+	index = RMI_RETURN_INDEX(rec_run_ret);
+
+	/*
+	 * If a PSCI_SYSTEM_OFF request raced with a vcpu executing, we might
+	 * see the following status code and index indicating an attempt to run
+	 * a REC when the RD state is SYSTEM_OFF.  In this case, we just need to
+	 * return to user space which can deal with the system event or will try
+	 * to run the KVM VCPU again, at which point we will no longer attempt
+	 * to enter the Realm because we will have a sleep request pending on
+	 * the VCPU as a result of KVM's PSCI handling.
+	 */
+	if (status == RMI_ERROR_REALM && index == 1) {
+		vcpu->run->exit_reason = KVM_EXIT_UNKNOWN;
+		return 0;
+	}
+
+	if (rec_run_ret)
+		return -ENXIO;
+
+	vcpu->arch.fault.esr_el2 = rec->run->exit.esr;
+	vcpu->arch.fault.far_el2 = rec->run->exit.far;
+	vcpu->arch.fault.hpfar_el2 = rec->run->exit.hpfar;
+
+	update_arch_timer_irq_lines(vcpu);
+
+	/* Reset the emulation flags for the next run of the REC */
+	rec->run->entry.flags = 0;
+
+	switch (rec->run->exit.exit_reason) {
+	case RMI_EXIT_SYNC:
+		return rec_exit_handlers[esr_ec](vcpu);
+	case RMI_EXIT_IRQ:
+	case RMI_EXIT_FIQ:
+		return 1;
+	case RMI_EXIT_PSCI:
+		return rec_exit_psci(vcpu);
+	case RMI_EXIT_RIPAS_CHANGE:
+		return rec_exit_ripas_change(vcpu);
+	}
+
+	kvm_pr_unimpl("Unsupported exit reason: %u\n",
+		      rec->run->exit.exit_reason);
+	vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
+	return 0;
+}
diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
index b3ea79189839..16e0bfea98b1 100644
--- a/arch/arm64/kvm/rme.c
+++ b/arch/arm64/kvm/rme.c
@@ -802,6 +802,17 @@ void kvm_destroy_realm(struct kvm *kvm)
 	kvm_free_stage2_pgd(&kvm->arch.mmu);
 }
 
+int kvm_rec_enter(struct kvm_vcpu *vcpu)
+{
+	struct rec *rec = &vcpu->arch.rec;
+
+	if (kvm_realm_state(vcpu->kvm) != REALM_STATE_ACTIVE)
+		return -EINVAL;
+
+	return rmi_rec_enter(virt_to_phys(rec->rec_page),
+			     virt_to_phys(rec->run));
+}
+
 static void free_rec_aux(struct page **aux_pages,
 			 unsigned int num_aux)
 {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 15/28] KVM: arm64: Handle realm MMIO emulation
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (13 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 14/28] arm64: RME: Handle realm enter/exit Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-03-06 15:37     ` Zhi Wang
  2023-01-27 11:29   ` [RFC PATCH 16/28] arm64: RME: Allow populating initial contents Steven Price
                     ` (12 subsequent siblings)
  27 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

MMIO emulation for a realm cannot be done directly with the VM's
registers as they are protected from the host. However the RMM interface
provides a structure member for providing the read/written value and
we can transfer this to the appropriate VCPU's register entry and then
depend on the generic MMIO handling code in KVM.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kvm/mmio.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
index 3dd38a151d2a..c4879fa3a8d3 100644
--- a/arch/arm64/kvm/mmio.c
+++ b/arch/arm64/kvm/mmio.c
@@ -6,6 +6,7 @@
 
 #include <linux/kvm_host.h>
 #include <asm/kvm_emulate.h>
+#include <asm/rmi_smc.h>
 #include <trace/events/kvm.h>
 
 #include "trace.h"
@@ -109,6 +110,9 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
 			       &data);
 		data = vcpu_data_host_to_guest(vcpu, data, len);
 		vcpu_set_reg(vcpu, kvm_vcpu_dabt_get_rd(vcpu), data);
+
+		if (vcpu_is_rec(vcpu))
+			vcpu->arch.rec.run->entry.gprs[0] = data;
 	}
 
 	/*
@@ -179,6 +183,9 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
 	run->mmio.len		= len;
 	vcpu->mmio_needed	= 1;
 
+	if (vcpu_is_rec(vcpu))
+		vcpu->arch.rec.run->entry.flags |= RMI_EMULATED_MMIO;
+
 	if (!ret) {
 		/* We handled the access successfully in the kernel. */
 		if (!is_write)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 16/28] arm64: RME: Allow populating initial contents
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (14 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 15/28] KVM: arm64: Handle realm MMIO emulation Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-03-06 17:34     ` Zhi Wang
  2023-01-27 11:29   ` [RFC PATCH 17/28] arm64: RME: Runtime faulting of memory Steven Price
                     ` (11 subsequent siblings)
  27 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

The VMM needs to populate the realm with some data before starting (e.g.
a kernel and initrd). This is measured by the RMM and used as part of
the attestation later on.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kvm/rme.c | 366 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 366 insertions(+)

diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
index 16e0bfea98b1..3405b43e1421 100644
--- a/arch/arm64/kvm/rme.c
+++ b/arch/arm64/kvm/rme.c
@@ -4,6 +4,7 @@
  */
 
 #include <linux/kvm_host.h>
+#include <linux/hugetlb.h>
 
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_mmu.h>
@@ -426,6 +427,359 @@ void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size)
 	}
 }
 
+static int realm_create_protected_data_page(struct realm *realm,
+					    unsigned long ipa,
+					    struct page *dst_page,
+					    struct page *tmp_page)
+{
+	phys_addr_t dst_phys, tmp_phys;
+	int ret;
+
+	copy_page(page_address(tmp_page), page_address(dst_page));
+
+	dst_phys = page_to_phys(dst_page);
+	tmp_phys = page_to_phys(tmp_page);
+
+	if (rmi_granule_delegate(dst_phys))
+		return -ENXIO;
+
+	ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa, tmp_phys,
+			      RMI_MEASURE_CONTENT);
+
+	if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
+		/* Create missing RTTs and retry */
+		int level = RMI_RETURN_INDEX(ret);
+
+		ret = realm_create_rtt_levels(realm, ipa, level,
+					      RME_RTT_MAX_LEVEL, NULL);
+		if (ret)
+			goto err;
+
+		ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa,
+				      tmp_phys, RMI_MEASURE_CONTENT);
+	}
+
+	if (ret)
+		goto err;
+
+	return 0;
+
+err:
+	if (WARN_ON(rmi_granule_undelegate(dst_phys))) {
+		/* Page can't be returned to NS world so is lost */
+		get_page(dst_page);
+	}
+	return -ENXIO;
+}
+
+static int fold_rtt(phys_addr_t rd, unsigned long addr, int level,
+		    struct realm *realm)
+{
+	struct rtt_entry rtt;
+	phys_addr_t rtt_addr;
+
+	if (rmi_rtt_read_entry(rd, addr, level, &rtt))
+		return -ENXIO;
+
+	if (rtt.state != RMI_TABLE)
+		return -EINVAL;
+
+	rtt_addr = rmi_rtt_get_phys(&rtt);
+	if (rmi_rtt_fold(rtt_addr, rd, addr, level + 1))
+		return -ENXIO;
+
+	free_delegated_page(realm, rtt_addr);
+
+	return 0;
+}
+
+int realm_map_protected(struct realm *realm,
+			unsigned long hva,
+			unsigned long base_ipa,
+			struct page *dst_page,
+			unsigned long map_size,
+			struct kvm_mmu_memory_cache *memcache)
+{
+	phys_addr_t dst_phys = page_to_phys(dst_page);
+	phys_addr_t rd = virt_to_phys(realm->rd);
+	unsigned long phys = dst_phys;
+	unsigned long ipa = base_ipa;
+	unsigned long size;
+	int map_level;
+	int ret = 0;
+
+	if (WARN_ON(!IS_ALIGNED(ipa, map_size)))
+		return -EINVAL;
+
+	switch (map_size) {
+	case PAGE_SIZE:
+		map_level = 3;
+		break;
+	case RME_L2_BLOCK_SIZE:
+		map_level = 2;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	if (map_level < RME_RTT_MAX_LEVEL) {
+		/*
+		 * A temporary RTT is needed during the map, precreate it,
+		 * however if there is an error (e.g. missing parent tables)
+		 * this will be handled below.
+		 */
+		realm_create_rtt_levels(realm, ipa, map_level,
+					RME_RTT_MAX_LEVEL, memcache);
+	}
+
+	for (size = 0; size < map_size; size += PAGE_SIZE) {
+		if (rmi_granule_delegate(phys)) {
+			struct rtt_entry rtt;
+
+			/*
+			 * It's possible we raced with another VCPU on the same
+			 * fault. If the entry exists and matches then exit
+			 * early and assume the other VCPU will handle the
+			 * mapping.
+			 */
+			if (rmi_rtt_read_entry(rd, ipa, RME_RTT_MAX_LEVEL, &rtt))
+				goto err;
+
+			// FIXME: For a block mapping this could race at level
+			// 2 or 3...
+			if (WARN_ON((rtt.walk_level != RME_RTT_MAX_LEVEL ||
+				     rtt.state != RMI_ASSIGNED ||
+				     rtt.desc != phys))) {
+				goto err;
+			}
+
+			return 0;
+		}
+
+		ret = rmi_data_create_unknown(phys, rd, ipa);
+
+		if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
+			/* Create missing RTTs and retry */
+			int level = RMI_RETURN_INDEX(ret);
+
+			ret = realm_create_rtt_levels(realm, ipa, level,
+						      RME_RTT_MAX_LEVEL,
+						      memcache);
+			WARN_ON(ret);
+			if (ret)
+				goto err_undelegate;
+
+			ret = rmi_data_create_unknown(phys, rd, ipa);
+		}
+		WARN_ON(ret);
+
+		if (ret)
+			goto err_undelegate;
+
+		phys += PAGE_SIZE;
+		ipa += PAGE_SIZE;
+	}
+
+	if (map_size == RME_L2_BLOCK_SIZE)
+		ret = fold_rtt(rd, base_ipa, map_level, realm);
+	if (WARN_ON(ret))
+		goto err;
+
+	return 0;
+
+err_undelegate:
+	if (WARN_ON(rmi_granule_undelegate(phys))) {
+		/* Page can't be returned to NS world so is lost */
+		get_page(phys_to_page(phys));
+	}
+err:
+	while (size > 0) {
+		phys -= PAGE_SIZE;
+		size -= PAGE_SIZE;
+		ipa -= PAGE_SIZE;
+
+		rmi_data_destroy(rd, ipa);
+
+		if (WARN_ON(rmi_granule_undelegate(phys))) {
+			/* Page can't be returned to NS world so is lost */
+			get_page(phys_to_page(phys));
+		}
+	}
+	return -ENXIO;
+}
+
+static int populate_par_region(struct kvm *kvm,
+			       phys_addr_t ipa_base,
+			       phys_addr_t ipa_end)
+{
+	struct realm *realm = &kvm->arch.realm;
+	struct kvm_memory_slot *memslot;
+	gfn_t base_gfn, end_gfn;
+	int idx;
+	phys_addr_t ipa;
+	int ret = 0;
+	struct page *tmp_page;
+	phys_addr_t rd = virt_to_phys(realm->rd);
+
+	base_gfn = gpa_to_gfn(ipa_base);
+	end_gfn = gpa_to_gfn(ipa_end);
+
+	idx = srcu_read_lock(&kvm->srcu);
+	memslot = gfn_to_memslot(kvm, base_gfn);
+	if (!memslot) {
+		ret = -EFAULT;
+		goto out;
+	}
+
+	/* We require the region to be contained within a single memslot */
+	if (memslot->base_gfn + memslot->npages < end_gfn) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	tmp_page = alloc_page(GFP_KERNEL);
+	if (!tmp_page) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	mmap_read_lock(current->mm);
+
+	ipa = ipa_base;
+
+	while (ipa < ipa_end) {
+		struct vm_area_struct *vma;
+		unsigned long map_size;
+		unsigned int vma_shift;
+		unsigned long offset;
+		unsigned long hva;
+		struct page *page;
+		kvm_pfn_t pfn;
+		int level;
+
+		hva = gfn_to_hva_memslot(memslot, gpa_to_gfn(ipa));
+		vma = vma_lookup(current->mm, hva);
+		if (!vma) {
+			ret = -EFAULT;
+			break;
+		}
+
+		if (is_vm_hugetlb_page(vma))
+			vma_shift = huge_page_shift(hstate_vma(vma));
+		else
+			vma_shift = PAGE_SHIFT;
+
+		map_size = 1 << vma_shift;
+
+		/*
+		 * FIXME: This causes over mapping, but there's no good
+		 * solution here with the ABI as it stands
+		 */
+		ipa = ALIGN_DOWN(ipa, map_size);
+
+		switch (map_size) {
+		case RME_L2_BLOCK_SIZE:
+			level = 2;
+			break;
+		case PAGE_SIZE:
+			level = 3;
+			break;
+		default:
+			WARN_ONCE(1, "Unsupport vma_shift %d", vma_shift);
+			ret = -EFAULT;
+			break;
+		}
+
+		pfn = gfn_to_pfn_memslot(memslot, gpa_to_gfn(ipa));
+
+		if (is_error_pfn(pfn)) {
+			ret = -EFAULT;
+			break;
+		}
+
+		ret = rmi_rtt_init_ripas(rd, ipa, level);
+		if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
+			ret = realm_create_rtt_levels(realm, ipa,
+						      RMI_RETURN_INDEX(ret),
+						      level, NULL);
+			if (ret)
+				break;
+			ret = rmi_rtt_init_ripas(rd, ipa, level);
+			if (ret) {
+				ret = -ENXIO;
+				break;
+			}
+		}
+
+		if (level < RME_RTT_MAX_LEVEL) {
+			/*
+			 * A temporary RTT is needed during the map, precreate
+			 * it, however if there is an error (e.g. missing
+			 * parent tables) this will be handled in the
+			 * realm_create_protected_data_page() call.
+			 */
+			realm_create_rtt_levels(realm, ipa, level,
+						RME_RTT_MAX_LEVEL, NULL);
+		}
+
+		page = pfn_to_page(pfn);
+
+		for (offset = 0; offset < map_size && !ret;
+		     offset += PAGE_SIZE, page++) {
+			phys_addr_t page_ipa = ipa + offset;
+
+			ret = realm_create_protected_data_page(realm, page_ipa,
+							       page, tmp_page);
+		}
+		if (ret)
+			goto err_release_pfn;
+
+		if (level == 2) {
+			ret = fold_rtt(rd, ipa, level, realm);
+			if (ret)
+				goto err_release_pfn;
+		}
+
+		ipa += map_size;
+		kvm_set_pfn_accessed(pfn);
+		kvm_set_pfn_dirty(pfn);
+		kvm_release_pfn_dirty(pfn);
+err_release_pfn:
+		if (ret) {
+			kvm_release_pfn_clean(pfn);
+			break;
+		}
+	}
+
+	mmap_read_unlock(current->mm);
+	__free_page(tmp_page);
+
+out:
+	srcu_read_unlock(&kvm->srcu, idx);
+	return ret;
+}
+
+static int kvm_populate_realm(struct kvm *kvm,
+			      struct kvm_cap_arm_rme_populate_realm_args *args)
+{
+	phys_addr_t ipa_base, ipa_end;
+
+	if (kvm_realm_state(kvm) != REALM_STATE_NEW)
+		return -EBUSY;
+
+	if (!IS_ALIGNED(args->populate_ipa_base, PAGE_SIZE) ||
+	    !IS_ALIGNED(args->populate_ipa_size, PAGE_SIZE))
+		return -EINVAL;
+
+	ipa_base = args->populate_ipa_base;
+	ipa_end = ipa_base + args->populate_ipa_size;
+
+	if (ipa_end < ipa_base)
+		return -EINVAL;
+
+	return populate_par_region(kvm, ipa_base, ipa_end);
+}
+
 static int set_ipa_state(struct kvm_vcpu *vcpu,
 			 unsigned long ipa,
 			 unsigned long end,
@@ -748,6 +1102,18 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
 		r = kvm_init_ipa_range_realm(kvm, &args);
 		break;
 	}
+	case KVM_CAP_ARM_RME_POPULATE_REALM: {
+		struct kvm_cap_arm_rme_populate_realm_args args;
+		void __user *argp = u64_to_user_ptr(cap->args[1]);
+
+		if (copy_from_user(&args, argp, sizeof(args))) {
+			r = -EFAULT;
+			break;
+		}
+
+		r = kvm_populate_realm(kvm, &args);
+		break;
+	}
 	default:
 		r = -EINVAL;
 		break;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 17/28] arm64: RME: Runtime faulting of memory
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (15 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 16/28] arm64: RME: Allow populating initial contents Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-03-06 18:20     ` Zhi Wang
  2023-01-27 11:29   ` [RFC PATCH 18/28] KVM: arm64: Handle realm VCPU load Steven Price
                     ` (10 subsequent siblings)
  27 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

At runtime if the realm guest accesses memory which hasn't yet been
mapped then KVM needs to either populate the region or fault the guest.

For memory in the lower (protected) region of IPA a fresh page is
provided to the RMM which will zero the contents. For memory in the
upper (shared) region of IPA, the memory from the memslot is mapped
into the realm VM non secure.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/kvm_emulate.h | 10 +++++
 arch/arm64/include/asm/kvm_rme.h     | 12 ++++++
 arch/arm64/kvm/mmu.c                 | 64 +++++++++++++++++++++++++---
 arch/arm64/kvm/rme.c                 | 48 +++++++++++++++++++++
 4 files changed, 128 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 285e62914ca4..3a71b3d2e10a 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -502,6 +502,16 @@ static inline enum realm_state kvm_realm_state(struct kvm *kvm)
 	return READ_ONCE(kvm->arch.realm.state);
 }
 
+static inline gpa_t kvm_gpa_stolen_bits(struct kvm *kvm)
+{
+	if (kvm_is_realm(kvm)) {
+		struct realm *realm = &kvm->arch.realm;
+
+		return BIT(realm->ia_bits - 1);
+	}
+	return 0;
+}
+
 static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu)
 {
 	if (static_branch_unlikely(&kvm_rme_is_available))
diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
index 9d1583c44a99..303e4a5e5704 100644
--- a/arch/arm64/include/asm/kvm_rme.h
+++ b/arch/arm64/include/asm/kvm_rme.h
@@ -50,6 +50,18 @@ void kvm_destroy_rec(struct kvm_vcpu *vcpu);
 int kvm_rec_enter(struct kvm_vcpu *vcpu);
 int handle_rme_exit(struct kvm_vcpu *vcpu, int rec_run_status);
 
+void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size);
+int realm_map_protected(struct realm *realm,
+			unsigned long hva,
+			unsigned long base_ipa,
+			struct page *dst_page,
+			unsigned long map_size,
+			struct kvm_mmu_memory_cache *memcache);
+int realm_map_non_secure(struct realm *realm,
+			 unsigned long ipa,
+			 struct page *page,
+			 unsigned long map_size,
+			 struct kvm_mmu_memory_cache *memcache);
 int realm_set_ipa_state(struct kvm_vcpu *vcpu,
 			unsigned long addr, unsigned long end,
 			unsigned long ripas);
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index f29558c5dcbc..5417c273861b 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -235,8 +235,13 @@ static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64
 
 	lockdep_assert_held_write(&kvm->mmu_lock);
 	WARN_ON(size & ~PAGE_MASK);
-	WARN_ON(stage2_apply_range(kvm, start, end, kvm_pgtable_stage2_unmap,
-				   may_block));
+
+	if (kvm_is_realm(kvm))
+		kvm_realm_unmap_range(kvm, start, size);
+	else
+		WARN_ON(stage2_apply_range(kvm, start, end,
+					   kvm_pgtable_stage2_unmap,
+					   may_block));
 }
 
 static void unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size)
@@ -250,7 +255,11 @@ static void stage2_flush_memslot(struct kvm *kvm,
 	phys_addr_t addr = memslot->base_gfn << PAGE_SHIFT;
 	phys_addr_t end = addr + PAGE_SIZE * memslot->npages;
 
-	stage2_apply_range_resched(kvm, addr, end, kvm_pgtable_stage2_flush);
+	if (kvm_is_realm(kvm))
+		kvm_realm_unmap_range(kvm, addr, end - addr);
+	else
+		stage2_apply_range_resched(kvm, addr, end,
+					   kvm_pgtable_stage2_flush);
 }
 
 /**
@@ -818,6 +827,10 @@ void stage2_unmap_vm(struct kvm *kvm)
 	struct kvm_memory_slot *memslot;
 	int idx, bkt;
 
+	/* For realms this is handled by the RMM so nothing to do here */
+	if (kvm_is_realm(kvm))
+		return;
+
 	idx = srcu_read_lock(&kvm->srcu);
 	mmap_read_lock(current->mm);
 	write_lock(&kvm->mmu_lock);
@@ -840,6 +853,7 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
 	pgt = mmu->pgt;
 	if (kvm_is_realm(kvm) &&
 	    kvm_realm_state(kvm) != REALM_STATE_DYING) {
+		unmap_stage2_range(mmu, 0, (~0ULL) & PAGE_MASK);
 		write_unlock(&kvm->mmu_lock);
 		kvm_realm_destroy_rtts(&kvm->arch.realm, pgt->ia_bits,
 				       pgt->start_level);
@@ -1190,6 +1204,24 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma)
 	return vma->vm_flags & VM_MTE_ALLOWED;
 }
 
+static int realm_map_ipa(struct kvm *kvm, phys_addr_t ipa, unsigned long hva,
+			 kvm_pfn_t pfn, unsigned long map_size,
+			 enum kvm_pgtable_prot prot,
+			 struct kvm_mmu_memory_cache *memcache)
+{
+	struct realm *realm = &kvm->arch.realm;
+	struct page *page = pfn_to_page(pfn);
+
+	if (WARN_ON(!(prot & KVM_PGTABLE_PROT_W)))
+		return -EFAULT;
+
+	if (!realm_is_addr_protected(realm, ipa))
+		return realm_map_non_secure(realm, ipa, page, map_size,
+					    memcache);
+
+	return realm_map_protected(realm, hva, ipa, page, map_size, memcache);
+}
+
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  struct kvm_memory_slot *memslot, unsigned long hva,
 			  unsigned long fault_status)
@@ -1210,9 +1242,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	unsigned long vma_pagesize, fault_granule;
 	enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
 	struct kvm_pgtable *pgt;
+	gpa_t gpa_stolen_mask = kvm_gpa_stolen_bits(vcpu->kvm);
 
 	fault_granule = 1UL << ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level);
 	write_fault = kvm_is_write_fault(vcpu);
+
+	/* Realms cannot map read-only */
+	if (vcpu_is_rec(vcpu))
+		write_fault = true;
+
 	exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
 	VM_BUG_ON(write_fault && exec_fault);
 
@@ -1272,7 +1310,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	if (vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE)
 		fault_ipa &= ~(vma_pagesize - 1);
 
-	gfn = fault_ipa >> PAGE_SHIFT;
+	gfn = (fault_ipa & ~gpa_stolen_mask) >> PAGE_SHIFT;
 	mmap_read_unlock(current->mm);
 
 	/*
@@ -1345,7 +1383,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	 * If we are not forced to use page mapping, check if we are
 	 * backed by a THP and thus use block mapping if possible.
 	 */
-	if (vma_pagesize == PAGE_SIZE && !(force_pte || device)) {
+	/* FIXME: We shouldn't need to disable this for realms */
+	if (vma_pagesize == PAGE_SIZE && !(force_pte || device || kvm_is_realm(kvm))) {
 		if (fault_status == FSC_PERM && fault_granule > PAGE_SIZE)
 			vma_pagesize = fault_granule;
 		else
@@ -1382,6 +1421,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	 */
 	if (fault_status == FSC_PERM && vma_pagesize == fault_granule)
 		ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot);
+	else if (kvm_is_realm(kvm))
+		ret = realm_map_ipa(kvm, fault_ipa, hva, pfn, vma_pagesize,
+				    prot, memcache);
 	else
 		ret = kvm_pgtable_stage2_map(pgt, fault_ipa, vma_pagesize,
 					     __pfn_to_phys(pfn), prot,
@@ -1437,6 +1479,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
 	struct kvm_memory_slot *memslot;
 	unsigned long hva;
 	bool is_iabt, write_fault, writable;
+	gpa_t gpa_stolen_mask = kvm_gpa_stolen_bits(vcpu->kvm);
 	gfn_t gfn;
 	int ret, idx;
 
@@ -1491,7 +1534,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
 
 	idx = srcu_read_lock(&vcpu->kvm->srcu);
 
-	gfn = fault_ipa >> PAGE_SHIFT;
+	gfn = (fault_ipa & ~gpa_stolen_mask) >> PAGE_SHIFT;
 	memslot = gfn_to_memslot(vcpu->kvm, gfn);
 	hva = gfn_to_hva_memslot_prot(memslot, gfn, &writable);
 	write_fault = kvm_is_write_fault(vcpu);
@@ -1536,6 +1579,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
 		 * of the page size.
 		 */
 		fault_ipa |= kvm_vcpu_get_hfar(vcpu) & ((1 << 12) - 1);
+		fault_ipa &= ~gpa_stolen_mask;
 		ret = io_mem_abort(vcpu, fault_ipa);
 		goto out_unlock;
 	}
@@ -1617,6 +1661,10 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 	if (!kvm->arch.mmu.pgt)
 		return false;
 
+	/* We don't support aging for Realms */
+	if (kvm_is_realm(kvm))
+		return true;
+
 	WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
 
 	kpte = kvm_pgtable_stage2_mkold(kvm->arch.mmu.pgt,
@@ -1630,6 +1678,10 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 	if (!kvm->arch.mmu.pgt)
 		return false;
 
+	/* We don't support aging for Realms */
+	if (kvm_is_realm(kvm))
+		return true;
+
 	return kvm_pgtable_stage2_is_young(kvm->arch.mmu.pgt,
 					   range->start << PAGE_SHIFT);
 }
diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
index 3405b43e1421..3d46191798e5 100644
--- a/arch/arm64/kvm/rme.c
+++ b/arch/arm64/kvm/rme.c
@@ -608,6 +608,54 @@ int realm_map_protected(struct realm *realm,
 	return -ENXIO;
 }
 
+int realm_map_non_secure(struct realm *realm,
+			 unsigned long ipa,
+			 struct page *page,
+			 unsigned long map_size,
+			 struct kvm_mmu_memory_cache *memcache)
+{
+	phys_addr_t rd = virt_to_phys(realm->rd);
+	int map_level;
+	int ret = 0;
+	unsigned long desc = page_to_phys(page) |
+			     PTE_S2_MEMATTR(MT_S2_FWB_NORMAL) |
+			     /* FIXME: Read+Write permissions for now */
+			     (3 << 6) |
+			     PTE_SHARED;
+
+	if (WARN_ON(!IS_ALIGNED(ipa, map_size)))
+		return -EINVAL;
+
+	switch (map_size) {
+	case PAGE_SIZE:
+		map_level = 3;
+		break;
+	case RME_L2_BLOCK_SIZE:
+		map_level = 2;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	ret = rmi_rtt_map_unprotected(rd, ipa, map_level, desc);
+
+	if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
+		/* Create missing RTTs and retry */
+		int level = RMI_RETURN_INDEX(ret);
+
+		ret = realm_create_rtt_levels(realm, ipa, level, map_level,
+					      memcache);
+		if (WARN_ON(ret))
+			return -ENXIO;
+
+		ret = rmi_rtt_map_unprotected(rd, ipa, map_level, desc);
+	}
+	if (WARN_ON(ret))
+		return -ENXIO;
+
+	return 0;
+}
+
 static int populate_par_region(struct kvm *kvm,
 			       phys_addr_t ipa_base,
 			       phys_addr_t ipa_end)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 18/28] KVM: arm64: Handle realm VCPU load
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (16 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 17/28] arm64: RME: Runtime faulting of memory Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-01-27 11:29   ` [RFC PATCH 19/28] KVM: arm64: Validate register access for a Realm VM Steven Price
                     ` (9 subsequent siblings)
  27 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

When loading a realm VCPU much of the work is handled by the RMM so only
some of the actions are required. Rearrange kvm_arch_vcpu_load()
slightly so we can bail out early for a realm guest.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kvm/arm.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index fd9e28f48903..46c152a9a150 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -451,19 +451,25 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 
 	vcpu->cpu = cpu;
 
+	if (single_task_running())
+		vcpu_clear_wfx_traps(vcpu);
+	else
+		vcpu_set_wfx_traps(vcpu);
+
 	kvm_vgic_load(vcpu);
 	kvm_timer_vcpu_load(vcpu);
+
+	if (kvm_arm_is_pvtime_enabled(&vcpu->arch))
+		kvm_make_request(KVM_REQ_RECORD_STEAL, vcpu);
+
+	/* No additional state needs to be loaded on Realmed VMs */
+	if (vcpu_is_rec(vcpu))
+		return;
+
 	if (has_vhe())
 		kvm_vcpu_load_sysregs_vhe(vcpu);
 	kvm_arch_vcpu_load_fp(vcpu);
 	kvm_vcpu_pmu_restore_guest(vcpu);
-	if (kvm_arm_is_pvtime_enabled(&vcpu->arch))
-		kvm_make_request(KVM_REQ_RECORD_STEAL, vcpu);
-
-	if (single_task_running())
-		vcpu_clear_wfx_traps(vcpu);
-	else
-		vcpu_set_wfx_traps(vcpu);
 
 	if (vcpu_has_ptrauth(vcpu))
 		vcpu_ptrauth_disable(vcpu);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 19/28] KVM: arm64: Validate register access for a Realm VM
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (17 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 18/28] KVM: arm64: Handle realm VCPU load Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-01-27 11:29   ` [RFC PATCH 20/28] KVM: arm64: Handle Realm PSCI requests Steven Price
                     ` (8 subsequent siblings)
  27 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

The RMM only allows setting the lower GPRS (x0-x7) and PC for a realm
guest. Check this in kvm_arm_set_reg() so that the VMM can receive a
suitable error return if other registers are accessed.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kvm/guest.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 5626ddb540ce..93468bbfb50e 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -768,12 +768,38 @@ int kvm_arm_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	return kvm_arm_sys_reg_get_reg(vcpu, reg);
 }
 
+/*
+ * The RMI ABI only enables setting the lower GPRs (x0-x7) and PC.
+ * All other registers are reset to architectural or otherwise defined reset
+ * values by the RMM
+ */
+static bool validate_realm_set_reg(struct kvm_vcpu *vcpu,
+				   const struct kvm_one_reg *reg)
+{
+	u64 off = core_reg_offset_from_id(reg->id);
+
+	if ((reg->id & KVM_REG_ARM_COPROC_MASK) != KVM_REG_ARM_CORE)
+		return false;
+
+	switch (off) {
+	case KVM_REG_ARM_CORE_REG(regs.regs[0]) ...
+	     KVM_REG_ARM_CORE_REG(regs.regs[7]):
+	case KVM_REG_ARM_CORE_REG(regs.pc):
+		return true;
+	}
+
+	return false;
+}
+
 int kvm_arm_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 {
 	/* We currently use nothing arch-specific in upper 32 bits */
 	if ((reg->id & ~KVM_REG_SIZE_MASK) >> 32 != KVM_REG_ARM64 >> 32)
 		return -EINVAL;
 
+	if (kvm_is_realm(vcpu->kvm) && !validate_realm_set_reg(vcpu, reg))
+		return -EINVAL;
+
 	switch (reg->id & KVM_REG_ARM_COPROC_MASK) {
 	case KVM_REG_ARM_CORE:	return set_core_reg(vcpu, reg);
 	case KVM_REG_ARM_FW:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 20/28] KVM: arm64: Handle Realm PSCI requests
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (18 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 19/28] KVM: arm64: Validate register access for a Realm VM Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-01-27 11:29   ` [RFC PATCH 21/28] KVM: arm64: WARN on injected undef exceptions Steven Price
                     ` (7 subsequent siblings)
  27 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

The RMM needs to be informed of the target REC when a PSCI call is made
with an MPIDR argument.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/kvm_rme.h |  1 +
 arch/arm64/kvm/psci.c            | 23 +++++++++++++++++++++++
 arch/arm64/kvm/rme.c             | 13 +++++++++++++
 3 files changed, 37 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
index 303e4a5e5704..2254e28c855e 100644
--- a/arch/arm64/include/asm/kvm_rme.h
+++ b/arch/arm64/include/asm/kvm_rme.h
@@ -65,6 +65,7 @@ int realm_map_non_secure(struct realm *realm,
 int realm_set_ipa_state(struct kvm_vcpu *vcpu,
 			unsigned long addr, unsigned long end,
 			unsigned long ripas);
+int realm_psci_complete(struct kvm_vcpu *calling, struct kvm_vcpu *target);
 
 #define RME_RTT_BLOCK_LEVEL	2
 #define RME_RTT_MAX_LEVEL	3
diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index 7fbc4c1b9df0..e2061cab9b26 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -76,6 +76,10 @@ static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 	 */
 	if (!vcpu)
 		return PSCI_RET_INVALID_PARAMS;
+
+	if (vcpu_is_rec(vcpu))
+		realm_psci_complete(source_vcpu, vcpu);
+
 	if (!kvm_arm_vcpu_stopped(vcpu)) {
 		if (kvm_psci_version(source_vcpu) != KVM_ARM_PSCI_0_1)
 			return PSCI_RET_ALREADY_ON;
@@ -135,6 +139,25 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
 	/* Ignore other bits of target affinity */
 	target_affinity &= target_affinity_mask;
 
+	if (vcpu_is_rec(vcpu)) {
+		struct kvm_vcpu *target_vcpu;
+
+		/* RMM supports only zero affinity level */
+		if (lowest_affinity_level != 0)
+			return PSCI_RET_INVALID_PARAMS;
+
+		target_vcpu = kvm_mpidr_to_vcpu(kvm, target_affinity);
+		if (!target_vcpu)
+			return PSCI_RET_INVALID_PARAMS;
+
+		/*
+		 * Provide the references of running and target RECs to the RMM
+		 * so that the RMM can complete the PSCI request.
+		 */
+		realm_psci_complete(vcpu, target_vcpu);
+		return PSCI_RET_SUCCESS;
+	}
+
 	/*
 	 * If one or more VCPU matching target affinity are running
 	 * then ON else OFF
diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
index 3d46191798e5..6ac50481a138 100644
--- a/arch/arm64/kvm/rme.c
+++ b/arch/arm64/kvm/rme.c
@@ -126,6 +126,19 @@ static void free_delegated_page(struct realm *realm, phys_addr_t phys)
 	free_page((unsigned long)phys_to_virt(phys));
 }
 
+int realm_psci_complete(struct kvm_vcpu *calling, struct kvm_vcpu *target)
+{
+	int ret;
+
+	ret = rmi_psci_complete(virt_to_phys(calling->arch.rec.rec_page),
+				virt_to_phys(target->arch.rec.rec_page));
+
+	if (ret)
+		return -EINVAL;
+
+	return 0;
+}
+
 static void realm_destroy_undelegate_range(struct realm *realm,
 					   unsigned long ipa,
 					   unsigned long addr,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 21/28] KVM: arm64: WARN on injected undef exceptions
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (19 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 20/28] KVM: arm64: Handle Realm PSCI requests Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-01-27 11:29   ` [RFC PATCH 22/28] arm64: Don't expose stolen time for realm guests Steven Price
                     ` (6 subsequent siblings)
  27 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

The RMM doesn't allow injection of a undefined exception into a realm
guest. Add a WARN to catch if this ever happens.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kvm/inject_fault.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index f32f4a2a347f..29966a3e5a71 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -175,6 +175,8 @@ void kvm_inject_size_fault(struct kvm_vcpu *vcpu)
  */
 void kvm_inject_undefined(struct kvm_vcpu *vcpu)
 {
+	if (vcpu_is_rec(vcpu))
+		WARN(1, "Cannot inject undefined exception into REC. Continuing with unknown behaviour");
 	if (vcpu_el1_is_32bit(vcpu))
 		inject_undef32(vcpu);
 	else
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 22/28] arm64: Don't expose stolen time for realm guests
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (20 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 21/28] KVM: arm64: WARN on injected undef exceptions Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-01-27 11:29   ` [RFC PATCH 23/28] KVM: arm64: Allow activating realms Steven Price
                     ` (5 subsequent siblings)
  27 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

It doesn't make much sense and with the ABI as it is it's a footgun for
the VMM which makes fatal granule protection faults easy to trigger.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kvm/arm.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 46c152a9a150..645df5968e1e 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -302,7 +302,10 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		r = system_supports_mte();
 		break;
 	case KVM_CAP_STEAL_TIME:
-		r = kvm_arm_pvtime_supported();
+		if (kvm && kvm_is_realm(kvm))
+			r = 0;
+		else
+			r = kvm_arm_pvtime_supported();
 		break;
 	case KVM_CAP_ARM_EL1_32BIT:
 		r = cpus_have_const_cap(ARM64_HAS_32BIT_EL1);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 23/28] KVM: arm64: Allow activating realms
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (21 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 22/28] arm64: Don't expose stolen time for realm guests Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-01-27 11:29   ` [RFC PATCH 24/28] arm64: rme: allow userspace to inject aborts Steven Price
                     ` (4 subsequent siblings)
  27 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

Add the ioctl to activate a realm and set the static branch to enable
access to the realm functionality if the RMM is detected.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kvm/rme.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
index 6ac50481a138..543e8d10f532 100644
--- a/arch/arm64/kvm/rme.c
+++ b/arch/arm64/kvm/rme.c
@@ -1000,6 +1000,20 @@ static int kvm_init_ipa_range_realm(struct kvm *kvm,
 	return ret;
 }
 
+static int kvm_activate_realm(struct kvm *kvm)
+{
+	struct realm *realm = &kvm->arch.realm;
+
+	if (kvm_realm_state(kvm) != REALM_STATE_NEW)
+		return -EBUSY;
+
+	if (rmi_realm_activate(virt_to_phys(realm->rd)))
+		return -ENXIO;
+
+	WRITE_ONCE(realm->state, REALM_STATE_ACTIVE);
+	return 0;
+}
+
 /* Protects access to rme_vmid_bitmap */
 static DEFINE_SPINLOCK(rme_vmid_lock);
 static unsigned long *rme_vmid_bitmap;
@@ -1175,6 +1189,9 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
 		r = kvm_populate_realm(kvm, &args);
 		break;
 	}
+	case KVM_CAP_ARM_RME_ACTIVATE_REALM:
+		r = kvm_activate_realm(kvm);
+		break;
 	default:
 		r = -EINVAL;
 		break;
@@ -1415,7 +1432,7 @@ int kvm_init_rme(void)
 
 	WARN_ON(rmi_features(0, &rmm_feat_reg0));
 
-	/* Future patch will enable static branch kvm_rme_is_available */
+	static_branch_enable(&kvm_rme_is_available);
 
 	return 0;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 24/28] arm64: rme: allow userspace to inject aborts
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (22 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 23/28] KVM: arm64: Allow activating realms Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-01-27 11:29   ` [RFC PATCH 25/28] arm64: rme: support RSI_HOST_CALL Steven Price
                     ` (3 subsequent siblings)
  27 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

From: Joey Gouly <joey.gouly@arm.com>

Extend KVM_SET_VCPU_EVENTS to support realms, where KVM cannot set the
system registers, and the RMM must perform it on next REC entry.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 Documentation/virt/kvm/api.rst |  2 ++
 arch/arm64/kvm/guest.c         | 24 ++++++++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index f1a59d6fb7fc..18a8ddaf31d8 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -1238,6 +1238,8 @@ User space may need to inject several types of events to the guest.
 Set the pending SError exception state for this VCPU. It is not possible to
 'cancel' an Serror that has been made pending.
 
+User space cannot inject SErrors into Realms.
+
 If the guest performed an access to I/O memory which could not be handled by
 userspace, for example because of missing instruction syndrome decode
 information or because there is no device mapped at the accessed IPA, then
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 93468bbfb50e..6e53e0ef2fba 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -851,6 +851,30 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
 	bool has_esr = events->exception.serror_has_esr;
 	bool ext_dabt_pending = events->exception.ext_dabt_pending;
 
+	if (vcpu_is_rec(vcpu)) {
+		/* Cannot inject SError into a Realm. */
+		if (serror_pending)
+			return -EINVAL;
+
+		/*
+		 * If a data abort is pending, set the flag and let the RMM
+		 * inject an SEA when the REC is scheduled to be run.
+		 */
+		if (ext_dabt_pending) {
+			/*
+			 * Can only inject SEA into a Realm if the previous exit
+			 * was due to a data abort of an Unprotected IPA.
+			 */
+			if (!(vcpu->arch.rec.run->entry.flags & RMI_EMULATED_MMIO))
+				return -EINVAL;
+
+			vcpu->arch.rec.run->entry.flags &= ~RMI_EMULATED_MMIO;
+			vcpu->arch.rec.run->entry.flags |= RMI_INJECT_SEA;
+		}
+
+		return 0;
+	}
+
 	if (serror_pending && has_esr) {
 		if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
 			return -EINVAL;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 25/28] arm64: rme: support RSI_HOST_CALL
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (23 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 24/28] arm64: rme: allow userspace to inject aborts Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-01-27 11:29   ` [RFC PATCH 26/28] arm64: rme: Allow checking SVE on VM instance Steven Price
                     ` (2 subsequent siblings)
  27 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

From: Joey Gouly <joey.gouly@arm.com>

Forward RSI_HOST_CALLS to KVM's HVC handler.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kvm/rme-exit.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/arch/arm64/kvm/rme-exit.c b/arch/arm64/kvm/rme-exit.c
index 15a4ff3517db..fcdc87e8f6bc 100644
--- a/arch/arm64/kvm/rme-exit.c
+++ b/arch/arm64/kvm/rme-exit.c
@@ -4,6 +4,7 @@
  */
 
 #include <linux/kvm_host.h>
+#include <kvm/arm_hypercalls.h>
 #include <kvm/arm_psci.h>
 
 #include <asm/rmi_smc.h>
@@ -98,6 +99,29 @@ static int rec_exit_ripas_change(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
+static int rec_exit_host_call(struct kvm_vcpu *vcpu)
+{
+	int ret, i;
+	struct rec *rec = &vcpu->arch.rec;
+
+	vcpu->stat.hvc_exit_stat++;
+
+	for (i = 0; i < REC_RUN_GPRS; i++)
+		vcpu_set_reg(vcpu, i, rec->run->exit.gprs[i]);
+
+	ret = kvm_hvc_call_handler(vcpu);
+
+	if (ret < 0) {
+		vcpu_set_reg(vcpu, 0, ~0UL);
+		ret = 1;
+	}
+
+	for (i = 0; i < REC_RUN_GPRS; i++)
+		rec->run->entry.gprs[i] = vcpu_get_reg(vcpu, i);
+
+	return ret;
+}
+
 static void update_arch_timer_irq_lines(struct kvm_vcpu *vcpu)
 {
 	struct rec *rec = &vcpu->arch.rec;
@@ -159,6 +183,8 @@ int handle_rme_exit(struct kvm_vcpu *vcpu, int rec_run_ret)
 		return rec_exit_psci(vcpu);
 	case RMI_EXIT_RIPAS_CHANGE:
 		return rec_exit_ripas_change(vcpu);
+	case RMI_EXIT_HOST_CALL:
+		return rec_exit_host_call(vcpu);
 	}
 
 	kvm_pr_unimpl("Unsupported exit reason: %u\n",
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 26/28] arm64: rme: Allow checking SVE on VM instance
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (24 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 25/28] arm64: rme: support RSI_HOST_CALL Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-01-27 11:29   ` [RFC PATCH 27/28] arm64: RME: Always use 4k pages for realms Steven Price
  2023-01-27 11:29   ` [RFC PATCH 28/28] HACK: Accept prototype RMI versions Steven Price
  27 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

From: Suzuki K Poulose <suzuki.poulose@arm.com>

Given we have different types of VMs supported, check the
support for SVE for the given instance of the VM to accurately
report the status.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/kvm_rme.h | 2 ++
 arch/arm64/kvm/arm.c             | 5 ++++-
 arch/arm64/kvm/rme.c             | 7 ++++++-
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
index 2254e28c855e..68e99e5107bc 100644
--- a/arch/arm64/include/asm/kvm_rme.h
+++ b/arch/arm64/include/asm/kvm_rme.h
@@ -40,6 +40,8 @@ struct rec {
 int kvm_init_rme(void);
 u32 kvm_realm_ipa_limit(void);
 
+bool kvm_rme_supports_sve(void);
+
 int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
 int kvm_init_realm_vm(struct kvm *kvm);
 void kvm_destroy_realm(struct kvm *kvm);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 645df5968e1e..1d0b8ac7314f 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -326,7 +326,10 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 		r = get_kvm_ipa_limit();
 		break;
 	case KVM_CAP_ARM_SVE:
-		r = system_supports_sve();
+		if (kvm && kvm_is_realm(kvm))
+			r = kvm_rme_supports_sve();
+		else
+			r = system_supports_sve();
 		break;
 	case KVM_CAP_ARM_PTRAUTH_ADDRESS:
 	case KVM_CAP_ARM_PTRAUTH_GENERIC:
diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
index 543e8d10f532..6ae7871aa6ed 100644
--- a/arch/arm64/kvm/rme.c
+++ b/arch/arm64/kvm/rme.c
@@ -49,6 +49,11 @@ static bool rme_supports(unsigned long feature)
 	return !!u64_get_bits(rmm_feat_reg0, feature);
 }
 
+bool kvm_rme_supports_sve(void)
+{
+	return rme_supports(RMI_FEATURE_REGISTER_0_SVE_EN);
+}
+
 static int rmi_check_version(void)
 {
 	struct arm_smccc_res res;
@@ -1104,7 +1109,7 @@ static int config_realm_sve(struct realm *realm,
 	int max_sve_vq = u64_get_bits(rmm_feat_reg0,
 				      RMI_FEATURE_REGISTER_0_SVE_VL);
 
-	if (!rme_supports(RMI_FEATURE_REGISTER_0_SVE_EN))
+	if (!kvm_rme_supports_sve())
 		return -EINVAL;
 
 	if (cfg->sve_vq > max_sve_vq)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 27/28] arm64: RME: Always use 4k pages for realms
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (25 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 26/28] arm64: rme: Allow checking SVE on VM instance Steven Price
@ 2023-01-27 11:29   ` Steven Price
  2023-01-27 11:29   ` [RFC PATCH 28/28] HACK: Accept prototype RMI versions Steven Price
  27 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

Always split up huge pages to avoid problems managing huge pages. There
are two issues currently:

1. The uABI for the VMM allows populating memory on 4k boundaries even
   if the underlying allocator (e.g. hugetlbfs) is using a larger page
   size. Using a memfd for private allocations will push this issue onto
   the VMM as it will need to respect the granularity of the allocator.

2. The guest is able to request arbitrary ranges to be remapped as
   shared. Again with a memfd approach it will be up to the VMM to deal
   with the complexity and either overmap (need the huge mapping and add
   an additional 'overlapping' shared mapping) or reject the request as
   invalid due to the use of a huge page allocator.

For now just break everything down to 4k pages in the RMM controlled
stage 2.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/kvm/mmu.c | 4 ++++
 arch/arm64/kvm/rme.c | 4 +++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 5417c273861b..b5fc8d8f7049 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1278,6 +1278,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	if (logging_active) {
 		force_pte = true;
 		vma_shift = PAGE_SHIFT;
+	} else if (kvm_is_realm(kvm)) {
+		// Force PTE level mappings for realms
+		force_pte = true;
+		vma_shift = PAGE_SHIFT;
 	} else {
 		vma_shift = get_vma_page_shift(vma, hva);
 	}
diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
index 6ae7871aa6ed..1eb76cbee267 100644
--- a/arch/arm64/kvm/rme.c
+++ b/arch/arm64/kvm/rme.c
@@ -730,7 +730,9 @@ static int populate_par_region(struct kvm *kvm,
 			break;
 		}
 
-		if (is_vm_hugetlb_page(vma))
+		// FIXME: To avoid the overmapping issue (see below comment)
+		// force the use of 4k pages
+		if (is_vm_hugetlb_page(vma) && 0)
 			vma_shift = huge_page_shift(hstate_vma(vma));
 		else
 			vma_shift = PAGE_SHIFT;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC PATCH 28/28] HACK: Accept prototype RMI versions
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
                     ` (26 preceding siblings ...)
  2023-01-27 11:29   ` [RFC PATCH 27/28] arm64: RME: Always use 4k pages for realms Steven Price
@ 2023-01-27 11:29   ` Steven Price
  27 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-01-27 11:29 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: Steven Price, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

The upstream RMM currently advertises the major version of an internal
prototype (v56.0) rather than the expected version from the RMM
architecture specification (v1.0).

Add a config option to enable support for the prototype RMI v56.0.

Signed-off-by: Steven Price <steven.price@arm.com>
---
 arch/arm64/include/asm/rmi_smc.h | 7 +++++++
 arch/arm64/kvm/Kconfig           | 8 ++++++++
 arch/arm64/kvm/rme.c             | 8 ++++++++
 3 files changed, 23 insertions(+)

diff --git a/arch/arm64/include/asm/rmi_smc.h b/arch/arm64/include/asm/rmi_smc.h
index 16ff65090f3a..d6bbd7d92b8f 100644
--- a/arch/arm64/include/asm/rmi_smc.h
+++ b/arch/arm64/include/asm/rmi_smc.h
@@ -6,6 +6,13 @@
 #ifndef __ASM_RME_SMC_H
 #define __ASM_RME_SMC_H
 
+#ifdef CONFIG_RME_USE_PROTOTYPE_HACKS
+
+// Allow the prototype RMI version
+#define PROTOTYPE_RMI_ABI_MAJOR_VERSION  56
+
+#endif /* CONFIG_RME_USE_PROTOTYPE_HACKS */
+
 #include <linux/arm-smccc.h>
 
 #define SMC_RxI_CALL(func)				\
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 05da3c8f7e88..13858a5047fd 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -58,6 +58,14 @@ config NVHE_EL2_DEBUG
 
 	  If unsure, say N.
 
+config RME_USE_PROTOTYPE_HACKS
+	bool "Allow RMM prototype version numbers"
+	default y
+	help
+	  For compatibility with the the current RMM code allow versions
+	  numbers from a prototype implementation as well as the expected
+	  version number from the RMM specification.
+
 config PROTECTED_NVHE_STACKTRACE
 	bool "Protected KVM hypervisor stacktraces"
 	depends on NVHE_EL2_DEBUG
diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
index 1eb76cbee267..894060635226 100644
--- a/arch/arm64/kvm/rme.c
+++ b/arch/arm64/kvm/rme.c
@@ -67,6 +67,14 @@ static int rmi_check_version(void)
 	version_major = RMI_ABI_VERSION_GET_MAJOR(res.a0);
 	version_minor = RMI_ABI_VERSION_GET_MINOR(res.a0);
 
+#ifdef PROTOTYPE_RMI_ABI_MAJOR_VERSION
+	// Support the prototype
+	if (version_major == PROTOTYPE_RMI_ABI_MAJOR_VERSION) {
+		kvm_err("Using prototype RMM support (version %d.%d)\n",
+			version_major, version_minor);
+		return 0;
+	}
+#endif
 	if (version_major != RMI_ABI_MAJOR_VERSION) {
 		kvm_err("Unsupported RMI ABI (version %d.%d) we support %d\n",
 			version_major, version_minor,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture
  2023-01-27 11:22 [RFC] Support for Arm CCA VMs on Linux Suzuki K Poulose
  2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
  2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
@ 2023-01-27 11:39 ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 01/31] arm64: Disable MTE when CFI flash is emulated Suzuki K Poulose
                     ` (31 more replies)
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                   ` (5 subsequent siblings)
  8 siblings, 32 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

This series is an initial version of the support for running VMs under the
Arm Confidential Compute Architecture. The purpose of the series is to gather
feedback on the proposed UABI changes for running Confidential VMs with KVM.
More information on the Arm CCA and instructions for how to get, build and run
the entire software stack is available here [0].

A new option, `--realm` is added to the the `run` command to mark the VM as a
confidential compute VM. This version doesn't use the Guest private memory [1]
support yet, instead uses normal anonymous/hugetlbfs backed memory. Our aim is
to switch to the guest private memory for the Realm.

The host including the kernel and kvmtool, must not access any memory allocated
to the protected IPA of the Realm.

The series adds the support for managing the lifecycle of the Realm, which includes:
   * Configuration
   * Creation of Realm (RD)
   * Load initial memory images
   * Creation of Realm Execution Contexts (RECs aka VCPUs)a
   * Activation of the Realm.

Patches are split as follows :

Patches 1 and 2 are fixes to existing code.
Patch 3 adds a new option --nocompat to disable compat warnings
Patches 4 - 6 are some preparations for Realm specific changes.

The remaining patches adds Realm support and using the --realm option is
enabled in patch 30.

The v1.0 of the Realm Management Monitor (RMM) specification doesn't support
paging protected memory of a Realm. Thus all of the memory backing the RAM
is locked by the VMM.

Since the IPA space of a Realm is split into Protected and Unprotected, with
one alias of the other, the VMM doubles the IPA Size for a Realm VM.

The KVM support for Arm CCA is advertised with a new cap KVM_CAP_ARM_RME.
A new "VM type" field is defined in the vm_type for CREATE_VM ioctl to indicate
that a VM is "Realm". Once the VM is created, the life cycle of the Realm is
managed via KVM_ENABLE_CAP of KVM_CAP_ARM_RME.

Command line options are also added to configure the Realm parameters.
These include :
 - Hash algorithm for measurements
 - Realm personalisation value
 - SVE vector Length (Optional feature in v1.0 RMM spec. Not yet supported
   by the TF-RMM. coming soon).

Support for PMU and self-hosted debug (number of watchpoint/breakpoit registers)
are not supported yet in the KVM/RMM implementation. This will be added soon.

The UABI doesn't support discovering the "supported" configuration values. In
real world, the Realm configuration 'affects' the initial measurement of the
Realms and which may be verified by a remote entity. Thus, the VMM is not at
liberty to make choices for configurations based on the "host" capabilities.
Instead, VMM should launch a Realm with the user requested parameters. If this
cannot be satisfied, there is no point in running the Realm. We are happy to
change this if there is interest.

Special actions are required to load the initial memory images (e.g, kernel,
firmware, DTB, initrd) in to the Realm memory.

For VCPUs, we add a new feature KVM_ARM_VCPU_REC, which will be used to control
the creation of the REC object (via KVM_ARM_VCPU_FINALIZE). This must be done
after the initial register state of the VCPUs are set.
RMM imposes an order in which the RECs are created. i.e., they must be created
in the ascending order of the MPIDR. This is for now a responsibility of the
VMM.

Once the Realm images are loaded, VCPUs created, Realm is activated before
the first vCPU is run.

virtio for the Realms enforces VIRTIO_F_ACCESS_PLATFORM flag.

Also, added support for injecting SEA into the VM for unhandled MMIO.

A tree with the patches are also available here :

	https://gitlab.arm.com/linux-arm/kvmtool-cca cca/rfc-v1

Running the Realm
------------------

A realm VM can be launched using :

 $ lkvm run						\
	 --realm					\
	 --disable-sve					\
	 [ --measurement-algo="sha256","sha512" ]	\
	 [ --realm-pv="<realm-pv>" ]			\
	 <normal-VM options>

[0] https://lkml.kernel.org/r/20230127112248.136810-1-suzuki.poulose@arm.com
[1] https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng@linux.intel.com

To: kvmarm@lists.linux.dev
To: kvm@vger.kernel.org
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Andrew Jones <andrew.jones@linux.dev>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: Joey Gouly <Joey.Gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Quentin Perret <qperret@google.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zenghui Yu <yuzenghui@huawei.com>
Cc: linux-coco@lists.linux.dev
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org

Alexandru Elisei (11):
  Add --nocompat option to disable compat warnings
  arm64: Add --realm command line option
  arm64: Lock realm RAM in memory
  arm64: Create Realm Descriptor
  arm: Add kernel size to VM context
  arm64: Populate initial realm contents
  arm64: Finalize realm VCPU after reset
  init: Add last_{init, exit} list macros
  arm64: Activate realm before the first VCPU is run
  arm64: Don't try to debug a realm
  arm64: Allow the user to create a realm

Christoffer Dall (4):
  arm64: Create a realm virtual machine
  arm64: Add --measurement-algo command line option for a realm
  arm64: Don't try to set PSTATE for VCPUs belonging to a realm
  arm64: Specify SMC as the PSCI conduits for realms

Joey Gouly (2):
  mmio: add arch hook for an unhandled MMIO access
  arm64: realm: inject an abort on an unhandled MMIO access

Suzuki K Poulose (14):
  arm64: Disable MTE when CFI flash is emulated
  script: update_headers: Ignore missing architectures
  hw: cfi flash: Handle errors in memory transitions
  arm64: Check pvtime support against the KVM instance
  arm64: Check SVE capability on the VM instance
  arm64: Add option to disable SVE
  linux: Update kernel headers for RME support
  arm64: Add configuration step for Realms
  arm64: Add support for Realm Personalisation Value
  arm64: Add support for specifying the SVE vector length for Realm
  arm64: realm: Double the IPA space
  virtio: Add a wrapper for get_host_features
  virtio: Add arch specific hook for virtio host flags
  arm64: realm: Enforce virtio F_ACCESS_PLATFORM flag

 Makefile                                  |   1 +
 arm/aarch32/include/asm/realm.h           |  13 ++
 arm/aarch32/kvm.c                         |   5 +
 arm/aarch64/include/asm/kvm.h             |  64 ++++++
 arm/aarch64/include/asm/realm.h           |  13 ++
 arm/aarch64/include/kvm/kvm-config-arch.h |  16 +-
 arm/aarch64/kvm-cpu.c                     |  41 +++-
 arm/aarch64/kvm.c                         |  95 ++++++++-
 arm/aarch64/pvtime.c                      |   4 +-
 arm/aarch64/realm.c                       | 229 ++++++++++++++++++++++
 arm/fdt.c                                 |  15 +-
 arm/include/arm-common/kvm-arch.h         |   4 +
 arm/include/arm-common/kvm-config-arch.h  |   5 +
 arm/kvm-cpu.c                             |  13 ++
 arm/kvm.c                                 |  75 ++++++-
 builtin-run.c                             |   5 +-
 guest_compat.c                            |   1 +
 hw/cfi_flash.c                            |   4 +
 include/kvm/kvm-config.h                  |   1 +
 include/kvm/kvm-cpu.h                     |   2 +
 include/kvm/kvm.h                         |   2 +
 include/kvm/util-init.h                   |   6 +-
 include/kvm/virtio.h                      |   2 +
 include/linux/kernel.h                    |   1 +
 include/linux/kvm.h                       |  22 ++-
 include/linux/virtio_blk.h                |  19 --
 include/linux/virtio_net.h                |  14 +-
 include/linux/virtio_ring.h               |  16 +-
 mips/kvm-cpu.c                            |   4 +
 mips/kvm.c                                |   5 +
 mmio.c                                    |   3 +
 powerpc/kvm-cpu.c                         |   4 +
 powerpc/kvm.c                             |   5 +
 riscv/kvm-cpu.c                           |   4 +
 riscv/kvm.c                               |   5 +
 util/update_headers.sh                    |   1 +
 virtio/core.c                             |   8 +
 virtio/mmio-legacy.c                      |   2 +-
 virtio/mmio-modern.c                      |   2 +-
 virtio/pci-legacy.c                       |   2 +-
 virtio/pci-modern.c                       |   2 +-
 x86/kvm-cpu.c                             |   4 +
 x86/kvm.c                                 |   5 +
 43 files changed, 667 insertions(+), 77 deletions(-)
 create mode 100644 arm/aarch32/include/asm/realm.h
 create mode 100644 arm/aarch64/include/asm/realm.h
 create mode 100644 arm/aarch64/realm.c

-- 
2.34.1


^ permalink raw reply	[flat|nested] 190+ messages in thread

* [RFC kvmtool 01/31] arm64: Disable MTE when CFI flash is emulated
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 02/31] script: update_headers: Ignore missing architectures Suzuki K Poulose
                     ` (30 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

CFI Flash file image cannot be mapped into the memory of the
guest if MTE is enabled. Thus disable MTE if flash emulation
is requested.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/kvm.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arm/aarch64/kvm.c b/arm/aarch64/kvm.c
index 54200c9e..5a53badb 100644
--- a/arm/aarch64/kvm.c
+++ b/arm/aarch64/kvm.c
@@ -145,6 +145,12 @@ void kvm__arch_enable_mte(struct kvm *kvm)
 		return;
 	}
 
+	if (kvm->cfg.flash_filename) {
+		kvm->cfg.arch.mte_disabled = true;
+		pr_info("MTE is incompatible with CFI flash support, disabling");
+		return;
+	}
+
 	if (kvm->cfg.arch.mte_disabled) {
 		pr_debug("MTE disabled by user");
 		return;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 02/31] script: update_headers: Ignore missing architectures
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 01/31] arm64: Disable MTE when CFI flash is emulated Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 03/31] hw: cfi flash: Handle errors in memory transitions Suzuki K Poulose
                     ` (29 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

Ignore missing architectures for header updates, for use with
older kernels.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 util/update_headers.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/util/update_headers.sh b/util/update_headers.sh
index 789e2a42..bdfb798c 100755
--- a/util/update_headers.sh
+++ b/util/update_headers.sh
@@ -48,6 +48,7 @@ copy_optional_arch () {
 
 for arch in arm64 mips powerpc riscv x86
 do
+	[ -f $LINUX_ROOT/arch/${arch} ] || continue;
 	case "$arch" in
 		arm64)	KVMTOOL_PATH=arm/aarch64
 			copy_optional_arch asm/sve_context.h ;;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 03/31] hw: cfi flash: Handle errors in memory transitions
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 01/31] arm64: Disable MTE when CFI flash is emulated Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 02/31] script: update_headers: Ignore missing architectures Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 04/31] Add --nocompat option to disable compat warnings Suzuki K Poulose
                     ` (28 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

Handle failures in creating the memory maps and back in
transitioning the CFI flash. e.g., with MTE enabled, CFI
flash emulation breaks with the map operation, silently.
And we later hit unhandled aborts in the guest.

To avoid such issues, let us make sure we catch the error
and handle it right at source.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 hw/cfi_flash.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/hw/cfi_flash.c b/hw/cfi_flash.c
index 7faecdfb..bce546bc 100644
--- a/hw/cfi_flash.c
+++ b/hw/cfi_flash.c
@@ -455,6 +455,8 @@ static int map_flash_memory(struct kvm *kvm, struct cfi_flash_device *sfdev)
 				KVM_MEM_TYPE_RAM | KVM_MEM_TYPE_READONLY);
 	if (!ret)
 		sfdev->is_mapped = true;
+	else
+		die("CFI Flash: ERROR: Unable to map memory: %d\n", ret);
 
 	return ret;
 }
@@ -472,6 +474,8 @@ static int unmap_flash_memory(struct kvm *kvm, struct cfi_flash_device *sfdev)
 
 	if (!ret)
 		sfdev->is_mapped = false;
+	else
+		die("CFI Flash: Failed to unmap Flash %d", ret);
 
 	return ret;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 04/31] Add --nocompat option to disable compat warnings
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (2 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 03/31] hw: cfi flash: Handle errors in memory transitions Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 12:19     ` Alexandru Elisei
  2023-01-27 11:39   ` [RFC kvmtool 05/31] arm64: Check pvtime support against the KVM instance Suzuki K Poulose
                     ` (27 subsequent siblings)
  31 siblings, 1 reply; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

From: Alexandru Elisei <alexandru.elisei@arm.com>

Commit e66942073035 ("kvm tools: Guest kernel compatability") added the
functionality that enables devices to print a warning message if the device
hasn't been initialized by the time the VM is destroyed. The purpose of
these messages is to let the user know if the kernel hasn't been built with
the correct Kconfig options to take advantage of the said devices (all
using virtio).

Since then, kvmtool has evolved and now supports loading different payloads
(like firmware images), and having those warnings even when it is entirely
intentional for the payload not to touch the devices can be confusing for
the user and makes the output unnecessarily verbose in those cases.

Add the --nocompat option to disable the warnings; the warnings are still
enabled by default.

Reported-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 builtin-run.c            | 5 ++++-
 guest_compat.c           | 1 +
 include/kvm/kvm-config.h | 1 +
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/builtin-run.c b/builtin-run.c
index bb7e6e8d..f8edfb3f 100644
--- a/builtin-run.c
+++ b/builtin-run.c
@@ -183,6 +183,8 @@ static int mem_parser(const struct option *opt, const char *arg, int unset)
 	OPT_BOOLEAN('\0', "nodefaults", &(cfg)->nodefaults, "Disable"   \
 			" implicit configuration that cannot be"	\
 			" disabled otherwise"),				\
+	OPT_BOOLEAN('\0', "nocompat", &(cfg)->nocompat, "Disable"	\
+			" compat warnings"),				\
 	OPT_CALLBACK('\0', "9p", NULL, "dir_to_share,tag_name",		\
 		     "Enable virtio 9p to share files between host and"	\
 		     " guest", virtio_9p_rootdir_parser, kvm),		\
@@ -797,7 +799,8 @@ static int kvm_cmd_run_work(struct kvm *kvm)
 
 static void kvm_cmd_run_exit(struct kvm *kvm, int guest_ret)
 {
-	compat__print_all_messages();
+	if (!kvm->cfg.nocompat)
+		compat__print_all_messages();
 
 	init_list__exit(kvm);
 
diff --git a/guest_compat.c b/guest_compat.c
index fd4704b2..a413c12c 100644
--- a/guest_compat.c
+++ b/guest_compat.c
@@ -88,6 +88,7 @@ int compat__print_all_messages(void)
 
 		printf("\n  # KVM compatibility warning.\n\t%s\n\t%s\n",
 			msg->title, msg->desc);
+		printf("\tTo stop seeing this warning, use the --nocompat option.\n");
 
 		list_del(&msg->list);
 		compat__free(msg);
diff --git a/include/kvm/kvm-config.h b/include/kvm/kvm-config.h
index 368e6c7d..88df7cc2 100644
--- a/include/kvm/kvm-config.h
+++ b/include/kvm/kvm-config.h
@@ -30,6 +30,7 @@ struct kvm_config {
 	u64 vsock_cid;
 	bool virtio_rng;
 	bool nodefaults;
+	bool nocompat;
 	int active_console;
 	int debug_iodelay;
 	int nrcpus;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 05/31] arm64: Check pvtime support against the KVM instance
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (3 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 04/31] Add --nocompat option to disable compat warnings Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 06/31] arm64: Check SVE capability on the VM instance Suzuki K Poulose
                     ` (26 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

KVM_CAP_STEAL_TIME can be checked against a VM instance.
To allow controlling the feature depending on the VM type,
use the cap against the VM.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/pvtime.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arm/aarch64/pvtime.c b/arm/aarch64/pvtime.c
index 2933ac7c..839aa8a7 100644
--- a/arm/aarch64/pvtime.c
+++ b/arm/aarch64/pvtime.c
@@ -58,8 +58,8 @@ int kvm_cpu__setup_pvtime(struct kvm_cpu *vcpu)
 	if (kvm_cfg->no_pvtime)
 		return 0;
 
-	has_stolen_time = kvm__supports_extension(vcpu->kvm,
-						  KVM_CAP_STEAL_TIME);
+	has_stolen_time = kvm__supports_vm_extension(vcpu->kvm,
+						     KVM_CAP_STEAL_TIME);
 	if (!has_stolen_time) {
 		kvm_cfg->no_pvtime = true;
 		return 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 06/31] arm64: Check SVE capability on the VM instance
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (4 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 05/31] arm64: Check pvtime support against the KVM instance Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 07/31] arm64: Add option to disable SVE Suzuki K Poulose
                     ` (25 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

Similar to PVtime, check the SVE capability on the VM instance
to account for the different VM types and the corresponding support.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/kvm-cpu.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arm/aarch64/kvm-cpu.c b/arm/aarch64/kvm-cpu.c
index c8be10b3..da809806 100644
--- a/arm/aarch64/kvm-cpu.c
+++ b/arm/aarch64/kvm-cpu.c
@@ -150,13 +150,15 @@ void kvm_cpu__select_features(struct kvm *kvm, struct kvm_vcpu_init *init)
 	}
 
 	/* Enable SVE if available */
-	if (kvm__supports_extension(kvm, KVM_CAP_ARM_SVE))
+	if (kvm__supports_vm_extension(kvm, KVM_CAP_ARM_SVE))
 		init->features[0] |= 1UL << KVM_ARM_VCPU_SVE;
 }
 
 int kvm_cpu__configure_features(struct kvm_cpu *vcpu)
 {
-	if (kvm__supports_extension(vcpu->kvm, KVM_CAP_ARM_SVE)) {
+	struct kvm *kvm = vcpu->kvm;
+
+	if (kvm__supports_vm_extension(kvm, KVM_CAP_ARM_SVE)) {
 		int feature = KVM_ARM_VCPU_SVE;
 
 		if (ioctl(vcpu->vcpu_fd, KVM_ARM_VCPU_FINALIZE, &feature)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 07/31] arm64: Add option to disable SVE
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (5 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 06/31] arm64: Check SVE capability on the VM instance Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 08/31] linux: Update kernel headers for RME support Suzuki K Poulose
                     ` (24 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

kvmtool enables SVE whenever it is supported by the KVM.
However, Realm VMs may want controlled features, which gets
measured during the creation. Thus, provide an option to disable
the SVE, to preserve the current behavior of SVE on by default.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/include/kvm/kvm-config-arch.h | 4 +++-
 arm/aarch64/kvm-cpu.c                     | 8 +++++---
 arm/include/arm-common/kvm-config-arch.h  | 1 +
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arm/aarch64/include/kvm/kvm-config-arch.h b/arm/aarch64/include/kvm/kvm-config-arch.h
index eae8080d..b055fef4 100644
--- a/arm/aarch64/include/kvm/kvm-config-arch.h
+++ b/arm/aarch64/include/kvm/kvm-config-arch.h
@@ -19,7 +19,9 @@ int vcpu_affinity_parser(const struct option *opt, const char *arg, int unset);
 			"Specify random seed for Kernel Address Space "	\
 			"Layout Randomization (KASLR)"),		\
 	OPT_BOOLEAN('\0', "no-pvtime", &(cfg)->no_pvtime, "Disable"	\
-			" stolen time"),
+			" stolen time"),				\
+	OPT_BOOLEAN('\0', "disable-sve", &(cfg)->disable_sve,		\
+			"Disable SVE"),
 #include "arm-common/kvm-config-arch.h"
 
 #endif /* KVM__KVM_CONFIG_ARCH_H */
diff --git a/arm/aarch64/kvm-cpu.c b/arm/aarch64/kvm-cpu.c
index da809806..e7649239 100644
--- a/arm/aarch64/kvm-cpu.c
+++ b/arm/aarch64/kvm-cpu.c
@@ -149,8 +149,9 @@ void kvm_cpu__select_features(struct kvm *kvm, struct kvm_vcpu_init *init)
 		init->features[0] |= 1UL << KVM_ARM_VCPU_PTRAUTH_GENERIC;
 	}
 
-	/* Enable SVE if available */
-	if (kvm__supports_vm_extension(kvm, KVM_CAP_ARM_SVE))
+	/* If SVE is not disabled explicitly, enable if available */
+	if (!kvm->cfg.arch.disable_sve &&
+	    kvm__supports_vm_extension(kvm, KVM_CAP_ARM_SVE))
 		init->features[0] |= 1UL << KVM_ARM_VCPU_SVE;
 }
 
@@ -158,7 +159,8 @@ int kvm_cpu__configure_features(struct kvm_cpu *vcpu)
 {
 	struct kvm *kvm = vcpu->kvm;
 
-	if (kvm__supports_vm_extension(kvm, KVM_CAP_ARM_SVE)) {
+	if (!kvm->cfg.arch.disable_sve &&
+	    kvm__supports_vm_extension(kvm, KVM_CAP_ARM_SVE)) {
 		int feature = KVM_ARM_VCPU_SVE;
 
 		if (ioctl(vcpu->vcpu_fd, KVM_ARM_VCPU_FINALIZE, &feature)) {
diff --git a/arm/include/arm-common/kvm-config-arch.h b/arm/include/arm-common/kvm-config-arch.h
index 9949bfe4..6599305b 100644
--- a/arm/include/arm-common/kvm-config-arch.h
+++ b/arm/include/arm-common/kvm-config-arch.h
@@ -15,6 +15,7 @@ struct kvm_config_arch {
 	enum irqchip_type irqchip;
 	u64		fw_addr;
 	bool no_pvtime;
+	bool		disable_sve;
 };
 
 int irqchip_parser(const struct option *opt, const char *arg, int unset);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 08/31] linux: Update kernel headers for RME support
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (6 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 07/31] arm64: Add option to disable SVE Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 09/31] arm64: Add --realm command line option Suzuki K Poulose
                     ` (23 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

Update the RME specific ABI bits from the kernel headers.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/include/asm/kvm.h | 64 +++++++++++++++++++++++++++++++++++
 include/linux/kvm.h           | 22 +++++++++---
 include/linux/virtio_blk.h    | 19 -----------
 include/linux/virtio_net.h    | 14 ++++----
 include/linux/virtio_ring.h   | 16 +++------
 5 files changed, 93 insertions(+), 42 deletions(-)

diff --git a/arm/aarch64/include/asm/kvm.h b/arm/aarch64/include/asm/kvm.h
index 316917b9..653a08fb 100644
--- a/arm/aarch64/include/asm/kvm.h
+++ b/arm/aarch64/include/asm/kvm.h
@@ -108,6 +108,7 @@ struct kvm_regs {
 #define KVM_ARM_VCPU_SVE		4 /* enable SVE for this CPU */
 #define KVM_ARM_VCPU_PTRAUTH_ADDRESS	5 /* VCPU uses address authentication */
 #define KVM_ARM_VCPU_PTRAUTH_GENERIC	6 /* VCPU uses generic authentication */
+#define KVM_ARM_VCPU_REC		7 /* VCPU REC state as part of Realm */
 
 struct kvm_vcpu_init {
 	__u32 target;
@@ -400,6 +401,69 @@ enum {
 #define   KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES	3
 #define   KVM_DEV_ARM_ITS_CTRL_RESET		4
 
+/* KVM_CAP_ARM_RME kvm_enable_cap->args[0] points to this */
+#define KVM_CAP_ARM_RME_CONFIG_REALM		0
+#define KVM_CAP_ARM_RME_CREATE_RD		1
+#define KVM_CAP_ARM_RME_INIT_IPA_REALM		2
+#define KVM_CAP_ARM_RME_POPULATE_REALM		3
+#define KVM_CAP_ARM_RME_ACTIVATE_REALM		4
+
+#define KVM_CAP_ARM_RME_MEASUREMENT_ALGO_ZERO	(0x01ULL << 7)
+#define KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256 0
+#define KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA512 1
+
+#define KVM_CAP_ARM_RME_RPV_SIZE 64
+
+/* List of configuration items accepted for KVM_CAP_ARM_RME_CONFIG_REALM */
+#define KVM_CAP_ARM_RME_CFG_RPV			0
+#define KVM_CAP_ARM_RME_CFG_HASH_ALGO		1
+#define KVM_CAP_ARM_RME_CFG_SVE			2
+#define KVM_CAP_ARM_RME_CFG_DBG			3
+#define KVM_CAP_ARM_RME_CFG_PMU			4
+
+struct kvm_cap_arm_rme_config_item {
+	__u32 cfg;
+	union {
+		/* cfg == KVM_CAP_ARM_RME_CFG_RPV */
+		struct {
+			__u8	rpv[KVM_CAP_ARM_RME_RPV_SIZE];
+		};
+
+		/* cfg == KVM_CAP_ARM_RME_CFG_HASH_ALGO */
+		struct {
+			__u32	hash_algo;
+		};
+
+		/* cfg == KVM_CAP_ARM_RME_CFG_SVE */
+		struct {
+			__u32	sve_vq;
+		};
+
+		/* cfg == KVM_CAP_ARM_RME_CFG_DBG */
+		struct {
+			__u32	num_brps;
+			__u32	num_wrps;
+		};
+
+		/* cfg == KVM_CAP_ARM_RME_CFG_PMU */
+		struct {
+			__u32	num_pmu_cntrs;
+		};
+		/* Fix the size of the union */
+		__u8	reserved[256];
+	};
+};
+
+struct kvm_cap_arm_rme_populate_realm_args {
+	__u64 populate_ipa_base;
+	__u64 populate_ipa_size;
+};
+
+struct kvm_cap_arm_rme_init_ipa_args {
+	__u64 init_ipa_base;
+	__u64 init_ipa_size;
+};
+
 /* Device Control API on vcpu fd */
 #define KVM_ARM_VCPU_PMU_V3_CTRL	0
 #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 0d5d4419..789c7f89 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -903,14 +903,25 @@ struct kvm_ppc_resize_hpt {
 #define KVM_S390_SIE_PAGE_OFFSET 1
 
 /*
- * On arm64, machine type can be used to request the physical
- * address size for the VM. Bits[7-0] are reserved for the guest
- * PA size shift (i.e, log2(PA_Size)). For backward compatibility,
- * value 0 implies the default IPA size, 40bits.
+ * On arm64, machine type can be used to request both the machine type and
+ * the physical address size for the VM.
+ *
+ * Bits[11-8] are reserved for the ARM specific machine type.
+ *
+ * Bits[7-0] are reserved for the guest PA size shift (i.e, log2(PA_Size)).
+ * For backward compatibility, value 0 implies the default IPA size, 40bits.
  */
+#define KVM_VM_TYPE_ARM_SHIFT		8
+#define KVM_VM_TYPE_ARM_MASK		(0xfULL << KVM_VM_TYPE_ARM_SHIFT)
+#define KVM_VM_TYPE_ARM(_type)		\
+	(((_type) << KVM_VM_TYPE_ARM_SHIFT) & KVM_VM_TYPE_ARM_MASK)
+#define KVM_VM_TYPE_ARM_NORMAL		KVM_VM_TYPE_ARM(0)
+#define KVM_VM_TYPE_ARM_REALM		KVM_VM_TYPE_ARM(1)
+
 #define KVM_VM_TYPE_ARM_IPA_SIZE_MASK	0xffULL
 #define KVM_VM_TYPE_ARM_IPA_SIZE(x)		\
 	((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
+
 /*
  * ioctls for /dev/kvm fds:
  */
@@ -1177,7 +1188,8 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_VM_DISABLE_NX_HUGE_PAGES 220
 #define KVM_CAP_S390_ZPCI_OP 221
 #define KVM_CAP_S390_CPU_TOPOLOGY 222
-#define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223
+
+#define KVM_CAP_ARM_RME 300 // FIXME: Large number to prevent conflicts
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/include/linux/virtio_blk.h b/include/linux/virtio_blk.h
index 58e70b24..d888f013 100644
--- a/include/linux/virtio_blk.h
+++ b/include/linux/virtio_blk.h
@@ -40,7 +40,6 @@
 #define VIRTIO_BLK_F_MQ		12	/* support more than one vq */
 #define VIRTIO_BLK_F_DISCARD	13	/* DISCARD is supported */
 #define VIRTIO_BLK_F_WRITE_ZEROES	14	/* WRITE ZEROES is supported */
-#define VIRTIO_BLK_F_SECURE_ERASE	16 /* Secure Erase is supported */
 
 /* Legacy feature bits */
 #ifndef VIRTIO_BLK_NO_LEGACY
@@ -122,21 +121,6 @@ struct virtio_blk_config {
 	__u8 write_zeroes_may_unmap;
 
 	__u8 unused1[3];
-
-	/* the next 3 entries are guarded by VIRTIO_BLK_F_SECURE_ERASE */
-	/*
-	 * The maximum secure erase sectors (in 512-byte sectors) for
-	 * one segment.
-	 */
-	__virtio32 max_secure_erase_sectors;
-	/*
-	 * The maximum number of secure erase segments in a
-	 * secure erase command.
-	 */
-	__virtio32 max_secure_erase_seg;
-	/* Secure erase commands must be aligned to this number of sectors. */
-	__virtio32 secure_erase_sector_alignment;
-
 } __attribute__((packed));
 
 /*
@@ -171,9 +155,6 @@ struct virtio_blk_config {
 /* Write zeroes command */
 #define VIRTIO_BLK_T_WRITE_ZEROES	13
 
-/* Secure erase command */
-#define VIRTIO_BLK_T_SECURE_ERASE	14
-
 #ifndef VIRTIO_BLK_NO_LEGACY
 /* Barrier before this op. */
 #define VIRTIO_BLK_T_BARRIER	0x80000000
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index 6cb842ea..29ced555 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -56,7 +56,7 @@
 #define VIRTIO_NET_F_MQ	22	/* Device supports Receive Flow
 					 * Steering */
 #define VIRTIO_NET_F_CTRL_MAC_ADDR 23	/* Set MAC address */
-#define VIRTIO_NET_F_NOTF_COAL	53	/* Device supports notifications coalescing */
+#define VIRTIO_NET_F_NOTF_COAL	53	/* Guest can handle notifications coalescing */
 #define VIRTIO_NET_F_HASH_REPORT  57	/* Supports hash report */
 #define VIRTIO_NET_F_RSS	  60	/* Supports RSS RX steering */
 #define VIRTIO_NET_F_RSC_EXT	  61	/* extended coalescing info */
@@ -364,24 +364,24 @@ struct virtio_net_hash_config {
  */
 #define VIRTIO_NET_CTRL_NOTF_COAL		6
 /*
- * Set the tx-usecs/tx-max-packets parameters.
+ * Set the tx-usecs/tx-max-packets patameters.
+ * tx-usecs - Maximum number of usecs to delay a TX notification.
+ * tx-max-packets - Maximum number of packets to send before a TX notification.
  */
 struct virtio_net_ctrl_coal_tx {
-	/* Maximum number of packets to send before a TX notification */
 	__le32 tx_max_packets;
-	/* Maximum number of usecs to delay a TX notification */
 	__le32 tx_usecs;
 };
 
 #define VIRTIO_NET_CTRL_NOTF_COAL_TX_SET		0
 
 /*
- * Set the rx-usecs/rx-max-packets parameters.
+ * Set the rx-usecs/rx-max-packets patameters.
+ * rx-usecs - Maximum number of usecs to delay a RX notification.
+ * rx-max-frames - Maximum number of packets to receive before a RX notification.
  */
 struct virtio_net_ctrl_coal_rx {
-	/* Maximum number of packets to receive before a RX notification */
 	__le32 rx_max_packets;
-	/* Maximum number of usecs to delay a RX notification */
 	__le32 rx_usecs;
 };
 
diff --git a/include/linux/virtio_ring.h b/include/linux/virtio_ring.h
index f8c20d3d..476d3e5c 100644
--- a/include/linux/virtio_ring.h
+++ b/include/linux/virtio_ring.h
@@ -93,21 +93,15 @@
 #define VRING_USED_ALIGN_SIZE 4
 #define VRING_DESC_ALIGN_SIZE 16
 
-/**
- * struct vring_desc - Virtio ring descriptors,
- * 16 bytes long. These can chain together via @next.
- *
- * @addr: buffer address (guest-physical)
- * @len: buffer length
- * @flags: descriptor flags
- * @next: index of the next descriptor in the chain,
- *        if the VRING_DESC_F_NEXT flag is set. We chain unused
- *        descriptors via this, too.
- */
+/* Virtio ring descriptors: 16 bytes.  These can chain together via "next". */
 struct vring_desc {
+	/* Address (guest-physical). */
 	__virtio64 addr;
+	/* Length. */
 	__virtio32 len;
+	/* The flags as indicated above. */
 	__virtio16 flags;
+	/* We chain unused descriptors via this, too */
 	__virtio16 next;
 };
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 09/31] arm64: Add --realm command line option
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (7 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 08/31] linux: Update kernel headers for RME support Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 10/31] arm64: Create a realm virtual machine Suzuki K Poulose
                     ` (22 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

From: Alexandru Elisei <alexandru.elisei@arm.com>

Add the --realm command line option which causes kvmtool to exit with an
error if specified, but which will be enabled once realms are fully
supported by kvmtool.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/include/kvm/kvm-config-arch.h |  5 ++++-
 arm/aarch64/kvm.c                         | 20 ++++++++++++++++++--
 arm/include/arm-common/kvm-config-arch.h  |  1 +
 3 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/arm/aarch64/include/kvm/kvm-config-arch.h b/arm/aarch64/include/kvm/kvm-config-arch.h
index b055fef4..d2df850a 100644
--- a/arm/aarch64/include/kvm/kvm-config-arch.h
+++ b/arm/aarch64/include/kvm/kvm-config-arch.h
@@ -21,7 +21,10 @@ int vcpu_affinity_parser(const struct option *opt, const char *arg, int unset);
 	OPT_BOOLEAN('\0', "no-pvtime", &(cfg)->no_pvtime, "Disable"	\
 			" stolen time"),				\
 	OPT_BOOLEAN('\0', "disable-sve", &(cfg)->disable_sve,		\
-			"Disable SVE"),
+			"Disable SVE"),					\
+	OPT_BOOLEAN('\0', "realm", &(cfg)->is_realm,			\
+			"Create VM running in a realm using Arm RME"),
+
 #include "arm-common/kvm-config-arch.h"
 
 #endif /* KVM__KVM_CONFIG_ARCH_H */
diff --git a/arm/aarch64/kvm.c b/arm/aarch64/kvm.c
index 5a53badb..25be2f2d 100644
--- a/arm/aarch64/kvm.c
+++ b/arm/aarch64/kvm.c
@@ -38,9 +38,8 @@ int vcpu_affinity_parser(const struct option *opt, const char *arg, int unset)
 	return 0;
 }
 
-void kvm__arch_validate_cfg(struct kvm *kvm)
+static void validate_mem_cfg(struct kvm *kvm)
 {
-
 	if (kvm->cfg.ram_addr < ARM_MEMORY_AREA) {
 		die("RAM address is below the I/O region ending at %luGB",
 		    ARM_MEMORY_AREA >> 30);
@@ -52,6 +51,23 @@ void kvm__arch_validate_cfg(struct kvm *kvm)
 	}
 }
 
+static void validate_realm_cfg(struct kvm *kvm)
+{
+	if (!kvm->cfg.arch.is_realm)
+		return;
+
+	if (kvm->cfg.arch.aarch32_guest)
+		die("Realms supported only for 64bit guests");
+
+	die("Realms not supported");
+}
+
+void kvm__arch_validate_cfg(struct kvm *kvm)
+{
+	validate_mem_cfg(kvm);
+	validate_realm_cfg(kvm);
+}
+
 u64 kvm__arch_default_ram_address(void)
 {
 	return ARM_MEMORY_AREA;
diff --git a/arm/include/arm-common/kvm-config-arch.h b/arm/include/arm-common/kvm-config-arch.h
index 6599305b..5eb791da 100644
--- a/arm/include/arm-common/kvm-config-arch.h
+++ b/arm/include/arm-common/kvm-config-arch.h
@@ -11,6 +11,7 @@ struct kvm_config_arch {
 	bool		aarch32_guest;
 	bool		has_pmuv3;
 	bool		mte_disabled;
+	bool		is_realm;
 	u64		kaslr_seed;
 	enum irqchip_type irqchip;
 	u64		fw_addr;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 10/31] arm64: Create a realm virtual machine
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (8 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 09/31] arm64: Add --realm command line option Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 11/31] arm64: Lock realm RAM in memory Suzuki K Poulose
                     ` (21 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

From: Christoffer Dall <christoffer.dall@arm.com>

Set the machine type to realm when creating a VM via the KVM_CREATE_VM
ioctl.

Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
[ Alex E: Reworked patch, split the command line option into a different
          patch ]
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/kvm.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arm/aarch64/kvm.c b/arm/aarch64/kvm.c
index 25be2f2d..5db4c572 100644
--- a/arm/aarch64/kvm.c
+++ b/arm/aarch64/kvm.c
@@ -131,12 +131,15 @@ int kvm__arch_get_ipa_limit(struct kvm *kvm)
 int kvm__get_vm_type(struct kvm *kvm)
 {
 	unsigned int ipa_bits, max_ipa_bits;
-	unsigned long max_ipa;
+	unsigned long max_ipa, vm_type;
 
-	/* If we're running on an old kernel, use 0 as the VM type */
+	vm_type = kvm->cfg.arch.is_realm ? \
+		  KVM_VM_TYPE_ARM_REALM : KVM_VM_TYPE_ARM_NORMAL;
+
+	/* If we're running on an old kernel, use 0 as the IPA bits */
 	max_ipa_bits = kvm__arch_get_ipa_limit(kvm);
 	if (!max_ipa_bits)
-		return 0;
+		return vm_type;
 
 	/* Otherwise, compute the minimal required IPA size */
 	max_ipa = kvm->cfg.ram_addr + kvm->cfg.ram_size - 1;
@@ -147,7 +150,8 @@ int kvm__get_vm_type(struct kvm *kvm)
 	if (ipa_bits > max_ipa_bits)
 		die("Memory too large for this system (needs %d bits, %d available)", ipa_bits, max_ipa_bits);
 
-	return KVM_VM_TYPE_ARM_IPA_SIZE(ipa_bits);
+	vm_type |= KVM_VM_TYPE_ARM_IPA_SIZE(ipa_bits);
+	return vm_type;
 }
 
 void kvm__arch_enable_mte(struct kvm *kvm)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 11/31] arm64: Lock realm RAM in memory
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (9 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 10/31] arm64: Create a realm virtual machine Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 12/31] arm64: Create Realm Descriptor Suzuki K Poulose
                     ` (20 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

From: Alexandru Elisei <alexandru.elisei@arm.com>

RMM doesn't yet support paging protected memory pages. Thus the VMM
must pin the entire VM memory.

Use mlock2 to keep the realm pages pinned in memory once they are faulted
in. Use the MLOCK_ONFAULT flag to prevent pre-mapping the pages and
maintain some semblance of on demand-paging for a realm VM.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/kvm.c | 44 ++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 42 insertions(+), 2 deletions(-)

diff --git a/arm/kvm.c b/arm/kvm.c
index d51cc15d..0e40b753 100644
--- a/arm/kvm.c
+++ b/arm/kvm.c
@@ -7,6 +7,8 @@
 
 #include "arm-common/gic.h"
 
+#include <sys/resource.h>
+
 #include <linux/kernel.h>
 #include <linux/kvm.h>
 #include <linux/sizes.h>
@@ -24,6 +26,25 @@ bool kvm__arch_cpu_supports_vm(void)
 	return true;
 }
 
+static void try_increase_mlock_limit(struct kvm *kvm)
+{
+	u64 size = kvm->arch.ram_alloc_size;
+	struct rlimit mlock_limit, new_limit;
+
+	if (getrlimit(RLIMIT_MEMLOCK, &mlock_limit)) {
+		perror("getrlimit(RLIMIT_MEMLOCK)");
+		return;
+	}
+
+	if (mlock_limit.rlim_cur > size)
+		return;
+
+	new_limit.rlim_cur = size;
+	new_limit.rlim_max = max((rlim_t)size, mlock_limit.rlim_max);
+	/* Requires CAP_SYS_RESOURCE capability. */
+	setrlimit(RLIMIT_MEMLOCK, &new_limit);
+}
+
 void kvm__init_ram(struct kvm *kvm)
 {
 	u64 phys_start, phys_size;
@@ -49,8 +70,27 @@ void kvm__init_ram(struct kvm *kvm)
 	kvm->ram_start = (void *)ALIGN((unsigned long)kvm->arch.ram_alloc_start,
 					SZ_2M);
 
-	madvise(kvm->arch.ram_alloc_start, kvm->arch.ram_alloc_size,
-		MADV_MERGEABLE);
+	/*
+	 * Do not merge pages if this is a Realm.
+	 *  a) We cannot replace a page in realm stage2 without export/import
+	 *
+	 * Pin the realm memory until we have export/import, due to the same
+	 * reason as above.
+	 *
+	 * Use mlock2(,,MLOCK_ONFAULT) to allow faulting in pages and thus
+	 * allowing to lazily populate the PAR.
+	 */
+	if (kvm->cfg.arch.is_realm) {
+		int ret;
+
+		try_increase_mlock_limit(kvm);
+		ret = mlock2(kvm->arch.ram_alloc_start, kvm->arch.ram_alloc_size,
+			     MLOCK_ONFAULT);
+		if (ret)
+			die_perror("mlock2");
+	} else {
+		madvise(kvm->arch.ram_alloc_start, kvm->arch.ram_alloc_size, MADV_MERGEABLE);
+	}
 
 	madvise(kvm->arch.ram_alloc_start, kvm->arch.ram_alloc_size,
 		MADV_HUGEPAGE);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 12/31] arm64: Create Realm Descriptor
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (10 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 11/31] arm64: Lock realm RAM in memory Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 13/31] arm64: Add --measurement-algo command line option for a realm Suzuki K Poulose
                     ` (19 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

From: Alexandru Elisei <alexandru.elisei@arm.com>

Create the Realm Descriptor using the measurement algorithm set
with --measurement-algo.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 Makefile                        |  1 +
 arm/aarch32/include/asm/realm.h | 10 ++++++++++
 arm/aarch64/include/asm/realm.h | 10 ++++++++++
 arm/aarch64/realm.c             | 14 ++++++++++++++
 arm/kvm.c                       |  3 +++
 5 files changed, 38 insertions(+)
 create mode 100644 arm/aarch32/include/asm/realm.h
 create mode 100644 arm/aarch64/include/asm/realm.h
 create mode 100644 arm/aarch64/realm.c

diff --git a/Makefile b/Makefile
index ed2414bd..88cdf6d2 100644
--- a/Makefile
+++ b/Makefile
@@ -192,6 +192,7 @@ ifeq ($(ARCH), arm64)
 	OBJS		+= arm/aarch64/kvm.o
 	OBJS		+= arm/aarch64/pvtime.o
 	OBJS		+= arm/aarch64/pmu.o
+	OBJS		+= arm/aarch64/realm.o
 	ARCH_INCLUDE	:= $(HDRS_ARM_COMMON)
 	ARCH_INCLUDE	+= -Iarm/aarch64/include
 
diff --git a/arm/aarch32/include/asm/realm.h b/arm/aarch32/include/asm/realm.h
new file mode 100644
index 00000000..5aca6cca
--- /dev/null
+++ b/arm/aarch32/include/asm/realm.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_REALM_H
+#define __ASM_REALM_H
+
+#include "kvm/kvm.h"
+
+static inline void kvm_arm_realm_create_realm_descriptor(struct kvm *kvm) {}
+
+#endif /* ! __ASM_REALM_H */
diff --git a/arm/aarch64/include/asm/realm.h b/arm/aarch64/include/asm/realm.h
new file mode 100644
index 00000000..e176f15f
--- /dev/null
+++ b/arm/aarch64/include/asm/realm.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_REALM_H
+#define __ASM_REALM_H
+
+#include "kvm/kvm.h"
+
+void kvm_arm_realm_create_realm_descriptor(struct kvm *kvm);
+
+#endif /* ! __ASM_REALM_H */
diff --git a/arm/aarch64/realm.c b/arm/aarch64/realm.c
new file mode 100644
index 00000000..3a4adb66
--- /dev/null
+++ b/arm/aarch64/realm.c
@@ -0,0 +1,14 @@
+#include "kvm/kvm.h"
+
+#include <asm/realm.h>
+
+void kvm_arm_realm_create_realm_descriptor(struct kvm *kvm)
+{
+	struct kvm_enable_cap rme_create_rd = {
+		.cap = KVM_CAP_ARM_RME,
+		.args[0] = KVM_CAP_ARM_RME_CREATE_RD,
+	};
+
+	if (ioctl(kvm->vm_fd, KVM_ENABLE_CAP, &rme_create_rd) < 0)
+		die_perror("KVM_CAP_RME(KVM_CAP_ARM_RME_CREATE_RD)");
+}
diff --git a/arm/kvm.c b/arm/kvm.c
index 0e40b753..2510a322 100644
--- a/arm/kvm.c
+++ b/arm/kvm.c
@@ -127,6 +127,9 @@ void kvm__arch_set_cmdline(char *cmdline, bool video)
 
 void kvm__arch_init(struct kvm *kvm)
 {
+	if (kvm->cfg.arch.is_realm)
+		kvm_arm_realm_create_realm_descriptor(kvm);
+
 	/* Create the virtual GIC. */
 	if (gic__create(kvm, kvm->cfg.arch.irqchip))
 		die("Failed to create virtual GIC");
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 13/31] arm64: Add --measurement-algo command line option for a realm
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (11 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 12/31] arm64: Create Realm Descriptor Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 14/31] arm64: Add configuration step for Realms Suzuki K Poulose
                     ` (18 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

From: Christoffer Dall <christoffer.dall@arm.com>

Add the command line option to specify the algorithm that will be used
to create the cryptographic measurement of the realm. Valid options are
"sha256" and "sha512". The final measurement will be a hash using the
selected algorithm

Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/include/kvm/kvm-config-arch.h |  5 ++++-
 arm/aarch64/kvm.c                         | 17 ++++++++++++++++-
 arm/include/arm-common/kvm-arch.h         |  1 +
 arm/include/arm-common/kvm-config-arch.h  |  1 +
 4 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/arm/aarch64/include/kvm/kvm-config-arch.h b/arm/aarch64/include/kvm/kvm-config-arch.h
index d2df850a..b93999b6 100644
--- a/arm/aarch64/include/kvm/kvm-config-arch.h
+++ b/arm/aarch64/include/kvm/kvm-config-arch.h
@@ -23,7 +23,10 @@ int vcpu_affinity_parser(const struct option *opt, const char *arg, int unset);
 	OPT_BOOLEAN('\0', "disable-sve", &(cfg)->disable_sve,		\
 			"Disable SVE"),					\
 	OPT_BOOLEAN('\0', "realm", &(cfg)->is_realm,			\
-			"Create VM running in a realm using Arm RME"),
+			"Create VM running in a realm using Arm RME"),	\
+	OPT_STRING('\0', "measurement-algo", &(cfg)->measurement_algo,	\
+			 "sha256, sha512",				\
+			 "Realm Measurement algorithm, default: sha256"),
 
 #include "arm-common/kvm-config-arch.h"
 
diff --git a/arm/aarch64/kvm.c b/arm/aarch64/kvm.c
index 5db4c572..a5a98b2e 100644
--- a/arm/aarch64/kvm.c
+++ b/arm/aarch64/kvm.c
@@ -53,12 +53,27 @@ static void validate_mem_cfg(struct kvm *kvm)
 
 static void validate_realm_cfg(struct kvm *kvm)
 {
-	if (!kvm->cfg.arch.is_realm)
+	if (!kvm->cfg.arch.is_realm) {
+		if (kvm->cfg.arch.measurement_algo)
+			die("--measurement-algo valid only with --realm");
 		return;
+	}
 
 	if (kvm->cfg.arch.aarch32_guest)
 		die("Realms supported only for 64bit guests");
 
+	if (kvm->cfg.arch.measurement_algo) {
+		if (strcmp(kvm->cfg.arch.measurement_algo, "sha256") == 0)
+			kvm->arch.measurement_algo = KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256;
+		else if (strcmp(kvm->cfg.arch.measurement_algo, "sha512") == 0)
+			kvm->arch.measurement_algo = KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA512;
+		else
+			die("unknown realm measurement algorithm");
+	} else {
+		pr_debug("Realm Hash algorithm: Using default SHA256\n");
+		kvm->arch.measurement_algo = KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256;
+	}
+
 	die("Realms not supported");
 }
 
diff --git a/arm/include/arm-common/kvm-arch.h b/arm/include/arm-common/kvm-arch.h
index b2ae373c..68224b1c 100644
--- a/arm/include/arm-common/kvm-arch.h
+++ b/arm/include/arm-common/kvm-arch.h
@@ -113,6 +113,7 @@ struct kvm_arch {
 	u64	dtb_guest_start;
 
 	cpu_set_t *vcpu_affinity_cpuset;
+	u64	measurement_algo;
 };
 
 #endif /* ARM_COMMON__KVM_ARCH_H */
diff --git a/arm/include/arm-common/kvm-config-arch.h b/arm/include/arm-common/kvm-config-arch.h
index 5eb791da..a2faa3af 100644
--- a/arm/include/arm-common/kvm-config-arch.h
+++ b/arm/include/arm-common/kvm-config-arch.h
@@ -6,6 +6,7 @@
 struct kvm_config_arch {
 	const char	*dump_dtb_filename;
 	const char	*vcpu_affinity;
+	const char	*measurement_algo;
 	unsigned int	force_cntfrq;
 	bool		virtio_trans_pci;
 	bool		aarch32_guest;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 14/31] arm64: Add configuration step for Realms
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (12 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 13/31] arm64: Add --measurement-algo command line option for a realm Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 15/31] arm64: Add support for Realm Personalisation Value Suzuki K Poulose
                     ` (17 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

Realm must be configured before it is created. Add the
step to specify the parameters for the Realm.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/realm.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/arm/aarch64/realm.c b/arm/aarch64/realm.c
index 3a4adb66..31543e55 100644
--- a/arm/aarch64/realm.c
+++ b/arm/aarch64/realm.c
@@ -2,6 +2,29 @@
 
 #include <asm/realm.h>
 
+
+static void realm_configure_hash_algo(struct kvm *kvm)
+{
+	struct kvm_cap_arm_rme_config_item hash_algo_cfg = {
+		.cfg	= KVM_CAP_ARM_RME_CFG_HASH_ALGO,
+		.hash_algo = kvm->arch.measurement_algo,
+	};
+
+	struct kvm_enable_cap rme_config = {
+		.cap = KVM_CAP_ARM_RME,
+		.args[0] = KVM_CAP_ARM_RME_CONFIG_REALM,
+		.args[1] = (u64)&hash_algo_cfg,
+	};
+
+	if (ioctl(kvm->vm_fd, KVM_ENABLE_CAP, &rme_config) < 0)
+		die_perror("KVM_CAP_RME(KVM_CAP_ARM_RME_CONFIG_REALM) hash_algo");
+}
+
+static void realm_configure_parameters(struct kvm *kvm)
+{
+	realm_configure_hash_algo(kvm);
+}
+
 void kvm_arm_realm_create_realm_descriptor(struct kvm *kvm)
 {
 	struct kvm_enable_cap rme_create_rd = {
@@ -9,6 +32,7 @@ void kvm_arm_realm_create_realm_descriptor(struct kvm *kvm)
 		.args[0] = KVM_CAP_ARM_RME_CREATE_RD,
 	};
 
+	realm_configure_parameters(kvm);
 	if (ioctl(kvm->vm_fd, KVM_ENABLE_CAP, &rme_create_rd) < 0)
 		die_perror("KVM_CAP_RME(KVM_CAP_ARM_RME_CREATE_RD)");
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 15/31] arm64: Add support for Realm Personalisation Value
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (13 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 14/31] arm64: Add configuration step for Realms Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 16/31] arm64: Add support for specifying the SVE vector length for Realm Suzuki K Poulose
                     ` (16 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

Add option to specify Realm personalisation value

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/include/kvm/kvm-config-arch.h |  6 +++++-
 arm/aarch64/kvm.c                         |  7 +++++++
 arm/aarch64/realm.c                       | 23 +++++++++++++++++++++++
 arm/include/arm-common/kvm-config-arch.h  |  1 +
 4 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/arm/aarch64/include/kvm/kvm-config-arch.h b/arm/aarch64/include/kvm/kvm-config-arch.h
index b93999b6..f2e659ad 100644
--- a/arm/aarch64/include/kvm/kvm-config-arch.h
+++ b/arm/aarch64/include/kvm/kvm-config-arch.h
@@ -26,7 +26,11 @@ int vcpu_affinity_parser(const struct option *opt, const char *arg, int unset);
 			"Create VM running in a realm using Arm RME"),	\
 	OPT_STRING('\0', "measurement-algo", &(cfg)->measurement_algo,	\
 			 "sha256, sha512",				\
-			 "Realm Measurement algorithm, default: sha256"),
+			 "Realm Measurement algorithm, default: sha256"),\
+	OPT_STRING('\0', "realm-pv", &(cfg)->realm_pv,			\
+			"personalisation value",			\
+			"Personalisation Value (only) for Realm VMs"),
+
 
 #include "arm-common/kvm-config-arch.h"
 
diff --git a/arm/aarch64/kvm.c b/arm/aarch64/kvm.c
index a5a98b2e..4798e359 100644
--- a/arm/aarch64/kvm.c
+++ b/arm/aarch64/kvm.c
@@ -56,6 +56,8 @@ static void validate_realm_cfg(struct kvm *kvm)
 	if (!kvm->cfg.arch.is_realm) {
 		if (kvm->cfg.arch.measurement_algo)
 			die("--measurement-algo valid only with --realm");
+		if (kvm->cfg.arch.realm_pv)
+			die("--realm-pv valid only with --realm");
 		return;
 	}
 
@@ -74,6 +76,11 @@ static void validate_realm_cfg(struct kvm *kvm)
 		kvm->arch.measurement_algo = KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256;
 	}
 
+	if (kvm->cfg.arch.realm_pv) {
+		if (strlen(kvm->cfg.arch.realm_pv) > KVM_CAP_ARM_RME_RPV_SIZE)
+			die("Invalid size for Realm Personalization Value\n");
+	}
+
 	die("Realms not supported");
 }
 
diff --git a/arm/aarch64/realm.c b/arm/aarch64/realm.c
index 31543e55..2e0be982 100644
--- a/arm/aarch64/realm.c
+++ b/arm/aarch64/realm.c
@@ -20,9 +20,32 @@ static void realm_configure_hash_algo(struct kvm *kvm)
 		die_perror("KVM_CAP_RME(KVM_CAP_ARM_RME_CONFIG_REALM) hash_algo");
 }
 
+static void realm_configure_rpv(struct kvm *kvm)
+{
+	struct kvm_cap_arm_rme_config_item rpv_cfg  = {
+		.cfg	= KVM_CAP_ARM_RME_CFG_RPV,
+	};
+
+	struct kvm_enable_cap rme_config = {
+		.cap = KVM_CAP_ARM_RME,
+		.args[0] = KVM_CAP_ARM_RME_CONFIG_REALM,
+		.args[1] = (u64)&rpv_cfg,
+	};
+
+	if (!kvm->cfg.arch.realm_pv)
+		return;
+
+	memset(&rpv_cfg.rpv, 0, sizeof(rpv_cfg.rpv));
+	memcpy(&rpv_cfg.rpv, kvm->cfg.arch.realm_pv, strlen(kvm->cfg.arch.realm_pv));
+
+	if (ioctl(kvm->vm_fd, KVM_ENABLE_CAP, &rme_config) < 0)
+		die_perror("KVM_CAP_RME(KVM_CAP_ARM_RME_CONFIG_REALM) RPV");
+}
+
 static void realm_configure_parameters(struct kvm *kvm)
 {
 	realm_configure_hash_algo(kvm);
+	realm_configure_rpv(kvm);
 }
 
 void kvm_arm_realm_create_realm_descriptor(struct kvm *kvm)
diff --git a/arm/include/arm-common/kvm-config-arch.h b/arm/include/arm-common/kvm-config-arch.h
index a2faa3af..80a3b18e 100644
--- a/arm/include/arm-common/kvm-config-arch.h
+++ b/arm/include/arm-common/kvm-config-arch.h
@@ -7,6 +7,7 @@ struct kvm_config_arch {
 	const char	*dump_dtb_filename;
 	const char	*vcpu_affinity;
 	const char	*measurement_algo;
+	const char	*realm_pv;
 	unsigned int	force_cntfrq;
 	bool		virtio_trans_pci;
 	bool		aarch32_guest;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 16/31] arm64: Add support for specifying the SVE vector length for Realm
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (14 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 15/31] arm64: Add support for Realm Personalisation Value Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 17/31] arm: Add kernel size to VM context Suzuki K Poulose
                     ` (15 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

Add option to specify SVE vector length for realms.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/include/kvm/kvm-config-arch.h |  6 ++++--
 arm/aarch64/kvm.c                         | 23 +++++++++++++++++++++++
 arm/aarch64/realm.c                       | 21 +++++++++++++++++++++
 arm/include/arm-common/kvm-arch.h         |  1 +
 arm/include/arm-common/kvm-config-arch.h  |  1 +
 5 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/arm/aarch64/include/kvm/kvm-config-arch.h b/arm/aarch64/include/kvm/kvm-config-arch.h
index f2e659ad..0f42c2c2 100644
--- a/arm/aarch64/include/kvm/kvm-config-arch.h
+++ b/arm/aarch64/include/kvm/kvm-config-arch.h
@@ -29,8 +29,10 @@ int vcpu_affinity_parser(const struct option *opt, const char *arg, int unset);
 			 "Realm Measurement algorithm, default: sha256"),\
 	OPT_STRING('\0', "realm-pv", &(cfg)->realm_pv,			\
 			"personalisation value",			\
-			"Personalisation Value (only) for Realm VMs"),
-
+			"Personalisation Value (only) for Realm VMs"),	\
+	OPT_U64('\0', "sve-vl", &(cfg)->sve_vl,				\
+			"SVE Vector Length the VM"			\
+			"(only supported for Realms)"),
 
 #include "arm-common/kvm-config-arch.h"
 
diff --git a/arm/aarch64/kvm.c b/arm/aarch64/kvm.c
index 4798e359..fca1410b 100644
--- a/arm/aarch64/kvm.c
+++ b/arm/aarch64/kvm.c
@@ -51,13 +51,19 @@ static void validate_mem_cfg(struct kvm *kvm)
 	}
 }
 
+#define SVE_VL_ALIGN	128
+
 static void validate_realm_cfg(struct kvm *kvm)
 {
+	u32 sve_vl;
+
 	if (!kvm->cfg.arch.is_realm) {
 		if (kvm->cfg.arch.measurement_algo)
 			die("--measurement-algo valid only with --realm");
 		if (kvm->cfg.arch.realm_pv)
 			die("--realm-pv valid only with --realm");
+		if (kvm->cfg.arch.sve_vl)
+			die("--sve-vl valid only with --realm");
 		return;
 	}
 
@@ -76,6 +82,23 @@ static void validate_realm_cfg(struct kvm *kvm)
 		kvm->arch.measurement_algo = KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256;
 	}
 
+	sve_vl = kvm->cfg.arch.sve_vl;
+	if (sve_vl) {
+		if (kvm->cfg.arch.disable_sve)
+			die("SVE VL requested when SVE is disabled");
+		if (!IS_ALIGNED(sve_vl, SVE_VL_ALIGN))
+			die("SVE VL is not aligned to %dbit\n", SVE_VL_ALIGN);
+		kvm->arch.sve_vq = (sve_vl / SVE_VL_ALIGN) - 1;
+	} else {
+		/*
+		 * Disable SVE for Realms, if a VL is not requested.
+		 * The SVE VL will be measured as part of the parameter
+		 * and we do not want to add an unknown entity to the
+		 * measurement.
+		 */
+		kvm->cfg.arch.disable_sve = true;
+	}
+
 	if (kvm->cfg.arch.realm_pv) {
 		if (strlen(kvm->cfg.arch.realm_pv) > KVM_CAP_ARM_RME_RPV_SIZE)
 			die("Invalid size for Realm Personalization Value\n");
diff --git a/arm/aarch64/realm.c b/arm/aarch64/realm.c
index 2e0be982..fc7f8d6a 100644
--- a/arm/aarch64/realm.c
+++ b/arm/aarch64/realm.c
@@ -42,10 +42,31 @@ static void realm_configure_rpv(struct kvm *kvm)
 		die_perror("KVM_CAP_RME(KVM_CAP_ARM_RME_CONFIG_REALM) RPV");
 }
 
+static void realm_configure_sve(struct kvm *kvm)
+{
+	struct kvm_cap_arm_rme_config_item sve_cfg = {
+		.cfg	= KVM_CAP_ARM_RME_CFG_SVE,
+		.sve_vq = kvm->arch.sve_vq,
+	};
+
+	struct kvm_enable_cap rme_config = {
+		.cap = KVM_CAP_ARM_RME,
+		.args[0] = KVM_CAP_ARM_RME_CONFIG_REALM,
+		.args[1] = (u64)&sve_cfg,
+	};
+
+	if (kvm->cfg.arch.disable_sve)
+		return;
+
+	if (ioctl(kvm->vm_fd, KVM_ENABLE_CAP, &rme_config) < 0)
+		die_perror("KVM_CAP_RME(KVM_CAP_ARM_RME_CONFIG_REALM) SVE");
+}
+
 static void realm_configure_parameters(struct kvm *kvm)
 {
 	realm_configure_hash_algo(kvm);
 	realm_configure_rpv(kvm);
+	realm_configure_sve(kvm);
 }
 
 void kvm_arm_realm_create_realm_descriptor(struct kvm *kvm)
diff --git a/arm/include/arm-common/kvm-arch.h b/arm/include/arm-common/kvm-arch.h
index 68224b1c..41b31f11 100644
--- a/arm/include/arm-common/kvm-arch.h
+++ b/arm/include/arm-common/kvm-arch.h
@@ -114,6 +114,7 @@ struct kvm_arch {
 
 	cpu_set_t *vcpu_affinity_cpuset;
 	u64	measurement_algo;
+	u64	sve_vq;
 };
 
 #endif /* ARM_COMMON__KVM_ARCH_H */
diff --git a/arm/include/arm-common/kvm-config-arch.h b/arm/include/arm-common/kvm-config-arch.h
index 80a3b18e..d923fd9e 100644
--- a/arm/include/arm-common/kvm-config-arch.h
+++ b/arm/include/arm-common/kvm-config-arch.h
@@ -19,6 +19,7 @@ struct kvm_config_arch {
 	u64		fw_addr;
 	bool no_pvtime;
 	bool		disable_sve;
+	u64		sve_vl;
 };
 
 int irqchip_parser(const struct option *opt, const char *arg, int unset);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 17/31] arm: Add kernel size to VM context
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (15 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 16/31] arm64: Add support for specifying the SVE vector length for Realm Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 18/31] arm64: Populate initial realm contents Suzuki K Poulose
                     ` (14 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

From: Alexandru Elisei <alexandru.elisei@arm.com>

Add the kernel image size to the VM context, as we are going to use it
later. This matches what we already do with the initrd.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
[Fix kernel size printed in debug messages]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/include/arm-common/kvm-arch.h | 1 +
 arm/kvm.c                         | 8 +++++---
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/arm/include/arm-common/kvm-arch.h b/arm/include/arm-common/kvm-arch.h
index 41b31f11..b5a4b851 100644
--- a/arm/include/arm-common/kvm-arch.h
+++ b/arm/include/arm-common/kvm-arch.h
@@ -108,6 +108,7 @@ struct kvm_arch {
 	 */
 	u64	memory_guest_start;
 	u64	kern_guest_start;
+	u64	kern_size;
 	u64	initrd_guest_start;
 	u64	initrd_size;
 	u64	dtb_guest_start;
diff --git a/arm/kvm.c b/arm/kvm.c
index 2510a322..acb627b2 100644
--- a/arm/kvm.c
+++ b/arm/kvm.c
@@ -153,7 +153,6 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
 	limit = kvm->ram_start + min(kvm->ram_size, (u64)SZ_256M) - 1;
 
 	pos = kvm->ram_start + kvm__arch_get_kern_offset(kvm, fd_kernel);
-	kvm->arch.kern_guest_start = host_to_guest_flat(kvm, pos);
 	file_size = read_file(fd_kernel, pos, limit - pos);
 	if (file_size < 0) {
 		if (errno == ENOMEM)
@@ -161,9 +160,12 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
 
 		die_perror("kernel read");
 	}
+
+	kvm->arch.kern_guest_start = host_to_guest_flat(kvm, pos);
+	kvm->arch.kern_size = file_size;
 	kernel_end = pos + file_size;
-	pr_debug("Loaded kernel to 0x%llx (%zd bytes)",
-		 kvm->arch.kern_guest_start, file_size);
+	pr_debug("Loaded kernel to 0x%llx (%llu bytes)",
+		 kvm->arch.kern_guest_start, kvm->arch.kern_size);
 
 	/*
 	 * Now load backwards from the end of memory so the kernel
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 18/31] arm64: Populate initial realm contents
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (16 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 17/31] arm: Add kernel size to VM context Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-03-02 14:03     ` Piotr Sawicki
  2023-01-27 11:39   ` [RFC kvmtool 19/31] arm64: Don't try to set PSTATE for VCPUs belonging to a realm Suzuki K Poulose
                     ` (13 subsequent siblings)
  31 siblings, 1 reply; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

From: Alexandru Elisei <alexandru.elisei@arm.com>

Populate the realm memory with the initial contents, which include
the device tree blob, the kernel image, and initrd, if specified,
or the firmware image.

Populating an image in the realm involves two steps:
 a) Mark the IPA area as RAM - INIT_IPA_REALM
 b) Load the contents into the IPA - POPULATE_REALM

Wherever we know the actual size of an image in memory, we make
sure the "memory area" is initialised to RAM.
e.g., Linux kernel image size from the header which includes the bss etc.
The "file size" on disk for the Linux image is much smaller.
We mark the region of size Image.header.size as RAM (a), from the kernel
load address. And load the Image file into the memory (b) above.
At the moment we only detect the Arm64 Linux Image header format.

Since we're already touching the code that copies the
initrd in guest memory, let's do a bit of cleaning and remove a
useless local variable.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
[ Make sure the Linux kernel image area is marked as RAM ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch32/include/asm/realm.h |   3 +
 arm/aarch64/include/asm/realm.h |   3 +
 arm/aarch64/realm.c             | 112 ++++++++++++++++++++++++++++++++
 arm/fdt.c                       |   6 ++
 arm/kvm.c                       |  20 ++++--
 include/linux/kernel.h          |   1 +
 6 files changed, 140 insertions(+), 5 deletions(-)

diff --git a/arm/aarch32/include/asm/realm.h b/arm/aarch32/include/asm/realm.h
index 5aca6cca..fcff0e55 100644
--- a/arm/aarch32/include/asm/realm.h
+++ b/arm/aarch32/include/asm/realm.h
@@ -6,5 +6,8 @@
 #include "kvm/kvm.h"
 
 static inline void kvm_arm_realm_create_realm_descriptor(struct kvm *kvm) {}
+static inline void kvm_arm_realm_populate_kernel(struct kvm *kvm) {}
+static inline void kvm_arm_realm_populate_initrd(struct kvm *kvm) {}
+static inline void kvm_arm_realm_populate_dtb(struct kvm *kvm) {}
 
 #endif /* ! __ASM_REALM_H */
diff --git a/arm/aarch64/include/asm/realm.h b/arm/aarch64/include/asm/realm.h
index e176f15f..6e760ac9 100644
--- a/arm/aarch64/include/asm/realm.h
+++ b/arm/aarch64/include/asm/realm.h
@@ -6,5 +6,8 @@
 #include "kvm/kvm.h"
 
 void kvm_arm_realm_create_realm_descriptor(struct kvm *kvm);
+void kvm_arm_realm_populate_kernel(struct kvm *kvm);
+void kvm_arm_realm_populate_initrd(struct kvm *kvm);
+void kvm_arm_realm_populate_dtb(struct kvm *kvm);
 
 #endif /* ! __ASM_REALM_H */
diff --git a/arm/aarch64/realm.c b/arm/aarch64/realm.c
index fc7f8d6a..eddccece 100644
--- a/arm/aarch64/realm.c
+++ b/arm/aarch64/realm.c
@@ -1,5 +1,7 @@
 #include "kvm/kvm.h"
 
+#include <linux/byteorder.h>
+#include <asm/image.h>
 #include <asm/realm.h>
 
 
@@ -80,3 +82,113 @@ void kvm_arm_realm_create_realm_descriptor(struct kvm *kvm)
 	if (ioctl(kvm->vm_fd, KVM_ENABLE_CAP, &rme_create_rd) < 0)
 		die_perror("KVM_CAP_RME(KVM_CAP_ARM_RME_CREATE_RD)");
 }
+
+static void realm_init_ipa_range(struct kvm *kvm, u64 start, u64 size)
+{
+	struct kvm_cap_arm_rme_init_ipa_args init_ipa_args = {
+		.init_ipa_base = start,
+		.init_ipa_size = size
+	};
+	struct kvm_enable_cap rme_init_ipa_realm = {
+		.cap = KVM_CAP_ARM_RME,
+		.args[0] = KVM_CAP_ARM_RME_INIT_IPA_REALM,
+		.args[1] = (u64)&init_ipa_args
+	};
+
+	if (ioctl(kvm->vm_fd, KVM_ENABLE_CAP, &rme_init_ipa_realm) < 0)
+		die("unable to intialise IPA range for Realm %llx - %llx (size %llu)",
+		    start, start + size, size);
+
+}
+
+static void __realm_populate(struct kvm *kvm, u64 start, u64 size)
+{
+	struct kvm_cap_arm_rme_populate_realm_args populate_args = {
+		.populate_ipa_base = start,
+		.populate_ipa_size = size
+	};
+	struct kvm_enable_cap rme_populate_realm = {
+		.cap = KVM_CAP_ARM_RME,
+		.args[0] = KVM_CAP_ARM_RME_POPULATE_REALM,
+		.args[1] = (u64)&populate_args
+	};
+
+	if (ioctl(kvm->vm_fd, KVM_ENABLE_CAP, &rme_populate_realm) < 0)
+		die("unable to populate Realm memory %llx - %llx (size %llu)",
+		    start, start + size, size);
+}
+
+static void realm_populate(struct kvm *kvm, u64 start, u64 size)
+{
+	realm_init_ipa_range(kvm, start, size);
+	__realm_populate(kvm, start, size);
+}
+
+static bool is_arm64_linux_kernel_image(void *header)
+{
+	struct arm64_image_header *hdr = header;
+
+	return memcmp(&hdr->magic, ARM64_IMAGE_MAGIC, sizeof(hdr->magic)) == 0;
+}
+
+static ssize_t arm64_linux_kernel_image_size(void *header)
+{
+	struct arm64_image_header *hdr = header;
+
+	if (is_arm64_linux_kernel_image(header))
+		return le64_to_cpu(hdr->image_size);
+	die("Not arm64 Linux kernel Image");
+}
+
+void kvm_arm_realm_populate_kernel(struct kvm *kvm)
+{
+	u64 start, end, mem_size;
+	void *header = guest_flat_to_host(kvm, kvm->arch.kern_guest_start);
+
+	start = ALIGN_DOWN(kvm->arch.kern_guest_start, SZ_4K);
+	end = ALIGN(kvm->arch.kern_guest_start + kvm->arch.kern_size, SZ_4K);
+
+	if (is_arm64_linux_kernel_image(header))
+		mem_size = arm64_linux_kernel_image_size(header);
+	else
+		mem_size = end - start;
+
+	realm_init_ipa_range(kvm, start, mem_size);
+	__realm_populate(kvm, start, end - start);
+}
+
+void kvm_arm_realm_populate_initrd(struct kvm *kvm)
+{
+	u64 kernel_end, start, end;
+
+	kernel_end = ALIGN(kvm->arch.kern_guest_start + kvm->arch.kern_size, SZ_4K);
+	start = ALIGN_DOWN(kvm->arch.initrd_guest_start, SZ_4K);
+	/*
+	 * Because we align the initrd to 4 bytes, it is theoretically possible
+	 * for the start of the initrd to overlap with the same page where the
+	 * kernel ends.
+	 */
+	if (start < kernel_end)
+		start = kernel_end;
+	end = ALIGN(kvm->arch.initrd_guest_start + kvm->arch.initrd_size, SZ_4K);
+	if (end > start)
+		realm_populate(kvm, start, end - start);
+}
+
+void kvm_arm_realm_populate_dtb(struct kvm *kvm)
+{
+	u64 initrd_end, start, end;
+
+	initrd_end = ALIGN(kvm->arch.initrd_guest_start + kvm->arch.initrd_size, SZ_4K);
+	start = ALIGN_DOWN(kvm->arch.dtb_guest_start, SZ_4K);
+	/*
+	 * Same situation as with the initrd, but now it is the DTB which is
+	 * overlapping with the last page of the initrd, because the initrd is
+	 * populated first.
+	 */
+	if (start < initrd_end)
+		start = initrd_end;
+	end = ALIGN(kvm->arch.dtb_guest_start + FDT_MAX_SIZE, SZ_4K);
+	if (end > start)
+		realm_populate(kvm, start, end - start);
+}
diff --git a/arm/fdt.c b/arm/fdt.c
index 286ccadf..762a604d 100644
--- a/arm/fdt.c
+++ b/arm/fdt.c
@@ -7,6 +7,8 @@
 #include "arm-common/gic.h"
 #include "arm-common/pci.h"
 
+#include <asm/realm.h>
+
 #include <stdbool.h>
 
 #include <linux/byteorder.h>
@@ -231,6 +233,10 @@ static int setup_fdt(struct kvm *kvm)
 
 	if (kvm->cfg.arch.dump_dtb_filename)
 		dump_fdt(kvm->cfg.arch.dump_dtb_filename, fdt_dest);
+
+	if (kvm->cfg.arch.is_realm)
+		kvm_arm_realm_populate_dtb(kvm);
+
 	return 0;
 }
 late_init(setup_fdt);
diff --git a/arm/kvm.c b/arm/kvm.c
index acb627b2..57c5b5f7 100644
--- a/arm/kvm.c
+++ b/arm/kvm.c
@@ -6,6 +6,7 @@
 #include "kvm/fdt.h"
 
 #include "arm-common/gic.h"
+#include <asm/realm.h>
 
 #include <sys/resource.h>
 
@@ -167,6 +168,9 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
 	pr_debug("Loaded kernel to 0x%llx (%llu bytes)",
 		 kvm->arch.kern_guest_start, kvm->arch.kern_size);
 
+	if (kvm->cfg.arch.is_realm)
+		kvm_arm_realm_populate_kernel(kvm);
+
 	/*
 	 * Now load backwards from the end of memory so the kernel
 	 * decompressor has plenty of space to work with. First up is
@@ -188,7 +192,6 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
 	/* ... and finally the initrd, if we have one. */
 	if (fd_initrd != -1) {
 		struct stat sb;
-		unsigned long initrd_start;
 
 		if (fstat(fd_initrd, &sb))
 			die_perror("fstat");
@@ -199,7 +202,6 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
 		if (pos < kernel_end)
 			die("initrd overlaps with kernel image.");
 
-		initrd_start = guest_addr;
 		file_size = read_file(fd_initrd, pos, limit - pos);
 		if (file_size == -1) {
 			if (errno == ENOMEM)
@@ -208,11 +210,13 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
 			die_perror("initrd read");
 		}
 
-		kvm->arch.initrd_guest_start = initrd_start;
+		kvm->arch.initrd_guest_start = guest_addr;
 		kvm->arch.initrd_size = file_size;
 		pr_debug("Loaded initrd to 0x%llx (%llu bytes)",
-			 kvm->arch.initrd_guest_start,
-			 kvm->arch.initrd_size);
+			 kvm->arch.initrd_guest_start, kvm->arch.initrd_size);
+
+		if (kvm->cfg.arch.is_realm)
+			kvm_arm_realm_populate_initrd(kvm);
 	} else {
 		kvm->arch.initrd_size = 0;
 	}
@@ -269,6 +273,8 @@ bool kvm__load_firmware(struct kvm *kvm, const char *firmware_filename)
 
 	/* Kernel isn't loaded by kvm, point start address to firmware */
 	kvm->arch.kern_guest_start = fw_addr;
+	kvm->arch.kern_size = fw_sz;
+
 	pr_debug("Loaded firmware to 0x%llx (%zd bytes)",
 		 kvm->arch.kern_guest_start, fw_sz);
 
@@ -283,6 +289,10 @@ bool kvm__load_firmware(struct kvm *kvm, const char *firmware_filename)
 		 kvm->arch.dtb_guest_start,
 		 kvm->arch.dtb_guest_start + FDT_MAX_SIZE);
 
+	if (kvm->cfg.arch.is_realm)
+		/* We hijack the kernel fields to describe the firmware. */
+		kvm_arm_realm_populate_kernel(kvm);
+
 	return true;
 }
 
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 6c22f1c0..25f19c20 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -9,6 +9,7 @@
 
 #define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))
 
+#define ALIGN_DOWN(x,a)		__ALIGN_MASK(x - (typeof(x))((a) - 1),(typeof(x))(a)-1)
 #define ALIGN(x,a)		__ALIGN_MASK(x,(typeof(x))(a)-1)
 #define __ALIGN_MASK(x,mask)	(((x)+(mask))&~(mask))
 #define IS_ALIGNED(x, a)	(((x) & ((typeof(x))(a) - 1)) == 0)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 19/31] arm64: Don't try to set PSTATE for VCPUs belonging to a realm
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (17 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 18/31] arm64: Populate initial realm contents Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 20/31] arm64: Finalize realm VCPU after reset Suzuki K Poulose
                     ` (12 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

From: Christoffer Dall <christoffer.dall@arm.com>

RME doesn't allow setting the PSTATE but resets it to an architectural
value, and KVM also does not allow setting this register from user
space, so stop trying to do that.

Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/kvm-cpu.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arm/aarch64/kvm-cpu.c b/arm/aarch64/kvm-cpu.c
index e7649239..37f9aa9d 100644
--- a/arm/aarch64/kvm-cpu.c
+++ b/arm/aarch64/kvm-cpu.c
@@ -92,11 +92,13 @@ static void reset_vcpu_aarch64(struct kvm_cpu *vcpu)
 
 	reg.addr = (u64)&data;
 
-	/* pstate = all interrupts masked */
-	data	= PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT | PSR_MODE_EL1h;
-	reg.id	= ARM64_CORE_REG(regs.pstate);
-	if (ioctl(vcpu->vcpu_fd, KVM_SET_ONE_REG, &reg) < 0)
-		die_perror("KVM_SET_ONE_REG failed (spsr[EL1])");
+	if (!kvm->cfg.arch.is_realm) {
+		/* pstate = all interrupts masked */
+		data	= PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT | PSR_MODE_EL1h;
+		reg.id	= ARM64_CORE_REG(regs.pstate);
+		if (ioctl(vcpu->vcpu_fd, KVM_SET_ONE_REG, &reg) < 0)
+			die_perror("KVM_SET_ONE_REG failed (PSTATE)");
+	}
 
 	/* x1...x3 = 0 */
 	data	= 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 20/31] arm64: Finalize realm VCPU after reset
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (18 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 19/31] arm64: Don't try to set PSTATE for VCPUs belonging to a realm Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 21/31] init: Add last_{init, exit} list macros Suzuki K Poulose
                     ` (11 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

From: Alexandru Elisei <alexandru.elisei@arm.com>

In order to run a VCPU belonging to a realm, that VCPU must be in the
finalized state. Finalize the CPU after reset, since kvmtool won't be
touching the VCPU state afterwards.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/kvm-cpu.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arm/aarch64/kvm-cpu.c b/arm/aarch64/kvm-cpu.c
index 37f9aa9d..24e570c4 100644
--- a/arm/aarch64/kvm-cpu.c
+++ b/arm/aarch64/kvm-cpu.c
@@ -128,6 +128,13 @@ static void reset_vcpu_aarch64(struct kvm_cpu *vcpu)
 		if (ioctl(vcpu->vcpu_fd, KVM_SET_ONE_REG, &reg) < 0)
 			die_perror("KVM_SET_ONE_REG failed (pc)");
 	}
+
+	if (kvm->cfg.arch.is_realm) {
+		int feature = KVM_ARM_VCPU_REC;
+
+		if (ioctl(vcpu->vcpu_fd, KVM_ARM_VCPU_FINALIZE, &feature) < 0)
+			die_perror("KVM_ARM_VCPU_FINALIZE(KVM_ARM_VCPU_REC)");
+	}
 }
 
 void kvm_cpu__select_features(struct kvm *kvm, struct kvm_vcpu_init *init)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 21/31] init: Add last_{init, exit} list macros
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (19 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 20/31] arm64: Finalize realm VCPU after reset Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 22/31] arm64: Activate realm before the first VCPU is run Suzuki K Poulose
                     ` (10 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

From: Alexandru Elisei <alexandru.elisei@arm.com>

Add a last_init macro for constructor functions that will be executed last
in the initialization process. Add a symmetrical macro, last_exit, for
destructor functions that will be the last to be executed when kvmtool
exits.

The list priority for the late_{init, exit} macros has been bumped down a
spot, but their relative priority remains unchanged, to keep the same size
for the init_lists and exit_lists.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 include/kvm/util-init.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/kvm/util-init.h b/include/kvm/util-init.h
index 13d4f04d..e6a0e169 100644
--- a/include/kvm/util-init.h
+++ b/include/kvm/util-init.h
@@ -39,7 +39,8 @@ static void __attribute__ ((constructor)) __init__##cb(void)		\
 #define dev_init(cb) __init_list_add(cb, 5)
 #define virtio_dev_init(cb) __init_list_add(cb, 6)
 #define firmware_init(cb) __init_list_add(cb, 7)
-#define late_init(cb) __init_list_add(cb, 9)
+#define late_init(cb) __init_list_add(cb, 8)
+#define last_init(cb) __init_list_add(cb, 9)
 
 #define core_exit(cb) __exit_list_add(cb, 0)
 #define base_exit(cb) __exit_list_add(cb, 2)
@@ -47,5 +48,6 @@ static void __attribute__ ((constructor)) __init__##cb(void)		\
 #define dev_exit(cb) __exit_list_add(cb, 5)
 #define virtio_dev_exit(cb) __exit_list_add(cb, 6)
 #define firmware_exit(cb) __exit_list_add(cb, 7)
-#define late_exit(cb) __exit_list_add(cb, 9)
+#define late_exit(cb) __exit_list_add(cb, 8)
+#define last_exit(cb) __exit_list_add(cb, 9)
 #endif
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 22/31] arm64: Activate realm before the first VCPU is run
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (20 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 21/31] init: Add last_{init, exit} list macros Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 23/31] arm64: Specify SMC as the PSCI conduits for realms Suzuki K Poulose
                     ` (9 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

From: Alexandru Elisei <alexandru.elisei@arm.com>

Before KVM can run a VCPU belong to a realm, the realm be activated.
Activating a realm prevents the adding of new object and seals the
cryptographic measurement of that realm. The VCPU state is part of the
measurement, which means that realm activation must be performed after
all VCPUs have been reset.

Current RMM implementation can only create RECs in the order of their
MPIDRs. VCPUs get assigned MPIDRs by KVM based on their VCPU id. Reset the
VCPUs in the order they were created from the main thread instead of doing
it from their own thread, which doesn't guarantee any ordering.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/kvm-cpu.c             |  4 ++++
 arm/aarch64/realm.c               | 35 +++++++++++++++++++++++++++++++
 arm/include/arm-common/kvm-arch.h |  1 +
 3 files changed, 40 insertions(+)

diff --git a/arm/aarch64/kvm-cpu.c b/arm/aarch64/kvm-cpu.c
index 24e570c4..32fa7609 100644
--- a/arm/aarch64/kvm-cpu.c
+++ b/arm/aarch64/kvm-cpu.c
@@ -187,6 +187,10 @@ void kvm_cpu__reset_vcpu(struct kvm_cpu *vcpu)
 	cpu_set_t *affinity;
 	int ret;
 
+	/* VCPU reset is done before activating the realm. */
+	if (kvm->arch.realm_is_active)
+		return;
+
 	affinity = kvm->arch.vcpu_affinity_cpuset;
 	if (affinity) {
 		ret = sched_setaffinity(0, sizeof(cpu_set_t), affinity);
diff --git a/arm/aarch64/realm.c b/arm/aarch64/realm.c
index eddccece..808d39c5 100644
--- a/arm/aarch64/realm.c
+++ b/arm/aarch64/realm.c
@@ -1,4 +1,5 @@
 #include "kvm/kvm.h"
+#include "kvm/kvm-cpu.h"
 
 #include <linux/byteorder.h>
 #include <asm/image.h>
@@ -192,3 +193,37 @@ void kvm_arm_realm_populate_dtb(struct kvm *kvm)
 	if (end > start)
 		realm_populate(kvm, start, end - start);
 }
+
+static void kvm_arm_realm_activate_realm(struct kvm *kvm)
+{
+	struct kvm_enable_cap activate_realm = {
+		.cap = KVM_CAP_ARM_RME,
+		.args[0] = KVM_CAP_ARM_RME_ACTIVATE_REALM,
+	};
+
+	if (ioctl(kvm->vm_fd, KVM_ENABLE_CAP, &activate_realm) < 0)
+		die_perror("KVM_CAP_ARM_RME(KVM_CAP_ARM_RME_ACTIVATE_REALM)");
+
+	kvm->arch.realm_is_active = true;
+}
+
+static int kvm_arm_realm_finalize(struct kvm *kvm)
+{
+	int i;
+
+	if (!kvm->cfg.arch.is_realm)
+		return 0;
+
+	/*
+	 * VCPU reset must happen before the realm is activated, because their
+	 * state is part of the cryptographic measurement for the realm.
+	 */
+	for (i = 0; i < kvm->nrcpus; i++)
+		kvm_cpu__reset_vcpu(kvm->cpus[i]);
+
+	/* Activate and seal the measurement for the realm. */
+	kvm_arm_realm_activate_realm(kvm);
+
+	return 0;
+}
+last_init(kvm_arm_realm_finalize)
diff --git a/arm/include/arm-common/kvm-arch.h b/arm/include/arm-common/kvm-arch.h
index b5a4b851..6d48e13c 100644
--- a/arm/include/arm-common/kvm-arch.h
+++ b/arm/include/arm-common/kvm-arch.h
@@ -116,6 +116,7 @@ struct kvm_arch {
 	cpu_set_t *vcpu_affinity_cpuset;
 	u64	measurement_algo;
 	u64	sve_vq;
+	bool	realm_is_active;
 };
 
 #endif /* ARM_COMMON__KVM_ARCH_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 23/31] arm64: Specify SMC as the PSCI conduits for realms
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (21 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 22/31] arm64: Activate realm before the first VCPU is run Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 24/31] arm64: Don't try to debug a realm Suzuki K Poulose
                     ` (8 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

From: Christoffer Dall <christoffer.dall@arm.com>

This lets the VM use the RMM implementation for PSCI.

Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/fdt.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arm/fdt.c b/arm/fdt.c
index 762a604d..c46ff410 100644
--- a/arm/fdt.c
+++ b/arm/fdt.c
@@ -208,7 +208,14 @@ static int setup_fdt(struct kvm *kvm)
 		_FDT(fdt_property_string(fdt, "compatible", "arm,psci"));
 		fns = &psci_0_1_fns;
 	}
-	_FDT(fdt_property_string(fdt, "method", "hvc"));
+
+
+	if (kvm->cfg.arch.is_realm) {
+		_FDT(fdt_property_string(fdt, "method", "smc"));
+	} else {
+		_FDT(fdt_property_string(fdt, "method", "hvc"));
+	}
+
 	_FDT(fdt_property_cell(fdt, "cpu_suspend", fns->cpu_suspend));
 	_FDT(fdt_property_cell(fdt, "cpu_off", fns->cpu_off));
 	_FDT(fdt_property_cell(fdt, "cpu_on", fns->cpu_on));
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 24/31] arm64: Don't try to debug a realm
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (22 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 23/31] arm64: Specify SMC as the PSCI conduits for realms Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 25/31] arm64: realm: Double the IPA space Suzuki K Poulose
                     ` (7 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

From: Alexandru Elisei <alexandru.elisei@arm.com>

Don't read the register values for a running realm, because they don't
reflect the actual hardware state of a realm. And don't try to read realm
memory, because that will promptly lead to kvmtool being killed.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/kvm-cpu.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arm/aarch64/kvm-cpu.c b/arm/aarch64/kvm-cpu.c
index 32fa7609..a29a3413 100644
--- a/arm/aarch64/kvm-cpu.c
+++ b/arm/aarch64/kvm-cpu.c
@@ -250,6 +250,9 @@ void kvm_cpu__show_code(struct kvm_cpu *vcpu)
 
 	reg.addr = (u64)&data;
 
+	if (vcpu->kvm->cfg.arch.is_realm)
+		return;
+
 	dprintf(debug_fd, "\n*pc:\n");
 	reg.id = ARM64_CORE_REG(regs.pc);
 	if (ioctl(vcpu->vcpu_fd, KVM_GET_ONE_REG, &reg) < 0)
@@ -274,6 +277,11 @@ void kvm_cpu__show_registers(struct kvm_cpu *vcpu)
 	reg.addr = (u64)&data;
 	dprintf(debug_fd, "\n Registers:\n");
 
+	if (vcpu->kvm->cfg.arch.is_realm) {
+		dprintf(debug_fd, " UNACCESSIBLE\n");
+		return;
+	}
+
 	reg.id		= ARM64_CORE_REG(regs.pc);
 	if (ioctl(vcpu->vcpu_fd, KVM_GET_ONE_REG, &reg) < 0)
 		die("KVM_GET_ONE_REG failed (pc)");
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 25/31] arm64: realm: Double the IPA space
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (23 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 24/31] arm64: Don't try to debug a realm Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 26/31] virtio: Add a wrapper for get_host_features Suzuki K Poulose
                     ` (6 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

The Realm's IPA space is divided into 2 halves. Protected
(lower half) and Unprotected (upper half). KVM implements
aliasing of the IPA, where the unprotected IPA is alias of
the corresponding protected ipa. Thus we must double the
IPA space required for a given VM.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/kvm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arm/aarch64/kvm.c b/arm/aarch64/kvm.c
index fca1410b..344c568b 100644
--- a/arm/aarch64/kvm.c
+++ b/arm/aarch64/kvm.c
@@ -189,6 +189,9 @@ int kvm__get_vm_type(struct kvm *kvm)
 	/* Otherwise, compute the minimal required IPA size */
 	max_ipa = kvm->cfg.ram_addr + kvm->cfg.ram_size - 1;
 	ipa_bits = max(32, fls_long(max_ipa));
+	/* Realm needs double the IPA space */
+	if (kvm->cfg.arch.is_realm)
+		ipa_bits++;
 	pr_debug("max_ipa %lx ipa_bits %d max_ipa_bits %d",
 		 max_ipa, ipa_bits, max_ipa_bits);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 26/31] virtio: Add a wrapper for get_host_features
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (24 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 25/31] arm64: realm: Double the IPA space Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 27/31] virtio: Add arch specific hook for virtio host flags Suzuki K Poulose
                     ` (5 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

Add a wrapper to the vdev->ops->get_host_features() to allow
setting platform specific flags outside the device

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 include/kvm/virtio.h | 2 ++
 virtio/core.c        | 5 +++++
 virtio/mmio-legacy.c | 2 +-
 virtio/mmio-modern.c | 2 +-
 virtio/pci-legacy.c  | 2 +-
 virtio/pci-modern.c  | 2 +-
 6 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/include/kvm/virtio.h b/include/kvm/virtio.h
index 94bddefe..e95cfad5 100644
--- a/include/kvm/virtio.h
+++ b/include/kvm/virtio.h
@@ -248,4 +248,6 @@ void virtio_set_guest_features(struct kvm *kvm, struct virtio_device *vdev,
 void virtio_notify_status(struct kvm *kvm, struct virtio_device *vdev,
 			  void *dev, u8 status);
 
+u64 virtio_dev_get_host_features(struct virtio_device *vdev, struct kvm *kvm, void *dev);
+
 #endif /* KVM__VIRTIO_H */
diff --git a/virtio/core.c b/virtio/core.c
index ea0e5b65..50e7f86d 100644
--- a/virtio/core.c
+++ b/virtio/core.c
@@ -283,6 +283,11 @@ void virtio_notify_status(struct kvm *kvm, struct virtio_device *vdev,
 		vdev->ops->notify_status(kvm, dev, ext_status);
 }
 
+u64 virtio_dev_get_host_features(struct virtio_device *vdev, struct kvm *kvm, void *dev)
+{
+	return vdev->ops->get_host_features(kvm, dev);
+}
+
 bool virtio_access_config(struct kvm *kvm, struct virtio_device *vdev,
 			  void *dev, unsigned long offset, void *data,
 			  size_t size, bool is_write)
diff --git a/virtio/mmio-legacy.c b/virtio/mmio-legacy.c
index 7ca7e69f..42673236 100644
--- a/virtio/mmio-legacy.c
+++ b/virtio/mmio-legacy.c
@@ -26,7 +26,7 @@ static void virtio_mmio_config_in(struct kvm_cpu *vcpu,
 		break;
 	case VIRTIO_MMIO_DEVICE_FEATURES:
 		if (vmmio->hdr.host_features_sel == 0)
-			val = vdev->ops->get_host_features(vmmio->kvm,
+			val = virtio_dev_get_host_features(vdev, vmmio->kvm,
 							   vmmio->dev);
 		ioport__write32(data, val);
 		break;
diff --git a/virtio/mmio-modern.c b/virtio/mmio-modern.c
index 6c0bb382..a09fa8e9 100644
--- a/virtio/mmio-modern.c
+++ b/virtio/mmio-modern.c
@@ -26,7 +26,7 @@ static void virtio_mmio_config_in(struct kvm_cpu *vcpu,
 	case VIRTIO_MMIO_DEVICE_FEATURES:
 		if (vmmio->hdr.host_features_sel > 1)
 			break;
-		features |= vdev->ops->get_host_features(vmmio->kvm, vmmio->dev);
+		features |= virtio_dev_get_host_features(vdev, vmmio->kvm, vmmio->dev);
 		val = features >> (32 * vmmio->hdr.host_features_sel);
 		break;
 	case VIRTIO_MMIO_QUEUE_NUM_MAX:
diff --git a/virtio/pci-legacy.c b/virtio/pci-legacy.c
index 58047967..d5f5dee7 100644
--- a/virtio/pci-legacy.c
+++ b/virtio/pci-legacy.c
@@ -44,7 +44,7 @@ static bool virtio_pci__data_in(struct kvm_cpu *vcpu, struct virtio_device *vdev
 
 	switch (offset) {
 	case VIRTIO_PCI_HOST_FEATURES:
-		val = vdev->ops->get_host_features(kvm, vpci->dev);
+		val = virtio_dev_get_host_features(vdev, kvm, vpci->dev);
 		ioport__write32(data, val);
 		break;
 	case VIRTIO_PCI_QUEUE_PFN:
diff --git a/virtio/pci-modern.c b/virtio/pci-modern.c
index c5b4bc50..2c5bf3f8 100644
--- a/virtio/pci-modern.c
+++ b/virtio/pci-modern.c
@@ -158,7 +158,7 @@ static bool virtio_pci__common_read(struct virtio_device *vdev,
 	case VIRTIO_PCI_COMMON_DF:
 		if (vpci->device_features_sel > 1)
 			break;
-		features |= vdev->ops->get_host_features(vpci->kvm, vpci->dev);
+		features |= virtio_dev_get_host_features(vdev, vpci->kvm, vpci->dev);
 		val = features >> (32 * vpci->device_features_sel);
 		ioport__write32(data, val);
 		break;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 27/31] virtio: Add arch specific hook for virtio host flags
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (25 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 26/31] virtio: Add a wrapper for get_host_features Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 28/31] arm64: realm: Enforce virtio F_ACCESS_PLATFORM flag Suzuki K Poulose
                     ` (4 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

Add callbacks for archs to provide virtio host flags.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch32/kvm.c | 5 +++++
 arm/aarch64/kvm.c | 5 +++++
 include/kvm/kvm.h | 2 ++
 mips/kvm.c        | 5 +++++
 powerpc/kvm.c     | 5 +++++
 riscv/kvm.c       | 5 +++++
 virtio/core.c     | 5 ++++-
 x86/kvm.c         | 5 +++++
 8 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/arm/aarch32/kvm.c b/arm/aarch32/kvm.c
index 768a56bb..849c55d3 100644
--- a/arm/aarch32/kvm.c
+++ b/arm/aarch32/kvm.c
@@ -12,3 +12,8 @@ u64 kvm__arch_default_ram_address(void)
 {
 	return ARM_MEMORY_AREA;
 }
+
+u64 kvm__arch_get_virtio_host_features(struct kvm *kvm)
+{
+	return 0;
+}
diff --git a/arm/aarch64/kvm.c b/arm/aarch64/kvm.c
index 344c568b..a4664237 100644
--- a/arm/aarch64/kvm.c
+++ b/arm/aarch64/kvm.c
@@ -234,3 +234,8 @@ void kvm__arch_enable_mte(struct kvm *kvm)
 
 	pr_debug("MTE capability enabled");
 }
+
+u64 kvm__arch_get_virtio_host_features(struct kvm *kvm)
+{
+	return 0;
+}
diff --git a/include/kvm/kvm.h b/include/kvm/kvm.h
index 3872dc65..a3624de4 100644
--- a/include/kvm/kvm.h
+++ b/include/kvm/kvm.h
@@ -203,6 +203,8 @@ int kvm__arch_free_firmware(struct kvm *kvm);
 bool kvm__arch_cpu_supports_vm(void);
 void kvm__arch_read_term(struct kvm *kvm);
 
+u64 kvm__arch_get_virtio_host_features(struct kvm *kvm);
+
 #ifdef ARCH_HAS_CFG_RAM_ADDRESS
 static inline bool kvm__arch_has_cfg_ram_address(void)
 {
diff --git a/mips/kvm.c b/mips/kvm.c
index 0faa03a9..e23d5cf9 100644
--- a/mips/kvm.c
+++ b/mips/kvm.c
@@ -374,3 +374,8 @@ void ioport__map_irq(u8 *irq)
 void serial8250__inject_sysrq(struct kvm *kvm, char sysrq)
 {
 }
+
+u64 kvm__arch_get_virtio_host_features(struct kvm *kvm)
+{
+	return 0;
+}
diff --git a/powerpc/kvm.c b/powerpc/kvm.c
index 7b0d0669..6b3ab93f 100644
--- a/powerpc/kvm.c
+++ b/powerpc/kvm.c
@@ -529,3 +529,8 @@ int kvm__arch_free_firmware(struct kvm *kvm)
 {
 	return 0;
 }
+
+u64 kvm__arch_get_virtio_host_features(struct kvm *kvm)
+{
+	return 0;
+}
diff --git a/riscv/kvm.c b/riscv/kvm.c
index 4d6f5cb5..884321ca 100644
--- a/riscv/kvm.c
+++ b/riscv/kvm.c
@@ -182,3 +182,8 @@ int kvm__arch_setup_firmware(struct kvm *kvm)
 {
 	return 0;
 }
+
+u64 kvm__arch_get_virtio_host_features(struct kvm *kvm)
+{
+	return 0;
+}
diff --git a/virtio/core.c b/virtio/core.c
index 50e7f86d..674f6fae 100644
--- a/virtio/core.c
+++ b/virtio/core.c
@@ -285,7 +285,10 @@ void virtio_notify_status(struct kvm *kvm, struct virtio_device *vdev,
 
 u64 virtio_dev_get_host_features(struct virtio_device *vdev, struct kvm *kvm, void *dev)
 {
-	return vdev->ops->get_host_features(kvm, dev);
+	u64 features = kvm__arch_get_virtio_host_features(kvm);
+
+	features |= vdev->ops->get_host_features(kvm, dev);
+	return features;
 }
 
 bool virtio_access_config(struct kvm *kvm, struct virtio_device *vdev,
diff --git a/x86/kvm.c b/x86/kvm.c
index 328fa750..961b5d3f 100644
--- a/x86/kvm.c
+++ b/x86/kvm.c
@@ -387,3 +387,8 @@ void kvm__arch_read_term(struct kvm *kvm)
 	serial8250__update_consoles(kvm);
 	virtio_console__inject_interrupt(kvm);
 }
+
+u64 kvm__arch_get_virtio_host_features(struct kvm *kvm)
+{
+	return 0;
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 28/31] arm64: realm: Enforce virtio F_ACCESS_PLATFORM flag
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (26 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 27/31] virtio: Add arch specific hook for virtio host flags Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 29/31] mmio: add arch hook for an unhandled MMIO access Suzuki K Poulose
                     ` (3 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

For realms, force the F_ACCESS_PLATFORM flag to ensure DMA-APIs
are triggered for virtio in Linux

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/kvm.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arm/aarch64/kvm.c b/arm/aarch64/kvm.c
index a4664237..1f3a0def 100644
--- a/arm/aarch64/kvm.c
+++ b/arm/aarch64/kvm.c
@@ -5,6 +5,7 @@
 #include <linux/byteorder.h>
 #include <linux/cpumask.h>
 #include <linux/sizes.h>
+#include <linux/virtio_config.h>
 
 #include <kvm/util.h>
 
@@ -237,5 +238,10 @@ void kvm__arch_enable_mte(struct kvm *kvm)
 
 u64 kvm__arch_get_virtio_host_features(struct kvm *kvm)
 {
-	return 0;
+	u64 features = 0;
+
+	/* Enforce F_ACCESS_PLATFORM for Realms */
+	if (kvm->cfg.arch.is_realm)
+		features |= (1ULL << VIRTIO_F_ACCESS_PLATFORM);
+	return features;
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 29/31] mmio: add arch hook for an unhandled MMIO access
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (27 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 28/31] arm64: realm: Enforce virtio F_ACCESS_PLATFORM flag Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 30/31] arm64: realm: inject an abort on " Suzuki K Poulose
                     ` (2 subsequent siblings)
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel, Joey Gouly

From: Joey Gouly <joey.gouly@arm.com>

Add a hook that allows an architecture to run some code on an
unhandled MMIO access.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/kvm-cpu.c         | 4 ++++
 include/kvm/kvm-cpu.h | 2 ++
 mips/kvm-cpu.c        | 4 ++++
 mmio.c                | 3 +++
 powerpc/kvm-cpu.c     | 4 ++++
 riscv/kvm-cpu.c       | 4 ++++
 x86/kvm-cpu.c         | 4 ++++
 7 files changed, 25 insertions(+)

diff --git a/arm/kvm-cpu.c b/arm/kvm-cpu.c
index 98bc5fdf..90a15ae9 100644
--- a/arm/kvm-cpu.c
+++ b/arm/kvm-cpu.c
@@ -152,3 +152,7 @@ bool kvm_cpu__handle_exit(struct kvm_cpu *vcpu)
 void kvm_cpu__show_page_tables(struct kvm_cpu *vcpu)
 {
 }
+
+void kvm_cpu__arch_unhandled_mmio(struct kvm_cpu *vcpu)
+{
+}
diff --git a/include/kvm/kvm-cpu.h b/include/kvm/kvm-cpu.h
index 0f16f8d6..d0c40598 100644
--- a/include/kvm/kvm-cpu.h
+++ b/include/kvm/kvm-cpu.h
@@ -29,4 +29,6 @@ void kvm_cpu__show_page_tables(struct kvm_cpu *vcpu);
 void kvm_cpu__arch_nmi(struct kvm_cpu *cpu);
 void kvm_cpu__run_on_all_cpus(struct kvm *kvm, struct kvm_cpu_task *task);
 
+void kvm_cpu__arch_unhandled_mmio(struct kvm_cpu *cpu);
+
 #endif /* KVM__KVM_CPU_H */
diff --git a/mips/kvm-cpu.c b/mips/kvm-cpu.c
index 30a3de18..0ce88ac3 100644
--- a/mips/kvm-cpu.c
+++ b/mips/kvm-cpu.c
@@ -217,3 +217,7 @@ void kvm_cpu__show_code(struct kvm_cpu *vcpu)
 void kvm_cpu__show_page_tables(struct kvm_cpu *vcpu)
 {
 }
+
+void kvm_cpu__arch_unhandled_mmio(struct kvm_cpu *cpu)
+{
+}
diff --git a/mmio.c b/mmio.c
index 5a114e99..7e31079b 100644
--- a/mmio.c
+++ b/mmio.c
@@ -206,6 +206,9 @@ bool kvm__emulate_mmio(struct kvm_cpu *vcpu, u64 phys_addr, u8 *data,
 			fprintf(stderr,	"Warning: Ignoring MMIO %s at %016llx (length %u)\n",
 				to_direction(is_write),
 				(unsigned long long)phys_addr, len);
+
+		kvm_cpu__arch_unhandled_mmio(vcpu);
+
 		goto out;
 	}
 
diff --git a/powerpc/kvm-cpu.c b/powerpc/kvm-cpu.c
index 461e0a90..e0c20f9d 100644
--- a/powerpc/kvm-cpu.c
+++ b/powerpc/kvm-cpu.c
@@ -288,3 +288,7 @@ void kvm_cpu__show_page_tables(struct kvm_cpu *vcpu)
 {
 	/* Does nothing yet */
 }
+
+void kvm_cpu__arch_unhandled_mmio(struct kvm_cpu *cpu)
+{
+}
diff --git a/riscv/kvm-cpu.c b/riscv/kvm-cpu.c
index f98bd7ae..8417e361 100644
--- a/riscv/kvm-cpu.c
+++ b/riscv/kvm-cpu.c
@@ -461,3 +461,7 @@ void kvm_cpu__show_registers(struct kvm_cpu *vcpu)
 
 	kvm_cpu__show_csrs(vcpu);
 }
+
+void kvm_cpu__arch_unhandled_mmio(struct kvm_cpu *cpu)
+{
+}
diff --git a/x86/kvm-cpu.c b/x86/kvm-cpu.c
index b02ff65e..ac075ee4 100644
--- a/x86/kvm-cpu.c
+++ b/x86/kvm-cpu.c
@@ -444,3 +444,7 @@ void kvm_cpu__arch_nmi(struct kvm_cpu *cpu)
 
 	ioctl(cpu->vcpu_fd, KVM_NMI);
 }
+
+void kvm_cpu__arch_unhandled_mmio(struct kvm_cpu *cpu)
+{
+}
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 30/31] arm64: realm: inject an abort on an unhandled MMIO access
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (28 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 29/31] mmio: add arch hook for an unhandled MMIO access Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-01-27 11:39   ` [RFC kvmtool 31/31] arm64: Allow the user to create a realm Suzuki K Poulose
  2023-10-02  9:45   ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Piotr Sawicki
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel, Joey Gouly

From: Joey Gouly <joey.gouly@arm.com>

For Realms, inject a synchronous external abort, instead of ignoring unknown
MMIO accesses.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/kvm-cpu.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arm/kvm-cpu.c b/arm/kvm-cpu.c
index 90a15ae9..c96d75eb 100644
--- a/arm/kvm-cpu.c
+++ b/arm/kvm-cpu.c
@@ -155,4 +155,13 @@ void kvm_cpu__show_page_tables(struct kvm_cpu *vcpu)
 
 void kvm_cpu__arch_unhandled_mmio(struct kvm_cpu *vcpu)
 {
+	struct kvm_vcpu_events events = { };
+
+	if (!vcpu->kvm->cfg.arch.is_realm)
+		return;
+
+	events.exception.ext_dabt_pending = 1;
+
+	if (ioctl(vcpu->vcpu_fd, KVM_SET_VCPU_EVENTS, &events) < 0)
+		die_perror("KVM_SET_VCPU_EVENTS failed");
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvmtool 31/31] arm64: Allow the user to create a realm
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (29 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 30/31] arm64: realm: inject an abort on " Suzuki K Poulose
@ 2023-01-27 11:39   ` Suzuki K Poulose
  2023-10-02  9:45   ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Piotr Sawicki
  31 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-01-27 11:39 UTC (permalink / raw)
  To: kvm, kvmarm
  Cc: suzuki.poulose, Alexandru Elisei, Andrew Jones, Christoffer Dall,
	Fuad Tabba, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, linux-coco,
	kvmarm, linux-arm-kernel, linux-kernel

From: Alexandru Elisei <alexandru.elisei@arm.com>

We have everything in place to create a realm, allow the user to do so.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arm/aarch64/kvm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arm/aarch64/kvm.c b/arm/aarch64/kvm.c
index 1f3a0def..422dbec2 100644
--- a/arm/aarch64/kvm.c
+++ b/arm/aarch64/kvm.c
@@ -104,8 +104,6 @@ static void validate_realm_cfg(struct kvm *kvm)
 		if (strlen(kvm->cfg.arch.realm_pv) > KVM_CAP_ARM_RME_RPV_SIZE)
 			die("Invalid size for Realm Personalization Value\n");
 	}
-
-	die("Realms not supported");
 }
 
 void kvm__arch_validate_cfg(struct kvm *kvm)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 00/27] Support for Arm Confidential Compute Architecture
  2023-01-27 11:22 [RFC] Support for Arm CCA VMs on Linux Suzuki K Poulose
                   ` (2 preceding siblings ...)
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
@ 2023-01-27 11:40 ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 01/27] lib/string: include stddef.h for size_t Joey Gouly
                     ` (26 more replies)
  2023-01-27 15:26 ` [RFC] Support for Arm CCA VMs on Linux Jean-Philippe Brucker
                   ` (4 subsequent siblings)
  8 siblings, 27 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

This series adds support for running the kvm-unit-tests in the Arm CCA reference
software architecture. See more details on Arm CCA and how to build/run the     
entire stack here [0].                                                          
                                                                                
This involves enlightening the boot/setup code with the Realm Service Interface 
(RSI). The series also includes new test cases that exercise the RSI calls.     
                                                                                
Currently we only support "kvmtool" as the VMM for running Realms. There was    
an attempt to add support for running the test scripts using with kvmtool here [1],
which hasn't progressed. It would be good to have that resolved, so that we can 
run all the tests without manually specifying the commandlines for each run.    
For the purposes of running the Realm specific tests, we have added a "temporary"
script "run-realm-tests" until the kvmtool support is added. We do not expect   
this to be merged.                                                              
                                                                                
                                                                                
Base Realm Support                                                              
-------------------                                                             
                                                                                
Realm IPA Space                                                                 
---------------                                                                 
When running on in Realm world, the (Guest) Physical Address - aka Intermediate 
Physical Address (IPA) in Arm terminology - space of the VM is split into two halves,
protected (lower half) and un-protected (upper half). A protected IPA will      
always map pages in the "realm world" and  the contents are not accessible to   
the host. An unprotected IPA on the other hand can be mapped to page in the     
"normal world" and thus shared with the host. All host emulated MMIO ranges must
be in unprotected IPA space.                                                    
                                                                                
Realm can query the Realm Management Monitor for the configuration via RSI call 
(RSI_REALM_CONFIG) and identify the "boundary" of the "IPA" split.              
                                                                                
As far as the hyp/VMM is concerned, there is only one "IPA space" (the lower    
half) of memory map. The "upper half" is "unprotected alias" of the memory map. 
                                                                                
In the guest, this is achieved by "treating the MSB (1 << (IPA_WIDTH - 1))" as  
a protection attribute (PTE_NS_SHARED), where the Realm applies this to any     
address, it thinks is acccessed/managed by host (e.g., MMIO, shared pages).     
Given that this is runtime variable (but fixed for a given Realm), uses a       
variable to track the value.                                                    
                                                                                
All I/O regions are marked as "shared". Care is taken to ensure I/O access (uart)
with MMU off uses the "Unprotected Physical address".                           
                                                                                
                                                                                
Realm IPA State                                                                 
---------------                                                                 
Additionally, each page (4K) in the protected IPA space has a state associated  
(Realm IPA State - RIPAS) with it. It is either of :                            
   RIPAS_EMPTY                                                                  
   RIPAS_RAM                                                                    
                                                                                
Any IPA backed by RAM, must be marked as RIPAS_RAM before an access is made to  
it. The hypervisor/VMM does this for the initial image loaded into the Realm    
memory before the Realm starts execution. Given the kvm-unit-test flat files do 
not contain a metadata header (e.g., like the arm64 Linux kernel Image),        
indicating the "actual image size in memory", the VMM cannot transition the     
area towards the end of the image (e.g., bss, stack) which are accessed very    
early during boot. Thus the early boot assembly code will mark the area upto    
the stack as RAM.                                                               
                                                                                
Once we land in the C code, we mark target relocation area for FDT and          
initrd as RIPAS_RAM. At this point, we can scan the FDT and mark all RAM memory 
blocks as RIPAS_RAM.                                                            
                                                                                
TODO: It would be good to add an image header to the flat files indicating the  
size, which can take the burden off doing the early assembly boot code RSI calls.
                                                                                
Shared Memory support                                                           
---------------------                                                           
Given the "default" memory of a VM is not accessible to host, we add new page   
alloc/free routines for "memory shared" with the host. e.g., GICv3-ITS must use 
shared pages for ITS emulation.                                                 
                                                                                
RSI Test suites                                                                 
--------------                                                                  
There are new testcases added to exercise the RSI interfaces and the RMM flows. 
                                                                                
Attestation and measurement services related RSI tests require parsing tokens   
and claims returned by the RMM. This is achieved with the help of QCBOR library 
[2], which is added as a submodule to the project. We have also added a wrapper 
library - libtokenverifier - around the QCBOR to parse the tokens according to  
the RMM specifications.                                                         
                                                                                
The patches are also available here:                                           
                                                                                
 https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca cca/rfc-v1                     
                                                                                
                                                                                
 [0] https://lore.kernel.org/all/20230127112248.136810-1-suzuki.poulose@arm.com/
 [1] https://lkml.kernel.org/r/20210702163122.96110-1-alexandru.elisei@arm.com  
 [2] https://github.com/laurencelundblade/QCBOR   

Thanks,
Joey

Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Andrew Jones <andrew.jones@linux.dev>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: Joey Gouly <Joey.Gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Quentin Perret <qperret@google.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zenghui Yu <yuzenghui@huawei.com>
Cc: linux-coco@lists.linux.dev
Cc: kvmarm@lists.linux.dev
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: kvm@vger.kernel.org

Alexandru Elisei (3):
  arm: Expand SMCCC arguments and return values
  arm: selftest: realm: skip pabt test when running in a realm
  NOT-FOR-MERGING: add run-realm-tests

Djordje Kovacevic (1):
  arm: realm: Add tests for in realm SEA

Gareth Stockwell (1):
  arm: realm: add hvc and RSI_HOST_CALL tests

Jean-Philippe Brucker (1):
  arm: Move io_init after vm initialization

Joey Gouly (9):
  lib/string: include stddef.h for size_t
  arm: realm: Add RSI interface header
  arm: Make physical address mask dynamic
  arm: Introduce NS_SHARED PTE attribute
  arm: realm: Make uart available before MMU is enabled
  arm: realm: Realm initialisation
  arm: realm: Add support for changing the state of memory
  arm: realm: Add RSI version test
  lib/alloc_page: Add shared page allocation support

Mate Toth-Pal (2):
  arm: Add a library to verify tokens using the QCBOR library
  arm: realm: Add Realm attestation tests

Subhasish Ghosh (1):
  arm: realm: Add test for FPU/SIMD context save/restore

Suzuki K Poulose (9):
  arm: realm: Set RIPAS state for RAM
  arm: realm: Early memory setup
  arm: gic-v3-its: Use shared pages wherever needed
  arm: realm: Enable memory encryption
  qcbor: Add QCBOR as a submodule
  arm: Add build steps for QCBOR library
  arm: realm: add RSI interface for attestation measurements
  arm: realm: Add helpers to decode RSI return codes
  arm: realm: Add a test for shared memory

 .gitmodules                         |    3 +
 arm/Makefile.arm64                  |   17 +-
 arm/Makefile.common                 |    1 +
 arm/cstart.S                        |   49 +-
 arm/cstart64.S                      |  123 ++-
 arm/realm-attest.c                  | 1125 +++++++++++++++++++++++++++
 arm/realm-fpu.c                     |  242 ++++++
 arm/realm-ns-memory.c               |   86 ++
 arm/realm-rsi.c                     |  157 ++++
 arm/realm-sea.c                     |  143 ++++
 arm/run-realm-tests                 |   56 ++
 arm/selftest.c                      |    9 +-
 arm/unittests.cfg                   |   94 +++
 lib/alloc_page.c                    |   34 +-
 lib/alloc_page.h                    |   24 +
 lib/arm/asm/arm-smccc.h             |   44 ++
 lib/arm/asm/psci.h                  |   13 +-
 lib/arm/asm/rsi.h                   |   16 +
 lib/arm/gic-v3.c                    |    6 +-
 lib/arm/io.c                        |   24 +-
 lib/arm/mmu.c                       |   73 +-
 lib/arm/psci.c                      |   19 +-
 lib/arm/setup.c                     |   17 +-
 lib/arm64/asm/arm-smccc.h           |    6 +
 lib/arm64/asm/io.h                  |    6 +
 lib/arm64/asm/pgtable-hwdef.h       |    6 -
 lib/arm64/asm/pgtable.h             |   20 +
 lib/arm64/asm/processor.h           |    8 +
 lib/arm64/asm/rsi.h                 |   84 ++
 lib/arm64/asm/smc-rsi.h             |  139 ++++
 lib/arm64/gic-v3-its.c              |    6 +-
 lib/arm64/rsi.c                     |  143 ++++
 lib/libcflat.h                      |    1 +
 lib/qcbor                           |    1 +
 lib/string.h                        |    2 +
 lib/token_verifier/attest_defines.h |   50 ++
 lib/token_verifier/token_dumper.c   |  158 ++++
 lib/token_verifier/token_dumper.h   |   15 +
 lib/token_verifier/token_verifier.c |  591 ++++++++++++++
 lib/token_verifier/token_verifier.h |   77 ++
 40 files changed, 3640 insertions(+), 48 deletions(-)
 create mode 100644 .gitmodules
 create mode 100644 arm/realm-attest.c
 create mode 100644 arm/realm-fpu.c
 create mode 100644 arm/realm-ns-memory.c
 create mode 100644 arm/realm-rsi.c
 create mode 100644 arm/realm-sea.c
 create mode 100755 arm/run-realm-tests
 create mode 100644 lib/arm/asm/arm-smccc.h
 create mode 100644 lib/arm/asm/rsi.h
 create mode 100644 lib/arm64/asm/arm-smccc.h
 create mode 100644 lib/arm64/asm/rsi.h
 create mode 100644 lib/arm64/asm/smc-rsi.h
 create mode 100644 lib/arm64/rsi.c
 create mode 160000 lib/qcbor
 create mode 100644 lib/token_verifier/attest_defines.h
 create mode 100644 lib/token_verifier/token_dumper.c
 create mode 100644 lib/token_verifier/token_dumper.h
 create mode 100644 lib/token_verifier/token_verifier.c
 create mode 100644 lib/token_verifier/token_verifier.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 01/27] lib/string: include stddef.h for size_t
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-31 14:43     ` Thomas Huth
  2023-01-27 11:40   ` [RFC kvm-unit-tests 02/27] arm: Expand SMCCC arguments and return values Joey Gouly
                     ` (25 subsequent siblings)
  26 siblings, 1 reply; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

Don't implicitly rely on this header being included.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 lib/string.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/string.h b/lib/string.h
index b07763ea..758dca8a 100644
--- a/lib/string.h
+++ b/lib/string.h
@@ -7,6 +7,8 @@
 #ifndef _STRING_H_
 #define _STRING_H_
 
+#include <stddef.h>  /* For size_t */
+
 extern size_t strlen(const char *buf);
 extern size_t strnlen(const char *buf, size_t maxlen);
 extern char *strcat(char *dest, const char *src);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 02/27] arm: Expand SMCCC arguments and return values
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 01/27] lib/string: include stddef.h for size_t Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 03/27] arm: realm: Add RSI interface header Joey Gouly
                     ` (24 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

From: Alexandru Elisei <alexandru.elisei@arm.com>

PSCI uses the SMC Calling Convention (SMCCC) to communicate with the higher
level software. PSCI uses at most 4 arguments and expend only one return
value. However, SMCCC has provisions for more arguments (upto 17 depending
on the SMCCC version) and upto 10 distinct return values.

We are going to be adding tests that make use of it, so add support for the
extended number of arguments and return values.

Also rename the SMCCC functions to generic, non-PSCI names, so they
can be used for Realm services.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
[ Expand the number of args to 11 /results 10]
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 arm/cstart.S              | 49 ++++++++++++++++++++++++++++------
 arm/cstart64.S            | 55 +++++++++++++++++++++++++++++++++------
 arm/selftest.c            |  2 +-
 lib/arm/asm/arm-smccc.h   | 44 +++++++++++++++++++++++++++++++
 lib/arm/asm/psci.h        | 13 +++++----
 lib/arm/psci.c            | 19 +++++++++++---
 lib/arm64/asm/arm-smccc.h |  6 +++++
 7 files changed, 160 insertions(+), 28 deletions(-)
 create mode 100644 lib/arm/asm/arm-smccc.h
 create mode 100644 lib/arm64/asm/arm-smccc.h

diff --git a/arm/cstart.S b/arm/cstart.S
index 7036e67f..db377668 100644
--- a/arm/cstart.S
+++ b/arm/cstart.S
@@ -96,26 +96,59 @@ start:
 .text
 
 /*
- * psci_invoke_hvc / psci_invoke_smc
+ * arm_smccc_hvc / arm_smccc_smc
  *
  * Inputs:
  *   r0 -- function_id
  *   r1 -- arg0
  *   r2 -- arg1
  *   r3 -- arg2
+ *   [sp] - arg3
+ *   [sp + #4] - arg4
+ *   [sp + #8] - arg5
+ *   [sp + #12] - arg6
+ *   [sp + #16] - arg7
+ *   [sp + #20] - arg8
+ *   [sp + #24] - arg9
+ *   [sp + #28] - arg10
+ *   [sp + #32] - result (as a pointer to a struct smccc_result)
  *
  * Outputs:
  *   r0 -- return code
+ *
+ * If result pointer is not NULL:
+ *   result.r0 -- return code
+ *   result.r1 -- r1
+ *   result.r2 -- r2
+ *   result.r3 -- r3
+ *   result.r4 -- r4
+ *   result.r5 -- r5
+ *   result.r6 -- r6
+ *   result.r7 -- r7
+ *   result.r8 -- r8
+ *   result.r9 -- r9
  */
-.globl psci_invoke_hvc
-psci_invoke_hvc:
-	hvc	#0
+.macro do_smccc_call instr
+	mov	r12, sp
+	push	{r4-r11}
+	ldm	r12, {r4-r11}
+	\instr	#0
+	ldr	r10, [sp, #64]
+	cmp	r10, #0
+	beq	1f
+	stm	r10, {r0-r9}
+1:
+	pop	{r4-r11}
 	mov	pc, lr
+.endm
 
-.globl psci_invoke_smc
-psci_invoke_smc:
-	smc	#0
-	mov	pc, lr
+.globl arm_smccc_hvc
+arm_smccc_hvc:
+	do_smccc_call hvc
+
+.globl arm_smccc_smc
+arm_smccc_smc:
+	do_smccc_call smc
 
 enable_vfp:
 	/* Enable full access to CP10 and CP11: */
diff --git a/arm/cstart64.S b/arm/cstart64.S
index e4ab7d06..b689b132 100644
--- a/arm/cstart64.S
+++ b/arm/cstart64.S
@@ -110,26 +110,65 @@ start:
 .text
 
 /*
- * psci_invoke_hvc / psci_invoke_smc
+ * arm_smccc_hvc / arm_smccc_smc
  *
  * Inputs:
  *   w0 -- function_id
  *   x1 -- arg0
  *   x2 -- arg1
  *   x3 -- arg2
+ *   x4 -- arg3
+ *   x5 -- arg4
+ *   x6 -- arg5
+ *   x7 -- arg6
+ *   sp -- { arg7, arg8, arg9, arg10, result }
  *
  * Outputs:
  *   x0 -- return code
+ *
+ * If result pointer is not NULL:
+ *   result.r0 -- return code
+ *   result.r1 -- x1
+ *   result.r2 -- x2
+ *   result.r3 -- x3
+ *   result.r4 -- x4
+ *   result.r5 -- x5
+ *   result.r6 -- x6
+ *   result.r7 -- x7
+ *   result.r8 -- x8
+ *   result.r9 -- x9
  */
-.globl psci_invoke_hvc
-psci_invoke_hvc:
-	hvc	#0
+.macro do_smccc_call instr
+	/* Save x8-x11 on stack */
+	stp	x9, x8,	  [sp, #-16]!
+	stp	x11, x10, [sp, #-16]!
+	/* Load arg7 - arg10 from the stack */
+	ldp	x8, x9,   [sp, #32]
+	ldp	x10, x11, [sp, #48]
+	\instr	#0
+	/* Get the result address */
+	ldr	x10, [sp, #64]
+	cmp	x10, xzr
+	b.eq	1f
+	stp	x0, x1, [x10, #0]
+	stp	x2, x3, [x10, #16]
+	stp	x4, x5, [x10, #32]
+	stp	x6, x7, [x10, #48]
+	stp	x8, x9, [x10, #64]
+1:
+	/* Restore x8-x11 from stack */
+	ldp	x11, x10, [sp], #16
+	ldp	x9, x8,   [sp], #16
 	ret
+.endm
 
-.globl psci_invoke_smc
-psci_invoke_smc:
-	smc	#0
-	ret
+.globl arm_smccc_hvc
+arm_smccc_hvc:
+	do_smccc_call hvc
+
+.globl arm_smccc_smc
+arm_smccc_smc:
+	do_smccc_call smc
 
 get_mmu_off:
 	adrp	x0, auxinfo
diff --git a/arm/selftest.c b/arm/selftest.c
index 9f459ed3..6f825add 100644
--- a/arm/selftest.c
+++ b/arm/selftest.c
@@ -405,7 +405,7 @@ static void psci_print(void)
 	int ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
 	report_info("PSCI version: %d.%d", PSCI_VERSION_MAJOR(ver),
 					  PSCI_VERSION_MINOR(ver));
-	report_info("PSCI method: %s", psci_invoke == psci_invoke_hvc ?
+	report_info("PSCI method: %s", psci_invoke_fn == arm_smccc_hvc ?
 				       "hvc" : "smc");
 }
 
diff --git a/lib/arm/asm/arm-smccc.h b/lib/arm/asm/arm-smccc.h
new file mode 100644
index 00000000..5d85b01a
--- /dev/null
+++ b/lib/arm/asm/arm-smccc.h
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+#ifndef _ASMARM_ARM_SMCCC_H_
+#define _ASMARM_ARM_SMCCC_H_
+
+struct smccc_result {
+	unsigned long r0;
+	unsigned long r1;
+	unsigned long r2;
+	unsigned long r3;
+	unsigned long r4;
+	unsigned long r5;
+	unsigned long r6;
+	unsigned long r7;
+	unsigned long r8;
+	unsigned long r9;
+};
+
+typedef int (*smccc_invoke_fn)(unsigned int function_id, unsigned long arg0,
+			       unsigned long arg1, unsigned long arg2,
+			       unsigned long arg3, unsigned long arg4,
+			       unsigned long arg5, unsigned long arg6,
+			       unsigned long arg7, unsigned long arg8,
+			       unsigned long arg9, unsigned long arg10,
+			       struct smccc_result *result);
+extern int arm_smccc_hvc(unsigned int function_id, unsigned long arg0,
+			 unsigned long arg1, unsigned long arg2,
+			 unsigned long arg3, unsigned long arg4,
+			 unsigned long arg5, unsigned long arg6,
+			 unsigned long arg7, unsigned long arg8,
+			 unsigned long arg9, unsigned long arg10,
+			 struct smccc_result *result);
+extern int arm_smccc_smc(unsigned int function_id, unsigned long arg0,
+			 unsigned long arg1, unsigned long arg2,
+			 unsigned long arg3, unsigned long arg4,
+			 unsigned long arg5, unsigned long arg6,
+			 unsigned long arg7, unsigned long arg8,
+			 unsigned long arg9, unsigned long arg10,
+			 struct smccc_result *result);
+
+#endif /* _ASMARM_ARM_SMCCC_H_ */
diff --git a/lib/arm/asm/psci.h b/lib/arm/asm/psci.h
index cf03449b..6a399621 100644
--- a/lib/arm/asm/psci.h
+++ b/lib/arm/asm/psci.h
@@ -3,13 +3,12 @@
 #include <libcflat.h>
 #include <linux/psci.h>
 
-typedef int (*psci_invoke_fn)(unsigned int function_id, unsigned long arg0,
-			      unsigned long arg1, unsigned long arg2);
-extern psci_invoke_fn psci_invoke;
-extern int psci_invoke_hvc(unsigned int function_id, unsigned long arg0,
-			   unsigned long arg1, unsigned long arg2);
-extern int psci_invoke_smc(unsigned int function_id, unsigned long arg0,
-			   unsigned long arg1, unsigned long arg2);
+#include <asm/arm-smccc.h>
+
+extern smccc_invoke_fn psci_invoke_fn;
+
+extern int psci_invoke(unsigned int function_id, unsigned long arg0,
+		       unsigned long arg1, unsigned long arg2);
 extern void psci_set_conduit(void);
 extern int psci_cpu_on(unsigned long cpuid, unsigned long entry_point);
 extern void psci_system_reset(void);
diff --git a/lib/arm/psci.c b/lib/arm/psci.c
index 9c031a12..0a1d0e82 100644
--- a/lib/arm/psci.c
+++ b/lib/arm/psci.c
@@ -13,13 +13,24 @@
 #include <asm/smp.h>
 
 static int psci_invoke_none(unsigned int function_id, unsigned long arg0,
-			    unsigned long arg1, unsigned long arg2)
+			    unsigned long arg1, unsigned long arg2,
+			    unsigned long arg3, unsigned long arg4,
+			    unsigned long arg5, unsigned long arg6,
+			    unsigned long arg7, unsigned long arg8,
+			    unsigned long arg9, unsigned long arg10,
+			    struct smccc_result *result)
 {
 	printf("No PSCI method configured! Can't invoke...\n");
 	return PSCI_RET_NOT_PRESENT;
 }
 
-psci_invoke_fn psci_invoke = psci_invoke_none;
+smccc_invoke_fn psci_invoke_fn = psci_invoke_none;
+
+int psci_invoke(unsigned int function_id, unsigned long arg0,
+		unsigned long arg1, unsigned long arg2)
+{
+	return psci_invoke_fn(function_id, arg0, arg1, arg2, 0, 0, 0, 0, 0, 0, 0, 0, NULL);
+}
 
 int psci_cpu_on(unsigned long cpuid, unsigned long entry_point)
 {
@@ -69,9 +80,9 @@ void psci_set_conduit(void)
 	assert(method != NULL && len == 4);
 
 	if (strcmp(method->data, "hvc") == 0)
-		psci_invoke = psci_invoke_hvc;
+		psci_invoke_fn = arm_smccc_hvc;
 	else if (strcmp(method->data, "smc") == 0)
-		psci_invoke = psci_invoke_smc;
+		psci_invoke_fn = arm_smccc_smc;
 	else
 		assert_msg(false, "Unknown PSCI conduit: %s", method->data);
 }
diff --git a/lib/arm64/asm/arm-smccc.h b/lib/arm64/asm/arm-smccc.h
new file mode 100644
index 00000000..ab649489
--- /dev/null
+++ b/lib/arm64/asm/arm-smccc.h
@@ -0,0 +1,6 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+#include "../../arm/asm/arm-smccc.h"
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 03/27] arm: realm: Add RSI interface header
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 01/27] lib/string: include stddef.h for size_t Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 02/27] arm: Expand SMCCC arguments and return values Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 04/27] arm: Make physical address mask dynamic Joey Gouly
                     ` (23 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

Add the defintions for the Realm Service Interface (RSI). RSI calls are a way
for the Realm to communicate with the RMM and request information/services.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 lib/arm64/asm/smc-rsi.h | 139 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 139 insertions(+)
 create mode 100644 lib/arm64/asm/smc-rsi.h

diff --git a/lib/arm64/asm/smc-rsi.h b/lib/arm64/asm/smc-rsi.h
new file mode 100644
index 00000000..cd05e9c6
--- /dev/null
+++ b/lib/arm64/asm/smc-rsi.h
@@ -0,0 +1,139 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+#ifndef __SMC_RSI_H_
+#define __SMC_RSI_H_
+
+/*
+ * This file describes the Realm Services Interface (RSI) Application Binary
+ * Interface (ABI) for SMC calls made from within the Realm to the RMM and
+ * serviced by the RMM.
+ */
+
+#define SMC_RSI_CALL_BASE		0xC4000000
+
+/*
+ * The major version number of the RSI implementation.  Increase this whenever
+ * the binary format or semantics of the SMC calls change.
+ */
+#define RSI_ABI_VERSION_MAJOR		12
+
+/*
+ * The minor version number of the RSI implementation.  Increase this when
+ * a bug is fixed, or a feature is added without breaking binary compatibility.
+ */
+#define RSI_ABI_VERSION_MINOR		0
+
+#define RSI_ABI_VERSION			((RSI_ABI_VERSION_MAJOR << 16) | \
+					 RSI_ABI_VERSION_MINOR)
+
+#define RSI_ABI_VERSION_GET_MAJOR(_version) ((_version) >> 16)
+#define RSI_ABI_VERSION_GET_MINOR(_version) ((_version) & 0xFFFF)
+
+#define RSI_SUCCESS			0
+#define RSI_ERROR_INPUT			1
+#define RSI_ERROR_STATE			2
+#define RSI_INCOMPLETE			3
+#define RSI_ERROR_MEMORY		4
+
+#define SMC_RSI_FID(_x)			(SMC_RSI_CALL_BASE + (_x))
+
+#define SMC_RSI_ABI_VERSION			SMC_RSI_FID(0x190)
+
+/*
+ * arg1 == The IPA of token buffer
+ * arg2 == Challenge value, bytes:  0 -  7
+ * arg3 == Challenge value, bytes:  7 - 15
+ * arg4 == Challenge value, bytes: 16 - 23
+ * arg5 == Challenge value, bytes: 24 - 31
+ * arg6 == Challenge value, bytes: 32 - 39
+ * arg7 == Challenge value, bytes: 40 - 47
+ * arg8 == Challenge value, bytes: 48 - 55
+ * arg9 == Challenge value, bytes: 56 - 63
+ * ret0 == Status / error
+ */
+#define SMC_RSI_ATTEST_TOKEN_INIT	SMC_RSI_FID(0x194)
+
+/*
+ * arg1 == The IPA of token buffer
+ * ret0 == Status / error
+ * ret1 == Size of completed token in bytes
+ */
+#define SMC_RSI_ATTEST_TOKEN_CONTINUE	SMC_RSI_FID(0x195)
+
+/*
+ * arg1  == Index (1..4), which measurement (REM) to extend
+ * arg2  == Size of realm measurement in bytes, max 64 bytes
+ * arg3  == Measurement value, bytes:  0 -  7
+ * arg4  == Measurement value, bytes:  7 - 15
+ * arg5  == Measurement value, bytes: 16 - 23
+ * arg6  == Measurement value, bytes: 24 - 31
+ * arg7  == Measurement value, bytes: 32 - 39
+ * arg8  == Measurement value, bytes: 40 - 47
+ * arg9  == Measurement value, bytes: 48 - 55
+ * arg10 == Measurement value, bytes: 56 - 63
+ * ret0  == Status / error
+ */
+#define SMC_RSI_MEASUREMENT_EXTEND	SMC_RSI_FID(0x193)
+
+/*
+ * arg1 == Index (0..4), which measurement (RIM or REM) to read
+ * ret0 == Status / error
+ * ret1 == Measurement value, bytes:  0 -  7
+ * ret2 == Measurement value, bytes:  7 - 15
+ * ret3 == Measurement value, bytes: 16 - 23
+ * ret4 == Measurement value, bytes: 24 - 31
+ * ret5 == Measurement value, bytes: 32 - 39
+ * ret6 == Measurement value, bytes: 40 - 47
+ * ret7 == Measurement value, bytes: 48 - 55
+ * ret8 == Measurement value, bytes: 56 - 63
+ */
+#define SMC_RSI_MEASUREMENT_READ	SMC_RSI_FID(0x192)
+
+#ifndef __ASSEMBLY__
+
+struct rsi_realm_config {
+	union {
+		struct {
+			unsigned long ipa_width; /* Width of IPA in bits */
+		};
+		unsigned char __reserved0[0x1000];
+	};
+	/* Offset 0x1000 */
+};
+
+#endif /* __ASSEMBLY__ */
+
+/*
+ * arg0 == struct rsi_realm_config addr
+ */
+#define SMC_RSI_REALM_CONFIG		SMC_RSI_FID(0x196)
+
+/*
+ * arg0 == IPA address of target region
+ * arg1 == size of target region in bytes
+ * arg2 == RIPAS value
+ * ret0 == Status / error
+ * ret1 == Top of modified IPA range
+ */
+#define SMC_RSI_IPA_STATE_SET		SMC_RSI_FID(0x197)
+
+#define RSI_HOST_CALL_NR_GPRS		31
+
+#ifndef __ASSEMBLY__
+
+struct rsi_host_call {
+	unsigned int imm;
+	unsigned long gprs[RSI_HOST_CALL_NR_GPRS];
+};
+
+#endif /* __ASSEMBLY__ */
+
+/*
+ * arg0 == struct rsi_host_call addr
+ */
+#define SMC_RSI_HOST_CALL		SMC_RSI_FID(0x199)
+
+#endif /* __SMC_RSI_H_ */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 04/27] arm: Make physical address mask dynamic
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (2 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 03/27] arm: realm: Add RSI interface header Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 05/27] arm: Introduce NS_SHARED PTE attribute Joey Gouly
                     ` (22 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

We are about to add Realm support where the physical address width may be known
via RSI. Make the Physical Address mask dynamic, so that it can be adjusted
to the limit for the realm. This will be required for making pages shared, as
we introduce the "sharing" attribute as the top bit of the IPA.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 lib/arm/mmu.c                 | 2 ++
 lib/arm/setup.c               | 1 +
 lib/arm64/asm/pgtable-hwdef.h | 6 ------
 lib/arm64/asm/pgtable.h       | 9 +++++++++
 4 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index e1a72fe4..acaf5614 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -22,6 +22,8 @@
 
 pgd_t *mmu_idmap;
 
+unsigned long phys_mask_shift = 48;
+
 /* CPU 0 starts with disabled MMU */
 static cpumask_t mmu_enabled_cpumask;
 
diff --git a/lib/arm/setup.c b/lib/arm/setup.c
index bcdf0d78..81052a3d 100644
--- a/lib/arm/setup.c
+++ b/lib/arm/setup.c
@@ -22,6 +22,7 @@
 #include <asm/thread_info.h>
 #include <asm/setup.h>
 #include <asm/page.h>
+#include <asm/pgtable.h>
 #include <asm/processor.h>
 #include <asm/smp.h>
 #include <asm/timer.h>
diff --git a/lib/arm64/asm/pgtable-hwdef.h b/lib/arm64/asm/pgtable-hwdef.h
index 8c41fe12..ac95550b 100644
--- a/lib/arm64/asm/pgtable-hwdef.h
+++ b/lib/arm64/asm/pgtable-hwdef.h
@@ -115,12 +115,6 @@
 #define PTE_ATTRINDX(t)		(_AT(pteval_t, (t)) << 2)
 #define PTE_ATTRINDX_MASK	(_AT(pteval_t, 7) << 2)
 
-/*
- * Highest possible physical address supported.
- */
-#define PHYS_MASK_SHIFT		(48)
-#define PHYS_MASK		((UL(1) << PHYS_MASK_SHIFT) - 1)
-
 /*
  * TCR flags.
  */
diff --git a/lib/arm64/asm/pgtable.h b/lib/arm64/asm/pgtable.h
index bfb8a993..22ce64f0 100644
--- a/lib/arm64/asm/pgtable.h
+++ b/lib/arm64/asm/pgtable.h
@@ -21,6 +21,15 @@
 
 #include <linux/compiler.h>
 
+extern unsigned long prot_ns_shared;
+/*
+ * Highest possible physical address supported.
+ */
+extern unsigned long phys_mask_shift;
+#define PHYS_MASK_SHIFT		(phys_mask_shift)
+#define PHYS_MASK		((UL(1) << PHYS_MASK_SHIFT) - 1)
+
+
 /*
  * We can convert va <=> pa page table addresses with simple casts
  * because we always allocate their pages with alloc_page(), and
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 05/27] arm: Introduce NS_SHARED PTE attribute
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (3 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 04/27] arm: Make physical address mask dynamic Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 06/27] arm: Move io_init after vm initialization Joey Gouly
                     ` (21 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

Introduce a new attribute to indicate the mapping is "Shared" with the
host. This will be used by the Realms to share pages with the Host.
For normal VMs, this is always 0.

For realms, this is dynamic, depending on the IPA width. The top bit of the
IPA is "treated" as the "NS_SHARED" attribute, making the VM access the
unprotected alias of the IPA.

By default, apply the NS_SHARED attribute for all I/O.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 lib/arm/mmu.c           | 5 ++++-
 lib/arm64/asm/pgtable.h | 6 ++++++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index acaf5614..6f1f42f5 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -22,6 +22,8 @@
 
 pgd_t *mmu_idmap;
 
+/* Used by Realms, depends on IPA size */
+unsigned long prot_ns_shared = 0;
 unsigned long phys_mask_shift = 48;
 
 /* CPU 0 starts with disabled MMU */
@@ -194,7 +196,8 @@ void __iomem *__ioremap(phys_addr_t phys_addr, size_t size)
 {
 	phys_addr_t paddr_aligned = phys_addr & PAGE_MASK;
 	phys_addr_t paddr_end = PAGE_ALIGN(phys_addr + size);
-	pgprot_t prot = __pgprot(PTE_UNCACHED | PTE_USER | PTE_UXN | PTE_PXN);
+	pgprot_t prot = __pgprot(PTE_UNCACHED | PTE_USER | PTE_UXN |
+				 PTE_PXN | PTE_NS_SHARED);
 	pgd_t *pgtable;
 
 	assert(sizeof(long) == 8 || !(phys_addr >> 32));
diff --git a/lib/arm64/asm/pgtable.h b/lib/arm64/asm/pgtable.h
index 22ce64f0..5b9f40b0 100644
--- a/lib/arm64/asm/pgtable.h
+++ b/lib/arm64/asm/pgtable.h
@@ -22,6 +22,12 @@
 #include <linux/compiler.h>
 
 extern unsigned long prot_ns_shared;
+/*
+ * The Non-secure shared bit for Realms is actually part of the output
+ * address, however it is modeled as a PTE attribute.
+*/
+#define PTE_NS_SHARED		(prot_ns_shared)
+
 /*
  * Highest possible physical address supported.
  */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 06/27] arm: Move io_init after vm initialization
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (4 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 05/27] arm: Introduce NS_SHARED PTE attribute Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 07/27] arm: realm: Make uart available before MMU is enabled Joey Gouly
                     ` (20 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

From: Jean-Philippe Brucker <jean-philippe@linaro.org>

To create shared pages, the NS_SHARED bit must be written into the
idmap. Before VM initializations, idmap hasn't necessarily been created.
To write shared pages, access must be done on a IPA with the NS_SHARED
bit. When the stage-1 MMU is enabled, that bit is set in the PTE. But
when the stage-1 MMU is disabled, then the realm must write to the IPA
with NS_SHARED directly.

To avoid changing the whole virtio infrastructure to support pre-MMU in
a realm, move the IO initialization after MMU enablement.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 lib/arm/setup.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/arm/setup.c b/lib/arm/setup.c
index 81052a3d..65d98e97 100644
--- a/lib/arm/setup.c
+++ b/lib/arm/setup.c
@@ -274,9 +274,6 @@ void setup(const void *fdt, phys_addr_t freemem_start)
 	/* cpu_init must be called before thread_info_init */
 	thread_info_init(current_thread_info(), 0);
 
-	/* mem_init must be called before io_init */
-	io_init();
-
 	timer_save_state();
 
 	ret = dt_get_bootargs(&bootargs);
@@ -292,4 +289,7 @@ void setup(const void *fdt, phys_addr_t freemem_start)
 
 	if (!(auxinfo.flags & AUXINFO_MMU_OFF))
 		setup_vm();
+
+	/* mem_init and setup_vm must be called before io_init */
+	io_init();
 }
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 07/27] arm: realm: Make uart available before MMU is enabled
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (5 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 06/27] arm: Move io_init after vm initialization Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 08/27] arm: realm: Realm initialisation Joey Gouly
                     ` (19 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

A Realm must access any emulated I/O mappings with the PTE_NS_SHARED bit set.
This is modelled as a PTE attribute, but is actually part of the address.

So, when MMU is disabled, the "physical address" must reflect this bit set. We
access the UART early before the MMU is enabled. So, make sure the UART is
accessed always with the bit set.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 lib/arm/io.c            | 24 +++++++++++++++++++++++-
 lib/arm64/asm/pgtable.h |  5 +++++
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/lib/arm/io.c b/lib/arm/io.c
index 343e1082..f7c6c771 100644
--- a/lib/arm/io.c
+++ b/lib/arm/io.c
@@ -15,6 +15,8 @@
 #include <asm/psci.h>
 #include <asm/spinlock.h>
 #include <asm/io.h>
+#include <asm/mmu-api.h>
+#include <asm/pgtable.h>
 
 #include "io.h"
 
@@ -29,6 +31,24 @@ static struct spinlock uart_lock;
 #define UART_EARLY_BASE (u8 *)(unsigned long)CONFIG_UART_EARLY_BASE
 static volatile u8 *uart0_base = UART_EARLY_BASE;
 
+static inline volatile u8 *get_uart_base(void)
+{
+	/*
+	 * The address of the UART base may be different
+	 * based on whether we are running with/without
+	 * MMU enabled.
+	 *
+	 * For realms, we must force to use the shared physical
+	 * alias with MMU disabled, to make sure the I/O can
+	 * be emulated.
+	 * When the MMU is turned ON, the mappings are created
+	 * appropriately.
+	 */
+	if (mmu_enabled())
+		return uart0_base;
+	return (u8 *)arm_shared_phys_alias((void *)uart0_base);
+}
+
 static void uart0_init(void)
 {
 	/*
@@ -81,9 +101,11 @@ void io_init(void)
 
 void puts(const char *s)
 {
+	volatile u8 *uart_base = get_uart_base();
+
 	spin_lock(&uart_lock);
 	while (*s)
-		writeb(*s++, uart0_base);
+		writeb(*s++, uart_base);
 	spin_unlock(&uart_lock);
 }
 
diff --git a/lib/arm64/asm/pgtable.h b/lib/arm64/asm/pgtable.h
index 5b9f40b0..871c03e9 100644
--- a/lib/arm64/asm/pgtable.h
+++ b/lib/arm64/asm/pgtable.h
@@ -28,6 +28,11 @@ extern unsigned long prot_ns_shared;
 */
 #define PTE_NS_SHARED		(prot_ns_shared)
 
+static inline unsigned long arm_shared_phys_alias(void *addr)
+{
+	return ((unsigned long)addr | PTE_NS_SHARED);
+}
+
 /*
  * Highest possible physical address supported.
  */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 08/27] arm: realm: Realm initialisation
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (6 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 07/27] arm: realm: Make uart available before MMU is enabled Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 09/27] arm: realm: Add support for changing the state of memory Joey Gouly
                     ` (18 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

During the boot, run a check for the presence of RMM. If we are Realm,
detect the Realm configuration using RSI and initialise the key parameters.

Also expose a helper to indicate if this is running inside a Realm

Co-developed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 arm/Makefile.arm64        |  1 +
 lib/arm/asm/rsi.h         | 16 ++++++++++
 lib/arm/setup.c           |  3 ++
 lib/arm64/asm/processor.h |  8 +++++
 lib/arm64/asm/rsi.h       | 36 +++++++++++++++++++++
 lib/arm64/rsi.c           | 67 +++++++++++++++++++++++++++++++++++++++
 6 files changed, 131 insertions(+)
 create mode 100644 lib/arm/asm/rsi.h
 create mode 100644 lib/arm64/asm/rsi.h
 create mode 100644 lib/arm64/rsi.c

diff --git a/arm/Makefile.arm64 b/arm/Makefile.arm64
index 42e18e77..ab557f84 100644
--- a/arm/Makefile.arm64
+++ b/arm/Makefile.arm64
@@ -24,6 +24,7 @@ cstart.o = $(TEST_DIR)/cstart64.o
 cflatobjs += lib/arm64/processor.o
 cflatobjs += lib/arm64/spinlock.o
 cflatobjs += lib/arm64/gic-v3-its.o lib/arm64/gic-v3-its-cmd.o
+cflatobjs += lib/arm64/rsi.o
 
 OBJDIRS += lib/arm64
 
diff --git a/lib/arm/asm/rsi.h b/lib/arm/asm/rsi.h
new file mode 100644
index 00000000..d1f72c25
--- /dev/null
+++ b/lib/arm/asm/rsi.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+#ifndef __ASMARM_RSI_H_
+#define __ASMARM_RSI_H_
+
+#include <stdbool.h>
+
+static inline bool is_realm(void)
+{
+	return false;
+}
+
+#endif /* __ASMARM_RSI_H_ */
diff --git a/lib/arm/setup.c b/lib/arm/setup.c
index 65d98e97..36d4d826 100644
--- a/lib/arm/setup.c
+++ b/lib/arm/setup.c
@@ -24,6 +24,7 @@
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/processor.h>
+#include <asm/rsi.h>
 #include <asm/smp.h>
 #include <asm/timer.h>
 #include <asm/psci.h>
@@ -244,6 +245,8 @@ void setup(const void *fdt, phys_addr_t freemem_start)
 	u32 fdt_size;
 	int ret;
 
+	arm_rsi_init();
+
 	assert(sizeof(long) == 8 || freemem_start < (3ul << 30));
 	freemem = (void *)(unsigned long)freemem_start;
 
diff --git a/lib/arm64/asm/processor.h b/lib/arm64/asm/processor.h
index 1c73ba32..320ebaef 100644
--- a/lib/arm64/asm/processor.h
+++ b/lib/arm64/asm/processor.h
@@ -114,6 +114,14 @@ static inline unsigned long get_id_aa64mmfr0_el1(void)
 #define ID_AA64MMFR0_TGRAN64_SUPPORTED	0x0
 #define ID_AA64MMFR0_TGRAN16_SUPPORTED	0x1
 
+static inline unsigned long get_id_aa64pfr0_el1(void)
+{
+	return read_sysreg(id_aa64pfr0_el1);
+}
+
+#define ID_AA64PFR0_EL1_EL3	(0xf << 12)
+#define ID_AA64PFR0_EL1_EL3_NI	(0x0 << 12)
+
 static inline bool system_supports_granule(size_t granule)
 {
 	u32 shift;
diff --git a/lib/arm64/asm/rsi.h b/lib/arm64/asm/rsi.h
new file mode 100644
index 00000000..8b9b91b2
--- /dev/null
+++ b/lib/arm64/asm/rsi.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+#ifndef __ASMARM64_RSI_H_
+#define __ASMARM64_RSI_H_
+
+#include <stdbool.h>
+
+#include <asm/arm-smccc.h>
+#include <asm/io.h>
+#include <asm/smc-rsi.h>
+
+#define RSI_GRANULE_SIZE	SZ_4K
+
+extern bool rsi_present;
+
+void arm_rsi_init(void);
+
+int rsi_invoke(unsigned int function_id, unsigned long arg0,
+	       unsigned long arg1, unsigned long arg2,
+	       unsigned long arg3, unsigned long arg4,
+	       unsigned long arg5, unsigned long arg6,
+	       unsigned long arg7, unsigned long arg8,
+	       unsigned long arg9, unsigned long arg10,
+	       struct smccc_result *result);
+
+int rsi_get_version(void);
+
+static inline bool is_realm(void)
+{
+	return rsi_present;
+}
+
+#endif /* __ASMARM64_RSI_H_ */
diff --git a/lib/arm64/rsi.c b/lib/arm64/rsi.c
new file mode 100644
index 00000000..23a4e963
--- /dev/null
+++ b/lib/arm64/rsi.c
@@ -0,0 +1,67 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+#include <libcflat.h>
+
+#include <asm/pgtable.h>
+#include <asm/processor.h>
+#include <asm/rsi.h>
+
+bool rsi_present;
+
+int rsi_invoke(unsigned int function_id, unsigned long arg0,
+	       unsigned long arg1, unsigned long arg2,
+	       unsigned long arg3, unsigned long arg4,
+	       unsigned long arg5, unsigned long arg6,
+	       unsigned long arg7, unsigned long arg8,
+	       unsigned long arg9, unsigned long arg10,
+	       struct smccc_result *result)
+{
+	return arm_smccc_smc(function_id, arg0, arg1, arg2, arg3, arg4, arg5,
+			     arg6, arg7, arg8, arg9, arg10, result);
+}
+
+struct rsi_realm_config __attribute__((aligned(RSI_GRANULE_SIZE))) config;
+
+static unsigned long rsi_get_realm_config(struct rsi_realm_config *cfg)
+{
+	struct smccc_result res;
+
+	rsi_invoke(SMC_RSI_REALM_CONFIG, __virt_to_phys((unsigned long)cfg),
+		   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, &res);
+
+	return res.r0;
+}
+
+int rsi_get_version(void)
+{
+	struct smccc_result res = {};
+	int ret;
+
+	if ((get_id_aa64pfr0_el1() & ID_AA64PFR0_EL1_EL3) == ID_AA64PFR0_EL1_EL3_NI)
+		return -1;
+
+	ret = rsi_invoke(SMC_RSI_ABI_VERSION, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+		         &res);
+	if (ret)
+		return ret;
+
+	return res.r0;
+}
+
+void arm_rsi_init(void)
+{
+	if (rsi_get_version() != RSI_ABI_VERSION)
+		return;
+
+	if (rsi_get_realm_config(&config))
+		return;
+
+	rsi_present = true;
+
+	phys_mask_shift = (config.ipa_width - 1);
+	/* Set the upper bit of the IPA as the NS_SHARED pte attribute */
+	prot_ns_shared = (1UL << phys_mask_shift);
+}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 09/27] arm: realm: Add support for changing the state of memory
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (7 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 08/27] arm: realm: Realm initialisation Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 10/27] arm: realm: Set RIPAS state for RAM Joey Gouly
                     ` (17 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

For a Realm, the guest physical address (in reality the IPA/GPA of the VM)
has an associated state (Realm IPA State, RIPAS) which is either of :
   RIPAS_RAM
   RIPAS_EMPTY

The state of the physical address decides certain behaviors. e.g., any access
to a RIPAS_EMPTY PA will generate a Synchronous External Abort back to the Realm,
from the RMM.

All "PA" that represents RAM for the Realm, must be set to RIPAS_RAM before
an access is made. When the initial image (e.g., test, DTB) of a Realm is
loaded, the hypervisor/VMM can transition the state of the loaded "area" to
RIPAS_RAM. The rest of the "RAM" must be transitioned by the test payload
before any access is made.

Similarly, a Realm could set an "IPA" to RIPAS_EMPTY, when it is about to use
the "unprotected" alias of the IPA. This is a hint for the host to reclaim the
page from the protected "IPA.

This patchs adds supporting helpers for setting the IPA state from Realm. These
will be used later for the Realm.

Co-developed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 lib/arm/mmu.c       |  1 +
 lib/arm64/asm/rsi.h |  8 ++++++++
 lib/arm64/rsi.c     | 44 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 53 insertions(+)

diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index 6f1f42f5..2b5a7141 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -12,6 +12,7 @@
 #include <asm/setup.h>
 #include <asm/page.h>
 #include <asm/io.h>
+#include <asm/rsi.h>
 
 #include "alloc_page.h"
 #include "vmalloc.h"
diff --git a/lib/arm64/asm/rsi.h b/lib/arm64/asm/rsi.h
index 8b9b91b2..c8179341 100644
--- a/lib/arm64/asm/rsi.h
+++ b/lib/arm64/asm/rsi.h
@@ -33,4 +33,12 @@ static inline bool is_realm(void)
 	return rsi_present;
 }
 
+enum ripas_t {
+	RIPAS_EMPTY,
+	RIPAS_RAM,
+};
+
+void arm_set_memory_protected(unsigned long va, size_t size);
+void arm_set_memory_shared(unsigned long va, size_t size);
+
 #endif /* __ASMARM64_RSI_H_ */
diff --git a/lib/arm64/rsi.c b/lib/arm64/rsi.c
index 23a4e963..08c77889 100644
--- a/lib/arm64/rsi.c
+++ b/lib/arm64/rsi.c
@@ -65,3 +65,47 @@ void arm_rsi_init(void)
 	/* Set the upper bit of the IPA as the NS_SHARED pte attribute */
 	prot_ns_shared = (1UL << phys_mask_shift);
 }
+
+static unsigned rsi_set_addr_range_state(unsigned long start, unsigned long size,
+					 enum ripas_t state, unsigned long *top)
+{
+	struct smccc_result res;
+
+	rsi_invoke(SMC_RSI_IPA_STATE_SET, start, size, state, 0, 0, 0, 0, 0, 0, 0, 0, &res);
+	*top = res.r1;
+	return res.r0;
+}
+
+static void arm_set_memory_state(unsigned long start,
+				 unsigned long size,
+				 unsigned int ripas)
+{
+	int ret;
+	unsigned long end, top;
+	unsigned long old_start = start;
+
+	if (!is_realm())
+		return;
+
+	start = ALIGN_DOWN(start, RSI_GRANULE_SIZE);
+	if (start != old_start)
+		size += old_start - start;
+	end = ALIGN(start + size, RSI_GRANULE_SIZE);
+	while (start != end) {
+		ret = rsi_set_addr_range_state(start, (end - start),
+					       ripas, &top);
+		assert(!ret);
+		assert(top <= end);
+		start = top;
+	}
+}
+
+void arm_set_memory_protected(unsigned long start, unsigned long size)
+{
+	arm_set_memory_state(start, size, RIPAS_RAM);
+}
+
+void arm_set_memory_shared(unsigned long start, unsigned long size)
+{
+	arm_set_memory_state(start, size, RIPAS_EMPTY);
+}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 10/27] arm: realm: Set RIPAS state for RAM
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (8 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 09/27] arm: realm: Add support for changing the state of memory Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 11/27] arm: realm: Early memory setup Joey Gouly
                     ` (16 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

From: Suzuki K Poulose <suzuki.poulose@arm.com>

A Realm must ensure that the "RAM" region is set to RIPAS_RAM, before any
access is made. This patch makes sure that all memory blocks are marked as
RIPAS_RAM. Also, before we relocate the "FDT" and "initrd", make sure the
target location is marked too. This happens before we parse the memory blocks.

It is OK to do this operation on a given IPA multiple times. So, we don't
exclude the inital image areas from the "target" list.

Also, this operation doesn't require the host to commit physical memory to back
the IPAs yet. It can be done on demand via fault handling.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 lib/arm/setup.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/lib/arm/setup.c b/lib/arm/setup.c
index 36d4d826..7b3dc0b8 100644
--- a/lib/arm/setup.c
+++ b/lib/arm/setup.c
@@ -181,6 +181,7 @@ static void mem_init(phys_addr_t freemem_start)
 	while (r && r->end != mem.end)
 		r = mem_region_find(r->end);
 	assert(r);
+	arm_set_memory_protected(r->start, r->end - r->start);
 
 	/* Ensure our selected freemem range is somewhere in our full range */
 	assert(freemem_start >= mem.start && freemem->end <= mem.end);
@@ -252,6 +253,11 @@ void setup(const void *fdt, phys_addr_t freemem_start)
 
 	/* Move the FDT to the base of free memory */
 	fdt_size = fdt_totalsize(fdt);
+	/*
+	 * Before we touch the memory @freemem, make sure it
+	 * is set to protected for Realms.
+	 */
+	arm_set_memory_protected((unsigned long)freemem, fdt_size);
 	ret = fdt_move(fdt, freemem, fdt_size);
 	assert(ret == 0);
 	ret = dt_init(freemem);
@@ -263,6 +269,7 @@ void setup(const void *fdt, phys_addr_t freemem_start)
 	assert(ret == 0 || ret == -FDT_ERR_NOTFOUND);
 	if (ret == 0) {
 		initrd = freemem;
+		arm_set_memory_protected((unsigned long)initrd, initrd_size);
 		memmove(initrd, tmp, initrd_size);
 		freemem += initrd_size;
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 11/27] arm: realm: Early memory setup
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (9 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 10/27] arm: realm: Set RIPAS state for RAM Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 12/27] arm: realm: Add RSI version test Joey Gouly
                     ` (15 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

From: Suzuki K Poulose <suzuki.poulose@arm.com>

A Realm must mark areas of memory as RIPAS_RAM before an access is made.

The binary image is loaded by the VMM and thus the area is converted.
However, the file image may not cover tail portion of the "memory" image (e.g,
BSS, stack etc.). Convert the area touched by the early boot code to RAM
before the access is made in early assembly code.

Once, we land in the C code, we take care of converting the entire RAM region
to RIPAS_RAM.

Please note that this operation doesn't require the host to commit memory to
the Realm.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Co-developed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Co-developed-by: Joey Gouly <joey.gouly@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 arm/cstart64.S | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/arm/cstart64.S b/arm/cstart64.S
index b689b132..b0861594 100644
--- a/arm/cstart64.S
+++ b/arm/cstart64.S
@@ -14,6 +14,7 @@
 #include <asm/pgtable-hwdef.h>
 #include <asm/thread_info.h>
 #include <asm/sysreg.h>
+#include <asm/smc-rsi.h>
 
 .macro zero_range, tmp1, tmp2
 9998:	cmp	\tmp1, \tmp2
@@ -61,6 +62,7 @@ start:
 	b	1b
 
 1:
+	bl	__early_mem_setup
 	/* zero BSS */
 	adrp	x4, bss
 	add	x4, x4, :lo12:bss
@@ -170,6 +172,76 @@ arm_smccc_hvc:
 arm_smccc_smc:
 	do_smccc_call smc
 
+__early_mem_setup:
+	/* Preserve x0 - x3 */
+	mov	x5, x0
+	mov	x6, x1
+	mov	x7, x2
+	mov	x8, x3
+
+	/*
+	 * Check for EL3, otherwise an SMC instruction
+	 * will cause an UNDEFINED exception.
+	 */
+	mrs	x9, ID_AA64PFR0_EL1
+	lsr	x9, x9, #12
+	and	x9, x9, 0b11
+	cbnz	x9, 1f
+	ret
+
+1:
+	/*
+	 * Are we a realm? Request the RSI ABI version.
+	 * If KVM is catching SMCs, it returns an error in x0 (~0UL)
+	 */
+	ldr	x0, =SMC_RSI_ABI_VERSION
+	smc	#0
+
+	ldr	x1, =RSI_ABI_VERSION
+	cmp	x0, x1
+	bne	3f
+
+	/*
+	 * For realms, we must mark area from bss
+	 * to the end of stack as memory before it is
+	 * accessed, as they are not populated as part
+	 * of the initial image. As such we can run
+	 * this unconditionally irrespective of whether
+	 * we are a normal VM or Realm.
+	 *
+	 * x1 = bss_start.
+	 */
+	adrp	x1, bss
+
+	/* x9 = (end of stack - bss_start) */
+	adrp	x9, (stacktop + PAGE_SIZE)
+2:
+	/* calculate the size as (end - start) */
+	sub	x2, x9, x1
+
+	/* x3 = RIPAS_RAM */
+	mov	x3, #1
+
+	/* x0 = SMC_RSI_IPA_STATE_SET */
+	movz	x0, :abs_g2_s:SMC_RSI_IPA_STATE_SET
+	movk	x0, :abs_g1_nc:SMC_RSI_IPA_STATE_SET
+	movk	x0, :abs_g0_nc:SMC_RSI_IPA_STATE_SET
+
+	/* Run the RSI request */
+	smc	#0
+
+	/* halt if there is an error */
+	cbnz x0, halt
+
+	cmp x1, x9
+	bne 2b
+3:
+	mov	x3, x8
+	mov	x2, x7
+	mov	x1, x6
+	mov	x0, x5
+	ret
+
 get_mmu_off:
 	adrp	x0, auxinfo
 	ldr	x0, [x0, :lo12:auxinfo + 8]
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 12/27] arm: realm: Add RSI version test
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (10 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 11/27] arm: realm: Early memory setup Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 13/27] arm: selftest: realm: skip pabt test when running in a realm Joey Gouly
                     ` (14 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

Add basic test for checking the RSI version command.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 arm/Makefile.arm64 |  1 +
 arm/realm-rsi.c    | 49 ++++++++++++++++++++++++++++++++++++++++++++++
 arm/unittests.cfg  |  7 +++++++
 3 files changed, 57 insertions(+)
 create mode 100644 arm/realm-rsi.c

diff --git a/arm/Makefile.arm64 b/arm/Makefile.arm64
index ab557f84..eed77d3a 100644
--- a/arm/Makefile.arm64
+++ b/arm/Makefile.arm64
@@ -33,6 +33,7 @@ tests = $(TEST_DIR)/timer.flat
 tests += $(TEST_DIR)/micro-bench.flat
 tests += $(TEST_DIR)/cache.flat
 tests += $(TEST_DIR)/debug.flat
+tests += $(TEST_DIR)/realm-rsi.flat
 
 include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
diff --git a/arm/realm-rsi.c b/arm/realm-rsi.c
new file mode 100644
index 00000000..d793f305
--- /dev/null
+++ b/arm/realm-rsi.c
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+
+#include <libcflat.h>
+#include <asm/io.h>
+#include <asm/page.h>
+#include <asm/processor.h>
+#include <asm/psci.h>
+#include <alloc_page.h>
+#include <asm/rsi.h>
+#include <asm/pgtable.h>
+#include <asm/processor.h>
+
+static void rsi_test_version(void)
+{
+	int version;
+
+	report_prefix_push("version");
+
+	version = rsi_get_version();
+	if (version < 0) {
+		report(false, "SMC_RSI_ABI_VERSION failed (%d)", version);
+		return;
+	}
+
+	report(version == RSI_ABI_VERSION, "RSI ABI version %u.%u (expected: %u.%u)",
+	       RSI_ABI_VERSION_GET_MAJOR(version),
+	       RSI_ABI_VERSION_GET_MINOR(version),
+	       RSI_ABI_VERSION_GET_MAJOR(RSI_ABI_VERSION),
+	       RSI_ABI_VERSION_GET_MINOR(RSI_ABI_VERSION));
+	report_prefix_pop();
+}
+
+int main(int argc, char **argv)
+{
+	report_prefix_push("rsi");
+
+	if (!is_realm()) {
+		report_skip("Not a realm, skipping tests");
+		goto exit;
+	}
+
+	rsi_test_version();
+exit:
+	return report_summary();
+}
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index 5e67b558..ce1b5ad9 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -275,3 +275,10 @@ file = debug.flat
 arch = arm64
 extra_params = -append 'ss-migration'
 groups = debug migration
+
+# Realm RSI ABI test
+[realm-rsi]
+file = realm-rsi.flat
+groups = nodefault realms
+accel = kvm
+arch = arm64
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 13/27] arm: selftest: realm: skip pabt test when running in a realm
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (11 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 12/27] arm: realm: Add RSI version test Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 14/27] arm: realm: add hvc and RSI_HOST_CALL tests Joey Gouly
                     ` (13 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

From: Alexandru Elisei <alexandru.elisei@arm.com>

The realm manager treats instruction aborts as fatal errors, skip this
test.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 arm/selftest.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arm/selftest.c b/arm/selftest.c
index 6f825add..174f2ebc 100644
--- a/arm/selftest.c
+++ b/arm/selftest.c
@@ -18,6 +18,7 @@
 #include <asm/smp.h>
 #include <asm/mmu.h>
 #include <asm/barrier.h>
+#include <asm/rsi.h>
 
 static cpumask_t ready, valid;
 
@@ -392,11 +393,17 @@ static void check_vectors(void *arg __unused)
 					  user_psci_system_off);
 #endif
 	} else {
+		if (is_realm()) {
+			report_skip("pabt test not supported in a realm");
+			goto out;
+		}
+
 		if (!check_pabt_init())
 			report_skip("Couldn't guess an invalid physical address");
 		else
 			report(check_pabt(), "pabt");
 	}
+out:
 	exit(report_summary());
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 14/27] arm: realm: add hvc and RSI_HOST_CALL tests
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (12 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 13/27] arm: selftest: realm: skip pabt test when running in a realm Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 15/27] arm: realm: Add test for FPU/SIMD context save/restore Joey Gouly
                     ` (12 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel,
	Gareth Stockwell

From: Gareth Stockwell <gareth.stockwell@arm.com>

Test that a HVC instruction in a Realm is turned into an undefined exception.

Test that RSI_HOST_CALL passes through to the Hypervisor.

Signed-off-by: Gareth Stockwell <gareth.stockwell@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 arm/realm-rsi.c   | 110 +++++++++++++++++++++++++++++++++++++++++++++-
 arm/unittests.cfg |  15 +++++++
 2 files changed, 124 insertions(+), 1 deletion(-)

diff --git a/arm/realm-rsi.c b/arm/realm-rsi.c
index d793f305..8a7e9622 100644
--- a/arm/realm-rsi.c
+++ b/arm/realm-rsi.c
@@ -14,6 +14,96 @@
 #include <asm/pgtable.h>
 #include <asm/processor.h>
 
+#define FID_SMCCC_VERSION	0x80000000
+#define FID_INVALID		0xc5000041
+
+#define SMCCC_VERSION_1_1	0x10001
+#define SMCCC_SUCCESS		0
+#define SMCCC_NOT_SUPPORTED	-1
+
+static bool unknown_taken;
+
+static void unknown_handler(struct pt_regs *regs, unsigned int esr)
+{
+	report_info("unknown_handler: esr=0x%x", esr);
+	unknown_taken = true;
+}
+
+static void hvc_call(unsigned int fid)
+{
+	struct smccc_result res;
+
+	unknown_taken = false;
+	arm_smccc_hvc(fid, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, &res);
+
+	if (unknown_taken) {
+		report(true, "FID=0x%x caused Unknown exception", fid);
+	} else {
+		report(false, "FID=0x%x did not cause Unknown exception", fid);
+		report_info("x0:  0x%lx", res.r0);
+		report_info("x1:  0x%lx", res.r1);
+		report_info("x2:  0x%lx", res.r2);
+		report_info("x3:  0x%lx", res.r3);
+		report_info("x4:  0x%lx", res.r4);
+		report_info("x5:  0x%lx", res.r5);
+		report_info("x6:  0x%lx", res.r6);
+		report_info("x7:  0x%lx", res.r7);
+	}
+}
+
+static void rsi_test_hvc(void)
+{
+	report_prefix_push("hvc");
+
+	/* Test that HVC causes Undefined exception, regardless of FID */
+	install_exception_handler(EL1H_SYNC, ESR_EL1_EC_UNKNOWN, unknown_handler);
+	hvc_call(FID_SMCCC_VERSION);
+	hvc_call(FID_INVALID);
+	install_exception_handler(EL1H_SYNC, ESR_EL1_EC_UNKNOWN, NULL);
+
+	report_prefix_pop();
+}
+
+static void host_call(unsigned int fid, unsigned long expected_x0)
+{
+	struct smccc_result res;
+	struct rsi_host_call __attribute__((aligned(256))) host_call_data = { 0 };
+
+	host_call_data.gprs[0] = fid;
+
+	arm_smccc_smc(SMC_RSI_HOST_CALL, virt_to_phys(&host_call_data),
+		       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, &res);
+
+	if (res.r0) {
+		report(false, "RSI_HOST_CALL returned 0x%lx", res.r0);
+	} else {
+		if (host_call_data.gprs[0] == expected_x0) {
+			report(true, "FID=0x%x x0=0x%lx",
+				fid, host_call_data.gprs[0]);
+		} else {
+			report(false, "FID=0x%x x0=0x%lx expected=0x%lx",
+				fid, host_call_data.gprs[0], expected_x0);
+			report_info("x1:  0x%lx", host_call_data.gprs[1]);
+			report_info("x2:  0x%lx", host_call_data.gprs[2]);
+			report_info("x3:  0x%lx", host_call_data.gprs[3]);
+			report_info("x4:  0x%lx", host_call_data.gprs[4]);
+			report_info("x5:  0x%lx", host_call_data.gprs[5]);
+			report_info("x6:  0x%lx", host_call_data.gprs[6]);
+		}
+	}
+}
+
+static void rsi_test_host_call(void)
+{
+	report_prefix_push("host_call");
+
+	/* Test that host calls return expected values */
+	host_call(FID_SMCCC_VERSION, SMCCC_VERSION_1_1);
+	host_call(FID_INVALID, SMCCC_NOT_SUPPORTED);
+
+	report_prefix_pop();
+}
+
 static void rsi_test_version(void)
 {
 	int version;
@@ -36,6 +126,8 @@ static void rsi_test_version(void)
 
 int main(int argc, char **argv)
 {
+	int i;
+
 	report_prefix_push("rsi");
 
 	if (!is_realm()) {
@@ -43,7 +135,23 @@ int main(int argc, char **argv)
 		goto exit;
 	}
 
-	rsi_test_version();
+	if (argc < 2) {
+		rsi_test_version();
+		rsi_test_host_call();
+		rsi_test_hvc();
+	} else {
+		for (i = 1; i < argc; i++) {
+			if (strcmp(argv[i], "version") == 0) {
+				rsi_test_version();
+			} else if (strcmp(argv[i], "hvc") == 0) {
+				rsi_test_hvc();
+			} else if (strcmp(argv[i], "host_call") == 0) {
+				rsi_test_host_call();
+			} else {
+				report_abort("Unknown subtest '%s'", argv[1]);
+			}
+		}
+	}
 exit:
 	return report_summary();
 }
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index ce1b5ad9..3cdb1a98 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -280,5 +280,20 @@ groups = debug migration
 [realm-rsi]
 file = realm-rsi.flat
 groups = nodefault realms
+extra_params = -append 'version'
+accel = kvm
+arch = arm64
+
+[realm-host-call]
+file = realm-rsi.flat
+groups = nodefault realms
+extra_params = -append 'host_call'
+accel = kvm
+arch = arm64
+
+[realm-hvc]
+file = realm-rsi.flat
+groups = nodefault realms
+extra_params = -append 'hvc'
 accel = kvm
 arch = arm64
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 15/27] arm: realm: Add test for FPU/SIMD context save/restore
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (13 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 14/27] arm: realm: add hvc and RSI_HOST_CALL tests Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 16/27] arm: realm: Add tests for in realm SEA Joey Gouly
                     ` (11 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel,
	Subhasish Ghosh

From: Subhasish Ghosh <subhasish.ghosh@arm.com>

Test that the FPU/SIMD registers are saved and restored correctly when
context switching VCPUs.

In order to test fpu/simd functionality, we need to make sure that
kvm-unit-tests doesn't generate code that uses the fpu registers, as that
might interfere with the test results. Thus make sure we compile the tests
with -mgeneral-regs-only.

Signed-off-by: Subhasish Ghosh <subhasish.ghosh@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 arm/Makefile.arm64  |   1 +
 arm/Makefile.common |   1 +
 arm/realm-fpu.c     | 242 ++++++++++++++++++++++++++++++++++++++++++++
 arm/unittests.cfg   |   8 ++
 4 files changed, 252 insertions(+)
 create mode 100644 arm/realm-fpu.c

diff --git a/arm/Makefile.arm64 b/arm/Makefile.arm64
index eed77d3a..90ec6815 100644
--- a/arm/Makefile.arm64
+++ b/arm/Makefile.arm64
@@ -34,6 +34,7 @@ tests += $(TEST_DIR)/micro-bench.flat
 tests += $(TEST_DIR)/cache.flat
 tests += $(TEST_DIR)/debug.flat
 tests += $(TEST_DIR)/realm-rsi.flat
+tests += $(TEST_DIR)/realm-fpu.flat
 
 include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
diff --git a/arm/Makefile.common b/arm/Makefile.common
index 1bbec64f..b339b62d 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -25,6 +25,7 @@ CFLAGS += -std=gnu99
 CFLAGS += -ffreestanding
 CFLAGS += -O2
 CFLAGS += -I $(SRCDIR)/lib -I $(SRCDIR)/lib/libfdt -I lib
+CFLAGS += -mgeneral-regs-only
 
 # We want to keep intermediate files
 .PRECIOUS: %.elf %.o
diff --git a/arm/realm-fpu.c b/arm/realm-fpu.c
new file mode 100644
index 00000000..35cfdf09
--- /dev/null
+++ b/arm/realm-fpu.c
@@ -0,0 +1,242 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+
+#include <libcflat.h>
+#include <asm/smp.h>
+#include <stdlib.h>
+
+#include <asm/rsi.h>
+
+#define CPU0_ID			0
+#define CPU1_ID			(CPU0_ID + 1)
+#define CPUS_MAX		(CPU1_ID + 1)
+#define RMM_FPU_QREG_MAX	32
+#define RMM_FPU_RESULT_PASS	(-1U)
+
+#define fpu_reg_read(val)				\
+({							\
+	uint64_t *__val = (val);			\
+	asm volatile("stp q0, q1, [%0], #32\n\t"	\
+		     "stp q2, q3, [%0], #32\n\t"	\
+		     "stp q4, q5, [%0], #32\n\t"	\
+		     "stp q6, q7, [%0], #32\n\t"	\
+		     "stp q8, q9, [%0], #32\n\t"	\
+		     "stp q10, q11, [%0], #32\n\t"	\
+		     "stp q12, q13, [%0], #32\n\t"	\
+		     "stp q14, q15, [%0], #32\n\t"	\
+		     "stp q16, q17, [%0], #32\n\t"	\
+		     "stp q18, q19, [%0], #32\n\t"	\
+		     "stp q20, q21, [%0], #32\n\t"	\
+		     "stp q22, q23, [%0], #32\n\t"	\
+		     "stp q24, q25, [%0], #32\n\t"	\
+		     "stp q26, q27, [%0], #32\n\t"	\
+		     "stp q28, q29, [%0], #32\n\t"	\
+		     "stp q30, q31, [%0], #32\n\t"	\
+		     : "=r" (__val)			\
+		     :					\
+		     : "q0", "q1", "q2", "q3",		\
+			"q4", "q5", "q6", "q7",		\
+			"q8", "q9", "q10", "q11",	\
+			"q12", "q13", "q14",		\
+			"q15", "q16", "q17",		\
+			"q18", "q19", "q20",		\
+			"q21", "q22", "q23",		\
+			"q24", "q25", "q26",		\
+			"q27", "q28", "q29",		\
+			"q30", "q31", "memory");	\
+})
+
+#define fpu_reg_write(val)			\
+do {						\
+	uint64_t *__val = (val);		\
+	asm volatile("ldp q0, q1, [%0]\n\t"	\
+		     "ldp q2, q3, [%0]\n\t"	\
+		     "ldp q4, q5, [%0]\n\t"	\
+		     "ldp q6, q7, [%0]\n\t"	\
+		     "ldp q8, q9, [%0]\n\t"	\
+		     "ldp q10, q11, [%0]\n\t"	\
+		     "ldp q12, q13, [%0]\n\t"	\
+		     "ldp q14, q15, [%0]\n\t"	\
+		     "ldp q16, q17, [%0]\n\t"	\
+		     "ldp q18, q19, [%0]\n\t"	\
+		     "ldp q20, q21, [%0]\n\t"	\
+		     "ldp q22, q23, [%0]\n\t"	\
+		     "ldp q24, q25, [%0]\n\t"	\
+		     "ldp q26, q27, [%0]\n\t"	\
+		     "ldp q28, q29, [%0]\n\t"	\
+		     "ldp q30, q31, [%0]\n\t"	\
+		     :				\
+		     : "r" (__val)		\
+		     : "q0", "q1", "q2", "q3",  \
+			"q4", "q5", "q6", "q7", \
+			"q8", "q9", "q10", "q11",\
+			"q12", "q13", "q14",	\
+			"q15", "q16", "q17",	\
+			"q18", "q19", "q20",	\
+			"q21", "q22", "q23",	\
+			"q24", "q25", "q26",	\
+			"q27", "q28", "q29",	\
+			"q30", "q31", "memory");\
+} while (0)
+
+static void nr_cpu_check(int nr)
+{
+	if (nr_cpus < nr)
+		report_abort("At least %d cpus required", nr);
+}
+/**
+ * @brief check if the FPU/SIMD register contents are the same as
+ * the input data provided.
+ */
+static uint32_t __realm_fpuregs_testall(uint64_t *indata)
+{
+	/* 128b aligned array to read data into */
+	uint64_t outdata[RMM_FPU_QREG_MAX * 2]
+			 __attribute__((aligned(sizeof(__uint128_t)))) = {
+			[0 ... ((RMM_FPU_QREG_MAX * 2) - 1)] = 0 };
+	uint8_t regcnt	= 0;
+	uint32_t result	= 0;
+
+	if (indata == NULL)
+		report_abort("invalid data pointer received");
+
+	/* read data from FPU registers */
+	fpu_reg_read(outdata);
+
+	/* check is the data is the same */
+	for (regcnt = 0; regcnt < (RMM_FPU_QREG_MAX * 2); regcnt += 2) {
+		if ((outdata[regcnt] != indata[regcnt % 4]) ||
+			(outdata[regcnt+1] != indata[(regcnt+1) % 4])) {
+			report_info(
+			"fpu/simd save/restore failed for reg: q%d expected: %lx_%lx received: %lx_%lx\n",
+			regcnt / 2, indata[(regcnt+1) % 4],
+			indata[regcnt % 4], outdata[regcnt+1],
+			outdata[regcnt]);
+		} else {
+			/* populate a bitmask indicating which
+			 * registers passed/failed
+			 */
+			result |= (1 << (regcnt / 2));
+		}
+	}
+
+	return result;
+}
+
+/**
+ * @brief writes randomly sampled data into the FPU/SIMD registers.
+ */
+static void __realm_fpuregs_writeall_random(uint64_t **indata)
+{
+
+	/* allocate 128b aligned memory */
+	*indata = memalign(sizeof(__uint128_t), sizeof(uint64_t) * 4);
+
+	/* populate the memory with sampled data from a counter */
+	(*indata)[0] = get_cntvct();
+	(*indata)[1] = get_cntvct();
+	(*indata)[2] = get_cntvct();
+	(*indata)[3] = get_cntvct();
+
+	/* write data into FPU registers */
+	fpu_reg_write(*indata);
+}
+
+static void realm_fpuregs_writeall_run(void *data)
+{
+
+	uint64_t **indata	= (uint64_t **)data;
+
+	__realm_fpuregs_writeall_random(indata);
+}
+
+static void realm_fpuregs_testall_run(void *data)
+{
+
+	uint64_t *indata	= (uint64_t *)data;
+	uint32_t result		= 0;
+
+	result = __realm_fpuregs_testall(indata);
+	report((result == RMM_FPU_RESULT_PASS),
+	       "fpu/simd register save/restore mask: 0x%x", result);
+}
+
+/**
+ * @brief This test uses two VCPU to test FPU/SIMD save/restore
+ * @details REC1 (vcpu1) writes random data into FPU/SIMD
+ * registers, REC0 (vcpu0) corrupts/overwrites the data and finally
+ * REC1 checks if the data remains unchanged in its context.
+ */
+static void realm_fpuregs_context_switch_cpu1(void)
+{
+	int target		= CPU1_ID;
+	uint64_t *indata_remote	= NULL;
+	uint64_t *indata_local	= NULL;
+
+	/* write data from REC1/VCPU1 */
+	on_cpu(target, realm_fpuregs_writeall_run, &indata_remote);
+
+	/* Overwrite from REC0/VCPU0 */
+	__realm_fpuregs_writeall_random(&indata_local);
+
+	/* check data consistency */
+	on_cpu(target, realm_fpuregs_testall_run, indata_remote);
+
+	free(indata_remote);
+	free(indata_local);
+}
+
+/**
+ * @brief This test uses two VCPU to test FPU/SIMD save/restore
+ * @details REC0 (vcpu0) writes random data into FPU/SIMD
+ * registers, REC1 (vcpu1) corrupts/overwrites the data and finally
+ * REC0 checks if the data remains unchanged in its context.
+ */
+static void realm_fpuregs_context_switch_cpu0(void)
+{
+
+	int target		= CPU1_ID;
+	uint64_t *indata_local	= NULL;
+	uint64_t *indata_remote	= NULL;
+	uint32_t result		= 0;
+
+	/* write data from REC0/VCPU0 */
+	__realm_fpuregs_writeall_random(&indata_local);
+
+	/* Overwrite from REC1/VCPU1 */
+	on_cpu(target, realm_fpuregs_writeall_run, &indata_remote);
+
+	/* check data consistency */
+	result = __realm_fpuregs_testall(indata_local);
+	report((result == RMM_FPU_RESULT_PASS),
+	       "fpu/simd register save/restore mask: 0x%x", result);
+
+	free(indata_remote);
+	free(indata_local);
+}
+/**
+ * checks if during realm context switch, FPU/SIMD registers
+ * are saved/restored.
+ */
+static void realm_fpuregs_context_switch(void)
+{
+
+	realm_fpuregs_context_switch_cpu0();
+	realm_fpuregs_context_switch_cpu1();
+}
+
+int main(int argc, char **argv)
+{
+	report_prefix_pushf("realm-fpu");
+
+	if (!is_realm())
+		report_skip("Not running in Realm world, skipping");
+
+	nr_cpu_check(CPUS_MAX);
+	realm_fpuregs_context_switch();
+
+	return report_summary();
+}
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index 3cdb1a98..a60dc6a9 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -297,3 +297,11 @@ groups = nodefault realms
 extra_params = -append 'hvc'
 accel = kvm
 arch = arm64
+
+# Realm FPU/SIMD test
+[realm-fpu-context]
+file = realm-fpu.flat
+smp = 2
+groups = nodefault realms
+accel = kvm
+arch = arm64
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 16/27] arm: realm: Add tests for in realm SEA
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (14 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 15/27] arm: realm: Add test for FPU/SIMD context save/restore Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 17/27] lib/alloc_page: Add shared page allocation support Joey Gouly
                     ` (10 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel,
	Djordje Kovacevic

From: Djordje Kovacevic <djordje.kovacevic@arm.com>

The RMM/Host could inject Synchronous External Aborts in to the Realm
for various reasons.

RMM injects the SEA for :
  * Instruction/Data fetch from an IPA that is in RIPAS_EMPTY state
  * Instruction fetch from an Unprotected IPA.

Trigger these conditions from within the Realm and verify that the
SEAs are received.

Signed-off-by: Djordje Kovacevic <djordje.kovacevic@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 arm/Makefile.arm64 |   1 +
 arm/realm-sea.c    | 143 +++++++++++++++++++++++++++++++++++++++++++++
 arm/unittests.cfg  |   6 ++
 3 files changed, 150 insertions(+)
 create mode 100644 arm/realm-sea.c

diff --git a/arm/Makefile.arm64 b/arm/Makefile.arm64
index 90ec6815..8448af36 100644
--- a/arm/Makefile.arm64
+++ b/arm/Makefile.arm64
@@ -35,6 +35,7 @@ tests += $(TEST_DIR)/cache.flat
 tests += $(TEST_DIR)/debug.flat
 tests += $(TEST_DIR)/realm-rsi.flat
 tests += $(TEST_DIR)/realm-fpu.flat
+tests += $(TEST_DIR)/realm-sea.flat
 
 include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
diff --git a/arm/realm-sea.c b/arm/realm-sea.c
new file mode 100644
index 00000000..5ef3e2a4
--- /dev/null
+++ b/arm/realm-sea.c
@@ -0,0 +1,143 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+#include <libcflat.h>
+#include <vmalloc.h>
+#include <asm/ptrace.h>
+#include <asm/thread_info.h>
+#include <asm/mmu.h>
+#include <asm/rsi.h>
+#include <linux/compiler.h>
+#include <alloc_page.h>
+#include <asm/pgtable.h>
+
+typedef void (*empty_fn)(void);
+
+static bool test_passed;
+
+/*
+ * The virtual address of the page that the test has made the access to
+ * in order to cause the I/DAbort with I/DFSC = Synchronous External Abort.
+ */
+static void* target_page_va;
+
+/*
+ * Ensure that the @va is the executable location from EL1:
+ * - SCTLR_EL1.WXN must be off.
+ * - Disable the access from EL0 (controlled by AP[1] in PTE).
+ */
+static void enable_instruction_fetch(void* va)
+{
+	unsigned long sctlr = read_sysreg(sctlr_el1);
+	if (sctlr & SCTLR_EL1_WXN) {
+		sctlr &= ~SCTLR_EL1_WXN;
+		write_sysreg(sctlr, sctlr_el1);
+		isb();
+		flush_tlb_all();
+	}
+
+	mmu_clear_user(current_thread_info()->pgtable, (u64)va);
+}
+
+static void data_abort_handler(struct pt_regs *regs, unsigned int esr)
+{
+	if ((esr & ESR_EL1_FSC_MASK) == ESR_EL1_FSC_EXTABT)
+		test_passed = true;
+
+	report_info("esr = %x", esr);
+	/*
+	 * Advance the PC to complete the test.
+	 */
+	regs->pc += 4;
+}
+
+static void data_access_to_empty(void)
+{
+	test_passed = false;
+	target_page_va = alloc_page();
+	phys_addr_t empty_ipa = virt_to_phys(target_page_va);
+
+	arm_set_memory_shared(empty_ipa, SZ_4K);
+
+	install_exception_handler(EL1H_SYNC, ESR_EL1_EC_DABT_EL1, data_abort_handler);
+	READ_ONCE(((char*)target_page_va)[0x55]);
+	install_exception_handler(EL1H_SYNC, ESR_EL1_EC_DABT_EL1, NULL);
+
+	report(test_passed, " ");
+}
+
+static void instruction_abort_handler(struct pt_regs *regs, unsigned int esr)
+{
+	if (((esr & ESR_EL1_FSC_MASK) == ESR_EL1_FSC_EXTABT) &&
+	     (regs->pc == (u64)target_page_va))
+		test_passed = true;
+
+	report_info("esr = %x", esr);
+	/*
+	 * Simulate the RET instruction to complete the test.
+	 */
+	regs->pc = regs->regs[30];
+}
+
+static void instr_fetch_from_empty(void)
+{
+	phys_addr_t empty_ipa;
+
+	test_passed = false;
+	target_page_va = alloc_page();
+	enable_instruction_fetch(target_page_va);
+
+	empty_ipa = virt_to_phys((void*)target_page_va);
+
+	arm_set_memory_shared(empty_ipa, SZ_4K);
+
+	install_exception_handler(EL1H_SYNC, ESR_EL1_EC_IABT_EL1, instruction_abort_handler);
+	/*
+	 * This should cause the IAbort with IFSC = SEA
+	 */
+	((empty_fn)target_page_va)();
+	install_exception_handler(EL1H_SYNC, ESR_EL1_EC_IABT_EL1, NULL);
+
+	report(test_passed, " ");
+}
+
+static void instr_fetch_from_unprotected(void)
+{
+	test_passed = false;
+	/*
+	 * The test will attempt to execute an instruction from the start of
+	 * the unprotected IPA space.
+	 */
+	target_page_va = vmap(PTE_NS_SHARED, SZ_4K);
+	enable_instruction_fetch(target_page_va);
+
+	install_exception_handler(EL1H_SYNC, ESR_EL1_EC_IABT_EL1, instruction_abort_handler);
+	/*
+	 * This should cause the IAbort with IFSC = SEA
+	 */
+	((empty_fn)target_page_va)();
+	install_exception_handler(EL1H_SYNC, ESR_EL1_EC_IABT_EL1, NULL);
+
+	report(test_passed, " ");
+}
+
+int main(int argc, char **argv)
+{
+	report_prefix_push("in_realm_sea");
+
+	report_prefix_push("data_access_to_empty");
+	data_access_to_empty();
+	report_prefix_pop();
+
+	report_prefix_push("instr_fetch_from_empty");
+	instr_fetch_from_empty();
+	report_prefix_pop();
+
+	report_prefix_push("instr_fetch_from_unprotected");
+	instr_fetch_from_unprotected();
+	report_prefix_pop();
+
+	return report_summary();
+}
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index a60dc6a9..bc2354c7 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -305,3 +305,9 @@ smp = 2
 groups = nodefault realms
 accel = kvm
 arch = arm64
+
+[realm-sea]
+file = realm-sea.flat
+groups = nodefault realms
+accel = kvm
+arch = arm64
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 17/27] lib/alloc_page: Add shared page allocation support
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (15 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 16/27] arm: realm: Add tests for in realm SEA Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:40   ` [RFC kvm-unit-tests 18/27] arm: gic-v3-its: Use shared pages wherever needed Joey Gouly
                     ` (9 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

Add support for allocating "pages" that can be shared with the host.
Or in other words, decrypted pages. This is achieved by adding hooks for
setting a memory region as "encrypted" or "decrypted", which can be overridden
by the architecture specific backends.

Also add a new flag - FLAG_SHARED - for allocating shared pages.

The page allocation/free routines get a "_shared_" variant too.
These will be later used for Realm support and tests.

Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 lib/alloc_page.c | 34 +++++++++++++++++++++++++++++++---
 lib/alloc_page.h | 24 ++++++++++++++++++++++++
 2 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/lib/alloc_page.c b/lib/alloc_page.c
index 84f01e11..8b811b15 100644
--- a/lib/alloc_page.c
+++ b/lib/alloc_page.c
@@ -53,6 +53,20 @@ static struct mem_area areas[MAX_AREAS];
 /* Mask of initialized areas */
 static unsigned int areas_mask;
 /* Protects areas and areas mask */
+
+#ifndef set_memory_encrypted
+static inline void set_memory_encrypted(unsigned long mem, unsigned long size)
+{
+}
+#endif
+
+#ifndef set_memory_decrypted
+static inline void set_memory_decrypted(unsigned long mem, unsigned long size)
+{
+}
+#endif
+
+
 static struct spinlock lock;
 
 bool page_alloc_initialized(void)
@@ -263,7 +277,7 @@ static bool coalesce(struct mem_area *a, u8 order, pfn_t pfn, pfn_t pfn2)
  * - no pages in the memory block were already free
  * - no pages in the memory block are special
  */
-static void _free_pages(void *mem)
+static void _free_pages(void *mem, u32 flags)
 {
 	pfn_t pfn2, pfn = virt_to_pfn(mem);
 	struct mem_area *a = NULL;
@@ -281,6 +295,9 @@ static void _free_pages(void *mem)
 	p = pfn - a->base;
 	order = a->page_states[p] & ORDER_MASK;
 
+	if (flags & FLAG_SHARED)
+		set_memory_encrypted((unsigned long)mem, BIT(order) * PAGE_SIZE);
+
 	/* ensure that the first page is allocated and not special */
 	assert(IS_ALLOCATED(a->page_states[p]));
 	/* ensure that the order has a sane value */
@@ -320,7 +337,14 @@ static void _free_pages(void *mem)
 void free_pages(void *mem)
 {
 	spin_lock(&lock);
-	_free_pages(mem);
+	_free_pages(mem, 0);
+	spin_unlock(&lock);
+}
+
+void free_pages_shared(void *mem)
+{
+	spin_lock(&lock);
+	_free_pages(mem, FLAG_SHARED);
 	spin_unlock(&lock);
 }
 
@@ -353,7 +377,7 @@ static void _unreserve_one_page(pfn_t pfn)
 	i = pfn - a->base;
 	assert(a->page_states[i] == STATUS_SPECIAL);
 	a->page_states[i] = STATUS_ALLOCATED;
-	_free_pages(pfn_to_virt(pfn));
+	_free_pages(pfn_to_virt(pfn), 0);
 }
 
 int reserve_pages(phys_addr_t addr, size_t n)
@@ -401,6 +425,10 @@ static void *page_memalign_order_flags(u8 al, u8 ord, u32 flags)
 		if (area & BIT(i))
 			res = page_memalign_order(areas + i, al, ord, fresh);
 	spin_unlock(&lock);
+
+	if (res && (flags & FLAG_SHARED))
+		set_memory_decrypted((unsigned long)res, BIT(ord) * PAGE_SIZE);
+
 	if (res && !(flags & FLAG_DONTZERO))
 		memset(res, 0, BIT(ord) * PAGE_SIZE);
 	return res;
diff --git a/lib/alloc_page.h b/lib/alloc_page.h
index 060e0418..847a7fda 100644
--- a/lib/alloc_page.h
+++ b/lib/alloc_page.h
@@ -21,6 +21,7 @@
 
 #define FLAG_DONTZERO	0x10000
 #define FLAG_FRESH	0x20000
+#define FLAG_SHARED	0x40000
 
 /* Returns true if the page allocator has been initialized */
 bool page_alloc_initialized(void);
@@ -121,4 +122,27 @@ int reserve_pages(phys_addr_t addr, size_t npages);
  */
 void unreserve_pages(phys_addr_t addr, size_t npages);
 
+/* Shared page operations */
+static inline void *alloc_pages_shared(unsigned long order)
+{
+	return alloc_pages_flags(order, FLAG_SHARED);
+}
+
+static inline void *alloc_page_shared(void)
+{
+	return alloc_pages_shared(0);
+}
+
+void free_pages_shared(void *mem);
+
+static inline void free_page_shared(void *page)
+{
+	free_pages_shared(page);
+}
+
+static inline void free_pages_shared_by_order(void *mem, unsigned long order)
+{
+	free_pages_shared(mem);
+}
+
 #endif
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 18/27] arm: gic-v3-its: Use shared pages wherever needed
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (16 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 17/27] lib/alloc_page: Add shared page allocation support Joey Gouly
@ 2023-01-27 11:40   ` Joey Gouly
  2023-01-27 11:41   ` [RFC kvm-unit-tests 19/27] arm: realm: Enable memory encryption Joey Gouly
                     ` (8 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:40 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

From: Suzuki K Poulose <suzuki.poulose@arm.com>

GICv3-ITS is emulated by the host and thus we should allocate shared pages for
access by the host. Make sure the allocations are shared.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 lib/arm/gic-v3.c       | 6 ++++--
 lib/arm64/gic-v3-its.c | 6 +++---
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/lib/arm/gic-v3.c b/lib/arm/gic-v3.c
index 2f7870ab..813cd5a6 100644
--- a/lib/arm/gic-v3.c
+++ b/lib/arm/gic-v3.c
@@ -171,7 +171,9 @@ void gicv3_lpi_alloc_tables(void)
 	u64 prop_val;
 	int cpu;
 
-	gicv3_data.lpi_prop = alloc_pages(order);
+	assert(gicv3_redist_base());
+
+	gicv3_data.lpi_prop = alloc_pages_shared(order);
 
 	/* ID bits = 13, ie. up to 14b LPI INTID */
 	prop_val = (u64)(virt_to_phys(gicv3_data.lpi_prop)) | 13;
@@ -186,7 +188,7 @@ void gicv3_lpi_alloc_tables(void)
 
 		writeq(prop_val, ptr + GICR_PROPBASER);
 
-		gicv3_data.lpi_pend[cpu] = alloc_pages(order);
+		gicv3_data.lpi_pend[cpu] = alloc_pages_shared(order);
 		pend_val = (u64)(virt_to_phys(gicv3_data.lpi_pend[cpu]));
 		writeq(pend_val, ptr + GICR_PENDBASER);
 	}
diff --git a/lib/arm64/gic-v3-its.c b/lib/arm64/gic-v3-its.c
index 2c69cfda..07dbeb81 100644
--- a/lib/arm64/gic-v3-its.c
+++ b/lib/arm64/gic-v3-its.c
@@ -54,7 +54,7 @@ static void its_baser_alloc_table(struct its_baser *baser, size_t size)
 	void *reg_addr = gicv3_its_base() + GITS_BASER + baser->index * 8;
 	u64 val = readq(reg_addr);
 
-	baser->table_addr = alloc_pages(order);
+	baser->table_addr = alloc_pages_shared(order);
 
 	val |= virt_to_phys(baser->table_addr) | GITS_BASER_VALID;
 
@@ -70,7 +70,7 @@ static void its_cmd_queue_init(void)
 	unsigned long order = get_order(SZ_64K >> PAGE_SHIFT);
 	u64 cbaser;
 
-	its_data.cmd_base = alloc_pages(order);
+	its_data.cmd_base = alloc_pages_shared(order);
 
 	cbaser = virt_to_phys(its_data.cmd_base) | (SZ_64K / SZ_4K - 1) | GITS_CBASER_VALID;
 
@@ -123,7 +123,7 @@ struct its_device *its_create_device(u32 device_id, int nr_ites)
 	new->nr_ites = nr_ites;
 
 	n = (its_data.typer.ite_size * nr_ites) >> PAGE_SHIFT;
-	new->itt = alloc_pages(get_order(n));
+	new->itt = alloc_pages_shared(get_order(n));
 
 	its_data.nr_devices++;
 	return new;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 19/27] arm: realm: Enable memory encryption
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (17 preceding siblings ...)
  2023-01-27 11:40   ` [RFC kvm-unit-tests 18/27] arm: gic-v3-its: Use shared pages wherever needed Joey Gouly
@ 2023-01-27 11:41   ` Joey Gouly
  2023-01-27 11:41   ` [RFC kvm-unit-tests 20/27] qcbor: Add QCBOR as a submodule Joey Gouly
                     ` (7 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:41 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

From: Suzuki K Poulose <suzuki.poulose@arm.com>

Enable memory encryption support for Realms.

When a page is "decrypted", we set the RIPAS to EMPTY, hinting to the hypervisor
that it could reclaim the page backing the IPA. Also the pagetable is updated
with the PTE_NS_SHARED attrbiute, whic in effect turns the "ipa" to the
unprotected alias.

Similarly for "encryption" we mark the IPA back to RIPAS_RAM and clear the
PTE_NS_SHARED attribute.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 lib/arm/mmu.c      | 65 ++++++++++++++++++++++++++++++++++++++++++++--
 lib/arm64/asm/io.h |  6 +++++
 2 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index 2b5a7141..d4fbe56a 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -22,6 +22,7 @@
 #include <linux/compiler.h>
 
 pgd_t *mmu_idmap;
+unsigned long idmap_end;
 
 /* Used by Realms, depends on IPA size */
 unsigned long prot_ns_shared = 0;
@@ -30,6 +31,11 @@ unsigned long phys_mask_shift = 48;
 /* CPU 0 starts with disabled MMU */
 static cpumask_t mmu_enabled_cpumask;
 
+static bool is_idmap_address(phys_addr_t pa)
+{
+	return pa < idmap_end;
+}
+
 bool mmu_enabled(void)
 {
 	/*
@@ -92,12 +98,17 @@ static pteval_t *get_pte(pgd_t *pgtable, uintptr_t vaddr)
 	return &pte_val(*pte);
 }
 
+static void set_pte(uintptr_t vaddr, pteval_t *p_pte, pteval_t pte)
+{
+	WRITE_ONCE(*p_pte, pte);
+	flush_tlb_page(vaddr);
+}
+
 static pteval_t *install_pte(pgd_t *pgtable, uintptr_t vaddr, pteval_t pte)
 {
 	pteval_t *p_pte = get_pte(pgtable, vaddr);
 
-	WRITE_ONCE(*p_pte, pte);
-	flush_tlb_page(vaddr);
+	set_pte(vaddr, p_pte, pte);
 	return p_pte;
 }
 
@@ -122,6 +133,39 @@ phys_addr_t virt_to_pte_phys(pgd_t *pgtable, void *mem)
 		+ ((ulong)mem & (PAGE_SIZE - 1));
 }
 
+/*
+ * __idmap_set_range_prot - Apply permissions to the given idmap range.
+ */
+static void __idmap_set_range_prot(unsigned long virt_offset, size_t size, pgprot_t prot)
+{
+	pteval_t *ptep;
+	pteval_t default_prot = PTE_TYPE_PAGE | PTE_AF | PTE_SHARED;
+
+	while (size > 0) {
+		pteval_t pte = virt_offset | default_prot | pgprot_val(prot);
+
+		if (!is_idmap_address(virt_offset))
+			break;
+		/* Break before make : Clear the PTE entry first */
+		ptep = install_pte(mmu_idmap, (uintptr_t)virt_offset, 0);
+		/* Now apply the changes */
+		set_pte((uintptr_t)virt_offset, ptep, pte);
+
+		size -= PAGE_SIZE;
+		virt_offset += PAGE_SIZE;
+	}
+}
+
+static void idmap_set_range_shared(unsigned long virt_offset, size_t size)
+{
+	return __idmap_set_range_prot(virt_offset, size, __pgprot(PTE_WBWA | PTE_USER | PTE_NS_SHARED));
+}
+
+static void idmap_set_range_protected(unsigned long virt_offset, size_t size)
+{
+	__idmap_set_range_prot(virt_offset, size, __pgprot(PTE_WBWA | PTE_USER));
+}
+
 void mmu_set_range_ptes(pgd_t *pgtable, uintptr_t virt_offset,
 			phys_addr_t phys_start, phys_addr_t phys_end,
 			pgprot_t prot)
@@ -190,6 +234,7 @@ void *setup_mmu(phys_addr_t phys_end, void *unused)
 	}
 
 	mmu_enable(mmu_idmap);
+	idmap_end = phys_end;
 	return mmu_idmap;
 }
 
@@ -278,3 +323,19 @@ void mmu_clear_user(pgd_t *pgtable, unsigned long vaddr)
 		flush_tlb_page(vaddr);
 	}
 }
+
+void set_memory_encrypted(unsigned long va, size_t size)
+{
+	if (is_realm()) {
+		arm_set_memory_protected(__virt_to_phys(va), size);
+		idmap_set_range_protected(va, size);
+	}
+}
+
+void set_memory_decrypted(unsigned long va, size_t size)
+{
+	if (is_realm()) {
+		arm_set_memory_shared(__virt_to_phys(va), size);
+		idmap_set_range_shared(va, size);
+	}
+}
diff --git a/lib/arm64/asm/io.h b/lib/arm64/asm/io.h
index be19f471..3f71254d 100644
--- a/lib/arm64/asm/io.h
+++ b/lib/arm64/asm/io.h
@@ -89,6 +89,12 @@ static inline void *phys_to_virt(phys_addr_t x)
 	return (void *)__phys_to_virt(x);
 }
 
+extern void set_memory_decrypted(unsigned long va, size_t size);
+#define set_memory_decrypted		set_memory_decrypted
+
+extern void set_memory_encrypted(unsigned long va, size_t size);
+#define set_memory_encrypted	set_memory_encrypted
+
 #include <asm-generic/io.h>
 
 #endif /* _ASMARM64_IO_H_ */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 20/27] qcbor: Add QCBOR as a submodule
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (18 preceding siblings ...)
  2023-01-27 11:41   ` [RFC kvm-unit-tests 19/27] arm: realm: Enable memory encryption Joey Gouly
@ 2023-01-27 11:41   ` Joey Gouly
  2023-01-27 11:41   ` [RFC kvm-unit-tests 21/27] arm: Add build steps for QCBOR library Joey Gouly
                     ` (6 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:41 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

From: Suzuki K Poulose <suzuki.poulose@arm.com>

Adds the library QCBOR as submodule. This will be later used
for arm64 realm attestation token parsing. The repository is
available at:

	https://github.com/laurencelundblade/QCBOR tag v1.0

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 .gitmodules | 3 +++
 lib/qcbor   | 1 +
 2 files changed, 4 insertions(+)
 create mode 100644 .gitmodules
 create mode 160000 lib/qcbor

diff --git a/.gitmodules b/.gitmodules
new file mode 100644
index 00000000..29fdbc5d
--- /dev/null
+++ b/.gitmodules
@@ -0,0 +1,3 @@
+[submodule "lib/qcbor"]
+	path = lib/qcbor
+	url = https://github.com/laurencelundblade/QCBOR.git
diff --git a/lib/qcbor b/lib/qcbor
new file mode 160000
index 00000000..56b17bf9
--- /dev/null
+++ b/lib/qcbor
@@ -0,0 +1 @@
+Subproject commit 56b17bf9f74096774944bcac0829adcd887d391e
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 21/27] arm: Add build steps for QCBOR library
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (19 preceding siblings ...)
  2023-01-27 11:41   ` [RFC kvm-unit-tests 20/27] qcbor: Add QCBOR as a submodule Joey Gouly
@ 2023-01-27 11:41   ` Joey Gouly
  2023-01-27 11:41   ` [RFC kvm-unit-tests 22/27] arm: Add a library to verify tokens using the " Joey Gouly
                     ` (5 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:41 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

From: Suzuki K Poulose <suzuki.poulose@arm.com>

The QCBOR library will be used for Realm attestation.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 arm/Makefile.arm64 | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arm/Makefile.arm64 b/arm/Makefile.arm64
index 8448af36..8d450de9 100644
--- a/arm/Makefile.arm64
+++ b/arm/Makefile.arm64
@@ -9,6 +9,8 @@ ldarch = elf64-littleaarch64
 arch_LDFLAGS = -pie -n
 arch_LDFLAGS += -z notext
 CFLAGS += -mstrict-align
+CFLAGS += -I $(SRCDIR)/lib/qcbor/inc
+CFLAGS += -DQCBOR_DISABLE_FLOAT_HW_USE -DQCBOR_DISABLE_PREFERRED_FLOAT -DUSEFULBUF_DISABLE_ALL_FLOAT
 
 mno_outline_atomics := $(call cc-option, -mno-outline-atomics, "")
 CFLAGS += $(mno_outline_atomics)
@@ -25,6 +27,7 @@ cflatobjs += lib/arm64/processor.o
 cflatobjs += lib/arm64/spinlock.o
 cflatobjs += lib/arm64/gic-v3-its.o lib/arm64/gic-v3-its-cmd.o
 cflatobjs += lib/arm64/rsi.o
+cflatobjs += lib/qcbor/src/qcbor_decode.o lib/qcbor/src/UsefulBuf.o
 
 OBJDIRS += lib/arm64
 
@@ -40,4 +43,5 @@ tests += $(TEST_DIR)/realm-sea.flat
 include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
 arch_clean: arm_clean
-	$(RM) lib/arm64/.*.d
+	$(RM) lib/arm64/.*.d		\
+	      lib/qcbor/src/.*.d
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 22/27] arm: Add a library to verify tokens using the QCBOR library
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (20 preceding siblings ...)
  2023-01-27 11:41   ` [RFC kvm-unit-tests 21/27] arm: Add build steps for QCBOR library Joey Gouly
@ 2023-01-27 11:41   ` Joey Gouly
  2023-01-27 11:41   ` [RFC kvm-unit-tests 23/27] arm: realm: add RSI interface for attestation measurements Joey Gouly
                     ` (4 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:41 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel,
	Mate Toth-Pal

From: Mate Toth-Pal <mate.toth-pal@arm.com>

Add a library wrapper around the QCBOR for parsing the Arm CCA attestation
tokens.

Signed-off-by: Mate Toth-Pal <mate.toth-pal@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 arm/Makefile.arm64                  |   7 +-
 lib/token_verifier/attest_defines.h |  50 +++
 lib/token_verifier/token_dumper.c   | 158 ++++++++
 lib/token_verifier/token_dumper.h   |  15 +
 lib/token_verifier/token_verifier.c | 591 ++++++++++++++++++++++++++++
 lib/token_verifier/token_verifier.h |  77 ++++
 6 files changed, 897 insertions(+), 1 deletion(-)
 create mode 100644 lib/token_verifier/attest_defines.h
 create mode 100644 lib/token_verifier/token_dumper.c
 create mode 100644 lib/token_verifier/token_dumper.h
 create mode 100644 lib/token_verifier/token_verifier.c
 create mode 100644 lib/token_verifier/token_verifier.h

diff --git a/arm/Makefile.arm64 b/arm/Makefile.arm64
index 8d450de9..f57d0a95 100644
--- a/arm/Makefile.arm64
+++ b/arm/Makefile.arm64
@@ -11,6 +11,7 @@ arch_LDFLAGS += -z notext
 CFLAGS += -mstrict-align
 CFLAGS += -I $(SRCDIR)/lib/qcbor/inc
 CFLAGS += -DQCBOR_DISABLE_FLOAT_HW_USE -DQCBOR_DISABLE_PREFERRED_FLOAT -DUSEFULBUF_DISABLE_ALL_FLOAT
+CFLAGS += -I $(SRCDIR)/lib/token_verifier
 
 mno_outline_atomics := $(call cc-option, -mno-outline-atomics, "")
 CFLAGS += $(mno_outline_atomics)
@@ -28,6 +29,9 @@ cflatobjs += lib/arm64/spinlock.o
 cflatobjs += lib/arm64/gic-v3-its.o lib/arm64/gic-v3-its-cmd.o
 cflatobjs += lib/arm64/rsi.o
 cflatobjs += lib/qcbor/src/qcbor_decode.o lib/qcbor/src/UsefulBuf.o
+cflatobjs += lib/token_verifier/token_verifier.o
+cflatobjs += lib/token_verifier/token_dumper.o
+
 
 OBJDIRS += lib/arm64
 
@@ -44,4 +48,5 @@ include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
 arch_clean: arm_clean
 	$(RM) lib/arm64/.*.d		\
-	      lib/qcbor/src/.*.d
+	      lib/qcbor/src/.*.d	\
+	      lib/token_verifier/.*.d
diff --git a/lib/token_verifier/attest_defines.h b/lib/token_verifier/attest_defines.h
new file mode 100644
index 00000000..daf51c5f
--- /dev/null
+++ b/lib/token_verifier/attest_defines.h
@@ -0,0 +1,50 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+
+#ifndef __ATTEST_DEFINES_H__
+#define __ATTEST_DEFINES_H__
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define TAG_COSE_SIGN1                       (18)
+#define TAG_CCA_TOKEN                       (399)
+
+#define CCA_PLAT_TOKEN                    (44234)    /* 0xACCA */
+#define CCA_REALM_DELEGATED_TOKEN         (44241)
+
+/* CCA Platform Attestation Token */
+#define CCA_PLAT_CHALLENGE                   (10)    /* EAT nonce */
+#define CCA_PLAT_INSTANCE_ID                (256)    /* EAT ueid */
+#define CCA_PLAT_PROFILE                    (265)    /* EAT profile */
+#define CCA_PLAT_SECURITY_LIFECYCLE        (2395)
+#define CCA_PLAT_IMPLEMENTATION_ID         (2396)
+#define CCA_PLAT_SW_COMPONENTS             (2399)
+#define CCA_PLAT_VERIFICATION_SERVICE      (2400)
+#define CCA_PLAT_CONFIGURATION             (2401)
+#define CCA_PLAT_HASH_ALGO_ID              (2402)
+
+/* CCA Realm Delegated Attestation Token */
+#define CCA_REALM_CHALLENGE                  (10)    /* EAT nonce */
+#define CCA_REALM_PERSONALIZATION_VALUE   (44235)
+#define CCA_REALM_HASH_ALGO_ID            (44236)
+#define CCA_REALM_PUB_KEY                 (44237)
+#define CCA_REALM_INITIAL_MEASUREMENT     (44238)
+#define CCA_REALM_EXTENSIBLE_MEASUREMENTS (44239)
+#define CCA_REALM_PUB_KEY_HASH_ALGO_ID    (44240)
+
+/* Software components */
+#define CCA_SW_COMP_MEASUREMENT_VALUE         (2)
+#define CCA_SW_COMP_VERSION                   (4)
+#define CCA_SW_COMP_SIGNER_ID                 (5)
+#define CCA_SW_COMP_HASH_ALGORITHM            (6)
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __ATTEST_DEFINES_H__ */
diff --git a/lib/token_verifier/token_dumper.c b/lib/token_verifier/token_dumper.c
new file mode 100644
index 00000000..15f17956
--- /dev/null
+++ b/lib/token_verifier/token_dumper.c
@@ -0,0 +1,158 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+
+#include <stdio.h>
+#include <inttypes.h>
+#include "attest_defines.h"
+#include "token_dumper.h"
+
+#define COLUMN_WIDTH "20"
+
+void print_raw_token(const char *token, size_t size)
+{
+	int i;
+	char byte;
+
+	printf("\r\nCopy paste token to www.cbor.me\r\n");
+	for (i = 0; i < size; ++i) {
+		byte = token[i];
+		if (byte == 0)
+			printf("0x%#02x ", byte);
+		else
+			printf("0x%02x ", byte);
+		if (((i + 1) % 8) == 0)
+			printf("\r\n");
+	}
+	printf("\r\n");
+}
+
+static void print_indent(int indent_level)
+{
+	int i;
+
+	for (i = 0; i < indent_level; ++i) {
+		printf("  ");
+	}
+}
+
+static void print_byte_string(const char *name, int index,
+			      struct q_useful_buf_c buf)
+{
+	int i;
+
+	printf("%-"COLUMN_WIDTH"s (#%d) = [", name, index);
+	for (i = 0; i < buf.len; ++i) {
+		printf("%02x", ((uint8_t *)buf.ptr)[i]);
+	}
+	printf("]\r\n");
+}
+
+static void print_text(const char *name, int index, struct q_useful_buf_c buf)
+{
+	int i;
+
+	printf("%-"COLUMN_WIDTH"s (#%d) = \"", name, index);
+	for (i = 0; i < buf.len; ++i) {
+		printf("%c", ((uint8_t *)buf.ptr)[i]);
+	}
+	printf("\"\r\n");
+}
+
+static void print_claim(struct claim_t *claim, int indent_level)
+{
+	print_indent(indent_level);
+	if (claim->present) {
+		switch (claim->type) {
+		case CLAIM_INT64:
+			printf("%-"COLUMN_WIDTH"s (#%" PRId64 ") = %" PRId64
+				"\r\n", claim->title,
+			claim->key, claim->int_data);
+			break;
+		case CLAIM_BOOL:
+			printf("%-"COLUMN_WIDTH"s (#%" PRId64 ") = %s\r\n",
+			claim->title, claim->key,
+			claim->bool_data?"true":"false");
+			break;
+		case CLAIM_BSTR:
+			print_byte_string(claim->title, claim->key,
+				claim->buffer_data);
+			break;
+		case CLAIM_TEXT:
+			print_text(claim->title, claim->key,
+				claim->buffer_data);
+			break;
+		default:
+			printf("* Internal error at  %s:%d.\r\n", __FILE__,
+				(int)__LINE__);
+			break;
+		}
+	} else {
+		printf("* Missing%s claim with key: %" PRId64 " (%s)\r\n",
+			claim->mandatory?" mandatory":"",
+			claim->key, claim->title);
+	}
+}
+
+static void print_cose_sign1_wrapper(const char *token_type,
+				     struct claim_t *cose_sign1_wrapper)
+{
+	printf("\r\n== %s Token cose header:\r\n", token_type);
+	print_claim(cose_sign1_wrapper + 0, 0);
+	/* Don't print wrapped token bytestring */
+	print_claim(cose_sign1_wrapper + 2, 0);
+	printf("== End of %s Token cose header\r\n\r\n", token_type);
+}
+
+void print_token(struct attestation_claims *claims)
+{
+	int i;
+
+	print_cose_sign1_wrapper("Realm", claims->realm_cose_sign1_wrapper);
+
+	printf("\r\n== Realm Token:\r\n");
+	/* print the claims except the last one. That is printed in detail
+	 * below.
+	 */
+	for (i = 0; i < CLAIM_COUNT_REALM_TOKEN; ++i) {
+		struct claim_t *claim = claims->realm_token_claims + i;
+
+		print_claim(claim, 0);
+	}
+
+	printf("%-"COLUMN_WIDTH"s (#%d)\r\n", "Realm measurements",
+		CCA_REALM_EXTENSIBLE_MEASUREMENTS);
+	for (i = 0; i < CLAIM_COUNT_REALM_EXTENSIBLE_MEASUREMENTS; ++i) {
+		struct claim_t *claim = claims->realm_measurement_claims + i;
+
+		print_claim(claim, 1);
+	}
+	printf("== End of Realm Token.\r\n");
+
+	print_cose_sign1_wrapper("Platform", claims->plat_cose_sign1_wrapper);
+
+	printf("\r\n== Platform Token:\r\n");
+	for (i = 0; i < CLAIM_COUNT_PLATFORM_TOKEN; ++i) {
+		struct claim_t *claim = claims->plat_token_claims + i;
+
+		print_claim(claim, 0);
+	}
+	printf("== End of Platform Token\r\n\r\n");
+
+	printf("\r\n== Platform Token SW components:\r\n");
+
+	for (i = 0; i < MAX_SW_COMPONENT_COUNT; ++i) {
+		struct sw_component_t *component =
+			claims->sw_component_claims + i;
+
+		if (component->present) {
+			printf("  SW component #%d:\r\n", i);
+			for (int j = 0; j < CLAIM_COUNT_SW_COMPONENT; ++j) {
+				print_claim(component->claims + j, 2);
+			}
+		}
+	}
+	printf("== End of Platform Token SW components\r\n\r\n");
+}
diff --git a/lib/token_verifier/token_dumper.h b/lib/token_verifier/token_dumper.h
new file mode 100644
index 00000000..96cc0744
--- /dev/null
+++ b/lib/token_verifier/token_dumper.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+
+#ifndef __TOKEN_DUMPER_H__
+#define __TOKEN_DUMPER_H__
+
+#include "token_verifier.h"
+
+void print_raw_token(const char *token, size_t size);
+void print_token(struct attestation_claims *claims);
+
+#endif /* __TOKEN_DUMPER_H__ */
diff --git a/lib/token_verifier/token_verifier.c b/lib/token_verifier/token_verifier.c
new file mode 100644
index 00000000..ba2a89f6
--- /dev/null
+++ b/lib/token_verifier/token_verifier.c
@@ -0,0 +1,591 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+
+#include <libcflat.h>
+#include <inttypes.h>
+#include <qcbor/qcbor_decode.h>
+#include <qcbor/qcbor_spiffy_decode.h>
+#include "attest_defines.h"
+#include "token_verifier.h"
+#include "token_dumper.h"
+
+#define SHA256_SIZE 32
+#define SHA512_SIZE 64
+
+#define RETURN_ON_DECODE_ERROR(p_context) \
+	do { \
+		QCBORError ret; \
+		ret = QCBORDecode_GetError(p_context); \
+		if (ret != QCBOR_SUCCESS) { \
+			printf("QCBOR decode failed with error at %s:%d." \
+				" err = %d\r\n", \
+				__FILE__, (int)__LINE__, (int)ret); \
+			return TOKEN_VERIFICATION_ERR_QCBOR(ret); \
+		} \
+	} while (0)
+
+static void init_claim(struct claim_t *claim,
+		       bool mandatory, enum claim_data_type type,
+		       int64_t key, const char *title, bool present)
+{
+	claim->mandatory = mandatory;
+	claim->type = type;
+	claim->key = key;
+	claim->title = title;
+	claim->present = present;
+}
+
+static int init_cose_wrapper_claim(struct claim_t *cose_sign1_wrapper)
+{
+	struct claim_t *c;
+
+	/* The cose wrapper looks like the following:
+	 *  - Protected header (bytestring).
+	 *  - Unprotected header: might contain 0 items. This is a map. Due to
+	 *    the way this thing is implemented, it is not in the below list,
+	 *    but is handled in the verify_token_cose_sign1_wrapping
+	 *    function.
+	 *  - Payload: Platform token (bytestring). The content is passed for
+	 *    verify_platform_token.
+	 *  - Signature.
+	 */
+	c = cose_sign1_wrapper;
+	/* This structure is in an array, so the key is not used */
+	init_claim(c++, true, CLAIM_BSTR, 0, "Protected header",  false);
+	init_claim(c++, true, CLAIM_BSTR, 0, "Platform token payload", false);
+	init_claim(c++, true, CLAIM_BSTR, 0, "Signature",  false);
+	if (c > cose_sign1_wrapper + CLAIM_COUNT_COSE_SIGN1_WRAPPER) {
+		return TOKEN_VERIFICATION_ERR_INIT_ERROR;
+	}
+	return 0;
+}
+
+static int init_claims(struct attestation_claims *attest_claims)
+{
+	int i;
+	int ret;
+	struct claim_t *c;
+	/* TODO: All the buffer overwrite checks are happening too late.
+	 * Either remove, or find a better way.
+	 */
+	c = attest_claims->realm_token_claims;
+	init_claim(c++, true, CLAIM_BSTR, CCA_REALM_CHALLENGE,             "Realm challenge",                false);
+	init_claim(c++, true, CLAIM_BSTR, CCA_REALM_PERSONALIZATION_VALUE, "Realm personalization value",    false);
+	init_claim(c++, true, CLAIM_TEXT, CCA_REALM_HASH_ALGO_ID,          "Realm hash algo id",             false);
+	init_claim(c++, true, CLAIM_TEXT, CCA_REALM_PUB_KEY_HASH_ALGO_ID,  "Realm public key hash algo id",  false);
+	init_claim(c++, true, CLAIM_BSTR, CCA_REALM_PUB_KEY,               "Realm signing public key",       false);
+	init_claim(c++, true, CLAIM_BSTR, CCA_REALM_INITIAL_MEASUREMENT,   "Realm initial measurement",      false);
+	/* Realm extensible measurements are not present here as they are
+	 * encoded as a CBOR array, and it is handled specially in
+	 * verify_realm_token().
+	 */
+	if (c > attest_claims->realm_token_claims + CLAIM_COUNT_REALM_TOKEN) {
+		return TOKEN_VERIFICATION_ERR_INIT_ERROR;
+	}
+
+	ret = init_cose_wrapper_claim(attest_claims->realm_cose_sign1_wrapper);
+	if (ret != 0) {
+		return ret;
+	}
+	ret = init_cose_wrapper_claim(attest_claims->plat_cose_sign1_wrapper);
+	if (ret != 0) {
+		return ret;
+	}
+
+	c = attest_claims->plat_token_claims;
+	init_claim(c++, true,  CLAIM_BSTR,  CCA_PLAT_CHALLENGE,            "Challenge",            false);
+	init_claim(c++, false, CLAIM_TEXT,  CCA_PLAT_VERIFICATION_SERVICE, "Verification service", false);
+	init_claim(c++, true,  CLAIM_TEXT,  CCA_PLAT_PROFILE,              "Profile",              false);
+	init_claim(c++, true,  CLAIM_BSTR,  CCA_PLAT_INSTANCE_ID,          "Instance ID",          false);
+	init_claim(c++, true,  CLAIM_BSTR,  CCA_PLAT_IMPLEMENTATION_ID,    "Implementation ID",    false);
+	init_claim(c++, true,  CLAIM_INT64, CCA_PLAT_SECURITY_LIFECYCLE,   "Lifecycle",            false);
+	init_claim(c++, true,  CLAIM_BSTR,  CCA_PLAT_CONFIGURATION,        "Configuration",        false);
+	init_claim(c++, true,  CLAIM_TEXT,  CCA_PLAT_HASH_ALGO_ID,         "Platform hash algo",   false);
+	if (c > attest_claims->plat_token_claims +
+		CLAIM_COUNT_PLATFORM_TOKEN) {
+		return TOKEN_VERIFICATION_ERR_INIT_ERROR;
+	}
+
+	for (i = 0; i < CLAIM_COUNT_REALM_EXTENSIBLE_MEASUREMENTS; ++i) {
+		c = attest_claims->realm_measurement_claims + i;
+		init_claim(c, true, CLAIM_BSTR, i,
+			"Realm extensible measurements", false);
+	}
+
+	for (i = 0; i < MAX_SW_COMPONENT_COUNT; ++i) {
+		struct sw_component_t *component =
+			attest_claims->sw_component_claims + i;
+
+		component->present = false;
+		c = component->claims;
+		init_claim(c++, false, CLAIM_TEXT, CCA_SW_COMP_HASH_ALGORITHM,    "Hash algo.",  false);
+		init_claim(c++, true,  CLAIM_BSTR, CCA_SW_COMP_MEASUREMENT_VALUE, "Meas. val.", false);
+		init_claim(c++, false, CLAIM_TEXT, CCA_SW_COMP_VERSION,           "Version",    false);
+		init_claim(c++, true,  CLAIM_BSTR, CCA_SW_COMP_SIGNER_ID,         "Signer ID",  false);
+		if (c > component->claims + CLAIM_COUNT_SW_COMPONENT) {
+			return TOKEN_VERIFICATION_ERR_INIT_ERROR;
+		}
+	}
+	return TOKEN_VERIFICATION_ERR_SUCCESS;
+}
+
+static int handle_claim_decode_error(const struct claim_t *claim,
+				     QCBORError err)
+{
+	if (err == QCBOR_ERR_LABEL_NOT_FOUND) {
+		if (claim->mandatory) {
+			printf("Mandatory claim with key %" PRId64 " (%s) is "
+				"missing from token.\r\n", claim->key,
+				claim->title);
+			return TOKEN_VERIFICATION_ERR_MISSING_MANDATORY_CLAIM;
+		}
+	} else {
+		printf("Decode failed with error at %s:%d. err = %d key = %"
+			PRId64 " (%s).\r\n",  __FILE__, (int)__LINE__, err,
+			claim->key, claim->title);
+		return TOKEN_VERIFICATION_ERR_QCBOR(err);
+	}
+	return TOKEN_VERIFICATION_ERR_SUCCESS;
+}
+
+/* Consume claims from a map.
+ *
+ * This function iterates on the array 'claims', and looks up items with the
+ * specified keys. If a claim flagged as mandatory is not found, an error is
+ * returned. The function doesn't checks for extra items. So if the map contains
+ * items with keys that are not in the claims array, no error is reported.
+ *
+ * The map needs to be 'entered' before calling this function, and be 'exited'
+ * after it returns.
+ */
+static int get_claims_from_map(QCBORDecodeContext *p_context,
+			       struct claim_t *claims,
+			       size_t num_of_claims)
+{
+	QCBORError err;
+	int token_verification_error;
+	int i;
+
+	for (i = 0; i < num_of_claims; ++i) {
+		struct claim_t *claim = claims + i;
+
+		switch (claim->type) {
+		case CLAIM_INT64:
+			QCBORDecode_GetInt64InMapN(p_context, claim->key,
+				&(claim->int_data));
+			break;
+		case CLAIM_BOOL:
+			QCBORDecode_GetBoolInMapN(p_context, claim->key,
+				&(claim->bool_data));
+			break;
+		case CLAIM_BSTR:
+			QCBORDecode_GetByteStringInMapN(p_context, claim->key,
+				&(claim->buffer_data));
+			break;
+		case CLAIM_TEXT:
+			QCBORDecode_GetTextStringInMapN(p_context, claim->key,
+				&(claim->buffer_data));
+			break;
+		default:
+			printf("Internal error at  %s:%d.\r\n",
+				__FILE__, (int)__LINE__);
+			return TOKEN_VERIFICATION_ERR_INTERNAL_ERROR;
+		}
+		err = QCBORDecode_GetAndResetError(p_context);
+		if (err == QCBOR_SUCCESS) {
+			claim->present = true;
+		} else {
+			token_verification_error =
+				handle_claim_decode_error(claim, err);
+			if (token_verification_error !=
+				TOKEN_VERIFICATION_ERR_SUCCESS) {
+				return token_verification_error;
+			}
+		}
+	}
+	return TOKEN_VERIFICATION_ERR_SUCCESS;
+}
+
+/* Consume a single claim from an array and from the top level.
+ *
+ * The claim's 'key' and 'mandatory' attribute is not used in this function.
+ * The claim is considered mandatory.
+ */
+static int get_claim(QCBORDecodeContext *p_context, struct claim_t *claim)
+{
+	QCBORError err;
+
+	switch (claim->type) {
+	case CLAIM_INT64:
+		QCBORDecode_GetInt64(p_context, &(claim->int_data));
+		break;
+	case CLAIM_BOOL:
+		QCBORDecode_GetBool(p_context, &(claim->bool_data));
+		break;
+	case CLAIM_BSTR:
+		QCBORDecode_GetByteString(p_context, &(claim->buffer_data));
+		break;
+	case CLAIM_TEXT:
+		QCBORDecode_GetTextString(p_context, &(claim->buffer_data));
+		break;
+	default:
+		printf("Internal error at  %s:%d.\r\n",
+			__FILE__, (int)__LINE__);
+		break;
+	}
+	err = QCBORDecode_GetAndResetError(p_context);
+	if (err == QCBOR_SUCCESS) {
+		claim->present = true;
+		return TOKEN_VERIFICATION_ERR_SUCCESS;
+	}
+	printf("Decode failed with error at %s:%d. err = %d claim: \"%s\".\r\n",
+		__FILE__, (int)__LINE__, err, claim->title);
+	return TOKEN_VERIFICATION_ERR_QCBOR(err);
+}
+
+/* Consume claims from an array and from the top level.
+ *
+ * This function iterates on the array 'claims', and gets an item for each
+ * element. If the array or the cbor runs out of elements before reaching the
+ * end of the 'claims' array, then error is returned.
+ *
+ * The claim's 'key' and 'mandatory' attribute is not used in this function.
+ * All the elements considered mandatory.
+ */
+static int get_claims(QCBORDecodeContext *p_context, struct claim_t *claims,
+		      size_t num_of_claims)
+{
+	QCBORError err;
+	int i;
+
+	for (i = 0; i < num_of_claims; ++i) {
+		struct claim_t *claim = claims + i;
+
+		err = get_claim(p_context, claim);
+		if (err != TOKEN_VERIFICATION_ERR_SUCCESS) {
+			return err;
+		}
+	}
+	return TOKEN_VERIFICATION_ERR_SUCCESS;
+}
+
+static int verify_platform_token(struct q_useful_buf_c buf,
+				 struct attestation_claims *attest_claims)
+{
+	QCBORDecodeContext context;
+	int err;
+	int label, index;
+
+	QCBORDecode_Init(&context, buf, QCBOR_DECODE_MODE_NORMAL);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	QCBORDecode_EnterMap(&context, NULL);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	err = get_claims_from_map(&context,
+		attest_claims->plat_token_claims,
+		CLAIM_COUNT_PLATFORM_TOKEN);
+	if (err != TOKEN_VERIFICATION_ERR_SUCCESS) {
+		return err;
+	}
+
+	label = CCA_PLAT_SW_COMPONENTS;
+	QCBORDecode_EnterArrayFromMapN(&context, label);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	index = 0;
+	while (1) {
+		QCBORDecode_EnterMap(&context, NULL);
+		if (QCBORDecode_GetError(&context) == QCBOR_ERR_NO_MORE_ITEMS) {
+			/* This is OK. We just reached the end of the array.
+			 * Break from the loop.
+			 */
+			break;
+		}
+
+		if (index >= MAX_SW_COMPONENT_COUNT) {
+			printf("Not enough slots in sw_component_claims.\r\n");
+			printf("Increase MAX_SW_COMPONENT_COUNT in %s.\r\n",
+				__FILE__);
+			return TOKEN_VERIFICATION_ERR_INTERNAL_ERROR;
+		}
+
+		err = get_claims_from_map(&context,
+			attest_claims->sw_component_claims[index].claims,
+			CLAIM_COUNT_SW_COMPONENT);
+		if (err != TOKEN_VERIFICATION_ERR_SUCCESS) {
+			return err;
+		}
+		attest_claims->sw_component_claims[index].present = true;
+
+		QCBORDecode_ExitMap(&context);
+		RETURN_ON_DECODE_ERROR(&context);
+
+		++index;
+	}
+	/* We only get here if the decode error code was a
+	 * QCBOR_ERR_NO_MORE_ITEMS which is expected when the end of an array is
+	 * reached. In this case the processing must be continued, so clear the
+	 * error.
+	 */
+	QCBORDecode_GetAndResetError(&context);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	QCBORDecode_ExitArray(&context);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	QCBORDecode_ExitMap(&context);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	QCBORDecode_Finish(&context);
+
+	return TOKEN_VERIFICATION_ERR_SUCCESS;
+}
+
+static bool verify_length_of_measurement(size_t len)
+{
+	size_t allowed_lengths[] = {SHA256_SIZE, SHA512_SIZE};
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(allowed_lengths); ++i) {
+		if (len == allowed_lengths[i])
+			return true;
+	}
+
+	return false;
+}
+
+static int verify_realm_token(struct q_useful_buf_c buf,
+			     struct attestation_claims *attest_claims)
+{
+	QCBORDecodeContext context;
+	int err;
+	int i;
+
+	QCBORDecode_Init(&context, buf, QCBOR_DECODE_MODE_NORMAL);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	QCBORDecode_EnterMap(&context, NULL);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	err = get_claims_from_map(&context, attest_claims->realm_token_claims,
+		CLAIM_COUNT_REALM_TOKEN);
+	if (err != TOKEN_VERIFICATION_ERR_SUCCESS) {
+		return err;
+	}
+
+	/* Now get the realm extensible measurements */
+	QCBORDecode_EnterArrayFromMapN(&context,
+					CCA_REALM_EXTENSIBLE_MEASUREMENTS);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	err = get_claims(&context,
+		attest_claims->realm_measurement_claims,
+		CLAIM_COUNT_REALM_EXTENSIBLE_MEASUREMENTS);
+	if (err != TOKEN_VERIFICATION_ERR_SUCCESS) {
+		return err;
+	}
+
+	for (i = 0; i < CLAIM_COUNT_REALM_EXTENSIBLE_MEASUREMENTS; ++i) {
+		struct claim_t *claims =
+			attest_claims->realm_measurement_claims;
+		struct q_useful_buf_c buf = claims[i].buffer_data;
+
+		if (!verify_length_of_measurement(buf.len)) {
+			return TOKEN_VERIFICATION_ERR_INVALID_CLAIM_LEN;
+		}
+	}
+
+	QCBORDecode_ExitArray(&context);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	QCBORDecode_ExitMap(&context);
+	QCBORDecode_Finish(&context);
+
+	return TOKEN_VERIFICATION_ERR_SUCCESS;
+}
+
+/* Returns a pointer to the wrapped token in: 'token_payload'.
+ * Returns the claims in the wrapper in cose_sign1_wrapper.
+ */
+static int verify_token_cose_sign1_wrapping(
+				  struct q_useful_buf_c token,
+				  struct q_useful_buf_c *token_payload,
+				  struct claim_t *cose_sign1_wrapper)
+{
+	QCBORDecodeContext context;
+	QCBORItem item;
+	int err;
+
+	QCBORDecode_Init(&context, token, QCBOR_DECODE_MODE_NORMAL);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	/* Check COSE tag. */
+	QCBORDecode_PeekNext(&context, &item);
+	if (!QCBORDecode_IsTagged(&context, &item,
+		TAG_COSE_SIGN1)) {
+		return TOKEN_VERIFICATION_ERR_INVALID_COSE_TAG;
+	}
+
+	QCBORDecode_EnterArray(&context, NULL);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	/* Protected header */
+	err = get_claim(&context, cose_sign1_wrapper);
+	if (err != TOKEN_VERIFICATION_ERR_SUCCESS) {
+		return err;
+	}
+
+	/* Unprotected header. The map is always present, but may contain 0
+	 * items.
+	 */
+	QCBORDecode_EnterMap(&context, NULL);
+	RETURN_ON_DECODE_ERROR(&context);
+
+		/* Skip the content for now. */
+
+	QCBORDecode_ExitMap(&context);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	/* Payload */
+	err = get_claim(&context, cose_sign1_wrapper + 1);
+	if (err != TOKEN_VERIFICATION_ERR_SUCCESS) {
+		return err;
+	}
+
+	/* Signature */
+	err = get_claim(&context, cose_sign1_wrapper + 2);
+	if (err != TOKEN_VERIFICATION_ERR_SUCCESS) {
+		return err;
+	}
+
+	QCBORDecode_ExitArray(&context);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	*token_payload = cose_sign1_wrapper[1].buffer_data;
+
+	return TOKEN_VERIFICATION_ERR_SUCCESS;
+}
+
+static int verify_cca_token(struct q_useful_buf_c  token,
+			    struct q_useful_buf_c *platform_token,
+			    struct q_useful_buf_c *realm_token)
+{
+	QCBORDecodeContext context;
+	QCBORItem item;
+	QCBORError err;
+
+	QCBORDecode_Init(&context, token, QCBOR_DECODE_MODE_NORMAL);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	/* ================== Check CCA_TOKEN tag =========================== */
+	QCBORDecode_PeekNext(&context, &item);
+	if (!QCBORDecode_IsTagged(&context, &item, TAG_CCA_TOKEN)) {
+		return TOKEN_VERIFICATION_ERR_INVALID_COSE_TAG;
+	}
+
+	/* ================== Get the the platform token ==================== */
+	QCBORDecode_EnterMap(&context, NULL);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	/*
+	 * First element is the CCA platfrom token which is a
+	 * COSE_Sign1_Tagged object. It has byte stream wrapper.
+	 */
+	QCBORDecode_GetByteStringInMapN(&context, CCA_PLAT_TOKEN,
+					platform_token);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	/* ================== Get the the realm token ======================= */
+	/*
+	 * Second element is the delegated realm token which is a
+	 * COSE_Sign1_Tagged object. It has byte stream wrapper.
+	 */
+	QCBORDecode_GetByteStringInMapN(&context, CCA_REALM_DELEGATED_TOKEN,
+					realm_token);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	QCBORDecode_ExitMap(&context);
+	RETURN_ON_DECODE_ERROR(&context);
+
+	/* Finishing up the decoding of the top-level wrapper */
+	err = QCBORDecode_Finish(&context);
+	if (err != QCBOR_SUCCESS) {
+		printf("QCBOR decode failed with error at %s:%d. err = %d\r\n",
+			__FILE__, (int)__LINE__, (int)err);
+		return TOKEN_VERIFICATION_ERR_QCBOR(err);
+	}
+
+	return TOKEN_VERIFICATION_ERR_SUCCESS;
+}
+
+/*
+ * This function expect two COSE_Sing1_Tagged object wrapped with a tagged map:
+ *
+ * cca-token = #6.44234(cca-token-map) ; 44234 = 0xACCA
+ *
+ * cca-platform-token = COSE_Sign1_Tagged
+ * cca-realm-delegated-token = COSE_Sign1_Tagged
+ *
+ * cca-token-map = {
+ *   0 => cca-platform-token
+ *   1 => cca-realm-delegated-token
+ * }
+ *
+ * COSE_Sign1_Tagged = #6.18(COSE_Sign1)
+ */
+int verify_token(const char *token, size_t size,
+		 struct attestation_claims *attest_claims)
+{
+	/* TODO: do signature check */
+	/* TODO: Add tag check on tokens */
+	struct q_useful_buf_c buf = {token, size};
+	int ret;
+	struct q_useful_buf_c realm_token;
+	struct q_useful_buf_c realm_token_payload;
+	struct q_useful_buf_c platform_token;
+	struct q_useful_buf_c platform_token_payload;
+
+	ret = init_claims(attest_claims);
+	if (ret != TOKEN_VERIFICATION_ERR_SUCCESS) {
+		return ret;
+	}
+
+	/* Verify top-level token map and extract the two sub-tokens */
+	ret = verify_cca_token(buf, &platform_token, &realm_token);
+	if (ret != TOKEN_VERIFICATION_ERR_SUCCESS) {
+		return ret;
+	}
+
+	/* Verify the COSE_Sign1 wrapper of the realm token */
+	ret = verify_token_cose_sign1_wrapping(realm_token,
+		&realm_token_payload,
+		attest_claims->realm_cose_sign1_wrapper);
+	if (ret != TOKEN_VERIFICATION_ERR_SUCCESS) {
+		return ret;
+	}
+	/* Verify the payload of the realm token */
+	ret = verify_realm_token(realm_token_payload, attest_claims);
+	if (ret != TOKEN_VERIFICATION_ERR_SUCCESS) {
+		return ret;
+	}
+
+	/* Verify the COSE_Sign1 wrapper of the platform token */
+	ret = verify_token_cose_sign1_wrapping(platform_token,
+		&platform_token_payload,
+		attest_claims->plat_cose_sign1_wrapper);
+	if (ret != TOKEN_VERIFICATION_ERR_SUCCESS) {
+		return ret;
+	}
+	/* Verify the payload of the platform token */
+	ret = verify_platform_token(platform_token_payload, attest_claims);
+	if (ret != TOKEN_VERIFICATION_ERR_SUCCESS) {
+		return ret;
+	}
+
+	return TOKEN_VERIFICATION_ERR_SUCCESS;
+}
+
diff --git a/lib/token_verifier/token_verifier.h b/lib/token_verifier/token_verifier.h
new file mode 100644
index 00000000..ec3ab9c9
--- /dev/null
+++ b/lib/token_verifier/token_verifier.h
@@ -0,0 +1,77 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+
+#ifndef __TOKEN_VERIFIER_H__
+#define __TOKEN_VERIFIER_H__
+
+#include <qcbor/qcbor_decode.h>
+
+#define TOKEN_VERIFICATION_ERR_SUCCESS                 0
+#define TOKEN_VERIFICATION_ERR_INIT_ERROR              1
+#define TOKEN_VERIFICATION_ERR_MISSING_MANDATORY_CLAIM 2
+#define TOKEN_VERIFICATION_ERR_INVALID_COSE_TAG        3
+#define TOKEN_VERIFICATION_ERR_INVALID_CLAIM_LEN       4
+#define TOKEN_VERIFICATION_ERR_INTERNAL_ERROR          5
+#define TOKEN_VERIFICATION_ERR_QCBOR(qcbor_err)        (1000 + qcbor_err)
+
+/* Number of realm extensible measurements (REM) */
+#define REM_COUNT 4
+
+#define MAX_SW_COMPONENT_COUNT 16
+
+#define CLAIM_COUNT_REALM_TOKEN 6
+#define CLAIM_COUNT_COSE_SIGN1_WRAPPER 3
+#define CLAIM_COUNT_PLATFORM_TOKEN 8
+#define CLAIM_COUNT_REALM_EXTENSIBLE_MEASUREMENTS REM_COUNT
+#define CLAIM_COUNT_SW_COMPONENT 4
+
+/* This tells how the data should be interpreted in the claim_t struct, and not
+ * necessarily is the same as the item's major type in the token.
+ */
+enum claim_data_type {
+	CLAIM_INT64,
+	CLAIM_BOOL,
+	CLAIM_BSTR,
+	CLAIM_TEXT,
+};
+
+struct claim_t {
+	/* 'static' */
+	bool mandatory;
+	enum claim_data_type type;
+	int64_t key;
+	const char *title;
+
+	/* filled during verification */
+	bool present;
+	union {
+		int64_t int_data;
+		bool bool_data;
+		/* Used for text and bytestream as well */
+		/* TODO: Add expected length check as well? */
+		struct q_useful_buf_c buffer_data;
+	};
+};
+
+struct sw_component_t {
+	bool present;
+	struct claim_t claims[CLAIM_COUNT_SW_COMPONENT];
+};
+
+struct attestation_claims {
+	struct claim_t realm_cose_sign1_wrapper[CLAIM_COUNT_COSE_SIGN1_WRAPPER];
+	struct claim_t realm_token_claims[CLAIM_COUNT_REALM_TOKEN];
+	struct claim_t realm_measurement_claims[CLAIM_COUNT_REALM_EXTENSIBLE_MEASUREMENTS];
+	struct claim_t plat_cose_sign1_wrapper[CLAIM_COUNT_COSE_SIGN1_WRAPPER];
+	struct claim_t plat_token_claims[CLAIM_COUNT_PLATFORM_TOKEN];
+	struct sw_component_t sw_component_claims[MAX_SW_COMPONENT_COUNT];
+};
+
+/* Returns TOKEN_VERIFICATION_ERR* */
+int verify_token(const char *token, size_t size,
+	struct attestation_claims *attest_claims);
+
+#endif /* __TOKEN_VERIFIER_H__ */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 23/27] arm: realm: add RSI interface for attestation measurements
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (21 preceding siblings ...)
  2023-01-27 11:41   ` [RFC kvm-unit-tests 22/27] arm: Add a library to verify tokens using the " Joey Gouly
@ 2023-01-27 11:41   ` Joey Gouly
  2023-01-27 11:41   ` [RFC kvm-unit-tests 24/27] arm: realm: Add helpers to decode RSI return codes Joey Gouly
                     ` (3 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:41 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

From: Suzuki K Poulose <suzuki.poulose@arm.com>

Add wrappers for the Attestation and measurement related RSI calls.
These will be later used in the test cases

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 lib/arm64/asm/rsi.h |  7 +++++++
 lib/arm64/rsi.c     | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/lib/arm64/asm/rsi.h b/lib/arm64/asm/rsi.h
index c8179341..50bab993 100644
--- a/lib/arm64/asm/rsi.h
+++ b/lib/arm64/asm/rsi.h
@@ -27,6 +27,13 @@ int rsi_invoke(unsigned int function_id, unsigned long arg0,
 	       struct smccc_result *result);
 
 int rsi_get_version(void);
+void rsi_attest_token_init(phys_addr_t addr, unsigned long *challenge,
+			   struct smccc_result *res);
+void rsi_attest_token_continue(phys_addr_t addr, struct smccc_result *res);
+void rsi_extend_measurement(unsigned int index, unsigned long size,
+			    unsigned long *measurement,
+			    struct smccc_result *res);
+void rsi_read_measurement(unsigned int index, struct smccc_result *res);
 
 static inline bool is_realm(void)
 {
diff --git a/lib/arm64/rsi.c b/lib/arm64/rsi.c
index 08c77889..63d0620a 100644
--- a/lib/arm64/rsi.c
+++ b/lib/arm64/rsi.c
@@ -66,6 +66,38 @@ void arm_rsi_init(void)
 	prot_ns_shared = (1UL << phys_mask_shift);
 }
 
+void rsi_attest_token_init(phys_addr_t addr, unsigned long *challenge,
+			   struct smccc_result *res)
+{
+	rsi_invoke(SMC_RSI_ATTEST_TOKEN_INIT, addr,
+		   challenge[0], challenge[1], challenge[2],
+		   challenge[3], challenge[4], challenge[5],
+		   challenge[6], challenge[7], 0, 0, res);
+}
+
+void rsi_attest_token_continue(phys_addr_t addr, struct smccc_result *res)
+{
+	rsi_invoke(SMC_RSI_ATTEST_TOKEN_CONTINUE, addr,
+		   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, res);
+}
+
+void rsi_extend_measurement(unsigned int index, unsigned long size,
+			    unsigned long *measurement, struct smccc_result *res)
+{
+	rsi_invoke(SMC_RSI_MEASUREMENT_EXTEND, index, size,
+		   measurement[0], measurement[1],
+		   measurement[2], measurement[3],
+		   measurement[4], measurement[5],
+		   measurement[6], measurement[7],
+		   0, res);
+}
+
+void rsi_read_measurement(unsigned int index, struct smccc_result *res)
+{
+	rsi_invoke(SMC_RSI_MEASUREMENT_READ, index, 0,
+		   0, 0, 0, 0, 0, 0, 0, 0, 0, res);
+}
+
 static unsigned rsi_set_addr_range_state(unsigned long start, unsigned long size,
 					 enum ripas_t state, unsigned long *top)
 {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 24/27] arm: realm: Add helpers to decode RSI return codes
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (22 preceding siblings ...)
  2023-01-27 11:41   ` [RFC kvm-unit-tests 23/27] arm: realm: add RSI interface for attestation measurements Joey Gouly
@ 2023-01-27 11:41   ` Joey Gouly
  2023-01-27 11:41   ` [RFC kvm-unit-tests 25/27] arm: realm: Add Realm attestation tests Joey Gouly
                     ` (2 subsequent siblings)
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:41 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

From: Suzuki K Poulose <suzuki.poulose@arm.com>

RMM encodes error code and index in the result of an operation.
Add helpers to decode this information for use with the attestation
tests.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 lib/arm64/asm/rsi.h | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/lib/arm64/asm/rsi.h b/lib/arm64/asm/rsi.h
index 50bab993..1d01a929 100644
--- a/lib/arm64/asm/rsi.h
+++ b/lib/arm64/asm/rsi.h
@@ -16,6 +16,39 @@
 
 extern bool rsi_present;
 
+/*
+ * Logical representation of return code returned by RMM commands.
+ * Each failure mode of a given command should return a unique return code, so
+ * that the caller can use it to unambiguously identify the failure mode.  To
+ * avoid having a very large list of enumerated values, the return code is
+ * composed of a status which identifies the category of the error (for example,
+ * an address was misaligned), and an index which disambiguates between multiple
+ * similar failure modes (for example, a command may take multiple addresses as
+ * its input; the index identifies _which_ of them was misaligned.)
+ */
+typedef unsigned int status_t;
+typedef struct {
+	status_t status;
+	unsigned int index;
+} return_code_t;
+
+/*
+ * Convenience function for creating a return_code_t.
+ */
+static inline return_code_t make_return_code(unsigned int status,
+					     unsigned int index)
+{
+	return (return_code_t) {status, index};
+}
+
+/*
+ * Unpacks a return code.
+ */
+static inline return_code_t unpack_return_code(unsigned long error_code)
+{
+	return make_return_code(error_code & 0xff, error_code >> 8);
+}
+
 void arm_rsi_init(void);
 
 int rsi_invoke(unsigned int function_id, unsigned long arg0,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 25/27] arm: realm: Add Realm attestation tests
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (23 preceding siblings ...)
  2023-01-27 11:41   ` [RFC kvm-unit-tests 24/27] arm: realm: Add helpers to decode RSI return codes Joey Gouly
@ 2023-01-27 11:41   ` Joey Gouly
  2023-01-27 11:41   ` [RFC kvm-unit-tests 26/27] arm: realm: Add a test for shared memory Joey Gouly
  2023-01-27 11:41   ` [RFC kvm-unit-tests 27/27] NOT-FOR-MERGING: add run-realm-tests Joey Gouly
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:41 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel,
	Mate Toth-Pal

From: Mate Toth-Pal <mate.toth-pal@arm.com>

Add tests for Attestation and measurement related RSI calls.

Signed-off-by: Mate Toth-Pal <mate.toth-pal@arm.com>
Co-developed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[ Rewrote the test cases, keeping the core testing data/logic ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 arm/Makefile.arm64 |    1 +
 arm/realm-attest.c | 1125 ++++++++++++++++++++++++++++++++++++++++++++
 arm/unittests.cfg  |   50 ++
 lib/libcflat.h     |    1 +
 4 files changed, 1177 insertions(+)
 create mode 100644 arm/realm-attest.c

diff --git a/arm/Makefile.arm64 b/arm/Makefile.arm64
index f57d0a95..0a0c4f2c 100644
--- a/arm/Makefile.arm64
+++ b/arm/Makefile.arm64
@@ -41,6 +41,7 @@ tests += $(TEST_DIR)/micro-bench.flat
 tests += $(TEST_DIR)/cache.flat
 tests += $(TEST_DIR)/debug.flat
 tests += $(TEST_DIR)/realm-rsi.flat
+tests += $(TEST_DIR)/realm-attest.flat
 tests += $(TEST_DIR)/realm-fpu.flat
 tests += $(TEST_DIR)/realm-sea.flat
 
diff --git a/arm/realm-attest.c b/arm/realm-attest.c
new file mode 100644
index 00000000..6c357fb5
--- /dev/null
+++ b/arm/realm-attest.c
@@ -0,0 +1,1125 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+#include <libcflat.h>
+
+#include <attest_defines.h>
+#include <alloc.h>
+#include <stdlib.h>
+#include <token_dumper.h>
+#include <token_verifier.h>
+
+#include <asm/io.h>
+#include <asm/page.h>
+#include <asm/rsi.h>
+#include <asm/setup.h>
+#include <asm/smp.h>
+
+
+#define SHA256_SIZE	32
+
+struct challenge {
+	unsigned long words[8];
+};
+
+struct measurement {
+	unsigned long words[8];
+};
+
+static char __attribute__((aligned(SZ_2M))) __attribute__((section(".data")))
+			block_buf_data[SZ_2M * 2];
+
+static char __attribute__((aligned(SZ_2M))) __attribute__((section(".bss")))
+			block_buf_bss[SZ_2M];
+
+static char __attribute__((aligned(SZ_4K))) __attribute__((section(".data")))
+			page_buf_data[SZ_4K];
+
+static char __attribute__((aligned(SZ_4K))) __attribute__((section(".bss")))
+			page_buf_bss[SZ_4K];
+
+/* Page aligned offset within the block mapped buffer */
+#define BLOCK_BUF_OFFSET	(SZ_8K)
+
+static inline void debug_print_raw_token(void *buf, size_t size)
+{
+#ifdef PRINT_RAW_TOKEN
+	print_raw_token(buf, size);
+#endif
+}
+
+static inline void debug_print_token(struct attestation_claims *claim)
+{
+#ifdef PRINT_TOKEN
+	print_token(claim);
+#endif
+}
+
+static bool claims_verify_token(char *token, size_t token_size,
+				struct attestation_claims *claims,
+				bool report_success)
+{
+	int verify_rc = verify_token(token, token_size, claims);
+	int cpu = smp_processor_id();
+
+	if (verify_rc == TOKEN_VERIFICATION_ERR_SUCCESS) {
+		if (report_success)
+			report(true, "CPU%d: Verfication of token passed", cpu);
+		return true;
+	}
+
+	report(false,
+	       "CPU%d: Verification of token failed with error code %d",
+	       cpu, verify_rc);
+
+	return false;
+}
+
+static inline void attest_token_init(phys_addr_t addr,
+				     struct challenge *ch,
+				     struct smccc_result *res)
+{
+	rsi_attest_token_init(addr, &ch->words[0], res);
+}
+
+
+static inline void attest_token_continue(phys_addr_t addr,
+					 struct smccc_result *res)
+{
+	rsi_attest_token_continue(addr, res);
+}
+
+static inline void attest_token_complete(phys_addr_t addr,
+					 struct smccc_result *res)
+{
+	do {
+		attest_token_continue(addr, res);
+	} while (res->r0 == RSI_INCOMPLETE);
+}
+
+static void get_attest_token(phys_addr_t ipa,
+			     struct challenge *ch,
+			     struct smccc_result *res)
+{
+	attest_token_init(ipa, ch, res);
+
+	if (res->r0)
+		return;
+	attest_token_complete(ipa, res);
+}
+
+/*
+ * __get_attest_token_claims: Get attestation token and verify the claims.
+ * If @claims is not NULL, token is parsed and the @claims is populated.
+ * All failures reported. Success is only reported if the @report_success is
+ * true.
+ * Returns whether the calls and verification succeeds
+ */
+static bool __get_attest_token_claims(void *buf, struct challenge *ch,
+				      struct attestation_claims *claims,
+				      size_t *token_size, bool report_success)
+{
+	struct smccc_result result;
+	struct attestation_claims local_claims;
+	struct attestation_claims *claimsp;
+	bool rc = false;
+
+	/* Use the local_claims if claims is not supplied */
+	claimsp = claims ? : &local_claims;
+
+	get_attest_token(virt_to_phys(buf), ch, &result);
+	if (result.r0) {
+		report(false, "Get attestation token with : %ld", result.r0);
+		return rc;
+	}
+
+	if (report_success)
+		report(true, "Get attestation token");
+
+	/* Update token_size if necessary */
+	if (token_size)
+		*token_size = result.r1;
+
+	return claims_verify_token(buf, result.r1, claimsp, report_success);
+}
+
+static bool get_attest_token_claims(void *buf, struct challenge *ch,
+				    struct attestation_claims *claims,
+				    size_t *token_size)
+{
+	return __get_attest_token_claims(buf, ch, claims, token_size, false);
+}
+
+static void get_verify_attest_token(void *buf, struct challenge *ch,
+				    const char *desc)
+{
+	report_prefix_push(desc);
+	__get_attest_token_claims(buf, ch, NULL, NULL, true);
+	report_prefix_pop();
+}
+
+static void get_verify_attest_token_verbose(void *buf,
+					    struct challenge *ch,
+					    const char *desc)
+{
+	size_t token_size;
+	struct attestation_claims claims;
+
+	report_prefix_push(desc);
+	if (__get_attest_token_claims(buf, ch, &claims, &token_size, true)) {
+		debug_print_raw_token(buf, token_size);
+		debug_print_token(&claims);
+	}
+	report_prefix_pop();
+}
+
+static void test_get_attest_token(void)
+{
+	char stack_buf[SZ_4K]__attribute__((aligned(SZ_4K)));
+	char *heap_buf;
+	struct challenge ch;
+
+	memset(&ch, 0xAB, sizeof(ch));
+
+	/* Heap buffer */
+	heap_buf = memalign(SZ_4K, SZ_4K);
+	if (heap_buf) {
+		get_verify_attest_token(heap_buf, &ch, "heap buffer");
+		free(heap_buf);
+	} else {
+		report_skip("heap buffer: Failed to allocate");
+	}
+
+	/* Stack buffer */
+	get_verify_attest_token(stack_buf, &ch, "stack buffer");
+	/* Page aligned buffer .data segment */
+	get_verify_attest_token(page_buf_data, &ch, ".data segment buffer");
+	/* Page aligned buffer .bss segment */
+	get_verify_attest_token(page_buf_bss, &ch, ".bss segment buffer");
+	/* Block mapped buffer in .data segment */
+	get_verify_attest_token(&block_buf_data[BLOCK_BUF_OFFSET], &ch,
+				"block mapped .data segment buffer");
+	/* Block mapped buffer in .bss segment */
+	get_verify_attest_token_verbose(&block_buf_bss[BLOCK_BUF_OFFSET],
+					 &ch, "block mapped .bss segment buffer");
+}
+
+static void get_attest_token_check_fail(phys_addr_t ipa,
+					struct challenge *ch,
+					return_code_t exp,
+					const char *buf_desc)
+{
+	struct smccc_result result;
+	return_code_t rc;
+
+	report_prefix_push(buf_desc);
+	get_attest_token(ipa, ch, &result);
+	rc = unpack_return_code(result.r0);
+	if (rc.status != exp.status) {
+		report(false, "Get attestation token "
+			      "got (%d) expected (%d)",
+			      rc.status, exp.status);
+	} else {
+		report(true, "Get attestation token fails as expected");
+	}
+	report_prefix_pop();
+}
+
+static void test_get_attest_token_bad_input(void)
+{
+	struct challenge ch;
+	return_code_t exp;
+
+	memset(page_buf_data, 0, sizeof(page_buf_data));
+	memset(&ch, 0xAB, sizeof(ch));
+	exp = make_return_code(RSI_ERROR_INPUT, 0);
+	get_attest_token_check_fail(virt_to_phys(page_buf_data + 0x100),
+				    &ch, exp, "unaligned buffer");
+	get_attest_token_check_fail(__phys_end + SZ_512M, &ch, exp,
+				    "buffer outside PAR");
+}
+
+static void test_get_attest_token_abi_misuse(void)
+{
+	struct smccc_result result;
+	struct challenge ch;
+	phys_addr_t ipa = virt_to_phys(page_buf_data);
+	return_code_t rc;
+
+	memset(&ch, 0xAB, sizeof(ch));
+
+	/*
+	 * Testcase 1 - Miss call to RSI_ATTEST_TOKEN_INIT
+	 *
+	 * step1. Execute a successful test to reset the state machine.
+	 */
+	report_prefix_push("miss token init");
+	get_attest_token(ipa, &ch, &result);
+	if (result.r0) {
+		report(false, "Get attestation failed %ld", result.r0);
+		report_prefix_pop(); /* miss token init */
+		return;
+	}
+	/*
+	 * step2. Execute RSI_ATTEST_TOKEN_CONTINUE without an RSI_ATTEST_TOKEN_INIT.
+	 * 	  Expect an error == RSI_ERROR_STATE
+	 */
+	attest_token_continue(ipa, &result);
+	rc = unpack_return_code(result.r0);
+	if (rc.status != RSI_ERROR_STATE) {
+		report(false, "Unexpected result (%d, %d) vs (%d) expected",
+		       rc.status, rc.index, RSI_ERROR_STATE);
+		report_prefix_pop(); /* miss token init */
+		return;
+	}
+
+	report(true, "Fails as expected");
+	report_prefix_pop(); /* miss token init */
+
+	/*
+	 * Test case 2 - Calling with inconsistend input.
+	 * step1. Issue RSI_ATTEST_TOKEN_INIT
+	 * step2. Modify the challenge and issue RSI_ATTEST_TOKEN_CONTINUE.
+	 * Test : Expect error == (RSI_ERROR_INPUT, 0)
+	 */
+	report_prefix_push("inconsistent input");
+	attest_token_init(ipa, &ch, &result);
+	rc = unpack_return_code(result.r0);
+	if (result.r0) {
+		report(false, "RSI_ATTEST_TOKEN_INIT failed unexpectedly (%d, %d)",
+		       rc.status, rc.index);
+		report_prefix_pop(); /* inconsistent input */
+		return;
+	}
+
+	/*
+	 * Corrupt the IPA address input to ATTEST_TOKEN_CONTINUE,
+	 */
+	attest_token_continue(ipa ^ 0x1UL, &result);
+	rc = unpack_return_code(result.r0);
+	if (rc.status != RSI_ERROR_INPUT) {
+		report(false, "Attest token continue unexpected results"
+			       " (%d) vs expected (%d)",
+			      rc.status, RSI_ERROR_INPUT);
+	}
+
+	report_prefix_pop(); /* inconsistent input */
+
+	/*
+	 * Test case 3
+	 * step1. Complete the token attestation from with proper values.
+	 *        Failures in the Test case 2 should not affect the completion.
+	 */
+	report_prefix_push("valid input after inconsistent input");
+	attest_token_complete(ipa, &result);
+	rc = unpack_return_code(result.r0);
+	if (result.r0) {
+		report(false, "Attest token continue failed with (%d, %d)",
+			rc.status, rc.index);
+		return;
+	} else {
+		report(true, "Attest token continue complete");
+	}
+	report_prefix_pop(); /* Valid input after inconsistent input */
+}
+
+static void test_get_attest_token_abi_abort_req(void)
+{
+	int i;
+	char *p;
+	size_t size;
+	struct attestation_claims claims;
+	struct smccc_result result;
+	struct challenge ch;
+	char stack_buf[SZ_4K] __attribute__((aligned(SZ_4K))) = { 0 };
+	phys_addr_t addr = virt_to_phys(stack_buf);
+
+	/* Set the initial challenge, which will be aborted */
+	memset(&ch, 0xAB, sizeof(ch));
+	attest_token_init(addr, &ch, &result);
+	if (result.r0) {
+		report(false, "Attest token init failed %ld", result.r0);
+		return;
+	}
+
+	/* Execute a few cycles, but not let it complete */
+	for (i = 0; i < 3; i++) {
+		attest_token_continue(addr, &result);
+		if (result.r0 != RSI_INCOMPLETE) {
+			if (result.r0)
+				report(false, "Attest token continue : unexpected "
+				       "failure %ld", result.r0);
+			else
+				report_skip("Attest token finished at iteration %d",
+					    i + 1);
+			return;
+		}
+	}
+
+	/* Issue a fresh Attest Token request with updated challenge */
+	memset(&ch, 0xEE, sizeof(ch));
+	get_attest_token(addr, &ch, &result);
+	if (result.r0) {
+		report(false, "Attest Token failed %ld", result.r0);
+		return;
+	}
+	claims_verify_token(stack_buf, result.r1, &claims, false);
+
+	/*
+	 * TODO: Index of claim in the array depends on the init sequence
+	 * in token_verifier.c: init_claim()
+	 */
+	p = (char*)claims.realm_token_claims[0].buffer_data.ptr;
+	size = claims.realm_token_claims[0].buffer_data.len;
+
+	/* Verify that token contains the updated challenge. */
+	if (size != sizeof(ch)) {
+		report(false, "Attestation token: abort request: "
+				"claim size mismatch : %ld", result.r0);
+		return;
+	}
+	if (memcmp(p, &ch, size)) {
+		report(false, "Attestation token: abort request: "
+			      "claim value mismatch: %ld", result.r0);
+		return;
+	}
+	report(true, "Aborting ongoing request");
+}
+
+static void run_rsi_attest_tests(void)
+{
+	report_prefix_push("attest");
+
+	test_get_attest_token();
+
+	report_prefix_push("bad input");
+	test_get_attest_token_bad_input();
+	report_prefix_pop();
+
+	report_prefix_push("ABI misuse");
+	test_get_attest_token_abi_misuse();
+	report_prefix_pop();
+
+	report_prefix_push("ABI Abort");
+	test_get_attest_token_abi_abort_req();
+	report_prefix_pop();
+
+	report_prefix_pop(); /* attest */
+}
+
+static void run_get_token_times(void *data)
+{
+	char buf[SZ_4K] __attribute__((aligned(SZ_4K)));
+	struct challenge ch;
+	struct attestation_claims claims;
+	unsigned long runs = ((size_t)data);
+	int i, j;
+	int cpu = smp_processor_id();
+
+	report_info("CPU%d: Running get token test %ld times", cpu, runs);
+	for (i = 0; i < runs; i++) {
+		uint8_t pattern = (cpu << 4) | (i & 0xf);
+		size_t token_size;
+		struct claim_t *claim;
+
+		memset(buf, 0, sizeof(buf));
+		memset(&ch, pattern, sizeof(ch));
+
+		if (!get_attest_token_claims(buf, &ch, &claims, &token_size))
+			return;
+		claim = claims.realm_token_claims;
+		if (claim->key != CCA_REALM_CHALLENGE ||
+		    claim->buffer_data.len != sizeof(ch)) {
+			report(false, "Invalid challenge size in parsed token:"
+				      " %zu (expected %zu)",
+				      claim->buffer_data.len, sizeof(ch));
+			return;
+		}
+
+		for (j = 0; j < sizeof(ch); j++) {
+			uint8_t byte = ((uint8_t *)claim->buffer_data.ptr)[j];
+			if (byte != pattern) {
+				report(false, "Invalid byte in challenge[%d]: "
+					       " %02x (expected %02x)",
+					       j, byte, pattern);
+				return;
+			}
+		}
+	}
+	report(true, "CPU%d: Completed runs", cpu);
+}
+
+static void run_rsi_attest_smp_test(void)
+{
+	unsigned long runs = 100;
+
+	report_prefix_push("attest_smp");
+	on_cpus(run_get_token_times, (void *)runs);
+	report_prefix_pop();
+}
+
+/*
+ * There are 7 slots for measurements. The first is reserved for initial
+ * content measurement. The rest are meant to store runtime measurements.
+ * Runtime measurements are extended (concatenated and hashed). Reading
+ * them back separately is unsupported. They can be queried in an
+ * attestation token.
+ *
+ * Measurement size is 64bytes maximum to accommodate a SHA512 hash.
+ */
+
+static void measurement_extend(int idx, struct measurement *m, size_t size,
+			       struct smccc_result *res)
+{
+	rsi_extend_measurement(idx, size, &m->words[0], res);
+}
+
+static void test_extend_measurement(void)
+{
+	struct smccc_result result;
+	struct measurement m;
+	return_code_t rc;
+	int idx;
+
+	memset(&m, 0xEE, sizeof(m));
+	/*
+	 * Store Runtime measurements for all possible slots.
+	 */
+	for (idx = 1; idx <= REM_COUNT; idx ++) {
+		measurement_extend(idx, &m, sizeof(m.words), &result);
+		rc = unpack_return_code(result.r0);
+		report(!rc.status, "Extend measurement idx: %d (%d, %d)",
+			idx, rc.status, rc.index);
+	}
+}
+
+static void test_extend_measurement_bad_index(struct measurement *m)
+{
+	struct smccc_result result;
+	return_code_t rc;
+	int indices[] = { 0, REM_COUNT + 1 };
+	const char *idx_descs[] = { "reserved", "out-of-bounds" };
+	int i;
+
+	report_prefix_push("index");
+	for (i = 0; i < ARRAY_SIZE(indices); i++) {
+		report_prefix_push(idx_descs[i]);
+		measurement_extend(indices[i], m, sizeof(m->words), &result);
+		rc = unpack_return_code(result.r0);
+
+		if (rc.status != RSI_ERROR_INPUT)
+			report(false, "Extend measurement index: "
+				      "actual (%d) vs expected (%d)",
+				      rc.status, RSI_ERROR_INPUT);
+		else
+			report(true, "Extend measurement index fails as expected");
+		report_prefix_pop(); /* idx_descs[i] */
+	}
+	report_prefix_pop(); /* index */
+}
+
+static void test_extend_measurement_bad_size(struct measurement *m)
+{
+	struct smccc_result result;
+	return_code_t rc;
+
+	report_prefix_push("size");
+	rsi_extend_measurement(1, 65, &m->words[0], &result);
+	rc = unpack_return_code(result.r0);
+	if (rc.status != RSI_ERROR_INPUT)
+		report(false, "Measurement extend "
+			      "actual (%d) vs expected (%d)",
+			      rc.status, RSI_ERROR_INPUT);
+	else
+		report(true, "Extend measurement fails as expected");
+	report_prefix_pop(); /* size */
+}
+
+static void test_extend_measurement_bad_input(void)
+{
+	struct measurement m;
+
+	report_prefix_push("bad input");
+	memset(&m, 0xEE, sizeof(m));
+	test_extend_measurement_bad_index(&m);
+	test_extend_measurement_bad_size(&m);
+	report_prefix_pop(); /* bad input */
+}
+
+static void run_rsi_extend_tests(void)
+{
+	report_prefix_push("extend");
+	test_extend_measurement();
+	test_extend_measurement_bad_input();
+	report_prefix_pop(); /* extend */
+}
+
+/*
+ * cpu_extend_run - Parameters for the extend measurement SMP run.
+ * @idx		- Pointer to the index
+ * @size	- Size of the measurement data
+ * @m		- Measurement data.
+ */
+struct cpu_extend_run {
+	int *idx;
+	struct measurement *m;
+	size_t size;
+	unsigned long rc;
+};
+
+/*
+ * We get an array of the parameters for the extend measurement.
+ * The cpu number is the index to the array. At the moment we
+ * only support 2 cpus.
+ */
+static void cpu_run_extend_measurement(void *data)
+{
+	struct smccc_result result;
+	struct cpu_extend_run *run;
+	int me = smp_processor_id();
+
+	assert(me >= 0);
+
+	/* Tests for only 2 CPUs */
+	if (me > 1)
+		return;
+	run = (struct cpu_extend_run *)data + me;
+	rsi_extend_measurement(*run->idx, run->size, &run->m->words[0], &result);
+	run->rc = result.r0;
+	if (result.r0 != 0)
+		report(false, "CPU%d: Extend measurement failed for slot %d",
+		       me, *run->idx);
+}
+
+static bool claims_uses_sha256_algo(struct attestation_claims *claims)
+{
+	struct claim_t *claim = claims->realm_token_claims + 2; /* CCA_REALM_HASH_ALGO_ID */
+
+	/* claim->buffer_data.ptr: Not NULL terminated, so using memcmp */
+	return !memcmp(claim->buffer_data.ptr, "sha-256", strlen("sha-256"));
+}
+
+static void test_rsi_extend_smp(void)
+{
+	int slot, m_idx;
+	struct measurement m[2];
+	struct challenge ch;
+	struct attestation_claims claims;
+	size_t token_size;
+
+	/*
+	 * Measurements to extend with
+	 *
+	 * Run		CPU0 data	CPU1 data
+	 *   1:		[31 - 0]	[55 - 24]
+	 *   2:		[39 - 8]	[63 - 32]
+	 *   3:		[47 - 16]	[71 - 40]
+	 */
+	char measure_bytes[] = {
+		0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11,
+		0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11,
+		0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11,
+		0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11,
+		0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
+		0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
+		0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
+		0x18, 0x19, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x1F,
+		0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27
+	};
+
+	/*
+	 * The expected measurement values. Each element in the array contains
+	 * a possible extended measurement value. (Multiple values are possible
+	 * as the extend function might be called in any order by the cores.)
+	 * The array contains results for all the possible orders. The number of
+	 * possibilities can be calculated as here:
+	 * https://math.stackexchange.com/q/1065374
+	 */
+	struct extend_smp_expected {
+		const char *sequence;
+		char measurement[SHA256_SIZE];
+	} expected[] = {
+		{
+			"[ cpu0#0 cpu0#1 cpu0#2 cpu1#0 cpu1#1 cpu1#2 ]",
+			{
+				0xB1, 0xBE, 0x04, 0x25, 0xBB, 0xBC, 0x04, 0x9F,
+				0x98, 0x4F, 0xFB, 0xDE, 0xAA, 0x00, 0xC9, 0xBC,
+				0x41, 0x43, 0xDB, 0x16, 0xBB, 0x2A, 0x5F, 0x4B,
+				0x8B, 0x36, 0xAE, 0x3F, 0xFE, 0x24, 0x23, 0xA4
+			},
+		},
+		{
+			"[ cpu0#0 cpu0#1 cpu1#0 cpu0#2 cpu1#1 cpu1#2 ]",
+			{
+				0x99, 0x00, 0x5E, 0xB7, 0xF8, 0x84, 0xA3, 0x99,
+				0x7E, 0x12, 0xDE, 0xD1, 0x5B, 0xA7, 0x07, 0xF4,
+				0x24, 0x3E, 0x77, 0xED, 0x60, 0xC0, 0xBD, 0x43,
+				0x3B, 0x60, 0x7E, 0x38, 0xDD, 0x58, 0xC7, 0x46
+			},
+		},
+		{
+			"[ cpu0#0 cpu0#1 cpu1#0 cpu1#1 cpu0#2 cpu1#2 ]",
+			{
+				0x0B, 0x5E, 0x31, 0x69, 0xAC, 0xAF, 0xA0, 0x8B,
+				0x4F, 0x90, 0xD1, 0x86, 0xCC, 0x8E, 0x11, 0x42,
+				0x0B, 0x74, 0x49, 0x6C, 0xA1, 0x27, 0x1B, 0x7C,
+				0x52, 0x77, 0x7F, 0x2F, 0x53, 0x2F, 0x9A, 0xC1
+			},
+		},
+		{
+			"[ cpu0#0 cpu0#1 cpu1#0 cpu1#1 cpu1#2 cpu0#2 ]",
+			{
+				0x99, 0xDE, 0xF8, 0x02, 0x27, 0xE9, 0x6F, 0x6F,
+				0xA6, 0x55, 0xFC, 0x56, 0xCC, 0x7A, 0xFC, 0xEF,
+				0x2F, 0x0C, 0x45, 0x3E, 0x01, 0xE0, 0x4B, 0xA1,
+				0x60, 0x96, 0xEE, 0xB1, 0x4A, 0x25, 0x86, 0x89},
+		},
+		{
+			"[ cpu0#0 cpu1#0 cpu0#1 cpu0#2 cpu1#1 cpu1#2 ]",
+			{	0x88, 0x40, 0x05, 0xF5, 0xA6, 0x95, 0xC1, 0xC7,
+				0xD3, 0x69, 0x16, 0x82, 0x0D, 0x79, 0xC1, 0x5B,
+				0x4A, 0x48, 0xCA, 0x7F, 0xA5, 0xF3, 0x77, 0x37,
+				0xBE, 0x0D, 0xAC, 0x2E, 0x42, 0x3E, 0x03, 0x37
+			},
+		},
+		{
+			"[ cpu0#0 cpu1#0 cpu0#1 cpu1#1 cpu0#2 cpu1#2 ]",
+			{
+				0x68, 0x32, 0xC6, 0xAF, 0x8C, 0x86, 0x77, 0x09,
+				0x4A, 0xB9, 0xA1, 0x9E, 0xBB, 0x2B, 0x42, 0x35,
+				0xF8, 0xDE, 0x9A, 0x98, 0x37, 0x7B, 0x3E, 0x82,
+				0x59, 0x0B, 0xC4, 0xAD, 0x1D, 0x01, 0x28, 0xCA
+			},
+		},
+		{
+			"[ cpu0#0 cpu1#0 cpu0#1 cpu1#1 cpu1#2 cpu0#2 ]",
+			{
+				0xF5, 0x96, 0x77, 0x68, 0xD9, 0x6A, 0xA2, 0xFC,
+				0x08, 0x8C, 0xF5, 0xA9, 0x6B, 0xE7, 0x1E, 0x20,
+				0x35, 0xC1, 0x92, 0xCE, 0xBC, 0x3A, 0x75, 0xEA,
+				0xB4, 0xEB, 0x17, 0xE5, 0x77, 0x50, 0x85, 0x40
+			},
+
+		},
+		{
+			"[ cpu0#0 cpu1#0 cpu1#1 cpu0#1 cpu0#2 cpu1#2 ]",
+			{
+				0x4E, 0xA2, 0xD2, 0x79, 0x55, 0x75, 0xCB, 0x86,
+				0x87, 0x34, 0x35, 0xE7, 0x75, 0xDF, 0xD5, 0x59,
+				0x58, 0xDE, 0x74, 0x35, 0x68, 0x2B, 0xDC, 0xC8,
+				0x85, 0x72, 0x97, 0xBE, 0x58, 0xB1, 0x1E, 0xA7
+			},
+
+		},
+		{
+			"[ cpu0#0 cpu1#0 cpu1#1 cpu0#1 cpu1#2 cpu0#2 ]",
+			{
+				0xD1, 0xC2, 0xC8, 0x08, 0x00, 0x64, 0xB8, 0x1F,
+				0xA0, 0xA5, 0x32, 0x20, 0xAA, 0x08, 0xC0, 0x48,
+				0xDB, 0xB1, 0xED, 0xE7, 0xAF, 0x18, 0x2F, 0x7F,
+				0x3C, 0xB8, 0x58, 0x83, 0xEC, 0xF9, 0x38, 0xFD
+			},
+
+		},
+		{
+			"[ cpu0#0 cpu1#0 cpu1#1 cpu1#2 cpu0#1 cpu0#2 ]",
+			{
+				0xD1, 0xB8, 0x31, 0x98, 0x8E, 0xF2, 0xE7, 0xF5,
+				0xBB, 0xD1, 0xE1, 0xC7, 0x3E, 0xB7, 0xA9, 0x18,
+				0x3B, 0xCC, 0x58, 0x98, 0xED, 0x22, 0x1E, 0xE2,
+				0x04, 0x76, 0xA1, 0xB9, 0x92, 0x54, 0xB5, 0x5B
+			},
+
+		},
+		{
+			"[ cpu1#0 cpu0#0 cpu0#1 cpu0#2 cpu1#1 cpu1#2 ]",
+			{
+				0xAB, 0x50, 0x2A, 0x68, 0x28, 0x35, 0x16, 0xA9,
+				0xDE, 0x26, 0x77, 0xAA, 0x99, 0x29, 0x0E, 0x9C,
+				0x67, 0x41, 0x64, 0x28, 0x6E, 0xFF, 0x54, 0x33,
+				0xE5, 0x29, 0xC4, 0xA5, 0x98, 0x40, 0x7E, 0xC9
+			},
+
+		},
+		{
+			"[ cpu1#0 cpu0#0 cpu0#1 cpu1#1 cpu0#2 cpu1#2 ]",
+			{
+				0xA3, 0x4D, 0xB0, 0x28, 0xAB, 0x01, 0x56, 0xBB,
+				0x7D, 0xE5, 0x0E, 0x86, 0x26, 0xBB, 0xBB, 0xDE,
+				0x58, 0x91, 0x88, 0xBB, 0x9F, 0x6A, 0x58, 0x78,
+				0x30, 0x2C, 0x22, 0x2E, 0x85, 0x7F, 0x87, 0xF6
+			},
+
+		},
+		{
+			"[ cpu1#0 cpu0#0 cpu0#1 cpu1#1 cpu1#2 cpu0#2 ]",
+			{
+				0x1A, 0x2E, 0xD2, 0xC2, 0x0C, 0xBD, 0x30, 0xDA,
+				0x4F, 0x37, 0x6B, 0x90, 0xE3, 0x67, 0xFE, 0x61,
+				0x4F, 0x30, 0xBB, 0x29, 0xBC, 0xAA, 0x6E, 0xC5,
+				0x60, 0x6E, 0x13, 0x6B, 0x33, 0x3D, 0xC0, 0x11
+			},
+
+		},
+		{
+			"[ cpu1#0 cpu0#0 cpu1#1 cpu0#1 cpu0#2 cpu1#2 ]",
+			{
+				0x8F, 0xEA, 0xD1, 0x80, 0xE0, 0xBE, 0x27, 0xF7,
+				0x8D, 0x19, 0xBF, 0x65, 0xBE, 0x92, 0x83, 0x7C,
+				0x61, 0x8F, 0xC5, 0x8D, 0x0F, 0xAD, 0x89, 0x1E,
+				0xAE, 0x0A, 0x75, 0xAC, 0x3E, 0x5F, 0xD5, 0x31
+			},
+
+		},
+		{
+			"[ cpu1#0 cpu0#0 cpu1#1 cpu0#1 cpu1#2 cpu0#2 ]",
+			{
+				0x0F, 0x7B, 0xEE, 0xA5, 0x9A, 0xCD, 0xED, 0x8D,
+				0x5A, 0x52, 0xFF, 0xD6, 0x30, 0xF4, 0xD9, 0xE9,
+				0xF4, 0xC1, 0x1A, 0x0C, 0x86, 0x2B, 0x96, 0x2C,
+				0x0E, 0x2D, 0x1A, 0x2A, 0xFE, 0xE6, 0x7C, 0xAD
+			},
+
+		},
+		{
+			"[ cpu1#0 cpu0#0 cpu1#1 cpu1#2 cpu0#1 cpu0#2 ]",
+			{
+				0x4A, 0xBA, 0xFF, 0x0B, 0x0B, 0x06, 0xD1, 0xCE,
+				0x95, 0x91, 0x70, 0x68, 0x20, 0xD6, 0xF2, 0x23,
+				0xC5, 0x6A, 0x63, 0x2B, 0xCA, 0xDF, 0x37, 0xB5,
+				0x0B, 0xDC, 0x64, 0x6A, 0xA3, 0xC9, 0x8F, 0x1E
+			},
+
+		},
+		{
+			"[ cpu1#0 cpu1#1 cpu0#0 cpu0#1 cpu0#2 cpu1#2 ]",
+			{
+				0x3D, 0xB1, 0xE1, 0xBD, 0x85, 0x2C, 0xA0, 0x04,
+				0xE6, 0x43, 0xE8, 0x82, 0xC3, 0x77, 0xF3, 0xCE,
+				0x4D, 0x62, 0x2C, 0xF4, 0x65, 0xF6, 0x29, 0x5F,
+				0x17, 0xDA, 0xD5, 0x79, 0x55, 0xE2, 0x3D, 0x0C
+			},
+
+		},
+		{
+			"[ cpu1#0 cpu1#1 cpu0#0 cpu0#1 cpu1#2 cpu0#2 ]",
+			{
+				0x5B, 0xFE, 0x29, 0xA4, 0xDA, 0x9F, 0xE7, 0x13,
+				0x5F, 0xA2, 0xCE, 0x53, 0x40, 0xC0, 0x38, 0xBC,
+				0x10, 0x7A, 0xF0, 0x29, 0x3C, 0xD6, 0xAF, 0x8A,
+				0x03, 0x40, 0xED, 0xE1, 0xFD, 0x46, 0xB7, 0x06
+			},
+
+		},
+		{
+			"[ cpu1#0 cpu1#1 cpu0#0 cpu1#2 cpu0#1 cpu0#2 ]",
+			{
+				0x66, 0x20, 0xA7, 0xBE, 0xED, 0x90, 0x0A, 0x14,
+				0x95, 0x7A, 0x93, 0x47, 0x1E, 0xA8, 0xDD, 0x6E,
+				0x25, 0xCB, 0x73, 0x18, 0x77, 0x77, 0x91, 0xE9,
+				0xCA, 0x17, 0x26, 0x16, 0xAA, 0xC9, 0x34, 0x7A
+			},
+
+		},
+		{
+			"[ cpu1#0 cpu1#1 cpu1#2 cpu0#0 cpu0#1 cpu0#2 ]",
+			{
+				0x4D, 0xF6, 0xC7, 0x74, 0x37, 0x66, 0x4C, 0x6A,
+				0x40, 0x32, 0x94, 0x01, 0x17, 0xA2, 0xE6, 0x3D,
+				0xA8, 0x00, 0x3E, 0xB7, 0x89, 0x24, 0xF4, 0x04,
+				0x14, 0xA8, 0xA1, 0xD1, 0xCD, 0x5B, 0xC3, 0x60
+			},
+
+		},
+	};
+
+	struct cpu_extend_run cpus[2] = {
+		/* CPU0 */
+		{ .idx = &slot, .m = &m[0], .size = SHA256_SIZE },
+		/* CPU1 */
+		{ .idx = &slot, .m = &m[1], .size = SHA256_SIZE },
+	};
+
+	for (slot = 1; slot <= REM_COUNT; slot++) {
+		for (m_idx = 0; m_idx < 3; m_idx++) {
+			memcpy(m[0].words, &measure_bytes[m_idx * 8], SHA256_SIZE);
+			memcpy(m[1].words, &measure_bytes[24 + m_idx * 8], SHA256_SIZE);
+			on_cpus(cpu_run_extend_measurement, (void *)&cpus[0]);
+			if (cpus[0].rc || cpus[1].rc)
+				return;
+		}
+	}
+
+	/* Get the token and parse the claims */
+	memset(page_buf_data, 0, sizeof(page_buf_data));
+	memset(&ch, 0xAB, sizeof(ch));
+	if (!get_attest_token_claims(page_buf_data, &ch, &claims, &token_size))
+		return;
+
+	/*
+	 * Hard-coded test data expects sha-256 algorithm, skip the measurement
+	 * value comparison if realm hash algo is different.
+	 */
+	if (!claims_uses_sha256_algo(&claims)) {
+		report_skip("Hash algo is different than sha-256,"
+			    " skip measurement value comparison");
+		return;
+	}
+
+	for (slot = 0; slot < REM_COUNT; slot++) {
+		struct claim_t *claim = &claims.realm_measurement_claims[slot];
+		const char *data = claim->buffer_data.ptr;
+		const size_t len = claim->buffer_data.len;
+
+		if (len != SHA256_SIZE) {
+			report(false, "Realm measurement size mismatch "
+				      "%zu vs %d (expected)", len, SHA256_SIZE);
+			continue;
+		}
+
+		for (m_idx = 0; m_idx < ARRAY_SIZE(expected); m_idx++) {
+			struct extend_smp_expected *em = &expected[m_idx];
+
+			if (memcmp(data, em->measurement, SHA256_SIZE) == 0) {
+				report(true, "Hash found for slot %d: %s",
+					      slot, em->sequence);
+				break;
+			}
+		}
+
+		if (m_idx == ARRAY_SIZE(expected))
+			report(false, "Measurement doesn't match any expected "
+				      "sequence for slot %d", slot);
+	}
+}
+
+static void run_rsi_extend_smp_tests(void)
+{
+	report_prefix_push("extend_smp");
+	test_rsi_extend_smp();
+	report_prefix_pop();
+}
+
+static void test_rsi_extend_and_attest(void)
+{
+	struct challenge ch;
+	struct measurement m;
+	struct attestation_claims claims;
+	size_t token_size;
+	int i, j;
+
+	char measure_bytes[] = {
+		0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, /*slot 1*/
+		0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, /*slot 2*/
+		0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, /*slot 3*/
+		0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, /*slot 4*/
+		0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, /*slot 5*/
+		0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, /*slot 6*/
+		0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
+		0x18, 0x19, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x1F,
+		0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27
+	};
+
+	/* The following expectations assume extending with SHA256 */
+	char expected_measurements[][SHA256_SIZE] = {
+		{
+			0x88, 0x78, 0xb1, 0x5a, 0x7d, 0x6a, 0x3a, 0x4f,
+			0x46, 0x4e, 0x8f, 0x9f, 0x42, 0x59, 0x1d, 0xbc,
+			0x0c, 0xf4, 0xbe, 0xde, 0xa0, 0xec, 0x30, 0x90,
+			0x03, 0xd2, 0xb2, 0xee, 0x53, 0x65, 0x5e, 0xf8
+		},
+		{
+			0x58, 0x32, 0x3b, 0xdf, 0x7a, 0x91, 0xf6, 0x8e,
+			0x80, 0xc7, 0xc8, 0x7f, 0xda, 0x1e, 0x22, 0x6c,
+			0x8b, 0xe7, 0xee, 0xa9, 0xef, 0x64, 0xa5, 0x21,
+			0xdb, 0x2c, 0x09, 0xa7, 0xd7, 0x01, 0x92, 0x05
+		},
+		{
+			0x66, 0xe3, 0x3b, 0x99, 0x49, 0x4d, 0xf4, 0xdd,
+			0xbc, 0x7a, 0x61, 0x7a, 0xa1, 0x56, 0x7b, 0xf8,
+			0x96, 0x3f, 0x0a, 0xf3, 0x1e, 0xab, 0xdd, 0x16,
+			0x37, 0xb0, 0xfb, 0xe0, 0x71, 0x82, 0x66, 0xce
+		},
+		{
+			0x97, 0x5e, 0x9f, 0x64, 0x79, 0x90, 0xa1, 0x51,
+			0xd2, 0x5b, 0x73, 0x75, 0x50, 0x94, 0xeb, 0x54,
+			0x90, 0xbb, 0x1e, 0xf8, 0x3b, 0x2c, 0xb8, 0x3b,
+			0x6f, 0x24, 0xf3, 0x86, 0x07, 0xe0, 0x58, 0x13
+		},
+		{
+			0x68, 0x99, 0x86, 0x64, 0x9b, 0xeb, 0xa2, 0xe4,
+			0x4d, 0x07, 0xbb, 0xb3, 0xa1, 0xd9, 0x2d, 0x07,
+			0x76, 0x7f, 0x86, 0x19, 0xb8, 0x5f, 0x14, 0x48,
+			0x1f, 0x38, 0x4b, 0x87, 0x51, 0xdc, 0x10, 0x31
+		},
+		{
+			0xee, 0x8f, 0xb3, 0xe9, 0xc8, 0xa5, 0xbe, 0x4f,
+			0x12, 0x90, 0x4a, 0x52, 0xb9, 0xc8, 0x62, 0xd1,
+			0x8a, 0x44, 0x31, 0xf7, 0x56, 0x7d, 0x96, 0xda,
+			0x97, 0x7a, 0x9e, 0x96, 0xae, 0x6a, 0x78, 0x43
+		},
+	};
+	int times_to_extend[] = {1, 2, 3, 4, 5, 6};
+
+	memset(page_buf_data, 0, sizeof(page_buf_data));
+	memset(&ch, 0xAB, sizeof(ch));
+	if (!__get_attest_token_claims(page_buf_data, &ch,
+				       &claims, &token_size, true))
+		return;
+
+	for (i = 0; i < REM_COUNT; i++) {
+		struct claim_t c = claims.realm_measurement_claims[i];
+		for (j = 0; j < c.buffer_data.len; j++) {
+			if (((char *)c.buffer_data.ptr)[j])
+				break;
+		}
+	}
+
+	report((i == REM_COUNT), "Initial measurements must be 0");
+
+	/* Extend the possible measurements (i.e., 1 to REM_COUNT) */
+	for (i = 1; i <= REM_COUNT; i++) {
+		memcpy(&m.words[0], &measure_bytes[(i - 1) * 8], SHA256_SIZE);
+		for (j = 0; j < times_to_extend[i - 1]; j++) {
+			struct smccc_result r;
+
+			measurement_extend(i, &m, SHA256_SIZE, &r);
+			if (r.r0) {
+				report(false, "Extend measurment slot %d, iteration %d "
+					      "failed with %ld", i, j, r.r0);
+				return;
+			}
+		}
+	}
+	report(true, "Extend measurement for all slots completed");
+
+	/* Get the attestation token again */
+	if (!__get_attest_token_claims(page_buf_data, &ch,
+				       &claims, &token_size, true))
+		return;
+
+	/*
+	 * Hard-coded test data expects sha-256 algorithm, skip the measurement
+	 * value comparison if realm hash algo is different.
+	 */
+	if (!claims_uses_sha256_algo(&claims))
+		return;
+
+	/* Verify the extended measurements */
+	for (i = 0; i < REM_COUNT; i++) {
+		const char *exp = expected_measurements[i];
+		const char *actual = claims.realm_measurement_claims[i].buffer_data.ptr;
+		const size_t len = claims.realm_measurement_claims[i].buffer_data.len;
+
+		if (len != SHA256_SIZE) {
+			report(false, "Realm measurement: slot: %d, unexpected size "
+				      "actual %ld vs %d expected", i, len,
+				      SHA256_SIZE);
+			return;
+		}
+		if (memcmp(exp, actual, len)) {
+			report(false, "Measurement doesn't match for slot %d", i);
+			printf("Expected:\n");
+			for (j = 0; j < len; j++)
+				printf("0x%2x ", exp[j]);
+			printf("\nActual:\n");
+			for (j = 0; j < len; j++)
+				printf("0x%2x ", actual[j]);
+			printf("\n");
+		} else {
+			report(true, "Extended measurement match expected for "
+				     "slot %d", i);
+
+		}
+	}
+}
+
+static void run_rsi_extend_and_attest_tests(void)
+{
+	report_prefix_push("extend_and_attest");
+	test_rsi_extend_and_attest();
+	report_prefix_pop();
+}
+
+#define MEASUREMENT_MAX_SIZE_LONGS	8
+
+static void test_read_measurement(void)
+{
+	struct smccc_result result;
+	return_code_t rc;
+	unsigned long *m;
+	int i, j;
+
+	/*
+	 * We must be able to read all measurements
+	 * 0 (Initial read-only measurement and the
+	 * realm extendable ones, 1 to REM_COUNT.
+	 */
+	for (i = 0; i <= REM_COUNT; i++) {
+		rsi_read_measurement(i, &result);
+		rc = unpack_return_code(result.r0);
+		if (rc.status) {
+			report(false, "Read measurement failed for slot %d with "
+				      "(%d, %d)", i, rc.status, rc.index);
+			return;
+		}
+		m = &result.r1;
+		printf("Read measurement slot:%d, Hash = ", i);
+		for (j = 0; j < MEASUREMENT_MAX_SIZE_LONGS; j++)
+			printf("%lx", __builtin_bswap64(*m++));
+		printf("\n");
+		report(true, "Read Measurement Slot: %d", i);
+	}
+}
+
+static void test_read_measurement_bad_input(void)
+{
+	struct smccc_result result;
+	return_code_t rc;
+
+	report_prefix_push("out-of-range index");
+	rsi_read_measurement(REM_COUNT + 1, &result);
+	rc = unpack_return_code(result.r0);
+	if (rc.status != RSI_ERROR_INPUT) {
+		report(false, "Read measurement fails, "
+			      "expected (%d), got (%d)",
+			      RSI_ERROR_INPUT, rc.status);
+	} else {
+		report(true, "Read measurement fails as expected");
+	}
+	report_prefix_pop(); /* out-of-range index */
+}
+
+static void run_rsi_read_measurement_tests(void)
+{
+	report_prefix_push("measurement");
+	test_read_measurement();
+	test_read_measurement_bad_input();
+	report_prefix_pop();
+}
+
+int main(int argc, char **argv)
+{
+	int i;
+	report_prefix_push("attestation");
+
+	for (i = 1; i < argc; i++) {
+		if (strcmp(argv[i], "attest") == 0)
+			run_rsi_attest_tests();
+		else if (strcmp(argv[i], "attest_smp") == 0)
+			run_rsi_attest_smp_test();
+		else if (strcmp(argv[i], "extend") == 0)
+			run_rsi_extend_tests();
+		else if (strcmp(argv[i], "extend_smp") == 0)
+			run_rsi_extend_smp_tests();
+		else if (strcmp(argv[i], "extend_and_attest") == 0)
+			run_rsi_extend_and_attest_tests();
+		else if (strcmp(argv[i], "measurement") == 0)
+			run_rsi_read_measurement_tests();
+		else
+			report_info("Unknown subtest '%s'", argv[i]);
+	}
+	return report_summary();
+}
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index bc2354c7..5e9e1cbd 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -311,3 +311,53 @@ file = realm-sea.flat
 groups = nodefault realms
 accel = kvm
 arch = arm64
+
+# Realm Attestation realted tests
+[realm-attest]
+file = realm-attest.flat
+groups = nodefault realms
+smp = 1
+extra_params = -m 32 -append 'attest'
+accel = kvm
+arch = arm64
+
+[realm-attest-smp]
+file = realm-attest.flat
+groups = nodefault realms
+smp = 2
+extra_params = -m 32 -append 'attest_smp'
+accel = kvm
+arch = arm64
+
+[realm-extend]
+file = realm-attest.flat
+groups = nodefault realms
+smp = 1
+extra_params = -m 32 -append 'extend'
+accel = kvm
+arch = arm64
+
+[realm-extend-smp]
+file = realm-attest.flat
+groups = nodefault realms
+smp = 2
+extra_params = -m 32 -append 'extend_smp'
+accel = kvm
+arch = arm64
+
+[realm-extend-and-attest]
+file = realm-attest.flat
+groups = nodefault realms
+smp = 1
+extra_params = -m 32 -append 'extend_and_attest'
+accel = kvm
+arch = arm64
+
+
+[realm-measurement]
+file = realm-attest.flat
+groups = nodefault realms
+smp = 1
+extra_params = -m 32 -append 'measurement'
+accel = kvm
+arch = arm64
diff --git a/lib/libcflat.h b/lib/libcflat.h
index c1fd31ff..893fee6f 100644
--- a/lib/libcflat.h
+++ b/lib/libcflat.h
@@ -163,6 +163,7 @@ extern void setup_vm(void);
 #define SZ_64K			(1 << 16)
 #define SZ_1M			(1 << 20)
 #define SZ_2M			(1 << 21)
+#define SZ_512M			(1 << 29)
 #define SZ_1G			(1 << 30)
 #define SZ_2G			(1ul << 31)
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 26/27] arm: realm: Add a test for shared memory
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (24 preceding siblings ...)
  2023-01-27 11:41   ` [RFC kvm-unit-tests 25/27] arm: realm: Add Realm attestation tests Joey Gouly
@ 2023-01-27 11:41   ` Joey Gouly
  2023-01-27 11:41   ` [RFC kvm-unit-tests 27/27] NOT-FOR-MERGING: add run-realm-tests Joey Gouly
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:41 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

From: Suzuki K Poulose <suzuki.poulose@arm.com>

Do some basic tests that trigger marking a memory region as
RIPAS_EMPTY and accessing the shared memory. Also, convert it back
to RAM and make sure the contents are scrubbed.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 arm/Makefile.arm64    |  1 +
 arm/realm-ns-memory.c | 86 +++++++++++++++++++++++++++++++++++++++++++
 arm/unittests.cfg     |  8 ++++
 3 files changed, 95 insertions(+)
 create mode 100644 arm/realm-ns-memory.c

diff --git a/arm/Makefile.arm64 b/arm/Makefile.arm64
index 0a0c4f2c..9b41e841 100644
--- a/arm/Makefile.arm64
+++ b/arm/Makefile.arm64
@@ -44,6 +44,7 @@ tests += $(TEST_DIR)/realm-rsi.flat
 tests += $(TEST_DIR)/realm-attest.flat
 tests += $(TEST_DIR)/realm-fpu.flat
 tests += $(TEST_DIR)/realm-sea.flat
+tests += $(TEST_DIR)/realm-ns-memory.flat
 
 include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
diff --git a/arm/realm-ns-memory.c b/arm/realm-ns-memory.c
new file mode 100644
index 00000000..8360c371
--- /dev/null
+++ b/arm/realm-ns-memory.c
@@ -0,0 +1,86 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Arm Limited.
+ * All rights reserved.
+ */
+
+#include <asm/io.h>
+#include <alloc_page.h>
+#include <bitops.h>
+
+#define GRANULE_SIZE 	0x1000
+#define BUF_SIZE	(PAGE_SIZE * 2)
+#define BUF_PAGES	(BUF_SIZE / PAGE_SIZE)
+#define BUF_GRANULES	(BUF_SIZE / GRANULE_SIZE)
+
+static char __attribute__((aligned(PAGE_SIZE))) buffer[BUF_SIZE];
+
+static void static_shared_buffer_test(void)
+{
+	int i;
+
+	set_memory_decrypted((unsigned long)buffer, sizeof(buffer));
+	for (i = 0; i < sizeof(buffer); i += GRANULE_SIZE)
+		buffer[i] = (char)i;
+
+	/*
+	 * Verify the content of the NS buffer
+	 */
+	for (i = 0; i < sizeof(buffer); i += GRANULE_SIZE) {
+		if (buffer[i] != (char)i) {
+			report(false, "Failed to set Non Secure memory");
+			return;
+		}
+	}
+
+	/* Make the buffer back to protected... */
+	set_memory_encrypted((unsigned long)buffer, sizeof(buffer));
+	/* .. and check if the contents were destroyed */
+	for (i = 0; i < sizeof(buffer); i += GRANULE_SIZE) {
+		if (buffer[i] != 0) {
+			report(false, "Failed to scrub protected memory");
+			return;
+		}
+	}
+
+	report(true, "Conversion of protected memory to shared and back");
+}
+
+static void dynamic_shared_buffer_test(void)
+{
+	char *ns_buffer;
+	int i;
+	int order = get_order(BUF_PAGES);
+
+	ns_buffer = alloc_pages_shared(order);
+	assert(ns_buffer);
+	for (i = 0; i < sizeof(buffer); i += GRANULE_SIZE)
+		ns_buffer[i] = (char)i;
+
+	/*
+	 * Verify the content of the NS buffer
+	 */
+	for (i = 0; i < sizeof(buffer); i += GRANULE_SIZE) {
+		if (ns_buffer[i] != (char)i) {
+			report(false, "Failed to set Non Secure memory");
+			return;
+		}
+	}
+	free_pages_shared(ns_buffer);
+	report(true, "Dynamic allocation and free of shared memory\n");
+}
+
+static void ns_test(void)
+{
+	static_shared_buffer_test();
+	dynamic_shared_buffer_test();
+}
+
+int main(int argc, char **argv)
+{
+	report_prefix_pushf("ns-memory");
+	ns_test();
+	report_prefix_pop();
+
+	return report_summary();
+}
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index 5e9e1cbd..8173ccfe 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -361,3 +361,11 @@ smp = 1
 extra_params = -m 32 -append 'measurement'
 accel = kvm
 arch = arm64
+
+[realm-ns-memory]
+file=realm-ns-memory.flat
+groups = nodefault realms
+smp = 1
+extra_params = -m 32
+accel = kvm
+arch = arm64
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* [RFC kvm-unit-tests 27/27] NOT-FOR-MERGING: add run-realm-tests
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
                     ` (25 preceding siblings ...)
  2023-01-27 11:41   ` [RFC kvm-unit-tests 26/27] arm: realm: Add a test for shared memory Joey Gouly
@ 2023-01-27 11:41   ` Joey Gouly
  26 siblings, 0 replies; 190+ messages in thread
From: Joey Gouly @ 2023-01-27 11:41 UTC (permalink / raw)
  To: Andrew Jones, kvmarm, kvm
  Cc: joey.gouly, Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Suzuki K Poulose, Thomas Huth, Will Deacon, Zenghui Yu,
	linux-coco, kvmarm, linux-arm-kernel, linux-kernel

From: Alexandru Elisei <alexandru.elisei@arm.com>

Until we add support for KVMTOOL to run the tests using the
scripts, provide a temporary script to run all the Realm tests.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Joey Gouly <joey.gouly@arm.com>
---
 arm/run-realm-tests | 56 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)
 create mode 100755 arm/run-realm-tests

diff --git a/arm/run-realm-tests b/arm/run-realm-tests
new file mode 100755
index 00000000..39f431d5
--- /dev/null
+++ b/arm/run-realm-tests
@@ -0,0 +1,56 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0-only
+# Copyright (C) 2023, Arm Ltd
+# All rights reserved
+#
+
+TASKSET=${TASKSET:-taskset}
+LKVM=${LKVM:-lkvm}
+ARGS="--realm --irqchip=gicv3 --console=serial --network mode=none --nodefaults"
+
+TESTDIR="."
+while getopts "d:" option; do
+	case "${option}" in
+		d) TESTDIR=${OPTARG};;
+		?)
+			exit 1
+			;;
+	esac
+done
+if [[ ! -d ${TESTDIR} ]]; then
+	echo "Invalid directory: ${TESTDIR}"
+	exit 1
+fi
+
+function run_tests {
+	DIR="$1"
+
+	$LKVM run $ARGS -c 2 -m 16 -k $DIR/selftest.flat -p "setup smp=2 mem=16"
+	$LKVM run $ARGS -c 1 -m 16 -k $DIR/selftest.flat -p "vectors-kernel"
+	$LKVM run $ARGS -c 1 -m 16 -k $DIR/selftest.flat -p "vectors-user"
+	$LKVM run $ARGS -c 4 -m 32 -k $DIR/selftest.flat -p "smp"
+
+	$LKVM run $ARGS -c 1 -m 32 -k $DIR/realm-ns-memory.flat
+
+	$LKVM run $ARGS -c 4 -m 32 -k $DIR/psci.flat
+
+	$LKVM run $ARGS -c 4 -m 32 -k $DIR/gic.flat -p "ipi"
+	$LKVM run $ARGS -c 4 -m 32 -k $DIR/gic.flat -p "active"
+
+	$LKVM run $ARGS -c 1 -m 16 -k $DIR/timer.flat
+
+	$LKVM run $ARGS -c 1 -m 16 -k $DIR/realm-rsi.flat -p "version"
+	$LKVM run $ARGS -c 1 -m 16 -k $DIR/realm-rsi.flat -p "host_call hvc"
+	$LKVM run $ARGS -c 1 -m 16 -k $DIR/realm-sea.flat
+
+	$LKVM run $ARGS -c 1 -m 24 -k $DIR/realm-attest.flat -p "attest"
+	$LKVM run $ARGS -c 2 -m 24 -k $DIR/realm-attest.flat -p "attest_smp"
+	$LKVM run $ARGS -c 1 -m 24 -k $DIR/realm-attest.flat -p "extend"
+	$LKVM run $ARGS -c 2 -m 24 -k $DIR/realm-attest.flat -p "extend_smp"
+	$LKVM run $ARGS -c 1 -m 24 -k $DIR/realm-attest.flat -p "extend_and_attest"
+	$LKVM run $ARGS -c 1 -m 24 -k $DIR/realm-attest.flat -p "measurement"
+
+	$TASKSET -c 0 $LKVM run $ARGS -c 4 -m 32 -k $DIR/realm-fpu.flat
+}
+
+run_tests "${TESTDIR}"
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 190+ messages in thread

* Re: [RFC kvmtool 04/31] Add --nocompat option to disable compat warnings
  2023-01-27 11:39   ` [RFC kvmtool 04/31] Add --nocompat option to disable compat warnings Suzuki K Poulose
@ 2023-01-27 12:19     ` Alexandru Elisei
  0 siblings, 0 replies; 190+ messages in thread
From: Alexandru Elisei @ 2023-01-27 12:19 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: kvm, kvmarm, Andrew Jones, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Thomas Huth, Will Deacon, Zenghui Yu, linux-coco, kvmarm,
	linux-arm-kernel, linux-kernel

Hi,

On Fri, Jan 27, 2023 at 11:39:05AM +0000, Suzuki K Poulose wrote:
> From: Alexandru Elisei <alexandru.elisei@arm.com>
> 
> Commit e66942073035 ("kvm tools: Guest kernel compatability") added the
> functionality that enables devices to print a warning message if the device
> hasn't been initialized by the time the VM is destroyed. The purpose of
> these messages is to let the user know if the kernel hasn't been built with
> the correct Kconfig options to take advantage of the said devices (all
> using virtio).
> 
> Since then, kvmtool has evolved and now supports loading different payloads
> (like firmware images), and having those warnings even when it is entirely
> intentional for the payload not to touch the devices can be confusing for
> the user and makes the output unnecessarily verbose in those cases.
> 
> Add the --nocompat option to disable the warnings; the warnings are still
> enabled by default.

I had a conversation with Will regarding this some time ago, we settled on
a different approach, by using --loglevel=<level>, similar to what Linux
does.  I'll put that on my TODO list and try to send patches soon-ish.

Thanks,
Alex

> 
> Reported-by: Christoffer Dall <christoffer.dall@arm.com>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  builtin-run.c            | 5 ++++-
>  guest_compat.c           | 1 +
>  include/kvm/kvm-config.h | 1 +
>  3 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin-run.c b/builtin-run.c
> index bb7e6e8d..f8edfb3f 100644
> --- a/builtin-run.c
> +++ b/builtin-run.c
> @@ -183,6 +183,8 @@ static int mem_parser(const struct option *opt, const char *arg, int unset)
>  	OPT_BOOLEAN('\0', "nodefaults", &(cfg)->nodefaults, "Disable"   \
>  			" implicit configuration that cannot be"	\
>  			" disabled otherwise"),				\
> +	OPT_BOOLEAN('\0', "nocompat", &(cfg)->nocompat, "Disable"	\
> +			" compat warnings"),				\
>  	OPT_CALLBACK('\0', "9p", NULL, "dir_to_share,tag_name",		\
>  		     "Enable virtio 9p to share files between host and"	\
>  		     " guest", virtio_9p_rootdir_parser, kvm),		\
> @@ -797,7 +799,8 @@ static int kvm_cmd_run_work(struct kvm *kvm)
>  
>  static void kvm_cmd_run_exit(struct kvm *kvm, int guest_ret)
>  {
> -	compat__print_all_messages();
> +	if (!kvm->cfg.nocompat)
> +		compat__print_all_messages();
>  
>  	init_list__exit(kvm);
>  
> diff --git a/guest_compat.c b/guest_compat.c
> index fd4704b2..a413c12c 100644
> --- a/guest_compat.c
> +++ b/guest_compat.c
> @@ -88,6 +88,7 @@ int compat__print_all_messages(void)
>  
>  		printf("\n  # KVM compatibility warning.\n\t%s\n\t%s\n",
>  			msg->title, msg->desc);
> +		printf("\tTo stop seeing this warning, use the --nocompat option.\n");
>  
>  		list_del(&msg->list);
>  		compat__free(msg);
> diff --git a/include/kvm/kvm-config.h b/include/kvm/kvm-config.h
> index 368e6c7d..88df7cc2 100644
> --- a/include/kvm/kvm-config.h
> +++ b/include/kvm/kvm-config.h
> @@ -30,6 +30,7 @@ struct kvm_config {
>  	u64 vsock_cid;
>  	bool virtio_rng;
>  	bool nodefaults;
> +	bool nocompat;
>  	int active_console;
>  	int debug_iodelay;
>  	int nrcpus;
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-01-27 11:22 [RFC] Support for Arm CCA VMs on Linux Suzuki K Poulose
                   ` (3 preceding siblings ...)
  2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
@ 2023-01-27 15:26 ` Jean-Philippe Brucker
  2023-02-28 23:35   ` Itaru Kitayama
  2023-02-10 16:51 ` Ryan Roberts
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 190+ messages in thread
From: Jean-Philippe Brucker @ 2023-01-27 15:26 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-coco, linux-kernel, kvm, kvmarm, linux-arm-kernel,
	Alexandru Elisei, Andrew Jones, Catalin Marinas, Chao Peng,
	Christoffer Dall, Fuad Tabba, James Morse, Joey Gouly,
	Marc Zyngier, Mark Rutland, Oliver Upton, Paolo Bonzini,
	Quentin Perret, Sean Christopherson, Steven Price, Thomas Huth,
	Will Deacon, Zenghui Yu, kvmarm

On Fri, Jan 27, 2023 at 11:22:48AM +0000, Suzuki K Poulose wrote:
> We are happy to announce the early RFC version of the Arm
> Confidential Compute Architecture (CCA) support for the Linux
> stack. The intention is to seek early feedback in the following areas:
>  * KVM integration of the Arm CCA
>  * KVM UABI for managing the Realms, seeking to generalise the operations
>    wherever possible with other Confidential Compute solutions.

A prototype for launching Realm VMs with QEMU is available at:
https://lore.kernel.org/qemu-devel/20230127150727.612594-1-jean-philippe@linaro.org/

Thanks,
Jean


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC kvm-unit-tests 01/27] lib/string: include stddef.h for size_t
  2023-01-27 11:40   ` [RFC kvm-unit-tests 01/27] lib/string: include stddef.h for size_t Joey Gouly
@ 2023-01-31 14:43     ` Thomas Huth
  0 siblings, 0 replies; 190+ messages in thread
From: Thomas Huth @ 2023-01-31 14:43 UTC (permalink / raw)
  To: Joey Gouly, Andrew Jones, kvmarm, kvm
  Cc: Alexandru Elisei, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Marc Zyngier, Mark Rutland, Oliver Upton,
	Paolo Bonzini, Quentin Perret, Steven Price, Suzuki K Poulose,
	Will Deacon, Zenghui Yu, linux-coco, kvmarm, linux-arm-kernel,
	linux-kernel

On 27/01/2023 12.40, Joey Gouly wrote:
> Don't implicitly rely on this header being included.
> 
> Signed-off-by: Joey Gouly <joey.gouly@arm.com>
> ---
>   lib/string.h | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/lib/string.h b/lib/string.h
> index b07763ea..758dca8a 100644
> --- a/lib/string.h
> +++ b/lib/string.h
> @@ -7,6 +7,8 @@
>   #ifndef _STRING_H_
>   #define _STRING_H_
>   
> +#include <stddef.h>  /* For size_t */
> +
>   extern size_t strlen(const char *buf);
>   extern size_t strnlen(const char *buf, size_t maxlen);
>   extern char *strcat(char *dest, const char *src);

Reviewed-by: Thomas Huth <thuth@redhat.com>


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms
  2023-01-27 11:29   ` [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms Steven Price
@ 2023-02-07 12:25     ` Jean-Philippe Brucker
  2023-02-07 12:55       ` Suzuki K Poulose
  2023-02-13 16:10     ` Zhi Wang
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 190+ messages in thread
From: Jean-Philippe Brucker @ 2023-02-07 12:25 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, Jan 27, 2023 at 11:29:10AM +0000, Steven Price wrote:
> +static int kvm_rme_config_realm(struct kvm *kvm, struct kvm_enable_cap *cap)
> +{
> +	struct kvm_cap_arm_rme_config_item cfg;
> +	struct realm *realm = &kvm->arch.realm;
> +	int r = 0;
> +
> +	if (kvm_realm_state(kvm) != REALM_STATE_NONE)
> +		return -EBUSY;

This should also check kvm_is_realm() (otherwise we dereference a NULL
realm).

I was wondering about fuzzing the API to find more of this kind of issue,
but don't know anything about it. Is there a recommended way to fuzz KVM?

Thanks,
Jean


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms
  2023-02-07 12:25     ` Jean-Philippe Brucker
@ 2023-02-07 12:55       ` Suzuki K Poulose
  0 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-02-07 12:55 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Zenghui Yu, linux-arm-kernel,
	linux-kernel, Joey Gouly, Alexandru Elisei, Christoffer Dall,
	Fuad Tabba, linux-coco

On 07/02/2023 12:25, Jean-Philippe Brucker wrote:
> On Fri, Jan 27, 2023 at 11:29:10AM +0000, Steven Price wrote:
>> +static int kvm_rme_config_realm(struct kvm *kvm, struct kvm_enable_cap *cap)
>> +{
>> +	struct kvm_cap_arm_rme_config_item cfg;
>> +	struct realm *realm = &kvm->arch.realm;
>> +	int r = 0;
>> +
>> +	if (kvm_realm_state(kvm) != REALM_STATE_NONE)
>> +		return -EBUSY;
> 
> This should also check kvm_is_realm() (otherwise we dereference a NULL
> realm).

Correct, I think this should be done way up in the stack at :

kvm_vm_ioctl_enable_cap() for KVM_CAP_ARM_RME.

> 
> I was wondering about fuzzing the API to find more of this kind of issue,
> but don't know anything about it. Is there a recommended way to fuzz KVM?

Not sure either. kselftests is one possible way to drive these test at 
least for unit-testing the new ABIs. This is something we plan to add.

Thanks for catching this.

Suzuki



> Thanks,
> Jean
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-01-27 11:22 [RFC] Support for Arm CCA VMs on Linux Suzuki K Poulose
                   ` (4 preceding siblings ...)
  2023-01-27 15:26 ` [RFC] Support for Arm CCA VMs on Linux Jean-Philippe Brucker
@ 2023-02-10 16:51 ` Ryan Roberts
  2023-02-10 22:53   ` Itaru Kitayama
  2023-02-14 17:13 ` Dr. David Alan Gilbert
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 190+ messages in thread
From: Ryan Roberts @ 2023-02-10 16:51 UTC (permalink / raw)
  To: Suzuki K Poulose, linux-coco, linux-kernel, kvm, kvmarm,
	linux-arm-kernel
  Cc: Alexandru Elisei, Andrew Jones, Catalin Marinas, Chao Peng,
	Christoffer Dall, Fuad Tabba, James Morse, Jean-Philippe Brucker,
	Joey Gouly, Marc Zyngier, Mark Rutland, Oliver Upton,
	Paolo Bonzini, Quentin Perret, Sean Christopherson, Steven Price,
	Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

On 27/01/2023 11:22, Suzuki K Poulose wrote:
> [...]

> Running the stack
> ====================
> 
> To run/test the stack, you would need the following components :
> 
> 1) FVP Base AEM RevC model with FEAT_RME support [4]
> 2) TF-A firmware for EL3 [5]
> 3) TF-A RMM for R-EL2 [3]
> 4) Linux Kernel [6]
> 5) kvmtool [7]
> 6) kvm-unit-tests [8]
> 
> Instructions for building the firmware components and running the model are
> available here [9]. Once, the host kernel is booted, a Realm can be launched by
> invoking the `lkvm` commad as follows:
> 
>  $ lkvm run --realm 				 \
> 	 --measurement-algo=["sha256", "sha512"] \
> 	 --disable-sve				 \
> 	 <normal-vm-options>
> 
> Where:
>  * --measurement-algo (Optional) specifies the algorithm selected for creating the
>    initial measurements by the RMM for this Realm (defaults to sha256).
>  * GICv3 is mandatory for the Realms.
>  * SVE is not yet supported in the TF-RMM, and thus must be disabled using
>    --disable-sve
> 
> You may also run the kvm-unit-tests inside the Realm world, using the similar
> options as above.

Building all of these components and configuring the FVP correctly can be quite
tricky, so I thought I would plug a tool we have called Shrinkwrap, which can
simplify all of this.

The tool accepts a yaml input configuration that describes how a set of
components should be built and packaged, and how the FVP should be configured
and booted. And by default, it uses a Docker container on its backend, which
contains all the required tools, including the FVP. You can optionally use
Podman or have it run on your native system if you prefer. It supports both
x86_64 and aarch64. And you can even run it in --dry-run mode to see the set of
shell commands that would have been executed.

It comes with two CCA configs out-of-the-box; cca-3world.yaml builds TF-A, RMM,
Linux (for both host and guest), kvmtool and kvm-unit-tests. cca-4world.yaml
adds Hafnium and some demo SPs for the secure world (although since Hafnium
requires x86_64 to build, cca-4world.yaml doesn't currently work on an aarch64
build host).

See the documentation [1] and repository [2] for more info.

Brief instructions to get you up and running:

  # Install shrinkwrap. (I assume you have Docker installed):
  sudo pip3 install pyyaml termcolor tuxmake
  git clone https://git.gitlab.arm.com/tooling/shrinkwrap.git
  export PATH=$PWD/shrinkwrap/shrinkwrap:$PATH

  # If running Python < 3.9:
  sudo pip3 install graphlib-backport

  # Build all the CCA components:
  shrinkwrap build cca-3world.yaml [--dry-run]

  # Run the stack in the FVP:
  shrinkwrap run cca-3world.yaml -r ROOTFS=<my_rootfs.ext4> [--dry-run]

By default, building is done at ~/.shrinkwrap/build/cca-3world and the package
is created at ~/.shrinkwrap/package/cca-3world (this can be changed with
envvars).

The 'run' command will boot TF-A, RMM and host Linux kernel in the FVP, and
mount the provided rootfs. You will likely want to have copied the userspace
pieces into the rootfs before running, so you can create realms:

- ~/.shrinkwrap/package/cca-3world/Image (kernel with RMI and RSI support)
- ~/.shrinkwrap/package/cca-3world/lkvm (kvmtool able to launch realms)
- ~/.shrinkwrap/package/cca-3world/kvm-unit-tests.tgz (built kvm-unit-tests)

Once the FVP is booted to a shell, you can do something like this to launch a
Linux guest in a realm:

  lkvm run --realm --disable-sve -c 1 -m 256 -k Image

[1] https://shrinkwrap.docs.arm.com
[2] https://gitlab.arm.com/tooling/shrinkwrap


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-02-10 16:51 ` Ryan Roberts
@ 2023-02-10 22:53   ` Itaru Kitayama
  2023-02-17  8:02     ` Itaru Kitayama
  0 siblings, 1 reply; 190+ messages in thread
From: Itaru Kitayama @ 2023-02-10 22:53 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: Suzuki K Poulose, linux-coco, linux-kernel, kvm, kvmarm,
	linux-arm-kernel, Alexandru Elisei, Andrew Jones,
	Catalin Marinas, Chao Peng, Christoffer Dall, Fuad Tabba,
	James Morse, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Sean Christopherson, Steven Price, Thomas Huth, Will Deacon,
	Zenghui Yu, kvmarm

On Sat, Feb 11, 2023 at 1:56 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> On 27/01/2023 11:22, Suzuki K Poulose wrote:
> > [...]
>
> > Running the stack
> > ====================
> >
> > To run/test the stack, you would need the following components :
> >
> > 1) FVP Base AEM RevC model with FEAT_RME support [4]
> > 2) TF-A firmware for EL3 [5]
> > 3) TF-A RMM for R-EL2 [3]
> > 4) Linux Kernel [6]
> > 5) kvmtool [7]
> > 6) kvm-unit-tests [8]
> >
> > Instructions for building the firmware components and running the model are
> > available here [9]. Once, the host kernel is booted, a Realm can be launched by
> > invoking the `lkvm` commad as follows:
> >
> >  $ lkvm run --realm                            \
> >        --measurement-algo=["sha256", "sha512"] \
> >        --disable-sve                           \
> >        <normal-vm-options>
> >
> > Where:
> >  * --measurement-algo (Optional) specifies the algorithm selected for creating the
> >    initial measurements by the RMM for this Realm (defaults to sha256).
> >  * GICv3 is mandatory for the Realms.
> >  * SVE is not yet supported in the TF-RMM, and thus must be disabled using
> >    --disable-sve
> >
> > You may also run the kvm-unit-tests inside the Realm world, using the similar
> > options as above.
>
> Building all of these components and configuring the FVP correctly can be quite
> tricky, so I thought I would plug a tool we have called Shrinkwrap, which can
> simplify all of this.
>
> The tool accepts a yaml input configuration that describes how a set of
> components should be built and packaged, and how the FVP should be configured
> and booted. And by default, it uses a Docker container on its backend, which
> contains all the required tools, including the FVP. You can optionally use
> Podman or have it run on your native system if you prefer. It supports both
> x86_64 and aarch64. And you can even run it in --dry-run mode to see the set of
> shell commands that would have been executed.
>
> It comes with two CCA configs out-of-the-box; cca-3world.yaml builds TF-A, RMM,
> Linux (for both host and guest), kvmtool and kvm-unit-tests. cca-4world.yaml
> adds Hafnium and some demo SPs for the secure world (although since Hafnium
> requires x86_64 to build, cca-4world.yaml doesn't currently work on an aarch64
> build host).
>
> See the documentation [1] and repository [2] for more info.
>
> Brief instructions to get you up and running:
>
>   # Install shrinkwrap. (I assume you have Docker installed):
>   sudo pip3 install pyyaml termcolor tuxmake
>   git clone https://git.gitlab.arm.com/tooling/shrinkwrap.git
>   export PATH=$PWD/shrinkwrap/shrinkwrap:$PATH
>
>   # If running Python < 3.9:
>   sudo pip3 install graphlib-backport
>
>   # Build all the CCA components:
>   shrinkwrap build cca-3world.yaml [--dry-run]

This has been working on my Multipass instance on M1, thanks for the tool.

Thanks,
Itaru.

>
>   # Run the stack in the FVP:
>   shrinkwrap run cca-3world.yaml -r ROOTFS=<my_rootfs.ext4> [--dry-run]
>
> By default, building is done at ~/.shrinkwrap/build/cca-3world and the package
> is created at ~/.shrinkwrap/package/cca-3world (this can be changed with
> envvars).
>
> The 'run' command will boot TF-A, RMM and host Linux kernel in the FVP, and
> mount the provided rootfs. You will likely want to have copied the userspace
> pieces into the rootfs before running, so you can create realms:
>
> - ~/.shrinkwrap/package/cca-3world/Image (kernel with RMI and RSI support)
> - ~/.shrinkwrap/package/cca-3world/lkvm (kvmtool able to launch realms)
> - ~/.shrinkwrap/package/cca-3world/kvm-unit-tests.tgz (built kvm-unit-tests)
>
> Once the FVP is booted to a shell, you can do something like this to launch a
> Linux guest in a realm:
>
>   lkvm run --realm --disable-sve -c 1 -m 256 -k Image
>
> [1] https://shrinkwrap.docs.arm.com
> [2] https://gitlab.arm.com/tooling/shrinkwrap
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 04/28] arm64: RME: Check for RME support at KVM init
  2023-01-27 11:29   ` [RFC PATCH 04/28] arm64: RME: Check for RME support at KVM init Steven Price
@ 2023-02-13 15:48     ` Zhi Wang
  2023-02-13 15:59       ` Steven Price
  2023-02-13 15:55     ` Zhi Wang
  2024-03-18  7:17     ` Ganapatrao Kulkarni
  2 siblings, 1 reply; 190+ messages in thread
From: Zhi Wang @ 2023-02-13 15:48 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 27 Jan 2023 11:29:08 +0000
Steven Price <steven.price@arm.com> wrote:

> Query the RMI version number and check if it is a compatible version. A
> static key is also provided to signal that a supported RMM is available.
> 
> Functions are provided to query if a VM or VCPU is a realm (or rec)
> which currently will always return false.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/include/asm/kvm_emulate.h | 17 ++++++++++
>  arch/arm64/include/asm/kvm_host.h    |  4 +++
>  arch/arm64/include/asm/kvm_rme.h     | 22 +++++++++++++
>  arch/arm64/include/asm/virt.h        |  1 +
>  arch/arm64/kvm/Makefile              |  3 +-
>  arch/arm64/kvm/arm.c                 |  8 +++++
>  arch/arm64/kvm/rme.c                 | 49 ++++++++++++++++++++++++++++
>  7 files changed, 103 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/include/asm/kvm_rme.h
>  create mode 100644 arch/arm64/kvm/rme.c
> 
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index 9bdba47f7e14..5a2b7229e83f 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -490,4 +490,21 @@ static inline bool vcpu_has_feature(struct kvm_vcpu *vcpu, int feature)
>  	return test_bit(feature, vcpu->arch.features);
>  }
>  
> +static inline bool kvm_is_realm(struct kvm *kvm)
> +{
> +	if (static_branch_unlikely(&kvm_rme_is_available))
> +		return kvm->arch.is_realm;
> +	return false;
> +}
> +
> +static inline enum realm_state kvm_realm_state(struct kvm *kvm)
> +{
> +	return READ_ONCE(kvm->arch.realm.state);
> +}
> +
> +static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu)
> +{
> +	return false;
> +}
> +
>  #endif /* __ARM64_KVM_EMULATE_H__ */
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 35a159d131b5..04347c3a8c6b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -26,6 +26,7 @@
>  #include <asm/fpsimd.h>
>  #include <asm/kvm.h>
>  #include <asm/kvm_asm.h>
> +#include <asm/kvm_rme.h>
>  
>  #define __KVM_HAVE_ARCH_INTC_INITIALIZED
>  
> @@ -240,6 +241,9 @@ struct kvm_arch {
>  	 * the associated pKVM instance in the hypervisor.
>  	 */
>  	struct kvm_protected_vm pkvm;
> +
> +	bool is_realm;
               ^
It would be better to put more comments which really helps on the review.

I was looking for the user of this memeber to see when it is set. It seems
it is not in this patch. It would have been nice to have a quick answer from the
comments.
> +	struct realm realm;
>  };
>  
>  struct kvm_vcpu_fault_info {
> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> new file mode 100644
> index 000000000000..c26bc2c6770d
> --- /dev/null
> +++ b/arch/arm64/include/asm/kvm_rme.h
> @@ -0,0 +1,22 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2023 ARM Ltd.
> + */
> +
> +#ifndef __ASM_KVM_RME_H
> +#define __ASM_KVM_RME_H
> +
> +enum realm_state {
> +	REALM_STATE_NONE,
> +	REALM_STATE_NEW,
> +	REALM_STATE_ACTIVE,
> +	REALM_STATE_DYING
> +};
> +
> +struct realm {
> +	enum realm_state state;
> +};
> +
> +int kvm_init_rme(void);
> +
> +#endif
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index 4eb601e7de50..be1383e26626 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -80,6 +80,7 @@ void __hyp_set_vectors(phys_addr_t phys_vector_base);
>  void __hyp_reset_vectors(void);
>  
>  DECLARE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);
> +DECLARE_STATIC_KEY_FALSE(kvm_rme_is_available);
>  
>  /* Reports the availability of HYP mode */
>  static inline bool is_hyp_mode_available(void)
> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> index 5e33c2d4645a..d2f0400c50da 100644
> --- a/arch/arm64/kvm/Makefile
> +++ b/arch/arm64/kvm/Makefile
> @@ -20,7 +20,8 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
>  	 vgic/vgic-v3.o vgic/vgic-v4.o \
>  	 vgic/vgic-mmio.o vgic/vgic-mmio-v2.o \
>  	 vgic/vgic-mmio-v3.o vgic/vgic-kvm-device.o \
> -	 vgic/vgic-its.o vgic/vgic-debug.o
> +	 vgic/vgic-its.o vgic/vgic-debug.o \
> +	 rme.o
>  
>  kvm-$(CONFIG_HW_PERF_EVENTS)  += pmu-emul.o pmu.o
>  
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 9c5573bc4614..d97b39d042ab 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -38,6 +38,7 @@
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_mmu.h>
>  #include <asm/kvm_pkvm.h>
> +#include <asm/kvm_rme.h>
>  #include <asm/kvm_emulate.h>
>  #include <asm/sections.h>
>  
> @@ -47,6 +48,7 @@
>  
>  static enum kvm_mode kvm_mode = KVM_MODE_DEFAULT;
>  DEFINE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);
> +DEFINE_STATIC_KEY_FALSE(kvm_rme_is_available);
>  
>  DECLARE_KVM_HYP_PER_CPU(unsigned long, kvm_hyp_vector);
>  
> @@ -2213,6 +2215,12 @@ int kvm_arch_init(void *opaque)
>  
>  	in_hyp_mode = is_kernel_in_hyp_mode();
>  
> +	if (in_hyp_mode) {
> +		err = kvm_init_rme();
> +		if (err)
> +			return err;
> +	}
> +
>  	if (cpus_have_final_cap(ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE) ||
>  	    cpus_have_final_cap(ARM64_WORKAROUND_1508412))
>  		kvm_info("Guests without required CPU erratum workarounds can deadlock system!\n" \
> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> new file mode 100644
> index 000000000000..f6b587bc116e
> --- /dev/null
> +++ b/arch/arm64/kvm/rme.c
> @@ -0,0 +1,49 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2023 ARM Ltd.
> + */
> +
> +#include <linux/kvm_host.h>
> +
> +#include <asm/rmi_cmds.h>
> +#include <asm/virt.h>
> +
> +static int rmi_check_version(void)
> +{
> +	struct arm_smccc_res res;
> +	int version_major, version_minor;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_VERSION, &res);
> +
> +	if (res.a0 == SMCCC_RET_NOT_SUPPORTED)
> +		return -ENXIO;
> +
> +	version_major = RMI_ABI_VERSION_GET_MAJOR(res.a0);
> +	version_minor = RMI_ABI_VERSION_GET_MINOR(res.a0);
> +
> +	if (version_major != RMI_ABI_MAJOR_VERSION) {
> +		kvm_err("Unsupported RMI ABI (version %d.%d) we support %d\n",
> +			version_major, version_minor,
> +			RMI_ABI_MAJOR_VERSION);
> +		return -ENXIO;
> +	}
> +
> +	kvm_info("RMI ABI version %d.%d\n", version_major, version_minor);
> +
> +	return 0;
> +}
> +
> +int kvm_init_rme(void)
> +{
> +	if (PAGE_SIZE != SZ_4K)
> +		/* Only 4k page size on the host is supported */
> +		return 0;
> +
> +	if (rmi_check_version())
> +		/* Continue without realm support */
> +		return 0;
> +
> +	/* Future patch will enable static branch kvm_rme_is_available */
> +
> +	return 0;
> +}


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 04/28] arm64: RME: Check for RME support at KVM init
  2023-01-27 11:29   ` [RFC PATCH 04/28] arm64: RME: Check for RME support at KVM init Steven Price
  2023-02-13 15:48     ` Zhi Wang
@ 2023-02-13 15:55     ` Zhi Wang
  2024-03-18  7:17     ` Ganapatrao Kulkarni
  2 siblings, 0 replies; 190+ messages in thread
From: Zhi Wang @ 2023-02-13 15:55 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 27 Jan 2023 11:29:08 +0000
Steven Price <steven.price@arm.com> wrote:

> Query the RMI version number and check if it is a compatible version. A
> static key is also provided to signal that a supported RMM is available.
> 
> Functions are provided to query if a VM or VCPU is a realm (or rec)
> which currently will always return false.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/include/asm/kvm_emulate.h | 17 ++++++++++
>  arch/arm64/include/asm/kvm_host.h    |  4 +++
>  arch/arm64/include/asm/kvm_rme.h     | 22 +++++++++++++
>  arch/arm64/include/asm/virt.h        |  1 +
>  arch/arm64/kvm/Makefile              |  3 +-
>  arch/arm64/kvm/arm.c                 |  8 +++++
>  arch/arm64/kvm/rme.c                 | 49 ++++++++++++++++++++++++++++
>  7 files changed, 103 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/include/asm/kvm_rme.h
>  create mode 100644 arch/arm64/kvm/rme.c
> 
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index 9bdba47f7e14..5a2b7229e83f 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -490,4 +490,21 @@ static inline bool vcpu_has_feature(struct kvm_vcpu *vcpu, int feature)
>  	return test_bit(feature, vcpu->arch.features);
>  }
>  
> +static inline bool kvm_is_realm(struct kvm *kvm)
> +{
> +	if (static_branch_unlikely(&kvm_rme_is_available))
> +		return kvm->arch.is_realm;
> +	return false;
> +}
> +
> +static inline enum realm_state kvm_realm_state(struct kvm *kvm)
> +{
> +	return READ_ONCE(kvm->arch.realm.state);
> +}
> +
> +static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu)
> +{
> +	return false;
> +}
> +
>  #endif /* __ARM64_KVM_EMULATE_H__ */
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 35a159d131b5..04347c3a8c6b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -26,6 +26,7 @@
>  #include <asm/fpsimd.h>
>  #include <asm/kvm.h>
>  #include <asm/kvm_asm.h>
> +#include <asm/kvm_rme.h>
>  
>  #define __KVM_HAVE_ARCH_INTC_INITIALIZED
>  
> @@ -240,6 +241,9 @@ struct kvm_arch {
>  	 * the associated pKVM instance in the hypervisor.
>  	 */
>  	struct kvm_protected_vm pkvm;
> +
> +	bool is_realm;
> +	struct realm realm;
>  };
>  
>  struct kvm_vcpu_fault_info {
> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> new file mode 100644
> index 000000000000..c26bc2c6770d
> --- /dev/null
> +++ b/arch/arm64/include/asm/kvm_rme.h
> @@ -0,0 +1,22 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2023 ARM Ltd.
> + */
> +
> +#ifndef __ASM_KVM_RME_H
> +#define __ASM_KVM_RME_H
> +
> +enum realm_state {
> +	REALM_STATE_NONE,
> +	REALM_STATE_NEW,
> +	REALM_STATE_ACTIVE,
> +	REALM_STATE_DYING
> +};
> +

By the way, it is better to add more comments to introduce about the states
here.

> +struct realm {
> +	enum realm_state state;
> +};
> +
> +int kvm_init_rme(void);
> +
> +#endif
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index 4eb601e7de50..be1383e26626 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -80,6 +80,7 @@ void __hyp_set_vectors(phys_addr_t phys_vector_base);
>  void __hyp_reset_vectors(void);
>  
>  DECLARE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);
> +DECLARE_STATIC_KEY_FALSE(kvm_rme_is_available);
>  
>  /* Reports the availability of HYP mode */
>  static inline bool is_hyp_mode_available(void)
> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> index 5e33c2d4645a..d2f0400c50da 100644
> --- a/arch/arm64/kvm/Makefile
> +++ b/arch/arm64/kvm/Makefile
> @@ -20,7 +20,8 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
>  	 vgic/vgic-v3.o vgic/vgic-v4.o \
>  	 vgic/vgic-mmio.o vgic/vgic-mmio-v2.o \
>  	 vgic/vgic-mmio-v3.o vgic/vgic-kvm-device.o \
> -	 vgic/vgic-its.o vgic/vgic-debug.o
> +	 vgic/vgic-its.o vgic/vgic-debug.o \
> +	 rme.o
>  
>  kvm-$(CONFIG_HW_PERF_EVENTS)  += pmu-emul.o pmu.o
>  
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 9c5573bc4614..d97b39d042ab 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -38,6 +38,7 @@
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_mmu.h>
>  #include <asm/kvm_pkvm.h>
> +#include <asm/kvm_rme.h>
>  #include <asm/kvm_emulate.h>
>  #include <asm/sections.h>
>  
> @@ -47,6 +48,7 @@
>  
>  static enum kvm_mode kvm_mode = KVM_MODE_DEFAULT;
>  DEFINE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);
> +DEFINE_STATIC_KEY_FALSE(kvm_rme_is_available);
>  
>  DECLARE_KVM_HYP_PER_CPU(unsigned long, kvm_hyp_vector);
>  
> @@ -2213,6 +2215,12 @@ int kvm_arch_init(void *opaque)
>  
>  	in_hyp_mode = is_kernel_in_hyp_mode();
>  
> +	if (in_hyp_mode) {
> +		err = kvm_init_rme();
> +		if (err)
> +			return err;
> +	}
> +
>  	if (cpus_have_final_cap(ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE) ||
>  	    cpus_have_final_cap(ARM64_WORKAROUND_1508412))
>  		kvm_info("Guests without required CPU erratum workarounds can deadlock system!\n" \
> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> new file mode 100644
> index 000000000000..f6b587bc116e
> --- /dev/null
> +++ b/arch/arm64/kvm/rme.c
> @@ -0,0 +1,49 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2023 ARM Ltd.
> + */
> +
> +#include <linux/kvm_host.h>
> +
> +#include <asm/rmi_cmds.h>
> +#include <asm/virt.h>
> +
> +static int rmi_check_version(void)
> +{
> +	struct arm_smccc_res res;
> +	int version_major, version_minor;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_VERSION, &res);
> +
> +	if (res.a0 == SMCCC_RET_NOT_SUPPORTED)
> +		return -ENXIO;
> +
> +	version_major = RMI_ABI_VERSION_GET_MAJOR(res.a0);
> +	version_minor = RMI_ABI_VERSION_GET_MINOR(res.a0);
> +
> +	if (version_major != RMI_ABI_MAJOR_VERSION) {
> +		kvm_err("Unsupported RMI ABI (version %d.%d) we support %d\n",
> +			version_major, version_minor,
> +			RMI_ABI_MAJOR_VERSION);
> +		return -ENXIO;
> +	}
> +
> +	kvm_info("RMI ABI version %d.%d\n", version_major, version_minor);
> +
> +	return 0;
> +}
> +
> +int kvm_init_rme(void)
> +{
> +	if (PAGE_SIZE != SZ_4K)
> +		/* Only 4k page size on the host is supported */
> +		return 0;
> +
> +	if (rmi_check_version())
> +		/* Continue without realm support */
> +		return 0;
> +
> +	/* Future patch will enable static branch kvm_rme_is_available */
> +
> +	return 0;
> +}


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 04/28] arm64: RME: Check for RME support at KVM init
  2023-02-13 15:48     ` Zhi Wang
@ 2023-02-13 15:59       ` Steven Price
  2023-03-04 12:07         ` Zhi Wang
  0 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-02-13 15:59 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On 13/02/2023 15:48, Zhi Wang wrote:
> On Fri, 27 Jan 2023 11:29:08 +0000
> Steven Price <steven.price@arm.com> wrote:
> 
>> Query the RMI version number and check if it is a compatible version. A
>> static key is also provided to signal that a supported RMM is available.
>>
>> Functions are provided to query if a VM or VCPU is a realm (or rec)
>> which currently will always return false.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_emulate.h | 17 ++++++++++
>>  arch/arm64/include/asm/kvm_host.h    |  4 +++
>>  arch/arm64/include/asm/kvm_rme.h     | 22 +++++++++++++
>>  arch/arm64/include/asm/virt.h        |  1 +
>>  arch/arm64/kvm/Makefile              |  3 +-
>>  arch/arm64/kvm/arm.c                 |  8 +++++
>>  arch/arm64/kvm/rme.c                 | 49 ++++++++++++++++++++++++++++
>>  7 files changed, 103 insertions(+), 1 deletion(-)
>>  create mode 100644 arch/arm64/include/asm/kvm_rme.h
>>  create mode 100644 arch/arm64/kvm/rme.c
>>
>> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
>> index 9bdba47f7e14..5a2b7229e83f 100644
>> --- a/arch/arm64/include/asm/kvm_emulate.h
>> +++ b/arch/arm64/include/asm/kvm_emulate.h
>> @@ -490,4 +490,21 @@ static inline bool vcpu_has_feature(struct kvm_vcpu *vcpu, int feature)
>>  	return test_bit(feature, vcpu->arch.features);
>>  }
>>  
>> +static inline bool kvm_is_realm(struct kvm *kvm)
>> +{
>> +	if (static_branch_unlikely(&kvm_rme_is_available))
>> +		return kvm->arch.is_realm;
>> +	return false;
>> +}
>> +
>> +static inline enum realm_state kvm_realm_state(struct kvm *kvm)
>> +{
>> +	return READ_ONCE(kvm->arch.realm.state);
>> +}
>> +
>> +static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu)
>> +{
>> +	return false;
>> +}
>> +
>>  #endif /* __ARM64_KVM_EMULATE_H__ */
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index 35a159d131b5..04347c3a8c6b 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -26,6 +26,7 @@
>>  #include <asm/fpsimd.h>
>>  #include <asm/kvm.h>
>>  #include <asm/kvm_asm.h>
>> +#include <asm/kvm_rme.h>
>>  
>>  #define __KVM_HAVE_ARCH_INTC_INITIALIZED
>>  
>> @@ -240,6 +241,9 @@ struct kvm_arch {
>>  	 * the associated pKVM instance in the hypervisor.
>>  	 */
>>  	struct kvm_protected_vm pkvm;
>> +
>> +	bool is_realm;
>                ^
> It would be better to put more comments which really helps on the review.

Thanks for the feedback - I had thought "is realm" was fairly
self-documenting, but perhaps I've just spent too much time with this code.

> I was looking for the user of this memeber to see when it is set. It seems
> it is not in this patch. It would have been nice to have a quick answer from the
> comments.

The usage is in the kvm_is_realm() function which is used in several of
the later patches as a way to detect this kvm guest is a realm guest.

I think the main issue is that I've got the patches in the wrong other.
Patch 7 "arm64: kvm: Allow passing machine type in KVM creation" should
probably be before this one, then I could add the assignment of is_realm
into this patch (potentially splitting out the is_realm parts into
another patch).

Thanks,

Steve


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 05/28] arm64: RME: Define the user ABI
  2023-01-27 11:29   ` [RFC PATCH 05/28] arm64: RME: Define the user ABI Steven Price
@ 2023-02-13 16:04     ` Zhi Wang
  2023-03-01 11:54       ` Steven Price
  0 siblings, 1 reply; 190+ messages in thread
From: Zhi Wang @ 2023-02-13 16:04 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 27 Jan 2023 11:29:09 +0000
Steven Price <steven.price@arm.com> wrote:

> There is one (multiplexed) CAP which can be used to create, populate and
> then activate the realm.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  Documentation/virt/kvm/api.rst    |  1 +
>  arch/arm64/include/uapi/asm/kvm.h | 63 +++++++++++++++++++++++++++++++
>  include/uapi/linux/kvm.h          |  2 +
>  3 files changed, 66 insertions(+)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 0dd5d8733dd5..f1a59d6fb7fc 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -4965,6 +4965,7 @@ Recognised values for feature:
>  
>    =====      ===========================================
>    arm64      KVM_ARM_VCPU_SVE (requires KVM_CAP_ARM_SVE)
> +  arm64      KVM_ARM_VCPU_REC (requires KVM_CAP_ARM_RME)
>    =====      ===========================================
>  
>  Finalizes the configuration of the specified vcpu feature.
> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> index a7a857f1784d..fcc0b8dce29b 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -109,6 +109,7 @@ struct kvm_regs {
>  #define KVM_ARM_VCPU_SVE		4 /* enable SVE for this CPU */
>  #define KVM_ARM_VCPU_PTRAUTH_ADDRESS	5 /* VCPU uses address authentication */
>  #define KVM_ARM_VCPU_PTRAUTH_GENERIC	6 /* VCPU uses generic authentication */
> +#define KVM_ARM_VCPU_REC		7 /* VCPU REC state as part of Realm */
>  
>  struct kvm_vcpu_init {
>  	__u32 target;
> @@ -401,6 +402,68 @@ enum {
>  #define   KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES	3
>  #define   KVM_DEV_ARM_ITS_CTRL_RESET		4
>  
> +/* KVM_CAP_ARM_RME on VM fd */
> +#define KVM_CAP_ARM_RME_CONFIG_REALM		0
> +#define KVM_CAP_ARM_RME_CREATE_RD		1
> +#define KVM_CAP_ARM_RME_INIT_IPA_REALM		2
> +#define KVM_CAP_ARM_RME_POPULATE_REALM		3
> +#define KVM_CAP_ARM_RME_ACTIVATE_REALM		4
> +

It is a little bit confusing here. These seems more like 'commands' not caps.
Will leave more comments after reviewing the later patches.
 
> +#define KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256		0
> +#define KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA512		1
> +
> +#define KVM_CAP_ARM_RME_RPV_SIZE 64
> +
> +/* List of configuration items accepted for KVM_CAP_ARM_RME_CONFIG_REALM */
> +#define KVM_CAP_ARM_RME_CFG_RPV			0
> +#define KVM_CAP_ARM_RME_CFG_HASH_ALGO		1
> +#define KVM_CAP_ARM_RME_CFG_SVE			2
> +#define KVM_CAP_ARM_RME_CFG_DBG			3
> +#define KVM_CAP_ARM_RME_CFG_PMU			4
> +
> +struct kvm_cap_arm_rme_config_item {
> +	__u32 cfg;
> +	union {
> +		/* cfg == KVM_CAP_ARM_RME_CFG_RPV */
> +		struct {
> +			__u8	rpv[KVM_CAP_ARM_RME_RPV_SIZE];
> +		};
> +
> +		/* cfg == KVM_CAP_ARM_RME_CFG_HASH_ALGO */
> +		struct {
> +			__u32	hash_algo;
> +		};
> +
> +		/* cfg == KVM_CAP_ARM_RME_CFG_SVE */
> +		struct {
> +			__u32	sve_vq;
> +		};
> +
> +		/* cfg == KVM_CAP_ARM_RME_CFG_DBG */
> +		struct {
> +			__u32	num_brps;
> +			__u32	num_wrps;
> +		};
> +
> +		/* cfg == KVM_CAP_ARM_RME_CFG_PMU */
> +		struct {
> +			__u32	num_pmu_cntrs;
> +		};
> +		/* Fix the size of the union */
> +		__u8	reserved[256];
> +	};
> +};
> +
> +struct kvm_cap_arm_rme_populate_realm_args {
> +	__u64 populate_ipa_base;
> +	__u64 populate_ipa_size;
> +};
> +
> +struct kvm_cap_arm_rme_init_ipa_args {
> +	__u64 init_ipa_base;
> +	__u64 init_ipa_size;
> +};
> +
>  /* Device Control API on vcpu fd */
>  #define KVM_ARM_VCPU_PMU_V3_CTRL	0
>  #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 20522d4ba1e0..fec1909e8b73 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1176,6 +1176,8 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224
>  #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225
>  
> +#define KVM_CAP_ARM_RME 300 // FIXME: Large number to prevent conflicts
> +
>  #ifdef KVM_CAP_IRQ_ROUTING
>  
>  struct kvm_irq_routing_irqchip {


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms
  2023-01-27 11:29   ` [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms Steven Price
  2023-02-07 12:25     ` Jean-Philippe Brucker
@ 2023-02-13 16:10     ` Zhi Wang
  2023-03-01 11:55       ` Steven Price
  2023-03-06 19:10     ` Zhi Wang
  2024-03-18  7:40     ` Ganapatrao Kulkarni
  3 siblings, 1 reply; 190+ messages in thread
From: Zhi Wang @ 2023-02-13 16:10 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 27 Jan 2023 11:29:10 +0000
Steven Price <steven.price@arm.com> wrote:

> Add the KVM_CAP_ARM_RME_CREATE_FD ioctl to create a realm. This involves
> delegating pages to the RMM to hold the Realm Descriptor (RD) and for
> the base level of the Realm Translation Tables (RTT). A VMID also need
> to be picked, since the RMM has a separate VMID address space a
> dedicated allocator is added for this purpose.
> 
> KVM_CAP_ARM_RME_CONFIG_REALM is provided to allow configuring the realm
> before it is created.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/include/asm/kvm_rme.h |  14 ++
>  arch/arm64/kvm/arm.c             |  19 ++
>  arch/arm64/kvm/mmu.c             |   6 +
>  arch/arm64/kvm/reset.c           |  33 +++
>  arch/arm64/kvm/rme.c             | 357 +++++++++++++++++++++++++++++++
>  5 files changed, 429 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> index c26bc2c6770d..055a22accc08 100644
> --- a/arch/arm64/include/asm/kvm_rme.h
> +++ b/arch/arm64/include/asm/kvm_rme.h
> @@ -6,6 +6,8 @@
>  #ifndef __ASM_KVM_RME_H
>  #define __ASM_KVM_RME_H
>  
> +#include <uapi/linux/kvm.h>
> +
>  enum realm_state {
>  	REALM_STATE_NONE,
>  	REALM_STATE_NEW,
> @@ -15,8 +17,20 @@ enum realm_state {
>  
>  struct realm {
>  	enum realm_state state;
> +
> +	void *rd;
> +	struct realm_params *params;
> +
> +	unsigned long num_aux;
> +	unsigned int vmid;
> +	unsigned int ia_bits;
>  };
>  

Maybe more comments for this structure?

>  int kvm_init_rme(void);
> +u32 kvm_realm_ipa_limit(void);
> +
> +int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
> +int kvm_init_realm_vm(struct kvm *kvm);
> +void kvm_destroy_realm(struct kvm *kvm);
>  
>  #endif
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index d97b39d042ab..50f54a63732a 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -103,6 +103,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  		r = 0;
>  		set_bit(KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED, &kvm->arch.flags);
>  		break;
> +	case KVM_CAP_ARM_RME:
> +		if (!static_branch_unlikely(&kvm_rme_is_available))
> +			return -EINVAL;
> +		mutex_lock(&kvm->lock);
> +		r = kvm_realm_enable_cap(kvm, cap);
> +		mutex_unlock(&kvm->lock);
> +		break;
>  	default:
>  		r = -EINVAL;
>  		break;
> @@ -172,6 +179,13 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  	 */
>  	kvm->arch.dfr0_pmuver.imp = kvm_arm_pmu_get_pmuver_limit();
>  
> +	/* Initialise the realm bits after the generic bits are enabled */
> +	if (kvm_is_realm(kvm)) {
> +		ret = kvm_init_realm_vm(kvm);
> +		if (ret)
> +			goto err_free_cpumask;
> +	}
> +
>  	return 0;
>  
>  err_free_cpumask:
> @@ -204,6 +218,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>  	kvm_destroy_vcpus(kvm);
>  
>  	kvm_unshare_hyp(kvm, kvm + 1);
> +
> +	kvm_destroy_realm(kvm);
>  }
>  
>  int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> @@ -300,6 +316,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_ARM_PTRAUTH_GENERIC:
>  		r = system_has_full_ptr_auth();
>  		break;
> +	case KVM_CAP_ARM_RME:
> +		r = static_key_enabled(&kvm_rme_is_available);
> +		break;
>  	default:
>  		r = 0;
>  	}
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 31d7fa4c7c14..d0f707767d05 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -840,6 +840,12 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
>  	struct kvm_pgtable *pgt = NULL;
>  
>  	write_lock(&kvm->mmu_lock);
> +	if (kvm_is_realm(kvm) &&
> +	    kvm_realm_state(kvm) != REALM_STATE_DYING) {
> +		/* TODO: teardown rtts */
> +		write_unlock(&kvm->mmu_lock);
> +		return;
> +	}
>  	pgt = mmu->pgt;
>  	if (pgt) {
>  		mmu->pgd_phys = 0;
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index e0267f672b8a..c165df174737 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -395,3 +395,36 @@ int kvm_set_ipa_limit(void)
>  
>  	return 0;
>  }
> +
> +int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type)
> +{
> +	u64 mmfr0, mmfr1;
> +	u32 phys_shift;
> +	u32 ipa_limit = kvm_ipa_limit;
> +
> +	if (kvm_is_realm(kvm))
> +		ipa_limit = kvm_realm_ipa_limit();
> +
> +	if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
> +		return -EINVAL;
> +
> +	phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type);
> +	if (phys_shift) {
> +		if (phys_shift > ipa_limit ||
> +		    phys_shift < ARM64_MIN_PARANGE_BITS)
> +			return -EINVAL;
> +	} else {
> +		phys_shift = KVM_PHYS_SHIFT;
> +		if (phys_shift > ipa_limit) {
> +			pr_warn_once("%s using unsupported default IPA limit, upgrade your VMM\n",
> +				     current->comm);
> +			return -EINVAL;
> +		}
> +	}
> +
> +	mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
> +	mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> +	kvm->arch.vtcr = kvm_get_vtcr(mmfr0, mmfr1, phys_shift);
> +
> +	return 0;
> +}
> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> index f6b587bc116e..9f8c5a91b8fc 100644
> --- a/arch/arm64/kvm/rme.c
> +++ b/arch/arm64/kvm/rme.c
> @@ -5,9 +5,49 @@
>  
>  #include <linux/kvm_host.h>
>  
> +#include <asm/kvm_emulate.h>
> +#include <asm/kvm_mmu.h>
>  #include <asm/rmi_cmds.h>
>  #include <asm/virt.h>
>  
> +/************ FIXME: Copied from kvm/hyp/pgtable.c **********/
> +#include <asm/kvm_pgtable.h>
> +
> +struct kvm_pgtable_walk_data {
> +	struct kvm_pgtable		*pgt;
> +	struct kvm_pgtable_walker	*walker;
> +
> +	u64				addr;
> +	u64				end;
> +};
> +
> +static u32 __kvm_pgd_page_idx(struct kvm_pgtable *pgt, u64 addr)
> +{
> +	u64 shift = kvm_granule_shift(pgt->start_level - 1); /* May underflow */
> +	u64 mask = BIT(pgt->ia_bits) - 1;
> +
> +	return (addr & mask) >> shift;
> +}
> +
> +static u32 kvm_pgd_pages(u32 ia_bits, u32 start_level)
> +{
> +	struct kvm_pgtable pgt = {
> +		.ia_bits	= ia_bits,
> +		.start_level	= start_level,
> +	};
> +
> +	return __kvm_pgd_page_idx(&pgt, -1ULL) + 1;
> +}
> +
> +/******************/
> +
> +static unsigned long rmm_feat_reg0;
> +
> +static bool rme_supports(unsigned long feature)
> +{
> +	return !!u64_get_bits(rmm_feat_reg0, feature);
> +}
> +
>  static int rmi_check_version(void)
>  {
>  	struct arm_smccc_res res;
> @@ -33,8 +73,319 @@ static int rmi_check_version(void)
>  	return 0;
>  }
>  
> +static unsigned long create_realm_feat_reg0(struct kvm *kvm)
> +{
> +	unsigned long ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
> +	u64 feat_reg0 = 0;
> +
> +	int num_bps = u64_get_bits(rmm_feat_reg0,
> +				   RMI_FEATURE_REGISTER_0_NUM_BPS);
> +	int num_wps = u64_get_bits(rmm_feat_reg0,
> +				   RMI_FEATURE_REGISTER_0_NUM_WPS);
> +
> +	feat_reg0 |= u64_encode_bits(ia_bits, RMI_FEATURE_REGISTER_0_S2SZ);
> +	feat_reg0 |= u64_encode_bits(num_bps, RMI_FEATURE_REGISTER_0_NUM_BPS);
> +	feat_reg0 |= u64_encode_bits(num_wps, RMI_FEATURE_REGISTER_0_NUM_WPS);
> +
> +	return feat_reg0;
> +}
> +
> +u32 kvm_realm_ipa_limit(void)
> +{
> +	return u64_get_bits(rmm_feat_reg0, RMI_FEATURE_REGISTER_0_S2SZ);
> +}
> +
> +static u32 get_start_level(struct kvm *kvm)
> +{
> +	long sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, kvm->arch.vtcr);
> +
> +	return VTCR_EL2_TGRAN_SL0_BASE - sl0;
> +}
> +
> +static int realm_create_rd(struct kvm *kvm)
> +{
> +	struct realm *realm = &kvm->arch.realm;
> +	struct realm_params *params = realm->params;
> +	void *rd = NULL;
> +	phys_addr_t rd_phys, params_phys;
> +	struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
> +	unsigned int pgd_sz;
> +	int i, r;
> +
> +	if (WARN_ON(realm->rd) || WARN_ON(!realm->params))
> +		return -EEXIST;
> +
> +	rd = (void *)__get_free_page(GFP_KERNEL);
> +	if (!rd)
> +		return -ENOMEM;
> +
> +	rd_phys = virt_to_phys(rd);
> +	if (rmi_granule_delegate(rd_phys)) {
> +		r = -ENXIO;
> +		goto out;
> +	}
> +
> +	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
> +	for (i = 0; i < pgd_sz; i++) {
> +		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
> +
> +		if (rmi_granule_delegate(pgd_phys)) {
> +			r = -ENXIO;
> +			goto out_undelegate_tables;
> +		}
> +	}
> +
> +	params->rtt_level_start = get_start_level(kvm);
> +	params->rtt_num_start = pgd_sz;
> +	params->rtt_base = kvm->arch.mmu.pgd_phys;
> +	params->vmid = realm->vmid;
> +
> +	params_phys = virt_to_phys(params);
> +
> +	if (rmi_realm_create(rd_phys, params_phys)) {
> +		r = -ENXIO;
> +		goto out_undelegate_tables;
> +	}
> +
> +	realm->rd = rd;
> +	realm->ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
> +
> +	if (WARN_ON(rmi_rec_aux_count(rd_phys, &realm->num_aux))) {
> +		WARN_ON(rmi_realm_destroy(rd_phys));
> +		goto out_undelegate_tables;
> +	}
> +
> +	return 0;
> +
> +out_undelegate_tables:
> +	while (--i >= 0) {
> +		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
> +
> +		WARN_ON(rmi_granule_undelegate(pgd_phys));
> +	}
> +	WARN_ON(rmi_granule_undelegate(rd_phys));
> +out:
> +	free_page((unsigned long)rd);
> +	return r;
> +}
> +

Just curious. Wouldn't it be better to use IDR as this is ID allocation? There
were some efforts to change the use of bitmap allocation to IDR before.

> +/* Protects access to rme_vmid_bitmap */
> +static DEFINE_SPINLOCK(rme_vmid_lock);
> +static unsigned long *rme_vmid_bitmap;
> +
> +static int rme_vmid_init(void)
> +{
> +	unsigned int vmid_count = 1 << kvm_get_vmid_bits();
> +
> +	rme_vmid_bitmap = bitmap_zalloc(vmid_count, GFP_KERNEL);
> +	if (!rme_vmid_bitmap) {
> +		kvm_err("%s: Couldn't allocate rme vmid bitmap\n", __func__);
> +		return -ENOMEM;
> +	}
> +
> +	return 0;
> +}
> +
> +static int rme_vmid_reserve(void)
> +{
> +	int ret;
> +	unsigned int vmid_count = 1 << kvm_get_vmid_bits();
> +
> +	spin_lock(&rme_vmid_lock);
> +	ret = bitmap_find_free_region(rme_vmid_bitmap, vmid_count, 0);
> +	spin_unlock(&rme_vmid_lock);
> +
> +	return ret;
> +}
> +
> +static void rme_vmid_release(unsigned int vmid)
> +{
> +	spin_lock(&rme_vmid_lock);
> +	bitmap_release_region(rme_vmid_bitmap, vmid, 0);
> +	spin_unlock(&rme_vmid_lock);
> +}
> +
> +static int kvm_create_realm(struct kvm *kvm)
> +{
> +	struct realm *realm = &kvm->arch.realm;
> +	int ret;
> +
> +	if (!kvm_is_realm(kvm) || kvm_realm_state(kvm) != REALM_STATE_NONE)
> +		return -EEXIST;
> +
> +	ret = rme_vmid_reserve();
> +	if (ret < 0)
> +		return ret;
> +	realm->vmid = ret;
> +
> +	ret = realm_create_rd(kvm);
> +	if (ret) {
> +		rme_vmid_release(realm->vmid);
> +		return ret;
> +	}
> +
> +	WRITE_ONCE(realm->state, REALM_STATE_NEW);
> +
> +	/* The realm is up, free the parameters.  */
> +	free_page((unsigned long)realm->params);
> +	realm->params = NULL;
> +
> +	return 0;
> +}
> +
> +static int config_realm_hash_algo(struct realm *realm,
> +				  struct kvm_cap_arm_rme_config_item *cfg)
> +{
> +	switch (cfg->hash_algo) {
> +	case KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256:
> +		if (!rme_supports(RMI_FEATURE_REGISTER_0_HASH_SHA_256))
> +			return -EINVAL;
> +		break;
> +	case KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA512:
> +		if (!rme_supports(RMI_FEATURE_REGISTER_0_HASH_SHA_512))
> +			return -EINVAL;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +	realm->params->measurement_algo = cfg->hash_algo;
> +	return 0;
> +}
> +
> +static int config_realm_sve(struct realm *realm,
> +			    struct kvm_cap_arm_rme_config_item *cfg)
> +{
> +	u64 features_0 = realm->params->features_0;
> +	int max_sve_vq = u64_get_bits(rmm_feat_reg0,
> +				      RMI_FEATURE_REGISTER_0_SVE_VL);
> +
> +	if (!rme_supports(RMI_FEATURE_REGISTER_0_SVE_EN))
> +		return -EINVAL;
> +
> +	if (cfg->sve_vq > max_sve_vq)
> +		return -EINVAL;
> +
> +	features_0 &= ~(RMI_FEATURE_REGISTER_0_SVE_EN |
> +			RMI_FEATURE_REGISTER_0_SVE_VL);
> +	features_0 |= u64_encode_bits(1, RMI_FEATURE_REGISTER_0_SVE_EN);
> +	features_0 |= u64_encode_bits(cfg->sve_vq,
> +				      RMI_FEATURE_REGISTER_0_SVE_VL);
> +
> +	realm->params->features_0 = features_0;
> +	return 0;
> +}
> +
> +static int kvm_rme_config_realm(struct kvm *kvm, struct kvm_enable_cap *cap)
> +{
> +	struct kvm_cap_arm_rme_config_item cfg;
> +	struct realm *realm = &kvm->arch.realm;
> +	int r = 0;
> +
> +	if (kvm_realm_state(kvm) != REALM_STATE_NONE)
> +		return -EBUSY;
> +
> +	if (copy_from_user(&cfg, (void __user *)cap->args[1], sizeof(cfg)))
> +		return -EFAULT;
> +
> +	switch (cfg.cfg) {
> +	case KVM_CAP_ARM_RME_CFG_RPV:
> +		memcpy(&realm->params->rpv, &cfg.rpv, sizeof(cfg.rpv));
> +		break;
> +	case KVM_CAP_ARM_RME_CFG_HASH_ALGO:
> +		r = config_realm_hash_algo(realm, &cfg);
> +		break;
> +	case KVM_CAP_ARM_RME_CFG_SVE:
> +		r = config_realm_sve(realm, &cfg);
> +		break;
> +	default:
> +		r = -EINVAL;
> +	}
> +
> +	return r;
> +}
> +
> +int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
> +{
> +	int r = 0;
> +
> +	switch (cap->args[0]) {
> +	case KVM_CAP_ARM_RME_CONFIG_REALM:
> +		r = kvm_rme_config_realm(kvm, cap);
> +		break;
> +	case KVM_CAP_ARM_RME_CREATE_RD:
> +		if (kvm->created_vcpus) {
> +			r = -EBUSY;
> +			break;
> +		}
> +
> +		r = kvm_create_realm(kvm);
> +		break;
> +	default:
> +		r = -EINVAL;
> +		break;
> +	}
> +
> +	return r;
> +}
> +
> +void kvm_destroy_realm(struct kvm *kvm)
> +{
> +	struct realm *realm = &kvm->arch.realm;
> +	struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
> +	unsigned int pgd_sz;
> +	int i;
> +
> +	if (realm->params) {
> +		free_page((unsigned long)realm->params);
> +		realm->params = NULL;
> +	}
> +
> +	if (kvm_realm_state(kvm) == REALM_STATE_NONE)
> +		return;
> +
> +	WRITE_ONCE(realm->state, REALM_STATE_DYING);
> +
> +	rme_vmid_release(realm->vmid);
> +
> +	if (realm->rd) {
> +		phys_addr_t rd_phys = virt_to_phys(realm->rd);
> +
> +		if (WARN_ON(rmi_realm_destroy(rd_phys)))
> +			return;
> +		if (WARN_ON(rmi_granule_undelegate(rd_phys)))
> +			return;
> +		free_page((unsigned long)realm->rd);
> +		realm->rd = NULL;
> +	}
> +
> +	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
> +	for (i = 0; i < pgd_sz; i++) {
> +		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
> +
> +		if (WARN_ON(rmi_granule_undelegate(pgd_phys)))
> +			return;
> +	}
> +
> +	kvm_free_stage2_pgd(&kvm->arch.mmu);
> +}
> +
> +int kvm_init_realm_vm(struct kvm *kvm)
> +{
> +	struct realm_params *params;
> +
> +	params = (struct realm_params *)get_zeroed_page(GFP_KERNEL);
> +	if (!params)
> +		return -ENOMEM;
> +
> +	params->features_0 = create_realm_feat_reg0(kvm);
> +	kvm->arch.realm.params = params;
> +	return 0;
> +}
> +
>  int kvm_init_rme(void)
>  {
> +	int ret;
> +
>  	if (PAGE_SIZE != SZ_4K)
>  		/* Only 4k page size on the host is supported */
>  		return 0;
> @@ -43,6 +394,12 @@ int kvm_init_rme(void)
>  		/* Continue without realm support */
>  		return 0;
>  
> +	ret = rme_vmid_init();
> +	if (ret)
> +		return ret;
> +
> +	WARN_ON(rmi_features(0, &rmm_feat_reg0));
> +
>  	/* Future patch will enable static branch kvm_rme_is_available */
>  
>  	return 0;


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 07/28] arm64: kvm: Allow passing machine type in KVM creation
  2023-01-27 11:29   ` [RFC PATCH 07/28] arm64: kvm: Allow passing machine type in KVM creation Steven Price
@ 2023-02-13 16:35     ` Zhi Wang
  2023-03-01 11:55       ` Steven Price
  0 siblings, 1 reply; 190+ messages in thread
From: Zhi Wang @ 2023-02-13 16:35 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 27 Jan 2023 11:29:11 +0000
Steven Price <steven.price@arm.com> wrote:

> Previously machine type was used purely for specifying the physical
> address size of the guest. Reserve the higher bits to specify an ARM
> specific machine type and declare a new type 'KVM_VM_TYPE_ARM_REALM'
> used to create a realm guest.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/kvm/arm.c     | 13 +++++++++++++
>  arch/arm64/kvm/mmu.c     |  3 ---
>  arch/arm64/kvm/reset.c   |  3 ---
>  include/uapi/linux/kvm.h | 19 +++++++++++++++----
>  4 files changed, 28 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 50f54a63732a..badd775547b8 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -147,6 +147,19 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  {
>  	int ret;
>  
> +	if (type & ~(KVM_VM_TYPE_ARM_MASK | KVM_VM_TYPE_ARM_IPA_SIZE_MASK))
> +		return -EINVAL;
> +
> +	switch (type & KVM_VM_TYPE_ARM_MASK) {
> +	case KVM_VM_TYPE_ARM_NORMAL:
> +		break;
> +	case KVM_VM_TYPE_ARM_REALM:
> +		kvm->arch.is_realm = true;

It is better to let this call fail when !kvm_rme_is_available? It is
strange to be able to create a VM with REALM type in a system doesn't
support RME.

> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
>  	ret = kvm_share_hyp(kvm, kvm + 1);
>  	if (ret)
>  		return ret;
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index d0f707767d05..22c00274884a 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -709,9 +709,6 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t
>  	u64 mmfr0, mmfr1;
>  	u32 phys_shift;
>  
> -	if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
> -		return -EINVAL;
> -
>  	phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type);
>  	if (is_protected_kvm_enabled()) {
>  		phys_shift = kvm_ipa_limit;
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index c165df174737..9e71d69e051f 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -405,9 +405,6 @@ int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type)
>  	if (kvm_is_realm(kvm))
>  		ipa_limit = kvm_realm_ipa_limit();
>  
> -	if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
> -		return -EINVAL;
> -
>  	phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type);
>  	if (phys_shift) {
>  		if (phys_shift > ipa_limit ||
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index fec1909e8b73..bcfc4d58dc19 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -898,14 +898,25 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_S390_SIE_PAGE_OFFSET 1
>  
>  /*
> - * On arm64, machine type can be used to request the physical
> - * address size for the VM. Bits[7-0] are reserved for the guest
> - * PA size shift (i.e, log2(PA_Size)). For backward compatibility,
> - * value 0 implies the default IPA size, 40bits.
> + * On arm64, machine type can be used to request both the machine type and
> + * the physical address size for the VM.
> + *
> + * Bits[11-8] are reserved for the ARM specific machine type.
> + *
> + * Bits[7-0] are reserved for the guest PA size shift (i.e, log2(PA_Size)).
> + * For backward compatibility, value 0 implies the default IPA size, 40bits.
>   */
> +#define KVM_VM_TYPE_ARM_SHIFT		8
> +#define KVM_VM_TYPE_ARM_MASK		(0xfULL << KVM_VM_TYPE_ARM_SHIFT)
> +#define KVM_VM_TYPE_ARM(_type)		\
> +	(((_type) << KVM_VM_TYPE_ARM_SHIFT) & KVM_VM_TYPE_ARM_MASK)
> +#define KVM_VM_TYPE_ARM_NORMAL		KVM_VM_TYPE_ARM(0)
> +#define KVM_VM_TYPE_ARM_REALM		KVM_VM_TYPE_ARM(1)
> +
>  #define KVM_VM_TYPE_ARM_IPA_SIZE_MASK	0xffULL
>  #define KVM_VM_TYPE_ARM_IPA_SIZE(x)		\
>  	((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
> +
>  /*
>   * ioctls for /dev/kvm fds:
>   */


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 03/28] arm64: RME: Add wrappers for RMI calls
  2023-01-27 11:29   ` [RFC PATCH 03/28] arm64: RME: Add wrappers for RMI calls Steven Price
@ 2023-02-13 16:43     ` Zhi Wang
  2024-03-18  7:03     ` Ganapatrao Kulkarni
  1 sibling, 0 replies; 190+ messages in thread
From: Zhi Wang @ 2023-02-13 16:43 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 27 Jan 2023 11:29:07 +0000
Steven Price <steven.price@arm.com> wrote:

> The wrappers make the call sites easier to read and deal with the
> boiler plate of handling the error codes from the RMM.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/include/asm/rmi_cmds.h | 259 ++++++++++++++++++++++++++++++
>  1 file changed, 259 insertions(+)
>  create mode 100644 arch/arm64/include/asm/rmi_cmds.h
> 
> diff --git a/arch/arm64/include/asm/rmi_cmds.h b/arch/arm64/include/asm/rmi_cmds.h
> new file mode 100644
> index 000000000000..d5468ee46f35
> --- /dev/null
> +++ b/arch/arm64/include/asm/rmi_cmds.h
> @@ -0,0 +1,259 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2023 ARM Ltd.
> + */
> +
> +#ifndef __ASM_RMI_CMDS_H
> +#define __ASM_RMI_CMDS_H
> +
> +#include <linux/arm-smccc.h>
> +
> +#include <asm/rmi_smc.h>
> +
> +struct rtt_entry {
> +	unsigned long walk_level;
> +	unsigned long desc;
> +	int state;
> +	bool ripas;
> +};
> +

It would be nice to have some information of the follwoing wrappers. E.g.
meaning of the return value. They will be quite helpful in the later patch
review.

> +static inline int rmi_data_create(unsigned long data, unsigned long rd,
> +				  unsigned long map_addr, unsigned long src,
> +				  unsigned long flags)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_DATA_CREATE, data, rd, map_addr, src,
> +			     flags, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_data_create_unknown(unsigned long data,
> +					  unsigned long rd,
> +					  unsigned long map_addr)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_DATA_CREATE_UNKNOWN, data, rd, map_addr,
> +			     &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_data_destroy(unsigned long rd, unsigned long map_addr)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_DATA_DESTROY, rd, map_addr, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_features(unsigned long index, unsigned long *out)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_FEATURES, index, &res);
> +
> +	*out = res.a1;
> +	return res.a0;
> +}
> +
> +static inline int rmi_granule_delegate(unsigned long phys)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_GRANULE_DELEGATE, phys, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_granule_undelegate(unsigned long phys)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_GRANULE_UNDELEGATE, phys, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_psci_complete(unsigned long calling_rec,
> +				    unsigned long target_rec)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_PSCI_COMPLETE, calling_rec, target_rec,
> +			     &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_realm_activate(unsigned long rd)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_REALM_ACTIVATE, rd, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_realm_create(unsigned long rd, unsigned long params_ptr)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_REALM_CREATE, rd, params_ptr, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_realm_destroy(unsigned long rd)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_REALM_DESTROY, rd, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rec_aux_count(unsigned long rd, unsigned long *aux_count)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_REC_AUX_COUNT, rd, &res);
> +
> +	*aux_count = res.a1;
> +	return res.a0;
> +}
> +
> +static inline int rmi_rec_create(unsigned long rec, unsigned long rd,
> +				 unsigned long params_ptr)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_REC_CREATE, rec, rd, params_ptr, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rec_destroy(unsigned long rec)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_REC_DESTROY, rec, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rec_enter(unsigned long rec, unsigned long run_ptr)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_REC_ENTER, rec, run_ptr, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rtt_create(unsigned long rtt, unsigned long rd,
> +				 unsigned long map_addr, unsigned long level)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_RTT_CREATE, rtt, rd, map_addr, level,
> +			     &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rtt_destroy(unsigned long rtt, unsigned long rd,
> +				  unsigned long map_addr, unsigned long level)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_RTT_DESTROY, rtt, rd, map_addr, level,
> +			     &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rtt_fold(unsigned long rtt, unsigned long rd,
> +			       unsigned long map_addr, unsigned long level)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_RTT_FOLD, rtt, rd, map_addr, level, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rtt_init_ripas(unsigned long rd, unsigned long map_addr,
> +				     unsigned long level)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_RTT_INIT_RIPAS, rd, map_addr, level, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rtt_map_unprotected(unsigned long rd,
> +					  unsigned long map_addr,
> +					  unsigned long level,
> +					  unsigned long desc)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_RTT_MAP_UNPROTECTED, rd, map_addr, level,
> +			     desc, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rtt_read_entry(unsigned long rd, unsigned long map_addr,
> +				     unsigned long level, struct rtt_entry *rtt)
> +{
> +	struct arm_smccc_1_2_regs regs = {
> +		SMC_RMI_RTT_READ_ENTRY,
> +		rd, map_addr, level
> +	};
> +
> +	arm_smccc_1_2_smc(&regs, &regs);
> +
> +	rtt->walk_level = regs.a1;
> +	rtt->state = regs.a2 & 0xFF;
> +	rtt->desc = regs.a3;
> +	rtt->ripas = regs.a4 & 1;
> +
> +	return regs.a0;
> +}
> +
> +static inline int rmi_rtt_set_ripas(unsigned long rd, unsigned long rec,
> +				    unsigned long map_addr, unsigned long level,
> +				    unsigned long ripas)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_RTT_SET_RIPAS, rd, rec, map_addr, level,
> +			     ripas, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rtt_unmap_unprotected(unsigned long rd,
> +					    unsigned long map_addr,
> +					    unsigned long level)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_RTT_UNMAP_UNPROTECTED, rd, map_addr,
> +			     level, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline phys_addr_t rmi_rtt_get_phys(struct rtt_entry *rtt)
> +{
> +	return rtt->desc & GENMASK(47, 12);
> +}
> +
> +#endif


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 08/28] arm64: RME: Keep a spare page delegated to the RMM
  2023-01-27 11:29   ` [RFC PATCH 08/28] arm64: RME: Keep a spare page delegated to the RMM Steven Price
@ 2023-02-13 16:47     ` Zhi Wang
  2023-03-01 11:55       ` Steven Price
  0 siblings, 1 reply; 190+ messages in thread
From: Zhi Wang @ 2023-02-13 16:47 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 27 Jan 2023 11:29:12 +0000
Steven Price <steven.price@arm.com> wrote:

> Pages can only be populated/destroyed on the RMM at the 4KB granule,
> this requires creating the full depth of RTTs. However if the pages are
> going to be combined into a 4MB huge page the last RTT is only
> temporarily needed. Similarly when freeing memory the huge page must be
> temporarily split requiring temporary usage of the full depth oF RTTs.
> 
> To avoid needing to perform a temporary allocation and delegation of a
> page for this purpose we keep a spare delegated page around. In
> particular this avoids the need for memory allocation while destroying
> the realm guest.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/include/asm/kvm_rme.h | 3 +++
>  arch/arm64/kvm/rme.c             | 6 ++++++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> index 055a22accc08..a6318af3ed11 100644
> --- a/arch/arm64/include/asm/kvm_rme.h
> +++ b/arch/arm64/include/asm/kvm_rme.h
> @@ -21,6 +21,9 @@ struct realm {
>  	void *rd;
>  	struct realm_params *params;
>  
> +	/* A spare already delegated page */
> +	phys_addr_t spare_page;
> +
>  	unsigned long num_aux;
>  	unsigned int vmid;
>  	unsigned int ia_bits;
> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> index 9f8c5a91b8fc..0c9d70e4d9e6 100644
> --- a/arch/arm64/kvm/rme.c
> +++ b/arch/arm64/kvm/rme.c
> @@ -148,6 +148,7 @@ static int realm_create_rd(struct kvm *kvm)
>  	}
>  
>  	realm->rd = rd;
> +	realm->spare_page = PHYS_ADDR_MAX;
>  	realm->ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
>  
>  	if (WARN_ON(rmi_rec_aux_count(rd_phys, &realm->num_aux))) {
> @@ -357,6 +358,11 @@ void kvm_destroy_realm(struct kvm *kvm)
>  		free_page((unsigned long)realm->rd);
>  		realm->rd = NULL;
>  	}
> +	if (realm->spare_page != PHYS_ADDR_MAX) {
> +		if (!WARN_ON(rmi_granule_undelegate(realm->spare_page)))
> +			free_page((unsigned long)phys_to_virt(realm->spare_page));

Will the page be leaked (not usable for host and realms) if the undelegate
failed? If yes, better at least put a comment.

> +		realm->spare_page = PHYS_ADDR_MAX;
> +	}
>  
>  	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
>  	for (i = 0; i < pgd_sz; i++) {


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 09/28] arm64: RME: RTT handling
  2023-01-27 11:29   ` [RFC PATCH 09/28] arm64: RME: RTT handling Steven Price
@ 2023-02-13 17:44     ` Zhi Wang
  2023-03-03 14:04       ` Steven Price
  2024-03-18 11:01     ` Ganapatrao Kulkarni
  1 sibling, 1 reply; 190+ messages in thread
From: Zhi Wang @ 2023-02-13 17:44 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 27 Jan 2023 11:29:13 +0000
Steven Price <steven.price@arm.com> wrote:

> The RMM owns the stage 2 page tables for a realm, and KVM must request
> that the RMM creates/destroys entries as necessary. The physical pages
> to store the page tables are delegated to the realm as required, and can
> be undelegated when no longer used.
> 

This is only an introduction to RTT handling. While this patch is mostly like
RTT teardown, better add more introduction to this patch. Also maybe refine
the tittle to reflect what this patch is actually doing.

> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/include/asm/kvm_rme.h |  19 +++++
>  arch/arm64/kvm/mmu.c             |   7 +-
>  arch/arm64/kvm/rme.c             | 139 +++++++++++++++++++++++++++++++
>  3 files changed, 162 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> index a6318af3ed11..eea5118dfa8a 100644
> --- a/arch/arm64/include/asm/kvm_rme.h
> +++ b/arch/arm64/include/asm/kvm_rme.h
> @@ -35,5 +35,24 @@ u32 kvm_realm_ipa_limit(void);
>  int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
>  int kvm_init_realm_vm(struct kvm *kvm);
>  void kvm_destroy_realm(struct kvm *kvm);
> +void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level);
> +
> +#define RME_RTT_BLOCK_LEVEL	2
> +#define RME_RTT_MAX_LEVEL	3
> +
> +#define RME_PAGE_SHIFT		12
> +#define RME_PAGE_SIZE		BIT(RME_PAGE_SHIFT)
> +/* See ARM64_HW_PGTABLE_LEVEL_SHIFT() */
> +#define RME_RTT_LEVEL_SHIFT(l)	\
> +	((RME_PAGE_SHIFT - 3) * (4 - (l)) + 3)
> +#define RME_L2_BLOCK_SIZE	BIT(RME_RTT_LEVEL_SHIFT(2))
> +
> +static inline unsigned long rme_rtt_level_mapsize(int level)
> +{
> +	if (WARN_ON(level > RME_RTT_MAX_LEVEL))
> +		return RME_PAGE_SIZE;
> +
> +	return (1UL << RME_RTT_LEVEL_SHIFT(level));
> +}
>  
>  #endif
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 22c00274884a..f29558c5dcbc 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -834,16 +834,17 @@ void stage2_unmap_vm(struct kvm *kvm)
>  void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
>  {
>  	struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
> -	struct kvm_pgtable *pgt = NULL;
> +	struct kvm_pgtable *pgt;
>  
>  	write_lock(&kvm->mmu_lock);
> +	pgt = mmu->pgt;
>  	if (kvm_is_realm(kvm) &&
>  	    kvm_realm_state(kvm) != REALM_STATE_DYING) {
> -		/* TODO: teardown rtts */
>  		write_unlock(&kvm->mmu_lock);
> +		kvm_realm_destroy_rtts(&kvm->arch.realm, pgt->ia_bits,
> +				       pgt->start_level);
>  		return;
>  	}
> -	pgt = mmu->pgt;
>  	if (pgt) {
>  		mmu->pgd_phys = 0;
>  		mmu->pgt = NULL;
> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> index 0c9d70e4d9e6..f7b0e5a779f8 100644
> --- a/arch/arm64/kvm/rme.c
> +++ b/arch/arm64/kvm/rme.c
> @@ -73,6 +73,28 @@ static int rmi_check_version(void)
>  	return 0;
>  }
>  
> +static void realm_destroy_undelegate_range(struct realm *realm,
> +					   unsigned long ipa,
> +					   unsigned long addr,
> +					   ssize_t size)
> +{
> +	unsigned long rd = virt_to_phys(realm->rd);
> +	int ret;
> +
> +	while (size > 0) {
> +		ret = rmi_data_destroy(rd, ipa);
> +		WARN_ON(ret);
> +		ret = rmi_granule_undelegate(addr);
> +
As the return value is not documented, what will happen if a page undelegate
failed? Leaked? Some explanation is required here.
> +		if (ret)
> +			get_page(phys_to_page(addr));
> +
> +		addr += PAGE_SIZE;
> +		ipa += PAGE_SIZE;
> +		size -= PAGE_SIZE;
> +	}
> +}
> +
>  static unsigned long create_realm_feat_reg0(struct kvm *kvm)
>  {
>  	unsigned long ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
> @@ -170,6 +192,123 @@ static int realm_create_rd(struct kvm *kvm)
>  	return r;
>  }
>  
> +static int realm_rtt_destroy(struct realm *realm, unsigned long addr,
> +			     int level, phys_addr_t rtt_granule)
> +{
> +	addr = ALIGN_DOWN(addr, rme_rtt_level_mapsize(level - 1));
> +	return rmi_rtt_destroy(rtt_granule, virt_to_phys(realm->rd), addr,
> +			level);
> +}
> +
> +static int realm_destroy_free_rtt(struct realm *realm, unsigned long addr,
> +				  int level, phys_addr_t rtt_granule)
> +{
> +	if (realm_rtt_destroy(realm, addr, level, rtt_granule))
> +		return -ENXIO;
> +	if (!WARN_ON(rmi_granule_undelegate(rtt_granule)))
> +		put_page(phys_to_page(rtt_granule));
> +
> +	return 0;
> +}
> +
> +static int realm_rtt_create(struct realm *realm,
> +			    unsigned long addr,
> +			    int level,
> +			    phys_addr_t phys)
> +{
> +	addr = ALIGN_DOWN(addr, rme_rtt_level_mapsize(level - 1));
> +	return rmi_rtt_create(phys, virt_to_phys(realm->rd), addr, level);
> +}
> +
> +static int realm_tear_down_rtt_range(struct realm *realm, int level,
> +				     unsigned long start, unsigned long end)
> +{
> +	phys_addr_t rd = virt_to_phys(realm->rd);
> +	ssize_t map_size = rme_rtt_level_mapsize(level);
> +	unsigned long addr, next_addr;
> +	bool failed = false;
> +
> +	for (addr = start; addr < end; addr = next_addr) {
> +		phys_addr_t rtt_addr, tmp_rtt;
> +		struct rtt_entry rtt;
> +		unsigned long end_addr;
> +
> +		next_addr = ALIGN(addr + 1, map_size);
> +
> +		end_addr = min(next_addr, end);
> +
> +		if (rmi_rtt_read_entry(rd, ALIGN_DOWN(addr, map_size),
> +				       level, &rtt)) {
> +			failed = true;
> +			continue;
> +		}
> +
> +		rtt_addr = rmi_rtt_get_phys(&rtt);
> +		WARN_ON(level != rtt.walk_level);
> +
> +		switch (rtt.state) {
> +		case RMI_UNASSIGNED:
> +		case RMI_DESTROYED:
> +			break;
> +		case RMI_TABLE:
> +			if (realm_tear_down_rtt_range(realm, level + 1,
> +						      addr, end_addr)) {
> +				failed = true;
> +				break;
> +			}
> +			if (IS_ALIGNED(addr, map_size) &&
> +			    next_addr <= end &&
> +			    realm_destroy_free_rtt(realm, addr, level + 1,
> +						   rtt_addr))
> +				failed = true;
> +			break;
> +		case RMI_ASSIGNED:
> +			WARN_ON(!rtt_addr);
> +			/*
> +			 * If there is a block mapping, break it now, using the
> +			 * spare_page. We are sure to have a valid delegated
> +			 * page at spare_page before we enter here, otherwise
> +			 * WARN once, which will be followed by further
> +			 * warnings.
> +			 */
> +			tmp_rtt = realm->spare_page;
> +			if (level == 2 &&
> +			    !WARN_ON_ONCE(tmp_rtt == PHYS_ADDR_MAX) &&
> +			    realm_rtt_create(realm, addr,
> +					     RME_RTT_MAX_LEVEL, tmp_rtt)) {
> +				WARN_ON(1);
> +				failed = true;
> +				break;
> +			}
> +			realm_destroy_undelegate_range(realm, addr,
> +						       rtt_addr, map_size);
> +			/*
> +			 * Collapse the last level table and make the spare page
> +			 * reusable again.
> +			 */
> +			if (level == 2 &&
> +			    realm_rtt_destroy(realm, addr, RME_RTT_MAX_LEVEL,
> +					      tmp_rtt))
> +				failed = true;
> +			break;
> +		case RMI_VALID_NS:
> +			WARN_ON(rmi_rtt_unmap_unprotected(rd, addr, level));
> +			break;
> +		default:
> +			WARN_ON(1);
> +			failed = true;
> +			break;
> +		}
> +	}
> +
> +	return failed ? -EINVAL : 0;
> +}
> +
> +void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level)
> +{
> +	realm_tear_down_rtt_range(realm, start_level, 0, (1UL << ia_bits));
> +}
> +
>  /* Protects access to rme_vmid_bitmap */
>  static DEFINE_SPINLOCK(rme_vmid_lock);
>  static unsigned long *rme_vmid_bitmap;


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 10/28] arm64: RME: Allocate/free RECs to match vCPUs
  2023-01-27 11:29   ` [RFC PATCH 10/28] arm64: RME: Allocate/free RECs to match vCPUs Steven Price
@ 2023-02-13 18:08     ` Zhi Wang
  2023-03-03 14:05       ` Steven Price
  0 siblings, 1 reply; 190+ messages in thread
From: Zhi Wang @ 2023-02-13 18:08 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 27 Jan 2023 11:29:14 +0000
Steven Price <steven.price@arm.com> wrote:

> The RMM maintains a data structure known as the Realm Execution Context
> (or REC). It is similar to struct kvm_vcpu and tracks the state of the
> virtual CPUs. KVM must delegate memory and request the structures are
> created when vCPUs are created, and suitably tear down on destruction.
> 

It would be better to leave some pointers to the spec here. It really saves
time for reviewers. 

> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/include/asm/kvm_emulate.h |   2 +
>  arch/arm64/include/asm/kvm_host.h    |   3 +
>  arch/arm64/include/asm/kvm_rme.h     |  10 ++
>  arch/arm64/kvm/arm.c                 |   1 +
>  arch/arm64/kvm/reset.c               |  11 ++
>  arch/arm64/kvm/rme.c                 | 144 +++++++++++++++++++++++++++
>  6 files changed, 171 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index 5a2b7229e83f..285e62914ca4 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -504,6 +504,8 @@ static inline enum realm_state kvm_realm_state(struct kvm *kvm)
>  
>  static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu)
>  {
> +	if (static_branch_unlikely(&kvm_rme_is_available))
> +		return vcpu->arch.rec.mpidr != INVALID_HWID;
>  	return false;
>  }
>  
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 04347c3a8c6b..ef497b718cdb 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -505,6 +505,9 @@ struct kvm_vcpu_arch {
>  		u64 last_steal;
>  		gpa_t base;
>  	} steal;
> +
> +	/* Realm meta data */
> +	struct rec rec;

I think the name of the data structure "rec" needs a prefix, it is too common
and might conflict with the private data structures in the other modules. Maybe
rme_rec or realm_rec?
>  };
>  
>  /*
> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> index eea5118dfa8a..4b219ebe1400 100644
> --- a/arch/arm64/include/asm/kvm_rme.h
> +++ b/arch/arm64/include/asm/kvm_rme.h
> @@ -6,6 +6,7 @@
>  #ifndef __ASM_KVM_RME_H
>  #define __ASM_KVM_RME_H
>  
> +#include <asm/rmi_smc.h>
>  #include <uapi/linux/kvm.h>
>  
>  enum realm_state {
> @@ -29,6 +30,13 @@ struct realm {
>  	unsigned int ia_bits;
>  };
>  
> +struct rec {
> +	unsigned long mpidr;
> +	void *rec_page;
> +	struct page *aux_pages[REC_PARAMS_AUX_GRANULES];
> +	struct rec_run *run;
> +};
> +

It is better to leave some comments for above members or pointers to the spec,
that saves a lot of time for review.

>  int kvm_init_rme(void);
>  u32 kvm_realm_ipa_limit(void);
>  
> @@ -36,6 +44,8 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
>  int kvm_init_realm_vm(struct kvm *kvm);
>  void kvm_destroy_realm(struct kvm *kvm);
>  void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level);
> +int kvm_create_rec(struct kvm_vcpu *vcpu);
> +void kvm_destroy_rec(struct kvm_vcpu *vcpu);
>  
>  #define RME_RTT_BLOCK_LEVEL	2
>  #define RME_RTT_MAX_LEVEL	3
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index badd775547b8..52affed2f3cf 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -373,6 +373,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>  	/* Force users to call KVM_ARM_VCPU_INIT */
>  	vcpu->arch.target = -1;
>  	bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
> +	vcpu->arch.rec.mpidr = INVALID_HWID;
>  
>  	vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
>  
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index 9e71d69e051f..0c84392a4bf2 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -135,6 +135,11 @@ int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature)
>  			return -EPERM;
>  
>  		return kvm_vcpu_finalize_sve(vcpu);
> +	case KVM_ARM_VCPU_REC:
> +		if (!kvm_is_realm(vcpu->kvm))
> +			return -EINVAL;
> +
> +		return kvm_create_rec(vcpu);
>  	}
>  
>  	return -EINVAL;
> @@ -145,6 +150,11 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu)
>  	if (vcpu_has_sve(vcpu) && !kvm_arm_vcpu_sve_finalized(vcpu))
>  		return false;
>  
> +	if (kvm_is_realm(vcpu->kvm) &&
> +	    !(vcpu_is_rec(vcpu) &&
> +	      READ_ONCE(vcpu->kvm->arch.realm.state) == REALM_STATE_ACTIVE))
> +		return false;

That's why it is better to introduce the realm state in the previous patches so
that people can really get the idea of the states at this stage.

> +
>  	return true;
>  }
>  
> @@ -157,6 +167,7 @@ void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu)
>  	if (sve_state)
>  		kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu));
>  	kfree(sve_state);
> +	kvm_destroy_rec(vcpu);
>  }
>  
>  static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu)
> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> index f7b0e5a779f8..d79ed889ca4d 100644
> --- a/arch/arm64/kvm/rme.c
> +++ b/arch/arm64/kvm/rme.c
> @@ -514,6 +514,150 @@ void kvm_destroy_realm(struct kvm *kvm)
>  	kvm_free_stage2_pgd(&kvm->arch.mmu);
>  }
>  
> +static void free_rec_aux(struct page **aux_pages,
> +			 unsigned int num_aux)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < num_aux; i++) {
> +		phys_addr_t aux_page_phys = page_to_phys(aux_pages[i]);
> +
> +		if (WARN_ON(rmi_granule_undelegate(aux_page_phys)))
> +			continue;
> +
> +		__free_page(aux_pages[i]);
> +	}
> +}
> +
> +static int alloc_rec_aux(struct page **aux_pages,
> +			 u64 *aux_phys_pages,
> +			 unsigned int num_aux)
> +{
> +	int ret;
> +	unsigned int i;
> +
> +	for (i = 0; i < num_aux; i++) {
> +		struct page *aux_page;
> +		phys_addr_t aux_page_phys;
> +
> +		aux_page = alloc_page(GFP_KERNEL);
> +		if (!aux_page) {
> +			ret = -ENOMEM;
> +			goto out_err;
> +		}
> +		aux_page_phys = page_to_phys(aux_page);
> +		if (rmi_granule_delegate(aux_page_phys)) {
> +			__free_page(aux_page);
> +			ret = -ENXIO;
> +			goto out_err;
> +		}
> +		aux_pages[i] = aux_page;
> +		aux_phys_pages[i] = aux_page_phys;
> +	}
> +
> +	return 0;
> +out_err:
> +	free_rec_aux(aux_pages, i);
> +	return ret;
> +}
> +
> +int kvm_create_rec(struct kvm_vcpu *vcpu)
> +{
> +	struct user_pt_regs *vcpu_regs = vcpu_gp_regs(vcpu);
> +	unsigned long mpidr = kvm_vcpu_get_mpidr_aff(vcpu);
> +	struct realm *realm = &vcpu->kvm->arch.realm;
> +	struct rec *rec = &vcpu->arch.rec;
> +	unsigned long rec_page_phys;
> +	struct rec_params *params;
> +	int r, i;
> +
> +	if (kvm_realm_state(vcpu->kvm) != REALM_STATE_NEW)
> +		return -ENOENT;
> +
> +	/*
> +	 * The RMM will report PSCI v1.0 to Realms and the KVM_ARM_VCPU_PSCI_0_2
> +	 * flag covers v0.2 and onwards.
> +	 */
> +	if (!test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
> +		return -EINVAL;
> +
> +	BUILD_BUG_ON(sizeof(*params) > PAGE_SIZE);
> +	BUILD_BUG_ON(sizeof(*rec->run) > PAGE_SIZE);
> +
> +	params = (struct rec_params *)get_zeroed_page(GFP_KERNEL);
> +	rec->rec_page = (void *)__get_free_page(GFP_KERNEL);
> +	rec->run = (void *)get_zeroed_page(GFP_KERNEL);
> +	if (!params || !rec->rec_page || !rec->run) {
> +		r = -ENOMEM;
> +		goto out_free_pages;
> +	}
> +
> +	for (i = 0; i < ARRAY_SIZE(params->gprs); i++)
> +		params->gprs[i] = vcpu_regs->regs[i];
> +
> +	params->pc = vcpu_regs->pc;
> +
> +	if (vcpu->vcpu_id == 0)
> +		params->flags |= REC_PARAMS_FLAG_RUNNABLE;
> +
> +	rec_page_phys = virt_to_phys(rec->rec_page);
> +
> +	if (rmi_granule_delegate(rec_page_phys)) {
> +		r = -ENXIO;
> +		goto out_free_pages;
> +	}
> +

Wouldn't it be better to extend the alloc_rec_aux() to allocate and delegate
pages above? so that you can same some gfps and rmi_granuale_delegates().

> +	r = alloc_rec_aux(rec->aux_pages, params->aux, realm->num_aux);
> +	if (r)
> +		goto out_undelegate_rmm_rec;
> +
> +	params->num_rec_aux = realm->num_aux;
> +	params->mpidr = mpidr;
> +
> +	if (rmi_rec_create(rec_page_phys,
> +			   virt_to_phys(realm->rd),
> +			   virt_to_phys(params))) {
> +		r = -ENXIO;
> +		goto out_free_rec_aux;
> +	}
> +
> +	rec->mpidr = mpidr;
> +
> +	free_page((unsigned long)params);
> +	return 0;
> +
> +out_free_rec_aux:
> +	free_rec_aux(rec->aux_pages, realm->num_aux);
> +out_undelegate_rmm_rec:
> +	if (WARN_ON(rmi_granule_undelegate(rec_page_phys)))
> +		rec->rec_page = NULL;
> +out_free_pages:
> +	free_page((unsigned long)rec->run);
> +	free_page((unsigned long)rec->rec_page);
> +	free_page((unsigned long)params);
> +	return r;
> +}
> +
> +void kvm_destroy_rec(struct kvm_vcpu *vcpu)
> +{
> +	struct realm *realm = &vcpu->kvm->arch.realm;
> +	struct rec *rec = &vcpu->arch.rec;
> +	unsigned long rec_page_phys;
> +
> +	if (!vcpu_is_rec(vcpu))
> +		return;
> +
> +	rec_page_phys = virt_to_phys(rec->rec_page);
> +
> +	if (WARN_ON(rmi_rec_destroy(rec_page_phys)))
> +		return;
> +	if (WARN_ON(rmi_granule_undelegate(rec_page_phys)))
> +		return;
> +

The two returns above feels off. What is the reason to skip the below page
undelegates?

> +	free_rec_aux(rec->aux_pages, realm->num_aux);
> +	free_page((unsigned long)rec->rec_page);
> +}
> +
>  int kvm_init_realm_vm(struct kvm *kvm)
>  {
>  	struct realm_params *params;


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-01-27 11:22 [RFC] Support for Arm CCA VMs on Linux Suzuki K Poulose
                   ` (5 preceding siblings ...)
  2023-02-10 16:51 ` Ryan Roberts
@ 2023-02-14 17:13 ` Dr. David Alan Gilbert
  2023-03-01  9:58   ` Suzuki K Poulose
  2023-07-14 13:46 ` Jonathan Cameron
  2023-10-02 12:43 ` Suzuki K Poulose
  8 siblings, 1 reply; 190+ messages in thread
From: Dr. David Alan Gilbert @ 2023-02-14 17:13 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-coco, linux-kernel, kvm, kvmarm, linux-arm-kernel,
	Alexandru Elisei, Andrew Jones, Catalin Marinas, Chao Peng,
	Christoffer Dall, Fuad Tabba, James Morse, Jean-Philippe Brucker,
	Joey Gouly, Marc Zyngier, Mark Rutland, Oliver Upton,
	Paolo Bonzini, Quentin Perret, Sean Christopherson, Steven Price,
	Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

* Suzuki K Poulose (suzuki.poulose@arm.com) wrote:
> We are happy to announce the early RFC version of the Arm
> Confidential Compute Architecture (CCA) support for the Linux
> stack. The intention is to seek early feedback in the following areas:
>  * KVM integration of the Arm CCA
>  * KVM UABI for managing the Realms, seeking to generalise the operations
>    wherever possible with other Confidential Compute solutions.
>    Note: This version doesn't support Guest Private memory, which will be added
>    later (see below).
>  * Linux Guest support for Realms
> 
> Arm CCA Introduction
> =====================
> 
> The Arm CCA is a reference software architecture and implementation that builds
> on the Realm Management Extension (RME), enabling the execution of Virtual
> machines, while preventing access by more privileged software, such as hypervisor.
> The Arm CCA allows the hypervisor to control the VM, but removes the right for
> access to the code, register state or data that is used by VM.
> More information on the architecture is available here[0].
> 
>     Arm CCA Reference Software Architecture
> 
>         Realm World    ||    Normal World   ||  Secure World  ||
>                        ||        |          ||                ||
>  EL0 x-------x         || x----x | x------x ||                ||
>      | Realm |         || |    | | |      | ||                ||
>      |       |         || | VM | | |      | ||                ||
>  ----|  VM*  |---------||-|    |---|      |-||----------------||
>      |       |         || |    | | |  H   | ||                ||
>  EL1 x-------x         || x----x | |      | ||                ||
>          ^             ||        | |  o   | ||                ||
>          |             ||        | |      | ||                ||
>  ------- R*------------------------|  s  -|---------------------
>          S             ||          |      | ||                ||
>          I             ||          |  t   | ||                ||
>          |             ||          |      | ||                || 
>          v             ||          x------x ||                ||
>  EL2    RMM*           ||              ^    ||                ||
>          ^             ||              |    ||                ||
>  ========|=============================|========================
>          |                             | SMC
>          x--------- *RMI* -------------x
> 
>  EL3                   Root World
>                        EL3 Firmware
>  ===============================================================
> Where :
>  RMM - Realm Management Monitor
>  RMI - Realm Management Interface
>  RSI - Realm Service Interface
>  SMC - Secure Monitor Call

Hi,
  It's nice to see this full stack posted - thanks!

Are there any pointers to information on attestation and similar
measurement things?  In particular, are there any plans for a vTPM
for Realms - if there were, it would make life easy for us, since we
can share some user space stuff with other CoCo systems.

Dave

> RME introduces a new security state "Realm world", in addition to the
> traditional Secure and Non-Secure states. The Arm CCA defines a new component,
> Realm Management Monitor (RMM) that runs at R-EL2. This is a standard piece of
> firmware, verified, installed and loaded by the EL3 firmware (e.g, TF-A), at
> system boot.
> 
> The RMM provides standard interfaces - Realm Management Interface (RMI) - to the
> Normal world hypervisor to manage the VMs running in the Realm world (also called
> Realms in short). These are exposed via SMC and are routed through the EL3
> firmwre.
> The RMI interface includes:
>   - Move a physical page from the Normal world to the Realm world
>   - Creating a Realm with requested parameters, tracked via Realm Descriptor (RD)
>   - Creating VCPUs aka Realm Execution Context (REC), with initial register state.
>   - Create stage2 translation table at any level.
>   - Load initial images into Realm Memory from normal world memory
>   - Schedule RECs (vCPUs) and handle exits
>   - Inject virtual interrupts into the Realm
>   - Service stage2 runtime faults with pages (provided by host, scrubbed by RMM).
>   - Create "shared" mappings that can be accessed by VMM/Hyp.
>   - Reclaim the memory allocated for the RAM and RTTs (Realm Translation Tables)
> 
> However v1.0 of RMM specifications doesn't support:
>  - Paging protected memory of a Realm VM. Thus the pages backing the protected
>    memory region must be pinned.
>  - Live migration of Realms.
>  - Trusted Device assignment.
>  - Physical interrupt backed Virtual interrupts for Realms
> 
> RMM also provides certain services to the Realms via SMC, called Realm Service
> Interface (RSI). These include:
>  - Realm Guest Configuration.
>  - Attestation & Measurement services
>  - Managing the state of an Intermediate Physical Address (IPA aka GPA) page.
>  - Host Call service (Communication with the Normal world Hypervisor)
> 
> The specifications for the RMM software is currently at *v1.0-Beta2* and the
> latest version is available here [1].
> 
> The Trusted Firmware foundation has an implementation of the RMM - TF-RMM -
> available here [3].
> 
> Implementation
> =================
> 
> This version of the stack is based on the RMM specification v1.0-Beta0[2], with
> following exceptions :
>   - TF-RMM/KVM currently doesn't support the optional features of PMU,
>      SVE and Self-hosted debug (coming soon).
>   - The RSI_HOST_CALL structure alignment requirement is reduced to match
>      RMM v1.0 Beta1
>   - RMI/RSI version numbers do not match the RMM spec. This will be
>     resolved once the spec/implementation is complete, across TF-RMM+Linux stack.
> 
> We plan to update the stack to support the latest version of the RMMv1.0 spec
> in the coming revisions.
> 
> This release includes the following components :
> 
>  a) Linux Kernel
>      i) Host / KVM support - Support for driving the Realms via RMI. This is
>      dependent on running in the Kernel at EL2 (aka VHE mode). Also provides
>      UABI for VMMs to manage the Realm VMs. The support is restricted to 4K page
>      size, matching the Stage2 granule supported by RMM. The VMM is responsible
>      for making sure the guest memory is locked.
> 
>        TODO: Guest Private memory[10] integration - We have been following the
>        series and support will be added once it is merged upstream.
>      
>      ii) Guest support - Support for a Linux Kernel to run in the Realm VM at
>      Realm-EL1, using RSI services. This includes virtio support (virtio-v1.0
>      only). All I/O are treated as non-secure/shared.
>  
>  c) kvmtool - VMM changes required to manage Realm VMs. No guest private memory
>     as mentioned above.
>  d) kvm-unit-tests - Support for running in Realms along with additional tests
>     for RSI ABI.
> 
> Running the stack
> ====================
> 
> To run/test the stack, you would need the following components :
> 
> 1) FVP Base AEM RevC model with FEAT_RME support [4]
> 2) TF-A firmware for EL3 [5]
> 3) TF-A RMM for R-EL2 [3]
> 4) Linux Kernel [6]
> 5) kvmtool [7]
> 6) kvm-unit-tests [8]
> 
> Instructions for building the firmware components and running the model are
> available here [9]. Once, the host kernel is booted, a Realm can be launched by
> invoking the `lkvm` commad as follows:
> 
>  $ lkvm run --realm 				 \
> 	 --measurement-algo=["sha256", "sha512"] \
> 	 --disable-sve				 \
> 	 <normal-vm-options>
> 
> Where:
>  * --measurement-algo (Optional) specifies the algorithm selected for creating the
>    initial measurements by the RMM for this Realm (defaults to sha256).
>  * GICv3 is mandatory for the Realms.
>  * SVE is not yet supported in the TF-RMM, and thus must be disabled using
>    --disable-sve
> 
> You may also run the kvm-unit-tests inside the Realm world, using the similar
> options as above.
> 
> 
> Links
> ============
> 
> [0] Arm CCA Landing page (See Key Resources section for various documentations)
>     https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
> 
> [1] RMM Specification Latest
>     https://developer.arm.com/documentation/den0137/latest
> 
> [2] RMM v1.0-Beta0 specification
>     https://developer.arm.com/documentation/den0137/1-0bet0/
> 
> [3] Trusted Firmware RMM - TF-RMM
>     https://www.trustedfirmware.org/projects/tf-rmm/
>     GIT: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git
> 
> [4] FVP Base RevC AEM Model (available on x86_64 / Arm64 Linux)
>     https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms
> 
> [5] Trusted Firmware for A class
>     https://www.trustedfirmware.org/projects/tf-a/
> 
> [6] Linux kernel support for Arm-CCA
>     https://gitlab.arm.com/linux-arm/linux-cca
>     Host Support branch:	cca-host/rfc-v1
>     Guest Support branch:	cca-guest/rfc-v1
> 
> [7] kvmtool support for Arm CCA
>     https://gitlab.arm.com/linux-arm/kvmtool-cca cca/rfc-v1
> 
> [8] kvm-unit-tests support for Arm CCA
>     https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca  cca/rfc-v1
> 
> [9] Instructions for Building Firmware components and running the model, see
>     section 4.19.2 "Building and running TF-A with RME"
>     https://trustedfirmware-a.readthedocs.io/en/latest/components/realm-management-extension.html#building-and-running-tf-a-with-rme
> 
> [10] fd based Guest Private memory for KVM
>    https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng@linux.intel.com
> 
> Cc: Alexandru Elisei <alexandru.elisei@arm.com>
> Cc: Andrew Jones <andrew.jones@linux.dev>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Chao Peng <chao.p.peng@linux.intel.com>
> Cc: Christoffer Dall <christoffer.dall@arm.com>
> Cc: Fuad Tabba <tabba@google.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Cc: Joey Gouly <Joey.Gouly@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Quentin Perret <qperret@google.com>
> Cc: Sean Christopherson <seanjc@google.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Zenghui Yu <yuzenghui@huawei.com>
> To: linux-coco@lists.linux.dev
> To: kvmarm@lists.linux.dev
> Cc: kvmarm@lists.cs.columbia.edu
> Cc: linux-arm-kernel@lists.infradead.org
> To: linux-kernel@vger.kernel.org
> To: kvm@vger.kernel.org
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-02-10 22:53   ` Itaru Kitayama
@ 2023-02-17  8:02     ` Itaru Kitayama
  2023-02-20 10:51       ` Ryan Roberts
  0 siblings, 1 reply; 190+ messages in thread
From: Itaru Kitayama @ 2023-02-17  8:02 UTC (permalink / raw)
  To: Ryan Roberts
  Cc: Suzuki K Poulose, linux-coco, linux-kernel, kvm, kvmarm,
	linux-arm-kernel, Alexandru Elisei, Andrew Jones,
	Catalin Marinas, Chao Peng, Christoffer Dall, Fuad Tabba,
	James Morse, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Sean Christopherson, Steven Price, Thomas Huth, Will Deacon,
	Zenghui Yu, kvmarm

On Sat, Feb 11, 2023 at 7:53 AM Itaru Kitayama <itaru.kitayama@gmail.com> wrote:
>
> On Sat, Feb 11, 2023 at 1:56 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
> >
> > On 27/01/2023 11:22, Suzuki K Poulose wrote:
> > > [...]
> >
> > > Running the stack
> > > ====================
> > >
> > > To run/test the stack, you would need the following components :
> > >
> > > 1) FVP Base AEM RevC model with FEAT_RME support [4]
> > > 2) TF-A firmware for EL3 [5]
> > > 3) TF-A RMM for R-EL2 [3]
> > > 4) Linux Kernel [6]
> > > 5) kvmtool [7]
> > > 6) kvm-unit-tests [8]
> > >
> > > Instructions for building the firmware components and running the model are
> > > available here [9]. Once, the host kernel is booted, a Realm can be launched by
> > > invoking the `lkvm` commad as follows:
> > >
> > >  $ lkvm run --realm                            \
> > >        --measurement-algo=["sha256", "sha512"] \
> > >        --disable-sve                           \
> > >        <normal-vm-options>
> > >
> > > Where:
> > >  * --measurement-algo (Optional) specifies the algorithm selected for creating the
> > >    initial measurements by the RMM for this Realm (defaults to sha256).
> > >  * GICv3 is mandatory for the Realms.
> > >  * SVE is not yet supported in the TF-RMM, and thus must be disabled using
> > >    --disable-sve
> > >
> > > You may also run the kvm-unit-tests inside the Realm world, using the similar
> > > options as above.
> >
> > Building all of these components and configuring the FVP correctly can be quite
> > tricky, so I thought I would plug a tool we have called Shrinkwrap, which can
> > simplify all of this.
> >
> > The tool accepts a yaml input configuration that describes how a set of
> > components should be built and packaged, and how the FVP should be configured
> > and booted. And by default, it uses a Docker container on its backend, which
> > contains all the required tools, including the FVP. You can optionally use
> > Podman or have it run on your native system if you prefer. It supports both
> > x86_64 and aarch64. And you can even run it in --dry-run mode to see the set of
> > shell commands that would have been executed.
> >
> > It comes with two CCA configs out-of-the-box; cca-3world.yaml builds TF-A, RMM,
> > Linux (for both host and guest), kvmtool and kvm-unit-tests. cca-4world.yaml
> > adds Hafnium and some demo SPs for the secure world (although since Hafnium
> > requires x86_64 to build, cca-4world.yaml doesn't currently work on an aarch64
> > build host).
> >
> > See the documentation [1] and repository [2] for more info.
> >
> > Brief instructions to get you up and running:
> >
> >   # Install shrinkwrap. (I assume you have Docker installed):
> >   sudo pip3 install pyyaml termcolor tuxmake
> >   git clone https://git.gitlab.arm.com/tooling/shrinkwrap.git
> >   export PATH=$PWD/shrinkwrap/shrinkwrap:$PATH
> >
> >   # If running Python < 3.9:
> >   sudo pip3 install graphlib-backport
> >
> >   # Build all the CCA components:
> >   shrinkwrap build cca-3world.yaml [--dry-run]
>
> This has been working on my Multipass instance on M1, thanks for the tool.
>
> Thanks,
> Itaru.

It took a while though I've just booted an Ubuntu 22.10 disk image
with the cca-3world.yaml config on M1.

Thanks,
Itaru.

>
> >
> >   # Run the stack in the FVP:
> >   shrinkwrap run cca-3world.yaml -r ROOTFS=<my_rootfs.ext4> [--dry-run]
> >
> > By default, building is done at ~/.shrinkwrap/build/cca-3world and the package
> > is created at ~/.shrinkwrap/package/cca-3world (this can be changed with
> > envvars).
> >
> > The 'run' command will boot TF-A, RMM and host Linux kernel in the FVP, and
> > mount the provided rootfs. You will likely want to have copied the userspace
> > pieces into the rootfs before running, so you can create realms:
> >
> > - ~/.shrinkwrap/package/cca-3world/Image (kernel with RMI and RSI support)
> > - ~/.shrinkwrap/package/cca-3world/lkvm (kvmtool able to launch realms)
> > - ~/.shrinkwrap/package/cca-3world/kvm-unit-tests.tgz (built kvm-unit-tests)
> >
> > Once the FVP is booted to a shell, you can do something like this to launch a
> > Linux guest in a realm:
> >
> >   lkvm run --realm --disable-sve -c 1 -m 256 -k Image
> >
> > [1] https://shrinkwrap.docs.arm.com
> > [2] https://gitlab.arm.com/tooling/shrinkwrap
> >
> >
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 13/28] arm64: RME: Allow VMM to set RIPAS
  2023-01-27 11:29   ` [RFC PATCH 13/28] arm64: RME: Allow VMM to set RIPAS Steven Price
@ 2023-02-17 13:07     ` Zhi Wang
  2023-03-03 14:05       ` Steven Price
  0 siblings, 1 reply; 190+ messages in thread
From: Zhi Wang @ 2023-02-17 13:07 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 27 Jan 2023 11:29:17 +0000
Steven Price <steven.price@arm.com> wrote:

> Each page within the protected region of the realm guest can be marked
> as either RAM or EMPTY. Allow the VMM to control this before the guest
> has started and provide the equivalent functions to change this (with
> the guest's approval) at runtime.
> 

The above is just the purpose of this patch. It would be better to have one
more paragraph to describe what this patch does (building RTT and set IPA
state in the RTT) and explain something might confuse people, for example
the spare page.

The spare page is really confusing. When reading __alloc_delegated_page()
, it looks like a mechanism to cache a delegated page for the realm. But later
in the teardown path, it looks like a workaround. What if the allocation of 
the spare page failed in the RTT tear down path? 

I understand this must be a temporary solution. It would be really nice to
have a big picture or some basic introduction to the future plan. 

> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/include/asm/kvm_rme.h |   4 +
>  arch/arm64/kvm/rme.c             | 288 +++++++++++++++++++++++++++++++
>  2 files changed, 292 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> index 4b219ebe1400..3e75cedaad18 100644
> --- a/arch/arm64/include/asm/kvm_rme.h
> +++ b/arch/arm64/include/asm/kvm_rme.h
> @@ -47,6 +47,10 @@ void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level);
>  int kvm_create_rec(struct kvm_vcpu *vcpu);
>  void kvm_destroy_rec(struct kvm_vcpu *vcpu);
>  
> +int realm_set_ipa_state(struct kvm_vcpu *vcpu,
> +			unsigned long addr, unsigned long end,
> +			unsigned long ripas);
> +
>  #define RME_RTT_BLOCK_LEVEL	2
>  #define RME_RTT_MAX_LEVEL	3
>  
> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> index d79ed889ca4d..b3ea79189839 100644
> --- a/arch/arm64/kvm/rme.c
> +++ b/arch/arm64/kvm/rme.c
> @@ -73,6 +73,58 @@ static int rmi_check_version(void)
>  	return 0;
>  }
>  
> +static phys_addr_t __alloc_delegated_page(struct realm *realm,
> +					  struct kvm_mmu_memory_cache *mc, gfp_t flags)
> +{
> +	phys_addr_t phys = PHYS_ADDR_MAX;
> +	void *virt;
> +
> +	if (realm->spare_page != PHYS_ADDR_MAX) {
> +		swap(realm->spare_page, phys);
> +		goto out;
> +	}
> +
> +	if (mc)
> +		virt = kvm_mmu_memory_cache_alloc(mc);
> +	else
> +		virt = (void *)__get_free_page(flags);
> +
> +	if (!virt)
> +		goto out;
> +
> +	phys = virt_to_phys(virt);
> +
> +	if (rmi_granule_delegate(phys)) {
> +		free_page((unsigned long)virt);
> +
> +		phys = PHYS_ADDR_MAX;
> +	}
> +
> +out:
> +	return phys;
> +}
> +
> +static phys_addr_t alloc_delegated_page(struct realm *realm,
> +					struct kvm_mmu_memory_cache *mc)
> +{
> +	return __alloc_delegated_page(realm, mc, GFP_KERNEL);
> +}
> +
> +static void free_delegated_page(struct realm *realm, phys_addr_t phys)
> +{
> +	if (realm->spare_page == PHYS_ADDR_MAX) {
> +		realm->spare_page = phys;
> +		return;
> +	}
> +
> +	if (WARN_ON(rmi_granule_undelegate(phys))) {
> +		/* Undelegate failed: leak the page */
> +		return;
> +	}
> +
> +	free_page((unsigned long)phys_to_virt(phys));
> +}
> +
>  static void realm_destroy_undelegate_range(struct realm *realm,
>  					   unsigned long ipa,
>  					   unsigned long addr,
> @@ -220,6 +272,30 @@ static int realm_rtt_create(struct realm *realm,
>  	return rmi_rtt_create(phys, virt_to_phys(realm->rd), addr, level);
>  }
>  
> +static int realm_create_rtt_levels(struct realm *realm,
> +				   unsigned long ipa,
> +				   int level,
> +				   int max_level,
> +				   struct kvm_mmu_memory_cache *mc)
> +{
> +	if (WARN_ON(level == max_level))
> +		return 0;
> +
> +	while (level++ < max_level) {
> +		phys_addr_t rtt = alloc_delegated_page(realm, mc);
> +
> +		if (rtt == PHYS_ADDR_MAX)
> +			return -ENOMEM;
> +
> +		if (realm_rtt_create(realm, ipa, level, rtt)) {
> +			free_delegated_page(realm, rtt);
> +			return -ENXIO;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
>  static int realm_tear_down_rtt_range(struct realm *realm, int level,
>  				     unsigned long start, unsigned long end)
>  {
> @@ -309,6 +385,206 @@ void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level)
>  	realm_tear_down_rtt_range(realm, start_level, 0, (1UL << ia_bits));
>  }
>  
> +void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size)
> +{
> +	u32 ia_bits = kvm->arch.mmu.pgt->ia_bits;
> +	u32 start_level = kvm->arch.mmu.pgt->start_level;
> +	unsigned long end = ipa + size;
> +	struct realm *realm = &kvm->arch.realm;
> +	phys_addr_t tmp_rtt = PHYS_ADDR_MAX;
> +
> +	if (end > (1UL << ia_bits))
> +		end = 1UL << ia_bits;
> +	/*
> +	 * Make sure we have a spare delegated page for tearing down the
> +	 * block mappings. We must use Atomic allocations as we are called
> +	 * with kvm->mmu_lock held.
> +	 */
> +	if (realm->spare_page == PHYS_ADDR_MAX) {
> +		tmp_rtt = __alloc_delegated_page(realm, NULL, GFP_ATOMIC);
> +		/*
> +		 * We don't have to check the status here, as we may not
> +		 * have a block level mapping. Delay any error to the point
> +		 * where we need it.
> +		 */
> +		realm->spare_page = tmp_rtt;
> +	}
> +
> +	realm_tear_down_rtt_range(&kvm->arch.realm, start_level, ipa, end);
> +
> +	/* Free up the atomic page, if there were any */
> +	if (tmp_rtt != PHYS_ADDR_MAX) {
> +		free_delegated_page(realm, tmp_rtt);
> +		/*
> +		 * Update the spare_page after we have freed the
> +		 * above page to make sure it doesn't get cached
> +		 * in spare_page.
> +		 * We should re-write this part and always have
> +		 * a dedicated page for handling block mappings.
> +		 */
> +		realm->spare_page = PHYS_ADDR_MAX;
> +	}
> +}
> +
> +static int set_ipa_state(struct kvm_vcpu *vcpu,
> +			 unsigned long ipa,
> +			 unsigned long end,
> +			 int level,
> +			 unsigned long ripas)
> +{
> +	struct kvm *kvm = vcpu->kvm;
> +	struct realm *realm = &kvm->arch.realm;
> +	struct rec *rec = &vcpu->arch.rec;
> +	phys_addr_t rd_phys = virt_to_phys(realm->rd);
> +	phys_addr_t rec_phys = virt_to_phys(rec->rec_page);
> +	unsigned long map_size = rme_rtt_level_mapsize(level);
> +	int ret;
> +
> +	while (ipa < end) {
> +		ret = rmi_rtt_set_ripas(rd_phys, rec_phys, ipa, level, ripas);
> +
> +		if (!ret) {
> +			if (!ripas)
> +				kvm_realm_unmap_range(kvm, ipa, map_size);
> +		} else if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> +			int walk_level = RMI_RETURN_INDEX(ret);
> +
> +			if (walk_level < level) {
> +				ret = realm_create_rtt_levels(realm, ipa,
> +							      walk_level,
> +							      level, NULL);
> +				if (ret)
> +					return ret;
> +				continue;
> +			}
> +
> +			if (WARN_ON(level >= RME_RTT_MAX_LEVEL))
> +				return -EINVAL;
> +
> +			/* Recurse one level lower */
> +			ret = set_ipa_state(vcpu, ipa, ipa + map_size,
> +					    level + 1, ripas);
> +			if (ret)
> +				return ret;
> +		} else {
> +			WARN(1, "Unexpected error in %s: %#x\n", __func__,
> +			     ret);
> +			return -EINVAL;
> +		}
> +		ipa += map_size;
> +	}
> +
> +	return 0;
> +}
> +
> +static int realm_init_ipa_state(struct realm *realm,
> +				unsigned long ipa,
> +				unsigned long end,
> +				int level)
> +{
> +	unsigned long map_size = rme_rtt_level_mapsize(level);
> +	phys_addr_t rd_phys = virt_to_phys(realm->rd);
> +	int ret;
> +
> +	while (ipa < end) {
> +		ret = rmi_rtt_init_ripas(rd_phys, ipa, level);
> +
> +		if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> +			int cur_level = RMI_RETURN_INDEX(ret);
> +
> +			if (cur_level < level) {
> +				ret = realm_create_rtt_levels(realm, ipa,
> +							      cur_level,
> +							      level, NULL);
> +				if (ret)
> +					return ret;
> +				/* Retry with the RTT levels in place */
> +				continue;
> +			}
> +
> +			/* There's an entry at a lower level, recurse */
> +			if (WARN_ON(level >= RME_RTT_MAX_LEVEL))
> +				return -EINVAL;
> +
> +			realm_init_ipa_state(realm, ipa, ipa + map_size,
> +					     level + 1);
> +		} else if (WARN_ON(ret)) {
> +			return -ENXIO;
> +		}
> +
> +		ipa += map_size;
> +	}
> +
> +	return 0;
> +}
> +
> +static int find_map_level(struct kvm *kvm, unsigned long start, unsigned long end)
> +{
> +	int level = RME_RTT_MAX_LEVEL;
> +
> +	while (level > get_start_level(kvm) + 1) {
> +		unsigned long map_size = rme_rtt_level_mapsize(level - 1);
> +
> +		if (!IS_ALIGNED(start, map_size) ||
> +		    (start + map_size) > end)
> +			break;
> +
> +		level--;
> +	}
> +
> +	return level;
> +}
> +
> +int realm_set_ipa_state(struct kvm_vcpu *vcpu,
> +			unsigned long addr, unsigned long end,
> +			unsigned long ripas)
> +{
> +	int ret = 0;
> +
> +	while (addr < end) {
> +		int level = find_map_level(vcpu->kvm, addr, end);
> +		unsigned long map_size = rme_rtt_level_mapsize(level);
> +
> +		ret = set_ipa_state(vcpu, addr, addr + map_size, level, ripas);
> +		if (ret)
> +			break;
> +
> +		addr += map_size;
> +	}
> +
> +	return ret;
> +}
> +
> +static int kvm_init_ipa_range_realm(struct kvm *kvm,
> +				    struct kvm_cap_arm_rme_init_ipa_args *args)
> +{
> +	int ret = 0;
> +	gpa_t addr, end;
> +	struct realm *realm = &kvm->arch.realm;
> +
> +	addr = args->init_ipa_base;
> +	end = addr + args->init_ipa_size;
> +
> +	if (end < addr)
> +		return -EINVAL;
> +
> +	if (kvm_realm_state(kvm) != REALM_STATE_NEW)
> +		return -EBUSY;
> +
> +	while (addr < end) {
> +		int level = find_map_level(kvm, addr, end);
> +		unsigned long map_size = rme_rtt_level_mapsize(level);
> +
> +		ret = realm_init_ipa_state(realm, addr, addr + map_size, level);
> +		if (ret)
> +			break;
> +
> +		addr += map_size;
> +	}
> +
> +	return ret;
> +}
> +
>  /* Protects access to rme_vmid_bitmap */
>  static DEFINE_SPINLOCK(rme_vmid_lock);
>  static unsigned long *rme_vmid_bitmap;
> @@ -460,6 +736,18 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>  
>  		r = kvm_create_realm(kvm);
>  		break;
> +	case KVM_CAP_ARM_RME_INIT_IPA_REALM: {
> +		struct kvm_cap_arm_rme_init_ipa_args args;
> +		void __user *argp = u64_to_user_ptr(cap->args[1]);
> +
> +		if (copy_from_user(&args, argp, sizeof(args))) {
> +			r = -EFAULT;
> +			break;
> +		}
> +
> +		r = kvm_init_ipa_range_realm(kvm, &args);
> +		break;
> +	}
>  	default:
>  		r = -EINVAL;
>  		break;


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-02-17  8:02     ` Itaru Kitayama
@ 2023-02-20 10:51       ` Ryan Roberts
  0 siblings, 0 replies; 190+ messages in thread
From: Ryan Roberts @ 2023-02-20 10:51 UTC (permalink / raw)
  To: Itaru Kitayama
  Cc: Suzuki K Poulose, linux-coco, linux-kernel, kvm, kvmarm,
	linux-arm-kernel, Alexandru Elisei, Andrew Jones,
	Catalin Marinas, Chao Peng, Christoffer Dall, Fuad Tabba,
	James Morse, Jean-Philippe Brucker, Joey Gouly, Marc Zyngier,
	Mark Rutland, Oliver Upton, Paolo Bonzini, Quentin Perret,
	Sean Christopherson, Steven Price, Thomas Huth, Will Deacon,
	Zenghui Yu, kvmarm

On 17/02/2023 08:02, Itaru Kitayama wrote:
> On Sat, Feb 11, 2023 at 7:53 AM Itaru Kitayama <itaru.kitayama@gmail.com> wrote:
>>
>> On Sat, Feb 11, 2023 at 1:56 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>>
>>> On 27/01/2023 11:22, Suzuki K Poulose wrote:
>>>> [...]
>>>
>>>> Running the stack
>>>> ====================
>>>>
>>>> To run/test the stack, you would need the following components :
>>>>
>>>> 1) FVP Base AEM RevC model with FEAT_RME support [4]
>>>> 2) TF-A firmware for EL3 [5]
>>>> 3) TF-A RMM for R-EL2 [3]
>>>> 4) Linux Kernel [6]
>>>> 5) kvmtool [7]
>>>> 6) kvm-unit-tests [8]
>>>>
>>>> Instructions for building the firmware components and running the model are
>>>> available here [9]. Once, the host kernel is booted, a Realm can be launched by
>>>> invoking the `lkvm` commad as follows:
>>>>
>>>>  $ lkvm run --realm                            \
>>>>        --measurement-algo=["sha256", "sha512"] \
>>>>        --disable-sve                           \
>>>>        <normal-vm-options>
>>>>
>>>> Where:
>>>>  * --measurement-algo (Optional) specifies the algorithm selected for creating the
>>>>    initial measurements by the RMM for this Realm (defaults to sha256).
>>>>  * GICv3 is mandatory for the Realms.
>>>>  * SVE is not yet supported in the TF-RMM, and thus must be disabled using
>>>>    --disable-sve
>>>>
>>>> You may also run the kvm-unit-tests inside the Realm world, using the similar
>>>> options as above.
>>>
>>> Building all of these components and configuring the FVP correctly can be quite
>>> tricky, so I thought I would plug a tool we have called Shrinkwrap, which can
>>> simplify all of this.
>>>
>>> The tool accepts a yaml input configuration that describes how a set of
>>> components should be built and packaged, and how the FVP should be configured
>>> and booted. And by default, it uses a Docker container on its backend, which
>>> contains all the required tools, including the FVP. You can optionally use
>>> Podman or have it run on your native system if you prefer. It supports both
>>> x86_64 and aarch64. And you can even run it in --dry-run mode to see the set of
>>> shell commands that would have been executed.
>>>
>>> It comes with two CCA configs out-of-the-box; cca-3world.yaml builds TF-A, RMM,
>>> Linux (for both host and guest), kvmtool and kvm-unit-tests. cca-4world.yaml
>>> adds Hafnium and some demo SPs for the secure world (although since Hafnium
>>> requires x86_64 to build, cca-4world.yaml doesn't currently work on an aarch64
>>> build host).
>>>
>>> See the documentation [1] and repository [2] for more info.
>>>
>>> Brief instructions to get you up and running:
>>>
>>>   # Install shrinkwrap. (I assume you have Docker installed):
>>>   sudo pip3 install pyyaml termcolor tuxmake
>>>   git clone https://git.gitlab.arm.com/tooling/shrinkwrap.git
>>>   export PATH=$PWD/shrinkwrap/shrinkwrap:$PATH
>>>
>>>   # If running Python < 3.9:
>>>   sudo pip3 install graphlib-backport
>>>
>>>   # Build all the CCA components:
>>>   shrinkwrap build cca-3world.yaml [--dry-run]
>>
>> This has been working on my Multipass instance on M1, thanks for the tool.
>>
>> Thanks,
>> Itaru.
> 
> It took a while though I've just booted an Ubuntu 22.10 disk image
> with the cca-3world.yaml config on M1.

That's good to hear - If you have any feedback (or patches ;-)) for Shrinkwrap
that would improve the experience, do let me know!

> 
> Thanks,
> Itaru.
> 
>>
>>>
>>>   # Run the stack in the FVP:
>>>   shrinkwrap run cca-3world.yaml -r ROOTFS=<my_rootfs.ext4> [--dry-run]
>>>
>>> By default, building is done at ~/.shrinkwrap/build/cca-3world and the package
>>> is created at ~/.shrinkwrap/package/cca-3world (this can be changed with
>>> envvars).
>>>
>>> The 'run' command will boot TF-A, RMM and host Linux kernel in the FVP, and
>>> mount the provided rootfs. You will likely want to have copied the userspace
>>> pieces into the rootfs before running, so you can create realms:
>>>
>>> - ~/.shrinkwrap/package/cca-3world/Image (kernel with RMI and RSI support)
>>> - ~/.shrinkwrap/package/cca-3world/lkvm (kvmtool able to launch realms)
>>> - ~/.shrinkwrap/package/cca-3world/kvm-unit-tests.tgz (built kvm-unit-tests)
>>>
>>> Once the FVP is booted to a shell, you can do something like this to launch a
>>> Linux guest in a realm:
>>>
>>>   lkvm run --realm --disable-sve -c 1 -m 256 -k Image
>>>
>>> [1] https://shrinkwrap.docs.arm.com
>>> [2] https://gitlab.arm.com/tooling/shrinkwrap
>>>
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-01-27 15:26 ` [RFC] Support for Arm CCA VMs on Linux Jean-Philippe Brucker
@ 2023-02-28 23:35   ` Itaru Kitayama
  2023-03-01  9:20     ` Jean-Philippe Brucker
  0 siblings, 1 reply; 190+ messages in thread
From: Itaru Kitayama @ 2023-02-28 23:35 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Suzuki K Poulose, linux-coco, linux-kernel, kvm, kvmarm,
	linux-arm-kernel, Alexandru Elisei, Andrew Jones,
	Catalin Marinas, Chao Peng, Christoffer Dall, Fuad Tabba,
	James Morse, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Sean Christopherson,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

On Sat, Jan 28, 2023 at 12:30 AM Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
>
> On Fri, Jan 27, 2023 at 11:22:48AM +0000, Suzuki K Poulose wrote:
> > We are happy to announce the early RFC version of the Arm
> > Confidential Compute Architecture (CCA) support for the Linux
> > stack. The intention is to seek early feedback in the following areas:
> >  * KVM integration of the Arm CCA
> >  * KVM UABI for managing the Realms, seeking to generalise the operations
> >    wherever possible with other Confidential Compute solutions.
>
> A prototype for launching Realm VMs with QEMU is available at:
> https://lore.kernel.org/qemu-devel/20230127150727.612594-1-jean-philippe@linaro.org/
>
> Thanks,
> Jean

Hi Jean,
I've tried your series in Real on CCA Host, but the KVM arch init
emits an Invalid argument error and terminates.
I configure it with the aarch64-softmmu target only and built, any
other steps I should worry?

Itaru.

>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-02-28 23:35   ` Itaru Kitayama
@ 2023-03-01  9:20     ` Jean-Philippe Brucker
  2023-03-01 22:12       ` Itaru Kitayama
  0 siblings, 1 reply; 190+ messages in thread
From: Jean-Philippe Brucker @ 2023-03-01  9:20 UTC (permalink / raw)
  To: Itaru Kitayama
  Cc: Suzuki K Poulose, linux-coco, linux-kernel, kvm, kvmarm,
	linux-arm-kernel, Alexandru Elisei, Andrew Jones,
	Catalin Marinas, Chao Peng, Christoffer Dall, Fuad Tabba,
	James Morse, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Sean Christopherson,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

Hi Itaru,

On Wed, Mar 01, 2023 at 08:35:05AM +0900, Itaru Kitayama wrote:
> Hi Jean,
> I've tried your series in Real on CCA Host, but the KVM arch init
> emits an Invalid argument error and terminates.

Do you know which call returns this error?  Normally the RMEGuest support
should print more detailed errors. Are you able to launch normal guests
(without the rme-guest object and confidential-guest-support machine
parameter)?  Could you give the complete QEMU command-line?

> I configure it with the aarch64-softmmu target only and built, any
> other steps I should worry?

No, that should be enough

Thanks,
Jean

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-02-14 17:13 ` Dr. David Alan Gilbert
@ 2023-03-01  9:58   ` Suzuki K Poulose
  2023-03-02 16:46     ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 190+ messages in thread
From: Suzuki K Poulose @ 2023-03-01  9:58 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: linux-coco, linux-kernel, kvm, kvmarm, linux-arm-kernel,
	Alexandru Elisei, Andrew Jones, Catalin Marinas, Chao Peng,
	Christoffer Dall, Fuad Tabba, James Morse, Jean-Philippe Brucker,
	Joey Gouly, Marc Zyngier, Mark Rutland, Oliver Upton,
	Paolo Bonzini, Quentin Perret, Sean Christopherson, Steven Price,
	Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

Hi Dave

Thanks for your response, and apologies for the delay. Response, in line.

On 14/02/2023 17:13, Dr. David Alan Gilbert wrote:
> * Suzuki K Poulose (suzuki.poulose@arm.com) wrote:
>> We are happy to announce the early RFC version of the Arm
>> Confidential Compute Architecture (CCA) support for the Linux
>> stack. The intention is to seek early feedback in the following areas:
>>   * KVM integration of the Arm CCA
>>   * KVM UABI for managing the Realms, seeking to generalise the operations
>>     wherever possible with other Confidential Compute solutions.
>>     Note: This version doesn't support Guest Private memory, which will be added
>>     later (see below).
>>   * Linux Guest support for Realms
>>
>> Arm CCA Introduction
>> =====================
>>
>> The Arm CCA is a reference software architecture and implementation that builds
>> on the Realm Management Extension (RME), enabling the execution of Virtual
>> machines, while preventing access by more privileged software, such as hypervisor.
>> The Arm CCA allows the hypervisor to control the VM, but removes the right for
>> access to the code, register state or data that is used by VM.
>> More information on the architecture is available here[0].
>>
>>      Arm CCA Reference Software Architecture
>>
>>          Realm World    ||    Normal World   ||  Secure World  ||
>>                         ||        |          ||                ||
>>   EL0 x-------x         || x----x | x------x ||                ||
>>       | Realm |         || |    | | |      | ||                ||
>>       |       |         || | VM | | |      | ||                ||
>>   ----|  VM*  |---------||-|    |---|      |-||----------------||
>>       |       |         || |    | | |  H   | ||                ||
>>   EL1 x-------x         || x----x | |      | ||                ||
>>           ^             ||        | |  o   | ||                ||
>>           |             ||        | |      | ||                ||
>>   ------- R*------------------------|  s  -|---------------------
>>           S             ||          |      | ||                ||
>>           I             ||          |  t   | ||                ||
>>           |             ||          |      | ||                ||
>>           v             ||          x------x ||                ||
>>   EL2    RMM*           ||              ^    ||                ||
>>           ^             ||              |    ||                ||
>>   ========|=============================|========================
>>           |                             | SMC
>>           x--------- *RMI* -------------x
>>
>>   EL3                   Root World
>>                         EL3 Firmware
>>   ===============================================================
>> Where :
>>   RMM - Realm Management Monitor
>>   RMI - Realm Management Interface
>>   RSI - Realm Service Interface
>>   SMC - Secure Monitor Call
> 
> Hi,
>    It's nice to see this full stack posted - thanks!
> 
> Are there any pointers to information on attestation and similar
> measurement things?  In particular, are there any plans for a vTPM

The RMM v1.0 provides attestation and measurement services to the Realm,
via Realm Service Interface (RSI) calls. However, there is no support
for partitioning the Realm VM with v1.0. This is currently under
development and should be available in the near future.

With that in place, a vTPM could reside in a partition of the Realm VM 
along side the OS in another. Does that answer your question ?

Kind regards
Suzuki


> for Realms - if there were, it would make life easy for us, since we
> can share some user space stuff with other CoCo systems.
> 
> Dave
> 
>> RME introduces a new security state "Realm world", in addition to the
>> traditional Secure and Non-Secure states. The Arm CCA defines a new component,
>> Realm Management Monitor (RMM) that runs at R-EL2. This is a standard piece of
>> firmware, verified, installed and loaded by the EL3 firmware (e.g, TF-A), at
>> system boot.
>>
>> The RMM provides standard interfaces - Realm Management Interface (RMI) - to the
>> Normal world hypervisor to manage the VMs running in the Realm world (also called
>> Realms in short). These are exposed via SMC and are routed through the EL3
>> firmwre.
>> The RMI interface includes:
>>    - Move a physical page from the Normal world to the Realm world
>>    - Creating a Realm with requested parameters, tracked via Realm Descriptor (RD)
>>    - Creating VCPUs aka Realm Execution Context (REC), with initial register state.
>>    - Create stage2 translation table at any level.
>>    - Load initial images into Realm Memory from normal world memory
>>    - Schedule RECs (vCPUs) and handle exits
>>    - Inject virtual interrupts into the Realm
>>    - Service stage2 runtime faults with pages (provided by host, scrubbed by RMM).
>>    - Create "shared" mappings that can be accessed by VMM/Hyp.
>>    - Reclaim the memory allocated for the RAM and RTTs (Realm Translation Tables)
>>
>> However v1.0 of RMM specifications doesn't support:
>>   - Paging protected memory of a Realm VM. Thus the pages backing the protected
>>     memory region must be pinned.
>>   - Live migration of Realms.
>>   - Trusted Device assignment.
>>   - Physical interrupt backed Virtual interrupts for Realms
>>
>> RMM also provides certain services to the Realms via SMC, called Realm Service
>> Interface (RSI). These include:
>>   - Realm Guest Configuration.
>>   - Attestation & Measurement services
>>   - Managing the state of an Intermediate Physical Address (IPA aka GPA) page.
>>   - Host Call service (Communication with the Normal world Hypervisor)
>>
>> The specifications for the RMM software is currently at *v1.0-Beta2* and the
>> latest version is available here [1].
>>
>> The Trusted Firmware foundation has an implementation of the RMM - TF-RMM -
>> available here [3].
>>
>> Implementation
>> =================
>>
>> This version of the stack is based on the RMM specification v1.0-Beta0[2], with
>> following exceptions :
>>    - TF-RMM/KVM currently doesn't support the optional features of PMU,
>>       SVE and Self-hosted debug (coming soon).
>>    - The RSI_HOST_CALL structure alignment requirement is reduced to match
>>       RMM v1.0 Beta1
>>    - RMI/RSI version numbers do not match the RMM spec. This will be
>>      resolved once the spec/implementation is complete, across TF-RMM+Linux stack.
>>
>> We plan to update the stack to support the latest version of the RMMv1.0 spec
>> in the coming revisions.
>>
>> This release includes the following components :
>>
>>   a) Linux Kernel
>>       i) Host / KVM support - Support for driving the Realms via RMI. This is
>>       dependent on running in the Kernel at EL2 (aka VHE mode). Also provides
>>       UABI for VMMs to manage the Realm VMs. The support is restricted to 4K page
>>       size, matching the Stage2 granule supported by RMM. The VMM is responsible
>>       for making sure the guest memory is locked.
>>
>>         TODO: Guest Private memory[10] integration - We have been following the
>>         series and support will be added once it is merged upstream.
>>       
>>       ii) Guest support - Support for a Linux Kernel to run in the Realm VM at
>>       Realm-EL1, using RSI services. This includes virtio support (virtio-v1.0
>>       only). All I/O are treated as non-secure/shared.
>>   
>>   c) kvmtool - VMM changes required to manage Realm VMs. No guest private memory
>>      as mentioned above.
>>   d) kvm-unit-tests - Support for running in Realms along with additional tests
>>      for RSI ABI.
>>
>> Running the stack
>> ====================
>>
>> To run/test the stack, you would need the following components :
>>
>> 1) FVP Base AEM RevC model with FEAT_RME support [4]
>> 2) TF-A firmware for EL3 [5]
>> 3) TF-A RMM for R-EL2 [3]
>> 4) Linux Kernel [6]
>> 5) kvmtool [7]
>> 6) kvm-unit-tests [8]
>>
>> Instructions for building the firmware components and running the model are
>> available here [9]. Once, the host kernel is booted, a Realm can be launched by
>> invoking the `lkvm` commad as follows:
>>
>>   $ lkvm run --realm 				 \
>> 	 --measurement-algo=["sha256", "sha512"] \
>> 	 --disable-sve				 \
>> 	 <normal-vm-options>
>>
>> Where:
>>   * --measurement-algo (Optional) specifies the algorithm selected for creating the
>>     initial measurements by the RMM for this Realm (defaults to sha256).
>>   * GICv3 is mandatory for the Realms.
>>   * SVE is not yet supported in the TF-RMM, and thus must be disabled using
>>     --disable-sve
>>
>> You may also run the kvm-unit-tests inside the Realm world, using the similar
>> options as above.
>>
>>
>> Links
>> ============
>>
>> [0] Arm CCA Landing page (See Key Resources section for various documentations)
>>      https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
>>
>> [1] RMM Specification Latest
>>      https://developer.arm.com/documentation/den0137/latest
>>
>> [2] RMM v1.0-Beta0 specification
>>      https://developer.arm.com/documentation/den0137/1-0bet0/
>>
>> [3] Trusted Firmware RMM - TF-RMM
>>      https://www.trustedfirmware.org/projects/tf-rmm/
>>      GIT: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git
>>
>> [4] FVP Base RevC AEM Model (available on x86_64 / Arm64 Linux)
>>      https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms
>>
>> [5] Trusted Firmware for A class
>>      https://www.trustedfirmware.org/projects/tf-a/
>>
>> [6] Linux kernel support for Arm-CCA
>>      https://gitlab.arm.com/linux-arm/linux-cca
>>      Host Support branch:	cca-host/rfc-v1
>>      Guest Support branch:	cca-guest/rfc-v1
>>
>> [7] kvmtool support for Arm CCA
>>      https://gitlab.arm.com/linux-arm/kvmtool-cca cca/rfc-v1
>>
>> [8] kvm-unit-tests support for Arm CCA
>>      https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca  cca/rfc-v1
>>
>> [9] Instructions for Building Firmware components and running the model, see
>>      section 4.19.2 "Building and running TF-A with RME"
>>      https://trustedfirmware-a.readthedocs.io/en/latest/components/realm-management-extension.html#building-and-running-tf-a-with-rme
>>
>> [10] fd based Guest Private memory for KVM
>>     https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng@linux.intel.com
>>
>> Cc: Alexandru Elisei <alexandru.elisei@arm.com>
>> Cc: Andrew Jones <andrew.jones@linux.dev>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Chao Peng <chao.p.peng@linux.intel.com>
>> Cc: Christoffer Dall <christoffer.dall@arm.com>
>> Cc: Fuad Tabba <tabba@google.com>
>> Cc: James Morse <james.morse@arm.com>
>> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
>> Cc: Joey Gouly <Joey.Gouly@arm.com>
>> Cc: Marc Zyngier <maz@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Oliver Upton <oliver.upton@linux.dev>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Quentin Perret <qperret@google.com>
>> Cc: Sean Christopherson <seanjc@google.com>
>> Cc: Steven Price <steven.price@arm.com>
>> Cc: Thomas Huth <thuth@redhat.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Zenghui Yu <yuzenghui@huawei.com>
>> To: linux-coco@lists.linux.dev
>> To: kvmarm@lists.linux.dev
>> Cc: kvmarm@lists.cs.columbia.edu
>> Cc: linux-arm-kernel@lists.infradead.org
>> To: linux-kernel@vger.kernel.org
>> To: kvm@vger.kernel.org
>>


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 05/28] arm64: RME: Define the user ABI
  2023-02-13 16:04     ` Zhi Wang
@ 2023-03-01 11:54       ` Steven Price
  2023-03-01 20:21         ` Zhi Wang
  0 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-03-01 11:54 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On 13/02/2023 16:04, Zhi Wang wrote:
> On Fri, 27 Jan 2023 11:29:09 +0000
> Steven Price <steven.price@arm.com> wrote:
> 
>> There is one (multiplexed) CAP which can be used to create, populate and
>> then activate the realm.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  Documentation/virt/kvm/api.rst    |  1 +
>>  arch/arm64/include/uapi/asm/kvm.h | 63 +++++++++++++++++++++++++++++++
>>  include/uapi/linux/kvm.h          |  2 +
>>  3 files changed, 66 insertions(+)
>>
>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>> index 0dd5d8733dd5..f1a59d6fb7fc 100644
>> --- a/Documentation/virt/kvm/api.rst
>> +++ b/Documentation/virt/kvm/api.rst
>> @@ -4965,6 +4965,7 @@ Recognised values for feature:
>>  
>>    =====      ===========================================
>>    arm64      KVM_ARM_VCPU_SVE (requires KVM_CAP_ARM_SVE)
>> +  arm64      KVM_ARM_VCPU_REC (requires KVM_CAP_ARM_RME)
>>    =====      ===========================================
>>  
>>  Finalizes the configuration of the specified vcpu feature.
>> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
>> index a7a857f1784d..fcc0b8dce29b 100644
>> --- a/arch/arm64/include/uapi/asm/kvm.h
>> +++ b/arch/arm64/include/uapi/asm/kvm.h
>> @@ -109,6 +109,7 @@ struct kvm_regs {
>>  #define KVM_ARM_VCPU_SVE		4 /* enable SVE for this CPU */
>>  #define KVM_ARM_VCPU_PTRAUTH_ADDRESS	5 /* VCPU uses address authentication */
>>  #define KVM_ARM_VCPU_PTRAUTH_GENERIC	6 /* VCPU uses generic authentication */
>> +#define KVM_ARM_VCPU_REC		7 /* VCPU REC state as part of Realm */
>>  
>>  struct kvm_vcpu_init {
>>  	__u32 target;
>> @@ -401,6 +402,68 @@ enum {
>>  #define   KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES	3
>>  #define   KVM_DEV_ARM_ITS_CTRL_RESET		4
>>  
>> +/* KVM_CAP_ARM_RME on VM fd */
>> +#define KVM_CAP_ARM_RME_CONFIG_REALM		0
>> +#define KVM_CAP_ARM_RME_CREATE_RD		1
>> +#define KVM_CAP_ARM_RME_INIT_IPA_REALM		2
>> +#define KVM_CAP_ARM_RME_POPULATE_REALM		3
>> +#define KVM_CAP_ARM_RME_ACTIVATE_REALM		4
>> +
> 
> It is a little bit confusing here. These seems more like 'commands' not caps.
> Will leave more comments after reviewing the later patches.

Sorry for the slow response. Thank you for your review - I hope to post
a new version of this series (rebased on 6.3-rc1) in the coming weeks
with your comments addressed.

They are indeed commands - and using caps is a bit of a hack. The
benefit here is that all the Realm commands are behind the one
KVM_CAP_ARM_RME.

The options I can see are:

a) What I've got here - (ab)using KVM_ENABLE_CAP to perform commands.

b) Add new ioctls for each of the above stages (so 5 new ioctls on top
of the CAP for discovery). With any future extensions requiring new ioctls.

c) Add a single new multiplexing ioctl (along with the CAP for discovery).

I'm not massively keen on defining a new multiplexing scheme (c), but
equally (b) seems like it's burning through ioctl numbers. Which led me
to stick with (a) which at least keeps the rebasing simple (there's only
the one CAP number which could conflict) and there's already a
multiplexing scheme.

But I'm happy to change if there's consensus a different approach would
be preferable.

Thanks,

Steve

>> +#define KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256		0
>> +#define KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA512		1
>> +
>> +#define KVM_CAP_ARM_RME_RPV_SIZE 64
>> +
>> +/* List of configuration items accepted for KVM_CAP_ARM_RME_CONFIG_REALM */
>> +#define KVM_CAP_ARM_RME_CFG_RPV			0
>> +#define KVM_CAP_ARM_RME_CFG_HASH_ALGO		1
>> +#define KVM_CAP_ARM_RME_CFG_SVE			2
>> +#define KVM_CAP_ARM_RME_CFG_DBG			3
>> +#define KVM_CAP_ARM_RME_CFG_PMU			4
>> +
>> +struct kvm_cap_arm_rme_config_item {
>> +	__u32 cfg;
>> +	union {
>> +		/* cfg == KVM_CAP_ARM_RME_CFG_RPV */
>> +		struct {
>> +			__u8	rpv[KVM_CAP_ARM_RME_RPV_SIZE];
>> +		};
>> +
>> +		/* cfg == KVM_CAP_ARM_RME_CFG_HASH_ALGO */
>> +		struct {
>> +			__u32	hash_algo;
>> +		};
>> +
>> +		/* cfg == KVM_CAP_ARM_RME_CFG_SVE */
>> +		struct {
>> +			__u32	sve_vq;
>> +		};
>> +
>> +		/* cfg == KVM_CAP_ARM_RME_CFG_DBG */
>> +		struct {
>> +			__u32	num_brps;
>> +			__u32	num_wrps;
>> +		};
>> +
>> +		/* cfg == KVM_CAP_ARM_RME_CFG_PMU */
>> +		struct {
>> +			__u32	num_pmu_cntrs;
>> +		};
>> +		/* Fix the size of the union */
>> +		__u8	reserved[256];
>> +	};
>> +};
>> +
>> +struct kvm_cap_arm_rme_populate_realm_args {
>> +	__u64 populate_ipa_base;
>> +	__u64 populate_ipa_size;
>> +};
>> +
>> +struct kvm_cap_arm_rme_init_ipa_args {
>> +	__u64 init_ipa_base;
>> +	__u64 init_ipa_size;
>> +};
>> +
>>  /* Device Control API on vcpu fd */
>>  #define KVM_ARM_VCPU_PMU_V3_CTRL	0
>>  #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index 20522d4ba1e0..fec1909e8b73 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -1176,6 +1176,8 @@ struct kvm_ppc_resize_hpt {
>>  #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224
>>  #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225
>>  
>> +#define KVM_CAP_ARM_RME 300 // FIXME: Large number to prevent conflicts
>> +
>>  #ifdef KVM_CAP_IRQ_ROUTING
>>  
>>  struct kvm_irq_routing_irqchip {
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms
  2023-02-13 16:10     ` Zhi Wang
@ 2023-03-01 11:55       ` Steven Price
  2023-03-01 20:33         ` Zhi Wang
  0 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-03-01 11:55 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On 13/02/2023 16:10, Zhi Wang wrote:
> On Fri, 27 Jan 2023 11:29:10 +0000
> Steven Price <steven.price@arm.com> wrote:
> 
>> Add the KVM_CAP_ARM_RME_CREATE_FD ioctl to create a realm. This involves
>> delegating pages to the RMM to hold the Realm Descriptor (RD) and for
>> the base level of the Realm Translation Tables (RTT). A VMID also need
>> to be picked, since the RMM has a separate VMID address space a
>> dedicated allocator is added for this purpose.
>>
>> KVM_CAP_ARM_RME_CONFIG_REALM is provided to allow configuring the realm
>> before it is created.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_rme.h |  14 ++
>>  arch/arm64/kvm/arm.c             |  19 ++
>>  arch/arm64/kvm/mmu.c             |   6 +
>>  arch/arm64/kvm/reset.c           |  33 +++
>>  arch/arm64/kvm/rme.c             | 357 +++++++++++++++++++++++++++++++
>>  5 files changed, 429 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
>> index c26bc2c6770d..055a22accc08 100644
>> --- a/arch/arm64/include/asm/kvm_rme.h
>> +++ b/arch/arm64/include/asm/kvm_rme.h
>> @@ -6,6 +6,8 @@
>>  #ifndef __ASM_KVM_RME_H
>>  #define __ASM_KVM_RME_H
>>  
>> +#include <uapi/linux/kvm.h>
>> +
>>  enum realm_state {
>>  	REALM_STATE_NONE,
>>  	REALM_STATE_NEW,
>> @@ -15,8 +17,20 @@ enum realm_state {
>>  
>>  struct realm {
>>  	enum realm_state state;
>> +
>> +	void *rd;
>> +	struct realm_params *params;
>> +
>> +	unsigned long num_aux;
>> +	unsigned int vmid;
>> +	unsigned int ia_bits;
>>  };
>>  
> 
> Maybe more comments for this structure?

Agreed, this series is a bit light on comments. I'll try to improve for v2.

<snip>

> 
> Just curious. Wouldn't it be better to use IDR as this is ID allocation? There
> were some efforts to change the use of bitmap allocation to IDR before.

I'm not sure it makes much difference really. This matches KVM's
vmid_map, but here things are much more simple as there's no support for
the likes of VMID rollover (the number of Realm VMs is just capped at
the number of VMIDs).

IDR provides a lot of functionality we don't need, but equally I don't
think performance or memory usage are really a concern here.

Steve

>> +/* Protects access to rme_vmid_bitmap */
>> +static DEFINE_SPINLOCK(rme_vmid_lock);
>> +static unsigned long *rme_vmid_bitmap;
>> +
>> +static int rme_vmid_init(void)
>> +{
>> +	unsigned int vmid_count = 1 << kvm_get_vmid_bits();
>> +
>> +	rme_vmid_bitmap = bitmap_zalloc(vmid_count, GFP_KERNEL);
>> +	if (!rme_vmid_bitmap) {
>> +		kvm_err("%s: Couldn't allocate rme vmid bitmap\n", __func__);
>> +		return -ENOMEM;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int rme_vmid_reserve(void)
>> +{
>> +	int ret;
>> +	unsigned int vmid_count = 1 << kvm_get_vmid_bits();
>> +
>> +	spin_lock(&rme_vmid_lock);
>> +	ret = bitmap_find_free_region(rme_vmid_bitmap, vmid_count, 0);
>> +	spin_unlock(&rme_vmid_lock);
>> +
>> +	return ret;
>> +}
>> +
>> +static void rme_vmid_release(unsigned int vmid)
>> +{
>> +	spin_lock(&rme_vmid_lock);
>> +	bitmap_release_region(rme_vmid_bitmap, vmid, 0);
>> +	spin_unlock(&rme_vmid_lock);
>> +}
>> +
>> +static int kvm_create_realm(struct kvm *kvm)
>> +{
>> +	struct realm *realm = &kvm->arch.realm;
>> +	int ret;
>> +
>> +	if (!kvm_is_realm(kvm) || kvm_realm_state(kvm) != REALM_STATE_NONE)
>> +		return -EEXIST;
>> +
>> +	ret = rme_vmid_reserve();
>> +	if (ret < 0)
>> +		return ret;
>> +	realm->vmid = ret;
>> +
>> +	ret = realm_create_rd(kvm);
>> +	if (ret) {
>> +		rme_vmid_release(realm->vmid);
>> +		return ret;
>> +	}
>> +
>> +	WRITE_ONCE(realm->state, REALM_STATE_NEW);
>> +
>> +	/* The realm is up, free the parameters.  */
>> +	free_page((unsigned long)realm->params);
>> +	realm->params = NULL;
>> +
>> +	return 0;
>> +}
>> +
>> +static int config_realm_hash_algo(struct realm *realm,
>> +				  struct kvm_cap_arm_rme_config_item *cfg)
>> +{
>> +	switch (cfg->hash_algo) {
>> +	case KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256:
>> +		if (!rme_supports(RMI_FEATURE_REGISTER_0_HASH_SHA_256))
>> +			return -EINVAL;
>> +		break;
>> +	case KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA512:
>> +		if (!rme_supports(RMI_FEATURE_REGISTER_0_HASH_SHA_512))
>> +			return -EINVAL;
>> +		break;
>> +	default:
>> +		return -EINVAL;
>> +	}
>> +	realm->params->measurement_algo = cfg->hash_algo;
>> +	return 0;
>> +}
>> +
>> +static int config_realm_sve(struct realm *realm,
>> +			    struct kvm_cap_arm_rme_config_item *cfg)
>> +{
>> +	u64 features_0 = realm->params->features_0;
>> +	int max_sve_vq = u64_get_bits(rmm_feat_reg0,
>> +				      RMI_FEATURE_REGISTER_0_SVE_VL);
>> +
>> +	if (!rme_supports(RMI_FEATURE_REGISTER_0_SVE_EN))
>> +		return -EINVAL;
>> +
>> +	if (cfg->sve_vq > max_sve_vq)
>> +		return -EINVAL;
>> +
>> +	features_0 &= ~(RMI_FEATURE_REGISTER_0_SVE_EN |
>> +			RMI_FEATURE_REGISTER_0_SVE_VL);
>> +	features_0 |= u64_encode_bits(1, RMI_FEATURE_REGISTER_0_SVE_EN);
>> +	features_0 |= u64_encode_bits(cfg->sve_vq,
>> +				      RMI_FEATURE_REGISTER_0_SVE_VL);
>> +
>> +	realm->params->features_0 = features_0;
>> +	return 0;
>> +}
>> +
>> +static int kvm_rme_config_realm(struct kvm *kvm, struct kvm_enable_cap *cap)
>> +{
>> +	struct kvm_cap_arm_rme_config_item cfg;
>> +	struct realm *realm = &kvm->arch.realm;
>> +	int r = 0;
>> +
>> +	if (kvm_realm_state(kvm) != REALM_STATE_NONE)
>> +		return -EBUSY;
>> +
>> +	if (copy_from_user(&cfg, (void __user *)cap->args[1], sizeof(cfg)))
>> +		return -EFAULT;
>> +
>> +	switch (cfg.cfg) {
>> +	case KVM_CAP_ARM_RME_CFG_RPV:
>> +		memcpy(&realm->params->rpv, &cfg.rpv, sizeof(cfg.rpv));
>> +		break;
>> +	case KVM_CAP_ARM_RME_CFG_HASH_ALGO:
>> +		r = config_realm_hash_algo(realm, &cfg);
>> +		break;
>> +	case KVM_CAP_ARM_RME_CFG_SVE:
>> +		r = config_realm_sve(realm, &cfg);
>> +		break;
>> +	default:
>> +		r = -EINVAL;
>> +	}
>> +
>> +	return r;
>> +}
>> +
>> +int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>> +{
>> +	int r = 0;
>> +
>> +	switch (cap->args[0]) {
>> +	case KVM_CAP_ARM_RME_CONFIG_REALM:
>> +		r = kvm_rme_config_realm(kvm, cap);
>> +		break;
>> +	case KVM_CAP_ARM_RME_CREATE_RD:
>> +		if (kvm->created_vcpus) {
>> +			r = -EBUSY;
>> +			break;
>> +		}
>> +
>> +		r = kvm_create_realm(kvm);
>> +		break;
>> +	default:
>> +		r = -EINVAL;
>> +		break;
>> +	}
>> +
>> +	return r;
>> +}
>> +
>> +void kvm_destroy_realm(struct kvm *kvm)
>> +{
>> +	struct realm *realm = &kvm->arch.realm;
>> +	struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
>> +	unsigned int pgd_sz;
>> +	int i;
>> +
>> +	if (realm->params) {
>> +		free_page((unsigned long)realm->params);
>> +		realm->params = NULL;
>> +	}
>> +
>> +	if (kvm_realm_state(kvm) == REALM_STATE_NONE)
>> +		return;
>> +
>> +	WRITE_ONCE(realm->state, REALM_STATE_DYING);
>> +
>> +	rme_vmid_release(realm->vmid);
>> +
>> +	if (realm->rd) {
>> +		phys_addr_t rd_phys = virt_to_phys(realm->rd);
>> +
>> +		if (WARN_ON(rmi_realm_destroy(rd_phys)))
>> +			return;
>> +		if (WARN_ON(rmi_granule_undelegate(rd_phys)))
>> +			return;
>> +		free_page((unsigned long)realm->rd);
>> +		realm->rd = NULL;
>> +	}
>> +
>> +	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
>> +	for (i = 0; i < pgd_sz; i++) {
>> +		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
>> +
>> +		if (WARN_ON(rmi_granule_undelegate(pgd_phys)))
>> +			return;
>> +	}
>> +
>> +	kvm_free_stage2_pgd(&kvm->arch.mmu);
>> +}
>> +
>> +int kvm_init_realm_vm(struct kvm *kvm)
>> +{
>> +	struct realm_params *params;
>> +
>> +	params = (struct realm_params *)get_zeroed_page(GFP_KERNEL);
>> +	if (!params)
>> +		return -ENOMEM;
>> +
>> +	params->features_0 = create_realm_feat_reg0(kvm);
>> +	kvm->arch.realm.params = params;
>> +	return 0;
>> +}
>> +
>>  int kvm_init_rme(void)
>>  {
>> +	int ret;
>> +
>>  	if (PAGE_SIZE != SZ_4K)
>>  		/* Only 4k page size on the host is supported */
>>  		return 0;
>> @@ -43,6 +394,12 @@ int kvm_init_rme(void)
>>  		/* Continue without realm support */
>>  		return 0;
>>  
>> +	ret = rme_vmid_init();
>> +	if (ret)
>> +		return ret;
>> +
>> +	WARN_ON(rmi_features(0, &rmm_feat_reg0));
>> +
>>  	/* Future patch will enable static branch kvm_rme_is_available */
>>  
>>  	return 0;
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 07/28] arm64: kvm: Allow passing machine type in KVM creation
  2023-02-13 16:35     ` Zhi Wang
@ 2023-03-01 11:55       ` Steven Price
  0 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-03-01 11:55 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On 13/02/2023 16:35, Zhi Wang wrote:
> On Fri, 27 Jan 2023 11:29:11 +0000
> Steven Price <steven.price@arm.com> wrote:
> 
>> Previously machine type was used purely for specifying the physical
>> address size of the guest. Reserve the higher bits to specify an ARM
>> specific machine type and declare a new type 'KVM_VM_TYPE_ARM_REALM'
>> used to create a realm guest.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  arch/arm64/kvm/arm.c     | 13 +++++++++++++
>>  arch/arm64/kvm/mmu.c     |  3 ---
>>  arch/arm64/kvm/reset.c   |  3 ---
>>  include/uapi/linux/kvm.h | 19 +++++++++++++++----
>>  4 files changed, 28 insertions(+), 10 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index 50f54a63732a..badd775547b8 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -147,6 +147,19 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>>  {
>>  	int ret;
>>  
>> +	if (type & ~(KVM_VM_TYPE_ARM_MASK | KVM_VM_TYPE_ARM_IPA_SIZE_MASK))
>> +		return -EINVAL;
>> +
>> +	switch (type & KVM_VM_TYPE_ARM_MASK) {
>> +	case KVM_VM_TYPE_ARM_NORMAL:
>> +		break;
>> +	case KVM_VM_TYPE_ARM_REALM:
>> +		kvm->arch.is_realm = true;
> 
> It is better to let this call fail when !kvm_rme_is_available? It is
> strange to be able to create a VM with REALM type in a system doesn't
> support RME.

Good point - I'll add a check here.

Thanks,

Steve

>> +		break;
>> +	default:
>> +		return -EINVAL;
>> +	}
>> +
>>  	ret = kvm_share_hyp(kvm, kvm + 1);
>>  	if (ret)
>>  		return ret;
>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>> index d0f707767d05..22c00274884a 100644
>> --- a/arch/arm64/kvm/mmu.c
>> +++ b/arch/arm64/kvm/mmu.c
>> @@ -709,9 +709,6 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t
>>  	u64 mmfr0, mmfr1;
>>  	u32 phys_shift;
>>  
>> -	if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
>> -		return -EINVAL;
>> -
>>  	phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type);
>>  	if (is_protected_kvm_enabled()) {
>>  		phys_shift = kvm_ipa_limit;
>> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
>> index c165df174737..9e71d69e051f 100644
>> --- a/arch/arm64/kvm/reset.c
>> +++ b/arch/arm64/kvm/reset.c
>> @@ -405,9 +405,6 @@ int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type)
>>  	if (kvm_is_realm(kvm))
>>  		ipa_limit = kvm_realm_ipa_limit();
>>  
>> -	if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
>> -		return -EINVAL;
>> -
>>  	phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type);
>>  	if (phys_shift) {
>>  		if (phys_shift > ipa_limit ||
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index fec1909e8b73..bcfc4d58dc19 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -898,14 +898,25 @@ struct kvm_ppc_resize_hpt {
>>  #define KVM_S390_SIE_PAGE_OFFSET 1
>>  
>>  /*
>> - * On arm64, machine type can be used to request the physical
>> - * address size for the VM. Bits[7-0] are reserved for the guest
>> - * PA size shift (i.e, log2(PA_Size)). For backward compatibility,
>> - * value 0 implies the default IPA size, 40bits.
>> + * On arm64, machine type can be used to request both the machine type and
>> + * the physical address size for the VM.
>> + *
>> + * Bits[11-8] are reserved for the ARM specific machine type.
>> + *
>> + * Bits[7-0] are reserved for the guest PA size shift (i.e, log2(PA_Size)).
>> + * For backward compatibility, value 0 implies the default IPA size, 40bits.
>>   */
>> +#define KVM_VM_TYPE_ARM_SHIFT		8
>> +#define KVM_VM_TYPE_ARM_MASK		(0xfULL << KVM_VM_TYPE_ARM_SHIFT)
>> +#define KVM_VM_TYPE_ARM(_type)		\
>> +	(((_type) << KVM_VM_TYPE_ARM_SHIFT) & KVM_VM_TYPE_ARM_MASK)
>> +#define KVM_VM_TYPE_ARM_NORMAL		KVM_VM_TYPE_ARM(0)
>> +#define KVM_VM_TYPE_ARM_REALM		KVM_VM_TYPE_ARM(1)
>> +
>>  #define KVM_VM_TYPE_ARM_IPA_SIZE_MASK	0xffULL
>>  #define KVM_VM_TYPE_ARM_IPA_SIZE(x)		\
>>  	((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
>> +
>>  /*
>>   * ioctls for /dev/kvm fds:
>>   */
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 08/28] arm64: RME: Keep a spare page delegated to the RMM
  2023-02-13 16:47     ` Zhi Wang
@ 2023-03-01 11:55       ` Steven Price
  2023-03-01 20:50         ` Zhi Wang
  0 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-03-01 11:55 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On 13/02/2023 16:47, Zhi Wang wrote:
> On Fri, 27 Jan 2023 11:29:12 +0000
> Steven Price <steven.price@arm.com> wrote:
> 
>> Pages can only be populated/destroyed on the RMM at the 4KB granule,
>> this requires creating the full depth of RTTs. However if the pages are
>> going to be combined into a 4MB huge page the last RTT is only
>> temporarily needed. Similarly when freeing memory the huge page must be
>> temporarily split requiring temporary usage of the full depth oF RTTs.
>>
>> To avoid needing to perform a temporary allocation and delegation of a
>> page for this purpose we keep a spare delegated page around. In
>> particular this avoids the need for memory allocation while destroying
>> the realm guest.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_rme.h | 3 +++
>>  arch/arm64/kvm/rme.c             | 6 ++++++
>>  2 files changed, 9 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
>> index 055a22accc08..a6318af3ed11 100644
>> --- a/arch/arm64/include/asm/kvm_rme.h
>> +++ b/arch/arm64/include/asm/kvm_rme.h
>> @@ -21,6 +21,9 @@ struct realm {
>>  	void *rd;
>>  	struct realm_params *params;
>>  
>> +	/* A spare already delegated page */
>> +	phys_addr_t spare_page;
>> +
>>  	unsigned long num_aux;
>>  	unsigned int vmid;
>>  	unsigned int ia_bits;
>> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
>> index 9f8c5a91b8fc..0c9d70e4d9e6 100644
>> --- a/arch/arm64/kvm/rme.c
>> +++ b/arch/arm64/kvm/rme.c
>> @@ -148,6 +148,7 @@ static int realm_create_rd(struct kvm *kvm)
>>  	}
>>  
>>  	realm->rd = rd;
>> +	realm->spare_page = PHYS_ADDR_MAX;
>>  	realm->ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
>>  
>>  	if (WARN_ON(rmi_rec_aux_count(rd_phys, &realm->num_aux))) {
>> @@ -357,6 +358,11 @@ void kvm_destroy_realm(struct kvm *kvm)
>>  		free_page((unsigned long)realm->rd);
>>  		realm->rd = NULL;
>>  	}
>> +	if (realm->spare_page != PHYS_ADDR_MAX) {
>> +		if (!WARN_ON(rmi_granule_undelegate(realm->spare_page)))
>> +			free_page((unsigned long)phys_to_virt(realm->spare_page));
> 
> Will the page be leaked (not usable for host and realms) if the undelegate
> failed? If yes, better at least put a comment.

Yes - I'll add a comment.

In general being unable to undelegate a page points to a programming
error in the host. The only reason the RMM should refuse the request is
it the page is in use by a Realm which the host has configured. So the
WARN() is correct (there's a kernel bug) and the only sensible course of
action is to leak the page and limp on.

Thanks,

Steve

>> +		realm->spare_page = PHYS_ADDR_MAX;
>> +	}
>>  
>>  	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
>>  	for (i = 0; i < pgd_sz; i++) {
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 05/28] arm64: RME: Define the user ABI
  2023-03-01 11:54       ` Steven Price
@ 2023-03-01 20:21         ` Zhi Wang
  0 siblings, 0 replies; 190+ messages in thread
From: Zhi Wang @ 2023-03-01 20:21 UTC (permalink / raw)
  To: Steven Price
  Cc: Zhi Wang, kvm, kvmarm, Catalin Marinas, Marc Zyngier,
	Will Deacon, James Morse, Oliver Upton, Suzuki K Poulose,
	Zenghui Yu, linux-arm-kernel, linux-kernel, Joey Gouly,
	Alexandru Elisei, Christoffer Dall, Fuad Tabba, linux-coco

On Wed, 1 Mar 2023 11:54:34 +0000
Steven Price <steven.price@arm.com> wrote:

> On 13/02/2023 16:04, Zhi Wang wrote:
> > On Fri, 27 Jan 2023 11:29:09 +0000
> > Steven Price <steven.price@arm.com> wrote:
> > 
> >> There is one (multiplexed) CAP which can be used to create, populate and
> >> then activate the realm.
> >>
> >> Signed-off-by: Steven Price <steven.price@arm.com>
> >> ---
> >>  Documentation/virt/kvm/api.rst    |  1 +
> >>  arch/arm64/include/uapi/asm/kvm.h | 63 +++++++++++++++++++++++++++++++
> >>  include/uapi/linux/kvm.h          |  2 +
> >>  3 files changed, 66 insertions(+)
> >>
> >> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> >> index 0dd5d8733dd5..f1a59d6fb7fc 100644
> >> --- a/Documentation/virt/kvm/api.rst
> >> +++ b/Documentation/virt/kvm/api.rst
> >> @@ -4965,6 +4965,7 @@ Recognised values for feature:
> >>  
> >>    =====      ===========================================
> >>    arm64      KVM_ARM_VCPU_SVE (requires KVM_CAP_ARM_SVE)
> >> +  arm64      KVM_ARM_VCPU_REC (requires KVM_CAP_ARM_RME)
> >>    =====      ===========================================
> >>  
> >>  Finalizes the configuration of the specified vcpu feature.
> >> diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/asm/kvm.h
> >> index a7a857f1784d..fcc0b8dce29b 100644
> >> --- a/arch/arm64/include/uapi/asm/kvm.h
> >> +++ b/arch/arm64/include/uapi/asm/kvm.h
> >> @@ -109,6 +109,7 @@ struct kvm_regs {
> >>  #define KVM_ARM_VCPU_SVE		4 /* enable SVE for this CPU */
> >>  #define KVM_ARM_VCPU_PTRAUTH_ADDRESS	5 /* VCPU uses address authentication */
> >>  #define KVM_ARM_VCPU_PTRAUTH_GENERIC	6 /* VCPU uses generic authentication */
> >> +#define KVM_ARM_VCPU_REC		7 /* VCPU REC state as part of Realm */
> >>  
> >>  struct kvm_vcpu_init {
> >>  	__u32 target;
> >> @@ -401,6 +402,68 @@ enum {
> >>  #define   KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES	3
> >>  #define   KVM_DEV_ARM_ITS_CTRL_RESET		4
> >>  
> >> +/* KVM_CAP_ARM_RME on VM fd */
> >> +#define KVM_CAP_ARM_RME_CONFIG_REALM		0
> >> +#define KVM_CAP_ARM_RME_CREATE_RD		1
> >> +#define KVM_CAP_ARM_RME_INIT_IPA_REALM		2
> >> +#define KVM_CAP_ARM_RME_POPULATE_REALM		3
> >> +#define KVM_CAP_ARM_RME_ACTIVATE_REALM		4
> >> +
> > 
> > It is a little bit confusing here. These seems more like 'commands' not caps.
> > Will leave more comments after reviewing the later patches.
> 
> Sorry for the slow response. Thank you for your review - I hope to post
> a new version of this series (rebased on 6.3-rc1) in the coming weeks
> with your comments addressed.
> 

Hi:

No worries. I spent most of my time on closing the review of TDX/SNP
series recently while stopped at patch 16 of this series. I will try to
finish the rest of this series this week.

I am glad if my efforts help and more reviewers can smoothly jump in
later.
 
> They are indeed commands - and using caps is a bit of a hack. The
> benefit here is that all the Realm commands are behind the one
> KVM_CAP_ARM_RME.
> 
> The options I can see are:
> 
> a) What I've got here - (ab)using KVM_ENABLE_CAP to perform commands.
> 
> b) Add new ioctls for each of the above stages (so 5 new ioctls on top
> of the CAP for discovery). With any future extensions requiring new ioctls.
> 
> c) Add a single new multiplexing ioctl (along with the CAP for discovery).
> 
> I'm not massively keen on defining a new multiplexing scheme (c), but
> equally (b) seems like it's burning through ioctl numbers. Which led me
> to stick with (a) which at least keeps the rebasing simple (there's only
> the one CAP number which could conflict) and there's already a
> multiplexing scheme.
> 
> But I'm happy to change if there's consensus a different approach would
> be preferable.
> 

Let's see if others have different opinions.

My coin goes to b as it is better to respect "what it is, make it explicit
and clear" when coming to define the UABI. Ioctl number is for UABI. If
it is going to burn out, IMHO, we need to find another way, perhaps another
fd to group those ioctls, like KVM.

1. a) seems abusing the usage of the cap. for sure, the benefit is obvious.
2. c) seems hiding the details, which saves the ioctl numbers, but it didn't
actually help a lot on the complexity and might end up with another bunch
of "command code".

> Thanks,
> 
> Steve
> 
> >> +#define KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256		0
> >> +#define KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA512		1
> >> +
> >> +#define KVM_CAP_ARM_RME_RPV_SIZE 64
> >> +
> >> +/* List of configuration items accepted for KVM_CAP_ARM_RME_CONFIG_REALM */
> >> +#define KVM_CAP_ARM_RME_CFG_RPV			0
> >> +#define KVM_CAP_ARM_RME_CFG_HASH_ALGO		1
> >> +#define KVM_CAP_ARM_RME_CFG_SVE			2
> >> +#define KVM_CAP_ARM_RME_CFG_DBG			3
> >> +#define KVM_CAP_ARM_RME_CFG_PMU			4
> >> +
> >> +struct kvm_cap_arm_rme_config_item {
> >> +	__u32 cfg;
> >> +	union {
> >> +		/* cfg == KVM_CAP_ARM_RME_CFG_RPV */
> >> +		struct {
> >> +			__u8	rpv[KVM_CAP_ARM_RME_RPV_SIZE];
> >> +		};
> >> +
> >> +		/* cfg == KVM_CAP_ARM_RME_CFG_HASH_ALGO */
> >> +		struct {
> >> +			__u32	hash_algo;
> >> +		};
> >> +
> >> +		/* cfg == KVM_CAP_ARM_RME_CFG_SVE */
> >> +		struct {
> >> +			__u32	sve_vq;
> >> +		};
> >> +
> >> +		/* cfg == KVM_CAP_ARM_RME_CFG_DBG */
> >> +		struct {
> >> +			__u32	num_brps;
> >> +			__u32	num_wrps;
> >> +		};
> >> +
> >> +		/* cfg == KVM_CAP_ARM_RME_CFG_PMU */
> >> +		struct {
> >> +			__u32	num_pmu_cntrs;
> >> +		};
> >> +		/* Fix the size of the union */
> >> +		__u8	reserved[256];
> >> +	};
> >> +};
> >> +
> >> +struct kvm_cap_arm_rme_populate_realm_args {
> >> +	__u64 populate_ipa_base;
> >> +	__u64 populate_ipa_size;
> >> +};
> >> +
> >> +struct kvm_cap_arm_rme_init_ipa_args {
> >> +	__u64 init_ipa_base;
> >> +	__u64 init_ipa_size;
> >> +};
> >> +
> >>  /* Device Control API on vcpu fd */
> >>  #define KVM_ARM_VCPU_PMU_V3_CTRL	0
> >>  #define   KVM_ARM_VCPU_PMU_V3_IRQ	0
> >> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> >> index 20522d4ba1e0..fec1909e8b73 100644
> >> --- a/include/uapi/linux/kvm.h
> >> +++ b/include/uapi/linux/kvm.h
> >> @@ -1176,6 +1176,8 @@ struct kvm_ppc_resize_hpt {
> >>  #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224
> >>  #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225
> >>  
> >> +#define KVM_CAP_ARM_RME 300 // FIXME: Large number to prevent conflicts
> >> +
> >>  #ifdef KVM_CAP_IRQ_ROUTING
> >>  
> >>  struct kvm_irq_routing_irqchip {
> > 
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms
  2023-03-01 11:55       ` Steven Price
@ 2023-03-01 20:33         ` Zhi Wang
  0 siblings, 0 replies; 190+ messages in thread
From: Zhi Wang @ 2023-03-01 20:33 UTC (permalink / raw)
  To: Steven Price
  Cc: Zhi Wang, kvm, kvmarm, Catalin Marinas, Marc Zyngier,
	Will Deacon, James Morse, Oliver Upton, Suzuki K Poulose,
	Zenghui Yu, linux-arm-kernel, linux-kernel, Joey Gouly,
	Alexandru Elisei, Christoffer Dall, Fuad Tabba, linux-coco

On Wed, 1 Mar 2023 11:55:17 +0000
Steven Price <steven.price@arm.com> wrote:

> On 13/02/2023 16:10, Zhi Wang wrote:
> > On Fri, 27 Jan 2023 11:29:10 +0000
> > Steven Price <steven.price@arm.com> wrote:
> > 
> >> Add the KVM_CAP_ARM_RME_CREATE_FD ioctl to create a realm. This involves
> >> delegating pages to the RMM to hold the Realm Descriptor (RD) and for
> >> the base level of the Realm Translation Tables (RTT). A VMID also need
> >> to be picked, since the RMM has a separate VMID address space a
> >> dedicated allocator is added for this purpose.
> >>
> >> KVM_CAP_ARM_RME_CONFIG_REALM is provided to allow configuring the realm
> >> before it is created.
> >>
> >> Signed-off-by: Steven Price <steven.price@arm.com>
> >> ---
> >>  arch/arm64/include/asm/kvm_rme.h |  14 ++
> >>  arch/arm64/kvm/arm.c             |  19 ++
> >>  arch/arm64/kvm/mmu.c             |   6 +
> >>  arch/arm64/kvm/reset.c           |  33 +++
> >>  arch/arm64/kvm/rme.c             | 357 +++++++++++++++++++++++++++++++
> >>  5 files changed, 429 insertions(+)
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> >> index c26bc2c6770d..055a22accc08 100644
> >> --- a/arch/arm64/include/asm/kvm_rme.h
> >> +++ b/arch/arm64/include/asm/kvm_rme.h
> >> @@ -6,6 +6,8 @@
> >>  #ifndef __ASM_KVM_RME_H
> >>  #define __ASM_KVM_RME_H
> >>  
> >> +#include <uapi/linux/kvm.h>
> >> +
> >>  enum realm_state {
> >>  	REALM_STATE_NONE,
> >>  	REALM_STATE_NEW,
> >> @@ -15,8 +17,20 @@ enum realm_state {
> >>  
> >>  struct realm {
> >>  	enum realm_state state;
> >> +
> >> +	void *rd;
> >> +	struct realm_params *params;
> >> +
> >> +	unsigned long num_aux;
> >> +	unsigned int vmid;
> >> +	unsigned int ia_bits;
> >>  };
> >>  
> > 
> > Maybe more comments for this structure?
> 
> Agreed, this series is a bit light on comments. I'll try to improve for v2.
> 
> <snip>
> 
> > 
> > Just curious. Wouldn't it be better to use IDR as this is ID allocation? There
> > were some efforts to change the use of bitmap allocation to IDR before.
> 
> I'm not sure it makes much difference really. This matches KVM's
> vmid_map, but here things are much more simple as there's no support for
> the likes of VMID rollover (the number of Realm VMs is just capped at
> the number of VMIDs).
> 
> IDR provides a lot of functionality we don't need, but equally I don't
> think performance or memory usage are really a concern here.

Agree. I am not opposed to the current approach. I gave this comment because
I vaguely remember there were some patch bundles to covert bitmap to IDR in
the kernel before. So I think it would be better to raise it up and get a
conclusion. It would save some efforts for the people who might jump in the
review later.

> 
> Steve
> 
> >> +/* Protects access to rme_vmid_bitmap */
> >> +static DEFINE_SPINLOCK(rme_vmid_lock);
> >> +static unsigned long *rme_vmid_bitmap;
> >> +
> >> +static int rme_vmid_init(void)
> >> +{
> >> +	unsigned int vmid_count = 1 << kvm_get_vmid_bits();
> >> +
> >> +	rme_vmid_bitmap = bitmap_zalloc(vmid_count, GFP_KERNEL);
> >> +	if (!rme_vmid_bitmap) {
> >> +		kvm_err("%s: Couldn't allocate rme vmid bitmap\n", __func__);
> >> +		return -ENOMEM;
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int rme_vmid_reserve(void)
> >> +{
> >> +	int ret;
> >> +	unsigned int vmid_count = 1 << kvm_get_vmid_bits();
> >> +
> >> +	spin_lock(&rme_vmid_lock);
> >> +	ret = bitmap_find_free_region(rme_vmid_bitmap, vmid_count, 0);
> >> +	spin_unlock(&rme_vmid_lock);
> >> +
> >> +	return ret;
> >> +}
> >> +
> >> +static void rme_vmid_release(unsigned int vmid)
> >> +{
> >> +	spin_lock(&rme_vmid_lock);
> >> +	bitmap_release_region(rme_vmid_bitmap, vmid, 0);
> >> +	spin_unlock(&rme_vmid_lock);
> >> +}
> >> +
> >> +static int kvm_create_realm(struct kvm *kvm)
> >> +{
> >> +	struct realm *realm = &kvm->arch.realm;
> >> +	int ret;
> >> +
> >> +	if (!kvm_is_realm(kvm) || kvm_realm_state(kvm) != REALM_STATE_NONE)
> >> +		return -EEXIST;
> >> +
> >> +	ret = rme_vmid_reserve();
> >> +	if (ret < 0)
> >> +		return ret;
> >> +	realm->vmid = ret;
> >> +
> >> +	ret = realm_create_rd(kvm);
> >> +	if (ret) {
> >> +		rme_vmid_release(realm->vmid);
> >> +		return ret;
> >> +	}
> >> +
> >> +	WRITE_ONCE(realm->state, REALM_STATE_NEW);
> >> +
> >> +	/* The realm is up, free the parameters.  */
> >> +	free_page((unsigned long)realm->params);
> >> +	realm->params = NULL;
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int config_realm_hash_algo(struct realm *realm,
> >> +				  struct kvm_cap_arm_rme_config_item *cfg)
> >> +{
> >> +	switch (cfg->hash_algo) {
> >> +	case KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256:
> >> +		if (!rme_supports(RMI_FEATURE_REGISTER_0_HASH_SHA_256))
> >> +			return -EINVAL;
> >> +		break;
> >> +	case KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA512:
> >> +		if (!rme_supports(RMI_FEATURE_REGISTER_0_HASH_SHA_512))
> >> +			return -EINVAL;
> >> +		break;
> >> +	default:
> >> +		return -EINVAL;
> >> +	}
> >> +	realm->params->measurement_algo = cfg->hash_algo;
> >> +	return 0;
> >> +}
> >> +
> >> +static int config_realm_sve(struct realm *realm,
> >> +			    struct kvm_cap_arm_rme_config_item *cfg)
> >> +{
> >> +	u64 features_0 = realm->params->features_0;
> >> +	int max_sve_vq = u64_get_bits(rmm_feat_reg0,
> >> +				      RMI_FEATURE_REGISTER_0_SVE_VL);
> >> +
> >> +	if (!rme_supports(RMI_FEATURE_REGISTER_0_SVE_EN))
> >> +		return -EINVAL;
> >> +
> >> +	if (cfg->sve_vq > max_sve_vq)
> >> +		return -EINVAL;
> >> +
> >> +	features_0 &= ~(RMI_FEATURE_REGISTER_0_SVE_EN |
> >> +			RMI_FEATURE_REGISTER_0_SVE_VL);
> >> +	features_0 |= u64_encode_bits(1, RMI_FEATURE_REGISTER_0_SVE_EN);
> >> +	features_0 |= u64_encode_bits(cfg->sve_vq,
> >> +				      RMI_FEATURE_REGISTER_0_SVE_VL);
> >> +
> >> +	realm->params->features_0 = features_0;
> >> +	return 0;
> >> +}
> >> +
> >> +static int kvm_rme_config_realm(struct kvm *kvm, struct kvm_enable_cap *cap)
> >> +{
> >> +	struct kvm_cap_arm_rme_config_item cfg;
> >> +	struct realm *realm = &kvm->arch.realm;
> >> +	int r = 0;
> >> +
> >> +	if (kvm_realm_state(kvm) != REALM_STATE_NONE)
> >> +		return -EBUSY;
> >> +
> >> +	if (copy_from_user(&cfg, (void __user *)cap->args[1], sizeof(cfg)))
> >> +		return -EFAULT;
> >> +
> >> +	switch (cfg.cfg) {
> >> +	case KVM_CAP_ARM_RME_CFG_RPV:
> >> +		memcpy(&realm->params->rpv, &cfg.rpv, sizeof(cfg.rpv));
> >> +		break;
> >> +	case KVM_CAP_ARM_RME_CFG_HASH_ALGO:
> >> +		r = config_realm_hash_algo(realm, &cfg);
> >> +		break;
> >> +	case KVM_CAP_ARM_RME_CFG_SVE:
> >> +		r = config_realm_sve(realm, &cfg);
> >> +		break;
> >> +	default:
> >> +		r = -EINVAL;
> >> +	}
> >> +
> >> +	return r;
> >> +}
> >> +
> >> +int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
> >> +{
> >> +	int r = 0;
> >> +
> >> +	switch (cap->args[0]) {
> >> +	case KVM_CAP_ARM_RME_CONFIG_REALM:
> >> +		r = kvm_rme_config_realm(kvm, cap);
> >> +		break;
> >> +	case KVM_CAP_ARM_RME_CREATE_RD:
> >> +		if (kvm->created_vcpus) {
> >> +			r = -EBUSY;
> >> +			break;
> >> +		}
> >> +
> >> +		r = kvm_create_realm(kvm);
> >> +		break;
> >> +	default:
> >> +		r = -EINVAL;
> >> +		break;
> >> +	}
> >> +
> >> +	return r;
> >> +}
> >> +
> >> +void kvm_destroy_realm(struct kvm *kvm)
> >> +{
> >> +	struct realm *realm = &kvm->arch.realm;
> >> +	struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
> >> +	unsigned int pgd_sz;
> >> +	int i;
> >> +
> >> +	if (realm->params) {
> >> +		free_page((unsigned long)realm->params);
> >> +		realm->params = NULL;
> >> +	}
> >> +
> >> +	if (kvm_realm_state(kvm) == REALM_STATE_NONE)
> >> +		return;
> >> +
> >> +	WRITE_ONCE(realm->state, REALM_STATE_DYING);
> >> +
> >> +	rme_vmid_release(realm->vmid);
> >> +
> >> +	if (realm->rd) {
> >> +		phys_addr_t rd_phys = virt_to_phys(realm->rd);
> >> +
> >> +		if (WARN_ON(rmi_realm_destroy(rd_phys)))
> >> +			return;
> >> +		if (WARN_ON(rmi_granule_undelegate(rd_phys)))
> >> +			return;
> >> +		free_page((unsigned long)realm->rd);
> >> +		realm->rd = NULL;
> >> +	}
> >> +
> >> +	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
> >> +	for (i = 0; i < pgd_sz; i++) {
> >> +		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
> >> +
> >> +		if (WARN_ON(rmi_granule_undelegate(pgd_phys)))
> >> +			return;
> >> +	}
> >> +
> >> +	kvm_free_stage2_pgd(&kvm->arch.mmu);
> >> +}
> >> +
> >> +int kvm_init_realm_vm(struct kvm *kvm)
> >> +{
> >> +	struct realm_params *params;
> >> +
> >> +	params = (struct realm_params *)get_zeroed_page(GFP_KERNEL);
> >> +	if (!params)
> >> +		return -ENOMEM;
> >> +
> >> +	params->features_0 = create_realm_feat_reg0(kvm);
> >> +	kvm->arch.realm.params = params;
> >> +	return 0;
> >> +}
> >> +
> >>  int kvm_init_rme(void)
> >>  {
> >> +	int ret;
> >> +
> >>  	if (PAGE_SIZE != SZ_4K)
> >>  		/* Only 4k page size on the host is supported */
> >>  		return 0;
> >> @@ -43,6 +394,12 @@ int kvm_init_rme(void)
> >>  		/* Continue without realm support */
> >>  		return 0;
> >>  
> >> +	ret = rme_vmid_init();
> >> +	if (ret)
> >> +		return ret;
> >> +
> >> +	WARN_ON(rmi_features(0, &rmm_feat_reg0));
> >> +
> >>  	/* Future patch will enable static branch kvm_rme_is_available */
> >>  
> >>  	return 0;
> > 
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 08/28] arm64: RME: Keep a spare page delegated to the RMM
  2023-03-01 11:55       ` Steven Price
@ 2023-03-01 20:50         ` Zhi Wang
  0 siblings, 0 replies; 190+ messages in thread
From: Zhi Wang @ 2023-03-01 20:50 UTC (permalink / raw)
  To: Steven Price
  Cc: Zhi Wang, kvm, kvmarm, Catalin Marinas, Marc Zyngier,
	Will Deacon, James Morse, Oliver Upton, Suzuki K Poulose,
	Zenghui Yu, linux-arm-kernel, linux-kernel, Joey Gouly,
	Alexandru Elisei, Christoffer Dall, Fuad Tabba, linux-coco

On Wed, 1 Mar 2023 11:55:37 +0000
Steven Price <steven.price@arm.com> wrote:

> On 13/02/2023 16:47, Zhi Wang wrote:
> > On Fri, 27 Jan 2023 11:29:12 +0000
> > Steven Price <steven.price@arm.com> wrote:
> > 
> >> Pages can only be populated/destroyed on the RMM at the 4KB granule,
> >> this requires creating the full depth of RTTs. However if the pages are
> >> going to be combined into a 4MB huge page the last RTT is only
> >> temporarily needed. Similarly when freeing memory the huge page must be
> >> temporarily split requiring temporary usage of the full depth oF RTTs.
> >>
> >> To avoid needing to perform a temporary allocation and delegation of a
> >> page for this purpose we keep a spare delegated page around. In
> >> particular this avoids the need for memory allocation while destroying
> >> the realm guest.
> >>
> >> Signed-off-by: Steven Price <steven.price@arm.com>
> >> ---
> >>  arch/arm64/include/asm/kvm_rme.h | 3 +++
> >>  arch/arm64/kvm/rme.c             | 6 ++++++
> >>  2 files changed, 9 insertions(+)
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> >> index 055a22accc08..a6318af3ed11 100644
> >> --- a/arch/arm64/include/asm/kvm_rme.h
> >> +++ b/arch/arm64/include/asm/kvm_rme.h
> >> @@ -21,6 +21,9 @@ struct realm {
> >>  	void *rd;
> >>  	struct realm_params *params;
> >>  
> >> +	/* A spare already delegated page */
> >> +	phys_addr_t spare_page;
> >> +
> >>  	unsigned long num_aux;
> >>  	unsigned int vmid;
> >>  	unsigned int ia_bits;
> >> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> >> index 9f8c5a91b8fc..0c9d70e4d9e6 100644
> >> --- a/arch/arm64/kvm/rme.c
> >> +++ b/arch/arm64/kvm/rme.c
> >> @@ -148,6 +148,7 @@ static int realm_create_rd(struct kvm *kvm)
> >>  	}
> >>  
> >>  	realm->rd = rd;
> >> +	realm->spare_page = PHYS_ADDR_MAX;
> >>  	realm->ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
> >>  
> >>  	if (WARN_ON(rmi_rec_aux_count(rd_phys, &realm->num_aux))) {
> >> @@ -357,6 +358,11 @@ void kvm_destroy_realm(struct kvm *kvm)
> >>  		free_page((unsigned long)realm->rd);
> >>  		realm->rd = NULL;
> >>  	}
> >> +	if (realm->spare_page != PHYS_ADDR_MAX) {
> >> +		if (!WARN_ON(rmi_granule_undelegate(realm->spare_page)))
> >> +			free_page((unsigned long)phys_to_virt(realm->spare_page));
> > 
> > Will the page be leaked (not usable for host and realms) if the undelegate
> > failed? If yes, better at least put a comment.
> 
> Yes - I'll add a comment.
> 
> In general being unable to undelegate a page points to a programming
> error in the host. The only reason the RMM should refuse the request is
> it the page is in use by a Realm which the host has configured. So the
> WARN() is correct (there's a kernel bug) and the only sensible course of
> action is to leak the page and limp on.
>

It would be nice to add a summary of above into the patch comments.

Having a comment when leaking a page (which mostly means the page cannot be
reclaimed by VMM and used on a REALM any more) is nice. TDX/SNP also have
the problem of leaking pages due to mystic reasons.

Imagine the leaking can turn worse bit by bit in a long running server and
KVM will definitely have a generic accounting interface for reporting the
numbers to the userspace later. Having a explicit comment at this time
really makes it easier later.
 
> Thanks,
> 
> Steve
> 
> >> +		realm->spare_page = PHYS_ADDR_MAX;
> >> +	}
> >>  
> >>  	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
> >>  	for (i = 0; i < pgd_sz; i++) {
> > 
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-03-01  9:20     ` Jean-Philippe Brucker
@ 2023-03-01 22:12       ` Itaru Kitayama
  2023-03-02  9:18         ` Jean-Philippe Brucker
  2023-03-03  9:46         ` Jean-Philippe Brucker
  0 siblings, 2 replies; 190+ messages in thread
From: Itaru Kitayama @ 2023-03-01 22:12 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Suzuki K Poulose, linux-coco, linux-kernel, kvm, kvmarm,
	linux-arm-kernel, Alexandru Elisei, Andrew Jones,
	Catalin Marinas, Chao Peng, Christoffer Dall, Fuad Tabba,
	James Morse, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Sean Christopherson,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

On Wed, Mar 1, 2023 at 6:20 PM Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
>
> Hi Itaru,
>
> On Wed, Mar 01, 2023 at 08:35:05AM +0900, Itaru Kitayama wrote:
> > Hi Jean,
> > I've tried your series in Real on CCA Host, but the KVM arch init
> > emits an Invalid argument error and terminates.
>
> Do you know which call returns this error?  Normally the RMEGuest support
> should print more detailed errors. Are you able to launch normal guests
> (without the rme-guest object and confidential-guest-support machine
> parameter)?  Could you give the complete QEMU command-line?

No, I cant say which. Yes, the CCA-capable QEMU boots if I don't set
RME-related options.

Here's mine (based upon your command-line):
qemu-system-aarch64 -cpu host -accel kvm -machine
virt,gic-version=3,confidential-guest-support=rme0 -smp 2 -m 256M
-nographic -object rme-guest,id=rme0,measurement-algo=sha512 -kernel
Image -initrd rootfs.ext2 -append 'console=ttyAMA0 earlycon'
-overcommit mem-lock=on

Itaru.
>
> > I configure it with the aarch64-softmmu target only and built, any
> > other steps I should worry?
>
> No, that should be enough
>
> Thanks,
> Jean

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-03-01 22:12       ` Itaru Kitayama
@ 2023-03-02  9:18         ` Jean-Philippe Brucker
  2023-03-03  9:46         ` Jean-Philippe Brucker
  1 sibling, 0 replies; 190+ messages in thread
From: Jean-Philippe Brucker @ 2023-03-02  9:18 UTC (permalink / raw)
  To: Itaru Kitayama
  Cc: Suzuki K Poulose, linux-coco, linux-kernel, kvm, kvmarm,
	linux-arm-kernel, Alexandru Elisei, Andrew Jones,
	Catalin Marinas, Chao Peng, Christoffer Dall, Fuad Tabba,
	James Morse, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Sean Christopherson,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

On Thu, Mar 02, 2023 at 07:12:24AM +0900, Itaru Kitayama wrote:
> On Wed, Mar 1, 2023 at 6:20 PM Jean-Philippe Brucker
> <jean-philippe@linaro.org> wrote:
> >
> > Hi Itaru,
> >
> > On Wed, Mar 01, 2023 at 08:35:05AM +0900, Itaru Kitayama wrote:
> > > Hi Jean,
> > > I've tried your series in Real on CCA Host, but the KVM arch init
> > > emits an Invalid argument error and terminates.
> >
> > Do you know which call returns this error?  Normally the RMEGuest support
> > should print more detailed errors. Are you able to launch normal guests
> > (without the rme-guest object and confidential-guest-support machine
> > parameter)?  Could you give the complete QEMU command-line?
> 
> No, I cant say which. Yes, the CCA-capable QEMU boots if I don't set
> RME-related options.
> 
> Here's mine (based upon your command-line):
> qemu-system-aarch64 -cpu host -accel kvm -machine
> virt,gic-version=3,confidential-guest-support=rme0 -smp 2 -m 256M
> -nographic -object rme-guest,id=rme0,measurement-algo=sha512 -kernel
> Image -initrd rootfs.ext2 -append 'console=ttyAMA0 earlycon'
> -overcommit mem-lock=on

Thank you, this works on my setup so I'm not sure what's wrong. Check that
KVM initialized successfully, with this in the host kernel log:
"[    0.267019] kvm [1]: Using prototype RMM support (version 56.0)"

Next step would be to find out where the EINVAL comes from, with printfs
or GDB. This seems rather specific so I'll email you directly to avoid
filling up everyone's inbox.

Thanks,
Jean

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC kvmtool 18/31] arm64: Populate initial realm contents
  2023-01-27 11:39   ` [RFC kvmtool 18/31] arm64: Populate initial realm contents Suzuki K Poulose
@ 2023-03-02 14:03     ` Piotr Sawicki
  2023-03-02 14:06       ` Suzuki K Poulose
  0 siblings, 1 reply; 190+ messages in thread
From: Piotr Sawicki @ 2023-03-02 14:03 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Alexandru Elisei, Andrew Jones, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Thomas Huth, Will Deacon, Zenghui Yu, linux-coco, kvmarm,
	linux-arm-kernel, linux-kernel

Hi,

> From: Alexandru Elisei <alexandru.elisei@arm.com>
> 
> Populate the realm memory with the initial contents, which include
> the device tree blob, the kernel image, and initrd, if specified,
> or the firmware image.
> 
> Populating an image in the realm involves two steps:
>   a) Mark the IPA area as RAM - INIT_IPA_REALM
>   b) Load the contents into the IPA - POPULATE_REALM
> 
> Wherever we know the actual size of an image in memory, we make
> sure the "memory area" is initialised to RAM.
> e.g., Linux kernel image size from the header which includes the bss etc.
> The "file size" on disk for the Linux image is much smaller.
> We mark the region of size Image.header.size as RAM (a), from the kernel
> load address. And load the Image file into the memory (b) above.
> At the moment we only detect the Arm64 Linux Image header format.
> 
> Since we're already touching the code that copies the
> initrd in guest memory, let's do a bit of cleaning and remove a
> useless local variable.
> 
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> [ Make sure the Linux kernel image area is marked as RAM ]
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>   arm/aarch32/include/asm/realm.h |   3 +
>   arm/aarch64/include/asm/realm.h |   3 +
>   arm/aarch64/realm.c             | 112 ++++++++++++++++++++++++++++++++
>   arm/fdt.c                       |   6 ++
>   arm/kvm.c                       |  20 ++++--
>   include/linux/kernel.h          |   1 +
>   6 files changed, 140 insertions(+), 5 deletions(-)
> 
> diff --git a/arm/aarch32/include/asm/realm.h b/arm/aarch32/include/asm/realm.h
> index 5aca6cca..fcff0e55 100644
> --- a/arm/aarch32/include/asm/realm.h
> +++ b/arm/aarch32/include/asm/realm.h
> @@ -6,5 +6,8 @@
>   #include "kvm/kvm.h"
>   
>   static inline void kvm_arm_realm_create_realm_descriptor(struct kvm *kvm) {}
> +static inline void kvm_arm_realm_populate_kernel(struct kvm *kvm) {}
> +static inline void kvm_arm_realm_populate_initrd(struct kvm *kvm) {}
> +static inline void kvm_arm_realm_populate_dtb(struct kvm *kvm) {}
>   
>   #endif /* ! __ASM_REALM_H */
> diff --git a/arm/aarch64/include/asm/realm.h b/arm/aarch64/include/asm/realm.h
> index e176f15f..6e760ac9 100644
> --- a/arm/aarch64/include/asm/realm.h
> +++ b/arm/aarch64/include/asm/realm.h
> @@ -6,5 +6,8 @@
>   #include "kvm/kvm.h"
>   
>   void kvm_arm_realm_create_realm_descriptor(struct kvm *kvm);
> +void kvm_arm_realm_populate_kernel(struct kvm *kvm);
> +void kvm_arm_realm_populate_initrd(struct kvm *kvm);
> +void kvm_arm_realm_populate_dtb(struct kvm *kvm);
>   
>   #endif /* ! __ASM_REALM_H */
> diff --git a/arm/aarch64/realm.c b/arm/aarch64/realm.c
> index fc7f8d6a..eddccece 100644
> --- a/arm/aarch64/realm.c
> +++ b/arm/aarch64/realm.c
> @@ -1,5 +1,7 @@
>   #include "kvm/kvm.h"
>   
> +#include <linux/byteorder.h>
> +#include <asm/image.h>
>   #include <asm/realm.h>
>   
>   
> @@ -80,3 +82,113 @@ void kvm_arm_realm_create_realm_descriptor(struct kvm *kvm)
>   	if (ioctl(kvm->vm_fd, KVM_ENABLE_CAP, &rme_create_rd) < 0)
>   		die_perror("KVM_CAP_RME(KVM_CAP_ARM_RME_CREATE_RD)");
>   }
> +
> +static void realm_init_ipa_range(struct kvm *kvm, u64 start, u64 size)
> +{
> +	struct kvm_cap_arm_rme_init_ipa_args init_ipa_args = {
> +		.init_ipa_base = start,
> +		.init_ipa_size = size
> +	};
> +	struct kvm_enable_cap rme_init_ipa_realm = {
> +		.cap = KVM_CAP_ARM_RME,
> +		.args[0] = KVM_CAP_ARM_RME_INIT_IPA_REALM,
> +		.args[1] = (u64)&init_ipa_args
> +	};
> +
> +	if (ioctl(kvm->vm_fd, KVM_ENABLE_CAP, &rme_init_ipa_realm) < 0)
> +		die("unable to intialise IPA range for Realm %llx - %llx (size %llu)",
> +		    start, start + size, size);
> +
> +}
> +
> +static void __realm_populate(struct kvm *kvm, u64 start, u64 size)
> +{
> +	struct kvm_cap_arm_rme_populate_realm_args populate_args = {
> +		.populate_ipa_base = start,
> +		.populate_ipa_size = size
> +	};
> +	struct kvm_enable_cap rme_populate_realm = {
> +		.cap = KVM_CAP_ARM_RME,
> +		.args[0] = KVM_CAP_ARM_RME_POPULATE_REALM,
> +		.args[1] = (u64)&populate_args
> +	};
> +
> +	if (ioctl(kvm->vm_fd, KVM_ENABLE_CAP, &rme_populate_realm) < 0)
> +		die("unable to populate Realm memory %llx - %llx (size %llu)",
> +		    start, start + size, size);
> +}
> +
> +static void realm_populate(struct kvm *kvm, u64 start, u64 size)
> +{
> +	realm_init_ipa_range(kvm, start, size);
> +	__realm_populate(kvm, start, size);
> +}
> +
> +static bool is_arm64_linux_kernel_image(void *header)
> +{
> +	struct arm64_image_header *hdr = header;
> +
> +	return memcmp(&hdr->magic, ARM64_IMAGE_MAGIC, sizeof(hdr->magic)) == 0;
> +}
> +
> +static ssize_t arm64_linux_kernel_image_size(void *header)
> +{
> +	struct arm64_image_header *hdr = header;
> +
> +	if (is_arm64_linux_kernel_image(header))
> +		return le64_to_cpu(hdr->image_size);
> +	die("Not arm64 Linux kernel Image");
> +}
> +
> +void kvm_arm_realm_populate_kernel(struct kvm *kvm)
> +{
> +	u64 start, end, mem_size;
> +	void *header = guest_flat_to_host(kvm, kvm->arch.kern_guest_start);
> +
> +	start = ALIGN_DOWN(kvm->arch.kern_guest_start, SZ_4K);
> +	end = ALIGN(kvm->arch.kern_guest_start + kvm->arch.kern_size, SZ_4K);
> +
> +	if (is_arm64_linux_kernel_image(header))
> +		mem_size = arm64_linux_kernel_image_size(header);
> +	else
> +		mem_size = end - start;
> +
> +	realm_init_ipa_range(kvm, start, mem_size);
> +	__realm_populate(kvm, start, end - start);
> +}
> +
> +void kvm_arm_realm_populate_initrd(struct kvm *kvm)
> +{
> +	u64 kernel_end, start, end;
> +
> +	kernel_end = ALIGN(kvm->arch.kern_guest_start + kvm->arch.kern_size, SZ_4K);
> +	start = ALIGN_DOWN(kvm->arch.initrd_guest_start, SZ_4K);
> +	/*
> +	 * Because we align the initrd to 4 bytes, it is theoretically possible
> +	 * for the start of the initrd to overlap with the same page where the
> +	 * kernel ends.
> +	 */
> +	if (start < kernel_end)
> +		start = kernel_end;
> +	end = ALIGN(kvm->arch.initrd_guest_start + kvm->arch.initrd_size, SZ_4K);
> +	if (end > start)
> +		realm_populate(kvm, start, end - start);
> +}
> +
> +void kvm_arm_realm_populate_dtb(struct kvm *kvm)
> +{
> +	u64 initrd_end, start, end;
> +
> +	initrd_end = ALIGN(kvm->arch.initrd_guest_start + kvm->arch.initrd_size, SZ_4K);
> +	start = ALIGN_DOWN(kvm->arch.dtb_guest_start, SZ_4K);
> +	/*
> +	 * Same situation as with the initrd, but now it is the DTB which is
> +	 * overlapping with the last page of the initrd, because the initrd is
> +	 * populated first.
> +	 */
> +	if (start < initrd_end)
> +		start = initrd_end;
> +	end = ALIGN(kvm->arch.dtb_guest_start + FDT_MAX_SIZE, SZ_4K);
> +	if (end > start)
> +		realm_populate(kvm, start, end - start);
> +}
> diff --git a/arm/fdt.c b/arm/fdt.c
> index 286ccadf..762a604d 100644
> --- a/arm/fdt.c
> +++ b/arm/fdt.c
> @@ -7,6 +7,8 @@
>   #include "arm-common/gic.h"
>   #include "arm-common/pci.h"
>   
> +#include <asm/realm.h>
> +
>   #include <stdbool.h>
>   
>   #include <linux/byteorder.h>
> @@ -231,6 +233,10 @@ static int setup_fdt(struct kvm *kvm)
>   
>   	if (kvm->cfg.arch.dump_dtb_filename)
>   		dump_fdt(kvm->cfg.arch.dump_dtb_filename, fdt_dest);
> +
> +	if (kvm->cfg.arch.is_realm)
> +		kvm_arm_realm_populate_dtb(kvm);
> +
>   	return 0;
>   }
>   late_init(setup_fdt);
> diff --git a/arm/kvm.c b/arm/kvm.c
> index acb627b2..57c5b5f7 100644
> --- a/arm/kvm.c
> +++ b/arm/kvm.c
> @@ -6,6 +6,7 @@
>   #include "kvm/fdt.h"
>   
>   #include "arm-common/gic.h"
> +#include <asm/realm.h>
>   
>   #include <sys/resource.h>
>   
> @@ -167,6 +168,9 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
>   	pr_debug("Loaded kernel to 0x%llx (%llu bytes)",
>   		 kvm->arch.kern_guest_start, kvm->arch.kern_size);


I've noticed that multiple calling of the measurement test from the 
kvm-unit-tests suite results in different Realm Initial Measurements, 
although the kernel image is always the same.

After short investigation, I've found that the RIM starts being 
different while populating the last 4kB chunk of the kernel image.
The issue occurs when the image size is not aligned to the page size (4kB).

After zeroing the unused area of the last chunk, the measurements become 
repeatable.

> +	if (kvm->cfg.arch.is_realm)
> +		kvm_arm_realm_populate_kernel(kvm);
> +
>   	/*
>   	 * Now load backwards from the end of memory so the kernel
>   	 * decompressor has plenty of space to work with. First up is
> @@ -188,7 +192,6 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
>   	/* ... and finally the initrd, if we have one. */
>   	if (fd_initrd != -1) {
>   		struct stat sb;
> -		unsigned long initrd_start;
>   
>   		if (fstat(fd_initrd, &sb))
>   			die_perror("fstat");
> @@ -199,7 +202,6 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
>   		if (pos < kernel_end)
>   			die("initrd overlaps with kernel image.");
>   
> -		initrd_start = guest_addr;
>   		file_size = read_file(fd_initrd, pos, limit - pos);
>   		if (file_size == -1) {
>   			if (errno == ENOMEM)
> @@ -208,11 +210,13 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
>   			die_perror("initrd read");
>   		}
>   
> -		kvm->arch.initrd_guest_start = initrd_start;
> +		kvm->arch.initrd_guest_start = guest_addr;
>   		kvm->arch.initrd_size = file_size;
>   		pr_debug("Loaded initrd to 0x%llx (%llu bytes)",
> -			 kvm->arch.initrd_guest_start,
> -			 kvm->arch.initrd_size);
> +			 kvm->arch.initrd_guest_start, kvm->arch.initrd_size);
> +
> +		if (kvm->cfg.arch.is_realm)
> +			kvm_arm_realm_populate_initrd(kvm);
>   	} else {
>   		kvm->arch.initrd_size = 0;
>   	}
> @@ -269,6 +273,8 @@ bool kvm__load_firmware(struct kvm *kvm, const char *firmware_filename)
>   
>   	/* Kernel isn't loaded by kvm, point start address to firmware */
>   	kvm->arch.kern_guest_start = fw_addr;
> +	kvm->arch.kern_size = fw_sz;
> +
>   	pr_debug("Loaded firmware to 0x%llx (%zd bytes)",
>   		 kvm->arch.kern_guest_start, fw_sz);
>   
> @@ -283,6 +289,10 @@ bool kvm__load_firmware(struct kvm *kvm, const char *firmware_filename)
>   		 kvm->arch.dtb_guest_start,
>   		 kvm->arch.dtb_guest_start + FDT_MAX_SIZE);
>   
> +	if (kvm->cfg.arch.is_realm)
> +		/* We hijack the kernel fields to describe the firmware. */
> +		kvm_arm_realm_populate_kernel(kvm);
> +
>   	return true;
>   }
>   
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index 6c22f1c0..25f19c20 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -9,6 +9,7 @@
>   
>   #define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))
>   
> +#define ALIGN_DOWN(x,a)		__ALIGN_MASK(x - (typeof(x))((a) - 1),(typeof(x))(a)-1)
>   #define ALIGN(x,a)		__ALIGN_MASK(x,(typeof(x))(a)-1)
>   #define __ALIGN_MASK(x,mask)	(((x)+(mask))&~(mask))
>   #define IS_ALIGNED(x, a)	(((x) & ((typeof(x))(a) - 1)) == 0)

Kind regards,
Piotr Sawicki

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC kvmtool 18/31] arm64: Populate initial realm contents
  2023-03-02 14:03     ` Piotr Sawicki
@ 2023-03-02 14:06       ` Suzuki K Poulose
  2023-10-02  9:28         ` Piotr Sawicki
  0 siblings, 1 reply; 190+ messages in thread
From: Suzuki K Poulose @ 2023-03-02 14:06 UTC (permalink / raw)
  To: Piotr Sawicki
  Cc: Alexandru Elisei, Andrew Jones, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Thomas Huth, Will Deacon, Zenghui Yu, linux-coco, kvmarm,
	linux-arm-kernel, linux-kernel

Hi Piotr

On 02/03/2023 14:03, Piotr Sawicki wrote:
> Hi,
> 
>> From: Alexandru Elisei <alexandru.elisei@arm.com>
>>
>> Populate the realm memory with the initial contents, which include
>> the device tree blob, the kernel image, and initrd, if specified,
>> or the firmware image.
>>
>> Populating an image in the realm involves two steps:
>>   a) Mark the IPA area as RAM - INIT_IPA_REALM
>>   b) Load the contents into the IPA - POPULATE_REALM
>>
>> Wherever we know the actual size of an image in memory, we make
>> sure the "memory area" is initialised to RAM.
>> e.g., Linux kernel image size from the header which includes the bss etc.
>> The "file size" on disk for the Linux image is much smaller.
>> We mark the region of size Image.header.size as RAM (a), from the kernel
>> load address. And load the Image file into the memory (b) above.
>> At the moment we only detect the Arm64 Linux Image header format.
>>
>> Since we're already touching the code that copies the
>> initrd in guest memory, let's do a bit of cleaning and remove a
>> useless local variable.
>>
>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> [ Make sure the Linux kernel image area is marked as RAM ]
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>


>> diff --git a/arm/kvm.c b/arm/kvm.c
>> index acb627b2..57c5b5f7 100644
>> --- a/arm/kvm.c
>> +++ b/arm/kvm.c
>> @@ -6,6 +6,7 @@
>>   #include "kvm/fdt.h"
>>   #include "arm-common/gic.h"
>> +#include <asm/realm.h>
>>   #include <sys/resource.h>
>> @@ -167,6 +168,9 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, 
>> int fd_kernel, int fd_initrd,
>>       pr_debug("Loaded kernel to 0x%llx (%llu bytes)",
>>            kvm->arch.kern_guest_start, kvm->arch.kern_size);
> 
> 
> I've noticed that multiple calling of the measurement test from the 
> kvm-unit-tests suite results in different Realm Initial Measurements, 
> although the kernel image is always the same.
> 
> After short investigation, I've found that the RIM starts being 
> different while populating the last 4kB chunk of the kernel image.
> The issue occurs when the image size is not aligned to the page size (4kB).
> 
> After zeroing the unused area of the last chunk, the measurements become 
> repeatable.
> 

That is a good point. We could memset() the remaining bits of the 4K 
page to 0. I will make this change.

Suzuki


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-03-01  9:58   ` Suzuki K Poulose
@ 2023-03-02 16:46     ` Dr. David Alan Gilbert
  2023-03-02 19:02       ` Suzuki K Poulose
  0 siblings, 1 reply; 190+ messages in thread
From: Dr. David Alan Gilbert @ 2023-03-02 16:46 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-coco, linux-kernel, kvm, kvmarm, linux-arm-kernel,
	Alexandru Elisei, Andrew Jones, Catalin Marinas, Chao Peng,
	Christoffer Dall, Fuad Tabba, James Morse, Jean-Philippe Brucker,
	Joey Gouly, Marc Zyngier, Mark Rutland, Oliver Upton,
	Paolo Bonzini, Quentin Perret, Sean Christopherson, Steven Price,
	Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

* Suzuki K Poulose (suzuki.poulose@arm.com) wrote:
> Hi Dave
> 
> Thanks for your response, and apologies for the delay. Response, in line.
> 
> On 14/02/2023 17:13, Dr. David Alan Gilbert wrote:
> > * Suzuki K Poulose (suzuki.poulose@arm.com) wrote:
> > > We are happy to announce the early RFC version of the Arm
> > > Confidential Compute Architecture (CCA) support for the Linux
> > > stack. The intention is to seek early feedback in the following areas:
> > >   * KVM integration of the Arm CCA
> > >   * KVM UABI for managing the Realms, seeking to generalise the operations
> > >     wherever possible with other Confidential Compute solutions.
> > >     Note: This version doesn't support Guest Private memory, which will be added
> > >     later (see below).
> > >   * Linux Guest support for Realms
> > > 
> > > Arm CCA Introduction
> > > =====================
> > > 
> > > The Arm CCA is a reference software architecture and implementation that builds
> > > on the Realm Management Extension (RME), enabling the execution of Virtual
> > > machines, while preventing access by more privileged software, such as hypervisor.
> > > The Arm CCA allows the hypervisor to control the VM, but removes the right for
> > > access to the code, register state or data that is used by VM.
> > > More information on the architecture is available here[0].
> > > 
> > >      Arm CCA Reference Software Architecture
> > > 
> > >          Realm World    ||    Normal World   ||  Secure World  ||
> > >                         ||        |          ||                ||
> > >   EL0 x-------x         || x----x | x------x ||                ||
> > >       | Realm |         || |    | | |      | ||                ||
> > >       |       |         || | VM | | |      | ||                ||
> > >   ----|  VM*  |---------||-|    |---|      |-||----------------||
> > >       |       |         || |    | | |  H   | ||                ||
> > >   EL1 x-------x         || x----x | |      | ||                ||
> > >           ^             ||        | |  o   | ||                ||
> > >           |             ||        | |      | ||                ||
> > >   ------- R*------------------------|  s  -|---------------------
> > >           S             ||          |      | ||                ||
> > >           I             ||          |  t   | ||                ||
> > >           |             ||          |      | ||                ||
> > >           v             ||          x------x ||                ||
> > >   EL2    RMM*           ||              ^    ||                ||
> > >           ^             ||              |    ||                ||
> > >   ========|=============================|========================
> > >           |                             | SMC
> > >           x--------- *RMI* -------------x
> > > 
> > >   EL3                   Root World
> > >                         EL3 Firmware
> > >   ===============================================================
> > > Where :
> > >   RMM - Realm Management Monitor
> > >   RMI - Realm Management Interface
> > >   RSI - Realm Service Interface
> > >   SMC - Secure Monitor Call
> > 
> > Hi,
> >    It's nice to see this full stack posted - thanks!
> > 
> > Are there any pointers to information on attestation and similar
> > measurement things?  In particular, are there any plans for a vTPM
> 
> The RMM v1.0 provides attestation and measurement services to the Realm,
> via Realm Service Interface (RSI) calls.

Can you point me at some docs for that?

> However, there is no support
> for partitioning the Realm VM with v1.0. This is currently under
> development and should be available in the near future.
> 
> With that in place, a vTPM could reside in a partition of the Realm VM along
> side the OS in another. Does that answer your question ?

Possibly; it would be great to be able to use a standard vTPM interface
here rather than have to do anything special.  People already have this
working on AMD SEV-SNP.

Dave

> Kind regards
> Suzuki
> 
> 
> > for Realms - if there were, it would make life easy for us, since we
> > can share some user space stuff with other CoCo systems.
> > 
> > Dave
> > 
> > > RME introduces a new security state "Realm world", in addition to the
> > > traditional Secure and Non-Secure states. The Arm CCA defines a new component,
> > > Realm Management Monitor (RMM) that runs at R-EL2. This is a standard piece of
> > > firmware, verified, installed and loaded by the EL3 firmware (e.g, TF-A), at
> > > system boot.
> > > 
> > > The RMM provides standard interfaces - Realm Management Interface (RMI) - to the
> > > Normal world hypervisor to manage the VMs running in the Realm world (also called
> > > Realms in short). These are exposed via SMC and are routed through the EL3
> > > firmwre.
> > > The RMI interface includes:
> > >    - Move a physical page from the Normal world to the Realm world
> > >    - Creating a Realm with requested parameters, tracked via Realm Descriptor (RD)
> > >    - Creating VCPUs aka Realm Execution Context (REC), with initial register state.
> > >    - Create stage2 translation table at any level.
> > >    - Load initial images into Realm Memory from normal world memory
> > >    - Schedule RECs (vCPUs) and handle exits
> > >    - Inject virtual interrupts into the Realm
> > >    - Service stage2 runtime faults with pages (provided by host, scrubbed by RMM).
> > >    - Create "shared" mappings that can be accessed by VMM/Hyp.
> > >    - Reclaim the memory allocated for the RAM and RTTs (Realm Translation Tables)
> > > 
> > > However v1.0 of RMM specifications doesn't support:
> > >   - Paging protected memory of a Realm VM. Thus the pages backing the protected
> > >     memory region must be pinned.
> > >   - Live migration of Realms.
> > >   - Trusted Device assignment.
> > >   - Physical interrupt backed Virtual interrupts for Realms
> > > 
> > > RMM also provides certain services to the Realms via SMC, called Realm Service
> > > Interface (RSI). These include:
> > >   - Realm Guest Configuration.
> > >   - Attestation & Measurement services
> > >   - Managing the state of an Intermediate Physical Address (IPA aka GPA) page.
> > >   - Host Call service (Communication with the Normal world Hypervisor)
> > > 
> > > The specifications for the RMM software is currently at *v1.0-Beta2* and the
> > > latest version is available here [1].
> > > 
> > > The Trusted Firmware foundation has an implementation of the RMM - TF-RMM -
> > > available here [3].
> > > 
> > > Implementation
> > > =================
> > > 
> > > This version of the stack is based on the RMM specification v1.0-Beta0[2], with
> > > following exceptions :
> > >    - TF-RMM/KVM currently doesn't support the optional features of PMU,
> > >       SVE and Self-hosted debug (coming soon).
> > >    - The RSI_HOST_CALL structure alignment requirement is reduced to match
> > >       RMM v1.0 Beta1
> > >    - RMI/RSI version numbers do not match the RMM spec. This will be
> > >      resolved once the spec/implementation is complete, across TF-RMM+Linux stack.
> > > 
> > > We plan to update the stack to support the latest version of the RMMv1.0 spec
> > > in the coming revisions.
> > > 
> > > This release includes the following components :
> > > 
> > >   a) Linux Kernel
> > >       i) Host / KVM support - Support for driving the Realms via RMI. This is
> > >       dependent on running in the Kernel at EL2 (aka VHE mode). Also provides
> > >       UABI for VMMs to manage the Realm VMs. The support is restricted to 4K page
> > >       size, matching the Stage2 granule supported by RMM. The VMM is responsible
> > >       for making sure the guest memory is locked.
> > > 
> > >         TODO: Guest Private memory[10] integration - We have been following the
> > >         series and support will be added once it is merged upstream.
> > >       ii) Guest support - Support for a Linux Kernel to run in the Realm VM at
> > >       Realm-EL1, using RSI services. This includes virtio support (virtio-v1.0
> > >       only). All I/O are treated as non-secure/shared.
> > >   c) kvmtool - VMM changes required to manage Realm VMs. No guest private memory
> > >      as mentioned above.
> > >   d) kvm-unit-tests - Support for running in Realms along with additional tests
> > >      for RSI ABI.
> > > 
> > > Running the stack
> > > ====================
> > > 
> > > To run/test the stack, you would need the following components :
> > > 
> > > 1) FVP Base AEM RevC model with FEAT_RME support [4]
> > > 2) TF-A firmware for EL3 [5]
> > > 3) TF-A RMM for R-EL2 [3]
> > > 4) Linux Kernel [6]
> > > 5) kvmtool [7]
> > > 6) kvm-unit-tests [8]
> > > 
> > > Instructions for building the firmware components and running the model are
> > > available here [9]. Once, the host kernel is booted, a Realm can be launched by
> > > invoking the `lkvm` commad as follows:
> > > 
> > >   $ lkvm run --realm 				 \
> > > 	 --measurement-algo=["sha256", "sha512"] \
> > > 	 --disable-sve				 \
> > > 	 <normal-vm-options>
> > > 
> > > Where:
> > >   * --measurement-algo (Optional) specifies the algorithm selected for creating the
> > >     initial measurements by the RMM for this Realm (defaults to sha256).
> > >   * GICv3 is mandatory for the Realms.
> > >   * SVE is not yet supported in the TF-RMM, and thus must be disabled using
> > >     --disable-sve
> > > 
> > > You may also run the kvm-unit-tests inside the Realm world, using the similar
> > > options as above.
> > > 
> > > 
> > > Links
> > > ============
> > > 
> > > [0] Arm CCA Landing page (See Key Resources section for various documentations)
> > >      https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
> > > 
> > > [1] RMM Specification Latest
> > >      https://developer.arm.com/documentation/den0137/latest
> > > 
> > > [2] RMM v1.0-Beta0 specification
> > >      https://developer.arm.com/documentation/den0137/1-0bet0/
> > > 
> > > [3] Trusted Firmware RMM - TF-RMM
> > >      https://www.trustedfirmware.org/projects/tf-rmm/
> > >      GIT: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git
> > > 
> > > [4] FVP Base RevC AEM Model (available on x86_64 / Arm64 Linux)
> > >      https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms
> > > 
> > > [5] Trusted Firmware for A class
> > >      https://www.trustedfirmware.org/projects/tf-a/
> > > 
> > > [6] Linux kernel support for Arm-CCA
> > >      https://gitlab.arm.com/linux-arm/linux-cca
> > >      Host Support branch:	cca-host/rfc-v1
> > >      Guest Support branch:	cca-guest/rfc-v1
> > > 
> > > [7] kvmtool support for Arm CCA
> > >      https://gitlab.arm.com/linux-arm/kvmtool-cca cca/rfc-v1
> > > 
> > > [8] kvm-unit-tests support for Arm CCA
> > >      https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca  cca/rfc-v1
> > > 
> > > [9] Instructions for Building Firmware components and running the model, see
> > >      section 4.19.2 "Building and running TF-A with RME"
> > >      https://trustedfirmware-a.readthedocs.io/en/latest/components/realm-management-extension.html#building-and-running-tf-a-with-rme
> > > 
> > > [10] fd based Guest Private memory for KVM
> > >     https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng@linux.intel.com
> > > 
> > > Cc: Alexandru Elisei <alexandru.elisei@arm.com>
> > > Cc: Andrew Jones <andrew.jones@linux.dev>
> > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Chao Peng <chao.p.peng@linux.intel.com>
> > > Cc: Christoffer Dall <christoffer.dall@arm.com>
> > > Cc: Fuad Tabba <tabba@google.com>
> > > Cc: James Morse <james.morse@arm.com>
> > > Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > > Cc: Joey Gouly <Joey.Gouly@arm.com>
> > > Cc: Marc Zyngier <maz@kernel.org>
> > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > Cc: Oliver Upton <oliver.upton@linux.dev>
> > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > Cc: Quentin Perret <qperret@google.com>
> > > Cc: Sean Christopherson <seanjc@google.com>
> > > Cc: Steven Price <steven.price@arm.com>
> > > Cc: Thomas Huth <thuth@redhat.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > Cc: Zenghui Yu <yuzenghui@huawei.com>
> > > To: linux-coco@lists.linux.dev
> > > To: kvmarm@lists.linux.dev
> > > Cc: kvmarm@lists.cs.columbia.edu
> > > Cc: linux-arm-kernel@lists.infradead.org
> > > To: linux-kernel@vger.kernel.org
> > > To: kvm@vger.kernel.org
> > > 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-03-02 16:46     ` Dr. David Alan Gilbert
@ 2023-03-02 19:02       ` Suzuki K Poulose
  0 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-03-02 19:02 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: linux-coco, linux-kernel, kvm, kvmarm, linux-arm-kernel,
	Alexandru Elisei, Andrew Jones, Catalin Marinas, Chao Peng,
	Christoffer Dall, Fuad Tabba, James Morse, Jean-Philippe Brucker,
	Joey Gouly, Marc Zyngier, Mark Rutland, Oliver Upton,
	Paolo Bonzini, Quentin Perret, Sean Christopherson, Steven Price,
	Thomas Huth, Will Deacon, Zenghui Yu, kvmarm, Gareth Stockwell

On 02/03/2023 16:46, Dr. David Alan Gilbert wrote:
> * Suzuki K Poulose (suzuki.poulose@arm.com) wrote:
>> Hi Dave
>>
>> Thanks for your response, and apologies for the delay. Response, in line.
>>
>> On 14/02/2023 17:13, Dr. David Alan Gilbert wrote:
>>> * Suzuki K Poulose (suzuki.poulose@arm.com) wrote:
>>>> We are happy to announce the early RFC version of the Arm
>>>> Confidential Compute Architecture (CCA) support for the Linux
>>>> stack. The intention is to seek early feedback in the following areas:
>>>>    * KVM integration of the Arm CCA
>>>>    * KVM UABI for managing the Realms, seeking to generalise the operations
>>>>      wherever possible with other Confidential Compute solutions.
>>>>      Note: This version doesn't support Guest Private memory, which will be added
>>>>      later (see below).
>>>>    * Linux Guest support for Realms
>>>>
>>>> Arm CCA Introduction
>>>> =====================
>>>>
>>>> The Arm CCA is a reference software architecture and implementation that builds
>>>> on the Realm Management Extension (RME), enabling the execution of Virtual
>>>> machines, while preventing access by more privileged software, such as hypervisor.
>>>> The Arm CCA allows the hypervisor to control the VM, but removes the right for
>>>> access to the code, register state or data that is used by VM.
>>>> More information on the architecture is available here[0].
>>>>
>>>>       Arm CCA Reference Software Architecture
>>>>
>>>>           Realm World    ||    Normal World   ||  Secure World  ||
>>>>                          ||        |          ||                ||
>>>>    EL0 x-------x         || x----x | x------x ||                ||
>>>>        | Realm |         || |    | | |      | ||                ||
>>>>        |       |         || | VM | | |      | ||                ||
>>>>    ----|  VM*  |---------||-|    |---|      |-||----------------||
>>>>        |       |         || |    | | |  H   | ||                ||
>>>>    EL1 x-------x         || x----x | |      | ||                ||
>>>>            ^             ||        | |  o   | ||                ||
>>>>            |             ||        | |      | ||                ||
>>>>    ------- R*------------------------|  s  -|---------------------
>>>>            S             ||          |      | ||                ||
>>>>            I             ||          |  t   | ||                ||
>>>>            |             ||          |      | ||                ||
>>>>            v             ||          x------x ||                ||
>>>>    EL2    RMM*           ||              ^    ||                ||
>>>>            ^             ||              |    ||                ||
>>>>    ========|=============================|========================
>>>>            |                             | SMC
>>>>            x--------- *RMI* -------------x
>>>>
>>>>    EL3                   Root World
>>>>                          EL3 Firmware
>>>>    ===============================================================
>>>> Where :
>>>>    RMM - Realm Management Monitor
>>>>    RMI - Realm Management Interface
>>>>    RSI - Realm Service Interface
>>>>    SMC - Secure Monitor Call
>>>
>>> Hi,
>>>     It's nice to see this full stack posted - thanks!
>>>
>>> Are there any pointers to information on attestation and similar
>>> measurement things?  In particular, are there any plans for a vTPM
>>
>> The RMM v1.0 provides attestation and measurement services to the Realm,
>> via Realm Service Interface (RSI) calls.
> 
> Can you point me at some docs for that?
> 

It is part of the RMM specification [1], linked below.
Please see "Chapter A7. Realm Measurement and Attestation"

[1] https://developer.arm.com/documentation/den0137/latest

>> However, there is no support
>> for partitioning the Realm VM with v1.0. This is currently under
>> development and should be available in the near future.
>>
>> With that in place, a vTPM could reside in a partition of the Realm VM along
>> side the OS in another. Does that answer your question ?
> 
> Possibly; it would be great to be able to use a standard vTPM interface
> here rather than have to do anything special.  People already have this
> working on AMD SEV-SNP.

Ok.

> 
> Dave

...

>>>>
>>>> [1] RMM Specification Latest
>>>>       https://developer.arm.com/documentation/den0137/latest


Suzuki



>>>>
>>>> [2] RMM v1.0-Beta0 specification
>>>>       https://developer.arm.com/documentation/den0137/1-0bet0/
>>>>
>>>> [3] Trusted Firmware RMM - TF-RMM
>>>>       https://www.trustedfirmware.org/projects/tf-rmm/
>>>>       GIT: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git
>>>>
>>>> [4] FVP Base RevC AEM Model (available on x86_64 / Arm64 Linux)
>>>>       https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms
>>>>
>>>> [5] Trusted Firmware for A class
>>>>       https://www.trustedfirmware.org/projects/tf-a/
>>>>
>>>> [6] Linux kernel support for Arm-CCA
>>>>       https://gitlab.arm.com/linux-arm/linux-cca
>>>>       Host Support branch:	cca-host/rfc-v1
>>>>       Guest Support branch:	cca-guest/rfc-v1
>>>>
>>>> [7] kvmtool support for Arm CCA
>>>>       https://gitlab.arm.com/linux-arm/kvmtool-cca cca/rfc-v1
>>>>
>>>> [8] kvm-unit-tests support for Arm CCA
>>>>       https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca  cca/rfc-v1
>>>>
>>>> [9] Instructions for Building Firmware components and running the model, see
>>>>       section 4.19.2 "Building and running TF-A with RME"
>>>>       https://trustedfirmware-a.readthedocs.io/en/latest/components/realm-management-extension.html#building-and-running-tf-a-with-rme
>>>>
>>>> [10] fd based Guest Private memory for KVM
>>>>      https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng@linux.intel.com
>>>>
>>>> Cc: Alexandru Elisei <alexandru.elisei@arm.com>
>>>> Cc: Andrew Jones <andrew.jones@linux.dev>
>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>>> Cc: Chao Peng <chao.p.peng@linux.intel.com>
>>>> Cc: Christoffer Dall <christoffer.dall@arm.com>
>>>> Cc: Fuad Tabba <tabba@google.com>
>>>> Cc: James Morse <james.morse@arm.com>
>>>> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>>> Cc: Joey Gouly <Joey.Gouly@arm.com>
>>>> Cc: Marc Zyngier <maz@kernel.org>
>>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>>> Cc: Oliver Upton <oliver.upton@linux.dev>
>>>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>>>> Cc: Quentin Perret <qperret@google.com>
>>>> Cc: Sean Christopherson <seanjc@google.com>
>>>> Cc: Steven Price <steven.price@arm.com>
>>>> Cc: Thomas Huth <thuth@redhat.com>
>>>> Cc: Will Deacon <will@kernel.org>
>>>> Cc: Zenghui Yu <yuzenghui@huawei.com>
>>>> To: linux-coco@lists.linux.dev
>>>> To: kvmarm@lists.linux.dev
>>>> Cc: kvmarm@lists.cs.columbia.edu
>>>> Cc: linux-arm-kernel@lists.infradead.org
>>>> To: linux-kernel@vger.kernel.org
>>>> To: kvm@vger.kernel.org
>>>>
>>


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-03-01 22:12       ` Itaru Kitayama
  2023-03-02  9:18         ` Jean-Philippe Brucker
@ 2023-03-03  9:46         ` Jean-Philippe Brucker
  2023-03-03  9:54           ` Suzuki K Poulose
  1 sibling, 1 reply; 190+ messages in thread
From: Jean-Philippe Brucker @ 2023-03-03  9:46 UTC (permalink / raw)
  To: Itaru Kitayama
  Cc: Suzuki K Poulose, linux-coco, linux-kernel, kvm, kvmarm,
	linux-arm-kernel, Alexandru Elisei, Andrew Jones,
	Catalin Marinas, Chao Peng, Christoffer Dall, Fuad Tabba,
	James Morse, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Sean Christopherson,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

On Thu, Mar 02, 2023 at 07:12:24AM +0900, Itaru Kitayama wrote:
> > > I've tried your series in Real on CCA Host, but the KVM arch init
> > > emits an Invalid argument error and terminates.

This was the KVM_SET_ONE_REG for the SVE vector size. During my tests I
didn't enable SVE in the host but shrinkwrap enables more options.

Until we figure out support for SVE, disable it on the QEMU command-line
(similarly to '--disable-sve' needed for kvmtool boot):

	-cpu host,sve=off

Thanks,
Jean

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-03-03  9:46         ` Jean-Philippe Brucker
@ 2023-03-03  9:54           ` Suzuki K Poulose
  2023-03-03 11:39             ` Jean-Philippe Brucker
  0 siblings, 1 reply; 190+ messages in thread
From: Suzuki K Poulose @ 2023-03-03  9:54 UTC (permalink / raw)
  To: Jean-Philippe Brucker, Itaru Kitayama
  Cc: linux-coco, linux-kernel, kvm, kvmarm, linux-arm-kernel,
	Alexandru Elisei, Andrew Jones, Catalin Marinas, Chao Peng,
	Christoffer Dall, Fuad Tabba, James Morse, Joey Gouly,
	Marc Zyngier, Mark Rutland, Oliver Upton, Paolo Bonzini,
	Quentin Perret, Sean Christopherson, Steven Price, Thomas Huth,
	Will Deacon, Zenghui Yu, kvmarm

On 03/03/2023 09:46, Jean-Philippe Brucker wrote:
> On Thu, Mar 02, 2023 at 07:12:24AM +0900, Itaru Kitayama wrote:
>>>> I've tried your series in Real on CCA Host, but the KVM arch init
>>>> emits an Invalid argument error and terminates.
> 
> This was the KVM_SET_ONE_REG for the SVE vector size. During my tests I
> didn't enable SVE in the host but shrinkwrap enables more options.

Does the Qemu check for SVE capability on /dev/kvm ? For kvmtool, we
changed to using the VM instance and that would prevent using SVE,
until the RMM supports it.

Suzuki

> 
> Until we figure out support for SVE, disable it on the QEMU command-line
> (similarly to '--disable-sve' needed for kvmtool boot):
> 
> 	-cpu host,sve=off
> 
> Thanks,
> Jean


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-03-03  9:54           ` Suzuki K Poulose
@ 2023-03-03 11:39             ` Jean-Philippe Brucker
  2023-03-03 12:08               ` Andrew Jones
  0 siblings, 1 reply; 190+ messages in thread
From: Jean-Philippe Brucker @ 2023-03-03 11:39 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Jean-Philippe Brucker, Itaru Kitayama, linux-coco, linux-kernel,
	kvm, kvmarm, linux-arm-kernel, Alexandru Elisei, Andrew Jones,
	Catalin Marinas, Chao Peng, Christoffer Dall, Fuad Tabba,
	James Morse, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Sean Christopherson,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

On Fri, Mar 03, 2023 at 09:54:47AM +0000, Suzuki K Poulose wrote:
> On 03/03/2023 09:46, Jean-Philippe Brucker wrote:
> > On Thu, Mar 02, 2023 at 07:12:24AM +0900, Itaru Kitayama wrote:
> > > > > I've tried your series in Real on CCA Host, but the KVM arch init
> > > > > emits an Invalid argument error and terminates.
> > 
> > This was the KVM_SET_ONE_REG for the SVE vector size. During my tests I
> > didn't enable SVE in the host but shrinkwrap enables more options.
> 
> Does the Qemu check for SVE capability on /dev/kvm ? For kvmtool, we
> changed to using the VM instance and that would prevent using SVE,
> until the RMM supports it.

Yes, QEMU does check the SVE cap on /dev/kvm. I can propose changing it or
complementing it with a VM check in my next version, it seems to work
(though I need to double-check the VM fd lifetime). Same goes for
KVM_CAP_STEAL_TIME, which I need to disable explicitly at the moment.

Thanks,
Jean

> 
> Suzuki
> 
> > 
> > Until we figure out support for SVE, disable it on the QEMU command-line
> > (similarly to '--disable-sve' needed for kvmtool boot):
> > 
> > 	-cpu host,sve=off
> > 
> > Thanks,
> > Jean
> 
> 

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-03-03 11:39             ` Jean-Philippe Brucker
@ 2023-03-03 12:08               ` Andrew Jones
  2023-03-03 12:19                 ` Suzuki K Poulose
  0 siblings, 1 reply; 190+ messages in thread
From: Andrew Jones @ 2023-03-03 12:08 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Suzuki K Poulose, Jean-Philippe Brucker, Itaru Kitayama,
	linux-coco, linux-kernel, kvm, kvmarm, linux-arm-kernel,
	Alexandru Elisei, Catalin Marinas, Chao Peng, Christoffer Dall,
	Fuad Tabba, James Morse, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Sean Christopherson,
	Steven Price, Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

On Fri, Mar 03, 2023 at 11:39:05AM +0000, Jean-Philippe Brucker wrote:
> On Fri, Mar 03, 2023 at 09:54:47AM +0000, Suzuki K Poulose wrote:
> > On 03/03/2023 09:46, Jean-Philippe Brucker wrote:
> > > On Thu, Mar 02, 2023 at 07:12:24AM +0900, Itaru Kitayama wrote:
> > > > > > I've tried your series in Real on CCA Host, but the KVM arch init
> > > > > > emits an Invalid argument error and terminates.
> > > 
> > > This was the KVM_SET_ONE_REG for the SVE vector size. During my tests I
> > > didn't enable SVE in the host but shrinkwrap enables more options.
> > 
> > Does the Qemu check for SVE capability on /dev/kvm ? For kvmtool, we
> > changed to using the VM instance and that would prevent using SVE,
> > until the RMM supports it.
> 
> Yes, QEMU does check the SVE cap on /dev/kvm. I can propose changing it or
> complementing it with a VM check in my next version, it seems to work
> (though I need to double-check the VM fd lifetime). Same goes for
> KVM_CAP_STEAL_TIME, which I need to disable explicitly at the moment.

I'm probably missing something since I haven't looked at this, but I'm
wondering what the "VM instance" check is and why it should be necessary.
Shouldn't KVM only expose capabilities which it can provide? I.e. the
"VM instance" check should be done by KVM and, when it fails, the SVE and
steal-time capabilities should return 0.

Thanks,
drew

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-03-03 12:08               ` Andrew Jones
@ 2023-03-03 12:19                 ` Suzuki K Poulose
  2023-03-03 13:06                   ` Cornelia Huck
  0 siblings, 1 reply; 190+ messages in thread
From: Suzuki K Poulose @ 2023-03-03 12:19 UTC (permalink / raw)
  To: Andrew Jones, Jean-Philippe Brucker
  Cc: Jean-Philippe Brucker, Itaru Kitayama, linux-coco, linux-kernel,
	kvm, kvmarm, linux-arm-kernel, Alexandru Elisei, Catalin Marinas,
	Chao Peng, Christoffer Dall, Fuad Tabba, James Morse, Joey Gouly,
	Marc Zyngier, Mark Rutland, Oliver Upton, Paolo Bonzini,
	Quentin Perret, Sean Christopherson, Steven Price, Thomas Huth,
	Will Deacon, Zenghui Yu, kvmarm

On 03/03/2023 12:08, Andrew Jones wrote:
> On Fri, Mar 03, 2023 at 11:39:05AM +0000, Jean-Philippe Brucker wrote:
>> On Fri, Mar 03, 2023 at 09:54:47AM +0000, Suzuki K Poulose wrote:
>>> On 03/03/2023 09:46, Jean-Philippe Brucker wrote:
>>>> On Thu, Mar 02, 2023 at 07:12:24AM +0900, Itaru Kitayama wrote:
>>>>>>> I've tried your series in Real on CCA Host, but the KVM arch init
>>>>>>> emits an Invalid argument error and terminates.
>>>>
>>>> This was the KVM_SET_ONE_REG for the SVE vector size. During my tests I
>>>> didn't enable SVE in the host but shrinkwrap enables more options.
>>>
>>> Does the Qemu check for SVE capability on /dev/kvm ? For kvmtool, we
>>> changed to using the VM instance and that would prevent using SVE,
>>> until the RMM supports it.
>>
>> Yes, QEMU does check the SVE cap on /dev/kvm. I can propose changing it or
>> complementing it with a VM check in my next version, it seems to work
>> (though I need to double-check the VM fd lifetime). Same goes for
>> KVM_CAP_STEAL_TIME, which I need to disable explicitly at the moment.
> 
> I'm probably missing something since I haven't looked at this, but I'm
> wondering what the "VM instance" check is and why it should be necessary.

Userspace can check for a KVM_CAP_ on KVM fd (/dev/kvm) or a VM fd
(returned via KVM_CREATE_VM).

> Shouldn't KVM only expose capabilities which it can provide? I.e. the

Correct, given now that we have different "types" of VMs possible on
Arm64, (Normal vs Realm vs pVM), the capabilities of each of these
could be different and thus we should use the KVM_CAP_ on the VM fd (
referred to VM instance above) and not the generic KVM fd.

> "VM instance" check should be done by KVM and, when it fails, the SVE and
> steal-time capabilities should return 0.
> 

Correct.

Suzuki

> Thanks,
> drew


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-03-03 12:19                 ` Suzuki K Poulose
@ 2023-03-03 13:06                   ` Cornelia Huck
  2023-03-03 13:57                     ` Jean-Philippe Brucker
  0 siblings, 1 reply; 190+ messages in thread
From: Cornelia Huck @ 2023-03-03 13:06 UTC (permalink / raw)
  To: Suzuki K Poulose, Andrew Jones, Jean-Philippe Brucker
  Cc: Jean-Philippe Brucker, Itaru Kitayama, linux-coco, linux-kernel,
	kvm, kvmarm, linux-arm-kernel, Alexandru Elisei, Catalin Marinas,
	Chao Peng, Christoffer Dall, Fuad Tabba, James Morse, Joey Gouly,
	Marc Zyngier, Mark Rutland, Oliver Upton, Paolo Bonzini,
	Quentin Perret, Sean Christopherson, Steven Price, Thomas Huth,
	Will Deacon, Zenghui Yu, kvmarm

On Fri, Mar 03 2023, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:

> On 03/03/2023 12:08, Andrew Jones wrote:
>> On Fri, Mar 03, 2023 at 11:39:05AM +0000, Jean-Philippe Brucker wrote:
>>> On Fri, Mar 03, 2023 at 09:54:47AM +0000, Suzuki K Poulose wrote:
>>>> On 03/03/2023 09:46, Jean-Philippe Brucker wrote:
>>>>> On Thu, Mar 02, 2023 at 07:12:24AM +0900, Itaru Kitayama wrote:
>>>>>>>> I've tried your series in Real on CCA Host, but the KVM arch init
>>>>>>>> emits an Invalid argument error and terminates.
>>>>>
>>>>> This was the KVM_SET_ONE_REG for the SVE vector size. During my tests I
>>>>> didn't enable SVE in the host but shrinkwrap enables more options.
>>>>
>>>> Does the Qemu check for SVE capability on /dev/kvm ? For kvmtool, we
>>>> changed to using the VM instance and that would prevent using SVE,
>>>> until the RMM supports it.
>>>
>>> Yes, QEMU does check the SVE cap on /dev/kvm. I can propose changing it or
>>> complementing it with a VM check in my next version, it seems to work
>>> (though I need to double-check the VM fd lifetime). Same goes for
>>> KVM_CAP_STEAL_TIME, which I need to disable explicitly at the moment.
>> 
>> I'm probably missing something since I haven't looked at this, but I'm
>> wondering what the "VM instance" check is and why it should be necessary.
>
> Userspace can check for a KVM_CAP_ on KVM fd (/dev/kvm) or a VM fd
> (returned via KVM_CREATE_VM).
>
>> Shouldn't KVM only expose capabilities which it can provide? I.e. the
>
> Correct, given now that we have different "types" of VMs possible on
> Arm64, (Normal vs Realm vs pVM), the capabilities of each of these
> could be different and thus we should use the KVM_CAP_ on the VM fd (
> referred to VM instance above) and not the generic KVM fd.

Using the vm ioctl is even encouraged in the doc for
KVM_CHECK_EXTENSION:

"Based on their initialization different VMs may have different capabilities.
It is thus encouraged to use the vm ioctl to query for capabilities"

It would probably make sense to convert QEMU to use the vm ioctl
everywhere (the wrapper falls back to the global version on failure
anyway.)

>
>> "VM instance" check should be done by KVM and, when it fails, the SVE and
>> steal-time capabilities should return 0.
>> 
>
> Correct.
>
> Suzuki
>
>> Thanks,
>> drew


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-03-03 13:06                   ` Cornelia Huck
@ 2023-03-03 13:57                     ` Jean-Philippe Brucker
  0 siblings, 0 replies; 190+ messages in thread
From: Jean-Philippe Brucker @ 2023-03-03 13:57 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Suzuki K Poulose, Andrew Jones, Jean-Philippe Brucker,
	Itaru Kitayama, linux-coco, linux-kernel, kvm, kvmarm,
	linux-arm-kernel, Alexandru Elisei, Catalin Marinas, Chao Peng,
	Christoffer Dall, Fuad Tabba, James Morse, Joey Gouly,
	Marc Zyngier, Mark Rutland, Oliver Upton, Paolo Bonzini,
	Quentin Perret, Sean Christopherson, Steven Price, Thomas Huth,
	Will Deacon, Zenghui Yu, kvmarm

On Fri, Mar 03, 2023 at 02:06:07PM +0100, Cornelia Huck wrote:
> On Fri, Mar 03 2023, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> 
> > On 03/03/2023 12:08, Andrew Jones wrote:
> >> On Fri, Mar 03, 2023 at 11:39:05AM +0000, Jean-Philippe Brucker wrote:
> >>> On Fri, Mar 03, 2023 at 09:54:47AM +0000, Suzuki K Poulose wrote:
> >>>> On 03/03/2023 09:46, Jean-Philippe Brucker wrote:
> >>>>> On Thu, Mar 02, 2023 at 07:12:24AM +0900, Itaru Kitayama wrote:
> >>>>>>>> I've tried your series in Real on CCA Host, but the KVM arch init
> >>>>>>>> emits an Invalid argument error and terminates.
> >>>>>
> >>>>> This was the KVM_SET_ONE_REG for the SVE vector size. During my tests I
> >>>>> didn't enable SVE in the host but shrinkwrap enables more options.
> >>>>
> >>>> Does the Qemu check for SVE capability on /dev/kvm ? For kvmtool, we
> >>>> changed to using the VM instance and that would prevent using SVE,
> >>>> until the RMM supports it.
> >>>
> >>> Yes, QEMU does check the SVE cap on /dev/kvm. I can propose changing it or
> >>> complementing it with a VM check in my next version, it seems to work
> >>> (though I need to double-check the VM fd lifetime). Same goes for
> >>> KVM_CAP_STEAL_TIME, which I need to disable explicitly at the moment.
> >> 
> >> I'm probably missing something since I haven't looked at this, but I'm
> >> wondering what the "VM instance" check is and why it should be necessary.
> >
> > Userspace can check for a KVM_CAP_ on KVM fd (/dev/kvm) or a VM fd
> > (returned via KVM_CREATE_VM).
> >
> >> Shouldn't KVM only expose capabilities which it can provide? I.e. the
> >
> > Correct, given now that we have different "types" of VMs possible on
> > Arm64, (Normal vs Realm vs pVM), the capabilities of each of these
> > could be different and thus we should use the KVM_CAP_ on the VM fd (
> > referred to VM instance above) and not the generic KVM fd.
> 
> Using the vm ioctl is even encouraged in the doc for
> KVM_CHECK_EXTENSION:
> 
> "Based on their initialization different VMs may have different capabilities.
> It is thus encouraged to use the vm ioctl to query for capabilities"
> 
> It would probably make sense to convert QEMU to use the vm ioctl
> everywhere (the wrapper falls back to the global version on failure
> anyway.)

Indeed, I'll see if I can come up with something generic, thanks for the
pointer

Thanks,
Jean

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 09/28] arm64: RME: RTT handling
  2023-02-13 17:44     ` Zhi Wang
@ 2023-03-03 14:04       ` Steven Price
  2023-03-04 12:32         ` Zhi Wang
  0 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-03-03 14:04 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On 13/02/2023 17:44, Zhi Wang wrote:
> On Fri, 27 Jan 2023 11:29:13 +0000
> Steven Price <steven.price@arm.com> wrote:
> 
>> The RMM owns the stage 2 page tables for a realm, and KVM must request
>> that the RMM creates/destroys entries as necessary. The physical pages
>> to store the page tables are delegated to the realm as required, and can
>> be undelegated when no longer used.
>>
> 
> This is only an introduction to RTT handling. While this patch is mostly like
> RTT teardown, better add more introduction to this patch. Also maybe refine
> the tittle to reflect what this patch is actually doing.

You've a definite point that this patch is mostly about RTT teardown.
Technically it also adds the RTT creation path (realm_rtt_create) -
hence the generic patch title.

But I'll definitely expand the commit message to mention the complexity
of tear down which is the bulk of the patch.

>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_rme.h |  19 +++++
>>  arch/arm64/kvm/mmu.c             |   7 +-
>>  arch/arm64/kvm/rme.c             | 139 +++++++++++++++++++++++++++++++
>>  3 files changed, 162 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
>> index a6318af3ed11..eea5118dfa8a 100644
>> --- a/arch/arm64/include/asm/kvm_rme.h
>> +++ b/arch/arm64/include/asm/kvm_rme.h
>> @@ -35,5 +35,24 @@ u32 kvm_realm_ipa_limit(void);
>>  int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
>>  int kvm_init_realm_vm(struct kvm *kvm);
>>  void kvm_destroy_realm(struct kvm *kvm);
>> +void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level);
>> +
>> +#define RME_RTT_BLOCK_LEVEL	2
>> +#define RME_RTT_MAX_LEVEL	3
>> +
>> +#define RME_PAGE_SHIFT		12
>> +#define RME_PAGE_SIZE		BIT(RME_PAGE_SHIFT)
>> +/* See ARM64_HW_PGTABLE_LEVEL_SHIFT() */
>> +#define RME_RTT_LEVEL_SHIFT(l)	\
>> +	((RME_PAGE_SHIFT - 3) * (4 - (l)) + 3)
>> +#define RME_L2_BLOCK_SIZE	BIT(RME_RTT_LEVEL_SHIFT(2))
>> +
>> +static inline unsigned long rme_rtt_level_mapsize(int level)
>> +{
>> +	if (WARN_ON(level > RME_RTT_MAX_LEVEL))
>> +		return RME_PAGE_SIZE;
>> +
>> +	return (1UL << RME_RTT_LEVEL_SHIFT(level));
>> +}
>>  
>>  #endif
>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>> index 22c00274884a..f29558c5dcbc 100644
>> --- a/arch/arm64/kvm/mmu.c
>> +++ b/arch/arm64/kvm/mmu.c
>> @@ -834,16 +834,17 @@ void stage2_unmap_vm(struct kvm *kvm)
>>  void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
>>  {
>>  	struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
>> -	struct kvm_pgtable *pgt = NULL;
>> +	struct kvm_pgtable *pgt;
>>  
>>  	write_lock(&kvm->mmu_lock);
>> +	pgt = mmu->pgt;
>>  	if (kvm_is_realm(kvm) &&
>>  	    kvm_realm_state(kvm) != REALM_STATE_DYING) {
>> -		/* TODO: teardown rtts */
>>  		write_unlock(&kvm->mmu_lock);
>> +		kvm_realm_destroy_rtts(&kvm->arch.realm, pgt->ia_bits,
>> +				       pgt->start_level);
>>  		return;
>>  	}
>> -	pgt = mmu->pgt;
>>  	if (pgt) {
>>  		mmu->pgd_phys = 0;
>>  		mmu->pgt = NULL;
>> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
>> index 0c9d70e4d9e6..f7b0e5a779f8 100644
>> --- a/arch/arm64/kvm/rme.c
>> +++ b/arch/arm64/kvm/rme.c
>> @@ -73,6 +73,28 @@ static int rmi_check_version(void)
>>  	return 0;
>>  }
>>  
>> +static void realm_destroy_undelegate_range(struct realm *realm,
>> +					   unsigned long ipa,
>> +					   unsigned long addr,
>> +					   ssize_t size)
>> +{
>> +	unsigned long rd = virt_to_phys(realm->rd);
>> +	int ret;
>> +
>> +	while (size > 0) {
>> +		ret = rmi_data_destroy(rd, ipa);
>> +		WARN_ON(ret);
>> +		ret = rmi_granule_undelegate(addr);
>> +
> As the return value is not documented, what will happen if a page undelegate
> failed? Leaked? Some explanation is required here.

Yes - it's leaked. I'll add a comment to explain the get_page() call.

Thanks,

Steve

>> +		if (ret)
>> +			get_page(phys_to_page(addr));
>> +
>> +		addr += PAGE_SIZE;
>> +		ipa += PAGE_SIZE;
>> +		size -= PAGE_SIZE;
>> +	}
>> +}
>> +
>>  static unsigned long create_realm_feat_reg0(struct kvm *kvm)
>>  {
>>  	unsigned long ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
>> @@ -170,6 +192,123 @@ static int realm_create_rd(struct kvm *kvm)
>>  	return r;
>>  }
>>  
>> +static int realm_rtt_destroy(struct realm *realm, unsigned long addr,
>> +			     int level, phys_addr_t rtt_granule)
>> +{
>> +	addr = ALIGN_DOWN(addr, rme_rtt_level_mapsize(level - 1));
>> +	return rmi_rtt_destroy(rtt_granule, virt_to_phys(realm->rd), addr,
>> +			level);
>> +}
>> +
>> +static int realm_destroy_free_rtt(struct realm *realm, unsigned long addr,
>> +				  int level, phys_addr_t rtt_granule)
>> +{
>> +	if (realm_rtt_destroy(realm, addr, level, rtt_granule))
>> +		return -ENXIO;
>> +	if (!WARN_ON(rmi_granule_undelegate(rtt_granule)))
>> +		put_page(phys_to_page(rtt_granule));
>> +
>> +	return 0;
>> +}
>> +
>> +static int realm_rtt_create(struct realm *realm,
>> +			    unsigned long addr,
>> +			    int level,
>> +			    phys_addr_t phys)
>> +{
>> +	addr = ALIGN_DOWN(addr, rme_rtt_level_mapsize(level - 1));
>> +	return rmi_rtt_create(phys, virt_to_phys(realm->rd), addr, level);
>> +}
>> +
>> +static int realm_tear_down_rtt_range(struct realm *realm, int level,
>> +				     unsigned long start, unsigned long end)
>> +{
>> +	phys_addr_t rd = virt_to_phys(realm->rd);
>> +	ssize_t map_size = rme_rtt_level_mapsize(level);
>> +	unsigned long addr, next_addr;
>> +	bool failed = false;
>> +
>> +	for (addr = start; addr < end; addr = next_addr) {
>> +		phys_addr_t rtt_addr, tmp_rtt;
>> +		struct rtt_entry rtt;
>> +		unsigned long end_addr;
>> +
>> +		next_addr = ALIGN(addr + 1, map_size);
>> +
>> +		end_addr = min(next_addr, end);
>> +
>> +		if (rmi_rtt_read_entry(rd, ALIGN_DOWN(addr, map_size),
>> +				       level, &rtt)) {
>> +			failed = true;
>> +			continue;
>> +		}
>> +
>> +		rtt_addr = rmi_rtt_get_phys(&rtt);
>> +		WARN_ON(level != rtt.walk_level);
>> +
>> +		switch (rtt.state) {
>> +		case RMI_UNASSIGNED:
>> +		case RMI_DESTROYED:
>> +			break;
>> +		case RMI_TABLE:
>> +			if (realm_tear_down_rtt_range(realm, level + 1,
>> +						      addr, end_addr)) {
>> +				failed = true;
>> +				break;
>> +			}
>> +			if (IS_ALIGNED(addr, map_size) &&
>> +			    next_addr <= end &&
>> +			    realm_destroy_free_rtt(realm, addr, level + 1,
>> +						   rtt_addr))
>> +				failed = true;
>> +			break;
>> +		case RMI_ASSIGNED:
>> +			WARN_ON(!rtt_addr);
>> +			/*
>> +			 * If there is a block mapping, break it now, using the
>> +			 * spare_page. We are sure to have a valid delegated
>> +			 * page at spare_page before we enter here, otherwise
>> +			 * WARN once, which will be followed by further
>> +			 * warnings.
>> +			 */
>> +			tmp_rtt = realm->spare_page;
>> +			if (level == 2 &&
>> +			    !WARN_ON_ONCE(tmp_rtt == PHYS_ADDR_MAX) &&
>> +			    realm_rtt_create(realm, addr,
>> +					     RME_RTT_MAX_LEVEL, tmp_rtt)) {
>> +				WARN_ON(1);
>> +				failed = true;
>> +				break;
>> +			}
>> +			realm_destroy_undelegate_range(realm, addr,
>> +						       rtt_addr, map_size);
>> +			/*
>> +			 * Collapse the last level table and make the spare page
>> +			 * reusable again.
>> +			 */
>> +			if (level == 2 &&
>> +			    realm_rtt_destroy(realm, addr, RME_RTT_MAX_LEVEL,
>> +					      tmp_rtt))
>> +				failed = true;
>> +			break;
>> +		case RMI_VALID_NS:
>> +			WARN_ON(rmi_rtt_unmap_unprotected(rd, addr, level));
>> +			break;
>> +		default:
>> +			WARN_ON(1);
>> +			failed = true;
>> +			break;
>> +		}
>> +	}
>> +
>> +	return failed ? -EINVAL : 0;
>> +}
>> +
>> +void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level)
>> +{
>> +	realm_tear_down_rtt_range(realm, start_level, 0, (1UL << ia_bits));
>> +}
>> +
>>  /* Protects access to rme_vmid_bitmap */
>>  static DEFINE_SPINLOCK(rme_vmid_lock);
>>  static unsigned long *rme_vmid_bitmap;
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 10/28] arm64: RME: Allocate/free RECs to match vCPUs
  2023-02-13 18:08     ` Zhi Wang
@ 2023-03-03 14:05       ` Steven Price
  2023-03-04 12:46         ` Zhi Wang
  0 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-03-03 14:05 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On 13/02/2023 18:08, Zhi Wang wrote:
> On Fri, 27 Jan 2023 11:29:14 +0000
> Steven Price <steven.price@arm.com> wrote:
> 
>> The RMM maintains a data structure known as the Realm Execution Context
>> (or REC). It is similar to struct kvm_vcpu and tracks the state of the
>> virtual CPUs. KVM must delegate memory and request the structures are
>> created when vCPUs are created, and suitably tear down on destruction.
>>
> 
> It would be better to leave some pointers to the spec here. It really saves
> time for reviewers. 

Fair enough. I wasn't sure how often to repeat the link to the spec, but
a few more times wouldn't hurt ;)

>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_emulate.h |   2 +
>>  arch/arm64/include/asm/kvm_host.h    |   3 +
>>  arch/arm64/include/asm/kvm_rme.h     |  10 ++
>>  arch/arm64/kvm/arm.c                 |   1 +
>>  arch/arm64/kvm/reset.c               |  11 ++
>>  arch/arm64/kvm/rme.c                 | 144 +++++++++++++++++++++++++++
>>  6 files changed, 171 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
>> index 5a2b7229e83f..285e62914ca4 100644
>> --- a/arch/arm64/include/asm/kvm_emulate.h
>> +++ b/arch/arm64/include/asm/kvm_emulate.h
>> @@ -504,6 +504,8 @@ static inline enum realm_state kvm_realm_state(struct kvm *kvm)
>>  
>>  static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu)
>>  {
>> +	if (static_branch_unlikely(&kvm_rme_is_available))
>> +		return vcpu->arch.rec.mpidr != INVALID_HWID;
>>  	return false;
>>  }
>>  
>> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
>> index 04347c3a8c6b..ef497b718cdb 100644
>> --- a/arch/arm64/include/asm/kvm_host.h
>> +++ b/arch/arm64/include/asm/kvm_host.h
>> @@ -505,6 +505,9 @@ struct kvm_vcpu_arch {
>>  		u64 last_steal;
>>  		gpa_t base;
>>  	} steal;
>> +
>> +	/* Realm meta data */
>> +	struct rec rec;
> 
> I think the name of the data structure "rec" needs a prefix, it is too common
> and might conflict with the private data structures in the other modules. Maybe
> rme_rec or realm_rec?

struct realm_rec seems like a good choice. I agree 'rec' without context
is somewhat ambiguous.

>>  };
>>  
>>  /*
>> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
>> index eea5118dfa8a..4b219ebe1400 100644
>> --- a/arch/arm64/include/asm/kvm_rme.h
>> +++ b/arch/arm64/include/asm/kvm_rme.h
>> @@ -6,6 +6,7 @@
>>  #ifndef __ASM_KVM_RME_H
>>  #define __ASM_KVM_RME_H
>>  
>> +#include <asm/rmi_smc.h>
>>  #include <uapi/linux/kvm.h>
>>  
>>  enum realm_state {
>> @@ -29,6 +30,13 @@ struct realm {
>>  	unsigned int ia_bits;
>>  };
>>  
>> +struct rec {
>> +	unsigned long mpidr;
>> +	void *rec_page;
>> +	struct page *aux_pages[REC_PARAMS_AUX_GRANULES];
>> +	struct rec_run *run;
>> +};
>> +
> 
> It is better to leave some comments for above members or pointers to the spec,
> that saves a lot of time for review.

Will add comments.

>>  int kvm_init_rme(void);
>>  u32 kvm_realm_ipa_limit(void);
>>  
>> @@ -36,6 +44,8 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
>>  int kvm_init_realm_vm(struct kvm *kvm);
>>  void kvm_destroy_realm(struct kvm *kvm);
>>  void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level);
>> +int kvm_create_rec(struct kvm_vcpu *vcpu);
>> +void kvm_destroy_rec(struct kvm_vcpu *vcpu);
>>  
>>  #define RME_RTT_BLOCK_LEVEL	2
>>  #define RME_RTT_MAX_LEVEL	3
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index badd775547b8..52affed2f3cf 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -373,6 +373,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>>  	/* Force users to call KVM_ARM_VCPU_INIT */
>>  	vcpu->arch.target = -1;
>>  	bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
>> +	vcpu->arch.rec.mpidr = INVALID_HWID;
>>  
>>  	vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
>>  
>> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
>> index 9e71d69e051f..0c84392a4bf2 100644
>> --- a/arch/arm64/kvm/reset.c
>> +++ b/arch/arm64/kvm/reset.c
>> @@ -135,6 +135,11 @@ int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature)
>>  			return -EPERM;
>>  
>>  		return kvm_vcpu_finalize_sve(vcpu);
>> +	case KVM_ARM_VCPU_REC:
>> +		if (!kvm_is_realm(vcpu->kvm))
>> +			return -EINVAL;
>> +
>> +		return kvm_create_rec(vcpu);
>>  	}
>>  
>>  	return -EINVAL;
>> @@ -145,6 +150,11 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu)
>>  	if (vcpu_has_sve(vcpu) && !kvm_arm_vcpu_sve_finalized(vcpu))
>>  		return false;
>>  
>> +	if (kvm_is_realm(vcpu->kvm) &&
>> +	    !(vcpu_is_rec(vcpu) &&
>> +	      READ_ONCE(vcpu->kvm->arch.realm.state) == REALM_STATE_ACTIVE))
>> +		return false;
> 
> That's why it is better to introduce the realm state in the previous patches so
> that people can really get the idea of the states at this stage.
> 
>> +
>>  	return true;
>>  }
>>  
>> @@ -157,6 +167,7 @@ void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu)
>>  	if (sve_state)
>>  		kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu));
>>  	kfree(sve_state);
>> +	kvm_destroy_rec(vcpu);
>>  }
>>  
>>  static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu)
>> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
>> index f7b0e5a779f8..d79ed889ca4d 100644
>> --- a/arch/arm64/kvm/rme.c
>> +++ b/arch/arm64/kvm/rme.c
>> @@ -514,6 +514,150 @@ void kvm_destroy_realm(struct kvm *kvm)
>>  	kvm_free_stage2_pgd(&kvm->arch.mmu);
>>  }
>>  
>> +static void free_rec_aux(struct page **aux_pages,
>> +			 unsigned int num_aux)
>> +{
>> +	unsigned int i;
>> +
>> +	for (i = 0; i < num_aux; i++) {
>> +		phys_addr_t aux_page_phys = page_to_phys(aux_pages[i]);
>> +
>> +		if (WARN_ON(rmi_granule_undelegate(aux_page_phys)))
>> +			continue;
>> +
>> +		__free_page(aux_pages[i]);
>> +	}
>> +}
>> +
>> +static int alloc_rec_aux(struct page **aux_pages,
>> +			 u64 *aux_phys_pages,
>> +			 unsigned int num_aux)
>> +{
>> +	int ret;
>> +	unsigned int i;
>> +
>> +	for (i = 0; i < num_aux; i++) {
>> +		struct page *aux_page;
>> +		phys_addr_t aux_page_phys;
>> +
>> +		aux_page = alloc_page(GFP_KERNEL);
>> +		if (!aux_page) {
>> +			ret = -ENOMEM;
>> +			goto out_err;
>> +		}
>> +		aux_page_phys = page_to_phys(aux_page);
>> +		if (rmi_granule_delegate(aux_page_phys)) {
>> +			__free_page(aux_page);
>> +			ret = -ENXIO;
>> +			goto out_err;
>> +		}
>> +		aux_pages[i] = aux_page;
>> +		aux_phys_pages[i] = aux_page_phys;
>> +	}
>> +
>> +	return 0;
>> +out_err:
>> +	free_rec_aux(aux_pages, i);
>> +	return ret;
>> +}
>> +
>> +int kvm_create_rec(struct kvm_vcpu *vcpu)
>> +{
>> +	struct user_pt_regs *vcpu_regs = vcpu_gp_regs(vcpu);
>> +	unsigned long mpidr = kvm_vcpu_get_mpidr_aff(vcpu);
>> +	struct realm *realm = &vcpu->kvm->arch.realm;
>> +	struct rec *rec = &vcpu->arch.rec;
>> +	unsigned long rec_page_phys;
>> +	struct rec_params *params;
>> +	int r, i;
>> +
>> +	if (kvm_realm_state(vcpu->kvm) != REALM_STATE_NEW)
>> +		return -ENOENT;
>> +
>> +	/*
>> +	 * The RMM will report PSCI v1.0 to Realms and the KVM_ARM_VCPU_PSCI_0_2
>> +	 * flag covers v0.2 and onwards.
>> +	 */
>> +	if (!test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
>> +		return -EINVAL;
>> +
>> +	BUILD_BUG_ON(sizeof(*params) > PAGE_SIZE);
>> +	BUILD_BUG_ON(sizeof(*rec->run) > PAGE_SIZE);
>> +
>> +	params = (struct rec_params *)get_zeroed_page(GFP_KERNEL);
>> +	rec->rec_page = (void *)__get_free_page(GFP_KERNEL);
>> +	rec->run = (void *)get_zeroed_page(GFP_KERNEL);
>> +	if (!params || !rec->rec_page || !rec->run) {
>> +		r = -ENOMEM;
>> +		goto out_free_pages;
>> +	}
>> +
>> +	for (i = 0; i < ARRAY_SIZE(params->gprs); i++)
>> +		params->gprs[i] = vcpu_regs->regs[i];
>> +
>> +	params->pc = vcpu_regs->pc;
>> +
>> +	if (vcpu->vcpu_id == 0)
>> +		params->flags |= REC_PARAMS_FLAG_RUNNABLE;
>> +
>> +	rec_page_phys = virt_to_phys(rec->rec_page);
>> +
>> +	if (rmi_granule_delegate(rec_page_phys)) {
>> +		r = -ENXIO;
>> +		goto out_free_pages;
>> +	}
>> +
> 
> Wouldn't it be better to extend the alloc_rec_aux() to allocate and delegate
> pages above? so that you can same some gfps and rmi_granuale_delegates().

I don't think it's really an improvement. There's only the one
rmi_granule_delegate() call (for the REC page itself). The RecParams and
RecRun pages are not delegated because they are shared with the host. It
would also hide the structure setup within the new
alloc_rec_aux_and_rec() function.

>> +	r = alloc_rec_aux(rec->aux_pages, params->aux, realm->num_aux);
>> +	if (r)
>> +		goto out_undelegate_rmm_rec;
>> +
>> +	params->num_rec_aux = realm->num_aux;
>> +	params->mpidr = mpidr;
>> +
>> +	if (rmi_rec_create(rec_page_phys,
>> +			   virt_to_phys(realm->rd),
>> +			   virt_to_phys(params))) {
>> +		r = -ENXIO;
>> +		goto out_free_rec_aux;
>> +	}
>> +
>> +	rec->mpidr = mpidr;
>> +
>> +	free_page((unsigned long)params);
>> +	return 0;
>> +
>> +out_free_rec_aux:
>> +	free_rec_aux(rec->aux_pages, realm->num_aux);
>> +out_undelegate_rmm_rec:
>> +	if (WARN_ON(rmi_granule_undelegate(rec_page_phys)))
>> +		rec->rec_page = NULL;
>> +out_free_pages:
>> +	free_page((unsigned long)rec->run);
>> +	free_page((unsigned long)rec->rec_page);
>> +	free_page((unsigned long)params);
>> +	return r;
>> +}
>> +
>> +void kvm_destroy_rec(struct kvm_vcpu *vcpu)
>> +{
>> +	struct realm *realm = &vcpu->kvm->arch.realm;
>> +	struct rec *rec = &vcpu->arch.rec;
>> +	unsigned long rec_page_phys;
>> +
>> +	if (!vcpu_is_rec(vcpu))
>> +		return;
>> +
>> +	rec_page_phys = virt_to_phys(rec->rec_page);
>> +
>> +	if (WARN_ON(rmi_rec_destroy(rec_page_phys)))
>> +		return;
>> +	if (WARN_ON(rmi_granule_undelegate(rec_page_phys)))
>> +		return;
>> +
> 
> The two returns above feels off. What is the reason to skip the below page
> undelegates?

The reason is the usual: if we fail to undelegate then the pages have to
be leaked. I'll add some comments. However it does look like I've got
the order wrong here, if the undelegate fails for rec_page_phys it's
possible that we might still be able to free the rec_aux pages (although
something has gone terribly wrong for that to be the case).

I'll change the order to:

  /* If the REC destroy fails, leak all pages relating to the REC */
  if (WARN_ON(rmi_rec_destroy(rec_page_phys)))
	return;

  free_rec_aux(rec->aux_pages, realm->num_aux);

  /* If the undelegate fails then leak the REC page */
  if (WARN_ON(rmi_granule_undelegate(rec_page_phys)))
	return;

  free_page((unsigned long)rec->rec_page);

If the rmi_rec_destroy() call has failed then the RMM should prevent the
undelegate so there's little point trying.

Steve

>> +	free_rec_aux(rec->aux_pages, realm->num_aux);
>> +	free_page((unsigned long)rec->rec_page);
>> +}
>> +
>>  int kvm_init_realm_vm(struct kvm *kvm)
>>  {
>>  	struct realm_params *params;
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 13/28] arm64: RME: Allow VMM to set RIPAS
  2023-02-17 13:07     ` Zhi Wang
@ 2023-03-03 14:05       ` Steven Price
  0 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-03-03 14:05 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On 17/02/2023 13:07, Zhi Wang wrote:
> On Fri, 27 Jan 2023 11:29:17 +0000
> Steven Price <steven.price@arm.com> wrote:
> 
>> Each page within the protected region of the realm guest can be marked
>> as either RAM or EMPTY. Allow the VMM to control this before the guest
>> has started and provide the equivalent functions to change this (with
>> the guest's approval) at runtime.
>>
> 
> The above is just the purpose of this patch. It would be better to have one
> more paragraph to describe what this patch does (building RTT and set IPA
> state in the RTT) and explain something might confuse people, for example
> the spare page.

I'll improve the commit message.

> The spare page is really confusing. When reading __alloc_delegated_page()
> , it looks like a mechanism to cache a delegated page for the realm. But later
> in the teardown path, it looks like a workaround. What if the allocation of 
> the spare page failed in the RTT tear down path? 

Yeah the spare_page is a bit messy. Ultimately the idea is that rather
than having to delegate a page to the RMM temporarily just for breaking
up a block mapping which is going to be freed, we keep one spare for the
purpose. This also reduces the chance of an allocation failure while
trying to free memory.

One area of confusion, and something that might be worth revisiting, is
that the spare_page is also used opportunistically in
realm_create_rtt_levels(). Again this makes sense in the case where a
temporary page is needed when creating a block mapping, but the code
doesn't distinguish between this and just creating RTTs for normal mappings.

This leads to the rather unfortunate situation that it's not actually
possible to rely on there being a spare_page and therefore this is
pre-populated in kvm_realm_unmap_range(), but with a possibility that
allocation failure could occur resulting in the function failing (which
is 'handled' by a WARN_ON).

> I understand this must be a temporary solution. It would be really nice to
> have a big picture or some basic introduction to the future plan. 

Sadly I don't currently have a "big picture" plan at the moment. I am
quite tempted to split spare_page into two:

 * A 'guaranteed' spare page purely for destroying block mappings. This
would be allocated when the realm is created and only used for the
purpose of tearing down mappings.

 * A temporary spare page used as a minor optimisation during block
mapping creation - rather than immediately freeing the page back when
folding we can hold on to it with the assumption that it's likely to be
useful for creating further mappings in the same call.

However, to be honest, this is all a bit academic because as it stands
block mappings can't really be used. But when we switch over to using
the memfd approach hopefully huge pages can be translated to block mappings.

Steve

>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_rme.h |   4 +
>>  arch/arm64/kvm/rme.c             | 288 +++++++++++++++++++++++++++++++
>>  2 files changed, 292 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
>> index 4b219ebe1400..3e75cedaad18 100644
>> --- a/arch/arm64/include/asm/kvm_rme.h
>> +++ b/arch/arm64/include/asm/kvm_rme.h
>> @@ -47,6 +47,10 @@ void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level);
>>  int kvm_create_rec(struct kvm_vcpu *vcpu);
>>  void kvm_destroy_rec(struct kvm_vcpu *vcpu);
>>  
>> +int realm_set_ipa_state(struct kvm_vcpu *vcpu,
>> +			unsigned long addr, unsigned long end,
>> +			unsigned long ripas);
>> +
>>  #define RME_RTT_BLOCK_LEVEL	2
>>  #define RME_RTT_MAX_LEVEL	3
>>  
>> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
>> index d79ed889ca4d..b3ea79189839 100644
>> --- a/arch/arm64/kvm/rme.c
>> +++ b/arch/arm64/kvm/rme.c
>> @@ -73,6 +73,58 @@ static int rmi_check_version(void)
>>  	return 0;
>>  }
>>  
>> +static phys_addr_t __alloc_delegated_page(struct realm *realm,
>> +					  struct kvm_mmu_memory_cache *mc, gfp_t flags)
>> +{
>> +	phys_addr_t phys = PHYS_ADDR_MAX;
>> +	void *virt;
>> +
>> +	if (realm->spare_page != PHYS_ADDR_MAX) {
>> +		swap(realm->spare_page, phys);
>> +		goto out;
>> +	}
>> +
>> +	if (mc)
>> +		virt = kvm_mmu_memory_cache_alloc(mc);
>> +	else
>> +		virt = (void *)__get_free_page(flags);
>> +
>> +	if (!virt)
>> +		goto out;
>> +
>> +	phys = virt_to_phys(virt);
>> +
>> +	if (rmi_granule_delegate(phys)) {
>> +		free_page((unsigned long)virt);
>> +
>> +		phys = PHYS_ADDR_MAX;
>> +	}
>> +
>> +out:
>> +	return phys;
>> +}
>> +
>> +static phys_addr_t alloc_delegated_page(struct realm *realm,
>> +					struct kvm_mmu_memory_cache *mc)
>> +{
>> +	return __alloc_delegated_page(realm, mc, GFP_KERNEL);
>> +}
>> +
>> +static void free_delegated_page(struct realm *realm, phys_addr_t phys)
>> +{
>> +	if (realm->spare_page == PHYS_ADDR_MAX) {
>> +		realm->spare_page = phys;
>> +		return;
>> +	}
>> +
>> +	if (WARN_ON(rmi_granule_undelegate(phys))) {
>> +		/* Undelegate failed: leak the page */
>> +		return;
>> +	}
>> +
>> +	free_page((unsigned long)phys_to_virt(phys));
>> +}
>> +
>>  static void realm_destroy_undelegate_range(struct realm *realm,
>>  					   unsigned long ipa,
>>  					   unsigned long addr,
>> @@ -220,6 +272,30 @@ static int realm_rtt_create(struct realm *realm,
>>  	return rmi_rtt_create(phys, virt_to_phys(realm->rd), addr, level);
>>  }
>>  
>> +static int realm_create_rtt_levels(struct realm *realm,
>> +				   unsigned long ipa,
>> +				   int level,
>> +				   int max_level,
>> +				   struct kvm_mmu_memory_cache *mc)
>> +{
>> +	if (WARN_ON(level == max_level))
>> +		return 0;
>> +
>> +	while (level++ < max_level) {
>> +		phys_addr_t rtt = alloc_delegated_page(realm, mc);
>> +
>> +		if (rtt == PHYS_ADDR_MAX)
>> +			return -ENOMEM;
>> +
>> +		if (realm_rtt_create(realm, ipa, level, rtt)) {
>> +			free_delegated_page(realm, rtt);
>> +			return -ENXIO;
>> +		}
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>  static int realm_tear_down_rtt_range(struct realm *realm, int level,
>>  				     unsigned long start, unsigned long end)
>>  {
>> @@ -309,6 +385,206 @@ void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level)
>>  	realm_tear_down_rtt_range(realm, start_level, 0, (1UL << ia_bits));
>>  }
>>  
>> +void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size)
>> +{
>> +	u32 ia_bits = kvm->arch.mmu.pgt->ia_bits;
>> +	u32 start_level = kvm->arch.mmu.pgt->start_level;
>> +	unsigned long end = ipa + size;
>> +	struct realm *realm = &kvm->arch.realm;
>> +	phys_addr_t tmp_rtt = PHYS_ADDR_MAX;
>> +
>> +	if (end > (1UL << ia_bits))
>> +		end = 1UL << ia_bits;
>> +	/*
>> +	 * Make sure we have a spare delegated page for tearing down the
>> +	 * block mappings. We must use Atomic allocations as we are called
>> +	 * with kvm->mmu_lock held.
>> +	 */
>> +	if (realm->spare_page == PHYS_ADDR_MAX) {
>> +		tmp_rtt = __alloc_delegated_page(realm, NULL, GFP_ATOMIC);
>> +		/*
>> +		 * We don't have to check the status here, as we may not
>> +		 * have a block level mapping. Delay any error to the point
>> +		 * where we need it.
>> +		 */
>> +		realm->spare_page = tmp_rtt;
>> +	}
>> +
>> +	realm_tear_down_rtt_range(&kvm->arch.realm, start_level, ipa, end);
>> +
>> +	/* Free up the atomic page, if there were any */
>> +	if (tmp_rtt != PHYS_ADDR_MAX) {
>> +		free_delegated_page(realm, tmp_rtt);
>> +		/*
>> +		 * Update the spare_page after we have freed the
>> +		 * above page to make sure it doesn't get cached
>> +		 * in spare_page.
>> +		 * We should re-write this part and always have
>> +		 * a dedicated page for handling block mappings.
>> +		 */
>> +		realm->spare_page = PHYS_ADDR_MAX;
>> +	}
>> +}
>> +
>> +static int set_ipa_state(struct kvm_vcpu *vcpu,
>> +			 unsigned long ipa,
>> +			 unsigned long end,
>> +			 int level,
>> +			 unsigned long ripas)
>> +{
>> +	struct kvm *kvm = vcpu->kvm;
>> +	struct realm *realm = &kvm->arch.realm;
>> +	struct rec *rec = &vcpu->arch.rec;
>> +	phys_addr_t rd_phys = virt_to_phys(realm->rd);
>> +	phys_addr_t rec_phys = virt_to_phys(rec->rec_page);
>> +	unsigned long map_size = rme_rtt_level_mapsize(level);
>> +	int ret;
>> +
>> +	while (ipa < end) {
>> +		ret = rmi_rtt_set_ripas(rd_phys, rec_phys, ipa, level, ripas);
>> +
>> +		if (!ret) {
>> +			if (!ripas)
>> +				kvm_realm_unmap_range(kvm, ipa, map_size);
>> +		} else if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
>> +			int walk_level = RMI_RETURN_INDEX(ret);
>> +
>> +			if (walk_level < level) {
>> +				ret = realm_create_rtt_levels(realm, ipa,
>> +							      walk_level,
>> +							      level, NULL);
>> +				if (ret)
>> +					return ret;
>> +				continue;
>> +			}
>> +
>> +			if (WARN_ON(level >= RME_RTT_MAX_LEVEL))
>> +				return -EINVAL;
>> +
>> +			/* Recurse one level lower */
>> +			ret = set_ipa_state(vcpu, ipa, ipa + map_size,
>> +					    level + 1, ripas);
>> +			if (ret)
>> +				return ret;
>> +		} else {
>> +			WARN(1, "Unexpected error in %s: %#x\n", __func__,
>> +			     ret);
>> +			return -EINVAL;
>> +		}
>> +		ipa += map_size;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int realm_init_ipa_state(struct realm *realm,
>> +				unsigned long ipa,
>> +				unsigned long end,
>> +				int level)
>> +{
>> +	unsigned long map_size = rme_rtt_level_mapsize(level);
>> +	phys_addr_t rd_phys = virt_to_phys(realm->rd);
>> +	int ret;
>> +
>> +	while (ipa < end) {
>> +		ret = rmi_rtt_init_ripas(rd_phys, ipa, level);
>> +
>> +		if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
>> +			int cur_level = RMI_RETURN_INDEX(ret);
>> +
>> +			if (cur_level < level) {
>> +				ret = realm_create_rtt_levels(realm, ipa,
>> +							      cur_level,
>> +							      level, NULL);
>> +				if (ret)
>> +					return ret;
>> +				/* Retry with the RTT levels in place */
>> +				continue;
>> +			}
>> +
>> +			/* There's an entry at a lower level, recurse */
>> +			if (WARN_ON(level >= RME_RTT_MAX_LEVEL))
>> +				return -EINVAL;
>> +
>> +			realm_init_ipa_state(realm, ipa, ipa + map_size,
>> +					     level + 1);
>> +		} else if (WARN_ON(ret)) {
>> +			return -ENXIO;
>> +		}
>> +
>> +		ipa += map_size;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int find_map_level(struct kvm *kvm, unsigned long start, unsigned long end)
>> +{
>> +	int level = RME_RTT_MAX_LEVEL;
>> +
>> +	while (level > get_start_level(kvm) + 1) {
>> +		unsigned long map_size = rme_rtt_level_mapsize(level - 1);
>> +
>> +		if (!IS_ALIGNED(start, map_size) ||
>> +		    (start + map_size) > end)
>> +			break;
>> +
>> +		level--;
>> +	}
>> +
>> +	return level;
>> +}
>> +
>> +int realm_set_ipa_state(struct kvm_vcpu *vcpu,
>> +			unsigned long addr, unsigned long end,
>> +			unsigned long ripas)
>> +{
>> +	int ret = 0;
>> +
>> +	while (addr < end) {
>> +		int level = find_map_level(vcpu->kvm, addr, end);
>> +		unsigned long map_size = rme_rtt_level_mapsize(level);
>> +
>> +		ret = set_ipa_state(vcpu, addr, addr + map_size, level, ripas);
>> +		if (ret)
>> +			break;
>> +
>> +		addr += map_size;
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static int kvm_init_ipa_range_realm(struct kvm *kvm,
>> +				    struct kvm_cap_arm_rme_init_ipa_args *args)
>> +{
>> +	int ret = 0;
>> +	gpa_t addr, end;
>> +	struct realm *realm = &kvm->arch.realm;
>> +
>> +	addr = args->init_ipa_base;
>> +	end = addr + args->init_ipa_size;
>> +
>> +	if (end < addr)
>> +		return -EINVAL;
>> +
>> +	if (kvm_realm_state(kvm) != REALM_STATE_NEW)
>> +		return -EBUSY;
>> +
>> +	while (addr < end) {
>> +		int level = find_map_level(kvm, addr, end);
>> +		unsigned long map_size = rme_rtt_level_mapsize(level);
>> +
>> +		ret = realm_init_ipa_state(realm, addr, addr + map_size, level);
>> +		if (ret)
>> +			break;
>> +
>> +		addr += map_size;
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>>  /* Protects access to rme_vmid_bitmap */
>>  static DEFINE_SPINLOCK(rme_vmid_lock);
>>  static unsigned long *rme_vmid_bitmap;
>> @@ -460,6 +736,18 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>>  
>>  		r = kvm_create_realm(kvm);
>>  		break;
>> +	case KVM_CAP_ARM_RME_INIT_IPA_REALM: {
>> +		struct kvm_cap_arm_rme_init_ipa_args args;
>> +		void __user *argp = u64_to_user_ptr(cap->args[1]);
>> +
>> +		if (copy_from_user(&args, argp, sizeof(args))) {
>> +			r = -EFAULT;
>> +			break;
>> +		}
>> +
>> +		r = kvm_init_ipa_range_realm(kvm, &args);
>> +		break;
>> +	}
>>  	default:
>>  		r = -EINVAL;
>>  		break;
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 04/28] arm64: RME: Check for RME support at KVM init
  2023-02-13 15:59       ` Steven Price
@ 2023-03-04 12:07         ` Zhi Wang
  0 siblings, 0 replies; 190+ messages in thread
From: Zhi Wang @ 2023-03-04 12:07 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Mon, 13 Feb 2023 15:59:05 +0000
Steven Price <steven.price@arm.com> wrote:

> On 13/02/2023 15:48, Zhi Wang wrote:
> > On Fri, 27 Jan 2023 11:29:08 +0000
> > Steven Price <steven.price@arm.com> wrote:
> > 
> >> Query the RMI version number and check if it is a compatible version. A
> >> static key is also provided to signal that a supported RMM is available.
> >>
> >> Functions are provided to query if a VM or VCPU is a realm (or rec)
> >> which currently will always return false.
> >>
> >> Signed-off-by: Steven Price <steven.price@arm.com>
> >> ---
> >>  arch/arm64/include/asm/kvm_emulate.h | 17 ++++++++++
> >>  arch/arm64/include/asm/kvm_host.h    |  4 +++
> >>  arch/arm64/include/asm/kvm_rme.h     | 22 +++++++++++++
> >>  arch/arm64/include/asm/virt.h        |  1 +
> >>  arch/arm64/kvm/Makefile              |  3 +-
> >>  arch/arm64/kvm/arm.c                 |  8 +++++
> >>  arch/arm64/kvm/rme.c                 | 49 ++++++++++++++++++++++++++++
> >>  7 files changed, 103 insertions(+), 1 deletion(-)
> >>  create mode 100644 arch/arm64/include/asm/kvm_rme.h
> >>  create mode 100644 arch/arm64/kvm/rme.c
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> >> index 9bdba47f7e14..5a2b7229e83f 100644
> >> --- a/arch/arm64/include/asm/kvm_emulate.h
> >> +++ b/arch/arm64/include/asm/kvm_emulate.h
> >> @@ -490,4 +490,21 @@ static inline bool vcpu_has_feature(struct kvm_vcpu *vcpu, int feature)
> >>  	return test_bit(feature, vcpu->arch.features);
> >>  }
> >>  
> >> +static inline bool kvm_is_realm(struct kvm *kvm)
> >> +{
> >> +	if (static_branch_unlikely(&kvm_rme_is_available))
> >> +		return kvm->arch.is_realm;
> >> +	return false;
> >> +}
> >> +
> >> +static inline enum realm_state kvm_realm_state(struct kvm *kvm)
> >> +{
> >> +	return READ_ONCE(kvm->arch.realm.state);
> >> +}
> >> +
> >> +static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu)
> >> +{
> >> +	return false;
> >> +}
> >> +
> >>  #endif /* __ARM64_KVM_EMULATE_H__ */
> >> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> >> index 35a159d131b5..04347c3a8c6b 100644
> >> --- a/arch/arm64/include/asm/kvm_host.h
> >> +++ b/arch/arm64/include/asm/kvm_host.h
> >> @@ -26,6 +26,7 @@
> >>  #include <asm/fpsimd.h>
> >>  #include <asm/kvm.h>
> >>  #include <asm/kvm_asm.h>
> >> +#include <asm/kvm_rme.h>
> >>  
> >>  #define __KVM_HAVE_ARCH_INTC_INITIALIZED
> >>  
> >> @@ -240,6 +241,9 @@ struct kvm_arch {
> >>  	 * the associated pKVM instance in the hypervisor.
> >>  	 */
> >>  	struct kvm_protected_vm pkvm;
> >> +
> >> +	bool is_realm;
> >                ^
> > It would be better to put more comments which really helps on the review.
> 
> Thanks for the feedback - I had thought "is realm" was fairly
> self-documenting, but perhaps I've just spent too much time with this code.
> 
> > I was looking for the user of this memeber to see when it is set. It seems
> > it is not in this patch. It would have been nice to have a quick answer from the
> > comments.
> 
> The usage is in the kvm_is_realm() function which is used in several of
> the later patches as a way to detect this kvm guest is a realm guest.
> 
> I think the main issue is that I've got the patches in the wrong other.
> Patch 7 "arm64: kvm: Allow passing machine type in KVM creation" should
> probably be before this one, then I could add the assignment of is_realm
> into this patch (potentially splitting out the is_realm parts into
> another patch).
> 

I agree the patch order seems a problem here. The name is self-documenting
but if the user of the variable is not in this patch, still needs to jump to
the related patch to confirm if the variable is used as expected. In that
situation, a comment would help to avoid jumping between patches (sometimes
finding the the user of a variable from a patch bundle really slows down
the review progress and eventually you have to open a terminal and check
it in the git tree).

> Thanks,
> 
> Steve
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 09/28] arm64: RME: RTT handling
  2023-03-03 14:04       ` Steven Price
@ 2023-03-04 12:32         ` Zhi Wang
  0 siblings, 0 replies; 190+ messages in thread
From: Zhi Wang @ 2023-03-04 12:32 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 3 Mar 2023 14:04:56 +0000
Steven Price <steven.price@arm.com> wrote:

> On 13/02/2023 17:44, Zhi Wang wrote:
> > On Fri, 27 Jan 2023 11:29:13 +0000
> > Steven Price <steven.price@arm.com> wrote:
> > 
> >> The RMM owns the stage 2 page tables for a realm, and KVM must request
> >> that the RMM creates/destroys entries as necessary. The physical pages
> >> to store the page tables are delegated to the realm as required, and can
> >> be undelegated when no longer used.
> >>
> > 
> > This is only an introduction to RTT handling. While this patch is mostly like
> > RTT teardown, better add more introduction to this patch. Also maybe refine
> > the tittle to reflect what this patch is actually doing.
> 
> You've a definite point that this patch is mostly about RTT teardown.
> Technically it also adds the RTT creation path (realm_rtt_create) -
> hence the generic patch title.
> 

But realm_rtt_create() seem only used in realm_tear_down_rtt_range(). That
makes me wonder where is the real RTT creation path.
 
> But I'll definitely expand the commit message to mention the complexity
> of tear down which is the bulk of the patch.

It is also a good place to explain more about the RTT.

> 
> >> Signed-off-by: Steven Price <steven.price@arm.com>
> >> ---
> >>  arch/arm64/include/asm/kvm_rme.h |  19 +++++
> >>  arch/arm64/kvm/mmu.c             |   7 +-
> >>  arch/arm64/kvm/rme.c             | 139 +++++++++++++++++++++++++++++++
> >>  3 files changed, 162 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> >> index a6318af3ed11..eea5118dfa8a 100644
> >> --- a/arch/arm64/include/asm/kvm_rme.h
> >> +++ b/arch/arm64/include/asm/kvm_rme.h
> >> @@ -35,5 +35,24 @@ u32 kvm_realm_ipa_limit(void);
> >>  int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
> >>  int kvm_init_realm_vm(struct kvm *kvm);
> >>  void kvm_destroy_realm(struct kvm *kvm);
> >> +void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level);
> >> +
> >> +#define RME_RTT_BLOCK_LEVEL	2
> >> +#define RME_RTT_MAX_LEVEL	3
> >> +
> >> +#define RME_PAGE_SHIFT		12
> >> +#define RME_PAGE_SIZE		BIT(RME_PAGE_SHIFT)
> >> +/* See ARM64_HW_PGTABLE_LEVEL_SHIFT() */
> >> +#define RME_RTT_LEVEL_SHIFT(l)	\
> >> +	((RME_PAGE_SHIFT - 3) * (4 - (l)) + 3)
> >> +#define RME_L2_BLOCK_SIZE	BIT(RME_RTT_LEVEL_SHIFT(2))
> >> +
> >> +static inline unsigned long rme_rtt_level_mapsize(int level)
> >> +{
> >> +	if (WARN_ON(level > RME_RTT_MAX_LEVEL))
> >> +		return RME_PAGE_SIZE;
> >> +
> >> +	return (1UL << RME_RTT_LEVEL_SHIFT(level));
> >> +}
> >>  
> >>  #endif
> >> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> >> index 22c00274884a..f29558c5dcbc 100644
> >> --- a/arch/arm64/kvm/mmu.c
> >> +++ b/arch/arm64/kvm/mmu.c
> >> @@ -834,16 +834,17 @@ void stage2_unmap_vm(struct kvm *kvm)
> >>  void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
> >>  {
> >>  	struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
> >> -	struct kvm_pgtable *pgt = NULL;
> >> +	struct kvm_pgtable *pgt;
> >>  
> >>  	write_lock(&kvm->mmu_lock);
> >> +	pgt = mmu->pgt;
> >>  	if (kvm_is_realm(kvm) &&
> >>  	    kvm_realm_state(kvm) != REALM_STATE_DYING) {
> >> -		/* TODO: teardown rtts */
> >>  		write_unlock(&kvm->mmu_lock);
> >> +		kvm_realm_destroy_rtts(&kvm->arch.realm, pgt->ia_bits,
> >> +				       pgt->start_level);
> >>  		return;
> >>  	}
> >> -	pgt = mmu->pgt;
> >>  	if (pgt) {
> >>  		mmu->pgd_phys = 0;
> >>  		mmu->pgt = NULL;
> >> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> >> index 0c9d70e4d9e6..f7b0e5a779f8 100644
> >> --- a/arch/arm64/kvm/rme.c
> >> +++ b/arch/arm64/kvm/rme.c
> >> @@ -73,6 +73,28 @@ static int rmi_check_version(void)
> >>  	return 0;
> >>  }
> >>  
> >> +static void realm_destroy_undelegate_range(struct realm *realm,
> >> +					   unsigned long ipa,
> >> +					   unsigned long addr,
> >> +					   ssize_t size)
> >> +{
> >> +	unsigned long rd = virt_to_phys(realm->rd);
> >> +	int ret;
> >> +
> >> +	while (size > 0) {
> >> +		ret = rmi_data_destroy(rd, ipa);
> >> +		WARN_ON(ret);
> >> +		ret = rmi_granule_undelegate(addr);
> >> +
> > As the return value is not documented, what will happen if a page undelegate
> > failed? Leaked? Some explanation is required here.
> 
> Yes - it's leaked. I'll add a comment to explain the get_page() call.
> 
> Thanks,
> 
> Steve
> 
> >> +		if (ret)
> >> +			get_page(phys_to_page(addr));
> >> +
> >> +		addr += PAGE_SIZE;
> >> +		ipa += PAGE_SIZE;
> >> +		size -= PAGE_SIZE;
> >> +	}
> >> +}
> >> +
> >>  static unsigned long create_realm_feat_reg0(struct kvm *kvm)
> >>  {
> >>  	unsigned long ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
> >> @@ -170,6 +192,123 @@ static int realm_create_rd(struct kvm *kvm)
> >>  	return r;
> >>  }
> >>  
> >> +static int realm_rtt_destroy(struct realm *realm, unsigned long addr,
> >> +			     int level, phys_addr_t rtt_granule)
> >> +{
> >> +	addr = ALIGN_DOWN(addr, rme_rtt_level_mapsize(level - 1));
> >> +	return rmi_rtt_destroy(rtt_granule, virt_to_phys(realm->rd), addr,
> >> +			level);
> >> +}
> >> +
> >> +static int realm_destroy_free_rtt(struct realm *realm, unsigned long addr,
> >> +				  int level, phys_addr_t rtt_granule)
> >> +{
> >> +	if (realm_rtt_destroy(realm, addr, level, rtt_granule))
> >> +		return -ENXIO;
> >> +	if (!WARN_ON(rmi_granule_undelegate(rtt_granule)))
> >> +		put_page(phys_to_page(rtt_granule));
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static int realm_rtt_create(struct realm *realm,
> >> +			    unsigned long addr,
> >> +			    int level,
> >> +			    phys_addr_t phys)
> >> +{
> >> +	addr = ALIGN_DOWN(addr, rme_rtt_level_mapsize(level - 1));
> >> +	return rmi_rtt_create(phys, virt_to_phys(realm->rd), addr, level);
> >> +}
> >> +
> >> +static int realm_tear_down_rtt_range(struct realm *realm, int level,
> >> +				     unsigned long start, unsigned long end)
> >> +{
> >> +	phys_addr_t rd = virt_to_phys(realm->rd);
> >> +	ssize_t map_size = rme_rtt_level_mapsize(level);
> >> +	unsigned long addr, next_addr;
> >> +	bool failed = false;
> >> +
> >> +	for (addr = start; addr < end; addr = next_addr) {
> >> +		phys_addr_t rtt_addr, tmp_rtt;
> >> +		struct rtt_entry rtt;
> >> +		unsigned long end_addr;
> >> +
> >> +		next_addr = ALIGN(addr + 1, map_size);
> >> +
> >> +		end_addr = min(next_addr, end);
> >> +
> >> +		if (rmi_rtt_read_entry(rd, ALIGN_DOWN(addr, map_size),
> >> +				       level, &rtt)) {
> >> +			failed = true;
> >> +			continue;
> >> +		}
> >> +
> >> +		rtt_addr = rmi_rtt_get_phys(&rtt);
> >> +		WARN_ON(level != rtt.walk_level);
> >> +
> >> +		switch (rtt.state) {
> >> +		case RMI_UNASSIGNED:
> >> +		case RMI_DESTROYED:
> >> +			break;
> >> +		case RMI_TABLE:
> >> +			if (realm_tear_down_rtt_range(realm, level + 1,
> >> +						      addr, end_addr)) {
> >> +				failed = true;
> >> +				break;
> >> +			}
> >> +			if (IS_ALIGNED(addr, map_size) &&
> >> +			    next_addr <= end &&
> >> +			    realm_destroy_free_rtt(realm, addr, level + 1,
> >> +						   rtt_addr))
> >> +				failed = true;
> >> +			break;
> >> +		case RMI_ASSIGNED:
> >> +			WARN_ON(!rtt_addr);
> >> +			/*
> >> +			 * If there is a block mapping, break it now, using the
> >> +			 * spare_page. We are sure to have a valid delegated
> >> +			 * page at spare_page before we enter here, otherwise
> >> +			 * WARN once, which will be followed by further
> >> +			 * warnings.
> >> +			 */
> >> +			tmp_rtt = realm->spare_page;
> >> +			if (level == 2 &&
> >> +			    !WARN_ON_ONCE(tmp_rtt == PHYS_ADDR_MAX) &&
> >> +			    realm_rtt_create(realm, addr,
> >> +					     RME_RTT_MAX_LEVEL, tmp_rtt)) {
> >> +				WARN_ON(1);
> >> +				failed = true;
> >> +				break;
> >> +			}
> >> +			realm_destroy_undelegate_range(realm, addr,
> >> +						       rtt_addr, map_size);
> >> +			/*
> >> +			 * Collapse the last level table and make the spare page
> >> +			 * reusable again.
> >> +			 */
> >> +			if (level == 2 &&
> >> +			    realm_rtt_destroy(realm, addr, RME_RTT_MAX_LEVEL,
> >> +					      tmp_rtt))
> >> +				failed = true;
> >> +			break;
> >> +		case RMI_VALID_NS:
> >> +			WARN_ON(rmi_rtt_unmap_unprotected(rd, addr, level));
> >> +			break;
> >> +		default:
> >> +			WARN_ON(1);
> >> +			failed = true;
> >> +			break;
> >> +		}
> >> +	}
> >> +
> >> +	return failed ? -EINVAL : 0;
> >> +}
> >> +
> >> +void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level)
> >> +{
> >> +	realm_tear_down_rtt_range(realm, start_level, 0, (1UL << ia_bits));
> >> +}
> >> +
> >>  /* Protects access to rme_vmid_bitmap */
> >>  static DEFINE_SPINLOCK(rme_vmid_lock);
> >>  static unsigned long *rme_vmid_bitmap;
> > 
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 10/28] arm64: RME: Allocate/free RECs to match vCPUs
  2023-03-03 14:05       ` Steven Price
@ 2023-03-04 12:46         ` Zhi Wang
  0 siblings, 0 replies; 190+ messages in thread
From: Zhi Wang @ 2023-03-04 12:46 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 3 Mar 2023 14:05:02 +0000
Steven Price <steven.price@arm.com> wrote:

> On 13/02/2023 18:08, Zhi Wang wrote:
> > On Fri, 27 Jan 2023 11:29:14 +0000
> > Steven Price <steven.price@arm.com> wrote:
> > 
> >> The RMM maintains a data structure known as the Realm Execution Context
> >> (or REC). It is similar to struct kvm_vcpu and tracks the state of the
> >> virtual CPUs. KVM must delegate memory and request the structures are
> >> created when vCPUs are created, and suitably tear down on destruction.
> >>
> > 
> > It would be better to leave some pointers to the spec here. It really saves
> > time for reviewers. /
> 
> Fair enough. I wasn't sure how often to repeat the link to the spec, but
> a few more times wouldn't hurt ;)

Based on my review experience, the right time would be when a new concept is
first on the table.

For example, the concept REC appears in this patch series for the first time, it
would be nice to have following things in the comments:

1) Basic summary of the concept. Several sentences help people to understand
what it is, what it is used for, what/when it is required (mostly this would be
helpful on denoting the interaction with existing flows), and then eventually how
it will interact with the existing flows. It would be good enough for people who
don't have time to dig the concept itself but want to review the interaction flow
with the component they are familiar with or the area they are working on. 

2) A simple sentence to show the spec name and chapter name would be good
enough for people who would like to dig it. It is also a nice timing to educate
people who are interested in the detail of the tech concept.

> 
> >> Signed-off-by: Steven Price <steven.price@arm.com>
> >> ---
> >>  arch/arm64/include/asm/kvm_emulate.h |   2 +
> >>  arch/arm64/include/asm/kvm_host.h    |   3 +
> >>  arch/arm64/include/asm/kvm_rme.h     |  10 ++
> >>  arch/arm64/kvm/arm.c                 |   1 +
> >>  arch/arm64/kvm/reset.c               |  11 ++
> >>  arch/arm64/kvm/rme.c                 | 144 +++++++++++++++++++++++++++
> >>  6 files changed, 171 insertions(+)
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> >> index 5a2b7229e83f..285e62914ca4 100644
> >> --- a/arch/arm64/include/asm/kvm_emulate.h
> >> +++ b/arch/arm64/include/asm/kvm_emulate.h
> >> @@ -504,6 +504,8 @@ static inline enum realm_state kvm_realm_state(struct kvm *kvm)
> >>  
> >>  static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu)
> >>  {
> >> +	if (static_branch_unlikely(&kvm_rme_is_available))
> >> +		return vcpu->arch.rec.mpidr != INVALID_HWID;
> >>  	return false;
> >>  }
> >>  
> >> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> >> index 04347c3a8c6b..ef497b718cdb 100644
> >> --- a/arch/arm64/include/asm/kvm_host.h
> >> +++ b/arch/arm64/include/asm/kvm_host.h
> >> @@ -505,6 +505,9 @@ struct kvm_vcpu_arch {
> >>  		u64 last_steal;
> >>  		gpa_t base;
> >>  	} steal;
> >> +
> >> +	/* Realm meta data */
> >> +	struct rec rec;
> > 
> > I think the name of the data structure "rec" needs a prefix, it is too common
> > and might conflict with the private data structures in the other modules. Maybe
> > rme_rec or realm_rec?
> 
> struct realm_rec seems like a good choice. I agree 'rec' without context
> is somewhat ambiguous.
> 
> >>  };
> >>  
> >>  /*
> >> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> >> index eea5118dfa8a..4b219ebe1400 100644
> >> --- a/arch/arm64/include/asm/kvm_rme.h
> >> +++ b/arch/arm64/include/asm/kvm_rme.h
> >> @@ -6,6 +6,7 @@
> >>  #ifndef __ASM_KVM_RME_H
> >>  #define __ASM_KVM_RME_H
> >>  
> >> +#include <asm/rmi_smc.h>
> >>  #include <uapi/linux/kvm.h>
> >>  
> >>  enum realm_state {
> >> @@ -29,6 +30,13 @@ struct realm {
> >>  	unsigned int ia_bits;
> >>  };
> >>  
> >> +struct rec {
> >> +	unsigned long mpidr;
> >> +	void *rec_page;
> >> +	struct page *aux_pages[REC_PARAMS_AUX_GRANULES];
> >> +	struct rec_run *run;
> >> +};
> >> +
> > 
> > It is better to leave some comments for above members or pointers to the spec,
> > that saves a lot of time for review.
> 
> Will add comments.
> 
> >>  int kvm_init_rme(void);
> >>  u32 kvm_realm_ipa_limit(void);
> >>  
> >> @@ -36,6 +44,8 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
> >>  int kvm_init_realm_vm(struct kvm *kvm);
> >>  void kvm_destroy_realm(struct kvm *kvm);
> >>  void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level);
> >> +int kvm_create_rec(struct kvm_vcpu *vcpu);
> >> +void kvm_destroy_rec(struct kvm_vcpu *vcpu);
> >>  
> >>  #define RME_RTT_BLOCK_LEVEL	2
> >>  #define RME_RTT_MAX_LEVEL	3
> >> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> >> index badd775547b8..52affed2f3cf 100644
> >> --- a/arch/arm64/kvm/arm.c
> >> +++ b/arch/arm64/kvm/arm.c
> >> @@ -373,6 +373,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
> >>  	/* Force users to call KVM_ARM_VCPU_INIT */
> >>  	vcpu->arch.target = -1;
> >>  	bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
> >> +	vcpu->arch.rec.mpidr = INVALID_HWID;
> >>  
> >>  	vcpu->arch.mmu_page_cache.gfp_zero = __GFP_ZERO;
> >>  
> >> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> >> index 9e71d69e051f..0c84392a4bf2 100644
> >> --- a/arch/arm64/kvm/reset.c
> >> +++ b/arch/arm64/kvm/reset.c
> >> @@ -135,6 +135,11 @@ int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature)
> >>  			return -EPERM;
> >>  
> >>  		return kvm_vcpu_finalize_sve(vcpu);
> >> +	case KVM_ARM_VCPU_REC:
> >> +		if (!kvm_is_realm(vcpu->kvm))
> >> +			return -EINVAL;
> >> +
> >> +		return kvm_create_rec(vcpu);
> >>  	}
> >>  
> >>  	return -EINVAL;
> >> @@ -145,6 +150,11 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu)
> >>  	if (vcpu_has_sve(vcpu) && !kvm_arm_vcpu_sve_finalized(vcpu))
> >>  		return false;
> >>  
> >> +	if (kvm_is_realm(vcpu->kvm) &&
> >> +	    !(vcpu_is_rec(vcpu) &&
> >> +	      READ_ONCE(vcpu->kvm->arch.realm.state) == REALM_STATE_ACTIVE))
> >> +		return false;
> > 
> > That's why it is better to introduce the realm state in the previous patches so
> > that people can really get the idea of the states at this stage.
> > 
> >> +
> >>  	return true;
> >>  }
> >>  
> >> @@ -157,6 +167,7 @@ void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu)
> >>  	if (sve_state)
> >>  		kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu));
> >>  	kfree(sve_state);
> >> +	kvm_destroy_rec(vcpu);
> >>  }
> >>  
> >>  static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu)
> >> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> >> index f7b0e5a779f8..d79ed889ca4d 100644
> >> --- a/arch/arm64/kvm/rme.c
> >> +++ b/arch/arm64/kvm/rme.c
> >> @@ -514,6 +514,150 @@ void kvm_destroy_realm(struct kvm *kvm)
> >>  	kvm_free_stage2_pgd(&kvm->arch.mmu);
> >>  }
> >>  
> >> +static void free_rec_aux(struct page **aux_pages,
> >> +			 unsigned int num_aux)
> >> +{
> >> +	unsigned int i;
> >> +
> >> +	for (i = 0; i < num_aux; i++) {
> >> +		phys_addr_t aux_page_phys = page_to_phys(aux_pages[i]);
> >> +
> >> +		if (WARN_ON(rmi_granule_undelegate(aux_page_phys)))
> >> +			continue;
> >> +
> >> +		__free_page(aux_pages[i]);
> >> +	}
> >> +}
> >> +
> >> +static int alloc_rec_aux(struct page **aux_pages,
> >> +			 u64 *aux_phys_pages,
> >> +			 unsigned int num_aux)
> >> +{
> >> +	int ret;
> >> +	unsigned int i;
> >> +
> >> +	for (i = 0; i < num_aux; i++) {
> >> +		struct page *aux_page;
> >> +		phys_addr_t aux_page_phys;
> >> +
> >> +		aux_page = alloc_page(GFP_KERNEL);
> >> +		if (!aux_page) {
> >> +			ret = -ENOMEM;
> >> +			goto out_err;
> >> +		}
> >> +		aux_page_phys = page_to_phys(aux_page);
> >> +		if (rmi_granule_delegate(aux_page_phys)) {
> >> +			__free_page(aux_page);
> >> +			ret = -ENXIO;
> >> +			goto out_err;
> >> +		}
> >> +		aux_pages[i] = aux_page;
> >> +		aux_phys_pages[i] = aux_page_phys;
> >> +	}
> >> +
> >> +	return 0;
> >> +out_err:
> >> +	free_rec_aux(aux_pages, i);
> >> +	return ret;
> >> +}
> >> +
> >> +int kvm_create_rec(struct kvm_vcpu *vcpu)
> >> +{
> >> +	struct user_pt_regs *vcpu_regs = vcpu_gp_regs(vcpu);
> >> +	unsigned long mpidr = kvm_vcpu_get_mpidr_aff(vcpu);
> >> +	struct realm *realm = &vcpu->kvm->arch.realm;
> >> +	struct rec *rec = &vcpu->arch.rec;
> >> +	unsigned long rec_page_phys;
> >> +	struct rec_params *params;
> >> +	int r, i;
> >> +
> >> +	if (kvm_realm_state(vcpu->kvm) != REALM_STATE_NEW)
> >> +		return -ENOENT;
> >> +
> >> +	/*
> >> +	 * The RMM will report PSCI v1.0 to Realms and the KVM_ARM_VCPU_PSCI_0_2
> >> +	 * flag covers v0.2 and onwards.
> >> +	 */
> >> +	if (!test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
> >> +		return -EINVAL;
> >> +
> >> +	BUILD_BUG_ON(sizeof(*params) > PAGE_SIZE);
> >> +	BUILD_BUG_ON(sizeof(*rec->run) > PAGE_SIZE);
> >> +
> >> +	params = (struct rec_params *)get_zeroed_page(GFP_KERNEL);
> >> +	rec->rec_page = (void *)__get_free_page(GFP_KERNEL);
> >> +	rec->run = (void *)get_zeroed_page(GFP_KERNEL);
> >> +	if (!params || !rec->rec_page || !rec->run) {
> >> +		r = -ENOMEM;
> >> +		goto out_free_pages;
> >> +	}
> >> +
> >> +	for (i = 0; i < ARRAY_SIZE(params->gprs); i++)
> >> +		params->gprs[i] = vcpu_regs->regs[i];
> >> +
> >> +	params->pc = vcpu_regs->pc;
> >> +
> >> +	if (vcpu->vcpu_id == 0)
> >> +		params->flags |= REC_PARAMS_FLAG_RUNNABLE;
> >> +
> >> +	rec_page_phys = virt_to_phys(rec->rec_page);
> >> +
> >> +	if (rmi_granule_delegate(rec_page_phys)) {
> >> +		r = -ENXIO;
> >> +		goto out_free_pages;
> >> +	}
> >> +
> > 
> > Wouldn't it be better to extend the alloc_rec_aux() to allocate and delegate
> > pages above? so that you can same some gfps and rmi_granuale_delegates().
> 
> I don't think it's really an improvement. There's only the one
> rmi_granule_delegate() call (for the REC page itself). The RecParams and
> RecRun pages are not delegated because they are shared with the host. It
> would also hide the structure setup within the new
> alloc_rec_aux_and_rec() function.
> 

It should make it clearer.

I was thinking if it would be nice to abstract alloc + delegated as a common
function, then alloc_rec_aux() and kvm_realm_rec() can be its common user.

> >> +	r = alloc_rec_aux(rec->aux_pages, params->aux, realm->num_aux);
> >> +	if (r)
> >> +		goto out_undelegate_rmm_rec;
> >> +
> >> +	params->num_rec_aux = realm->num_aux;
> >> +	params->mpidr = mpidr;
> >> +
> >> +	if (rmi_rec_create(rec_page_phys,
> >> +			   virt_to_phys(realm->rd),
> >> +			   virt_to_phys(params))) {
> >> +		r = -ENXIO;
> >> +		goto out_free_rec_aux;
> >> +	}
> >> +
> >> +	rec->mpidr = mpidr;
> >> +
> >> +	free_page((unsigned long)params);
> >> +	return 0;
> >> +
> >> +out_free_rec_aux:
> >> +	free_rec_aux(rec->aux_pages, realm->num_aux);
> >> +out_undelegate_rmm_rec:
> >> +	if (WARN_ON(rmi_granule_undelegate(rec_page_phys)))
> >> +		rec->rec_page = NULL;
> >> +out_free_pages:
> >> +	free_page((unsigned long)rec->run);
> >> +	free_page((unsigned long)rec->rec_page);
> >> +	free_page((unsigned long)params);
> >> +	return r;
> >> +}
> >> +
> >> +void kvm_destroy_rec(struct kvm_vcpu *vcpu)
> >> +{
> >> +	struct realm *realm = &vcpu->kvm->arch.realm;
> >> +	struct rec *rec = &vcpu->arch.rec;
> >> +	unsigned long rec_page_phys;
> >> +
> >> +	if (!vcpu_is_rec(vcpu))
> >> +		return;
> >> +
> >> +	rec_page_phys = virt_to_phys(rec->rec_page);
> >> +
> >> +	if (WARN_ON(rmi_rec_destroy(rec_page_phys)))
> >> +		return;
> >> +	if (WARN_ON(rmi_granule_undelegate(rec_page_phys)))
> >> +		return;
> >> +
> > 
> > The two returns above feels off. What is the reason to skip the below page
> > undelegates?
> 
> The reason is the usual: if we fail to undelegate then the pages have to
> be leaked. I'll add some comments. However it does look like I've got
> the order wrong here, if the undelegate fails for rec_page_phys it's
> possible that we might still be able to free the rec_aux pages (although
> something has gone terribly wrong for that to be the case).
> 
> I'll change the order to:
> 
>   /* If the REC destroy fails, leak all pages relating to the REC */
>   if (WARN_ON(rmi_rec_destroy(rec_page_phys)))
> 	return;
> 
>   free_rec_aux(rec->aux_pages, realm->num_aux);
> 
>   /* If the undelegate fails then leak the REC page */
>   if (WARN_ON(rmi_granule_undelegate(rec_page_phys)))
> 	return;
> 
>   free_page((unsigned long)rec->rec_page);
> 
> If the rmi_rec_destroy() call has failed then the RMM should prevent the
> undelegate so there's little point trying.
> 
> Steve
> 
> >> +	free_rec_aux(rec->aux_pages, realm->num_aux);
> >> +	free_page((unsigned long)rec->rec_page);
> >> +}
> >> +
> >>  int kvm_init_realm_vm(struct kvm *kvm)
> >>  {
> >>  	struct realm_params *params;
> > 
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 15/28] KVM: arm64: Handle realm MMIO emulation
  2023-01-27 11:29   ` [RFC PATCH 15/28] KVM: arm64: Handle realm MMIO emulation Steven Price
@ 2023-03-06 15:37     ` Zhi Wang
  2023-03-10 15:47       ` Steven Price
  0 siblings, 1 reply; 190+ messages in thread
From: Zhi Wang @ 2023-03-06 15:37 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 27 Jan 2023 11:29:19 +0000
Steven Price <steven.price@arm.com> wrote:

> MMIO emulation for a realm cannot be done directly with the VM's
> registers as they are protected from the host. However the RMM interface
> provides a structure member for providing the read/written value and

More details would be better for helping the review. I can only see the
emulated mmio value from the device model (kvmtool or kvm_io_bus) is put into
the GPRS[0] of the RecEntry object. But the rest of the flow is missing.

I guess RMM copies the value in the RecEntry.GPRS[0] to the target GPR in the
guest context in RMI_REC_ENTER when seeing RMI_EMULATED_MMIO. This is for
the guest MMIO read path.

How about the MMIO write path? I don't see where the RecExit.GPRS[0] is loaded
to a varible and returned to the userspace.

> we can transfer this to the appropriate VCPU's register entry and then
> depend on the generic MMIO handling code in KVM.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/kvm/mmio.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
> index 3dd38a151d2a..c4879fa3a8d3 100644
> --- a/arch/arm64/kvm/mmio.c
> +++ b/arch/arm64/kvm/mmio.c
> @@ -6,6 +6,7 @@
>  
>  #include <linux/kvm_host.h>
>  #include <asm/kvm_emulate.h>
> +#include <asm/rmi_smc.h>
>  #include <trace/events/kvm.h>
>  
>  #include "trace.h"
> @@ -109,6 +110,9 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
>  			       &data);
>  		data = vcpu_data_host_to_guest(vcpu, data, len);
>  		vcpu_set_reg(vcpu, kvm_vcpu_dabt_get_rd(vcpu), data);
> +
> +		if (vcpu_is_rec(vcpu))
> +			vcpu->arch.rec.run->entry.gprs[0] = data;

I think the guest context is maintained by RMM (while KVM can only touch
Rec{Entry, Exit} object) so that guest context in the legacy VHE mode is
unused.

If yes, I guess here is should be:

if (unlikely(vcpu_is_rec(vcpu)))
	vcpu->arch.rec.run->entry.gprs[0] = data;
else
	vcpu_set_reg(vcpu, kvm_vcpu_dabt_get_rd(vcpu), data);

>  	}
>  
>  	/*
> @@ -179,6 +183,9 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>  	run->mmio.len		= len;
>  	vcpu->mmio_needed	= 1;
>  
> +	if (vcpu_is_rec(vcpu))
> +		vcpu->arch.rec.run->entry.flags |= RMI_EMULATED_MMIO;
> +

Wouldn't it be better to set this in the kvm_handle_mmio_return where the MMIO
read emulation has been surely successful?

>  	if (!ret) {
>  		/* We handled the access successfully in the kernel. */
>  		if (!is_write)


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 16/28] arm64: RME: Allow populating initial contents
  2023-01-27 11:29   ` [RFC PATCH 16/28] arm64: RME: Allow populating initial contents Steven Price
@ 2023-03-06 17:34     ` Zhi Wang
  2023-03-10 15:47       ` Steven Price
  0 siblings, 1 reply; 190+ messages in thread
From: Zhi Wang @ 2023-03-06 17:34 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 27 Jan 2023 11:29:20 +0000
Steven Price <steven.price@arm.com> wrote:

> The VMM needs to populate the realm with some data before starting (e.g.
> a kernel and initrd). This is measured by the RMM and used as part of
> the attestation later on.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/kvm/rme.c | 366 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 366 insertions(+)
> 
> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> index 16e0bfea98b1..3405b43e1421 100644
> --- a/arch/arm64/kvm/rme.c
> +++ b/arch/arm64/kvm/rme.c
> @@ -4,6 +4,7 @@
>   */
>  
>  #include <linux/kvm_host.h>
> +#include <linux/hugetlb.h>
>  
>  #include <asm/kvm_emulate.h>
>  #include <asm/kvm_mmu.h>
> @@ -426,6 +427,359 @@ void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size)
>  	}
>  }
>  
> +static int realm_create_protected_data_page(struct realm *realm,
> +					    unsigned long ipa,
> +					    struct page *dst_page,
> +					    struct page *tmp_page)
> +{
> +	phys_addr_t dst_phys, tmp_phys;
> +	int ret;
> +
> +	copy_page(page_address(tmp_page), page_address(dst_page));
> +
> +	dst_phys = page_to_phys(dst_page);
> +	tmp_phys = page_to_phys(tmp_page);
> +
> +	if (rmi_granule_delegate(dst_phys))
> +		return -ENXIO;
> +
> +	ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa, tmp_phys,
> +			      RMI_MEASURE_CONTENT);
> +
> +	if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> +		/* Create missing RTTs and retry */
> +		int level = RMI_RETURN_INDEX(ret);
> +
> +		ret = realm_create_rtt_levels(realm, ipa, level,
> +					      RME_RTT_MAX_LEVEL, NULL);
> +		if (ret)
> +			goto err;
> +
> +		ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa,
> +				      tmp_phys, RMI_MEASURE_CONTENT);
> +	}
> +
> +	if (ret)
> +		goto err;
> +
> +	return 0;
> +
> +err:
> +	if (WARN_ON(rmi_granule_undelegate(dst_phys))) {
> +		/* Page can't be returned to NS world so is lost */
> +		get_page(dst_page);
> +	}
> +	return -ENXIO;
> +}
> +
> +static int fold_rtt(phys_addr_t rd, unsigned long addr, int level,
> +		    struct realm *realm)
> +{
> +	struct rtt_entry rtt;
> +	phys_addr_t rtt_addr;
> +
> +	if (rmi_rtt_read_entry(rd, addr, level, &rtt))
> +		return -ENXIO;
> +
> +	if (rtt.state != RMI_TABLE)
> +		return -EINVAL;
> +
> +	rtt_addr = rmi_rtt_get_phys(&rtt);
> +	if (rmi_rtt_fold(rtt_addr, rd, addr, level + 1))
> +		return -ENXIO;
> +
> +	free_delegated_page(realm, rtt_addr);
> +
> +	return 0;
> +}
> +
> +int realm_map_protected(struct realm *realm,
> +			unsigned long hva,
> +			unsigned long base_ipa,
> +			struct page *dst_page,
> +			unsigned long map_size,
> +			struct kvm_mmu_memory_cache *memcache)
> +{
> +	phys_addr_t dst_phys = page_to_phys(dst_page);
> +	phys_addr_t rd = virt_to_phys(realm->rd);
> +	unsigned long phys = dst_phys;
> +	unsigned long ipa = base_ipa;
> +	unsigned long size;
> +	int map_level;
> +	int ret = 0;
> +
> +	if (WARN_ON(!IS_ALIGNED(ipa, map_size)))
> +		return -EINVAL;
> +
> +	switch (map_size) {
> +	case PAGE_SIZE:
> +		map_level = 3;
> +		break;
> +	case RME_L2_BLOCK_SIZE:
> +		map_level = 2;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	if (map_level < RME_RTT_MAX_LEVEL) {
> +		/*
> +		 * A temporary RTT is needed during the map, precreate it,
> +		 * however if there is an error (e.g. missing parent tables)
> +		 * this will be handled below.
> +		 */
> +		realm_create_rtt_levels(realm, ipa, map_level,
> +					RME_RTT_MAX_LEVEL, memcache);
> +	}
> +
> +	for (size = 0; size < map_size; size += PAGE_SIZE) {
> +		if (rmi_granule_delegate(phys)) {
> +			struct rtt_entry rtt;
> +
> +			/*
> +			 * It's possible we raced with another VCPU on the same
> +			 * fault. If the entry exists and matches then exit
> +			 * early and assume the other VCPU will handle the
> +			 * mapping.
> +			 */
> +			if (rmi_rtt_read_entry(rd, ipa, RME_RTT_MAX_LEVEL, &rtt))
> +				goto err;
> +
> +			// FIXME: For a block mapping this could race at level
> +			// 2 or 3...
> +			if (WARN_ON((rtt.walk_level != RME_RTT_MAX_LEVEL ||
> +				     rtt.state != RMI_ASSIGNED ||
> +				     rtt.desc != phys))) {
> +				goto err;
> +			}
> +
> +			return 0;
> +		}
> +
> +		ret = rmi_data_create_unknown(phys, rd, ipa);
> +
> +		if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> +			/* Create missing RTTs and retry */
> +			int level = RMI_RETURN_INDEX(ret);
> +
> +			ret = realm_create_rtt_levels(realm, ipa, level,
> +						      RME_RTT_MAX_LEVEL,
> +						      memcache);
> +			WARN_ON(ret);
> +			if (ret)
> +				goto err_undelegate;
> +
> +			ret = rmi_data_create_unknown(phys, rd, ipa);
> +		}
> +		WARN_ON(ret);
> +
> +		if (ret)
> +			goto err_undelegate;
> +
> +		phys += PAGE_SIZE;
> +		ipa += PAGE_SIZE;
> +	}
> +
> +	if (map_size == RME_L2_BLOCK_SIZE)
> +		ret = fold_rtt(rd, base_ipa, map_level, realm);
> +	if (WARN_ON(ret))
> +		goto err;
> +
> +	return 0;
> +
> +err_undelegate:
> +	if (WARN_ON(rmi_granule_undelegate(phys))) {
> +		/* Page can't be returned to NS world so is lost */
> +		get_page(phys_to_page(phys));
> +	}
> +err:
> +	while (size > 0) {
> +		phys -= PAGE_SIZE;
> +		size -= PAGE_SIZE;
> +		ipa -= PAGE_SIZE;
> +
> +		rmi_data_destroy(rd, ipa);
> +
> +		if (WARN_ON(rmi_granule_undelegate(phys))) {
> +			/* Page can't be returned to NS world so is lost */
> +			get_page(phys_to_page(phys));
> +		}
> +	}
> +	return -ENXIO;
> +}
> +

There seems no caller to the function above. Better move it to the related
patch.

> +static int populate_par_region(struct kvm *kvm,
> +			       phys_addr_t ipa_base,
> +			       phys_addr_t ipa_end)
> +{
> +	struct realm *realm = &kvm->arch.realm;
> +	struct kvm_memory_slot *memslot;
> +	gfn_t base_gfn, end_gfn;
> +	int idx;
> +	phys_addr_t ipa;
> +	int ret = 0;
> +	struct page *tmp_page;
> +	phys_addr_t rd = virt_to_phys(realm->rd);
> +
> +	base_gfn = gpa_to_gfn(ipa_base);
> +	end_gfn = gpa_to_gfn(ipa_end);
> +
> +	idx = srcu_read_lock(&kvm->srcu);
> +	memslot = gfn_to_memslot(kvm, base_gfn);
> +	if (!memslot) {
> +		ret = -EFAULT;
> +		goto out;
> +	}
> +
> +	/* We require the region to be contained within a single memslot */
> +	if (memslot->base_gfn + memslot->npages < end_gfn) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
> +	tmp_page = alloc_page(GFP_KERNEL);
> +	if (!tmp_page) {
> +		ret = -ENOMEM;
> +		goto out;
> +	}
> +
> +	mmap_read_lock(current->mm);
> +
> +	ipa = ipa_base;
> +
> +	while (ipa < ipa_end) {
> +		struct vm_area_struct *vma;
> +		unsigned long map_size;
> +		unsigned int vma_shift;
> +		unsigned long offset;
> +		unsigned long hva;
> +		struct page *page;
> +		kvm_pfn_t pfn;
> +		int level;
> +
> +		hva = gfn_to_hva_memslot(memslot, gpa_to_gfn(ipa));
> +		vma = vma_lookup(current->mm, hva);
> +		if (!vma) {
> +			ret = -EFAULT;
> +			break;
> +		}
> +
> +		if (is_vm_hugetlb_page(vma))
> +			vma_shift = huge_page_shift(hstate_vma(vma));
> +		else
> +			vma_shift = PAGE_SHIFT;
> +
> +		map_size = 1 << vma_shift;
> +
> +		/*
> +		 * FIXME: This causes over mapping, but there's no good
> +		 * solution here with the ABI as it stands
> +		 */
> +		ipa = ALIGN_DOWN(ipa, map_size);
> +
> +		switch (map_size) {
> +		case RME_L2_BLOCK_SIZE:
> +			level = 2;
> +			break;
> +		case PAGE_SIZE:
> +			level = 3;
> +			break;
> +		default:
> +			WARN_ONCE(1, "Unsupport vma_shift %d", vma_shift);
> +			ret = -EFAULT;
> +			break;
> +		}
> +
> +		pfn = gfn_to_pfn_memslot(memslot, gpa_to_gfn(ipa));
> +
> +		if (is_error_pfn(pfn)) {
> +			ret = -EFAULT;
> +			break;
> +		}
> +
> +		ret = rmi_rtt_init_ripas(rd, ipa, level);
> +		if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> +			ret = realm_create_rtt_levels(realm, ipa,
> +						      RMI_RETURN_INDEX(ret),
> +						      level, NULL);
> +			if (ret)
> +				break;
> +			ret = rmi_rtt_init_ripas(rd, ipa, level);
> +			if (ret) {
> +				ret = -ENXIO;
> +				break;
> +			}
> +		}
> +
> +		if (level < RME_RTT_MAX_LEVEL) {
> +			/*
> +			 * A temporary RTT is needed during the map, precreate
> +			 * it, however if there is an error (e.g. missing
> +			 * parent tables) this will be handled in the
> +			 * realm_create_protected_data_page() call.
> +			 */
> +			realm_create_rtt_levels(realm, ipa, level,
> +						RME_RTT_MAX_LEVEL, NULL);
> +		}
> +
> +		page = pfn_to_page(pfn);
> +
> +		for (offset = 0; offset < map_size && !ret;
> +		     offset += PAGE_SIZE, page++) {
> +			phys_addr_t page_ipa = ipa + offset;
> +
> +			ret = realm_create_protected_data_page(realm, page_ipa,
> +							       page, tmp_page);
> +		}
> +		if (ret)
> +			goto err_release_pfn;
> +
> +		if (level == 2) {
> +			ret = fold_rtt(rd, ipa, level, realm);
> +			if (ret)
> +				goto err_release_pfn;
> +		}
> +
> +		ipa += map_size;

> +		kvm_set_pfn_accessed(pfn);
> +		kvm_set_pfn_dirty(pfn);

kvm_release_pfn_dirty() has already called kvm_set_pfn_{accessed, dirty}().

> +		kvm_release_pfn_dirty(pfn);
> +err_release_pfn:
> +		if (ret) {
> +			kvm_release_pfn_clean(pfn);
> +			break;
> +		}
> +	}
> +
> +	mmap_read_unlock(current->mm);
> +	__free_page(tmp_page);
> +
> +out:
> +	srcu_read_unlock(&kvm->srcu, idx);
> +	return ret;
> +}
> +
> +static int kvm_populate_realm(struct kvm *kvm,
> +			      struct kvm_cap_arm_rme_populate_realm_args *args)
> +{
> +	phys_addr_t ipa_base, ipa_end;
> +

Check kvm_is_realm(kvm) here or in the kvm_realm_enable_cap().

> +	if (kvm_realm_state(kvm) != REALM_STATE_NEW)
> +		return -EBUSY;

Maybe -EINVAL? The realm hasn't been created (RMI_REALM_CREATE is not called
yet). The userspace shouldn't reach this path.

> +
> +	if (!IS_ALIGNED(args->populate_ipa_base, PAGE_SIZE) ||
> +	    !IS_ALIGNED(args->populate_ipa_size, PAGE_SIZE))
> +		return -EINVAL;
> +
> +	ipa_base = args->populate_ipa_base;
> +	ipa_end = ipa_base + args->populate_ipa_size;
> +
> +	if (ipa_end < ipa_base)
> +		return -EINVAL;
> +
> +	return populate_par_region(kvm, ipa_base, ipa_end);
> +}
> +
>  static int set_ipa_state(struct kvm_vcpu *vcpu,
>  			 unsigned long ipa,
>  			 unsigned long end,
> @@ -748,6 +1102,18 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>  		r = kvm_init_ipa_range_realm(kvm, &args);
>  		break;
>  	}
> +	case KVM_CAP_ARM_RME_POPULATE_REALM: {
> +		struct kvm_cap_arm_rme_populate_realm_args args;
> +		void __user *argp = u64_to_user_ptr(cap->args[1]);
> +
> +		if (copy_from_user(&args, argp, sizeof(args))) {
> +			r = -EFAULT;
> +			break;
> +		}
> +
> +		r = kvm_populate_realm(kvm, &args);
> +		break;
> +	}
>  	default:
>  		r = -EINVAL;
>  		break;


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 17/28] arm64: RME: Runtime faulting of memory
  2023-01-27 11:29   ` [RFC PATCH 17/28] arm64: RME: Runtime faulting of memory Steven Price
@ 2023-03-06 18:20     ` Zhi Wang
  2023-03-10 15:47       ` Steven Price
  0 siblings, 1 reply; 190+ messages in thread
From: Zhi Wang @ 2023-03-06 18:20 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 27 Jan 2023 11:29:21 +0000
Steven Price <steven.price@arm.com> wrote:

> At runtime if the realm guest accesses memory which hasn't yet been
> mapped then KVM needs to either populate the region or fault the guest.
> 
> For memory in the lower (protected) region of IPA a fresh page is
> provided to the RMM which will zero the contents. For memory in the
> upper (shared) region of IPA, the memory from the memslot is mapped
> into the realm VM non secure.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/include/asm/kvm_emulate.h | 10 +++++
>  arch/arm64/include/asm/kvm_rme.h     | 12 ++++++
>  arch/arm64/kvm/mmu.c                 | 64 +++++++++++++++++++++++++---
>  arch/arm64/kvm/rme.c                 | 48 +++++++++++++++++++++
>  4 files changed, 128 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index 285e62914ca4..3a71b3d2e10a 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -502,6 +502,16 @@ static inline enum realm_state kvm_realm_state(struct kvm *kvm)
>  	return READ_ONCE(kvm->arch.realm.state);
>  }
>  
> +static inline gpa_t kvm_gpa_stolen_bits(struct kvm *kvm)
> +{
> +	if (kvm_is_realm(kvm)) {
> +		struct realm *realm = &kvm->arch.realm;
> +
> +		return BIT(realm->ia_bits - 1);
> +	}
> +	return 0;
> +}
> +

"stolen" seems a little bit vague. Maybe "shared" bit would be better as
SEV-SNP has C bit and TDX has shared bit. It would be nice to align with
the common knowledge.

Also, it would be nice to change the name of gpa_stolen_mask accordingly.

>  static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu)
>  {
>  	if (static_branch_unlikely(&kvm_rme_is_available))
> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> index 9d1583c44a99..303e4a5e5704 100644
> --- a/arch/arm64/include/asm/kvm_rme.h
> +++ b/arch/arm64/include/asm/kvm_rme.h
> @@ -50,6 +50,18 @@ void kvm_destroy_rec(struct kvm_vcpu *vcpu);
>  int kvm_rec_enter(struct kvm_vcpu *vcpu);
>  int handle_rme_exit(struct kvm_vcpu *vcpu, int rec_run_status);
>  
> +void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size);
> +int realm_map_protected(struct realm *realm,
> +			unsigned long hva,
> +			unsigned long base_ipa,
> +			struct page *dst_page,
> +			unsigned long map_size,
> +			struct kvm_mmu_memory_cache *memcache);
> +int realm_map_non_secure(struct realm *realm,
> +			 unsigned long ipa,
> +			 struct page *page,
> +			 unsigned long map_size,
> +			 struct kvm_mmu_memory_cache *memcache);
>  int realm_set_ipa_state(struct kvm_vcpu *vcpu,
>  			unsigned long addr, unsigned long end,
>  			unsigned long ripas);
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index f29558c5dcbc..5417c273861b 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -235,8 +235,13 @@ static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64
>  
>  	lockdep_assert_held_write(&kvm->mmu_lock);
>  	WARN_ON(size & ~PAGE_MASK);
> -	WARN_ON(stage2_apply_range(kvm, start, end, kvm_pgtable_stage2_unmap,
> -				   may_block));
> +
> +	if (kvm_is_realm(kvm))
> +		kvm_realm_unmap_range(kvm, start, size);
> +	else
> +		WARN_ON(stage2_apply_range(kvm, start, end,
> +					   kvm_pgtable_stage2_unmap,
> +					   may_block));
>  }
>  
>  static void unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size)
> @@ -250,7 +255,11 @@ static void stage2_flush_memslot(struct kvm *kvm,
>  	phys_addr_t addr = memslot->base_gfn << PAGE_SHIFT;
>  	phys_addr_t end = addr + PAGE_SIZE * memslot->npages;
>  
> -	stage2_apply_range_resched(kvm, addr, end, kvm_pgtable_stage2_flush);
> +	if (kvm_is_realm(kvm))
> +		kvm_realm_unmap_range(kvm, addr, end - addr);
> +	else
> +		stage2_apply_range_resched(kvm, addr, end,
> +					   kvm_pgtable_stage2_flush);
>  }
>  
>  /**
> @@ -818,6 +827,10 @@ void stage2_unmap_vm(struct kvm *kvm)
>  	struct kvm_memory_slot *memslot;
>  	int idx, bkt;
>  
> +	/* For realms this is handled by the RMM so nothing to do here */
> +	if (kvm_is_realm(kvm))
> +		return;
> +
>  	idx = srcu_read_lock(&kvm->srcu);
>  	mmap_read_lock(current->mm);
>  	write_lock(&kvm->mmu_lock);
> @@ -840,6 +853,7 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
>  	pgt = mmu->pgt;
>  	if (kvm_is_realm(kvm) &&
>  	    kvm_realm_state(kvm) != REALM_STATE_DYING) {
> +		unmap_stage2_range(mmu, 0, (~0ULL) & PAGE_MASK);
>  		write_unlock(&kvm->mmu_lock);
>  		kvm_realm_destroy_rtts(&kvm->arch.realm, pgt->ia_bits,
>  				       pgt->start_level);
> @@ -1190,6 +1204,24 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma)
>  	return vma->vm_flags & VM_MTE_ALLOWED;
>  }
>  
> +static int realm_map_ipa(struct kvm *kvm, phys_addr_t ipa, unsigned long hva,
> +			 kvm_pfn_t pfn, unsigned long map_size,
> +			 enum kvm_pgtable_prot prot,
> +			 struct kvm_mmu_memory_cache *memcache)
> +{
> +	struct realm *realm = &kvm->arch.realm;
> +	struct page *page = pfn_to_page(pfn);
> +
> +	if (WARN_ON(!(prot & KVM_PGTABLE_PROT_W)))
> +		return -EFAULT;
> +
> +	if (!realm_is_addr_protected(realm, ipa))
> +		return realm_map_non_secure(realm, ipa, page, map_size,
> +					    memcache);
> +
> +	return realm_map_protected(realm, hva, ipa, page, map_size, memcache);
> +}
> +
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  			  struct kvm_memory_slot *memslot, unsigned long hva,
>  			  unsigned long fault_status)
> @@ -1210,9 +1242,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	unsigned long vma_pagesize, fault_granule;
>  	enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
>  	struct kvm_pgtable *pgt;
> +	gpa_t gpa_stolen_mask = kvm_gpa_stolen_bits(vcpu->kvm);
>  
>  	fault_granule = 1UL << ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level);
>  	write_fault = kvm_is_write_fault(vcpu);
> +
> +	/* Realms cannot map read-only */

Out of curiosity, why? It would be nice to have more explanation in the
comment.

> +	if (vcpu_is_rec(vcpu))
> +		write_fault = true;
> +
>  	exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
>  	VM_BUG_ON(write_fault && exec_fault);
>  
> @@ -1272,7 +1310,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	if (vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE)
>  		fault_ipa &= ~(vma_pagesize - 1);
>  
> -	gfn = fault_ipa >> PAGE_SHIFT;
> +	gfn = (fault_ipa & ~gpa_stolen_mask) >> PAGE_SHIFT;
>  	mmap_read_unlock(current->mm);
>  
>  	/*
> @@ -1345,7 +1383,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	 * If we are not forced to use page mapping, check if we are
>  	 * backed by a THP and thus use block mapping if possible.
>  	 */
> -	if (vma_pagesize == PAGE_SIZE && !(force_pte || device)) {
> +	/* FIXME: We shouldn't need to disable this for realms */
> +	if (vma_pagesize == PAGE_SIZE && !(force_pte || device || kvm_is_realm(kvm))) {

Why do we have to disable this temporarily?

>  		if (fault_status == FSC_PERM && fault_granule > PAGE_SIZE)
>  			vma_pagesize = fault_granule;
>  		else
> @@ -1382,6 +1421,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	 */
>  	if (fault_status == FSC_PERM && vma_pagesize == fault_granule)
>  		ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot);
> +	else if (kvm_is_realm(kvm))
> +		ret = realm_map_ipa(kvm, fault_ipa, hva, pfn, vma_pagesize,
> +				    prot, memcache);
>  	else
>  		ret = kvm_pgtable_stage2_map(pgt, fault_ipa, vma_pagesize,
>  					     __pfn_to_phys(pfn), prot,
> @@ -1437,6 +1479,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>  	struct kvm_memory_slot *memslot;
>  	unsigned long hva;
>  	bool is_iabt, write_fault, writable;
> +	gpa_t gpa_stolen_mask = kvm_gpa_stolen_bits(vcpu->kvm);
>  	gfn_t gfn;
>  	int ret, idx;
>  
> @@ -1491,7 +1534,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>  
>  	idx = srcu_read_lock(&vcpu->kvm->srcu);
>  
> -	gfn = fault_ipa >> PAGE_SHIFT;
> +	gfn = (fault_ipa & ~gpa_stolen_mask) >> PAGE_SHIFT;
>  	memslot = gfn_to_memslot(vcpu->kvm, gfn);
>  	hva = gfn_to_hva_memslot_prot(memslot, gfn, &writable);
>  	write_fault = kvm_is_write_fault(vcpu);
> @@ -1536,6 +1579,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>  		 * of the page size.
>  		 */
>  		fault_ipa |= kvm_vcpu_get_hfar(vcpu) & ((1 << 12) - 1);
> +		fault_ipa &= ~gpa_stolen_mask;
>  		ret = io_mem_abort(vcpu, fault_ipa);
>  		goto out_unlock;
>  	}
> @@ -1617,6 +1661,10 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
>  	if (!kvm->arch.mmu.pgt)
>  		return false;
>

Does the unprotected (shared) region of a realm support aging?
  
> +	/* We don't support aging for Realms */
> +	if (kvm_is_realm(kvm))
> +		return true;
> +
>  	WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
>  
>  	kpte = kvm_pgtable_stage2_mkold(kvm->arch.mmu.pgt,
> @@ -1630,6 +1678,10 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
>  	if (!kvm->arch.mmu.pgt)
>  		return false;
>  
> +	/* We don't support aging for Realms */
> +	if (kvm_is_realm(kvm))
> +		return true;
> +
>  	return kvm_pgtable_stage2_is_young(kvm->arch.mmu.pgt,
>  					   range->start << PAGE_SHIFT);
>  }
> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> index 3405b43e1421..3d46191798e5 100644
> --- a/arch/arm64/kvm/rme.c
> +++ b/arch/arm64/kvm/rme.c
> @@ -608,6 +608,54 @@ int realm_map_protected(struct realm *realm,
>  	return -ENXIO;
>  }
>  
> +int realm_map_non_secure(struct realm *realm,
> +			 unsigned long ipa,
> +			 struct page *page,
> +			 unsigned long map_size,
> +			 struct kvm_mmu_memory_cache *memcache)
> +{
> +	phys_addr_t rd = virt_to_phys(realm->rd);
> +	int map_level;
> +	int ret = 0;
> +	unsigned long desc = page_to_phys(page) |
> +			     PTE_S2_MEMATTR(MT_S2_FWB_NORMAL) |
> +			     /* FIXME: Read+Write permissions for now */
Why can't we handle the prot from the realm_map_ipa()? Working in progress? :)
> +			     (3 << 6) |
> +			     PTE_SHARED;
> +
> +	if (WARN_ON(!IS_ALIGNED(ipa, map_size)))
> +		return -EINVAL;
> +
> +	switch (map_size) {
> +	case PAGE_SIZE:
> +		map_level = 3;
> +		break;
> +	case RME_L2_BLOCK_SIZE:
> +		map_level = 2;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +
> +	ret = rmi_rtt_map_unprotected(rd, ipa, map_level, desc);
> +
> +	if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> +		/* Create missing RTTs and retry */
> +		int level = RMI_RETURN_INDEX(ret);
> +
> +		ret = realm_create_rtt_levels(realm, ipa, level, map_level,
> +					      memcache);
> +		if (WARN_ON(ret))
> +			return -ENXIO;
> +
> +		ret = rmi_rtt_map_unprotected(rd, ipa, map_level, desc);
> +	}
> +	if (WARN_ON(ret))
> +		return -ENXIO;
> +
> +	return 0;
> +}
> +
>  static int populate_par_region(struct kvm *kvm,
>  			       phys_addr_t ipa_base,
>  			       phys_addr_t ipa_end)


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms
  2023-01-27 11:29   ` [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms Steven Price
  2023-02-07 12:25     ` Jean-Philippe Brucker
  2023-02-13 16:10     ` Zhi Wang
@ 2023-03-06 19:10     ` Zhi Wang
  2023-03-10 15:47       ` Steven Price
  2024-03-18  7:40     ` Ganapatrao Kulkarni
  3 siblings, 1 reply; 190+ messages in thread
From: Zhi Wang @ 2023-03-06 19:10 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 27 Jan 2023 11:29:10 +0000
Steven Price <steven.price@arm.com> wrote:

> Add the KVM_CAP_ARM_RME_CREATE_FD ioctl to create a realm. This involves
> delegating pages to the RMM to hold the Realm Descriptor (RD) and for
> the base level of the Realm Translation Tables (RTT). A VMID also need
> to be picked, since the RMM has a separate VMID address space a
> dedicated allocator is added for this purpose.
> 
> KVM_CAP_ARM_RME_CONFIG_REALM is provided to allow configuring the realm
> before it is created.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>  arch/arm64/include/asm/kvm_rme.h |  14 ++
>  arch/arm64/kvm/arm.c             |  19 ++
>  arch/arm64/kvm/mmu.c             |   6 +
>  arch/arm64/kvm/reset.c           |  33 +++
>  arch/arm64/kvm/rme.c             | 357 +++++++++++++++++++++++++++++++
>  5 files changed, 429 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> index c26bc2c6770d..055a22accc08 100644
> --- a/arch/arm64/include/asm/kvm_rme.h
> +++ b/arch/arm64/include/asm/kvm_rme.h
> @@ -6,6 +6,8 @@
>  #ifndef __ASM_KVM_RME_H
>  #define __ASM_KVM_RME_H
>  
> +#include <uapi/linux/kvm.h>
> +
>  enum realm_state {
>  	REALM_STATE_NONE,
>  	REALM_STATE_NEW,
> @@ -15,8 +17,20 @@ enum realm_state {
>  
>  struct realm {
>  	enum realm_state state;
> +
> +	void *rd;
> +	struct realm_params *params;
> +
> +	unsigned long num_aux;
> +	unsigned int vmid;
> +	unsigned int ia_bits;
>  };
>  
>  int kvm_init_rme(void);
> +u32 kvm_realm_ipa_limit(void);
> +
> +int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
> +int kvm_init_realm_vm(struct kvm *kvm);
> +void kvm_destroy_realm(struct kvm *kvm);
>  
>  #endif
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index d97b39d042ab..50f54a63732a 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -103,6 +103,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  		r = 0;
>  		set_bit(KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED, &kvm->arch.flags);
>  		break;
> +	case KVM_CAP_ARM_RME:
> +		if (!static_branch_unlikely(&kvm_rme_is_available))
> +			return -EINVAL;
> +		mutex_lock(&kvm->lock);
> +		r = kvm_realm_enable_cap(kvm, cap);
> +		mutex_unlock(&kvm->lock);
> +		break;
>  	default:
>  		r = -EINVAL;
>  		break;
> @@ -172,6 +179,13 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  	 */
>  	kvm->arch.dfr0_pmuver.imp = kvm_arm_pmu_get_pmuver_limit();
>  
> +	/* Initialise the realm bits after the generic bits are enabled */
> +	if (kvm_is_realm(kvm)) {
> +		ret = kvm_init_realm_vm(kvm);
> +		if (ret)
> +			goto err_free_cpumask;
> +	}
> +
>  	return 0;
>  
>  err_free_cpumask:
> @@ -204,6 +218,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>  	kvm_destroy_vcpus(kvm);
>  
>  	kvm_unshare_hyp(kvm, kvm + 1);
> +
> +	kvm_destroy_realm(kvm);
>  }
>  
>  int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> @@ -300,6 +316,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_ARM_PTRAUTH_GENERIC:
>  		r = system_has_full_ptr_auth();
>  		break;
> +	case KVM_CAP_ARM_RME:
> +		r = static_key_enabled(&kvm_rme_is_available);
> +		break;
>  	default:
>  		r = 0;
>  	}
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 31d7fa4c7c14..d0f707767d05 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -840,6 +840,12 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
>  	struct kvm_pgtable *pgt = NULL;
>  
>  	write_lock(&kvm->mmu_lock);
> +	if (kvm_is_realm(kvm) &&
> +	    kvm_realm_state(kvm) != REALM_STATE_DYING) {
> +		/* TODO: teardown rtts */
> +		write_unlock(&kvm->mmu_lock);
> +		return;
> +	}
>  	pgt = mmu->pgt;
>  	if (pgt) {
>  		mmu->pgd_phys = 0;
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index e0267f672b8a..c165df174737 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -395,3 +395,36 @@ int kvm_set_ipa_limit(void)
>  
>  	return 0;
>  }
> +

The below function doesn't have an user in this patch. Also,
it looks like a part of copy from kvm_init_stage2_mmu()
in arch/arm64/kvm/mmu.c.

> +int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type)
> +{
> +	u64 mmfr0, mmfr1;
> +	u32 phys_shift;
> +	u32 ipa_limit = kvm_ipa_limit;
> +
> +	if (kvm_is_realm(kvm))
> +		ipa_limit = kvm_realm_ipa_limit();
> +
> +	if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
> +		return -EINVAL;
> +
> +	phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type);
> +	if (phys_shift) {
> +		if (phys_shift > ipa_limit ||
> +		    phys_shift < ARM64_MIN_PARANGE_BITS)
> +			return -EINVAL;
> +	} else {
> +		phys_shift = KVM_PHYS_SHIFT;
> +		if (phys_shift > ipa_limit) {
> +			pr_warn_once("%s using unsupported default IPA limit, upgrade your VMM\n",
> +				     current->comm);
> +			return -EINVAL;
> +		}
> +	}
> +
> +	mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
> +	mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> +	kvm->arch.vtcr = kvm_get_vtcr(mmfr0, mmfr1, phys_shift);
> +
> +	return 0;
> +}
> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> index f6b587bc116e..9f8c5a91b8fc 100644
> --- a/arch/arm64/kvm/rme.c
> +++ b/arch/arm64/kvm/rme.c
> @@ -5,9 +5,49 @@
>  
>  #include <linux/kvm_host.h>
>  
> +#include <asm/kvm_emulate.h>
> +#include <asm/kvm_mmu.h>
>  #include <asm/rmi_cmds.h>
>  #include <asm/virt.h>
>  
> +/************ FIXME: Copied from kvm/hyp/pgtable.c **********/
> +#include <asm/kvm_pgtable.h>
> +
> +struct kvm_pgtable_walk_data {
> +	struct kvm_pgtable		*pgt;
> +	struct kvm_pgtable_walker	*walker;
> +
> +	u64				addr;
> +	u64				end;
> +};
> +
> +static u32 __kvm_pgd_page_idx(struct kvm_pgtable *pgt, u64 addr)
> +{
> +	u64 shift = kvm_granule_shift(pgt->start_level - 1); /* May underflow */
> +	u64 mask = BIT(pgt->ia_bits) - 1;
> +
> +	return (addr & mask) >> shift;
> +}
> +
> +static u32 kvm_pgd_pages(u32 ia_bits, u32 start_level)
> +{
> +	struct kvm_pgtable pgt = {
> +		.ia_bits	= ia_bits,
> +		.start_level	= start_level,
> +	};
> +
> +	return __kvm_pgd_page_idx(&pgt, -1ULL) + 1;
> +}
> +
> +/******************/
> +
> +static unsigned long rmm_feat_reg0;
> +
> +static bool rme_supports(unsigned long feature)
> +{
> +	return !!u64_get_bits(rmm_feat_reg0, feature);
> +}
> +
>  static int rmi_check_version(void)
>  {
>  	struct arm_smccc_res res;
> @@ -33,8 +73,319 @@ static int rmi_check_version(void)
>  	return 0;
>  }
>  
> +static unsigned long create_realm_feat_reg0(struct kvm *kvm)
> +{
> +	unsigned long ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
> +	u64 feat_reg0 = 0;
> +
> +	int num_bps = u64_get_bits(rmm_feat_reg0,
> +				   RMI_FEATURE_REGISTER_0_NUM_BPS);
> +	int num_wps = u64_get_bits(rmm_feat_reg0,
> +				   RMI_FEATURE_REGISTER_0_NUM_WPS);
> +
> +	feat_reg0 |= u64_encode_bits(ia_bits, RMI_FEATURE_REGISTER_0_S2SZ);
> +	feat_reg0 |= u64_encode_bits(num_bps, RMI_FEATURE_REGISTER_0_NUM_BPS);
> +	feat_reg0 |= u64_encode_bits(num_wps, RMI_FEATURE_REGISTER_0_NUM_WPS);
> +
> +	return feat_reg0;
> +}
> +
> +u32 kvm_realm_ipa_limit(void)
> +{
> +	return u64_get_bits(rmm_feat_reg0, RMI_FEATURE_REGISTER_0_S2SZ);
> +}
> +
> +static u32 get_start_level(struct kvm *kvm)
> +{
> +	long sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, kvm->arch.vtcr);
> +
> +	return VTCR_EL2_TGRAN_SL0_BASE - sl0;
> +}
> +
> +static int realm_create_rd(struct kvm *kvm)
> +{
> +	struct realm *realm = &kvm->arch.realm;
> +	struct realm_params *params = realm->params;
> +	void *rd = NULL;
> +	phys_addr_t rd_phys, params_phys;
> +	struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
> +	unsigned int pgd_sz;
> +	int i, r;
> +
> +	if (WARN_ON(realm->rd) || WARN_ON(!realm->params))
> +		return -EEXIST;
> +
> +	rd = (void *)__get_free_page(GFP_KERNEL);
> +	if (!rd)
> +		return -ENOMEM;
> +
> +	rd_phys = virt_to_phys(rd);
> +	if (rmi_granule_delegate(rd_phys)) {
> +		r = -ENXIO;
> +		goto out;
> +	}
> +
> +	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
> +	for (i = 0; i < pgd_sz; i++) {
> +		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
> +
> +		if (rmi_granule_delegate(pgd_phys)) {
> +			r = -ENXIO;
> +			goto out_undelegate_tables;
> +		}
> +	}
> +
> +	params->rtt_level_start = get_start_level(kvm);
> +	params->rtt_num_start = pgd_sz;
> +	params->rtt_base = kvm->arch.mmu.pgd_phys;
> +	params->vmid = realm->vmid;
> +
> +	params_phys = virt_to_phys(params);
> +
> +	if (rmi_realm_create(rd_phys, params_phys)) {
> +		r = -ENXIO;
> +		goto out_undelegate_tables;
> +	}
> +
> +	realm->rd = rd;
> +	realm->ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
> +
> +	if (WARN_ON(rmi_rec_aux_count(rd_phys, &realm->num_aux))) {
> +		WARN_ON(rmi_realm_destroy(rd_phys));
> +		goto out_undelegate_tables;
> +	}
> +
> +	return 0;
> +
> +out_undelegate_tables:
> +	while (--i >= 0) {
> +		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
> +
> +		WARN_ON(rmi_granule_undelegate(pgd_phys));
> +	}
> +	WARN_ON(rmi_granule_undelegate(rd_phys));
> +out:
> +	free_page((unsigned long)rd);
> +	return r;
> +}
> +
> +/* Protects access to rme_vmid_bitmap */
> +static DEFINE_SPINLOCK(rme_vmid_lock);
> +static unsigned long *rme_vmid_bitmap;
> +
> +static int rme_vmid_init(void)
> +{
> +	unsigned int vmid_count = 1 << kvm_get_vmid_bits();
> +
> +	rme_vmid_bitmap = bitmap_zalloc(vmid_count, GFP_KERNEL);
> +	if (!rme_vmid_bitmap) {
> +		kvm_err("%s: Couldn't allocate rme vmid bitmap\n", __func__);
> +		return -ENOMEM;
> +	}
> +
> +	return 0;
> +}
> +
> +static int rme_vmid_reserve(void)
> +{
> +	int ret;
> +	unsigned int vmid_count = 1 << kvm_get_vmid_bits();
> +
> +	spin_lock(&rme_vmid_lock);
> +	ret = bitmap_find_free_region(rme_vmid_bitmap, vmid_count, 0);
> +	spin_unlock(&rme_vmid_lock);
> +
> +	return ret;
> +}
> +
> +static void rme_vmid_release(unsigned int vmid)
> +{
> +	spin_lock(&rme_vmid_lock);
> +	bitmap_release_region(rme_vmid_bitmap, vmid, 0);
> +	spin_unlock(&rme_vmid_lock);
> +}
> +
> +static int kvm_create_realm(struct kvm *kvm)
> +{
> +	struct realm *realm = &kvm->arch.realm;
> +	int ret;
> +
> +	if (!kvm_is_realm(kvm) || kvm_realm_state(kvm) != REALM_STATE_NONE)
> +		return -EEXIST;
> +
> +	ret = rme_vmid_reserve();
> +	if (ret < 0)
> +		return ret;
> +	realm->vmid = ret;
> +
> +	ret = realm_create_rd(kvm);
> +	if (ret) {
> +		rme_vmid_release(realm->vmid);
> +		return ret;
> +	}
> +
> +	WRITE_ONCE(realm->state, REALM_STATE_NEW);
> +
> +	/* The realm is up, free the parameters.  */
> +	free_page((unsigned long)realm->params);
> +	realm->params = NULL;
> +
> +	return 0;
> +}
> +
> +static int config_realm_hash_algo(struct realm *realm,
> +				  struct kvm_cap_arm_rme_config_item *cfg)
> +{
> +	switch (cfg->hash_algo) {
> +	case KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256:
> +		if (!rme_supports(RMI_FEATURE_REGISTER_0_HASH_SHA_256))
> +			return -EINVAL;
> +		break;
> +	case KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA512:
> +		if (!rme_supports(RMI_FEATURE_REGISTER_0_HASH_SHA_512))
> +			return -EINVAL;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +	realm->params->measurement_algo = cfg->hash_algo;
> +	return 0;
> +}
> +
> +static int config_realm_sve(struct realm *realm,
> +			    struct kvm_cap_arm_rme_config_item *cfg)
> +{
> +	u64 features_0 = realm->params->features_0;
> +	int max_sve_vq = u64_get_bits(rmm_feat_reg0,
> +				      RMI_FEATURE_REGISTER_0_SVE_VL);
> +
> +	if (!rme_supports(RMI_FEATURE_REGISTER_0_SVE_EN))
> +		return -EINVAL;
> +
> +	if (cfg->sve_vq > max_sve_vq)
> +		return -EINVAL;
> +
> +	features_0 &= ~(RMI_FEATURE_REGISTER_0_SVE_EN |
> +			RMI_FEATURE_REGISTER_0_SVE_VL);
> +	features_0 |= u64_encode_bits(1, RMI_FEATURE_REGISTER_0_SVE_EN);
> +	features_0 |= u64_encode_bits(cfg->sve_vq,
> +				      RMI_FEATURE_REGISTER_0_SVE_VL);
> +
> +	realm->params->features_0 = features_0;
> +	return 0;
> +}
> +
> +static int kvm_rme_config_realm(struct kvm *kvm, struct kvm_enable_cap *cap)
> +{
> +	struct kvm_cap_arm_rme_config_item cfg;
> +	struct realm *realm = &kvm->arch.realm;
> +	int r = 0;
> +
> +	if (kvm_realm_state(kvm) != REALM_STATE_NONE)
> +		return -EBUSY;
> +
> +	if (copy_from_user(&cfg, (void __user *)cap->args[1], sizeof(cfg)))
> +		return -EFAULT;
> +
> +	switch (cfg.cfg) {
> +	case KVM_CAP_ARM_RME_CFG_RPV:
> +		memcpy(&realm->params->rpv, &cfg.rpv, sizeof(cfg.rpv));
> +		break;
> +	case KVM_CAP_ARM_RME_CFG_HASH_ALGO:
> +		r = config_realm_hash_algo(realm, &cfg);
> +		break;
> +	case KVM_CAP_ARM_RME_CFG_SVE:
> +		r = config_realm_sve(realm, &cfg);
> +		break;
> +	default:
> +		r = -EINVAL;
> +	}
> +
> +	return r;
> +}
> +
> +int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
> +{
> +	int r = 0;
> +
> +	switch (cap->args[0]) {
> +	case KVM_CAP_ARM_RME_CONFIG_REALM:
> +		r = kvm_rme_config_realm(kvm, cap);
> +		break;
> +	case KVM_CAP_ARM_RME_CREATE_RD:
> +		if (kvm->created_vcpus) {
> +			r = -EBUSY;
> +			break;
> +		}
> +
> +		r = kvm_create_realm(kvm);
> +		break;
> +	default:
> +		r = -EINVAL;
> +		break;
> +	}
> +
> +	return r;
> +}
> +
> +void kvm_destroy_realm(struct kvm *kvm)
> +{
> +	struct realm *realm = &kvm->arch.realm;
> +	struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
> +	unsigned int pgd_sz;
> +	int i;
> +
> +	if (realm->params) {
> +		free_page((unsigned long)realm->params);
> +		realm->params = NULL;
> +	}
> +
> +	if (kvm_realm_state(kvm) == REALM_STATE_NONE)
> +		return;
> +
> +	WRITE_ONCE(realm->state, REALM_STATE_DYING);
> +
> +	rme_vmid_release(realm->vmid);
> +
> +	if (realm->rd) {
> +		phys_addr_t rd_phys = virt_to_phys(realm->rd);
> +
> +		if (WARN_ON(rmi_realm_destroy(rd_phys)))
> +			return;
> +		if (WARN_ON(rmi_granule_undelegate(rd_phys)))
> +			return;
> +		free_page((unsigned long)realm->rd);
> +		realm->rd = NULL;
> +	}
> +
> +	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
> +	for (i = 0; i < pgd_sz; i++) {
> +		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
> +
> +		if (WARN_ON(rmi_granule_undelegate(pgd_phys)))
> +			return;
> +	}
> +
> +	kvm_free_stage2_pgd(&kvm->arch.mmu);
> +}
> +
> +int kvm_init_realm_vm(struct kvm *kvm)
> +{
> +	struct realm_params *params;
> +
> +	params = (struct realm_params *)get_zeroed_page(GFP_KERNEL);
> +	if (!params)
> +		return -ENOMEM;
> +
> +	params->features_0 = create_realm_feat_reg0(kvm);
> +	kvm->arch.realm.params = params;
> +	return 0;
> +}
> +
>  int kvm_init_rme(void)
>  {
> +	int ret;
> +
>  	if (PAGE_SIZE != SZ_4K)
>  		/* Only 4k page size on the host is supported */
>  		return 0;
> @@ -43,6 +394,12 @@ int kvm_init_rme(void)
>  		/* Continue without realm support */
>  		return 0;
>  
> +	ret = rme_vmid_init();
> +	if (ret)
> +		return ret;
> +
> +	WARN_ON(rmi_features(0, &rmm_feat_reg0));
> +
>  	/* Future patch will enable static branch kvm_rme_is_available */
>  
>  	return 0;


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 15/28] KVM: arm64: Handle realm MMIO emulation
  2023-03-06 15:37     ` Zhi Wang
@ 2023-03-10 15:47       ` Steven Price
  2023-03-14 15:44         ` Zhi Wang
  0 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-03-10 15:47 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On 06/03/2023 15:37, Zhi Wang wrote:
> On Fri, 27 Jan 2023 11:29:19 +0000
> Steven Price <steven.price@arm.com> wrote:
> 
>> MMIO emulation for a realm cannot be done directly with the VM's
>> registers as they are protected from the host. However the RMM interface
>> provides a structure member for providing the read/written value and
> 
> More details would be better for helping the review. I can only see the
> emulated mmio value from the device model (kvmtool or kvm_io_bus) is put into
> the GPRS[0] of the RecEntry object. But the rest of the flow is missing.

The commit message is out of date (sorry about that). A previous version
of the spec had a dedicated member for the read/write value, but this
was changed to just use GPRS[0] as you've spotted. I'll update the text.

> I guess RMM copies the value in the RecEntry.GPRS[0] to the target GPR in the
> guest context in RMI_REC_ENTER when seeing RMI_EMULATED_MMIO. This is for
> the guest MMIO read path.

Yes, when entering the guest after an (emulatable) read data abort the
value in GPRS[0] is loaded from the RecEntry structure into the
appropriate register for the guest.

> How about the MMIO write path? I don't see where the RecExit.GPRS[0] is loaded
> to a varible and returned to the userspace.

The RMM will populate GPRS[0] with the written value in this case (even
if another register was actually used in the instruction). We then
transfer that to the usual VCPU structure so that the normal fault
handling logic works.

>> we can transfer this to the appropriate VCPU's register entry and then
>> depend on the generic MMIO handling code in KVM.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  arch/arm64/kvm/mmio.c | 7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
>> index 3dd38a151d2a..c4879fa3a8d3 100644
>> --- a/arch/arm64/kvm/mmio.c
>> +++ b/arch/arm64/kvm/mmio.c
>> @@ -6,6 +6,7 @@
>>  
>>  #include <linux/kvm_host.h>
>>  #include <asm/kvm_emulate.h>
>> +#include <asm/rmi_smc.h>
>>  #include <trace/events/kvm.h>
>>  
>>  #include "trace.h"
>> @@ -109,6 +110,9 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
>>  			       &data);
>>  		data = vcpu_data_host_to_guest(vcpu, data, len);
>>  		vcpu_set_reg(vcpu, kvm_vcpu_dabt_get_rd(vcpu), data);
>> +
>> +		if (vcpu_is_rec(vcpu))
>> +			vcpu->arch.rec.run->entry.gprs[0] = data;
> 
> I think the guest context is maintained by RMM (while KVM can only touch
> Rec{Entry, Exit} object) so that guest context in the legacy VHE mode is
> unused.
> 
> If yes, I guess here is should be:
> 
> if (unlikely(vcpu_is_rec(vcpu)))
> 	vcpu->arch.rec.run->entry.gprs[0] = data;
> else
> 	vcpu_set_reg(vcpu, kvm_vcpu_dabt_get_rd(vcpu), data);

Correct. Although there's no harm in updating with vcpu_set_reg(). But
I'll make the change because it's clearer.

>>  	}
>>  
>>  	/*
>> @@ -179,6 +183,9 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>>  	run->mmio.len		= len;
>>  	vcpu->mmio_needed	= 1;
>>  
>> +	if (vcpu_is_rec(vcpu))
>> +		vcpu->arch.rec.run->entry.flags |= RMI_EMULATED_MMIO;
>> +
> 
> Wouldn't it be better to set this in the kvm_handle_mmio_return where the MMIO
> read emulation has been surely successful?

Yes, that makes sense - I'll move this.

Thanks,

Steve

>>  	if (!ret) {
>>  		/* We handled the access successfully in the kernel. */
>>  		if (!is_write)
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 16/28] arm64: RME: Allow populating initial contents
  2023-03-06 17:34     ` Zhi Wang
@ 2023-03-10 15:47       ` Steven Price
  2023-03-14 15:31         ` Zhi Wang
  0 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-03-10 15:47 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On 06/03/2023 17:34, Zhi Wang wrote:
> On Fri, 27 Jan 2023 11:29:20 +0000
> Steven Price <steven.price@arm.com> wrote:
> 
>> The VMM needs to populate the realm with some data before starting (e.g.
>> a kernel and initrd). This is measured by the RMM and used as part of
>> the attestation later on.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  arch/arm64/kvm/rme.c | 366 +++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 366 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
>> index 16e0bfea98b1..3405b43e1421 100644
>> --- a/arch/arm64/kvm/rme.c
>> +++ b/arch/arm64/kvm/rme.c
>> @@ -4,6 +4,7 @@
>>   */
>>  
>>  #include <linux/kvm_host.h>
>> +#include <linux/hugetlb.h>
>>  
>>  #include <asm/kvm_emulate.h>
>>  #include <asm/kvm_mmu.h>
>> @@ -426,6 +427,359 @@ void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size)
>>  	}
>>  }
>>  
>> +static int realm_create_protected_data_page(struct realm *realm,
>> +					    unsigned long ipa,
>> +					    struct page *dst_page,
>> +					    struct page *tmp_page)
>> +{
>> +	phys_addr_t dst_phys, tmp_phys;
>> +	int ret;
>> +
>> +	copy_page(page_address(tmp_page), page_address(dst_page));
>> +
>> +	dst_phys = page_to_phys(dst_page);
>> +	tmp_phys = page_to_phys(tmp_page);
>> +
>> +	if (rmi_granule_delegate(dst_phys))
>> +		return -ENXIO;
>> +
>> +	ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa, tmp_phys,
>> +			      RMI_MEASURE_CONTENT);
>> +
>> +	if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
>> +		/* Create missing RTTs and retry */
>> +		int level = RMI_RETURN_INDEX(ret);
>> +
>> +		ret = realm_create_rtt_levels(realm, ipa, level,
>> +					      RME_RTT_MAX_LEVEL, NULL);
>> +		if (ret)
>> +			goto err;
>> +
>> +		ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa,
>> +				      tmp_phys, RMI_MEASURE_CONTENT);
>> +	}
>> +
>> +	if (ret)
>> +		goto err;
>> +
>> +	return 0;
>> +
>> +err:
>> +	if (WARN_ON(rmi_granule_undelegate(dst_phys))) {
>> +		/* Page can't be returned to NS world so is lost */
>> +		get_page(dst_page);
>> +	}
>> +	return -ENXIO;
>> +}
>> +
>> +static int fold_rtt(phys_addr_t rd, unsigned long addr, int level,
>> +		    struct realm *realm)
>> +{
>> +	struct rtt_entry rtt;
>> +	phys_addr_t rtt_addr;
>> +
>> +	if (rmi_rtt_read_entry(rd, addr, level, &rtt))
>> +		return -ENXIO;
>> +
>> +	if (rtt.state != RMI_TABLE)
>> +		return -EINVAL;
>> +
>> +	rtt_addr = rmi_rtt_get_phys(&rtt);
>> +	if (rmi_rtt_fold(rtt_addr, rd, addr, level + 1))
>> +		return -ENXIO;
>> +
>> +	free_delegated_page(realm, rtt_addr);
>> +
>> +	return 0;
>> +}
>> +
>> +int realm_map_protected(struct realm *realm,
>> +			unsigned long hva,
>> +			unsigned long base_ipa,
>> +			struct page *dst_page,
>> +			unsigned long map_size,
>> +			struct kvm_mmu_memory_cache *memcache)
>> +{
>> +	phys_addr_t dst_phys = page_to_phys(dst_page);
>> +	phys_addr_t rd = virt_to_phys(realm->rd);
>> +	unsigned long phys = dst_phys;
>> +	unsigned long ipa = base_ipa;
>> +	unsigned long size;
>> +	int map_level;
>> +	int ret = 0;
>> +
>> +	if (WARN_ON(!IS_ALIGNED(ipa, map_size)))
>> +		return -EINVAL;
>> +
>> +	switch (map_size) {
>> +	case PAGE_SIZE:
>> +		map_level = 3;
>> +		break;
>> +	case RME_L2_BLOCK_SIZE:
>> +		map_level = 2;
>> +		break;
>> +	default:
>> +		return -EINVAL;
>> +	}
>> +
>> +	if (map_level < RME_RTT_MAX_LEVEL) {
>> +		/*
>> +		 * A temporary RTT is needed during the map, precreate it,
>> +		 * however if there is an error (e.g. missing parent tables)
>> +		 * this will be handled below.
>> +		 */
>> +		realm_create_rtt_levels(realm, ipa, map_level,
>> +					RME_RTT_MAX_LEVEL, memcache);
>> +	}
>> +
>> +	for (size = 0; size < map_size; size += PAGE_SIZE) {
>> +		if (rmi_granule_delegate(phys)) {
>> +			struct rtt_entry rtt;
>> +
>> +			/*
>> +			 * It's possible we raced with another VCPU on the same
>> +			 * fault. If the entry exists and matches then exit
>> +			 * early and assume the other VCPU will handle the
>> +			 * mapping.
>> +			 */
>> +			if (rmi_rtt_read_entry(rd, ipa, RME_RTT_MAX_LEVEL, &rtt))
>> +				goto err;
>> +
>> +			// FIXME: For a block mapping this could race at level
>> +			// 2 or 3...
>> +			if (WARN_ON((rtt.walk_level != RME_RTT_MAX_LEVEL ||
>> +				     rtt.state != RMI_ASSIGNED ||
>> +				     rtt.desc != phys))) {
>> +				goto err;
>> +			}
>> +
>> +			return 0;
>> +		}
>> +
>> +		ret = rmi_data_create_unknown(phys, rd, ipa);
>> +
>> +		if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
>> +			/* Create missing RTTs and retry */
>> +			int level = RMI_RETURN_INDEX(ret);
>> +
>> +			ret = realm_create_rtt_levels(realm, ipa, level,
>> +						      RME_RTT_MAX_LEVEL,
>> +						      memcache);
>> +			WARN_ON(ret);
>> +			if (ret)
>> +				goto err_undelegate;
>> +
>> +			ret = rmi_data_create_unknown(phys, rd, ipa);
>> +		}
>> +		WARN_ON(ret);
>> +
>> +		if (ret)
>> +			goto err_undelegate;
>> +
>> +		phys += PAGE_SIZE;
>> +		ipa += PAGE_SIZE;
>> +	}
>> +
>> +	if (map_size == RME_L2_BLOCK_SIZE)
>> +		ret = fold_rtt(rd, base_ipa, map_level, realm);
>> +	if (WARN_ON(ret))
>> +		goto err;
>> +
>> +	return 0;
>> +
>> +err_undelegate:
>> +	if (WARN_ON(rmi_granule_undelegate(phys))) {
>> +		/* Page can't be returned to NS world so is lost */
>> +		get_page(phys_to_page(phys));
>> +	}
>> +err:
>> +	while (size > 0) {
>> +		phys -= PAGE_SIZE;
>> +		size -= PAGE_SIZE;
>> +		ipa -= PAGE_SIZE;
>> +
>> +		rmi_data_destroy(rd, ipa);
>> +
>> +		if (WARN_ON(rmi_granule_undelegate(phys))) {
>> +			/* Page can't be returned to NS world so is lost */
>> +			get_page(phys_to_page(phys));
>> +		}
>> +	}
>> +	return -ENXIO;
>> +}
>> +
> 
> There seems no caller to the function above. Better move it to the related
> patch.

Indeed this should really be in the next patch - will move as it's very
confusing having it in this patch (sorry about that).

>> +static int populate_par_region(struct kvm *kvm,
>> +			       phys_addr_t ipa_base,
>> +			       phys_addr_t ipa_end)
>> +{
>> +	struct realm *realm = &kvm->arch.realm;
>> +	struct kvm_memory_slot *memslot;
>> +	gfn_t base_gfn, end_gfn;
>> +	int idx;
>> +	phys_addr_t ipa;
>> +	int ret = 0;
>> +	struct page *tmp_page;
>> +	phys_addr_t rd = virt_to_phys(realm->rd);
>> +
>> +	base_gfn = gpa_to_gfn(ipa_base);
>> +	end_gfn = gpa_to_gfn(ipa_end);
>> +
>> +	idx = srcu_read_lock(&kvm->srcu);
>> +	memslot = gfn_to_memslot(kvm, base_gfn);
>> +	if (!memslot) {
>> +		ret = -EFAULT;
>> +		goto out;
>> +	}
>> +
>> +	/* We require the region to be contained within a single memslot */
>> +	if (memslot->base_gfn + memslot->npages < end_gfn) {
>> +		ret = -EINVAL;
>> +		goto out;
>> +	}
>> +
>> +	tmp_page = alloc_page(GFP_KERNEL);
>> +	if (!tmp_page) {
>> +		ret = -ENOMEM;
>> +		goto out;
>> +	}
>> +
>> +	mmap_read_lock(current->mm);
>> +
>> +	ipa = ipa_base;
>> +
>> +	while (ipa < ipa_end) {
>> +		struct vm_area_struct *vma;
>> +		unsigned long map_size;
>> +		unsigned int vma_shift;
>> +		unsigned long offset;
>> +		unsigned long hva;
>> +		struct page *page;
>> +		kvm_pfn_t pfn;
>> +		int level;
>> +
>> +		hva = gfn_to_hva_memslot(memslot, gpa_to_gfn(ipa));
>> +		vma = vma_lookup(current->mm, hva);
>> +		if (!vma) {
>> +			ret = -EFAULT;
>> +			break;
>> +		}
>> +
>> +		if (is_vm_hugetlb_page(vma))
>> +			vma_shift = huge_page_shift(hstate_vma(vma));
>> +		else
>> +			vma_shift = PAGE_SHIFT;
>> +
>> +		map_size = 1 << vma_shift;
>> +
>> +		/*
>> +		 * FIXME: This causes over mapping, but there's no good
>> +		 * solution here with the ABI as it stands
>> +		 */
>> +		ipa = ALIGN_DOWN(ipa, map_size);
>> +
>> +		switch (map_size) {
>> +		case RME_L2_BLOCK_SIZE:
>> +			level = 2;
>> +			break;
>> +		case PAGE_SIZE:
>> +			level = 3;
>> +			break;
>> +		default:
>> +			WARN_ONCE(1, "Unsupport vma_shift %d", vma_shift);
>> +			ret = -EFAULT;
>> +			break;
>> +		}
>> +
>> +		pfn = gfn_to_pfn_memslot(memslot, gpa_to_gfn(ipa));
>> +
>> +		if (is_error_pfn(pfn)) {
>> +			ret = -EFAULT;
>> +			break;
>> +		}
>> +
>> +		ret = rmi_rtt_init_ripas(rd, ipa, level);
>> +		if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
>> +			ret = realm_create_rtt_levels(realm, ipa,
>> +						      RMI_RETURN_INDEX(ret),
>> +						      level, NULL);
>> +			if (ret)
>> +				break;
>> +			ret = rmi_rtt_init_ripas(rd, ipa, level);
>> +			if (ret) {
>> +				ret = -ENXIO;
>> +				break;
>> +			}
>> +		}
>> +
>> +		if (level < RME_RTT_MAX_LEVEL) {
>> +			/*
>> +			 * A temporary RTT is needed during the map, precreate
>> +			 * it, however if there is an error (e.g. missing
>> +			 * parent tables) this will be handled in the
>> +			 * realm_create_protected_data_page() call.
>> +			 */
>> +			realm_create_rtt_levels(realm, ipa, level,
>> +						RME_RTT_MAX_LEVEL, NULL);
>> +		}
>> +
>> +		page = pfn_to_page(pfn);
>> +
>> +		for (offset = 0; offset < map_size && !ret;
>> +		     offset += PAGE_SIZE, page++) {
>> +			phys_addr_t page_ipa = ipa + offset;
>> +
>> +			ret = realm_create_protected_data_page(realm, page_ipa,
>> +							       page, tmp_page);
>> +		}
>> +		if (ret)
>> +			goto err_release_pfn;
>> +
>> +		if (level == 2) {
>> +			ret = fold_rtt(rd, ipa, level, realm);
>> +			if (ret)
>> +				goto err_release_pfn;
>> +		}
>> +
>> +		ipa += map_size;
> 
>> +		kvm_set_pfn_accessed(pfn);
>> +		kvm_set_pfn_dirty(pfn);
> 
> kvm_release_pfn_dirty() has already called kvm_set_pfn_{accessed, dirty}().

Will remove those calls.

>> +		kvm_release_pfn_dirty(pfn);
>> +err_release_pfn:
>> +		if (ret) {
>> +			kvm_release_pfn_clean(pfn);
>> +			break;
>> +		}
>> +	}
>> +
>> +	mmap_read_unlock(current->mm);
>> +	__free_page(tmp_page);
>> +
>> +out:
>> +	srcu_read_unlock(&kvm->srcu, idx);
>> +	return ret;
>> +}
>> +
>> +static int kvm_populate_realm(struct kvm *kvm,
>> +			      struct kvm_cap_arm_rme_populate_realm_args *args)
>> +{
>> +	phys_addr_t ipa_base, ipa_end;
>> +
> 
> Check kvm_is_realm(kvm) here or in the kvm_realm_enable_cap().

I'm going to update kvm_vm_ioctl_enable_cap() to check kvm_is_realm() so
we won't get here.

>> +	if (kvm_realm_state(kvm) != REALM_STATE_NEW)
>> +		return -EBUSY;
> 
> Maybe -EINVAL? The realm hasn't been created (RMI_REALM_CREATE is not called
> yet). The userspace shouldn't reach this path.

Well user space can attempt to populate in the ACTIVE state - which is
where the idea of 'busy' comes from. Admittedly it's a little confusing
when RMI_REALM_CREATE hasn't been called.

I'm not particularly bothered about the return code, but it's useful to
have a different code to -EINVAL as it's not an invalid argument, but
calling at the wrong time. I can't immediately see a better error code
though.

Steve

>> +
>> +	if (!IS_ALIGNED(args->populate_ipa_base, PAGE_SIZE) ||
>> +	    !IS_ALIGNED(args->populate_ipa_size, PAGE_SIZE))
>> +		return -EINVAL;
>> +
>> +	ipa_base = args->populate_ipa_base;
>> +	ipa_end = ipa_base + args->populate_ipa_size;
>> +
>> +	if (ipa_end < ipa_base)
>> +		return -EINVAL;
>> +
>> +	return populate_par_region(kvm, ipa_base, ipa_end);
>> +}
>> +
>>  static int set_ipa_state(struct kvm_vcpu *vcpu,
>>  			 unsigned long ipa,
>>  			 unsigned long end,
>> @@ -748,6 +1102,18 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>>  		r = kvm_init_ipa_range_realm(kvm, &args);
>>  		break;
>>  	}
>> +	case KVM_CAP_ARM_RME_POPULATE_REALM: {
>> +		struct kvm_cap_arm_rme_populate_realm_args args;
>> +		void __user *argp = u64_to_user_ptr(cap->args[1]);
>> +
>> +		if (copy_from_user(&args, argp, sizeof(args))) {
>> +			r = -EFAULT;
>> +			break;
>> +		}
>> +
>> +		r = kvm_populate_realm(kvm, &args);
>> +		break;
>> +	}
>>  	default:
>>  		r = -EINVAL;
>>  		break;
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 17/28] arm64: RME: Runtime faulting of memory
  2023-03-06 18:20     ` Zhi Wang
@ 2023-03-10 15:47       ` Steven Price
  2023-03-14 16:41         ` Zhi Wang
  0 siblings, 1 reply; 190+ messages in thread
From: Steven Price @ 2023-03-10 15:47 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On 06/03/2023 18:20, Zhi Wang wrote:
> On Fri, 27 Jan 2023 11:29:21 +0000
> Steven Price <steven.price@arm.com> wrote:
> 
>> At runtime if the realm guest accesses memory which hasn't yet been
>> mapped then KVM needs to either populate the region or fault the guest.
>>
>> For memory in the lower (protected) region of IPA a fresh page is
>> provided to the RMM which will zero the contents. For memory in the
>> upper (shared) region of IPA, the memory from the memslot is mapped
>> into the realm VM non secure.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_emulate.h | 10 +++++
>>  arch/arm64/include/asm/kvm_rme.h     | 12 ++++++
>>  arch/arm64/kvm/mmu.c                 | 64 +++++++++++++++++++++++++---
>>  arch/arm64/kvm/rme.c                 | 48 +++++++++++++++++++++
>>  4 files changed, 128 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
>> index 285e62914ca4..3a71b3d2e10a 100644
>> --- a/arch/arm64/include/asm/kvm_emulate.h
>> +++ b/arch/arm64/include/asm/kvm_emulate.h
>> @@ -502,6 +502,16 @@ static inline enum realm_state kvm_realm_state(struct kvm *kvm)
>>  	return READ_ONCE(kvm->arch.realm.state);
>>  }
>>  
>> +static inline gpa_t kvm_gpa_stolen_bits(struct kvm *kvm)
>> +{
>> +	if (kvm_is_realm(kvm)) {
>> +		struct realm *realm = &kvm->arch.realm;
>> +
>> +		return BIT(realm->ia_bits - 1);
>> +	}
>> +	return 0;
>> +}
>> +
> 
> "stolen" seems a little bit vague. Maybe "shared" bit would be better as
> SEV-SNP has C bit and TDX has shared bit. It would be nice to align with
> the common knowledge.

The Arm CCA term is the "protected" bit[1] - although the bit is
backwards as it's cleared to indicate protected... so not ideal naming! ;)

But it's termed 'stolen' here as it's effectively removed from the set
of value address bits. And this function is returning a mask of the bits
that are not available as address bits. The naming was meant to be
generic that this could encompass other features that need to reserve
IPA bits.

But it's possible this is too generic and perhaps we should just deal
with a single bit rather than potential masks. Alternatively we could
invert this and return a set of valid bits:

static inline gpa_t kvm_gpa_valid_bits(struct kvm *kvm)
{
	if (kvm_is_realm(kvm)) {
		struct realm *realm = &kvm->arch.realm;

		return ~BIT(realm->ia_bits - 1);
	}
	return ~(gpa_t)0;
}

That would at least match the current usage where the inverse is what we
need.

So SEV-SNP or TDX have a concept of a mask to apply to addresses from
the guest? Can we steal any existing terms?


[1] Technically the spec only states: "Software in a Realm should treat
the most significant bit of an IPA as a protection attribute." I don't
think the bit is directly referred to in the spec as anything other than
"the most significant bit". Although that in itself is confusing as it
is the most significant *active* bit (i.e the configured IPA size
changes which bit is used).

> Also, it would be nice to change the name of gpa_stolen_mask accordingly.
> 
>>  static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu)
>>  {
>>  	if (static_branch_unlikely(&kvm_rme_is_available))
>> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
>> index 9d1583c44a99..303e4a5e5704 100644
>> --- a/arch/arm64/include/asm/kvm_rme.h
>> +++ b/arch/arm64/include/asm/kvm_rme.h
>> @@ -50,6 +50,18 @@ void kvm_destroy_rec(struct kvm_vcpu *vcpu);
>>  int kvm_rec_enter(struct kvm_vcpu *vcpu);
>>  int handle_rme_exit(struct kvm_vcpu *vcpu, int rec_run_status);
>>  
>> +void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size);
>> +int realm_map_protected(struct realm *realm,
>> +			unsigned long hva,
>> +			unsigned long base_ipa,
>> +			struct page *dst_page,
>> +			unsigned long map_size,
>> +			struct kvm_mmu_memory_cache *memcache);
>> +int realm_map_non_secure(struct realm *realm,
>> +			 unsigned long ipa,
>> +			 struct page *page,
>> +			 unsigned long map_size,
>> +			 struct kvm_mmu_memory_cache *memcache);
>>  int realm_set_ipa_state(struct kvm_vcpu *vcpu,
>>  			unsigned long addr, unsigned long end,
>>  			unsigned long ripas);
>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>> index f29558c5dcbc..5417c273861b 100644
>> --- a/arch/arm64/kvm/mmu.c
>> +++ b/arch/arm64/kvm/mmu.c
>> @@ -235,8 +235,13 @@ static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64
>>  
>>  	lockdep_assert_held_write(&kvm->mmu_lock);
>>  	WARN_ON(size & ~PAGE_MASK);
>> -	WARN_ON(stage2_apply_range(kvm, start, end, kvm_pgtable_stage2_unmap,
>> -				   may_block));
>> +
>> +	if (kvm_is_realm(kvm))
>> +		kvm_realm_unmap_range(kvm, start, size);
>> +	else
>> +		WARN_ON(stage2_apply_range(kvm, start, end,
>> +					   kvm_pgtable_stage2_unmap,
>> +					   may_block));
>>  }
>>  
>>  static void unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size)
>> @@ -250,7 +255,11 @@ static void stage2_flush_memslot(struct kvm *kvm,
>>  	phys_addr_t addr = memslot->base_gfn << PAGE_SHIFT;
>>  	phys_addr_t end = addr + PAGE_SIZE * memslot->npages;
>>  
>> -	stage2_apply_range_resched(kvm, addr, end, kvm_pgtable_stage2_flush);
>> +	if (kvm_is_realm(kvm))
>> +		kvm_realm_unmap_range(kvm, addr, end - addr);
>> +	else
>> +		stage2_apply_range_resched(kvm, addr, end,
>> +					   kvm_pgtable_stage2_flush);
>>  }
>>  
>>  /**
>> @@ -818,6 +827,10 @@ void stage2_unmap_vm(struct kvm *kvm)
>>  	struct kvm_memory_slot *memslot;
>>  	int idx, bkt;
>>  
>> +	/* For realms this is handled by the RMM so nothing to do here */
>> +	if (kvm_is_realm(kvm))
>> +		return;
>> +
>>  	idx = srcu_read_lock(&kvm->srcu);
>>  	mmap_read_lock(current->mm);
>>  	write_lock(&kvm->mmu_lock);
>> @@ -840,6 +853,7 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
>>  	pgt = mmu->pgt;
>>  	if (kvm_is_realm(kvm) &&
>>  	    kvm_realm_state(kvm) != REALM_STATE_DYING) {
>> +		unmap_stage2_range(mmu, 0, (~0ULL) & PAGE_MASK);
>>  		write_unlock(&kvm->mmu_lock);
>>  		kvm_realm_destroy_rtts(&kvm->arch.realm, pgt->ia_bits,
>>  				       pgt->start_level);
>> @@ -1190,6 +1204,24 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma)
>>  	return vma->vm_flags & VM_MTE_ALLOWED;
>>  }
>>  
>> +static int realm_map_ipa(struct kvm *kvm, phys_addr_t ipa, unsigned long hva,
>> +			 kvm_pfn_t pfn, unsigned long map_size,
>> +			 enum kvm_pgtable_prot prot,
>> +			 struct kvm_mmu_memory_cache *memcache)
>> +{
>> +	struct realm *realm = &kvm->arch.realm;
>> +	struct page *page = pfn_to_page(pfn);
>> +
>> +	if (WARN_ON(!(prot & KVM_PGTABLE_PROT_W)))
>> +		return -EFAULT;
>> +
>> +	if (!realm_is_addr_protected(realm, ipa))
>> +		return realm_map_non_secure(realm, ipa, page, map_size,
>> +					    memcache);
>> +
>> +	return realm_map_protected(realm, hva, ipa, page, map_size, memcache);
>> +}
>> +
>>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  			  struct kvm_memory_slot *memslot, unsigned long hva,
>>  			  unsigned long fault_status)
>> @@ -1210,9 +1242,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	unsigned long vma_pagesize, fault_granule;
>>  	enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
>>  	struct kvm_pgtable *pgt;
>> +	gpa_t gpa_stolen_mask = kvm_gpa_stolen_bits(vcpu->kvm);
>>  
>>  	fault_granule = 1UL << ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level);
>>  	write_fault = kvm_is_write_fault(vcpu);
>> +
>> +	/* Realms cannot map read-only */
> 
> Out of curiosity, why? It would be nice to have more explanation in the
> comment.

The RMM specification doesn't support mapping protected memory read
only. I don't believe there is any reason why it couldn't, but equally I
don't think there any use cases for a guest needing read-only pages so
this just isn't supported by the RMM. Since the page is necessarily
taken away from the host it's fairly irrelevant (from the host's
perspective) whether it is actually read only or not.

However, this is technically wrong for the case of unprotected (shared)
pages - it should be possible to map those read only. But I need to have
a think about how to fix that up.

>> +	if (vcpu_is_rec(vcpu))
>> +		write_fault = true;
>> +
>>  	exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
>>  	VM_BUG_ON(write_fault && exec_fault);
>>  
>> @@ -1272,7 +1310,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	if (vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE)
>>  		fault_ipa &= ~(vma_pagesize - 1);
>>  
>> -	gfn = fault_ipa >> PAGE_SHIFT;
>> +	gfn = (fault_ipa & ~gpa_stolen_mask) >> PAGE_SHIFT;
>>  	mmap_read_unlock(current->mm);
>>  
>>  	/*
>> @@ -1345,7 +1383,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	 * If we are not forced to use page mapping, check if we are
>>  	 * backed by a THP and thus use block mapping if possible.
>>  	 */
>> -	if (vma_pagesize == PAGE_SIZE && !(force_pte || device)) {
>> +	/* FIXME: We shouldn't need to disable this for realms */
>> +	if (vma_pagesize == PAGE_SIZE && !(force_pte || device || kvm_is_realm(kvm))) {
> 
> Why do we have to disable this temporarily?

The current uABI (not using memfd) has some serious issues regarding
huge page support. KVM normally follows the user space mappings of the
memslot - so if user space has a huge page (transparent or hugetlbs)
then stage 2 for the guest also gets one.

However realms sometimes require that the stage 2 differs. The main
examples are:

 * RIPAS - if part of a huge page is RIPAS_RAM and part RIPAS_EMPTY then
the huge page would have to be split.

 * Initially populated memory: basically the same as above - if the
populated memory doesn't perfectly align with huge pages, then the
head/tail pages would need to be broken up.

Removing this hack allows the huge pages to be created in stage 2, but
then causes overmapping of the initial contents, then later on when the
VMM (or guest) attempts to change the properties of the misaligned tail
it gets an error because the pages are already present in stage 2.

The planned solution to all this is to stop following the user space
page tables and create huge pages opportunistically from the memfd that
backs the protected range. For now this hack exists to avoid things
"randomly" failing when e.g. the initial kernel image isn't huge page
aligned. In theory it should be possible to make this work with the
current uABI, but it's not worth it when we know we're replacing it.

>>  		if (fault_status == FSC_PERM && fault_granule > PAGE_SIZE)
>>  			vma_pagesize = fault_granule;
>>  		else
>> @@ -1382,6 +1421,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	 */
>>  	if (fault_status == FSC_PERM && vma_pagesize == fault_granule)
>>  		ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot);
>> +	else if (kvm_is_realm(kvm))
>> +		ret = realm_map_ipa(kvm, fault_ipa, hva, pfn, vma_pagesize,
>> +				    prot, memcache);
>>  	else
>>  		ret = kvm_pgtable_stage2_map(pgt, fault_ipa, vma_pagesize,
>>  					     __pfn_to_phys(pfn), prot,
>> @@ -1437,6 +1479,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>>  	struct kvm_memory_slot *memslot;
>>  	unsigned long hva;
>>  	bool is_iabt, write_fault, writable;
>> +	gpa_t gpa_stolen_mask = kvm_gpa_stolen_bits(vcpu->kvm);
>>  	gfn_t gfn;
>>  	int ret, idx;
>>  
>> @@ -1491,7 +1534,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>>  
>>  	idx = srcu_read_lock(&vcpu->kvm->srcu);
>>  
>> -	gfn = fault_ipa >> PAGE_SHIFT;
>> +	gfn = (fault_ipa & ~gpa_stolen_mask) >> PAGE_SHIFT;
>>  	memslot = gfn_to_memslot(vcpu->kvm, gfn);
>>  	hva = gfn_to_hva_memslot_prot(memslot, gfn, &writable);
>>  	write_fault = kvm_is_write_fault(vcpu);
>> @@ -1536,6 +1579,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>>  		 * of the page size.
>>  		 */
>>  		fault_ipa |= kvm_vcpu_get_hfar(vcpu) & ((1 << 12) - 1);
>> +		fault_ipa &= ~gpa_stolen_mask;
>>  		ret = io_mem_abort(vcpu, fault_ipa);
>>  		goto out_unlock;
>>  	}
>> @@ -1617,6 +1661,10 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
>>  	if (!kvm->arch.mmu.pgt)
>>  		return false;
>>
> 
> Does the unprotected (shared) region of a realm support aging?

In theory this should be possible to support by unmapping the NS entry
and handling the fault. But the hardware access flag optimisation isn't
available with the RMM, and the overhead of RMI calls to unmap/map could
be significant.

For now this isn't something we've looked at, but I guess it might be
worth trying out when we have some real hardware to benchmark on.

>> +	/* We don't support aging for Realms */
>> +	if (kvm_is_realm(kvm))
>> +		return true;
>> +
>>  	WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
>>  
>>  	kpte = kvm_pgtable_stage2_mkold(kvm->arch.mmu.pgt,
>> @@ -1630,6 +1678,10 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
>>  	if (!kvm->arch.mmu.pgt)
>>  		return false;
>>  
>> +	/* We don't support aging for Realms */
>> +	if (kvm_is_realm(kvm))
>> +		return true;
>> +
>>  	return kvm_pgtable_stage2_is_young(kvm->arch.mmu.pgt,
>>  					   range->start << PAGE_SHIFT);
>>  }
>> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
>> index 3405b43e1421..3d46191798e5 100644
>> --- a/arch/arm64/kvm/rme.c
>> +++ b/arch/arm64/kvm/rme.c
>> @@ -608,6 +608,54 @@ int realm_map_protected(struct realm *realm,
>>  	return -ENXIO;
>>  }
>>  
>> +int realm_map_non_secure(struct realm *realm,
>> +			 unsigned long ipa,
>> +			 struct page *page,
>> +			 unsigned long map_size,
>> +			 struct kvm_mmu_memory_cache *memcache)
>> +{
>> +	phys_addr_t rd = virt_to_phys(realm->rd);
>> +	int map_level;
>> +	int ret = 0;
>> +	unsigned long desc = page_to_phys(page) |
>> +			     PTE_S2_MEMATTR(MT_S2_FWB_NORMAL) |
>> +			     /* FIXME: Read+Write permissions for now */
> Why can't we handle the prot from the realm_map_ipa()? Working in progress? :)

Yes, work in progress - this comes from the "Realms cannot map
read-only" in user_mem_abort() above. Since all faults are treated as
write faults we need to upgrade to read/write here too.

The prot in realm_map_ipa isn't actually useful currently because we
simply WARN_ON and return if it doesn't have PROT_W. Again this needs to
be fixed! It's on my todo list ;)

Steve

>> +			     (3 << 6) |
>> +			     PTE_SHARED;
>> +
>> +	if (WARN_ON(!IS_ALIGNED(ipa, map_size)))
>> +		return -EINVAL;
>> +
>> +	switch (map_size) {
>> +	case PAGE_SIZE:
>> +		map_level = 3;
>> +		break;
>> +	case RME_L2_BLOCK_SIZE:
>> +		map_level = 2;
>> +		break;
>> +	default:
>> +		return -EINVAL;
>> +	}
>> +
>> +	ret = rmi_rtt_map_unprotected(rd, ipa, map_level, desc);
>> +
>> +	if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
>> +		/* Create missing RTTs and retry */
>> +		int level = RMI_RETURN_INDEX(ret);
>> +
>> +		ret = realm_create_rtt_levels(realm, ipa, level, map_level,
>> +					      memcache);
>> +		if (WARN_ON(ret))
>> +			return -ENXIO;
>> +
>> +		ret = rmi_rtt_map_unprotected(rd, ipa, map_level, desc);
>> +	}
>> +	if (WARN_ON(ret))
>> +		return -ENXIO;
>> +
>> +	return 0;
>> +}
>> +
>>  static int populate_par_region(struct kvm *kvm,
>>  			       phys_addr_t ipa_base,
>>  			       phys_addr_t ipa_end)
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms
  2023-03-06 19:10     ` Zhi Wang
@ 2023-03-10 15:47       ` Steven Price
  0 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-03-10 15:47 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On 06/03/2023 19:10, Zhi Wang wrote:
> On Fri, 27 Jan 2023 11:29:10 +0000
> Steven Price <steven.price@arm.com> wrote:
> 
>> Add the KVM_CAP_ARM_RME_CREATE_FD ioctl to create a realm. This involves
>> delegating pages to the RMM to hold the Realm Descriptor (RD) and for
>> the base level of the Realm Translation Tables (RTT). A VMID also need
>> to be picked, since the RMM has a separate VMID address space a
>> dedicated allocator is added for this purpose.
>>
>> KVM_CAP_ARM_RME_CONFIG_REALM is provided to allow configuring the realm
>> before it is created.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_rme.h |  14 ++
>>  arch/arm64/kvm/arm.c             |  19 ++
>>  arch/arm64/kvm/mmu.c             |   6 +
>>  arch/arm64/kvm/reset.c           |  33 +++
>>  arch/arm64/kvm/rme.c             | 357 +++++++++++++++++++++++++++++++
>>  5 files changed, 429 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
>> index c26bc2c6770d..055a22accc08 100644
>> --- a/arch/arm64/include/asm/kvm_rme.h
>> +++ b/arch/arm64/include/asm/kvm_rme.h
>> @@ -6,6 +6,8 @@
>>  #ifndef __ASM_KVM_RME_H
>>  #define __ASM_KVM_RME_H
>>  
>> +#include <uapi/linux/kvm.h>
>> +
>>  enum realm_state {
>>  	REALM_STATE_NONE,
>>  	REALM_STATE_NEW,
>> @@ -15,8 +17,20 @@ enum realm_state {
>>  
>>  struct realm {
>>  	enum realm_state state;
>> +
>> +	void *rd;
>> +	struct realm_params *params;
>> +
>> +	unsigned long num_aux;
>> +	unsigned int vmid;
>> +	unsigned int ia_bits;
>>  };
>>  
>>  int kvm_init_rme(void);
>> +u32 kvm_realm_ipa_limit(void);
>> +
>> +int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
>> +int kvm_init_realm_vm(struct kvm *kvm);
>> +void kvm_destroy_realm(struct kvm *kvm);
>>  
>>  #endif
>> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
>> index d97b39d042ab..50f54a63732a 100644
>> --- a/arch/arm64/kvm/arm.c
>> +++ b/arch/arm64/kvm/arm.c
>> @@ -103,6 +103,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>>  		r = 0;
>>  		set_bit(KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED, &kvm->arch.flags);
>>  		break;
>> +	case KVM_CAP_ARM_RME:
>> +		if (!static_branch_unlikely(&kvm_rme_is_available))
>> +			return -EINVAL;
>> +		mutex_lock(&kvm->lock);
>> +		r = kvm_realm_enable_cap(kvm, cap);
>> +		mutex_unlock(&kvm->lock);
>> +		break;
>>  	default:
>>  		r = -EINVAL;
>>  		break;
>> @@ -172,6 +179,13 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>>  	 */
>>  	kvm->arch.dfr0_pmuver.imp = kvm_arm_pmu_get_pmuver_limit();
>>  
>> +	/* Initialise the realm bits after the generic bits are enabled */
>> +	if (kvm_is_realm(kvm)) {
>> +		ret = kvm_init_realm_vm(kvm);
>> +		if (ret)
>> +			goto err_free_cpumask;
>> +	}
>> +
>>  	return 0;
>>  
>>  err_free_cpumask:
>> @@ -204,6 +218,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>>  	kvm_destroy_vcpus(kvm);
>>  
>>  	kvm_unshare_hyp(kvm, kvm + 1);
>> +
>> +	kvm_destroy_realm(kvm);
>>  }
>>  
>>  int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>> @@ -300,6 +316,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>  	case KVM_CAP_ARM_PTRAUTH_GENERIC:
>>  		r = system_has_full_ptr_auth();
>>  		break;
>> +	case KVM_CAP_ARM_RME:
>> +		r = static_key_enabled(&kvm_rme_is_available);
>> +		break;
>>  	default:
>>  		r = 0;
>>  	}
>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>> index 31d7fa4c7c14..d0f707767d05 100644
>> --- a/arch/arm64/kvm/mmu.c
>> +++ b/arch/arm64/kvm/mmu.c
>> @@ -840,6 +840,12 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
>>  	struct kvm_pgtable *pgt = NULL;
>>  
>>  	write_lock(&kvm->mmu_lock);
>> +	if (kvm_is_realm(kvm) &&
>> +	    kvm_realm_state(kvm) != REALM_STATE_DYING) {
>> +		/* TODO: teardown rtts */
>> +		write_unlock(&kvm->mmu_lock);
>> +		return;
>> +	}
>>  	pgt = mmu->pgt;
>>  	if (pgt) {
>>  		mmu->pgd_phys = 0;
>> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
>> index e0267f672b8a..c165df174737 100644
>> --- a/arch/arm64/kvm/reset.c
>> +++ b/arch/arm64/kvm/reset.c
>> @@ -395,3 +395,36 @@ int kvm_set_ipa_limit(void)
>>  
>>  	return 0;
>>  }
>> +
> 
> The below function doesn't have an user in this patch. Also,
> it looks like a part of copy from kvm_init_stage2_mmu()
> in arch/arm64/kvm/mmu.c.

Good spot ;) Yes I discovered this, it should have been removed - it's
no longer used. I think this was an error when I was rebasing:
kvm_arm_setup-stage2() was removed in 315775ff7c6d ("KVM: arm64:
Consolidate stage-2 initialisation into a single function").

Steve

>> +int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type)
>> +{
>> +	u64 mmfr0, mmfr1;
>> +	u32 phys_shift;
>> +	u32 ipa_limit = kvm_ipa_limit;
>> +
>> +	if (kvm_is_realm(kvm))
>> +		ipa_limit = kvm_realm_ipa_limit();
>> +
>> +	if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
>> +		return -EINVAL;
>> +
>> +	phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type);
>> +	if (phys_shift) {
>> +		if (phys_shift > ipa_limit ||
>> +		    phys_shift < ARM64_MIN_PARANGE_BITS)
>> +			return -EINVAL;
>> +	} else {
>> +		phys_shift = KVM_PHYS_SHIFT;
>> +		if (phys_shift > ipa_limit) {
>> +			pr_warn_once("%s using unsupported default IPA limit, upgrade your VMM\n",
>> +				     current->comm);
>> +			return -EINVAL;
>> +		}
>> +	}
>> +
>> +	mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
>> +	mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
>> +	kvm->arch.vtcr = kvm_get_vtcr(mmfr0, mmfr1, phys_shift);
>> +
>> +	return 0;
>> +}
>> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
>> index f6b587bc116e..9f8c5a91b8fc 100644
>> --- a/arch/arm64/kvm/rme.c
>> +++ b/arch/arm64/kvm/rme.c
>> @@ -5,9 +5,49 @@
>>  
>>  #include <linux/kvm_host.h>
>>  
>> +#include <asm/kvm_emulate.h>
>> +#include <asm/kvm_mmu.h>
>>  #include <asm/rmi_cmds.h>
>>  #include <asm/virt.h>
>>  
>> +/************ FIXME: Copied from kvm/hyp/pgtable.c **********/
>> +#include <asm/kvm_pgtable.h>
>> +
>> +struct kvm_pgtable_walk_data {
>> +	struct kvm_pgtable		*pgt;
>> +	struct kvm_pgtable_walker	*walker;
>> +
>> +	u64				addr;
>> +	u64				end;
>> +};
>> +
>> +static u32 __kvm_pgd_page_idx(struct kvm_pgtable *pgt, u64 addr)
>> +{
>> +	u64 shift = kvm_granule_shift(pgt->start_level - 1); /* May underflow */
>> +	u64 mask = BIT(pgt->ia_bits) - 1;
>> +
>> +	return (addr & mask) >> shift;
>> +}
>> +
>> +static u32 kvm_pgd_pages(u32 ia_bits, u32 start_level)
>> +{
>> +	struct kvm_pgtable pgt = {
>> +		.ia_bits	= ia_bits,
>> +		.start_level	= start_level,
>> +	};
>> +
>> +	return __kvm_pgd_page_idx(&pgt, -1ULL) + 1;
>> +}
>> +
>> +/******************/
>> +
>> +static unsigned long rmm_feat_reg0;
>> +
>> +static bool rme_supports(unsigned long feature)
>> +{
>> +	return !!u64_get_bits(rmm_feat_reg0, feature);
>> +}
>> +
>>  static int rmi_check_version(void)
>>  {
>>  	struct arm_smccc_res res;
>> @@ -33,8 +73,319 @@ static int rmi_check_version(void)
>>  	return 0;
>>  }
>>  
>> +static unsigned long create_realm_feat_reg0(struct kvm *kvm)
>> +{
>> +	unsigned long ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
>> +	u64 feat_reg0 = 0;
>> +
>> +	int num_bps = u64_get_bits(rmm_feat_reg0,
>> +				   RMI_FEATURE_REGISTER_0_NUM_BPS);
>> +	int num_wps = u64_get_bits(rmm_feat_reg0,
>> +				   RMI_FEATURE_REGISTER_0_NUM_WPS);
>> +
>> +	feat_reg0 |= u64_encode_bits(ia_bits, RMI_FEATURE_REGISTER_0_S2SZ);
>> +	feat_reg0 |= u64_encode_bits(num_bps, RMI_FEATURE_REGISTER_0_NUM_BPS);
>> +	feat_reg0 |= u64_encode_bits(num_wps, RMI_FEATURE_REGISTER_0_NUM_WPS);
>> +
>> +	return feat_reg0;
>> +}
>> +
>> +u32 kvm_realm_ipa_limit(void)
>> +{
>> +	return u64_get_bits(rmm_feat_reg0, RMI_FEATURE_REGISTER_0_S2SZ);
>> +}
>> +
>> +static u32 get_start_level(struct kvm *kvm)
>> +{
>> +	long sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, kvm->arch.vtcr);
>> +
>> +	return VTCR_EL2_TGRAN_SL0_BASE - sl0;
>> +}
>> +
>> +static int realm_create_rd(struct kvm *kvm)
>> +{
>> +	struct realm *realm = &kvm->arch.realm;
>> +	struct realm_params *params = realm->params;
>> +	void *rd = NULL;
>> +	phys_addr_t rd_phys, params_phys;
>> +	struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
>> +	unsigned int pgd_sz;
>> +	int i, r;
>> +
>> +	if (WARN_ON(realm->rd) || WARN_ON(!realm->params))
>> +		return -EEXIST;
>> +
>> +	rd = (void *)__get_free_page(GFP_KERNEL);
>> +	if (!rd)
>> +		return -ENOMEM;
>> +
>> +	rd_phys = virt_to_phys(rd);
>> +	if (rmi_granule_delegate(rd_phys)) {
>> +		r = -ENXIO;
>> +		goto out;
>> +	}
>> +
>> +	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
>> +	for (i = 0; i < pgd_sz; i++) {
>> +		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
>> +
>> +		if (rmi_granule_delegate(pgd_phys)) {
>> +			r = -ENXIO;
>> +			goto out_undelegate_tables;
>> +		}
>> +	}
>> +
>> +	params->rtt_level_start = get_start_level(kvm);
>> +	params->rtt_num_start = pgd_sz;
>> +	params->rtt_base = kvm->arch.mmu.pgd_phys;
>> +	params->vmid = realm->vmid;
>> +
>> +	params_phys = virt_to_phys(params);
>> +
>> +	if (rmi_realm_create(rd_phys, params_phys)) {
>> +		r = -ENXIO;
>> +		goto out_undelegate_tables;
>> +	}
>> +
>> +	realm->rd = rd;
>> +	realm->ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
>> +
>> +	if (WARN_ON(rmi_rec_aux_count(rd_phys, &realm->num_aux))) {
>> +		WARN_ON(rmi_realm_destroy(rd_phys));
>> +		goto out_undelegate_tables;
>> +	}
>> +
>> +	return 0;
>> +
>> +out_undelegate_tables:
>> +	while (--i >= 0) {
>> +		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
>> +
>> +		WARN_ON(rmi_granule_undelegate(pgd_phys));
>> +	}
>> +	WARN_ON(rmi_granule_undelegate(rd_phys));
>> +out:
>> +	free_page((unsigned long)rd);
>> +	return r;
>> +}
>> +
>> +/* Protects access to rme_vmid_bitmap */
>> +static DEFINE_SPINLOCK(rme_vmid_lock);
>> +static unsigned long *rme_vmid_bitmap;
>> +
>> +static int rme_vmid_init(void)
>> +{
>> +	unsigned int vmid_count = 1 << kvm_get_vmid_bits();
>> +
>> +	rme_vmid_bitmap = bitmap_zalloc(vmid_count, GFP_KERNEL);
>> +	if (!rme_vmid_bitmap) {
>> +		kvm_err("%s: Couldn't allocate rme vmid bitmap\n", __func__);
>> +		return -ENOMEM;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int rme_vmid_reserve(void)
>> +{
>> +	int ret;
>> +	unsigned int vmid_count = 1 << kvm_get_vmid_bits();
>> +
>> +	spin_lock(&rme_vmid_lock);
>> +	ret = bitmap_find_free_region(rme_vmid_bitmap, vmid_count, 0);
>> +	spin_unlock(&rme_vmid_lock);
>> +
>> +	return ret;
>> +}
>> +
>> +static void rme_vmid_release(unsigned int vmid)
>> +{
>> +	spin_lock(&rme_vmid_lock);
>> +	bitmap_release_region(rme_vmid_bitmap, vmid, 0);
>> +	spin_unlock(&rme_vmid_lock);
>> +}
>> +
>> +static int kvm_create_realm(struct kvm *kvm)
>> +{
>> +	struct realm *realm = &kvm->arch.realm;
>> +	int ret;
>> +
>> +	if (!kvm_is_realm(kvm) || kvm_realm_state(kvm) != REALM_STATE_NONE)
>> +		return -EEXIST;
>> +
>> +	ret = rme_vmid_reserve();
>> +	if (ret < 0)
>> +		return ret;
>> +	realm->vmid = ret;
>> +
>> +	ret = realm_create_rd(kvm);
>> +	if (ret) {
>> +		rme_vmid_release(realm->vmid);
>> +		return ret;
>> +	}
>> +
>> +	WRITE_ONCE(realm->state, REALM_STATE_NEW);
>> +
>> +	/* The realm is up, free the parameters.  */
>> +	free_page((unsigned long)realm->params);
>> +	realm->params = NULL;
>> +
>> +	return 0;
>> +}
>> +
>> +static int config_realm_hash_algo(struct realm *realm,
>> +				  struct kvm_cap_arm_rme_config_item *cfg)
>> +{
>> +	switch (cfg->hash_algo) {
>> +	case KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256:
>> +		if (!rme_supports(RMI_FEATURE_REGISTER_0_HASH_SHA_256))
>> +			return -EINVAL;
>> +		break;
>> +	case KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA512:
>> +		if (!rme_supports(RMI_FEATURE_REGISTER_0_HASH_SHA_512))
>> +			return -EINVAL;
>> +		break;
>> +	default:
>> +		return -EINVAL;
>> +	}
>> +	realm->params->measurement_algo = cfg->hash_algo;
>> +	return 0;
>> +}
>> +
>> +static int config_realm_sve(struct realm *realm,
>> +			    struct kvm_cap_arm_rme_config_item *cfg)
>> +{
>> +	u64 features_0 = realm->params->features_0;
>> +	int max_sve_vq = u64_get_bits(rmm_feat_reg0,
>> +				      RMI_FEATURE_REGISTER_0_SVE_VL);
>> +
>> +	if (!rme_supports(RMI_FEATURE_REGISTER_0_SVE_EN))
>> +		return -EINVAL;
>> +
>> +	if (cfg->sve_vq > max_sve_vq)
>> +		return -EINVAL;
>> +
>> +	features_0 &= ~(RMI_FEATURE_REGISTER_0_SVE_EN |
>> +			RMI_FEATURE_REGISTER_0_SVE_VL);
>> +	features_0 |= u64_encode_bits(1, RMI_FEATURE_REGISTER_0_SVE_EN);
>> +	features_0 |= u64_encode_bits(cfg->sve_vq,
>> +				      RMI_FEATURE_REGISTER_0_SVE_VL);
>> +
>> +	realm->params->features_0 = features_0;
>> +	return 0;
>> +}
>> +
>> +static int kvm_rme_config_realm(struct kvm *kvm, struct kvm_enable_cap *cap)
>> +{
>> +	struct kvm_cap_arm_rme_config_item cfg;
>> +	struct realm *realm = &kvm->arch.realm;
>> +	int r = 0;
>> +
>> +	if (kvm_realm_state(kvm) != REALM_STATE_NONE)
>> +		return -EBUSY;
>> +
>> +	if (copy_from_user(&cfg, (void __user *)cap->args[1], sizeof(cfg)))
>> +		return -EFAULT;
>> +
>> +	switch (cfg.cfg) {
>> +	case KVM_CAP_ARM_RME_CFG_RPV:
>> +		memcpy(&realm->params->rpv, &cfg.rpv, sizeof(cfg.rpv));
>> +		break;
>> +	case KVM_CAP_ARM_RME_CFG_HASH_ALGO:
>> +		r = config_realm_hash_algo(realm, &cfg);
>> +		break;
>> +	case KVM_CAP_ARM_RME_CFG_SVE:
>> +		r = config_realm_sve(realm, &cfg);
>> +		break;
>> +	default:
>> +		r = -EINVAL;
>> +	}
>> +
>> +	return r;
>> +}
>> +
>> +int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>> +{
>> +	int r = 0;
>> +
>> +	switch (cap->args[0]) {
>> +	case KVM_CAP_ARM_RME_CONFIG_REALM:
>> +		r = kvm_rme_config_realm(kvm, cap);
>> +		break;
>> +	case KVM_CAP_ARM_RME_CREATE_RD:
>> +		if (kvm->created_vcpus) {
>> +			r = -EBUSY;
>> +			break;
>> +		}
>> +
>> +		r = kvm_create_realm(kvm);
>> +		break;
>> +	default:
>> +		r = -EINVAL;
>> +		break;
>> +	}
>> +
>> +	return r;
>> +}
>> +
>> +void kvm_destroy_realm(struct kvm *kvm)
>> +{
>> +	struct realm *realm = &kvm->arch.realm;
>> +	struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
>> +	unsigned int pgd_sz;
>> +	int i;
>> +
>> +	if (realm->params) {
>> +		free_page((unsigned long)realm->params);
>> +		realm->params = NULL;
>> +	}
>> +
>> +	if (kvm_realm_state(kvm) == REALM_STATE_NONE)
>> +		return;
>> +
>> +	WRITE_ONCE(realm->state, REALM_STATE_DYING);
>> +
>> +	rme_vmid_release(realm->vmid);
>> +
>> +	if (realm->rd) {
>> +		phys_addr_t rd_phys = virt_to_phys(realm->rd);
>> +
>> +		if (WARN_ON(rmi_realm_destroy(rd_phys)))
>> +			return;
>> +		if (WARN_ON(rmi_granule_undelegate(rd_phys)))
>> +			return;
>> +		free_page((unsigned long)realm->rd);
>> +		realm->rd = NULL;
>> +	}
>> +
>> +	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
>> +	for (i = 0; i < pgd_sz; i++) {
>> +		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
>> +
>> +		if (WARN_ON(rmi_granule_undelegate(pgd_phys)))
>> +			return;
>> +	}
>> +
>> +	kvm_free_stage2_pgd(&kvm->arch.mmu);
>> +}
>> +
>> +int kvm_init_realm_vm(struct kvm *kvm)
>> +{
>> +	struct realm_params *params;
>> +
>> +	params = (struct realm_params *)get_zeroed_page(GFP_KERNEL);
>> +	if (!params)
>> +		return -ENOMEM;
>> +
>> +	params->features_0 = create_realm_feat_reg0(kvm);
>> +	kvm->arch.realm.params = params;
>> +	return 0;
>> +}
>> +
>>  int kvm_init_rme(void)
>>  {
>> +	int ret;
>> +
>>  	if (PAGE_SIZE != SZ_4K)
>>  		/* Only 4k page size on the host is supported */
>>  		return 0;
>> @@ -43,6 +394,12 @@ int kvm_init_rme(void)
>>  		/* Continue without realm support */
>>  		return 0;
>>  
>> +	ret = rme_vmid_init();
>> +	if (ret)
>> +		return ret;
>> +
>> +	WARN_ON(rmi_features(0, &rmm_feat_reg0));
>> +
>>  	/* Future patch will enable static branch kvm_rme_is_available */
>>  
>>  	return 0;
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 16/28] arm64: RME: Allow populating initial contents
  2023-03-10 15:47       ` Steven Price
@ 2023-03-14 15:31         ` Zhi Wang
  2023-03-22 11:51           ` Steven Price
  0 siblings, 1 reply; 190+ messages in thread
From: Zhi Wang @ 2023-03-14 15:31 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 10 Mar 2023 15:47:16 +0000
Steven Price <steven.price@arm.com> wrote:

> On 06/03/2023 17:34, Zhi Wang wrote:
> > On Fri, 27 Jan 2023 11:29:20 +0000
> > Steven Price <steven.price@arm.com> wrote:
> >   
> >> The VMM needs to populate the realm with some data before starting (e.g.
> >> a kernel and initrd). This is measured by the RMM and used as part of
> >> the attestation later on.
> >>
> >> Signed-off-by: Steven Price <steven.price@arm.com>
> >> ---
> >>  arch/arm64/kvm/rme.c | 366 +++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 366 insertions(+)
> >>
> >> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> >> index 16e0bfea98b1..3405b43e1421 100644
> >> --- a/arch/arm64/kvm/rme.c
> >> +++ b/arch/arm64/kvm/rme.c
> >> @@ -4,6 +4,7 @@
> >>   */
> >>  
> >>  #include <linux/kvm_host.h>
> >> +#include <linux/hugetlb.h>
> >>  
> >>  #include <asm/kvm_emulate.h>
> >>  #include <asm/kvm_mmu.h>
> >> @@ -426,6 +427,359 @@ void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size)
> >>  	}
> >>  }
> >>  
> >> +static int realm_create_protected_data_page(struct realm *realm,
> >> +					    unsigned long ipa,
> >> +					    struct page *dst_page,
> >> +					    struct page *tmp_page)
> >> +{
> >> +	phys_addr_t dst_phys, tmp_phys;
> >> +	int ret;
> >> +
> >> +	copy_page(page_address(tmp_page), page_address(dst_page));
> >> +
> >> +	dst_phys = page_to_phys(dst_page);
> >> +	tmp_phys = page_to_phys(tmp_page);
> >> +
> >> +	if (rmi_granule_delegate(dst_phys))
> >> +		return -ENXIO;
> >> +
> >> +	ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa, tmp_phys,
> >> +			      RMI_MEASURE_CONTENT);
> >> +
> >> +	if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> >> +		/* Create missing RTTs and retry */
> >> +		int level = RMI_RETURN_INDEX(ret);
> >> +
> >> +		ret = realm_create_rtt_levels(realm, ipa, level,
> >> +					      RME_RTT_MAX_LEVEL, NULL);
> >> +		if (ret)
> >> +			goto err;
> >> +
> >> +		ret = rmi_data_create(dst_phys, virt_to_phys(realm->rd), ipa,
> >> +				      tmp_phys, RMI_MEASURE_CONTENT);
> >> +	}
> >> +
> >> +	if (ret)
> >> +		goto err;
> >> +
> >> +	return 0;
> >> +
> >> +err:
> >> +	if (WARN_ON(rmi_granule_undelegate(dst_phys))) {
> >> +		/* Page can't be returned to NS world so is lost */
> >> +		get_page(dst_page);
> >> +	}
> >> +	return -ENXIO;
> >> +}
> >> +
> >> +static int fold_rtt(phys_addr_t rd, unsigned long addr, int level,
> >> +		    struct realm *realm)
> >> +{
> >> +	struct rtt_entry rtt;
> >> +	phys_addr_t rtt_addr;
> >> +
> >> +	if (rmi_rtt_read_entry(rd, addr, level, &rtt))
> >> +		return -ENXIO;
> >> +
> >> +	if (rtt.state != RMI_TABLE)
> >> +		return -EINVAL;
> >> +
> >> +	rtt_addr = rmi_rtt_get_phys(&rtt);
> >> +	if (rmi_rtt_fold(rtt_addr, rd, addr, level + 1))
> >> +		return -ENXIO;
> >> +
> >> +	free_delegated_page(realm, rtt_addr);
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +int realm_map_protected(struct realm *realm,
> >> +			unsigned long hva,
> >> +			unsigned long base_ipa,
> >> +			struct page *dst_page,
> >> +			unsigned long map_size,
> >> +			struct kvm_mmu_memory_cache *memcache)
> >> +{
> >> +	phys_addr_t dst_phys = page_to_phys(dst_page);
> >> +	phys_addr_t rd = virt_to_phys(realm->rd);
> >> +	unsigned long phys = dst_phys;
> >> +	unsigned long ipa = base_ipa;
> >> +	unsigned long size;
> >> +	int map_level;
> >> +	int ret = 0;
> >> +
> >> +	if (WARN_ON(!IS_ALIGNED(ipa, map_size)))
> >> +		return -EINVAL;
> >> +
> >> +	switch (map_size) {
> >> +	case PAGE_SIZE:
> >> +		map_level = 3;
> >> +		break;
> >> +	case RME_L2_BLOCK_SIZE:
> >> +		map_level = 2;
> >> +		break;
> >> +	default:
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	if (map_level < RME_RTT_MAX_LEVEL) {
> >> +		/*
> >> +		 * A temporary RTT is needed during the map, precreate it,
> >> +		 * however if there is an error (e.g. missing parent tables)
> >> +		 * this will be handled below.
> >> +		 */
> >> +		realm_create_rtt_levels(realm, ipa, map_level,
> >> +					RME_RTT_MAX_LEVEL, memcache);
> >> +	}
> >> +
> >> +	for (size = 0; size < map_size; size += PAGE_SIZE) {
> >> +		if (rmi_granule_delegate(phys)) {
> >> +			struct rtt_entry rtt;
> >> +
> >> +			/*
> >> +			 * It's possible we raced with another VCPU on the same
> >> +			 * fault. If the entry exists and matches then exit
> >> +			 * early and assume the other VCPU will handle the
> >> +			 * mapping.
> >> +			 */
> >> +			if (rmi_rtt_read_entry(rd, ipa, RME_RTT_MAX_LEVEL, &rtt))
> >> +				goto err;
> >> +
> >> +			// FIXME: For a block mapping this could race at level
> >> +			// 2 or 3...
> >> +			if (WARN_ON((rtt.walk_level != RME_RTT_MAX_LEVEL ||
> >> +				     rtt.state != RMI_ASSIGNED ||
> >> +				     rtt.desc != phys))) {
> >> +				goto err;
> >> +			}
> >> +
> >> +			return 0;
> >> +		}
> >> +
> >> +		ret = rmi_data_create_unknown(phys, rd, ipa);
> >> +
> >> +		if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> >> +			/* Create missing RTTs and retry */
> >> +			int level = RMI_RETURN_INDEX(ret);
> >> +
> >> +			ret = realm_create_rtt_levels(realm, ipa, level,
> >> +						      RME_RTT_MAX_LEVEL,
> >> +						      memcache);
> >> +			WARN_ON(ret);
> >> +			if (ret)
> >> +				goto err_undelegate;
> >> +
> >> +			ret = rmi_data_create_unknown(phys, rd, ipa);
> >> +		}
> >> +		WARN_ON(ret);
> >> +
> >> +		if (ret)
> >> +			goto err_undelegate;
> >> +
> >> +		phys += PAGE_SIZE;
> >> +		ipa += PAGE_SIZE;
> >> +	}
> >> +
> >> +	if (map_size == RME_L2_BLOCK_SIZE)
> >> +		ret = fold_rtt(rd, base_ipa, map_level, realm);
> >> +	if (WARN_ON(ret))
> >> +		goto err;
> >> +
> >> +	return 0;
> >> +
> >> +err_undelegate:
> >> +	if (WARN_ON(rmi_granule_undelegate(phys))) {
> >> +		/* Page can't be returned to NS world so is lost */
> >> +		get_page(phys_to_page(phys));
> >> +	}
> >> +err:
> >> +	while (size > 0) {
> >> +		phys -= PAGE_SIZE;
> >> +		size -= PAGE_SIZE;
> >> +		ipa -= PAGE_SIZE;
> >> +
> >> +		rmi_data_destroy(rd, ipa);
> >> +
> >> +		if (WARN_ON(rmi_granule_undelegate(phys))) {
> >> +			/* Page can't be returned to NS world so is lost */
> >> +			get_page(phys_to_page(phys));
> >> +		}
> >> +	}
> >> +	return -ENXIO;
> >> +}
> >> +  
> > 
> > There seems no caller to the function above. Better move it to the related
> > patch.  
> 
> Indeed this should really be in the next patch - will move as it's very
> confusing having it in this patch (sorry about that).
> 
> >> +static int populate_par_region(struct kvm *kvm,
> >> +			       phys_addr_t ipa_base,
> >> +			       phys_addr_t ipa_end)
> >> +{
> >> +	struct realm *realm = &kvm->arch.realm;
> >> +	struct kvm_memory_slot *memslot;
> >> +	gfn_t base_gfn, end_gfn;
> >> +	int idx;
> >> +	phys_addr_t ipa;
> >> +	int ret = 0;
> >> +	struct page *tmp_page;
> >> +	phys_addr_t rd = virt_to_phys(realm->rd);
> >> +
> >> +	base_gfn = gpa_to_gfn(ipa_base);
> >> +	end_gfn = gpa_to_gfn(ipa_end);
> >> +
> >> +	idx = srcu_read_lock(&kvm->srcu);
> >> +	memslot = gfn_to_memslot(kvm, base_gfn);
> >> +	if (!memslot) {
> >> +		ret = -EFAULT;
> >> +		goto out;
> >> +	}
> >> +
> >> +	/* We require the region to be contained within a single memslot */
> >> +	if (memslot->base_gfn + memslot->npages < end_gfn) {
> >> +		ret = -EINVAL;
> >> +		goto out;
> >> +	}
> >> +
> >> +	tmp_page = alloc_page(GFP_KERNEL);
> >> +	if (!tmp_page) {
> >> +		ret = -ENOMEM;
> >> +		goto out;
> >> +	}
> >> +
> >> +	mmap_read_lock(current->mm);
> >> +
> >> +	ipa = ipa_base;
> >> +
> >> +	while (ipa < ipa_end) {
> >> +		struct vm_area_struct *vma;
> >> +		unsigned long map_size;
> >> +		unsigned int vma_shift;
> >> +		unsigned long offset;
> >> +		unsigned long hva;
> >> +		struct page *page;
> >> +		kvm_pfn_t pfn;
> >> +		int level;
> >> +
> >> +		hva = gfn_to_hva_memslot(memslot, gpa_to_gfn(ipa));
> >> +		vma = vma_lookup(current->mm, hva);
> >> +		if (!vma) {
> >> +			ret = -EFAULT;
> >> +			break;
> >> +		}
> >> +
> >> +		if (is_vm_hugetlb_page(vma))
> >> +			vma_shift = huge_page_shift(hstate_vma(vma));
> >> +		else
> >> +			vma_shift = PAGE_SHIFT;
> >> +
> >> +		map_size = 1 << vma_shift;
> >> +
> >> +		/*
> >> +		 * FIXME: This causes over mapping, but there's no good
> >> +		 * solution here with the ABI as it stands
> >> +		 */
> >> +		ipa = ALIGN_DOWN(ipa, map_size);
> >> +
> >> +		switch (map_size) {
> >> +		case RME_L2_BLOCK_SIZE:
> >> +			level = 2;
> >> +			break;
> >> +		case PAGE_SIZE:
> >> +			level = 3;
> >> +			break;
> >> +		default:
> >> +			WARN_ONCE(1, "Unsupport vma_shift %d", vma_shift);
> >> +			ret = -EFAULT;
> >> +			break;
> >> +		}
> >> +
> >> +		pfn = gfn_to_pfn_memslot(memslot, gpa_to_gfn(ipa));
> >> +
> >> +		if (is_error_pfn(pfn)) {
> >> +			ret = -EFAULT;
> >> +			break;
> >> +		}
> >> +
> >> +		ret = rmi_rtt_init_ripas(rd, ipa, level);
> >> +		if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> >> +			ret = realm_create_rtt_levels(realm, ipa,
> >> +						      RMI_RETURN_INDEX(ret),
> >> +						      level, NULL);
> >> +			if (ret)
> >> +				break;
> >> +			ret = rmi_rtt_init_ripas(rd, ipa, level);
> >> +			if (ret) {
> >> +				ret = -ENXIO;
> >> +				break;
> >> +			}
> >> +		}
> >> +
> >> +		if (level < RME_RTT_MAX_LEVEL) {
> >> +			/*
> >> +			 * A temporary RTT is needed during the map, precreate
> >> +			 * it, however if there is an error (e.g. missing
> >> +			 * parent tables) this will be handled in the
> >> +			 * realm_create_protected_data_page() call.
> >> +			 */
> >> +			realm_create_rtt_levels(realm, ipa, level,
> >> +						RME_RTT_MAX_LEVEL, NULL);
> >> +		}
> >> +
> >> +		page = pfn_to_page(pfn);
> >> +
> >> +		for (offset = 0; offset < map_size && !ret;
> >> +		     offset += PAGE_SIZE, page++) {
> >> +			phys_addr_t page_ipa = ipa + offset;
> >> +
> >> +			ret = realm_create_protected_data_page(realm, page_ipa,
> >> +							       page, tmp_page);
> >> +		}
> >> +		if (ret)
> >> +			goto err_release_pfn;
> >> +
> >> +		if (level == 2) {
> >> +			ret = fold_rtt(rd, ipa, level, realm);
> >> +			if (ret)
> >> +				goto err_release_pfn;
> >> +		}
> >> +
> >> +		ipa += map_size;  
> >   
> >> +		kvm_set_pfn_accessed(pfn);
> >> +		kvm_set_pfn_dirty(pfn);  
> > 
> > kvm_release_pfn_dirty() has already called kvm_set_pfn_{accessed, dirty}().  
> 
> Will remove those calls.
> 
> >> +		kvm_release_pfn_dirty(pfn);
> >> +err_release_pfn:
> >> +		if (ret) {
> >> +			kvm_release_pfn_clean(pfn);
> >> +			break;
> >> +		}
> >> +	}
> >> +
> >> +	mmap_read_unlock(current->mm);
> >> +	__free_page(tmp_page);
> >> +
> >> +out:
> >> +	srcu_read_unlock(&kvm->srcu, idx);
> >> +	return ret;
> >> +}
> >> +
> >> +static int kvm_populate_realm(struct kvm *kvm,
> >> +			      struct kvm_cap_arm_rme_populate_realm_args *args)
> >> +{
> >> +	phys_addr_t ipa_base, ipa_end;
> >> +  
> > 
> > Check kvm_is_realm(kvm) here or in the kvm_realm_enable_cap().  
> 
> I'm going to update kvm_vm_ioctl_enable_cap() to check kvm_is_realm() so
> we won't get here.
> 
> >> +	if (kvm_realm_state(kvm) != REALM_STATE_NEW)
> >> +		return -EBUSY;  
> > 
> > Maybe -EINVAL? The realm hasn't been created (RMI_REALM_CREATE is not called
> > yet). The userspace shouldn't reach this path.  
> 
> Well user space can attempt to populate in the ACTIVE state - which is
> where the idea of 'busy' comes from. Admittedly it's a little confusing
> when RMI_REALM_CREATE hasn't been called.
> 
> I'm not particularly bothered about the return code, but it's useful to
> have a different code to -EINVAL as it's not an invalid argument, but
> calling at the wrong time. I can't immediately see a better error code
> though.
> 
The reason why I feel -EBUSY is little bit off is EBUSY usually indicates
something is already initialized and currently running, then another
calling path wanna to operate it. 

I took a look on the ioctls in arch/arm64/kvm/arm.c. It seems people have
different opinions for calling execution path at a wrong time:

For example:

long kvm_arch_vcpu_ioctl()
...
        case KVM_GET_REG_LIST: {
                struct kvm_reg_list __user *user_list = argp;
                struct kvm_reg_list reg_list;
                unsigned n;

                r = -ENOEXEC;
                if (unlikely(!kvm_vcpu_initialized(vcpu)))
                        break;

                r = -EPERM;
                if (!kvm_arm_vcpu_is_finalized(vcpu))
                        break;

If we have to choose one, I prefer -ENOEXEC as -EPERM is stranger. But
personally my vote goes to -EINVAL.

> Steve
> 
> >> +
> >> +	if (!IS_ALIGNED(args->populate_ipa_base, PAGE_SIZE) ||
> >> +	    !IS_ALIGNED(args->populate_ipa_size, PAGE_SIZE))
> >> +		return -EINVAL;
> >> +
> >> +	ipa_base = args->populate_ipa_base;
> >> +	ipa_end = ipa_base + args->populate_ipa_size;
> >> +
> >> +	if (ipa_end < ipa_base)
> >> +		return -EINVAL;
> >> +
> >> +	return populate_par_region(kvm, ipa_base, ipa_end);
> >> +}
> >> +
> >>  static int set_ipa_state(struct kvm_vcpu *vcpu,
> >>  			 unsigned long ipa,
> >>  			 unsigned long end,
> >> @@ -748,6 +1102,18 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
> >>  		r = kvm_init_ipa_range_realm(kvm, &args);
> >>  		break;
> >>  	}
> >> +	case KVM_CAP_ARM_RME_POPULATE_REALM: {
> >> +		struct kvm_cap_arm_rme_populate_realm_args args;
> >> +		void __user *argp = u64_to_user_ptr(cap->args[1]);
> >> +
> >> +		if (copy_from_user(&args, argp, sizeof(args))) {
> >> +			r = -EFAULT;
> >> +			break;
> >> +		}
> >> +
> >> +		r = kvm_populate_realm(kvm, &args);
> >> +		break;
> >> +	}
> >>  	default:
> >>  		r = -EINVAL;
> >>  		break;  
> >   
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 15/28] KVM: arm64: Handle realm MMIO emulation
  2023-03-10 15:47       ` Steven Price
@ 2023-03-14 15:44         ` Zhi Wang
  2023-03-22 11:51           ` Steven Price
  0 siblings, 1 reply; 190+ messages in thread
From: Zhi Wang @ 2023-03-14 15:44 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 10 Mar 2023 15:47:14 +0000
Steven Price <steven.price@arm.com> wrote:

> On 06/03/2023 15:37, Zhi Wang wrote:
> > On Fri, 27 Jan 2023 11:29:19 +0000
> > Steven Price <steven.price@arm.com> wrote:
> >   
> >> MMIO emulation for a realm cannot be done directly with the VM's
> >> registers as they are protected from the host. However the RMM interface
> >> provides a structure member for providing the read/written value and  
> > 
> > More details would be better for helping the review. I can only see the
> > emulated mmio value from the device model (kvmtool or kvm_io_bus) is put into
> > the GPRS[0] of the RecEntry object. But the rest of the flow is missing.  
> 
> The commit message is out of date (sorry about that). A previous version
> of the spec had a dedicated member for the read/write value, but this
> was changed to just use GPRS[0] as you've spotted. I'll update the text.
> 
> > I guess RMM copies the value in the RecEntry.GPRS[0] to the target GPR in the
> > guest context in RMI_REC_ENTER when seeing RMI_EMULATED_MMIO. This is for
> > the guest MMIO read path.  
> 
> Yes, when entering the guest after an (emulatable) read data abort the
> value in GPRS[0] is loaded from the RecEntry structure into the
> appropriate register for the guest.
> 
> > How about the MMIO write path? I don't see where the RecExit.GPRS[0] is loaded
> > to a varible and returned to the userspace.  
> 

-----
> The RMM will populate GPRS[0] with the written value in this case (even
> if another register was actually used in the instruction). We then
> transfer that to the usual VCPU structure so that the normal fault
> handling logic works.
> 
-----

Are these in this patch or another patch?

> >> we can transfer this to the appropriate VCPU's register entry and then
> >> depend on the generic MMIO handling code in KVM.
> >>
> >> Signed-off-by: Steven Price <steven.price@arm.com>
> >> ---
> >>  arch/arm64/kvm/mmio.c | 7 +++++++
> >>  1 file changed, 7 insertions(+)
> >>
> >> diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
> >> index 3dd38a151d2a..c4879fa3a8d3 100644
> >> --- a/arch/arm64/kvm/mmio.c
> >> +++ b/arch/arm64/kvm/mmio.c
> >> @@ -6,6 +6,7 @@
> >>  
> >>  #include <linux/kvm_host.h>
> >>  #include <asm/kvm_emulate.h>
> >> +#include <asm/rmi_smc.h>
> >>  #include <trace/events/kvm.h>
> >>  
> >>  #include "trace.h"
> >> @@ -109,6 +110,9 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
> >>  			       &data);
> >>  		data = vcpu_data_host_to_guest(vcpu, data, len);
> >>  		vcpu_set_reg(vcpu, kvm_vcpu_dabt_get_rd(vcpu), data);
> >> +
> >> +		if (vcpu_is_rec(vcpu))
> >> +			vcpu->arch.rec.run->entry.gprs[0] = data;  
> > 
> > I think the guest context is maintained by RMM (while KVM can only touch
> > Rec{Entry, Exit} object) so that guest context in the legacy VHE mode is
> > unused.
> > 
> > If yes, I guess here is should be:
> > 
> > if (unlikely(vcpu_is_rec(vcpu)))
> > 	vcpu->arch.rec.run->entry.gprs[0] = data;
> > else
> > 	vcpu_set_reg(vcpu, kvm_vcpu_dabt_get_rd(vcpu), data);  
> 
> Correct. Although there's no harm in updating with vcpu_set_reg(). But
> I'll make the change because it's clearer.
> 
> >>  	}
> >>  
> >>  	/*
> >> @@ -179,6 +183,9 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
> >>  	run->mmio.len		= len;
> >>  	vcpu->mmio_needed	= 1;
> >>  
> >> +	if (vcpu_is_rec(vcpu))
> >> +		vcpu->arch.rec.run->entry.flags |= RMI_EMULATED_MMIO;
> >> +  
> > 
> > Wouldn't it be better to set this in the kvm_handle_mmio_return where the MMIO
> > read emulation has been surely successful?  
> 
> Yes, that makes sense - I'll move this.
> 
> Thanks,
> 
> Steve
> 
> >>  	if (!ret) {
> >>  		/* We handled the access successfully in the kernel. */
> >>  		if (!is_write)  
> >   
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 17/28] arm64: RME: Runtime faulting of memory
  2023-03-10 15:47       ` Steven Price
@ 2023-03-14 16:41         ` Zhi Wang
  0 siblings, 0 replies; 190+ messages in thread
From: Zhi Wang @ 2023-03-14 16:41 UTC (permalink / raw)
  To: Steven Price
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On Fri, 10 Mar 2023 15:47:19 +0000
Steven Price <steven.price@arm.com> wrote:

> On 06/03/2023 18:20, Zhi Wang wrote:
> > On Fri, 27 Jan 2023 11:29:21 +0000
> > Steven Price <steven.price@arm.com> wrote:
> >   
> >> At runtime if the realm guest accesses memory which hasn't yet been
> >> mapped then KVM needs to either populate the region or fault the guest.
> >>
> >> For memory in the lower (protected) region of IPA a fresh page is
> >> provided to the RMM which will zero the contents. For memory in the
> >> upper (shared) region of IPA, the memory from the memslot is mapped
> >> into the realm VM non secure.
> >>
> >> Signed-off-by: Steven Price <steven.price@arm.com>
> >> ---
> >>  arch/arm64/include/asm/kvm_emulate.h | 10 +++++
> >>  arch/arm64/include/asm/kvm_rme.h     | 12 ++++++
> >>  arch/arm64/kvm/mmu.c                 | 64 +++++++++++++++++++++++++---
> >>  arch/arm64/kvm/rme.c                 | 48 +++++++++++++++++++++
> >>  4 files changed, 128 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> >> index 285e62914ca4..3a71b3d2e10a 100644
> >> --- a/arch/arm64/include/asm/kvm_emulate.h
> >> +++ b/arch/arm64/include/asm/kvm_emulate.h
> >> @@ -502,6 +502,16 @@ static inline enum realm_state kvm_realm_state(struct kvm *kvm)
> >>  	return READ_ONCE(kvm->arch.realm.state);
> >>  }
> >>  
> >> +static inline gpa_t kvm_gpa_stolen_bits(struct kvm *kvm)
> >> +{
> >> +	if (kvm_is_realm(kvm)) {
> >> +		struct realm *realm = &kvm->arch.realm;
> >> +
> >> +		return BIT(realm->ia_bits - 1);
> >> +	}
> >> +	return 0;
> >> +}
> >> +  
> > 
> > "stolen" seems a little bit vague. Maybe "shared" bit would be better as
> > SEV-SNP has C bit and TDX has shared bit. It would be nice to align with
> > the common knowledge.  
> 
> The Arm CCA term is the "protected" bit[1] - although the bit is
> backwards as it's cleared to indicate protected... so not ideal naming! ;)
> 
> But it's termed 'stolen' here as it's effectively removed from the set
> of value address bits. And this function is returning a mask of the bits
> that are not available as address bits. The naming was meant to be
> generic that this could encompass other features that need to reserve
> IPA bits.
> 
> But it's possible this is too generic and perhaps we should just deal
> with a single bit rather than potential masks. Alternatively we could
> invert this and return a set of valid bits:
> 
> static inline gpa_t kvm_gpa_valid_bits(struct kvm *kvm)
> {
> 	if (kvm_is_realm(kvm)) {
> 		struct realm *realm = &kvm->arch.realm;
> 
> 		return ~BIT(realm->ia_bits - 1);
> 	}
> 	return ~(gpa_t)0;
> }
> 
> That would at least match the current usage where the inverse is what we
> need.
> 
> So SEV-SNP or TDX have a concept of a mask to apply to addresses from
> the guest? Can we steal any existing terms?
> 

In a general level, they are using "shared"/"private". TDX is using a
function kvm_gfn_shared_mask() to get the mask and three other marcos to
apply the mask on GPA (IPA)[1]. SEV-SNP is re-using SME macros e.g.
__sme_clr() to apply the mask[2].

Guess we can take them as an reference: using an inline function to get
the protected_bit_mask, like kvm_ipa_protected_mask() with the spec text
you pasted in the comment of the function. The name echoes the spec
description.

Then another necessary functions like kvm_gpa_{is, to}_{shared, private}
, which applies the mask to a GPA(IPA), to echo the terms in the generic
KVM knowledge. (Guess we can refine that with realm_is_addr_protected()).

[1] https://www.spinics.net/lists/kernel/msg4718104.html
[2] https://lore.kernel.org/lkml/20230220183847.59159-25-michael.roth@amd.com/

> 
> [1] Technically the spec only states: "Software in a Realm should treat
> the most significant bit of an IPA as a protection attribute." I don't
> think the bit is directly referred to in the spec as anything other than
> "the most significant bit". Although that in itself is confusing as it
> is the most significant *active* bit (i.e the configured IPA size
> changes which bit is used).
> 
> > Also, it would be nice to change the name of gpa_stolen_mask accordingly.
> >   
> >>  static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu)
> >>  {
> >>  	if (static_branch_unlikely(&kvm_rme_is_available))
> >> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> >> index 9d1583c44a99..303e4a5e5704 100644
> >> --- a/arch/arm64/include/asm/kvm_rme.h
> >> +++ b/arch/arm64/include/asm/kvm_rme.h
> >> @@ -50,6 +50,18 @@ void kvm_destroy_rec(struct kvm_vcpu *vcpu);
> >>  int kvm_rec_enter(struct kvm_vcpu *vcpu);
> >>  int handle_rme_exit(struct kvm_vcpu *vcpu, int rec_run_status);
> >>  
> >> +void kvm_realm_unmap_range(struct kvm *kvm, unsigned long ipa, u64 size);
> >> +int realm_map_protected(struct realm *realm,
> >> +			unsigned long hva,
> >> +			unsigned long base_ipa,
> >> +			struct page *dst_page,
> >> +			unsigned long map_size,
> >> +			struct kvm_mmu_memory_cache *memcache);
> >> +int realm_map_non_secure(struct realm *realm,
> >> +			 unsigned long ipa,
> >> +			 struct page *page,
> >> +			 unsigned long map_size,
> >> +			 struct kvm_mmu_memory_cache *memcache);
> >>  int realm_set_ipa_state(struct kvm_vcpu *vcpu,
> >>  			unsigned long addr, unsigned long end,
> >>  			unsigned long ripas);
> >> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> >> index f29558c5dcbc..5417c273861b 100644
> >> --- a/arch/arm64/kvm/mmu.c
> >> +++ b/arch/arm64/kvm/mmu.c
> >> @@ -235,8 +235,13 @@ static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64
> >>  
> >>  	lockdep_assert_held_write(&kvm->mmu_lock);
> >>  	WARN_ON(size & ~PAGE_MASK);
> >> -	WARN_ON(stage2_apply_range(kvm, start, end, kvm_pgtable_stage2_unmap,
> >> -				   may_block));
> >> +
> >> +	if (kvm_is_realm(kvm))
> >> +		kvm_realm_unmap_range(kvm, start, size);
> >> +	else
> >> +		WARN_ON(stage2_apply_range(kvm, start, end,
> >> +					   kvm_pgtable_stage2_unmap,
> >> +					   may_block));
> >>  }
> >>  
> >>  static void unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size)
> >> @@ -250,7 +255,11 @@ static void stage2_flush_memslot(struct kvm *kvm,
> >>  	phys_addr_t addr = memslot->base_gfn << PAGE_SHIFT;
> >>  	phys_addr_t end = addr + PAGE_SIZE * memslot->npages;
> >>  
> >> -	stage2_apply_range_resched(kvm, addr, end, kvm_pgtable_stage2_flush);
> >> +	if (kvm_is_realm(kvm))
> >> +		kvm_realm_unmap_range(kvm, addr, end - addr);
> >> +	else
> >> +		stage2_apply_range_resched(kvm, addr, end,
> >> +					   kvm_pgtable_stage2_flush);
> >>  }
> >>  
> >>  /**
> >> @@ -818,6 +827,10 @@ void stage2_unmap_vm(struct kvm *kvm)
> >>  	struct kvm_memory_slot *memslot;
> >>  	int idx, bkt;
> >>  
> >> +	/* For realms this is handled by the RMM so nothing to do here */
> >> +	if (kvm_is_realm(kvm))
> >> +		return;
> >> +
> >>  	idx = srcu_read_lock(&kvm->srcu);
> >>  	mmap_read_lock(current->mm);
> >>  	write_lock(&kvm->mmu_lock);
> >> @@ -840,6 +853,7 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
> >>  	pgt = mmu->pgt;
> >>  	if (kvm_is_realm(kvm) &&
> >>  	    kvm_realm_state(kvm) != REALM_STATE_DYING) {
> >> +		unmap_stage2_range(mmu, 0, (~0ULL) & PAGE_MASK);
> >>  		write_unlock(&kvm->mmu_lock);
> >>  		kvm_realm_destroy_rtts(&kvm->arch.realm, pgt->ia_bits,
> >>  				       pgt->start_level);
> >> @@ -1190,6 +1204,24 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma)
> >>  	return vma->vm_flags & VM_MTE_ALLOWED;
> >>  }
> >>  
> >> +static int realm_map_ipa(struct kvm *kvm, phys_addr_t ipa, unsigned long hva,
> >> +			 kvm_pfn_t pfn, unsigned long map_size,
> >> +			 enum kvm_pgtable_prot prot,
> >> +			 struct kvm_mmu_memory_cache *memcache)
> >> +{
> >> +	struct realm *realm = &kvm->arch.realm;
> >> +	struct page *page = pfn_to_page(pfn);
> >> +
> >> +	if (WARN_ON(!(prot & KVM_PGTABLE_PROT_W)))
> >> +		return -EFAULT;
> >> +
> >> +	if (!realm_is_addr_protected(realm, ipa))
> >> +		return realm_map_non_secure(realm, ipa, page, map_size,
> >> +					    memcache);
> >> +
> >> +	return realm_map_protected(realm, hva, ipa, page, map_size, memcache);
> >> +}
> >> +
> >>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >>  			  struct kvm_memory_slot *memslot, unsigned long hva,
> >>  			  unsigned long fault_status)
> >> @@ -1210,9 +1242,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >>  	unsigned long vma_pagesize, fault_granule;
> >>  	enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
> >>  	struct kvm_pgtable *pgt;
> >> +	gpa_t gpa_stolen_mask = kvm_gpa_stolen_bits(vcpu->kvm);
> >>  
> >>  	fault_granule = 1UL << ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level);
> >>  	write_fault = kvm_is_write_fault(vcpu);
> >> +
> >> +	/* Realms cannot map read-only */  
> > 
> > Out of curiosity, why? It would be nice to have more explanation in the
> > comment.  
> 
> The RMM specification doesn't support mapping protected memory read
> only. I don't believe there is any reason why it couldn't, but equally I
> don't think there any use cases for a guest needing read-only pages so
> this just isn't supported by the RMM. Since the page is necessarily
> taken away from the host it's fairly irrelevant (from the host's
> perspective) whether it is actually read only or not.
> 
> However, this is technically wrong for the case of unprotected (shared)
> pages - it should be possible to map those read only. But I need to have
> a think about how to fix that up.

If the fault IPA carries the protected bit, can't we do like:

if (vcpu_is_rec(vcpu) && fault_ipa_is_protected)
	write_fault = true

Are there still other gaps?
>
> >> +	if (vcpu_is_rec(vcpu))
> >> +		write_fault = true;
> >> +
> >>  	exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
> >>  	VM_BUG_ON(write_fault && exec_fault);
> >>  
> >> @@ -1272,7 +1310,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >>  	if (vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE)
> >>  		fault_ipa &= ~(vma_pagesize - 1);
> >>  
> >> -	gfn = fault_ipa >> PAGE_SHIFT;
> >> +	gfn = (fault_ipa & ~gpa_stolen_mask) >> PAGE_SHIFT;
> >>  	mmap_read_unlock(current->mm);
> >>  
> >>  	/*
> >> @@ -1345,7 +1383,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >>  	 * If we are not forced to use page mapping, check if we are
> >>  	 * backed by a THP and thus use block mapping if possible.
> >>  	 */
> >> -	if (vma_pagesize == PAGE_SIZE && !(force_pte || device)) {
> >> +	/* FIXME: We shouldn't need to disable this for realms */
> >> +	if (vma_pagesize == PAGE_SIZE && !(force_pte || device || kvm_is_realm(kvm))) {  
> > 
> > Why do we have to disable this temporarily?  
> 
> The current uABI (not using memfd) has some serious issues regarding
> huge page support. KVM normally follows the user space mappings of the
> memslot - so if user space has a huge page (transparent or hugetlbs)
> then stage 2 for the guest also gets one.
> 
> However realms sometimes require that the stage 2 differs. The main
> examples are:
> 
>  * RIPAS - if part of a huge page is RIPAS_RAM and part RIPAS_EMPTY then
> the huge page would have to be split.
> 
>  * Initially populated memory: basically the same as above - if the
> populated memory doesn't perfectly align with huge pages, then the
> head/tail pages would need to be broken up.
> 
> Removing this hack allows the huge pages to be created in stage 2, but
> then causes overmapping of the initial contents, then later on when the
> VMM (or guest) attempts to change the properties of the misaligned tail
> it gets an error because the pages are already present in stage 2.
> 
> The planned solution to all this is to stop following the user space
> page tables and create huge pages opportunistically from the memfd that
> backs the protected range. For now this hack exists to avoid things
> "randomly" failing when e.g. the initial kernel image isn't huge page
> aligned. In theory it should be possible to make this work with the
> current uABI, but it's not worth it when we know we're replacing it.

I see. Will dig it and see if there is any idea come to my mind.
> 
> >>  		if (fault_status == FSC_PERM && fault_granule > PAGE_SIZE)
> >>  			vma_pagesize = fault_granule;
> >>  		else
> >> @@ -1382,6 +1421,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >>  	 */
> >>  	if (fault_status == FSC_PERM && vma_pagesize == fault_granule)
> >>  		ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot);
> >> +	else if (kvm_is_realm(kvm))
> >> +		ret = realm_map_ipa(kvm, fault_ipa, hva, pfn, vma_pagesize,
> >> +				    prot, memcache);
> >>  	else
> >>  		ret = kvm_pgtable_stage2_map(pgt, fault_ipa, vma_pagesize,
> >>  					     __pfn_to_phys(pfn), prot,
> >> @@ -1437,6 +1479,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
> >>  	struct kvm_memory_slot *memslot;
> >>  	unsigned long hva;
> >>  	bool is_iabt, write_fault, writable;
> >> +	gpa_t gpa_stolen_mask = kvm_gpa_stolen_bits(vcpu->kvm);
> >>  	gfn_t gfn;
> >>  	int ret, idx;
> >>  
> >> @@ -1491,7 +1534,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
> >>  
> >>  	idx = srcu_read_lock(&vcpu->kvm->srcu);
> >>  
> >> -	gfn = fault_ipa >> PAGE_SHIFT;
> >> +	gfn = (fault_ipa & ~gpa_stolen_mask) >> PAGE_SHIFT;
> >>  	memslot = gfn_to_memslot(vcpu->kvm, gfn);
> >>  	hva = gfn_to_hva_memslot_prot(memslot, gfn, &writable);
> >>  	write_fault = kvm_is_write_fault(vcpu);
> >> @@ -1536,6 +1579,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
> >>  		 * of the page size.
> >>  		 */
> >>  		fault_ipa |= kvm_vcpu_get_hfar(vcpu) & ((1 << 12) - 1);
> >> +		fault_ipa &= ~gpa_stolen_mask;
> >>  		ret = io_mem_abort(vcpu, fault_ipa);
> >>  		goto out_unlock;
> >>  	}
> >> @@ -1617,6 +1661,10 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
> >>  	if (!kvm->arch.mmu.pgt)
> >>  		return false;
> >>  
> > 
> > Does the unprotected (shared) region of a realm support aging?  
> 
> In theory this should be possible to support by unmapping the NS entry
> and handling the fault. But the hardware access flag optimisation isn't
> available with the RMM, and the overhead of RMI calls to unmap/map could
> be significant.
> 
> For now this isn't something we've looked at, but I guess it might be
> worth trying out when we have some real hardware to benchmark on.
> 
> >> +	/* We don't support aging for Realms */
> >> +	if (kvm_is_realm(kvm))
> >> +		return true;
> >> +
> >>  	WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
> >>  
> >>  	kpte = kvm_pgtable_stage2_mkold(kvm->arch.mmu.pgt,
> >> @@ -1630,6 +1678,10 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
> >>  	if (!kvm->arch.mmu.pgt)
> >>  		return false;
> >>  
> >> +	/* We don't support aging for Realms */
> >> +	if (kvm_is_realm(kvm))
> >> +		return true;
> >> +
> >>  	return kvm_pgtable_stage2_is_young(kvm->arch.mmu.pgt,
> >>  					   range->start << PAGE_SHIFT);
> >>  }
> >> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> >> index 3405b43e1421..3d46191798e5 100644
> >> --- a/arch/arm64/kvm/rme.c
> >> +++ b/arch/arm64/kvm/rme.c
> >> @@ -608,6 +608,54 @@ int realm_map_protected(struct realm *realm,
> >>  	return -ENXIO;
> >>  }
> >>  
> >> +int realm_map_non_secure(struct realm *realm,
> >> +			 unsigned long ipa,
> >> +			 struct page *page,
> >> +			 unsigned long map_size,
> >> +			 struct kvm_mmu_memory_cache *memcache)
> >> +{
> >> +	phys_addr_t rd = virt_to_phys(realm->rd);
> >> +	int map_level;
> >> +	int ret = 0;
> >> +	unsigned long desc = page_to_phys(page) |
> >> +			     PTE_S2_MEMATTR(MT_S2_FWB_NORMAL) |
> >> +			     /* FIXME: Read+Write permissions for now */  
> > Why can't we handle the prot from the realm_map_ipa()? Working in progress? :)  
> 
> Yes, work in progress - this comes from the "Realms cannot map
> read-only" in user_mem_abort() above. Since all faults are treated as
> write faults we need to upgrade to read/write here too.
> 
> The prot in realm_map_ipa isn't actually useful currently because we
> simply WARN_ON and return if it doesn't have PROT_W. Again this needs to
> be fixed! It's on my todo list ;)
> 
> Steve
> 
> >> +			     (3 << 6) |
> >> +			     PTE_SHARED;
> >> +
> >> +	if (WARN_ON(!IS_ALIGNED(ipa, map_size)))
> >> +		return -EINVAL;
> >> +
> >> +	switch (map_size) {
> >> +	case PAGE_SIZE:
> >> +		map_level = 3;
> >> +		break;
> >> +	case RME_L2_BLOCK_SIZE:
> >> +		map_level = 2;
> >> +		break;
> >> +	default:
> >> +		return -EINVAL;
> >> +	}
> >> +
> >> +	ret = rmi_rtt_map_unprotected(rd, ipa, map_level, desc);
> >> +
> >> +	if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) {
> >> +		/* Create missing RTTs and retry */
> >> +		int level = RMI_RETURN_INDEX(ret);
> >> +
> >> +		ret = realm_create_rtt_levels(realm, ipa, level, map_level,
> >> +					      memcache);
> >> +		if (WARN_ON(ret))
> >> +			return -ENXIO;
> >> +
> >> +		ret = rmi_rtt_map_unprotected(rd, ipa, map_level, desc);
> >> +	}
> >> +	if (WARN_ON(ret))
> >> +		return -ENXIO;
> >> +
> >> +	return 0;
> >> +}
> >> +
> >>  static int populate_par_region(struct kvm *kvm,
> >>  			       phys_addr_t ipa_base,
> >>  			       phys_addr_t ipa_end)  
> >   
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 15/28] KVM: arm64: Handle realm MMIO emulation
  2023-03-14 15:44         ` Zhi Wang
@ 2023-03-22 11:51           ` Steven Price
  0 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-03-22 11:51 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On 14/03/2023 15:44, Zhi Wang wrote:
> On Fri, 10 Mar 2023 15:47:14 +0000
> Steven Price <steven.price@arm.com> wrote:
> 
>> On 06/03/2023 15:37, Zhi Wang wrote:
>>> On Fri, 27 Jan 2023 11:29:19 +0000
>>> Steven Price <steven.price@arm.com> wrote:
>>>   
>>>> MMIO emulation for a realm cannot be done directly with the VM's
>>>> registers as they are protected from the host. However the RMM interface
>>>> provides a structure member for providing the read/written value and  
>>>
>>> More details would be better for helping the review. I can only see the
>>> emulated mmio value from the device model (kvmtool or kvm_io_bus) is put into
>>> the GPRS[0] of the RecEntry object. But the rest of the flow is missing.  
>>
>> The commit message is out of date (sorry about that). A previous version
>> of the spec had a dedicated member for the read/write value, but this
>> was changed to just use GPRS[0] as you've spotted. I'll update the text.
>>
>>> I guess RMM copies the value in the RecEntry.GPRS[0] to the target GPR in the
>>> guest context in RMI_REC_ENTER when seeing RMI_EMULATED_MMIO. This is for
>>> the guest MMIO read path.  
>>
>> Yes, when entering the guest after an (emulatable) read data abort the
>> value in GPRS[0] is loaded from the RecEntry structure into the
>> appropriate register for the guest.
>>
>>> How about the MMIO write path? I don't see where the RecExit.GPRS[0] is loaded
>>> to a varible and returned to the userspace.  
>>
> 
> -----
>> The RMM will populate GPRS[0] with the written value in this case (even
>> if another register was actually used in the instruction). We then
>> transfer that to the usual VCPU structure so that the normal fault
>> handling logic works.
>>
> -----
> 
> Are these in this patch or another patch?

The RMM (not included in this particular series[1]) sets the first
element of the 'GPRS' array which is in memory shared with the host.

The Linux half to populate the vcpu structure is in the previous patch:

+static int rec_exit_sync_dabt(struct kvm_vcpu *vcpu)
+{
+	struct rec *rec = &vcpu->arch.rec;
+
+	if (kvm_vcpu_dabt_iswrite(vcpu) && kvm_vcpu_dabt_isvalid(vcpu))
+		vcpu_set_reg(vcpu, kvm_vcpu_dabt_get_rd(vcpu),
+			     rec->run->exit.gprs[0]);
+
+	return kvm_handle_guest_abort(vcpu);
+}

I guess it might make sense to pull the 'if' statement out of the
previous patch and into this one to keep all the MMIO code together.

Steve

[1] This Linux code is written against the RMM specification and in
theory could work with any RMM implementation. But the TF RMM is open
source, so I can point you at the assignment in the latest commit here:
https://git.trustedfirmware.org/TF-RMM/tf-rmm.git/tree/runtime/core/exit.c?id=d294b1b05e8d234d32684a982552aa2a17fb9157#n264

>>>> we can transfer this to the appropriate VCPU's register entry and then
>>>> depend on the generic MMIO handling code in KVM.
>>>>
>>>> Signed-off-by: Steven Price <steven.price@arm.com>
>>>> ---
>>>>  arch/arm64/kvm/mmio.c | 7 +++++++
>>>>  1 file changed, 7 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
>>>> index 3dd38a151d2a..c4879fa3a8d3 100644
>>>> --- a/arch/arm64/kvm/mmio.c
>>>> +++ b/arch/arm64/kvm/mmio.c
>>>> @@ -6,6 +6,7 @@
>>>>  
>>>>  #include <linux/kvm_host.h>
>>>>  #include <asm/kvm_emulate.h>
>>>> +#include <asm/rmi_smc.h>
>>>>  #include <trace/events/kvm.h>
>>>>  
>>>>  #include "trace.h"
>>>> @@ -109,6 +110,9 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
>>>>  			       &data);
>>>>  		data = vcpu_data_host_to_guest(vcpu, data, len);
>>>>  		vcpu_set_reg(vcpu, kvm_vcpu_dabt_get_rd(vcpu), data);
>>>> +
>>>> +		if (vcpu_is_rec(vcpu))
>>>> +			vcpu->arch.rec.run->entry.gprs[0] = data;  
>>>
>>> I think the guest context is maintained by RMM (while KVM can only touch
>>> Rec{Entry, Exit} object) so that guest context in the legacy VHE mode is
>>> unused.
>>>
>>> If yes, I guess here is should be:
>>>
>>> if (unlikely(vcpu_is_rec(vcpu)))
>>> 	vcpu->arch.rec.run->entry.gprs[0] = data;
>>> else
>>> 	vcpu_set_reg(vcpu, kvm_vcpu_dabt_get_rd(vcpu), data);  
>>
>> Correct. Although there's no harm in updating with vcpu_set_reg(). But
>> I'll make the change because it's clearer.
>>
>>>>  	}
>>>>  
>>>>  	/*
>>>> @@ -179,6 +183,9 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>>>>  	run->mmio.len		= len;
>>>>  	vcpu->mmio_needed	= 1;
>>>>  
>>>> +	if (vcpu_is_rec(vcpu))
>>>> +		vcpu->arch.rec.run->entry.flags |= RMI_EMULATED_MMIO;
>>>> +  
>>>
>>> Wouldn't it be better to set this in the kvm_handle_mmio_return where the MMIO
>>> read emulation has been surely successful?  
>>
>> Yes, that makes sense - I'll move this.
>>
>> Thanks,
>>
>> Steve
>>
>>>>  	if (!ret) {
>>>>  		/* We handled the access successfully in the kernel. */
>>>>  		if (!is_write)  
>>>   
>>
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 16/28] arm64: RME: Allow populating initial contents
  2023-03-14 15:31         ` Zhi Wang
@ 2023-03-22 11:51           ` Steven Price
  0 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2023-03-22 11:51 UTC (permalink / raw)
  To: Zhi Wang
  Cc: kvm, kvmarm, Catalin Marinas, Marc Zyngier, Will Deacon,
	James Morse, Oliver Upton, Suzuki K Poulose, Zenghui Yu,
	linux-arm-kernel, linux-kernel, Joey Gouly, Alexandru Elisei,
	Christoffer Dall, Fuad Tabba, linux-coco

On 14/03/2023 15:31, Zhi Wang wrote:
> On Fri, 10 Mar 2023 15:47:16 +0000
> Steven Price <steven.price@arm.com> wrote:
> 
>> On 06/03/2023 17:34, Zhi Wang wrote:
>>> On Fri, 27 Jan 2023 11:29:20 +0000
>>> Steven Price <steven.price@arm.com> wrote:

<snip>

>>>> +	if (kvm_realm_state(kvm) != REALM_STATE_NEW)
>>>> +		return -EBUSY;  
>>>
>>> Maybe -EINVAL? The realm hasn't been created (RMI_REALM_CREATE is not called
>>> yet). The userspace shouldn't reach this path.  
>>
>> Well user space can attempt to populate in the ACTIVE state - which is
>> where the idea of 'busy' comes from. Admittedly it's a little confusing
>> when RMI_REALM_CREATE hasn't been called.
>>
>> I'm not particularly bothered about the return code, but it's useful to
>> have a different code to -EINVAL as it's not an invalid argument, but
>> calling at the wrong time. I can't immediately see a better error code
>> though.
>>
> The reason why I feel -EBUSY is little bit off is EBUSY usually indicates
> something is already initialized and currently running, then another
> calling path wanna to operate it. 
> 
> I took a look on the ioctls in arch/arm64/kvm/arm.c. It seems people have
> different opinions for calling execution path at a wrong time:
> 
> For example:
> 
> long kvm_arch_vcpu_ioctl()
> ...
>         case KVM_GET_REG_LIST: {
>                 struct kvm_reg_list __user *user_list = argp;
>                 struct kvm_reg_list reg_list;
>                 unsigned n;
> 
>                 r = -ENOEXEC;
>                 if (unlikely(!kvm_vcpu_initialized(vcpu)))
>                         break;
> 
>                 r = -EPERM;
>                 if (!kvm_arm_vcpu_is_finalized(vcpu))
>                         break;
> 
> If we have to choose one, I prefer -ENOEXEC as -EPERM is stranger. But
> personally my vote goes to -EINVAL.

Ok, I think you've convinced me - I'll change to -EINVAL. It is invalid
use of the API and none of the other error codes seem a great fit.

Although I do wish Linux had more descriptive error codes - I often end
up peppering the kernel with a few printks when using a new API to find
out what I'm doing wrong.

Steve

>> Steve
>>
>>>> +
>>>> +	if (!IS_ALIGNED(args->populate_ipa_base, PAGE_SIZE) ||
>>>> +	    !IS_ALIGNED(args->populate_ipa_size, PAGE_SIZE))
>>>> +		return -EINVAL;
>>>> +
>>>> +	ipa_base = args->populate_ipa_base;
>>>> +	ipa_end = ipa_base + args->populate_ipa_size;
>>>> +
>>>> +	if (ipa_end < ipa_base)
>>>> +		return -EINVAL;
>>>> +
>>>> +	return populate_par_region(kvm, ipa_base, ipa_end);
>>>> +}
>>>> +
>>>>  static int set_ipa_state(struct kvm_vcpu *vcpu,
>>>>  			 unsigned long ipa,
>>>>  			 unsigned long end,
>>>> @@ -748,6 +1102,18 @@ int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>>>>  		r = kvm_init_ipa_range_realm(kvm, &args);
>>>>  		break;
>>>>  	}
>>>> +	case KVM_CAP_ARM_RME_POPULATE_REALM: {
>>>> +		struct kvm_cap_arm_rme_populate_realm_args args;
>>>> +		void __user *argp = u64_to_user_ptr(cap->args[1]);
>>>> +
>>>> +		if (copy_from_user(&args, argp, sizeof(args))) {
>>>> +			r = -EFAULT;
>>>> +			break;
>>>> +		}
>>>> +
>>>> +		r = kvm_populate_realm(kvm, &args);
>>>> +		break;
>>>> +	}
>>>>  	default:
>>>>  		r = -EINVAL;
>>>>  		break;  
>>>   
>>
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-01-27 11:22 [RFC] Support for Arm CCA VMs on Linux Suzuki K Poulose
                   ` (6 preceding siblings ...)
  2023-02-14 17:13 ` Dr. David Alan Gilbert
@ 2023-07-14 13:46 ` Jonathan Cameron
  2023-07-14 15:03   ` Suzuki K Poulose
  2023-10-02 12:43 ` Suzuki K Poulose
  8 siblings, 1 reply; 190+ messages in thread
From: Jonathan Cameron @ 2023-07-14 13:46 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-coco, linux-kernel, kvm, kvmarm, linux-arm-kernel,
	Alexandru Elisei, Andrew Jones, Catalin Marinas, Chao Peng,
	Christoffer Dall, Fuad Tabba, James Morse, Jean-Philippe Brucker,
	Joey Gouly, Marc Zyngier, Mark Rutland, Oliver Upton,
	Paolo Bonzini, Quentin Perret, Sean Christopherson, Steven Price,
	Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

On Fri, 27 Jan 2023 11:22:48 +0000
Suzuki K Poulose <suzuki.poulose@arm.com> wrote:


Hi Suzuki,

Looking at this has been on the backlog for a while from our side and we are finally
getting to it.  So before we dive in and given it's been 6 months, I wanted to check
if you expect to post a new version shortly or if there is a rebased tree available?

Jonathan
  
> We are happy to announce the early RFC version of the Arm
> Confidential Compute Architecture (CCA) support for the Linux
> stack. The intention is to seek early feedback in the following areas:
>  * KVM integration of the Arm CCA
>  * KVM UABI for managing the Realms, seeking to generalise the operations
>    wherever possible with other Confidential Compute solutions.
>    Note: This version doesn't support Guest Private memory, which will be added
>    later (see below).
>  * Linux Guest support for Realms
> 
> Arm CCA Introduction
> =====================
> 
> The Arm CCA is a reference software architecture and implementation that builds
> on the Realm Management Extension (RME), enabling the execution of Virtual
> machines, while preventing access by more privileged software, such as hypervisor.
> The Arm CCA allows the hypervisor to control the VM, but removes the right for
> access to the code, register state or data that is used by VM.
> More information on the architecture is available here[0].
> 
>     Arm CCA Reference Software Architecture
> 
>         Realm World    ||    Normal World   ||  Secure World  ||
>                        ||        |          ||                ||
>  EL0 x-------x         || x----x | x------x ||                ||
>      | Realm |         || |    | | |      | ||                ||
>      |       |         || | VM | | |      | ||                ||
>  ----|  VM*  |---------||-|    |---|      |-||----------------||
>      |       |         || |    | | |  H   | ||                ||
>  EL1 x-------x         || x----x | |      | ||                ||
>          ^             ||        | |  o   | ||                ||
>          |             ||        | |      | ||                ||
>  ------- R*------------------------|  s  -|---------------------
>          S             ||          |      | ||                ||
>          I             ||          |  t   | ||                ||
>          |             ||          |      | ||                || 
>          v             ||          x------x ||                ||
>  EL2    RMM*           ||              ^    ||                ||
>          ^             ||              |    ||                ||
>  ========|=============================|========================
>          |                             | SMC
>          x--------- *RMI* -------------x
> 
>  EL3                   Root World
>                        EL3 Firmware
>  ===============================================================
> Where :
>  RMM - Realm Management Monitor
>  RMI - Realm Management Interface
>  RSI - Realm Service Interface
>  SMC - Secure Monitor Call
> 
> RME introduces a new security state "Realm world", in addition to the
> traditional Secure and Non-Secure states. The Arm CCA defines a new component,
> Realm Management Monitor (RMM) that runs at R-EL2. This is a standard piece of
> firmware, verified, installed and loaded by the EL3 firmware (e.g, TF-A), at
> system boot.
> 
> The RMM provides standard interfaces - Realm Management Interface (RMI) - to the
> Normal world hypervisor to manage the VMs running in the Realm world (also called
> Realms in short). These are exposed via SMC and are routed through the EL3
> firmwre.
> The RMI interface includes:
>   - Move a physical page from the Normal world to the Realm world
>   - Creating a Realm with requested parameters, tracked via Realm Descriptor (RD)
>   - Creating VCPUs aka Realm Execution Context (REC), with initial register state.
>   - Create stage2 translation table at any level.
>   - Load initial images into Realm Memory from normal world memory
>   - Schedule RECs (vCPUs) and handle exits
>   - Inject virtual interrupts into the Realm
>   - Service stage2 runtime faults with pages (provided by host, scrubbed by RMM).
>   - Create "shared" mappings that can be accessed by VMM/Hyp.
>   - Reclaim the memory allocated for the RAM and RTTs (Realm Translation Tables)
> 
> However v1.0 of RMM specifications doesn't support:
>  - Paging protected memory of a Realm VM. Thus the pages backing the protected
>    memory region must be pinned.
>  - Live migration of Realms.
>  - Trusted Device assignment.
>  - Physical interrupt backed Virtual interrupts for Realms
> 
> RMM also provides certain services to the Realms via SMC, called Realm Service
> Interface (RSI). These include:
>  - Realm Guest Configuration.
>  - Attestation & Measurement services
>  - Managing the state of an Intermediate Physical Address (IPA aka GPA) page.
>  - Host Call service (Communication with the Normal world Hypervisor)
> 
> The specifications for the RMM software is currently at *v1.0-Beta2* and the
> latest version is available here [1].
> 
> The Trusted Firmware foundation has an implementation of the RMM - TF-RMM -
> available here [3].
> 
> Implementation
> =================
> 
> This version of the stack is based on the RMM specification v1.0-Beta0[2], with
> following exceptions :
>   - TF-RMM/KVM currently doesn't support the optional features of PMU,
>      SVE and Self-hosted debug (coming soon).
>   - The RSI_HOST_CALL structure alignment requirement is reduced to match
>      RMM v1.0 Beta1
>   - RMI/RSI version numbers do not match the RMM spec. This will be
>     resolved once the spec/implementation is complete, across TF-RMM+Linux stack.
> 
> We plan to update the stack to support the latest version of the RMMv1.0 spec
> in the coming revisions.
> 
> This release includes the following components :
> 
>  a) Linux Kernel
>      i) Host / KVM support - Support for driving the Realms via RMI. This is
>      dependent on running in the Kernel at EL2 (aka VHE mode). Also provides
>      UABI for VMMs to manage the Realm VMs. The support is restricted to 4K page
>      size, matching the Stage2 granule supported by RMM. The VMM is responsible
>      for making sure the guest memory is locked.
> 
>        TODO: Guest Private memory[10] integration - We have been following the
>        series and support will be added once it is merged upstream.
>      
>      ii) Guest support - Support for a Linux Kernel to run in the Realm VM at
>      Realm-EL1, using RSI services. This includes virtio support (virtio-v1.0
>      only). All I/O are treated as non-secure/shared.
>  
>  c) kvmtool - VMM changes required to manage Realm VMs. No guest private memory
>     as mentioned above.
>  d) kvm-unit-tests - Support for running in Realms along with additional tests
>     for RSI ABI.
> 
> Running the stack
> ====================
> 
> To run/test the stack, you would need the following components :
> 
> 1) FVP Base AEM RevC model with FEAT_RME support [4]
> 2) TF-A firmware for EL3 [5]
> 3) TF-A RMM for R-EL2 [3]
> 4) Linux Kernel [6]
> 5) kvmtool [7]
> 6) kvm-unit-tests [8]
> 
> Instructions for building the firmware components and running the model are
> available here [9]. Once, the host kernel is booted, a Realm can be launched by
> invoking the `lkvm` commad as follows:
> 
>  $ lkvm run --realm 				 \
> 	 --measurement-algo=["sha256", "sha512"] \
> 	 --disable-sve				 \
> 	 <normal-vm-options>
> 
> Where:
>  * --measurement-algo (Optional) specifies the algorithm selected for creating the
>    initial measurements by the RMM for this Realm (defaults to sha256).
>  * GICv3 is mandatory for the Realms.
>  * SVE is not yet supported in the TF-RMM, and thus must be disabled using
>    --disable-sve
> 
> You may also run the kvm-unit-tests inside the Realm world, using the similar
> options as above.
> 
> 
> Links
> ============
> 
> [0] Arm CCA Landing page (See Key Resources section for various documentations)
>     https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
> 
> [1] RMM Specification Latest
>     https://developer.arm.com/documentation/den0137/latest
> 
> [2] RMM v1.0-Beta0 specification
>     https://developer.arm.com/documentation/den0137/1-0bet0/
> 
> [3] Trusted Firmware RMM - TF-RMM
>     https://www.trustedfirmware.org/projects/tf-rmm/
>     GIT: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git
> 
> [4] FVP Base RevC AEM Model (available on x86_64 / Arm64 Linux)
>     https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms
> 
> [5] Trusted Firmware for A class
>     https://www.trustedfirmware.org/projects/tf-a/
> 
> [6] Linux kernel support for Arm-CCA
>     https://gitlab.arm.com/linux-arm/linux-cca
>     Host Support branch:	cca-host/rfc-v1
>     Guest Support branch:	cca-guest/rfc-v1
> 
> [7] kvmtool support for Arm CCA
>     https://gitlab.arm.com/linux-arm/kvmtool-cca cca/rfc-v1
> 
> [8] kvm-unit-tests support for Arm CCA
>     https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca  cca/rfc-v1
> 
> [9] Instructions for Building Firmware components and running the model, see
>     section 4.19.2 "Building and running TF-A with RME"
>     https://trustedfirmware-a.readthedocs.io/en/latest/components/realm-management-extension.html#building-and-running-tf-a-with-rme
> 
> [10] fd based Guest Private memory for KVM
>    https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng@linux.intel.com
> 
> Cc: Alexandru Elisei <alexandru.elisei@arm.com>
> Cc: Andrew Jones <andrew.jones@linux.dev>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Chao Peng <chao.p.peng@linux.intel.com>
> Cc: Christoffer Dall <christoffer.dall@arm.com>
> Cc: Fuad Tabba <tabba@google.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Cc: Joey Gouly <Joey.Gouly@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Quentin Perret <qperret@google.com>
> Cc: Sean Christopherson <seanjc@google.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Zenghui Yu <yuzenghui@huawei.com>
> To: linux-coco@lists.linux.dev
> To: kvmarm@lists.linux.dev
> Cc: kvmarm@lists.cs.columbia.edu
> Cc: linux-arm-kernel@lists.infradead.org
> To: linux-kernel@vger.kernel.org
> To: kvm@vger.kernel.org
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-07-14 13:46 ` Jonathan Cameron
@ 2023-07-14 15:03   ` Suzuki K Poulose
  2023-07-14 16:28     ` Jonathan Cameron
  0 siblings, 1 reply; 190+ messages in thread
From: Suzuki K Poulose @ 2023-07-14 15:03 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-coco, linux-kernel, kvm, kvmarm, linux-arm-kernel,
	Alexandru Elisei, Andrew Jones, Catalin Marinas, Chao Peng,
	Christoffer Dall, Fuad Tabba, James Morse, Jean-Philippe Brucker,
	Joey Gouly, Marc Zyngier, Mark Rutland, Oliver Upton,
	Paolo Bonzini, Quentin Perret, Sean Christopherson, Steven Price,
	Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

Hi Jonathan

On 14/07/2023 14:46, Jonathan Cameron wrote:
> On Fri, 27 Jan 2023 11:22:48 +0000
> Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> 
> 
> Hi Suzuki,
> 
> Looking at this has been on the backlog for a while from our side and we are finally
> getting to it.  So before we dive in and given it's been 6 months, I wanted to check
> if you expect to post a new version shortly or if there is a rebased tree available?

Thanks for your interest. We have been updating our trees to the latest
RMM specification (v1.0-eac2 now) and also rebasing Linux/KVM on top of
v6.5-rc1. We will post this as soon as we have all the components ready
(and the TF-RMM). At the earliest, this would be around early September.

That said, the revised version will have the following changes :
  - Changes to the Stage2 management
  - Changes to RMM memory management for Realm
  - PMU/SVE support

Otherwise, most of the changes remain the same (e.g., UABI). Happy to
hear feedback on those areas.


Kind regards
Suzuki

> 
> Jonathan
>    
>> We are happy to announce the early RFC version of the Arm
>> Confidential Compute Architecture (CCA) support for the Linux
>> stack. The intention is to seek early feedback in the following areas:
>>   * KVM integration of the Arm CCA
>>   * KVM UABI for managing the Realms, seeking to generalise the operations
>>     wherever possible with other Confidential Compute solutions.
>>     Note: This version doesn't support Guest Private memory, which will be added
>>     later (see below).
>>   * Linux Guest support for Realms
>>
>> Arm CCA Introduction
>> =====================
>>
>> The Arm CCA is a reference software architecture and implementation that builds
>> on the Realm Management Extension (RME), enabling the execution of Virtual
>> machines, while preventing access by more privileged software, such as hypervisor.
>> The Arm CCA allows the hypervisor to control the VM, but removes the right for
>> access to the code, register state or data that is used by VM.
>> More information on the architecture is available here[0].
>>
>>      Arm CCA Reference Software Architecture
>>
>>          Realm World    ||    Normal World   ||  Secure World  ||
>>                         ||        |          ||                ||
>>   EL0 x-------x         || x----x | x------x ||                ||
>>       | Realm |         || |    | | |      | ||                ||
>>       |       |         || | VM | | |      | ||                ||
>>   ----|  VM*  |---------||-|    |---|      |-||----------------||
>>       |       |         || |    | | |  H   | ||                ||
>>   EL1 x-------x         || x----x | |      | ||                ||
>>           ^             ||        | |  o   | ||                ||
>>           |             ||        | |      | ||                ||
>>   ------- R*------------------------|  s  -|---------------------
>>           S             ||          |      | ||                ||
>>           I             ||          |  t   | ||                ||
>>           |             ||          |      | ||                ||
>>           v             ||          x------x ||                ||
>>   EL2    RMM*           ||              ^    ||                ||
>>           ^             ||              |    ||                ||
>>   ========|=============================|========================
>>           |                             | SMC
>>           x--------- *RMI* -------------x
>>
>>   EL3                   Root World
>>                         EL3 Firmware
>>   ===============================================================
>> Where :
>>   RMM - Realm Management Monitor
>>   RMI - Realm Management Interface
>>   RSI - Realm Service Interface
>>   SMC - Secure Monitor Call
>>
>> RME introduces a new security state "Realm world", in addition to the
>> traditional Secure and Non-Secure states. The Arm CCA defines a new component,
>> Realm Management Monitor (RMM) that runs at R-EL2. This is a standard piece of
>> firmware, verified, installed and loaded by the EL3 firmware (e.g, TF-A), at
>> system boot.
>>
>> The RMM provides standard interfaces - Realm Management Interface (RMI) - to the
>> Normal world hypervisor to manage the VMs running in the Realm world (also called
>> Realms in short). These are exposed via SMC and are routed through the EL3
>> firmwre.
>> The RMI interface includes:
>>    - Move a physical page from the Normal world to the Realm world
>>    - Creating a Realm with requested parameters, tracked via Realm Descriptor (RD)
>>    - Creating VCPUs aka Realm Execution Context (REC), with initial register state.
>>    - Create stage2 translation table at any level.
>>    - Load initial images into Realm Memory from normal world memory
>>    - Schedule RECs (vCPUs) and handle exits
>>    - Inject virtual interrupts into the Realm
>>    - Service stage2 runtime faults with pages (provided by host, scrubbed by RMM).
>>    - Create "shared" mappings that can be accessed by VMM/Hyp.
>>    - Reclaim the memory allocated for the RAM and RTTs (Realm Translation Tables)
>>
>> However v1.0 of RMM specifications doesn't support:
>>   - Paging protected memory of a Realm VM. Thus the pages backing the protected
>>     memory region must be pinned.
>>   - Live migration of Realms.
>>   - Trusted Device assignment.
>>   - Physical interrupt backed Virtual interrupts for Realms
>>
>> RMM also provides certain services to the Realms via SMC, called Realm Service
>> Interface (RSI). These include:
>>   - Realm Guest Configuration.
>>   - Attestation & Measurement services
>>   - Managing the state of an Intermediate Physical Address (IPA aka GPA) page.
>>   - Host Call service (Communication with the Normal world Hypervisor)
>>
>> The specifications for the RMM software is currently at *v1.0-Beta2* and the
>> latest version is available here [1].
>>
>> The Trusted Firmware foundation has an implementation of the RMM - TF-RMM -
>> available here [3].
>>
>> Implementation
>> =================
>>
>> This version of the stack is based on the RMM specification v1.0-Beta0[2], with
>> following exceptions :
>>    - TF-RMM/KVM currently doesn't support the optional features of PMU,
>>       SVE and Self-hosted debug (coming soon).
>>    - The RSI_HOST_CALL structure alignment requirement is reduced to match
>>       RMM v1.0 Beta1
>>    - RMI/RSI version numbers do not match the RMM spec. This will be
>>      resolved once the spec/implementation is complete, across TF-RMM+Linux stack.
>>
>> We plan to update the stack to support the latest version of the RMMv1.0 spec
>> in the coming revisions.
>>
>> This release includes the following components :
>>
>>   a) Linux Kernel
>>       i) Host / KVM support - Support for driving the Realms via RMI. This is
>>       dependent on running in the Kernel at EL2 (aka VHE mode). Also provides
>>       UABI for VMMs to manage the Realm VMs. The support is restricted to 4K page
>>       size, matching the Stage2 granule supported by RMM. The VMM is responsible
>>       for making sure the guest memory is locked.
>>
>>         TODO: Guest Private memory[10] integration - We have been following the
>>         series and support will be added once it is merged upstream.
>>       
>>       ii) Guest support - Support for a Linux Kernel to run in the Realm VM at
>>       Realm-EL1, using RSI services. This includes virtio support (virtio-v1.0
>>       only). All I/O are treated as non-secure/shared.
>>   
>>   c) kvmtool - VMM changes required to manage Realm VMs. No guest private memory
>>      as mentioned above.
>>   d) kvm-unit-tests - Support for running in Realms along with additional tests
>>      for RSI ABI.
>>
>> Running the stack
>> ====================
>>
>> To run/test the stack, you would need the following components :
>>
>> 1) FVP Base AEM RevC model with FEAT_RME support [4]
>> 2) TF-A firmware for EL3 [5]
>> 3) TF-A RMM for R-EL2 [3]
>> 4) Linux Kernel [6]
>> 5) kvmtool [7]
>> 6) kvm-unit-tests [8]
>>
>> Instructions for building the firmware components and running the model are
>> available here [9]. Once, the host kernel is booted, a Realm can be launched by
>> invoking the `lkvm` commad as follows:
>>
>>   $ lkvm run --realm 				 \
>> 	 --measurement-algo=["sha256", "sha512"] \
>> 	 --disable-sve				 \
>> 	 <normal-vm-options>
>>
>> Where:
>>   * --measurement-algo (Optional) specifies the algorithm selected for creating the
>>     initial measurements by the RMM for this Realm (defaults to sha256).
>>   * GICv3 is mandatory for the Realms.
>>   * SVE is not yet supported in the TF-RMM, and thus must be disabled using
>>     --disable-sve
>>
>> You may also run the kvm-unit-tests inside the Realm world, using the similar
>> options as above.
>>
>>
>> Links
>> ============
>>
>> [0] Arm CCA Landing page (See Key Resources section for various documentations)
>>      https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
>>
>> [1] RMM Specification Latest
>>      https://developer.arm.com/documentation/den0137/latest
>>
>> [2] RMM v1.0-Beta0 specification
>>      https://developer.arm.com/documentation/den0137/1-0bet0/
>>
>> [3] Trusted Firmware RMM - TF-RMM
>>      https://www.trustedfirmware.org/projects/tf-rmm/
>>      GIT: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git
>>
>> [4] FVP Base RevC AEM Model (available on x86_64 / Arm64 Linux)
>>      https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms
>>
>> [5] Trusted Firmware for A class
>>      https://www.trustedfirmware.org/projects/tf-a/
>>
>> [6] Linux kernel support for Arm-CCA
>>      https://gitlab.arm.com/linux-arm/linux-cca
>>      Host Support branch:	cca-host/rfc-v1
>>      Guest Support branch:	cca-guest/rfc-v1
>>
>> [7] kvmtool support for Arm CCA
>>      https://gitlab.arm.com/linux-arm/kvmtool-cca cca/rfc-v1
>>
>> [8] kvm-unit-tests support for Arm CCA
>>      https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca  cca/rfc-v1
>>
>> [9] Instructions for Building Firmware components and running the model, see
>>      section 4.19.2 "Building and running TF-A with RME"
>>      https://trustedfirmware-a.readthedocs.io/en/latest/components/realm-management-extension.html#building-and-running-tf-a-with-rme
>>
>> [10] fd based Guest Private memory for KVM
>>     https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng@linux.intel.com
>>
>> Cc: Alexandru Elisei <alexandru.elisei@arm.com>
>> Cc: Andrew Jones <andrew.jones@linux.dev>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Chao Peng <chao.p.peng@linux.intel.com>
>> Cc: Christoffer Dall <christoffer.dall@arm.com>
>> Cc: Fuad Tabba <tabba@google.com>
>> Cc: James Morse <james.morse@arm.com>
>> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
>> Cc: Joey Gouly <Joey.Gouly@arm.com>
>> Cc: Marc Zyngier <maz@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Oliver Upton <oliver.upton@linux.dev>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Quentin Perret <qperret@google.com>
>> Cc: Sean Christopherson <seanjc@google.com>
>> Cc: Steven Price <steven.price@arm.com>
>> Cc: Thomas Huth <thuth@redhat.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Zenghui Yu <yuzenghui@huawei.com>
>> To: linux-coco@lists.linux.dev
>> To: kvmarm@lists.linux.dev
>> Cc: kvmarm@lists.cs.columbia.edu
>> Cc: linux-arm-kernel@lists.infradead.org
>> To: linux-kernel@vger.kernel.org
>> To: kvm@vger.kernel.org
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-07-14 15:03   ` Suzuki K Poulose
@ 2023-07-14 16:28     ` Jonathan Cameron
  2023-07-17  9:40       ` Suzuki K Poulose
  0 siblings, 1 reply; 190+ messages in thread
From: Jonathan Cameron @ 2023-07-14 16:28 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-coco, linux-kernel, kvm, kvmarm, linux-arm-kernel,
	Alexandru Elisei, Andrew Jones, Catalin Marinas, Chao Peng,
	Christoffer Dall, Fuad Tabba, James Morse, Jean-Philippe Brucker,
	Joey Gouly, Marc Zyngier, Mark Rutland, Oliver Upton,
	Paolo Bonzini, Quentin Perret, Sean Christopherson, Steven Price,
	Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

On Fri, 14 Jul 2023 16:03:37 +0100
Suzuki K Poulose <suzuki.poulose@arm.com> wrote:

> Hi Jonathan
> 
> On 14/07/2023 14:46, Jonathan Cameron wrote:
> > On Fri, 27 Jan 2023 11:22:48 +0000
> > Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> > 
> > 
> > Hi Suzuki,
> > 
> > Looking at this has been on the backlog for a while from our side and we are finally
> > getting to it.  So before we dive in and given it's been 6 months, I wanted to check
> > if you expect to post a new version shortly or if there is a rebased tree available?  
> 
> Thanks for your interest. We have been updating our trees to the latest
> RMM specification (v1.0-eac2 now) and also rebasing Linux/KVM on top of
> v6.5-rc1. We will post this as soon as we have all the components ready
> (and the TF-RMM). At the earliest, this would be around early September.
> 
> That said, the revised version will have the following changes :
>   - Changes to the Stage2 management
>   - Changes to RMM memory management for Realm
>   - PMU/SVE support
> 
> Otherwise, most of the changes remain the same (e.g., UABI). Happy to
> hear feedback on those areas.

Hi Suzuki,

Thanks for the update.  If there is any chance of visibility of changes
via a git tree etc that would be great in the meantime.  If not, such is life
and I'll try to wait patiently :) + we'll review the existing code.

Jonathan

> 
> 
> Kind regards
> Suzuki
> 
> > 
> > Jonathan
> >      
> >> We are happy to announce the early RFC version of the Arm
> >> Confidential Compute Architecture (CCA) support for the Linux
> >> stack. The intention is to seek early feedback in the following areas:
> >>   * KVM integration of the Arm CCA
> >>   * KVM UABI for managing the Realms, seeking to generalise the operations
> >>     wherever possible with other Confidential Compute solutions.
> >>     Note: This version doesn't support Guest Private memory, which will be added
> >>     later (see below).
> >>   * Linux Guest support for Realms
> >>
> >> Arm CCA Introduction
> >> =====================
> >>
> >> The Arm CCA is a reference software architecture and implementation that builds
> >> on the Realm Management Extension (RME), enabling the execution of Virtual
> >> machines, while preventing access by more privileged software, such as hypervisor.
> >> The Arm CCA allows the hypervisor to control the VM, but removes the right for
> >> access to the code, register state or data that is used by VM.
> >> More information on the architecture is available here[0].
> >>
> >>      Arm CCA Reference Software Architecture
> >>
> >>          Realm World    ||    Normal World   ||  Secure World  ||
> >>                         ||        |          ||                ||
> >>   EL0 x-------x         || x----x | x------x ||                ||
> >>       | Realm |         || |    | | |      | ||                ||
> >>       |       |         || | VM | | |      | ||                ||
> >>   ----|  VM*  |---------||-|    |---|      |-||----------------||
> >>       |       |         || |    | | |  H   | ||                ||
> >>   EL1 x-------x         || x----x | |      | ||                ||
> >>           ^             ||        | |  o   | ||                ||
> >>           |             ||        | |      | ||                ||
> >>   ------- R*------------------------|  s  -|---------------------
> >>           S             ||          |      | ||                ||
> >>           I             ||          |  t   | ||                ||
> >>           |             ||          |      | ||                ||
> >>           v             ||          x------x ||                ||
> >>   EL2    RMM*           ||              ^    ||                ||
> >>           ^             ||              |    ||                ||
> >>   ========|=============================|========================
> >>           |                             | SMC
> >>           x--------- *RMI* -------------x
> >>
> >>   EL3                   Root World
> >>                         EL3 Firmware
> >>   ===============================================================
> >> Where :
> >>   RMM - Realm Management Monitor
> >>   RMI - Realm Management Interface
> >>   RSI - Realm Service Interface
> >>   SMC - Secure Monitor Call
> >>
> >> RME introduces a new security state "Realm world", in addition to the
> >> traditional Secure and Non-Secure states. The Arm CCA defines a new component,
> >> Realm Management Monitor (RMM) that runs at R-EL2. This is a standard piece of
> >> firmware, verified, installed and loaded by the EL3 firmware (e.g, TF-A), at
> >> system boot.
> >>
> >> The RMM provides standard interfaces - Realm Management Interface (RMI) - to the
> >> Normal world hypervisor to manage the VMs running in the Realm world (also called
> >> Realms in short). These are exposed via SMC and are routed through the EL3
> >> firmwre.
> >> The RMI interface includes:
> >>    - Move a physical page from the Normal world to the Realm world
> >>    - Creating a Realm with requested parameters, tracked via Realm Descriptor (RD)
> >>    - Creating VCPUs aka Realm Execution Context (REC), with initial register state.
> >>    - Create stage2 translation table at any level.
> >>    - Load initial images into Realm Memory from normal world memory
> >>    - Schedule RECs (vCPUs) and handle exits
> >>    - Inject virtual interrupts into the Realm
> >>    - Service stage2 runtime faults with pages (provided by host, scrubbed by RMM).
> >>    - Create "shared" mappings that can be accessed by VMM/Hyp.
> >>    - Reclaim the memory allocated for the RAM and RTTs (Realm Translation Tables)
> >>
> >> However v1.0 of RMM specifications doesn't support:
> >>   - Paging protected memory of a Realm VM. Thus the pages backing the protected
> >>     memory region must be pinned.
> >>   - Live migration of Realms.
> >>   - Trusted Device assignment.
> >>   - Physical interrupt backed Virtual interrupts for Realms
> >>
> >> RMM also provides certain services to the Realms via SMC, called Realm Service
> >> Interface (RSI). These include:
> >>   - Realm Guest Configuration.
> >>   - Attestation & Measurement services
> >>   - Managing the state of an Intermediate Physical Address (IPA aka GPA) page.
> >>   - Host Call service (Communication with the Normal world Hypervisor)
> >>
> >> The specifications for the RMM software is currently at *v1.0-Beta2* and the
> >> latest version is available here [1].
> >>
> >> The Trusted Firmware foundation has an implementation of the RMM - TF-RMM -
> >> available here [3].
> >>
> >> Implementation
> >> =================
> >>
> >> This version of the stack is based on the RMM specification v1.0-Beta0[2], with
> >> following exceptions :
> >>    - TF-RMM/KVM currently doesn't support the optional features of PMU,
> >>       SVE and Self-hosted debug (coming soon).
> >>    - The RSI_HOST_CALL structure alignment requirement is reduced to match
> >>       RMM v1.0 Beta1
> >>    - RMI/RSI version numbers do not match the RMM spec. This will be
> >>      resolved once the spec/implementation is complete, across TF-RMM+Linux stack.
> >>
> >> We plan to update the stack to support the latest version of the RMMv1.0 spec
> >> in the coming revisions.
> >>
> >> This release includes the following components :
> >>
> >>   a) Linux Kernel
> >>       i) Host / KVM support - Support for driving the Realms via RMI. This is
> >>       dependent on running in the Kernel at EL2 (aka VHE mode). Also provides
> >>       UABI for VMMs to manage the Realm VMs. The support is restricted to 4K page
> >>       size, matching the Stage2 granule supported by RMM. The VMM is responsible
> >>       for making sure the guest memory is locked.
> >>
> >>         TODO: Guest Private memory[10] integration - We have been following the
> >>         series and support will be added once it is merged upstream.
> >>       
> >>       ii) Guest support - Support for a Linux Kernel to run in the Realm VM at
> >>       Realm-EL1, using RSI services. This includes virtio support (virtio-v1.0
> >>       only). All I/O are treated as non-secure/shared.
> >>   
> >>   c) kvmtool - VMM changes required to manage Realm VMs. No guest private memory
> >>      as mentioned above.
> >>   d) kvm-unit-tests - Support for running in Realms along with additional tests
> >>      for RSI ABI.
> >>
> >> Running the stack
> >> ====================
> >>
> >> To run/test the stack, you would need the following components :
> >>
> >> 1) FVP Base AEM RevC model with FEAT_RME support [4]
> >> 2) TF-A firmware for EL3 [5]
> >> 3) TF-A RMM for R-EL2 [3]
> >> 4) Linux Kernel [6]
> >> 5) kvmtool [7]
> >> 6) kvm-unit-tests [8]
> >>
> >> Instructions for building the firmware components and running the model are
> >> available here [9]. Once, the host kernel is booted, a Realm can be launched by
> >> invoking the `lkvm` commad as follows:
> >>
> >>   $ lkvm run --realm 				 \
> >> 	 --measurement-algo=["sha256", "sha512"] \
> >> 	 --disable-sve				 \
> >> 	 <normal-vm-options>
> >>
> >> Where:
> >>   * --measurement-algo (Optional) specifies the algorithm selected for creating the
> >>     initial measurements by the RMM for this Realm (defaults to sha256).
> >>   * GICv3 is mandatory for the Realms.
> >>   * SVE is not yet supported in the TF-RMM, and thus must be disabled using
> >>     --disable-sve
> >>
> >> You may also run the kvm-unit-tests inside the Realm world, using the similar
> >> options as above.
> >>
> >>
> >> Links
> >> ============
> >>
> >> [0] Arm CCA Landing page (See Key Resources section for various documentations)
> >>      https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
> >>
> >> [1] RMM Specification Latest
> >>      https://developer.arm.com/documentation/den0137/latest
> >>
> >> [2] RMM v1.0-Beta0 specification
> >>      https://developer.arm.com/documentation/den0137/1-0bet0/
> >>
> >> [3] Trusted Firmware RMM - TF-RMM
> >>      https://www.trustedfirmware.org/projects/tf-rmm/
> >>      GIT: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git
> >>
> >> [4] FVP Base RevC AEM Model (available on x86_64 / Arm64 Linux)
> >>      https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms
> >>
> >> [5] Trusted Firmware for A class
> >>      https://www.trustedfirmware.org/projects/tf-a/
> >>
> >> [6] Linux kernel support for Arm-CCA
> >>      https://gitlab.arm.com/linux-arm/linux-cca
> >>      Host Support branch:	cca-host/rfc-v1
> >>      Guest Support branch:	cca-guest/rfc-v1
> >>
> >> [7] kvmtool support for Arm CCA
> >>      https://gitlab.arm.com/linux-arm/kvmtool-cca cca/rfc-v1
> >>
> >> [8] kvm-unit-tests support for Arm CCA
> >>      https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca  cca/rfc-v1
> >>
> >> [9] Instructions for Building Firmware components and running the model, see
> >>      section 4.19.2 "Building and running TF-A with RME"
> >>      https://trustedfirmware-a.readthedocs.io/en/latest/components/realm-management-extension.html#building-and-running-tf-a-with-rme
> >>
> >> [10] fd based Guest Private memory for KVM
> >>     https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng@linux.intel.com
> >>
> >> Cc: Alexandru Elisei <alexandru.elisei@arm.com>
> >> Cc: Andrew Jones <andrew.jones@linux.dev>
> >> Cc: Catalin Marinas <catalin.marinas@arm.com>
> >> Cc: Chao Peng <chao.p.peng@linux.intel.com>
> >> Cc: Christoffer Dall <christoffer.dall@arm.com>
> >> Cc: Fuad Tabba <tabba@google.com>
> >> Cc: James Morse <james.morse@arm.com>
> >> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >> Cc: Joey Gouly <Joey.Gouly@arm.com>
> >> Cc: Marc Zyngier <maz@kernel.org>
> >> Cc: Mark Rutland <mark.rutland@arm.com>
> >> Cc: Oliver Upton <oliver.upton@linux.dev>
> >> Cc: Paolo Bonzini <pbonzini@redhat.com>
> >> Cc: Quentin Perret <qperret@google.com>
> >> Cc: Sean Christopherson <seanjc@google.com>
> >> Cc: Steven Price <steven.price@arm.com>
> >> Cc: Thomas Huth <thuth@redhat.com>
> >> Cc: Will Deacon <will@kernel.org>
> >> Cc: Zenghui Yu <yuzenghui@huawei.com>
> >> To: linux-coco@lists.linux.dev
> >> To: kvmarm@lists.linux.dev
> >> Cc: kvmarm@lists.cs.columbia.edu
> >> Cc: linux-arm-kernel@lists.infradead.org
> >> To: linux-kernel@vger.kernel.org
> >> To: kvm@vger.kernel.org
> >>
> >> _______________________________________________
> >> linux-arm-kernel mailing list
> >> linux-arm-kernel@lists.infradead.org
> >> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel  
> >   
> 
> 


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-07-14 16:28     ` Jonathan Cameron
@ 2023-07-17  9:40       ` Suzuki K Poulose
  0 siblings, 0 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2023-07-17  9:40 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-coco, linux-kernel, kvm, kvmarm, linux-arm-kernel,
	Alexandru Elisei, Andrew Jones, Catalin Marinas, Chao Peng,
	Christoffer Dall, Fuad Tabba, James Morse, Jean-Philippe Brucker,
	Joey Gouly, Marc Zyngier, Mark Rutland, Oliver Upton,
	Paolo Bonzini, Quentin Perret, Sean Christopherson, Steven Price,
	Thomas Huth, Will Deacon, Zenghui Yu, kvmarm

On 14/07/2023 17:28, Jonathan Cameron wrote:
> On Fri, 14 Jul 2023 16:03:37 +0100
> Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> 
>> Hi Jonathan
>>
>> On 14/07/2023 14:46, Jonathan Cameron wrote:
>>> On Fri, 27 Jan 2023 11:22:48 +0000
>>> Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>>>
>>>
>>> Hi Suzuki,
>>>
>>> Looking at this has been on the backlog for a while from our side and we are finally
>>> getting to it.  So before we dive in and given it's been 6 months, I wanted to check
>>> if you expect to post a new version shortly or if there is a rebased tree available?
>>
>> Thanks for your interest. We have been updating our trees to the latest
>> RMM specification (v1.0-eac2 now) and also rebasing Linux/KVM on top of
>> v6.5-rc1. We will post this as soon as we have all the components ready
>> (and the TF-RMM). At the earliest, this would be around early September.
>>
>> That said, the revised version will have the following changes :
>>    - Changes to the Stage2 management
>>    - Changes to RMM memory management for Realm
>>    - PMU/SVE support
>>
>> Otherwise, most of the changes remain the same (e.g., UABI). Happy to
>> hear feedback on those areas.
> 
> Hi Suzuki,
> 
> Thanks for the update.  If there is any chance of visibility of changes
> via a git tree etc that would be great in the meantime.  If not, such is life
> and I'll try to wait patiently :) + we'll review the existing code.

I am afraid not yet. Thanks for reviewing the changes :-)

Suzuki


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC kvmtool 18/31] arm64: Populate initial realm contents
  2023-03-02 14:06       ` Suzuki K Poulose
@ 2023-10-02  9:28         ` Piotr Sawicki
  0 siblings, 0 replies; 190+ messages in thread
From: Piotr Sawicki @ 2023-10-02  9:28 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Alexandru Elisei, Andrew Jones, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Thomas Huth, Will Deacon, Zenghui Yu, linux-coco, kvmarm,
	linux-arm-kernel, linux-kernel



Hi Suzuki

> Hi Piotr
> 
> On 02/03/2023 14:03, Piotr Sawicki wrote:
>> Hi,
>>
>>> From: Alexandru Elisei <alexandru.elisei@arm.com>
>>>
>>> Populate the realm memory with the initial contents, which include
>>> the device tree blob, the kernel image, and initrd, if specified,
>>> or the firmware image.
>>>
>>> Populating an image in the realm involves two steps:
>>>   a) Mark the IPA area as RAM - INIT_IPA_REALM
>>>   b) Load the contents into the IPA - POPULATE_REALM
>>>
>>> Wherever we know the actual size of an image in memory, we make
>>> sure the "memory area" is initialised to RAM.
>>> e.g., Linux kernel image size from the header which includes the bss 
>>> etc.
>>> The "file size" on disk for the Linux image is much smaller.
>>> We mark the region of size Image.header.size as RAM (a), from the kernel
>>> load address. And load the Image file into the memory (b) above.
>>> At the moment we only detect the Arm64 Linux Image header format.
>>>
>>> Since we're already touching the code that copies the
>>> initrd in guest memory, let's do a bit of cleaning and remove a
>>> useless local variable.
>>>
>>> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
>>> [ Make sure the Linux kernel image area is marked as RAM ]
>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> 
> 
>>> diff --git a/arm/kvm.c b/arm/kvm.c
>>> index acb627b2..57c5b5f7 100644
>>> --- a/arm/kvm.c
>>> +++ b/arm/kvm.c
>>> @@ -6,6 +6,7 @@
>>>   #include "kvm/fdt.h"
>>>   #include "arm-common/gic.h"
>>> +#include <asm/realm.h>
>>>   #include <sys/resource.h>
>>> @@ -167,6 +168,9 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, 
>>> int fd_kernel, int fd_initrd,
>>>       pr_debug("Loaded kernel to 0x%llx (%llu bytes)",
>>>            kvm->arch.kern_guest_start, kvm->arch.kern_size);
>>
>>
>> I've noticed that multiple calling of the measurement test from the 
>> kvm-unit-tests suite results in different Realm Initial Measurements, 
>> although the kernel image is always the same.
>>
>> After short investigation, I've found that the RIM starts being 
>> different while populating the last 4kB chunk of the kernel image.
>> The issue occurs when the image size is not aligned to the page size 
>> (4kB).
>>
>> After zeroing the unused area of the last chunk, the measurements 
>> become repeatable.
>>
> 
> That is a good point. We could memset() the remaining bits of the 4K 
> page to 0. I will make this change.

It looks that this is somewhat related to the implementation of the 9p 
filesystem (Linux host and/or the FVP emulator).

I'm getting this issue only when the initrd and the guest kernel images 
are located in the shared folder that uses the 9p filesystem. Moving 
those files to the ramdisk (e.g. to the /root folder) and running lkvm 
tool on them resolves the issue.

Kind regards,
Piotr Sawicki



^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture
  2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
                     ` (30 preceding siblings ...)
  2023-01-27 11:39   ` [RFC kvmtool 31/31] arm64: Allow the user to create a realm Suzuki K Poulose
@ 2023-10-02  9:45   ` Piotr Sawicki
  31 siblings, 0 replies; 190+ messages in thread
From: Piotr Sawicki @ 2023-10-02  9:45 UTC (permalink / raw)
  To: Suzuki K Poulose, kvm, kvmarm
  Cc: Alexandru Elisei, Andrew Jones, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Joey Gouly, Marc Zyngier, Mark Rutland,
	Oliver Upton, Paolo Bonzini, Quentin Perret, Steven Price,
	Thomas Huth, Will Deacon, Zenghui Yu, linux-coco, kvmarm,
	linux-arm-kernel, linux-kernel

Hi Suzuki

> This series is an initial version of the support for running VMs under the
> Arm Confidential Compute Architecture. The purpose of the series is to gather
> feedback on the proposed UABI changes for running Confidential VMs with KVM.
> More information on the Arm CCA and instructions for how to get, build and run
> the entire software stack is available here [0].
> 
> A new option, `--realm` is added to the the `run` command to mark the VM as a
> confidential compute VM. This version doesn't use the Guest private memory [1]
> support yet, instead uses normal anonymous/hugetlbfs backed memory. Our aim is
> to switch to the guest private memory for the Realm.
> 
> The host including the kernel and kvmtool, must not access any memory allocated
> to the protected IPA of the Realm.
> 
> The series adds the support for managing the lifecycle of the Realm, which includes:
>     * Configuration
>     * Creation of Realm (RD)
>     * Load initial memory images
>     * Creation of Realm Execution Contexts (RECs aka VCPUs)a
>     * Activation of the Realm.
> 
> Patches are split as follows :
> 
> Patches 1 and 2 are fixes to existing code.
> Patch 3 adds a new option --nocompat to disable compat warnings
> Patches 4 - 6 are some preparations for Realm specific changes.
> 
> The remaining patches adds Realm support and using the --realm option is
> enabled in patch 30.
> 
> The v1.0 of the Realm Management Monitor (RMM) specification doesn't support
> paging protected memory of a Realm. Thus all of the memory backing the RAM
> is locked by the VMM.
> 
> Since the IPA space of a Realm is split into Protected and Unprotected, with
> one alias of the other, the VMM doubles the IPA Size for a Realm VM.
> 
> The KVM support for Arm CCA is advertised with a new cap KVM_CAP_ARM_RME.
> A new "VM type" field is defined in the vm_type for CREATE_VM ioctl to indicate
> that a VM is "Realm". Once the VM is created, the life cycle of the Realm is
> managed via KVM_ENABLE_CAP of KVM_CAP_ARM_RME.
> 
> Command line options are also added to configure the Realm parameters.
> These include :
>   - Hash algorithm for measurements
>   - Realm personalisation value
>   - SVE vector Length (Optional feature in v1.0 RMM spec. Not yet supported
>     by the TF-RMM. coming soon).
> 
> Support for PMU and self-hosted debug (number of watchpoint/breakpoit registers)
> are not supported yet in the KVM/RMM implementation. This will be added soon.
> 
> The UABI doesn't support discovering the "supported" configuration values. In
> real world, the Realm configuration 'affects' the initial measurement of the
> Realms and which may be verified by a remote entity. Thus, the VMM is not at
> liberty to make choices for configurations based on the "host" capabilities.
> Instead, VMM should launch a Realm with the user requested parameters. If this
> cannot be satisfied, there is no point in running the Realm. We are happy to
> change this if there is interest.
> 
> Special actions are required to load the initial memory images (e.g, kernel,
> firmware, DTB, initrd) in to the Realm memory.
> 
> For VCPUs, we add a new feature KVM_ARM_VCPU_REC, which will be used to control
> the creation of the REC object (via KVM_ARM_VCPU_FINALIZE). This must be done
> after the initial register state of the VCPUs are set.
> RMM imposes an order in which the RECs are created. i.e., they must be created
> in the ascending order of the MPIDR. This is for now a responsibility of the
> VMM.
> 
> Once the Realm images are loaded, VCPUs created, Realm is activated before
> the first vCPU is run.
> 
> virtio for the Realms enforces VIRTIO_F_ACCESS_PLATFORM flag.
> 
> Also, added support for injecting SEA into the VM for unhandled MMIO.
> 

I wonder if there is a plan to develop a dedicated (stand-alone) tool 
that allows a realm developer to calculate Realm Initial Measurements 
for realms. I mean a tool that can be compiled and run on a Linux PC 
machine.

As you know, the remote attestation mechanism requires a verifier to be 
provisioned with reference values. In this case, a realm verifier should 
have access to the initial reference measurement (RIM) of a realm that 
is intended to be run on a remote Arm CCA platform.

The algorithm that measures the initial state of realms (RIM) is highly 
sensitive to the content of a realm memory and the order of RMI 
operations. This means that not only the content of populated realm 
memory matters but also the implementation of the host components (e.g. 
kvm, kvmtool/qemu).In the  of kvmtool-cca, the layout of memory and the 
content of DTB highly depend on the provided options (DTB is generated 
in run-time). Unfortunately, the content of DTB also depends on the 
linking order of object files (the order of DTB generation is imposed by 
__attribute__((constructor)) that is used to register devices). This 
complicates development of a separate tool for calculating RIM, as the 
tool would have to emulate all quirks of the kvmtool.

One of the solution of retrieving Realm Initial Measurements seems to be 
running the whole firmware/software (e.g. kvmtool/Linux host/TF-RMM) 
stack on the FVP emulator and gathering the RIM directly from the 
TF-RMM. This would require a realm developer to have access to the whole 
firmware/software stack and the emulator of the CCA platform.

The other solution would require the implementation of a dedicated tool. 
For instance, a sensible approach could be to extend the functionality 
of kvmtool.

Is Arm going to develop a dedicated, stand-alone tool for calculating RIMs?

What is the recommended way of retrieving/calculating RIMs for realms?

Kind regards,
Piotr Sawicki

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-01-27 11:22 [RFC] Support for Arm CCA VMs on Linux Suzuki K Poulose
                   ` (7 preceding siblings ...)
  2023-07-14 13:46 ` Jonathan Cameron
@ 2023-10-02 12:43 ` Suzuki K Poulose
  2024-01-10  5:40   ` Itaru Kitayama
  8 siblings, 1 reply; 190+ messages in thread
From: Suzuki K Poulose @ 2023-10-02 12:43 UTC (permalink / raw)
  To: linux-coco, linux-kernel, linux-arm-kernel, kvm, kvmarm
  Cc: catalin.marinas, will, maz, steven.price, alexandru.elisei,
	joey.gouly, james.morse, Jonathan.Cameron, dgilbert, jpb,
	oliver.upton, zhi.wang.linux, yuzenghui, salil.mehta,
	Suzuki K Poulose, Andrew Jones, Chao Peng, Christoffer Dall,
	Fuad Tabba, Jonathan Cameron, Jean-Philippe Brucker, Joey Gouly,
	Mark Rutland, Paolo Bonzini, Quentin Perret, Sean Christopherson,
	Thomas Huth

Hi,


> We are happy to announce the early RFC version of the Arm
> Confidential Compute Architecture (CCA) support for the Linux
> stack. The intention is to seek early feedback in the following areas:
>  * KVM integration of the Arm CCA
>  * KVM UABI for managing the Realms, seeking to generalise the operations
>    wherever possible with other Confidential Compute solutions.
>    Note: This version doesn't support Guest Private memory, which will be added
>    later (see below).
>  * Linux Guest support for Realms
>

We have updated the stack for Arm CCA Linux support to RMM-v1.0-EAC2 (See links)
We are not posting the patches for review yet, as we plan to update our
stack to support the latest RMM-v1.0 specification, which includes some
functional changes to support PSCI monitoring by the VMM along with other
minor changes. All relevant components are updated on a new branch "rmm-v1.0-eac2"
Guest-mem support is not included, but is in progress.

Change log :
 - KVM RMI support updated to v1.0-eac2, with optimisations to stage2 tear down
 - Guest (Linux and kvm-unit-test) support for RSI compliant to v1.0-eac2
 - SVE, PMU support for Realms

kvmtool :
  - Dropped no-compat and switched to --loglevel (merged upstream)
  - Support for SVE, --sve-vl for vector length

> Arm CCA Introduction
> =====================
> 
> The Arm CCA is a reference software architecture and implementation that builds
> on the Realm Management Extension (RME), enabling the execution of Virtual
> machines, while preventing access by more privileged software, such as hypervisor.
> The Arm CCA allows the hypervisor to control the VM, but removes the right for
> access to the code, register state or data that is used by VM.
> More information on the architecture is available here[0].
> 
>     Arm CCA Reference Software Architecture
> 
>         Realm World    ||    Normal World   ||  Secure World  ||
>                        ||        |          ||                ||
>  EL0 x-------x         || x----x | x------x ||                ||
>      | Realm |         || |    | | |      | ||                ||
>      |       |         || | VM | | |      | ||                ||
>  ----|  VM*  |---------||-|    |---|      |-||----------------||
>      |       |         || |    | | |  H   | ||                ||
>  EL1 x-------x         || x----x | |      | ||                ||
>          ^             ||        | |  o   | ||                ||
>          |             ||        | |      | ||                ||
>  ------- R*------------------------|  s  -|---------------------
>          S             ||          |      | ||                ||
>          I             ||          |  t   | ||                ||
>          |             ||          |      | ||                || 
>          v             ||          x------x ||                ||
>  EL2    RMM*           ||              ^    ||                ||
>          ^             ||              |    ||                ||
>  ========|=============================|========================
>          |                             | SMC
>          x--------- *RMI* -------------x
> 
>  EL3                   Root World
>                        EL3 Firmware
>  ===============================================================
> Where :
>  RMM - Realm Management Monitor
>  RMI - Realm Management Interface
>  RSI - Realm Service Interface
>  SMC - Secure Monitor Call
> 
> RME introduces a new security state "Realm world", in addition to the
> traditional Secure and Non-Secure states. The Arm CCA defines a new component,
> Realm Management Monitor (RMM) that runs at R-EL2. This is a standard piece of
> firmware, verified, installed and loaded by the EL3 firmware (e.g, TF-A), at
> system boot.
> 
> The RMM provides standard interfaces - Realm Management Interface (RMI) - to the
> Normal world hypervisor to manage the VMs running in the Realm world (also called
> Realms in short). These are exposed via SMC and are routed through the EL3
> firmwre.
> The RMI interface includes:
>   - Move a physical page from the Normal world to the Realm world
>   - Creating a Realm with requested parameters, tracked via Realm Descriptor (RD)
>   - Creating VCPUs aka Realm Execution Context (REC), with initial register state.
>   - Create stage2 translation table at any level.
>   - Load initial images into Realm Memory from normal world memory
>   - Schedule RECs (vCPUs) and handle exits
>   - Inject virtual interrupts into the Realm
>   - Service stage2 runtime faults with pages (provided by host, scrubbed by RMM).
>   - Create "shared" mappings that can be accessed by VMM/Hyp.
>   - Reclaim the memory allocated for the RAM and RTTs (Realm Translation Tables)
> 
> However v1.0 of RMM specifications doesn't support:
>  - Paging protected memory of a Realm VM. Thus the pages backing the protected
>    memory region must be pinned.
>  - Live migration of Realms.
>  - Trusted Device assignment.
>  - Physical interrupt backed Virtual interrupts for Realms
> 
> RMM also provides certain services to the Realms via SMC, called Realm Service
> Interface (RSI). These include:
>  - Realm Guest Configuration.
>  - Attestation & Measurement services
>  - Managing the state of an Intermediate Physical Address (IPA aka GPA) page.
>  - Host Call service (Communication with the Normal world Hypervisor)
> 
> The specifications for the RMM software is currently at *v1.0-Beta2* and the
> latest version is available here [1].
> 
> The Trusted Firmware foundation has an implementation of the RMM - TF-RMM -
> available here [3].
> 
> Implementation
> =================
> 
> This version of the stack is based on the RMM specification v1.0-Beta0[2], with
> following exceptions :
>   - TF-RMM/KVM currently doesn't support the optional features of PMU,
>      SVE and Self-hosted debug (coming soon).
>   - The RSI_HOST_CALL structure alignment requirement is reduced to match
>      RMM v1.0 Beta1
>   - RMI/RSI version numbers do not match the RMM spec. This will be
>     resolved once the spec/implementation is complete, across TF-RMM+Linux stack.
> 
> We plan to update the stack to support the latest version of the RMMv1.0 spec
> in the coming revisions.
> 
> This release includes the following components :
> 
>  a) Linux Kernel
>      i) Host / KVM support - Support for driving the Realms via RMI. This is
>      dependent on running in the Kernel at EL2 (aka VHE mode). Also provides
>      UABI for VMMs to manage the Realm VMs. The support is restricted to 4K page
>      size, matching the Stage2 granule supported by RMM. The VMM is responsible
>      for making sure the guest memory is locked.
> 
>        TODO: Guest Private memory[10] integration - We have been following the
>        series and support will be added once it is merged upstream.
>      
>      ii) Guest support - Support for a Linux Kernel to run in the Realm VM at
>      Realm-EL1, using RSI services. This includes virtio support (virtio-v1.0
>      only). All I/O are treated as non-secure/shared.
>  
>  c) kvmtool - VMM changes required to manage Realm VMs. No guest private memory
>     as mentioned above.
>  d) kvm-unit-tests - Support for running in Realms along with additional tests
>     for RSI ABI.
> 
> Running the stack
> ====================
> 
> To run/test the stack, you would need the following components :
> 
> 1) FVP Base AEM RevC model with FEAT_RME support [4]
> 2) TF-A firmware for EL3 [5]
> 3) TF-A RMM for R-EL2 [3]
> 4) Linux Kernel [6]
> 5) kvmtool [7]
> 6) kvm-unit-tests [8]
> 
> Instructions for building the firmware components and running the model are
> available here [9]. Once, the host kernel is booted, a Realm can be launched by
> invoking the `lkvm` commad as follows:
> 
>  $ lkvm run --realm 				 \
> 	 --measurement-algo=["sha256", "sha512"] \
> 	 --disable-sve				 \

As noted above, this is no longer required.

> 	 <normal-vm-options>
> 
> Where:
>  * --measurement-algo (Optional) specifies the algorithm selected for creating the
>    initial measurements by the RMM for this Realm (defaults to sha256).
>  * GICv3 is mandatory for the Realms.
>  * SVE is not yet supported in the TF-RMM, and thus must be disabled using
>    --disable-sve
> 
> You may also run the kvm-unit-tests inside the Realm world, using the similar
> options as above.
> 
> 
> Links
> ============
> 
> [0] Arm CCA Landing page (See Key Resources section for various documentations)
>     https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
> 
> [1] RMM Specification Latest
>     https://developer.arm.com/documentation/den0137/latest
> 
> [2] RMM v1.0-Beta0 specification
>     https://developer.arm.com/documentation/den0137/1-0bet0/

 EAC2 spec: https://developer.arm.com/documentation/den0137/1-0eac2/
> 
> [3] Trusted Firmware RMM - TF-RMM
>     https://www.trustedfirmware.org/projects/tf-rmm/
>     GIT: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git
> 
> [4] FVP Base RevC AEM Model (available on x86_64 / Arm64 Linux)
>     https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms
> 
> [5] Trusted Firmware for A class
>     https://www.trustedfirmware.org/projects/tf-a/
> 
> [6] Linux kernel support for Arm-CCA
>     https://gitlab.arm.com/linux-arm/linux-cca
>     Host Support branch:	cca-host/rfc-v1

Update branch : cca-host/rmm-v1.0-eac2

>     Guest Support branch:	cca-guest/rfc-v1

Update branch : cca-guest/rmm-v1.0-eac2

Combined tree for host and guest is also available at: "cca-full/rmm-v1.0-eac2"

> 
> [7] kvmtool support for Arm CCA
>     https://gitlab.arm.com/linux-arm/kvmtool-cca cca/rfc-v1

Update branch : cca/rmm-v1.0-eac2

> 
> [8] kvm-unit-tests support for Arm CCA
>     https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca  cca/rfc-v1
> 

Update branch : cca/rmm-v1.0-eac2


Suzuki

> [9] Instructions for Building Firmware components and running the model, see
>     section 4.19.2 "Building and running TF-A with RME"
>     https://trustedfirmware-a.readthedocs.io/en/latest/components/realm-management-extension.html#building-and-running-tf-a-with-rme
> 
> [10] fd based Guest Private memory for KVM
>    https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng@linux.intel.com





Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Andrew Jones <andrew.jones@linux.dev>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chao Peng <chao.p.peng@linux.intel.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: Joey Gouly <Joey.Gouly@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Quentin Perret <qperret@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zenghui Yu <yuzenghui@huawei.com>
To: linux-coco@lists.linux.dev
To: kvmarm@lists.linux.dev
Cc: linux-arm-kernel@lists.infradead.org
To: linux-kernel@vger.kernel.org
To: kvm@vger.kernel.org

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2023-10-02 12:43 ` Suzuki K Poulose
@ 2024-01-10  5:40   ` Itaru Kitayama
  2024-01-10 11:41     ` Suzuki K Poulose
  0 siblings, 1 reply; 190+ messages in thread
From: Itaru Kitayama @ 2024-01-10  5:40 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-coco, linux-kernel, linux-arm-kernel, kvm, kvmarm,
	catalin.marinas, will, maz, steven.price, alexandru.elisei,
	joey.gouly, james.morse, Jonathan.Cameron, dgilbert, jpb,
	oliver.upton, zhi.wang.linux, yuzenghui, salil.mehta,
	Andrew Jones, Chao Peng, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Mark Rutland, Paolo Bonzini,
	Quentin Perret, Sean Christopherson, Thomas Huth

On Mon, Oct 02, 2023 at 01:43:11PM +0100, Suzuki K Poulose wrote:
> Hi,
> 
> 
> > We are happy to announce the early RFC version of the Arm
> > Confidential Compute Architecture (CCA) support for the Linux
> > stack. The intention is to seek early feedback in the following areas:
> >  * KVM integration of the Arm CCA
> >  * KVM UABI for managing the Realms, seeking to generalise the operations
> >    wherever possible with other Confidential Compute solutions.
> >    Note: This version doesn't support Guest Private memory, which will be added
> >    later (see below).
> >  * Linux Guest support for Realms
> >
> 
> We have updated the stack for Arm CCA Linux support to RMM-v1.0-EAC2 (See links)
> We are not posting the patches for review yet, as we plan to update our
> stack to support the latest RMM-v1.0 specification, which includes some
> functional changes to support PSCI monitoring by the VMM along with other
> minor changes. All relevant components are updated on a new branch "rmm-v1.0-eac2"
> Guest-mem support is not included, but is in progress.
> 
> Change log :
>  - KVM RMI support updated to v1.0-eac2, with optimisations to stage2 tear down
>  - Guest (Linux and kvm-unit-test) support for RSI compliant to v1.0-eac2
>  - SVE, PMU support for Realms
> 
> kvmtool :
>   - Dropped no-compat and switched to --loglevel (merged upstream)
>   - Support for SVE, --sve-vl for vector length
> 
> > Arm CCA Introduction
> > =====================
> > 
> > The Arm CCA is a reference software architecture and implementation that builds
> > on the Realm Management Extension (RME), enabling the execution of Virtual
> > machines, while preventing access by more privileged software, such as hypervisor.
> > The Arm CCA allows the hypervisor to control the VM, but removes the right for
> > access to the code, register state or data that is used by VM.
> > More information on the architecture is available here[0].
> > 
> >     Arm CCA Reference Software Architecture
> > 
> >         Realm World    ||    Normal World   ||  Secure World  ||
> >                        ||        |          ||                ||
> >  EL0 x-------x         || x----x | x------x ||                ||
> >      | Realm |         || |    | | |      | ||                ||
> >      |       |         || | VM | | |      | ||                ||
> >  ----|  VM*  |---------||-|    |---|      |-||----------------||
> >      |       |         || |    | | |  H   | ||                ||
> >  EL1 x-------x         || x----x | |      | ||                ||
> >          ^             ||        | |  o   | ||                ||
> >          |             ||        | |      | ||                ||
> >  ------- R*------------------------|  s  -|---------------------
> >          S             ||          |      | ||                ||
> >          I             ||          |  t   | ||                ||
> >          |             ||          |      | ||                || 
> >          v             ||          x------x ||                ||
> >  EL2    RMM*           ||              ^    ||                ||
> >          ^             ||              |    ||                ||
> >  ========|=============================|========================
> >          |                             | SMC
> >          x--------- *RMI* -------------x
> > 
> >  EL3                   Root World
> >                        EL3 Firmware
> >  ===============================================================
> > Where :
> >  RMM - Realm Management Monitor
> >  RMI - Realm Management Interface
> >  RSI - Realm Service Interface
> >  SMC - Secure Monitor Call
> > 
> > RME introduces a new security state "Realm world", in addition to the
> > traditional Secure and Non-Secure states. The Arm CCA defines a new component,
> > Realm Management Monitor (RMM) that runs at R-EL2. This is a standard piece of
> > firmware, verified, installed and loaded by the EL3 firmware (e.g, TF-A), at
> > system boot.
> > 
> > The RMM provides standard interfaces - Realm Management Interface (RMI) - to the
> > Normal world hypervisor to manage the VMs running in the Realm world (also called
> > Realms in short). These are exposed via SMC and are routed through the EL3
> > firmwre.
> > The RMI interface includes:
> >   - Move a physical page from the Normal world to the Realm world
> >   - Creating a Realm with requested parameters, tracked via Realm Descriptor (RD)
> >   - Creating VCPUs aka Realm Execution Context (REC), with initial register state.
> >   - Create stage2 translation table at any level.
> >   - Load initial images into Realm Memory from normal world memory
> >   - Schedule RECs (vCPUs) and handle exits
> >   - Inject virtual interrupts into the Realm
> >   - Service stage2 runtime faults with pages (provided by host, scrubbed by RMM).
> >   - Create "shared" mappings that can be accessed by VMM/Hyp.
> >   - Reclaim the memory allocated for the RAM and RTTs (Realm Translation Tables)
> > 
> > However v1.0 of RMM specifications doesn't support:
> >  - Paging protected memory of a Realm VM. Thus the pages backing the protected
> >    memory region must be pinned.
> >  - Live migration of Realms.
> >  - Trusted Device assignment.
> >  - Physical interrupt backed Virtual interrupts for Realms
> > 
> > RMM also provides certain services to the Realms via SMC, called Realm Service
> > Interface (RSI). These include:
> >  - Realm Guest Configuration.
> >  - Attestation & Measurement services
> >  - Managing the state of an Intermediate Physical Address (IPA aka GPA) page.
> >  - Host Call service (Communication with the Normal world Hypervisor)
> > 
> > The specifications for the RMM software is currently at *v1.0-Beta2* and the
> > latest version is available here [1].
> > 
> > The Trusted Firmware foundation has an implementation of the RMM - TF-RMM -
> > available here [3].
> > 
> > Implementation
> > =================
> > 
> > This version of the stack is based on the RMM specification v1.0-Beta0[2], with
> > following exceptions :
> >   - TF-RMM/KVM currently doesn't support the optional features of PMU,
> >      SVE and Self-hosted debug (coming soon).
> >   - The RSI_HOST_CALL structure alignment requirement is reduced to match
> >      RMM v1.0 Beta1
> >   - RMI/RSI version numbers do not match the RMM spec. This will be
> >     resolved once the spec/implementation is complete, across TF-RMM+Linux stack.
> > 
> > We plan to update the stack to support the latest version of the RMMv1.0 spec
> > in the coming revisions.
> > 
> > This release includes the following components :
> > 
> >  a) Linux Kernel
> >      i) Host / KVM support - Support for driving the Realms via RMI. This is
> >      dependent on running in the Kernel at EL2 (aka VHE mode). Also provides
> >      UABI for VMMs to manage the Realm VMs. The support is restricted to 4K page
> >      size, matching the Stage2 granule supported by RMM. The VMM is responsible
> >      for making sure the guest memory is locked.
> > 
> >        TODO: Guest Private memory[10] integration - We have been following the
> >        series and support will be added once it is merged upstream.
> >      
> >      ii) Guest support - Support for a Linux Kernel to run in the Realm VM at
> >      Realm-EL1, using RSI services. This includes virtio support (virtio-v1.0
> >      only). All I/O are treated as non-secure/shared.
> >  
> >  c) kvmtool - VMM changes required to manage Realm VMs. No guest private memory
> >     as mentioned above.
> >  d) kvm-unit-tests - Support for running in Realms along with additional tests
> >     for RSI ABI.
> > 
> > Running the stack
> > ====================
> > 
> > To run/test the stack, you would need the following components :
> > 
> > 1) FVP Base AEM RevC model with FEAT_RME support [4]
> > 2) TF-A firmware for EL3 [5]
> > 3) TF-A RMM for R-EL2 [3]
> > 4) Linux Kernel [6]
> > 5) kvmtool [7]
> > 6) kvm-unit-tests [8]
> > 
> > Instructions for building the firmware components and running the model are
> > available here [9]. Once, the host kernel is booted, a Realm can be launched by
> > invoking the `lkvm` commad as follows:
> > 
> >  $ lkvm run --realm 				 \
> > 	 --measurement-algo=["sha256", "sha512"] \
> > 	 --disable-sve				 \
> 
> As noted above, this is no longer required.
> 
> > 	 <normal-vm-options>
> > 
> > Where:
> >  * --measurement-algo (Optional) specifies the algorithm selected for creating the
> >    initial measurements by the RMM for this Realm (defaults to sha256).
> >  * GICv3 is mandatory for the Realms.
> >  * SVE is not yet supported in the TF-RMM, and thus must be disabled using
> >    --disable-sve
> > 
> > You may also run the kvm-unit-tests inside the Realm world, using the similar
> > options as above.
> > 
> > 
> > Links
> > ============
> > 
> > [0] Arm CCA Landing page (See Key Resources section for various documentations)
> >     https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
> > 
> > [1] RMM Specification Latest
> >     https://developer.arm.com/documentation/den0137/latest
> > 
> > [2] RMM v1.0-Beta0 specification
> >     https://developer.arm.com/documentation/den0137/1-0bet0/
> 
>  EAC2 spec: https://developer.arm.com/documentation/den0137/1-0eac2/
> > 
> > [3] Trusted Firmware RMM - TF-RMM
> >     https://www.trustedfirmware.org/projects/tf-rmm/
> >     GIT: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git
> > 
> > [4] FVP Base RevC AEM Model (available on x86_64 / Arm64 Linux)
> >     https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms
> > 
> > [5] Trusted Firmware for A class
> >     https://www.trustedfirmware.org/projects/tf-a/
> > 
> > [6] Linux kernel support for Arm-CCA
> >     https://gitlab.arm.com/linux-arm/linux-cca
> >     Host Support branch:	cca-host/rfc-v1
> 
> Update branch : cca-host/rmm-v1.0-eac2
> 
> >     Guest Support branch:	cca-guest/rfc-v1
> 
> Update branch : cca-guest/rmm-v1.0-eac2
> 
> Combined tree for host and guest is also available at: "cca-full/rmm-v1.0-eac2"
> 
> > 
> > [7] kvmtool support for Arm CCA
> >     https://gitlab.arm.com/linux-arm/kvmtool-cca cca/rfc-v1
> 
> Update branch : cca/rmm-v1.0-eac2
> 
> > 
> > [8] kvm-unit-tests support for Arm CCA
> >     https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca  cca/rfc-v1
> > 
> 
> Update branch : cca/rmm-v1.0-eac2
> 
> 
> Suzuki
> 
> > [9] Instructions for Building Firmware components and running the model, see
> >     section 4.19.2 "Building and running TF-A with RME"
> >     https://trustedfirmware-a.readthedocs.io/en/latest/components/realm-management-extension.html#building-and-running-tf-a-with-rme
> > 
> > [10] fd based Guest Private memory for KVM
> >    https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng@linux.intel.com
> 
> 
> 
> 
> 
> Cc: Alexandru Elisei <alexandru.elisei@arm.com>
> Cc: Andrew Jones <andrew.jones@linux.dev>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Chao Peng <chao.p.peng@linux.intel.com>
> Cc: Christoffer Dall <christoffer.dall@arm.com>
> Cc: Fuad Tabba <tabba@google.com>
> Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Cc: Joey Gouly <Joey.Gouly@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Oliver Upton <oliver.upton@linux.dev>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Quentin Perret <qperret@google.com>
> Cc: Sean Christopherson <seanjc@google.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Thomas Huth <thuth@redhat.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Zenghui Yu <yuzenghui@huawei.com>
> To: linux-coco@lists.linux.dev
> To: kvmarm@lists.linux.dev
> Cc: linux-arm-kernel@lists.infradead.org
> To: linux-kernel@vger.kernel.org
> To: kvm@vger.kernel.org

Suzuki,
Any update to the Arm CCA series (v3?) since last October?

Thanks,
Itaru.

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2024-01-10  5:40   ` Itaru Kitayama
@ 2024-01-10 11:41     ` Suzuki K Poulose
  2024-01-10 13:44       ` Suzuki K Poulose
  2024-01-12  5:01       ` Itaru Kitayama
  0 siblings, 2 replies; 190+ messages in thread
From: Suzuki K Poulose @ 2024-01-10 11:41 UTC (permalink / raw)
  To: Itaru Kitayama
  Cc: linux-coco, linux-kernel, linux-arm-kernel, kvm, kvmarm,
	catalin.marinas, will, maz, steven.price, alexandru.elisei,
	joey.gouly, james.morse, Jonathan.Cameron, dgilbert, jpb,
	oliver.upton, zhi.wang.linux, yuzenghui, salil.mehta,
	Andrew Jones, Chao Peng, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Mark Rutland, Paolo Bonzini,
	Quentin Perret, Sean Christopherson, Thomas Huth, Ryan Roberts,
	Sami Mujawar

Hi Itaru,

On 10/01/2024 05:40, Itaru Kitayama wrote:
> On Mon, Oct 02, 2023 at 01:43:11PM +0100, Suzuki K Poulose wrote:
>> Hi,
>>
>>
>>> We are happy to announce the early RFC version of the Arm
>>> Confidential Compute Architecture (CCA) support for the Linux
>>> stack. The intention is to seek early feedback in the following areas:
>>>   * KVM integration of the Arm CCA
>>>   * KVM UABI for managing the Realms, seeking to generalise the operations
>>>     wherever possible with other Confidential Compute solutions.
>>>     Note: This version doesn't support Guest Private memory, which will be added
>>>     later (see below).
>>>   * Linux Guest support for Realms
>>>
>>
>> We have updated the stack for Arm CCA Linux support to RMM-v1.0-EAC2 (See links)
>> We are not posting the patches for review yet, as we plan to update our
>> stack to support the latest RMM-v1.0 specification, which includes some
>> functional changes to support PSCI monitoring by the VMM along with other
>> minor changes. All relevant components are updated on a new branch "rmm-v1.0-eac2"
>> Guest-mem support is not included, but is in progress.
>>
>> Change log :
>>   - KVM RMI support updated to v1.0-eac2, with optimisations to stage2 tear down
>>   - Guest (Linux and kvm-unit-test) support for RSI compliant to v1.0-eac2
>>   - SVE, PMU support for Realms
>>
>> kvmtool :
>>    - Dropped no-compat and switched to --loglevel (merged upstream)
>>    - Support for SVE, --sve-vl for vector length
>>
>>> Arm CCA Introduction
>>> =====================
>>>
>>> The Arm CCA is a reference software architecture and implementation that builds
>>> on the Realm Management Extension (RME), enabling the execution of Virtual
>>> machines, while preventing access by more privileged software, such as hypervisor.
>>> The Arm CCA allows the hypervisor to control the VM, but removes the right for
>>> access to the code, register state or data that is used by VM.
>>> More information on the architecture is available here[0].
>>>
>>>      Arm CCA Reference Software Architecture
>>>
>>>          Realm World    ||    Normal World   ||  Secure World  ||
>>>                         ||        |          ||                ||
>>>   EL0 x-------x         || x----x | x------x ||                ||
>>>       | Realm |         || |    | | |      | ||                ||
>>>       |       |         || | VM | | |      | ||                ||
>>>   ----|  VM*  |---------||-|    |---|      |-||----------------||
>>>       |       |         || |    | | |  H   | ||                ||
>>>   EL1 x-------x         || x----x | |      | ||                ||
>>>           ^             ||        | |  o   | ||                ||
>>>           |             ||        | |      | ||                ||
>>>   ------- R*------------------------|  s  -|---------------------
>>>           S             ||          |      | ||                ||
>>>           I             ||          |  t   | ||                ||
>>>           |             ||          |      | ||                ||
>>>           v             ||          x------x ||                ||
>>>   EL2    RMM*           ||              ^    ||                ||
>>>           ^             ||              |    ||                ||
>>>   ========|=============================|========================
>>>           |                             | SMC
>>>           x--------- *RMI* -------------x
>>>
>>>   EL3                   Root World
>>>                         EL3 Firmware
>>>   ===============================================================
>>> Where :
>>>   RMM - Realm Management Monitor
>>>   RMI - Realm Management Interface
>>>   RSI - Realm Service Interface
>>>   SMC - Secure Monitor Call
>>>
>>> RME introduces a new security state "Realm world", in addition to the
>>> traditional Secure and Non-Secure states. The Arm CCA defines a new component,
>>> Realm Management Monitor (RMM) that runs at R-EL2. This is a standard piece of
>>> firmware, verified, installed and loaded by the EL3 firmware (e.g, TF-A), at
>>> system boot.
>>>
>>> The RMM provides standard interfaces - Realm Management Interface (RMI) - to the
>>> Normal world hypervisor to manage the VMs running in the Realm world (also called
>>> Realms in short). These are exposed via SMC and are routed through the EL3
>>> firmwre.
>>> The RMI interface includes:
>>>    - Move a physical page from the Normal world to the Realm world
>>>    - Creating a Realm with requested parameters, tracked via Realm Descriptor (RD)
>>>    - Creating VCPUs aka Realm Execution Context (REC), with initial register state.
>>>    - Create stage2 translation table at any level.
>>>    - Load initial images into Realm Memory from normal world memory
>>>    - Schedule RECs (vCPUs) and handle exits
>>>    - Inject virtual interrupts into the Realm
>>>    - Service stage2 runtime faults with pages (provided by host, scrubbed by RMM).
>>>    - Create "shared" mappings that can be accessed by VMM/Hyp.
>>>    - Reclaim the memory allocated for the RAM and RTTs (Realm Translation Tables)
>>>
>>> However v1.0 of RMM specifications doesn't support:
>>>   - Paging protected memory of a Realm VM. Thus the pages backing the protected
>>>     memory region must be pinned.
>>>   - Live migration of Realms.
>>>   - Trusted Device assignment.
>>>   - Physical interrupt backed Virtual interrupts for Realms
>>>
>>> RMM also provides certain services to the Realms via SMC, called Realm Service
>>> Interface (RSI). These include:
>>>   - Realm Guest Configuration.
>>>   - Attestation & Measurement services
>>>   - Managing the state of an Intermediate Physical Address (IPA aka GPA) page.
>>>   - Host Call service (Communication with the Normal world Hypervisor)
>>>
>>> The specifications for the RMM software is currently at *v1.0-Beta2* and the
>>> latest version is available here [1].
>>>
>>> The Trusted Firmware foundation has an implementation of the RMM - TF-RMM -
>>> available here [3].
>>>
>>> Implementation
>>> =================
>>>
>>> This version of the stack is based on the RMM specification v1.0-Beta0[2], with
>>> following exceptions :
>>>    - TF-RMM/KVM currently doesn't support the optional features of PMU,
>>>       SVE and Self-hosted debug (coming soon).
>>>    - The RSI_HOST_CALL structure alignment requirement is reduced to match
>>>       RMM v1.0 Beta1
>>>    - RMI/RSI version numbers do not match the RMM spec. This will be
>>>      resolved once the spec/implementation is complete, across TF-RMM+Linux stack.
>>>
>>> We plan to update the stack to support the latest version of the RMMv1.0 spec
>>> in the coming revisions.
>>>
>>> This release includes the following components :
>>>
>>>   a) Linux Kernel
>>>       i) Host / KVM support - Support for driving the Realms via RMI. This is
>>>       dependent on running in the Kernel at EL2 (aka VHE mode). Also provides
>>>       UABI for VMMs to manage the Realm VMs. The support is restricted to 4K page
>>>       size, matching the Stage2 granule supported by RMM. The VMM is responsible
>>>       for making sure the guest memory is locked.
>>>
>>>         TODO: Guest Private memory[10] integration - We have been following the
>>>         series and support will be added once it is merged upstream.
>>>       
>>>       ii) Guest support - Support for a Linux Kernel to run in the Realm VM at
>>>       Realm-EL1, using RSI services. This includes virtio support (virtio-v1.0
>>>       only). All I/O are treated as non-secure/shared.
>>>   
>>>   c) kvmtool - VMM changes required to manage Realm VMs. No guest private memory
>>>      as mentioned above.
>>>   d) kvm-unit-tests - Support for running in Realms along with additional tests
>>>      for RSI ABI.
>>>
>>> Running the stack
>>> ====================
>>>
>>> To run/test the stack, you would need the following components :
>>>
>>> 1) FVP Base AEM RevC model with FEAT_RME support [4]
>>> 2) TF-A firmware for EL3 [5]
>>> 3) TF-A RMM for R-EL2 [3]
>>> 4) Linux Kernel [6]
>>> 5) kvmtool [7]
>>> 6) kvm-unit-tests [8]
>>>
>>> Instructions for building the firmware components and running the model are
>>> available here [9]. Once, the host kernel is booted, a Realm can be launched by
>>> invoking the `lkvm` commad as follows:
>>>
>>>   $ lkvm run --realm 				 \
>>> 	 --measurement-algo=["sha256", "sha512"] \
>>> 	 --disable-sve				 \
>>
>> As noted above, this is no longer required.
>>
>>> 	 <normal-vm-options>
>>>
>>> Where:
>>>   * --measurement-algo (Optional) specifies the algorithm selected for creating the
>>>     initial measurements by the RMM for this Realm (defaults to sha256).
>>>   * GICv3 is mandatory for the Realms.
>>>   * SVE is not yet supported in the TF-RMM, and thus must be disabled using
>>>     --disable-sve
>>>
>>> You may also run the kvm-unit-tests inside the Realm world, using the similar
>>> options as above.
>>>
>>>
>>> Links
>>> ============
>>>
>>> [0] Arm CCA Landing page (See Key Resources section for various documentations)
>>>      https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
>>>
>>> [1] RMM Specification Latest
>>>      https://developer.arm.com/documentation/den0137/latest
>>>
>>> [2] RMM v1.0-Beta0 specification
>>>      https://developer.arm.com/documentation/den0137/1-0bet0/
>>
>>   EAC2 spec: https://developer.arm.com/documentation/den0137/1-0eac2/
>>>
>>> [3] Trusted Firmware RMM - TF-RMM
>>>      https://www.trustedfirmware.org/projects/tf-rmm/
>>>      GIT: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git
>>>
>>> [4] FVP Base RevC AEM Model (available on x86_64 / Arm64 Linux)
>>>      https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms
>>>
>>> [5] Trusted Firmware for A class
>>>      https://www.trustedfirmware.org/projects/tf-a/ >>>
>>> [6] Linux kernel support for Arm-CCA
>>>      https://gitlab.arm.com/linux-arm/linux-cca
>>>      Host Support branch:	cca-host/rfc-v1
>>
>> Update branch : cca-host/rmm-v1.0-eac2
>>
>>>      Guest Support branch:	cca-guest/rfc-v1
>>
>> Update branch : cca-guest/rmm-v1.0-eac2
>>
>> Combined tree for host and guest is also available at: "cca-full/rmm-v1.0-eac2"
>>
>>>
>>> [7] kvmtool support for Arm CCA
>>>      https://gitlab.arm.com/linux-arm/kvmtool-cca cca/rfc-v1
>>
>> Update branch : cca/rmm-v1.0-eac2
>>
>>>
>>> [8] kvm-unit-tests support for Arm CCA
>>>      https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca  cca/rfc-v1
>>>
>>
>> Update branch : cca/rmm-v1.0-eac2
>>
>>
>> Suzuki
>>
>>> [9] Instructions for Building Firmware components and running the model, see
>>>      section 4.19.2 "Building and running TF-A with RME"
>>>      https://trustedfirmware-a.readthedocs.io/en/latest/components/realm-management-extension.html#building-and-running-tf-a-with-rme
>>>
>>> [10] fd based Guest Private memory for KVM
>>>     https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng@linux.intel.com
>>
>>
>>
>>
>>
>> Cc: Alexandru Elisei <alexandru.elisei@arm.com>
>> Cc: Andrew Jones <andrew.jones@linux.dev>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Chao Peng <chao.p.peng@linux.intel.com>
>> Cc: Christoffer Dall <christoffer.dall@arm.com>
>> Cc: Fuad Tabba <tabba@google.com>
>> Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
>> Cc: James Morse <james.morse@arm.com>
>> Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
>> Cc: Joey Gouly <Joey.Gouly@arm.com>
>> Cc: Marc Zyngier <maz@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Oliver Upton <oliver.upton@linux.dev>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Quentin Perret <qperret@google.com>
>> Cc: Sean Christopherson <seanjc@google.com>
>> Cc: Steven Price <steven.price@arm.com>
>> Cc: Thomas Huth <thuth@redhat.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Zenghui Yu <yuzenghui@huawei.com>
>> To: linux-coco@lists.linux.dev
>> To: kvmarm@lists.linux.dev
>> Cc: linux-arm-kernel@lists.infradead.org
>> To: linux-kernel@vger.kernel.org
>> To: kvm@vger.kernel.org
> 
> Suzuki,
> Any update to the Arm CCA series (v3?) since last October?

Yes, we now have a version that supports the final RMM-v1.0
specification (RMM-v1.0-EAC5). We also have the UEFI EDK2 firmware
support for Guests in Realm world.

We are planning to post the changes for review in the v6.8-rc cycle. We
are trying to integrate the guest_mem support (available in v6.8-rc1) as
well as reusing some of the arm64 kvm generic interface for configuring
the Realm parameters (e.g., PMU, SVE_VL etc).

Here is a version that is missing the items mentioned above, based
on v6.7-rc4, if anyone would like to try.

Also, the easiest way to get the components built and model kick started
is using the shrinkwrap [6] tool, using the cca-3world configuration.
The tool pulls all the required software components, builds (including
the buildroot for rootfs) and can run a model using these built
components.



[0] Linux Repo:
       Where: git@git.gitlab.arm.com:linux-arm/linux-cca.git
       KVM Support branch: cca-host/rmm-v1.0-eac5
       Linux Guest branch: cca-guest/rmm-v1.0-eac5
       Full stack branch:  cca-full/rmm-v1.0-eac5

[1] kvmtool Repo:
       Where: git@git.gitlab.arm.com:linux-arm/kvmtool-cca.git
       Branch: cca/rmm-v1.0-eac5

[2] kvm-unit-tests Repo:
       Where: git@git.gitlab.arm.com:linux-arm/kvm-unit-tests-cca.git
       Branch: cca/rmm-v1.0-eac5

[3] UEFI Guest firmware:
       edk2:     https://git.gitlab.arm.com/linux-arm/edk2-cca.git
       revision: 2802_arm_cca_rmm-v1.0-eac5

       edk2-platforms: 
https://git.gitlab.arm.com/linux-arm/edk2-platforms-cca.git
       revision:       2802_arm_cca_rmm-v1.0-eac5


[4] RMM Repo:
       Where: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git
       tag : tf-rmm-v0.4.0

[5] TF-A repo:
       Where: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git
       Tag: v2.10


[6] https://shrinkwrap.docs.arm.com/en/latest/
     config: cca-3world.yaml

Kind regards
Suzuki



^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2024-01-10 11:41     ` Suzuki K Poulose
@ 2024-01-10 13:44       ` Suzuki K Poulose
  2024-01-19  1:26         ` Itaru Kitayama
  2024-01-12  5:01       ` Itaru Kitayama
  1 sibling, 1 reply; 190+ messages in thread
From: Suzuki K Poulose @ 2024-01-10 13:44 UTC (permalink / raw)
  To: Itaru Kitayama
  Cc: linux-coco, linux-kernel, linux-arm-kernel, kvm, kvmarm,
	catalin.marinas, will, maz, steven.price, alexandru.elisei,
	joey.gouly, james.morse, Jonathan.Cameron, dgilbert, jpb,
	oliver.upton, zhi.wang.linux, yuzenghui, salil.mehta,
	Andrew Jones, Chao Peng, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Mark Rutland, Paolo Bonzini,
	Quentin Perret, Sean Christopherson, Thomas Huth, Ryan Roberts,
	Sami Mujawar

On 10/01/2024 11:41, Suzuki K Poulose wrote:
> Hi Itaru,
> 
> On 10/01/2024 05:40, Itaru Kitayama wrote:
>> On Mon, Oct 02, 2023 at 01:43:11PM +0100, Suzuki K Poulose wrote:
>>> Hi,
>>>
>>>

...

>>
>> Suzuki,
>> Any update to the Arm CCA series (v3?) since last October?
> 
> Yes, we now have a version that supports the final RMM-v1.0
> specification (RMM-v1.0-EAC5). We also have the UEFI EDK2 firmware
> support for Guests in Realm world.
> 
> We are planning to post the changes for review in the v6.8-rc cycle. We
> are trying to integrate the guest_mem support (available in v6.8-rc1) as
> well as reusing some of the arm64 kvm generic interface for configuring
> the Realm parameters (e.g., PMU, SVE_VL etc).
> 
> Here is a version that is missing the items mentioned above, based
> on v6.7-rc4, if anyone would like to try.
> 
> Also, the easiest way to get the components built and model kick started
> is using the shrinkwrap [6] tool, using the cca-3world configuration.
> The tool pulls all the required software components, builds (including
> the buildroot for rootfs) and can run a model using these built
> components.

Also, please see 'arm/run-realm-tests.sh' in the kvm-unit-tests-cca 
repository for sample command lines to invoke kvmtool to create Realm
VMs.


> 
> 
> 
> [0] Linux Repo:
>        Where: git@git.gitlab.arm.com:linux-arm/linux-cca.git
>        KVM Support branch: cca-host/rmm-v1.0-eac5
>        Linux Guest branch: cca-guest/rmm-v1.0-eac5
>        Full stack branch:  cca-full/rmm-v1.0-eac5
> 
> [1] kvmtool Repo:
>        Where: git@git.gitlab.arm.com:linux-arm/kvmtool-cca.git
>        Branch: cca/rmm-v1.0-eac5
> 
> [2] kvm-unit-tests Repo:
>        Where: git@git.gitlab.arm.com:linux-arm/kvm-unit-tests-cca.git
>        Branch: cca/rmm-v1.0-eac5
> 
> [3] UEFI Guest firmware:
>        edk2:     https://git.gitlab.arm.com/linux-arm/edk2-cca.git
>        revision: 2802_arm_cca_rmm-v1.0-eac5
> 
>        edk2-platforms: 
> https://git.gitlab.arm.com/linux-arm/edk2-platforms-cca.git
>        revision:       2802_arm_cca_rmm-v1.0-eac5
> 
> 
> [4] RMM Repo:
>        Where: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git
>        tag : tf-rmm-v0.4.0
> 
> [5] TF-A repo:
>        Where: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git
>        Tag: v2.10
> 
> 
> [6] https://shrinkwrap.docs.arm.com/en/latest/
>      config: cca-3world.yaml
> 

Suzuki


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2024-01-10 11:41     ` Suzuki K Poulose
  2024-01-10 13:44       ` Suzuki K Poulose
@ 2024-01-12  5:01       ` Itaru Kitayama
  1 sibling, 0 replies; 190+ messages in thread
From: Itaru Kitayama @ 2024-01-12  5:01 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-coco, linux-kernel, linux-arm-kernel, kvm, kvmarm,
	catalin.marinas, will, maz, steven.price, alexandru.elisei,
	joey.gouly, james.morse, Jonathan.Cameron, dgilbert, jpb,
	oliver.upton, zhi.wang.linux, yuzenghui, salil.mehta,
	Andrew Jones, Chao Peng, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Mark Rutland, Paolo Bonzini,
	Quentin Perret, Sean Christopherson, Thomas Huth, Ryan Roberts,
	Sami Mujawar

On Wed, Jan 10, 2024 at 11:41:09AM +0000, Suzuki K Poulose wrote:
> Hi Itaru,
> 
> On 10/01/2024 05:40, Itaru Kitayama wrote:
> > On Mon, Oct 02, 2023 at 01:43:11PM +0100, Suzuki K Poulose wrote:
> > > Hi,
> > > 
> > > 
> > > > We are happy to announce the early RFC version of the Arm
> > > > Confidential Compute Architecture (CCA) support for the Linux
> > > > stack. The intention is to seek early feedback in the following areas:
> > > >   * KVM integration of the Arm CCA
> > > >   * KVM UABI for managing the Realms, seeking to generalise the operations
> > > >     wherever possible with other Confidential Compute solutions.
> > > >     Note: This version doesn't support Guest Private memory, which will be added
> > > >     later (see below).
> > > >   * Linux Guest support for Realms
> > > > 
> > > 
> > > We have updated the stack for Arm CCA Linux support to RMM-v1.0-EAC2 (See links)
> > > We are not posting the patches for review yet, as we plan to update our
> > > stack to support the latest RMM-v1.0 specification, which includes some
> > > functional changes to support PSCI monitoring by the VMM along with other
> > > minor changes. All relevant components are updated on a new branch "rmm-v1.0-eac2"
> > > Guest-mem support is not included, but is in progress.
> > > 
> > > Change log :
> > >   - KVM RMI support updated to v1.0-eac2, with optimisations to stage2 tear down
> > >   - Guest (Linux and kvm-unit-test) support for RSI compliant to v1.0-eac2
> > >   - SVE, PMU support for Realms
> > > 
> > > kvmtool :
> > >    - Dropped no-compat and switched to --loglevel (merged upstream)
> > >    - Support for SVE, --sve-vl for vector length
> > > 
> > > > Arm CCA Introduction
> > > > =====================
> > > > 
> > > > The Arm CCA is a reference software architecture and implementation that builds
> > > > on the Realm Management Extension (RME), enabling the execution of Virtual
> > > > machines, while preventing access by more privileged software, such as hypervisor.
> > > > The Arm CCA allows the hypervisor to control the VM, but removes the right for
> > > > access to the code, register state or data that is used by VM.
> > > > More information on the architecture is available here[0].
> > > > 
> > > >      Arm CCA Reference Software Architecture
> > > > 
> > > >          Realm World    ||    Normal World   ||  Secure World  ||
> > > >                         ||        |          ||                ||
> > > >   EL0 x-------x         || x----x | x------x ||                ||
> > > >       | Realm |         || |    | | |      | ||                ||
> > > >       |       |         || | VM | | |      | ||                ||
> > > >   ----|  VM*  |---------||-|    |---|      |-||----------------||
> > > >       |       |         || |    | | |  H   | ||                ||
> > > >   EL1 x-------x         || x----x | |      | ||                ||
> > > >           ^             ||        | |  o   | ||                ||
> > > >           |             ||        | |      | ||                ||
> > > >   ------- R*------------------------|  s  -|---------------------
> > > >           S             ||          |      | ||                ||
> > > >           I             ||          |  t   | ||                ||
> > > >           |             ||          |      | ||                ||
> > > >           v             ||          x------x ||                ||
> > > >   EL2    RMM*           ||              ^    ||                ||
> > > >           ^             ||              |    ||                ||
> > > >   ========|=============================|========================
> > > >           |                             | SMC
> > > >           x--------- *RMI* -------------x
> > > > 
> > > >   EL3                   Root World
> > > >                         EL3 Firmware
> > > >   ===============================================================
> > > > Where :
> > > >   RMM - Realm Management Monitor
> > > >   RMI - Realm Management Interface
> > > >   RSI - Realm Service Interface
> > > >   SMC - Secure Monitor Call
> > > > 
> > > > RME introduces a new security state "Realm world", in addition to the
> > > > traditional Secure and Non-Secure states. The Arm CCA defines a new component,
> > > > Realm Management Monitor (RMM) that runs at R-EL2. This is a standard piece of
> > > > firmware, verified, installed and loaded by the EL3 firmware (e.g, TF-A), at
> > > > system boot.
> > > > 
> > > > The RMM provides standard interfaces - Realm Management Interface (RMI) - to the
> > > > Normal world hypervisor to manage the VMs running in the Realm world (also called
> > > > Realms in short). These are exposed via SMC and are routed through the EL3
> > > > firmwre.
> > > > The RMI interface includes:
> > > >    - Move a physical page from the Normal world to the Realm world
> > > >    - Creating a Realm with requested parameters, tracked via Realm Descriptor (RD)
> > > >    - Creating VCPUs aka Realm Execution Context (REC), with initial register state.
> > > >    - Create stage2 translation table at any level.
> > > >    - Load initial images into Realm Memory from normal world memory
> > > >    - Schedule RECs (vCPUs) and handle exits
> > > >    - Inject virtual interrupts into the Realm
> > > >    - Service stage2 runtime faults with pages (provided by host, scrubbed by RMM).
> > > >    - Create "shared" mappings that can be accessed by VMM/Hyp.
> > > >    - Reclaim the memory allocated for the RAM and RTTs (Realm Translation Tables)
> > > > 
> > > > However v1.0 of RMM specifications doesn't support:
> > > >   - Paging protected memory of a Realm VM. Thus the pages backing the protected
> > > >     memory region must be pinned.
> > > >   - Live migration of Realms.
> > > >   - Trusted Device assignment.
> > > >   - Physical interrupt backed Virtual interrupts for Realms
> > > > 
> > > > RMM also provides certain services to the Realms via SMC, called Realm Service
> > > > Interface (RSI). These include:
> > > >   - Realm Guest Configuration.
> > > >   - Attestation & Measurement services
> > > >   - Managing the state of an Intermediate Physical Address (IPA aka GPA) page.
> > > >   - Host Call service (Communication with the Normal world Hypervisor)
> > > > 
> > > > The specifications for the RMM software is currently at *v1.0-Beta2* and the
> > > > latest version is available here [1].
> > > > 
> > > > The Trusted Firmware foundation has an implementation of the RMM - TF-RMM -
> > > > available here [3].
> > > > 
> > > > Implementation
> > > > =================
> > > > 
> > > > This version of the stack is based on the RMM specification v1.0-Beta0[2], with
> > > > following exceptions :
> > > >    - TF-RMM/KVM currently doesn't support the optional features of PMU,
> > > >       SVE and Self-hosted debug (coming soon).
> > > >    - The RSI_HOST_CALL structure alignment requirement is reduced to match
> > > >       RMM v1.0 Beta1
> > > >    - RMI/RSI version numbers do not match the RMM spec. This will be
> > > >      resolved once the spec/implementation is complete, across TF-RMM+Linux stack.
> > > > 
> > > > We plan to update the stack to support the latest version of the RMMv1.0 spec
> > > > in the coming revisions.
> > > > 
> > > > This release includes the following components :
> > > > 
> > > >   a) Linux Kernel
> > > >       i) Host / KVM support - Support for driving the Realms via RMI. This is
> > > >       dependent on running in the Kernel at EL2 (aka VHE mode). Also provides
> > > >       UABI for VMMs to manage the Realm VMs. The support is restricted to 4K page
> > > >       size, matching the Stage2 granule supported by RMM. The VMM is responsible
> > > >       for making sure the guest memory is locked.
> > > > 
> > > >         TODO: Guest Private memory[10] integration - We have been following the
> > > >         series and support will be added once it is merged upstream.
> > > >       ii) Guest support - Support for a Linux Kernel to run in the Realm VM at
> > > >       Realm-EL1, using RSI services. This includes virtio support (virtio-v1.0
> > > >       only). All I/O are treated as non-secure/shared.
> > > >   c) kvmtool - VMM changes required to manage Realm VMs. No guest private memory
> > > >      as mentioned above.
> > > >   d) kvm-unit-tests - Support for running in Realms along with additional tests
> > > >      for RSI ABI.
> > > > 
> > > > Running the stack
> > > > ====================
> > > > 
> > > > To run/test the stack, you would need the following components :
> > > > 
> > > > 1) FVP Base AEM RevC model with FEAT_RME support [4]
> > > > 2) TF-A firmware for EL3 [5]
> > > > 3) TF-A RMM for R-EL2 [3]
> > > > 4) Linux Kernel [6]
> > > > 5) kvmtool [7]
> > > > 6) kvm-unit-tests [8]
> > > > 
> > > > Instructions for building the firmware components and running the model are
> > > > available here [9]. Once, the host kernel is booted, a Realm can be launched by
> > > > invoking the `lkvm` commad as follows:
> > > > 
> > > >   $ lkvm run --realm 				 \
> > > > 	 --measurement-algo=["sha256", "sha512"] \
> > > > 	 --disable-sve				 \
> > > 
> > > As noted above, this is no longer required.
> > > 
> > > > 	 <normal-vm-options>
> > > > 
> > > > Where:
> > > >   * --measurement-algo (Optional) specifies the algorithm selected for creating the
> > > >     initial measurements by the RMM for this Realm (defaults to sha256).
> > > >   * GICv3 is mandatory for the Realms.
> > > >   * SVE is not yet supported in the TF-RMM, and thus must be disabled using
> > > >     --disable-sve
> > > > 
> > > > You may also run the kvm-unit-tests inside the Realm world, using the similar
> > > > options as above.
> > > > 
> > > > 
> > > > Links
> > > > ============
> > > > 
> > > > [0] Arm CCA Landing page (See Key Resources section for various documentations)
> > > >      https://www.arm.com/architecture/security-features/arm-confidential-compute-architecture
> > > > 
> > > > [1] RMM Specification Latest
> > > >      https://developer.arm.com/documentation/den0137/latest
> > > > 
> > > > [2] RMM v1.0-Beta0 specification
> > > >      https://developer.arm.com/documentation/den0137/1-0bet0/
> > > 
> > >   EAC2 spec: https://developer.arm.com/documentation/den0137/1-0eac2/
> > > > 
> > > > [3] Trusted Firmware RMM - TF-RMM
> > > >      https://www.trustedfirmware.org/projects/tf-rmm/
> > > >      GIT: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git
> > > > 
> > > > [4] FVP Base RevC AEM Model (available on x86_64 / Arm64 Linux)
> > > >      https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms
> > > > 
> > > > [5] Trusted Firmware for A class
> > > >      https://www.trustedfirmware.org/projects/tf-a/ >>>
> > > > [6] Linux kernel support for Arm-CCA
> > > >      https://gitlab.arm.com/linux-arm/linux-cca
> > > >      Host Support branch:	cca-host/rfc-v1
> > > 
> > > Update branch : cca-host/rmm-v1.0-eac2
> > > 
> > > >      Guest Support branch:	cca-guest/rfc-v1
> > > 
> > > Update branch : cca-guest/rmm-v1.0-eac2
> > > 
> > > Combined tree for host and guest is also available at: "cca-full/rmm-v1.0-eac2"
> > > 
> > > > 
> > > > [7] kvmtool support for Arm CCA
> > > >      https://gitlab.arm.com/linux-arm/kvmtool-cca cca/rfc-v1
> > > 
> > > Update branch : cca/rmm-v1.0-eac2
> > > 
> > > > 
> > > > [8] kvm-unit-tests support for Arm CCA
> > > >      https://gitlab.arm.com/linux-arm/kvm-unit-tests-cca  cca/rfc-v1
> > > > 
> > > 
> > > Update branch : cca/rmm-v1.0-eac2
> > > 
> > > 
> > > Suzuki
> > > 
> > > > [9] Instructions for Building Firmware components and running the model, see
> > > >      section 4.19.2 "Building and running TF-A with RME"
> > > >      https://trustedfirmware-a.readthedocs.io/en/latest/components/realm-management-extension.html#building-and-running-tf-a-with-rme
> > > > 
> > > > [10] fd based Guest Private memory for KVM
> > > >     https://lkml.kernel.org/r/20221202061347.1070246-1-chao.p.peng@linux.intel.com
> > > 
> > > 
> > > 
> > > 
> > > 
> > > Cc: Alexandru Elisei <alexandru.elisei@arm.com>
> > > Cc: Andrew Jones <andrew.jones@linux.dev>
> > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Chao Peng <chao.p.peng@linux.intel.com>
> > > Cc: Christoffer Dall <christoffer.dall@arm.com>
> > > Cc: Fuad Tabba <tabba@google.com>
> > > Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
> > > Cc: James Morse <james.morse@arm.com>
> > > Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > > Cc: Joey Gouly <Joey.Gouly@arm.com>
> > > Cc: Marc Zyngier <maz@kernel.org>
> > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > Cc: Oliver Upton <oliver.upton@linux.dev>
> > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > Cc: Quentin Perret <qperret@google.com>
> > > Cc: Sean Christopherson <seanjc@google.com>
> > > Cc: Steven Price <steven.price@arm.com>
> > > Cc: Thomas Huth <thuth@redhat.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > Cc: Zenghui Yu <yuzenghui@huawei.com>
> > > To: linux-coco@lists.linux.dev
> > > To: kvmarm@lists.linux.dev
> > > Cc: linux-arm-kernel@lists.infradead.org
> > > To: linux-kernel@vger.kernel.org
> > > To: kvm@vger.kernel.org
> > 
> > Suzuki,
> > Any update to the Arm CCA series (v3?) since last October?
> 
> Yes, we now have a version that supports the final RMM-v1.0
> specification (RMM-v1.0-EAC5). We also have the UEFI EDK2 firmware
> support for Guests in Realm world.
> 
> We are planning to post the changes for review in the v6.8-rc cycle. We
> are trying to integrate the guest_mem support (available in v6.8-rc1) as
> well as reusing some of the arm64 kvm generic interface for configuring
> the Realm parameters (e.g., PMU, SVE_VL etc).
> 
> Here is a version that is missing the items mentioned above, based
> on v6.7-rc4, if anyone would like to try.
> 
> Also, the easiest way to get the components built and model kick started
> is using the shrinkwrap [6] tool, using the cca-3world configuration.
> The tool pulls all the required software components, builds (including
> the buildroot for rootfs) and can run a model using these built
> components.

Hi Suzuki,

This is great news! I've just booted you guys WIP Linux kernel through
shrinkwrap (cca-3world.yaml) without an issue. 
Many thanks to Ryan who delivered an extremely handy tool to us. 

Thanks,
Itaru.

> 
> 
> 
> [0] Linux Repo:
>       Where: git@git.gitlab.arm.com:linux-arm/linux-cca.git
>       KVM Support branch: cca-host/rmm-v1.0-eac5
>       Linux Guest branch: cca-guest/rmm-v1.0-eac5
>       Full stack branch:  cca-full/rmm-v1.0-eac5
> 
> [1] kvmtool Repo:
>       Where: git@git.gitlab.arm.com:linux-arm/kvmtool-cca.git
>       Branch: cca/rmm-v1.0-eac5
> 
> [2] kvm-unit-tests Repo:
>       Where: git@git.gitlab.arm.com:linux-arm/kvm-unit-tests-cca.git
>       Branch: cca/rmm-v1.0-eac5
> 
> [3] UEFI Guest firmware:
>       edk2:     https://git.gitlab.arm.com/linux-arm/edk2-cca.git
>       revision: 2802_arm_cca_rmm-v1.0-eac5
> 
>       edk2-platforms:
> https://git.gitlab.arm.com/linux-arm/edk2-platforms-cca.git
>       revision:       2802_arm_cca_rmm-v1.0-eac5
> 
> 
> [4] RMM Repo:
>       Where: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git
>       tag : tf-rmm-v0.4.0
> 
> [5] TF-A repo:
>       Where: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git
>       Tag: v2.10
> 
> 
> [6] https://shrinkwrap.docs.arm.com/en/latest/
>     config: cca-3world.yaml
> 
> Kind regards
> Suzuki
> 
> 

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC] Support for Arm CCA VMs on Linux
  2024-01-10 13:44       ` Suzuki K Poulose
@ 2024-01-19  1:26         ` Itaru Kitayama
  0 siblings, 0 replies; 190+ messages in thread
From: Itaru Kitayama @ 2024-01-19  1:26 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-coco, linux-kernel, linux-arm-kernel, kvm, kvmarm,
	catalin.marinas, will, maz, steven.price, alexandru.elisei,
	joey.gouly, james.morse, Jonathan.Cameron, dgilbert, jpb,
	oliver.upton, zhi.wang.linux, yuzenghui, salil.mehta,
	Andrew Jones, Chao Peng, Christoffer Dall, Fuad Tabba,
	Jean-Philippe Brucker, Mark Rutland, Paolo Bonzini,
	Quentin Perret, Sean Christopherson, Thomas Huth, Ryan Roberts,
	Sami Mujawar

On Wed, Jan 10, 2024 at 01:44:45PM +0000, Suzuki K Poulose wrote:
> On 10/01/2024 11:41, Suzuki K Poulose wrote:
> > Hi Itaru,
> > 
> > On 10/01/2024 05:40, Itaru Kitayama wrote:
> > > On Mon, Oct 02, 2023 at 01:43:11PM +0100, Suzuki K Poulose wrote:
> > > > Hi,
> > > > 
> > > > 
> 
> ...
> 
> > > 
> > > Suzuki,
> > > Any update to the Arm CCA series (v3?) since last October?
> > 
> > Yes, we now have a version that supports the final RMM-v1.0
> > specification (RMM-v1.0-EAC5). We also have the UEFI EDK2 firmware
> > support for Guests in Realm world.
> > 
> > We are planning to post the changes for review in the v6.8-rc cycle. We
> > are trying to integrate the guest_mem support (available in v6.8-rc1) as
> > well as reusing some of the arm64 kvm generic interface for configuring
> > the Realm parameters (e.g., PMU, SVE_VL etc).
> > 
> > Here is a version that is missing the items mentioned above, based
> > on v6.7-rc4, if anyone would like to try.
> > 
> > Also, the easiest way to get the components built and model kick started
> > is using the shrinkwrap [6] tool, using the cca-3world configuration.
> > The tool pulls all the required software components, builds (including
> > the buildroot for rootfs) and can run a model using these built
> > components.
> 
> Also, please see 'arm/run-realm-tests.sh' in the kvm-unit-tests-cca
> repository for sample command lines to invoke kvmtool to create Realm
> VMs.

Thank you, Suzuki. I have just run the script above, again in the
framework of shrinkwrap on the RevC FVP and the jobs ran fine. I need to
go look at the lots of logs.

Itaru.

> 
> 
> > 
> > 
> > 
> > [0] Linux Repo:
> >        Where: git@git.gitlab.arm.com:linux-arm/linux-cca.git
> >        KVM Support branch: cca-host/rmm-v1.0-eac5
> >        Linux Guest branch: cca-guest/rmm-v1.0-eac5
> >        Full stack branch:  cca-full/rmm-v1.0-eac5
> > 
> > [1] kvmtool Repo:
> >        Where: git@git.gitlab.arm.com:linux-arm/kvmtool-cca.git
> >        Branch: cca/rmm-v1.0-eac5
> > 
> > [2] kvm-unit-tests Repo:
> >        Where: git@git.gitlab.arm.com:linux-arm/kvm-unit-tests-cca.git
> >        Branch: cca/rmm-v1.0-eac5
> > 
> > [3] UEFI Guest firmware:
> >        edk2:     https://git.gitlab.arm.com/linux-arm/edk2-cca.git
> >        revision: 2802_arm_cca_rmm-v1.0-eac5
> > 
> >        edk2-platforms:
> > https://git.gitlab.arm.com/linux-arm/edk2-platforms-cca.git
> >        revision:       2802_arm_cca_rmm-v1.0-eac5
> > 
> > 
> > [4] RMM Repo:
> >        Where: https://git.trustedfirmware.org/TF-RMM/tf-rmm.git
> >        tag : tf-rmm-v0.4.0
> > 
> > [5] TF-A repo:
> >        Where: https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git
> >        Tag: v2.10
> > 
> > 
> > [6] https://shrinkwrap.docs.arm.com/en/latest/
> >      config: cca-3world.yaml
> > 
> 
> Suzuki
> 

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 03/28] arm64: RME: Add wrappers for RMI calls
  2023-01-27 11:29   ` [RFC PATCH 03/28] arm64: RME: Add wrappers for RMI calls Steven Price
  2023-02-13 16:43     ` Zhi Wang
@ 2024-03-18  7:03     ` Ganapatrao Kulkarni
  2024-03-18 11:22       ` Steven Price
  1 sibling, 1 reply; 190+ messages in thread
From: Ganapatrao Kulkarni @ 2024-03-18  7:03 UTC (permalink / raw)
  To: Steven Price, kvm, kvmarm
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon, James Morse,
	Oliver Upton, Suzuki K Poulose, Zenghui Yu, linux-arm-kernel,
	linux-kernel, Joey Gouly, Alexandru Elisei, Christoffer Dall,
	Fuad Tabba, linux-coco


Hi Steven,

On 27-01-2023 04:59 pm, Steven Price wrote:
> The wrappers make the call sites easier to read and deal with the
> boiler plate of handling the error codes from the RMM.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>   arch/arm64/include/asm/rmi_cmds.h | 259 ++++++++++++++++++++++++++++++
>   1 file changed, 259 insertions(+)
>   create mode 100644 arch/arm64/include/asm/rmi_cmds.h
> 
> diff --git a/arch/arm64/include/asm/rmi_cmds.h b/arch/arm64/include/asm/rmi_cmds.h
> new file mode 100644
> index 000000000000..d5468ee46f35
> --- /dev/null
> +++ b/arch/arm64/include/asm/rmi_cmds.h
> @@ -0,0 +1,259 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2023 ARM Ltd.
> + */
> +
> +#ifndef __ASM_RMI_CMDS_H
> +#define __ASM_RMI_CMDS_H
> +
> +#include <linux/arm-smccc.h>
> +
> +#include <asm/rmi_smc.h>
> +
> +struct rtt_entry {
> +	unsigned long walk_level;
> +	unsigned long desc;
> +	int state;
> +	bool ripas;
> +};
> +
> +static inline int rmi_data_create(unsigned long data, unsigned long rd,
> +				  unsigned long map_addr, unsigned long src,
> +				  unsigned long flags)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_DATA_CREATE, data, rd, map_addr, src,
> +			     flags, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_data_create_unknown(unsigned long data,
> +					  unsigned long rd,
> +					  unsigned long map_addr)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_DATA_CREATE_UNKNOWN, data, rd, map_addr,
> +			     &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_data_destroy(unsigned long rd, unsigned long map_addr)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_DATA_DESTROY, rd, map_addr, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_features(unsigned long index, unsigned long *out)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_FEATURES, index, &res);
> +
> +	*out = res.a1;
> +	return res.a0;
> +}
> +
> +static inline int rmi_granule_delegate(unsigned long phys)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_GRANULE_DELEGATE, phys, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_granule_undelegate(unsigned long phys)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_GRANULE_UNDELEGATE, phys, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_psci_complete(unsigned long calling_rec,
> +				    unsigned long target_rec)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_PSCI_COMPLETE, calling_rec, target_rec,
> +			     &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_realm_activate(unsigned long rd)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_REALM_ACTIVATE, rd, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_realm_create(unsigned long rd, unsigned long params_ptr)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_REALM_CREATE, rd, params_ptr, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_realm_destroy(unsigned long rd)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_REALM_DESTROY, rd, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rec_aux_count(unsigned long rd, unsigned long *aux_count)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_REC_AUX_COUNT, rd, &res);
> +
> +	*aux_count = res.a1;
> +	return res.a0;
> +}
> +
> +static inline int rmi_rec_create(unsigned long rec, unsigned long rd,
> +				 unsigned long params_ptr)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_REC_CREATE, rec, rd, params_ptr, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rec_destroy(unsigned long rec)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_REC_DESTROY, rec, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rec_enter(unsigned long rec, unsigned long run_ptr)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_REC_ENTER, rec, run_ptr, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rtt_create(unsigned long rtt, unsigned long rd,
> +				 unsigned long map_addr, unsigned long level)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_RTT_CREATE, rtt, rd, map_addr, level,
> +			     &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rtt_destroy(unsigned long rtt, unsigned long rd,
> +				  unsigned long map_addr, unsigned long level)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_RTT_DESTROY, rtt, rd, map_addr, level,
> +			     &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rtt_fold(unsigned long rtt, unsigned long rd,
> +			       unsigned long map_addr, unsigned long level)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_RTT_FOLD, rtt, rd, map_addr, level, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rtt_init_ripas(unsigned long rd, unsigned long map_addr,
> +				     unsigned long level)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_RTT_INIT_RIPAS, rd, map_addr, level, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rtt_map_unprotected(unsigned long rd,
> +					  unsigned long map_addr,
> +					  unsigned long level,
> +					  unsigned long desc)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_RTT_MAP_UNPROTECTED, rd, map_addr, level,
> +			     desc, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rtt_read_entry(unsigned long rd, unsigned long map_addr,
> +				     unsigned long level, struct rtt_entry *rtt)
> +{
> +	struct arm_smccc_1_2_regs regs = {
> +		SMC_RMI_RTT_READ_ENTRY,
> +		rd, map_addr, level
> +	};
> +
> +	arm_smccc_1_2_smc(&regs, &regs);
> +
> +	rtt->walk_level = regs.a1;
> +	rtt->state = regs.a2 & 0xFF;
> +	rtt->desc = regs.a3;
> +	rtt->ripas = regs.a4 & 1;
> +
> +	return regs.a0;
> +}
> +
> +static inline int rmi_rtt_set_ripas(unsigned long rd, unsigned long rec,
> +				    unsigned long map_addr, unsigned long level,
> +				    unsigned long ripas)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_RTT_SET_RIPAS, rd, rec, map_addr, level,
> +			     ripas, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline int rmi_rtt_unmap_unprotected(unsigned long rd,
> +					    unsigned long map_addr,
> +					    unsigned long level)
> +{
> +	struct arm_smccc_res res;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_RTT_UNMAP_UNPROTECTED, rd, map_addr,
> +			     level, &res);
> +
> +	return res.a0;
> +}
> +
> +static inline phys_addr_t rmi_rtt_get_phys(struct rtt_entry *rtt)
> +{
> +	return rtt->desc & GENMASK(47, 12);
> +}
> +
> +#endif

Can we please replace all occurrence of "unsigned long" with u64?
Also as per spec, RTT level is Int64, can we change accordingly?

Please CC me in future cca patch-sets.
gankulkarni@os.amperecomputing.com


Thanks,
Ganapat

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 04/28] arm64: RME: Check for RME support at KVM init
  2023-01-27 11:29   ` [RFC PATCH 04/28] arm64: RME: Check for RME support at KVM init Steven Price
  2023-02-13 15:48     ` Zhi Wang
  2023-02-13 15:55     ` Zhi Wang
@ 2024-03-18  7:17     ` Ganapatrao Kulkarni
  2024-03-18 11:22       ` Steven Price
  2 siblings, 1 reply; 190+ messages in thread
From: Ganapatrao Kulkarni @ 2024-03-18  7:17 UTC (permalink / raw)
  To: Steven Price, kvm, kvmarm
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon, James Morse,
	Oliver Upton, Suzuki K Poulose, Zenghui Yu, linux-arm-kernel,
	linux-kernel, Joey Gouly, Alexandru Elisei, Christoffer Dall,
	Fuad Tabba, linux-coco



On 27-01-2023 04:59 pm, Steven Price wrote:
> Query the RMI version number and check if it is a compatible version. A
> static key is also provided to signal that a supported RMM is available.
> 
> Functions are provided to query if a VM or VCPU is a realm (or rec)
> which currently will always return false.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>   arch/arm64/include/asm/kvm_emulate.h | 17 ++++++++++
>   arch/arm64/include/asm/kvm_host.h    |  4 +++
>   arch/arm64/include/asm/kvm_rme.h     | 22 +++++++++++++
>   arch/arm64/include/asm/virt.h        |  1 +
>   arch/arm64/kvm/Makefile              |  3 +-
>   arch/arm64/kvm/arm.c                 |  8 +++++
>   arch/arm64/kvm/rme.c                 | 49 ++++++++++++++++++++++++++++
>   7 files changed, 103 insertions(+), 1 deletion(-)
>   create mode 100644 arch/arm64/include/asm/kvm_rme.h
>   create mode 100644 arch/arm64/kvm/rme.c
> 
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index 9bdba47f7e14..5a2b7229e83f 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -490,4 +490,21 @@ static inline bool vcpu_has_feature(struct kvm_vcpu *vcpu, int feature)
>   	return test_bit(feature, vcpu->arch.features);
>   }
>   
> +static inline bool kvm_is_realm(struct kvm *kvm)
> +{
> +	if (static_branch_unlikely(&kvm_rme_is_available))
> +		return kvm->arch.is_realm;
> +	return false;
> +}
> +
> +static inline enum realm_state kvm_realm_state(struct kvm *kvm)
> +{
> +	return READ_ONCE(kvm->arch.realm.state);
> +}
> +
> +static inline bool vcpu_is_rec(struct kvm_vcpu *vcpu)
> +{
> +	return false;
> +}
> +
>   #endif /* __ARM64_KVM_EMULATE_H__ */
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 35a159d131b5..04347c3a8c6b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -26,6 +26,7 @@
>   #include <asm/fpsimd.h>
>   #include <asm/kvm.h>
>   #include <asm/kvm_asm.h>
> +#include <asm/kvm_rme.h>
>   
>   #define __KVM_HAVE_ARCH_INTC_INITIALIZED
>   
> @@ -240,6 +241,9 @@ struct kvm_arch {
>   	 * the associated pKVM instance in the hypervisor.
>   	 */
>   	struct kvm_protected_vm pkvm;
> +
> +	bool is_realm;
> +	struct realm realm;
>   };
>   
>   struct kvm_vcpu_fault_info {
> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> new file mode 100644
> index 000000000000..c26bc2c6770d
> --- /dev/null
> +++ b/arch/arm64/include/asm/kvm_rme.h
> @@ -0,0 +1,22 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2023 ARM Ltd.
> + */
> +
> +#ifndef __ASM_KVM_RME_H
> +#define __ASM_KVM_RME_H
> +
> +enum realm_state {
> +	REALM_STATE_NONE,
> +	REALM_STATE_NEW,
> +	REALM_STATE_ACTIVE,
> +	REALM_STATE_DYING
> +};
> +
> +struct realm {
> +	enum realm_state state;
> +};
> +
> +int kvm_init_rme(void);
> +
> +#endif
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index 4eb601e7de50..be1383e26626 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -80,6 +80,7 @@ void __hyp_set_vectors(phys_addr_t phys_vector_base);
>   void __hyp_reset_vectors(void);
>   
>   DECLARE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);
> +DECLARE_STATIC_KEY_FALSE(kvm_rme_is_available);
>   
>   /* Reports the availability of HYP mode */
>   static inline bool is_hyp_mode_available(void)
> diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> index 5e33c2d4645a..d2f0400c50da 100644
> --- a/arch/arm64/kvm/Makefile
> +++ b/arch/arm64/kvm/Makefile
> @@ -20,7 +20,8 @@ kvm-y += arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.o \
>   	 vgic/vgic-v3.o vgic/vgic-v4.o \
>   	 vgic/vgic-mmio.o vgic/vgic-mmio-v2.o \
>   	 vgic/vgic-mmio-v3.o vgic/vgic-kvm-device.o \
> -	 vgic/vgic-its.o vgic/vgic-debug.o
> +	 vgic/vgic-its.o vgic/vgic-debug.o \
> +	 rme.o
>   
>   kvm-$(CONFIG_HW_PERF_EVENTS)  += pmu-emul.o pmu.o
>   
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 9c5573bc4614..d97b39d042ab 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -38,6 +38,7 @@
>   #include <asm/kvm_asm.h>
>   #include <asm/kvm_mmu.h>
>   #include <asm/kvm_pkvm.h>
> +#include <asm/kvm_rme.h>
>   #include <asm/kvm_emulate.h>
>   #include <asm/sections.h>
>   
> @@ -47,6 +48,7 @@
>   
>   static enum kvm_mode kvm_mode = KVM_MODE_DEFAULT;
>   DEFINE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);
> +DEFINE_STATIC_KEY_FALSE(kvm_rme_is_available);
>   
>   DECLARE_KVM_HYP_PER_CPU(unsigned long, kvm_hyp_vector);
>   
> @@ -2213,6 +2215,12 @@ int kvm_arch_init(void *opaque)
>   
>   	in_hyp_mode = is_kernel_in_hyp_mode();
>   
> +	if (in_hyp_mode) {
> +		err = kvm_init_rme();
> +		if (err)
> +			return err;
> +	}
> +
>   	if (cpus_have_final_cap(ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE) ||
>   	    cpus_have_final_cap(ARM64_WORKAROUND_1508412))
>   		kvm_info("Guests without required CPU erratum workarounds can deadlock system!\n" \
> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> new file mode 100644
> index 000000000000..f6b587bc116e
> --- /dev/null
> +++ b/arch/arm64/kvm/rme.c
> @@ -0,0 +1,49 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2023 ARM Ltd.
> + */
> +
> +#include <linux/kvm_host.h>
> +
> +#include <asm/rmi_cmds.h>
> +#include <asm/virt.h>
> +
> +static int rmi_check_version(void)
> +{
> +	struct arm_smccc_res res;
> +	int version_major, version_minor;
> +
> +	arm_smccc_1_1_invoke(SMC_RMI_VERSION, &res);
> +
> +	if (res.a0 == SMCCC_RET_NOT_SUPPORTED)
> +		return -ENXIO;
> +
> +	version_major = RMI_ABI_VERSION_GET_MAJOR(res.a0);
> +	version_minor = RMI_ABI_VERSION_GET_MINOR(res.a0);
> +
> +	if (version_major != RMI_ABI_MAJOR_VERSION) {
> +		kvm_err("Unsupported RMI ABI (version %d.%d) we support %d\n",

Can we please replace "we support" to host supports.
Also in the patch present in the repo, you are using variable 
our_version, can this be changed to host_version?

> +			version_major, version_minor,
> +			RMI_ABI_MAJOR_VERSION);
> +		return -ENXIO;
> +	}
> +
> +	kvm_info("RMI ABI version %d.%d\n", version_major, version_minor);
> +
> +	return 0;
> +}
> +
> +int kvm_init_rme(void)
> +{
> +	if (PAGE_SIZE != SZ_4K)
> +		/* Only 4k page size on the host is supported */
> +		return 0;
> +
> +	if (rmi_check_version())
> +		/* Continue without realm support */
> +		return 0;
> +
> +	/* Future patch will enable static branch kvm_rme_is_available */
> +
> +	return 0;
> +}

Thanks,
Ganapat

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms
  2023-01-27 11:29   ` [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms Steven Price
                       ` (2 preceding siblings ...)
  2023-03-06 19:10     ` Zhi Wang
@ 2024-03-18  7:40     ` Ganapatrao Kulkarni
  2024-03-18 11:22       ` Steven Price
  3 siblings, 1 reply; 190+ messages in thread
From: Ganapatrao Kulkarni @ 2024-03-18  7:40 UTC (permalink / raw)
  To: Steven Price, kvm, kvmarm
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon, James Morse,
	Oliver Upton, Suzuki K Poulose, Zenghui Yu, linux-arm-kernel,
	linux-kernel, Joey Gouly, Alexandru Elisei, Christoffer Dall,
	Fuad Tabba, linux-coco



On 27-01-2023 04:59 pm, Steven Price wrote:
> Add the KVM_CAP_ARM_RME_CREATE_FD ioctl to create a realm. This involves
> delegating pages to the RMM to hold the Realm Descriptor (RD) and for
> the base level of the Realm Translation Tables (RTT). A VMID also need
> to be picked, since the RMM has a separate VMID address space a
> dedicated allocator is added for this purpose.
> 
> KVM_CAP_ARM_RME_CONFIG_REALM is provided to allow configuring the realm
> before it is created.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>   arch/arm64/include/asm/kvm_rme.h |  14 ++
>   arch/arm64/kvm/arm.c             |  19 ++
>   arch/arm64/kvm/mmu.c             |   6 +
>   arch/arm64/kvm/reset.c           |  33 +++
>   arch/arm64/kvm/rme.c             | 357 +++++++++++++++++++++++++++++++
>   5 files changed, 429 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> index c26bc2c6770d..055a22accc08 100644
> --- a/arch/arm64/include/asm/kvm_rme.h
> +++ b/arch/arm64/include/asm/kvm_rme.h
> @@ -6,6 +6,8 @@
>   #ifndef __ASM_KVM_RME_H
>   #define __ASM_KVM_RME_H
>   
> +#include <uapi/linux/kvm.h>
> +
>   enum realm_state {
>   	REALM_STATE_NONE,
>   	REALM_STATE_NEW,
> @@ -15,8 +17,20 @@ enum realm_state {
>   
>   struct realm {
>   	enum realm_state state;
> +
> +	void *rd;
> +	struct realm_params *params;
> +
> +	unsigned long num_aux;
> +	unsigned int vmid;
> +	unsigned int ia_bits;
>   };
>   
>   int kvm_init_rme(void);
> +u32 kvm_realm_ipa_limit(void);
> +
> +int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
> +int kvm_init_realm_vm(struct kvm *kvm);
> +void kvm_destroy_realm(struct kvm *kvm);
>   
>   #endif
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index d97b39d042ab..50f54a63732a 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -103,6 +103,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>   		r = 0;
>   		set_bit(KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED, &kvm->arch.flags);
>   		break;
> +	case KVM_CAP_ARM_RME:
> +		if (!static_branch_unlikely(&kvm_rme_is_available))
> +			return -EINVAL;
> +		mutex_lock(&kvm->lock);
> +		r = kvm_realm_enable_cap(kvm, cap);
> +		mutex_unlock(&kvm->lock);
> +		break;
>   	default:
>   		r = -EINVAL;
>   		break;
> @@ -172,6 +179,13 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>   	 */
>   	kvm->arch.dfr0_pmuver.imp = kvm_arm_pmu_get_pmuver_limit();
>   
> +	/* Initialise the realm bits after the generic bits are enabled */
> +	if (kvm_is_realm(kvm)) {
> +		ret = kvm_init_realm_vm(kvm);
> +		if (ret)
> +			goto err_free_cpumask;
> +	}
> +
>   	return 0;
>   
>   err_free_cpumask:
> @@ -204,6 +218,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>   	kvm_destroy_vcpus(kvm);
>   
>   	kvm_unshare_hyp(kvm, kvm + 1);
> +
> +	kvm_destroy_realm(kvm);
>   }
>   
>   int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> @@ -300,6 +316,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>   	case KVM_CAP_ARM_PTRAUTH_GENERIC:
>   		r = system_has_full_ptr_auth();
>   		break;
> +	case KVM_CAP_ARM_RME:
> +		r = static_key_enabled(&kvm_rme_is_available);
> +		break;
>   	default:
>   		r = 0;
>   	}
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 31d7fa4c7c14..d0f707767d05 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -840,6 +840,12 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
>   	struct kvm_pgtable *pgt = NULL;
>   
>   	write_lock(&kvm->mmu_lock);
> +	if (kvm_is_realm(kvm) &&
> +	    kvm_realm_state(kvm) != REALM_STATE_DYING) {
> +		/* TODO: teardown rtts */
> +		write_unlock(&kvm->mmu_lock);
> +		return;
> +	}
>   	pgt = mmu->pgt;
>   	if (pgt) {
>   		mmu->pgd_phys = 0;
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index e0267f672b8a..c165df174737 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -395,3 +395,36 @@ int kvm_set_ipa_limit(void)
>   
>   	return 0;
>   }
> +
> +int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long type)
> +{
> +	u64 mmfr0, mmfr1;
> +	u32 phys_shift;
> +	u32 ipa_limit = kvm_ipa_limit;
> +
> +	if (kvm_is_realm(kvm))
> +		ipa_limit = kvm_realm_ipa_limit();
> +
> +	if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK)
> +		return -EINVAL;
> +
> +	phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type);
> +	if (phys_shift) {
> +		if (phys_shift > ipa_limit ||
> +		    phys_shift < ARM64_MIN_PARANGE_BITS)
> +			return -EINVAL;
> +	} else {
> +		phys_shift = KVM_PHYS_SHIFT;
> +		if (phys_shift > ipa_limit) {
> +			pr_warn_once("%s using unsupported default IPA limit, upgrade your VMM\n",
> +				     current->comm);
> +			return -EINVAL;
> +		}
> +	}
> +
> +	mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
> +	mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> +	kvm->arch.vtcr = kvm_get_vtcr(mmfr0, mmfr1, phys_shift);
> +
> +	return 0;
> +}
> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> index f6b587bc116e..9f8c5a91b8fc 100644
> --- a/arch/arm64/kvm/rme.c
> +++ b/arch/arm64/kvm/rme.c
> @@ -5,9 +5,49 @@
>   
>   #include <linux/kvm_host.h>
>   
> +#include <asm/kvm_emulate.h>
> +#include <asm/kvm_mmu.h>
>   #include <asm/rmi_cmds.h>
>   #include <asm/virt.h>
>   
> +/************ FIXME: Copied from kvm/hyp/pgtable.c **********/
> +#include <asm/kvm_pgtable.h>
> +
> +struct kvm_pgtable_walk_data {
> +	struct kvm_pgtable		*pgt;
> +	struct kvm_pgtable_walker	*walker;
> +
> +	u64				addr;
> +	u64				end;
> +};
> +
> +static u32 __kvm_pgd_page_idx(struct kvm_pgtable *pgt, u64 addr)
> +{
> +	u64 shift = kvm_granule_shift(pgt->start_level - 1); /* May underflow */
> +	u64 mask = BIT(pgt->ia_bits) - 1;
> +
> +	return (addr & mask) >> shift;
> +}
> +
> +static u32 kvm_pgd_pages(u32 ia_bits, u32 start_level)
> +{
> +	struct kvm_pgtable pgt = {
> +		.ia_bits	= ia_bits,
> +		.start_level	= start_level,
> +	};
> +
> +	return __kvm_pgd_page_idx(&pgt, -1ULL) + 1;
> +}
> +
> +/******************/
> +
> +static unsigned long rmm_feat_reg0;
> +
> +static bool rme_supports(unsigned long feature)
> +{
> +	return !!u64_get_bits(rmm_feat_reg0, feature);
> +}
> +
>   static int rmi_check_version(void)
>   {
>   	struct arm_smccc_res res;
> @@ -33,8 +73,319 @@ static int rmi_check_version(void)
>   	return 0;
>   }
>   
> +static unsigned long create_realm_feat_reg0(struct kvm *kvm)
> +{
> +	unsigned long ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
> +	u64 feat_reg0 = 0;
> +
> +	int num_bps = u64_get_bits(rmm_feat_reg0,
> +				   RMI_FEATURE_REGISTER_0_NUM_BPS);
> +	int num_wps = u64_get_bits(rmm_feat_reg0,
> +				   RMI_FEATURE_REGISTER_0_NUM_WPS);
> +
> +	feat_reg0 |= u64_encode_bits(ia_bits, RMI_FEATURE_REGISTER_0_S2SZ);
> +	feat_reg0 |= u64_encode_bits(num_bps, RMI_FEATURE_REGISTER_0_NUM_BPS);
> +	feat_reg0 |= u64_encode_bits(num_wps, RMI_FEATURE_REGISTER_0_NUM_WPS);
> +
> +	return feat_reg0;
> +}
> +
> +u32 kvm_realm_ipa_limit(void)
> +{
> +	return u64_get_bits(rmm_feat_reg0, RMI_FEATURE_REGISTER_0_S2SZ);
> +}
> +
> +static u32 get_start_level(struct kvm *kvm)
> +{
> +	long sl0 = FIELD_GET(VTCR_EL2_SL0_MASK, kvm->arch.vtcr);
> +
> +	return VTCR_EL2_TGRAN_SL0_BASE - sl0;
> +}
> +
> +static int realm_create_rd(struct kvm *kvm)
> +{
> +	struct realm *realm = &kvm->arch.realm;
> +	struct realm_params *params = realm->params;
> +	void *rd = NULL;
> +	phys_addr_t rd_phys, params_phys;
> +	struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
> +	unsigned int pgd_sz;
> +	int i, r;
> +
> +	if (WARN_ON(realm->rd) || WARN_ON(!realm->params))
> +		return -EEXIST;
> +
> +	rd = (void *)__get_free_page(GFP_KERNEL);
> +	if (!rd)
> +		return -ENOMEM;
> +
> +	rd_phys = virt_to_phys(rd);
> +	if (rmi_granule_delegate(rd_phys)) {
> +		r = -ENXIO;
> +		goto out;
> +	}
> +
> +	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
> +	for (i = 0; i < pgd_sz; i++) {
> +		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
> +
> +		if (rmi_granule_delegate(pgd_phys)) {
> +			r = -ENXIO;
> +			goto out_undelegate_tables;
> +		}
> +	}
> +
> +	params->rtt_level_start = get_start_level(kvm);
> +	params->rtt_num_start = pgd_sz;
> +	params->rtt_base = kvm->arch.mmu.pgd_phys;
> +	params->vmid = realm->vmid;
> +
> +	params_phys = virt_to_phys(params);
> +
> +	if (rmi_realm_create(rd_phys, params_phys)) {
> +		r = -ENXIO;
> +		goto out_undelegate_tables;
> +	}
> +
> +	realm->rd = rd;
> +	realm->ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
> +
> +	if (WARN_ON(rmi_rec_aux_count(rd_phys, &realm->num_aux))) {
> +		WARN_ON(rmi_realm_destroy(rd_phys));
> +		goto out_undelegate_tables;
> +	}
> +
> +	return 0;
> +
> +out_undelegate_tables:
> +	while (--i >= 0) {
> +		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
> +
> +		WARN_ON(rmi_granule_undelegate(pgd_phys));
> +	}
> +	WARN_ON(rmi_granule_undelegate(rd_phys));
> +out:
> +	free_page((unsigned long)rd);
> +	return r;
> +}
> +
> +/* Protects access to rme_vmid_bitmap */
> +static DEFINE_SPINLOCK(rme_vmid_lock);
> +static unsigned long *rme_vmid_bitmap;
> +
> +static int rme_vmid_init(void)
> +{
> +	unsigned int vmid_count = 1 << kvm_get_vmid_bits();
> +
> +	rme_vmid_bitmap = bitmap_zalloc(vmid_count, GFP_KERNEL);
> +	if (!rme_vmid_bitmap) {
> +		kvm_err("%s: Couldn't allocate rme vmid bitmap\n", __func__);
> +		return -ENOMEM;
> +	}
> +
> +	return 0;
> +}
> +
> +static int rme_vmid_reserve(void)
> +{
> +	int ret;
> +	unsigned int vmid_count = 1 << kvm_get_vmid_bits();
> +
> +	spin_lock(&rme_vmid_lock);
> +	ret = bitmap_find_free_region(rme_vmid_bitmap, vmid_count, 0);
> +	spin_unlock(&rme_vmid_lock);
> +
> +	return ret;
> +}
> +
> +static void rme_vmid_release(unsigned int vmid)
> +{
> +	spin_lock(&rme_vmid_lock);
> +	bitmap_release_region(rme_vmid_bitmap, vmid, 0);
> +	spin_unlock(&rme_vmid_lock);
> +}
> +
> +static int kvm_create_realm(struct kvm *kvm)
> +{
> +	struct realm *realm = &kvm->arch.realm;
> +	int ret;
> +
> +	if (!kvm_is_realm(kvm) || kvm_realm_state(kvm) != REALM_STATE_NONE)
> +		return -EEXIST;
> +
> +	ret = rme_vmid_reserve();
> +	if (ret < 0)
> +		return ret;
> +	realm->vmid = ret;
> +
> +	ret = realm_create_rd(kvm);
> +	if (ret) {
> +		rme_vmid_release(realm->vmid);
> +		return ret;
> +	}
> +
> +	WRITE_ONCE(realm->state, REALM_STATE_NEW);
> +
> +	/* The realm is up, free the parameters.  */
> +	free_page((unsigned long)realm->params);
> +	realm->params = NULL;
> +
> +	return 0;
> +}
> +
> +static int config_realm_hash_algo(struct realm *realm,
> +				  struct kvm_cap_arm_rme_config_item *cfg)
> +{
> +	switch (cfg->hash_algo) {
> +	case KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA256:
> +		if (!rme_supports(RMI_FEATURE_REGISTER_0_HASH_SHA_256))
> +			return -EINVAL;
> +		break;
> +	case KVM_CAP_ARM_RME_MEASUREMENT_ALGO_SHA512:
> +		if (!rme_supports(RMI_FEATURE_REGISTER_0_HASH_SHA_512))
> +			return -EINVAL;
> +		break;
> +	default:
> +		return -EINVAL;
> +	}
> +	realm->params->measurement_algo = cfg->hash_algo;
> +	return 0;
> +}
> +
> +static int config_realm_sve(struct realm *realm,
> +			    struct kvm_cap_arm_rme_config_item *cfg)
> +{
> +	u64 features_0 = realm->params->features_0;
> +	int max_sve_vq = u64_get_bits(rmm_feat_reg0,
> +				      RMI_FEATURE_REGISTER_0_SVE_VL);
> +
> +	if (!rme_supports(RMI_FEATURE_REGISTER_0_SVE_EN))
> +		return -EINVAL;
> +
> +	if (cfg->sve_vq > max_sve_vq)
> +		return -EINVAL;
> +
> +	features_0 &= ~(RMI_FEATURE_REGISTER_0_SVE_EN |
> +			RMI_FEATURE_REGISTER_0_SVE_VL);
> +	features_0 |= u64_encode_bits(1, RMI_FEATURE_REGISTER_0_SVE_EN);
> +	features_0 |= u64_encode_bits(cfg->sve_vq,
> +				      RMI_FEATURE_REGISTER_0_SVE_VL);
> +
> +	realm->params->features_0 = features_0;
> +	return 0;
> +}
> +
> +static int kvm_rme_config_realm(struct kvm *kvm, struct kvm_enable_cap *cap)
> +{
> +	struct kvm_cap_arm_rme_config_item cfg;
> +	struct realm *realm = &kvm->arch.realm;
> +	int r = 0;
> +
> +	if (kvm_realm_state(kvm) != REALM_STATE_NONE)
> +		return -EBUSY;
> +
> +	if (copy_from_user(&cfg, (void __user *)cap->args[1], sizeof(cfg)))
> +		return -EFAULT;
> +
> +	switch (cfg.cfg) {
> +	case KVM_CAP_ARM_RME_CFG_RPV:
> +		memcpy(&realm->params->rpv, &cfg.rpv, sizeof(cfg.rpv));
> +		break;
> +	case KVM_CAP_ARM_RME_CFG_HASH_ALGO:
> +		r = config_realm_hash_algo(realm, &cfg);
> +		break;
> +	case KVM_CAP_ARM_RME_CFG_SVE:
> +		r = config_realm_sve(realm, &cfg);
> +		break;
> +	default:
> +		r = -EINVAL;
> +	}
> +
> +	return r;
> +}
> +
> +int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
> +{
> +	int r = 0;
> +
> +	switch (cap->args[0]) {
> +	case KVM_CAP_ARM_RME_CONFIG_REALM:
> +		r = kvm_rme_config_realm(kvm, cap);
> +		break;
> +	case KVM_CAP_ARM_RME_CREATE_RD:
> +		if (kvm->created_vcpus) {
> +			r = -EBUSY;
> +			break;
> +		}
> +
> +		r = kvm_create_realm(kvm);
> +		break;
> +	default:
> +		r = -EINVAL;
> +		break;
> +	}
> +
> +	return r;
> +}
> +
> +void kvm_destroy_realm(struct kvm *kvm)
> +{
> +	struct realm *realm = &kvm->arch.realm;
> +	struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
> +	unsigned int pgd_sz;
> +	int i;
> +
> +	if (realm->params) {
> +		free_page((unsigned long)realm->params);
> +		realm->params = NULL;
> +	}
> +
> +	if (kvm_realm_state(kvm) == REALM_STATE_NONE)
> +		return;
> +
> +	WRITE_ONCE(realm->state, REALM_STATE_DYING);
> +
> +	rme_vmid_release(realm->vmid);
> +
> +	if (realm->rd) {
> +		phys_addr_t rd_phys = virt_to_phys(realm->rd);
> +
> +		if (WARN_ON(rmi_realm_destroy(rd_phys)))
> +			return;
> +		if (WARN_ON(rmi_granule_undelegate(rd_phys)))
> +			return;
> +		free_page((unsigned long)realm->rd);
> +		realm->rd = NULL;
> +	}
> +
> +	pgd_sz = kvm_pgd_pages(pgt->ia_bits, pgt->start_level);
> +	for (i = 0; i < pgd_sz; i++) {
> +		phys_addr_t pgd_phys = kvm->arch.mmu.pgd_phys + i * PAGE_SIZE;
> +
> +		if (WARN_ON(rmi_granule_undelegate(pgd_phys)))
> +			return;
> +	}
> +
> +	kvm_free_stage2_pgd(&kvm->arch.mmu);
> +}
> +
> +int kvm_init_realm_vm(struct kvm *kvm)
> +{
> +	struct realm_params *params;
> +
> +	params = (struct realm_params *)get_zeroed_page(GFP_KERNEL);
> +	if (!params)
> +		return -ENOMEM;
> +
> +	params->features_0 = create_realm_feat_reg0(kvm);
> +	kvm->arch.realm.params = params;
> +	return 0;
> +}
> +
>   int kvm_init_rme(void)
>   {
> +	int ret;
> +
>   	if (PAGE_SIZE != SZ_4K)
>   		/* Only 4k page size on the host is supported */
>   		return 0;
> @@ -43,6 +394,12 @@ int kvm_init_rme(void)
>   		/* Continue without realm support */
>   		return 0;
>   
> +	ret = rme_vmid_init();
> +	if (ret)
> +		return ret;
> +
> +	WARN_ON(rmi_features(0, &rmm_feat_reg0));

Why WARN_ON, Is that good enough to print err/info message and keep 
"kvm_rme_is_available" disabled?

IMO, we should print message when rme is enabled, otherwise it should be 
silent return.

> +
>   	/* Future patch will enable static branch kvm_rme_is_available */
>   
>   	return 0;

Thanks,
Ganapat

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 09/28] arm64: RME: RTT handling
  2023-01-27 11:29   ` [RFC PATCH 09/28] arm64: RME: RTT handling Steven Price
  2023-02-13 17:44     ` Zhi Wang
@ 2024-03-18 11:01     ` Ganapatrao Kulkarni
  2024-03-18 11:25       ` Steven Price
  1 sibling, 1 reply; 190+ messages in thread
From: Ganapatrao Kulkarni @ 2024-03-18 11:01 UTC (permalink / raw)
  To: Steven Price, kvm, kvmarm
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon, James Morse,
	Oliver Upton, Suzuki K Poulose, Zenghui Yu, linux-arm-kernel,
	linux-kernel, Joey Gouly, Alexandru Elisei, Christoffer Dall,
	Fuad Tabba, linux-coco


On 27-01-2023 04:59 pm, Steven Price wrote:
> The RMM owns the stage 2 page tables for a realm, and KVM must request
> that the RMM creates/destroys entries as necessary. The physical pages
> to store the page tables are delegated to the realm as required, and can
> be undelegated when no longer used.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>   arch/arm64/include/asm/kvm_rme.h |  19 +++++
>   arch/arm64/kvm/mmu.c             |   7 +-
>   arch/arm64/kvm/rme.c             | 139 +++++++++++++++++++++++++++++++
>   3 files changed, 162 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_rme.h b/arch/arm64/include/asm/kvm_rme.h
> index a6318af3ed11..eea5118dfa8a 100644
> --- a/arch/arm64/include/asm/kvm_rme.h
> +++ b/arch/arm64/include/asm/kvm_rme.h
> @@ -35,5 +35,24 @@ u32 kvm_realm_ipa_limit(void);
>   int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
>   int kvm_init_realm_vm(struct kvm *kvm);
>   void kvm_destroy_realm(struct kvm *kvm);
> +void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level);
> +
> +#define RME_RTT_BLOCK_LEVEL	2
> +#define RME_RTT_MAX_LEVEL	3
> +
> +#define RME_PAGE_SHIFT		12
> +#define RME_PAGE_SIZE		BIT(RME_PAGE_SHIFT)

Can we use PAGE_SIZE and PAGE_SHIFT instead of redefining?
May be we can use them to define RME_PAGE_SIZE and RME_PAGE_SHIFT.

> +/* See ARM64_HW_PGTABLE_LEVEL_SHIFT() */
> +#define RME_RTT_LEVEL_SHIFT(l)	\
> +	((RME_PAGE_SHIFT - 3) * (4 - (l)) + 3)

Instead of defining again, can we define to
ARM64_HW_PGTABLE_LEVEL_SHIFT?

> +#define RME_L2_BLOCK_SIZE	BIT(RME_RTT_LEVEL_SHIFT(2))
> +
> +static inline unsigned long rme_rtt_level_mapsize(int level)
> +{
> +	if (WARN_ON(level > RME_RTT_MAX_LEVEL))
> +		return RME_PAGE_SIZE;
> +
> +	return (1UL << RME_RTT_LEVEL_SHIFT(level));
> +}
>   
>   #endif
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 22c00274884a..f29558c5dcbc 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -834,16 +834,17 @@ void stage2_unmap_vm(struct kvm *kvm)
>   void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
>   {
>   	struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
> -	struct kvm_pgtable *pgt = NULL;
> +	struct kvm_pgtable *pgt;
>   
>   	write_lock(&kvm->mmu_lock);
> +	pgt = mmu->pgt;
>   	if (kvm_is_realm(kvm) &&
>   	    kvm_realm_state(kvm) != REALM_STATE_DYING) {
> -		/* TODO: teardown rtts */
>   		write_unlock(&kvm->mmu_lock);
> +		kvm_realm_destroy_rtts(&kvm->arch.realm, pgt->ia_bits,
> +				       pgt->start_level);
>   		return;
>   	}
> -	pgt = mmu->pgt;
>   	if (pgt) {
>   		mmu->pgd_phys = 0;
>   		mmu->pgt = NULL;
> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
> index 0c9d70e4d9e6..f7b0e5a779f8 100644
> --- a/arch/arm64/kvm/rme.c
> +++ b/arch/arm64/kvm/rme.c
> @@ -73,6 +73,28 @@ static int rmi_check_version(void)
>   	return 0;
>   }
>   
> +static void realm_destroy_undelegate_range(struct realm *realm,
> +					   unsigned long ipa,
> +					   unsigned long addr,
> +					   ssize_t size)
> +{
> +	unsigned long rd = virt_to_phys(realm->rd);
> +	int ret;
> +
> +	while (size > 0) {
> +		ret = rmi_data_destroy(rd, ipa);
> +		WARN_ON(ret);
> +		ret = rmi_granule_undelegate(addr);
> +
> +		if (ret)
> +			get_page(phys_to_page(addr));
> +
> +		addr += PAGE_SIZE;
> +		ipa += PAGE_SIZE;
> +		size -= PAGE_SIZE;
> +	}
> +}
> +
>   static unsigned long create_realm_feat_reg0(struct kvm *kvm)
>   {
>   	unsigned long ia_bits = VTCR_EL2_IPA(kvm->arch.vtcr);
> @@ -170,6 +192,123 @@ static int realm_create_rd(struct kvm *kvm)
>   	return r;
>   }
>   
> +static int realm_rtt_destroy(struct realm *realm, unsigned long addr,
> +			     int level, phys_addr_t rtt_granule)
> +{
> +	addr = ALIGN_DOWN(addr, rme_rtt_level_mapsize(level - 1));
> +	return rmi_rtt_destroy(rtt_granule, virt_to_phys(realm->rd), addr,
> +			level);
> +}
> +
> +static int realm_destroy_free_rtt(struct realm *realm, unsigned long addr,
> +				  int level, phys_addr_t rtt_granule)
> +{
> +	if (realm_rtt_destroy(realm, addr, level, rtt_granule))
> +		return -ENXIO;
> +	if (!WARN_ON(rmi_granule_undelegate(rtt_granule)))
> +		put_page(phys_to_page(rtt_granule));
> +
> +	return 0;
> +}
> +
> +static int realm_rtt_create(struct realm *realm,
> +			    unsigned long addr,
> +			    int level,
> +			    phys_addr_t phys)
> +{
> +	addr = ALIGN_DOWN(addr, rme_rtt_level_mapsize(level - 1));
> +	return rmi_rtt_create(phys, virt_to_phys(realm->rd), addr, level);
> +}
> +
> +static int realm_tear_down_rtt_range(struct realm *realm, int level,
> +				     unsigned long start, unsigned long end)
> +{
> +	phys_addr_t rd = virt_to_phys(realm->rd);
> +	ssize_t map_size = rme_rtt_level_mapsize(level);
> +	unsigned long addr, next_addr;
> +	bool failed = false;
> +
> +	for (addr = start; addr < end; addr = next_addr) {
> +		phys_addr_t rtt_addr, tmp_rtt;
> +		struct rtt_entry rtt;
> +		unsigned long end_addr;
> +
> +		next_addr = ALIGN(addr + 1, map_size);
> +
> +		end_addr = min(next_addr, end);
> +
> +		if (rmi_rtt_read_entry(rd, ALIGN_DOWN(addr, map_size),
> +				       level, &rtt)) {
> +			failed = true;
> +			continue;
> +		}
> +
> +		rtt_addr = rmi_rtt_get_phys(&rtt);
> +		WARN_ON(level != rtt.walk_level);
> +
> +		switch (rtt.state) {
> +		case RMI_UNASSIGNED:
> +		case RMI_DESTROYED:
> +			break;
> +		case RMI_TABLE:
> +			if (realm_tear_down_rtt_range(realm, level + 1,
> +						      addr, end_addr)) {
> +				failed = true;
> +				break;
> +			}
> +			if (IS_ALIGNED(addr, map_size) &&
> +			    next_addr <= end &&
> +			    realm_destroy_free_rtt(realm, addr, level + 1,
> +						   rtt_addr))
> +				failed = true;
> +			break;
> +		case RMI_ASSIGNED:
> +			WARN_ON(!rtt_addr);
> +			/*
> +			 * If there is a block mapping, break it now, using the
> +			 * spare_page. We are sure to have a valid delegated
> +			 * page at spare_page before we enter here, otherwise
> +			 * WARN once, which will be followed by further
> +			 * warnings.
> +			 */
> +			tmp_rtt = realm->spare_page;
> +			if (level == 2 &&
> +			    !WARN_ON_ONCE(tmp_rtt == PHYS_ADDR_MAX) &&
> +			    realm_rtt_create(realm, addr,
> +					     RME_RTT_MAX_LEVEL, tmp_rtt)) {
> +				WARN_ON(1);
> +				failed = true;
> +				break;
> +			}
> +			realm_destroy_undelegate_range(realm, addr,
> +						       rtt_addr, map_size);
> +			/*
> +			 * Collapse the last level table and make the spare page
> +			 * reusable again.
> +			 */
> +			if (level == 2 &&
> +			    realm_rtt_destroy(realm, addr, RME_RTT_MAX_LEVEL,
> +					      tmp_rtt))
> +				failed = true;
> +			break;
> +		case RMI_VALID_NS:
> +			WARN_ON(rmi_rtt_unmap_unprotected(rd, addr, level));
> +			break;
> +		default:
> +			WARN_ON(1);
> +			failed = true;
> +			break;
> +		}
> +	}
> +
> +	return failed ? -EINVAL : 0;
> +}
> +
> +void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32 start_level)
> +{
> +	realm_tear_down_rtt_range(realm, start_level, 0, (1UL << ia_bits));
> +}
> +
>   /* Protects access to rme_vmid_bitmap */
>   static DEFINE_SPINLOCK(rme_vmid_lock);
>   static unsigned long *rme_vmid_bitmap;

Thanks,
Ganapat

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms
  2024-03-18  7:40     ` Ganapatrao Kulkarni
@ 2024-03-18 11:22       ` Steven Price
  0 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2024-03-18 11:22 UTC (permalink / raw)
  To: Ganapatrao Kulkarni, kvm, kvmarm
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon, James Morse,
	Oliver Upton, Suzuki K Poulose, Zenghui Yu, linux-arm-kernel,
	linux-kernel, Joey Gouly, Alexandru Elisei, Christoffer Dall,
	Fuad Tabba, linux-coco

Thanks for taking a look at this.

On 18/03/2024 07:40, Ganapatrao Kulkarni wrote:
> On 27-01-2023 04:59 pm, Steven Price wrote:
[...]
>>   int kvm_init_rme(void)
>>   {
>> +    int ret;
>> +
>>       if (PAGE_SIZE != SZ_4K)
>>           /* Only 4k page size on the host is supported */
>>           return 0;
>> @@ -43,6 +394,12 @@ int kvm_init_rme(void)
>>           /* Continue without realm support */
>>           return 0;
>>   +    ret = rme_vmid_init();
>> +    if (ret)
>> +        return ret;
>> +
>> +    WARN_ON(rmi_features(0, &rmm_feat_reg0));
> 
> Why WARN_ON, Is that good enough to print err/info message and keep
> "kvm_rme_is_available" disabled?

Good point. RMI_FEATURES "does not have any failure conditions" so this
is very much a "should never happen" situation. Assuming the call
gracefully fails then rmm_feat_reg0 would remain 0 which would in
practise stop realms being created, but this is clearly non-ideal.

I'll fix this up in the next version to do the rmi_features() call
before rme_vmid_init(), that way we can just return early without
setting kvm_rme_is_available in this situation. I'll keep the WARN_ON
because something has gone very wrong if this call fails.

> IMO, we should print message when rme is enabled, otherwise it should be
> silent return.

The rmi_check_version() call already outputs a "RMI ABI version %d.%d"
message - I don't want to be too noisy here. Other than the 'cannot
happen' situations if you see the "RMI ABI" message then
kvm_rme_is_available will be set. And those 'cannot happen' routes will
print their own error message (and point to a seriously broken system).

And obviously in the case of SMC_RMI_VERSION not being supported then we
silently return as this is taken to mean there isn't an RMM.

Thanks,

Steve


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 04/28] arm64: RME: Check for RME support at KVM init
  2024-03-18  7:17     ` Ganapatrao Kulkarni
@ 2024-03-18 11:22       ` Steven Price
  0 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2024-03-18 11:22 UTC (permalink / raw)
  To: Ganapatrao Kulkarni, kvm, kvmarm
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon, James Morse,
	Oliver Upton, Suzuki K Poulose, Zenghui Yu, linux-arm-kernel,
	linux-kernel, Joey Gouly, Alexandru Elisei, Christoffer Dall,
	Fuad Tabba, linux-coco

On 18/03/2024 07:17, Ganapatrao Kulkarni wrote:
> 
> 
> On 27-01-2023 04:59 pm, Steven Price wrote:
>> Query the RMI version number and check if it is a compatible version. A
>> static key is also provided to signal that a supported RMM is available.
>>
>> Functions are provided to query if a VM or VCPU is a realm (or rec)
>> which currently will always return false.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>   arch/arm64/include/asm/kvm_emulate.h | 17 ++++++++++
>>   arch/arm64/include/asm/kvm_host.h    |  4 +++
>>   arch/arm64/include/asm/kvm_rme.h     | 22 +++++++++++++
>>   arch/arm64/include/asm/virt.h        |  1 +
>>   arch/arm64/kvm/Makefile              |  3 +-
>>   arch/arm64/kvm/arm.c                 |  8 +++++
>>   arch/arm64/kvm/rme.c                 | 49 ++++++++++++++++++++++++++++
>>   7 files changed, 103 insertions(+), 1 deletion(-)
>>   create mode 100644 arch/arm64/include/asm/kvm_rme.h
>>   create mode 100644 arch/arm64/kvm/rme.c
>>

[...]

>> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c
>> new file mode 100644
>> index 000000000000..f6b587bc116e
>> --- /dev/null
>> +++ b/arch/arm64/kvm/rme.c
>> @@ -0,0 +1,49 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright (C) 2023 ARM Ltd.
>> + */
>> +
>> +#include <linux/kvm_host.h>
>> +
>> +#include <asm/rmi_cmds.h>
>> +#include <asm/virt.h>
>> +
>> +static int rmi_check_version(void)
>> +{
>> +    struct arm_smccc_res res;
>> +    int version_major, version_minor;
>> +
>> +    arm_smccc_1_1_invoke(SMC_RMI_VERSION, &res);
>> +
>> +    if (res.a0 == SMCCC_RET_NOT_SUPPORTED)
>> +        return -ENXIO;
>> +
>> +    version_major = RMI_ABI_VERSION_GET_MAJOR(res.a0);
>> +    version_minor = RMI_ABI_VERSION_GET_MINOR(res.a0);
>> +
>> +    if (version_major != RMI_ABI_MAJOR_VERSION) {
>> +        kvm_err("Unsupported RMI ABI (version %d.%d) we support %d\n",
> 
> Can we please replace "we support" to host supports.
> Also in the patch present in the repo, you are using variable
> our_version, can this be changed to host_version?

Sure, I do have a bad habit using "we" - thanks for point it out.

Steve

>> +            version_major, version_minor,
>> +            RMI_ABI_MAJOR_VERSION);
>> +        return -ENXIO;
>> +    }
>> +
>> +    kvm_info("RMI ABI version %d.%d\n", version_major, version_minor);
>> +
>> +    return 0;
>> +}
>> +
>> +int kvm_init_rme(void)
>> +{
>> +    if (PAGE_SIZE != SZ_4K)
>> +        /* Only 4k page size on the host is supported */
>> +        return 0;
>> +
>> +    if (rmi_check_version())
>> +        /* Continue without realm support */
>> +        return 0;
>> +
>> +    /* Future patch will enable static branch kvm_rme_is_available */
>> +
>> +    return 0;
>> +}
> 
> Thanks,
> Ganapat


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 03/28] arm64: RME: Add wrappers for RMI calls
  2024-03-18  7:03     ` Ganapatrao Kulkarni
@ 2024-03-18 11:22       ` Steven Price
  0 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2024-03-18 11:22 UTC (permalink / raw)
  To: Ganapatrao Kulkarni, kvm, kvmarm
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon, James Morse,
	Oliver Upton, Suzuki K Poulose, Zenghui Yu, linux-arm-kernel,
	linux-kernel, Joey Gouly, Alexandru Elisei, Christoffer Dall,
	Fuad Tabba, linux-coco

On 18/03/2024 07:03, Ganapatrao Kulkarni wrote:
> 
> Hi Steven,
> 
> On 27-01-2023 04:59 pm, Steven Price wrote:
>> The wrappers make the call sites easier to read and deal with the
>> boiler plate of handling the error codes from the RMM.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>   arch/arm64/include/asm/rmi_cmds.h | 259 ++++++++++++++++++++++++++++++
>>   1 file changed, 259 insertions(+)
>>   create mode 100644 arch/arm64/include/asm/rmi_cmds.h
>>
>> diff --git a/arch/arm64/include/asm/rmi_cmds.h
>> b/arch/arm64/include/asm/rmi_cmds.h
>> new file mode 100644
>> index 000000000000..d5468ee46f35
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/rmi_cmds.h

[...]

>> +static inline int rmi_rtt_read_entry(unsigned long rd, unsigned long
>> map_addr,
>> +                     unsigned long level, struct rtt_entry *rtt)
>> +{
>> +    struct arm_smccc_1_2_regs regs = {
>> +        SMC_RMI_RTT_READ_ENTRY,
>> +        rd, map_addr, level
>> +    };
>> +
>> +    arm_smccc_1_2_smc(&regs, &regs);
>> +
>> +    rtt->walk_level = regs.a1;
>> +    rtt->state = regs.a2 & 0xFF;
>> +    rtt->desc = regs.a3;
>> +    rtt->ripas = regs.a4 & 1;
>> +
>> +    return regs.a0;
>> +}
>> +
>> +static inline int rmi_rtt_set_ripas(unsigned long rd, unsigned long rec,
>> +                    unsigned long map_addr, unsigned long level,
>> +                    unsigned long ripas)
>> +{
>> +    struct arm_smccc_res res;
>> +
>> +    arm_smccc_1_1_invoke(SMC_RMI_RTT_SET_RIPAS, rd, rec, map_addr,
>> level,
>> +                 ripas, &res);
>> +
>> +    return res.a0;
>> +}
>> +
>> +static inline int rmi_rtt_unmap_unprotected(unsigned long rd,
>> +                        unsigned long map_addr,
>> +                        unsigned long level)
>> +{
>> +    struct arm_smccc_res res;
>> +
>> +    arm_smccc_1_1_invoke(SMC_RMI_RTT_UNMAP_UNPROTECTED, rd, map_addr,
>> +                 level, &res);
>> +
>> +    return res.a0;
>> +}
>> +
>> +static inline phys_addr_t rmi_rtt_get_phys(struct rtt_entry *rtt)
>> +{
>> +    return rtt->desc & GENMASK(47, 12);
>> +}
>> +
>> +#endif
> 
> Can we please replace all occurrence of "unsigned long" with u64?

I'm conflicted here. On the one hand I agree with you - it would be
better to use types that are sized according to the RMM spec. However,
this file is a thin wrapper around the low-level SMC calls, and the
SMCCC interface is a bunch of "unsigned longs" (e.g. look at struct
arm_smccc_1_2_regs).

In particular it could be broken to use smaller types (e.g. char/u8) as
it would potentially permit the compiler to leave 'junk' in the top part
of the register.

So the question becomes whether to stick with the SMCCC interface sizes
(unsigned long) or use our knowledge that it must be a 64 bit platform
(RMM isn't support for 32 bit) and therefore use u64. My (mild)
preference is for unsigned long because it makes it obvious how this
relates to the SMCCC interface it's using. It also seems like it would
ease compatibility if (/when?) 128 bit registers become a thing.

> Also as per spec, RTT level is Int64, can we change accordingly?

Here, however, I agree you've definitely got a point. level should be
signed as (at least in theory) it could be negative.

> Please CC me in future cca patch-sets.
> gankulkarni@os.amperecomputing.com

I will do, thanks for the review.

Steve


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 09/28] arm64: RME: RTT handling
  2024-03-18 11:01     ` Ganapatrao Kulkarni
@ 2024-03-18 11:25       ` Steven Price
  0 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2024-03-18 11:25 UTC (permalink / raw)
  To: Ganapatrao Kulkarni, kvm, kvmarm
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon, James Morse,
	Oliver Upton, Suzuki K Poulose, Zenghui Yu, linux-arm-kernel,
	linux-kernel, Joey Gouly, Alexandru Elisei, Christoffer Dall,
	Fuad Tabba, linux-coco

On 18/03/2024 11:01, Ganapatrao Kulkarni wrote:
> 
> On 27-01-2023 04:59 pm, Steven Price wrote:
>> The RMM owns the stage 2 page tables for a realm, and KVM must request
>> that the RMM creates/destroys entries as necessary. The physical pages
>> to store the page tables are delegated to the realm as required, and can
>> be undelegated when no longer used.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>   arch/arm64/include/asm/kvm_rme.h |  19 +++++
>>   arch/arm64/kvm/mmu.c             |   7 +-
>>   arch/arm64/kvm/rme.c             | 139 +++++++++++++++++++++++++++++++
>>   3 files changed, 162 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_rme.h
>> b/arch/arm64/include/asm/kvm_rme.h
>> index a6318af3ed11..eea5118dfa8a 100644
>> --- a/arch/arm64/include/asm/kvm_rme.h
>> +++ b/arch/arm64/include/asm/kvm_rme.h
>> @@ -35,5 +35,24 @@ u32 kvm_realm_ipa_limit(void);
>>   int kvm_realm_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap);
>>   int kvm_init_realm_vm(struct kvm *kvm);
>>   void kvm_destroy_realm(struct kvm *kvm);
>> +void kvm_realm_destroy_rtts(struct realm *realm, u32 ia_bits, u32
>> start_level);
>> +
>> +#define RME_RTT_BLOCK_LEVEL    2
>> +#define RME_RTT_MAX_LEVEL    3
>> +
>> +#define RME_PAGE_SHIFT        12
>> +#define RME_PAGE_SIZE        BIT(RME_PAGE_SHIFT)
> 
> Can we use PAGE_SIZE and PAGE_SHIFT instead of redefining?
> May be we can use them to define RME_PAGE_SIZE and RME_PAGE_SHIFT.

At the moment the code only supports the host page size matching the
RMM's. But I want to leave open the possibility for the host size being
larger than the RMM's. In this case PAGE_SHIFT/PAGE_SIZE will not equal
RME_PAGE_SIZE and RME_PAGE_SHIFT. The host will have to create multiple
RMM RTTs for each host page.

>> +/* See ARM64_HW_PGTABLE_LEVEL_SHIFT() */
>> +#define RME_RTT_LEVEL_SHIFT(l)    \
>> +    ((RME_PAGE_SHIFT - 3) * (4 - (l)) + 3)
> 
> Instead of defining again, can we define to
> ARM64_HW_PGTABLE_LEVEL_SHIFT?

Same as above - ARM64_HW_PGTABLE_LEVEL_SHIFT uses PAGE_SHIFT, but we
want the same calculation using RME_PAGE_SHIFT which might be different.

Thanks,

Steve


^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 12/28] KVM: arm64: Support timers in realm RECs
  2023-01-27 11:29   ` [RFC PATCH 12/28] KVM: arm64: Support timers in realm RECs Steven Price
@ 2024-03-18 11:28     ` Ganapatrao Kulkarni
  2024-03-18 14:14       ` Steven Price
  0 siblings, 1 reply; 190+ messages in thread
From: Ganapatrao Kulkarni @ 2024-03-18 11:28 UTC (permalink / raw)
  To: Steven Price, kvm, kvmarm
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon, James Morse,
	Oliver Upton, Suzuki K Poulose, Zenghui Yu, linux-arm-kernel,
	linux-kernel, Joey Gouly, Alexandru Elisei, Christoffer Dall,
	Fuad Tabba, linux-coco



On 27-01-2023 04:59 pm, Steven Price wrote:
> The RMM keeps track of the timer while the realm REC is running, but on
> exit to the normal world KVM is responsible for handling the timers.
> 
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
>   arch/arm64/kvm/arch_timer.c  | 53 ++++++++++++++++++++++++++++++++----
>   include/kvm/arm_arch_timer.h |  2 ++
>   2 files changed, 49 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index bb24a76b4224..d4af9ee58550 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -130,6 +130,11 @@ static void timer_set_offset(struct arch_timer_context *ctxt, u64 offset)
>   {
>   	struct kvm_vcpu *vcpu = ctxt->vcpu;
>   
> +	if (kvm_is_realm(vcpu->kvm)) {
> +		WARN_ON(offset);
> +		return;
> +	}
> +
>   	switch(arch_timer_ctx_index(ctxt)) {
>   	case TIMER_VTIMER:
>   		__vcpu_sys_reg(vcpu, CNTVOFF_EL2) = offset;
> @@ -411,6 +416,21 @@ static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
>   	}
>   }
>   
> +void kvm_realm_timers_update(struct kvm_vcpu *vcpu)
> +{
> +	struct arch_timer_cpu *arch_timer = &vcpu->arch.timer_cpu;
> +	int i;
> +
> +	for (i = 0; i < NR_KVM_TIMERS; i++) {

Do we required to check for all timers, is realm/rmm uses hyp timers?

> +		struct arch_timer_context *timer = &arch_timer->timers[i];
> +		bool status = timer_get_ctl(timer) & ARCH_TIMER_CTRL_IT_STAT;
> +		bool level = kvm_timer_irq_can_fire(timer) && status;
> +
> +		if (level != timer->irq.level)
> +			kvm_timer_update_irq(vcpu, level, timer);
> +	}
> +}
> +
>   /* Only called for a fully emulated timer */
>   static void timer_emulate(struct arch_timer_context *ctx)
>   {
> @@ -621,6 +641,11 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
>   	if (unlikely(!timer->enabled))
>   		return;
>   
> +	kvm_timer_unblocking(vcpu);
> +
> +	if (vcpu_is_rec(vcpu))
> +		return;
> +

For realm, timer->enabled is not set, load returns before this check.

>   	get_timer_map(vcpu, &map);
>   
>   	if (static_branch_likely(&has_gic_active_state)) {
> @@ -633,8 +658,6 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
>   
>   	set_cntvoff(timer_get_offset(map.direct_vtimer));
>   
> -	kvm_timer_unblocking(vcpu);
> -
>   	timer_restore_state(map.direct_vtimer);
>   	if (map.direct_ptimer)
>   		timer_restore_state(map.direct_ptimer);
> @@ -668,6 +691,9 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
>   	if (unlikely(!timer->enabled))
>   		return;
>   
> +	if (vcpu_is_rec(vcpu))
> +		goto out;
> +
>   	get_timer_map(vcpu, &map);
>   
>   	timer_save_state(map.direct_vtimer);
> @@ -686,9 +712,6 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
>   	if (map.emul_ptimer)
>   		soft_timer_cancel(&map.emul_ptimer->hrtimer);
>   
> -	if (kvm_vcpu_is_blocking(vcpu))
> -		kvm_timer_blocking(vcpu);
> -
>   	/*
>   	 * The kernel may decide to run userspace after calling vcpu_put, so
>   	 * we reset cntvoff to 0 to ensure a consistent read between user
> @@ -697,6 +720,11 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
>   	 * virtual offset of zero, so no need to zero CNTVOFF_EL2 register.
>   	 */
>   	set_cntvoff(0);
> +
> +out:
> +	if (kvm_vcpu_is_blocking(vcpu))
> +		kvm_timer_blocking(vcpu);
> +
>   }
>   
>   /*
> @@ -785,12 +813,18 @@ void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
>   	struct arch_timer_cpu *timer = vcpu_timer(vcpu);
>   	struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
>   	struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
> +	u64 cntvoff;
>   
>   	vtimer->vcpu = vcpu;
>   	ptimer->vcpu = vcpu;
>   
> +	if (kvm_is_realm(vcpu->kvm))
> +		cntvoff = 0;
> +	else
> +		cntvoff = kvm_phys_timer_read();
> +
>   	/* Synchronize cntvoff across all vtimers of a VM. */
> -	update_vtimer_cntvoff(vcpu, kvm_phys_timer_read());
> +	update_vtimer_cntvoff(vcpu, cntvoff);
>   	timer_set_offset(ptimer, 0);
>   
>   	hrtimer_init(&timer->bg_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
> @@ -1265,6 +1299,13 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>   		return -EINVAL;
>   	}
>   
> +	/*
> +	 * We don't use mapped IRQs for Realms because the RMI doesn't allow
> +	 * us setting the LR.HW bit in the VGIC.
> +	 */
> +	if (vcpu_is_rec(vcpu))
> +		return 0;
> +
>   	get_timer_map(vcpu, &map);
>   
>   	ret = kvm_vgic_map_phys_irq(vcpu,
> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
> index cd6d8f260eab..158280e15a33 100644
> --- a/include/kvm/arm_arch_timer.h
> +++ b/include/kvm/arm_arch_timer.h
> @@ -76,6 +76,8 @@ int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr);
>   int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr);
>   int kvm_arm_timer_has_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr);
>   
> +void kvm_realm_timers_update(struct kvm_vcpu *vcpu);
> +
>   u64 kvm_phys_timer_read(void);
>   
>   void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu);

Thanks,
Ganapat

^ permalink raw reply	[flat|nested] 190+ messages in thread

* Re: [RFC PATCH 12/28] KVM: arm64: Support timers in realm RECs
  2024-03-18 11:28     ` Ganapatrao Kulkarni
@ 2024-03-18 14:14       ` Steven Price
  0 siblings, 0 replies; 190+ messages in thread
From: Steven Price @ 2024-03-18 14:14 UTC (permalink / raw)
  To: Ganapatrao Kulkarni, kvm, kvmarm
  Cc: Catalin Marinas, Marc Zyngier, Will Deacon, James Morse,
	Oliver Upton, Suzuki K Poulose, Zenghui Yu, linux-arm-kernel,
	linux-kernel, Joey Gouly, Alexandru Elisei, Christoffer Dall,
	Fuad Tabba, linux-coco

On 18/03/2024 11:28, Ganapatrao Kulkarni wrote:
> 
> 
> On 27-01-2023 04:59 pm, Steven Price wrote:
>> The RMM keeps track of the timer while the realm REC is running, but on
>> exit to the normal world KVM is responsible for handling the timers.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>   arch/arm64/kvm/arch_timer.c  | 53 ++++++++++++++++++++++++++++++++----
>>   include/kvm/arm_arch_timer.h |  2 ++
>>   2 files changed, 49 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
>> index bb24a76b4224..d4af9ee58550 100644
>> --- a/arch/arm64/kvm/arch_timer.c
>> +++ b/arch/arm64/kvm/arch_timer.c
>> @@ -130,6 +130,11 @@ static void timer_set_offset(struct
>> arch_timer_context *ctxt, u64 offset)
>>   {
>>       struct kvm_vcpu *vcpu = ctxt->vcpu;
>>   +    if (kvm_is_realm(vcpu->kvm)) {
>> +        WARN_ON(offset);
>> +        return;
>> +    }
>> +
>>       switch(arch_timer_ctx_index(ctxt)) {
>>       case TIMER_VTIMER:
>>           __vcpu_sys_reg(vcpu, CNTVOFF_EL2) = offset;
>> @@ -411,6 +416,21 @@ static void kvm_timer_update_irq(struct kvm_vcpu
>> *vcpu, bool new_level,
>>       }
>>   }
>>   +void kvm_realm_timers_update(struct kvm_vcpu *vcpu)
>> +{
>> +    struct arch_timer_cpu *arch_timer = &vcpu->arch.timer_cpu;
>> +    int i;
>> +
>> +    for (i = 0; i < NR_KVM_TIMERS; i++) {
> 
> Do we required to check for all timers, is realm/rmm uses hyp timers?

Good point, the realm guest can't use the hyp timers, so this should be
NR_KVM_EL0_TIMERS. The hyp timers are used by the host to interrupt the
guest execution. I think this code was written before NV support added
the extra timers.

>> +        struct arch_timer_context *timer = &arch_timer->timers[i];
>> +        bool status = timer_get_ctl(timer) & ARCH_TIMER_CTRL_IT_STAT;
>> +        bool level = kvm_timer_irq_can_fire(timer) && status;
>> +
>> +        if (level != timer->irq.level)
>> +            kvm_timer_update_irq(vcpu, level, timer);
>> +    }
>> +}
>> +
>>   /* Only called for a fully emulated timer */
>>   static void timer_emulate(struct arch_timer_context *ctx)
>>   {
>> @@ -621,6 +641,11 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
>>       if (unlikely(!timer->enabled))
>>           return;
>>   +    kvm_timer_unblocking(vcpu);
>> +
>> +    if (vcpu_is_rec(vcpu))
>> +        return;
>> +
> 
> For realm, timer->enabled is not set, load returns before this check.

True, this can be simplified. Thanks.

Steve

>>       get_timer_map(vcpu, &map);
>>         if (static_branch_likely(&has_gic_active_state)) {
>> @@ -633,8 +658,6 @@ void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu)
>>         set_cntvoff(timer_get_offset(map.direct_vtimer));
>>   -    kvm_timer_unblocking(vcpu);
>> -
>>       timer_restore_state(map.direct_vtimer);
>>       if (map.direct_ptimer)
>>           timer_restore_state(map.direct_ptimer);
>> @@ -668,6 +691,9 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
>>       if (unlikely(!timer->enabled))
>>           return;
>>   +    if (vcpu_is_rec(vcpu))
>> +        goto out;
>> +
>>       get_timer_map(vcpu, &map);
>>         timer_save_state(map.direct_vtimer);
>> @@ -686,9 +712,6 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
>>       if (map.emul_ptimer)
>>           soft_timer_cancel(&map.emul_ptimer->hrtimer);
>>   -    if (kvm_vcpu_is_blocking(vcpu))
>> -        kvm_timer_blocking(vcpu);
>> -
>>       /*
>>        * The kernel may decide to run userspace after calling
>> vcpu_put, so
>>        * we reset cntvoff to 0 to ensure a consistent read between user
>> @@ -697,6 +720,11 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu)
>>        * virtual offset of zero, so no need to zero CNTVOFF_EL2 register.
>>        */
>>       set_cntvoff(0);
>> +
>> +out:
>> +    if (kvm_vcpu_is_blocking(vcpu))
>> +        kvm_timer_blocking(vcpu);
>> +
>>   }
>>     /*
>> @@ -785,12 +813,18 @@ void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
>>       struct arch_timer_cpu *timer = vcpu_timer(vcpu);
>>       struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
>>       struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
>> +    u64 cntvoff;
>>         vtimer->vcpu = vcpu;
>>       ptimer->vcpu = vcpu;
>>   +    if (kvm_is_realm(vcpu->kvm))
>> +        cntvoff = 0;
>> +    else
>> +        cntvoff = kvm_phys_timer_read();
>> +
>>       /* Synchronize cntvoff across all vtimers of a VM. */
>> -    update_vtimer_cntvoff(vcpu, kvm_phys_timer_read());
>> +    update_vtimer_cntvoff(vcpu, cntvoff);
>>       timer_set_offset(ptimer, 0);
>>         hrtimer_init(&timer->bg_timer, CLOCK_MONOTONIC,
>> HRTIMER_MODE_ABS_HARD);
>> @@ -1265,6 +1299,13 @@ int kvm_timer_enable(struct kvm_vcpu *vcpu)
>>           return -EINVAL;
>>       }
>>   +    /*
>> +     * We don't use mapped IRQs for Realms because the RMI doesn't allow
>> +     * us setting the LR.HW bit in the VGIC.
>> +     */
>> +    if (vcpu_is_rec(vcpu))
>> +        return 0;
>> +
>>       get_timer_map(vcpu, &map);
>>         ret = kvm_vgic_map_phys_irq(vcpu,
>> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
>> index cd6d8f260eab..158280e15a33 100644
>> --- a/include/kvm/arm_arch_timer.h
>> +++ b/include/kvm/arm_arch_timer.h
>> @@ -76,6 +76,8 @@ int kvm_arm_timer_set_attr(struct kvm_vcpu *vcpu,
>> struct kvm_device_attr *attr);
>>   int kvm_arm_timer_get_attr(struct kvm_vcpu *vcpu, struct
>> kvm_device_attr *attr);
>>   int kvm_arm_timer_has_attr(struct kvm_vcpu *vcpu, struct
>> kvm_device_attr *attr);
>>   +void kvm_realm_timers_update(struct kvm_vcpu *vcpu);
>> +
>>   u64 kvm_phys_timer_read(void);
>>     void kvm_timer_vcpu_load(struct kvm_vcpu *vcpu);
> 
> Thanks,
> Ganapat


^ permalink raw reply	[flat|nested] 190+ messages in thread

end of thread, other threads:[~2024-03-18 14:14 UTC | newest]

Thread overview: 190+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-27 11:22 [RFC] Support for Arm CCA VMs on Linux Suzuki K Poulose
2023-01-27 11:27 ` [RFC PATCH 00/14] arm64: Support for running as a guest in Arm CCA Steven Price
2023-01-27 11:27   ` [RFC PATCH 01/14] arm64: remove redundant 'extern' Steven Price
2023-01-27 11:27   ` [RFC PATCH 02/14] arm64: rsi: Add RSI definitions Steven Price
2023-01-27 11:27   ` [RFC PATCH 03/14] arm64: Detect if in a realm and set RIPAS RAM Steven Price
2023-01-27 11:27   ` [RFC PATCH 04/14] arm64: realm: Query IPA size from the RMM Steven Price
2023-01-27 11:27   ` [RFC PATCH 05/14] arm64: Mark all I/O as non-secure shared Steven Price
2023-01-27 11:27   ` [RFC PATCH 06/14] fixmap: Allow architecture overriding set_fixmap_io Steven Price
2023-01-27 11:27   ` [RFC PATCH 07/14] arm64: Override set_fixmap_io Steven Price
2023-01-27 11:27   ` [RFC PATCH 08/14] arm64: Make the PHYS_MASK_SHIFT dynamic Steven Price
2023-01-27 11:27   ` [RFC PATCH 09/14] arm64: Enforce bounce buffers for realm DMA Steven Price
2023-01-27 11:27   ` [RFC PATCH 10/14] arm64: Enable memory encrypt for Realms Steven Price
2023-01-27 11:27   ` [RFC PATCH 11/14] arm64: Force device mappings to be non-secure shared Steven Price
2023-01-27 11:27   ` [RFC PATCH 12/14] efi: arm64: Map Device with Prot Shared Steven Price
2023-01-27 11:27   ` [RFC PATCH 13/14] arm64: realm: Support nonsecure ITS emulation shared Steven Price
2023-01-27 11:27   ` [RFC PATCH 14/14] HACK: Accept prototype RSI version Steven Price
2023-01-27 11:29 ` [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM Steven Price
2023-01-27 11:29   ` [RFC PATCH 01/28] arm64: RME: Handle Granule Protection Faults (GPFs) Steven Price
2023-01-27 11:29   ` [RFC PATCH 02/28] arm64: RME: Add SMC definitions for calling the RMM Steven Price
2023-01-27 11:29   ` [RFC PATCH 03/28] arm64: RME: Add wrappers for RMI calls Steven Price
2023-02-13 16:43     ` Zhi Wang
2024-03-18  7:03     ` Ganapatrao Kulkarni
2024-03-18 11:22       ` Steven Price
2023-01-27 11:29   ` [RFC PATCH 04/28] arm64: RME: Check for RME support at KVM init Steven Price
2023-02-13 15:48     ` Zhi Wang
2023-02-13 15:59       ` Steven Price
2023-03-04 12:07         ` Zhi Wang
2023-02-13 15:55     ` Zhi Wang
2024-03-18  7:17     ` Ganapatrao Kulkarni
2024-03-18 11:22       ` Steven Price
2023-01-27 11:29   ` [RFC PATCH 05/28] arm64: RME: Define the user ABI Steven Price
2023-02-13 16:04     ` Zhi Wang
2023-03-01 11:54       ` Steven Price
2023-03-01 20:21         ` Zhi Wang
2023-01-27 11:29   ` [RFC PATCH 06/28] arm64: RME: ioctls to create and configure realms Steven Price
2023-02-07 12:25     ` Jean-Philippe Brucker
2023-02-07 12:55       ` Suzuki K Poulose
2023-02-13 16:10     ` Zhi Wang
2023-03-01 11:55       ` Steven Price
2023-03-01 20:33         ` Zhi Wang
2023-03-06 19:10     ` Zhi Wang
2023-03-10 15:47       ` Steven Price
2024-03-18  7:40     ` Ganapatrao Kulkarni
2024-03-18 11:22       ` Steven Price
2023-01-27 11:29   ` [RFC PATCH 07/28] arm64: kvm: Allow passing machine type in KVM creation Steven Price
2023-02-13 16:35     ` Zhi Wang
2023-03-01 11:55       ` Steven Price
2023-01-27 11:29   ` [RFC PATCH 08/28] arm64: RME: Keep a spare page delegated to the RMM Steven Price
2023-02-13 16:47     ` Zhi Wang
2023-03-01 11:55       ` Steven Price
2023-03-01 20:50         ` Zhi Wang
2023-01-27 11:29   ` [RFC PATCH 09/28] arm64: RME: RTT handling Steven Price
2023-02-13 17:44     ` Zhi Wang
2023-03-03 14:04       ` Steven Price
2023-03-04 12:32         ` Zhi Wang
2024-03-18 11:01     ` Ganapatrao Kulkarni
2024-03-18 11:25       ` Steven Price
2023-01-27 11:29   ` [RFC PATCH 10/28] arm64: RME: Allocate/free RECs to match vCPUs Steven Price
2023-02-13 18:08     ` Zhi Wang
2023-03-03 14:05       ` Steven Price
2023-03-04 12:46         ` Zhi Wang
2023-01-27 11:29   ` [RFC PATCH 11/28] arm64: RME: Support for the VGIC in realms Steven Price
2023-01-27 11:29   ` [RFC PATCH 12/28] KVM: arm64: Support timers in realm RECs Steven Price
2024-03-18 11:28     ` Ganapatrao Kulkarni
2024-03-18 14:14       ` Steven Price
2023-01-27 11:29   ` [RFC PATCH 13/28] arm64: RME: Allow VMM to set RIPAS Steven Price
2023-02-17 13:07     ` Zhi Wang
2023-03-03 14:05       ` Steven Price
2023-01-27 11:29   ` [RFC PATCH 14/28] arm64: RME: Handle realm enter/exit Steven Price
2023-01-27 11:29   ` [RFC PATCH 15/28] KVM: arm64: Handle realm MMIO emulation Steven Price
2023-03-06 15:37     ` Zhi Wang
2023-03-10 15:47       ` Steven Price
2023-03-14 15:44         ` Zhi Wang
2023-03-22 11:51           ` Steven Price
2023-01-27 11:29   ` [RFC PATCH 16/28] arm64: RME: Allow populating initial contents Steven Price
2023-03-06 17:34     ` Zhi Wang
2023-03-10 15:47       ` Steven Price
2023-03-14 15:31         ` Zhi Wang
2023-03-22 11:51           ` Steven Price
2023-01-27 11:29   ` [RFC PATCH 17/28] arm64: RME: Runtime faulting of memory Steven Price
2023-03-06 18:20     ` Zhi Wang
2023-03-10 15:47       ` Steven Price
2023-03-14 16:41         ` Zhi Wang
2023-01-27 11:29   ` [RFC PATCH 18/28] KVM: arm64: Handle realm VCPU load Steven Price
2023-01-27 11:29   ` [RFC PATCH 19/28] KVM: arm64: Validate register access for a Realm VM Steven Price
2023-01-27 11:29   ` [RFC PATCH 20/28] KVM: arm64: Handle Realm PSCI requests Steven Price
2023-01-27 11:29   ` [RFC PATCH 21/28] KVM: arm64: WARN on injected undef exceptions Steven Price
2023-01-27 11:29   ` [RFC PATCH 22/28] arm64: Don't expose stolen time for realm guests Steven Price
2023-01-27 11:29   ` [RFC PATCH 23/28] KVM: arm64: Allow activating realms Steven Price
2023-01-27 11:29   ` [RFC PATCH 24/28] arm64: rme: allow userspace to inject aborts Steven Price
2023-01-27 11:29   ` [RFC PATCH 25/28] arm64: rme: support RSI_HOST_CALL Steven Price
2023-01-27 11:29   ` [RFC PATCH 26/28] arm64: rme: Allow checking SVE on VM instance Steven Price
2023-01-27 11:29   ` [RFC PATCH 27/28] arm64: RME: Always use 4k pages for realms Steven Price
2023-01-27 11:29   ` [RFC PATCH 28/28] HACK: Accept prototype RMI versions Steven Price
2023-01-27 11:39 ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 01/31] arm64: Disable MTE when CFI flash is emulated Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 02/31] script: update_headers: Ignore missing architectures Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 03/31] hw: cfi flash: Handle errors in memory transitions Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 04/31] Add --nocompat option to disable compat warnings Suzuki K Poulose
2023-01-27 12:19     ` Alexandru Elisei
2023-01-27 11:39   ` [RFC kvmtool 05/31] arm64: Check pvtime support against the KVM instance Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 06/31] arm64: Check SVE capability on the VM instance Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 07/31] arm64: Add option to disable SVE Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 08/31] linux: Update kernel headers for RME support Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 09/31] arm64: Add --realm command line option Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 10/31] arm64: Create a realm virtual machine Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 11/31] arm64: Lock realm RAM in memory Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 12/31] arm64: Create Realm Descriptor Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 13/31] arm64: Add --measurement-algo command line option for a realm Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 14/31] arm64: Add configuration step for Realms Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 15/31] arm64: Add support for Realm Personalisation Value Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 16/31] arm64: Add support for specifying the SVE vector length for Realm Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 17/31] arm: Add kernel size to VM context Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 18/31] arm64: Populate initial realm contents Suzuki K Poulose
2023-03-02 14:03     ` Piotr Sawicki
2023-03-02 14:06       ` Suzuki K Poulose
2023-10-02  9:28         ` Piotr Sawicki
2023-01-27 11:39   ` [RFC kvmtool 19/31] arm64: Don't try to set PSTATE for VCPUs belonging to a realm Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 20/31] arm64: Finalize realm VCPU after reset Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 21/31] init: Add last_{init, exit} list macros Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 22/31] arm64: Activate realm before the first VCPU is run Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 23/31] arm64: Specify SMC as the PSCI conduits for realms Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 24/31] arm64: Don't try to debug a realm Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 25/31] arm64: realm: Double the IPA space Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 26/31] virtio: Add a wrapper for get_host_features Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 27/31] virtio: Add arch specific hook for virtio host flags Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 28/31] arm64: realm: Enforce virtio F_ACCESS_PLATFORM flag Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 29/31] mmio: add arch hook for an unhandled MMIO access Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 30/31] arm64: realm: inject an abort on " Suzuki K Poulose
2023-01-27 11:39   ` [RFC kvmtool 31/31] arm64: Allow the user to create a realm Suzuki K Poulose
2023-10-02  9:45   ` [RFC kvmtool 00/31] arm64: Support for Arm Confidential Compute Architecture Piotr Sawicki
2023-01-27 11:40 ` [RFC kvm-unit-tests 00/27] " Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 01/27] lib/string: include stddef.h for size_t Joey Gouly
2023-01-31 14:43     ` Thomas Huth
2023-01-27 11:40   ` [RFC kvm-unit-tests 02/27] arm: Expand SMCCC arguments and return values Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 03/27] arm: realm: Add RSI interface header Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 04/27] arm: Make physical address mask dynamic Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 05/27] arm: Introduce NS_SHARED PTE attribute Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 06/27] arm: Move io_init after vm initialization Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 07/27] arm: realm: Make uart available before MMU is enabled Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 08/27] arm: realm: Realm initialisation Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 09/27] arm: realm: Add support for changing the state of memory Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 10/27] arm: realm: Set RIPAS state for RAM Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 11/27] arm: realm: Early memory setup Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 12/27] arm: realm: Add RSI version test Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 13/27] arm: selftest: realm: skip pabt test when running in a realm Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 14/27] arm: realm: add hvc and RSI_HOST_CALL tests Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 15/27] arm: realm: Add test for FPU/SIMD context save/restore Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 16/27] arm: realm: Add tests for in realm SEA Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 17/27] lib/alloc_page: Add shared page allocation support Joey Gouly
2023-01-27 11:40   ` [RFC kvm-unit-tests 18/27] arm: gic-v3-its: Use shared pages wherever needed Joey Gouly
2023-01-27 11:41   ` [RFC kvm-unit-tests 19/27] arm: realm: Enable memory encryption Joey Gouly
2023-01-27 11:41   ` [RFC kvm-unit-tests 20/27] qcbor: Add QCBOR as a submodule Joey Gouly
2023-01-27 11:41   ` [RFC kvm-unit-tests 21/27] arm: Add build steps for QCBOR library Joey Gouly
2023-01-27 11:41   ` [RFC kvm-unit-tests 22/27] arm: Add a library to verify tokens using the " Joey Gouly
2023-01-27 11:41   ` [RFC kvm-unit-tests 23/27] arm: realm: add RSI interface for attestation measurements Joey Gouly
2023-01-27 11:41   ` [RFC kvm-unit-tests 24/27] arm: realm: Add helpers to decode RSI return codes Joey Gouly
2023-01-27 11:41   ` [RFC kvm-unit-tests 25/27] arm: realm: Add Realm attestation tests Joey Gouly
2023-01-27 11:41   ` [RFC kvm-unit-tests 26/27] arm: realm: Add a test for shared memory Joey Gouly
2023-01-27 11:41   ` [RFC kvm-unit-tests 27/27] NOT-FOR-MERGING: add run-realm-tests Joey Gouly
2023-01-27 15:26 ` [RFC] Support for Arm CCA VMs on Linux Jean-Philippe Brucker
2023-02-28 23:35   ` Itaru Kitayama
2023-03-01  9:20     ` Jean-Philippe Brucker
2023-03-01 22:12       ` Itaru Kitayama
2023-03-02  9:18         ` Jean-Philippe Brucker
2023-03-03  9:46         ` Jean-Philippe Brucker
2023-03-03  9:54           ` Suzuki K Poulose
2023-03-03 11:39             ` Jean-Philippe Brucker
2023-03-03 12:08               ` Andrew Jones
2023-03-03 12:19                 ` Suzuki K Poulose
2023-03-03 13:06                   ` Cornelia Huck
2023-03-03 13:57                     ` Jean-Philippe Brucker
2023-02-10 16:51 ` Ryan Roberts
2023-02-10 22:53   ` Itaru Kitayama
2023-02-17  8:02     ` Itaru Kitayama
2023-02-20 10:51       ` Ryan Roberts
2023-02-14 17:13 ` Dr. David Alan Gilbert
2023-03-01  9:58   ` Suzuki K Poulose
2023-03-02 16:46     ` Dr. David Alan Gilbert
2023-03-02 19:02       ` Suzuki K Poulose
2023-07-14 13:46 ` Jonathan Cameron
2023-07-14 15:03   ` Suzuki K Poulose
2023-07-14 16:28     ` Jonathan Cameron
2023-07-17  9:40       ` Suzuki K Poulose
2023-10-02 12:43 ` Suzuki K Poulose
2024-01-10  5:40   ` Itaru Kitayama
2024-01-10 11:41     ` Suzuki K Poulose
2024-01-10 13:44       ` Suzuki K Poulose
2024-01-19  1:26         ` Itaru Kitayama
2024-01-12  5:01       ` Itaru Kitayama

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).