A number of hardware platforms are implementing mechanisms whereby the hypervisor does not have unfettered access to guest memory, in order to mitigate the security impact of a compromised hypervisor. AMD's SEV implements this with in-cpu memory encryption, and Intel has its own memory encryption mechanism. POWER has an upcoming mechanism to accomplish this in a different way, using a new memory protection level plus a small trusted ultravisor. s390 also has a protected execution environment. The current code (committed or draft) for these features has each platform's version configured entirely differently. That doesn't seem ideal for users, or particularly for management layers. AMD SEV introduces a notionally generic machine option "machine-encryption", but it doesn't actually cover any cases other than SEV. This series is a proposal to at least partially unify configuration for these mechanisms, by renaming and generalizing AMD's "memory-encryption" property. It is replaced by a "securable-guest-memory" property pointing to a platform specific object which configures and manages the specific details. Changes since v4: * Renamed from "host trust limitation" to "securable guest memory", which I think is marginally more descriptive * Re-organized initialization, because the previous model called at kvm_init didn't work for s390 * Assorted fixes to the s390 implementation; rudimentary testing (gitlab CI) only Changes since v3: * Rebased * Added first cut at handling of s390 protected virtualization Changes since RFCv2: * Rebased * Removed preliminary SEV cleanups (they've been merged) * Changed name to "host trust limitation" * Added migration blocker to the PEF code (based on SEV's version) Changes since RFCv1: * Rebased * Fixed some errors pointed out by Dave Gilbert David Gibson (12): securable guest memory: Introduce new securable guest memory base class securable guest memory: Handle memory encryption via interface securable guest memory: Move side effect out of machine_set_memory_encryption() securable guest memory: Rework the "memory-encryption" property securable guest memory: Decouple kvm_memcrypt_*() helpers from KVM sev: Add Error ** to sev_kvm_init() securable guest memory: Introduce sgm "ready" flag securable guest memory: Move SEV initialization into arch specific code spapr: Add PEF based securable guest memory spapr: PEF: prevent migration securable guest memory: Alter virtio default properties for protected guests s390: Recognize securable-guest-memory option Greg Kurz (1): qom: Allow optional sugar props accel/kvm/kvm-all.c | 39 +------ accel/kvm/sev-stub.c | 10 +- accel/stubs/kvm-stub.c | 10 -- backends/meson.build | 1 + backends/securable-guest-memory.c | 30 +++++ hw/core/machine.c | 71 ++++++++++-- hw/i386/pc_sysfw.c | 6 +- hw/ppc/meson.build | 1 + hw/ppc/pef.c | 124 +++++++++++++++++++++ hw/ppc/spapr.c | 10 ++ hw/s390x/pv.c | 58 ++++++++++ include/exec/securable-guest-memory.h | 86 +++++++++++++++ include/hw/boards.h | 2 +- include/hw/ppc/pef.h | 26 +++++ include/hw/s390x/pv.h | 1 + include/qemu/typedefs.h | 1 + include/qom/object.h | 3 +- include/sysemu/kvm.h | 17 --- include/sysemu/sev.h | 5 +- qom/object.c | 4 +- softmmu/vl.c | 16 ++- target/i386/kvm.c | 12 ++ target/i386/monitor.c | 1 - target/i386/sev.c | 153 ++++++++++++-------------- target/ppc/kvm.c | 18 --- target/ppc/kvm_ppc.h | 6 - target/s390x/kvm.c | 3 + 27 files changed, 510 insertions(+), 204 deletions(-) create mode 100644 backends/securable-guest-memory.c create mode 100644 hw/ppc/pef.c create mode 100644 include/exec/securable-guest-memory.h create mode 100644 include/hw/ppc/pef.h -- 2.28.0
From: Greg Kurz <groug@kaod.org> Global properties have an @optional field, which allows to apply a given property to a given type even if one of its subclasses doesn't support it. This is especially used in the compat code when dealing with the "disable-modern" and "disable-legacy" properties and the "virtio-pci" type. Allow object_register_sugar_prop() to set this field as well. Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <159738953558.377274.16617742952571083440.stgit@bahia.lan> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- include/qom/object.h | 3 ++- qom/object.c | 4 +++- softmmu/vl.c | 16 ++++++++++------ 3 files changed, 15 insertions(+), 8 deletions(-) diff --git a/include/qom/object.h b/include/qom/object.h index d378f13a11..6721cd312e 100644 --- a/include/qom/object.h +++ b/include/qom/object.h @@ -638,7 +638,8 @@ bool object_apply_global_props(Object *obj, const GPtrArray *props, Error **errp); void object_set_machine_compat_props(GPtrArray *compat_props); void object_set_accelerator_compat_props(GPtrArray *compat_props); -void object_register_sugar_prop(const char *driver, const char *prop, const char *value); +void object_register_sugar_prop(const char *driver, const char *prop, + const char *value, bool optional); void object_apply_compat_props(Object *obj); /** diff --git a/qom/object.c b/qom/object.c index 1065355233..62218bb17d 100644 --- a/qom/object.c +++ b/qom/object.c @@ -442,7 +442,8 @@ static GPtrArray *object_compat_props[3]; * other than "-global". These are generally used for syntactic * sugar and legacy command line options. */ -void object_register_sugar_prop(const char *driver, const char *prop, const char *value) +void object_register_sugar_prop(const char *driver, const char *prop, + const char *value, bool optional) { GlobalProperty *g; if (!object_compat_props[2]) { @@ -452,6 +453,7 @@ void object_register_sugar_prop(const char *driver, const char *prop, const char g->driver = g_strdup(driver); g->property = g_strdup(prop); g->value = g_strdup(value); + g->optional = optional; g_ptr_array_add(object_compat_props[2], g); } diff --git a/softmmu/vl.c b/softmmu/vl.c index e6e0ad5a92..cf4a9dc198 100644 --- a/softmmu/vl.c +++ b/softmmu/vl.c @@ -884,7 +884,7 @@ static void configure_rtc(QemuOpts *opts) if (!strcmp(value, "slew")) { object_register_sugar_prop("mc146818rtc", "lost_tick_policy", - "slew"); + "slew", false); } else if (!strcmp(value, "none")) { /* discard is default */ } else { @@ -2498,12 +2498,14 @@ static int machine_set_property(void *opaque, return 0; } if (g_str_equal(qom_name, "igd-passthru")) { - object_register_sugar_prop(ACCEL_CLASS_NAME("xen"), qom_name, value); + object_register_sugar_prop(ACCEL_CLASS_NAME("xen"), qom_name, value, + false); return 0; } if (g_str_equal(qom_name, "kvm-shadow-mem") || g_str_equal(qom_name, "kernel-irqchip")) { - object_register_sugar_prop(ACCEL_CLASS_NAME("kvm"), qom_name, value); + object_register_sugar_prop(ACCEL_CLASS_NAME("kvm"), qom_name, value, + false); return 0; } @@ -3645,7 +3647,8 @@ void qemu_init(int argc, char **argv, char **envp) exit(1); #endif warn_report("The -tb-size option is deprecated, use -accel tcg,tb-size instead"); - object_register_sugar_prop(ACCEL_CLASS_NAME("tcg"), "tb-size", optarg); + object_register_sugar_prop(ACCEL_CLASS_NAME("tcg"), "tb-size", + optarg, false); break; case QEMU_OPTION_icount: icount_opts = qemu_opts_parse_noisily(qemu_find_opts("icount"), @@ -3996,9 +3999,10 @@ void qemu_init(int argc, char **argv, char **envp) char *val; val = g_strdup_printf("%d", current_machine->smp.cpus); - object_register_sugar_prop("memory-backend", "prealloc-threads", val); + object_register_sugar_prop("memory-backend", "prealloc-threads", val, + false); g_free(val); - object_register_sugar_prop("memory-backend", "prealloc", "on"); + object_register_sugar_prop("memory-backend", "prealloc", "on", false); } /* -- 2.28.0
Several architectures have mechanisms which are designed to protect guest memory from interference or eavesdropping by a compromised hypervisor. AMD SEV does this with in-chip memory encryption and Intel's MKTME can do similar things. POWER's Protected Execution Framework (PEF) accomplishes a similar goal using an ultravisor and new memory protection features, instead of encryption. To (partially) unify handling for these, this introduces a new SecurableGuestMemoryState QOM base class. "Securable" is kind of vague, but "secure memory" or "secure guest" seems to be a common theme in the lexicon around these schemes, so it's the best name I've managed to find so far. It's "securable" rather than "secure", because in at least some of the cases it requires the guest to take specific actions in order to protect itself from hypervisor eavesdropping. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- backends/meson.build | 1 + backends/securable-guest-memory.c | 30 +++++++++++++++++ include/exec/securable-guest-memory.h | 46 +++++++++++++++++++++++++++ include/qemu/typedefs.h | 1 + target/i386/sev.c | 3 +- 5 files changed, 80 insertions(+), 1 deletion(-) create mode 100644 backends/securable-guest-memory.c create mode 100644 include/exec/securable-guest-memory.h diff --git a/backends/meson.build b/backends/meson.build index 484456ece7..781594af86 100644 --- a/backends/meson.build +++ b/backends/meson.build @@ -6,6 +6,7 @@ softmmu_ss.add([files( 'rng-builtin.c', 'rng-egd.c', 'rng.c', + 'securable-guest-memory.c', ), numa]) softmmu_ss.add(when: 'CONFIG_POSIX', if_true: files('rng-random.c')) diff --git a/backends/securable-guest-memory.c b/backends/securable-guest-memory.c new file mode 100644 index 0000000000..5bf380fd84 --- /dev/null +++ b/backends/securable-guest-memory.c @@ -0,0 +1,30 @@ +/* + * QEMU Securable Guest Memory interface + * + * Copyright: David Gibson, Red Hat Inc. 2020 + * + * Authors: + * David Gibson <david@gibson.dropbear.id.au> + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * later. See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" + +#include "exec/securable-guest-memory.h" + +static const TypeInfo securable_guest_memory_info = { + .parent = TYPE_OBJECT, + .name = TYPE_SECURABLE_GUEST_MEMORY, + .class_size = sizeof(SecurableGuestMemoryClass), + .instance_size = sizeof(SecurableGuestMemory), +}; + +static void securable_guest_memory_register_types(void) +{ + type_register_static(&securable_guest_memory_info); +} + +type_init(securable_guest_memory_register_types) diff --git a/include/exec/securable-guest-memory.h b/include/exec/securable-guest-memory.h new file mode 100644 index 0000000000..0d5ecfb681 --- /dev/null +++ b/include/exec/securable-guest-memory.h @@ -0,0 +1,46 @@ +/* + * QEMU Securable Guest Memory interface + * This interface describes the common pieces between various + * schemes for protecting guest memory against a compromised + * hypervisor. This includes memory encryption (AMD's SEV and + * Intel's MKTME) or special protection modes (PEF on POWER, or PV + * on s390x). + * + * Copyright: David Gibson, Red Hat Inc. 2020 + * + * Authors: + * David Gibson <david@gibson.dropbear.id.au> + * + * This work is licensed under the terms of the GNU GPL, version 2 or + * later. See the COPYING file in the top-level directory. + * + */ +#ifndef QEMU_SECURABLE_GUEST_MEMORY_H +#define QEMU_SECURABLE_GUEST_MEMORY_H + +#ifndef CONFIG_USER_ONLY + +#include "qom/object.h" + +#define TYPE_SECURABLE_GUEST_MEMORY "securable-guest-memory" +#define SECURABLE_GUEST_MEMORY(obj) \ + OBJECT_CHECK(SecurableGuestMemory, (obj), \ + TYPE_SECURABLE_GUEST_MEMORY) +#define SECURABLE_GUEST_MEMORY_CLASS(klass) \ + OBJECT_CLASS_CHECK(SecurableGuestMemoryClass, (klass), \ + TYPE_SECURABLE_GUEST_MEMORY) +#define SECURABLE_GUEST_MEMORY_GET_CLASS(obj) \ + OBJECT_GET_CLASS(SecurableGuestMemoryClass, (obj), \ + TYPE_SECURABLE_GUEST_MEMORY) + +struct SecurableGuestMemory { + Object parent; +}; + +typedef struct SecurableGuestMemoryClass { + ObjectClass parent; +} SecurableGuestMemoryClass; + +#endif /* !CONFIG_USER_ONLY */ + +#endif /* QEMU_SECURABLE_GUEST_MEMORY_H */ diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h index 6281eae3b5..79d53746f1 100644 --- a/include/qemu/typedefs.h +++ b/include/qemu/typedefs.h @@ -116,6 +116,7 @@ typedef struct QString QString; typedef struct RAMBlock RAMBlock; typedef struct Range Range; typedef struct SavedIOTLB SavedIOTLB; +typedef struct SecurableGuestMemory SecurableGuestMemory; typedef struct SHPCDevice SHPCDevice; typedef struct SSIBus SSIBus; typedef struct VirtIODevice VirtIODevice; diff --git a/target/i386/sev.c b/target/i386/sev.c index 93c4d60b82..53f00a24cf 100644 --- a/target/i386/sev.c +++ b/target/i386/sev.c @@ -29,6 +29,7 @@ #include "trace.h" #include "migration/blocker.h" #include "qom/object.h" +#include "exec/securable-guest-memory.h" #define TYPE_SEV_GUEST "sev-guest" OBJECT_DECLARE_SIMPLE_TYPE(SevGuestState, SEV_GUEST) @@ -320,7 +321,7 @@ sev_guest_instance_init(Object *obj) /* sev guest info */ static const TypeInfo sev_guest_info = { - .parent = TYPE_OBJECT, + .parent = TYPE_SECURABLE_GUEST_MEMORY, .name = TYPE_SEV_GUEST, .instance_size = sizeof(SevGuestState), .instance_finalize = sev_guest_finalize, -- 2.28.0
At the moment AMD SEV sets a special function pointer, plus an opaque handle in KVMState to let things know how to encrypt guest memory. Now that we have a QOM interface for handling things related to securable guest memory, use a QOM method on that interface, rather than a bare function pointer for this. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> --- accel/kvm/kvm-all.c | 36 +++++--- accel/kvm/sev-stub.c | 9 +- include/exec/securable-guest-memory.h | 2 + include/sysemu/sev.h | 5 +- target/i386/monitor.c | 1 - target/i386/sev.c | 116 ++++++++++---------------- 6 files changed, 77 insertions(+), 92 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index baaa54249d..9e7cea64d6 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -47,6 +47,7 @@ #include "qemu/guest-random.h" #include "sysemu/hw_accel.h" #include "kvm-cpus.h" +#include "exec/securable-guest-memory.h" #include "hw/boards.h" @@ -120,9 +121,8 @@ struct KVMState KVMMemoryListener memory_listener; QLIST_HEAD(, KVMParkedVcpu) kvm_parked_vcpus; - /* memory encryption */ - void *memcrypt_handle; - int (*memcrypt_encrypt_data)(void *handle, uint8_t *ptr, uint64_t len); + /* securable guest memory (e.g. by guest memory encryption) */ + SecurableGuestMemory *sgm; /* For "info mtree -f" to tell if an MR is registered in KVM */ int nr_as; @@ -224,7 +224,7 @@ int kvm_get_max_memslots(void) bool kvm_memcrypt_enabled(void) { - if (kvm_state && kvm_state->memcrypt_handle) { + if (kvm_state && kvm_state->sgm) { return true; } @@ -233,10 +233,12 @@ bool kvm_memcrypt_enabled(void) int kvm_memcrypt_encrypt_data(uint8_t *ptr, uint64_t len) { - if (kvm_state->memcrypt_handle && - kvm_state->memcrypt_encrypt_data) { - return kvm_state->memcrypt_encrypt_data(kvm_state->memcrypt_handle, - ptr, len); + SecurableGuestMemory *sgm = kvm_state->sgm; + + if (sgm) { + SecurableGuestMemoryClass *sgmc = SECURABLE_GUEST_MEMORY_GET_CLASS(sgm); + + return sgmc->encrypt_data(sgm, ptr, len); } return 1; @@ -2206,13 +2208,23 @@ static int kvm_init(MachineState *ms) * encryption context. */ if (ms->memory_encryption) { - kvm_state->memcrypt_handle = sev_guest_init(ms->memory_encryption); - if (!kvm_state->memcrypt_handle) { + Object *obj = object_resolve_path_component(object_get_objects_root(), + ms->memory_encryption); + + if (object_dynamic_cast(obj, TYPE_SECURABLE_GUEST_MEMORY)) { + SecurableGuestMemory *sgm = SECURABLE_GUEST_MEMORY(obj); + + /* FIXME handle mechanisms other than SEV */ + ret = sev_kvm_init(sgm); + if (ret < 0) { + goto err; + } + + kvm_state->sgm = sgm; + } else { ret = -1; goto err; } - - kvm_state->memcrypt_encrypt_data = sev_encrypt_data; } ret = kvm_arch_init(ms, s); diff --git a/accel/kvm/sev-stub.c b/accel/kvm/sev-stub.c index 4f97452585..3df3c88eeb 100644 --- a/accel/kvm/sev-stub.c +++ b/accel/kvm/sev-stub.c @@ -15,12 +15,7 @@ #include "qemu-common.h" #include "sysemu/sev.h" -int sev_encrypt_data(void *handle, uint8_t *ptr, uint64_t len) +int sev_kvm_init(SecurableGuestMemory *sgm) { - abort(); -} - -void *sev_guest_init(const char *id) -{ - return NULL; + return -1; } diff --git a/include/exec/securable-guest-memory.h b/include/exec/securable-guest-memory.h index 0d5ecfb681..4e2ae27040 100644 --- a/include/exec/securable-guest-memory.h +++ b/include/exec/securable-guest-memory.h @@ -39,6 +39,8 @@ struct SecurableGuestMemory { typedef struct SecurableGuestMemoryClass { ObjectClass parent; + + int (*encrypt_data)(SecurableGuestMemory *, uint8_t *, uint64_t); } SecurableGuestMemoryClass; #endif /* !CONFIG_USER_ONLY */ diff --git a/include/sysemu/sev.h b/include/sysemu/sev.h index 98c1ec8d38..36d038a36f 100644 --- a/include/sysemu/sev.h +++ b/include/sysemu/sev.h @@ -15,7 +15,8 @@ #define QEMU_SEV_H #include "sysemu/kvm.h" +#include "exec/securable-guest-memory.h" + +int sev_kvm_init(SecurableGuestMemory *sgm); -void *sev_guest_init(const char *id); -int sev_encrypt_data(void *handle, uint8_t *ptr, uint64_t len); #endif diff --git a/target/i386/monitor.c b/target/i386/monitor.c index 9f9e1c42f4..db6aeaf43a 100644 --- a/target/i386/monitor.c +++ b/target/i386/monitor.c @@ -29,7 +29,6 @@ #include "monitor/hmp.h" #include "qapi/qmp/qdict.h" #include "sysemu/kvm.h" -#include "sysemu/sev.h" #include "qapi/error.h" #include "sev_i386.h" #include "qapi/qapi-commands-misc-target.h" diff --git a/target/i386/sev.c b/target/i386/sev.c index 53f00a24cf..7b8ce590f7 100644 --- a/target/i386/sev.c +++ b/target/i386/sev.c @@ -281,26 +281,6 @@ sev_guest_set_sev_device(Object *obj, const char *value, Error **errp) sev->sev_device = g_strdup(value); } -static void -sev_guest_class_init(ObjectClass *oc, void *data) -{ - object_class_property_add_str(oc, "sev-device", - sev_guest_get_sev_device, - sev_guest_set_sev_device); - object_class_property_set_description(oc, "sev-device", - "SEV device to use"); - object_class_property_add_str(oc, "dh-cert-file", - sev_guest_get_dh_cert_file, - sev_guest_set_dh_cert_file); - object_class_property_set_description(oc, "dh-cert-file", - "guest owners DH certificate (encoded with base64)"); - object_class_property_add_str(oc, "session-file", - sev_guest_get_session_file, - sev_guest_set_session_file); - object_class_property_set_description(oc, "session-file", - "guest owners session parameters (encoded with base64)"); -} - static void sev_guest_instance_init(Object *obj) { @@ -319,40 +299,6 @@ sev_guest_instance_init(Object *obj) OBJ_PROP_FLAG_READWRITE); } -/* sev guest info */ -static const TypeInfo sev_guest_info = { - .parent = TYPE_SECURABLE_GUEST_MEMORY, - .name = TYPE_SEV_GUEST, - .instance_size = sizeof(SevGuestState), - .instance_finalize = sev_guest_finalize, - .class_init = sev_guest_class_init, - .instance_init = sev_guest_instance_init, - .interfaces = (InterfaceInfo[]) { - { TYPE_USER_CREATABLE }, - { } - } -}; - -static SevGuestState * -lookup_sev_guest_info(const char *id) -{ - Object *obj; - SevGuestState *info; - - obj = object_resolve_path_component(object_get_objects_root(), id); - if (!obj) { - return NULL; - } - - info = (SevGuestState *) - object_dynamic_cast(obj, TYPE_SEV_GUEST); - if (!info) { - return NULL; - } - - return info; -} - bool sev_enabled(void) { @@ -680,10 +626,9 @@ sev_vm_state_change(void *opaque, int running, RunState state) } } -void * -sev_guest_init(const char *id) +int sev_kvm_init(SecurableGuestMemory *sgm) { - SevGuestState *sev; + SevGuestState *sev = SEV_GUEST(sgm); char *devname; int ret, fw_error; uint32_t ebx; @@ -693,14 +638,7 @@ sev_guest_init(const char *id) ret = ram_block_discard_disable(true); if (ret) { error_report("%s: cannot disable RAM discard", __func__); - return NULL; - } - - sev = lookup_sev_guest_info(id); - if (!sev) { - error_report("%s: '%s' is not a valid '%s' object", - __func__, id, TYPE_SEV_GUEST); - goto err; + return -1; } sev_guest = sev; @@ -764,17 +702,17 @@ sev_guest_init(const char *id) qemu_add_machine_init_done_notifier(&sev_machine_done_notify); qemu_add_vm_change_state_handler(sev_vm_state_change, sev); - return sev; + return 0; err: sev_guest = NULL; ram_block_discard_disable(false); - return NULL; + return -1; } -int -sev_encrypt_data(void *handle, uint8_t *ptr, uint64_t len) +static int +sev_encrypt_data(SecurableGuestMemory *opaque, uint8_t *ptr, uint64_t len) { - SevGuestState *sev = handle; + SevGuestState *sev = SEV_GUEST(opaque); assert(sev); @@ -786,6 +724,44 @@ sev_encrypt_data(void *handle, uint8_t *ptr, uint64_t len) return 0; } +static void +sev_guest_class_init(ObjectClass *oc, void *data) +{ + SecurableGuestMemoryClass *sgmc = SECURABLE_GUEST_MEMORY_CLASS(oc); + + object_class_property_add_str(oc, "sev-device", + sev_guest_get_sev_device, + sev_guest_set_sev_device); + object_class_property_set_description(oc, "sev-device", + "SEV device to use"); + object_class_property_add_str(oc, "dh-cert-file", + sev_guest_get_dh_cert_file, + sev_guest_set_dh_cert_file); + object_class_property_set_description(oc, "dh-cert-file", + "guest owners DH certificate (encoded with base64)"); + object_class_property_add_str(oc, "session-file", + sev_guest_get_session_file, + sev_guest_set_session_file); + object_class_property_set_description(oc, "session-file", + "guest owners session parameters (encoded with base64)"); + + sgmc->encrypt_data = sev_encrypt_data; +} + +/* sev guest info */ +static const TypeInfo sev_guest_info = { + .parent = TYPE_SECURABLE_GUEST_MEMORY, + .name = TYPE_SEV_GUEST, + .instance_size = sizeof(SevGuestState), + .instance_finalize = sev_guest_finalize, + .class_init = sev_guest_class_init, + .instance_init = sev_guest_instance_init, + .interfaces = (InterfaceInfo[]) { + { TYPE_USER_CREATABLE }, + { } + } +}; + static void sev_register_types(void) { -- 2.28.0
When the "memory-encryption" property is set, we also disable KSM merging for the guest, since it won't accomplish anything. We want that, but doing it in the property set function itself is thereoretically incorrect, in the unlikely event of some configuration environment that set the property then cleared it again before constructing the guest. More importantly, it makes some other cleanups we want more difficult. So, instead move this logic to machine_run_board_init() conditional on the final value of the property. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> --- hw/core/machine.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/hw/core/machine.c b/hw/core/machine.c index d0408049b5..cb0711508d 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -427,14 +427,6 @@ static void machine_set_memory_encryption(Object *obj, const char *value, g_free(ms->memory_encryption); ms->memory_encryption = g_strdup(value); - - /* - * With memory encryption, the host can't see the real contents of RAM, - * so there's no point in it trying to merge areas. - */ - if (value) { - machine_set_mem_merge(obj, false, errp); - } } static bool machine_get_nvdimm(Object *obj, Error **errp) @@ -1131,6 +1123,15 @@ void machine_run_board_init(MachineState *machine) cc->deprecation_note); } + if (machine->memory_encryption) { + /* + * With memory encryption, the host can't see the real + * contents of RAM, so there's no point in it trying to merge + * areas. + */ + machine_set_mem_merge(OBJECT(machine), false, &error_abort); + } + machine_class->init(machine); } -- 2.28.0
Currently the "memory-encryption" property is only looked at once we get to kvm_init(). Although protection of guest memory from the hypervisor isn't something that could really ever work with TCG, it's not conceptually tied to the KVM accelerator. In addition, the way the string property is resolved to an object is almost identical to how a QOM link property is handled. So, create a new "securable-guest-memory" link property which sets this QOM interface link directly in the machine. For compatibility we keep the "memory-encryption" property, but now implemented in terms of the new property. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> --- accel/kvm/kvm-all.c | 22 ++++++---------------- hw/core/machine.c | 43 +++++++++++++++++++++++++++++++++++++------ include/hw/boards.h | 2 +- 3 files changed, 44 insertions(+), 23 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 9e7cea64d6..92a49b328a 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -2207,24 +2207,14 @@ static int kvm_init(MachineState *ms) * if memory encryption object is specified then initialize the memory * encryption context. */ - if (ms->memory_encryption) { - Object *obj = object_resolve_path_component(object_get_objects_root(), - ms->memory_encryption); - - if (object_dynamic_cast(obj, TYPE_SECURABLE_GUEST_MEMORY)) { - SecurableGuestMemory *sgm = SECURABLE_GUEST_MEMORY(obj); - - /* FIXME handle mechanisms other than SEV */ - ret = sev_kvm_init(sgm); - if (ret < 0) { - goto err; - } - - kvm_state->sgm = sgm; - } else { - ret = -1; + if (ms->sgm) { + /* FIXME handle mechanisms other than SEV */ + ret = sev_kvm_init(ms->sgm); + if (ret < 0) { goto err; } + + kvm_state->sgm = ms->sgm; } ret = kvm_arch_init(ms, s); diff --git a/hw/core/machine.c b/hw/core/machine.c index cb0711508d..816ea3ae3e 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -27,6 +27,7 @@ #include "hw/pci/pci.h" #include "hw/mem/nvdimm.h" #include "migration/vmstate.h" +#include "exec/securable-guest-memory.h" GlobalProperty hw_compat_5_1[] = { { "vhost-scsi", "num_queues", "1"}, @@ -417,16 +418,37 @@ static char *machine_get_memory_encryption(Object *obj, Error **errp) { MachineState *ms = MACHINE(obj); - return g_strdup(ms->memory_encryption); + if (ms->sgm) { + return g_strdup(object_get_canonical_path_component(OBJECT(ms->sgm))); + } + + return NULL; } static void machine_set_memory_encryption(Object *obj, const char *value, Error **errp) { - MachineState *ms = MACHINE(obj); + Object *sgm = + object_resolve_path_component(object_get_objects_root(), value); + + if (!sgm) { + error_setg(errp, "No such memory encryption object '%s'", value); + return; + } - g_free(ms->memory_encryption); - ms->memory_encryption = g_strdup(value); + object_property_set_link(obj, "securable-guest-memory", sgm, errp); +} + +static void machine_check_securable_guest_memory(const Object *obj, + const char *name, + Object *new_target, + Error **errp) +{ + /* + * So far the only constraint is that the target has the + * TYPE_SECURABLE_GUEST_MEMORY interface, and that's checked by + * the QOM core + */ } static bool machine_get_nvdimm(Object *obj, Error **errp) @@ -833,6 +855,15 @@ static void machine_class_init(ObjectClass *oc, void *data) object_class_property_set_description(oc, "suppress-vmdesc", "Set on to disable self-describing migration"); + object_class_property_add_link(oc, "securable-guest-memory", + TYPE_SECURABLE_GUEST_MEMORY, + offsetof(MachineState, sgm), + machine_check_securable_guest_memory, + OBJ_PROP_LINK_STRONG); + object_class_property_set_description(oc, "securable-guest-memory", + "Set securable guest memory scheme to use"); + + /* For compatibility */ object_class_property_add_str(oc, "memory-encryption", machine_get_memory_encryption, machine_set_memory_encryption); object_class_property_set_description(oc, "memory-encryption", @@ -1123,9 +1154,9 @@ void machine_run_board_init(MachineState *machine) cc->deprecation_note); } - if (machine->memory_encryption) { + if (machine->sgm) { /* - * With memory encryption, the host can't see the real + * With securable guest memory, the host can't see the real * contents of RAM, so there's no point in it trying to merge * areas. */ diff --git a/include/hw/boards.h b/include/hw/boards.h index a49e3a6b44..2ea9790183 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -269,7 +269,7 @@ struct MachineState { bool iommu; bool suppress_vmdesc; bool enable_graphics; - char *memory_encryption; + SecurableGuestMemory *sgm; char *ram_memdev_id; /* * convenience alias to ram_memdev_id backend memory region -- 2.28.0
The kvm_memcrypt_enabled() and kvm_memcrypt_encrypt_data() helper functions don't conceptually have any connection to KVM (although it's not possible in practice to use them without it). They also rely on looking at the global KVMState. But the same information is available from the machine, and the only existing callers have natural access to the machine state. Therefore, move and rename them to helpers in securable-guest-memory.h, taking an explicit machine parameter. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> --- accel/kvm/kvm-all.c | 27 -------------------- accel/stubs/kvm-stub.c | 10 -------- hw/i386/pc_sysfw.c | 6 +++-- include/exec/securable-guest-memory.h | 36 +++++++++++++++++++++++++++ include/sysemu/kvm.h | 17 ------------- 5 files changed, 40 insertions(+), 56 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 92a49b328a..c6bd7b9d02 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -121,9 +121,6 @@ struct KVMState KVMMemoryListener memory_listener; QLIST_HEAD(, KVMParkedVcpu) kvm_parked_vcpus; - /* securable guest memory (e.g. by guest memory encryption) */ - SecurableGuestMemory *sgm; - /* For "info mtree -f" to tell if an MR is registered in KVM */ int nr_as; struct KVMAs { @@ -222,28 +219,6 @@ int kvm_get_max_memslots(void) return s->nr_slots; } -bool kvm_memcrypt_enabled(void) -{ - if (kvm_state && kvm_state->sgm) { - return true; - } - - return false; -} - -int kvm_memcrypt_encrypt_data(uint8_t *ptr, uint64_t len) -{ - SecurableGuestMemory *sgm = kvm_state->sgm; - - if (sgm) { - SecurableGuestMemoryClass *sgmc = SECURABLE_GUEST_MEMORY_GET_CLASS(sgm); - - return sgmc->encrypt_data(sgm, ptr, len); - } - - return 1; -} - /* Called with KVMMemoryListener.slots_lock held */ static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml) { @@ -2213,8 +2188,6 @@ static int kvm_init(MachineState *ms) if (ret < 0) { goto err; } - - kvm_state->sgm = ms->sgm; } ret = kvm_arch_init(ms, s); diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c index 680e099463..0f17acfac0 100644 --- a/accel/stubs/kvm-stub.c +++ b/accel/stubs/kvm-stub.c @@ -81,16 +81,6 @@ int kvm_on_sigbus(int code, void *addr) return 1; } -bool kvm_memcrypt_enabled(void) -{ - return false; -} - -int kvm_memcrypt_encrypt_data(uint8_t *ptr, uint64_t len) -{ - return 1; -} - #ifndef CONFIG_USER_ONLY int kvm_irqchip_add_msi_route(KVMState *s, int vector, PCIDevice *dev) { diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c index b6c0822fe3..439ac78970 100644 --- a/hw/i386/pc_sysfw.c +++ b/hw/i386/pc_sysfw.c @@ -38,6 +38,7 @@ #include "sysemu/sysemu.h" #include "hw/block/flash.h" #include "sysemu/kvm.h" +#include "exec/securable-guest-memory.h" /* * We don't have a theoretically justifiable exact lower bound on the base @@ -201,10 +202,11 @@ static void pc_system_flash_map(PCMachineState *pcms, pc_isa_bios_init(rom_memory, flash_mem, size); /* Encrypt the pflash boot ROM */ - if (kvm_memcrypt_enabled()) { + if (securable_guest_memory_enabled(MACHINE(pcms))) { flash_ptr = memory_region_get_ram_ptr(flash_mem); flash_size = memory_region_size(flash_mem); - ret = kvm_memcrypt_encrypt_data(flash_ptr, flash_size); + ret = securable_guest_memory_encrypt(MACHINE(pcms), + flash_ptr, flash_size); if (ret) { error_report("failed to encrypt pflash rom"); exit(1); diff --git a/include/exec/securable-guest-memory.h b/include/exec/securable-guest-memory.h index 4e2ae27040..7325b504ba 100644 --- a/include/exec/securable-guest-memory.h +++ b/include/exec/securable-guest-memory.h @@ -21,6 +21,7 @@ #ifndef CONFIG_USER_ONLY #include "qom/object.h" +#include "hw/boards.h" #define TYPE_SECURABLE_GUEST_MEMORY "securable-guest-memory" #define SECURABLE_GUEST_MEMORY(obj) \ @@ -43,6 +44,41 @@ typedef struct SecurableGuestMemoryClass { int (*encrypt_data)(SecurableGuestMemory *, uint8_t *, uint64_t); } SecurableGuestMemoryClass; +/** + * securable_guest_memory_enabled - return whether guest memory is protected + * from hypervisor access (with memory + * encryption or otherwise) + * Returns: true guest memory is not directly accessible to qemu + * false guest memory is directly accessible to qemu + */ +static inline bool securable_guest_memory_enabled(MachineState *machine) +{ + return !!machine->sgm; +} + +/** + * securable_guest_memory_encrypt: encrypt the memory range to make + * it guest accessible + * + * Return: 1 failed to encrypt the range + * 0 succesfully encrypted memory region + */ +static inline int securable_guest_memory_encrypt(MachineState *machine, + uint8_t *ptr, uint64_t len) +{ + SecurableGuestMemory *sgm = machine->sgm; + + if (sgm) { + SecurableGuestMemoryClass *sgmc = SECURABLE_GUEST_MEMORY_GET_CLASS(sgm); + + if (sgmc->encrypt_data) { + return sgmc->encrypt_data(sgm, ptr, len); + } + } + + return 1; +} + #endif /* !CONFIG_USER_ONLY */ #endif /* QEMU_SECURABLE_GUEST_MEMORY_H */ diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index bb5d5cf497..0e163c2c9d 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -233,23 +233,6 @@ int kvm_has_intx_set_mask(void); */ bool kvm_arm_supports_user_irq(void); -/** - * kvm_memcrypt_enabled - return boolean indicating whether memory encryption - * is enabled - * Returns: 1 memory encryption is enabled - * 0 memory encryption is disabled - */ -bool kvm_memcrypt_enabled(void); - -/** - * kvm_memcrypt_encrypt_data: encrypt the memory range - * - * Return: 1 failed to encrypt the range - * 0 succesfully encrypted memory region - */ -int kvm_memcrypt_encrypt_data(uint8_t *ptr, uint64_t len); - - #ifdef NEED_CPU_H #include "cpu.h" -- 2.28.0
This allows failures to be reported richly and idiomatically. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> --- accel/kvm/kvm-all.c | 4 +++- accel/kvm/sev-stub.c | 5 +++-- include/sysemu/sev.h | 2 +- target/i386/sev.c | 31 +++++++++++++++---------------- 4 files changed, 22 insertions(+), 20 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index c6bd7b9d02..724e9294d0 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -2183,9 +2183,11 @@ static int kvm_init(MachineState *ms) * encryption context. */ if (ms->sgm) { + Error *local_err = NULL; /* FIXME handle mechanisms other than SEV */ - ret = sev_kvm_init(ms->sgm); + ret = sev_kvm_init(ms->sgm, &local_err); if (ret < 0) { + error_report_err(local_err); goto err; } } diff --git a/accel/kvm/sev-stub.c b/accel/kvm/sev-stub.c index 3df3c88eeb..537c91d9f8 100644 --- a/accel/kvm/sev-stub.c +++ b/accel/kvm/sev-stub.c @@ -15,7 +15,8 @@ #include "qemu-common.h" #include "sysemu/sev.h" -int sev_kvm_init(SecurableGuestMemory *sgm) +int sev_kvm_init(SecurableGuestMemory *sgm, Error **errp) { - return -1; + /* SEV can't be selected if it's not compiled */ + g_assert_not_reached(); } diff --git a/include/sysemu/sev.h b/include/sysemu/sev.h index 36d038a36f..7aa35821f0 100644 --- a/include/sysemu/sev.h +++ b/include/sysemu/sev.h @@ -17,6 +17,6 @@ #include "sysemu/kvm.h" #include "exec/securable-guest-memory.h" -int sev_kvm_init(SecurableGuestMemory *sgm); +int sev_kvm_init(SecurableGuestMemory *sgm, Error **errp); #endif diff --git a/target/i386/sev.c b/target/i386/sev.c index 7b8ce590f7..7333a60dc0 100644 --- a/target/i386/sev.c +++ b/target/i386/sev.c @@ -626,7 +626,7 @@ sev_vm_state_change(void *opaque, int running, RunState state) } } -int sev_kvm_init(SecurableGuestMemory *sgm) +int sev_kvm_init(SecurableGuestMemory *sgm, Error **errp) { SevGuestState *sev = SEV_GUEST(sgm); char *devname; @@ -648,14 +648,14 @@ int sev_kvm_init(SecurableGuestMemory *sgm) host_cbitpos = ebx & 0x3f; if (host_cbitpos != sev->cbitpos) { - error_report("%s: cbitpos check failed, host '%d' requested '%d'", - __func__, host_cbitpos, sev->cbitpos); + error_setg(errp, "%s: cbitpos check failed, host '%d' requested '%d'", + __func__, host_cbitpos, sev->cbitpos); goto err; } if (sev->reduced_phys_bits < 1) { - error_report("%s: reduced_phys_bits check failed, it should be >=1," - " requested '%d'", __func__, sev->reduced_phys_bits); + error_setg(errp, "%s: reduced_phys_bits check failed, it should be >=1," + " requested '%d'", __func__, sev->reduced_phys_bits); goto err; } @@ -664,20 +664,19 @@ int sev_kvm_init(SecurableGuestMemory *sgm) devname = object_property_get_str(OBJECT(sev), "sev-device", NULL); sev->sev_fd = open(devname, O_RDWR); if (sev->sev_fd < 0) { - error_report("%s: Failed to open %s '%s'", __func__, - devname, strerror(errno)); - } - g_free(devname); - if (sev->sev_fd < 0) { + error_setg(errp, "%s: Failed to open %s '%s'", __func__, + devname, strerror(errno)); + g_free(devname); goto err; } + g_free(devname); ret = sev_platform_ioctl(sev->sev_fd, SEV_PLATFORM_STATUS, &status, &fw_error); if (ret) { - error_report("%s: failed to get platform status ret=%d " - "fw_error='%d: %s'", __func__, ret, fw_error, - fw_error_to_str(fw_error)); + error_setg(errp, "%s: failed to get platform status ret=%d " + "fw_error='%d: %s'", __func__, ret, fw_error, + fw_error_to_str(fw_error)); goto err; } sev->build_id = status.build; @@ -687,14 +686,14 @@ int sev_kvm_init(SecurableGuestMemory *sgm) trace_kvm_sev_init(); ret = sev_ioctl(sev->sev_fd, KVM_SEV_INIT, NULL, &fw_error); if (ret) { - error_report("%s: failed to initialize ret=%d fw_error=%d '%s'", - __func__, ret, fw_error, fw_error_to_str(fw_error)); + error_setg(errp, "%s: failed to initialize ret=%d fw_error=%d '%s'", + __func__, ret, fw_error, fw_error_to_str(fw_error)); goto err; } ret = sev_launch_start(sev); if (ret) { - error_report("%s: failed to create encryption context", __func__); + error_setg(errp, "%s: failed to create encryption context", __func__); goto err; } -- 2.28.0
The platform specific details of mechanisms for implementing securable guest memory may require setup at various points during initialization. Thus, it's not really feasible to have a single sgm initialization hook, but instead each mechanism needs its own initialization calls in arch or machine specific code. However, to make it harder to have a bug where a mechanism isn't properly initialized under some circumstances, we want to have a common place, relatively late in boot, where we verify that sgm has been initialized if it was requested. This patch introduces a ready flag to the SecurableGuestMemory base type to accomplish this, which we verify just before the machine specific initialization function. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- hw/core/machine.c | 8 ++++++++ include/exec/securable-guest-memory.h | 2 ++ target/i386/sev.c | 2 ++ 3 files changed, 12 insertions(+) diff --git a/hw/core/machine.c b/hw/core/machine.c index 816ea3ae3e..a67a27d03c 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -1155,6 +1155,14 @@ void machine_run_board_init(MachineState *machine) } if (machine->sgm) { + /* + * Where securable guest memory is initialized depends on the + * specific mechanism in use. But, we need to make sure it's + * ready by now. If it isn't, that's a bug in the + * implementation of that sgm mechanism. + */ + assert(machine->sgm->ready); + /* * With securable guest memory, the host can't see the real * contents of RAM, so there's no point in it trying to merge diff --git a/include/exec/securable-guest-memory.h b/include/exec/securable-guest-memory.h index 7325b504ba..20cf13777b 100644 --- a/include/exec/securable-guest-memory.h +++ b/include/exec/securable-guest-memory.h @@ -36,6 +36,8 @@ struct SecurableGuestMemory { Object parent; + + bool ready; }; typedef struct SecurableGuestMemoryClass { diff --git a/target/i386/sev.c b/target/i386/sev.c index 7333a60dc0..022ce5fc3a 100644 --- a/target/i386/sev.c +++ b/target/i386/sev.c @@ -701,6 +701,8 @@ int sev_kvm_init(SecurableGuestMemory *sgm, Error **errp) qemu_add_machine_init_done_notifier(&sev_machine_done_notify); qemu_add_vm_change_state_handler(sev_vm_state_change, sev); + sgm->ready = true; + return 0; err: sev_guest = NULL; -- 2.28.0
While we've abstracted some (potential) differences between mechanisms for securing guest memory, the initialization is still specific to SEV. Given that, move it into x86's kvm_arch_init() code, rather than the generic kvm_init() code. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- accel/kvm/kvm-all.c | 14 -------------- target/i386/kvm.c | 12 ++++++++++++ target/i386/sev.c | 7 ++++++- 3 files changed, 18 insertions(+), 15 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 724e9294d0..1b676da6c2 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -2178,20 +2178,6 @@ static int kvm_init(MachineState *ms) kvm_state = s; - /* - * if memory encryption object is specified then initialize the memory - * encryption context. - */ - if (ms->sgm) { - Error *local_err = NULL; - /* FIXME handle mechanisms other than SEV */ - ret = sev_kvm_init(ms->sgm, &local_err); - if (ret < 0) { - error_report_err(local_err); - goto err; - } - } - ret = kvm_arch_init(ms, s); if (ret < 0) { goto err; diff --git a/target/i386/kvm.c b/target/i386/kvm.c index a2934dda02..8e3617f3cd 100644 --- a/target/i386/kvm.c +++ b/target/i386/kvm.c @@ -42,6 +42,7 @@ #include "hw/i386/intel_iommu.h" #include "hw/i386/x86-iommu.h" #include "hw/i386/e820_memory_layout.h" +#include "sysemu/sev.h" #include "hw/pci/pci.h" #include "hw/pci/msi.h" @@ -2110,6 +2111,17 @@ int kvm_arch_init(MachineState *ms, KVMState *s) uint64_t shadow_mem; int ret; struct utsname utsname; + Error *local_err = NULL; + + /* + * if memory encryption object is specified then initialize the + * memory encryption context (no-op otherwise) + */ + ret = sev_kvm_init(ms->sgm, &local_err); + if (ret < 0) { + error_report_err(local_err); + return ret; + } if (!kvm_check_extension(s, KVM_CAP_IRQ_ROUTING)) { error_report("kvm: KVM_CAP_IRQ_ROUTING not supported by KVM"); diff --git a/target/i386/sev.c b/target/i386/sev.c index 022ce5fc3a..8c19f4aea6 100644 --- a/target/i386/sev.c +++ b/target/i386/sev.c @@ -628,13 +628,18 @@ sev_vm_state_change(void *opaque, int running, RunState state) int sev_kvm_init(SecurableGuestMemory *sgm, Error **errp) { - SevGuestState *sev = SEV_GUEST(sgm); + SevGuestState *sev + = (SevGuestState *)object_dynamic_cast(OBJECT(sgm), TYPE_SEV_GUEST); char *devname; int ret, fw_error; uint32_t ebx; uint32_t host_cbitpos; struct sev_user_data_status status = {}; + if (!sev) { + return 0; + } + ret = ram_block_discard_disable(true); if (ret) { error_report("%s: cannot disable RAM discard", __func__); -- 2.28.0
Some upcoming POWER machines have a system called PEF (Protected Execution Facility) which uses a small ultravisor to allow guests to run in a way that they can't be eavesdropped by the hypervisor. The effect is roughly similar to AMD SEV, although the mechanisms are quite different. Most of the work of this is done between the guest, KVM and the ultravisor, with little need for involvement by qemu. However qemu does need to tell KVM to allow secure VMs. Because the availability of secure mode is a guest visible difference which depends on having the right hardware and firmware, we don't enable this by default. In order to run a secure guest you need to create a "pef-guest" object and set the securable-guest-memory machine property to point to it. Note that this just *allows* secure guests, the architecture of PEF is such that the guest still needs to talk to the ultravisor to enter secure mode. Qemu has no directl way of knowing if the guest is in secure mode, and certainly can't know until well after machine creation time. To start a PEF-capable guest, use the command line options: -object pef-guest,id=pef0 -machine securable-guest-memory=pef0 Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Ram Pai <linuxram@us.ibm.com> --- hw/ppc/meson.build | 1 + hw/ppc/pef.c | 115 +++++++++++++++++++++++++++++++++++++++++++ hw/ppc/spapr.c | 10 ++++ include/hw/ppc/pef.h | 26 ++++++++++ target/ppc/kvm.c | 18 ------- target/ppc/kvm_ppc.h | 6 --- 6 files changed, 152 insertions(+), 24 deletions(-) create mode 100644 hw/ppc/pef.c create mode 100644 include/hw/ppc/pef.h diff --git a/hw/ppc/meson.build b/hw/ppc/meson.build index ffa2ec37fa..218631c883 100644 --- a/hw/ppc/meson.build +++ b/hw/ppc/meson.build @@ -27,6 +27,7 @@ ppc_ss.add(when: 'CONFIG_PSERIES', if_true: files( 'spapr_nvdimm.c', 'spapr_rtas_ddw.c', 'spapr_numa.c', + 'pef.c', )) ppc_ss.add(when: 'CONFIG_SPAPR_RNG', if_true: files('spapr_rng.c')) ppc_ss.add(when: ['CONFIG_PSERIES', 'CONFIG_LINUX'], if_true: files( diff --git a/hw/ppc/pef.c b/hw/ppc/pef.c new file mode 100644 index 0000000000..3ae3059cfe --- /dev/null +++ b/hw/ppc/pef.c @@ -0,0 +1,115 @@ +/* + * PEF (Protected Execution Facility) for POWER support + * + * Copyright David Gibson, Redhat Inc. 2020 + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" + +#include "qapi/error.h" +#include "qom/object_interfaces.h" +#include "sysemu/kvm.h" +#include "migration/blocker.h" +#include "exec/securable-guest-memory.h" +#include "hw/ppc/pef.h" + +#define TYPE_PEF_GUEST "pef-guest" +#define PEF_GUEST(obj) \ + OBJECT_CHECK(PefGuestState, (obj), TYPE_PEF_GUEST) + +typedef struct PefGuestState PefGuestState; + +/** + * PefGuestState: + * + * The PefGuestState object is used for creating and managing a PEF + * guest. + * + * # $QEMU \ + * -object pef-guest,id=pef0 \ + * -machine ...,securable-guest-memory=pef0 + */ +struct PefGuestState { + Object parent_obj; +}; + +#ifdef CONFIG_KVM +static int kvmppc_svm_init(Error **errp) +{ + if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURABLE_GUEST)) { + error_setg(errp, + "KVM implementation does not support Secure VMs (is an ultravisor running?)"); + return -1; + } else { + int ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_PPC_SECURE_GUEST, 0, 1); + + if (ret < 0) { + error_setg(errp, + "Error enabling PEF with KVM"); + return -1; + } + } + + return 0; +} + +/* + * Don't set error if KVM_PPC_SVM_OFF ioctl is invoked on kernels + * that don't support this ioctl. + */ +void kvmppc_svm_off(Error **errp) +{ + int rc; + + if (!kvm_enabled()) { + return; + } + + rc = kvm_vm_ioctl(KVM_STATE(current_accel()), KVM_PPC_SVM_OFF); + if (rc && rc != -ENOTTY) { + error_setg_errno(errp, -rc, "KVM_PPC_SVM_OFF ioctl failed"); + } +} +#else +static int kvmppc_svm_init(Error **errp) +{ + g_assert_not_reached(); +} +#endif + +int pef_kvm_init(SecurableGuestMemory *sgm, Error **errp) +{ + if (!object_dynamic_cast(OBJECT(sgm), TYPE_PEF_GUEST)) { + return 0; + } + + if (!kvm_enabled()) { + error_setg(errp, "PEF requires KVM"); + return -1; + } + + return kvmppc_svm_init(errp); +} + +static const TypeInfo pef_guest_info = { + .parent = TYPE_OBJECT, + .name = TYPE_PEF_GUEST, + .instance_size = sizeof(PefGuestState), + .interfaces = (InterfaceInfo[]) { + { TYPE_SECURABLE_GUEST_MEMORY }, + { TYPE_USER_CREATABLE }, + { } + } +}; + +static void +pef_register_types(void) +{ + type_register_static(&pef_guest_info); +} + +type_init(pef_register_types); diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 12a012d9dd..d95b60f712 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -82,6 +82,7 @@ #include "hw/ppc/spapr_tpm_proxy.h" #include "hw/ppc/spapr_nvdimm.h" #include "hw/ppc/spapr_numa.h" +#include "hw/ppc/pef.h" #include "monitor/monitor.h" @@ -2665,6 +2666,15 @@ static void spapr_machine_init(MachineState *machine) long load_limit, fw_size; char *filename; Error *resize_hpt_err = NULL; + Error *local_err = NULL; + + /* + * if Secure VM (PEF) support is configured, then initialize it + */ + if (pef_kvm_init(machine->sgm, &local_err) < 0) { + error_report_err(local_err); + exit(1); + } msi_nonbroken = true; diff --git a/include/hw/ppc/pef.h b/include/hw/ppc/pef.h new file mode 100644 index 0000000000..63c3475859 --- /dev/null +++ b/include/hw/ppc/pef.h @@ -0,0 +1,26 @@ +/* + * PEF (Protected Execution Facility) for POWER support + * + * Copyright David Gibson, Redhat Inc. 2020 + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef HW_PPC_PEF_H +#define HW_PPC_PEF_H + +int pef_kvm_init(SecurableGuestMemory *sgm, Error **errp); + +#ifdef CONFIG_KVM +void kvmppc_svm_off(Error **errp); +#else +static inline void kvmppc_svm_off(Error **errp) +{ +} +#endif + + +#endif /* HW_PPC_PEF_H */ + diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c index daf690a678..0c5056dd5b 100644 --- a/target/ppc/kvm.c +++ b/target/ppc/kvm.c @@ -2929,21 +2929,3 @@ void kvmppc_set_reg_tb_offset(PowerPCCPU *cpu, int64_t tb_offset) kvm_set_one_reg(cs, KVM_REG_PPC_TB_OFFSET, &tb_offset); } } - -/* - * Don't set error if KVM_PPC_SVM_OFF ioctl is invoked on kernels - * that don't support this ioctl. - */ -void kvmppc_svm_off(Error **errp) -{ - int rc; - - if (!kvm_enabled()) { - return; - } - - rc = kvm_vm_ioctl(KVM_STATE(current_accel()), KVM_PPC_SVM_OFF); - if (rc && rc != -ENOTTY) { - error_setg_errno(errp, -rc, "KVM_PPC_SVM_OFF ioctl failed"); - } -} diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h index 73ce2bc951..989f61ace0 100644 --- a/target/ppc/kvm_ppc.h +++ b/target/ppc/kvm_ppc.h @@ -39,7 +39,6 @@ int kvmppc_booke_watchdog_enable(PowerPCCPU *cpu); target_ulong kvmppc_configure_v3_mmu(PowerPCCPU *cpu, bool radix, bool gtse, uint64_t proc_tbl); -void kvmppc_svm_off(Error **errp); #ifndef CONFIG_USER_ONLY bool kvmppc_spapr_use_multitce(void); int kvmppc_spapr_enable_inkernel_multitce(void); @@ -216,11 +215,6 @@ static inline target_ulong kvmppc_configure_v3_mmu(PowerPCCPU *cpu, return 0; } -static inline void kvmppc_svm_off(Error **errp) -{ - return; -} - static inline void kvmppc_set_reg_ppc_online(PowerPCCPU *cpu, unsigned int online) { -- 2.28.0
We haven't yet implemented the fairly involved handshaking that will be needed to migrate PEF protected guests. For now, just use a migration blocker so we get a meaningful error if someone attempts this (this is the same approach used by AMD SEV). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- hw/ppc/pef.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/hw/ppc/pef.c b/hw/ppc/pef.c index 3ae3059cfe..edc3e744ba 100644 --- a/hw/ppc/pef.c +++ b/hw/ppc/pef.c @@ -38,7 +38,11 @@ struct PefGuestState { }; #ifdef CONFIG_KVM +static Error *pef_mig_blocker; + static int kvmppc_svm_init(Error **errp) + +int kvmppc_svm_init(SecurableGuestMemory *sgm, Error **errp) { if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURABLE_GUEST)) { error_setg(errp, @@ -54,6 +58,11 @@ static int kvmppc_svm_init(Error **errp) } } + /* add migration blocker */ + error_setg(&pef_mig_blocker, "PEF: Migration is not implemented"); + /* NB: This can fail if --only-migratable is used */ + migrate_add_blocker(pef_mig_blocker, &error_fatal); + return 0; } -- 2.28.0
The default behaviour for virtio devices is not to use the platforms normal DMA paths, but instead to use the fact that it's running in a hypervisor to directly access guest memory. That doesn't work if the guest's memory is protected from hypervisor access, such as with AMD's SEV or POWER's PEF. So, if a securable guest memory mechanism is enabled, then apply the iommu_platform=on option so it will go through normal DMA mechanisms. Those will presumably have some way of marking memory as shared with the hypervisor or hardware so that DMA will work. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> --- hw/core/machine.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/hw/core/machine.c b/hw/core/machine.c index a67a27d03c..d16273d75d 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -28,6 +28,8 @@ #include "hw/mem/nvdimm.h" #include "migration/vmstate.h" #include "exec/securable-guest-memory.h" +#include "hw/virtio/virtio.h" +#include "hw/virtio/virtio-pci.h" GlobalProperty hw_compat_5_1[] = { { "vhost-scsi", "num_queues", "1"}, @@ -1169,6 +1171,17 @@ void machine_run_board_init(MachineState *machine) * areas. */ machine_set_mem_merge(OBJECT(machine), false, &error_abort); + + /* + * Virtio devices can't count on directly accessing guest + * memory, so they need iommu_platform=on to use normal DMA + * mechanisms. That requires also disabling legacy virtio + * support for those virtio pci devices which allow it. + */ + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy", + "on", true); + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform", + "on", false); } machine_class->init(machine); -- 2.28.0
At least some s390 cpu models support "Protected Virtualization" (PV), a mechanism to protect guests from eavesdropping by a compromised hypervisor. This is similar in function to other mechanisms like AMD's SEV and POWER's PEF, which are controlled bythe "securable-guest-memory" machine option. s390 is a slightly special case, because we already supported PV, simply by using a CPU model with the required feature (S390_FEAT_UNPACK). To integrate this with the option used by other platforms, we implement the following compromise: - When the securable-guest-memory option is set, s390 will recognize it, verify that the CPU can support PV (failing if not) and set virtio default options necessary for encrypted or protected guests, as on other platforms. i.e. if securable-guest-memory is set, we will either create a guest capable of entering PV mode, or fail outright - If securable-guest-memory is not set, guest's might still be able to enter PV mode, if the CPU has the right model. This may be a little surprising, but shouldn't actually be harmful. To start a guest supporting Protected Virtualization using the new option use the command line arguments: -object s390-pv-guest,id=pv0 -machine securable-guest-memory=pv0 Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- hw/s390x/pv.c | 58 +++++++++++++++++++++++++++++++++++++++++++ include/hw/s390x/pv.h | 1 + target/s390x/kvm.c | 3 +++ 3 files changed, 62 insertions(+) diff --git a/hw/s390x/pv.c b/hw/s390x/pv.c index ab3a2482aa..9fddc196a3 100644 --- a/hw/s390x/pv.c +++ b/hw/s390x/pv.c @@ -14,8 +14,11 @@ #include <linux/kvm.h> #include "cpu.h" +#include "qapi/error.h" #include "qemu/error-report.h" #include "sysemu/kvm.h" +#include "qom/object_interfaces.h" +#include "exec/securable-guest-memory.h" #include "hw/s390x/ipl.h" #include "hw/s390x/pv.h" @@ -111,3 +114,58 @@ void s390_pv_inject_reset_error(CPUState *cs) /* Report that we are unable to enter protected mode */ env->regs[r1 + 1] = DIAG_308_RC_INVAL_FOR_PV; } + +#define TYPE_S390_PV_GUEST "s390-pv-guest" +#define S390_PV_GUEST(obj) \ + OBJECT_CHECK(S390PVGuestState, (obj), TYPE_S390_PV_GUEST) + +typedef struct S390PVGuestState S390PVGuestState; + +/** + * S390PVGuestState: + * + * The S390PVGuestState object is basically a dummy used to tell the + * securable guest memory system to use s390's PV mechanism. + * + * # $QEMU \ + * -object s390-pv-guest,id=pv0 \ + * -machine ...,securable-guest-memory=pv0 + */ +struct S390PVGuestState { + Object parent_obj; +}; + +int s390_pv_init(SecurableGuestMemory *sgm, Error **errp) +{ + if (!object_dynamic_cast(OBJECT(sgm), TYPE_S390_PV_GUEST)) { + return 0; + } + + if (!s390_has_feat(S390_FEAT_UNPACK)) { + error_setg(errp, + "CPU model does not support Protected Virtualization"); + return -1; + } + + sgm->ready = true; + + return 0; +} + +static const TypeInfo s390_pv_guest_info = { + .parent = TYPE_SECURABLE_GUEST_MEMORY, + .name = TYPE_S390_PV_GUEST, + .instance_size = sizeof(S390PVGuestState), + .interfaces = (InterfaceInfo[]) { + { TYPE_USER_CREATABLE }, + { } + } +}; + +static void +s390_pv_register_types(void) +{ + type_register_static(&s390_pv_guest_info); +} + +type_init(s390_pv_register_types); diff --git a/include/hw/s390x/pv.h b/include/hw/s390x/pv.h index aee758bc2d..4250af699b 100644 --- a/include/hw/s390x/pv.h +++ b/include/hw/s390x/pv.h @@ -43,6 +43,7 @@ void s390_pv_prep_reset(void); int s390_pv_verify(void); void s390_pv_unshare(void); void s390_pv_inject_reset_error(CPUState *cs); +int s390_pv_init(SecurableGuestMemory *sgm, Error **errp); #else /* CONFIG_KVM */ static inline bool s390_is_pv(void) { return false; } static inline int s390_pv_vm_enable(void) { return 0; } diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c index b8385e6b95..3383487463 100644 --- a/target/s390x/kvm.c +++ b/target/s390x/kvm.c @@ -387,6 +387,9 @@ int kvm_arch_init(MachineState *ms, KVMState *s) } kvm_set_max_memslot_size(KVM_SLOT_MAX_BYTES); + + s390_pv_init(ms->sgm, &error_fatal); + return 0; } -- 2.28.0
On 04.12.20 06:44, David Gibson wrote:
> A number of hardware platforms are implementing mechanisms whereby the
> hypervisor does not have unfettered access to guest memory, in order
> to mitigate the security impact of a compromised hypervisor.
>
> AMD's SEV implements this with in-cpu memory encryption, and Intel has
> its own memory encryption mechanism. POWER has an upcoming mechanism
> to accomplish this in a different way, using a new memory protection
> level plus a small trusted ultravisor. s390 also has a protected
> execution environment.
>
> The current code (committed or draft) for these features has each
> platform's version configured entirely differently. That doesn't seem
> ideal for users, or particularly for management layers.
>
> AMD SEV introduces a notionally generic machine option
> "machine-encryption", but it doesn't actually cover any cases other
> than SEV.
>
> This series is a proposal to at least partially unify configuration
> for these mechanisms, by renaming and generalizing AMD's
> "memory-encryption" property. It is replaced by a
> "securable-guest-memory" property pointing to a platform specific
Can we do "securable-guest" ?
s390x also protects registers and integrity. memory is only one piece
of the puzzle and what we protect might differ from platform to
platform.
On 04.12.20 06:44, David Gibson wrote:
> The default behaviour for virtio devices is not to use the platforms normal
> DMA paths, but instead to use the fact that it's running in a hypervisor
> to directly access guest memory. That doesn't work if the guest's memory
> is protected from hypervisor access, such as with AMD's SEV or POWER's PEF.
>
> So, if a securable guest memory mechanism is enabled, then apply the
> iommu_platform=on option so it will go through normal DMA mechanisms.
> Those will presumably have some way of marking memory as shared with
> the hypervisor or hardware so that DMA will work.
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> hw/core/machine.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index a67a27d03c..d16273d75d 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -28,6 +28,8 @@
> #include "hw/mem/nvdimm.h"
> #include "migration/vmstate.h"
> #include "exec/securable-guest-memory.h"
> +#include "hw/virtio/virtio.h"
> +#include "hw/virtio/virtio-pci.h"
>
> GlobalProperty hw_compat_5_1[] = {
> { "vhost-scsi", "num_queues", "1"},
> @@ -1169,6 +1171,17 @@ void machine_run_board_init(MachineState *machine)
> * areas.
> */
> machine_set_mem_merge(OBJECT(machine), false, &error_abort);
> +
> + /*
> + * Virtio devices can't count on directly accessing guest
> + * memory, so they need iommu_platform=on to use normal DMA
> + * mechanisms. That requires also disabling legacy virtio
> + * support for those virtio pci devices which allow it.
> + */
> + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy",
> + "on", true);
> + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform",
> + "on", false);
I have not followed all the history (sorry). Should we also set iommu_platform
for virtio-ccw? Halil?
On Fri, 4 Dec 2020 09:10:36 +0100
Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> On 04.12.20 06:44, David Gibson wrote:
> > The default behaviour for virtio devices is not to use the platforms normal
> > DMA paths, but instead to use the fact that it's running in a hypervisor
> > to directly access guest memory. That doesn't work if the guest's memory
> > is protected from hypervisor access, such as with AMD's SEV or POWER's PEF.
> >
> > So, if a securable guest memory mechanism is enabled, then apply the
> > iommu_platform=on option so it will go through normal DMA mechanisms.
> > Those will presumably have some way of marking memory as shared with
> > the hypervisor or hardware so that DMA will work.
> >
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> > hw/core/machine.c | 13 +++++++++++++
> > 1 file changed, 13 insertions(+)
> >
> > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > index a67a27d03c..d16273d75d 100644
> > --- a/hw/core/machine.c
> > +++ b/hw/core/machine.c
> > @@ -28,6 +28,8 @@
> > #include "hw/mem/nvdimm.h"
> > #include "migration/vmstate.h"
> > #include "exec/securable-guest-memory.h"
> > +#include "hw/virtio/virtio.h"
> > +#include "hw/virtio/virtio-pci.h"
> >
> > GlobalProperty hw_compat_5_1[] = {
> > { "vhost-scsi", "num_queues", "1"},
> > @@ -1169,6 +1171,17 @@ void machine_run_board_init(MachineState *machine)
> > * areas.
> > */
> > machine_set_mem_merge(OBJECT(machine), false, &error_abort);
> > +
> > + /*
> > + * Virtio devices can't count on directly accessing guest
> > + * memory, so they need iommu_platform=on to use normal DMA
> > + * mechanisms. That requires also disabling legacy virtio
> > + * support for those virtio pci devices which allow it.
> > + */
> > + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy",
> > + "on", true);
> > + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform",
> > + "on", false);
>
> I have not followed all the history (sorry). Should we also set iommu_platform
> for virtio-ccw? Halil?
>
That line should add iommu_platform for all virtio devices, shouldn't
it?
On 04.12.20 09:17, Cornelia Huck wrote:
> On Fri, 4 Dec 2020 09:10:36 +0100
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>
>> On 04.12.20 06:44, David Gibson wrote:
>>> The default behaviour for virtio devices is not to use the platforms normal
>>> DMA paths, but instead to use the fact that it's running in a hypervisor
>>> to directly access guest memory. That doesn't work if the guest's memory
>>> is protected from hypervisor access, such as with AMD's SEV or POWER's PEF.
>>>
>>> So, if a securable guest memory mechanism is enabled, then apply the
>>> iommu_platform=on option so it will go through normal DMA mechanisms.
>>> Those will presumably have some way of marking memory as shared with
>>> the hypervisor or hardware so that DMA will work.
>>>
>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>> ---
>>> hw/core/machine.c | 13 +++++++++++++
>>> 1 file changed, 13 insertions(+)
>>>
>>> diff --git a/hw/core/machine.c b/hw/core/machine.c
>>> index a67a27d03c..d16273d75d 100644
>>> --- a/hw/core/machine.c
>>> +++ b/hw/core/machine.c
>>> @@ -28,6 +28,8 @@
>>> #include "hw/mem/nvdimm.h"
>>> #include "migration/vmstate.h"
>>> #include "exec/securable-guest-memory.h"
>>> +#include "hw/virtio/virtio.h"
>>> +#include "hw/virtio/virtio-pci.h"
>>>
>>> GlobalProperty hw_compat_5_1[] = {
>>> { "vhost-scsi", "num_queues", "1"},
>>> @@ -1169,6 +1171,17 @@ void machine_run_board_init(MachineState *machine)
>>> * areas.
>>> */
>>> machine_set_mem_merge(OBJECT(machine), false, &error_abort);
>>> +
>>> + /*
>>> + * Virtio devices can't count on directly accessing guest
>>> + * memory, so they need iommu_platform=on to use normal DMA
>>> + * mechanisms. That requires also disabling legacy virtio
>>> + * support for those virtio pci devices which allow it.
>>> + */
>>> + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy",
>>> + "on", true);
>>> + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform",
>>> + "on", false);
>>
>> I have not followed all the history (sorry). Should we also set iommu_platform
>> for virtio-ccw? Halil?
>>
>
> That line should add iommu_platform for all virtio devices, shouldn't
> it?
Yes, sorry. Was misreading that with the line above.
On Fri, Dec 04, 2020 at 04:44:02PM +1100, David Gibson wrote: > A number of hardware platforms are implementing mechanisms whereby the > hypervisor does not have unfettered access to guest memory, in order > to mitigate the security impact of a compromised hypervisor. > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > its own memory encryption mechanism. POWER has an upcoming mechanism > to accomplish this in a different way, using a new memory protection > level plus a small trusted ultravisor. s390 also has a protected > execution environment. > > The current code (committed or draft) for these features has each > platform's version configured entirely differently. That doesn't seem > ideal for users, or particularly for management layers. > > AMD SEV introduces a notionally generic machine option > "machine-encryption", but it doesn't actually cover any cases other > than SEV. > > This series is a proposal to at least partially unify configuration > for these mechanisms, by renaming and generalizing AMD's > "memory-encryption" property. It is replaced by a > "securable-guest-memory" property pointing to a platform specific > object which configures and manages the specific details. There's no docs updated or added in this series. docs/amd-memory-encryption.txt needs an update at least, and there ought to be a doc added describing how this series is to be used for s390/ppc > accel/kvm/kvm-all.c | 39 +------ > accel/kvm/sev-stub.c | 10 +- > accel/stubs/kvm-stub.c | 10 -- > backends/meson.build | 1 + > backends/securable-guest-memory.c | 30 +++++ > hw/core/machine.c | 71 ++++++++++-- > hw/i386/pc_sysfw.c | 6 +- > hw/ppc/meson.build | 1 + > hw/ppc/pef.c | 124 +++++++++++++++++++++ > hw/ppc/spapr.c | 10 ++ > hw/s390x/pv.c | 58 ++++++++++ > include/exec/securable-guest-memory.h | 86 +++++++++++++++ > include/hw/boards.h | 2 +- > include/hw/ppc/pef.h | 26 +++++ > include/hw/s390x/pv.h | 1 + > include/qemu/typedefs.h | 1 + > include/qom/object.h | 3 +- > include/sysemu/kvm.h | 17 --- > include/sysemu/sev.h | 5 +- > qom/object.c | 4 +- > softmmu/vl.c | 16 ++- > target/i386/kvm.c | 12 ++ > target/i386/monitor.c | 1 - > target/i386/sev.c | 153 ++++++++++++-------------- > target/ppc/kvm.c | 18 --- > target/ppc/kvm_ppc.h | 6 - > target/s390x/kvm.c | 3 + > 27 files changed, 510 insertions(+), 204 deletions(-) > create mode 100644 backends/securable-guest-memory.c > create mode 100644 hw/ppc/pef.c > create mode 100644 include/exec/securable-guest-memory.h > create mode 100644 include/hw/ppc/pef.h Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Fri, 4 Dec 2020 16:44:03 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> From: Greg Kurz <groug@kaod.org>
>
> Global properties have an @optional field, which allows to apply a given
> property to a given type even if one of its subclasses doesn't support
> it. This is especially used in the compat code when dealing with the
> "disable-modern" and "disable-legacy" properties and the "virtio-pci"
> type.
>
> Allow object_register_sugar_prop() to set this field as well.
>
> Signed-off-by: Greg Kurz <groug@kaod.org>
> Message-Id: <159738953558.377274.16617742952571083440.stgit@bahia.lan>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> include/qom/object.h | 3 ++-
> qom/object.c | 4 +++-
> softmmu/vl.c | 16 ++++++++++------
> 3 files changed, 15 insertions(+), 8 deletions(-)
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
On Fri, 4 Dec 2020 09:06:50 +0100
Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> On 04.12.20 06:44, David Gibson wrote:
> > A number of hardware platforms are implementing mechanisms whereby the
> > hypervisor does not have unfettered access to guest memory, in order
> > to mitigate the security impact of a compromised hypervisor.
> >
> > AMD's SEV implements this with in-cpu memory encryption, and Intel has
> > its own memory encryption mechanism. POWER has an upcoming mechanism
> > to accomplish this in a different way, using a new memory protection
> > level plus a small trusted ultravisor. s390 also has a protected
> > execution environment.
> >
> > The current code (committed or draft) for these features has each
> > platform's version configured entirely differently. That doesn't seem
> > ideal for users, or particularly for management layers.
> >
> > AMD SEV introduces a notionally generic machine option
> > "machine-encryption", but it doesn't actually cover any cases other
> > than SEV.
> >
> > This series is a proposal to at least partially unify configuration
> > for these mechanisms, by renaming and generalizing AMD's
> > "memory-encryption" property. It is replaced by a
> > "securable-guest-memory" property pointing to a platform specific
>
> Can we do "securable-guest" ?
> s390x also protects registers and integrity. memory is only one piece
> of the puzzle and what we protect might differ from platform to
> platform.
>
I agree. Even technologies that currently only do memory encryption may
be enhanced with more protections later.
* Cornelia Huck (cohuck@redhat.com) wrote:
> On Fri, 4 Dec 2020 09:06:50 +0100
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>
> > On 04.12.20 06:44, David Gibson wrote:
> > > A number of hardware platforms are implementing mechanisms whereby the
> > > hypervisor does not have unfettered access to guest memory, in order
> > > to mitigate the security impact of a compromised hypervisor.
> > >
> > > AMD's SEV implements this with in-cpu memory encryption, and Intel has
> > > its own memory encryption mechanism. POWER has an upcoming mechanism
> > > to accomplish this in a different way, using a new memory protection
> > > level plus a small trusted ultravisor. s390 also has a protected
> > > execution environment.
> > >
> > > The current code (committed or draft) for these features has each
> > > platform's version configured entirely differently. That doesn't seem
> > > ideal for users, or particularly for management layers.
> > >
> > > AMD SEV introduces a notionally generic machine option
> > > "machine-encryption", but it doesn't actually cover any cases other
> > > than SEV.
> > >
> > > This series is a proposal to at least partially unify configuration
> > > for these mechanisms, by renaming and generalizing AMD's
> > > "memory-encryption" property. It is replaced by a
> > > "securable-guest-memory" property pointing to a platform specific
> >
> > Can we do "securable-guest" ?
> > s390x also protects registers and integrity. memory is only one piece
> > of the puzzle and what we protect might differ from platform to
> > platform.
> >
>
> I agree. Even technologies that currently only do memory encryption may
> be enhanced with more protections later.
There's already SEV-ES patches onlist for this on the SEV side.
<sigh on haggling over the name>
Perhaps 'confidential guest' is actually what we need, since the
marketing folks seem to have started labelling this whole idea
'confidential computing'.
Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Fri, 4 Dec 2020 16:44:05 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > At the moment AMD SEV sets a special function pointer, plus an opaque > handle in KVMState to let things know how to encrypt guest memory. > > Now that we have a QOM interface for handling things related to securable > guest memory, use a QOM method on that interface, rather than a bare > function pointer for this. > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > --- > accel/kvm/kvm-all.c | 36 +++++--- > accel/kvm/sev-stub.c | 9 +- > include/exec/securable-guest-memory.h | 2 + > include/sysemu/sev.h | 5 +- > target/i386/monitor.c | 1 - > target/i386/sev.c | 116 ++++++++++---------------- > 6 files changed, 77 insertions(+), 92 deletions(-) > > @@ -224,7 +224,7 @@ int kvm_get_max_memslots(void) > > bool kvm_memcrypt_enabled(void) > { > - if (kvm_state && kvm_state->memcrypt_handle) { > + if (kvm_state && kvm_state->sgm) { If we want to generalize the concept, maybe check for encrypt_data in sgm here? There's probably room for different callbacks in the sgm structure. > return true; > } >
On Fri, 4 Dec 2020 13:07:27 +0000
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Cornelia Huck (cohuck@redhat.com) wrote:
> > On Fri, 4 Dec 2020 09:06:50 +0100
> > Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> >
> > > On 04.12.20 06:44, David Gibson wrote:
> > > > A number of hardware platforms are implementing mechanisms whereby the
> > > > hypervisor does not have unfettered access to guest memory, in order
> > > > to mitigate the security impact of a compromised hypervisor.
> > > >
> > > > AMD's SEV implements this with in-cpu memory encryption, and Intel has
> > > > its own memory encryption mechanism. POWER has an upcoming mechanism
> > > > to accomplish this in a different way, using a new memory protection
> > > > level plus a small trusted ultravisor. s390 also has a protected
> > > > execution environment.
> > > >
> > > > The current code (committed or draft) for these features has each
> > > > platform's version configured entirely differently. That doesn't seem
> > > > ideal for users, or particularly for management layers.
> > > >
> > > > AMD SEV introduces a notionally generic machine option
> > > > "machine-encryption", but it doesn't actually cover any cases other
> > > > than SEV.
> > > >
> > > > This series is a proposal to at least partially unify configuration
> > > > for these mechanisms, by renaming and generalizing AMD's
> > > > "memory-encryption" property. It is replaced by a
> > > > "securable-guest-memory" property pointing to a platform specific
> > >
> > > Can we do "securable-guest" ?
> > > s390x also protects registers and integrity. memory is only one piece
> > > of the puzzle and what we protect might differ from platform to
> > > platform.
> > >
> >
> > I agree. Even technologies that currently only do memory encryption may
> > be enhanced with more protections later.
>
> There's already SEV-ES patches onlist for this on the SEV side.
>
> <sigh on haggling over the name>
>
> Perhaps 'confidential guest' is actually what we need, since the
> marketing folks seem to have started labelling this whole idea
> 'confidential computing'.
It's more like a 'possibly confidential guest', though.
On Fri, Dec 04, 2020 at 01:07:27PM +0000, Dr. David Alan Gilbert wrote: > * Cornelia Huck (cohuck@redhat.com) wrote: > > On Fri, 4 Dec 2020 09:06:50 +0100 > > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > > > > > On 04.12.20 06:44, David Gibson wrote: > > > > A number of hardware platforms are implementing mechanisms whereby the > > > > hypervisor does not have unfettered access to guest memory, in order > > > > to mitigate the security impact of a compromised hypervisor. > > > > > > > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > > > > its own memory encryption mechanism. POWER has an upcoming mechanism > > > > to accomplish this in a different way, using a new memory protection > > > > level plus a small trusted ultravisor. s390 also has a protected > > > > execution environment. > > > > > > > > The current code (committed or draft) for these features has each > > > > platform's version configured entirely differently. That doesn't seem > > > > ideal for users, or particularly for management layers. > > > > > > > > AMD SEV introduces a notionally generic machine option > > > > "machine-encryption", but it doesn't actually cover any cases other > > > > than SEV. > > > > > > > > This series is a proposal to at least partially unify configuration > > > > for these mechanisms, by renaming and generalizing AMD's > > > > "memory-encryption" property. It is replaced by a > > > > "securable-guest-memory" property pointing to a platform specific > > > > > > Can we do "securable-guest" ? > > > s390x also protects registers and integrity. memory is only one piece > > > of the puzzle and what we protect might differ from platform to > > > platform. > > > > > > > I agree. Even technologies that currently only do memory encryption may > > be enhanced with more protections later. > > There's already SEV-ES patches onlist for this on the SEV side. > > <sigh on haggling over the name> > > Perhaps 'confidential guest' is actually what we need, since the > marketing folks seem to have started labelling this whole idea > 'confidential computing'. I think we shouldn't worry about the specific name too much, as it won't be visible much outside QEMU and the internals of the immediate layer above such as libvirt. What matters much more is that we have documentation that clearly explains what the different levels of protection are for each different architecture, and/or generation of architecture. Mgmt apps / end users need understand exactly what kind of unicorns they are being promised for a given configuration. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Fri, 4 Dec 2020 13:25:00 +0000
Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Fri, Dec 04, 2020 at 01:07:27PM +0000, Dr. David Alan Gilbert wrote:
> > * Cornelia Huck (cohuck@redhat.com) wrote:
> > > On Fri, 4 Dec 2020 09:06:50 +0100
> > > Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> > >
> > > > On 04.12.20 06:44, David Gibson wrote:
> > > > > A number of hardware platforms are implementing mechanisms whereby the
> > > > > hypervisor does not have unfettered access to guest memory, in order
> > > > > to mitigate the security impact of a compromised hypervisor.
> > > > >
> > > > > AMD's SEV implements this with in-cpu memory encryption, and Intel has
> > > > > its own memory encryption mechanism. POWER has an upcoming mechanism
> > > > > to accomplish this in a different way, using a new memory protection
> > > > > level plus a small trusted ultravisor. s390 also has a protected
> > > > > execution environment.
> > > > >
> > > > > The current code (committed or draft) for these features has each
> > > > > platform's version configured entirely differently. That doesn't seem
> > > > > ideal for users, or particularly for management layers.
> > > > >
> > > > > AMD SEV introduces a notionally generic machine option
> > > > > "machine-encryption", but it doesn't actually cover any cases other
> > > > > than SEV.
> > > > >
> > > > > This series is a proposal to at least partially unify configuration
> > > > > for these mechanisms, by renaming and generalizing AMD's
> > > > > "memory-encryption" property. It is replaced by a
> > > > > "securable-guest-memory" property pointing to a platform specific
> > > >
> > > > Can we do "securable-guest" ?
> > > > s390x also protects registers and integrity. memory is only one piece
> > > > of the puzzle and what we protect might differ from platform to
> > > > platform.
> > > >
> > >
> > > I agree. Even technologies that currently only do memory encryption may
> > > be enhanced with more protections later.
> >
> > There's already SEV-ES patches onlist for this on the SEV side.
> >
> > <sigh on haggling over the name>
> >
> > Perhaps 'confidential guest' is actually what we need, since the
> > marketing folks seem to have started labelling this whole idea
> > 'confidential computing'.
>
> I think we shouldn't worry about the specific name too much, as it
> won't be visible much outside QEMU and the internals of the immediate
> layer above such as libvirt. What matters much more is that we have
> documentation that clearly explains what the different levels of
> protection are for each different architecture, and/or generation of
> architecture. Mgmt apps / end users need understand exactly what
> kind of unicorns they are being promised for a given configuration.
>
>
You are probably right, but I still prefer descriptive names over
misleading ones -- it helps with my cognitive process.
Regards,
Halil
On Fri, 4 Dec 2020 09:29:59 +0100
Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>
>
> On 04.12.20 09:17, Cornelia Huck wrote:
> > On Fri, 4 Dec 2020 09:10:36 +0100
> > Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> >
> >> On 04.12.20 06:44, David Gibson wrote:
> >>> The default behaviour for virtio devices is not to use the platforms normal
> >>> DMA paths, but instead to use the fact that it's running in a hypervisor
> >>> to directly access guest memory. That doesn't work if the guest's memory
> >>> is protected from hypervisor access, such as with AMD's SEV or POWER's PEF.
> >>>
> >>> So, if a securable guest memory mechanism is enabled, then apply the
> >>> iommu_platform=on option so it will go through normal DMA mechanisms.
> >>> Those will presumably have some way of marking memory as shared with
> >>> the hypervisor or hardware so that DMA will work.
> >>>
> >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> >>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >>> ---
> >>> hw/core/machine.c | 13 +++++++++++++
> >>> 1 file changed, 13 insertions(+)
> >>>
> >>> diff --git a/hw/core/machine.c b/hw/core/machine.c
> >>> index a67a27d03c..d16273d75d 100644
> >>> --- a/hw/core/machine.c
> >>> +++ b/hw/core/machine.c
> >>> @@ -28,6 +28,8 @@
> >>> #include "hw/mem/nvdimm.h"
> >>> #include "migration/vmstate.h"
> >>> #include "exec/securable-guest-memory.h"
> >>> +#include "hw/virtio/virtio.h"
> >>> +#include "hw/virtio/virtio-pci.h"
> >>>
> >>> GlobalProperty hw_compat_5_1[] = {
> >>> { "vhost-scsi", "num_queues", "1"},
> >>> @@ -1169,6 +1171,17 @@ void machine_run_board_init(MachineState *machine)
> >>> * areas.
> >>> */
> >>> machine_set_mem_merge(OBJECT(machine), false, &error_abort);
> >>> +
> >>> + /*
> >>> + * Virtio devices can't count on directly accessing guest
> >>> + * memory, so they need iommu_platform=on to use normal DMA
> >>> + * mechanisms. That requires also disabling legacy virtio
> >>> + * support for those virtio pci devices which allow it.
> >>> + */
> >>> + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy",
> >>> + "on", true);
> >>> + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform",
> >>> + "on", false);
> >>
> >> I have not followed all the history (sorry). Should we also set iommu_platform
> >> for virtio-ccw? Halil?
> >>
> >
> > That line should add iommu_platform for all virtio devices, shouldn't
> > it?
>
> Yes, sorry. Was misreading that with the line above.
>
I believe this is the best we can get. In a sense it is still a
pessimization, but it is a big usability improvement compared to having
to set iommu_platform manually.
Regards,
Halil
On Fri, 4 Dec 2020 16:44:14 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> The default behaviour for virtio devices is not to use the platforms normal
> DMA paths, but instead to use the fact that it's running in a hypervisor
> to directly access guest memory. That doesn't work if the guest's memory
> is protected from hypervisor access, such as with AMD's SEV or POWER's PEF.
>
> So, if a securable guest memory mechanism is enabled, then apply the
> iommu_platform=on option so it will go through normal DMA mechanisms.
> Those will presumably have some way of marking memory as shared with
> the hypervisor or hardware so that DMA will work.
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> hw/core/machine.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 3394 bytes --] On Fri, Dec 04, 2020 at 03:43:10PM +0100, Halil Pasic wrote: > On Fri, 4 Dec 2020 09:29:59 +0100 > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > > > On 04.12.20 09:17, Cornelia Huck wrote: > > > On Fri, 4 Dec 2020 09:10:36 +0100 > > > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > > > > > >> On 04.12.20 06:44, David Gibson wrote: > > >>> The default behaviour for virtio devices is not to use the platforms normal > > >>> DMA paths, but instead to use the fact that it's running in a hypervisor > > >>> to directly access guest memory. That doesn't work if the guest's memory > > >>> is protected from hypervisor access, such as with AMD's SEV or POWER's PEF. > > >>> > > >>> So, if a securable guest memory mechanism is enabled, then apply the > > >>> iommu_platform=on option so it will go through normal DMA mechanisms. > > >>> Those will presumably have some way of marking memory as shared with > > >>> the hypervisor or hardware so that DMA will work. > > >>> > > >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > >>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > >>> --- > > >>> hw/core/machine.c | 13 +++++++++++++ > > >>> 1 file changed, 13 insertions(+) > > >>> > > >>> diff --git a/hw/core/machine.c b/hw/core/machine.c > > >>> index a67a27d03c..d16273d75d 100644 > > >>> --- a/hw/core/machine.c > > >>> +++ b/hw/core/machine.c > > >>> @@ -28,6 +28,8 @@ > > >>> #include "hw/mem/nvdimm.h" > > >>> #include "migration/vmstate.h" > > >>> #include "exec/securable-guest-memory.h" > > >>> +#include "hw/virtio/virtio.h" > > >>> +#include "hw/virtio/virtio-pci.h" > > >>> > > >>> GlobalProperty hw_compat_5_1[] = { > > >>> { "vhost-scsi", "num_queues", "1"}, > > >>> @@ -1169,6 +1171,17 @@ void machine_run_board_init(MachineState *machine) > > >>> * areas. > > >>> */ > > >>> machine_set_mem_merge(OBJECT(machine), false, &error_abort); > > >>> + > > >>> + /* > > >>> + * Virtio devices can't count on directly accessing guest > > >>> + * memory, so they need iommu_platform=on to use normal DMA > > >>> + * mechanisms. That requires also disabling legacy virtio > > >>> + * support for those virtio pci devices which allow it. > > >>> + */ > > >>> + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy", > > >>> + "on", true); > > >>> + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform", > > >>> + "on", false); > > >> > > >> I have not followed all the history (sorry). Should we also set iommu_platform > > >> for virtio-ccw? Halil? > > >> > > > > > > That line should add iommu_platform for all virtio devices, shouldn't > > > it? > > > > Yes, sorry. Was misreading that with the line above. > > > > I believe this is the best we can get. In a sense it is still a > pessimization, I'm not really clear on what you're getting at here. > but it is a big usability improvement compared to having > to set iommu_platform manually. > > Regards, > Halil > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 2237 bytes --] On Fri, Dec 04, 2020 at 02:02:05PM +0100, Cornelia Huck wrote: > On Fri, 4 Dec 2020 09:06:50 +0100 > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > > > On 04.12.20 06:44, David Gibson wrote: > > > A number of hardware platforms are implementing mechanisms whereby the > > > hypervisor does not have unfettered access to guest memory, in order > > > to mitigate the security impact of a compromised hypervisor. > > > > > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > > > its own memory encryption mechanism. POWER has an upcoming mechanism > > > to accomplish this in a different way, using a new memory protection > > > level plus a small trusted ultravisor. s390 also has a protected > > > execution environment. > > > > > > The current code (committed or draft) for these features has each > > > platform's version configured entirely differently. That doesn't seem > > > ideal for users, or particularly for management layers. > > > > > > AMD SEV introduces a notionally generic machine option > > > "machine-encryption", but it doesn't actually cover any cases other > > > than SEV. > > > > > > This series is a proposal to at least partially unify configuration > > > for these mechanisms, by renaming and generalizing AMD's > > > "memory-encryption" property. It is replaced by a > > > "securable-guest-memory" property pointing to a platform specific > > > > Can we do "securable-guest" ? > > s390x also protects registers and integrity. memory is only one piece > > of the puzzle and what we protect might differ from platform to > > platform. > > I agree. Even technologies that currently only do memory encryption may > be enhanced with more protections later. That's a good point. I've focused on the memory aspect because that's what's most immediately relevant to qemu - the fact that we can't directly access guest memory is something we have to deal with, and has some uniformity regardless of the details of the protection scheme. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 3085 bytes --] On Fri, Dec 04, 2020 at 02:12:29PM +0100, Cornelia Huck wrote: > On Fri, 4 Dec 2020 13:07:27 +0000 > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > > > * Cornelia Huck (cohuck@redhat.com) wrote: > > > On Fri, 4 Dec 2020 09:06:50 +0100 > > > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > > > > > > > On 04.12.20 06:44, David Gibson wrote: > > > > > A number of hardware platforms are implementing mechanisms whereby the > > > > > hypervisor does not have unfettered access to guest memory, in order > > > > > to mitigate the security impact of a compromised hypervisor. > > > > > > > > > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > > > > > its own memory encryption mechanism. POWER has an upcoming mechanism > > > > > to accomplish this in a different way, using a new memory protection > > > > > level plus a small trusted ultravisor. s390 also has a protected > > > > > execution environment. > > > > > > > > > > The current code (committed or draft) for these features has each > > > > > platform's version configured entirely differently. That doesn't seem > > > > > ideal for users, or particularly for management layers. > > > > > > > > > > AMD SEV introduces a notionally generic machine option > > > > > "machine-encryption", but it doesn't actually cover any cases other > > > > > than SEV. > > > > > > > > > > This series is a proposal to at least partially unify configuration > > > > > for these mechanisms, by renaming and generalizing AMD's > > > > > "memory-encryption" property. It is replaced by a > > > > > "securable-guest-memory" property pointing to a platform specific > > > > > > > > Can we do "securable-guest" ? > > > > s390x also protects registers and integrity. memory is only one piece > > > > of the puzzle and what we protect might differ from platform to > > > > platform. > > > > > > > > > > I agree. Even technologies that currently only do memory encryption may > > > be enhanced with more protections later. > > > > There's already SEV-ES patches onlist for this on the SEV side. > > > > <sigh on haggling over the name> > > > > Perhaps 'confidential guest' is actually what we need, since the > > marketing folks seem to have started labelling this whole idea > > 'confidential computing'. That's not a bad idea, much as I usually hate marketing terms. But it does seem to be becoming a general term for this style of thing, and it doesn't overlap too badly with other terms ("secure" and "protected" are also used for hypervisor-from-guest and guest-from-guest protection). > It's more like a 'possibly confidential guest', though. Hmm. What about "Confidential Guest Facility" or "Confidential Guest Mechanism"? The implication being that the facility is there, whether or not the guest actually uses it. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
On 08.12.20 02:54, David Gibson wrote:
> On Fri, Dec 04, 2020 at 03:43:10PM +0100, Halil Pasic wrote:
>> On Fri, 4 Dec 2020 09:29:59 +0100
>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>>
>>> On 04.12.20 09:17, Cornelia Huck wrote:
>>>> On Fri, 4 Dec 2020 09:10:36 +0100
>>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>>>>
>>>>> On 04.12.20 06:44, David Gibson wrote:
>>>>>> The default behaviour for virtio devices is not to use the platforms normal
>>>>>> DMA paths, but instead to use the fact that it's running in a hypervisor
>>>>>> to directly access guest memory. That doesn't work if the guest's memory
>>>>>> is protected from hypervisor access, such as with AMD's SEV or POWER's PEF.
>>>>>>
>>>>>> So, if a securable guest memory mechanism is enabled, then apply the
>>>>>> iommu_platform=on option so it will go through normal DMA mechanisms.
>>>>>> Those will presumably have some way of marking memory as shared with
>>>>>> the hypervisor or hardware so that DMA will work.
>>>>>>
>>>>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>>>>>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>>>>> ---
>>>>>> hw/core/machine.c | 13 +++++++++++++
>>>>>> 1 file changed, 13 insertions(+)
>>>>>>
>>>>>> diff --git a/hw/core/machine.c b/hw/core/machine.c
>>>>>> index a67a27d03c..d16273d75d 100644
>>>>>> --- a/hw/core/machine.c
>>>>>> +++ b/hw/core/machine.c
>>>>>> @@ -28,6 +28,8 @@
>>>>>> #include "hw/mem/nvdimm.h"
>>>>>> #include "migration/vmstate.h"
>>>>>> #include "exec/securable-guest-memory.h"
>>>>>> +#include "hw/virtio/virtio.h"
>>>>>> +#include "hw/virtio/virtio-pci.h"
>>>>>>
>>>>>> GlobalProperty hw_compat_5_1[] = {
>>>>>> { "vhost-scsi", "num_queues", "1"},
>>>>>> @@ -1169,6 +1171,17 @@ void machine_run_board_init(MachineState *machine)
>>>>>> * areas.
>>>>>> */
>>>>>> machine_set_mem_merge(OBJECT(machine), false, &error_abort);
>>>>>> +
>>>>>> + /*
>>>>>> + * Virtio devices can't count on directly accessing guest
>>>>>> + * memory, so they need iommu_platform=on to use normal DMA
>>>>>> + * mechanisms. That requires also disabling legacy virtio
>>>>>> + * support for those virtio pci devices which allow it.
>>>>>> + */
>>>>>> + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy",
>>>>>> + "on", true);
>>>>>> + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform",
>>>>>> + "on", false);
>>>>>
>>>>> I have not followed all the history (sorry). Should we also set iommu_platform
>>>>> for virtio-ccw? Halil?
>>>>>
>>>>
>>>> That line should add iommu_platform for all virtio devices, shouldn't
>>>> it?
>>>
>>> Yes, sorry. Was misreading that with the line above.
>>>
>>
>> I believe this is the best we can get. In a sense it is still a
>> pessimization,
>
> I'm not really clear on what you're getting at here.
I think Halils point is that somebody might come up with a solution where things would
work even without iommu_platform. But as he said, still the best setting we can get
to cover all cases.
[-- Attachment #1: Type: text/plain, Size: 1508 bytes --] On Tue, 8 Dec 2020 12:54:03 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > > > >>> + * Virtio devices can't count on directly accessing guest > > > >>> + * memory, so they need iommu_platform=on to use normal DMA > > > >>> + * mechanisms. That requires also disabling legacy virtio > > > >>> + * support for those virtio pci devices which allow it. > > > >>> + */ > > > >>> + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy", > > > >>> + "on", true); > > > >>> + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform", > > > >>> + "on", false); > > > >> > > > >> I have not followed all the history (sorry). Should we also set iommu_platform > > > >> for virtio-ccw? Halil? > > > >> > > > > > > > > That line should add iommu_platform for all virtio devices, shouldn't > > > > it? > > > > > > Yes, sorry. Was misreading that with the line above. > > > > > > > I believe this is the best we can get. In a sense it is still a > > pessimization, > > I'm not really clear on what you're getting at here. By pessimiziation, I mean that we are going to indicate _F_PLATFORM_ACCESS even if it isn't necessary, because the guest never opted in for confidential/memory protection/memory encryption. We have discussed this before, and I don't see a better solution that works for everybody. Regards, Halil [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --]
[-- Attachment #1: Type: text/plain, Size: 3303 bytes --] On Tue, 8 Dec 2020 13:57:28 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > On Fri, Dec 04, 2020 at 02:12:29PM +0100, Cornelia Huck wrote: > > On Fri, 4 Dec 2020 13:07:27 +0000 > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > > > > > * Cornelia Huck (cohuck@redhat.com) wrote: > > > > On Fri, 4 Dec 2020 09:06:50 +0100 > > > > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > > > > > > > > > On 04.12.20 06:44, David Gibson wrote: > > > > > > A number of hardware platforms are implementing mechanisms whereby the > > > > > > hypervisor does not have unfettered access to guest memory, in order > > > > > > to mitigate the security impact of a compromised hypervisor. > > > > > > > > > > > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > > > > > > its own memory encryption mechanism. POWER has an upcoming mechanism > > > > > > to accomplish this in a different way, using a new memory protection > > > > > > level plus a small trusted ultravisor. s390 also has a protected > > > > > > execution environment. > > > > > > > > > > > > The current code (committed or draft) for these features has each > > > > > > platform's version configured entirely differently. That doesn't seem > > > > > > ideal for users, or particularly for management layers. > > > > > > > > > > > > AMD SEV introduces a notionally generic machine option > > > > > > "machine-encryption", but it doesn't actually cover any cases other > > > > > > than SEV. > > > > > > > > > > > > This series is a proposal to at least partially unify configuration > > > > > > for these mechanisms, by renaming and generalizing AMD's > > > > > > "memory-encryption" property. It is replaced by a > > > > > > "securable-guest-memory" property pointing to a platform specific > > > > > > > > > > Can we do "securable-guest" ? > > > > > s390x also protects registers and integrity. memory is only one piece > > > > > of the puzzle and what we protect might differ from platform to > > > > > platform. > > > > > > > > > > > > > I agree. Even technologies that currently only do memory encryption may > > > > be enhanced with more protections later. > > > > > > There's already SEV-ES patches onlist for this on the SEV side. > > > > > > <sigh on haggling over the name> > > > > > > Perhaps 'confidential guest' is actually what we need, since the > > > marketing folks seem to have started labelling this whole idea > > > 'confidential computing'. > > That's not a bad idea, much as I usually hate marketing terms. But it > does seem to be becoming a general term for this style of thing, and > it doesn't overlap too badly with other terms ("secure" and > "protected" are also used for hypervisor-from-guest and > guest-from-guest protection). > > > It's more like a 'possibly confidential guest', though. > > Hmm. What about "Confidential Guest Facility" or "Confidential Guest > Mechanism"? The implication being that the facility is there, whether > or not the guest actually uses it. > "Confidential Guest Enablement"? The others generally sound fine to me as well, though; not sure if "Facility" might be a bit confusing, as that term is already a bit overloaded. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 1901 bytes --] On Tue, 8 Dec 2020 11:28:29 +0100 Halil Pasic <pasic@linux.ibm.com> wrote: > On Tue, 8 Dec 2020 12:54:03 +1100 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > >>> + * Virtio devices can't count on directly accessing guest > > > > >>> + * memory, so they need iommu_platform=on to use normal DMA > > > > >>> + * mechanisms. That requires also disabling legacy virtio > > > > >>> + * support for those virtio pci devices which allow it. > > > > >>> + */ > > > > >>> + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy", > > > > >>> + "on", true); > > > > >>> + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform", > > > > >>> + "on", false); > > > > >> > > > > >> I have not followed all the history (sorry). Should we also set iommu_platform > > > > >> for virtio-ccw? Halil? > > > > >> > > > > > > > > > > That line should add iommu_platform for all virtio devices, shouldn't > > > > > it? > > > > > > > > Yes, sorry. Was misreading that with the line above. > > > > > > > > > > I believe this is the best we can get. In a sense it is still a > > > pessimization, > > > > I'm not really clear on what you're getting at here. > > By pessimiziation, I mean that we are going to indicate > _F_PLATFORM_ACCESS even if it isn't necessary, because the guest never > opted in for confidential/memory protection/memory encryption. We have > discussed this before, and I don't see a better solution that works for > everybody. If you consider specifying the secure guest option as a way to tell QEMU to make everything ready for running a secure guest, I'd certainly consider it necessary. If you do not want to force it, you should not do the secure guest preparation setup. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --]
On Fri, 4 Dec 2020 16:44:09 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> This allows failures to be reported richly and idiomatically.
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/kvm/kvm-all.c | 4 +++-
> accel/kvm/sev-stub.c | 5 +++--
> include/sysemu/sev.h | 2 +-
> target/i386/sev.c | 31 +++++++++++++++----------------
> 4 files changed, 22 insertions(+), 20 deletions(-)
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
On Fri, 4 Dec 2020 16:44:10 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > The platform specific details of mechanisms for implementing securable > guest memory may require setup at various points during initialization. > Thus, it's not really feasible to have a single sgm initialization hook, > but instead each mechanism needs its own initialization calls in arch or > machine specific code. > > However, to make it harder to have a bug where a mechanism isn't properly > initialized under some circumstances, we want to have a common place, > relatively late in boot, where we verify that sgm has been initialized if > it was requested. > > This patch introduces a ready flag to the SecurableGuestMemory base type > to accomplish this, which we verify just before the machine specific > initialization function. > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > --- > hw/core/machine.c | 8 ++++++++ > include/exec/securable-guest-memory.h | 2 ++ > target/i386/sev.c | 2 ++ > 3 files changed, 12 insertions(+) > > diff --git a/hw/core/machine.c b/hw/core/machine.c > index 816ea3ae3e..a67a27d03c 100644 > --- a/hw/core/machine.c > +++ b/hw/core/machine.c > @@ -1155,6 +1155,14 @@ void machine_run_board_init(MachineState *machine) > } > > if (machine->sgm) { > + /* > + * Where securable guest memory is initialized depends on the > + * specific mechanism in use. But, we need to make sure it's > + * ready by now. If it isn't, that's a bug in the > + * implementation of that sgm mechanism. > + */ > + assert(machine->sgm->ready); Under which circumstances might we arrive here with 'ready' not set? - programming error, setup is happening too late -> assert() seems appropriate - we tried to set it up, but some error happened -> should we rely on the setup code to error out first? (i.e. we won't end up here, unless there's a programming error, in which case the assert() looks fine) Is there a possible use case for "we could not set it up, but we support an unsecured guest (as long as it is clear what happens)"? Likely only for guests that transition themselves, but one could argue that QEMU should simply be invoked a second time without the sgm stuff being specified in the error case. > + > /* > * With securable guest memory, the host can't see the real > * contents of RAM, so there's no point in it trying to merge
On Fri, 4 Dec 2020 16:44:13 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > We haven't yet implemented the fairly involved handshaking that will be > needed to migrate PEF protected guests. For now, just use a migration > blocker so we get a meaningful error if someone attempts this (this is the > same approach used by AMD SEV). > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > --- > hw/ppc/pef.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/hw/ppc/pef.c b/hw/ppc/pef.c > index 3ae3059cfe..edc3e744ba 100644 > --- a/hw/ppc/pef.c > +++ b/hw/ppc/pef.c > @@ -38,7 +38,11 @@ struct PefGuestState { > }; > > #ifdef CONFIG_KVM > +static Error *pef_mig_blocker; > + > static int kvmppc_svm_init(Error **errp) This looks weird? > + > +int kvmppc_svm_init(SecurableGuestMemory *sgm, Error **errp) > { > if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURABLE_GUEST)) { > error_setg(errp, > @@ -54,6 +58,11 @@ static int kvmppc_svm_init(Error **errp) > } > } > > + /* add migration blocker */ > + error_setg(&pef_mig_blocker, "PEF: Migration is not implemented"); > + /* NB: This can fail if --only-migratable is used */ > + migrate_add_blocker(pef_mig_blocker, &error_fatal); Just so that I understand: is PEF something that is enabled by the host (and the guest is either secured or doesn't start), or is it using a model like s390x PV where the guest initiates the transition into secured mode? Asking because s390x adds the migration blocker only when the transition is actually happening (i.e. guests that do not transition into secure mode remain migratable.) This has the side effect that you might be able to start a machine with --only-migratable that transitions into a non-migratable machine via a guest action, if I'm not mistaken. Without the new object, I don't see a way to block with --only-migratable; with it, we should be able to do that. Not sure what the desirable behaviour is here. > + > return 0; > } >
On Fri, Dec 04, 2020 at 04:44:03PM +1100, David Gibson wrote:
> From: Greg Kurz <groug@kaod.org>
>
> Global properties have an @optional field, which allows to apply a given
> property to a given type even if one of its subclasses doesn't support
> it. This is especially used in the compat code when dealing with the
> "disable-modern" and "disable-legacy" properties and the "virtio-pci"
> type.
>
> Allow object_register_sugar_prop() to set this field as well.
>
> Signed-off-by: Greg Kurz <groug@kaod.org>
> Message-Id: <159738953558.377274.16617742952571083440.stgit@bahia.lan>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
--
Eduardo
On Fri, 4 Dec 2020 16:44:15 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > At least some s390 cpu models support "Protected Virtualization" (PV), > a mechanism to protect guests from eavesdropping by a compromised > hypervisor. > > This is similar in function to other mechanisms like AMD's SEV and > POWER's PEF, which are controlled bythe "securable-guest-memory" machine s/bythe/by the/ > option. s390 is a slightly special case, because we already supported > PV, simply by using a CPU model with the required feature > (S390_FEAT_UNPACK). > > To integrate this with the option used by other platforms, we > implement the following compromise: > > - When the securable-guest-memory option is set, s390 will recognize it, > verify that the CPU can support PV (failing if not) and set virtio > default options necessary for encrypted or protected guests, as on > other platforms. i.e. if securable-guest-memory is set, we will > either create a guest capable of entering PV mode, or fail outright s/outright/outright./ > > - If securable-guest-memory is not set, guest's might still be able to s/guest's/guests/ > enter PV mode, if the CPU has the right model. This may be a > little surprising, but shouldn't actually be harmful. > > To start a guest supporting Protected Virtualization using the new > option use the command line arguments: > -object s390-pv-guest,id=pv0 -machine securable-guest-memory=pv0 > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > --- > hw/s390x/pv.c | 58 +++++++++++++++++++++++++++++++++++++++++++ > include/hw/s390x/pv.h | 1 + > target/s390x/kvm.c | 3 +++ > 3 files changed, 62 insertions(+) > Modulo any naming changes etc., I think this should work for s390. I don't have the hardware to test this, however, and would appreciate someone with a PV setup giving this a go.
[-- Attachment #1: Type: text/plain, Size: 3891 bytes --] On Mon, Dec 14, 2020 at 06:00:36PM +0100, Cornelia Huck wrote: > On Fri, 4 Dec 2020 16:44:10 +1100 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > The platform specific details of mechanisms for implementing securable > > guest memory may require setup at various points during initialization. > > Thus, it's not really feasible to have a single sgm initialization hook, > > but instead each mechanism needs its own initialization calls in arch or > > machine specific code. > > > > However, to make it harder to have a bug where a mechanism isn't properly > > initialized under some circumstances, we want to have a common place, > > relatively late in boot, where we verify that sgm has been initialized if > > it was requested. > > > > This patch introduces a ready flag to the SecurableGuestMemory base type > > to accomplish this, which we verify just before the machine specific > > initialization function. > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > --- > > hw/core/machine.c | 8 ++++++++ > > include/exec/securable-guest-memory.h | 2 ++ > > target/i386/sev.c | 2 ++ > > 3 files changed, 12 insertions(+) > > > > diff --git a/hw/core/machine.c b/hw/core/machine.c > > index 816ea3ae3e..a67a27d03c 100644 > > --- a/hw/core/machine.c > > +++ b/hw/core/machine.c > > @@ -1155,6 +1155,14 @@ void machine_run_board_init(MachineState *machine) > > } > > > > if (machine->sgm) { > > + /* > > + * Where securable guest memory is initialized depends on the > > + * specific mechanism in use. But, we need to make sure it's > > + * ready by now. If it isn't, that's a bug in the > > + * implementation of that sgm mechanism. > > + */ > > + assert(machine->sgm->ready); > > Under which circumstances might we arrive here with 'ready' not set? > > - programming error, setup is happening too late -> assert() seems > appropriate Yes, this is designed to catch programming errors. In particular I'm concerned about: * Re-arranging the init code, and either entirely forgetting the sgm setup, or accidentally moving it too late * The sgm setup is buried in the machine setup code, conditional on various things, and changes mean we no longer either call it or (correctly) fail * User has specified an sgm scheme designed for a machine type other than the one they selected. The arch/machine init code hasn't correctly accounted for that possibility and ignores it, instead of correctly throwing an error > - we tried to set it up, but some error happened -> should we rely on > the setup code to error out first? (i.e. we won't end up here, unless > there's a programming error, in which case the assert() looks > fine) Yes, that's my intention. > Is there a possible use case for "we could not set it up, but we > support an unsecured guest (as long as it is clear what happens)"? I don't think so. My feeling is that if you specify that you want the feature, qemu needs to either give it to you, or fail, not silently degrade the features presented to the guest. > Likely only for guests that transition themselves, but one could > argue that QEMU should simply be invoked a second time without the > sgm stuff being specified in the error case. Right - I think whatever error we give here is likely to be easier to diagnose than the guest itself throwing an error when it fails to transition to secure mode (plus we should catch it always, rather than only if we run a guest which tries to go secure). -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 2718 bytes --] On Mon, Dec 14, 2020 at 06:22:40PM +0100, Cornelia Huck wrote: > On Fri, 4 Dec 2020 16:44:13 +1100 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > We haven't yet implemented the fairly involved handshaking that will be > > needed to migrate PEF protected guests. For now, just use a migration > > blocker so we get a meaningful error if someone attempts this (this is the > > same approach used by AMD SEV). > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > --- > > hw/ppc/pef.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/hw/ppc/pef.c b/hw/ppc/pef.c > > index 3ae3059cfe..edc3e744ba 100644 > > --- a/hw/ppc/pef.c > > +++ b/hw/ppc/pef.c > > @@ -38,7 +38,11 @@ struct PefGuestState { > > }; > > > > #ifdef CONFIG_KVM > > +static Error *pef_mig_blocker; > > + > > static int kvmppc_svm_init(Error **errp) > > This looks weird? Oops. Not sure how that made it past even my rudimentary compile testing. > > + > > +int kvmppc_svm_init(SecurableGuestMemory *sgm, Error **errp) > > { > > if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURABLE_GUEST)) { > > error_setg(errp, > > @@ -54,6 +58,11 @@ static int kvmppc_svm_init(Error **errp) > > } > > } > > > > + /* add migration blocker */ > > + error_setg(&pef_mig_blocker, "PEF: Migration is not implemented"); > > + /* NB: This can fail if --only-migratable is used */ > > + migrate_add_blocker(pef_mig_blocker, &error_fatal); > > Just so that I understand: is PEF something that is enabled by the host > (and the guest is either secured or doesn't start), or is it using a > model like s390x PV where the guest initiates the transition into > secured mode? Like s390x PV it's initiated by the guest. > Asking because s390x adds the migration blocker only when the > transition is actually happening (i.e. guests that do not transition > into secure mode remain migratable.) This has the side effect that you > might be able to start a machine with --only-migratable that > transitions into a non-migratable machine via a guest action, if I'm > not mistaken. Without the new object, I don't see a way to block with > --only-migratable; with it, we should be able to do that. Not sure what > the desirable behaviour is here. Hm, I'm not sure what the best option is here either. > > > + > > return 0; > > } > > > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 2449 bytes --] On Tue, Dec 08, 2020 at 01:50:05PM +0100, Cornelia Huck wrote: > On Tue, 8 Dec 2020 11:28:29 +0100 > Halil Pasic <pasic@linux.ibm.com> wrote: > > > On Tue, 8 Dec 2020 12:54:03 +1100 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > >>> + * Virtio devices can't count on directly accessing guest > > > > > >>> + * memory, so they need iommu_platform=on to use normal DMA > > > > > >>> + * mechanisms. That requires also disabling legacy virtio > > > > > >>> + * support for those virtio pci devices which allow it. > > > > > >>> + */ > > > > > >>> + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy", > > > > > >>> + "on", true); > > > > > >>> + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform", > > > > > >>> + "on", false); > > > > > >> > > > > > >> I have not followed all the history (sorry). Should we also set iommu_platform > > > > > >> for virtio-ccw? Halil? > > > > > >> > > > > > > > > > > > > That line should add iommu_platform for all virtio devices, shouldn't > > > > > > it? > > > > > > > > > > Yes, sorry. Was misreading that with the line above. > > > > > > > > > > > > > I believe this is the best we can get. In a sense it is still a > > > > pessimization, > > > > > > I'm not really clear on what you're getting at here. > > > > By pessimiziation, I mean that we are going to indicate > > _F_PLATFORM_ACCESS even if it isn't necessary, because the guest never > > opted in for confidential/memory protection/memory encryption. We have > > discussed this before, and I don't see a better solution that works for > > everybody. > > If you consider specifying the secure guest option as a way to tell > QEMU to make everything ready for running a secure guest, I'd certainly > consider it necessary. If you do not want to force it, you should not > do the secure guest preparation setup. Right, that's my feeling as well. I'm also of the opinion that !F_PLATFORM_ACCESS is kind of a nasty hack that has some other problems (e.g. it means an L1 can't safely pass the device into an L2). -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 2348 bytes --] On Tue, Dec 15, 2020 at 12:45:26PM +0100, Cornelia Huck wrote: > On Fri, 4 Dec 2020 16:44:15 +1100 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > At least some s390 cpu models support "Protected Virtualization" (PV), > > a mechanism to protect guests from eavesdropping by a compromised > > hypervisor. > > > > This is similar in function to other mechanisms like AMD's SEV and > > POWER's PEF, which are controlled bythe "securable-guest-memory" machine > > s/bythe/by the/ > > > option. s390 is a slightly special case, because we already supported > > PV, simply by using a CPU model with the required feature > > (S390_FEAT_UNPACK). > > > > To integrate this with the option used by other platforms, we > > implement the following compromise: > > > > - When the securable-guest-memory option is set, s390 will recognize it, > > verify that the CPU can support PV (failing if not) and set virtio > > default options necessary for encrypted or protected guests, as on > > other platforms. i.e. if securable-guest-memory is set, we will > > either create a guest capable of entering PV mode, or fail outright > > s/outright/outright./ > > > > > - If securable-guest-memory is not set, guest's might still be able to > > s/guest's/guests/ All those corrected, thanks. > > enter PV mode, if the CPU has the right model. This may be a > > little surprising, but shouldn't actually be harmful. > > > > To start a guest supporting Protected Virtualization using the new > > option use the command line arguments: > > -object s390-pv-guest,id=pv0 -machine securable-guest-memory=pv0 > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > --- > > hw/s390x/pv.c | 58 +++++++++++++++++++++++++++++++++++++++++++ > > include/hw/s390x/pv.h | 1 + > > target/s390x/kvm.c | 3 +++ > > 3 files changed, 62 insertions(+) > > > > Modulo any naming changes etc., I think this should work for s390. I > don't have the hardware to test this, however, and would appreciate > someone with a PV setup giving this a go. Makes sense. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 3849 bytes --] On Tue, Dec 08, 2020 at 01:43:08PM +0100, Cornelia Huck wrote: > On Tue, 8 Dec 2020 13:57:28 +1100 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Fri, Dec 04, 2020 at 02:12:29PM +0100, Cornelia Huck wrote: > > > On Fri, 4 Dec 2020 13:07:27 +0000 > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > > > > > > > * Cornelia Huck (cohuck@redhat.com) wrote: > > > > > On Fri, 4 Dec 2020 09:06:50 +0100 > > > > > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > > > > > > > > > > > On 04.12.20 06:44, David Gibson wrote: > > > > > > > A number of hardware platforms are implementing mechanisms whereby the > > > > > > > hypervisor does not have unfettered access to guest memory, in order > > > > > > > to mitigate the security impact of a compromised hypervisor. > > > > > > > > > > > > > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > > > > > > > its own memory encryption mechanism. POWER has an upcoming mechanism > > > > > > > to accomplish this in a different way, using a new memory protection > > > > > > > level plus a small trusted ultravisor. s390 also has a protected > > > > > > > execution environment. > > > > > > > > > > > > > > The current code (committed or draft) for these features has each > > > > > > > platform's version configured entirely differently. That doesn't seem > > > > > > > ideal for users, or particularly for management layers. > > > > > > > > > > > > > > AMD SEV introduces a notionally generic machine option > > > > > > > "machine-encryption", but it doesn't actually cover any cases other > > > > > > > than SEV. > > > > > > > > > > > > > > This series is a proposal to at least partially unify configuration > > > > > > > for these mechanisms, by renaming and generalizing AMD's > > > > > > > "memory-encryption" property. It is replaced by a > > > > > > > "securable-guest-memory" property pointing to a platform specific > > > > > > > > > > > > Can we do "securable-guest" ? > > > > > > s390x also protects registers and integrity. memory is only one piece > > > > > > of the puzzle and what we protect might differ from platform to > > > > > > platform. > > > > > > > > > > > > > > > > I agree. Even technologies that currently only do memory encryption may > > > > > be enhanced with more protections later. > > > > > > > > There's already SEV-ES patches onlist for this on the SEV side. > > > > > > > > <sigh on haggling over the name> > > > > > > > > Perhaps 'confidential guest' is actually what we need, since the > > > > marketing folks seem to have started labelling this whole idea > > > > 'confidential computing'. > > > > That's not a bad idea, much as I usually hate marketing terms. But it > > does seem to be becoming a general term for this style of thing, and > > it doesn't overlap too badly with other terms ("secure" and > > "protected" are also used for hypervisor-from-guest and > > guest-from-guest protection). > > > > > It's more like a 'possibly confidential guest', though. > > > > Hmm. What about "Confidential Guest Facility" or "Confidential Guest > > Mechanism"? The implication being that the facility is there, whether > > or not the guest actually uses it. > > > > "Confidential Guest Enablement"? The others generally sound fine to me > as well, though; not sure if "Facility" might be a bit confusing, as > that term is already a bit overloaded. Well, "facility" is a bit overloaded, but IMO "enablement" is even more so. I think I'll go with "confidential guest support" in the next spin. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 4047 bytes --] On Thu, 17 Dec 2020 16:38:20 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > On Mon, Dec 14, 2020 at 06:00:36PM +0100, Cornelia Huck wrote: > > On Fri, 4 Dec 2020 16:44:10 +1100 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > The platform specific details of mechanisms for implementing securable > > > guest memory may require setup at various points during initialization. > > > Thus, it's not really feasible to have a single sgm initialization hook, > > > but instead each mechanism needs its own initialization calls in arch or > > > machine specific code. > > > > > > However, to make it harder to have a bug where a mechanism isn't properly > > > initialized under some circumstances, we want to have a common place, > > > relatively late in boot, where we verify that sgm has been initialized if > > > it was requested. > > > > > > This patch introduces a ready flag to the SecurableGuestMemory base type > > > to accomplish this, which we verify just before the machine specific > > > initialization function. > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > --- > > > hw/core/machine.c | 8 ++++++++ > > > include/exec/securable-guest-memory.h | 2 ++ > > > target/i386/sev.c | 2 ++ > > > 3 files changed, 12 insertions(+) > > > > > > diff --git a/hw/core/machine.c b/hw/core/machine.c > > > index 816ea3ae3e..a67a27d03c 100644 > > > --- a/hw/core/machine.c > > > +++ b/hw/core/machine.c > > > @@ -1155,6 +1155,14 @@ void machine_run_board_init(MachineState *machine) > > > } > > > > > > if (machine->sgm) { > > > + /* > > > + * Where securable guest memory is initialized depends on the > > > + * specific mechanism in use. But, we need to make sure it's > > > + * ready by now. If it isn't, that's a bug in the > > > + * implementation of that sgm mechanism. > > > + */ > > > + assert(machine->sgm->ready); > > > > Under which circumstances might we arrive here with 'ready' not set? > > > > - programming error, setup is happening too late -> assert() seems > > appropriate > > Yes, this is designed to catch programming errors. In particular I'm > concerned about: > * Re-arranging the init code, and either entirely forgetting the sgm > setup, or accidentally moving it too late > * The sgm setup is buried in the machine setup code, conditional on > various things, and changes mean we no longer either call it or > (correctly) fail > * User has specified an sgm scheme designed for a machine type other > than the one they selected. The arch/machine init code hasn't > correctly accounted for that possibility and ignores it, instead > of correctly throwing an error > > > - we tried to set it up, but some error happened -> should we rely on > > the setup code to error out first? (i.e. we won't end up here, unless > > there's a programming error, in which case the assert() looks > > fine) > > Yes, that's my intention. > > > Is there a possible use case for "we could not set it up, but we > > support an unsecured guest (as long as it is clear what happens)"? > > I don't think so. My feeling is that if you specify that you want the > feature, qemu needs to either give it to you, or fail, not silently > degrade the features presented to the guest. Yes, that should align with what QEMU is doing elsewhere. > > > Likely only for guests that transition themselves, but one could > > argue that QEMU should simply be invoked a second time without the > > sgm stuff being specified in the error case. > > Right - I think whatever error we give here is likely to be easier to > diagnose than the guest itself throwing an error when it fails to > transition to secure mode (plus we should catch it always, rather than > only if we run a guest which tries to go secure). Yes, that makes sense. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 3223 bytes --] On Thu, 17 Dec 2020 16:47:36 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > On Mon, Dec 14, 2020 at 06:22:40PM +0100, Cornelia Huck wrote: > > On Fri, 4 Dec 2020 16:44:13 +1100 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > We haven't yet implemented the fairly involved handshaking that will be > > > needed to migrate PEF protected guests. For now, just use a migration > > > blocker so we get a meaningful error if someone attempts this (this is the > > > same approach used by AMD SEV). > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > > --- > > > hw/ppc/pef.c | 9 +++++++++ > > > 1 file changed, 9 insertions(+) > > > > > > diff --git a/hw/ppc/pef.c b/hw/ppc/pef.c > > > index 3ae3059cfe..edc3e744ba 100644 > > > --- a/hw/ppc/pef.c > > > +++ b/hw/ppc/pef.c > > > @@ -38,7 +38,11 @@ struct PefGuestState { > > > }; > > > > > > #ifdef CONFIG_KVM > > > +static Error *pef_mig_blocker; > > > + > > > static int kvmppc_svm_init(Error **errp) > > > > This looks weird? > > Oops. Not sure how that made it past even my rudimentary compile > testing. > > > > + > > > +int kvmppc_svm_init(SecurableGuestMemory *sgm, Error **errp) > > > { > > > if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURABLE_GUEST)) { > > > error_setg(errp, > > > @@ -54,6 +58,11 @@ static int kvmppc_svm_init(Error **errp) > > > } > > > } > > > > > > + /* add migration blocker */ > > > + error_setg(&pef_mig_blocker, "PEF: Migration is not implemented"); > > > + /* NB: This can fail if --only-migratable is used */ > > > + migrate_add_blocker(pef_mig_blocker, &error_fatal); > > > > Just so that I understand: is PEF something that is enabled by the host > > (and the guest is either secured or doesn't start), or is it using a > > model like s390x PV where the guest initiates the transition into > > secured mode? > > Like s390x PV it's initiated by the guest. > > > Asking because s390x adds the migration blocker only when the > > transition is actually happening (i.e. guests that do not transition > > into secure mode remain migratable.) This has the side effect that you > > might be able to start a machine with --only-migratable that > > transitions into a non-migratable machine via a guest action, if I'm > > not mistaken. Without the new object, I don't see a way to block with > > --only-migratable; with it, we should be able to do that. Not sure what > > the desirable behaviour is here. > > Hm, I'm not sure what the best option is here either. If we agree on anything, it should be as consistent across architectures as possible :) If we want to add the migration blocker to s390x even before the guest transitions, it needs to be tied to the new object; if we'd make it dependent on the cpu feature bit, we'd block migration of all machines on hardware with SE and a recent kernel. Is there a convenient point in time when PEF guests transition where QEMU can add a blocker? > > > > > > + > > > return 0; > > > } > > > > > > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 3937 bytes --] On Thu, 17 Dec 2020 17:21:16 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > On Tue, Dec 08, 2020 at 01:43:08PM +0100, Cornelia Huck wrote: > > On Tue, 8 Dec 2020 13:57:28 +1100 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > On Fri, Dec 04, 2020 at 02:12:29PM +0100, Cornelia Huck wrote: > > > > On Fri, 4 Dec 2020 13:07:27 +0000 > > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > > > > > > > > > * Cornelia Huck (cohuck@redhat.com) wrote: > > > > > > On Fri, 4 Dec 2020 09:06:50 +0100 > > > > > > Christian Borntraeger <borntraeger@de.ibm.com> wrote: > > > > > > > > > > > > > On 04.12.20 06:44, David Gibson wrote: > > > > > > > > A number of hardware platforms are implementing mechanisms whereby the > > > > > > > > hypervisor does not have unfettered access to guest memory, in order > > > > > > > > to mitigate the security impact of a compromised hypervisor. > > > > > > > > > > > > > > > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > > > > > > > > its own memory encryption mechanism. POWER has an upcoming mechanism > > > > > > > > to accomplish this in a different way, using a new memory protection > > > > > > > > level plus a small trusted ultravisor. s390 also has a protected > > > > > > > > execution environment. > > > > > > > > > > > > > > > > The current code (committed or draft) for these features has each > > > > > > > > platform's version configured entirely differently. That doesn't seem > > > > > > > > ideal for users, or particularly for management layers. > > > > > > > > > > > > > > > > AMD SEV introduces a notionally generic machine option > > > > > > > > "machine-encryption", but it doesn't actually cover any cases other > > > > > > > > than SEV. > > > > > > > > > > > > > > > > This series is a proposal to at least partially unify configuration > > > > > > > > for these mechanisms, by renaming and generalizing AMD's > > > > > > > > "memory-encryption" property. It is replaced by a > > > > > > > > "securable-guest-memory" property pointing to a platform specific > > > > > > > > > > > > > > Can we do "securable-guest" ? > > > > > > > s390x also protects registers and integrity. memory is only one piece > > > > > > > of the puzzle and what we protect might differ from platform to > > > > > > > platform. > > > > > > > > > > > > > > > > > > > I agree. Even technologies that currently only do memory encryption may > > > > > > be enhanced with more protections later. > > > > > > > > > > There's already SEV-ES patches onlist for this on the SEV side. > > > > > > > > > > <sigh on haggling over the name> > > > > > > > > > > Perhaps 'confidential guest' is actually what we need, since the > > > > > marketing folks seem to have started labelling this whole idea > > > > > 'confidential computing'. > > > > > > That's not a bad idea, much as I usually hate marketing terms. But it > > > does seem to be becoming a general term for this style of thing, and > > > it doesn't overlap too badly with other terms ("secure" and > > > "protected" are also used for hypervisor-from-guest and > > > guest-from-guest protection). > > > > > > > It's more like a 'possibly confidential guest', though. > > > > > > Hmm. What about "Confidential Guest Facility" or "Confidential Guest > > > Mechanism"? The implication being that the facility is there, whether > > > or not the guest actually uses it. > > > > > > > "Confidential Guest Enablement"? The others generally sound fine to me > > as well, though; not sure if "Facility" might be a bit confusing, as > > that term is already a bit overloaded. > > Well, "facility" is a bit overloaded, but IMO "enablement" is even > more so. I think I'll go with "confidential guest support" in the > next spin. > Works for me. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 3668 bytes --] On Thu, 17 Dec 2020 12:38:42 +0100 Cornelia Huck <cohuck@redhat.com> wrote: > On Thu, 17 Dec 2020 16:47:36 +1100 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Mon, Dec 14, 2020 at 06:22:40PM +0100, Cornelia Huck wrote: > > > On Fri, 4 Dec 2020 16:44:13 +1100 > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > We haven't yet implemented the fairly involved handshaking that will be > > > > needed to migrate PEF protected guests. For now, just use a migration > > > > blocker so we get a meaningful error if someone attempts this (this is the > > > > same approach used by AMD SEV). > > > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > > > --- > > > > hw/ppc/pef.c | 9 +++++++++ > > > > 1 file changed, 9 insertions(+) > > > > > > > > diff --git a/hw/ppc/pef.c b/hw/ppc/pef.c > > > > index 3ae3059cfe..edc3e744ba 100644 > > > > --- a/hw/ppc/pef.c > > > > +++ b/hw/ppc/pef.c > > > > @@ -38,7 +38,11 @@ struct PefGuestState { > > > > }; > > > > > > > > #ifdef CONFIG_KVM > > > > +static Error *pef_mig_blocker; > > > > + > > > > static int kvmppc_svm_init(Error **errp) > > > > > > This looks weird? > > > > Oops. Not sure how that made it past even my rudimentary compile > > testing. > > > > > > + > > > > +int kvmppc_svm_init(SecurableGuestMemory *sgm, Error **errp) > > > > { > > > > if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURABLE_GUEST)) { > > > > error_setg(errp, > > > > @@ -54,6 +58,11 @@ static int kvmppc_svm_init(Error **errp) > > > > } > > > > } > > > > > > > > + /* add migration blocker */ > > > > + error_setg(&pef_mig_blocker, "PEF: Migration is not implemented"); > > > > + /* NB: This can fail if --only-migratable is used */ > > > > + migrate_add_blocker(pef_mig_blocker, &error_fatal); > > > > > > Just so that I understand: is PEF something that is enabled by the host > > > (and the guest is either secured or doesn't start), or is it using a > > > model like s390x PV where the guest initiates the transition into > > > secured mode? > > > > Like s390x PV it's initiated by the guest. > > > > > Asking because s390x adds the migration blocker only when the > > > transition is actually happening (i.e. guests that do not transition > > > into secure mode remain migratable.) This has the side effect that you > > > might be able to start a machine with --only-migratable that > > > transitions into a non-migratable machine via a guest action, if I'm > > > not mistaken. Without the new object, I don't see a way to block with > > > --only-migratable; with it, we should be able to do that. Not sure what > > > the desirable behaviour is here. > > The purpose of --only-migratable is specifically to prevent the machine to transition to a non-migrate state IIUC. The guest transition to secure mode should be nacked in this case. > > Hm, I'm not sure what the best option is here either. > > If we agree on anything, it should be as consistent across > architectures as possible :) > > If we want to add the migration blocker to s390x even before the guest > transitions, it needs to be tied to the new object; if we'd make it > dependent on the cpu feature bit, we'd block migration of all machines > on hardware with SE and a recent kernel. > > Is there a convenient point in time when PEF guests transition where > QEMU can add a blocker? > > > > > > > > > > + > > > > return 0; > > > > } > > > > > > > > > > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 4571 bytes --] On Thu, 17 Dec 2020 15:15:30 +0100 Greg Kurz <groug@kaod.org> wrote: > On Thu, 17 Dec 2020 12:38:42 +0100 > Cornelia Huck <cohuck@redhat.com> wrote: > > > On Thu, 17 Dec 2020 16:47:36 +1100 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > On Mon, Dec 14, 2020 at 06:22:40PM +0100, Cornelia Huck wrote: > > > > On Fri, 4 Dec 2020 16:44:13 +1100 > > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > > > We haven't yet implemented the fairly involved handshaking that will be > > > > > needed to migrate PEF protected guests. For now, just use a migration > > > > > blocker so we get a meaningful error if someone attempts this (this is the > > > > > same approach used by AMD SEV). > > > > > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > > > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > > > > --- > > > > > hw/ppc/pef.c | 9 +++++++++ > > > > > 1 file changed, 9 insertions(+) > > > > > > > > > > diff --git a/hw/ppc/pef.c b/hw/ppc/pef.c > > > > > index 3ae3059cfe..edc3e744ba 100644 > > > > > --- a/hw/ppc/pef.c > > > > > +++ b/hw/ppc/pef.c > > > > > @@ -38,7 +38,11 @@ struct PefGuestState { > > > > > }; > > > > > > > > > > #ifdef CONFIG_KVM > > > > > +static Error *pef_mig_blocker; > > > > > + > > > > > static int kvmppc_svm_init(Error **errp) > > > > > > > > This looks weird? > > > > > > Oops. Not sure how that made it past even my rudimentary compile > > > testing. > > > > > > > > + > > > > > +int kvmppc_svm_init(SecurableGuestMemory *sgm, Error **errp) > > > > > { > > > > > if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURABLE_GUEST)) { > > > > > error_setg(errp, > > > > > @@ -54,6 +58,11 @@ static int kvmppc_svm_init(Error **errp) > > > > > } > > > > > } > > > > > > > > > > + /* add migration blocker */ > > > > > + error_setg(&pef_mig_blocker, "PEF: Migration is not implemented"); > > > > > + /* NB: This can fail if --only-migratable is used */ > > > > > + migrate_add_blocker(pef_mig_blocker, &error_fatal); > > > > > > > > Just so that I understand: is PEF something that is enabled by the host > > > > (and the guest is either secured or doesn't start), or is it using a > > > > model like s390x PV where the guest initiates the transition into > > > > secured mode? > > > > > > Like s390x PV it's initiated by the guest. > > > > > > > Asking because s390x adds the migration blocker only when the > > > > transition is actually happening (i.e. guests that do not transition > > > > into secure mode remain migratable.) This has the side effect that you > > > > might be able to start a machine with --only-migratable that > > > > transitions into a non-migratable machine via a guest action, if I'm > > > > not mistaken. Without the new object, I don't see a way to block with > > > > --only-migratable; with it, we should be able to do that. Not sure what > > > > the desirable behaviour is here. > > > > > The purpose of --only-migratable is specifically to prevent the machine > to transition to a non-migrate state IIUC. The guest transition to > secure mode should be nacked in this case. Yes, that's what happens for s390x: The guest tries to transition, QEMU can't add a migration blocker and fails the instruction used for transitioning, the guest sees the error. The drawback is that we see the failure only when we already launched the machine and the guest tries to transition. If I start QEMU with --only-migratable, it will refuse to start when non-migratable devices are configured in the command line, so I see the issue right from the start. (For s390x, that would possibly mean that we should not even present the cpu feature bit when only_migratable is set?) > > > > Hm, I'm not sure what the best option is here either. > > > > If we agree on anything, it should be as consistent across > > architectures as possible :) > > > > If we want to add the migration blocker to s390x even before the guest > > transitions, it needs to be tied to the new object; if we'd make it > > dependent on the cpu feature bit, we'd block migration of all machines > > on hardware with SE and a recent kernel. > > > > Is there a convenient point in time when PEF guests transition where > > QEMU can add a blocker? > > > > > > > > > > > > > > + > > > > > return 0; > > > > > } > > > > > > > > > > > > > > > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --]
* Cornelia Huck (cohuck@redhat.com) wrote: > On Thu, 17 Dec 2020 15:15:30 +0100 > Greg Kurz <groug@kaod.org> wrote: > > > On Thu, 17 Dec 2020 12:38:42 +0100 > > Cornelia Huck <cohuck@redhat.com> wrote: > > > > > On Thu, 17 Dec 2020 16:47:36 +1100 > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > On Mon, Dec 14, 2020 at 06:22:40PM +0100, Cornelia Huck wrote: > > > > > On Fri, 4 Dec 2020 16:44:13 +1100 > > > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > > > > > We haven't yet implemented the fairly involved handshaking that will be > > > > > > needed to migrate PEF protected guests. For now, just use a migration > > > > > > blocker so we get a meaningful error if someone attempts this (this is the > > > > > > same approach used by AMD SEV). > > > > > > > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > > > > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > > > > > --- > > > > > > hw/ppc/pef.c | 9 +++++++++ > > > > > > 1 file changed, 9 insertions(+) > > > > > > > > > > > > diff --git a/hw/ppc/pef.c b/hw/ppc/pef.c > > > > > > index 3ae3059cfe..edc3e744ba 100644 > > > > > > --- a/hw/ppc/pef.c > > > > > > +++ b/hw/ppc/pef.c > > > > > > @@ -38,7 +38,11 @@ struct PefGuestState { > > > > > > }; > > > > > > > > > > > > #ifdef CONFIG_KVM > > > > > > +static Error *pef_mig_blocker; > > > > > > + > > > > > > static int kvmppc_svm_init(Error **errp) > > > > > > > > > > This looks weird? > > > > > > > > Oops. Not sure how that made it past even my rudimentary compile > > > > testing. > > > > > > > > > > + > > > > > > +int kvmppc_svm_init(SecurableGuestMemory *sgm, Error **errp) > > > > > > { > > > > > > if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURABLE_GUEST)) { > > > > > > error_setg(errp, > > > > > > @@ -54,6 +58,11 @@ static int kvmppc_svm_init(Error **errp) > > > > > > } > > > > > > } > > > > > > > > > > > > + /* add migration blocker */ > > > > > > + error_setg(&pef_mig_blocker, "PEF: Migration is not implemented"); > > > > > > + /* NB: This can fail if --only-migratable is used */ > > > > > > + migrate_add_blocker(pef_mig_blocker, &error_fatal); > > > > > > > > > > Just so that I understand: is PEF something that is enabled by the host > > > > > (and the guest is either secured or doesn't start), or is it using a > > > > > model like s390x PV where the guest initiates the transition into > > > > > secured mode? > > > > > > > > Like s390x PV it's initiated by the guest. > > > > > > > > > Asking because s390x adds the migration blocker only when the > > > > > transition is actually happening (i.e. guests that do not transition > > > > > into secure mode remain migratable.) This has the side effect that you > > > > > might be able to start a machine with --only-migratable that > > > > > transitions into a non-migratable machine via a guest action, if I'm > > > > > not mistaken. Without the new object, I don't see a way to block with > > > > > --only-migratable; with it, we should be able to do that. Not sure what > > > > > the desirable behaviour is here. > > > > > > > > The purpose of --only-migratable is specifically to prevent the machine > > to transition to a non-migrate state IIUC. The guest transition to > > secure mode should be nacked in this case. > > Yes, that's what happens for s390x: The guest tries to transition, QEMU > can't add a migration blocker and fails the instruction used for > transitioning, the guest sees the error. > > The drawback is that we see the failure only when we already launched > the machine and the guest tries to transition. If I start QEMU with > --only-migratable, it will refuse to start when non-migratable devices > are configured in the command line, so I see the issue right from the > start. (For s390x, that would possibly mean that we should not even > present the cpu feature bit when only_migratable is set?) I see --only-migratable as refusing to start if you've enabled anything that would stop migration. So I'd expect: a) Allow the cpu flag to be turned on/off somehow b) If you ask for it (-cpu ...,_confidentialcomp or whatever) and you've got --only-migratable then you'd fail before startup. Dave > > > > > > Hm, I'm not sure what the best option is here either. > > > > > > If we agree on anything, it should be as consistent across > > > architectures as possible :) > > > > > > If we want to add the migration blocker to s390x even before the guest > > > transitions, it needs to be tied to the new object; if we'd make it > > > dependent on the cpu feature bit, we'd block migration of all machines > > > on hardware with SE and a recent kernel. > > > > > > Is there a convenient point in time when PEF guests transition where > > > QEMU can add a blocker? > > > > > > > > > > > > > > > > > > + > > > > > > return 0; > > > > > > } > > > > > > > > > > > > > > > > > > > > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Fri, Dec 18, 2020 at 12:41:11PM +0100, Cornelia Huck wrote:
> On Thu, 17 Dec 2020 15:15:30 +0100
> Greg Kurz <groug@kaod.org> wrote:
>
> > On Thu, 17 Dec 2020 12:38:42 +0100
> > Cornelia Huck <cohuck@redhat.com> wrote:
> >
> > > On Thu, 17 Dec 2020 16:47:36 +1100
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > >
> > > > On Mon, Dec 14, 2020 at 06:22:40PM +0100, Cornelia Huck wrote:
> > > > > On Fri, 4 Dec 2020 16:44:13 +1100
> > > > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > > > >
> > > > > > We haven't yet implemented the fairly involved handshaking that will be
> > > > > > needed to migrate PEF protected guests. For now, just use a migration
> > > > > > blocker so we get a meaningful error if someone attempts this (this is the
> > > > > > same approach used by AMD SEV).
> > > > > >
> > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > > > > > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > > > > ---
> > > > > > hw/ppc/pef.c | 9 +++++++++
> > > > > > 1 file changed, 9 insertions(+)
> > > > > >
> > > > > > diff --git a/hw/ppc/pef.c b/hw/ppc/pef.c
> > > > > > index 3ae3059cfe..edc3e744ba 100644
> > > > > > --- a/hw/ppc/pef.c
> > > > > > +++ b/hw/ppc/pef.c
> > > > > > @@ -38,7 +38,11 @@ struct PefGuestState {
> > > > > > };
> > > > > >
> > > > > > #ifdef CONFIG_KVM
> > > > > > +static Error *pef_mig_blocker;
> > > > > > +
> > > > > > static int kvmppc_svm_init(Error **errp)
> > > > >
> > > > > This looks weird?
> > > >
> > > > Oops. Not sure how that made it past even my rudimentary compile
> > > > testing.
> > > >
> > > > > > +
> > > > > > +int kvmppc_svm_init(SecurableGuestMemory *sgm, Error **errp)
> > > > > > {
> > > > > > if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURABLE_GUEST)) {
> > > > > > error_setg(errp,
> > > > > > @@ -54,6 +58,11 @@ static int kvmppc_svm_init(Error **errp)
> > > > > > }
> > > > > > }
> > > > > >
> > > > > > + /* add migration blocker */
> > > > > > + error_setg(&pef_mig_blocker, "PEF: Migration is not implemented");
> > > > > > + /* NB: This can fail if --only-migratable is used */
> > > > > > + migrate_add_blocker(pef_mig_blocker, &error_fatal);
> > > > >
> > > > > Just so that I understand: is PEF something that is enabled by the host
> > > > > (and the guest is either secured or doesn't start), or is it using a
> > > > > model like s390x PV where the guest initiates the transition into
> > > > > secured mode?
> > > >
> > > > Like s390x PV it's initiated by the guest.
> > > >
> > > > > Asking because s390x adds the migration blocker only when the
> > > > > transition is actually happening (i.e. guests that do not transition
> > > > > into secure mode remain migratable.) This has the side effect that you
> > > > > might be able to start a machine with --only-migratable that
> > > > > transitions into a non-migratable machine via a guest action, if I'm
> > > > > not mistaken. Without the new object, I don't see a way to block with
> > > > > --only-migratable; with it, we should be able to do that. Not sure what
> > > > > the desirable behaviour is here.
> > > >
> >
> > The purpose of --only-migratable is specifically to prevent the machine
> > to transition to a non-migrate state IIUC. The guest transition to
> > secure mode should be nacked in this case.
>
> Yes, that's what happens for s390x: The guest tries to transition, QEMU
> can't add a migration blocker and fails the instruction used for
> transitioning, the guest sees the error.
>
> The drawback is that we see the failure only when we already launched
> the machine and the guest tries to transition. If I start QEMU with
> --only-migratable, it will refuse to start when non-migratable devices
> are configured in the command line, so I see the issue right from the
> start. (For s390x, that would possibly mean that we should not even
> present the cpu feature bit when only_migratable is set?)
What happens in s390x, if the guest tries to transition to secure, when
the secure object is NOT configured on the machine?
On PEF systems, the transition fails and the guest is terminated.
My point is -- QEMU will not be able to predict in advance, what the
guest might or might not do, regardless of what devices and objects are
configured in the machine. If the guest does something unexpected, it
has to be terminated.
So one possible design choice is to let the guest know that migration
must be facilitated. It can then decide if it wants to continue as a
normal VM or terminate itself, or take the plunge and switch to secure.
A well behaving guest will not switch to secure.
RP
On Sun, 3 Jan 2021 23:15:50 -0800 Ram Pai <linuxram@us.ibm.com> wrote: > On Fri, Dec 18, 2020 at 12:41:11PM +0100, Cornelia Huck wrote: > > On Thu, 17 Dec 2020 15:15:30 +0100 [..] > > > > > > > +int kvmppc_svm_init(SecurableGuestMemory *sgm, Error **errp) > > > > > > > { > > > > > > > if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURABLE_GUEST)) { > > > > > > > error_setg(errp, > > > > > > > @@ -54,6 +58,11 @@ static int kvmppc_svm_init(Error **errp) > > > > > > > } > > > > > > > } > > > > > > > > > > > > > > + /* add migration blocker */ > > > > > > > + error_setg(&pef_mig_blocker, "PEF: Migration is not implemented"); > > > > > > > + /* NB: This can fail if --only-migratable is used */ > > > > > > > + migrate_add_blocker(pef_mig_blocker, &error_fatal); > > > > > > > > > > > > Just so that I understand: is PEF something that is enabled by the host > > > > > > (and the guest is either secured or doesn't start), or is it using a > > > > > > model like s390x PV where the guest initiates the transition into > > > > > > secured mode? > > > > > > > > > > Like s390x PV it's initiated by the guest. > > > > > > > > > > > Asking because s390x adds the migration blocker only when the > > > > > > transition is actually happening (i.e. guests that do not transition > > > > > > into secure mode remain migratable.) This has the side effect that you > > > > > > might be able to start a machine with --only-migratable that > > > > > > transitions into a non-migratable machine via a guest action, if I'm > > > > > > not mistaken. Without the new object, I don't see a way to block with > > > > > > --only-migratable; with it, we should be able to do that. Not sure what > > > > > > the desirable behaviour is here. > > > > > > > > > > > The purpose of --only-migratable is specifically to prevent the machine > > > to transition to a non-migrate state IIUC. The guest transition to > > > secure mode should be nacked in this case. > > > > Yes, that's what happens for s390x: The guest tries to transition, QEMU > > can't add a migration blocker and fails the instruction used for > > transitioning, the guest sees the error. > > > > The drawback is that we see the failure only when we already launched > > the machine and the guest tries to transition. If I start QEMU with > > --only-migratable, it will refuse to start when non-migratable devices > > are configured in the command line, so I see the issue right from the > > start. (For s390x, that would possibly mean that we should not even > > present the cpu feature bit when only_migratable is set?) > > What happens in s390x, if the guest tries to transition to secure, when > the secure object is NOT configured on the machine? > Nothing in particular. > On PEF systems, the transition fails and the guest is terminated. > > My point is -- QEMU will not be able to predict in advance, what the > guest might or might not do, regardless of what devices and objects are > configured in the machine. If the guest does something unexpected, it > has to be terminated. We can't fail transition to secure when the secure object is not configured on the machine, because that would break pre-existing setups. This feature is still to be shipped, but secure execution has already been shipped, but without migration support. That's why when you have both the secure object configured, and mandate migratability, the we can fail. Actually we should fail now, because the two options are not compatible: you can't have a qemu that is guaranteed to be migratable, and guaranteed to be able to operate in secure execution mode today. Failing early, and not on the guests opt-in would be preferable. After migration support is added, the combo should be fine, and probably also the default for secure execution machines. > > So one possible design choice is to let the guest know that migration > must be facilitated. It can then decide if it wants to continue as a > normal VM or terminate itself, or take the plunge and switch to secure. > A well behaving guest will not switch to secure. > I don't understand this point. Sorry. Regards, Halil [..]
On Mon, Jan 04, 2021 at 01:46:29PM +0100, Halil Pasic wrote: > On Sun, 3 Jan 2021 23:15:50 -0800 > Ram Pai <linuxram@us.ibm.com> wrote: > > > On Fri, Dec 18, 2020 at 12:41:11PM +0100, Cornelia Huck wrote: > > > On Thu, 17 Dec 2020 15:15:30 +0100 > [..] > > > > > > > > +int kvmppc_svm_init(SecurableGuestMemory *sgm, Error **errp) > > > > > > > > { > > > > > > > > if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURABLE_GUEST)) { > > > > > > > > error_setg(errp, > > > > > > > > @@ -54,6 +58,11 @@ static int kvmppc_svm_init(Error **errp) > > > > > > > > } > > > > > > > > } > > > > > > > > > > > > > > > > + /* add migration blocker */ > > > > > > > > + error_setg(&pef_mig_blocker, "PEF: Migration is not implemented"); > > > > > > > > + /* NB: This can fail if --only-migratable is used */ > > > > > > > > + migrate_add_blocker(pef_mig_blocker, &error_fatal); > > > > > > > > > > > > > > Just so that I understand: is PEF something that is enabled by the host > > > > > > > (and the guest is either secured or doesn't start), or is it using a > > > > > > > model like s390x PV where the guest initiates the transition into > > > > > > > secured mode? > > > > > > > > > > > > Like s390x PV it's initiated by the guest. > > > > > > > > > > > > > Asking because s390x adds the migration blocker only when the > > > > > > > transition is actually happening (i.e. guests that do not transition > > > > > > > into secure mode remain migratable.) This has the side effect that you > > > > > > > might be able to start a machine with --only-migratable that > > > > > > > transitions into a non-migratable machine via a guest action, if I'm > > > > > > > not mistaken. Without the new object, I don't see a way to block with > > > > > > > --only-migratable; with it, we should be able to do that. Not sure what > > > > > > > the desirable behaviour is here. > > > > > > > > > > > > > > The purpose of --only-migratable is specifically to prevent the machine > > > > to transition to a non-migrate state IIUC. The guest transition to > > > > secure mode should be nacked in this case. > > > > > > Yes, that's what happens for s390x: The guest tries to transition, QEMU > > > can't add a migration blocker and fails the instruction used for > > > transitioning, the guest sees the error. > > > > > > The drawback is that we see the failure only when we already launched > > > the machine and the guest tries to transition. If I start QEMU with > > > --only-migratable, it will refuse to start when non-migratable devices > > > are configured in the command line, so I see the issue right from the > > > start. (For s390x, that would possibly mean that we should not even > > > present the cpu feature bit when only_migratable is set?) > > > > What happens in s390x, if the guest tries to transition to secure, when > > the secure object is NOT configured on the machine? > > > > Nothing in particular. > > > On PEF systems, the transition fails and the guest is terminated. > > > > My point is -- QEMU will not be able to predict in advance, what the > > guest might or might not do, regardless of what devices and objects are > > configured in the machine. If the guest does something unexpected, it > > has to be terminated. > > We can't fail transition to secure when the secure object is not > configured on the machine, because that would break pre-existing > setups. So the instruction to switch-to-secure; which I believe is a ultracall on S390, will return success even though the switch-to-secure has failed? Will the guest continue as a normal guest or as a secure guest? > This feature is still to be shipped, but secure execution has > already been shipped, but without migration support. > > That's why when you have both the secure object configured, and mandate > migratability, the we can fail. Actually we should fail now, because the > two options are not compatible: you can't have a qemu that is guaranteed > to be migratable, and guaranteed to be able to operate in secure > execution mode today. Failing early, and not on the guests opt-in would > be preferable. > > After migration support is added, the combo should be fine, and probably > also the default for secure execution machines. > > > > > So one possible design choice is to let the guest know that migration > > must be facilitated. It can then decide if it wants to continue as a > > normal VM or terminate itself, or take the plunge and switch to secure. > > A well behaving guest will not switch to secure. > > > > I don't understand this point. Sorry. Qemu will present the 'must-support-migrate' and the 'secure-object' capability to the guest. The secure-aware guest, has three choices (a) terminate itself. OR (b) not call the switch-to-secure ucall, and continue as normal guest. OR (c) call the switch-to-secure ucall. Legacy guests which are not aware of secure-object, will continue to do (b). New Guests which are secure-object aware, will observe that 'must-support-migrate' and 'secure-object' capabilities are incompatible. Hence will choose (a) or (b), but will never choose (c). The main difference between my proposal and the other proposal is... In my proposal the guest makes the compatibility decision and acts accordingly. In the other proposal QEMU makes the compatibility decision and acts accordingly. I argue that QEMU cannot make a good compatibility decision, because it wont know in advance, if the guest will or will-not switch-to-secure. RP
On Mon, 4 Jan 2021 10:40:26 -0800 Ram Pai <linuxram@us.ibm.com> wrote: > On Mon, Jan 04, 2021 at 01:46:29PM +0100, Halil Pasic wrote: > > On Sun, 3 Jan 2021 23:15:50 -0800 > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > On Fri, Dec 18, 2020 at 12:41:11PM +0100, Cornelia Huck wrote: > > > > On Thu, 17 Dec 2020 15:15:30 +0100 > > [..] > > > > > > > > > +int kvmppc_svm_init(SecurableGuestMemory *sgm, Error **errp) > > > > > > > > > { > > > > > > > > > if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURABLE_GUEST)) { > > > > > > > > > error_setg(errp, > > > > > > > > > @@ -54,6 +58,11 @@ static int kvmppc_svm_init(Error **errp) > > > > > > > > > } > > > > > > > > > } > > > > > > > > > > > > > > > > > > + /* add migration blocker */ > > > > > > > > > + error_setg(&pef_mig_blocker, "PEF: Migration is not implemented"); > > > > > > > > > + /* NB: This can fail if --only-migratable is used */ > > > > > > > > > + migrate_add_blocker(pef_mig_blocker, &error_fatal); > > > > > > > > > > > > > > > > Just so that I understand: is PEF something that is enabled by the host > > > > > > > > (and the guest is either secured or doesn't start), or is it using a > > > > > > > > model like s390x PV where the guest initiates the transition into > > > > > > > > secured mode? > > > > > > > > > > > > > > Like s390x PV it's initiated by the guest. > > > > > > > > > > > > > > > Asking because s390x adds the migration blocker only when the > > > > > > > > transition is actually happening (i.e. guests that do not transition > > > > > > > > into secure mode remain migratable.) This has the side effect that you > > > > > > > > might be able to start a machine with --only-migratable that > > > > > > > > transitions into a non-migratable machine via a guest action, if I'm > > > > > > > > not mistaken. Without the new object, I don't see a way to block with > > > > > > > > --only-migratable; with it, we should be able to do that. Not sure what > > > > > > > > the desirable behaviour is here. > > > > > > > > > > > > > > > > > The purpose of --only-migratable is specifically to prevent the machine > > > > > to transition to a non-migrate state IIUC. The guest transition to > > > > > secure mode should be nacked in this case. > > > > > > > > Yes, that's what happens for s390x: The guest tries to transition, QEMU > > > > can't add a migration blocker and fails the instruction used for > > > > transitioning, the guest sees the error. > > > > > > > > The drawback is that we see the failure only when we already launched > > > > the machine and the guest tries to transition. If I start QEMU with > > > > --only-migratable, it will refuse to start when non-migratable devices > > > > are configured in the command line, so I see the issue right from the > > > > start. (For s390x, that would possibly mean that we should not even > > > > present the cpu feature bit when only_migratable is set?) > > > > > > What happens in s390x, if the guest tries to transition to secure, when > > > the secure object is NOT configured on the machine? > > > > > > > Nothing in particular. > > > > > On PEF systems, the transition fails and the guest is terminated. > > > > > > My point is -- QEMU will not be able to predict in advance, what the > > > guest might or might not do, regardless of what devices and objects are > > > configured in the machine. If the guest does something unexpected, it > > > has to be terminated. > > > > We can't fail transition to secure when the secure object is not > > configured on the machine, because that would break pre-existing > > setups. > > So the instruction to switch-to-secure; which I believe is a ultracall > on S390, Yes it is an ultravisor call. > will return success even though the switch-to-secure has failed? No, I don't think so. > Will the guest continue as a normal guest or as a secure guest? > I think the guest will give up. It definitely can't continue as secure because the conversion to secure failed. And it should not continue as non-secure because that's not what the user asked for. I'm not sure you got my point. My point is: we may not break existing setups when adding new features. Secure execution can work without secure object today, and what works today shall keep working tomorrow and beyond. > > This feature is still to be shipped, but secure execution has > > already been shipped, but without migration support. > > > > That's why when you have both the secure object configured, and mandate > > migratability, the we can fail. Actually we should fail now, because the > > two options are not compatible: you can't have a qemu that is guaranteed > > to be migratable, and guaranteed to be able to operate in secure > > execution mode today. Failing early, and not on the guests opt-in would > > be preferable. > > > > After migration support is added, the combo should be fine, and probably > > also the default for secure execution machines. > > > > > > > > So one possible design choice is to let the guest know that migration > > > must be facilitated. It can then decide if it wants to continue as a > > > normal VM or terminate itself, or take the plunge and switch to secure. > > > A well behaving guest will not switch to secure. > > > > > > > I don't understand this point. Sorry. > > Qemu will present the 'must-support-migrate' and the 'secure-object' capability > to the guest. How does the qemu preset the 'must-support-migrate' and the 'secure-object' capability to the guest on (PPC and especially on s390)? And please clarify what do you mean by 'secure-object'. I used to believe I understood, but now I have the feeling I don't understand. > > The secure-aware guest, has three choices > (a) terminate itself. OR > (b) not call the switch-to-secure ucall, and continue as normal guest. OR > (c) call the switch-to-secure ucall. > > Legacy guests which are not aware of secure-object, will continue to do > (b). > New Guests which are secure-object aware, will observe that > 'must-support-migrate' and 'secure-object' capabilities are > incompatible. Hence will choose (a) or (b), but will never choose > (c). > The first problem is, IMHO, that you want to expose QEMU internals to the guest. For the guest, there is no such thing as 'must-support-migrate' (AFAIK). The other problem is, that migration and secure are not inherently incompatible. On s390x it is the property of the current host implementation, that we can't do migration for secure. But this can change in the future. > > > The main difference between my proposal and the other proposal is... > > In my proposal the guest makes the compatibility decision and acts > accordingly. In the other proposal QEMU makes the compatibility > decision and acts accordingly. I argue that QEMU cannot make a good > compatibility decision, because it wont know in advance, if the guest > will or will-not switch-to-secure. > You have a point there when you say that QEMU does not know in advance, if the guest will or will-not switch-to-secure. I made that argument regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea was to flip that property on demand when the conversion occurs. David explained to me that this is not possible for ppc, and that having the "securable-guest-memory" property (or whatever the name will be) specified is a strong indication, that the VM is intended to be used as a secure VM (thus it is OK to hurt the case where the guest does not try to transition). That argument applies here as well. But more importantly, as I explained above, the guest does not know if migration and secure are incompatible or not. So the guest can't make a good decision. Regards, Halil
On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote: > On Mon, 4 Jan 2021 10:40:26 -0800 > Ram Pai <linuxram@us.ibm.com> wrote: > > > On Mon, Jan 04, 2021 at 01:46:29PM +0100, Halil Pasic wrote: > > > On Sun, 3 Jan 2021 23:15:50 -0800 > > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > > > On Fri, Dec 18, 2020 at 12:41:11PM +0100, Cornelia Huck wrote: > > > > > On Thu, 17 Dec 2020 15:15:30 +0100 > > > [..] > > > > > > > > > > +int kvmppc_svm_init(SecurableGuestMemory *sgm, Error **errp) > > > > > > > > > > { > > > > > > > > > > if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURABLE_GUEST)) { > > > > > > > > > > error_setg(errp, > > > > > > > > > > @@ -54,6 +58,11 @@ static int kvmppc_svm_init(Error **errp) > > > > > > > > > > } > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > + /* add migration blocker */ > > > > > > > > > > + error_setg(&pef_mig_blocker, "PEF: Migration is not implemented"); > > > > > > > > > > + /* NB: This can fail if --only-migratable is used */ > > > > > > > > > > + migrate_add_blocker(pef_mig_blocker, &error_fatal); > > > > > > > > > > > > > > > > > > Just so that I understand: is PEF something that is enabled by the host > > > > > > > > > (and the guest is either secured or doesn't start), or is it using a > > > > > > > > > model like s390x PV where the guest initiates the transition into > > > > > > > > > secured mode? > > > > > > > > > > > > > > > > Like s390x PV it's initiated by the guest. > > > > > > > > > > > > > > > > > Asking because s390x adds the migration blocker only when the > > > > > > > > > transition is actually happening (i.e. guests that do not transition > > > > > > > > > into secure mode remain migratable.) This has the side effect that you > > > > > > > > > might be able to start a machine with --only-migratable that > > > > > > > > > transitions into a non-migratable machine via a guest action, if I'm > > > > > > > > > not mistaken. Without the new object, I don't see a way to block with > > > > > > > > > --only-migratable; with it, we should be able to do that. Not sure what > > > > > > > > > the desirable behaviour is here. > > > > > > > > > > > > > > > > > > > > The purpose of --only-migratable is specifically to prevent the machine > > > > > > to transition to a non-migrate state IIUC. The guest transition to > > > > > > secure mode should be nacked in this case. > > > > > > > > > > Yes, that's what happens for s390x: The guest tries to transition, QEMU > > > > > can't add a migration blocker and fails the instruction used for > > > > > transitioning, the guest sees the error. > > > > > > > > > > The drawback is that we see the failure only when we already launched > > > > > the machine and the guest tries to transition. If I start QEMU with > > > > > --only-migratable, it will refuse to start when non-migratable devices > > > > > are configured in the command line, so I see the issue right from the > > > > > start. (For s390x, that would possibly mean that we should not even > > > > > present the cpu feature bit when only_migratable is set?) > > > > > > > > What happens in s390x, if the guest tries to transition to secure, when > > > > the secure object is NOT configured on the machine? > > > > > > > > > > Nothing in particular. > > > > > > > On PEF systems, the transition fails and the guest is terminated. > > > > > > > > My point is -- QEMU will not be able to predict in advance, what the > > > > guest might or might not do, regardless of what devices and objects are > > > > configured in the machine. If the guest does something unexpected, it > > > > has to be terminated. > > > > > > We can't fail transition to secure when the secure object is not > > > configured on the machine, because that would break pre-existing > > > setups. > > > > So the instruction to switch-to-secure; which I believe is a ultracall > > on S390, > > Yes it is an ultravisor call. > > > will return success even though the switch-to-secure has failed? > > No, I don't think so. > > > Will the guest continue as a normal guest or as a secure guest? > > > > I think the guest will give up. It definitely can't continue as secure > because the conversion to secure failed. And it should not continue as > non-secure because that's not what the user asked for. > > I'm not sure you got my point. My point is: we may not break existing > setups when adding new features. Secure execution can work without secure > object today, and what works today shall keep working tomorrow and > beyond. > > > > This feature is still to be shipped, but secure execution has > > > already been shipped, but without migration support. > > > > > > That's why when you have both the secure object configured, and mandate > > > migratability, the we can fail. Actually we should fail now, because the > > > two options are not compatible: you can't have a qemu that is guaranteed > > > to be migratable, and guaranteed to be able to operate in secure > > > execution mode today. Failing early, and not on the guests opt-in would > > > be preferable. > > > > > > After migration support is added, the combo should be fine, and probably > > > also the default for secure execution machines. > > > > > > > > > > > So one possible design choice is to let the guest know that migration > > > > must be facilitated. It can then decide if it wants to continue as a > > > > normal VM or terminate itself, or take the plunge and switch to secure. > > > > A well behaving guest will not switch to secure. > > > > > > > > > > I don't understand this point. Sorry. > > > > Qemu will present the 'must-support-migrate' and the 'secure-object' capability > > to the guest. > > How does the qemu preset the 'must-support-migrate' and the > 'secure-object' capability to the guest on (PPC and especially on s390)? This can be modeled with device tree properties on PPC. However, I figure, my proposal has its own flaws; as admitted below. > And > please clarify what do you mean by 'secure-object'. I used to believe I > understood, but now I have the feeling I don't understand. Its the feature that enables the machine to be capable of running secure guests. > > > > > The secure-aware guest, has three choices > > (a) terminate itself. OR > > (b) not call the switch-to-secure ucall, and continue as normal guest. OR > > (c) call the switch-to-secure ucall. > > > > Legacy guests which are not aware of secure-object, will continue to do > > (b). > > New Guests which are secure-object aware, will observe that > > 'must-support-migrate' and 'secure-object' capabilities are > > incompatible. Hence will choose (a) or (b), but will never choose > > (c). > > > > The first problem is, IMHO, that you want to expose QEMU internals to the > guest. For the guest, there is no such thing as 'must-support-migrate' > (AFAIK). right. good point. The key point is, migration must be transparent to the guest. And that is where; I realize, my proposal falters. > > The other problem is, that migration and secure are not inherently > incompatible. On s390x it is the property of the current host > implementation, that we can't do migration for secure. But this can > change in the future. > > > > > > > The main difference between my proposal and the other proposal is... > > > > In my proposal the guest makes the compatibility decision and acts > > accordingly. In the other proposal QEMU makes the compatibility > > decision and acts accordingly. I argue that QEMU cannot make a good > > compatibility decision, because it wont know in advance, if the guest > > will or will-not switch-to-secure. > > > > You have a point there when you say that QEMU does not know in advance, > if the guest will or will-not switch-to-secure. I made that argument > regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea > was to flip that property on demand when the conversion occurs. David > explained to me that this is not possible for ppc, and that having the > "securable-guest-memory" property (or whatever the name will be) > specified is a strong indication, that the VM is intended to be used as > a secure VM (thus it is OK to hurt the case where the guest does not > try to transition). That argument applies here as well. As suggested by Cornelia Huck, what if QEMU disabled the "securable-guest-memory" property if 'must-support-migrate' is enabled? Offcourse; this has to be done with a big fat warning stating "secure-guest-memory" feature is disabled on the machine. Doing so, will continue to support guest that do not try to transition. Guest that try to transition will fail and terminate themselves. > > But more importantly, as I explained above, the guest does not know if > migration and secure are incompatible or not. So the guest can't make a > good decision. Agree. RP
On Fri, Dec 04, 2020 at 04:44:12PM +1100, David Gibson wrote:
> Some upcoming POWER machines have a system called PEF (Protected
> Execution Facility) which uses a small ultravisor to allow guests to
> run in a way that they can't be eavesdropped by the hypervisor. The
> effect is roughly similar to AMD SEV, although the mechanisms are
> quite different.
>
> Most of the work of this is done between the guest, KVM and the
> ultravisor, with little need for involvement by qemu. However qemu
> does need to tell KVM to allow secure VMs.
>
> Because the availability of secure mode is a guest visible difference
> which depends on having the right hardware and firmware, we don't
> enable this by default. In order to run a secure guest you need to
> create a "pef-guest" object and set the securable-guest-memory machine
> property to point to it.
>
> Note that this just *allows* secure guests, the architecture of PEF is
> such that the guest still needs to talk to the ultravisor to enter
> secure mode. Qemu has no directl way of knowing if the guest is in
> secure mode, and certainly can't know until well after machine
> creation time.
>
> To start a PEF-capable guest, use the command line options:
> -object pef-guest,id=pef0 -machine securable-guest-memory=pef0
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> Acked-by: Ram Pai <linuxram@us.ibm.com>
> ---
> hw/ppc/meson.build | 1 +
> hw/ppc/pef.c | 115 +++++++++++++++++++++++++++++++++++++++++++
> hw/ppc/spapr.c | 10 ++++
> include/hw/ppc/pef.h | 26 ++++++++++
> target/ppc/kvm.c | 18 -------
> target/ppc/kvm_ppc.h | 6 ---
> 6 files changed, 152 insertions(+), 24 deletions(-)
> create mode 100644 hw/ppc/pef.c
> create mode 100644 include/hw/ppc/pef.h
>
> diff --git a/hw/ppc/meson.build b/hw/ppc/meson.build
> index ffa2ec37fa..218631c883 100644
> --- a/hw/ppc/meson.build
> +++ b/hw/ppc/meson.build
> @@ -27,6 +27,7 @@ ppc_ss.add(when: 'CONFIG_PSERIES', if_true: files(
> 'spapr_nvdimm.c',
> 'spapr_rtas_ddw.c',
> 'spapr_numa.c',
> + 'pef.c',
> ))
> ppc_ss.add(when: 'CONFIG_SPAPR_RNG', if_true: files('spapr_rng.c'))
> ppc_ss.add(when: ['CONFIG_PSERIES', 'CONFIG_LINUX'], if_true: files(
> diff --git a/hw/ppc/pef.c b/hw/ppc/pef.c
> new file mode 100644
> index 0000000000..3ae3059cfe
> --- /dev/null
> +++ b/hw/ppc/pef.c
> @@ -0,0 +1,115 @@
> +/*
> + * PEF (Protected Execution Facility) for POWER support
> + *
> + * Copyright David Gibson, Redhat Inc. 2020
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +
> +#include "qapi/error.h"
> +#include "qom/object_interfaces.h"
> +#include "sysemu/kvm.h"
> +#include "migration/blocker.h"
> +#include "exec/securable-guest-memory.h"
> +#include "hw/ppc/pef.h"
> +
> +#define TYPE_PEF_GUEST "pef-guest"
> +#define PEF_GUEST(obj) \
> + OBJECT_CHECK(PefGuestState, (obj), TYPE_PEF_GUEST)
> +
> +typedef struct PefGuestState PefGuestState;
> +
> +/**
> + * PefGuestState:
> + *
> + * The PefGuestState object is used for creating and managing a PEF
> + * guest.
> + *
> + * # $QEMU \
> + * -object pef-guest,id=pef0 \
> + * -machine ...,securable-guest-memory=pef0
> + */
> +struct PefGuestState {
> + Object parent_obj;
> +};
> +
> +#ifdef CONFIG_KVM
> +static int kvmppc_svm_init(Error **errp)
> +{
> + if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURABLE_GUEST)) {
^^^^^^^^^^^^^^^^^^^^^^^^^^
KVM defines this macro as KVM_CAP_PPC_SECURE_GUEST. Unless we patch KVM,
we are stuck with KVM_CAP_PPC_SECURE_GUEST.
RP
[-- Attachment #1: Type: text/plain, Size: 4378 bytes --] On Tue, Jan 05, 2021 at 03:34:38PM -0800, Ram Pai wrote: > On Fri, Dec 04, 2020 at 04:44:12PM +1100, David Gibson wrote: > > Some upcoming POWER machines have a system called PEF (Protected > > Execution Facility) which uses a small ultravisor to allow guests to > > run in a way that they can't be eavesdropped by the hypervisor. The > > effect is roughly similar to AMD SEV, although the mechanisms are > > quite different. > > > > Most of the work of this is done between the guest, KVM and the > > ultravisor, with little need for involvement by qemu. However qemu > > does need to tell KVM to allow secure VMs. > > > > Because the availability of secure mode is a guest visible difference > > which depends on having the right hardware and firmware, we don't > > enable this by default. In order to run a secure guest you need to > > create a "pef-guest" object and set the securable-guest-memory machine > > property to point to it. > > > > Note that this just *allows* secure guests, the architecture of PEF is > > such that the guest still needs to talk to the ultravisor to enter > > secure mode. Qemu has no directl way of knowing if the guest is in > > secure mode, and certainly can't know until well after machine > > creation time. > > > > To start a PEF-capable guest, use the command line options: > > -object pef-guest,id=pef0 -machine securable-guest-memory=pef0 > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > Acked-by: Ram Pai <linuxram@us.ibm.com> > > --- > > hw/ppc/meson.build | 1 + > > hw/ppc/pef.c | 115 +++++++++++++++++++++++++++++++++++++++++++ > > hw/ppc/spapr.c | 10 ++++ > > include/hw/ppc/pef.h | 26 ++++++++++ > > target/ppc/kvm.c | 18 ------- > > target/ppc/kvm_ppc.h | 6 --- > > 6 files changed, 152 insertions(+), 24 deletions(-) > > create mode 100644 hw/ppc/pef.c > > create mode 100644 include/hw/ppc/pef.h > > > > diff --git a/hw/ppc/meson.build b/hw/ppc/meson.build > > index ffa2ec37fa..218631c883 100644 > > --- a/hw/ppc/meson.build > > +++ b/hw/ppc/meson.build > > @@ -27,6 +27,7 @@ ppc_ss.add(when: 'CONFIG_PSERIES', if_true: files( > > 'spapr_nvdimm.c', > > 'spapr_rtas_ddw.c', > > 'spapr_numa.c', > > + 'pef.c', > > )) > > ppc_ss.add(when: 'CONFIG_SPAPR_RNG', if_true: files('spapr_rng.c')) > > ppc_ss.add(when: ['CONFIG_PSERIES', 'CONFIG_LINUX'], if_true: files( > > diff --git a/hw/ppc/pef.c b/hw/ppc/pef.c > > new file mode 100644 > > index 0000000000..3ae3059cfe > > --- /dev/null > > +++ b/hw/ppc/pef.c > > @@ -0,0 +1,115 @@ > > +/* > > + * PEF (Protected Execution Facility) for POWER support > > + * > > + * Copyright David Gibson, Redhat Inc. 2020 > > + * > > + * This work is licensed under the terms of the GNU GPL, version 2 or later. > > + * See the COPYING file in the top-level directory. > > + * > > + */ > > + > > +#include "qemu/osdep.h" > > + > > +#include "qapi/error.h" > > +#include "qom/object_interfaces.h" > > +#include "sysemu/kvm.h" > > +#include "migration/blocker.h" > > +#include "exec/securable-guest-memory.h" > > +#include "hw/ppc/pef.h" > > + > > +#define TYPE_PEF_GUEST "pef-guest" > > +#define PEF_GUEST(obj) \ > > + OBJECT_CHECK(PefGuestState, (obj), TYPE_PEF_GUEST) > > + > > +typedef struct PefGuestState PefGuestState; > > + > > +/** > > + * PefGuestState: > > + * > > + * The PefGuestState object is used for creating and managing a PEF > > + * guest. > > + * > > + * # $QEMU \ > > + * -object pef-guest,id=pef0 \ > > + * -machine ...,securable-guest-memory=pef0 > > + */ > > +struct PefGuestState { > > + Object parent_obj; > > +}; > > + > > +#ifdef CONFIG_KVM > > +static int kvmppc_svm_init(Error **errp) > > +{ > > + if (!kvm_check_extension(kvm_state, KVM_CAP_PPC_SECURABLE_GUEST)) { > ^^^^^^^^^^^^^^^^^^^^^^^^^^ > KVM defines this macro as KVM_CAP_PPC_SECURE_GUEST. Unless we patch KVM, > we are stuck with KVM_CAP_PPC_SECURE_GUEST. Oops, made an over-zealous search and replace. Fixed now. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 1812 bytes --] On Fri, Dec 04, 2020 at 02:10:05PM +0100, Cornelia Huck wrote: > On Fri, 4 Dec 2020 16:44:05 +1100 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > At the moment AMD SEV sets a special function pointer, plus an opaque > > handle in KVMState to let things know how to encrypt guest memory. > > > > Now that we have a QOM interface for handling things related to securable > > guest memory, use a QOM method on that interface, rather than a bare > > function pointer for this. > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > > --- > > accel/kvm/kvm-all.c | 36 +++++--- > > accel/kvm/sev-stub.c | 9 +- > > include/exec/securable-guest-memory.h | 2 + > > include/sysemu/sev.h | 5 +- > > target/i386/monitor.c | 1 - > > target/i386/sev.c | 116 ++++++++++---------------- > > 6 files changed, 77 insertions(+), 92 deletions(-) > > > > > @@ -224,7 +224,7 @@ int kvm_get_max_memslots(void) > > > > bool kvm_memcrypt_enabled(void) > > { > > - if (kvm_state && kvm_state->memcrypt_handle) { > > + if (kvm_state && kvm_state->sgm) { > > If we want to generalize the concept, maybe check for encrypt_data in > sgm here? There's probably room for different callbacks in the sgm > structure. I don't think it's worth changing here. This gets changed again in patch 6, I'll adjust to clarify a bit what's going on there. > > > return true; > > } > > > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
On Tue, 5 Jan 2021 12:41:25 -0800 Ram Pai <linuxram@us.ibm.com> wrote: > On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote: > > On Mon, 4 Jan 2021 10:40:26 -0800 > > Ram Pai <linuxram@us.ibm.com> wrote: > > > The main difference between my proposal and the other proposal is... > > > > > > In my proposal the guest makes the compatibility decision and acts > > > accordingly. In the other proposal QEMU makes the compatibility > > > decision and acts accordingly. I argue that QEMU cannot make a good > > > compatibility decision, because it wont know in advance, if the guest > > > will or will-not switch-to-secure. > > > > > > > You have a point there when you say that QEMU does not know in advance, > > if the guest will or will-not switch-to-secure. I made that argument > > regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea > > was to flip that property on demand when the conversion occurs. David > > explained to me that this is not possible for ppc, and that having the > > "securable-guest-memory" property (or whatever the name will be) > > specified is a strong indication, that the VM is intended to be used as > > a secure VM (thus it is OK to hurt the case where the guest does not > > try to transition). That argument applies here as well. > > As suggested by Cornelia Huck, what if QEMU disabled the > "securable-guest-memory" property if 'must-support-migrate' is enabled? > Offcourse; this has to be done with a big fat warning stating > "secure-guest-memory" feature is disabled on the machine. > Doing so, will continue to support guest that do not try to transition. > Guest that try to transition will fail and terminate themselves. Just to recap the s390x situation: - We currently offer a cpu feature that indicates secure execution to be available to the guest if the host supports it. - When we introduce the secure object, we still need to support previous configurations and continue to offer the cpu feature, even if the secure object is not specified. - As migration is currently not supported for secured guests, we add a blocker once the guest actually transitions. That means that transition fails if --only-migratable was specified on the command line. (Guests not transitioning will obviously not notice anything.) - With the secure object, we will already fail starting QEMU if --only-migratable was specified. My suggestion is now that we don't even offer the cpu feature if --only-migratable has been specified. For a guest that does not want to transition to secure mode, nothing changes; a guest that wants to transition to secure mode will notice that the feature is not available and fail appropriately (or ultimately, when the ultravisor call fails). We'd still fail starting QEMU for the secure object + --only-migratable combination. Does that make sense?
On 12/4/20 6:44 AM, David Gibson wrote:
> From: Greg Kurz <groug@kaod.org>
>
> Global properties have an @optional field, which allows to apply a given
> property to a given type even if one of its subclasses doesn't support
> it. This is especially used in the compat code when dealing with the
> "disable-modern" and "disable-legacy" properties and the "virtio-pci"
> type.
>
> Allow object_register_sugar_prop() to set this field as well.
>
> Signed-off-by: Greg Kurz <groug@kaod.org>
> Message-Id: <159738953558.377274.16617742952571083440.stgit@bahia.lan>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> include/qom/object.h | 3 ++-
> qom/object.c | 4 +++-
> softmmu/vl.c | 16 ++++++++++------
> 3 files changed, 15 insertions(+), 8 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
On 12/4/20 6:44 AM, David Gibson wrote:
> Currently the "memory-encryption" property is only looked at once we
> get to kvm_init(). Although protection of guest memory from the
> hypervisor isn't something that could really ever work with TCG, it's
> not conceptually tied to the KVM accelerator.
>
> In addition, the way the string property is resolved to an object is
> almost identical to how a QOM link property is handled.
>
> So, create a new "securable-guest-memory" link property which sets
> this QOM interface link directly in the machine. For compatibility we
> keep the "memory-encryption" property, but now implemented in terms of
> the new property.
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/kvm/kvm-all.c | 22 ++++++----------------
> hw/core/machine.c | 43 +++++++++++++++++++++++++++++++++++++------
> include/hw/boards.h | 2 +-
> 3 files changed, 44 insertions(+), 23 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
On 12/4/20 6:44 AM, David Gibson wrote: > The kvm_memcrypt_enabled() and kvm_memcrypt_encrypt_data() helper functions > don't conceptually have any connection to KVM (although it's not possible > in practice to use them without it). > > They also rely on looking at the global KVMState. But the same information > is available from the machine, and the only existing callers have natural > access to the machine state. > > Therefore, move and rename them to helpers in securable-guest-memory.h, > taking an explicit machine parameter. > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > --- > accel/kvm/kvm-all.c | 27 -------------------- > accel/stubs/kvm-stub.c | 10 -------- > hw/i386/pc_sysfw.c | 6 +++-- > include/exec/securable-guest-memory.h | 36 +++++++++++++++++++++++++++ > include/sysemu/kvm.h | 17 ------------- > 5 files changed, 40 insertions(+), 56 deletions(-) ... > +static inline int securable_guest_memory_encrypt(MachineState *machine, > + uint8_t *ptr, uint64_t len) > +{ > + SecurableGuestMemory *sgm = machine->sgm; > + > + if (sgm) { > + SecurableGuestMemoryClass *sgmc = SECURABLE_GUEST_MEMORY_GET_CLASS(sgm); > + > + if (sgmc->encrypt_data) { Can this ever happen? Maybe use assert(sgmc->encrypt_data) instead? Otherwise: Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> > + return sgmc->encrypt_data(sgm, ptr, len); > + } > + } > + > + return 1; > +}
On Mon, Jan 11, 2021 at 05:59:14PM +0100, Cornelia Huck wrote: > On Tue, 5 Jan 2021 12:41:25 -0800 > Ram Pai <linuxram@us.ibm.com> wrote: > > > On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote: > > > On Mon, 4 Jan 2021 10:40:26 -0800 > > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > The main difference between my proposal and the other proposal is... > > > > > > > > In my proposal the guest makes the compatibility decision and acts > > > > accordingly. In the other proposal QEMU makes the compatibility > > > > decision and acts accordingly. I argue that QEMU cannot make a good > > > > compatibility decision, because it wont know in advance, if the guest > > > > will or will-not switch-to-secure. > > > > > > > > > > You have a point there when you say that QEMU does not know in advance, > > > if the guest will or will-not switch-to-secure. I made that argument > > > regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea > > > was to flip that property on demand when the conversion occurs. David > > > explained to me that this is not possible for ppc, and that having the > > > "securable-guest-memory" property (or whatever the name will be) > > > specified is a strong indication, that the VM is intended to be used as > > > a secure VM (thus it is OK to hurt the case where the guest does not > > > try to transition). That argument applies here as well. > > > > As suggested by Cornelia Huck, what if QEMU disabled the > > "securable-guest-memory" property if 'must-support-migrate' is enabled? > > Offcourse; this has to be done with a big fat warning stating > > "secure-guest-memory" feature is disabled on the machine. > > Doing so, will continue to support guest that do not try to transition. > > Guest that try to transition will fail and terminate themselves. > > Just to recap the s390x situation: > > - We currently offer a cpu feature that indicates secure execution to > be available to the guest if the host supports it. > - When we introduce the secure object, we still need to support > previous configurations and continue to offer the cpu feature, even > if the secure object is not specified. > - As migration is currently not supported for secured guests, we add a > blocker once the guest actually transitions. That means that > transition fails if --only-migratable was specified on the command > line. (Guests not transitioning will obviously not notice anything.) > - With the secure object, we will already fail starting QEMU if > --only-migratable was specified. > > My suggestion is now that we don't even offer the cpu feature if > --only-migratable has been specified. For a guest that does not want to > transition to secure mode, nothing changes; a guest that wants to > transition to secure mode will notice that the feature is not available > and fail appropriately (or ultimately, when the ultravisor call fails). On POWER, secure-execution is not **automatically** enabled even when the host supports it. The feature is enabled only if the secure-object is configured, and the host supports it. However the behavior proposed above will be consistent on POWER and on s390x, when '--only-migratable' is specified and 'secure-object' is NOT specified. So I am in agreement till now. > We'd still fail starting QEMU for the secure object + --only-migratable > combination. Why fail? Instead, print a warning and disable the secure-object; which will disable your cpu-feature. Guests that do not transition to secure, will continue to operate, and guests that transition to secure, will fail. RP
[-- Attachment #1: Type: text/plain, Size: 1856 bytes --] On Fri, Dec 04, 2020 at 09:50:05AM +0000, Daniel P. Berrangé wrote: > On Fri, Dec 04, 2020 at 04:44:02PM +1100, David Gibson wrote: > > A number of hardware platforms are implementing mechanisms whereby the > > hypervisor does not have unfettered access to guest memory, in order > > to mitigate the security impact of a compromised hypervisor. > > > > AMD's SEV implements this with in-cpu memory encryption, and Intel has > > its own memory encryption mechanism. POWER has an upcoming mechanism > > to accomplish this in a different way, using a new memory protection > > level plus a small trusted ultravisor. s390 also has a protected > > execution environment. > > > > The current code (committed or draft) for these features has each > > platform's version configured entirely differently. That doesn't seem > > ideal for users, or particularly for management layers. > > > > AMD SEV introduces a notionally generic machine option > > "machine-encryption", but it doesn't actually cover any cases other > > than SEV. > > > > This series is a proposal to at least partially unify configuration > > for these mechanisms, by renaming and generalizing AMD's > > "memory-encryption" property. It is replaced by a > > "securable-guest-memory" property pointing to a platform specific > > object which configures and manages the specific details. > > There's no docs updated or added in this series. > > docs/amd-memory-encryption.txt needs an update at least, and > there ought to be a doc added describing how this series is > to be used for s390/ppc Fair point, I've made a bunch of doc updates for the next spin. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 2088 bytes --] On Mon, Jan 11, 2021 at 07:13:27PM +0100, Philippe Mathieu-Daudé wrote: > On 12/4/20 6:44 AM, David Gibson wrote: > > The kvm_memcrypt_enabled() and kvm_memcrypt_encrypt_data() helper functions > > don't conceptually have any connection to KVM (although it's not possible > > in practice to use them without it). > > > > They also rely on looking at the global KVMState. But the same information > > is available from the machine, and the only existing callers have natural > > access to the machine state. > > > > Therefore, move and rename them to helpers in securable-guest-memory.h, > > taking an explicit machine parameter. > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > > --- > > accel/kvm/kvm-all.c | 27 -------------------- > > accel/stubs/kvm-stub.c | 10 -------- > > hw/i386/pc_sysfw.c | 6 +++-- > > include/exec/securable-guest-memory.h | 36 +++++++++++++++++++++++++++ > > include/sysemu/kvm.h | 17 ------------- > > 5 files changed, 40 insertions(+), 56 deletions(-) > ... > > > +static inline int securable_guest_memory_encrypt(MachineState *machine, > > + uint8_t *ptr, uint64_t len) > > +{ > > + SecurableGuestMemory *sgm = machine->sgm; > > + > > + if (sgm) { > > + SecurableGuestMemoryClass *sgmc = SECURABLE_GUEST_MEMORY_GET_CLASS(sgm); > > + > > + if (sgmc->encrypt_data) { > > Can this ever happen? Maybe use assert(sgmc->encrypt_data) instead? It's made moot by changes in the next spin. > > Otherwise: > Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> > > > + return sgmc->encrypt_data(sgm, ptr, len); > > + } > > + } > > + > > + return 1; > > +} > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #1: Type: text/plain, Size: 1814 bytes --] On Fri, Dec 04, 2020 at 02:10:05PM +0100, Cornelia Huck wrote: > On Fri, 4 Dec 2020 16:44:05 +1100 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > At the moment AMD SEV sets a special function pointer, plus an opaque > > handle in KVMState to let things know how to encrypt guest memory. > > > > Now that we have a QOM interface for handling things related to securable > > guest memory, use a QOM method on that interface, rather than a bare > > function pointer for this. > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > > --- > > accel/kvm/kvm-all.c | 36 +++++--- > > accel/kvm/sev-stub.c | 9 +- > > include/exec/securable-guest-memory.h | 2 + > > include/sysemu/sev.h | 5 +- > > target/i386/monitor.c | 1 - > > target/i386/sev.c | 116 ++++++++++---------------- > > 6 files changed, 77 insertions(+), 92 deletions(-) > > > > > @@ -224,7 +224,7 @@ int kvm_get_max_memslots(void) > > > > bool kvm_memcrypt_enabled(void) > > { > > - if (kvm_state && kvm_state->memcrypt_handle) { > > + if (kvm_state && kvm_state->sgm) { > > If we want to generalize the concept, maybe check for encrypt_data in > sgm here? There's probably room for different callbacks in the sgm > structure. Actually, I've realised this isn't even as general as it pretends to be now, so I've taken a different approach for the next spin. > > > return true; > > } > > > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
On Mon, 11 Jan 2021 11:58:30 -0800 Ram Pai <linuxram@us.ibm.com> wrote: > On Mon, Jan 11, 2021 at 05:59:14PM +0100, Cornelia Huck wrote: > > On Tue, 5 Jan 2021 12:41:25 -0800 > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote: > > > > On Mon, 4 Jan 2021 10:40:26 -0800 > > > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > > > The main difference between my proposal and the other proposal is... > > > > > > > > > > In my proposal the guest makes the compatibility decision and acts > > > > > accordingly. In the other proposal QEMU makes the compatibility > > > > > decision and acts accordingly. I argue that QEMU cannot make a good > > > > > compatibility decision, because it wont know in advance, if the guest > > > > > will or will-not switch-to-secure. > > > > > > > > > > > > > You have a point there when you say that QEMU does not know in advance, > > > > if the guest will or will-not switch-to-secure. I made that argument > > > > regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea > > > > was to flip that property on demand when the conversion occurs. David > > > > explained to me that this is not possible for ppc, and that having the > > > > "securable-guest-memory" property (or whatever the name will be) > > > > specified is a strong indication, that the VM is intended to be used as > > > > a secure VM (thus it is OK to hurt the case where the guest does not > > > > try to transition). That argument applies here as well. > > > > > > As suggested by Cornelia Huck, what if QEMU disabled the > > > "securable-guest-memory" property if 'must-support-migrate' is enabled? > > > Offcourse; this has to be done with a big fat warning stating > > > "secure-guest-memory" feature is disabled on the machine. > > > Doing so, will continue to support guest that do not try to transition. > > > Guest that try to transition will fail and terminate themselves. > > > > Just to recap the s390x situation: > > > > - We currently offer a cpu feature that indicates secure execution to > > be available to the guest if the host supports it. > > - When we introduce the secure object, we still need to support > > previous configurations and continue to offer the cpu feature, even > > if the secure object is not specified. > > - As migration is currently not supported for secured guests, we add a > > blocker once the guest actually transitions. That means that > > transition fails if --only-migratable was specified on the command > > line. (Guests not transitioning will obviously not notice anything.) > > - With the secure object, we will already fail starting QEMU if > > --only-migratable was specified. > > > > My suggestion is now that we don't even offer the cpu feature if > > --only-migratable has been specified. For a guest that does not want to > > transition to secure mode, nothing changes; a guest that wants to > > transition to secure mode will notice that the feature is not available > > and fail appropriately (or ultimately, when the ultravisor call fails). > > > On POWER, secure-execution is not **automatically** enabled even when > the host supports it. The feature is enabled only if the secure-object > is configured, and the host supports it. Yes, the cpu feature on s390x is simply pre-existing. > > However the behavior proposed above will be consistent on POWER and > on s390x, when '--only-migratable' is specified and 'secure-object' > is NOT specified. > > So I am in agreement till now. > > > > We'd still fail starting QEMU for the secure object + --only-migratable > > combination. > > Why fail? > > Instead, print a warning and disable the secure-object; which will > disable your cpu-feature. Guests that do not transition to secure, will > continue to operate, and guests that transition to secure, will fail. But that would be consistent with how other non-migratable objects are handled, no? It's simply a case of incompatible options on the command line.
On Tue, Jan 12, 2021 at 09:19:43AM +0100, Cornelia Huck wrote:
> On Mon, 11 Jan 2021 11:58:30 -0800
> Ram Pai <linuxram@us.ibm.com> wrote:
>
> > On Mon, Jan 11, 2021 at 05:59:14PM +0100, Cornelia Huck wrote:
> > > On Tue, 5 Jan 2021 12:41:25 -0800
> > > Ram Pai <linuxram@us.ibm.com> wrote:
> > >
> > > > On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote:
> > > > > On Mon, 4 Jan 2021 10:40:26 -0800
> > > > > Ram Pai <linuxram@us.ibm.com> wrote:
> > >
> > > > > > The main difference between my proposal and the other proposal is...
> > > > > >
> > > > > > In my proposal the guest makes the compatibility decision and acts
> > > > > > accordingly. In the other proposal QEMU makes the compatibility
> > > > > > decision and acts accordingly. I argue that QEMU cannot make a good
> > > > > > compatibility decision, because it wont know in advance, if the guest
> > > > > > will or will-not switch-to-secure.
> > > > > >
> > > > >
> > > > > You have a point there when you say that QEMU does not know in advance,
> > > > > if the guest will or will-not switch-to-secure. I made that argument
> > > > > regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea
> > > > > was to flip that property on demand when the conversion occurs. David
> > > > > explained to me that this is not possible for ppc, and that having the
> > > > > "securable-guest-memory" property (or whatever the name will be)
> > > > > specified is a strong indication, that the VM is intended to be used as
> > > > > a secure VM (thus it is OK to hurt the case where the guest does not
> > > > > try to transition). That argument applies here as well.
> > > >
> > > > As suggested by Cornelia Huck, what if QEMU disabled the
> > > > "securable-guest-memory" property if 'must-support-migrate' is enabled?
> > > > Offcourse; this has to be done with a big fat warning stating
> > > > "secure-guest-memory" feature is disabled on the machine.
> > > > Doing so, will continue to support guest that do not try to transition.
> > > > Guest that try to transition will fail and terminate themselves.
> > >
> > > Just to recap the s390x situation:
> > >
> > > - We currently offer a cpu feature that indicates secure execution to
> > > be available to the guest if the host supports it.
> > > - When we introduce the secure object, we still need to support
> > > previous configurations and continue to offer the cpu feature, even
> > > if the secure object is not specified.
> > > - As migration is currently not supported for secured guests, we add a
> > > blocker once the guest actually transitions. That means that
> > > transition fails if --only-migratable was specified on the command
> > > line. (Guests not transitioning will obviously not notice anything.)
> > > - With the secure object, we will already fail starting QEMU if
> > > --only-migratable was specified.
> > >
> > > My suggestion is now that we don't even offer the cpu feature if
> > > --only-migratable has been specified. For a guest that does not want to
> > > transition to secure mode, nothing changes; a guest that wants to
> > > transition to secure mode will notice that the feature is not available
> > > and fail appropriately (or ultimately, when the ultravisor call fails).
> >
> >
> > On POWER, secure-execution is not **automatically** enabled even when
> > the host supports it. The feature is enabled only if the secure-object
> > is configured, and the host supports it.
>
> Yes, the cpu feature on s390x is simply pre-existing.
>
> >
> > However the behavior proposed above will be consistent on POWER and
> > on s390x, when '--only-migratable' is specified and 'secure-object'
> > is NOT specified.
> >
> > So I am in agreement till now.
> >
> >
> > > We'd still fail starting QEMU for the secure object + --only-migratable
> > > combination.
> >
> > Why fail?
> >
> > Instead, print a warning and disable the secure-object; which will
> > disable your cpu-feature. Guests that do not transition to secure, will
> > continue to operate, and guests that transition to secure, will fail.
>
> But that would be consistent with how other non-migratable objects are
> handled, no? It's simply a case of incompatible options on the command
> line.
Actually the two options are inherently NOT incompatible. Halil also
mentioned this in one of his replies.
Its just that the current implementation is lacking, which will be fixed
in the near future.
We can design it upfront, with the assumption that they both are compatible.
In the short term disable one; preferrably the secure-object, if both
options are specified. In the long term, remove the restriction, when
the implemetation is complete.
--
Ram Pai
On Tue, 12 Jan 2021 10:55:11 -0800
Ram Pai <linuxram@us.ibm.com> wrote:
> On Tue, Jan 12, 2021 at 09:19:43AM +0100, Cornelia Huck wrote:
> > On Mon, 11 Jan 2021 11:58:30 -0800
> > Ram Pai <linuxram@us.ibm.com> wrote:
> >
> > > On Mon, Jan 11, 2021 at 05:59:14PM +0100, Cornelia Huck wrote:
> > > > On Tue, 5 Jan 2021 12:41:25 -0800
> > > > Ram Pai <linuxram@us.ibm.com> wrote:
> > > >
> > > > > On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote:
> > > > > > On Mon, 4 Jan 2021 10:40:26 -0800
> > > > > > Ram Pai <linuxram@us.ibm.com> wrote:
> > > >
> > > > > > > The main difference between my proposal and the other proposal is...
> > > > > > >
> > > > > > > In my proposal the guest makes the compatibility decision and acts
> > > > > > > accordingly. In the other proposal QEMU makes the compatibility
> > > > > > > decision and acts accordingly. I argue that QEMU cannot make a good
> > > > > > > compatibility decision, because it wont know in advance, if the guest
> > > > > > > will or will-not switch-to-secure.
> > > > > > >
> > > > > >
> > > > > > You have a point there when you say that QEMU does not know in advance,
> > > > > > if the guest will or will-not switch-to-secure. I made that argument
> > > > > > regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea
> > > > > > was to flip that property on demand when the conversion occurs. David
> > > > > > explained to me that this is not possible for ppc, and that having the
> > > > > > "securable-guest-memory" property (or whatever the name will be)
> > > > > > specified is a strong indication, that the VM is intended to be used as
> > > > > > a secure VM (thus it is OK to hurt the case where the guest does not
> > > > > > try to transition). That argument applies here as well.
> > > > >
> > > > > As suggested by Cornelia Huck, what if QEMU disabled the
> > > > > "securable-guest-memory" property if 'must-support-migrate' is enabled?
> > > > > Offcourse; this has to be done with a big fat warning stating
> > > > > "secure-guest-memory" feature is disabled on the machine.
> > > > > Doing so, will continue to support guest that do not try to transition.
> > > > > Guest that try to transition will fail and terminate themselves.
> > > >
> > > > Just to recap the s390x situation:
> > > >
> > > > - We currently offer a cpu feature that indicates secure execution to
> > > > be available to the guest if the host supports it.
> > > > - When we introduce the secure object, we still need to support
> > > > previous configurations and continue to offer the cpu feature, even
> > > > if the secure object is not specified.
> > > > - As migration is currently not supported for secured guests, we add a
> > > > blocker once the guest actually transitions. That means that
> > > > transition fails if --only-migratable was specified on the command
> > > > line. (Guests not transitioning will obviously not notice anything.)
> > > > - With the secure object, we will already fail starting QEMU if
> > > > --only-migratable was specified.
> > > >
> > > > My suggestion is now that we don't even offer the cpu feature if
> > > > --only-migratable has been specified. For a guest that does not want to
> > > > transition to secure mode, nothing changes; a guest that wants to
> > > > transition to secure mode will notice that the feature is not available
> > > > and fail appropriately (or ultimately, when the ultravisor call fails).
> > >
> > >
> > > On POWER, secure-execution is not **automatically** enabled even when
> > > the host supports it. The feature is enabled only if the secure-object
> > > is configured, and the host supports it.
> >
> > Yes, the cpu feature on s390x is simply pre-existing.
> >
> > >
> > > However the behavior proposed above will be consistent on POWER and
> > > on s390x, when '--only-migratable' is specified and 'secure-object'
> > > is NOT specified.
> > >
> > > So I am in agreement till now.
> > >
> > >
> > > > We'd still fail starting QEMU for the secure object + --only-migratable
> > > > combination.
> > >
> > > Why fail?
> > >
> > > Instead, print a warning and disable the secure-object; which will
> > > disable your cpu-feature. Guests that do not transition to secure, will
> > > continue to operate, and guests that transition to secure, will fail.
> >
> > But that would be consistent with how other non-migratable objects are
> > handled, no? It's simply a case of incompatible options on the command
> > line.
>
> Actually the two options are inherently NOT incompatible. Halil also
> mentioned this in one of his replies.
>
> Its just that the current implementation is lacking, which will be fixed
> in the near future.
>
> We can design it upfront, with the assumption that they both are compatible.
> In the short term disable one; preferrably the secure-object, if both
> options are specified. In the long term, remove the restriction, when
> the implemetation is complete.
Can't we simply mark the object as non-migratable now, and then remove
that later? I don't see what is so special about it.
* Cornelia Huck (cohuck@redhat.com) wrote:
> On Tue, 5 Jan 2021 12:41:25 -0800
> Ram Pai <linuxram@us.ibm.com> wrote:
>
> > On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote:
> > > On Mon, 4 Jan 2021 10:40:26 -0800
> > > Ram Pai <linuxram@us.ibm.com> wrote:
>
> > > > The main difference between my proposal and the other proposal is...
> > > >
> > > > In my proposal the guest makes the compatibility decision and acts
> > > > accordingly. In the other proposal QEMU makes the compatibility
> > > > decision and acts accordingly. I argue that QEMU cannot make a good
> > > > compatibility decision, because it wont know in advance, if the guest
> > > > will or will-not switch-to-secure.
> > > >
> > >
> > > You have a point there when you say that QEMU does not know in advance,
> > > if the guest will or will-not switch-to-secure. I made that argument
> > > regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea
> > > was to flip that property on demand when the conversion occurs. David
> > > explained to me that this is not possible for ppc, and that having the
> > > "securable-guest-memory" property (or whatever the name will be)
> > > specified is a strong indication, that the VM is intended to be used as
> > > a secure VM (thus it is OK to hurt the case where the guest does not
> > > try to transition). That argument applies here as well.
> >
> > As suggested by Cornelia Huck, what if QEMU disabled the
> > "securable-guest-memory" property if 'must-support-migrate' is enabled?
> > Offcourse; this has to be done with a big fat warning stating
> > "secure-guest-memory" feature is disabled on the machine.
> > Doing so, will continue to support guest that do not try to transition.
> > Guest that try to transition will fail and terminate themselves.
>
> Just to recap the s390x situation:
>
> - We currently offer a cpu feature that indicates secure execution to
> be available to the guest if the host supports it.
> - When we introduce the secure object, we still need to support
> previous configurations and continue to offer the cpu feature, even
> if the secure object is not specified.
> - As migration is currently not supported for secured guests, we add a
> blocker once the guest actually transitions. That means that
> transition fails if --only-migratable was specified on the command
> line. (Guests not transitioning will obviously not notice anything.)
> - With the secure object, we will already fail starting QEMU if
> --only-migratable was specified.
>
> My suggestion is now that we don't even offer the cpu feature if
> --only-migratable has been specified. For a guest that does not want to
> transition to secure mode, nothing changes; a guest that wants to
> transition to secure mode will notice that the feature is not available
> and fail appropriately (or ultimately, when the ultravisor call fails).
> We'd still fail starting QEMU for the secure object + --only-migratable
> combination.
>
> Does that make sense?
It's a little unusual; I don't think we have any other cases where
--only-migratable changes the behaviour; I think it normally only stops
you doing something that would have made it unmigratable or causes
an operation that would make it unmigratable to fail.
Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On 13.01.21 13:42, Dr. David Alan Gilbert wrote:
> * Cornelia Huck (cohuck@redhat.com) wrote:
>> On Tue, 5 Jan 2021 12:41:25 -0800
>> Ram Pai <linuxram@us.ibm.com> wrote:
>>
>>> On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote:
>>>> On Mon, 4 Jan 2021 10:40:26 -0800
>>>> Ram Pai <linuxram@us.ibm.com> wrote:
>>
>>>>> The main difference between my proposal and the other proposal is...
>>>>>
>>>>> In my proposal the guest makes the compatibility decision and acts
>>>>> accordingly. In the other proposal QEMU makes the compatibility
>>>>> decision and acts accordingly. I argue that QEMU cannot make a good
>>>>> compatibility decision, because it wont know in advance, if the guest
>>>>> will or will-not switch-to-secure.
>>>>>
>>>>
>>>> You have a point there when you say that QEMU does not know in advance,
>>>> if the guest will or will-not switch-to-secure. I made that argument
>>>> regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea
>>>> was to flip that property on demand when the conversion occurs. David
>>>> explained to me that this is not possible for ppc, and that having the
>>>> "securable-guest-memory" property (or whatever the name will be)
>>>> specified is a strong indication, that the VM is intended to be used as
>>>> a secure VM (thus it is OK to hurt the case where the guest does not
>>>> try to transition). That argument applies here as well.
>>>
>>> As suggested by Cornelia Huck, what if QEMU disabled the
>>> "securable-guest-memory" property if 'must-support-migrate' is enabled?
>>> Offcourse; this has to be done with a big fat warning stating
>>> "secure-guest-memory" feature is disabled on the machine.
>>> Doing so, will continue to support guest that do not try to transition.
>>> Guest that try to transition will fail and terminate themselves.
>>
>> Just to recap the s390x situation:
>>
>> - We currently offer a cpu feature that indicates secure execution to
>> be available to the guest if the host supports it.
>> - When we introduce the secure object, we still need to support
>> previous configurations and continue to offer the cpu feature, even
>> if the secure object is not specified.
>> - As migration is currently not supported for secured guests, we add a
>> blocker once the guest actually transitions. That means that
>> transition fails if --only-migratable was specified on the command
>> line. (Guests not transitioning will obviously not notice anything.)
>> - With the secure object, we will already fail starting QEMU if
>> --only-migratable was specified.
>>
>> My suggestion is now that we don't even offer the cpu feature if
>> --only-migratable has been specified. For a guest that does not want to
>> transition to secure mode, nothing changes; a guest that wants to
>> transition to secure mode will notice that the feature is not available
>> and fail appropriately (or ultimately, when the ultravisor call fails).
>> We'd still fail starting QEMU for the secure object + --only-migratable
>> combination.
>>
>> Does that make sense?
>
> It's a little unusual; I don't think we have any other cases where
> --only-migratable changes the behaviour; I think it normally only stops
> you doing something that would have made it unmigratable or causes
> an operation that would make it unmigratable to fail.
I would like to NOT block this feature with --only-migrateable. A guest
can startup unprotected (and then is is migrateable). the migration blocker
is really a dynamic aspect during runtime.
* Christian Borntraeger (borntraeger@de.ibm.com) wrote:
>
>
> On 13.01.21 13:42, Dr. David Alan Gilbert wrote:
> > * Cornelia Huck (cohuck@redhat.com) wrote:
> >> On Tue, 5 Jan 2021 12:41:25 -0800
> >> Ram Pai <linuxram@us.ibm.com> wrote:
> >>
> >>> On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote:
> >>>> On Mon, 4 Jan 2021 10:40:26 -0800
> >>>> Ram Pai <linuxram@us.ibm.com> wrote:
> >>
> >>>>> The main difference between my proposal and the other proposal is...
> >>>>>
> >>>>> In my proposal the guest makes the compatibility decision and acts
> >>>>> accordingly. In the other proposal QEMU makes the compatibility
> >>>>> decision and acts accordingly. I argue that QEMU cannot make a good
> >>>>> compatibility decision, because it wont know in advance, if the guest
> >>>>> will or will-not switch-to-secure.
> >>>>>
> >>>>
> >>>> You have a point there when you say that QEMU does not know in advance,
> >>>> if the guest will or will-not switch-to-secure. I made that argument
> >>>> regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea
> >>>> was to flip that property on demand when the conversion occurs. David
> >>>> explained to me that this is not possible for ppc, and that having the
> >>>> "securable-guest-memory" property (or whatever the name will be)
> >>>> specified is a strong indication, that the VM is intended to be used as
> >>>> a secure VM (thus it is OK to hurt the case where the guest does not
> >>>> try to transition). That argument applies here as well.
> >>>
> >>> As suggested by Cornelia Huck, what if QEMU disabled the
> >>> "securable-guest-memory" property if 'must-support-migrate' is enabled?
> >>> Offcourse; this has to be done with a big fat warning stating
> >>> "secure-guest-memory" feature is disabled on the machine.
> >>> Doing so, will continue to support guest that do not try to transition.
> >>> Guest that try to transition will fail and terminate themselves.
> >>
> >> Just to recap the s390x situation:
> >>
> >> - We currently offer a cpu feature that indicates secure execution to
> >> be available to the guest if the host supports it.
> >> - When we introduce the secure object, we still need to support
> >> previous configurations and continue to offer the cpu feature, even
> >> if the secure object is not specified.
> >> - As migration is currently not supported for secured guests, we add a
> >> blocker once the guest actually transitions. That means that
> >> transition fails if --only-migratable was specified on the command
> >> line. (Guests not transitioning will obviously not notice anything.)
> >> - With the secure object, we will already fail starting QEMU if
> >> --only-migratable was specified.
> >>
> >> My suggestion is now that we don't even offer the cpu feature if
> >> --only-migratable has been specified. For a guest that does not want to
> >> transition to secure mode, nothing changes; a guest that wants to
> >> transition to secure mode will notice that the feature is not available
> >> and fail appropriately (or ultimately, when the ultravisor call fails).
> >> We'd still fail starting QEMU for the secure object + --only-migratable
> >> combination.
> >>
> >> Does that make sense?
> >
> > It's a little unusual; I don't think we have any other cases where
> > --only-migratable changes the behaviour; I think it normally only stops
> > you doing something that would have made it unmigratable or causes
> > an operation that would make it unmigratable to fail.
>
> I would like to NOT block this feature with --only-migrateable. A guest
> can startup unprotected (and then is is migrateable). the migration blocker
> is really a dynamic aspect during runtime.
But the point of --only-migratable is to turn things that would have
blocked migration into failures, so that a VM started with
--only-migratable is *always* migratable.
Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On 14.01.21 11:36, Dr. David Alan Gilbert wrote:
> * Christian Borntraeger (borntraeger@de.ibm.com) wrote:
>>
>>
>> On 13.01.21 13:42, Dr. David Alan Gilbert wrote:
>>> * Cornelia Huck (cohuck@redhat.com) wrote:
>>>> On Tue, 5 Jan 2021 12:41:25 -0800
>>>> Ram Pai <linuxram@us.ibm.com> wrote:
>>>>
>>>>> On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote:
>>>>>> On Mon, 4 Jan 2021 10:40:26 -0800
>>>>>> Ram Pai <linuxram@us.ibm.com> wrote:
>>>>
>>>>>>> The main difference between my proposal and the other proposal is...
>>>>>>>
>>>>>>> In my proposal the guest makes the compatibility decision and acts
>>>>>>> accordingly. In the other proposal QEMU makes the compatibility
>>>>>>> decision and acts accordingly. I argue that QEMU cannot make a good
>>>>>>> compatibility decision, because it wont know in advance, if the guest
>>>>>>> will or will-not switch-to-secure.
>>>>>>>
>>>>>>
>>>>>> You have a point there when you say that QEMU does not know in advance,
>>>>>> if the guest will or will-not switch-to-secure. I made that argument
>>>>>> regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea
>>>>>> was to flip that property on demand when the conversion occurs. David
>>>>>> explained to me that this is not possible for ppc, and that having the
>>>>>> "securable-guest-memory" property (or whatever the name will be)
>>>>>> specified is a strong indication, that the VM is intended to be used as
>>>>>> a secure VM (thus it is OK to hurt the case where the guest does not
>>>>>> try to transition). That argument applies here as well.
>>>>>
>>>>> As suggested by Cornelia Huck, what if QEMU disabled the
>>>>> "securable-guest-memory" property if 'must-support-migrate' is enabled?
>>>>> Offcourse; this has to be done with a big fat warning stating
>>>>> "secure-guest-memory" feature is disabled on the machine.
>>>>> Doing so, will continue to support guest that do not try to transition.
>>>>> Guest that try to transition will fail and terminate themselves.
>>>>
>>>> Just to recap the s390x situation:
>>>>
>>>> - We currently offer a cpu feature that indicates secure execution to
>>>> be available to the guest if the host supports it.
>>>> - When we introduce the secure object, we still need to support
>>>> previous configurations and continue to offer the cpu feature, even
>>>> if the secure object is not specified.
>>>> - As migration is currently not supported for secured guests, we add a
>>>> blocker once the guest actually transitions. That means that
>>>> transition fails if --only-migratable was specified on the command
>>>> line. (Guests not transitioning will obviously not notice anything.)
>>>> - With the secure object, we will already fail starting QEMU if
>>>> --only-migratable was specified.
>>>>
>>>> My suggestion is now that we don't even offer the cpu feature if
>>>> --only-migratable has been specified. For a guest that does not want to
>>>> transition to secure mode, nothing changes; a guest that wants to
>>>> transition to secure mode will notice that the feature is not available
>>>> and fail appropriately (or ultimately, when the ultravisor call fails).
>>>> We'd still fail starting QEMU for the secure object + --only-migratable
>>>> combination.
>>>>
>>>> Does that make sense?
>>>
>>> It's a little unusual; I don't think we have any other cases where
>>> --only-migratable changes the behaviour; I think it normally only stops
>>> you doing something that would have made it unmigratable or causes
>>> an operation that would make it unmigratable to fail.
>>
>> I would like to NOT block this feature with --only-migrateable. A guest
>> can startup unprotected (and then is is migrateable). the migration blocker
>> is really a dynamic aspect during runtime.
>
> But the point of --only-migratable is to turn things that would have
> blocked migration into failures, so that a VM started with
> --only-migratable is *always* migratable.
Hmmm, fair enough. How do we do this with host-model? The constructed model
would contain unpack, but then it will fail to startup? Or do we silently
drop unpack in that case? Both variants do not feel completely right.
On Thu, 14 Jan 2021 11:52:11 +0100
Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> On 14.01.21 11:36, Dr. David Alan Gilbert wrote:
> > * Christian Borntraeger (borntraeger@de.ibm.com) wrote:
> >>
> >>
> >> On 13.01.21 13:42, Dr. David Alan Gilbert wrote:
> >>> * Cornelia Huck (cohuck@redhat.com) wrote:
> >>>> On Tue, 5 Jan 2021 12:41:25 -0800
> >>>> Ram Pai <linuxram@us.ibm.com> wrote:
> >>>>
> >>>>> On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote:
> >>>>>> On Mon, 4 Jan 2021 10:40:26 -0800
> >>>>>> Ram Pai <linuxram@us.ibm.com> wrote:
> >>>>
> >>>>>>> The main difference between my proposal and the other proposal is...
> >>>>>>>
> >>>>>>> In my proposal the guest makes the compatibility decision and acts
> >>>>>>> accordingly. In the other proposal QEMU makes the compatibility
> >>>>>>> decision and acts accordingly. I argue that QEMU cannot make a good
> >>>>>>> compatibility decision, because it wont know in advance, if the guest
> >>>>>>> will or will-not switch-to-secure.
> >>>>>>>
> >>>>>>
> >>>>>> You have a point there when you say that QEMU does not know in advance,
> >>>>>> if the guest will or will-not switch-to-secure. I made that argument
> >>>>>> regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea
> >>>>>> was to flip that property on demand when the conversion occurs. David
> >>>>>> explained to me that this is not possible for ppc, and that having the
> >>>>>> "securable-guest-memory" property (or whatever the name will be)
> >>>>>> specified is a strong indication, that the VM is intended to be used as
> >>>>>> a secure VM (thus it is OK to hurt the case where the guest does not
> >>>>>> try to transition). That argument applies here as well.
> >>>>>
> >>>>> As suggested by Cornelia Huck, what if QEMU disabled the
> >>>>> "securable-guest-memory" property if 'must-support-migrate' is enabled?
> >>>>> Offcourse; this has to be done with a big fat warning stating
> >>>>> "secure-guest-memory" feature is disabled on the machine.
> >>>>> Doing so, will continue to support guest that do not try to transition.
> >>>>> Guest that try to transition will fail and terminate themselves.
> >>>>
> >>>> Just to recap the s390x situation:
> >>>>
> >>>> - We currently offer a cpu feature that indicates secure execution to
> >>>> be available to the guest if the host supports it.
> >>>> - When we introduce the secure object, we still need to support
> >>>> previous configurations and continue to offer the cpu feature, even
> >>>> if the secure object is not specified.
> >>>> - As migration is currently not supported for secured guests, we add a
> >>>> blocker once the guest actually transitions. That means that
> >>>> transition fails if --only-migratable was specified on the command
> >>>> line. (Guests not transitioning will obviously not notice anything.)
> >>>> - With the secure object, we will already fail starting QEMU if
> >>>> --only-migratable was specified.
> >>>>
> >>>> My suggestion is now that we don't even offer the cpu feature if
> >>>> --only-migratable has been specified. For a guest that does not want to
> >>>> transition to secure mode, nothing changes; a guest that wants to
> >>>> transition to secure mode will notice that the feature is not available
> >>>> and fail appropriately (or ultimately, when the ultravisor call fails).
> >>>> We'd still fail starting QEMU for the secure object + --only-migratable
> >>>> combination.
> >>>>
> >>>> Does that make sense?
> >>>
> >>> It's a little unusual; I don't think we have any other cases where
> >>> --only-migratable changes the behaviour; I think it normally only stops
> >>> you doing something that would have made it unmigratable or causes
> >>> an operation that would make it unmigratable to fail.
> >>
> >> I would like to NOT block this feature with --only-migrateable. A guest
> >> can startup unprotected (and then is is migrateable). the migration blocker
> >> is really a dynamic aspect during runtime.
> >
> > But the point of --only-migratable is to turn things that would have
> > blocked migration into failures, so that a VM started with
> > --only-migratable is *always* migratable.
>
> Hmmm, fair enough. How do we do this with host-model? The constructed model
> would contain unpack, but then it will fail to startup? Or do we silently
> drop unpack in that case? Both variants do not feel completely right.
Failing if you explicitly specified unpacked feels right, but failing
if you just used the host model feels odd. Removing unpack also is a
bit odd, but I think the better option if we want to do anything about
it at all.
On Mon, Jan 11, 2021 at 11:58:30AM -0800, Ram Pai wrote: > On Mon, Jan 11, 2021 at 05:59:14PM +0100, Cornelia Huck wrote: > > On Tue, 5 Jan 2021 12:41:25 -0800 > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote: > > > > On Mon, 4 Jan 2021 10:40:26 -0800 > > > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > > > The main difference between my proposal and the other proposal is... > > > > > > > > > > In my proposal the guest makes the compatibility decision and acts > > > > > accordingly. In the other proposal QEMU makes the compatibility > > > > > decision and acts accordingly. I argue that QEMU cannot make a good > > > > > compatibility decision, because it wont know in advance, if the guest > > > > > will or will-not switch-to-secure. > > > > > > > > > > > > > You have a point there when you say that QEMU does not know in advance, > > > > if the guest will or will-not switch-to-secure. I made that argument > > > > regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea > > > > was to flip that property on demand when the conversion occurs. David > > > > explained to me that this is not possible for ppc, and that having the > > > > "securable-guest-memory" property (or whatever the name will be) > > > > specified is a strong indication, that the VM is intended to be used as > > > > a secure VM (thus it is OK to hurt the case where the guest does not > > > > try to transition). That argument applies here as well. > > > > > > As suggested by Cornelia Huck, what if QEMU disabled the > > > "securable-guest-memory" property if 'must-support-migrate' is enabled? > > > Offcourse; this has to be done with a big fat warning stating > > > "secure-guest-memory" feature is disabled on the machine. > > > Doing so, will continue to support guest that do not try to transition. > > > Guest that try to transition will fail and terminate themselves. > > > > Just to recap the s390x situation: > > > > - We currently offer a cpu feature that indicates secure execution to > > be available to the guest if the host supports it. > > - When we introduce the secure object, we still need to support > > previous configurations and continue to offer the cpu feature, even > > if the secure object is not specified. > > - As migration is currently not supported for secured guests, we add a > > blocker once the guest actually transitions. That means that > > transition fails if --only-migratable was specified on the command > > line. (Guests not transitioning will obviously not notice anything.) > > - With the secure object, we will already fail starting QEMU if > > --only-migratable was specified. > > > > My suggestion is now that we don't even offer the cpu feature if > > --only-migratable has been specified. For a guest that does not want to > > transition to secure mode, nothing changes; a guest that wants to > > transition to secure mode will notice that the feature is not available > > and fail appropriately (or ultimately, when the ultravisor call fails). > > > On POWER, secure-execution is not **automatically** enabled even when > the host supports it. The feature is enabled only if the secure-object > is configured, and the host supports it. > > However the behavior proposed above will be consistent on POWER and > on s390x, when '--only-migratable' is specified and 'secure-object' > is NOT specified. > > So I am in agreement till now. > > > > We'd still fail starting QEMU for the secure object + --only-migratable > > combination. > > Why fail? > > Instead, print a warning and disable the secure-object; which will > disable your cpu-feature. Guests that do not transition to secure, will > continue to operate, and guests that transition to secure, will fail. Ignoring a configuration option that was explicitly requested by the user/mgmt app is bad practice. If a request feature combination cannot be honoured, QEMU must treat that as a fatal error and exit, so that the mgmt app knows their config is unsupported. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Wed, Jan 13, 2021 at 12:42:26PM +0000, Dr. David Alan Gilbert wrote: > * Cornelia Huck (cohuck@redhat.com) wrote: > > On Tue, 5 Jan 2021 12:41:25 -0800 > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote: > > > > On Mon, 4 Jan 2021 10:40:26 -0800 > > > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > > > The main difference between my proposal and the other proposal is... > > > > > > > > > > In my proposal the guest makes the compatibility decision and acts > > > > > accordingly. In the other proposal QEMU makes the compatibility > > > > > decision and acts accordingly. I argue that QEMU cannot make a good > > > > > compatibility decision, because it wont know in advance, if the guest > > > > > will or will-not switch-to-secure. > > > > > > > > > > > > > You have a point there when you say that QEMU does not know in advance, > > > > if the guest will or will-not switch-to-secure. I made that argument > > > > regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea > > > > was to flip that property on demand when the conversion occurs. David > > > > explained to me that this is not possible for ppc, and that having the > > > > "securable-guest-memory" property (or whatever the name will be) > > > > specified is a strong indication, that the VM is intended to be used as > > > > a secure VM (thus it is OK to hurt the case where the guest does not > > > > try to transition). That argument applies here as well. > > > > > > As suggested by Cornelia Huck, what if QEMU disabled the > > > "securable-guest-memory" property if 'must-support-migrate' is enabled? > > > Offcourse; this has to be done with a big fat warning stating > > > "secure-guest-memory" feature is disabled on the machine. > > > Doing so, will continue to support guest that do not try to transition. > > > Guest that try to transition will fail and terminate themselves. > > > > Just to recap the s390x situation: > > > > - We currently offer a cpu feature that indicates secure execution to > > be available to the guest if the host supports it. > > - When we introduce the secure object, we still need to support > > previous configurations and continue to offer the cpu feature, even > > if the secure object is not specified. > > - As migration is currently not supported for secured guests, we add a > > blocker once the guest actually transitions. That means that > > transition fails if --only-migratable was specified on the command > > line. (Guests not transitioning will obviously not notice anything.) > > - With the secure object, we will already fail starting QEMU if > > --only-migratable was specified. > > > > My suggestion is now that we don't even offer the cpu feature if > > --only-migratable has been specified. For a guest that does not want to > > transition to secure mode, nothing changes; a guest that wants to > > transition to secure mode will notice that the feature is not available > > and fail appropriately (or ultimately, when the ultravisor call fails). > > We'd still fail starting QEMU for the secure object + --only-migratable > > combination. > > > > Does that make sense? > > It's a little unusual; I don't think we have any other cases where > --only-migratable changes the behaviour; I think it normally only stops > you doing something that would have made it unmigratable or causes > an operation that would make it unmigratable to fail. I agree, --only-migratable is supposed to be a *behavioural* toggle for QEMU. It must /not/ have any impact on the guest ABI. A management application needs to be able to add/remove --only-migratable at will without changing the exposing guest ABI. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
* Cornelia Huck (cohuck@redhat.com) wrote:
> On Thu, 14 Jan 2021 11:52:11 +0100
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>
> > On 14.01.21 11:36, Dr. David Alan Gilbert wrote:
> > > * Christian Borntraeger (borntraeger@de.ibm.com) wrote:
> > >>
> > >>
> > >> On 13.01.21 13:42, Dr. David Alan Gilbert wrote:
> > >>> * Cornelia Huck (cohuck@redhat.com) wrote:
> > >>>> On Tue, 5 Jan 2021 12:41:25 -0800
> > >>>> Ram Pai <linuxram@us.ibm.com> wrote:
> > >>>>
> > >>>>> On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote:
> > >>>>>> On Mon, 4 Jan 2021 10:40:26 -0800
> > >>>>>> Ram Pai <linuxram@us.ibm.com> wrote:
> > >>>>
> > >>>>>>> The main difference between my proposal and the other proposal is...
> > >>>>>>>
> > >>>>>>> In my proposal the guest makes the compatibility decision and acts
> > >>>>>>> accordingly. In the other proposal QEMU makes the compatibility
> > >>>>>>> decision and acts accordingly. I argue that QEMU cannot make a good
> > >>>>>>> compatibility decision, because it wont know in advance, if the guest
> > >>>>>>> will or will-not switch-to-secure.
> > >>>>>>>
> > >>>>>>
> > >>>>>> You have a point there when you say that QEMU does not know in advance,
> > >>>>>> if the guest will or will-not switch-to-secure. I made that argument
> > >>>>>> regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea
> > >>>>>> was to flip that property on demand when the conversion occurs. David
> > >>>>>> explained to me that this is not possible for ppc, and that having the
> > >>>>>> "securable-guest-memory" property (or whatever the name will be)
> > >>>>>> specified is a strong indication, that the VM is intended to be used as
> > >>>>>> a secure VM (thus it is OK to hurt the case where the guest does not
> > >>>>>> try to transition). That argument applies here as well.
> > >>>>>
> > >>>>> As suggested by Cornelia Huck, what if QEMU disabled the
> > >>>>> "securable-guest-memory" property if 'must-support-migrate' is enabled?
> > >>>>> Offcourse; this has to be done with a big fat warning stating
> > >>>>> "secure-guest-memory" feature is disabled on the machine.
> > >>>>> Doing so, will continue to support guest that do not try to transition.
> > >>>>> Guest that try to transition will fail and terminate themselves.
> > >>>>
> > >>>> Just to recap the s390x situation:
> > >>>>
> > >>>> - We currently offer a cpu feature that indicates secure execution to
> > >>>> be available to the guest if the host supports it.
> > >>>> - When we introduce the secure object, we still need to support
> > >>>> previous configurations and continue to offer the cpu feature, even
> > >>>> if the secure object is not specified.
> > >>>> - As migration is currently not supported for secured guests, we add a
> > >>>> blocker once the guest actually transitions. That means that
> > >>>> transition fails if --only-migratable was specified on the command
> > >>>> line. (Guests not transitioning will obviously not notice anything.)
> > >>>> - With the secure object, we will already fail starting QEMU if
> > >>>> --only-migratable was specified.
> > >>>>
> > >>>> My suggestion is now that we don't even offer the cpu feature if
> > >>>> --only-migratable has been specified. For a guest that does not want to
> > >>>> transition to secure mode, nothing changes; a guest that wants to
> > >>>> transition to secure mode will notice that the feature is not available
> > >>>> and fail appropriately (or ultimately, when the ultravisor call fails).
> > >>>> We'd still fail starting QEMU for the secure object + --only-migratable
> > >>>> combination.
> > >>>>
> > >>>> Does that make sense?
> > >>>
> > >>> It's a little unusual; I don't think we have any other cases where
> > >>> --only-migratable changes the behaviour; I think it normally only stops
> > >>> you doing something that would have made it unmigratable or causes
> > >>> an operation that would make it unmigratable to fail.
> > >>
> > >> I would like to NOT block this feature with --only-migrateable. A guest
> > >> can startup unprotected (and then is is migrateable). the migration blocker
> > >> is really a dynamic aspect during runtime.
> > >
> > > But the point of --only-migratable is to turn things that would have
> > > blocked migration into failures, so that a VM started with
> > > --only-migratable is *always* migratable.
> >
> > Hmmm, fair enough. How do we do this with host-model? The constructed model
> > would contain unpack, but then it will fail to startup? Or do we silently
> > drop unpack in that case? Both variants do not feel completely right.
>
> Failing if you explicitly specified unpacked feels right, but failing
> if you just used the host model feels odd. Removing unpack also is a
> bit odd, but I think the better option if we want to do anything about
> it at all.
'host-model' feels a bit special; but breaking the rule that
only-migratable doesn't change behaviour is weird.
Can you do host,-unpack to make that work explicitly?
But hang on; why is 'unpack' the name of a secure guest facility - is
it really a feature for secure guest or something else?
Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On 14.01.21 12:45, Dr. David Alan Gilbert wrote: > * Cornelia Huck (cohuck@redhat.com) wrote: >> On Thu, 14 Jan 2021 11:52:11 +0100 >> Christian Borntraeger <borntraeger@de.ibm.com> wrote: >> >>> On 14.01.21 11:36, Dr. David Alan Gilbert wrote: >>>> * Christian Borntraeger (borntraeger@de.ibm.com) wrote: >>>>> >>>>> >>>>> On 13.01.21 13:42, Dr. David Alan Gilbert wrote: >>>>>> * Cornelia Huck (cohuck@redhat.com) wrote: >>>>>>> On Tue, 5 Jan 2021 12:41:25 -0800 >>>>>>> Ram Pai <linuxram@us.ibm.com> wrote: >>>>>>> >>>>>>>> On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote: >>>>>>>>> On Mon, 4 Jan 2021 10:40:26 -0800 >>>>>>>>> Ram Pai <linuxram@us.ibm.com> wrote: >>>>>>> >>>>>>>>>> The main difference between my proposal and the other proposal is... >>>>>>>>>> >>>>>>>>>> In my proposal the guest makes the compatibility decision and acts >>>>>>>>>> accordingly. In the other proposal QEMU makes the compatibility >>>>>>>>>> decision and acts accordingly. I argue that QEMU cannot make a good >>>>>>>>>> compatibility decision, because it wont know in advance, if the guest >>>>>>>>>> will or will-not switch-to-secure. >>>>>>>>>> >>>>>>>>> >>>>>>>>> You have a point there when you say that QEMU does not know in advance, >>>>>>>>> if the guest will or will-not switch-to-secure. I made that argument >>>>>>>>> regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea >>>>>>>>> was to flip that property on demand when the conversion occurs. David >>>>>>>>> explained to me that this is not possible for ppc, and that having the >>>>>>>>> "securable-guest-memory" property (or whatever the name will be) >>>>>>>>> specified is a strong indication, that the VM is intended to be used as >>>>>>>>> a secure VM (thus it is OK to hurt the case where the guest does not >>>>>>>>> try to transition). That argument applies here as well. >>>>>>>> >>>>>>>> As suggested by Cornelia Huck, what if QEMU disabled the >>>>>>>> "securable-guest-memory" property if 'must-support-migrate' is enabled? >>>>>>>> Offcourse; this has to be done with a big fat warning stating >>>>>>>> "secure-guest-memory" feature is disabled on the machine. >>>>>>>> Doing so, will continue to support guest that do not try to transition. >>>>>>>> Guest that try to transition will fail and terminate themselves. >>>>>>> >>>>>>> Just to recap the s390x situation: >>>>>>> >>>>>>> - We currently offer a cpu feature that indicates secure execution to >>>>>>> be available to the guest if the host supports it. >>>>>>> - When we introduce the secure object, we still need to support >>>>>>> previous configurations and continue to offer the cpu feature, even >>>>>>> if the secure object is not specified. >>>>>>> - As migration is currently not supported for secured guests, we add a >>>>>>> blocker once the guest actually transitions. That means that >>>>>>> transition fails if --only-migratable was specified on the command >>>>>>> line. (Guests not transitioning will obviously not notice anything.) >>>>>>> - With the secure object, we will already fail starting QEMU if >>>>>>> --only-migratable was specified. >>>>>>> >>>>>>> My suggestion is now that we don't even offer the cpu feature if >>>>>>> --only-migratable has been specified. For a guest that does not want to >>>>>>> transition to secure mode, nothing changes; a guest that wants to >>>>>>> transition to secure mode will notice that the feature is not available >>>>>>> and fail appropriately (or ultimately, when the ultravisor call fails). >>>>>>> We'd still fail starting QEMU for the secure object + --only-migratable >>>>>>> combination. >>>>>>> >>>>>>> Does that make sense? >>>>>> >>>>>> It's a little unusual; I don't think we have any other cases where >>>>>> --only-migratable changes the behaviour; I think it normally only stops >>>>>> you doing something that would have made it unmigratable or causes >>>>>> an operation that would make it unmigratable to fail. >>>>> >>>>> I would like to NOT block this feature with --only-migrateable. A guest >>>>> can startup unprotected (and then is is migrateable). the migration blocker >>>>> is really a dynamic aspect during runtime. >>>> >>>> But the point of --only-migratable is to turn things that would have >>>> blocked migration into failures, so that a VM started with >>>> --only-migratable is *always* migratable. >>> >>> Hmmm, fair enough. How do we do this with host-model? The constructed model >>> would contain unpack, but then it will fail to startup? Or do we silently >>> drop unpack in that case? Both variants do not feel completely right. >> >> Failing if you explicitly specified unpacked feels right, but failing >> if you just used the host model feels odd. Removing unpack also is a >> bit odd, but I think the better option if we want to do anything about >> it at all. > > 'host-model' feels a bit special; but breaking the rule that > only-migratable doesn't change behaviour is weird > Can you do host,-unpack to make that work explicitly? I guess that should work. But it means that we need to add logic in libvirt to disable unpack for host-passthru and host-model. Next problem is then, that a future version might implement migration of such guests, which means that libvirt must then stop fencing unpack. > > But hang on; why is 'unpack' the name of a secure guest facility - is > it really a feature for secure guest or something else? unpack is the name of the function that unpacks and decrypts the encrypted image. If if is there, then you can switch into the securable guest mode.
On Thu, Jan 14, 2021 at 12:50:12PM +0100, Christian Borntraeger wrote: > > > On 14.01.21 12:45, Dr. David Alan Gilbert wrote: > > * Cornelia Huck (cohuck@redhat.com) wrote: > >> On Thu, 14 Jan 2021 11:52:11 +0100 > >> Christian Borntraeger <borntraeger@de.ibm.com> wrote: > >> > >>> On 14.01.21 11:36, Dr. David Alan Gilbert wrote: > >>>> * Christian Borntraeger (borntraeger@de.ibm.com) wrote: > >>>>> > >>>>> > >>>>> On 13.01.21 13:42, Dr. David Alan Gilbert wrote: > >>>>>> * Cornelia Huck (cohuck@redhat.com) wrote: > >>>>>>> On Tue, 5 Jan 2021 12:41:25 -0800 > >>>>>>> Ram Pai <linuxram@us.ibm.com> wrote: > >>>>>>> > >>>>>>>> On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote: > >>>>>>>>> On Mon, 4 Jan 2021 10:40:26 -0800 > >>>>>>>>> Ram Pai <linuxram@us.ibm.com> wrote: > >>>>>>> > >>>>>>>>>> The main difference between my proposal and the other proposal is... > >>>>>>>>>> > >>>>>>>>>> In my proposal the guest makes the compatibility decision and acts > >>>>>>>>>> accordingly. In the other proposal QEMU makes the compatibility > >>>>>>>>>> decision and acts accordingly. I argue that QEMU cannot make a good > >>>>>>>>>> compatibility decision, because it wont know in advance, if the guest > >>>>>>>>>> will or will-not switch-to-secure. > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> You have a point there when you say that QEMU does not know in advance, > >>>>>>>>> if the guest will or will-not switch-to-secure. I made that argument > >>>>>>>>> regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea > >>>>>>>>> was to flip that property on demand when the conversion occurs. David > >>>>>>>>> explained to me that this is not possible for ppc, and that having the > >>>>>>>>> "securable-guest-memory" property (or whatever the name will be) > >>>>>>>>> specified is a strong indication, that the VM is intended to be used as > >>>>>>>>> a secure VM (thus it is OK to hurt the case where the guest does not > >>>>>>>>> try to transition). That argument applies here as well. > >>>>>>>> > >>>>>>>> As suggested by Cornelia Huck, what if QEMU disabled the > >>>>>>>> "securable-guest-memory" property if 'must-support-migrate' is enabled? > >>>>>>>> Offcourse; this has to be done with a big fat warning stating > >>>>>>>> "secure-guest-memory" feature is disabled on the machine. > >>>>>>>> Doing so, will continue to support guest that do not try to transition. > >>>>>>>> Guest that try to transition will fail and terminate themselves. > >>>>>>> > >>>>>>> Just to recap the s390x situation: > >>>>>>> > >>>>>>> - We currently offer a cpu feature that indicates secure execution to > >>>>>>> be available to the guest if the host supports it. > >>>>>>> - When we introduce the secure object, we still need to support > >>>>>>> previous configurations and continue to offer the cpu feature, even > >>>>>>> if the secure object is not specified. > >>>>>>> - As migration is currently not supported for secured guests, we add a > >>>>>>> blocker once the guest actually transitions. That means that > >>>>>>> transition fails if --only-migratable was specified on the command > >>>>>>> line. (Guests not transitioning will obviously not notice anything.) > >>>>>>> - With the secure object, we will already fail starting QEMU if > >>>>>>> --only-migratable was specified. > >>>>>>> > >>>>>>> My suggestion is now that we don't even offer the cpu feature if > >>>>>>> --only-migratable has been specified. For a guest that does not want to > >>>>>>> transition to secure mode, nothing changes; a guest that wants to > >>>>>>> transition to secure mode will notice that the feature is not available > >>>>>>> and fail appropriately (or ultimately, when the ultravisor call fails). > >>>>>>> We'd still fail starting QEMU for the secure object + --only-migratable > >>>>>>> combination. > >>>>>>> > >>>>>>> Does that make sense? > >>>>>> > >>>>>> It's a little unusual; I don't think we have any other cases where > >>>>>> --only-migratable changes the behaviour; I think it normally only stops > >>>>>> you doing something that would have made it unmigratable or causes > >>>>>> an operation that would make it unmigratable to fail. > >>>>> > >>>>> I would like to NOT block this feature with --only-migrateable. A guest > >>>>> can startup unprotected (and then is is migrateable). the migration blocker > >>>>> is really a dynamic aspect during runtime. > >>>> > >>>> But the point of --only-migratable is to turn things that would have > >>>> blocked migration into failures, so that a VM started with > >>>> --only-migratable is *always* migratable. > >>> > >>> Hmmm, fair enough. How do we do this with host-model? The constructed model > >>> would contain unpack, but then it will fail to startup? Or do we silently > >>> drop unpack in that case? Both variants do not feel completely right. > >> > >> Failing if you explicitly specified unpacked feels right, but failing > >> if you just used the host model feels odd. Removing unpack also is a > >> bit odd, but I think the better option if we want to do anything about > >> it at all. > > > > 'host-model' feels a bit special; but breaking the rule that > > only-migratable doesn't change behaviour is weird > > Can you do host,-unpack to make that work explicitly? > > I guess that should work. But it means that we need to add logic in libvirt > to disable unpack for host-passthru and host-model. Next problem is then, > that a future version might implement migration of such guests, which means > that libvirt must then stop fencing unpack. The "host-model" is supposed to always be migratable, so we should fence the feature there. host-passthrough is "undefined" whether it is migratable - it may or may not work, no guarantees made by libvirt. Ultimately I think the problem is that there ought to be an explicit config to enable the feature for s390, as there is for SEV, and will also presumably be needed for ppc. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Thu, 14 Jan 2021 12:20:48 +0000
Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Thu, Jan 14, 2021 at 12:50:12PM +0100, Christian Borntraeger wrote:
> >
> >
> > On 14.01.21 12:45, Dr. David Alan Gilbert wrote:
> > > * Cornelia Huck (cohuck@redhat.com) wrote:
> > >> On Thu, 14 Jan 2021 11:52:11 +0100
> > >> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> > >>
> > >>> On 14.01.21 11:36, Dr. David Alan Gilbert wrote:
> > >>>> * Christian Borntraeger (borntraeger@de.ibm.com) wrote:
> > >>>>>
> > >>>>>
> > >>>>> On 13.01.21 13:42, Dr. David Alan Gilbert wrote:
> > >>>>>> * Cornelia Huck (cohuck@redhat.com) wrote:
> > >>>>>>> On Tue, 5 Jan 2021 12:41:25 -0800
> > >>>>>>> Ram Pai <linuxram@us.ibm.com> wrote:
> > >>>>>>>
> > >>>>>>>> On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote:
> > >>>>>>>>> On Mon, 4 Jan 2021 10:40:26 -0800
> > >>>>>>>>> Ram Pai <linuxram@us.ibm.com> wrote:
> > >>>>>>>
> > >>>>>>>>>> The main difference between my proposal and the other proposal is...
> > >>>>>>>>>>
> > >>>>>>>>>> In my proposal the guest makes the compatibility decision and acts
> > >>>>>>>>>> accordingly. In the other proposal QEMU makes the compatibility
> > >>>>>>>>>> decision and acts accordingly. I argue that QEMU cannot make a good
> > >>>>>>>>>> compatibility decision, because it wont know in advance, if the guest
> > >>>>>>>>>> will or will-not switch-to-secure.
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> You have a point there when you say that QEMU does not know in advance,
> > >>>>>>>>> if the guest will or will-not switch-to-secure. I made that argument
> > >>>>>>>>> regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea
> > >>>>>>>>> was to flip that property on demand when the conversion occurs. David
> > >>>>>>>>> explained to me that this is not possible for ppc, and that having the
> > >>>>>>>>> "securable-guest-memory" property (or whatever the name will be)
> > >>>>>>>>> specified is a strong indication, that the VM is intended to be used as
> > >>>>>>>>> a secure VM (thus it is OK to hurt the case where the guest does not
> > >>>>>>>>> try to transition). That argument applies here as well.
> > >>>>>>>>
> > >>>>>>>> As suggested by Cornelia Huck, what if QEMU disabled the
> > >>>>>>>> "securable-guest-memory" property if 'must-support-migrate' is enabled?
> > >>>>>>>> Offcourse; this has to be done with a big fat warning stating
> > >>>>>>>> "secure-guest-memory" feature is disabled on the machine.
> > >>>>>>>> Doing so, will continue to support guest that do not try to transition.
> > >>>>>>>> Guest that try to transition will fail and terminate themselves.
> > >>>>>>>
> > >>>>>>> Just to recap the s390x situation:
> > >>>>>>>
> > >>>>>>> - We currently offer a cpu feature that indicates secure execution to
> > >>>>>>> be available to the guest if the host supports it.
> > >>>>>>> - When we introduce the secure object, we still need to support
> > >>>>>>> previous configurations and continue to offer the cpu feature, even
> > >>>>>>> if the secure object is not specified.
> > >>>>>>> - As migration is currently not supported for secured guests, we add a
> > >>>>>>> blocker once the guest actually transitions. That means that
> > >>>>>>> transition fails if --only-migratable was specified on the command
> > >>>>>>> line. (Guests not transitioning will obviously not notice anything.)
> > >>>>>>> - With the secure object, we will already fail starting QEMU if
> > >>>>>>> --only-migratable was specified.
> > >>>>>>>
> > >>>>>>> My suggestion is now that we don't even offer the cpu feature if
> > >>>>>>> --only-migratable has been specified. For a guest that does not want to
> > >>>>>>> transition to secure mode, nothing changes; a guest that wants to
> > >>>>>>> transition to secure mode will notice that the feature is not available
> > >>>>>>> and fail appropriately (or ultimately, when the ultravisor call fails).
> > >>>>>>> We'd still fail starting QEMU for the secure object + --only-migratable
> > >>>>>>> combination.
> > >>>>>>>
> > >>>>>>> Does that make sense?
> > >>>>>>
> > >>>>>> It's a little unusual; I don't think we have any other cases where
> > >>>>>> --only-migratable changes the behaviour; I think it normally only stops
> > >>>>>> you doing something that would have made it unmigratable or causes
> > >>>>>> an operation that would make it unmigratable to fail.
> > >>>>>
> > >>>>> I would like to NOT block this feature with --only-migrateable. A guest
> > >>>>> can startup unprotected (and then is is migrateable). the migration blocker
> > >>>>> is really a dynamic aspect during runtime.
> > >>>>
> > >>>> But the point of --only-migratable is to turn things that would have
> > >>>> blocked migration into failures, so that a VM started with
> > >>>> --only-migratable is *always* migratable.
> > >>>
> > >>> Hmmm, fair enough. How do we do this with host-model? The constructed model
> > >>> would contain unpack, but then it will fail to startup? Or do we silently
> > >>> drop unpack in that case? Both variants do not feel completely right.
> > >>
> > >> Failing if you explicitly specified unpacked feels right, but failing
> > >> if you just used the host model feels odd. Removing unpack also is a
> > >> bit odd, but I think the better option if we want to do anything about
> > >> it at all.
> > >
> > > 'host-model' feels a bit special; but breaking the rule that
> > > only-migratable doesn't change behaviour is weird
> > > Can you do host,-unpack to make that work explicitly?
> >
> > I guess that should work. But it means that we need to add logic in libvirt
> > to disable unpack for host-passthru and host-model. Next problem is then,
> > that a future version might implement migration of such guests, which means
> > that libvirt must then stop fencing unpack.
>
> The "host-model" is supposed to always be migratable, so we should
> fence the feature there.
>
> host-passthrough is "undefined" whether it is migratable - it may or may
> not work, no guarantees made by libvirt.
>
> Ultimately I think the problem is that there ought to be an explicit
> config to enable the feature for s390, as there is for SEV, and will
> also presumably be needed for ppc.
Yes, an explicit config is what we want; unfortunately, we have to deal
with existing setups as well...
The options I see are
- leave things for existing setups as they are now (i.e. might become
unmigratable when the guest transitions), and make sure we're doing
the right thing with the new object
- always make the unpack feature conflict with migration requirements;
this is a guest-visible change
The first option might be less hairy, all considered?
On 14.01.21 15:04, Cornelia Huck wrote:
> On Thu, 14 Jan 2021 12:20:48 +0000
> Daniel P. Berrangé <berrange@redhat.com> wrote:
>
>> On Thu, Jan 14, 2021 at 12:50:12PM +0100, Christian Borntraeger wrote:
>>>
>>>
>>> On 14.01.21 12:45, Dr. David Alan Gilbert wrote:
>>>> * Cornelia Huck (cohuck@redhat.com) wrote:
>>>>> On Thu, 14 Jan 2021 11:52:11 +0100
>>>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
>>>>>
>>>>>> On 14.01.21 11:36, Dr. David Alan Gilbert wrote:
>>>>>>> * Christian Borntraeger (borntraeger@de.ibm.com) wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 13.01.21 13:42, Dr. David Alan Gilbert wrote:
>>>>>>>>> * Cornelia Huck (cohuck@redhat.com) wrote:
>>>>>>>>>> On Tue, 5 Jan 2021 12:41:25 -0800
>>>>>>>>>> Ram Pai <linuxram@us.ibm.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote:
>>>>>>>>>>>> On Mon, 4 Jan 2021 10:40:26 -0800
>>>>>>>>>>>> Ram Pai <linuxram@us.ibm.com> wrote:
>>>>>>>>>>
>>>>>>>>>>>>> The main difference between my proposal and the other proposal is...
>>>>>>>>>>>>>
>>>>>>>>>>>>> In my proposal the guest makes the compatibility decision and acts
>>>>>>>>>>>>> accordingly. In the other proposal QEMU makes the compatibility
>>>>>>>>>>>>> decision and acts accordingly. I argue that QEMU cannot make a good
>>>>>>>>>>>>> compatibility decision, because it wont know in advance, if the guest
>>>>>>>>>>>>> will or will-not switch-to-secure.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> You have a point there when you say that QEMU does not know in advance,
>>>>>>>>>>>> if the guest will or will-not switch-to-secure. I made that argument
>>>>>>>>>>>> regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea
>>>>>>>>>>>> was to flip that property on demand when the conversion occurs. David
>>>>>>>>>>>> explained to me that this is not possible for ppc, and that having the
>>>>>>>>>>>> "securable-guest-memory" property (or whatever the name will be)
>>>>>>>>>>>> specified is a strong indication, that the VM is intended to be used as
>>>>>>>>>>>> a secure VM (thus it is OK to hurt the case where the guest does not
>>>>>>>>>>>> try to transition). That argument applies here as well.
>>>>>>>>>>>
>>>>>>>>>>> As suggested by Cornelia Huck, what if QEMU disabled the
>>>>>>>>>>> "securable-guest-memory" property if 'must-support-migrate' is enabled?
>>>>>>>>>>> Offcourse; this has to be done with a big fat warning stating
>>>>>>>>>>> "secure-guest-memory" feature is disabled on the machine.
>>>>>>>>>>> Doing so, will continue to support guest that do not try to transition.
>>>>>>>>>>> Guest that try to transition will fail and terminate themselves.
>>>>>>>>>>
>>>>>>>>>> Just to recap the s390x situation:
>>>>>>>>>>
>>>>>>>>>> - We currently offer a cpu feature that indicates secure execution to
>>>>>>>>>> be available to the guest if the host supports it.
>>>>>>>>>> - When we introduce the secure object, we still need to support
>>>>>>>>>> previous configurations and continue to offer the cpu feature, even
>>>>>>>>>> if the secure object is not specified.
>>>>>>>>>> - As migration is currently not supported for secured guests, we add a
>>>>>>>>>> blocker once the guest actually transitions. That means that
>>>>>>>>>> transition fails if --only-migratable was specified on the command
>>>>>>>>>> line. (Guests not transitioning will obviously not notice anything.)
>>>>>>>>>> - With the secure object, we will already fail starting QEMU if
>>>>>>>>>> --only-migratable was specified.
>>>>>>>>>>
>>>>>>>>>> My suggestion is now that we don't even offer the cpu feature if
>>>>>>>>>> --only-migratable has been specified. For a guest that does not want to
>>>>>>>>>> transition to secure mode, nothing changes; a guest that wants to
>>>>>>>>>> transition to secure mode will notice that the feature is not available
>>>>>>>>>> and fail appropriately (or ultimately, when the ultravisor call fails).
>>>>>>>>>> We'd still fail starting QEMU for the secure object + --only-migratable
>>>>>>>>>> combination.
>>>>>>>>>>
>>>>>>>>>> Does that make sense?
>>>>>>>>>
>>>>>>>>> It's a little unusual; I don't think we have any other cases where
>>>>>>>>> --only-migratable changes the behaviour; I think it normally only stops
>>>>>>>>> you doing something that would have made it unmigratable or causes
>>>>>>>>> an operation that would make it unmigratable to fail.
>>>>>>>>
>>>>>>>> I would like to NOT block this feature with --only-migrateable. A guest
>>>>>>>> can startup unprotected (and then is is migrateable). the migration blocker
>>>>>>>> is really a dynamic aspect during runtime.
>>>>>>>
>>>>>>> But the point of --only-migratable is to turn things that would have
>>>>>>> blocked migration into failures, so that a VM started with
>>>>>>> --only-migratable is *always* migratable.
>>>>>>
>>>>>> Hmmm, fair enough. How do we do this with host-model? The constructed model
>>>>>> would contain unpack, but then it will fail to startup? Or do we silently
>>>>>> drop unpack in that case? Both variants do not feel completely right.
>>>>>
>>>>> Failing if you explicitly specified unpacked feels right, but failing
>>>>> if you just used the host model feels odd. Removing unpack also is a
>>>>> bit odd, but I think the better option if we want to do anything about
>>>>> it at all.
>>>>
>>>> 'host-model' feels a bit special; but breaking the rule that
>>>> only-migratable doesn't change behaviour is weird
>>>> Can you do host,-unpack to make that work explicitly?
>>>
>>> I guess that should work. But it means that we need to add logic in libvirt
>>> to disable unpack for host-passthru and host-model. Next problem is then,
>>> that a future version might implement migration of such guests, which means
>>> that libvirt must then stop fencing unpack.
>>
>> The "host-model" is supposed to always be migratable, so we should
>> fence the feature there.
>>
>> host-passthrough is "undefined" whether it is migratable - it may or may
>> not work, no guarantees made by libvirt.
>>
>> Ultimately I think the problem is that there ought to be an explicit
>> config to enable the feature for s390, as there is for SEV, and will
>> also presumably be needed for ppc.
>
> Yes, an explicit config is what we want; unfortunately, we have to deal
> with existing setups as well...
>
> The options I see are
> - leave things for existing setups as they are now (i.e. might become
> unmigratable when the guest transitions), and make sure we're doing
> the right thing with the new object
> - always make the unpack feature conflict with migration requirements;
> this is a guest-visible change
>
> The first option might be less hairy, all considered?
What about a libvirt change that removes the unpack from the host-model as
soon as only-migrateable is used. When that is in place, QEMU can reject
the combination of only-migrateable + unpack.
On Thu, Jan 14, 2021 at 03:09:01PM +0100, Christian Borntraeger wrote: > > > On 14.01.21 15:04, Cornelia Huck wrote: > > On Thu, 14 Jan 2021 12:20:48 +0000 > > Daniel P. Berrangé <berrange@redhat.com> wrote: > > > >> On Thu, Jan 14, 2021 at 12:50:12PM +0100, Christian Borntraeger wrote: > >>> > >>> > >>> On 14.01.21 12:45, Dr. David Alan Gilbert wrote: > >>>> * Cornelia Huck (cohuck@redhat.com) wrote: > >>>>> On Thu, 14 Jan 2021 11:52:11 +0100 > >>>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote: > >>>>> > >>>>>> On 14.01.21 11:36, Dr. David Alan Gilbert wrote: > >>>>>>> * Christian Borntraeger (borntraeger@de.ibm.com) wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> On 13.01.21 13:42, Dr. David Alan Gilbert wrote: > >>>>>>>>> * Cornelia Huck (cohuck@redhat.com) wrote: > >>>>>>>>>> On Tue, 5 Jan 2021 12:41:25 -0800 > >>>>>>>>>> Ram Pai <linuxram@us.ibm.com> wrote: > >>>>>>>>>> > >>>>>>>>>>> On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote: > >>>>>>>>>>>> On Mon, 4 Jan 2021 10:40:26 -0800 > >>>>>>>>>>>> Ram Pai <linuxram@us.ibm.com> wrote: > >>>>>>>>>> > >>>>>>>>>>>>> The main difference between my proposal and the other proposal is... > >>>>>>>>>>>>> > >>>>>>>>>>>>> In my proposal the guest makes the compatibility decision and acts > >>>>>>>>>>>>> accordingly. In the other proposal QEMU makes the compatibility > >>>>>>>>>>>>> decision and acts accordingly. I argue that QEMU cannot make a good > >>>>>>>>>>>>> compatibility decision, because it wont know in advance, if the guest > >>>>>>>>>>>>> will or will-not switch-to-secure. > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> You have a point there when you say that QEMU does not know in advance, > >>>>>>>>>>>> if the guest will or will-not switch-to-secure. I made that argument > >>>>>>>>>>>> regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea > >>>>>>>>>>>> was to flip that property on demand when the conversion occurs. David > >>>>>>>>>>>> explained to me that this is not possible for ppc, and that having the > >>>>>>>>>>>> "securable-guest-memory" property (or whatever the name will be) > >>>>>>>>>>>> specified is a strong indication, that the VM is intended to be used as > >>>>>>>>>>>> a secure VM (thus it is OK to hurt the case where the guest does not > >>>>>>>>>>>> try to transition). That argument applies here as well. > >>>>>>>>>>> > >>>>>>>>>>> As suggested by Cornelia Huck, what if QEMU disabled the > >>>>>>>>>>> "securable-guest-memory" property if 'must-support-migrate' is enabled? > >>>>>>>>>>> Offcourse; this has to be done with a big fat warning stating > >>>>>>>>>>> "secure-guest-memory" feature is disabled on the machine. > >>>>>>>>>>> Doing so, will continue to support guest that do not try to transition. > >>>>>>>>>>> Guest that try to transition will fail and terminate themselves. > >>>>>>>>>> > >>>>>>>>>> Just to recap the s390x situation: > >>>>>>>>>> > >>>>>>>>>> - We currently offer a cpu feature that indicates secure execution to > >>>>>>>>>> be available to the guest if the host supports it. > >>>>>>>>>> - When we introduce the secure object, we still need to support > >>>>>>>>>> previous configurations and continue to offer the cpu feature, even > >>>>>>>>>> if the secure object is not specified. > >>>>>>>>>> - As migration is currently not supported for secured guests, we add a > >>>>>>>>>> blocker once the guest actually transitions. That means that > >>>>>>>>>> transition fails if --only-migratable was specified on the command > >>>>>>>>>> line. (Guests not transitioning will obviously not notice anything.) > >>>>>>>>>> - With the secure object, we will already fail starting QEMU if > >>>>>>>>>> --only-migratable was specified. > >>>>>>>>>> > >>>>>>>>>> My suggestion is now that we don't even offer the cpu feature if > >>>>>>>>>> --only-migratable has been specified. For a guest that does not want to > >>>>>>>>>> transition to secure mode, nothing changes; a guest that wants to > >>>>>>>>>> transition to secure mode will notice that the feature is not available > >>>>>>>>>> and fail appropriately (or ultimately, when the ultravisor call fails). > >>>>>>>>>> We'd still fail starting QEMU for the secure object + --only-migratable > >>>>>>>>>> combination. > >>>>>>>>>> > >>>>>>>>>> Does that make sense? > >>>>>>>>> > >>>>>>>>> It's a little unusual; I don't think we have any other cases where > >>>>>>>>> --only-migratable changes the behaviour; I think it normally only stops > >>>>>>>>> you doing something that would have made it unmigratable or causes > >>>>>>>>> an operation that would make it unmigratable to fail. > >>>>>>>> > >>>>>>>> I would like to NOT block this feature with --only-migrateable. A guest > >>>>>>>> can startup unprotected (and then is is migrateable). the migration blocker > >>>>>>>> is really a dynamic aspect during runtime. > >>>>>>> > >>>>>>> But the point of --only-migratable is to turn things that would have > >>>>>>> blocked migration into failures, so that a VM started with > >>>>>>> --only-migratable is *always* migratable. > >>>>>> > >>>>>> Hmmm, fair enough. How do we do this with host-model? The constructed model > >>>>>> would contain unpack, but then it will fail to startup? Or do we silently > >>>>>> drop unpack in that case? Both variants do not feel completely right. > >>>>> > >>>>> Failing if you explicitly specified unpacked feels right, but failing > >>>>> if you just used the host model feels odd. Removing unpack also is a > >>>>> bit odd, but I think the better option if we want to do anything about > >>>>> it at all. > >>>> > >>>> 'host-model' feels a bit special; but breaking the rule that > >>>> only-migratable doesn't change behaviour is weird > >>>> Can you do host,-unpack to make that work explicitly? > >>> > >>> I guess that should work. But it means that we need to add logic in libvirt > >>> to disable unpack for host-passthru and host-model. Next problem is then, > >>> that a future version might implement migration of such guests, which means > >>> that libvirt must then stop fencing unpack. > >> > >> The "host-model" is supposed to always be migratable, so we should > >> fence the feature there. > >> > >> host-passthrough is "undefined" whether it is migratable - it may or may > >> not work, no guarantees made by libvirt. > >> > >> Ultimately I think the problem is that there ought to be an explicit > >> config to enable the feature for s390, as there is for SEV, and will > >> also presumably be needed for ppc. > > > > Yes, an explicit config is what we want; unfortunately, we have to deal > > with existing setups as well... > > > > The options I see are > > - leave things for existing setups as they are now (i.e. might become > > unmigratable when the guest transitions), and make sure we're doing > > the right thing with the new object > > - always make the unpack feature conflict with migration requirements; > > this is a guest-visible change > > > > The first option might be less hairy, all considered? > > What about a libvirt change that removes the unpack from the host-model as > soon as only-migrateable is used. When that is in place, QEMU can reject > the combination of only-migrateable + unpack. I think libvirt needs to just unconditionally remove unpack from host-model regardless, and require an explicit opt in. We can do that in libvirt without compat problems, because we track the expansion of "host-model" for existing running guests. QEMU could introduce a deprecation warning right now, and then turn it into an error after the deprecation cycle is complete. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 14.01.21 15:15, Daniel P. Berrangé wrote: > On Thu, Jan 14, 2021 at 03:09:01PM +0100, Christian Borntraeger wrote: >> >> >> On 14.01.21 15:04, Cornelia Huck wrote: >>> On Thu, 14 Jan 2021 12:20:48 +0000 >>> Daniel P. Berrangé <berrange@redhat.com> wrote: >>> >>>> On Thu, Jan 14, 2021 at 12:50:12PM +0100, Christian Borntraeger wrote: >>>>> >>>>> >>>>> On 14.01.21 12:45, Dr. David Alan Gilbert wrote: >>>>>> * Cornelia Huck (cohuck@redhat.com) wrote: >>>>>>> On Thu, 14 Jan 2021 11:52:11 +0100 >>>>>>> Christian Borntraeger <borntraeger@de.ibm.com> wrote: >>>>>>> >>>>>>>> On 14.01.21 11:36, Dr. David Alan Gilbert wrote: >>>>>>>>> * Christian Borntraeger (borntraeger@de.ibm.com) wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 13.01.21 13:42, Dr. David Alan Gilbert wrote: >>>>>>>>>>> * Cornelia Huck (cohuck@redhat.com) wrote: >>>>>>>>>>>> On Tue, 5 Jan 2021 12:41:25 -0800 >>>>>>>>>>>> Ram Pai <linuxram@us.ibm.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote: >>>>>>>>>>>>>> On Mon, 4 Jan 2021 10:40:26 -0800 >>>>>>>>>>>>>> Ram Pai <linuxram@us.ibm.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>>>> The main difference between my proposal and the other proposal is... >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In my proposal the guest makes the compatibility decision and acts >>>>>>>>>>>>>>> accordingly. In the other proposal QEMU makes the compatibility >>>>>>>>>>>>>>> decision and acts accordingly. I argue that QEMU cannot make a good >>>>>>>>>>>>>>> compatibility decision, because it wont know in advance, if the guest >>>>>>>>>>>>>>> will or will-not switch-to-secure. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> You have a point there when you say that QEMU does not know in advance, >>>>>>>>>>>>>> if the guest will or will-not switch-to-secure. I made that argument >>>>>>>>>>>>>> regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea >>>>>>>>>>>>>> was to flip that property on demand when the conversion occurs. David >>>>>>>>>>>>>> explained to me that this is not possible for ppc, and that having the >>>>>>>>>>>>>> "securable-guest-memory" property (or whatever the name will be) >>>>>>>>>>>>>> specified is a strong indication, that the VM is intended to be used as >>>>>>>>>>>>>> a secure VM (thus it is OK to hurt the case where the guest does not >>>>>>>>>>>>>> try to transition). That argument applies here as well. >>>>>>>>>>>>> >>>>>>>>>>>>> As suggested by Cornelia Huck, what if QEMU disabled the >>>>>>>>>>>>> "securable-guest-memory" property if 'must-support-migrate' is enabled? >>>>>>>>>>>>> Offcourse; this has to be done with a big fat warning stating >>>>>>>>>>>>> "secure-guest-memory" feature is disabled on the machine. >>>>>>>>>>>>> Doing so, will continue to support guest that do not try to transition. >>>>>>>>>>>>> Guest that try to transition will fail and terminate themselves. >>>>>>>>>>>> >>>>>>>>>>>> Just to recap the s390x situation: >>>>>>>>>>>> >>>>>>>>>>>> - We currently offer a cpu feature that indicates secure execution to >>>>>>>>>>>> be available to the guest if the host supports it. >>>>>>>>>>>> - When we introduce the secure object, we still need to support >>>>>>>>>>>> previous configurations and continue to offer the cpu feature, even >>>>>>>>>>>> if the secure object is not specified. >>>>>>>>>>>> - As migration is currently not supported for secured guests, we add a >>>>>>>>>>>> blocker once the guest actually transitions. That means that >>>>>>>>>>>> transition fails if --only-migratable was specified on the command >>>>>>>>>>>> line. (Guests not transitioning will obviously not notice anything.) >>>>>>>>>>>> - With the secure object, we will already fail starting QEMU if >>>>>>>>>>>> --only-migratable was specified. >>>>>>>>>>>> >>>>>>>>>>>> My suggestion is now that we don't even offer the cpu feature if >>>>>>>>>>>> --only-migratable has been specified. For a guest that does not want to >>>>>>>>>>>> transition to secure mode, nothing changes; a guest that wants to >>>>>>>>>>>> transition to secure mode will notice that the feature is not available >>>>>>>>>>>> and fail appropriately (or ultimately, when the ultravisor call fails). >>>>>>>>>>>> We'd still fail starting QEMU for the secure object + --only-migratable >>>>>>>>>>>> combination. >>>>>>>>>>>> >>>>>>>>>>>> Does that make sense? >>>>>>>>>>> >>>>>>>>>>> It's a little unusual; I don't think we have any other cases where >>>>>>>>>>> --only-migratable changes the behaviour; I think it normally only stops >>>>>>>>>>> you doing something that would have made it unmigratable or causes >>>>>>>>>>> an operation that would make it unmigratable to fail. >>>>>>>>>> >>>>>>>>>> I would like to NOT block this feature with --only-migrateable. A guest >>>>>>>>>> can startup unprotected (and then is is migrateable). the migration blocker >>>>>>>>>> is really a dynamic aspect during runtime. >>>>>>>>> >>>>>>>>> But the point of --only-migratable is to turn things that would have >>>>>>>>> blocked migration into failures, so that a VM started with >>>>>>>>> --only-migratable is *always* migratable. >>>>>>>> >>>>>>>> Hmmm, fair enough. How do we do this with host-model? The constructed model >>>>>>>> would contain unpack, but then it will fail to startup? Or do we silently >>>>>>>> drop unpack in that case? Both variants do not feel completely right. >>>>>>> >>>>>>> Failing if you explicitly specified unpacked feels right, but failing >>>>>>> if you just used the host model feels odd. Removing unpack also is a >>>>>>> bit odd, but I think the better option if we want to do anything about >>>>>>> it at all. >>>>>> >>>>>> 'host-model' feels a bit special; but breaking the rule that >>>>>> only-migratable doesn't change behaviour is weird >>>>>> Can you do host,-unpack to make that work explicitly? >>>>> >>>>> I guess that should work. But it means that we need to add logic in libvirt >>>>> to disable unpack for host-passthru and host-model. Next problem is then, >>>>> that a future version might implement migration of such guests, which means >>>>> that libvirt must then stop fencing unpack. >>>> >>>> The "host-model" is supposed to always be migratable, so we should >>>> fence the feature there. >>>> >>>> host-passthrough is "undefined" whether it is migratable - it may or may >>>> not work, no guarantees made by libvirt. >>>> >>>> Ultimately I think the problem is that there ought to be an explicit >>>> config to enable the feature for s390, as there is for SEV, and will >>>> also presumably be needed for ppc. >>> >>> Yes, an explicit config is what we want; unfortunately, we have to deal >>> with existing setups as well... >>> >>> The options I see are >>> - leave things for existing setups as they are now (i.e. might become >>> unmigratable when the guest transitions), and make sure we're doing >>> the right thing with the new object >>> - always make the unpack feature conflict with migration requirements; >>> this is a guest-visible change >>> >>> The first option might be less hairy, all considered? >> >> What about a libvirt change that removes the unpack from the host-model as >> soon as only-migrateable is used. When that is in place, QEMU can reject >> the combination of only-migrateable + unpack. > > I think libvirt needs to just unconditionally remove unpack from host-model > regardless, and require an explicit opt in. We can do that in libvirt > without compat problems, because we track the expansion of "host-model" > for existing running guests. This is true for running guests, but not for shutdown and restart. I would really like to avoid bad (and hard to debug) surprises that a guest boots fine with libvirt version x and then fail with x+1. So at the beginning I am fine with libvirt removing "unpack" from the default host model expansion if the --only-migrateable parameter is used. Now I look into libvirt and I cannot actually find code that uses this parameter. Are there some patches posted somewhere? > > QEMU could introduce a deprecation warning right now, and then turn it into > an error after the deprecation cycle is complete.
On Thu, Jan 14, 2021 at 04:25:21PM +0100, Christian Borntraeger wrote: > On 14.01.21 15:15, Daniel P. Berrangé wrote: > > On Thu, Jan 14, 2021 at 03:09:01PM +0100, Christian Borntraeger wrote: > >> > >> > >> On 14.01.21 15:04, Cornelia Huck wrote: > >> > >> What about a libvirt change that removes the unpack from the host-model as > >> soon as only-migrateable is used. When that is in place, QEMU can reject > >> the combination of only-migrateable + unpack. > > > > I think libvirt needs to just unconditionally remove unpack from host-model > > regardless, and require an explicit opt in. We can do that in libvirt > > without compat problems, because we track the expansion of "host-model" > > for existing running guests. > > This is true for running guests, but not for shutdown and restart. > > I would really like to avoid bad (and hard to debug) surprises that a guest boots > fine with libvirt version x and then fail with x+1. So at the beginning > I am fine with libvirt removing "unpack" from the default host model expansion > if the --only-migrateable parameter is used. Now I look into libvirt and I > cannot actually find code that uses this parameter. Are there some patches > posted somewhere? Sorryy, I should have been clearer that we don't currently use --only-migrateable. I've been talking from the pov of the effects if we were to introduce it into libvirt. The way it would work would be for 'virsh start FOO' to start the guest unconditionally, while 'virsh start --migratable FOO' would start the same guest config but fail if it used a non-migratable feature. We need the guest ABI to be the same in both cases. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
[-- Attachment #1: Type: text/plain, Size: 4544 bytes --] On Thu, Jan 14, 2021 at 11:25:17AM +0000, Daniel P. Berrangé wrote: > On Wed, Jan 13, 2021 at 12:42:26PM +0000, Dr. David Alan Gilbert wrote: > > * Cornelia Huck (cohuck@redhat.com) wrote: > > > On Tue, 5 Jan 2021 12:41:25 -0800 > > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > > > On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote: > > > > > On Mon, 4 Jan 2021 10:40:26 -0800 > > > > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > > > > > The main difference between my proposal and the other proposal is... > > > > > > > > > > > > In my proposal the guest makes the compatibility decision and acts > > > > > > accordingly. In the other proposal QEMU makes the compatibility > > > > > > decision and acts accordingly. I argue that QEMU cannot make a good > > > > > > compatibility decision, because it wont know in advance, if the guest > > > > > > will or will-not switch-to-secure. > > > > > > > > > > > > > > > > You have a point there when you say that QEMU does not know in advance, > > > > > if the guest will or will-not switch-to-secure. I made that argument > > > > > regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea > > > > > was to flip that property on demand when the conversion occurs. David > > > > > explained to me that this is not possible for ppc, and that having the > > > > > "securable-guest-memory" property (or whatever the name will be) > > > > > specified is a strong indication, that the VM is intended to be used as > > > > > a secure VM (thus it is OK to hurt the case where the guest does not > > > > > try to transition). That argument applies here as well. > > > > > > > > As suggested by Cornelia Huck, what if QEMU disabled the > > > > "securable-guest-memory" property if 'must-support-migrate' is enabled? > > > > Offcourse; this has to be done with a big fat warning stating > > > > "secure-guest-memory" feature is disabled on the machine. > > > > Doing so, will continue to support guest that do not try to transition. > > > > Guest that try to transition will fail and terminate themselves. > > > > > > Just to recap the s390x situation: > > > > > > - We currently offer a cpu feature that indicates secure execution to > > > be available to the guest if the host supports it. > > > - When we introduce the secure object, we still need to support > > > previous configurations and continue to offer the cpu feature, even > > > if the secure object is not specified. > > > - As migration is currently not supported for secured guests, we add a > > > blocker once the guest actually transitions. That means that > > > transition fails if --only-migratable was specified on the command > > > line. (Guests not transitioning will obviously not notice anything.) > > > - With the secure object, we will already fail starting QEMU if > > > --only-migratable was specified. > > > > > > My suggestion is now that we don't even offer the cpu feature if > > > --only-migratable has been specified. For a guest that does not want to > > > transition to secure mode, nothing changes; a guest that wants to > > > transition to secure mode will notice that the feature is not available > > > and fail appropriately (or ultimately, when the ultravisor call fails). > > > We'd still fail starting QEMU for the secure object + --only-migratable > > > combination. > > > > > > Does that make sense? > > > > It's a little unusual; I don't think we have any other cases where > > --only-migratable changes the behaviour; I think it normally only stops > > you doing something that would have made it unmigratable or causes > > an operation that would make it unmigratable to fail. > > I agree, --only-migratable is supposed to be a *behavioural* toggle > for QEMU. It must /not/ have any impact on the guest ABI. > > A management application needs to be able to add/remove --only-migratable > at will without changing the exposing guest ABI. At the qemu level, it sounds like the right thing to do is to fail outright if all of the below are true: 1. --only-migratable is specified 2. -cpu host is specified 3. unpack isn't explicitly disabled 4. the host CPU actually does have the unpack facility That can be changed if & when migration support is added for PV. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --]
On Thu, Jan 14, 2021 at 10:36:43AM +0000, Dr. David Alan Gilbert wrote:
> * Christian Borntraeger (borntraeger@de.ibm.com) wrote:
> >
> >
> > On 13.01.21 13:42, Dr. David Alan Gilbert wrote:
> > > * Cornelia Huck (cohuck@redhat.com) wrote:
> > >> On Tue, 5 Jan 2021 12:41:25 -0800
> > >> Ram Pai <linuxram@us.ibm.com> wrote:
> > >>
> > >>> On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote:
> > >>>> On Mon, 4 Jan 2021 10:40:26 -0800
> > >>>> Ram Pai <linuxram@us.ibm.com> wrote:
> > >>
> > >>>>> The main difference between my proposal and the other proposal is...
> > >>>>>
> > >>>>> In my proposal the guest makes the compatibility decision and acts
> > >>>>> accordingly. In the other proposal QEMU makes the compatibility
> > >>>>> decision and acts accordingly. I argue that QEMU cannot make a good
> > >>>>> compatibility decision, because it wont know in advance, if the guest
> > >>>>> will or will-not switch-to-secure.
> > >>>>>
> > >>>>
> > >>>> You have a point there when you say that QEMU does not know in advance,
> > >>>> if the guest will or will-not switch-to-secure. I made that argument
> > >>>> regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea
> > >>>> was to flip that property on demand when the conversion occurs. David
> > >>>> explained to me that this is not possible for ppc, and that having the
> > >>>> "securable-guest-memory" property (or whatever the name will be)
> > >>>> specified is a strong indication, that the VM is intended to be used as
> > >>>> a secure VM (thus it is OK to hurt the case where the guest does not
> > >>>> try to transition). That argument applies here as well.
> > >>>
> > >>> As suggested by Cornelia Huck, what if QEMU disabled the
> > >>> "securable-guest-memory" property if 'must-support-migrate' is enabled?
> > >>> Offcourse; this has to be done with a big fat warning stating
> > >>> "secure-guest-memory" feature is disabled on the machine.
> > >>> Doing so, will continue to support guest that do not try to transition.
> > >>> Guest that try to transition will fail and terminate themselves.
> > >>
> > >> Just to recap the s390x situation:
> > >>
> > >> - We currently offer a cpu feature that indicates secure execution to
> > >> be available to the guest if the host supports it.
> > >> - When we introduce the secure object, we still need to support
> > >> previous configurations and continue to offer the cpu feature, even
> > >> if the secure object is not specified.
> > >> - As migration is currently not supported for secured guests, we add a
> > >> blocker once the guest actually transitions. That means that
> > >> transition fails if --only-migratable was specified on the command
> > >> line. (Guests not transitioning will obviously not notice anything.)
> > >> - With the secure object, we will already fail starting QEMU if
> > >> --only-migratable was specified.
> > >>
> > >> My suggestion is now that we don't even offer the cpu feature if
> > >> --only-migratable has been specified. For a guest that does not want to
> > >> transition to secure mode, nothing changes; a guest that wants to
> > >> transition to secure mode will notice that the feature is not available
> > >> and fail appropriately (or ultimately, when the ultravisor call fails).
> > >> We'd still fail starting QEMU for the secure object + --only-migratable
> > >> combination.
> > >>
> > >> Does that make sense?
> > >
> > > It's a little unusual; I don't think we have any other cases where
> > > --only-migratable changes the behaviour; I think it normally only stops
> > > you doing something that would have made it unmigratable or causes
> > > an operation that would make it unmigratable to fail.
> >
> > I would like to NOT block this feature with --only-migrateable. A guest
> > can startup unprotected (and then is is migrateable). the migration blocker
> > is really a dynamic aspect during runtime.
>
> But the point of --only-migratable is to turn things that would have
> blocked migration into failures, so that a VM started with
> --only-migratable is *always* migratable.
I believe, the proposed behavior, does follow the above rule. The
VM started with --only-migratable will always be migratable. Any
behavior; in the guest, to the contrary will disallow the behavior or
terminate the guest, but will never let the VM transition to a
non-migratable state.
RP
On Wed, Jan 13, 2021 at 09:06:29AM +0100, Cornelia Huck wrote:
> On Tue, 12 Jan 2021 10:55:11 -0800
> Ram Pai <linuxram@us.ibm.com> wrote:
>
> > On Tue, Jan 12, 2021 at 09:19:43AM +0100, Cornelia Huck wrote:
> > > On Mon, 11 Jan 2021 11:58:30 -0800
> > > Ram Pai <linuxram@us.ibm.com> wrote:
> > >
> > > > On Mon, Jan 11, 2021 at 05:59:14PM +0100, Cornelia Huck wrote:
> > > > > On Tue, 5 Jan 2021 12:41:25 -0800
> > > > > Ram Pai <linuxram@us.ibm.com> wrote:
> > > > >
> > > > > > On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote:
> > > > > > > On Mon, 4 Jan 2021 10:40:26 -0800
> > > > > > > Ram Pai <linuxram@us.ibm.com> wrote:
> > > > >
> > > > > > > > The main difference between my proposal and the other proposal is...
> > > > > > > >
> > > > > > > > In my proposal the guest makes the compatibility decision and acts
> > > > > > > > accordingly. In the other proposal QEMU makes the compatibility
> > > > > > > > decision and acts accordingly. I argue that QEMU cannot make a good
> > > > > > > > compatibility decision, because it wont know in advance, if the guest
> > > > > > > > will or will-not switch-to-secure.
> > > > > > > >
> > > > > > >
> > > > > > > You have a point there when you say that QEMU does not know in advance,
> > > > > > > if the guest will or will-not switch-to-secure. I made that argument
> > > > > > > regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea
> > > > > > > was to flip that property on demand when the conversion occurs. David
> > > > > > > explained to me that this is not possible for ppc, and that having the
> > > > > > > "securable-guest-memory" property (or whatever the name will be)
> > > > > > > specified is a strong indication, that the VM is intended to be used as
> > > > > > > a secure VM (thus it is OK to hurt the case where the guest does not
> > > > > > > try to transition). That argument applies here as well.
> > > > > >
> > > > > > As suggested by Cornelia Huck, what if QEMU disabled the
> > > > > > "securable-guest-memory" property if 'must-support-migrate' is enabled?
> > > > > > Offcourse; this has to be done with a big fat warning stating
> > > > > > "secure-guest-memory" feature is disabled on the machine.
> > > > > > Doing so, will continue to support guest that do not try to transition.
> > > > > > Guest that try to transition will fail and terminate themselves.
> > > > >
> > > > > Just to recap the s390x situation:
> > > > >
> > > > > - We currently offer a cpu feature that indicates secure execution to
> > > > > be available to the guest if the host supports it.
> > > > > - When we introduce the secure object, we still need to support
> > > > > previous configurations and continue to offer the cpu feature, even
> > > > > if the secure object is not specified.
> > > > > - As migration is currently not supported for secured guests, we add a
> > > > > blocker once the guest actually transitions. That means that
> > > > > transition fails if --only-migratable was specified on the command
> > > > > line. (Guests not transitioning will obviously not notice anything.)
> > > > > - With the secure object, we will already fail starting QEMU if
> > > > > --only-migratable was specified.
> > > > >
> > > > > My suggestion is now that we don't even offer the cpu feature if
> > > > > --only-migratable has been specified. For a guest that does not want to
> > > > > transition to secure mode, nothing changes; a guest that wants to
> > > > > transition to secure mode will notice that the feature is not available
> > > > > and fail appropriately (or ultimately, when the ultravisor call fails).
> > > >
> > > >
> > > > On POWER, secure-execution is not **automatically** enabled even when
> > > > the host supports it. The feature is enabled only if the secure-object
> > > > is configured, and the host supports it.
> > >
> > > Yes, the cpu feature on s390x is simply pre-existing.
> > >
> > > >
> > > > However the behavior proposed above will be consistent on POWER and
> > > > on s390x, when '--only-migratable' is specified and 'secure-object'
> > > > is NOT specified.
> > > >
> > > > So I am in agreement till now.
> > > >
> > > >
> > > > > We'd still fail starting QEMU for the secure object + --only-migratable
> > > > > combination.
> > > >
> > > > Why fail?
> > > >
> > > > Instead, print a warning and disable the secure-object; which will
> > > > disable your cpu-feature. Guests that do not transition to secure, will
> > > > continue to operate, and guests that transition to secure, will fail.
> > >
> > > But that would be consistent with how other non-migratable objects are
> > > handled, no? It's simply a case of incompatible options on the command
> > > line.
> >
> > Actually the two options are inherently NOT incompatible. Halil also
> > mentioned this in one of his replies.
> >
> > Its just that the current implementation is lacking, which will be fixed
> > in the near future.
> >
> > We can design it upfront, with the assumption that they both are compatible.
> > In the short term disable one; preferrably the secure-object, if both
> > options are specified. In the long term, remove the restriction, when
> > the implemetation is complete.
>
> Can't we simply mark the object as non-migratable now, and then remove
> that later? I don't see what is so special about it.
This is fine too.
However I am told that libvirt has some assumptions, where it assumes
that the VM is guaranteed to be migratable if '--only-migratable' is
specified. Silently turning off that option can be bad.
--
Ram Pai
* David Gibson (david@gibson.dropbear.id.au) wrote: > On Thu, Jan 14, 2021 at 11:25:17AM +0000, Daniel P. Berrangé wrote: > > On Wed, Jan 13, 2021 at 12:42:26PM +0000, Dr. David Alan Gilbert wrote: > > > * Cornelia Huck (cohuck@redhat.com) wrote: > > > > On Tue, 5 Jan 2021 12:41:25 -0800 > > > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > > > > > On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote: > > > > > > On Mon, 4 Jan 2021 10:40:26 -0800 > > > > > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > > > > > > > The main difference between my proposal and the other proposal is... > > > > > > > > > > > > > > In my proposal the guest makes the compatibility decision and acts > > > > > > > accordingly. In the other proposal QEMU makes the compatibility > > > > > > > decision and acts accordingly. I argue that QEMU cannot make a good > > > > > > > compatibility decision, because it wont know in advance, if the guest > > > > > > > will or will-not switch-to-secure. > > > > > > > > > > > > > > > > > > > You have a point there when you say that QEMU does not know in advance, > > > > > > if the guest will or will-not switch-to-secure. I made that argument > > > > > > regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea > > > > > > was to flip that property on demand when the conversion occurs. David > > > > > > explained to me that this is not possible for ppc, and that having the > > > > > > "securable-guest-memory" property (or whatever the name will be) > > > > > > specified is a strong indication, that the VM is intended to be used as > > > > > > a secure VM (thus it is OK to hurt the case where the guest does not > > > > > > try to transition). That argument applies here as well. > > > > > > > > > > As suggested by Cornelia Huck, what if QEMU disabled the > > > > > "securable-guest-memory" property if 'must-support-migrate' is enabled? > > > > > Offcourse; this has to be done with a big fat warning stating > > > > > "secure-guest-memory" feature is disabled on the machine. > > > > > Doing so, will continue to support guest that do not try to transition. > > > > > Guest that try to transition will fail and terminate themselves. > > > > > > > > Just to recap the s390x situation: > > > > > > > > - We currently offer a cpu feature that indicates secure execution to > > > > be available to the guest if the host supports it. > > > > - When we introduce the secure object, we still need to support > > > > previous configurations and continue to offer the cpu feature, even > > > > if the secure object is not specified. > > > > - As migration is currently not supported for secured guests, we add a > > > > blocker once the guest actually transitions. That means that > > > > transition fails if --only-migratable was specified on the command > > > > line. (Guests not transitioning will obviously not notice anything.) > > > > - With the secure object, we will already fail starting QEMU if > > > > --only-migratable was specified. > > > > > > > > My suggestion is now that we don't even offer the cpu feature if > > > > --only-migratable has been specified. For a guest that does not want to > > > > transition to secure mode, nothing changes; a guest that wants to > > > > transition to secure mode will notice that the feature is not available > > > > and fail appropriately (or ultimately, when the ultravisor call fails). > > > > We'd still fail starting QEMU for the secure object + --only-migratable > > > > combination. > > > > > > > > Does that make sense? > > > > > > It's a little unusual; I don't think we have any other cases where > > > --only-migratable changes the behaviour; I think it normally only stops > > > you doing something that would have made it unmigratable or causes > > > an operation that would make it unmigratable to fail. > > > > I agree, --only-migratable is supposed to be a *behavioural* toggle > > for QEMU. It must /not/ have any impact on the guest ABI. > > > > A management application needs to be able to add/remove --only-migratable > > at will without changing the exposing guest ABI. > > At the qemu level, it sounds like the right thing to do is to fail > outright if all of the below are true: > 1. --only-migratable is specified > 2. -cpu host is specified > 3. unpack isn't explicitly disabled > 4. the host CPU actually does have the unpack facility > > That can be changed if & when migration support is added for PV. That sounds right to me. Dave > > -- > David Gibson | I'll have my music baroque, and my code > david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ > | _way_ _around_! > http://www.ozlabs.org/~dgibson -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Fri, 15 Jan 2021 10:55:14 -0800
Ram Pai <linuxram@us.ibm.com> wrote:
> On Wed, Jan 13, 2021 at 09:06:29AM +0100, Cornelia Huck wrote:
> > On Tue, 12 Jan 2021 10:55:11 -0800
> > Ram Pai <linuxram@us.ibm.com> wrote:
> >
> > > On Tue, Jan 12, 2021 at 09:19:43AM +0100, Cornelia Huck wrote:
> > > > On Mon, 11 Jan 2021 11:58:30 -0800
> > > > Ram Pai <linuxram@us.ibm.com> wrote:
> > > >
> > > > > On Mon, Jan 11, 2021 at 05:59:14PM +0100, Cornelia Huck wrote:
> > > > > > On Tue, 5 Jan 2021 12:41:25 -0800
> > > > > > Ram Pai <linuxram@us.ibm.com> wrote:
> > > > > >
> > > > > > > On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote:
> > > > > > > > On Mon, 4 Jan 2021 10:40:26 -0800
> > > > > > > > Ram Pai <linuxram@us.ibm.com> wrote:
> > > > > >
> > > > > > > > > The main difference between my proposal and the other proposal is...
> > > > > > > > >
> > > > > > > > > In my proposal the guest makes the compatibility decision and acts
> > > > > > > > > accordingly. In the other proposal QEMU makes the compatibility
> > > > > > > > > decision and acts accordingly. I argue that QEMU cannot make a good
> > > > > > > > > compatibility decision, because it wont know in advance, if the guest
> > > > > > > > > will or will-not switch-to-secure.
> > > > > > > > >
> > > > > > > >
> > > > > > > > You have a point there when you say that QEMU does not know in advance,
> > > > > > > > if the guest will or will-not switch-to-secure. I made that argument
> > > > > > > > regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea
> > > > > > > > was to flip that property on demand when the conversion occurs. David
> > > > > > > > explained to me that this is not possible for ppc, and that having the
> > > > > > > > "securable-guest-memory" property (or whatever the name will be)
> > > > > > > > specified is a strong indication, that the VM is intended to be used as
> > > > > > > > a secure VM (thus it is OK to hurt the case where the guest does not
> > > > > > > > try to transition). That argument applies here as well.
> > > > > > >
> > > > > > > As suggested by Cornelia Huck, what if QEMU disabled the
> > > > > > > "securable-guest-memory" property if 'must-support-migrate' is enabled?
> > > > > > > Offcourse; this has to be done with a big fat warning stating
> > > > > > > "secure-guest-memory" feature is disabled on the machine.
> > > > > > > Doing so, will continue to support guest that do not try to transition.
> > > > > > > Guest that try to transition will fail and terminate themselves.
> > > > > >
> > > > > > Just to recap the s390x situation:
> > > > > >
> > > > > > - We currently offer a cpu feature that indicates secure execution to
> > > > > > be available to the guest if the host supports it.
> > > > > > - When we introduce the secure object, we still need to support
> > > > > > previous configurations and continue to offer the cpu feature, even
> > > > > > if the secure object is not specified.
> > > > > > - As migration is currently not supported for secured guests, we add a
> > > > > > blocker once the guest actually transitions. That means that
> > > > > > transition fails if --only-migratable was specified on the command
> > > > > > line. (Guests not transitioning will obviously not notice anything.)
> > > > > > - With the secure object, we will already fail starting QEMU if
> > > > > > --only-migratable was specified.
> > > > > >
> > > > > > My suggestion is now that we don't even offer the cpu feature if
> > > > > > --only-migratable has been specified. For a guest that does not want to
> > > > > > transition to secure mode, nothing changes; a guest that wants to
> > > > > > transition to secure mode will notice that the feature is not available
> > > > > > and fail appropriately (or ultimately, when the ultravisor call fails).
> > > > >
> > > > >
> > > > > On POWER, secure-execution is not **automatically** enabled even when
> > > > > the host supports it. The feature is enabled only if the secure-object
> > > > > is configured, and the host supports it.
> > > >
> > > > Yes, the cpu feature on s390x is simply pre-existing.
> > > >
> > > > >
> > > > > However the behavior proposed above will be consistent on POWER and
> > > > > on s390x, when '--only-migratable' is specified and 'secure-object'
> > > > > is NOT specified.
> > > > >
> > > > > So I am in agreement till now.
> > > > >
> > > > >
> > > > > > We'd still fail starting QEMU for the secure object + --only-migratable
> > > > > > combination.
> > > > >
> > > > > Why fail?
> > > > >
> > > > > Instead, print a warning and disable the secure-object; which will
> > > > > disable your cpu-feature. Guests that do not transition to secure, will
> > > > > continue to operate, and guests that transition to secure, will fail.
> > > >
> > > > But that would be consistent with how other non-migratable objects are
> > > > handled, no? It's simply a case of incompatible options on the command
> > > > line.
> > >
> > > Actually the two options are inherently NOT incompatible. Halil also
> > > mentioned this in one of his replies.
> > >
> > > Its just that the current implementation is lacking, which will be fixed
> > > in the near future.
> > >
> > > We can design it upfront, with the assumption that they both are compatible.
> > > In the short term disable one; preferrably the secure-object, if both
> > > options are specified. In the long term, remove the restriction, when
> > > the implemetation is complete.
> >
> > Can't we simply mark the object as non-migratable now, and then remove
> > that later? I don't see what is so special about it.
>
> This is fine too.
>
> However I am told that libvirt has some assumptions, where it assumes
> that the VM is guaranteed to be migratable if '--only-migratable' is
> specified. Silently turning off that option can be bad.
>
I meant "later" as in "when support for live migration has been added".
Mucking around with the options does not sound like a good idea.
On 18.01.21 18:39, Dr. David Alan Gilbert wrote: > * David Gibson (david@gibson.dropbear.id.au) wrote: >> On Thu, Jan 14, 2021 at 11:25:17AM +0000, Daniel P. Berrangé wrote: >>> On Wed, Jan 13, 2021 at 12:42:26PM +0000, Dr. David Alan Gilbert wrote: >>>> * Cornelia Huck (cohuck@redhat.com) wrote: >>>>> On Tue, 5 Jan 2021 12:41:25 -0800 >>>>> Ram Pai <linuxram@us.ibm.com> wrote: >>>>> >>>>>> On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote: >>>>>>> On Mon, 4 Jan 2021 10:40:26 -0800 >>>>>>> Ram Pai <linuxram@us.ibm.com> wrote: >>>>> >>>>>>>> The main difference between my proposal and the other proposal is... >>>>>>>> >>>>>>>> In my proposal the guest makes the compatibility decision and acts >>>>>>>> accordingly. In the other proposal QEMU makes the compatibility >>>>>>>> decision and acts accordingly. I argue that QEMU cannot make a good >>>>>>>> compatibility decision, because it wont know in advance, if the guest >>>>>>>> will or will-not switch-to-secure. >>>>>>>> >>>>>>> >>>>>>> You have a point there when you say that QEMU does not know in advance, >>>>>>> if the guest will or will-not switch-to-secure. I made that argument >>>>>>> regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea >>>>>>> was to flip that property on demand when the conversion occurs. David >>>>>>> explained to me that this is not possible for ppc, and that having the >>>>>>> "securable-guest-memory" property (or whatever the name will be) >>>>>>> specified is a strong indication, that the VM is intended to be used as >>>>>>> a secure VM (thus it is OK to hurt the case where the guest does not >>>>>>> try to transition). That argument applies here as well. >>>>>> >>>>>> As suggested by Cornelia Huck, what if QEMU disabled the >>>>>> "securable-guest-memory" property if 'must-support-migrate' is enabled? >>>>>> Offcourse; this has to be done with a big fat warning stating >>>>>> "secure-guest-memory" feature is disabled on the machine. >>>>>> Doing so, will continue to support guest that do not try to transition. >>>>>> Guest that try to transition will fail and terminate themselves. >>>>> >>>>> Just to recap the s390x situation: >>>>> >>>>> - We currently offer a cpu feature that indicates secure execution to >>>>> be available to the guest if the host supports it. >>>>> - When we introduce the secure object, we still need to support >>>>> previous configurations and continue to offer the cpu feature, even >>>>> if the secure object is not specified. >>>>> - As migration is currently not supported for secured guests, we add a >>>>> blocker once the guest actually transitions. That means that >>>>> transition fails if --only-migratable was specified on the command >>>>> line. (Guests not transitioning will obviously not notice anything.) >>>>> - With the secure object, we will already fail starting QEMU if >>>>> --only-migratable was specified. >>>>> >>>>> My suggestion is now that we don't even offer the cpu feature if >>>>> --only-migratable has been specified. For a guest that does not want to >>>>> transition to secure mode, nothing changes; a guest that wants to >>>>> transition to secure mode will notice that the feature is not available >>>>> and fail appropriately (or ultimately, when the ultravisor call fails). >>>>> We'd still fail starting QEMU for the secure object + --only-migratable >>>>> combination. >>>>> >>>>> Does that make sense? >>>> >>>> It's a little unusual; I don't think we have any other cases where >>>> --only-migratable changes the behaviour; I think it normally only stops >>>> you doing something that would have made it unmigratable or causes >>>> an operation that would make it unmigratable to fail. >>> >>> I agree, --only-migratable is supposed to be a *behavioural* toggle >>> for QEMU. It must /not/ have any impact on the guest ABI. >>> >>> A management application needs to be able to add/remove --only-migratable >>> at will without changing the exposing guest ABI. >> >> At the qemu level, it sounds like the right thing to do is to fail >> outright if all of the below are true: >> 1. --only-migratable is specified >> 2. -cpu host is specified >> 3. unpack isn't explicitly disabled >> 4. the host CPU actually does have the unpack facility >> >> That can be changed if & when migration support is added for PV. > > That sounds right to me. as startup will fail anyway if the guest cpu model enables unpack, but the host cpu does not support it this can be simplified to forbid startup in qemu if --only-migratable is combined with unpack being active in the guest cpu model. This is actually independent from this patch set. maybe just something like diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c index 35179f9dc7ba..3b85ff4e31b2 100644 --- a/target/s390x/cpu_models.c +++ b/target/s390x/cpu_models.c @@ -26,6 +26,7 @@ #include "qapi/qmp/qdict.h" #ifndef CONFIG_USER_ONLY #include "sysemu/arch_init.h" +#include "sysemu/sysemu.h" #include "hw/pci/pci.h" #endif #include "qapi/qapi-commands-machine-target.h" @@ -878,6 +879,11 @@ static void check_compatibility(const S390CPUModel *max_model, return; } + if (only_migratable && test_bit(S390_FEAT_UNPACK, model->features)) { + error_setg(errp, "The unpack facility is not compatible with " + "the --only-migratable option"); + } + /* detect the missing features to properly report them */ bitmap_andnot(missing, model->features, max_model->features, S390_FEAT_MAX); if (bitmap_empty(missing, S390_FEAT_MAX)) {
On Tue, 19 Jan 2021 09:28:22 +0100 Christian Borntraeger <borntraeger@de.ibm.com> wrote: > On 18.01.21 18:39, Dr. David Alan Gilbert wrote: > > * David Gibson (david@gibson.dropbear.id.au) wrote: > >> On Thu, Jan 14, 2021 at 11:25:17AM +0000, Daniel P. Berrangé wrote: > >>> On Wed, Jan 13, 2021 at 12:42:26PM +0000, Dr. David Alan Gilbert wrote: > >>>> * Cornelia Huck (cohuck@redhat.com) wrote: > >>>>> On Tue, 5 Jan 2021 12:41:25 -0800 > >>>>> Ram Pai <linuxram@us.ibm.com> wrote: > >>>>> > >>>>>> On Tue, Jan 05, 2021 at 11:56:14AM +0100, Halil Pasic wrote: > >>>>>>> On Mon, 4 Jan 2021 10:40:26 -0800 > >>>>>>> Ram Pai <linuxram@us.ibm.com> wrote: > >>>>> > >>>>>>>> The main difference between my proposal and the other proposal is... > >>>>>>>> > >>>>>>>> In my proposal the guest makes the compatibility decision and acts > >>>>>>>> accordingly. In the other proposal QEMU makes the compatibility > >>>>>>>> decision and acts accordingly. I argue that QEMU cannot make a good > >>>>>>>> compatibility decision, because it wont know in advance, if the guest > >>>>>>>> will or will-not switch-to-secure. > >>>>>>>> > >>>>>>> > >>>>>>> You have a point there when you say that QEMU does not know in advance, > >>>>>>> if the guest will or will-not switch-to-secure. I made that argument > >>>>>>> regarding VIRTIO_F_ACCESS_PLATFORM (iommu_platform) myself. My idea > >>>>>>> was to flip that property on demand when the conversion occurs. David > >>>>>>> explained to me that this is not possible for ppc, and that having the > >>>>>>> "securable-guest-memory" property (or whatever the name will be) > >>>>>>> specified is a strong indication, that the VM is intended to be used as > >>>>>>> a secure VM (thus it is OK to hurt the case where the guest does not > >>>>>>> try to transition). That argument applies here as well. > >>>>>> > >>>>>> As suggested by Cornelia Huck, what if QEMU disabled the > >>>>>> "securable-guest-memory" property if 'must-support-migrate' is enabled? > >>>>>> Offcourse; this has to be done with a big fat warning stating > >>>>>> "secure-guest-memory" feature is disabled on the machine. > >>>>>> Doing so, will continue to support guest that do not try to transition. > >>>>>> Guest that try to transition will fail and terminate themselves. > >>>>> > >>>>> Just to recap the s390x situation: > >>>>> > >>>>> - We currently offer a cpu feature that indicates secure execution to > >>>>> be available to the guest if the host supports it. > >>>>> - When we introduce the secure object, we still need to support > >>>>> previous configurations and continue to offer the cpu feature, even > >>>>> if the secure object is not specified. > >>>>> - As migration is currently not supported for secured guests, we add a > >>>>> blocker once the guest actually transitions. That means that > >>>>> transition fails if --only-migratable was specified on the command > >>>>> line. (Guests not transitioning will obviously not notice anything.) > >>>>> - With the secure object, we will already fail starting QEMU if > >>>>> --only-migratable was specified. > >>>>> > >>>>> My suggestion is now that we don't even offer the cpu feature if > >>>>> --only-migratable has been specified. For a guest that does not want to > >>>>> transition to secure mode, nothing changes; a guest that wants to > >>>>> transition to secure mode will notice that the feature is not available > >>>>> and fail appropriately (or ultimately, when the ultravisor call fails). > >>>>> We'd still fail starting QEMU for the secure object + --only-migratable > >>>>> combination. > >>>>> > >>>>> Does that make sense? > >>>> > >>>> It's a little unusual; I don't think we have any other cases where > >>>> --only-migratable changes the behaviour; I think it normally only stops > >>>> you doing something that would have made it unmigratable or causes > >>>> an operation that would make it unmigratable to fail. > >>> > >>> I agree, --only-migratable is supposed to be a *behavioural* toggle > >>> for QEMU. It must /not/ have any impact on the guest ABI. > >>> > >>> A management application needs to be able to add/remove --only-migratable > >>> at will without changing the exposing guest ABI. > >> > >> At the qemu level, it sounds like the right thing to do is to fail > >> outright if all of the below are true: > >> 1. --only-migratable is specified > >> 2. -cpu host is specified > >> 3. unpack isn't explicitly disabled > >> 4. the host CPU actually does have the unpack facility > >> > >> That can be changed if & when migration support is added for PV. > > > > That sounds right to me. > > as startup will fail anyway if the guest cpu model enables unpack, but the host > cpu does not support it this can be simplified to forbid startup in qemu if > --only-migratable is combined with unpack being active in the guest cpu model. > > This is actually independent from this patch set. Yep, I think we should just go ahead and fix this. > maybe just > something like > > diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c > index 35179f9dc7ba..3b85ff4e31b2 100644 > --- a/target/s390x/cpu_models.c > +++ b/target/s390x/cpu_models.c > @@ -26,6 +26,7 @@ > #include "qapi/qmp/qdict.h" > #ifndef CONFIG_USER_ONLY > #include "sysemu/arch_init.h" > +#include "sysemu/sysemu.h" > #include "hw/pci/pci.h" > #endif > #include "qapi/qapi-commands-machine-target.h" > @@ -878,6 +879,11 @@ static void check_compatibility(const S390CPUModel *max_model, > return; > } > > + if (only_migratable && test_bit(S390_FEAT_UNPACK, model->features)) { > + error_setg(errp, "The unpack facility is not compatible with " > + "the --only-migratable option"); > + } > + > /* detect the missing features to properly report them */ > bitmap_andnot(missing, model->features, max_model->features, S390_FEAT_MAX); > if (bitmap_empty(missing, S390_FEAT_MAX)) { > > Want to send this as a proper patch?
On Fri, Jan 15, 2021 at 10:55:14AM -0800, Ram Pai wrote: > On Wed, Jan 13, 2021 at 09:06:29AM +0100, Cornelia Huck wrote: > > On Tue, 12 Jan 2021 10:55:11 -0800 > > Ram Pai <linuxram@us.ibm.com> wrote: > > > > > On Tue, Jan 12, 2021 at 09:19:43AM +0100, Cornelia Huck wrote: > > > Actually the two options are inherently NOT incompatible. Halil also > > > mentioned this in one of his replies. > > > > > > Its just that the current implementation is lacking, which will be fixed > > > in the near future. > > > > > > We can design it upfront, with the assumption that they both are compatible. > > > In the short term disable one; preferrably the secure-object, if both > > > options are specified. In the long term, remove the restriction, when > > > the implemetation is complete. > > > > Can't we simply mark the object as non-migratable now, and then remove > > that later? I don't see what is so special about it. > > This is fine too. > > However I am told that libvirt has some assumptions, where it assumes > that the VM is guaranteed to be migratable if '--only-migratable' is > specified. Silently turning off that option can be bad. TO be clear libvirt does *not* currently use --only-migratable. What you're describing here is QEMU's own definition of this flag $ qemu-system-x86_64 | grep migratable -only-migratable allow only migratable devices Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|