linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] KVM: fix KVM_EXIT_SYSTEM_EVENT mess
@ 2022-04-21 18:04 Paolo Bonzini
  2022-04-21 18:04 ` [PATCH 1/4] KVM: x86: always initialize system_event.ndata Paolo Bonzini
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Paolo Bonzini @ 2022-04-21 18:04 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: will, maz, apatel, atishp, seanjc, pgonda

The KVM_SYSTEM_EVENT_NDATA_VALID mechanism that was introduced
contextually with KVM_SYSTEM_EVENT_SEV_TERM is not a good match
for ARM and RISC-V, which want to communicate information even
for existing KVM_SYSTEM_EVENT_* constants.  Userspace is not ready
to filter out bit 31 of type, and fails to process the
KVM_EXIT_SYSTEM_EVENT exit.

Therefore, tie the availability of ndata to a system capability;
if the capability is present, ndata is always valid, so patch 1
makes x86 always initialize it.  Then patches 2 and 3 fix
ARM and RISC-V compilation and patch 4 enables the capability.

Only compiled on x86, waiting for acks.

Paolo

Paolo Bonzini (4):
  KVM: x86: always initialize system_event.ndata
  KVM: ARM: replace system_event.flags with ndata and data[0]
  KVM: RISC-V: replace system_event.flags with ndata and data[0]
  KVM: tell userspace that system_event.ndata is valid

 Documentation/virt/kvm/api.rst        | 29 +++++++++++++++------------
 arch/arm64/kvm/psci.c                 |  3 ++-
 arch/riscv/include/asm/kvm_vcpu_sbi.h |  2 +-
 arch/riscv/kvm/vcpu_sbi.c             |  5 +++--
 arch/riscv/kvm/vcpu_sbi_replace.c     |  4 ++--
 arch/riscv/kvm/vcpu_sbi_v01.c         |  2 +-
 arch/x86/kvm/svm/sev.c                |  3 +--
 arch/x86/kvm/x86.c                    |  2 ++
 include/uapi/linux/kvm.h              |  2 +-
 virt/kvm/kvm_main.c                   |  1 +
 10 files changed, 30 insertions(+), 23 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] KVM: x86: always initialize system_event.ndata
  2022-04-21 18:04 [PATCH 0/4] KVM: fix KVM_EXIT_SYSTEM_EVENT mess Paolo Bonzini
@ 2022-04-21 18:04 ` Paolo Bonzini
  2022-04-22  9:48   ` Marc Zyngier
  2022-04-21 18:04 ` [PATCH 2/4] KVM: ARM: replace system_event.flags with ndata and data[0] Paolo Bonzini
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Paolo Bonzini @ 2022-04-21 18:04 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: will, maz, apatel, atishp, seanjc, pgonda

The KVM_SYSTEM_EVENT_NDATA_VALID mechanism that was introduced
contextually with KVM_SYSTEM_EVENT_SEV_TERM is not a good match
for ARM and RISC-V, which want to communicate information even
for existing KVM_SYSTEM_EVENT_* constants.  Userspace is not ready
to filter out bit 31 of type, and fails to process the
KVM_EXIT_SYSTEM_EVENT exit.

Therefore, tie the availability of ndata to a system capability
(which will be added once all architectures are on board).
Userspace written for released versions of Linux has no reason to
check flags, since it was never written, so it is okay to replace
it with ndata and data[0] (on 32-bit kernels) or with data[0]
(on 64-bit kernels).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/svm/sev.c   | 3 +--
 arch/x86/kvm/x86.c       | 2 ++
 include/uapi/linux/kvm.h | 1 -
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index a93f0d01bb90..51b963ec122b 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2739,8 +2739,7 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
 			reason_set, reason_code);
 
 		vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
-		vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SEV_TERM |
-					       KVM_SYSTEM_EVENT_NDATA_VALID;
+		vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SEV_TERM;
 		vcpu->run->system_event.ndata = 1;
 		vcpu->run->system_event.data[1] = control->ghcb_gpa;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4e7f3a8da16a..517c0228881c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10056,12 +10056,14 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		if (kvm_check_request(KVM_REQ_HV_CRASH, vcpu)) {
 			vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
 			vcpu->run->system_event.type = KVM_SYSTEM_EVENT_CRASH;
+			vcpu->run->system_event.ndata = 0;
 			r = 0;
 			goto out;
 		}
 		if (kvm_check_request(KVM_REQ_HV_RESET, vcpu)) {
 			vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
 			vcpu->run->system_event.type = KVM_SYSTEM_EVENT_RESET;
+			vcpu->run->system_event.ndata = 0;
 			r = 0;
 			goto out;
 		}
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index dd1d8167e71f..5a57f74b4903 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -445,7 +445,6 @@ struct kvm_run {
 #define KVM_SYSTEM_EVENT_RESET          2
 #define KVM_SYSTEM_EVENT_CRASH          3
 #define KVM_SYSTEM_EVENT_SEV_TERM       4
-#define KVM_SYSTEM_EVENT_NDATA_VALID    (1u << 31)
 			__u32 type;
 			__u32 ndata;
 			__u64 data[16];
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] KVM: ARM: replace system_event.flags with ndata and data[0]
  2022-04-21 18:04 [PATCH 0/4] KVM: fix KVM_EXIT_SYSTEM_EVENT mess Paolo Bonzini
  2022-04-21 18:04 ` [PATCH 1/4] KVM: x86: always initialize system_event.ndata Paolo Bonzini
@ 2022-04-21 18:04 ` Paolo Bonzini
  2022-04-21 18:04 ` [PATCH 3/4] KVM: RISC-V: " Paolo Bonzini
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2022-04-21 18:04 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: will, maz, apatel, atishp, seanjc, pgonda

The flags datum for KVM_EXIT_SYSTEM_EVENT exits has been
removed because it was defined incorrectly; no padding was
introduced between the 32-bit type and the 64-bit flags,
resulting in different definitions for 32-bit and 64-bit
userspace.

The replacement is a pair of fields, ndata and data[],
with ndata saying how many items are valid in the data array.
In the case of ARM that's one, as flags can be simply moved
to data[0].

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/arm64/kvm/psci.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index baac2b405f23..708d80e8e60d 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -181,7 +181,8 @@ static void kvm_prepare_system_event(struct kvm_vcpu *vcpu, u32 type, u64 flags)
 
 	memset(&vcpu->run->system_event, 0, sizeof(vcpu->run->system_event));
 	vcpu->run->system_event.type = type;
-	vcpu->run->system_event.flags = flags;
+	vcpu->run->system_event.ndata = 1;
+	vcpu->run->system_event.data[0] = flags;
 	vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
 }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] KVM: RISC-V: replace system_event.flags with ndata and data[0]
  2022-04-21 18:04 [PATCH 0/4] KVM: fix KVM_EXIT_SYSTEM_EVENT mess Paolo Bonzini
  2022-04-21 18:04 ` [PATCH 1/4] KVM: x86: always initialize system_event.ndata Paolo Bonzini
  2022-04-21 18:04 ` [PATCH 2/4] KVM: ARM: replace system_event.flags with ndata and data[0] Paolo Bonzini
@ 2022-04-21 18:04 ` Paolo Bonzini
  2022-04-21 18:04 ` [PATCH 4/4] KVM: tell userspace that system_event.ndata is valid Paolo Bonzini
  2022-04-22  7:58 ` [PATCH 0/4] KVM: fix KVM_EXIT_SYSTEM_EVENT mess Oliver Upton
  4 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2022-04-21 18:04 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: will, maz, apatel, atishp, seanjc, pgonda

The flags datum for KVM_EXIT_SYSTEM_EVENT exits has been
removed because it was defined incorrectly; no padding was
introduced between the 32-bit type and the 64-bit flags,
resulting in different definitions for 32-bit and 64-bit
userspace.

The replacement is a pair of fields, ndata and data[],
with ndata saying how many items are valid in the data array.
In the case of RISC-V that's one for the SRST SBI call (with
flags simply moved to data[0]) and zero for the legacy v0.1
call.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/riscv/include/asm/kvm_vcpu_sbi.h | 2 +-
 arch/riscv/kvm/vcpu_sbi.c             | 5 +++--
 arch/riscv/kvm/vcpu_sbi_replace.c     | 4 ++--
 arch/riscv/kvm/vcpu_sbi_v01.c         | 2 +-
 4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
index 83d6d4d2b1df..c4b10c25264a 100644
--- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
+++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
@@ -30,7 +30,7 @@ struct kvm_vcpu_sbi_extension {
 void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run);
 void kvm_riscv_vcpu_sbi_system_reset(struct kvm_vcpu *vcpu,
 				     struct kvm_run *run,
-				     u32 type, u64 flags);
+				     u32 type, bool have_reason, u64 reason);
 const struct kvm_vcpu_sbi_extension *kvm_vcpu_sbi_find_ext(unsigned long extid);
 
 #endif /* __RISCV_KVM_VCPU_SBI_H__ */
diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
index a09ecb97b890..2019ca01b44b 100644
--- a/arch/riscv/kvm/vcpu_sbi.c
+++ b/arch/riscv/kvm/vcpu_sbi.c
@@ -83,7 +83,7 @@ void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run)
 
 void kvm_riscv_vcpu_sbi_system_reset(struct kvm_vcpu *vcpu,
 				     struct kvm_run *run,
-				     u32 type, u64 flags)
+				     u32 type, bool have_reason, u64 reason)
 {
 	unsigned long i;
 	struct kvm_vcpu *tmp;
@@ -94,7 +94,8 @@ void kvm_riscv_vcpu_sbi_system_reset(struct kvm_vcpu *vcpu,
 
 	memset(&run->system_event, 0, sizeof(run->system_event));
 	run->system_event.type = type;
-	run->system_event.flags = flags;
+	run->system_event.ndata = have_reason;
+	run->system_event.data[0] = reason;
 	run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
 }
 
diff --git a/arch/riscv/kvm/vcpu_sbi_replace.c b/arch/riscv/kvm/vcpu_sbi_replace.c
index 0f217365c287..a36414a5f59e 100644
--- a/arch/riscv/kvm/vcpu_sbi_replace.c
+++ b/arch/riscv/kvm/vcpu_sbi_replace.c
@@ -148,14 +148,14 @@ static int kvm_sbi_ext_srst_handler(struct kvm_vcpu *vcpu,
 		case SBI_SRST_RESET_TYPE_SHUTDOWN:
 			kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
 						KVM_SYSTEM_EVENT_SHUTDOWN,
-						reason);
+						true, reason);
 			*exit = true;
 			break;
 		case SBI_SRST_RESET_TYPE_COLD_REBOOT:
 		case SBI_SRST_RESET_TYPE_WARM_REBOOT:
 			kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
 						KVM_SYSTEM_EVENT_RESET,
-						reason);
+						true, reason);
 			*exit = true;
 			break;
 		default:
diff --git a/arch/riscv/kvm/vcpu_sbi_v01.c b/arch/riscv/kvm/vcpu_sbi_v01.c
index da4d6c99c2cf..9598bc6b4c0e 100644
--- a/arch/riscv/kvm/vcpu_sbi_v01.c
+++ b/arch/riscv/kvm/vcpu_sbi_v01.c
@@ -66,7 +66,7 @@ static int kvm_sbi_ext_v01_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		break;
 	case SBI_EXT_0_1_SHUTDOWN:
 		kvm_riscv_vcpu_sbi_system_reset(vcpu, run,
-						KVM_SYSTEM_EVENT_SHUTDOWN, 0);
+						KVM_SYSTEM_EVENT_SHUTDOWN, false, 0);
 		*exit = true;
 		break;
 	case SBI_EXT_0_1_REMOTE_FENCE_I:
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] KVM: tell userspace that system_event.ndata is valid
  2022-04-21 18:04 [PATCH 0/4] KVM: fix KVM_EXIT_SYSTEM_EVENT mess Paolo Bonzini
                   ` (2 preceding siblings ...)
  2022-04-21 18:04 ` [PATCH 3/4] KVM: RISC-V: " Paolo Bonzini
@ 2022-04-21 18:04 ` Paolo Bonzini
  2022-04-22  7:58 ` [PATCH 0/4] KVM: fix KVM_EXIT_SYSTEM_EVENT mess Oliver Upton
  4 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2022-04-21 18:04 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: will, maz, apatel, atishp, seanjc, pgonda

Now that all architectures are fixed, introduce a new capability
so that userspace knows that it can look at the ndata and data[]
members of run->system_event.  Adjust the documentation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 Documentation/virt/kvm/api.rst | 29 ++++++++++++++++-------------
 include/uapi/linux/kvm.h       |  1 +
 virt/kvm/kvm_main.c            |  1 +
 3 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 72183ae628f7..fe5805ab0d75 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6089,21 +6089,18 @@ should put the acknowledged interrupt vector into the 'epr' field.
   #define KVM_SYSTEM_EVENT_RESET          2
   #define KVM_SYSTEM_EVENT_CRASH          3
   #define KVM_SYSTEM_EVENT_SEV_TERM       4
-  #define KVM_SYSTEM_EVENT_NDATA_VALID    (1u << 31)
 			__u32 type;
                         __u32 ndata;
-			__u64 flags;
                         __u64 data[16];
 		} system_event;
 
 If exit_reason is KVM_EXIT_SYSTEM_EVENT then the vcpu has triggered
 a system-level event using some architecture specific mechanism (hypercall
 or some special instruction). In case of ARM64, this is triggered using
-HVC instruction based PSCI call from the vcpu. The 'type' field describes
-the system-level event type. The 'flags' field describes architecture
-specific flags for the system-level event.
+HVC instruction based PSCI call from the vcpu.
 
-Valid values for bits 30:0 of 'type' are:
+The 'type' field describes the system-level event type.
+Valid values for 'type' are:
 
  - KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the
    VM. Userspace is not obliged to honour this, and if it does honour
@@ -6117,16 +6114,23 @@ Valid values for bits 30:0 of 'type' are:
    to ignore the request, or to gather VM memory core dump and/or
    reset/shutdown of the VM.
  - KVM_SYSTEM_EVENT_SEV_TERM -- an AMD SEV guest requested termination.
-   The guest physical address of the guest's GHCB is stored in `data[0]`.
 
-Valid flags are:
+If KVM_CAP_SYSTEM_EVENT_DATA is present, the 'data' field can contain
+architecture specific information for the system-level event.  Only
+the first `ndata` items (possibly zero) of the data array are valid.
+
+ - for arm64, data[0] is set to KVM_SYSTEM_EVENT_RESET_FLAG_PSCI_RESET2 if
+   the guest issued a SYSTEM_RESET2 call according to v1.1 of the PSCI
+   specification.
+
+ - for RISC-V, data[0] is set to the value of the second argument of the
+   ``sbi_system_reset`` call.
 
- - KVM_SYSTEM_EVENT_RESET_FLAG_PSCI_RESET2 (arm64 only) -- the guest issued
-   a SYSTEM_RESET2 call according to v1.1 of the PSCI specification.
+ - for x86, KVM_SYSTEM_EVENT_SEV_TERM stores the guest physical address of the
+   guest's GHCB in `data[0]`.
 
-Extra data for this event is stored in the `data[]` array, up to index
-`ndata-1` included, if bit 31 is set in `type`.  The data depends on the
-`type` field.  There is no extra data if bit 31 is clear or `ndata` is zero.
+Previous versions of Linux defined a `flags` member in this struct.  The
+field however was never written.
 
 ::
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 5a57f74b4903..f76ffecda38a 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1147,6 +1147,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_PMU_CAPABILITY 212
 #define KVM_CAP_DISABLE_QUIRKS2 213
 #define KVM_CAP_VM_TSC_CONTROL 214
+#define KVM_CAP_SYSTEM_EVENT_DATA 215
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index dfb7dabdbc63..ac57fc2c935f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4333,6 +4333,7 @@ static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
 		return 0;
 #endif
 	case KVM_CAP_BINARY_STATS_FD:
+	case KVM_CAP_SYSTEM_EVENT_DATA:
 		return 1;
 	default:
 		break;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/4] KVM: fix KVM_EXIT_SYSTEM_EVENT mess
  2022-04-21 18:04 [PATCH 0/4] KVM: fix KVM_EXIT_SYSTEM_EVENT mess Paolo Bonzini
                   ` (3 preceding siblings ...)
  2022-04-21 18:04 ` [PATCH 4/4] KVM: tell userspace that system_event.ndata is valid Paolo Bonzini
@ 2022-04-22  7:58 ` Oliver Upton
  2022-04-22  9:41   ` Paolo Bonzini
  4 siblings, 1 reply; 11+ messages in thread
From: Oliver Upton @ 2022-04-22  7:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: linux-kernel, kvm, will, maz, apatel, atishp, seanjc, pgonda

Hi Paolo,

On Thu, Apr 21, 2022 at 02:04:39PM -0400, Paolo Bonzini wrote:
> The KVM_SYSTEM_EVENT_NDATA_VALID mechanism that was introduced
> contextually with KVM_SYSTEM_EVENT_SEV_TERM is not a good match
> for ARM and RISC-V, which want to communicate information even
> for existing KVM_SYSTEM_EVENT_* constants.  Userspace is not ready
> to filter out bit 31 of type, and fails to process the
> KVM_EXIT_SYSTEM_EVENT exit.
> 
> Therefore, tie the availability of ndata to a system capability;
> if the capability is present, ndata is always valid, so patch 1
> makes x86 always initialize it.  Then patches 2 and 3 fix
> ARM and RISC-V compilation and patch 4 enables the capability.
> 
> Only compiled on x86, waiting for acks.
> 
> Paolo
> 
> Paolo Bonzini (4):
>   KVM: x86: always initialize system_event.ndata
>   KVM: ARM: replace system_event.flags with ndata and data[0]
>   KVM: RISC-V: replace system_event.flags with ndata and data[0]
>   KVM: tell userspace that system_event.ndata is valid

Is there any way we could clean this up in 5.18 and leave the whole
ndata/data pattern for 5.19?

IOW, for 5.18 go back and fix the padding:

	struct {
		__u32 type;
		__u32 pad;
		__u64 flags;
	} system_event;

Then for 5.19 circle back on the data business, except use a flag bit
for it:

	struct {
		__u32 type;
		__u32 pad;
	#define KVM_SYSTEM_EVENT_NDATA_VALID	(1u << 63)
		__u64 flags;
		__u64 ndata;
		__u64 data[16];
	} system_event;

Where we apply that bit to system_event::flags this time instead of
::type. Could also go the CAP route.

Wouldn't this be enough to preserve ABI with whatever userspace has been
spun up for system_event::flags already and also keep the SEV stuff
happy in 5.19?

It is a bit weird to churn existing UAPI usage in the very next kernel
cycle, but could be convinced otherwise :)

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/4] KVM: fix KVM_EXIT_SYSTEM_EVENT mess
  2022-04-22  7:58 ` [PATCH 0/4] KVM: fix KVM_EXIT_SYSTEM_EVENT mess Oliver Upton
@ 2022-04-22  9:41   ` Paolo Bonzini
  2022-04-22 10:01     ` Marc Zyngier
  0 siblings, 1 reply; 11+ messages in thread
From: Paolo Bonzini @ 2022-04-22  9:41 UTC (permalink / raw)
  To: Oliver Upton; +Cc: linux-kernel, kvm, will, maz, apatel, atishp, seanjc, pgonda

On 4/22/22 09:58, Oliver Upton wrote:
> Is there any way we could clean this up in 5.18 and leave the whole
> ndata/data pattern for 5.19?
> 
> IOW, for 5.18 go back and fix the padding:
> 
> 	struct {
> 		__u32 type;
> 		__u32 pad;
> 		__u64 flags;
> 	} system_event;
> 
> Then for 5.19 circle back on the data business, except use a flag bit
> for it:
> 
> 	struct {
> 		__u32 type;
> 		__u32 pad;
> 	#define KVM_SYSTEM_EVENT_NDATA_VALID	(1u << 63)
> 		__u64 flags;
> 		__u64 ndata;
> 		__u64 data[16];
> 	} system_event;
> 
> Where we apply that bit to system_event::flags this time instead of
> ::type. Could also go the CAP route.

These patches are against kvm/next, so that is already what I did. :)

On the other hand right now the ARM and RISC-V flags are unusable with 
32-bit userspace, so we need to fix _something_ in 5.18 as well.  For 
your proposal, all that's missing is a 5.18 patch to add the padding. 
But since the flags UAPI was completely unused before 5.18 and there's 
no reason to inflict the different naming of fields to userspace.  So I 
think we want to apply this UAPI change in 5.18 too.

Paolo


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] KVM: x86: always initialize system_event.ndata
  2022-04-21 18:04 ` [PATCH 1/4] KVM: x86: always initialize system_event.ndata Paolo Bonzini
@ 2022-04-22  9:48   ` Marc Zyngier
  2022-04-22 10:11     ` Paolo Bonzini
  0 siblings, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2022-04-22  9:48 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, will, apatel, atishp, seanjc, pgonda

On Thu, 21 Apr 2022 19:04:40 +0100,
Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> The KVM_SYSTEM_EVENT_NDATA_VALID mechanism that was introduced
> contextually with KVM_SYSTEM_EVENT_SEV_TERM is not a good match
> for ARM and RISC-V, which want to communicate information even
> for existing KVM_SYSTEM_EVENT_* constants.  Userspace is not ready
> to filter out bit 31 of type, and fails to process the
> KVM_EXIT_SYSTEM_EVENT exit.
> 
> Therefore, tie the availability of ndata to a system capability
> (which will be added once all architectures are on board).
> Userspace written for released versions of Linux has no reason to
> check flags, since it was never written, so it is okay to replace
> it with ndata and data[0] (on 32-bit kernels) or with data[0]
> (on 64-bit kernels).

How is it going to work for new userspace on old kernels, for which
the ndata field is left uninitialised?

> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/svm/sev.c   | 3 +--
>  arch/x86/kvm/x86.c       | 2 ++
>  include/uapi/linux/kvm.h | 1 -
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index a93f0d01bb90..51b963ec122b 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -2739,8 +2739,7 @@ static int sev_handle_vmgexit_msr_protocol(struct vcpu_svm *svm)
>  			reason_set, reason_code);
>  
>  		vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
> -		vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SEV_TERM |
> -					       KVM_SYSTEM_EVENT_NDATA_VALID;
> +		vcpu->run->system_event.type = KVM_SYSTEM_EVENT_SEV_TERM;
>  		vcpu->run->system_event.ndata = 1;
>  		vcpu->run->system_event.data[1] = control->ghcb_gpa;

Isn't this really odd? ndata = 1, and yet you populate data[1]?

>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 4e7f3a8da16a..517c0228881c 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10056,12 +10056,14 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  		if (kvm_check_request(KVM_REQ_HV_CRASH, vcpu)) {
>  			vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
>  			vcpu->run->system_event.type = KVM_SYSTEM_EVENT_CRASH;
> +			vcpu->run->system_event.ndata = 0;
>  			r = 0;
>  			goto out;
>  		}
>  		if (kvm_check_request(KVM_REQ_HV_RESET, vcpu)) {
>  			vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
>  			vcpu->run->system_event.type = KVM_SYSTEM_EVENT_RESET;
> +			vcpu->run->system_event.ndata = 0;
>  			r = 0;
>  			goto out;
>  		}
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index dd1d8167e71f..5a57f74b4903 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -445,7 +445,6 @@ struct kvm_run {
>  #define KVM_SYSTEM_EVENT_RESET          2
>  #define KVM_SYSTEM_EVENT_CRASH          3
>  #define KVM_SYSTEM_EVENT_SEV_TERM       4
> -#define KVM_SYSTEM_EVENT_NDATA_VALID    (1u << 31)
>  			__u32 type;
>  			__u32 ndata;
>  			__u64 data[16];

Cat we please get a #define that aliases data[0] to flags? At the next
merge of the KVM headers into their respective trees, all the existing
VMM are going to break if they have a reference to this field (CrosVM
definitely does today -- yes, we're ahead of time).

Also, getting a bisectable series would be good.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/4] KVM: fix KVM_EXIT_SYSTEM_EVENT mess
  2022-04-22  9:41   ` Paolo Bonzini
@ 2022-04-22 10:01     ` Marc Zyngier
  2022-04-22 10:19       ` Paolo Bonzini
  0 siblings, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2022-04-22 10:01 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Oliver Upton, linux-kernel, kvm, will, apatel, atishp, seanjc, pgonda

On Fri, 22 Apr 2022 10:41:34 +0100,
Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> On 4/22/22 09:58, Oliver Upton wrote:
> > Is there any way we could clean this up in 5.18 and leave the whole
> > ndata/data pattern for 5.19?
> > 
> > IOW, for 5.18 go back and fix the padding:
> > 
> > 	struct {
> > 		__u32 type;
> > 		__u32 pad;
> > 		__u64 flags;
> > 	} system_event;
> > 
> > Then for 5.19 circle back on the data business, except use a flag bit
> > for it:
> > 
> > 	struct {
> > 		__u32 type;
> > 		__u32 pad;
> > 	#define KVM_SYSTEM_EVENT_NDATA_VALID	(1u << 63)
> > 		__u64 flags;
> > 		__u64 ndata;
> > 		__u64 data[16];
> > 	} system_event;
> > 
> > Where we apply that bit to system_event::flags this time instead of
> > ::type. Could also go the CAP route.
> 
> These patches are against kvm/next, so that is already what I did. :)

Can you please post a complete series? It is becoming really hard to
track what you are doing.

> On the other hand right now the ARM and RISC-V flags are unusable with
> 32-bit userspace, so we need to fix _something_ in 5.18 as well.

What 32bit userspace? arm64 doesn't have any that can interact with KVM,
so I don't see anything to fix on that front.

> For
> your proposal, all that's missing is a 5.18 patch to add the
> padding. But since the flags UAPI was completely unused before 5.18
> and there's no reason to inflict the different naming of fields to
> userspace.  So I think we want to apply this UAPI change in 5.18 too.

As it was pointed out already, CrosVM has already started looking at
the flags. The fact that it was always 0 until now doesn't make it
less of a UAPI.

I'd like to see a full series that implements the transition before we
make a decision on this.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] KVM: x86: always initialize system_event.ndata
  2022-04-22  9:48   ` Marc Zyngier
@ 2022-04-22 10:11     ` Paolo Bonzini
  0 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2022-04-22 10:11 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: linux-kernel, kvm, will, apatel, atishp, seanjc, pgonda

On 4/22/22 11:48, Marc Zyngier wrote:
> On Thu, 21 Apr 2022 19:04:40 +0100,
> Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> The KVM_SYSTEM_EVENT_NDATA_VALID mechanism that was introduced
>> contextually with KVM_SYSTEM_EVENT_SEV_TERM is not a good match
>> for ARM and RISC-V, which want to communicate information even
>> for existing KVM_SYSTEM_EVENT_* constants.  Userspace is not ready
>> to filter out bit 31 of type, and fails to process the
>> KVM_EXIT_SYSTEM_EVENT exit.
>>
>> Therefore, tie the availability of ndata to a system capability
>> (which will be added once all architectures are on board).
>> Userspace written for released versions of Linux has no reason to
>> check flags, since it was never written, so it is okay to replace
>> it with ndata and data[0] (on 32-bit kernels) or with data[0]
>> (on 64-bit kernels).
> 
> How is it going to work for new userspace on old kernels, for which
> the ndata field is left uninitialised?

New userspace needs to check the capability before accessing ndata.

If it is desirable for Android, crosvm can "quirk" that ARM always has 
valid data[0] even in the absence of the capability.

> Cat we please get a #define that aliases data[0] to flags? At the next
> merge of the KVM headers into their respective trees, all the existing
> VMM are going to break if they have a reference to this field (CrosVM
> definitely does today -- yes, we're ahead of time).

That would be a union rather than a define but yes, I can do it.

Paolo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/4] KVM: fix KVM_EXIT_SYSTEM_EVENT mess
  2022-04-22 10:01     ` Marc Zyngier
@ 2022-04-22 10:19       ` Paolo Bonzini
  0 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2022-04-22 10:19 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Oliver Upton, linux-kernel, kvm, will, apatel, atishp, seanjc, pgonda

On 4/22/22 12:01, Marc Zyngier wrote:
>> These patches are against kvm/next, so that is already what I did. :)
> 
> Can you please post a complete series? It is becoming really hard to
> track what you are doing.

Yes, I'll post a complete series for kvm/master in a second.

>> On the other hand right now the ARM and RISC-V flags are unusable with
>> 32-bit userspace, so we need to fix _something_ in 5.18 as well.
> 
> What 32bit userspace? arm64 doesn't have any that can interact with KVM,
> so I don't see anything to fix on that front.

You're right, ARM64 does not define KVM_COMPAT.  But RISC-V does.

>> For your proposal, all that's missing is a 5.18 patch to add the
>> padding. But since the flags UAPI was completely unused before 5.18
>> and there's no reason to inflict the different naming of fields to
>> userspace.  So I think we want to apply this UAPI change in 5.18 too.
> 
> As it was pointed out already, CrosVM has already started looking at
> the flags. The fact that it was always 0 until now doesn't make it
> less of a UAPI.

I heard that and that's exactly why I dropped the idea of using 
NDATA_VALID in bit 31 of flags, and switched to a capability instead. 
If it is desirable for Android, crosvm can "quirk" that ARM always has 
valid data[0].

Paolo

> I'd like to see a full series that implements the transition before we
> make a decision on this.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-04-22 10:20 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-21 18:04 [PATCH 0/4] KVM: fix KVM_EXIT_SYSTEM_EVENT mess Paolo Bonzini
2022-04-21 18:04 ` [PATCH 1/4] KVM: x86: always initialize system_event.ndata Paolo Bonzini
2022-04-22  9:48   ` Marc Zyngier
2022-04-22 10:11     ` Paolo Bonzini
2022-04-21 18:04 ` [PATCH 2/4] KVM: ARM: replace system_event.flags with ndata and data[0] Paolo Bonzini
2022-04-21 18:04 ` [PATCH 3/4] KVM: RISC-V: " Paolo Bonzini
2022-04-21 18:04 ` [PATCH 4/4] KVM: tell userspace that system_event.ndata is valid Paolo Bonzini
2022-04-22  7:58 ` [PATCH 0/4] KVM: fix KVM_EXIT_SYSTEM_EVENT mess Oliver Upton
2022-04-22  9:41   ` Paolo Bonzini
2022-04-22 10:01     ` Marc Zyngier
2022-04-22 10:19       ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).