From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85138C5CFEB for ; Wed, 11 Jul 2018 10:38:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3C99B20BEC for ; Wed, 11 Jul 2018 10:38:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3C99B20BEC Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732320AbeGKKmH (ORCPT ); Wed, 11 Jul 2018 06:42:07 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:32970 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726644AbeGKKmH (ORCPT ); Wed, 11 Jul 2018 06:42:07 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5D23480D; Wed, 11 Jul 2018 03:38:26 -0700 (PDT) Received: from e103592.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4B64B3F589; Wed, 11 Jul 2018 03:38:24 -0700 (PDT) Date: Wed, 11 Jul 2018 11:38:21 +0100 From: Dave Martin To: Suzuki K Poulose Cc: cdall@kernel.org, kvm@vger.kernel.org, Marc Zyngier , catalin.marinas@arm.com, punit.agrawal@arm.com, Will Deacon , linux-kernel@vger.kernel.org, qemu-devel@nongnu.org, Paolo Bonzini , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH v3 15/20] kvm: arm/arm64: Allow tuning the physical address size for VM Message-ID: <20180711103819.GK9486@e103592.cambridge.arm.com> References: <12d1832a-1a13-7dd4-662b-addf58400789@arm.com> <9f1af26e-2913-2b0b-1352-63160096f78f@arm.com> <20180709112326.GD9486@e103592.cambridge.arm.com> <17f8d585-3489-ab6f-6ee1-4d8d337dcf9c@arm.com> <20180709133750.GE9486@e103592.cambridge.arm.com> <377420ce-97a8-4359-6224-273d91f37247@arm.com> <20180710170330.GJ9486@e103592.cambridge.arm.com> <0d9d48fc-f913-9f05-afbe-83299c470611@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0d9d48fc-f913-9f05-afbe-83299c470611@arm.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 11, 2018 at 10:05:50AM +0100, Suzuki K Poulose wrote: > On 10/07/18 18:03, Dave Martin wrote: > >On Tue, Jul 10, 2018 at 05:38:39PM +0100, Suzuki K Poulose wrote: > >>On 09/07/18 14:37, Dave Martin wrote: > >>>On Mon, Jul 09, 2018 at 01:29:42PM +0100, Marc Zyngier wrote: > >>>>On 09/07/18 12:23, Dave Martin wrote: > > > >[...] > > > >>>>>Wedging arguments into a few bits in the type argument feels awkward, > >>>>>and may be regretted later if we run out of bits, or something can't be > >>>>>represented in the chosen encoding. > >>>> > >>>>I think that's a pretty convincing argument for a "better" CREATE_VM, > >>>>one that would have a clearly defined, structured (and potentially > >>>>extensible) argument. > >>>> > >>>>I've quickly hacked the following: > >>>> > >>>>diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > >>>>index b6270a3b38e9..3e76214034c2 100644 > >>>>--- a/include/uapi/linux/kvm.h > >>>>+++ b/include/uapi/linux/kvm.h > >>>>@@ -735,6 +735,20 @@ struct kvm_ppc_resize_hpt { > >>>> __u32 pad; > >>>> }; > >>>> > >>>>+struct kvm_create_vm2 { > >>>>+ __u64 version; /* Or maybe not */ > >>>>+ union { > >>>>+ struct { > >>>>+#define KVM_ARM_SVE_CAPABLE (1 << 0) > >>>>+#define KVM_ARM_SELECT_IPA {1 << 1) > >>>>+ __u64 capabilities; > >>>>+ __u16 sve_vlen; > >>>>+ __u8 ipa_size; > >>>>+ } arm64; > >>>>+ __u64 dummy[15]; > >>>>+ }; > >>>>+}; > >>>>+ > >>>> #define KVMIO 0xAE > >>>> > >>>> /* machine type bits, to be used as argument to KVM_CREATE_VM */ > >>>> > >>>>Other architectures could fill in their own bits if they need to. > >>>> > >>>>Thoughts? > >>> > >>>This kind of thing should work, but it may still get messy when we > >>>add additional fields. > >> > >> > >>Marc, Dave, > >> > >>I like Dave's approach. Some comments below. > >> > >>> > >>>It we want this to work cross-arch, would it make sense to go > >>>for a more generic approach, say > >>> > >>>struct kvm_create_vm_attr_any { > >>> __u32 type; > >>>}; > >>> > >>>#define KVM_CREATE_VM_ATTR_ARCH_CAPABILITIES 1 > >>>struct kvm_create_vm_attr_arch_capabilities { > >>> __u32 type; > >>> __u16 size; /* support future expansion of capabilities[] */ > >>> __u16 reserved; > >>> __u64 capabilities[1]; > >>>}; > >> > >>We also need to advertise which attributes are supported by the host, > >>so that the user can tune the available ones. That would make a bit mask > >>like the above trickier, unless we return the supported values back > >>in the argument ptr for the "probe" call. And this scheme in general > >>can be useful for passing back a non-boolean result specific to the > >>attribute, without having a per-attribute ioctl. (e.g, maximum limit > >>for IPA). > > > >Maybe, but this could quickly become bloated. (My approach already > >feels a bit bloated...) > > > >I'm not sure that arbitrarily complex negotiation will really be > >needed, but userspace might want to change its mind if setting a > >particular propertiy fails. > > > >An alternative might be to have a bunch of per-VM ioctls to configure > >different things, like x86 has. There's at least precedent for that. > >For arm, we currently only have a few. That allows for easy extension, > >at the cost of adding ioctls. > > As you know, one of the major problems with the per-VM ioctls is > the ordering of different operations and tracking to make sure that > the userspace follows the expected order. e.g, the first approach for > IPA series was based on this and it made things complex enough to drop > it. I'm aware of that, but if we are adding a new KVM_CREATE_VM, we could perhaps give it different semantics: i.e., we create a half-created VM that only accepts configuration ioctls and a "finish creation" ioctl that finalises everything before you're allowed to create devices, vcpus etc. This is the sort of thing I was moving torwards for SVE (but for vcpus there). I'm not saying we should drop the existing KVM_CREATE_VM2 ideas, but that we should take a step back if it starts to accrue complexity. > > > >There may be some ioctls we can reuse, like KVM_ENABLE_CAP for per- > >vm capability flags. > > May be we could switch to KVM_VM_CAPS and pass a list of capabilities > to be enabled at creation time ? The kvm_enable_cap can pass in additional > arguments for each cap. That way we don't have to rely on a new set of > attributes and probing becomes straight forward. That's a possibility. I guess we'd need to understand how exactly x86 uses this. Cheers ---Dave From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48730) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fdCVz-000637-Nd for qemu-devel@nongnu.org; Wed, 11 Jul 2018 06:38:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fdCVw-0006ou-Ja for qemu-devel@nongnu.org; Wed, 11 Jul 2018 06:38:31 -0400 Received: from foss.arm.com ([217.140.101.70]:40422) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fdCVw-0006oj-9k for qemu-devel@nongnu.org; Wed, 11 Jul 2018 06:38:28 -0400 Date: Wed, 11 Jul 2018 11:38:21 +0100 From: Dave Martin Message-ID: <20180711103819.GK9486@e103592.cambridge.arm.com> References: <12d1832a-1a13-7dd4-662b-addf58400789@arm.com> <9f1af26e-2913-2b0b-1352-63160096f78f@arm.com> <20180709112326.GD9486@e103592.cambridge.arm.com> <17f8d585-3489-ab6f-6ee1-4d8d337dcf9c@arm.com> <20180709133750.GE9486@e103592.cambridge.arm.com> <377420ce-97a8-4359-6224-273d91f37247@arm.com> <20180710170330.GJ9486@e103592.cambridge.arm.com> <0d9d48fc-f913-9f05-afbe-83299c470611@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0d9d48fc-f913-9f05-afbe-83299c470611@arm.com> Subject: Re: [Qemu-devel] [PATCH v3 15/20] kvm: arm/arm64: Allow tuning the physical address size for VM List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Suzuki K Poulose Cc: cdall@kernel.org, kvm@vger.kernel.org, Marc Zyngier , catalin.marinas@arm.com, punit.agrawal@arm.com, Will Deacon , linux-kernel@vger.kernel.org, qemu-devel@nongnu.org, Paolo Bonzini , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org On Wed, Jul 11, 2018 at 10:05:50AM +0100, Suzuki K Poulose wrote: > On 10/07/18 18:03, Dave Martin wrote: > >On Tue, Jul 10, 2018 at 05:38:39PM +0100, Suzuki K Poulose wrote: > >>On 09/07/18 14:37, Dave Martin wrote: > >>>On Mon, Jul 09, 2018 at 01:29:42PM +0100, Marc Zyngier wrote: > >>>>On 09/07/18 12:23, Dave Martin wrote: > > > >[...] > > > >>>>>Wedging arguments into a few bits in the type argument feels awkward, > >>>>>and may be regretted later if we run out of bits, or something can't be > >>>>>represented in the chosen encoding. > >>>> > >>>>I think that's a pretty convincing argument for a "better" CREATE_VM, > >>>>one that would have a clearly defined, structured (and potentially > >>>>extensible) argument. > >>>> > >>>>I've quickly hacked the following: > >>>> > >>>>diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > >>>>index b6270a3b38e9..3e76214034c2 100644 > >>>>--- a/include/uapi/linux/kvm.h > >>>>+++ b/include/uapi/linux/kvm.h > >>>>@@ -735,6 +735,20 @@ struct kvm_ppc_resize_hpt { > >>>> __u32 pad; > >>>> }; > >>>> > >>>>+struct kvm_create_vm2 { > >>>>+ __u64 version; /* Or maybe not */ > >>>>+ union { > >>>>+ struct { > >>>>+#define KVM_ARM_SVE_CAPABLE (1 << 0) > >>>>+#define KVM_ARM_SELECT_IPA {1 << 1) > >>>>+ __u64 capabilities; > >>>>+ __u16 sve_vlen; > >>>>+ __u8 ipa_size; > >>>>+ } arm64; > >>>>+ __u64 dummy[15]; > >>>>+ }; > >>>>+}; > >>>>+ > >>>> #define KVMIO 0xAE > >>>> > >>>> /* machine type bits, to be used as argument to KVM_CREATE_VM */ > >>>> > >>>>Other architectures could fill in their own bits if they need to. > >>>> > >>>>Thoughts? > >>> > >>>This kind of thing should work, but it may still get messy when we > >>>add additional fields. > >> > >> > >>Marc, Dave, > >> > >>I like Dave's approach. Some comments below. > >> > >>> > >>>It we want this to work cross-arch, would it make sense to go > >>>for a more generic approach, say > >>> > >>>struct kvm_create_vm_attr_any { > >>> __u32 type; > >>>}; > >>> > >>>#define KVM_CREATE_VM_ATTR_ARCH_CAPABILITIES 1 > >>>struct kvm_create_vm_attr_arch_capabilities { > >>> __u32 type; > >>> __u16 size; /* support future expansion of capabilities[] */ > >>> __u16 reserved; > >>> __u64 capabilities[1]; > >>>}; > >> > >>We also need to advertise which attributes are supported by the host, > >>so that the user can tune the available ones. That would make a bit mask > >>like the above trickier, unless we return the supported values back > >>in the argument ptr for the "probe" call. And this scheme in general > >>can be useful for passing back a non-boolean result specific to the > >>attribute, without having a per-attribute ioctl. (e.g, maximum limit > >>for IPA). > > > >Maybe, but this could quickly become bloated. (My approach already > >feels a bit bloated...) > > > >I'm not sure that arbitrarily complex negotiation will really be > >needed, but userspace might want to change its mind if setting a > >particular propertiy fails. > > > >An alternative might be to have a bunch of per-VM ioctls to configure > >different things, like x86 has. There's at least precedent for that. > >For arm, we currently only have a few. That allows for easy extension, > >at the cost of adding ioctls. > > As you know, one of the major problems with the per-VM ioctls is > the ordering of different operations and tracking to make sure that > the userspace follows the expected order. e.g, the first approach for > IPA series was based on this and it made things complex enough to drop > it. I'm aware of that, but if we are adding a new KVM_CREATE_VM, we could perhaps give it different semantics: i.e., we create a half-created VM that only accepts configuration ioctls and a "finish creation" ioctl that finalises everything before you're allowed to create devices, vcpus etc. This is the sort of thing I was moving torwards for SVE (but for vcpus there). I'm not saying we should drop the existing KVM_CREATE_VM2 ideas, but that we should take a step back if it starts to accrue complexity. > > > >There may be some ioctls we can reuse, like KVM_ENABLE_CAP for per- > >vm capability flags. > > May be we could switch to KVM_VM_CAPS and pass a list of capabilities > to be enabled at creation time ? The kvm_enable_cap can pass in additional > arguments for each cap. That way we don't have to rely on a new set of > attributes and probing becomes straight forward. That's a possibility. I guess we'd need to understand how exactly x86 uses this. Cheers ---Dave From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave.Martin@arm.com (Dave Martin) Date: Wed, 11 Jul 2018 11:38:21 +0100 Subject: [PATCH v3 15/20] kvm: arm/arm64: Allow tuning the physical address size for VM In-Reply-To: <0d9d48fc-f913-9f05-afbe-83299c470611@arm.com> References: <12d1832a-1a13-7dd4-662b-addf58400789@arm.com> <9f1af26e-2913-2b0b-1352-63160096f78f@arm.com> <20180709112326.GD9486@e103592.cambridge.arm.com> <17f8d585-3489-ab6f-6ee1-4d8d337dcf9c@arm.com> <20180709133750.GE9486@e103592.cambridge.arm.com> <377420ce-97a8-4359-6224-273d91f37247@arm.com> <20180710170330.GJ9486@e103592.cambridge.arm.com> <0d9d48fc-f913-9f05-afbe-83299c470611@arm.com> Message-ID: <20180711103819.GK9486@e103592.cambridge.arm.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Wed, Jul 11, 2018 at 10:05:50AM +0100, Suzuki K Poulose wrote: > On 10/07/18 18:03, Dave Martin wrote: > >On Tue, Jul 10, 2018 at 05:38:39PM +0100, Suzuki K Poulose wrote: > >>On 09/07/18 14:37, Dave Martin wrote: > >>>On Mon, Jul 09, 2018 at 01:29:42PM +0100, Marc Zyngier wrote: > >>>>On 09/07/18 12:23, Dave Martin wrote: > > > >[...] > > > >>>>>Wedging arguments into a few bits in the type argument feels awkward, > >>>>>and may be regretted later if we run out of bits, or something can't be > >>>>>represented in the chosen encoding. > >>>> > >>>>I think that's a pretty convincing argument for a "better" CREATE_VM, > >>>>one that would have a clearly defined, structured (and potentially > >>>>extensible) argument. > >>>> > >>>>I've quickly hacked the following: > >>>> > >>>>diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > >>>>index b6270a3b38e9..3e76214034c2 100644 > >>>>--- a/include/uapi/linux/kvm.h > >>>>+++ b/include/uapi/linux/kvm.h > >>>>@@ -735,6 +735,20 @@ struct kvm_ppc_resize_hpt { > >>>> __u32 pad; > >>>> }; > >>>> > >>>>+struct kvm_create_vm2 { > >>>>+ __u64 version; /* Or maybe not */ > >>>>+ union { > >>>>+ struct { > >>>>+#define KVM_ARM_SVE_CAPABLE (1 << 0) > >>>>+#define KVM_ARM_SELECT_IPA {1 << 1) > >>>>+ __u64 capabilities; > >>>>+ __u16 sve_vlen; > >>>>+ __u8 ipa_size; > >>>>+ } arm64; > >>>>+ __u64 dummy[15]; > >>>>+ }; > >>>>+}; > >>>>+ > >>>> #define KVMIO 0xAE > >>>> > >>>> /* machine type bits, to be used as argument to KVM_CREATE_VM */ > >>>> > >>>>Other architectures could fill in their own bits if they need to. > >>>> > >>>>Thoughts? > >>> > >>>This kind of thing should work, but it may still get messy when we > >>>add additional fields. > >> > >> > >>Marc, Dave, > >> > >>I like Dave's approach. Some comments below. > >> > >>> > >>>It we want this to work cross-arch, would it make sense to go > >>>for a more generic approach, say > >>> > >>>struct kvm_create_vm_attr_any { > >>> __u32 type; > >>>}; > >>> > >>>#define KVM_CREATE_VM_ATTR_ARCH_CAPABILITIES 1 > >>>struct kvm_create_vm_attr_arch_capabilities { > >>> __u32 type; > >>> __u16 size; /* support future expansion of capabilities[] */ > >>> __u16 reserved; > >>> __u64 capabilities[1]; > >>>}; > >> > >>We also need to advertise which attributes are supported by the host, > >>so that the user can tune the available ones. That would make a bit mask > >>like the above trickier, unless we return the supported values back > >>in the argument ptr for the "probe" call. And this scheme in general > >>can be useful for passing back a non-boolean result specific to the > >>attribute, without having a per-attribute ioctl. (e.g, maximum limit > >>for IPA). > > > >Maybe, but this could quickly become bloated. (My approach already > >feels a bit bloated...) > > > >I'm not sure that arbitrarily complex negotiation will really be > >needed, but userspace might want to change its mind if setting a > >particular propertiy fails. > > > >An alternative might be to have a bunch of per-VM ioctls to configure > >different things, like x86 has. There's at least precedent for that. > >For arm, we currently only have a few. That allows for easy extension, > >at the cost of adding ioctls. > > As you know, one of the major problems with the per-VM ioctls is > the ordering of different operations and tracking to make sure that > the userspace follows the expected order. e.g, the first approach for > IPA series was based on this and it made things complex enough to drop > it. I'm aware of that, but if we are adding a new KVM_CREATE_VM, we could perhaps give it different semantics: i.e., we create a half-created VM that only accepts configuration ioctls and a "finish creation" ioctl that finalises everything before you're allowed to create devices, vcpus etc. This is the sort of thing I was moving torwards for SVE (but for vcpus there). I'm not saying we should drop the existing KVM_CREATE_VM2 ideas, but that we should take a step back if it starts to accrue complexity. > > > >There may be some ioctls we can reuse, like KVM_ENABLE_CAP for per- > >vm capability flags. > > May be we could switch to KVM_VM_CAPS and pass a list of capabilities > to be enabled at creation time ? The kvm_enable_cap can pass in additional > arguments for each cap. That way we don't have to rely on a new set of > attributes and probing becomes straight forward. That's a possibility. I guess we'd need to understand how exactly x86 uses this. Cheers ---Dave