On 18.11.21 16:09, Sean Christopherson wrote: > On Thu, Nov 18, 2021, Juergen Gross wrote: >> On 18.11.21 00:46, Sean Christopherson wrote: >>> On Wed, Nov 17, 2021, Juergen Gross wrote: >>>> On 16.11.21 15:10, Juergen Gross wrote: >>>>> Today the maximum vcpu-id of a kvm guest's vcpu on x86 systems is set >>>>> via a #define in a header file. >>>>> >>>>> In order to support higher vcpu-ids without generally increasing the >>>>> memory consumption of guests on the host (some guest structures contain >>>>> arrays sized by KVM_MAX_VCPU_IDS) add a boot parameter for adding some >>>>> bits to the vcpu-id. Additional bits are needed as the vcpu-id is >>>>> constructed via bit-wise concatenation of socket-id, core-id, etc. >>>>> As those ids maximum values are not always a power of 2, the vcpu-ids >>>>> are sparse. >>>>> >>>>> The additional number of bits needed is basically the number of >>>>> topology levels with a non-power-of-2 maximum value, excluding the top >>>>> most level. >>>>> >>>>> The default value of the new parameter will be 2 in order to support >>>>> today's possible topologies. The special value of -1 will use the >>>>> number of bits needed for a guest with the current host's topology. >>>>> >>>>> Calculating the maximum vcpu-id dynamically requires to allocate the >>>>> arrays using KVM_MAX_VCPU_IDS as the size dynamically. >>>>> >>>>> Signed-of-by: Juergen Gross >>>> >>>> Just thought about vcpu-ids a little bit more. >>>> >>>> It would be possible to replace the topology games completely by an >>>> arbitrary rather high vcpu-id limit (65536?) and to allocate the memory >>>> depending on the max vcpu-id just as needed. >>>> >>>> Right now the only vcpu-id dependent memory is for the ioapic consisting >>>> of a vcpu-id indexed bitmap and a vcpu-id indexed byte array (vectors). >>>> >>>> We could start with a minimal size when setting up an ioapic and extend >>>> the areas in case a new vcpu created would introduce a vcpu-id outside >>>> the currently allocated memory. Both arrays are protected by the ioapic >>>> specific lock (at least I couldn't spot any unprotected usage when >>>> looking briefly into the code), so reallocating those arrays shouldn't >>>> be hard. In case of ENOMEM the related vcpu creation would just fail. >>>> >>>> Thoughts? >>> >>> Why not have userspace state the max vcpu_id it intends to creates on a per-VM >>> basis? Same end result, but doesn't require the complexity of reallocating the >>> I/O APIC stuff. >>> >> >> And if the userspace doesn't do it (like today)? > > Similar to my comments in patch 4, KVM's current limits could be used as the > defaults, and any use case wanting to go beyond that would need an updated > userspace. Exceeding those limits today doesn't work, so there's no ABI breakage > by requiring a userspace change. Hmm, nice idea. Will look into it. > Or again, this could be a Kconfig knob, though that feels a bit weird in this case. > But it might make sense if it can be tied to something in the kernel's config? Having a Kconfig knob for an absolute upper bound of vcpus should be fine. If someone doesn't like the capability to explicitly let qemu create very large VMs, he/she can still set that upper bound to the normal KVM_MAX_VCPUS value. Juergen