From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932137AbcGAOis (ORCPT ); Fri, 1 Jul 2016 10:38:48 -0400 Received: from mx1.redhat.com ([209.132.183.28]:54546 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752509AbcGAOiq (ORCPT ); Fri, 1 Jul 2016 10:38:46 -0400 Date: Fri, 1 Jul 2016 16:38:28 +0200 From: Radim =?utf-8?B?S3LEjW3DocWZ?= To: Paolo Bonzini Cc: Andrew Honig , linux-kernel@vger.kernel.org, kvm , "Lan, Tianyu" , Igor Mammedov , Jan Kiszka , Peter Xu Subject: Re: [PATCH v1 03/11] KVM: x86: dynamic kvm_apic_map Message-ID: <20160701143827.GE27840@potion> References: <20160630205429.16480-1-rkrcmar@redhat.com> <20160630205429.16480-4-rkrcmar@redhat.com> <20160701124421.GA2301@potion> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Fri, 01 Jul 2016 14:38:32 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 2016-07-01 16:03+0200, Paolo Bonzini: > On 01/07/2016 14:44, Radim Krčmář wrote: >> 2016-07-01 10:42+0200, Paolo Bonzini: >>> On 01/07/2016 00:15, Andrew Honig wrote: >>>>>> + /* kvm_apic_map_get_logical_dest() expects multiples of 16 */ >>>>>> + size = round_up(max_id + 1, 16); >>>> Now that you're using the full range of apic_id values, could this >>>> calculation overflow? Perhaps max_id could be u64? >>> >>> Good point, but I wonder if it's a good idea to let userspace allocate >>> 32 GB of memory. :) >> >> Yes, both could happen. I'll change it to u64 to make it future proof. > > It's not necessary to change it to u64 if you put a limit, but you can > add a WARN_ON(size == 0). Hm, to save 4 bytes and avoid a WARN_ON, I'll change it to u32 max_apic_id instead of u32 size. > Also if kvm_apic_map_get_logical_dest() expects multiples of 16, it > should warn whenever the invariant is not respected. It was to optimize the fast path ... kvm_apic_map_get_logical_dest() can handle arbitrary values, so I'll do that instead of checking or assuming an alignment. >>> Let's put a limit on the maximum supported APIC ID, and report it >>> through KVM_CHECK_EXTENSION on the new KVM_CAP_X2APIC_API capability. >>> If 767 is enough for Knights Landing, the allocation below fits in two >>> pages. If you need to make it higher, please change the allocation to >>> use kvm_kvzalloc and kvfree. >> >> We sort of have a capability for maximum APIC ID, KVM_MAX_VCPU_ID, >> because VCPU ID is initial APIC ID and x2APIC ID should always be the >> initial APIC ID. > > Should it? Yes, x2APIC ID cannot be changed in hardware and is initialized to the intitial APIC ID. Letting LAPIC_SET change x2APIC ID would allow scenarios where userspace reuses old VMs instead of building new ones after reconfiguration. I don't think it's a sensible use case and it it is currently broken, because we don't exit to userspace when changing APIC mode, so KVM would just set APIC ID to VCPU ID on any transition and userspace couldn't amend it. > According to QEMU if you have e.g. 3 cores per socket one > socket take 4 APIC IDs. For Knights Landing the "worst" prime factor in > 288 is 3^2 so you need APIC IDs up to 288 * (4/3)^2 = 512. The topology can result in sparse APIC ID and APIC ID is initialized from VCPU ID, so userspace has to pick VCPU ID accordingly.