From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S977490AbdDXUEB (ORCPT <rfc822;w@1wt.eu>);
        Mon, 24 Apr 2017 16:04:01 -0400
Received: from mx1.redhat.com ([209.132.183.28]:37778 "EHLO mx1.redhat.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S977217AbdDXUDw (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 24 Apr 2017 16:03:52 -0400
DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 0D961C0467F3
Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
Authentication-Results: ext-mx07.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=rkrcmar@redhat.com
DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 0D961C0467F3
Date: Mon, 24 Apr 2017 22:03:46 +0200
From: Radim =?utf-8?B?S3LEjW3DocWZ?= <rkrcmar@redhat.com>
To: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: David Hildenbrand <david@redhat.com>, linux-kernel@vger.kernel.org,
        kvm@vger.kernel.org, Christoffer Dall <cdall@linaro.org>,
        Marc Zyngier <marc.zyngier@arm.com>,
        Paolo Bonzini <pbonzini@redhat.com>,
        Christian Borntraeger <borntraeger@de.ibm.com>,
        James Hogan <james.hogan@imgtec.com>,
        Paul Mackerras <paulus@ozlabs.org>, Alexander Graf <agraf@suse.com>
Subject: Re: [PATCH 0/4] KVM: add KVM_CREATE_VM2 to allow dynamic kvm->vcpus
 array
Message-ID: <20170424200346.GD5713@potion>
References: <20170413201951.11939-1-rkrcmar@redhat.com>
 <bb0cda85-6222-5cb7-718f-bbc70cd6297c@redhat.com>
 <20170418142903.44973c86.cornelia.huck@de.ibm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <20170418142903.44973c86.cornelia.huck@de.ibm.com>
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Mon, 24 Apr 2017 20:03:52 +0000 (UTC)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

2017-04-18 14:29+0200, Cornelia Huck:
> On Tue, 18 Apr 2017 13:11:55 +0200
> David Hildenbrand <david@redhat.com> wrote:
>> On 13.04.2017 22:19, Radim Krčmář wrote:
>> > new KVM_MAX_CONFIGURABLE_VCPUS, probably directly to INT_MAX/KVM_VCPU_ID, so we
>> > don't have to worry about it for a while.
>> > 
>> > PPC should be interested in this as they set KVM_MAX_VCPUS to NR_CPUS
>> > and probably waste few pages for every guest this way.
>> 
>> As we just store pointers, this should be a maximum of 4 pages for ppc
>> (4k pages). Is this really worth yet another VM creation ioctl? Is there
>> not a nicer way to handle this internally?
>> 
>> An alternative might be to simply realloc the array when it reaches a
>> certain size (on VCPU creation, maybe protecting the pointer via rcu).
>> But not sure if something like that could work.
> 
> I like that idea better, if it does work (I think it should be doable).
> If we just double the array size every time we run out of space, we
> should be able to do this with few reallocations. That has also the
> advantage of being transparent to user space (other than increased
> number of vcpus).

Yes, relocating would work with protection against use-after-free and
RCU fits well.  (Readers don't have any lock we could piggyback on.)

I didn't go for it because of C: the kvm_for_each_vcpu macro would be
less robust if it included the locking around its body -- nested fors are
susceptible to return/goto errors inside the loop body + we'd need to
obfuscate several existing users of that pattern.  And open-coding the
protection everywhere is polluting the code too, IMO.

Lock-less list would solve those problems, but we are accessing the
VCPUs by index, which makes it suboptimal in other direction ... using
the list for kvm_for_each_vcpu and adding RCU protected array for
kvm_get_vcpu and kvm_get_vcpu_by_id looks like over-engineering as we
wouldn't save memory, performance, nor lines of code by doing that.

I didn't see a way to untangle kvm->vcpu that would allow a nice
runtime-dynamic variant.

We currently don't need to pass more information at VM creation time
either, so I was also thinking of hijacking the parameter to
KVM_CREATE_VM for factor-of-2 VCPU count (20 bits would last a while),
but that is already a new interface and new IOCTL to do a superset of
another one seemed much better.

I agree that the idea is questionable.   I'll redo the series and bump
KVM_MAX_VCPUS unless you think that the dynamic could be done nicely.

(The memory saving is a miniscule fraction of a VM size and if we do big
 increments in KVM_MAX_VCPUS, then the motivation is gone.)

Thanks.