All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zeng Guang <guang.zeng@intel.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"Luck, Tony" <tony.luck@intel.com>,
	Kan Liang <kan.liang@linux.intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Kim Phillips <kim.phillips@amd.com>,
	Jarkko Sakkinen <jarkko@kernel.org>,
	Jethro Beekman <jethro@fortanix.com>,
	"Huang, Kai" <kai.huang@intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Hu, Robert" <robert.hu@intel.com>,
	"Gao, Chao" <chao.gao@intel.com>
Subject: Re: [PATCH v5 8/8] KVM: VMX: Resize PID-ponter table on demand for IPI virtualization
Date: Mon, 17 Jan 2022 23:04:18 +0800	[thread overview]
Message-ID: <67262b95-d577-0620-79bf-20fc37906869@intel.com> (raw)
In-Reply-To: <YeGiVCn0wNH9eqxX@google.com>

On 1/15/2022 12:18 AM, Sean Christopherson wrote:
> On Fri, Jan 14, 2022, Zeng Guang wrote:
>> On 1/14/2022 6:09 AM, Sean Christopherson wrote:
>>> On Fri, Dec 31, 2021, Zeng Guang wrote:
>>>> +static int vmx_expand_pid_table(struct kvm_vmx *kvm_vmx, int entry_idx)
>>>> +{
>>>> +	u64 *last_pid_table;
>>>> +	int last_table_size, new_order;
>>>> +
>>>> +	if (entry_idx <= kvm_vmx->pid_last_index)
>>>> +		return 0;
>>>> +
>>>> +	last_pid_table = kvm_vmx->pid_table;
>>>> +	last_table_size = table_index_to_size(kvm_vmx->pid_last_index + 1);
>>>> +	new_order = get_order(table_index_to_size(entry_idx + 1));
>>>> +
>>>> +	if (vmx_alloc_pid_table(kvm_vmx, new_order))
>>>> +		return -ENOMEM;
>>>> +
>>>> +	memcpy(kvm_vmx->pid_table, last_pid_table, last_table_size);
>>>> +	kvm_make_all_cpus_request(&kvm_vmx->kvm, KVM_REQ_PID_TABLE_UPDATE);
>>>> +
>>>> +	/* Now old PID table can be freed safely as no vCPU is using it. */
>>>> +	free_pages((unsigned long)last_pid_table, get_order(last_table_size));
>>> This is terrifying.  I think it's safe?  But it's still terrifying.
>> Free old PID table here is safe as kvm making request KVM_REQ_PI_TABLE_UPDATE
>> with KVM_REQUEST_WAIT flag force all vcpus trigger vm-exit to update vmcs
>> field to new allocated PID table. At this time, it makes sure old PID table
>> not referenced by any vcpu.
>> Do you mean it still has potential problem?
> No, I do think it's safe, but it is still terrifying :-)
>
>>> Rather than dynamically react as vCPUs are created, what about we make max_vcpus
>>> common[*], extend KVM_CAP_MAX_VCPUS to allow userspace to override max_vcpus,
>>> and then have the IPIv support allocate the PID table on first vCPU creation
>>> instead of in vmx_vm_init()?
>>>
>>> That will give userspace an opportunity to lower max_vcpus to reduce memory
>>> consumption without needing to dynamically muck with the table in KVM.  Then
>>> this entire patch goes away.
>> IIUC, it's risky if relying on userspace .
> That's why we have cgroups, rlimits, etc...
>
>> In this way userspace also have chance to assign large max_vcpus but not use
>> them at all. This cannot approach the goal to save memory as much as possible
>> just similar as using KVM_MAX_VCPU_IDS to allocate PID table.
> Userspace can simply do KVM_CREATE_VCPU until it hits KVM_MAX_VCPU_IDS...
IIUC, what you proposed is to use max_vcpus in kvm for x86 arch 
(currently not present yet) and
provide new api for userspace to notify kvm how many vcpus in current vm 
session prior to vCPU creation.
Thus IPIv can setup PID-table with this information in one shot.
I'm thinking this may have several things uncertain:
1. cannot identify the exact max APIC ID corresponding to max vcpus
APIC ID definition is platform dependent. A large APIC ID could be 
assigned to one vCPU in theory even running with
small max_vcpus. We cannot figure out max APIC ID supported mapping to 
max_vcpus.

2. cannot optimize the memory consumption on PID table to the least at 
run-time
  In case "-smp=small_n,maxcpus=large_N", kvm has to allocate memory to 
accommodate large_N vcpus at the
beginning no matter whether all maxcpus will run.

3. Potential backward-compatible problem
If running with old QEMU version,  kvm cannot get expected information 
so as to make a fallback to use
KVM_MAX_VCPU_IDS by default. It's feasible but not benefit on memory 
optimization for PID table.

What's your opinion ? Thanks.

  reply	other threads:[~2022-01-17 15:04 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-31 14:28 [PATCH v5 0/8] IPI virtualization support for VM Zeng Guang
2021-12-31 14:28 ` [PATCH v5 1/8] x86/cpu: Add new VMX feature, Tertiary VM-Execution control Zeng Guang
2021-12-31 14:28 ` [PATCH v5 2/8] KVM: VMX: Extend BUILD_CONTROLS_SHADOW macro to support 64-bit variation Zeng Guang
2021-12-31 14:28 ` [PATCH v5 3/8] KVM: VMX: Detect Tertiary VM-Execution control when setup VMCS config Zeng Guang
2021-12-31 14:28 ` [PATCH v5 4/8] KVM: VMX: dump_vmcs() reports tertiary_exec_control field as well Zeng Guang
2022-01-13 21:03   ` Sean Christopherson
2022-01-14  4:19     ` Zeng Guang
2022-01-20  1:06       ` Sean Christopherson
2022-01-20  5:34         ` Zeng Guang
2021-12-31 14:28 ` [PATCH v5 5/8] KVM: x86: Support interrupt dispatch in x2APIC mode with APIC-write VM exit Zeng Guang
2022-01-13 21:29   ` Sean Christopherson
2022-01-14  7:52     ` Zeng Guang
2022-01-14 17:34       ` Sean Christopherson
2022-01-15  2:08         ` Zeng Guang
2022-01-18  0:44           ` Yuan Yao
2022-01-18  3:06             ` Zeng Guang
2022-01-18 18:17           ` Sean Christopherson
2022-01-19  2:48             ` Zeng Guang
2021-12-31 14:28 ` [PATCH v5 6/8] KVM: VMX: enable IPI virtualization Zeng Guang
2022-01-13 21:47   ` Sean Christopherson
2022-01-14  5:36     ` Zeng Guang
2021-12-31 14:28 ` [PATCH v5 7/8] KVM: VMX: Update PID-pointer table entry when APIC ID is changed Zeng Guang
2022-01-05 19:13   ` Tom Lendacky
2022-01-06  1:44     ` Zeng Guang
2022-01-06 14:06       ` Tom Lendacky
2022-01-07  8:05         ` Zeng Guang
2022-01-07  8:31           ` Maxim Levitsky
2022-01-10  7:45             ` Chao Gao
2022-01-10 22:24               ` Maxim Levitsky
2022-01-13 22:19                 ` Sean Christopherson
2022-01-14  2:58                   ` Chao Gao
2022-01-14  8:17                     ` Maxim Levitsky
2022-01-17  3:17                       ` Chao Gao
2022-02-02 23:23                   ` Sean Christopherson
2022-02-03 20:22                     ` Sean Christopherson
2022-02-23  6:10                       ` Chao Gao
2022-02-23 10:26                         ` Maxim Levitsky
2022-01-14  0:22               ` Yuan Yao
2021-12-31 14:28 ` [PATCH v5 8/8] KVM: VMX: Resize PID-ponter table on demand for IPI virtualization Zeng Guang
2022-01-13 22:09   ` Sean Christopherson
2022-01-14 15:59     ` Zeng Guang
2022-01-14 16:18       ` Sean Christopherson
2022-01-17 15:04         ` Zeng Guang [this message]
2022-01-18 17:15           ` Sean Christopherson
2022-01-19  7:55             ` Zeng Guang
2022-01-20  1:01               ` Sean Christopherson
2022-01-24 16:40                 ` Zeng Guang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=67262b95-d577-0620-79bf-20fc37906869@intel.com \
    --to=guang.zeng@intel.com \
    --cc=bp@alien8.de \
    --cc=chao.gao@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jarkko@kernel.org \
    --cc=jethro@fortanix.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kai.huang@intel.com \
    --cc=kan.liang@linux.intel.com \
    --cc=kim.phillips@amd.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=robert.hu@intel.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.