From: "Radim Krčmář" <rkrcmar@redhat.com>
To: Wanpeng Li <kernellwp@gmail.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
Paolo Bonzini <pbonzini@redhat.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>
Subject: Re: [PATCH v3 2/6] KVM: X86: Implement PV IPIs in linux guest
Date: Thu, 19 Jul 2018 18:28:27 +0200 [thread overview]
Message-ID: <20180719162826.GB11749@flask> (raw)
In-Reply-To: <1530598891-21370-3-git-send-email-wanpengli@tencent.com>
2018-07-03 14:21+0800, Wanpeng Li:
> From: Wanpeng Li <wanpengli@tencent.com>
>
> Implement paravirtual apic hooks to enable PV IPIs.
>
> apic->send_IPI_mask
> apic->send_IPI_mask_allbutself
> apic->send_IPI_allbutself
> apic->send_IPI_all
>
> The PV IPIs supports maximal 128 vCPUs VM, it is big enough for cloud
> environment currently, supporting more vCPUs needs to introduce more
> complex logic, in the future this might be extended if needed.
>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> ---
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> @@ -454,6 +454,71 @@ static void __init sev_map_percpu_data(void)
> }
>
> #ifdef CONFIG_SMP
> +
> +#ifdef CONFIG_X86_64
> +static void __send_ipi_mask(const struct cpumask *mask, int vector)
> +{
> + unsigned long flags, ipi_bitmap_low = 0, ipi_bitmap_high = 0;
> + int cpu, apic_id;
> +
> + if (cpumask_empty(mask))
> + return;
> +
> + local_irq_save(flags);
> +
> + for_each_cpu(cpu, mask) {
> + apic_id = per_cpu(x86_cpu_to_apicid, cpu);
> + if (apic_id < BITS_PER_LONG)
> + __set_bit(apic_id, &ipi_bitmap_low);
> + else if (apic_id < 2 * BITS_PER_LONG)
> + __set_bit(apic_id - BITS_PER_LONG, &ipi_bitmap_high);
It'd be nicer with 'unsigned long ipi_bitmap[2]' and a single
__set_bit(apic_id, ipi_bitmap);
> + }
> +
> + kvm_hypercall3(KVM_HC_SEND_IPI, ipi_bitmap_low, ipi_bitmap_high, vector);
and
kvm_hypercall3(KVM_HC_SEND_IPI, ipi_bitmap[0], ipi_bitmap[1], vector);
Still, the main problem is that we can only address 128 APICs.
A simple improvement would reuse the vector field (as we need only 8
bits) and put a 'offset' in the rest. The offset would say which
cluster of 128 are we addressing. 24 bits of offset results in 2^31
total addressable CPUs (we probably should even use that many bits).
The downside of this is that we can only address 128 at a time.
It's basically the same as x2apic cluster mode, only with 128 cluster
size instead of 16, so the code should be a straightforward port.
And because x2apic code doesn't seem to use any division by the cluster
size, we could even try to use kvm_hypercall4, add ipi_bitmap[2], and
make the cluster size 192. :)
But because it is very similar to x2apic, I'd really need some real
performance data to see if this benefits a real workload.
Hardware could further optimize LAPIC (apicv, vapic) in the future,
which we'd lose by using paravirt.
e.g. AMD's acceleration should be superior to this when using < 8 VCPUs
as they can use logical xAPIC and send without VM exits (when all VCPUs
are running).
> +
> + local_irq_restore(flags);
> +}
> +
> +static void kvm_send_ipi_mask(const struct cpumask *mask, int vector)
> +{
> + __send_ipi_mask(mask, vector);
> +}
> +
> +static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int vector)
> +{
> + unsigned int this_cpu = smp_processor_id();
> + struct cpumask new_mask;
> + const struct cpumask *local_mask;
> +
> + cpumask_copy(&new_mask, mask);
> + cpumask_clear_cpu(this_cpu, &new_mask);
> + local_mask = &new_mask;
> + __send_ipi_mask(local_mask, vector);
> +}
> +
> +static void kvm_send_ipi_allbutself(int vector)
> +{
> + kvm_send_ipi_mask_allbutself(cpu_online_mask, vector);
> +}
> +
> +static void kvm_send_ipi_all(int vector)
> +{
> + __send_ipi_mask(cpu_online_mask, vector);
These should be faster when using the native APIC shorthand -- is this
the "Broadcast" in your tests?
> +}
> +
> +/*
> + * Set the IPI entry points
> + */
> +static void kvm_setup_pv_ipi(void)
> +{
> + apic->send_IPI_mask = kvm_send_ipi_mask;
> + apic->send_IPI_mask_allbutself = kvm_send_ipi_mask_allbutself;
> + apic->send_IPI_allbutself = kvm_send_ipi_allbutself;
> + apic->send_IPI_all = kvm_send_ipi_all;
> + pr_info("KVM setup pv IPIs\n");
> +}
> +#endif
> +
> static void __init kvm_smp_prepare_cpus(unsigned int max_cpus)
> {
> native_smp_prepare_cpus(max_cpus);
> @@ -626,6 +691,11 @@ static uint32_t __init kvm_detect(void)
>
> static void __init kvm_apic_init(void)
> {
> +#if defined(CONFIG_SMP) && defined(CONFIG_X86_64)
> + if (kvm_para_has_feature(KVM_FEATURE_PV_SEND_IPI) &&
> + num_possible_cpus() <= 2 * BITS_PER_LONG)
It looks that num_possible_cpus() is actually NR_CPUS, so the feature
would never be used on a standard Linux distro.
And we're using APIC_ID, which can be higher even if maximum CPU the
number is lower. Just remove it.
next prev parent reply other threads:[~2018-07-19 16:28 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-03 6:21 [PATCH v3 0/6] KVM: X86: Implement PV IPIs support Wanpeng Li
2018-07-03 6:21 ` [PATCH v3 1/6] KVM: X86: Add kvm hypervisor init time platform setup callback Wanpeng Li
2018-07-03 6:21 ` [PATCH v3 2/6] KVM: X86: Implement PV IPIs in linux guest Wanpeng Li
2018-07-19 16:28 ` Radim Krčmář [this message]
2018-07-19 16:47 ` Paolo Bonzini
2018-07-19 17:22 ` Radim Krčmář
2018-07-20 3:35 ` Wanpeng Li
2018-07-20 5:58 ` Wanpeng Li
2018-07-20 8:06 ` Paolo Bonzini
2018-07-20 3:33 ` Wanpeng Li
2018-07-20 9:51 ` Radim Krcmar
2018-07-20 10:17 ` Wanpeng Li
2018-07-19 23:05 ` David Matlack
2018-07-20 3:45 ` Wanpeng Li
2018-07-20 13:12 ` Radim Krcmar
2018-07-03 6:21 ` [PATCH v3 3/6] KVM: X86: Fallback to original apic hooks when bad happens Wanpeng Li
2018-07-03 6:21 ` [PATCH v3 4/6] KVM: X86: Implement PV IPIs send hypercall Wanpeng Li
2018-07-19 16:47 ` Paolo Bonzini
2018-07-20 3:49 ` Wanpeng Li
2018-07-03 6:21 ` [PATCH v3 5/6] KVM: X86: Add NMI support to PV IPIs Wanpeng Li
2018-07-19 16:31 ` Radim Krčmář
2018-07-20 3:53 ` Wanpeng Li
2018-07-20 8:04 ` Paolo Bonzini
2018-07-20 13:26 ` Radim Krcmar
2018-07-23 0:52 ` Wanpeng Li
2018-07-03 6:21 ` [PATCH v3 6/6] KVM: X86: Expose PV_SEND_IPI CPUID feature bit to guest Wanpeng Li
2018-07-18 3:00 ` [PATCH v3 0/6] KVM: X86: Implement PV IPIs support Wanpeng Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180719162826.GB11749@flask \
--to=rkrcmar@redhat.com \
--cc=kernellwp@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=vkuznets@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).