linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>,
	Michael Kelley <mikelley@microsoft.com>,
	Siddharth Chandrasekaran <sidcha@amazon.de>,
	Yuan Yao <yuan.yao@linux.intel.com>,
	Maxim Levitsky <mlevitsk@redhat.com>,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v10 10/39] KVM: x86: hyper-v: Don't use sparse_set_to_vcpu_mask() in kvm_hv_send_ipi()
Date: Wed, 21 Sep 2022 20:54:34 +0000	[thread overview]
Message-ID: <Yyt6Clofhg7u+bI+@google.com> (raw)
In-Reply-To: <20220921152436.3673454-11-vkuznets@redhat.com>

On Wed, Sep 21, 2022, Vitaly Kuznetsov wrote:
> Get rid of on-stack allocation of vcpu_mask and optimize kvm_hv_send_ipi()
> for a smaller number of vCPUs in the request. When Hyper-V TLB flush
> is in  use, HvSendSyntheticClusterIpi{,Ex} calls are not commonly used to
> send IPIs to a large number of vCPUs (and are rarely used in general).
> 
> Introduce hv_is_vp_in_sparse_set() to directly check if the specified
> VP_ID is present in sparse vCPU set.
> 
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/kvm/hyperv.c | 37 ++++++++++++++++++++++++++-----------
>  1 file changed, 26 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 69891c48c12a..9764ebb7fd5f 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -1741,6 +1741,25 @@ static void sparse_set_to_vcpu_mask(struct kvm *kvm, u64 *sparse_banks,
>  	}
>  }
>  
> +static bool hv_is_vp_in_sparse_set(u32 vp_id, u64 valid_bank_mask, u64 sparse_banks[])
> +{
> +	int bank, sbank = 0;
> +
> +	if (!test_bit(vp_id / HV_VCPUS_PER_SPARSE_BANK,
> +		      (unsigned long *)&valid_bank_mask))
> +		return false;
> +
> +	for_each_set_bit(bank, (unsigned long *)&valid_bank_mask,
> +			 KVM_HV_MAX_SPARSE_VCPU_SET_BITS) {
> +		if (bank == vp_id / HV_VCPUS_PER_SPARSE_BANK)
> +			break;
> +		sbank++;

At the risk of being too clever, this can be heavily optimized, which given what
this helper is used for is probably worth doing.  The index into sparse_banks is
the number of bits preceding the target bit, and POPCNT can determine the number
of bits.  So, to get the index, simply strip off the upper bits and do hweight64().

And to avoid bugs while also optimizing for "small" VMs, the math can be skipped
if vp_id < 64, i.e. if bank==0, because in that case there can't possibly be
preceding bits.

Compile tested only...

	int valid_bit_nr = vp_id / HV_VCPUS_PER_SPARSE_BANK;
	unsigned long sbank;

	if (!test_bit(valid_bit_nr, (unsigned long *)&valid_bank_mask))
		return false;

	/*
	 * The index into the sparse bank is the number of preceding bits in
	 * the valid mask.  Optimize for VMs with <64 vCPUs by skipping the
	 * fancy math if there can't possibly be preceding bits.
	 */
	if (valid_bit_nr)
		sbank = hweight64(valid_bank_mask & GENMASK_ULL(valid_bit_nr - 1, 0));
	else
		sbank = 0;

	return test_bit(vp_id % HV_VCPUS_PER_SPARSE_BANK,
			(unsigned long *)&sparse_banks[sbank]);

yields this, where the "call __sw_hweight64" will be patched to POPCNT on 64-bit
hosts (POPCNT has been around for a long time).

   	call   0xffffffff810c3ea0 <__fentry__>
   	push   %rbp
   	mov    %edi,%eax
   	mov    %rsp,%rbp
   	shr    $0x6,%eax
   	sub    $0x8,%rsp
   	mov    %rsi,-0x8(%rbp)
   	mov    %eax,%ecx
   	bt     %rcx,-0x8(%rbp)
   	setb   %cl
   	jae    0xffffffff81064784 <hv_is_vp_in_sparse_set+52>
   	test   %eax,%eax
   	mov    %edi,%r8d
   	jne    0xffffffff8106478c <hv_is_vp_in_sparse_set+60>
   	and    $0x3f,%r8d
   	bt     %r8,(%rdx)
   	setb   %cl
   	leave  
   	mov    %ecx,%eax
   	jmp    0xffffffff81c02200 <__x86_return_thunk>
   	mov    $0x40,%ecx
   	mov    $0xffffffffffffffff,%rdi
   	sub    %eax,%ecx
   	shr    %cl,%rdi
   	and    -0x8(%rbp),%rdi
   	call   0xffffffff815beea0 <__sw_hweight64>
   	lea    (%rdx,%rax,8),%rdx
   	jmp    0xffffffff81064779 <hv_is_vp_in_sparse_set+41>

Alternatively, we could choose not to optimize bit==0 and just do:

	sbank = hweight64(valid_bank_mask & GENMASK_ULL(valid_bit_nr, 0)) - 1;

but there's enough prep work needed for hweight64() that I think it's worth
opimtizing because "small" VMs are probably very common.

  reply	other threads:[~2022-09-21 20:54 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-21 15:23 [PATCH v10 00/39] KVM: x86: hyper-v: Fine-grained TLB flush + L2 TLB flush features Vitaly Kuznetsov
2022-09-21 15:23 ` [PATCH v10 01/39] KVM: x86: Rename 'enable_direct_tlbflush' to 'enable_l2_tlb_flush' Vitaly Kuznetsov
2022-09-21 15:23 ` [PATCH v10 02/39] KVM: x86: hyper-v: Resurrect dedicated KVM_REQ_HV_TLB_FLUSH flag Vitaly Kuznetsov
2022-09-21 16:23   ` Sean Christopherson
2022-09-21 16:45     ` Sean Christopherson
2022-09-22  9:35       ` Vitaly Kuznetsov
2022-09-22  9:31     ` Vitaly Kuznetsov
2022-09-22 15:23       ` Sean Christopherson
2022-09-22 15:37         ` Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 03/39] KVM: x86: hyper-v: Introduce TLB flush fifo Vitaly Kuznetsov
2022-09-21 16:56   ` Sean Christopherson
2022-09-22  9:42     ` Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 04/39] KVM: x86: hyper-v: Add helper to read hypercall data for array Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 05/39] KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently Vitaly Kuznetsov
2022-09-21 17:00   ` Sean Christopherson
2022-09-21 15:24 ` [PATCH v10 06/39] KVM: x86: hyper-v: Expose support for extended gva ranges for flush hypercalls Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 07/39] KVM: x86: Prepare kvm_hv_flush_tlb() to handle L2's GPAs Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 08/39] x86/hyperv: Introduce HV_MAX_SPARSE_VCPU_BANKS/HV_VCPUS_PER_SPARSE_BANK constants Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 09/39] KVM: x86: hyper-v: Use HV_MAX_SPARSE_VCPU_BANKS/HV_VCPUS_PER_SPARSE_BANK instead of raw '64' Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 10/39] KVM: x86: hyper-v: Don't use sparse_set_to_vcpu_mask() in kvm_hv_send_ipi() Vitaly Kuznetsov
2022-09-21 20:54   ` Sean Christopherson [this message]
2022-09-21 15:24 ` [PATCH v10 11/39] KVM: x86: hyper-v: Create a separate fifo for L2 TLB flush Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 12/39] KVM: x86: hyper-v: Use preallocated buffer in 'struct kvm_vcpu_hv' instead of on-stack 'sparse_banks' Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 13/39] KVM: nVMX: Keep track of hv_vm_id/hv_vp_id when eVMCS is in use Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 14/39] KVM: nSVM: Keep track of Hyper-V hv_vm_id/hv_vp_id Vitaly Kuznetsov
2022-09-21 21:16   ` Sean Christopherson
2022-09-22  9:51     ` Vitaly Kuznetsov
2022-09-22 19:52       ` Sean Christopherson
2022-09-21 15:24 ` [PATCH v10 15/39] KVM: x86: Introduce .hv_inject_synthetic_vmexit_post_tlb_flush() nested hook Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 16/39] KVM: x86: hyper-v: Introduce kvm_hv_is_tlb_flush_hcall() Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 17/39] KVM: x86: hyper-v: L2 TLB flush Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 18/39] KVM: x86: hyper-v: Introduce fast guest_hv_cpuid_has_l2_tlb_flush() check Vitaly Kuznetsov
2022-09-21 21:19   ` Sean Christopherson
2022-09-21 15:24 ` [PATCH v10 19/39] KVM: nVMX: hyper-v: Cache VP assist page in 'struct kvm_vcpu_hv' Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 20/39] KVM: nVMX: hyper-v: Enable L2 TLB flush Vitaly Kuznetsov
2022-09-21 21:24   ` Sean Christopherson
2022-09-22 16:05   ` Sean Christopherson
2022-09-21 15:24 ` [PATCH v10 21/39] KVM: nSVM: " Vitaly Kuznetsov
2022-09-21 21:31   ` Sean Christopherson
2022-09-21 15:24 ` [PATCH v10 22/39] KVM: x86: Expose Hyper-V L2 TLB flush feature Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 23/39] KVM: selftests: Better XMM read/write helpers Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 24/39] KVM: selftests: Move HYPERV_LINUX_OS_ID definition to a common header Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 25/39] KVM: selftests: Move the function doing Hyper-V hypercall " Vitaly Kuznetsov
2022-09-21 21:51   ` Sean Christopherson
2022-09-21 15:24 ` [PATCH v10 26/39] KVM: selftests: Hyper-V PV IPI selftest Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 27/39] KVM: selftests: Fill in vm->vpages_mapped bitmap in virt_map() too Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 28/39] KVM: selftests: Export vm_vaddr_unused_gap() to make it possible to request unmapped ranges Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 29/39] KVM: selftests: Export _vm_get_page_table_entry() Vitaly Kuznetsov
2022-09-21 22:13   ` Sean Christopherson
2022-09-21 15:24 ` [PATCH v10 30/39] KVM: selftests: Hyper-V PV TLB flush selftest Vitaly Kuznetsov
2022-09-21 22:52   ` Sean Christopherson
2022-10-03 13:01     ` Vitaly Kuznetsov
2022-10-03 15:47       ` Sean Christopherson
2022-10-03 16:00         ` Sean Christopherson
2022-09-21 15:24 ` [PATCH v10 31/39] KVM: selftests: Sync 'struct hv_enlightened_vmcs' definition with hyperv-tlfs.h Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 32/39] KVM: selftests: Sync 'struct hv_vp_assist_page' " Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 33/39] KVM: selftests: Move Hyper-V VP assist page enablement out of evmcs.h Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 34/39] KVM: selftests: Split off load_evmcs() from load_vmcs() Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 35/39] KVM: selftests: Create a vendor independent helper to allocate Hyper-V specific test pages Vitaly Kuznetsov
2022-09-21 22:59   ` Sean Christopherson
2022-09-21 15:24 ` [PATCH v10 36/39] KVM: selftests: Allocate Hyper-V partition assist page Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 37/39] KVM: selftests: evmcs_test: Introduce L2 TLB flush test Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 38/39] KVM: selftests: hyperv_svm_test: " Vitaly Kuznetsov
2022-09-21 15:24 ` [PATCH v10 39/39] KVM: selftests: Rename 'evmcs_test' to 'hyperv_evmcs' Vitaly Kuznetsov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yyt6Clofhg7u+bI+@google.com \
    --to=seanjc@google.com \
    --cc=jmattson@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikelley@microsoft.com \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=sidcha@amazon.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=yuan.yao@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).