From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C931AC433E0 for ; Fri, 5 Mar 2021 02:33:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9A7C964EC9 for ; Fri, 5 Mar 2021 02:33:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229791AbhCECdc (ORCPT ); Thu, 4 Mar 2021 21:33:32 -0500 Received: from mga06.intel.com ([134.134.136.31]:38809 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229436AbhCECdb (ORCPT ); Thu, 4 Mar 2021 21:33:31 -0500 IronPort-SDR: J5ZrL8MbjQAIS+PNlNF2fc4GMVOYa64tAfkOP2STYc+/W8rU7pxgFw1q3+f4bXYPhrRxxGpyLZ 7PDr+hT1em+A== X-IronPort-AV: E=McAfee;i="6000,8403,9913"; a="248941328" X-IronPort-AV: E=Sophos;i="5.81,224,1610438400"; d="scan'208";a="248941328" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Mar 2021 18:33:30 -0800 IronPort-SDR: ZBEGOxWl1Occbip9Q3/yVi4Hx23VPJWDUFNza22g7byXr/dQhCIv+jUgZ6NB6D1WIeO0qGXVVC vjEQtDRhraiA== X-IronPort-AV: E=Sophos;i="5.81,224,1610438400"; d="scan'208";a="401093201" Received: from likexu-mobl1.ccr.corp.intel.com (HELO [10.238.4.93]) ([10.238.4.93]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Mar 2021 18:33:26 -0800 Subject: Re: [PATCH v3 5/9] KVM: vmx/pmu: Add MSR_ARCH_LBR_DEPTH emulation for Arch LBR To: Sean Christopherson Cc: Peter Zijlstra , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Kan Liang , Dave Hansen , wei.w.wang@intel.com, Borislav Petkov , kvm@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, Like Xu References: <20210303135756.1546253-1-like.xu@linux.intel.com> <20210303135756.1546253-6-like.xu@linux.intel.com> <890a6f34-812a-5937-8761-d448a04f67d7@intel.com> From: "Xu, Like" Message-ID: <5be999eb-64d7-de0e-254b-82711acc5e24@intel.com> Date: Fri, 5 Mar 2021 10:33:24 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/3/5 0:12, Sean Christopherson wrote: > On Thu, Mar 04, 2021, Xu, Like wrote: >> Hi Sean, >> >> Thanks for your detailed review on the patch set. >> >> On 2021/3/4 0:58, Sean Christopherson wrote: >>> On Wed, Mar 03, 2021, Like Xu wrote: >>>> @@ -348,10 +352,26 @@ static bool intel_pmu_handle_lbr_msrs_access(struct kvm_vcpu *vcpu, >>>> return true; >>>> } >>>> +/* >>>> + * Check if the requested depth values is supported >>>> + * based on the bits [0:7] of the guest cpuid.1c.eax. >>>> + */ >>>> +static bool arch_lbr_depth_is_valid(struct kvm_vcpu *vcpu, u64 depth) >>>> +{ >>>> + struct kvm_cpuid_entry2 *best; >>>> + >>>> + best = kvm_find_cpuid_entry(vcpu, 0x1c, 0); >>>> + if (best && depth && !(depth % 8)) >>> This is still wrong, it fails to weed out depth > 64. >> How come ? The testcases depth = {65, 127, 128} get #GP as expected. > @depth is a u64, throw in a number that is a multiple of 8 and >= 520, and the > "(1ULL << (depth / 8 - 1))" will trigger undefined behavior due to shifting > beyond the capacity of a ULL / u64. Extra: when we say "undefined behavior" if shifting beyond the capacity of a ULL, do you mean that the actual behavior depends on the machine, architecture or compiler? > > Adding the "< 64" check would also allow dropping the " & 0xff" since the check > would ensure the shift doesn't go beyond bit 7. I'm not sure the cleverness is > worth shaving a cycle, though. Finally how about:     if (best && depth && (depth < 65) && !(depth & 7))         return best->eax & BIT_ULL(depth / 8 - 1);     return false; Do you see the room for optimization ? > >>> Not that this is a hot path, but it's probably worth double checking that the >>> compiler generates simple code for "depth % 8", e.g. it can be "depth & 7)". >> Emm, the "%" operation is quite normal over kernel code. > So is "&" :-) I was just pointing out that the compiler should optimize this, > and it did. > >> if (best && depth && !(depth % 8)) >>    10659:       48 85 c0                test   rax,rax >>    1065c:       74 c7                   je     10625 >>    1065e:       4d 85 e4                test   r12,r12 >>    10661:       74 c2                   je     10625 >>    10663:       41 f6 c4 07             test   r12b,0x7 >>    10667:       75 bc                   jne    10625 >> >> It looks like the compiler does the right thing. >> Do you see the room for optimization ? >> >>>> + return (best->eax & 0xff) & (1ULL << (depth / 8 - 1)); > Actually, looking at this again, I would explicitly use BIT() instead of 1ULL > (or BIT_ULL), since the shift must be 7 or less. > >>>> + >>>> + return false; >>>> +} >>>> +