From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Longpeng (Mike)" Subject: Re: [Question] About the behavior of HLT in VMX guest mode Date: Fri, 17 Mar 2017 13:22:03 +0800 Message-ID: <58CB727B.20902@huawei.com> References: <58C64672.1070706@huawei.com> <20170315173254.GF14081@potion> <58C9F3A5.3090604@huawei.com> <20170316142340.GD14076@potion> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Cc: kvm , Jan Beulich , Gonglei To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= Return-path: Received: from szxga03-in.huawei.com ([45.249.212.189]:4421 "EHLO dggrg03-dlp.huawei.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1750980AbdCQFW4 (ORCPT ); Fri, 17 Mar 2017 01:22:56 -0400 In-Reply-To: <20170316142340.GD14076@potion> Sender: kvm-owner@vger.kernel.org List-ID: Hi Radim, On 2017/3/16 22:23, Radim Krčmář wrote: > 2017-03-16 10:08+0800, Longpeng (Mike): >> Hi, Radim, >> >> On 2017/3/16 1:32, Radim Krčmář wrote: >> >>> 2017-03-13 15:12+0800, Longpeng (Mike): >>>> Hi guys, >>>> >>>> I'm confusing about the behavior of HLT instruction in VMX guest mode. >>>> >>>> I set "hlt exiting" bit to 0 in VMCS, and the vcpu didn't vmexit when execute >>>> HLT as expected. However, I used powertop/cpupower on host to watch the pcpu's >>>> c-states, it seems that the pcpu didn't enter C1/C1E state during this period. >>>> >>>> I searched the Intel spec vol-3, and only found that guest MWAIT won't entering >>>> a low-power sleep state under certain conditions(ch 25.3), but not mentioned HLT. >>>> >>>> My questions are >>>> 1) Does executing HLT instruction in guest-mode won't enter C1/C1E state ? >>> >>> Do you get a different result when running HLT outside VMX? >>> >> >> Yep, I'm sure that executing HLT in host will enter C1/C1E state, but it won't >> when executing in guest. > > I'd go for the thermal monitoring (ideally with constant fan speed) if > CPU counters are lacking. Thermal sensors are easily accessible and far > more trustworthy for measuring power saving. :) > >>>> 2) If it won't, then whether it would release the hardware resources shared with >>>> another hyper-thread ? >>> >> >>> No idea. Aren't hyperthreaded resources scheduled dynamically, so even >>> a nop-spinning VCPU won't hinder the other hyper-thread? >>> >> >> >> I had wrote a testcase in kvm-unit-tests, and it seems that guest-mode HLT-ed >> vcpu won't compete the hardware resources( maybe including the pipeline ) any more. >> >> My testcase is: binding vcpu1 and vcpu2 to a core's 2 hyper-threads, and >> >> (vcpu1) >> t1 = rdtsc(); >> for (int i = 0; i < 10000000; ++i) ; >> t2 = rdtsc(); >> costs = t2 - t1; >> >> (vcpu2) >> "halt" or "while (1) ;" >> >> The result is: >> ----------------------------------------------------------------------- >> (vcpu2)idle=poll (vcpu2)idle=halt >> (HLT exiting=1) >> vcpu1 costs 3800931 1900209 >> >> (HLT exiting=0) >> vcpu1 costs 3800193 1913514 >> ----------------------------------------------------------------------- > > Oh, great results. > I wonder if the slightly better time on HLT exiting=1 is because the > other hyper-thread goes into deeper sleep after exit. Yes, maybe. Another potential reason is maybe the host's overhead is lower. For "HLT exiting=1 && idle=halt" the host is idle and the cpu-usage close to 0%, while for "HLT exiting=0 && idle=halt" the host is actually very busy and the cpu-usage close to 100%. Maybe host kernel would do more work when it's busy. > Btw. does adding pause() into the while loop bring the performance close > to halt? Good suggestion! :) I tested pause() into poll loop and set "ple_gap=0" just now, the performance is much better than "while(1) ;", but it's still obvious slower than halt. ----------------------------------------------------------------------- (vcpu2)poll (vcpu2)pause (vcpu2)halt (HLT exiting=1) vcpu1 costs 3800931 2572812 1916724 (HLT exiting=0) vcpu1 costs 3800193 2573685 1912443 ----------------------------------------------------------------------- > >> I found that https://www.spinics.net/lists/kvm-commits/msg00137.html had maked >> "HLT exiting" configurable, while >> http://lkml.iu.edu/hypermail/linux/kernel/1202.0/03309.html removed it due to >> redundant with CFS hardlimit. >> >> I focus on the VM's performance. According the result, I think running HLT in >> guest-mode is better than idle=poll with HLT-exiting in *certain* scenarios. > > Yes, and using MWAIT for idle is even better than HLT (you can be woken Yes, agree. > up without IPI) -- any reason to prefer HLT? > In my humble opinion: 1) As "Intel sdm vol3 ch25.3" says, MWAIT operates normally (I think includes entering deeper sleep) under certain conditions. Some deeper sleep modes(such as C4E/C6/C7) will clear the L1/L2/L3 cache. This is insecurity if we don't take other protective measures(such as limit the guest's max-cstate, it's fortunately that power subsystem isn't supported by QEMU, but we should be careful for some special-purpose in case). While HLT in guest mode can't cause hardware into sleep. 2) According to the "Intel sdm vol3 ch26.3.3 & ch27.5.6", I think MONITOR in guest mode can't work as perfect as in host sometimes. For example, a vcpu MONITOR a address and then MWAIT, if a external-intr(suppose this intr won't cause to inject any virtual events ) cause VMEXIT, the monitor address will be cleaned, so the MWAIT won't be waken up by a store operation to the monitored address any more. But I'm glad to do some tests if time permits, thanks :) Radim, how about to make HLT-exiting configurable again in upstream ? If you like it, there is a problem should be resolved, asynpf is conflict with "HLT-exiting = 0" in certain situations. > Thanks. > > . > -- Regards, Longpeng(Mike)