From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752803AbdF0OXw (ORCPT ); Tue, 27 Jun 2017 10:23:52 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45172 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752310AbdF0OXQ (ORCPT ); Tue, 27 Jun 2017 10:23:16 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com A537CC04574E Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=rkrcmar@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com A537CC04574E Date: Tue, 27 Jun 2017 16:22:51 +0200 From: Radim =?utf-8?B?S3LEjW3DocWZ?= To: Paolo Bonzini Cc: Wanpeng Li , Yang Zhang , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , the arch/x86 maintainers , Jonathan Corbet , tony.luck@intel.com, Borislav Petkov , Peter Zijlstra , mchehab@kernel.org, Andrew Morton , krzk@kernel.org, jpoimboe@redhat.com, Andy Lutomirski , Christian Borntraeger , Thomas Garnier , Robert Gerst , Mathias Krause , douly.fnst@cn.fujitsu.com, Nicolai Stange , Frederic Weisbecker , dvlasenk@redhat.com, Daniel Bristot de Oliveira , yamada.masahiro@socionext.com, mika.westerberg@linux.intel.com, Chen Yu , aaron.lu@intel.com, Steven Rostedt , Kyle Huey , Len Brown , Prarit Bhargava , hidehiro.kawai.ez@hitachi.com, fengtiantian@huawei.com, pmladek@suse.com, jeyu@redhat.com, Larry.Finger@lwfinger.net, zijun_hu@htc.com, luisbg@osg.samsung.com, johannes.berg@intel.com, niklas.soderlund+renesas@ragnatech.se, zlpnobody@gmail.com, Alexey Dobriyan , fgao@48lvckh6395k16k5.yundunddos.com, ebiederm@xmission.com, Subash Abhinov Kasiviswanathan , Arnd Bergmann , Matt Fleming , Mel Gorman , "linux-kernel@vger.kernel.org" , linux-doc@vger.kernel.org, linux-edac@vger.kernel.org, kvm Subject: Re: [PATCH 2/2] x86/idle: use dynamic halt poll Message-ID: <20170627142251.GB1487@potion> References: <1498130534-26568-1-git-send-email-root@ip-172-31-39-62.us-west-2.compute.internal> <1498130534-26568-3-git-send-email-root@ip-172-31-39-62.us-west-2.compute.internal> <4444ffc8-9e7b-5bd2-20da-af422fe834cc@redhat.com> <2245bef7-b668-9265-f3f8-3b63d71b1033@gmail.com> <7d085956-2573-212f-44f4-86104beba9bb@gmail.com> <05ec7efc-fb9c-ae24-5770-66fc472545a4@redhat.com> <20170627134043.GA1487@potion> <2771f905-d1b0-b118-9ae9-db5fb87f877c@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <2771f905-d1b0-b118-9ae9-db5fb87f877c@redhat.com> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Tue, 27 Jun 2017 14:23:10 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 2017-06-27 15:56+0200, Paolo Bonzini: > On 27/06/2017 15:40, Radim Krčmář wrote: >>> ... which is not necessarily _wrong_. It's just a different heuristic. >> Right, it's just harder to use than host's single_task_running() -- the >> VCPU calling vcpu_is_preempted() is never preempted, so we have to look >> at other VCPUs that are not halted, but still preempted. >> >> If we see some ratio of preempted VCPUs (> 0?), then we stop polling and >> yield to the host. Working under the assumption that there is work for >> this PCPU if other VCPUs have stuff to do. The downside is that it >> misses information about host's topology, so it would be hard to make it >> work well. > > I would just use vcpu_is_preempted on the current CPU. From guest POV > this option is really a "f*** everyone else" setting just like > idle=poll, only a little more polite. vcpu_is_preempted() on current cpu cannot return true, AFAIK. > If we've been preempted and we were polling, there are two cases. If an > interrupt was queued while the guest was preempted, the poll will be > treated as successful anyway. I think the poll should be treated as invalid if the window has expired while the VCPU was preempted -- the guest can't tell whether the interrupt arrived still within the poll window (unless we added paravirt for that), so it shouldn't be wasting time waiting for it. > If it hasn't, let others run---but really > that's not because the guest wants to be polite, it's to avoid that the > scheduler penalizes it excessively. This sounds like a VM entry just to do an immediate VM exit, so paravirt seems better here as well ... (the guest telling the host about its window -- which could also be used to rule it out as a target in the pause loop random kick.) > So until it's preempted, I think it's okay if the guest doesn't care > about others. You wouldn't use this option anyway in overcommitted > situations. > > (I'm still not very convinced about the idea). Me neither. (The same mechanism is applicable to bare-metal, but was never used there, so I would rather bring the guest behavior closer to bare-metal.) From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Subject: [2/2] x86/idle: use dynamic halt poll From: =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= Message-Id: <20170627142251.GB1487@potion> Date: Tue, 27 Jun 2017 16:22:51 +0200 To: Paolo Bonzini Cc: Wanpeng Li , Yang Zhang , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , the arch/x86 maintainers , Jonathan Corbet , tony.luck@intel.com, Borislav Petkov , Peter Zijlstra , mchehab@kernel.org, Andrew Morton , krzk@kernel.org, jpoimboe@redhat.com, Andy Lutomirski , Christian Borntraeger , Thomas Garnier , Robert Gerst , Mathias Krause , douly.fnst@cn.fujitsu.com, Nicolai Stange , Frederic Weisbecker , dvlasenk@redhat.com, Daniel Bristot de Oliveira , yamada.masahiro@socionext.com, mika.westerberg@linux.intel.com, Chen Yu , aaron.lu@intel.com, Steven Rostedt , Kyle Huey , Len Brown , Prarit Bhargava , hidehiro.kawai.ez@hitachi.com, fengtiantian@huawei.com, pmladek@suse.com, jeyu@redhat.com, Larry.Finger@lwfinger.net, zijun_hu@htc.com, luisbg@osg.samsung.com, johannes.berg@intel.com, niklas.soderlund+renesas@ragnatech.se, zlpnobody@gmail.com, Alexey Dobriyan , fgao@ikuai8.com, ebiederm@xmission.com, Subash Abhinov Kasiviswanathan , Arnd Bergmann , Matt Fleming , Mel Gorman , "linux-kernel@vger.kernel.org" , linux-doc@vger.kernel.org, linux-edac@vger.kernel.org, kvm List-ID: MjAxNy0wNi0yNyAxNTo1NiswMjAwLCBQYW9sbyBCb256aW5pOgo+IE9uIDI3LzA2LzIwMTcgMTU6 NDAsIFJhZGltIEtyxI1tw6HFmSB3cm90ZToKPj4+IC4uLiB3aGljaCBpcyBub3QgbmVjZXNzYXJp bHkgX3dyb25nXy4gIEl0J3MganVzdCBhIGRpZmZlcmVudCBoZXVyaXN0aWMuCj4+IFJpZ2h0LCBp dCdzIGp1c3QgaGFyZGVyIHRvIHVzZSB0aGFuIGhvc3QncyBzaW5nbGVfdGFza19ydW5uaW5nKCkg LS0gdGhlCj4+IFZDUFUgY2FsbGluZyB2Y3B1X2lzX3ByZWVtcHRlZCgpIGlzIG5ldmVyIHByZWVt cHRlZCwgc28gd2UgaGF2ZSB0byBsb29rCj4+IGF0IG90aGVyIFZDUFVzIHRoYXQgYXJlIG5vdCBo YWx0ZWQsIGJ1dCBzdGlsbCBwcmVlbXB0ZWQuCj4+IAo+PiBJZiB3ZSBzZWUgc29tZSByYXRpbyBv ZiBwcmVlbXB0ZWQgVkNQVXMgKD4gMD8pLCB0aGVuIHdlIHN0b3AgcG9sbGluZyBhbmQKPj4geWll bGQgdG8gdGhlIGhvc3QuICBXb3JraW5nIHVuZGVyIHRoZSBhc3N1bXB0aW9uIHRoYXQgdGhlcmUg aXMgd29yayBmb3IKPj4gdGhpcyBQQ1BVIGlmIG90aGVyIFZDUFVzIGhhdmUgc3R1ZmYgdG8gZG8u ICBUaGUgZG93bnNpZGUgaXMgdGhhdCBpdAo+PiBtaXNzZXMgaW5mb3JtYXRpb24gYWJvdXQgaG9z dCdzIHRvcG9sb2d5LCBzbyBpdCB3b3VsZCBiZSBoYXJkIHRvIG1ha2UgaXQKPj4gd29yayB3ZWxs Lgo+IAo+IEkgd291bGQganVzdCB1c2UgdmNwdV9pc19wcmVlbXB0ZWQgb24gdGhlIGN1cnJlbnQg Q1BVLiAgRnJvbSBndWVzdCBQT1YKPiB0aGlzIG9wdGlvbiBpcyByZWFsbHkgYSAiZioqKiBldmVy eW9uZSBlbHNlIiBzZXR0aW5nIGp1c3QgbGlrZQo+IGlkbGU9cG9sbCwgb25seSBhIGxpdHRsZSBt b3JlIHBvbGl0ZS4KCnZjcHVfaXNfcHJlZW1wdGVkKCkgb24gY3VycmVudCBjcHUgY2Fubm90IHJl dHVybiB0cnVlLCBBRkFJSy4KCj4gSWYgd2UndmUgYmVlbiBwcmVlbXB0ZWQgYW5kIHdlIHdlcmUg cG9sbGluZywgdGhlcmUgYXJlIHR3byBjYXNlcy4gIElmIGFuCj4gaW50ZXJydXB0IHdhcyBxdWV1 ZWQgd2hpbGUgdGhlIGd1ZXN0IHdhcyBwcmVlbXB0ZWQsIHRoZSBwb2xsIHdpbGwgYmUKPiB0cmVh dGVkIGFzIHN1Y2Nlc3NmdWwgYW55d2F5LgoKSSB0aGluayB0aGUgcG9sbCBzaG91bGQgYmUgdHJl YXRlZCBhcyBpbnZhbGlkIGlmIHRoZSB3aW5kb3cgaGFzIGV4cGlyZWQKd2hpbGUgdGhlIFZDUFUg d2FzIHByZWVtcHRlZCAtLSB0aGUgZ3Vlc3QgY2FuJ3QgdGVsbCB3aGV0aGVyIHRoZQppbnRlcnJ1 cHQgYXJyaXZlZCBzdGlsbCB3aXRoaW4gdGhlIHBvbGwgd2luZG93ICh1bmxlc3Mgd2UgYWRkZWQg cGFyYXZpcnQKZm9yIHRoYXQpLCBzbyBpdCBzaG91bGRuJ3QgYmUgd2FzdGluZyB0aW1lIHdhaXRp bmcgZm9yIGl0LgoKPiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgSWYgaXQgaGFzbid0 LCBsZXQgb3RoZXJzIHJ1bi0tLWJ1dCByZWFsbHkKPiB0aGF0J3Mgbm90IGJlY2F1c2UgdGhlIGd1 ZXN0IHdhbnRzIHRvIGJlIHBvbGl0ZSwgaXQncyB0byBhdm9pZCB0aGF0IHRoZQo+IHNjaGVkdWxl ciBwZW5hbGl6ZXMgaXQgZXhjZXNzaXZlbHkuCgpUaGlzIHNvdW5kcyBsaWtlIGEgVk0gZW50cnkg anVzdCB0byBkbyBhbiBpbW1lZGlhdGUgVk0gZXhpdCwgc28gcGFyYXZpcnQKc2VlbXMgYmV0dGVy IGhlcmUgYXMgd2VsbCAuLi4gKHRoZSBndWVzdCB0ZWxsaW5nIHRoZSBob3N0IGFib3V0IGl0cwp3 aW5kb3cgLS0gd2hpY2ggY291bGQgYWxzbyBiZSB1c2VkIHRvIHJ1bGUgaXQgb3V0IGFzIGEgdGFy Z2V0IGluIHRoZQpwYXVzZSBsb29wIHJhbmRvbSBraWNrLikKCj4gU28gdW50aWwgaXQncyBwcmVl bXB0ZWQsIEkgdGhpbmsgaXQncyBva2F5IGlmIHRoZSBndWVzdCBkb2Vzbid0IGNhcmUKPiBhYm91 dCBvdGhlcnMuICBZb3Ugd291bGRuJ3QgdXNlIHRoaXMgb3B0aW9uIGFueXdheSBpbiBvdmVyY29t bWl0dGVkCj4gc2l0dWF0aW9ucy4KPiAKPiAoSSdtIHN0aWxsIG5vdCB2ZXJ5IGNvbnZpbmNlZCBh Ym91dCB0aGUgaWRlYSkuCgpNZSBuZWl0aGVyLiAgKFRoZSBzYW1lIG1lY2hhbmlzbSBpcyBhcHBs aWNhYmxlIHRvIGJhcmUtbWV0YWwsIGJ1dCB3YXMKbmV2ZXIgdXNlZCB0aGVyZSwgc28gSSB3b3Vs ZCByYXRoZXIgYnJpbmcgdGhlIGd1ZXN0IGJlaGF2aW9yIGNsb3NlciB0bwpiYXJlLW1ldGFsLikK LS0tClRvIHVuc3Vic2NyaWJlIGZyb20gdGhpcyBsaXN0OiBzZW5kIHRoZSBsaW5lICJ1bnN1YnNj cmliZSBsaW51eC1lZGFjIiBpbgp0aGUgYm9keSBvZiBhIG1lc3NhZ2UgdG8gbWFqb3Jkb21vQHZn ZXIua2VybmVsLm9yZwpNb3JlIG1ham9yZG9tbyBpbmZvIGF0ICBodHRwOi8vdmdlci5rZXJuZWwu b3JnL21ham9yZG9tby1pbmZvLmh0bWwK From mboxrd@z Thu Jan 1 00:00:00 1970 From: Radim =?utf-8?B?S3LEjW3DocWZ?= Subject: Re: [PATCH 2/2] x86/idle: use dynamic halt poll Date: Tue, 27 Jun 2017 16:22:51 +0200 Message-ID: <20170627142251.GB1487@potion> References: <1498130534-26568-1-git-send-email-root@ip-172-31-39-62.us-west-2.compute.internal> <1498130534-26568-3-git-send-email-root@ip-172-31-39-62.us-west-2.compute.internal> <4444ffc8-9e7b-5bd2-20da-af422fe834cc@redhat.com> <2245bef7-b668-9265-f3f8-3b63d71b1033@gmail.com> <7d085956-2573-212f-44f4-86104beba9bb@gmail.com> <05ec7efc-fb9c-ae24-5770-66fc472545a4@redhat.com> <20170627134043.GA1487@potion> <2771f905-d1b0-b118-9ae9-db5fb87f877c@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Cc: Wanpeng Li , Yang Zhang , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , the arch/x86 maintainers , Jonathan Corbet , tony.luck@intel.com, Borislav Petkov , Peter Zijlstra , mchehab@kernel.org, Andrew Morton , krzk@kernel.org, jpoimboe@redhat.com, Andy Lutomirski , Christian Borntraeger , Thomas Garnier , Robert Gerst , Mathias Krause , douly.fnst@cn.fujitsu.com, Nicolai Stange , Frederic Weisbecker , dvlasenk@redhat.com, To: Paolo Bonzini Return-path: Content-Disposition: inline In-Reply-To: <2771f905-d1b0-b118-9ae9-db5fb87f877c@redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: kvm.vger.kernel.org 2017-06-27 15:56+0200, Paolo Bonzini: > On 27/06/2017 15:40, Radim Krčmář wrote: >>> ... which is not necessarily _wrong_. It's just a different heuristic. >> Right, it's just harder to use than host's single_task_running() -- the >> VCPU calling vcpu_is_preempted() is never preempted, so we have to look >> at other VCPUs that are not halted, but still preempted. >> >> If we see some ratio of preempted VCPUs (> 0?), then we stop polling and >> yield to the host. Working under the assumption that there is work for >> this PCPU if other VCPUs have stuff to do. The downside is that it >> misses information about host's topology, so it would be hard to make it >> work well. > > I would just use vcpu_is_preempted on the current CPU. From guest POV > this option is really a "f*** everyone else" setting just like > idle=poll, only a little more polite. vcpu_is_preempted() on current cpu cannot return true, AFAIK. > If we've been preempted and we were polling, there are two cases. If an > interrupt was queued while the guest was preempted, the poll will be > treated as successful anyway. I think the poll should be treated as invalid if the window has expired while the VCPU was preempted -- the guest can't tell whether the interrupt arrived still within the poll window (unless we added paravirt for that), so it shouldn't be wasting time waiting for it. > If it hasn't, let others run---but really > that's not because the guest wants to be polite, it's to avoid that the > scheduler penalizes it excessively. This sounds like a VM entry just to do an immediate VM exit, so paravirt seems better here as well ... (the guest telling the host about its window -- which could also be used to rule it out as a target in the pause loop random kick.) > So until it's preempted, I think it's okay if the guest doesn't care > about others. You wouldn't use this option anyway in overcommitted > situations. > > (I'm still not very convinced about the idea). Me neither. (The same mechanism is applicable to bare-metal, but was never used there, so I would rather bring the guest behavior closer to bare-metal.)