All of lore.kernel.org
 help / color / mirror / Atom feed
From: Quan Xu <quan.xu0@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Quan Xu <quan.xu03@gmail.com>,
	kvm@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	virtualization@lists.linux-foundation.org, x86@kernel.org,
	xen-devel@lists.xenproject.org,
	Yang Zhang <yang.zhang.wz@gmail.com>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Borislav Petkov <bp@alien8.de>, Kyle Huey <me@kylehuey.com>,
	Len Brown <len.brown@intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	Tobias Klauser <tklauser@distanz.ch>,
	Daniel Lezcano <daniel.lezcano@linaro.org>
Subject: Re: [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path
Date: Fri, 17 Nov 2017 20:21:42 +0800	[thread overview]
Message-ID: <a9f02b7f-04e2-41e9-38c5-f770f40f6faf@gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.20.1711171229100.7700@nanos>



On 2017-11-17 19:36, Thomas Gleixner wrote:
> On Fri, 17 Nov 2017, Quan Xu wrote:
>> On 2017-11-16 17:53, Thomas Gleixner wrote:
>>> That's just plain wrong. We don't want to see any of this PARAVIRT crap in
>>> anything outside the architecture/hypervisor interfacing code which really
>>> needs it.
>>>
>>> The problem can and must be solved at the generic level in the first place
>>> to gather the data which can be used to make such decisions.
>>>
>>> How that information is used might be either completely generic or requires
>>> system specific variants. But as long as we don't have any information at
>>> all we cannot discuss that.
>>>
>>> Please sit down and write up which data needs to be considered to make
>>> decisions about probabilistic polling. Then we need to compare and contrast
>>> that with the data which is necessary to make power/idle state decisions.
>>>
>>> I would be very surprised if this data would not overlap by at least 90%.
>>>
>> 1. which data needs to considerd to make decisions about probabilistic polling
>>
>> I really need to write up which data needs to considerd to make
>> decisions about probabilistic polling. At last several months,
>> I always focused on the data _from idle to reschedule_, then to bypass
>> the idle loops. unfortunately, this makes me touch scheduler/idle/nohz
>> code inevitably.
>>
>> with tglx's suggestion, the data which is necessary to make power/idle
>> state decisions, is the last idle state's residency time. IIUC this data
>> is duration from idle to wakeup, which maybe by reschedule irq or other irq.
> That's part of the picture, but not complete.

tglx, could you share more? I am very curious about it..

>> I also test that the reschedule irq overlap by more than 90% (trace the
>> need_resched status after cpuidle_idle_call), when I run ctxsw/netperf for
>> one minute.
>>
>> as the overlap, I think I can input the last idle state's residency time
>> to make decisions about probabilistic polling, as @dev->last_residency does.
>> it is much easier to get data.
> That's only true for your particular use case.
>
>> 2. do a HV specific idle driver (function)
>>
>> so far, power management is not exposed to guest.. idle is simple for KVM
>> guest,
>> calling "sti" / "hlt"(cpuidle_idle_call() --> default_idle_call())..
>> thanks Xen guys, who has implemented the paravirt framework. I can implement
>> it
>> as easy as following:
>>
>>               --- a/arch/x86/kernel/kvm.c
> Your email client is using a very strange formatting.

my bad, I insert space to highlight these code.

> This is definitely better than what you proposed so far and implementing it
> as a prove of concept seems to be worthwhile.
>
> But I doubt that this is the final solution. It's not generic and not
> necessarily suitable for all use case scenarios.
>
>
yes, I am exhausted :):)


could you tell me the gap to be generic and necessarily suitable for
all use case scenarios? as lack of irq/idle predictors?

  I really want to upstream it for all of public cloud users/providers..

as kvm host has a similar one, is it possible to upstream with following 
conditions? :
     1). add a QEMU configuration, whether enable or not, by default 
disable.
     2). add some "TODO" comments near the code.
     3). ...


anyway, thanks for your help..

Quan
  Alibaba Cloud

WARNING: multiple messages have this Message-ID (diff)
From: Quan Xu <quan.xu0@gmail.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Quan Xu <quan.xu03@gmail.com>,
	kvm@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	virtualization@lists.linux-foundation.org, x86@kernel.org,
	xen-devel@lists.xenproject.org,
	Yang Zhang <yang.zhang.wz@gmail.com>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Borislav Petkov <bp@alien8.de>, Kyle Huey <me@kylehuey.com>,
	Len Brown <len.brown@intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	Tobias Klauser <tklauser@distanz.ch>,
	Daniel Lezcano <daniel.lezcano@linaro.org>
Subject: Re: [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path
Date: Fri, 17 Nov 2017 20:21:42 +0800	[thread overview]
Message-ID: <a9f02b7f-04e2-41e9-38c5-f770f40f6faf@gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.20.1711171229100.7700@nanos>



On 2017-11-17 19:36, Thomas Gleixner wrote:
> On Fri, 17 Nov 2017, Quan Xu wrote:
>> On 2017-11-16 17:53, Thomas Gleixner wrote:
>>> That's just plain wrong. We don't want to see any of this PARAVIRT crap in
>>> anything outside the architecture/hypervisor interfacing code which really
>>> needs it.
>>>
>>> The problem can and must be solved at the generic level in the first place
>>> to gather the data which can be used to make such decisions.
>>>
>>> How that information is used might be either completely generic or requires
>>> system specific variants. But as long as we don't have any information at
>>> all we cannot discuss that.
>>>
>>> Please sit down and write up which data needs to be considered to make
>>> decisions about probabilistic polling. Then we need to compare and contrast
>>> that with the data which is necessary to make power/idle state decisions.
>>>
>>> I would be very surprised if this data would not overlap by at least 90%.
>>>
>> 1. which data needs to considerd to make decisions about probabilistic polling
>>
>> I really need to write up which data needs to considerd to make
>> decisions about probabilistic polling. At last several months,
>> I always focused on the data _from idle to reschedule_, then to bypass
>> the idle loops. unfortunately, this makes me touch scheduler/idle/nohz
>> code inevitably.
>>
>> with tglx's suggestion, the data which is necessary to make power/idle
>> state decisions, is the last idle state's residency time. IIUC this data
>> is duration from idle to wakeup, which maybe by reschedule irq or other irq.
> That's part of the picture, but not complete.

tglx, could you share more? I am very curious about it..

>> I also test that the reschedule irq overlap by more than 90% (trace the
>> need_resched status after cpuidle_idle_call), when I run ctxsw/netperf for
>> one minute.
>>
>> as the overlap, I think I can input the last idle state's residency time
>> to make decisions about probabilistic polling, as @dev->last_residency does.
>> it is much easier to get data.
> That's only true for your particular use case.
>
>> 2. do a HV specific idle driver (function)
>>
>> so far, power management is not exposed to guest.. idle is simple for KVM
>> guest,
>> calling "sti" / "hlt"(cpuidle_idle_call() --> default_idle_call())..
>> thanks Xen guys, who has implemented the paravirt framework. I can implement
>> it
>> as easy as following:
>>
>>  ᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵᅵ --- a/arch/x86/kernel/kvm.c
> Your email client is using a very strange formatting.

my bad, I insert space to highlight these code.

> This is definitely better than what you proposed so far and implementing it
> as a prove of concept seems to be worthwhile.
>
> But I doubt that this is the final solution. It's not generic and not
> necessarily suitable for all use case scenarios.
>
>
yes, I am exhausted :):)


could you tell me the gap to be generic and necessarily suitable for
all use case scenarios? as lack of irq/idle predictors?

 ï¿œI really want to upstream it for all of public cloud users/providers..

as kvm host has a similar one, is it possible to upstream with following 
conditions? :
 ᅵᅵᅵ 1). add a QEMU configuration, whether enable or not, by default 
disable.
 ᅵᅵᅵ 2). add some "TODO" comments near the code.
 ᅵᅵᅵ 3). ...


anyway, thanks for your help..

Quan
 ï¿œAlibaba Cloud

  reply	other threads:[~2017-11-17 12:22 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-13 10:05 [PATCH RFC v3 0/6] x86/idle: add halt poll support Quan Xu
2017-11-13 10:06 ` [PATCH RFC v3 1/6] x86/paravirt: Add pv_idle_ops to paravirt ops Quan Xu
2017-11-13 10:53   ` Juergen Gross
2017-11-13 11:09     ` Wanpeng Li
2017-11-13 11:09       ` Wanpeng Li
2017-11-13 11:09     ` Wanpeng Li
2017-11-13 11:09     ` Wanpeng Li
2017-11-14  7:02     ` Quan Xu
2017-11-14  7:02     ` Quan Xu
2017-11-14  7:02     ` Quan Xu
2017-11-14  7:12       ` Wanpeng Li
2017-11-14  7:12         ` Wanpeng Li
2017-11-14  8:15         ` Quan Xu
2017-11-14  8:15         ` Quan Xu
2017-11-14  8:15           ` Quan Xu
2017-11-14  8:22           ` Wanpeng Li
2017-11-14  8:22             ` Wanpeng Li
2017-11-14  8:22             ` Wanpeng Li
2017-11-14  8:22             ` Wanpeng Li
2017-11-14 10:23             ` Quan Xu
2017-11-14 10:23               ` Quan Xu
2017-11-14 10:23             ` Quan Xu
2017-11-14 10:23             ` Quan Xu
2017-11-14  8:22           ` Wanpeng Li
2017-11-14  8:15         ` Quan Xu
2017-11-14  7:12       ` Wanpeng Li
2017-11-14  7:12       ` Wanpeng Li
2017-11-14  7:30       ` Juergen Gross
2017-11-14  7:30       ` Juergen Gross
2017-11-14  9:38         ` Quan Xu
2017-11-14 10:27           ` Juergen Gross
2017-11-14 11:43             ` Quan Xu
2017-11-14 11:43             ` Quan Xu
2017-11-14 11:43             ` Quan Xu
2017-11-14 11:58               ` Juergen Gross
2017-11-14 11:58               ` Juergen Gross
2017-11-14 11:58                 ` Juergen Gross
2017-11-14 10:27           ` Juergen Gross
2017-11-14 10:27           ` Juergen Gross
2017-11-14  9:38         ` Quan Xu
2017-11-14  9:38         ` Quan Xu
2017-11-14  7:30       ` Juergen Gross
2017-11-13 10:53   ` Juergen Gross
2017-11-13 10:53   ` Juergen Gross
2017-11-13 10:06 ` Quan Xu
2017-11-13 10:06 ` [PATCH RFC v3 2/6] KVM guest: register kvm_idle_poll for pv_idle_ops Quan Xu
2017-11-13 10:06 ` Quan Xu
2017-11-13 10:06 ` [PATCH RFC v3 3/6] sched/idle: Add a generic poll before enter real idle path Quan Xu
2017-11-15 12:11   ` Peter Zijlstra
2017-11-15 12:11     ` Peter Zijlstra
2017-11-15 22:03     ` Thomas Gleixner
2017-11-15 22:03     ` Thomas Gleixner
2017-11-15 22:03     ` Thomas Gleixner
2017-11-16  8:45       ` Peter Zijlstra
2017-11-16  8:45         ` Peter Zijlstra
2017-11-16  8:58         ` Thomas Gleixner
2017-11-16  8:58         ` Thomas Gleixner
2017-11-16  8:58         ` Thomas Gleixner
2017-11-16  9:29         ` Quan Xu
2017-11-16  9:29         ` Quan Xu
2017-11-16  9:47           ` Thomas Gleixner
2017-11-16  9:47           ` Thomas Gleixner
2017-11-16  9:47           ` Thomas Gleixner
2017-11-16  9:29         ` Quan Xu
2017-11-16  8:45       ` Peter Zijlstra
2017-11-16  9:12       ` Quan Xu
2017-11-16  9:45         ` Daniel Lezcano
2017-11-20  7:05           ` Quan Xu
2017-11-20  7:05           ` Quan Xu
2017-11-20 18:01             ` Daniel Lezcano
2017-11-20 18:01             ` Daniel Lezcano
2017-11-20 18:01             ` Daniel Lezcano
2017-11-20  7:05           ` Quan Xu
2017-11-16  9:45         ` Daniel Lezcano
2017-11-16  9:45         ` Daniel Lezcano
2017-11-16  9:53         ` Thomas Gleixner
2017-11-16  9:53         ` Thomas Gleixner
2017-11-17 11:23           ` Quan Xu
2017-11-17 11:23           ` Quan Xu
2017-11-17 11:23           ` Quan Xu
2017-11-17 11:36             ` Thomas Gleixner
2017-11-17 11:36             ` Thomas Gleixner
2017-11-17 11:36               ` Thomas Gleixner
2017-11-17 11:36               ` Thomas Gleixner
2017-11-17 12:21               ` Quan Xu [this message]
2017-11-17 12:21                 ` Quan Xu
2017-11-17 12:21               ` Quan Xu
2017-11-17 12:21               ` Quan Xu
2017-11-16  9:53         ` Thomas Gleixner
2017-11-16  9:12       ` Quan Xu
2017-11-16  9:12       ` Quan Xu
2017-11-15 12:11   ` Peter Zijlstra
2017-11-13 10:06 ` Quan Xu
2017-11-15 21:31 ` [PATCH RFC v3 0/6] x86/idle: add halt poll support Konrad Rzeszutek Wilk
2017-11-15 21:31 ` [Xen-devel] " Konrad Rzeszutek Wilk
2017-11-15 21:31   ` Konrad Rzeszutek Wilk
2017-11-20  7:18   ` Quan Xu
2017-11-20  7:18   ` Quan Xu
2017-11-20  7:18   ` Quan Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a9f02b7f-04e2-41e9-38c5-f770f40f6faf@gmail.com \
    --to=quan.xu0@gmail.com \
    --cc=bp@alien8.de \
    --cc=daniel.lezcano@linaro.org \
    --cc=hpa@zytor.com \
    --cc=kvm@vger.kernel.org \
    --cc=len.brown@intel.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=me@kylehuey.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=quan.xu03@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=tklauser@distanz.ch \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.