Linux-PM Archive on lore.kernel.org
 help / color / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: Abhishek Goel <huntbag@linux.vnet.ibm.com>,
	linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org
Cc: daniel.lezcano@linaro.org, dja@axtens.net,
	ego@linux.vnet.ibm.com, mpe@ellerman.id.au, rjw@rjwysocki.net
Subject: Re: [PATCH v2 1/1] cpuidle-powernv : forced wakeup for stop states
Date: Wed, 19 Jun 2019 14:23:03 +1000
Message-ID: <1560917320.mk5nn6r8jw.astroid@bobo.none> (raw)
In-Reply-To: <20190617095648.18847-2-huntbag@linux.vnet.ibm.com>

Abhishek Goel's on June 17, 2019 7:56 pm:
> Currently, the cpuidle governors determine what idle state a idling CPU
> should enter into based on heuristics that depend on the idle history on
> that CPU. Given that no predictive heuristic is perfect, there are cases
> where the governor predicts a shallow idle state, hoping that the CPU will
> be busy soon. However, if no new workload is scheduled on that CPU in the
> near future, the CPU may end up in the shallow state.
> 
> This is problematic, when the predicted state in the aforementioned
> scenario is a shallow stop state on a tickless system. As we might get
> stuck into shallow states for hours, in absence of ticks or interrupts.
> 
> To address this, We forcefully wakeup the cpu by setting the
> decrementer. The decrementer is set to a value that corresponds with the
> residency of the next available state. Thus firing up a timer that will
> forcefully wakeup the cpu. Few such iterations will essentially train the
> governor to select a deeper state for that cpu, as the timer here
> corresponds to the next available cpuidle state residency. Thus, cpu will
> eventually end up in the deepest possible state.
> 
> Signed-off-by: Abhishek Goel <huntbag@linux.vnet.ibm.com>
> ---
> 
> Auto-promotion
>  v1 : started as auto promotion logic for cpuidle states in generic
> driver
>  v2 : Removed timeout_needed and rebased the code to upstream kernel
> Forced-wakeup
>  v1 : New patch with name of forced wakeup started
>  v2 : Extending the forced wakeup logic for all states. Setting the
> decrementer instead of queuing up a hrtimer to implement the logic.
> 
>  drivers/cpuidle/cpuidle-powernv.c | 38 +++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
> index 84b1ebe212b3..bc9ca18ae7e3 100644
> --- a/drivers/cpuidle/cpuidle-powernv.c
> +++ b/drivers/cpuidle/cpuidle-powernv.c
> @@ -46,6 +46,26 @@ static struct stop_psscr_table stop_psscr_table[CPUIDLE_STATE_MAX] __read_mostly
>  static u64 default_snooze_timeout __read_mostly;
>  static bool snooze_timeout_en __read_mostly;
>  
> +static u64 forced_wakeup_timeout(struct cpuidle_device *dev,
> +				 struct cpuidle_driver *drv,
> +				 int index)
> +{
> +	int i;
> +
> +	for (i = index + 1; i < drv->state_count; i++) {
> +		struct cpuidle_state *s = &drv->states[i];
> +		struct cpuidle_state_usage *su = &dev->states_usage[i];
> +
> +		if (s->disabled || su->disable)
> +			continue;
> +
> +		return (s->target_residency + 2 * s->exit_latency) *
> +			tb_ticks_per_usec;
> +	}
> +
> +	return 0;
> +}

It would be nice to not have this kind of loop iteration in the
idle fast path. Can we add a flag or something to the idle state?

> +
>  static u64 get_snooze_timeout(struct cpuidle_device *dev,
>  			      struct cpuidle_driver *drv,
>  			      int index)
> @@ -144,8 +164,26 @@ static int stop_loop(struct cpuidle_device *dev,
>  		     struct cpuidle_driver *drv,
>  		     int index)
>  {
> +	u64 dec_expiry_tb, dec, timeout_tb, forced_wakeup;
> +
> +	dec = mfspr(SPRN_DEC);
> +	timeout_tb = forced_wakeup_timeout(dev, drv, index);
> +	forced_wakeup = 0;
> +
> +	if (timeout_tb && timeout_tb < dec) {
> +		forced_wakeup = 1;
> +		dec_expiry_tb = mftb() + dec;
> +	}

The compiler probably can't optimise away the SPR manipulations so try
to avoid them if possible.

> +
> +	if (forced_wakeup)
> +		mtspr(SPRN_DEC, timeout_tb);

This should just be put in the above 'if'.

> +
>  	power9_idle_type(stop_psscr_table[index].val,
>  			 stop_psscr_table[index].mask);
> +
> +	if (forced_wakeup)
> +		mtspr(SPRN_DEC, dec_expiry_tb - mftb());

This will sometimes go negative and result in another timer interrupt.

It also breaks irq work (which can be set here by machine check I
believe.

May need to implement some timer code to do this for you.

static void reset_dec_after_idle(void)
{
	u64 now;
        u64 *next_tb;

	if (test_irq_work_pending())
		return;
	now = mftb;
	next_tb = this_cpu_ptr(&decrementers_next_tb);

	if (now >= *next_tb)
		return;
	set_dec(*next_tb - now);
	if (test_irq_work_pending())
		set_dec(1);
}

Something vaguely like that. See timer_interrupt().

Thanks,
Nick

  reply index

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-17  9:56 [PATCH v2 0/1] Forced-wakeup for stop states on Powernv Abhishek Goel
2019-06-17  9:56 ` [PATCH v2 1/1] cpuidle-powernv : forced wakeup for stop states Abhishek Goel
2019-06-19  4:23   ` Nicholas Piggin [this message]
2019-06-19  9:08     ` Abhishek
2019-06-19 10:09       ` Nicholas Piggin
2019-06-26  9:09         ` Abhishek

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1560917320.mk5nn6r8jw.astroid@bobo.none \
    --to=npiggin@gmail.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=dja@axtens.net \
    --cc=ego@linux.vnet.ibm.com \
    --cc=huntbag@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=rjw@rjwysocki.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-PM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pm/0 linux-pm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pm linux-pm/ https://lore.kernel.org/linux-pm \
		linux-pm@vger.kernel.org linux-pm@archiver.kernel.org
	public-inbox-index linux-pm


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pm


AGPL code for this site: git clone https://public-inbox.org/ public-inbox