All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rafael@kernel.org>
To: Artem Bityutskiy <dedekind1@gmail.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Linux PM Mailing List <linux-pm@vger.kernel.org>
Subject: Re: [PATCH 1/1] intel_idle: enable interrupts before C1 on Xeons
Date: Fri, 24 Sep 2021 18:45:52 +0200	[thread overview]
Message-ID: <CAJZ5v0gns1EunzRfgDAHDS22aeAXnG_3aiFX3Px=crtER00Gtg@mail.gmail.com> (raw)
In-Reply-To: <20210917072022.3234272-2-dedekind1@gmail.com>

On Fri, Sep 17, 2021 at 9:20 AM Artem Bityutskiy <dedekind1@gmail.com> wrote:
>
> From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
>
> Enable local interrupts before requesting C1 on the last two generations of
> Intel Xeon platforms: Sky Lake, Cascade Lake, Cooper Lake, Ice Lake.
> This decreases average C1 interrupt latency by about 5-10%, as measured with
> the 'wult' tool.
>
> The '->enter()' function of the driver enters C-states with local interrupts
> disabled by executing the 'monitor' and 'mwait' pair of instructions. If an
> interrupt happens, the CPU exits the C-state and continues executing
> instructions after 'mwait'. It does not jump to the interrupt handler, because
> local interrupts are disabled. The cpuidle subsystem enables interrupts a bit
> later, after doing some housekeeping.
>
> With this patch, we enable local interrupts before requesting C1. In this case,
> if the CPU wakes up because of an interrupt, it will jump to the interrupt
> handler right away. The cpuidle housekeeping will be done after the pending
> interrupt(s) are handled.
>
> Enabling interrupts before entering a C-state has measurable impact for faster
> C-states, like C1. Deeper, but slower C-states like C6 do not really benefit
> from this sort of change, because their latency is a lot higher comparing to
> the delay added by cpuidle housekeeping.
>
> This change was also tested with cyclictest and dbench. In case of Ice Lake,
> the average cyclictest latency decreased by 5.1%, and the average 'dbench'
> throughput increased by about 0.8%. Both tests were run for 4 hours with only
> C1 enabled (all other idle states, including 'POLL', were disabled). CPU
> frequency was pinned to HFM, and uncore frequency was pinned to the maximum
> value. The other platforms had similar single-digit percentage improvements.
>
> It is worth noting, that this patch affects 'cpuidle' statistics a tiny bit.
> Before this patch, C1 residency did not include the interrupt handling time, but
> with this patch, it will include it. This is similar to what happens in case of
> the 'POLL' state, which also runs with interrupts enabled.
>
> Suggested-by: Len Brown <len.brown@intel.com>
> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
> ---
>  drivers/idle/intel_idle.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> index e6c543b5ee1d..0b66e25c0e2d 100644
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -88,6 +88,12 @@ static struct cpuidle_state *cpuidle_state_table __initdata;
>
>  static unsigned int mwait_substates __initdata;
>
> +/*
> + * Enable interrupts before entering the C-state. On some platforms and for
> + * some C-states, this may measurably decrease interrupt latency.
> + */
> +#define CPUIDLE_FLAG_IRQ_ENABLE                BIT(14)
> +
>  /*
>   * Enable this state by default even if the ACPI _CST does not list it.
>   */
> @@ -127,6 +133,9 @@ static __cpuidle int intel_idle(struct cpuidle_device *dev,
>         unsigned long eax = flg2MWAIT(state->flags);
>         unsigned long ecx = 1; /* break on interrupt flag */
>
> +       if (state->flags & CPUIDLE_FLAG_IRQ_ENABLE)
> +               local_irq_enable();
> +
>         mwait_idle_with_hints(eax, ecx);
>
>         return index;
> @@ -698,7 +707,7 @@ static struct cpuidle_state skx_cstates[] __initdata = {
>         {
>                 .name = "C1",
>                 .desc = "MWAIT 0x00",
> -               .flags = MWAIT2flg(0x00),
> +               .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE,
>                 .exit_latency = 2,
>                 .target_residency = 2,
>                 .enter = &intel_idle,
> @@ -727,7 +736,7 @@ static struct cpuidle_state icx_cstates[] __initdata = {
>         {
>                 .name = "C1",
>                 .desc = "MWAIT 0x00",
> -               .flags = MWAIT2flg(0x00),
> +               .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE,
>                 .exit_latency = 1,
>                 .target_residency = 1,
>                 .enter = &intel_idle,
> --

Applied as 5.16 material, thanks!

      reply	other threads:[~2021-09-24 16:46 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-17  7:20 [PATCH 0/1] intel_idle: improve Xeon C1 interrupt response time Artem Bityutskiy
2021-09-17  7:20 ` [PATCH 1/1] intel_idle: enable interrupts before C1 on Xeons Artem Bityutskiy
2021-09-24 16:45   ` Rafael J. Wysocki [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJZ5v0gns1EunzRfgDAHDS22aeAXnG_3aiFX3Px=crtER00Gtg@mail.gmail.com' \
    --to=rafael@kernel.org \
    --cc=dedekind1@gmail.com \
    --cc=linux-pm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.