From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> To: "Rafael J. Wysocki" <rjw@rjwysocki.net>, Daniel Lezcano <daniel.lezcano@linaro.org>, Michael Ellerman <mpe@ellerman.id.au>, "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>, Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>, Michal Suchanek <msuchanek@suse.de> Cc: linux-pm@vger.kernel.org, joedecke@de.ibm.com, linuxppc-dev@lists.ozlabs.org, "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>, Vaidyanathan Srinivasan <svaidy@linux.ibm.com> Subject: [PATCH v5 1/2] cpuidle/pseries: Fixup CEDE0 latency only for POWER10 onwards Date: Mon, 19 Jul 2021 12:03:18 +0530 [thread overview] Message-ID: <1626676399-15975-2-git-send-email-ego@linux.vnet.ibm.com> (raw) In-Reply-To: <1626676399-15975-1-git-send-email-ego@linux.vnet.ibm.com> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> Commit d947fb4c965c ("cpuidle: pseries: Fixup exit latency for CEDE(0)") sets the exit latency of CEDE(0) based on the latency values of the Extended CEDE states advertised by the platform On POWER9 LPARs, the firmwares advertise a very low value of 2us for CEDE1 exit latency on a Dedicated LPAR. The latency advertized by the PHYP hypervisor corresponds to the latency required to wakeup from the underlying hardware idle state. However the wakeup latency from the LPAR perspective should include 1. The time taken to transition the CPU from the Hypervisor into the LPAR post wakeup from platform idle state 2. Time taken to send the IPI from the source CPU (waker) to the idle target CPU (wakee). 1. can be measured via timer idle test, where we queue a timer, say for 1ms, and enter the CEDE state. When the timer fires, in the timer handler we compute how much extra timer over the expected 1ms have we consumed. On a a POWER9 LPAR the numbers are CEDE latency measured using a timer (numbers in ns) N Min Median Avg 90%ile 99%ile Max Stddev 400 2601 5677 5668.74 5917 6413 9299 455.01 1. and 2. combined can be determined by an IPI latency test where we send an IPI to an idle CPU and in the handler compute the time difference between when the IPI was sent and when the handler ran. We see the following numbers on POWER9 LPAR. CEDE latency measured using an IPI (numbers in ns) N Min Median Avg 90%ile 99%ile Max Stddev 400 711 7564 7369.43 8559 9514 9698 1200.01 Suppose, we consider the 99th percentile latency value measured using the IPI to be the wakeup latency, the value would be 9.5us This is in the ballpark of the default value of 10us. Hence, use the exit latency of CEDE(0) based on the latency values advertized by platform only from POWER10 onwards. The values advertized on POWER10 platforms is more realistic and informed by the latency measurements. For earlier platforms stick to the default value of 10us. The fix was suggested by Michael Ellerman. Reported-by: Enrico Joedecke <joedecke@de.ibm.com> Fixes: commit d947fb4c965c ("cpuidle: pseries: Fixup exit latency for CEDE(0)") Cc: Michal Suchanek <msuchanek@suse.de> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> --- drivers/cpuidle/cpuidle-pseries.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c index a2b5c6f..e592280d 100644 --- a/drivers/cpuidle/cpuidle-pseries.c +++ b/drivers/cpuidle/cpuidle-pseries.c @@ -419,7 +419,21 @@ static int pseries_idle_probe(void) cpuidle_state_table = shared_states; max_idle_state = ARRAY_SIZE(shared_states); } else { - fixup_cede0_latency(); + /* + * Use firmware provided latency values + * starting with POWER10 platforms. In the + * case that we are running on a POWER10 + * platform but in an earlier compat mode, we + * can still use the firmware provided values. + * + * However, on platforms prior to POWER10, we + * cannot rely on the accuracy of the firmware + * provided latency values. On such platforms, + * go with the conservative default estimate + * of 10us. + */ + if (cpu_has_feature(CPU_FTR_ARCH_31) || pvr_version_is(PVR_POWER10)) + fixup_cede0_latency(); cpuidle_state_table = dedicated_states; max_idle_state = NR_DEDICATED_STATES; } -- 1.9.4
WARNING: multiple messages have this Message-ID (diff)
From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> To: "Rafael J. Wysocki" <rjw@rjwysocki.net>, Daniel Lezcano <daniel.lezcano@linaro.org>, Michael Ellerman <mpe@ellerman.id.au>, "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>, Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>, Michal Suchanek <msuchanek@suse.de> Cc: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>, linuxppc-dev@lists.ozlabs.org, joedecke@de.ibm.com, linux-pm@vger.kernel.org Subject: [PATCH v5 1/2] cpuidle/pseries: Fixup CEDE0 latency only for POWER10 onwards Date: Mon, 19 Jul 2021 12:03:18 +0530 [thread overview] Message-ID: <1626676399-15975-2-git-send-email-ego@linux.vnet.ibm.com> (raw) In-Reply-To: <1626676399-15975-1-git-send-email-ego@linux.vnet.ibm.com> From: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com> Commit d947fb4c965c ("cpuidle: pseries: Fixup exit latency for CEDE(0)") sets the exit latency of CEDE(0) based on the latency values of the Extended CEDE states advertised by the platform On POWER9 LPARs, the firmwares advertise a very low value of 2us for CEDE1 exit latency on a Dedicated LPAR. The latency advertized by the PHYP hypervisor corresponds to the latency required to wakeup from the underlying hardware idle state. However the wakeup latency from the LPAR perspective should include 1. The time taken to transition the CPU from the Hypervisor into the LPAR post wakeup from platform idle state 2. Time taken to send the IPI from the source CPU (waker) to the idle target CPU (wakee). 1. can be measured via timer idle test, where we queue a timer, say for 1ms, and enter the CEDE state. When the timer fires, in the timer handler we compute how much extra timer over the expected 1ms have we consumed. On a a POWER9 LPAR the numbers are CEDE latency measured using a timer (numbers in ns) N Min Median Avg 90%ile 99%ile Max Stddev 400 2601 5677 5668.74 5917 6413 9299 455.01 1. and 2. combined can be determined by an IPI latency test where we send an IPI to an idle CPU and in the handler compute the time difference between when the IPI was sent and when the handler ran. We see the following numbers on POWER9 LPAR. CEDE latency measured using an IPI (numbers in ns) N Min Median Avg 90%ile 99%ile Max Stddev 400 711 7564 7369.43 8559 9514 9698 1200.01 Suppose, we consider the 99th percentile latency value measured using the IPI to be the wakeup latency, the value would be 9.5us This is in the ballpark of the default value of 10us. Hence, use the exit latency of CEDE(0) based on the latency values advertized by platform only from POWER10 onwards. The values advertized on POWER10 platforms is more realistic and informed by the latency measurements. For earlier platforms stick to the default value of 10us. The fix was suggested by Michael Ellerman. Reported-by: Enrico Joedecke <joedecke@de.ibm.com> Fixes: commit d947fb4c965c ("cpuidle: pseries: Fixup exit latency for CEDE(0)") Cc: Michal Suchanek <msuchanek@suse.de> Cc: Vaidyanathan Srinivasan <svaidy@linux.ibm.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> --- drivers/cpuidle/cpuidle-pseries.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c index a2b5c6f..e592280d 100644 --- a/drivers/cpuidle/cpuidle-pseries.c +++ b/drivers/cpuidle/cpuidle-pseries.c @@ -419,7 +419,21 @@ static int pseries_idle_probe(void) cpuidle_state_table = shared_states; max_idle_state = ARRAY_SIZE(shared_states); } else { - fixup_cede0_latency(); + /* + * Use firmware provided latency values + * starting with POWER10 platforms. In the + * case that we are running on a POWER10 + * platform but in an earlier compat mode, we + * can still use the firmware provided values. + * + * However, on platforms prior to POWER10, we + * cannot rely on the accuracy of the firmware + * provided latency values. On such platforms, + * go with the conservative default estimate + * of 10us. + */ + if (cpu_has_feature(CPU_FTR_ARCH_31) || pvr_version_is(PVR_POWER10)) + fixup_cede0_latency(); cpuidle_state_table = dedicated_states; max_idle_state = NR_DEDICATED_STATES; } -- 1.9.4
next prev parent reply other threads:[~2021-07-19 6:33 UTC|newest] Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-07-19 6:33 [PATCH v5 0/2] cpuidle/pseries: cleanup of the CEDE0 latency fixup code Gautham R. Shenoy 2021-07-19 6:33 ` Gautham R. Shenoy 2021-07-19 6:33 ` Gautham R. Shenoy [this message] 2021-07-19 6:33 ` [PATCH v5 1/2] cpuidle/pseries: Fixup CEDE0 latency only for POWER10 onwards Gautham R. Shenoy 2021-07-19 6:33 ` [PATCH v5 2/2] cpuidle/pseries: Do not cap the CEDE0 latency in fixup_cede0_latency() Gautham R. Shenoy 2021-07-19 6:33 ` Gautham R. Shenoy 2021-08-03 10:20 ` [PATCH v5 0/2] cpuidle/pseries: cleanup of the CEDE0 latency fixup code Michael Ellerman 2021-08-03 10:20 ` Michael Ellerman 2021-08-03 12:50 ` Michael Ellerman 2021-08-03 12:50 ` Michael Ellerman
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1626676399-15975-2-git-send-email-ego@linux.vnet.ibm.com \ --to=ego@linux.vnet.ibm.com \ --cc=aneesh.kumar@linux.ibm.com \ --cc=daniel.lezcano@linaro.org \ --cc=joedecke@de.ibm.com \ --cc=linux-pm@vger.kernel.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=mpe@ellerman.id.au \ --cc=msuchanek@suse.de \ --cc=rjw@rjwysocki.net \ --cc=svaidy@linux.ibm.com \ --cc=svaidy@linux.vnet.ibm.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.