* [PATCH v1 1/3] intel_idle: add SPR support
@ 2022-03-02 8:15 Artem Bityutskiy
2022-03-02 8:15 ` [PATCH v1 2/3] intel_idle: add 'preferred_cstates' module argument Artem Bityutskiy
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Artem Bityutskiy @ 2022-03-02 8:15 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: Linux PM Mailing List, chang.seok.bae
From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Add Sapphire Rapids Xeon support.
Up until very recently, the C1 and C1E C-states were independent, but this
has changed in some new chips, including Sapphire Rapids Xeon (SPR). In these
chips the C1 and C1E states cannot be enabled at the same time. The "C1E
promotion" bit in 'MSR_IA32_POWER_CTL' also has its semantics changed a bit.
Here are the C1, C1E, and "C1E promotion" bit rules on Xeons before SPR.
1. If C1E promotion bit is disabled.
a. C1 requests end up with C1 C-state.
b. C1E requests end up with C1E C-state.
2. If C1E promotion bit is enabled.
a. C1 requests end up with C1E C-state.
b. C1E requests end up with C1E C-state.
Here are the C1, C1E, and "C1E promotion" bit rules on Sapphire Rapids Xeon.
1. If C1E promotion bit is disabled.
a. C1 requests end up with C1 C-state.
b. C1E requests end up with C1 C-state.
2. If C1E promotion bit is enabled.
a. C1 requests end up with C1E C-state.
b. C1E requests end up with C1E C-state.
Before SPR Xeon, the 'intel_idle' driver was disabling C1E promotion and was
exposing C1 and C1E as independent C-states. But on SPR, C1 and C1E cannot be
enabled at the same time.
This patch adds both C1 and C1E states. However, C1E is marked as with the
"CPUIDLE_FLAG_UNUSABLE" flag, which means that in won't be registered by
default. The C1E promotion bit will be cleared, which means that by default
only C1 and C6 will be registered on SPR.
The next patch will add an option for enabling C1E and disabling C1 on SPR.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
---
drivers/idle/intel_idle.c | 47 +++++++++++++++++++++++++++++++++++++++
1 file changed, 47 insertions(+)
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 0b66e25c0e2d..1c7c25909e54 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -761,6 +761,46 @@ static struct cpuidle_state icx_cstates[] __initdata = {
.enter = NULL }
};
+/*
+ * On Sapphire Rapids Xeon C1 has to be disabled if C1E is enabled, and vice
+ * versa. On SPR C1E is enabled only if "C1E promotion" bit is set in
+ * MSR_IA32_POWER_CTL. But in this case there effectively no C1, because C1
+ * requests are promoted to C1E. If the "C1E promotion" bit is cleared, then
+ * both C1 and C1E requests end up with C1, so there is effectively no C1E.
+ *
+ * By default we enable C1 and disable C1E by marking it with
+ * 'CPUIDLE_FLAG_UNUSABLE'.
+ */
+static struct cpuidle_state spr_cstates[] __initdata = {
+ {
+ .name = "C1",
+ .desc = "MWAIT 0x00",
+ .flags = MWAIT2flg(0x00),
+ .exit_latency = 1,
+ .target_residency = 1,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .name = "C1E",
+ .desc = "MWAIT 0x01",
+ .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE | \
+ CPUIDLE_FLAG_UNUSABLE,
+ .exit_latency = 2,
+ .target_residency = 4,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .name = "C6",
+ .desc = "MWAIT 0x20",
+ .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
+ .exit_latency = 290,
+ .target_residency = 800,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .enter = NULL }
+};
+
static struct cpuidle_state atom_cstates[] __initdata = {
{
.name = "C1E",
@@ -1104,6 +1144,12 @@ static const struct idle_cpu idle_cpu_icx __initconst = {
.use_acpi = true,
};
+static const struct idle_cpu idle_cpu_spr __initconst = {
+ .state_table = spr_cstates,
+ .disable_promotion_to_c1e = true,
+ .use_acpi = true,
+};
+
static const struct idle_cpu idle_cpu_avn __initconst = {
.state_table = avn_cstates,
.disable_promotion_to_c1e = true,
@@ -1166,6 +1212,7 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = {
X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, &idle_cpu_skx),
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, &idle_cpu_icx),
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, &idle_cpu_icx),
+ X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, &idle_cpu_spr),
X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNL, &idle_cpu_knl),
X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNM, &idle_cpu_knl),
X86_MATCH_INTEL_FAM6_MODEL(ATOM_GOLDMONT, &idle_cpu_bxt),
--
2.31.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v1 2/3] intel_idle: add 'preferred_cstates' module argument
2022-03-02 8:15 [PATCH v1 1/3] intel_idle: add SPR support Artem Bityutskiy
@ 2022-03-02 8:15 ` Artem Bityutskiy
2022-03-02 8:16 ` [PATCH v1 3/3] intel_idle: add core C6 optimization for SPR Artem Bityutskiy
2022-03-04 18:56 ` [PATCH v1 1/3] intel_idle: add SPR support Rafael J. Wysocki
2 siblings, 0 replies; 4+ messages in thread
From: Artem Bityutskiy @ 2022-03-02 8:15 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: Linux PM Mailing List, chang.seok.bae
From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
On Sapphire Rapids Xeon (SPR) the C1 and C1E states are basically mutually
exclusive - only one of them can be enabled. By default, 'intel_idle' driver
enables C1 and disables C1E. However, some users prefer to use C1E instead of
C1, because it saves more energy.
This patch adds a new module parameter ('preferred_cstates') for enabling C1E
and disabling C1. Here is the idea behind it.
1. This option has effect only for "mutually exclusive" C-states like C1 and
C1E on SPR.
2. It does not have any effect on independent C-states, which do not require
other C-states to be disabled (most states on most platforms as of today).
3. For mutually exclusive C-states, the 'intel_idle' driver always has a
reasonable default, such as enabling C1 on SPR by default. On other
platforms, the default may be different.
4. Users can override the default using the 'preferred_cstates' parameter.
5. The parameter accepts the preferred C-states bit-mask, similarly to the
existing 'states_off' parameter.
6. This parameter is not limited to C1/C1E, and leaves room for supporting
other mutually exclusive C-states, if they come in the future.
Today 'intel_idle' can only be compiled-in, which means that on SPR, in order
to disable C1 and enable C1E, users should boot with the following kernel
argument: intel_idle.preferred_cstates=4
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
---
drivers/idle/intel_idle.c | 46 +++++++++++++++++++++++++++++++++++++++
1 file changed, 46 insertions(+)
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 1c7c25909e54..b2688c326522 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -64,6 +64,7 @@ static struct cpuidle_driver intel_idle_driver = {
/* intel_idle.max_cstate=0 disables driver */
static int max_cstate = CPUIDLE_STATE_MAX - 1;
static unsigned int disabled_states_mask;
+static unsigned int preferred_states_mask;
static struct cpuidle_device __percpu *intel_idle_cpuidle_devices;
@@ -1400,6 +1401,8 @@ static inline void intel_idle_init_cstates_acpi(struct cpuidle_driver *drv) { }
static inline bool intel_idle_off_by_default(u32 mwait_hint) { return false; }
#endif /* !CONFIG_ACPI_PROCESSOR_CSTATE */
+static void c1e_promotion_enable(void);
+
/**
* ivt_idle_state_table_update - Tune the idle states table for Ivy Town.
*
@@ -1570,6 +1573,26 @@ static void __init skx_idle_state_table_update(void)
}
}
+/**
+ * spr_idle_state_table_update - Adjust Sapphire Rapids idle states table.
+ */
+static void __init spr_idle_state_table_update(void)
+{
+ /* Check if user prefers C1E over C1. */
+ if (preferred_states_mask & BIT(2)) {
+ if (preferred_states_mask & BIT(1))
+ /* Both can't be enabled, stick to the defaults. */
+ return;
+
+ spr_cstates[0].flags |= CPUIDLE_FLAG_UNUSABLE;
+ spr_cstates[1].flags &= ~CPUIDLE_FLAG_UNUSABLE;
+
+ /* Enable C1E using the "C1E promotion" bit. */
+ c1e_promotion_enable();
+ disable_promotion_to_c1e = false;
+ }
+}
+
static bool __init intel_idle_verify_cstate(unsigned int mwait_hint)
{
unsigned int mwait_cstate = MWAIT_HINT2CSTATE(mwait_hint) + 1;
@@ -1604,6 +1627,9 @@ static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv)
case INTEL_FAM6_SKYLAKE_X:
skx_idle_state_table_update();
break;
+ case INTEL_FAM6_SAPPHIRERAPIDS_X:
+ spr_idle_state_table_update();
+ break;
}
for (cstate = 0; cstate < CPUIDLE_STATE_MAX; ++cstate) {
@@ -1676,6 +1702,15 @@ static void auto_demotion_disable(void)
wrmsrl(MSR_PKG_CST_CONFIG_CONTROL, msr_bits);
}
+static void c1e_promotion_enable(void)
+{
+ unsigned long long msr_bits;
+
+ rdmsrl(MSR_IA32_POWER_CTL, msr_bits);
+ msr_bits |= 0x2;
+ wrmsrl(MSR_IA32_POWER_CTL, msr_bits);
+}
+
static void c1e_promotion_disable(void)
{
unsigned long long msr_bits;
@@ -1845,3 +1880,14 @@ module_param(max_cstate, int, 0444);
*/
module_param_named(states_off, disabled_states_mask, uint, 0444);
MODULE_PARM_DESC(states_off, "Mask of disabled idle states");
+/*
+ * Some platforms come with mutually exclusive C-states, so that if one is
+ * enabled, the other C-states must not be used. Example: C1 and C1E on
+ * Sapphire Rapids platform. This parameter allows for selecting the
+ * preferred C-states among the groups of mutually exclusive C-states - the
+ * selected C-states will be registered, the other C-states from the mutually
+ * exclusive group won't be registered. If the platform has no mutually
+ * exclusive C-states, this parameter has no effect.
+ */
+module_param_named(preferred_cstates, preferred_states_mask, uint, 0444);
+MODULE_PARM_DESC(preferred_cstates, "Mask of preferred idle states");
--
2.31.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v1 3/3] intel_idle: add core C6 optimization for SPR
2022-03-02 8:15 [PATCH v1 1/3] intel_idle: add SPR support Artem Bityutskiy
2022-03-02 8:15 ` [PATCH v1 2/3] intel_idle: add 'preferred_cstates' module argument Artem Bityutskiy
@ 2022-03-02 8:16 ` Artem Bityutskiy
2022-03-04 18:56 ` [PATCH v1 1/3] intel_idle: add SPR support Rafael J. Wysocki
2 siblings, 0 replies; 4+ messages in thread
From: Artem Bityutskiy @ 2022-03-02 8:16 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: Linux PM Mailing List, chang.seok.bae
From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Add a Sapphire Rapids Xeon C6 optimization, similar to what we have for Sky Lake
Xeon: if package C6 is disabled, adjust C6 exit latency and target residency to
match core C6 values, instead of using the default package C6 values.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
---
drivers/idle/intel_idle.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index b2688c326522..e385ddf15b32 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -1578,6 +1578,8 @@ static void __init skx_idle_state_table_update(void)
*/
static void __init spr_idle_state_table_update(void)
{
+ unsigned long long msr;
+
/* Check if user prefers C1E over C1. */
if (preferred_states_mask & BIT(2)) {
if (preferred_states_mask & BIT(1))
@@ -1591,6 +1593,19 @@ static void __init spr_idle_state_table_update(void)
c1e_promotion_enable();
disable_promotion_to_c1e = false;
}
+
+ /*
+ * By default, the C6 state assumes the worst-case scenario of package
+ * C6. However, if PC6 is disabled, we update the numbers to match
+ * core C6.
+ */
+ rdmsrl(MSR_PKG_CST_CONFIG_CONTROL, msr);
+
+ /* Limit value 2 and above allow for PC6. */
+ if ((msr & 0x7) < 2) {
+ spr_cstates[2].exit_latency = 190;
+ spr_cstates[2].target_residency = 600;
+ }
}
static bool __init intel_idle_verify_cstate(unsigned int mwait_hint)
--
2.31.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v1 1/3] intel_idle: add SPR support
2022-03-02 8:15 [PATCH v1 1/3] intel_idle: add SPR support Artem Bityutskiy
2022-03-02 8:15 ` [PATCH v1 2/3] intel_idle: add 'preferred_cstates' module argument Artem Bityutskiy
2022-03-02 8:16 ` [PATCH v1 3/3] intel_idle: add core C6 optimization for SPR Artem Bityutskiy
@ 2022-03-04 18:56 ` Rafael J. Wysocki
2 siblings, 0 replies; 4+ messages in thread
From: Rafael J. Wysocki @ 2022-03-04 18:56 UTC (permalink / raw)
To: Artem Bityutskiy; +Cc: Rafael J. Wysocki, Linux PM Mailing List, Chang S. Bae
On Wed, Mar 2, 2022 at 9:16 AM Artem Bityutskiy <dedekind1@gmail.com> wrote:
>
> From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
>
> Add Sapphire Rapids Xeon support.
>
> Up until very recently, the C1 and C1E C-states were independent, but this
> has changed in some new chips, including Sapphire Rapids Xeon (SPR). In these
> chips the C1 and C1E states cannot be enabled at the same time. The "C1E
> promotion" bit in 'MSR_IA32_POWER_CTL' also has its semantics changed a bit.
>
> Here are the C1, C1E, and "C1E promotion" bit rules on Xeons before SPR.
>
> 1. If C1E promotion bit is disabled.
> a. C1 requests end up with C1 C-state.
> b. C1E requests end up with C1E C-state.
> 2. If C1E promotion bit is enabled.
> a. C1 requests end up with C1E C-state.
> b. C1E requests end up with C1E C-state.
>
> Here are the C1, C1E, and "C1E promotion" bit rules on Sapphire Rapids Xeon.
> 1. If C1E promotion bit is disabled.
> a. C1 requests end up with C1 C-state.
> b. C1E requests end up with C1 C-state.
> 2. If C1E promotion bit is enabled.
> a. C1 requests end up with C1E C-state.
> b. C1E requests end up with C1E C-state.
>
> Before SPR Xeon, the 'intel_idle' driver was disabling C1E promotion and was
> exposing C1 and C1E as independent C-states. But on SPR, C1 and C1E cannot be
> enabled at the same time.
>
> This patch adds both C1 and C1E states. However, C1E is marked as with the
> "CPUIDLE_FLAG_UNUSABLE" flag, which means that in won't be registered by
> default. The C1E promotion bit will be cleared, which means that by default
> only C1 and C6 will be registered on SPR.
>
> The next patch will add an option for enabling C1E and disabling C1 on SPR.
>
> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
> ---
> drivers/idle/intel_idle.c | 47 +++++++++++++++++++++++++++++++++++++++
> 1 file changed, 47 insertions(+)
>
> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> index 0b66e25c0e2d..1c7c25909e54 100644
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -761,6 +761,46 @@ static struct cpuidle_state icx_cstates[] __initdata = {
> .enter = NULL }
> };
>
> +/*
> + * On Sapphire Rapids Xeon C1 has to be disabled if C1E is enabled, and vice
> + * versa. On SPR C1E is enabled only if "C1E promotion" bit is set in
> + * MSR_IA32_POWER_CTL. But in this case there effectively no C1, because C1
> + * requests are promoted to C1E. If the "C1E promotion" bit is cleared, then
> + * both C1 and C1E requests end up with C1, so there is effectively no C1E.
> + *
> + * By default we enable C1 and disable C1E by marking it with
> + * 'CPUIDLE_FLAG_UNUSABLE'.
> + */
> +static struct cpuidle_state spr_cstates[] __initdata = {
> + {
> + .name = "C1",
> + .desc = "MWAIT 0x00",
> + .flags = MWAIT2flg(0x00),
> + .exit_latency = 1,
> + .target_residency = 1,
> + .enter = &intel_idle,
> + .enter_s2idle = intel_idle_s2idle, },
> + {
> + .name = "C1E",
> + .desc = "MWAIT 0x01",
> + .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE | \
> + CPUIDLE_FLAG_UNUSABLE,
> + .exit_latency = 2,
> + .target_residency = 4,
> + .enter = &intel_idle,
> + .enter_s2idle = intel_idle_s2idle, },
> + {
> + .name = "C6",
> + .desc = "MWAIT 0x20",
> + .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
> + .exit_latency = 290,
> + .target_residency = 800,
> + .enter = &intel_idle,
> + .enter_s2idle = intel_idle_s2idle, },
> + {
> + .enter = NULL }
> +};
> +
> static struct cpuidle_state atom_cstates[] __initdata = {
> {
> .name = "C1E",
> @@ -1104,6 +1144,12 @@ static const struct idle_cpu idle_cpu_icx __initconst = {
> .use_acpi = true,
> };
>
> +static const struct idle_cpu idle_cpu_spr __initconst = {
> + .state_table = spr_cstates,
> + .disable_promotion_to_c1e = true,
> + .use_acpi = true,
> +};
> +
> static const struct idle_cpu idle_cpu_avn __initconst = {
> .state_table = avn_cstates,
> .disable_promotion_to_c1e = true,
> @@ -1166,6 +1212,7 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = {
> X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, &idle_cpu_skx),
> X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, &idle_cpu_icx),
> X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, &idle_cpu_icx),
> + X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, &idle_cpu_spr),
> X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNL, &idle_cpu_knl),
> X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNM, &idle_cpu_knl),
> X86_MATCH_INTEL_FAM6_MODEL(ATOM_GOLDMONT, &idle_cpu_bxt),
> --
Applied as 5.18 material along with the rest of the series.
Thanks!
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-03-04 18:56 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-02 8:15 [PATCH v1 1/3] intel_idle: add SPR support Artem Bityutskiy
2022-03-02 8:15 ` [PATCH v1 2/3] intel_idle: add 'preferred_cstates' module argument Artem Bityutskiy
2022-03-02 8:16 ` [PATCH v1 3/3] intel_idle: add core C6 optimization for SPR Artem Bityutskiy
2022-03-04 18:56 ` [PATCH v1 1/3] intel_idle: add SPR support Rafael J. Wysocki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).