xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] x86/mwait-idle: updates from Linux (and more)
@ 2022-01-20 14:00 Jan Beulich
  2022-01-20 14:01 ` [PATCH v2 1/5] x86/mwait-idle: stop exposing platform acronyms Jan Beulich
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Jan Beulich @ 2022-01-20 14:00 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

Besides the prereqs requested to the remaining patch from v1 there's
yet one more thing to consider pulling in (RFC for now), and one
further custom change I'd like us to consider making (potentially
even going farther than presented here).

1: stop exposing platform acronyms
2: switch to using bool
3: add SnowRidge C-state table
4: enable interrupts before C1 on Xeons
5: squash stats update when not actually entering C-state

Jan



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/5] x86/mwait-idle: stop exposing platform acronyms
  2022-01-20 14:00 [PATCH v2 0/5] x86/mwait-idle: updates from Linux (and more) Jan Beulich
@ 2022-01-20 14:01 ` Jan Beulich
  2022-01-20 14:31   ` Roger Pau Monné
  2022-01-20 14:02 ` [PATCH v2 2/5] x86/mwait-idle: switch to using bool Jan Beulich
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2022-01-20 14:01 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

This follows Linux commit de09cdd09fa1 ("intel_idle: stop exposing
platform acronyms in sysfs"), but their main justifications (sysfs
exposure and similarity with acpi-idle) don't apply to us. The field is
only used in a single printk() right now, but having the platform tags
there isn't useful either.

Requested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.

--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -134,25 +134,25 @@ static const struct cpuidle_state {
  */
 static const struct cpuidle_state nehalem_cstates[] = {
 	{
-		.name = "C1-NHM",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 3,
 		.target_residency = 6,
 	},
 	{
-		.name = "C1E-NHM",
+		.name = "C1E",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 20,
 	},
 	{
-		.name = "C3-NHM",
+		.name = "C3",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 20,
 		.target_residency = 80,
 	},
 	{
-		.name = "C6-NHM",
+		.name = "C6",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 200,
 		.target_residency = 800,
@@ -162,31 +162,31 @@ static const struct cpuidle_state nehale
 
 static const struct cpuidle_state snb_cstates[] = {
 	{
-		.name = "C1-SNB",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 2,
 		.target_residency = 2,
 	},
 	{
-		.name = "C1E-SNB",
+		.name = "C1E",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 20,
 	},
 	{
-		.name = "C3-SNB",
+		.name = "C3",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 80,
 		.target_residency = 211,
 	},
 	{
-		.name = "C6-SNB",
+		.name = "C6",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 104,
 		.target_residency = 345,
 	},
 	{
-		.name = "C7-SNB",
+		.name = "C7",
 		.flags = MWAIT2flg(0x30) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 109,
 		.target_residency = 345,
@@ -196,31 +196,31 @@ static const struct cpuidle_state snb_cs
 
 static const struct cpuidle_state byt_cstates[] = {
 	{
-		.name = "C1-BYT",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 1,
 		.target_residency = 1,
 	},
 	{
-		.name = "C6N-BYT",
+		.name = "C6N",
 		.flags = MWAIT2flg(0x58) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 300,
 		.target_residency = 275,
 	},
 	{
-		.name = "C6S-BYT",
+		.name = "C6S",
 		.flags = MWAIT2flg(0x52) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 500,
 		.target_residency = 560,
 	},
 	{
-		.name = "C7-BYT",
+		.name = "C7",
 		.flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 1200,
 		.target_residency = 4000,
 	},
 	{
-		.name = "C7S-BYT",
+		.name = "C7S",
 		.flags = MWAIT2flg(0x64) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 10000,
 		.target_residency = 20000,
@@ -230,31 +230,31 @@ static const struct cpuidle_state byt_cs
 
 static const struct cpuidle_state cht_cstates[] = {
 	{
-		.name = "C1-CHT",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 1,
 		.target_residency = 1,
 	},
 	{
-		.name = "C6N-CHT",
+		.name = "C6N",
 		.flags = MWAIT2flg(0x58) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 80,
 		.target_residency = 275,
 	},
 	{
-		.name = "C6S-CHT",
+		.name = "C6S",
 		.flags = MWAIT2flg(0x52) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 200,
 		.target_residency = 560,
 	},
 	{
-		.name = "C7-CHT",
+		.name = "C7",
 		.flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 1200,
 		.target_residency = 4000,
 	},
 	{
-		.name = "C7S-CHT",
+		.name = "C7S",
 		.flags = MWAIT2flg(0x64) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 10000,
 		.target_residency = 20000,
@@ -264,31 +264,31 @@ static const struct cpuidle_state cht_cs
 
 static const struct cpuidle_state ivb_cstates[] = {
 	{
-		.name = "C1-IVB",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 1,
 		.target_residency = 1,
 	},
 	{
-		.name = "C1E-IVB",
+		.name = "C1E",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 20,
 	},
 	{
-		.name = "C3-IVB",
+		.name = "C3",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 59,
 		.target_residency = 156,
 	},
 	{
-		.name = "C6-IVB",
+		.name = "C6",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 80,
 		.target_residency = 300,
 	},
 	{
-		.name = "C7-IVB",
+		.name = "C7",
 		.flags = MWAIT2flg(0x30) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 87,
 		.target_residency = 300,
@@ -298,25 +298,25 @@ static const struct cpuidle_state ivb_cs
 
 static const struct cpuidle_state ivt_cstates[] = {
 	{
-		.name = "C1-IVT",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 1,
 		.target_residency = 1,
 	},
 	{
-		.name = "C1E-IVT",
+		.name = "C1E",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 80,
 	},
 	{
-		.name = "C3-IVT",
+		.name = "C3",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 59,
 		.target_residency = 156,
 	},
 	{
-		.name = "C6-IVT",
+		.name = "C6",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 82,
 		.target_residency = 300,
@@ -326,25 +326,25 @@ static const struct cpuidle_state ivt_cs
 
 static const struct cpuidle_state ivt_cstates_4s[] = {
 	{
-		.name = "C1-IVT-4S",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 1,
 		.target_residency = 1,
 	},
 	{
-		.name = "C1E-IVT-4S",
+		.name = "C1E",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 250,
 	},
 	{
-		.name = "C3-IVT-4S",
+		.name = "C3",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 59,
 		.target_residency = 300,
 	},
 	{
-		.name = "C6-IVT-4S",
+		.name = "C6",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 84,
 		.target_residency = 400,
@@ -354,25 +354,25 @@ static const struct cpuidle_state ivt_cs
 
 static const struct cpuidle_state ivt_cstates_8s[] = {
 	{
-		.name = "C1-IVT-8S",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 1,
 		.target_residency = 1,
 	},
 	{
-		.name = "C1E-IVT-8S",
+		.name = "C1E",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 500,
 	},
 	{
-		.name = "C3-IVT-8S",
+		.name = "C3",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 59,
 		.target_residency = 600,
 	},
 	{
-		.name = "C6-IVT-8S",
+		.name = "C6",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 88,
 		.target_residency = 700,
@@ -382,49 +382,49 @@ static const struct cpuidle_state ivt_cs
 
 static const struct cpuidle_state hsw_cstates[] = {
 	{
-		.name = "C1-HSW",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 2,
 		.target_residency = 2,
 	},
 	{
-		.name = "C1E-HSW",
+		.name = "C1E",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 20,
 	},
 	{
-		.name = "C3-HSW",
+		.name = "C3",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 33,
 		.target_residency = 100,
 	},
 	{
-		.name = "C6-HSW",
+		.name = "C6",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 133,
 		.target_residency = 400,
 	},
 	{
-		.name = "C7s-HSW",
+		.name = "C7s",
 		.flags = MWAIT2flg(0x32) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 166,
 		.target_residency = 500,
 	},
  	{
-		.name = "C8-HSW",
+		.name = "C8",
 		.flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 300,
 		.target_residency = 900,
 	},
 	{
-		.name = "C9-HSW",
+		.name = "C9",
 		.flags = MWAIT2flg(0x50) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 600,
 		.target_residency = 1800,
 	},
 	{
-		.name = "C10-HSW",
+		.name = "C10",
 		.flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 2600,
 		.target_residency = 7700,
@@ -434,49 +434,49 @@ static const struct cpuidle_state hsw_cs
 
 static const struct cpuidle_state bdw_cstates[] = {
 	{
-		.name = "C1-BDW",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 2,
 		.target_residency = 2,
 	},
 	{
-		.name = "C1E-BDW",
+		.name = "C1E",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 20,
 	},
 	{
-		.name = "C3-BDW",
+		.name = "C3",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 40,
 		.target_residency = 100,
 	},
 	{
-		.name = "C6-BDW",
+		.name = "C6",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 133,
 		.target_residency = 400,
 	},
 	{
-		.name = "C7s-BDW",
+		.name = "C7s",
 		.flags = MWAIT2flg(0x32) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 166,
 		.target_residency = 500,
 	},
 	{
-		.name = "C8-BDW",
+		.name = "C8",
 		.flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 300,
 		.target_residency = 900,
 	},
 	{
-		.name = "C9-BDW",
+		.name = "C9",
 		.flags = MWAIT2flg(0x50) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 600,
 		.target_residency = 1800,
 	},
 	{
-		.name = "C10-BDW",
+		.name = "C10",
 		.flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 2600,
 		.target_residency = 7700,
@@ -486,49 +486,49 @@ static const struct cpuidle_state bdw_cs
 
 static struct cpuidle_state __read_mostly skl_cstates[] = {
 	{
-		.name = "C1-SKL",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 2,
 		.target_residency = 2,
 	},
 	{
-		.name = "C1E-SKL",
+		.name = "C1E",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 20,
 	},
 	{
-		.name = "C3-SKL",
+		.name = "C3",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 70,
 		.target_residency = 100,
 	},
 	{
-		.name = "C6-SKL",
+		.name = "C6",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 85,
 		.target_residency = 200,
 	},
 	{
-		.name = "C7s-SKL",
+		.name = "C7s",
 		.flags = MWAIT2flg(0x33) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 124,
 		.target_residency = 800,
 	},
 	{
-		.name = "C8-SKL",
+		.name = "C8",
 		.flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 200,
 		.target_residency = 800,
 	},
 	{
-		.name = "C9-SKL",
+		.name = "C9",
 		.flags = MWAIT2flg(0x50) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 480,
 		.target_residency = 5000,
 	},
 	{
-		.name = "C10-SKL",
+		.name = "C10",
 		.flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 890,
 		.target_residency = 5000,
@@ -538,19 +538,19 @@ static struct cpuidle_state __read_mostl
 
 static struct cpuidle_state __read_mostly skx_cstates[] = {
 	{
-		.name = "C1-SKX",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 2,
 		.target_residency = 2,
 	},
 	{
-		.name = "C1E-SKX",
+		.name = "C1E",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 20,
 	},
 	{
-		.name = "C6-SKX",
+		.name = "C6",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 133,
 		.target_residency = 600,
@@ -560,19 +560,19 @@ static struct cpuidle_state __read_mostl
 
 static const struct cpuidle_state icx_cstates[] = {
        {
-               .name = "C1-ICX",
+               .name = "C1",
                .flags = MWAIT2flg(0x00),
                .exit_latency = 1,
                .target_residency = 1,
        },
        {
-               .name = "C1E-ICX",
+               .name = "C1E",
                .flags = MWAIT2flg(0x01),
                .exit_latency = 4,
                .target_residency = 4,
        },
        {
-               .name = "C6-ICX",
+               .name = "C6",
                .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
                .exit_latency = 170,
                .target_residency = 600,
@@ -582,25 +582,25 @@ static const struct cpuidle_state icx_cs
 
 static const struct cpuidle_state atom_cstates[] = {
 	{
-		.name = "C1E-ATM",
+		.name = "C1E",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 10,
 		.target_residency = 20,
 	},
 	{
-		.name = "C2-ATM",
+		.name = "C2",
 		.flags = MWAIT2flg(0x10),
 		.exit_latency = 20,
 		.target_residency = 80,
 	},
 	{
-		.name = "C4-ATM",
+		.name = "C4",
 		.flags = MWAIT2flg(0x30) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 100,
 		.target_residency = 400,
 	},
 	{
-		.name = "C6-ATM",
+		.name = "C6",
 		.flags = MWAIT2flg(0x52) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 140,
 		.target_residency = 560,
@@ -610,31 +610,31 @@ static const struct cpuidle_state atom_c
 
 static const struct cpuidle_state tangier_cstates[] = {
 	{
-		.name = "C1-TNG",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 1,
 		.target_residency = 4,
 	},
 	{
-		.name = "C4-TNG",
+		.name = "C4",
 		.flags = MWAIT2flg(0x30) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 100,
 		.target_residency = 400,
 	},
 	{
-		.name = "C6-TNG",
+		.name = "C6",
 		.flags = MWAIT2flg(0x52) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 140,
 		.target_residency = 560,
 	},
 	{
-		.name = "C7-TNG",
+		.name = "C7",
 		.flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 1200,
 		.target_residency = 4000,
 	},
 	{
-		.name = "C9-TNG",
+		.name = "C9",
 		.flags = MWAIT2flg(0x64) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 10000,
 		.target_residency = 20000,
@@ -644,13 +644,13 @@ static const struct cpuidle_state tangie
 
 static const struct cpuidle_state avn_cstates[] = {
 	{
-		.name = "C1-AVN",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 2,
 		.target_residency = 2,
 	},
 	{
-		.name = "C6-AVN",
+		.name = "C6",
 		.flags = MWAIT2flg(0x51) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 15,
 		.target_residency = 45,
@@ -660,13 +660,13 @@ static const struct cpuidle_state avn_cs
 
 static const struct cpuidle_state knl_cstates[] = {
 	{
-		.name = "C1-KNL",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 1,
 		.target_residency = 2,
 	},
 	{
-		.name = "C6-KNL",
+		.name = "C6",
 		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 120,
 		.target_residency = 500,
@@ -676,43 +676,43 @@ static const struct cpuidle_state knl_cs
 
 static struct cpuidle_state __read_mostly bxt_cstates[] = {
 	{
-		.name = "C1-BXT",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 2,
 		.target_residency = 2,
 	},
 	{
-		.name = "C1E-BXT",
+		.name = "C1E",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 20,
 	},
 	{
-		.name = "C6-BXT",
+		.name = "C6",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 133,
 		.target_residency = 133,
 	},
 	{
-		.name = "C7s-BXT",
+		.name = "C7s",
 		.flags = MWAIT2flg(0x31) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 155,
 		.target_residency = 155,
 	},
 	{
-		.name = "C8-BXT",
+		.name = "C8",
 		.flags = MWAIT2flg(0x40) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 1000,
 		.target_residency = 1000,
 	},
 	{
-		.name = "C9-BXT",
+		.name = "C9",
 		.flags = MWAIT2flg(0x50) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 2000,
 		.target_residency = 2000,
 	},
 	{
-		.name = "C10-BXT",
+		.name = "C10",
 		.flags = MWAIT2flg(0x60) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 10000,
 		.target_residency = 10000,
@@ -722,19 +722,19 @@ static struct cpuidle_state __read_mostl
 
 static const struct cpuidle_state dnv_cstates[] = {
 	{
-		.name = "C1-DNV",
+		.name = "C1",
 		.flags = MWAIT2flg(0x00),
 		.exit_latency = 2,
 		.target_residency = 2,
 	},
 	{
-		.name = "C1E-DNV",
+		.name = "C1E",
 		.flags = MWAIT2flg(0x01),
 		.exit_latency = 10,
 		.target_residency = 20,
 	},
 	{
-		.name = "C6-DNV",
+		.name = "C6",
 		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
 		.exit_latency = 50,
 		.target_residency = 500,



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 2/5] x86/mwait-idle: switch to using bool
  2022-01-20 14:00 [PATCH v2 0/5] x86/mwait-idle: updates from Linux (and more) Jan Beulich
  2022-01-20 14:01 ` [PATCH v2 1/5] x86/mwait-idle: stop exposing platform acronyms Jan Beulich
@ 2022-01-20 14:02 ` Jan Beulich
  2022-01-20 14:34   ` Roger Pau Monné
  2022-01-20 14:02 ` [PATCH v2 3/5] x86/mwait-idle: add SnowRidge C-state table Jan Beulich
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2022-01-20 14:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

When the driver was first ported, we didn't have "bool" yet, so
conversion to bool_t / 0 / 1 was necessary. Undo that conversion, easing
ports of newer changes as well as tidying things up.

Requested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.

--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -76,7 +76,7 @@
 # define pr_debug(fmt...)
 #endif
 
-static __initdata bool_t opt_mwait_idle = 1;
+static __initdata bool opt_mwait_idle = true;
 boolean_param("mwait-idle", opt_mwait_idle);
 
 static unsigned int mwait_substates;
@@ -93,8 +93,8 @@ struct idle_cpu {
 	 * Indicate which enable bits to clear here.
 	 */
 	unsigned long auto_demotion_disable_flags;
-	bool_t byt_auto_demotion_disable_flag;
-	bool_t disable_promotion_to_c1e;
+	bool byt_auto_demotion_disable_flag;
+	bool disable_promotion_to_c1e;
 };
 
 static const struct idle_cpu *icpu;
@@ -867,7 +867,7 @@ static void c1e_promotion_disable(void *
 static const struct idle_cpu idle_cpu_nehalem = {
 	.state_table = nehalem_cstates,
 	.auto_demotion_disable_flags = NHM_C1_AUTO_DEMOTE | NHM_C3_AUTO_DEMOTE,
-	.disable_promotion_to_c1e = 1,
+	.disable_promotion_to_c1e = true,
 };
 
 static const struct idle_cpu idle_cpu_atom = {
@@ -885,59 +885,59 @@ static const struct idle_cpu idle_cpu_li
 
 static const struct idle_cpu idle_cpu_snb = {
 	.state_table = snb_cstates,
-	.disable_promotion_to_c1e = 1,
+	.disable_promotion_to_c1e = true,
 };
 
 static const struct idle_cpu idle_cpu_byt = {
 	.state_table = byt_cstates,
-	.disable_promotion_to_c1e = 1,
-	.byt_auto_demotion_disable_flag = 1,
+	.disable_promotion_to_c1e = true,
+	.byt_auto_demotion_disable_flag = true,
 };
 
 static const struct idle_cpu idle_cpu_cht = {
 	.state_table = cht_cstates,
-	.disable_promotion_to_c1e = 1,
-	.byt_auto_demotion_disable_flag = 1,
+	.disable_promotion_to_c1e = true,
+	.byt_auto_demotion_disable_flag = true,
 };
 
 static const struct idle_cpu idle_cpu_ivb = {
 	.state_table = ivb_cstates,
-	.disable_promotion_to_c1e = 1,
+	.disable_promotion_to_c1e = true,
 };
 
 static const struct idle_cpu idle_cpu_ivt = {
 	.state_table = ivt_cstates,
-	.disable_promotion_to_c1e = 1,
+	.disable_promotion_to_c1e = true,
 };
 
 static const struct idle_cpu idle_cpu_hsw = {
 	.state_table = hsw_cstates,
-	.disable_promotion_to_c1e = 1,
+	.disable_promotion_to_c1e = true,
 };
 
 static const struct idle_cpu idle_cpu_bdw = {
 	.state_table = bdw_cstates,
-	.disable_promotion_to_c1e = 1,
+	.disable_promotion_to_c1e = true,
 };
 
 static const struct idle_cpu idle_cpu_skl = {
 	.state_table = skl_cstates,
-	.disable_promotion_to_c1e = 1,
+	.disable_promotion_to_c1e = true,
 };
 
 static const struct idle_cpu idle_cpu_skx = {
 	.state_table = skx_cstates,
-	.disable_promotion_to_c1e = 1,
+	.disable_promotion_to_c1e = true,
 };
 
 static const struct idle_cpu idle_cpu_icx = {
        .state_table = icx_cstates,
-       .disable_promotion_to_c1e = 1,
+       .disable_promotion_to_c1e = true,
 };
 
 static const struct idle_cpu idle_cpu_avn = {
 	.state_table = avn_cstates,
-	.disable_promotion_to_c1e = 1,
+	.disable_promotion_to_c1e = true,
 };
 
 static const struct idle_cpu idle_cpu_knl = {
@@ -946,12 +946,12 @@ static const struct idle_cpu idle_cpu_kn
 
 static const struct idle_cpu idle_cpu_bxt = {
 	.state_table = bxt_cstates,
-	.disable_promotion_to_c1e = 1,
+	.disable_promotion_to_c1e = true,
 };
 
 static const struct idle_cpu idle_cpu_dnv = {
 	.state_table = dnv_cstates,
-	.disable_promotion_to_c1e = 1,
+	.disable_promotion_to_c1e = true,
 };
 
 #define ICPU(model, cpu) \



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 3/5] x86/mwait-idle: add SnowRidge C-state table
  2022-01-20 14:00 [PATCH v2 0/5] x86/mwait-idle: updates from Linux (and more) Jan Beulich
  2022-01-20 14:01 ` [PATCH v2 1/5] x86/mwait-idle: stop exposing platform acronyms Jan Beulich
  2022-01-20 14:02 ` [PATCH v2 2/5] x86/mwait-idle: switch to using bool Jan Beulich
@ 2022-01-20 14:02 ` Jan Beulich
  2022-01-20 14:04 ` [PATCH RFC v2 4/5] x86/mwait-idle: enable interrupts before C1 on Xeons Jan Beulich
  2022-01-20 14:05 ` [PATCH RFC v2 5/5] x86/mwait-idle: squash stats update when not actually entering C-state Jan Beulich
  4 siblings, 0 replies; 14+ messages in thread
From: Jan Beulich @ 2022-01-20 14:02 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

Add C-state table for the SnowRidge SoC which is found on Intel Jacobsville
platforms.

The following has been changed.

 1. C1E latency changed from 10us to 15us. It was measured using the
    open source "wult" tool (the "nic" method, 15us is the 99.99th
    percentile).

 2. C1E power break even changed from 20us to 25us, which may result
    in less C1E residency in some workloads.

 3. C6 latency changed from 50us to 130us. Measured the same way as C1E.

The C6 C-state is supported only by some SnowRidge revisions, so add a C-state
table commentary about this.

On SnowRidge, C6 support is enumerated via the usual mechanism: "mwait" leaf of
the "cpuid" instruction. The 'intel_idle' driver does check this leaf, so even
though C6 is present in the table, the driver will only use it if the CPU does
support it.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[Linux commit: 9cf93f056f783f986c19f40d5304d1bcffa0fc0d]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -742,6 +742,32 @@ static const struct cpuidle_state dnv_cs
 	{}
 };
 
+/*
+ * Note, depending on HW and FW revision, SnowRidge SoC may or may not support
+ * C6, and this is indicated in the CPUID mwait leaf.
+ */
+static const struct cpuidle_state snr_cstates[] = {
+	{
+		.name = "C1",
+		.flags = MWAIT2flg(0x00),
+		.exit_latency = 2,
+		.target_residency = 2,
+	},
+	{
+		.name = "C1E",
+		.flags = MWAIT2flg(0x01),
+		.exit_latency = 15,
+		.target_residency = 25,
+	},
+	{
+		.name = "C6",
+		.flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
+		.exit_latency = 130,
+		.target_residency = 500,
+	},
+	{}
+};
+
 static void mwait_idle(void)
 {
 	unsigned int cpu = smp_processor_id();
@@ -954,6 +980,11 @@ static const struct idle_cpu idle_cpu_dn
 	.disable_promotion_to_c1e = true,
 };
 
+static const struct idle_cpu idle_cpu_snr = {
+	.state_table = snr_cstates,
+	.disable_promotion_to_c1e = true,
+};
+
 #define ICPU(model, cpu) \
 	{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_ALWAYS, &idle_cpu_##cpu}
 
@@ -996,7 +1027,7 @@ static const struct x86_cpu_id intel_idl
 	ICPU(0x5c, bxt),
 	ICPU(0x7a, bxt),
 	ICPU(0x5f, dnv),
-	ICPU(0x86, dnv),
+	ICPU(0x86, snr),
 	{}
 };
 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH RFC v2 4/5] x86/mwait-idle: enable interrupts before C1 on Xeons
  2022-01-20 14:00 [PATCH v2 0/5] x86/mwait-idle: updates from Linux (and more) Jan Beulich
                   ` (2 preceding siblings ...)
  2022-01-20 14:02 ` [PATCH v2 3/5] x86/mwait-idle: add SnowRidge C-state table Jan Beulich
@ 2022-01-20 14:04 ` Jan Beulich
  2022-01-20 15:52   ` Roger Pau Monné
  2022-01-20 14:05 ` [PATCH RFC v2 5/5] x86/mwait-idle: squash stats update when not actually entering C-state Jan Beulich
  4 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2022-01-20 14:04 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

Enable local interrupts before requesting C1 on the last two generations
of Intel Xeon platforms: Sky Lake, Cascade Lake, Cooper Lake, Ice Lake.
This decreases average C1 interrupt latency by about 5-10%, as measured
with the 'wult' tool.

The '->enter()' function of the driver enters C-states with local
interrupts disabled by executing the 'monitor' and 'mwait' pair of
instructions. If an interrupt happens, the CPU exits the C-state and
continues executing instructions after 'mwait'. It does not jump to
the interrupt handler, because local interrupts are disabled. The
cpuidle subsystem enables interrupts a bit later, after doing some
housekeeping.

With this patch, we enable local interrupts before requesting C1. In
this case, if the CPU wakes up because of an interrupt, it will jump
to the interrupt handler right away. The cpuidle housekeeping will be
done after the pending interrupt(s) are handled.

Enabling interrupts before entering a C-state has measurable impact
for faster C-states, like C1. Deeper, but slower C-states like C6 do
not really benefit from this sort of change, because their latency is
a lot higher comparing to the delay added by cpuidle housekeeping.

This change was also tested with cyclictest and dbench. In case of Ice
Lake, the average cyclictest latency decreased by 5.1%, and the average
'dbench' throughput increased by about 0.8%. Both tests were run for 4
hours with only C1 enabled (all other idle states, including 'POLL',
were disabled). CPU frequency was pinned to HFM, and uncore frequency
was pinned to the maximum value. The other platforms had similar
single-digit percentage improvements.

It is worth noting that this patch affects 'cpuidle' statistics a tiny
bit.  Before this patch, C1 residency did not include the interrupt
handling time, but with this patch, it will include it. This is similar
to what happens in case of the 'POLL' state, which also runs with
interrupts enabled.

Suggested-by: Len Brown <len.brown@intel.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
[Linux commit: c227233ad64c77e57db738ab0e46439db71822a3]

We don't have a pointer into cpuidle_state_table[] readily available.
To compensate, make use of the new flag only appearing for C1 and hence
only in the first table entry.

Unlike Linux we want to disable IRQs again after MWAITing, as
subsequently invoked functions assume so.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
RFC: I'm not entirely certain that we want to take this, i.e. whether
     we're as much worried about interrupt latency.
RFC: I was going back and forth between putting the local_irq_enable()
     ahead of or after cpu_is_haltable().
---
v2: New.

--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -108,6 +108,11 @@ static const struct cpuidle_state {
 
 #define CPUIDLE_FLAG_DISABLED		0x1
 /*
+ * Enable interrupts before entering the C-state. On some platforms and for
+ * some C-states, this may measurably decrease interrupt latency.
+ */
+#define CPUIDLE_FLAG_IRQ_ENABLE		0x8000
+/*
  * Set this flag for states where the HW flushes the TLB for us
  * and so we don't need cross-calls to keep it consistent.
  * If this flag is set, SW flushes the TLB, so even if the
@@ -539,7 +544,7 @@ static struct cpuidle_state __read_mostl
 static struct cpuidle_state __read_mostly skx_cstates[] = {
 	{
 		.name = "C1",
-		.flags = MWAIT2flg(0x00),
+		.flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE,
 		.exit_latency = 2,
 		.target_residency = 2,
 	},
@@ -561,7 +566,7 @@ static struct cpuidle_state __read_mostl
 static const struct cpuidle_state icx_cstates[] = {
        {
                .name = "C1",
-               .flags = MWAIT2flg(0x00),
+               .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE,
                .exit_latency = 1,
                .target_residency = 1,
        },
@@ -776,6 +781,7 @@ static void mwait_idle(void)
 	unsigned int next_state;
 	u64 before, after;
 	u32 exp = 0, pred = 0, irq_traced[4] = { 0 };
+	bool irq_enable_early = false;
 
 	if (max_cstate > 0 && power &&
 	    (next_state = cpuidle_current_governor->select(power)) > 0) {
@@ -806,6 +812,12 @@ static void mwait_idle(void)
 		return;
 	}
 
+	if (cx->idx == 1 && cx->type == ACPI_STATE_C1 &&
+	    (cpuidle_state_table[0].flags & CPUIDLE_FLAG_IRQ_ENABLE)) {
+		ASSERT(cx->address == flg2MWAIT(cpuidle_state_table[0].flags));
+		irq_enable_early = true;
+	}
+
 	cpufreq_dbs_timer_suspend();
 
 	rcu_idle_enter(cpu);
@@ -842,9 +854,15 @@ static void mwait_idle(void)
 
 	update_last_cx_stat(power, cx, before);
 
-	if (cpu_is_haltable(cpu))
+	if (cpu_is_haltable(cpu)) {
+		if (irq_enable_early)
+			local_irq_enable();
+
 		mwait_idle_with_hints(cx->address, MWAIT_ECX_INTERRUPT_BREAK);
 
+		local_irq_disable();
+	}
+
 	after = alternative_call(cpuidle_get_tick);
 
 	cstate_restore_tsc();



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH RFC v2 5/5] x86/mwait-idle: squash stats update when not actually entering C-state
  2022-01-20 14:00 [PATCH v2 0/5] x86/mwait-idle: updates from Linux (and more) Jan Beulich
                   ` (3 preceding siblings ...)
  2022-01-20 14:04 ` [PATCH RFC v2 4/5] x86/mwait-idle: enable interrupts before C1 on Xeons Jan Beulich
@ 2022-01-20 14:05 ` Jan Beulich
  2022-01-20 16:00   ` Roger Pau Monné
  4 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2022-01-20 14:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

While we don't want to skip calling update_idle_stats(), arrange for it
to not increment the overall time spent in the state we didn't really
enter.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
RFC: Arguably more of what follows could be moved into the if() -
     thoughts?
---
v2: New.

--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -861,9 +861,11 @@ static void mwait_idle(void)
 		mwait_idle_with_hints(cx->address, MWAIT_ECX_INTERRUPT_BREAK);
 
 		local_irq_disable();
-	}
 
-	after = alternative_call(cpuidle_get_tick);
+		after = alternative_call(cpuidle_get_tick);
+	}
+	else
+		before = after = alternative_call(cpuidle_get_tick);
 
 	cstate_restore_tsc();
 	trace_exit_reason(irq_traced);



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/5] x86/mwait-idle: stop exposing platform acronyms
  2022-01-20 14:01 ` [PATCH v2 1/5] x86/mwait-idle: stop exposing platform acronyms Jan Beulich
@ 2022-01-20 14:31   ` Roger Pau Monné
  0 siblings, 0 replies; 14+ messages in thread
From: Roger Pau Monné @ 2022-01-20 14:31 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Wei Liu

On Thu, Jan 20, 2022 at 03:01:45PM +0100, Jan Beulich wrote:
> This follows Linux commit de09cdd09fa1 ("intel_idle: stop exposing
> platform acronyms in sysfs"), but their main justifications (sysfs
> exposure and similarity with acpi-idle) don't apply to us. The field is
> only used in a single printk() right now, but having the platform tags
> there isn't useful either.
> 
> Requested-by: Roger Pau Monné <roger.pau@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/5] x86/mwait-idle: switch to using bool
  2022-01-20 14:02 ` [PATCH v2 2/5] x86/mwait-idle: switch to using bool Jan Beulich
@ 2022-01-20 14:34   ` Roger Pau Monné
  0 siblings, 0 replies; 14+ messages in thread
From: Roger Pau Monné @ 2022-01-20 14:34 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Wei Liu

On Thu, Jan 20, 2022 at 03:02:14PM +0100, Jan Beulich wrote:
> When the driver was first ported, we didn't have "bool" yet, so
> conversion to bool_t / 0 / 1 was necessary. Undo that conversion, easing
> ports of newer changes as well as tidying things up.
> 
> Requested-by: Roger Pau Monné <roger.pau@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC v2 4/5] x86/mwait-idle: enable interrupts before C1 on Xeons
  2022-01-20 14:04 ` [PATCH RFC v2 4/5] x86/mwait-idle: enable interrupts before C1 on Xeons Jan Beulich
@ 2022-01-20 15:52   ` Roger Pau Monné
  2022-01-24 14:44     ` Jan Beulich
  0 siblings, 1 reply; 14+ messages in thread
From: Roger Pau Monné @ 2022-01-20 15:52 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Wei Liu

On Thu, Jan 20, 2022 at 03:04:39PM +0100, Jan Beulich wrote:
> From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
> 
> Enable local interrupts before requesting C1 on the last two generations
> of Intel Xeon platforms: Sky Lake, Cascade Lake, Cooper Lake, Ice Lake.
> This decreases average C1 interrupt latency by about 5-10%, as measured
> with the 'wult' tool.
> 
> The '->enter()' function of the driver enters C-states with local
> interrupts disabled by executing the 'monitor' and 'mwait' pair of
> instructions. If an interrupt happens, the CPU exits the C-state and
> continues executing instructions after 'mwait'. It does not jump to
> the interrupt handler, because local interrupts are disabled. The
> cpuidle subsystem enables interrupts a bit later, after doing some
> housekeeping.
> 
> With this patch, we enable local interrupts before requesting C1. In
> this case, if the CPU wakes up because of an interrupt, it will jump
> to the interrupt handler right away. The cpuidle housekeeping will be
> done after the pending interrupt(s) are handled.
> 
> Enabling interrupts before entering a C-state has measurable impact
> for faster C-states, like C1. Deeper, but slower C-states like C6 do
> not really benefit from this sort of change, because their latency is
> a lot higher comparing to the delay added by cpuidle housekeeping.
> 
> This change was also tested with cyclictest and dbench. In case of Ice
> Lake, the average cyclictest latency decreased by 5.1%, and the average
> 'dbench' throughput increased by about 0.8%. Both tests were run for 4
> hours with only C1 enabled (all other idle states, including 'POLL',
> were disabled). CPU frequency was pinned to HFM, and uncore frequency
> was pinned to the maximum value. The other platforms had similar
> single-digit percentage improvements.
> 
> It is worth noting that this patch affects 'cpuidle' statistics a tiny
> bit.  Before this patch, C1 residency did not include the interrupt
> handling time, but with this patch, it will include it. This is similar
> to what happens in case of the 'POLL' state, which also runs with
> interrupts enabled.
> 
> Suggested-by: Len Brown <len.brown@intel.com>
> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
> [Linux commit: c227233ad64c77e57db738ab0e46439db71822a3]
> 
> We don't have a pointer into cpuidle_state_table[] readily available.
> To compensate, make use of the new flag only appearing for C1 and hence
> only in the first table entry.

We could likely use the 8 padding bits in acpi_processor_cx between
entry_method and method to store a flags field?

It would seems more resilient, as I guess at some point we could
enable interrupts for further states?

> 
> Unlike Linux we want to disable IRQs again after MWAITing, as
> subsequently invoked functions assume so.

I'm also wondering whether there could be interrupts that rely on some
of the housekeeping that's done when returning from mwait. I guess
it's unlikely for an interrupt handler to have anything to do with the
TSC not having been restored.

> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

I think it's important to keep in sync with our upstream that's Linux
there.

Albeit I would prefer if we could carry the flags into acpi_processor_cx.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC v2 5/5] x86/mwait-idle: squash stats update when not actually entering C-state
  2022-01-20 14:05 ` [PATCH RFC v2 5/5] x86/mwait-idle: squash stats update when not actually entering C-state Jan Beulich
@ 2022-01-20 16:00   ` Roger Pau Monné
  2022-01-20 16:21     ` Jan Beulich
  0 siblings, 1 reply; 14+ messages in thread
From: Roger Pau Monné @ 2022-01-20 16:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Wei Liu

On Thu, Jan 20, 2022 at 03:05:12PM +0100, Jan Beulich wrote:
> While we don't want to skip calling update_idle_stats(), arrange for it
> to not increment the overall time spent in the state we didn't really
> enter.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> RFC: Arguably more of what follows could be moved into the if() -
>      thoughts?

I would move at least the restoring of the TSC, but I would be fine
with moving everything up to local_irq_enable unless I'm missing
something.

I think you could likely also avoid the call to update_idle_stats so
there's no need to fetch the new tick count.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC v2 5/5] x86/mwait-idle: squash stats update when not actually entering C-state
  2022-01-20 16:00   ` Roger Pau Monné
@ 2022-01-20 16:21     ` Jan Beulich
  2022-01-21  9:32       ` Roger Pau Monné
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2022-01-20 16:21 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Wei Liu

On 20.01.2022 17:00, Roger Pau Monné wrote:
> On Thu, Jan 20, 2022 at 03:05:12PM +0100, Jan Beulich wrote:
>> While we don't want to skip calling update_idle_stats(), arrange for it
>> to not increment the overall time spent in the state we didn't really
>> enter.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> RFC: Arguably more of what follows could be moved into the if() -
>>      thoughts?
> 
> I would move at least the restoring of the TSC, but I would be fine
> with moving everything up to local_irq_enable unless I'm missing
> something.

Yes, that's what I was considering.

> I think you could likely also avoid the call to update_idle_stats so
> there's no need to fetch the new tick count.

No, this call cannot be bypassed. At least not without further
rearrangements elsewhere.

Jan



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC v2 5/5] x86/mwait-idle: squash stats update when not actually entering C-state
  2022-01-20 16:21     ` Jan Beulich
@ 2022-01-21  9:32       ` Roger Pau Monné
  0 siblings, 0 replies; 14+ messages in thread
From: Roger Pau Monné @ 2022-01-21  9:32 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Wei Liu

On Thu, Jan 20, 2022 at 05:21:31PM +0100, Jan Beulich wrote:
> On 20.01.2022 17:00, Roger Pau Monné wrote:
> > On Thu, Jan 20, 2022 at 03:05:12PM +0100, Jan Beulich wrote:
> >> While we don't want to skip calling update_idle_stats(), arrange for it
> >> to not increment the overall time spent in the state we didn't really
> >> enter.
> >>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >> ---
> >> RFC: Arguably more of what follows could be moved into the if() -
> >>      thoughts?
> > 
> > I would move at least the restoring of the TSC, but I would be fine
> > with moving everything up to local_irq_enable unless I'm missing
> > something.
> 
> Yes, that's what I was considering.
> 
> > I think you could likely also avoid the call to update_idle_stats so
> > there's no need to fetch the new tick count.
> 
> No, this call cannot be bypassed. At least not without further
> rearrangements elsewhere.

Ack, I would move everything you can then.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC v2 4/5] x86/mwait-idle: enable interrupts before C1 on Xeons
  2022-01-20 15:52   ` Roger Pau Monné
@ 2022-01-24 14:44     ` Jan Beulich
  2022-01-25 12:43       ` Roger Pau Monné
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Beulich @ 2022-01-24 14:44 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Wei Liu

On 20.01.2022 16:52, Roger Pau Monné wrote:
> On Thu, Jan 20, 2022 at 03:04:39PM +0100, Jan Beulich wrote:
>> From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
>>
>> Enable local interrupts before requesting C1 on the last two generations
>> of Intel Xeon platforms: Sky Lake, Cascade Lake, Cooper Lake, Ice Lake.
>> This decreases average C1 interrupt latency by about 5-10%, as measured
>> with the 'wult' tool.
>>
>> The '->enter()' function of the driver enters C-states with local
>> interrupts disabled by executing the 'monitor' and 'mwait' pair of
>> instructions. If an interrupt happens, the CPU exits the C-state and
>> continues executing instructions after 'mwait'. It does not jump to
>> the interrupt handler, because local interrupts are disabled. The
>> cpuidle subsystem enables interrupts a bit later, after doing some
>> housekeeping.
>>
>> With this patch, we enable local interrupts before requesting C1. In
>> this case, if the CPU wakes up because of an interrupt, it will jump
>> to the interrupt handler right away. The cpuidle housekeeping will be
>> done after the pending interrupt(s) are handled.
>>
>> Enabling interrupts before entering a C-state has measurable impact
>> for faster C-states, like C1. Deeper, but slower C-states like C6 do
>> not really benefit from this sort of change, because their latency is
>> a lot higher comparing to the delay added by cpuidle housekeeping.
>>
>> This change was also tested with cyclictest and dbench. In case of Ice
>> Lake, the average cyclictest latency decreased by 5.1%, and the average
>> 'dbench' throughput increased by about 0.8%. Both tests were run for 4
>> hours with only C1 enabled (all other idle states, including 'POLL',
>> were disabled). CPU frequency was pinned to HFM, and uncore frequency
>> was pinned to the maximum value. The other platforms had similar
>> single-digit percentage improvements.
>>
>> It is worth noting that this patch affects 'cpuidle' statistics a tiny
>> bit.  Before this patch, C1 residency did not include the interrupt
>> handling time, but with this patch, it will include it. This is similar
>> to what happens in case of the 'POLL' state, which also runs with
>> interrupts enabled.
>>
>> Suggested-by: Len Brown <len.brown@intel.com>
>> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
>> [Linux commit: c227233ad64c77e57db738ab0e46439db71822a3]
>>
>> We don't have a pointer into cpuidle_state_table[] readily available.
>> To compensate, make use of the new flag only appearing for C1 and hence
>> only in the first table entry.
> 
> We could likely use the 8 padding bits in acpi_processor_cx between
> entry_method and method to store a flags field?

When looking there initially, I thought it wouldn't be good to add
field there. But looking again, I now don't see why I thought what
I thought. (We actually have an 8-bit padding slot there and a
32-bit one.)

> It would seems more resilient, as I guess at some point we could
> enable interrupts for further states?

C1E maybe; I don't think anything beyond, for having higher wakeup
latency anyway.

>> Unlike Linux we want to disable IRQs again after MWAITing, as
>> subsequently invoked functions assume so.
> 
> I'm also wondering whether there could be interrupts that rely on some
> of the housekeeping that's done when returning from mwait. I guess
> it's unlikely for an interrupt handler to have anything to do with the
> TSC not having been restored.

Actually this is a good point you make: We don't want to enable
IRQs when cstate_restore_tsc() is not a no-op, or else we might
confuse the time rendezvous. (I thought that I would remember
TSC to stop only in deeper C-states, but maybe I'm mixing this up
with the APIC timer.)

>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, but as per above - not just yet.

Jan



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH RFC v2 4/5] x86/mwait-idle: enable interrupts before C1 on Xeons
  2022-01-24 14:44     ` Jan Beulich
@ 2022-01-25 12:43       ` Roger Pau Monné
  0 siblings, 0 replies; 14+ messages in thread
From: Roger Pau Monné @ 2022-01-25 12:43 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Wei Liu

On Mon, Jan 24, 2022 at 03:44:53PM +0100, Jan Beulich wrote:
> On 20.01.2022 16:52, Roger Pau Monné wrote:
> > On Thu, Jan 20, 2022 at 03:04:39PM +0100, Jan Beulich wrote:
> >> From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
> >> Unlike Linux we want to disable IRQs again after MWAITing, as
> >> subsequently invoked functions assume so.
> > 
> > I'm also wondering whether there could be interrupts that rely on some
> > of the housekeeping that's done when returning from mwait. I guess
> > it's unlikely for an interrupt handler to have anything to do with the
> > TSC not having been restored.
> 
> Actually this is a good point you make: We don't want to enable
> IRQs when cstate_restore_tsc() is not a no-op, or else we might
> confuse the time rendezvous. (I thought that I would remember
> TSC to stop only in deeper C-states, but maybe I'm mixing this up
> with the APIC timer.)

There's a comment in time.c that mentions the TSC only stopping in
'deep C states'. Also note that in that case the rendezvous function
already updates the TSC, so I'm not sure whether calling it with an
out of date TSC would be harmful - it will be updated anyway to match
the master TSC.

Might be safer to disable interrupts unconditionally on CPUs that
don't have a non-stop TSC just to be on the safe side.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-01-25 12:43 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-20 14:00 [PATCH v2 0/5] x86/mwait-idle: updates from Linux (and more) Jan Beulich
2022-01-20 14:01 ` [PATCH v2 1/5] x86/mwait-idle: stop exposing platform acronyms Jan Beulich
2022-01-20 14:31   ` Roger Pau Monné
2022-01-20 14:02 ` [PATCH v2 2/5] x86/mwait-idle: switch to using bool Jan Beulich
2022-01-20 14:34   ` Roger Pau Monné
2022-01-20 14:02 ` [PATCH v2 3/5] x86/mwait-idle: add SnowRidge C-state table Jan Beulich
2022-01-20 14:04 ` [PATCH RFC v2 4/5] x86/mwait-idle: enable interrupts before C1 on Xeons Jan Beulich
2022-01-20 15:52   ` Roger Pau Monné
2022-01-24 14:44     ` Jan Beulich
2022-01-25 12:43       ` Roger Pau Monné
2022-01-20 14:05 ` [PATCH RFC v2 5/5] x86/mwait-idle: squash stats update when not actually entering C-state Jan Beulich
2022-01-20 16:00   ` Roger Pau Monné
2022-01-20 16:21     ` Jan Beulich
2022-01-21  9:32       ` Roger Pau Monné

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).