linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] intel_idle: Two new module parameters
@ 2020-01-30 14:44 Rafael J. Wysocki
  2020-01-30 14:46 ` [PATCH 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-01-30 14:44 UTC (permalink / raw)
  To: Linux PM
  Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box, Artem Bityutskiy

Hi All,

This series adds to module parameters to intel_idle (on top of the material
already in the mainline since Monday), one to make it use ACPI even if that
is not enabled for the processor model present in the system and the other
one to allow passing a list of idle states to disable by default to it.

Details in the patch changelogs.

Thanks,
Rafael




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] intel_idle: Introduce 'use_acpi' module parameter
  2020-01-30 14:44 [PATCH 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
@ 2020-01-30 14:46 ` Rafael J. Wysocki
  2020-02-02 14:23   ` kbuild test robot
  2020-01-30 14:47 ` [PATCH 2/2] intel_idle: Introduce 'states_off' " Rafael J. Wysocki
  2020-02-03 11:13 ` [PATCH v2 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
  2 siblings, 1 reply; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-01-30 14:46 UTC (permalink / raw)
  To: Linux PM
  Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box,
	Artem Bityutskiy, Rafael J. Wysocki

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

For diagnostics, it is generally useful to be able to make intel_idle
take the system's ACPI tables into consideration even if that is not
required for the processor model in there, so introduce a new module
parameter, 'use_acpi', to make that happen and update the documentation
to cover it.

While at it, fix the 'no_acpi' module parameter name in the
documentation.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 Documentation/admin-guide/pm/intel_idle.rst |   13 +++++++++----
 drivers/idle/intel_idle.c                   |   11 +++++++++--
 2 files changed, 18 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/idle/intel_idle.c
===================================================================
--- linux-pm.orig/drivers/idle/intel_idle.c
+++ linux-pm/drivers/idle/intel_idle.c
@@ -1131,6 +1131,10 @@ static bool no_acpi __read_mostly;
 module_param(no_acpi, bool, 0444);
 MODULE_PARM_DESC(no_acpi, "Do not use ACPI _CST for building the idle states list");
 
+static bool use_acpi __read_mostly; /* No effect if no_acpi is set. */
+module_param(use_acpi, bool, 0444);
+MODULE_PARM_DESC(use_acpi, "Use ACPI _CST for building the idle states list");
+
 static struct acpi_processor_power acpi_state_table __initdata;
 
 /**
@@ -1258,6 +1262,8 @@ static bool __init intel_idle_off_by_def
 	return true;
 }
 #else /* !CONFIG_ACPI_PROCESSOR_CSTATE */
+#define use_acpi	(false)
+
 static inline bool intel_idle_acpi_cst_extract(void) { return false; }
 static inline void intel_idle_init_cstates_acpi(struct cpuidle_driver *drv) { }
 static inline bool intel_idle_off_by_default(u32 mwait_hint) { return false; }
@@ -1460,7 +1466,8 @@ static void __init intel_idle_init_cstat
 		/* Structure copy. */
 		drv->states[drv->state_count] = cpuidle_state_table[cstate];
 
-		if (icpu->use_acpi && intel_idle_off_by_default(mwait_hint) &&
+		if ((icpu->use_acpi || use_acpi) &&
+		    intel_idle_off_by_default(mwait_hint) &&
 		    !(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE))
 			drv->states[drv->state_count].flags |= CPUIDLE_FLAG_OFF;
 
@@ -1607,7 +1614,7 @@ static int __init intel_idle_init(void)
 	icpu = (const struct idle_cpu *)id->driver_data;
 	if (icpu) {
 		cpuidle_state_table = icpu->state_table;
-		if (icpu->use_acpi)
+		if (icpu->use_acpi || use_acpi)
 			intel_idle_acpi_cst_extract();
 	} else if (!intel_idle_acpi_cst_extract()) {
 		return -ENODEV;
Index: linux-pm/Documentation/admin-guide/pm/intel_idle.rst
===================================================================
--- linux-pm.orig/Documentation/admin-guide/pm/intel_idle.rst
+++ linux-pm/Documentation/admin-guide/pm/intel_idle.rst
@@ -60,6 +60,9 @@ of the system.  The former are always us
 recognized by ``intel_idle`` and the latter are used if that is required for
 the given processor model (which is the case for all server processor models
 recognized by ``intel_idle``) or if the processor model is not recognized.
+[There is a module parameter that can be used to make the driver use the ACPI
+tables with any processor model recognized by it; see
+`below <intel-idle-parameters_>`_.]
 
 If the ACPI tables are going to be used for building the list of available idle
 states, ``intel_idle`` first looks for a ``_CST`` object under one of the ACPI
@@ -165,7 +168,7 @@ and ``idle=nomwait``.  If any of them is
 ``MWAIT`` instruction is not allowed to be used, so the initialization of
 ``intel_idle`` will fail.
 
-Apart from that there are two module parameters recognized by ``intel_idle``
+Apart from that there are three module parameters recognized by ``intel_idle``
 itself that can be set via the kernel command line (they cannot be updated via
 sysfs, so that is the only way to change their values).
 
@@ -186,9 +189,11 @@ QoS) feature can be used to prevent ``CP
 even if they have been enumerated (see :ref:`cpu-pm-qos` in :doc:`cpuidle`).
 Setting ``max_cstate`` to 0 causes the ``intel_idle`` initialization to fail.
 
-The ``noacpi`` module parameter (which is recognized by ``intel_idle`` if the
-kernel has been configured with ACPI support), can be set to make the driver
-ignore the system's ACPI tables entirely (it is unset by default).
+The ``no_acpi`` and ``use_acpi`` module parameters (recognized by ``intel_idle``
+if the kernel has been configured with ACPI support) can be set to make the
+driver ignore the system's ACPI tables entirely or use them for all of the
+recognized processor models, respectively (they both are unset by default and
+``use_acpi`` has no effect if ``no_acpi`` is set).
 
 
 .. _intel-idle-core-and-package-idle-states:




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 2/2] intel_idle: Introduce 'states_off' module parameter
  2020-01-30 14:44 [PATCH 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
  2020-01-30 14:46 ` [PATCH 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
@ 2020-01-30 14:47 ` Rafael J. Wysocki
  2020-01-31 11:07   ` David Laight
  2020-02-03 11:13 ` [PATCH v2 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
  2 siblings, 1 reply; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-01-30 14:47 UTC (permalink / raw)
  To: Linux PM
  Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box,
	Artem Bityutskiy, Rafael J. Wysocki

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

In certain system configurations it may not be desirable to use some
C-states assumed to be available by intel_idle and the driver needs
to be prevented from using them even before the cpuidle sysfs
interface becomes accessible to user space.  Currently, the only way
to achieve that is by setting the 'max_cstate' module parameter to a
value lower than the index of the shallowest of the C-states in
question, but that may be overly intrusive, because it effectively
makes all of the idle states deeper than the 'max_cstate' one go
away (and the C-state to avoid may be in the middle of the range
normally regarded as available).

To allow that limitation to be overcome, introduce a new module
parameter called 'states_off' to represent a list of idle states to
be disabled by default in the form of a bitmask and update the
documentation to cover it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 Documentation/admin-guide/pm/intel_idle.rst |   15 ++++++++++++++-
 drivers/idle/intel_idle.c                   |   24 +++++++++++++++++++++---
 2 files changed, 35 insertions(+), 4 deletions(-)

Index: linux-pm/drivers/idle/intel_idle.c
===================================================================
--- linux-pm.orig/drivers/idle/intel_idle.c
+++ linux-pm/drivers/idle/intel_idle.c
@@ -63,6 +63,7 @@ static struct cpuidle_driver intel_idle_
 };
 /* intel_idle.max_cstate=0 disables driver */
 static int max_cstate = CPUIDLE_STATE_MAX - 1;
+static unsigned int disabled_states_mask;
 
 static unsigned int mwait_substates;
 
@@ -1234,6 +1235,9 @@ static void __init intel_idle_init_cstat
 		if (cx->type > ACPI_STATE_C2)
 			state->flags |= CPUIDLE_FLAG_TLB_FLUSHED;
 
+		if (disabled_states_mask & BIT(cstate))
+			state->flags |= CPUIDLE_FLAG_OFF;
+
 		state->enter = intel_idle;
 		state->enter_s2idle = intel_idle_s2idle;
 	}
@@ -1466,9 +1470,10 @@ static void __init intel_idle_init_cstat
 		/* Structure copy. */
 		drv->states[drv->state_count] = cpuidle_state_table[cstate];
 
-		if ((icpu->use_acpi || use_acpi) &&
-		    intel_idle_off_by_default(mwait_hint) &&
-		    !(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE))
+		if ((disabled_states_mask & BIT(drv->state_count)) ||
+		    ((icpu->use_acpi || use_acpi) &&
+		     intel_idle_off_by_default(mwait_hint) &&
+		     !(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE)))
 			drv->states[drv->state_count].flags |= CPUIDLE_FLAG_OFF;
 
 		drv->state_count++;
@@ -1487,6 +1492,10 @@ static void __init intel_idle_init_cstat
 static void __init intel_idle_cpuidle_driver_init(struct cpuidle_driver *drv)
 {
 	cpuidle_poll_state_init(drv);
+
+	if (disabled_states_mask & BIT(0))
+		drv->states[0].flags |= CPUIDLE_FLAG_OFF;
+
 	drv->state_count = 1;
 
 	if (icpu)
@@ -1667,3 +1676,12 @@ device_initcall(intel_idle_init);
  * is the easiest way (currently) to continue doing that.
  */
 module_param(max_cstate, int, 0444);
+/*
+ * The positions of the bits that are set in the two's complement representation
+ * of this value are the indices of the idle states to be disabled by default
+ * (as reflected by the names of the corresponding idle state directories in
+ * sysfs, "state0", "state1" ... "state<i>" ..., where <i> is the index of the
+ * given state).
+ */
+module_param_named(states_off, disabled_states_mask, uint, 0444);
+MODULE_PARM_DESC(states_off, "Mask of disabled idle states");
Index: linux-pm/Documentation/admin-guide/pm/intel_idle.rst
===================================================================
--- linux-pm.orig/Documentation/admin-guide/pm/intel_idle.rst
+++ linux-pm/Documentation/admin-guide/pm/intel_idle.rst
@@ -168,7 +168,7 @@ and ``idle=nomwait``.  If any of them is
 ``MWAIT`` instruction is not allowed to be used, so the initialization of
 ``intel_idle`` will fail.
 
-Apart from that there are three module parameters recognized by ``intel_idle``
+Apart from that there are four module parameters recognized by ``intel_idle``
 itself that can be set via the kernel command line (they cannot be updated via
 sysfs, so that is the only way to change their values).
 
@@ -195,6 +195,19 @@ driver ignore the system's ACPI tables e
 recognized processor models, respectively (they both are unset by default and
 ``use_acpi`` has no effect if ``no_acpi`` is set).
 
+The value of the ``states_off`` module parameter (0 by default) represents a
+list of idle states to be disabled by default in the form of a bitmask.  Namely,
+the positions of the bits that are set in the two's complement representation of
+that value are the indices of idle states to be disabled by default (as
+reflected by the names of the corresponding idle state directories in ``sysfs``,
+:file:`state0`, :file:`state1` ... :file:`state<i>` ..., where ``<i>`` is the
+index of the given idle state; see :ref:`idle-states-representation` in
+:doc:`cpuidle`).  For example, if ``states_off`` is equal to 3, the driver will
+disable idle states 0 and 1 by default, and if it is equal to 8, idle state 3
+will be disabled by default and so on (bit positions beyond the maximum idle
+state index are ignored).  The idle states disabled this way can be enabled (on
+a per-CPU basis) from user space via ``sysfs``.
+
 
 .. _intel-idle-core-and-package-idle-states:
 




^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH 2/2] intel_idle: Introduce 'states_off' module parameter
  2020-01-30 14:47 ` [PATCH 2/2] intel_idle: Introduce 'states_off' " Rafael J. Wysocki
@ 2020-01-31 11:07   ` David Laight
  2020-01-31 11:23     ` Rafael J. Wysocki
  2020-01-31 11:24     ` Artem Bityutskiy
  0 siblings, 2 replies; 12+ messages in thread
From: David Laight @ 2020-01-31 11:07 UTC (permalink / raw)
  To: 'Rafael J. Wysocki', Linux PM
  Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box,
	Artem Bityutskiy, Rafael J. Wysocki

From: Rafael J. Wysocki
> Sent: 30 January 2020 14:47
> 
> In certain system configurations it may not be desirable to use some
> C-states assumed to be available by intel_idle and the driver needs
> to be prevented from using them even before the cpuidle sysfs
> interface becomes accessible to user space.  Currently, the only way
> to achieve that is by setting the 'max_cstate' module parameter to a
> value lower than the index of the shallowest of the C-states in
> question, but that may be overly intrusive, because it effectively
> makes all of the idle states deeper than the 'max_cstate' one go
> away (and the C-state to avoid may be in the middle of the range
> normally regarded as available).
> 
> To allow that limitation to be overcome, introduce a new module
> parameter called 'states_off' to represent a list of idle states to
> be disabled by default in the form of a bitmask and update the
> documentation to cover it.

The problem I see is that there are (at least) 3 different ways of
referring to the C-States:

1) The state names, C1, C1E, C3, C7 etc.
   I'm not sure these are visible outside intel_idle.c.
2) The maximum allowed latency in us.
3) The index into the cpu-dependant tables in intel_idle.c.

Boot parameters that set 3 are completely hopeless for normal
users. The C-state names might be - but they aren't documented.

Unless you know exactly which cpu table is being used the
only constraint a user can request is the latency.

(I've had the misfortune to read intel_idle.c in the last week.
Almost impenetrable TLA ridden uncommented code.)

...
> + * The positions of the bits that are set in the two's complement representation
> + * of this value are the indices of the idle states to be disabled by default
> + * (as reflected by the names of the corresponding idle state directories in
> + * sysfs, "state0", "state1" ... "state<i>" ..., where <i> is the index of the
> + * given state).

What has 'two's complement' got to do with anything?

...
> +The value of the ``states_off`` module parameter (0 by default) represents a
> +list of idle states to be disabled by default in the form of a bitmask.  Namely,
> +the positions of the bits that are set in the two's complement representation of
> +that value are the indices of idle states to be disabled by default (as
> +reflected by the names of the corresponding idle state directories in ``sysfs``,
> +:file:`state0`, :file:`state1` ... :file:`state<i>` ..., where ``<i>`` is the
> +index of the given idle state; see :ref:`idle-states-representation` in
> +:doc:`cpuidle`).  For example, if ``states_off`` is equal to 3, the driver will
> +disable idle states 0 and 1 by default, and if it is equal to 8, idle state 3
> +will be disabled by default and so on (bit positions beyond the maximum idle
> +state index are ignored).  The idle states disabled this way can be enabled (on
> +a per-CPU basis) from user space via ``sysfs``.

A few line breaks would make that easier to read.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] intel_idle: Introduce 'states_off' module parameter
  2020-01-31 11:07   ` David Laight
@ 2020-01-31 11:23     ` Rafael J. Wysocki
  2020-01-31 11:24     ` Artem Bityutskiy
  1 sibling, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-01-31 11:23 UTC (permalink / raw)
  To: David Laight
  Cc: Rafael J. Wysocki, Linux PM, Linux ACPI, LKML, Len Brown,
	Zhang Rui, David Box, Artem Bityutskiy, Rafael J. Wysocki

On Fri, Jan 31, 2020 at 12:07 PM David Laight <David.Laight@aculab.com> wrote:
>
> From: Rafael J. Wysocki
> > Sent: 30 January 2020 14:47
> >
> > In certain system configurations it may not be desirable to use some
> > C-states assumed to be available by intel_idle and the driver needs
> > to be prevented from using them even before the cpuidle sysfs
> > interface becomes accessible to user space.  Currently, the only way
> > to achieve that is by setting the 'max_cstate' module parameter to a
> > value lower than the index of the shallowest of the C-states in
> > question, but that may be overly intrusive, because it effectively
> > makes all of the idle states deeper than the 'max_cstate' one go
> > away (and the C-state to avoid may be in the middle of the range
> > normally regarded as available).
> >
> > To allow that limitation to be overcome, introduce a new module
> > parameter called 'states_off' to represent a list of idle states to
> > be disabled by default in the form of a bitmask and update the
> > documentation to cover it.
>
> The problem I see is that there are (at least) 3 different ways of
> referring to the C-States:

So the mask is not referring to the C-states in the first place.

> 1) The state names, C1, C1E, C3, C7 etc.
>    I'm not sure these are visible outside intel_idle.c.

Yes, they are, in sysfs.

> 2) The maximum allowed latency in us.
> 3) The index into the cpu-dependant tables in intel_idle.c.
>
> Boot parameters that set 3 are completely hopeless for normal
> users. The C-state names might be - but they aren't documented.
>
> Unless you know exactly which cpu table is being used the
> only constraint a user can request is the latency.

So this mask refers to the idle states numbering in sysfs, as stated
in the documentation update.  That covers state0 which is not a
C-state too.

> (I've had the misfortune to read intel_idle.c in the last week.
> Almost impenetrable TLA ridden uncommented code.)

I have some patches to improve that, will post them after this is settled.

> ...
> > + * The positions of the bits that are set in the two's complement representation
> > + * of this value are the indices of the idle states to be disabled by default
> > + * (as reflected by the names of the corresponding idle state directories in
> > + * sysfs, "state0", "state1" ... "state<i>" ..., where <i> is the index of the
> > + * given state).
>
> What has 'two's complement' got to do with anything?

Well, it is the representation in which bits are used.  Kind of as
opposed to decimal or hex digits.  But I can replace that phrase with
"bits that are set in this number" easily enough.

> ...
> > +The value of the ``states_off`` module parameter (0 by default) represents a
> > +list of idle states to be disabled by default in the form of a bitmask.  Namely,
> > +the positions of the bits that are set in the two's complement representation of
> > +that value are the indices of idle states to be disabled by default (as
> > +reflected by the names of the corresponding idle state directories in ``sysfs``,
> > +:file:`state0`, :file:`state1` ... :file:`state<i>` ..., where ``<i>`` is the
> > +index of the given idle state; see :ref:`idle-states-representation` in
> > +:doc:`cpuidle`).  For example, if ``states_off`` is equal to 3, the driver will
> > +disable idle states 0 and 1 by default, and if it is equal to 8, idle state 3
> > +will be disabled by default and so on (bit positions beyond the maximum idle
> > +state index are ignored).  The idle states disabled this way can be enabled (on
> > +a per-CPU basis) from user space via ``sysfs``.
>
> A few line breaks would make that easier to read.

Fair enough.

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] intel_idle: Introduce 'states_off' module parameter
  2020-01-31 11:07   ` David Laight
  2020-01-31 11:23     ` Rafael J. Wysocki
@ 2020-01-31 11:24     ` Artem Bityutskiy
  2020-01-31 11:54       ` David Laight
  1 sibling, 1 reply; 12+ messages in thread
From: Artem Bityutskiy @ 2020-01-31 11:24 UTC (permalink / raw)
  To: David Laight, 'Rafael J. Wysocki', Linux PM
  Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box, Rafael J. Wysocki

On Fri, 2020-01-31 at 11:07 +0000, David Laight wrote:
> Unless you know exactly which cpu table is being used the
> only constraint a user can request is the latency.

Hi David,

in all my use-cases I always know what is the CPU I am dealing with and
what are the C-states. Simply because in my view they are always CPU-
dependent in terms of what they do and how are they named.

What you say sounds to me like you would want to disable some C-states
without knowing anything (or much) about the CPU you are dealing with
and the C-state names.

If so, could you please share examples of such use-cases?

Thanks!

-- 
Best Regards,
Artem Bityutskiy


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH 2/2] intel_idle: Introduce 'states_off' module parameter
  2020-01-31 11:24     ` Artem Bityutskiy
@ 2020-01-31 11:54       ` David Laight
  2020-01-31 12:03         ` Rafael J. Wysocki
  0 siblings, 1 reply; 12+ messages in thread
From: David Laight @ 2020-01-31 11:54 UTC (permalink / raw)
  To: 'artem.bityutskiy@linux.intel.com',
	'Rafael J. Wysocki',
	Linux PM
  Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box, Rafael J. Wysocki

From: Artem Bityutskiy >
> Sent: 31 January 2020 11:24
> On Fri, 2020-01-31 at 11:07 +0000, David Laight wrote:
> > Unless you know exactly which cpu table is being used the
> > only constraint a user can request is the latency.
> 
> Hi David,
> 
> in all my use-cases I always know what is the CPU I am dealing with and
> what are the C-states. Simply because in my view they are always CPU-
> dependent in terms of what they do and how are they named.
> 
> What you say sounds to me like you would want to disable some C-states
> without knowing anything (or much) about the CPU you are dealing with
> and the C-state names.
> 
> If so, could you please share examples of such use-cases?

Dunno, but clearly you want to disable (say) C3 while leaving C6
enabled.

I was trying to find why it was taking 600+us for a RT process
to get rescheduled when it had only been sleeping for a few us.

I found where it was sleeping, but that didn't help at all.
Someone pointed me at a 'random' pdf that referred to /dev/cpu_dma_latency.
Setting that to a small value (eg 20) helps no end.
But there are no references in the code or man pages to that.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] intel_idle: Introduce 'states_off' module parameter
  2020-01-31 11:54       ` David Laight
@ 2020-01-31 12:03         ` Rafael J. Wysocki
  0 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-01-31 12:03 UTC (permalink / raw)
  To: David Laight
  Cc: artem.bityutskiy, Rafael J. Wysocki, Linux PM, Linux ACPI, LKML,
	Len Brown, Zhang Rui, David Box, Rafael J. Wysocki

On Fri, Jan 31, 2020 at 12:54 PM David Laight <David.Laight@aculab.com> wrote:
>
> From: Artem Bityutskiy >
> > Sent: 31 January 2020 11:24
> > On Fri, 2020-01-31 at 11:07 +0000, David Laight wrote:
> > > Unless you know exactly which cpu table is being used the
> > > only constraint a user can request is the latency.
> >
> > Hi David,
> >
> > in all my use-cases I always know what is the CPU I am dealing with and
> > what are the C-states. Simply because in my view they are always CPU-
> > dependent in terms of what they do and how are they named.
> >
> > What you say sounds to me like you would want to disable some C-states
> > without knowing anything (or much) about the CPU you are dealing with
> > and the C-state names.
> >
> > If so, could you please share examples of such use-cases?
>
> Dunno, but clearly you want to disable (say) C3 while leaving C6
> enabled.
>
> I was trying to find why it was taking 600+us for a RT process
> to get rescheduled when it had only been sleeping for a few us.
>
> I found where it was sleeping, but that didn't help at all.
> Someone pointed me at a 'random' pdf that referred to /dev/cpu_dma_latency.
> Setting that to a small value (eg 20) helps no end.
> But there are no references in the code or man pages to that.

There is a piece of kernel documentation regarding it, however:

https://www.kernel.org/doc/html/latest/admin-guide/pm/cpuidle.html#cpu-pm-qos

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] intel_idle: Introduce 'use_acpi' module parameter
  2020-01-30 14:46 ` [PATCH 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
@ 2020-02-02 14:23   ` kbuild test robot
  0 siblings, 0 replies; 12+ messages in thread
From: kbuild test robot @ 2020-02-02 14:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: kbuild-all, Linux PM, Linux ACPI, LKML, Len Brown, Zhang Rui,
	David Box, Artem Bityutskiy, Rafael J. Wysocki

[-- Attachment #1: Type: text/plain, Size: 3745 bytes --]

Hi "Rafael,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on next-20200130]
[cannot apply to acpi/next linux/master v5.5]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Rafael-J-Wysocki/intel_idle-Two-new-module-parameters/20200202-161814
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 94f2630b18975bb56eee5d1a36371db967643479
config: i386-randconfig-a001-20200202 (attached as .config)
compiler: gcc-6 (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

   drivers//idle/intel_idle.c: In function 'intel_idle_init_cstates_icpu':
>> drivers//idle/intel_idle.c:1265:18: error: expected identifier before '(' token
    #define use_acpi (false)
                     ^
>> drivers//idle/intel_idle.c:1469:14: note: in expansion of macro 'use_acpi'
      if ((icpu->use_acpi || use_acpi) &&
                 ^~~~~~~~
   drivers//idle/intel_idle.c: In function 'intel_idle_init':
>> drivers//idle/intel_idle.c:1265:18: error: expected identifier before '(' token
    #define use_acpi (false)
                     ^
   drivers//idle/intel_idle.c:1617:13: note: in expansion of macro 'use_acpi'
      if (icpu->use_acpi || use_acpi)
                ^~~~~~~~

vim +/use_acpi +1469 drivers//idle/intel_idle.c

  1427	
  1428	static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv)
  1429	{
  1430		int cstate;
  1431	
  1432		switch (boot_cpu_data.x86_model) {
  1433		case INTEL_FAM6_IVYBRIDGE_X:
  1434			ivt_idle_state_table_update();
  1435			break;
  1436		case INTEL_FAM6_ATOM_GOLDMONT:
  1437		case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
  1438			bxt_idle_state_table_update();
  1439			break;
  1440		case INTEL_FAM6_SKYLAKE:
  1441			sklh_idle_state_table_update();
  1442			break;
  1443		}
  1444	
  1445		for (cstate = 0; cstate < CPUIDLE_STATE_MAX; ++cstate) {
  1446			unsigned int mwait_hint;
  1447	
  1448			if (intel_idle_max_cstate_reached(cstate))
  1449				break;
  1450	
  1451			if (!cpuidle_state_table[cstate].enter &&
  1452			    !cpuidle_state_table[cstate].enter_s2idle)
  1453				break;
  1454	
  1455			/* If marked as unusable, skip this state. */
  1456			if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_UNUSABLE) {
  1457				pr_debug("state %s is disabled\n",
  1458					 cpuidle_state_table[cstate].name);
  1459				continue;
  1460			}
  1461	
  1462			mwait_hint = flg2MWAIT(cpuidle_state_table[cstate].flags);
  1463			if (!intel_idle_verify_cstate(mwait_hint))
  1464				continue;
  1465	
  1466			/* Structure copy. */
  1467			drv->states[drv->state_count] = cpuidle_state_table[cstate];
  1468	
> 1469			if ((icpu->use_acpi || use_acpi) &&
  1470			    intel_idle_off_by_default(mwait_hint) &&
  1471			    !(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE))
  1472				drv->states[drv->state_count].flags |= CPUIDLE_FLAG_OFF;
  1473	
  1474			drv->state_count++;
  1475		}
  1476	
  1477		if (icpu->byt_auto_demotion_disable_flag) {
  1478			wrmsrl(MSR_CC6_DEMOTION_POLICY_CONFIG, 0);
  1479			wrmsrl(MSR_MC6_DEMOTION_POLICY_CONFIG, 0);
  1480		}
  1481	}
  1482	

---
0-DAY kernel test infrastructure                 Open Source Technology Center
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 39486 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 0/2] intel_idle: Two new module parameters
  2020-01-30 14:44 [PATCH 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
  2020-01-30 14:46 ` [PATCH 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
  2020-01-30 14:47 ` [PATCH 2/2] intel_idle: Introduce 'states_off' " Rafael J. Wysocki
@ 2020-02-03 11:13 ` Rafael J. Wysocki
  2020-02-03 11:15   ` [PATCH v2 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
  2020-02-03 11:19   ` [PATCH v2 2/2] intel_idle: Introduce 'states_off' " Rafael J. Wysocki
  2 siblings, 2 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-02-03 11:13 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Linux ACPI, LKML, Len Brown, Zhang Rui, David Box,
	Artem Bityutskiy

On Thursday, January 30, 2020 3:44:40 PM CET Rafael J. Wysocki wrote:
> Hi All,
> 
> This series adds to module parameters to intel_idle (on top of the material
> already in the mainline since Monday), one to make it use ACPI even if that
> is not enabled for the processor model present in the system and the other
> one to allow passing a list of idle states to disable by default to it.
> 
> Details in the patch changelogs.

This is a small update of the original set fixing a build issue in the first
patch and addressing a couple of review comments in the second one.

Thanks,
Rafael




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/2] intel_idle: Introduce 'use_acpi' module parameter
  2020-02-03 11:13 ` [PATCH v2 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
@ 2020-02-03 11:15   ` Rafael J. Wysocki
  2020-02-03 11:19   ` [PATCH v2 2/2] intel_idle: Introduce 'states_off' " Rafael J. Wysocki
  1 sibling, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-02-03 11:15 UTC (permalink / raw)
  To: Linux PM
  Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box, Artem Bityutskiy

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

For diagnostics, it is generally useful to be able to make intel_idle
take the system's ACPI tables into consideration even if that is not
required for the processor model in there, so introduce a new module
parameter, 'use_acpi', to make that happen and update the documentation
to cover it.

While at it, fix the 'no_acpi' module parameter name in the
documentation.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

-> v2:
   * Fix build for CONFIG_ACPI unset.

---
 Documentation/admin-guide/pm/intel_idle.rst |   13 +++++++++----
 drivers/idle/intel_idle.c                   |   11 +++++++++--
 2 files changed, 18 insertions(+), 6 deletions(-)

Index: linux-pm/drivers/idle/intel_idle.c
===================================================================
--- linux-pm.orig/drivers/idle/intel_idle.c
+++ linux-pm/drivers/idle/intel_idle.c
@@ -1131,6 +1131,10 @@ static bool no_acpi __read_mostly;
 module_param(no_acpi, bool, 0444);
 MODULE_PARM_DESC(no_acpi, "Do not use ACPI _CST for building the idle states list");
 
+static bool force_use_acpi __read_mostly; /* No effect if no_acpi is set. */
+module_param_named(use_acpi, force_use_acpi, bool, 0444);
+MODULE_PARM_DESC(use_acpi, "Use ACPI _CST for building the idle states list");
+
 static struct acpi_processor_power acpi_state_table __initdata;
 
 /**
@@ -1258,6 +1262,8 @@ static bool __init intel_idle_off_by_def
 	return true;
 }
 #else /* !CONFIG_ACPI_PROCESSOR_CSTATE */
+#define force_use_acpi	(false)
+
 static inline bool intel_idle_acpi_cst_extract(void) { return false; }
 static inline void intel_idle_init_cstates_acpi(struct cpuidle_driver *drv) { }
 static inline bool intel_idle_off_by_default(u32 mwait_hint) { return false; }
@@ -1460,7 +1466,8 @@ static void __init intel_idle_init_cstat
 		/* Structure copy. */
 		drv->states[drv->state_count] = cpuidle_state_table[cstate];
 
-		if (icpu->use_acpi && intel_idle_off_by_default(mwait_hint) &&
+		if ((icpu->use_acpi || force_use_acpi) &&
+		    intel_idle_off_by_default(mwait_hint) &&
 		    !(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE))
 			drv->states[drv->state_count].flags |= CPUIDLE_FLAG_OFF;
 
@@ -1607,7 +1614,7 @@ static int __init intel_idle_init(void)
 	icpu = (const struct idle_cpu *)id->driver_data;
 	if (icpu) {
 		cpuidle_state_table = icpu->state_table;
-		if (icpu->use_acpi)
+		if (icpu->use_acpi || force_use_acpi)
 			intel_idle_acpi_cst_extract();
 	} else if (!intel_idle_acpi_cst_extract()) {
 		return -ENODEV;
Index: linux-pm/Documentation/admin-guide/pm/intel_idle.rst
===================================================================
--- linux-pm.orig/Documentation/admin-guide/pm/intel_idle.rst
+++ linux-pm/Documentation/admin-guide/pm/intel_idle.rst
@@ -60,6 +60,9 @@ of the system.  The former are always us
 recognized by ``intel_idle`` and the latter are used if that is required for
 the given processor model (which is the case for all server processor models
 recognized by ``intel_idle``) or if the processor model is not recognized.
+[There is a module parameter that can be used to make the driver use the ACPI
+tables with any processor model recognized by it; see
+`below <intel-idle-parameters_>`_.]
 
 If the ACPI tables are going to be used for building the list of available idle
 states, ``intel_idle`` first looks for a ``_CST`` object under one of the ACPI
@@ -165,7 +168,7 @@ and ``idle=nomwait``.  If any of them is
 ``MWAIT`` instruction is not allowed to be used, so the initialization of
 ``intel_idle`` will fail.
 
-Apart from that there are two module parameters recognized by ``intel_idle``
+Apart from that there are three module parameters recognized by ``intel_idle``
 itself that can be set via the kernel command line (they cannot be updated via
 sysfs, so that is the only way to change their values).
 
@@ -186,9 +189,11 @@ QoS) feature can be used to prevent ``CP
 even if they have been enumerated (see :ref:`cpu-pm-qos` in :doc:`cpuidle`).
 Setting ``max_cstate`` to 0 causes the ``intel_idle`` initialization to fail.
 
-The ``noacpi`` module parameter (which is recognized by ``intel_idle`` if the
-kernel has been configured with ACPI support), can be set to make the driver
-ignore the system's ACPI tables entirely (it is unset by default).
+The ``no_acpi`` and ``use_acpi`` module parameters (recognized by ``intel_idle``
+if the kernel has been configured with ACPI support) can be set to make the
+driver ignore the system's ACPI tables entirely or use them for all of the
+recognized processor models, respectively (they both are unset by default and
+``use_acpi`` has no effect if ``no_acpi`` is set).
 
 
 .. _intel-idle-core-and-package-idle-states:




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 2/2] intel_idle: Introduce 'states_off' module parameter
  2020-02-03 11:13 ` [PATCH v2 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
  2020-02-03 11:15   ` [PATCH v2 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
@ 2020-02-03 11:19   ` Rafael J. Wysocki
  1 sibling, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-02-03 11:19 UTC (permalink / raw)
  To: Linux PM
  Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box,
	Artem Bityutskiy, David Laight

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

In certain system configurations it may not be desirable to use some
C-states assumed to be available by intel_idle and the driver needs
to be prevented from using them even before the cpuidle sysfs
interface becomes accessible to user space.  Currently, the only way
to achieve that is by setting the 'max_cstate' module parameter to a
value lower than the index of the shallowest of the C-states in
question, but that may be overly intrusive, because it effectively
makes all of the idle states deeper than the 'max_cstate' one go
away (and the C-state to avoid may be in the middle of the range
normally regarded as available).

To allow that limitation to be overcome, introduce a new module
parameter called 'states_off' to represent a list of idle states to
be disabled by default in the form of a bitmask and update the
documentation to cover it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

-> v2:
   * Address a couple of review comments from David Laight.

---
 Documentation/admin-guide/pm/intel_idle.rst |   19 ++++++++++++++++++-
 drivers/idle/intel_idle.c                   |   23 ++++++++++++++++++++---
 2 files changed, 38 insertions(+), 4 deletions(-)

Index: linux-pm/drivers/idle/intel_idle.c
===================================================================
--- linux-pm.orig/drivers/idle/intel_idle.c
+++ linux-pm/drivers/idle/intel_idle.c
@@ -63,6 +63,7 @@ static struct cpuidle_driver intel_idle_
 };
 /* intel_idle.max_cstate=0 disables driver */
 static int max_cstate = CPUIDLE_STATE_MAX - 1;
+static unsigned int disabled_states_mask;
 
 static unsigned int mwait_substates;
 
@@ -1234,6 +1235,9 @@ static void __init intel_idle_init_cstat
 		if (cx->type > ACPI_STATE_C2)
 			state->flags |= CPUIDLE_FLAG_TLB_FLUSHED;
 
+		if (disabled_states_mask & BIT(cstate))
+			state->flags |= CPUIDLE_FLAG_OFF;
+
 		state->enter = intel_idle;
 		state->enter_s2idle = intel_idle_s2idle;
 	}
@@ -1466,9 +1470,10 @@ static void __init intel_idle_init_cstat
 		/* Structure copy. */
 		drv->states[drv->state_count] = cpuidle_state_table[cstate];
 
-		if ((icpu->use_acpi || force_use_acpi) &&
-		    intel_idle_off_by_default(mwait_hint) &&
-		    !(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE))
+		if ((disabled_states_mask & BIT(drv->state_count)) ||
+		    ((icpu->use_acpi || force_use_acpi) &&
+		     intel_idle_off_by_default(mwait_hint) &&
+		     !(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE)))
 			drv->states[drv->state_count].flags |= CPUIDLE_FLAG_OFF;
 
 		drv->state_count++;
@@ -1487,6 +1492,10 @@ static void __init intel_idle_init_cstat
 static void __init intel_idle_cpuidle_driver_init(struct cpuidle_driver *drv)
 {
 	cpuidle_poll_state_init(drv);
+
+	if (disabled_states_mask & BIT(0))
+		drv->states[0].flags |= CPUIDLE_FLAG_OFF;
+
 	drv->state_count = 1;
 
 	if (icpu)
@@ -1667,3 +1676,11 @@ device_initcall(intel_idle_init);
  * is the easiest way (currently) to continue doing that.
  */
 module_param(max_cstate, int, 0444);
+/*
+ * The positions of the bits that are set in this number are the indices of the
+ * idle states to be disabled by default (as reflected by the names of the
+ * corresponding idle state directories in sysfs, "state0", "state1" ...
+ * "state<i>" ..., where <i> is the index of the given state).
+ */
+module_param_named(states_off, disabled_states_mask, uint, 0444);
+MODULE_PARM_DESC(states_off, "Mask of disabled idle states");
Index: linux-pm/Documentation/admin-guide/pm/intel_idle.rst
===================================================================
--- linux-pm.orig/Documentation/admin-guide/pm/intel_idle.rst
+++ linux-pm/Documentation/admin-guide/pm/intel_idle.rst
@@ -168,7 +168,7 @@ and ``idle=nomwait``.  If any of them is
 ``MWAIT`` instruction is not allowed to be used, so the initialization of
 ``intel_idle`` will fail.
 
-Apart from that there are three module parameters recognized by ``intel_idle``
+Apart from that there are four module parameters recognized by ``intel_idle``
 itself that can be set via the kernel command line (they cannot be updated via
 sysfs, so that is the only way to change their values).
 
@@ -195,6 +195,23 @@ driver ignore the system's ACPI tables e
 recognized processor models, respectively (they both are unset by default and
 ``use_acpi`` has no effect if ``no_acpi`` is set).
 
+The value of the ``states_off`` module parameter (0 by default) represents a
+list of idle states to be disabled by default in the form of a bitmask.
+
+Namely, the positions of the bits that are set in the ``states_off`` value are
+the indices of idle states to be disabled by default (as reflected by the names
+of the corresponding idle state directories in ``sysfs``, :file:`state0`,
+:file:`state1` ... :file:`state<i>` ..., where ``<i>`` is the index of the given
+idle state; see :ref:`idle-states-representation` in :doc:`cpuidle`).
+
+For example, if ``states_off`` is equal to 3, the driver will disable idle
+states 0 and 1 by default, and if it is equal to 8, idle state 3 will be
+disabled by default and so on (bit positions beyond the maximum idle state index
+are ignored).
+
+The idle states disabled this way can be enabled (on a per-CPU basis) from user
+space via ``sysfs``.
+
 
 .. _intel-idle-core-and-package-idle-states:
 




^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-02-03 11:19 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-30 14:44 [PATCH 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
2020-01-30 14:46 ` [PATCH 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
2020-02-02 14:23   ` kbuild test robot
2020-01-30 14:47 ` [PATCH 2/2] intel_idle: Introduce 'states_off' " Rafael J. Wysocki
2020-01-31 11:07   ` David Laight
2020-01-31 11:23     ` Rafael J. Wysocki
2020-01-31 11:24     ` Artem Bityutskiy
2020-01-31 11:54       ` David Laight
2020-01-31 12:03         ` Rafael J. Wysocki
2020-02-03 11:13 ` [PATCH v2 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
2020-02-03 11:15   ` [PATCH v2 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
2020-02-03 11:19   ` [PATCH v2 2/2] intel_idle: Introduce 'states_off' " Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).