* [PATCH 0/2] intel_idle: Two new module parameters
@ 2020-01-30 14:44 Rafael J. Wysocki
2020-01-30 14:46 ` [PATCH 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-01-30 14:44 UTC (permalink / raw)
To: Linux PM
Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box, Artem Bityutskiy
Hi All,
This series adds to module parameters to intel_idle (on top of the material
already in the mainline since Monday), one to make it use ACPI even if that
is not enabled for the processor model present in the system and the other
one to allow passing a list of idle states to disable by default to it.
Details in the patch changelogs.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2] intel_idle: Introduce 'use_acpi' module parameter
2020-01-30 14:44 [PATCH 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
@ 2020-01-30 14:46 ` Rafael J. Wysocki
2020-02-02 14:23 ` kbuild test robot
2020-01-30 14:47 ` [PATCH 2/2] intel_idle: Introduce 'states_off' " Rafael J. Wysocki
2020-02-03 11:13 ` [PATCH v2 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
2 siblings, 1 reply; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-01-30 14:46 UTC (permalink / raw)
To: Linux PM
Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box,
Artem Bityutskiy, Rafael J. Wysocki
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
For diagnostics, it is generally useful to be able to make intel_idle
take the system's ACPI tables into consideration even if that is not
required for the processor model in there, so introduce a new module
parameter, 'use_acpi', to make that happen and update the documentation
to cover it.
While at it, fix the 'no_acpi' module parameter name in the
documentation.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
Documentation/admin-guide/pm/intel_idle.rst | 13 +++++++++----
drivers/idle/intel_idle.c | 11 +++++++++--
2 files changed, 18 insertions(+), 6 deletions(-)
Index: linux-pm/drivers/idle/intel_idle.c
===================================================================
--- linux-pm.orig/drivers/idle/intel_idle.c
+++ linux-pm/drivers/idle/intel_idle.c
@@ -1131,6 +1131,10 @@ static bool no_acpi __read_mostly;
module_param(no_acpi, bool, 0444);
MODULE_PARM_DESC(no_acpi, "Do not use ACPI _CST for building the idle states list");
+static bool use_acpi __read_mostly; /* No effect if no_acpi is set. */
+module_param(use_acpi, bool, 0444);
+MODULE_PARM_DESC(use_acpi, "Use ACPI _CST for building the idle states list");
+
static struct acpi_processor_power acpi_state_table __initdata;
/**
@@ -1258,6 +1262,8 @@ static bool __init intel_idle_off_by_def
return true;
}
#else /* !CONFIG_ACPI_PROCESSOR_CSTATE */
+#define use_acpi (false)
+
static inline bool intel_idle_acpi_cst_extract(void) { return false; }
static inline void intel_idle_init_cstates_acpi(struct cpuidle_driver *drv) { }
static inline bool intel_idle_off_by_default(u32 mwait_hint) { return false; }
@@ -1460,7 +1466,8 @@ static void __init intel_idle_init_cstat
/* Structure copy. */
drv->states[drv->state_count] = cpuidle_state_table[cstate];
- if (icpu->use_acpi && intel_idle_off_by_default(mwait_hint) &&
+ if ((icpu->use_acpi || use_acpi) &&
+ intel_idle_off_by_default(mwait_hint) &&
!(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE))
drv->states[drv->state_count].flags |= CPUIDLE_FLAG_OFF;
@@ -1607,7 +1614,7 @@ static int __init intel_idle_init(void)
icpu = (const struct idle_cpu *)id->driver_data;
if (icpu) {
cpuidle_state_table = icpu->state_table;
- if (icpu->use_acpi)
+ if (icpu->use_acpi || use_acpi)
intel_idle_acpi_cst_extract();
} else if (!intel_idle_acpi_cst_extract()) {
return -ENODEV;
Index: linux-pm/Documentation/admin-guide/pm/intel_idle.rst
===================================================================
--- linux-pm.orig/Documentation/admin-guide/pm/intel_idle.rst
+++ linux-pm/Documentation/admin-guide/pm/intel_idle.rst
@@ -60,6 +60,9 @@ of the system. The former are always us
recognized by ``intel_idle`` and the latter are used if that is required for
the given processor model (which is the case for all server processor models
recognized by ``intel_idle``) or if the processor model is not recognized.
+[There is a module parameter that can be used to make the driver use the ACPI
+tables with any processor model recognized by it; see
+`below <intel-idle-parameters_>`_.]
If the ACPI tables are going to be used for building the list of available idle
states, ``intel_idle`` first looks for a ``_CST`` object under one of the ACPI
@@ -165,7 +168,7 @@ and ``idle=nomwait``. If any of them is
``MWAIT`` instruction is not allowed to be used, so the initialization of
``intel_idle`` will fail.
-Apart from that there are two module parameters recognized by ``intel_idle``
+Apart from that there are three module parameters recognized by ``intel_idle``
itself that can be set via the kernel command line (they cannot be updated via
sysfs, so that is the only way to change their values).
@@ -186,9 +189,11 @@ QoS) feature can be used to prevent ``CP
even if they have been enumerated (see :ref:`cpu-pm-qos` in :doc:`cpuidle`).
Setting ``max_cstate`` to 0 causes the ``intel_idle`` initialization to fail.
-The ``noacpi`` module parameter (which is recognized by ``intel_idle`` if the
-kernel has been configured with ACPI support), can be set to make the driver
-ignore the system's ACPI tables entirely (it is unset by default).
+The ``no_acpi`` and ``use_acpi`` module parameters (recognized by ``intel_idle``
+if the kernel has been configured with ACPI support) can be set to make the
+driver ignore the system's ACPI tables entirely or use them for all of the
+recognized processor models, respectively (they both are unset by default and
+``use_acpi`` has no effect if ``no_acpi`` is set).
.. _intel-idle-core-and-package-idle-states:
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] intel_idle: Introduce 'states_off' module parameter
2020-01-30 14:44 [PATCH 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
2020-01-30 14:46 ` [PATCH 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
@ 2020-01-30 14:47 ` Rafael J. Wysocki
2020-01-31 11:07 ` David Laight
2020-02-03 11:13 ` [PATCH v2 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
2 siblings, 1 reply; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-01-30 14:47 UTC (permalink / raw)
To: Linux PM
Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box,
Artem Bityutskiy, Rafael J. Wysocki
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In certain system configurations it may not be desirable to use some
C-states assumed to be available by intel_idle and the driver needs
to be prevented from using them even before the cpuidle sysfs
interface becomes accessible to user space. Currently, the only way
to achieve that is by setting the 'max_cstate' module parameter to a
value lower than the index of the shallowest of the C-states in
question, but that may be overly intrusive, because it effectively
makes all of the idle states deeper than the 'max_cstate' one go
away (and the C-state to avoid may be in the middle of the range
normally regarded as available).
To allow that limitation to be overcome, introduce a new module
parameter called 'states_off' to represent a list of idle states to
be disabled by default in the form of a bitmask and update the
documentation to cover it.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
Documentation/admin-guide/pm/intel_idle.rst | 15 ++++++++++++++-
drivers/idle/intel_idle.c | 24 +++++++++++++++++++++---
2 files changed, 35 insertions(+), 4 deletions(-)
Index: linux-pm/drivers/idle/intel_idle.c
===================================================================
--- linux-pm.orig/drivers/idle/intel_idle.c
+++ linux-pm/drivers/idle/intel_idle.c
@@ -63,6 +63,7 @@ static struct cpuidle_driver intel_idle_
};
/* intel_idle.max_cstate=0 disables driver */
static int max_cstate = CPUIDLE_STATE_MAX - 1;
+static unsigned int disabled_states_mask;
static unsigned int mwait_substates;
@@ -1234,6 +1235,9 @@ static void __init intel_idle_init_cstat
if (cx->type > ACPI_STATE_C2)
state->flags |= CPUIDLE_FLAG_TLB_FLUSHED;
+ if (disabled_states_mask & BIT(cstate))
+ state->flags |= CPUIDLE_FLAG_OFF;
+
state->enter = intel_idle;
state->enter_s2idle = intel_idle_s2idle;
}
@@ -1466,9 +1470,10 @@ static void __init intel_idle_init_cstat
/* Structure copy. */
drv->states[drv->state_count] = cpuidle_state_table[cstate];
- if ((icpu->use_acpi || use_acpi) &&
- intel_idle_off_by_default(mwait_hint) &&
- !(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE))
+ if ((disabled_states_mask & BIT(drv->state_count)) ||
+ ((icpu->use_acpi || use_acpi) &&
+ intel_idle_off_by_default(mwait_hint) &&
+ !(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE)))
drv->states[drv->state_count].flags |= CPUIDLE_FLAG_OFF;
drv->state_count++;
@@ -1487,6 +1492,10 @@ static void __init intel_idle_init_cstat
static void __init intel_idle_cpuidle_driver_init(struct cpuidle_driver *drv)
{
cpuidle_poll_state_init(drv);
+
+ if (disabled_states_mask & BIT(0))
+ drv->states[0].flags |= CPUIDLE_FLAG_OFF;
+
drv->state_count = 1;
if (icpu)
@@ -1667,3 +1676,12 @@ device_initcall(intel_idle_init);
* is the easiest way (currently) to continue doing that.
*/
module_param(max_cstate, int, 0444);
+/*
+ * The positions of the bits that are set in the two's complement representation
+ * of this value are the indices of the idle states to be disabled by default
+ * (as reflected by the names of the corresponding idle state directories in
+ * sysfs, "state0", "state1" ... "state<i>" ..., where <i> is the index of the
+ * given state).
+ */
+module_param_named(states_off, disabled_states_mask, uint, 0444);
+MODULE_PARM_DESC(states_off, "Mask of disabled idle states");
Index: linux-pm/Documentation/admin-guide/pm/intel_idle.rst
===================================================================
--- linux-pm.orig/Documentation/admin-guide/pm/intel_idle.rst
+++ linux-pm/Documentation/admin-guide/pm/intel_idle.rst
@@ -168,7 +168,7 @@ and ``idle=nomwait``. If any of them is
``MWAIT`` instruction is not allowed to be used, so the initialization of
``intel_idle`` will fail.
-Apart from that there are three module parameters recognized by ``intel_idle``
+Apart from that there are four module parameters recognized by ``intel_idle``
itself that can be set via the kernel command line (they cannot be updated via
sysfs, so that is the only way to change their values).
@@ -195,6 +195,19 @@ driver ignore the system's ACPI tables e
recognized processor models, respectively (they both are unset by default and
``use_acpi`` has no effect if ``no_acpi`` is set).
+The value of the ``states_off`` module parameter (0 by default) represents a
+list of idle states to be disabled by default in the form of a bitmask. Namely,
+the positions of the bits that are set in the two's complement representation of
+that value are the indices of idle states to be disabled by default (as
+reflected by the names of the corresponding idle state directories in ``sysfs``,
+:file:`state0`, :file:`state1` ... :file:`state<i>` ..., where ``<i>`` is the
+index of the given idle state; see :ref:`idle-states-representation` in
+:doc:`cpuidle`). For example, if ``states_off`` is equal to 3, the driver will
+disable idle states 0 and 1 by default, and if it is equal to 8, idle state 3
+will be disabled by default and so on (bit positions beyond the maximum idle
+state index are ignored). The idle states disabled this way can be enabled (on
+a per-CPU basis) from user space via ``sysfs``.
+
.. _intel-idle-core-and-package-idle-states:
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH 2/2] intel_idle: Introduce 'states_off' module parameter
2020-01-30 14:47 ` [PATCH 2/2] intel_idle: Introduce 'states_off' " Rafael J. Wysocki
@ 2020-01-31 11:07 ` David Laight
2020-01-31 11:23 ` Rafael J. Wysocki
2020-01-31 11:24 ` Artem Bityutskiy
0 siblings, 2 replies; 12+ messages in thread
From: David Laight @ 2020-01-31 11:07 UTC (permalink / raw)
To: 'Rafael J. Wysocki', Linux PM
Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box,
Artem Bityutskiy, Rafael J. Wysocki
From: Rafael J. Wysocki
> Sent: 30 January 2020 14:47
>
> In certain system configurations it may not be desirable to use some
> C-states assumed to be available by intel_idle and the driver needs
> to be prevented from using them even before the cpuidle sysfs
> interface becomes accessible to user space. Currently, the only way
> to achieve that is by setting the 'max_cstate' module parameter to a
> value lower than the index of the shallowest of the C-states in
> question, but that may be overly intrusive, because it effectively
> makes all of the idle states deeper than the 'max_cstate' one go
> away (and the C-state to avoid may be in the middle of the range
> normally regarded as available).
>
> To allow that limitation to be overcome, introduce a new module
> parameter called 'states_off' to represent a list of idle states to
> be disabled by default in the form of a bitmask and update the
> documentation to cover it.
The problem I see is that there are (at least) 3 different ways of
referring to the C-States:
1) The state names, C1, C1E, C3, C7 etc.
I'm not sure these are visible outside intel_idle.c.
2) The maximum allowed latency in us.
3) The index into the cpu-dependant tables in intel_idle.c.
Boot parameters that set 3 are completely hopeless for normal
users. The C-state names might be - but they aren't documented.
Unless you know exactly which cpu table is being used the
only constraint a user can request is the latency.
(I've had the misfortune to read intel_idle.c in the last week.
Almost impenetrable TLA ridden uncommented code.)
...
> + * The positions of the bits that are set in the two's complement representation
> + * of this value are the indices of the idle states to be disabled by default
> + * (as reflected by the names of the corresponding idle state directories in
> + * sysfs, "state0", "state1" ... "state<i>" ..., where <i> is the index of the
> + * given state).
What has 'two's complement' got to do with anything?
...
> +The value of the ``states_off`` module parameter (0 by default) represents a
> +list of idle states to be disabled by default in the form of a bitmask. Namely,
> +the positions of the bits that are set in the two's complement representation of
> +that value are the indices of idle states to be disabled by default (as
> +reflected by the names of the corresponding idle state directories in ``sysfs``,
> +:file:`state0`, :file:`state1` ... :file:`state<i>` ..., where ``<i>`` is the
> +index of the given idle state; see :ref:`idle-states-representation` in
> +:doc:`cpuidle`). For example, if ``states_off`` is equal to 3, the driver will
> +disable idle states 0 and 1 by default, and if it is equal to 8, idle state 3
> +will be disabled by default and so on (bit positions beyond the maximum idle
> +state index are ignored). The idle states disabled this way can be enabled (on
> +a per-CPU basis) from user space via ``sysfs``.
A few line breaks would make that easier to read.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] intel_idle: Introduce 'states_off' module parameter
2020-01-31 11:07 ` David Laight
@ 2020-01-31 11:23 ` Rafael J. Wysocki
2020-01-31 11:24 ` Artem Bityutskiy
1 sibling, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-01-31 11:23 UTC (permalink / raw)
To: David Laight
Cc: Rafael J. Wysocki, Linux PM, Linux ACPI, LKML, Len Brown,
Zhang Rui, David Box, Artem Bityutskiy, Rafael J. Wysocki
On Fri, Jan 31, 2020 at 12:07 PM David Laight <David.Laight@aculab.com> wrote:
>
> From: Rafael J. Wysocki
> > Sent: 30 January 2020 14:47
> >
> > In certain system configurations it may not be desirable to use some
> > C-states assumed to be available by intel_idle and the driver needs
> > to be prevented from using them even before the cpuidle sysfs
> > interface becomes accessible to user space. Currently, the only way
> > to achieve that is by setting the 'max_cstate' module parameter to a
> > value lower than the index of the shallowest of the C-states in
> > question, but that may be overly intrusive, because it effectively
> > makes all of the idle states deeper than the 'max_cstate' one go
> > away (and the C-state to avoid may be in the middle of the range
> > normally regarded as available).
> >
> > To allow that limitation to be overcome, introduce a new module
> > parameter called 'states_off' to represent a list of idle states to
> > be disabled by default in the form of a bitmask and update the
> > documentation to cover it.
>
> The problem I see is that there are (at least) 3 different ways of
> referring to the C-States:
So the mask is not referring to the C-states in the first place.
> 1) The state names, C1, C1E, C3, C7 etc.
> I'm not sure these are visible outside intel_idle.c.
Yes, they are, in sysfs.
> 2) The maximum allowed latency in us.
> 3) The index into the cpu-dependant tables in intel_idle.c.
>
> Boot parameters that set 3 are completely hopeless for normal
> users. The C-state names might be - but they aren't documented.
>
> Unless you know exactly which cpu table is being used the
> only constraint a user can request is the latency.
So this mask refers to the idle states numbering in sysfs, as stated
in the documentation update. That covers state0 which is not a
C-state too.
> (I've had the misfortune to read intel_idle.c in the last week.
> Almost impenetrable TLA ridden uncommented code.)
I have some patches to improve that, will post them after this is settled.
> ...
> > + * The positions of the bits that are set in the two's complement representation
> > + * of this value are the indices of the idle states to be disabled by default
> > + * (as reflected by the names of the corresponding idle state directories in
> > + * sysfs, "state0", "state1" ... "state<i>" ..., where <i> is the index of the
> > + * given state).
>
> What has 'two's complement' got to do with anything?
Well, it is the representation in which bits are used. Kind of as
opposed to decimal or hex digits. But I can replace that phrase with
"bits that are set in this number" easily enough.
> ...
> > +The value of the ``states_off`` module parameter (0 by default) represents a
> > +list of idle states to be disabled by default in the form of a bitmask. Namely,
> > +the positions of the bits that are set in the two's complement representation of
> > +that value are the indices of idle states to be disabled by default (as
> > +reflected by the names of the corresponding idle state directories in ``sysfs``,
> > +:file:`state0`, :file:`state1` ... :file:`state<i>` ..., where ``<i>`` is the
> > +index of the given idle state; see :ref:`idle-states-representation` in
> > +:doc:`cpuidle`). For example, if ``states_off`` is equal to 3, the driver will
> > +disable idle states 0 and 1 by default, and if it is equal to 8, idle state 3
> > +will be disabled by default and so on (bit positions beyond the maximum idle
> > +state index are ignored). The idle states disabled this way can be enabled (on
> > +a per-CPU basis) from user space via ``sysfs``.
>
> A few line breaks would make that easier to read.
Fair enough.
Thanks!
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] intel_idle: Introduce 'states_off' module parameter
2020-01-31 11:07 ` David Laight
2020-01-31 11:23 ` Rafael J. Wysocki
@ 2020-01-31 11:24 ` Artem Bityutskiy
2020-01-31 11:54 ` David Laight
1 sibling, 1 reply; 12+ messages in thread
From: Artem Bityutskiy @ 2020-01-31 11:24 UTC (permalink / raw)
To: David Laight, 'Rafael J. Wysocki', Linux PM
Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box, Rafael J. Wysocki
On Fri, 2020-01-31 at 11:07 +0000, David Laight wrote:
> Unless you know exactly which cpu table is being used the
> only constraint a user can request is the latency.
Hi David,
in all my use-cases I always know what is the CPU I am dealing with and
what are the C-states. Simply because in my view they are always CPU-
dependent in terms of what they do and how are they named.
What you say sounds to me like you would want to disable some C-states
without knowing anything (or much) about the CPU you are dealing with
and the C-state names.
If so, could you please share examples of such use-cases?
Thanks!
--
Best Regards,
Artem Bityutskiy
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH 2/2] intel_idle: Introduce 'states_off' module parameter
2020-01-31 11:24 ` Artem Bityutskiy
@ 2020-01-31 11:54 ` David Laight
2020-01-31 12:03 ` Rafael J. Wysocki
0 siblings, 1 reply; 12+ messages in thread
From: David Laight @ 2020-01-31 11:54 UTC (permalink / raw)
To: 'artem.bityutskiy@linux.intel.com',
'Rafael J. Wysocki',
Linux PM
Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box, Rafael J. Wysocki
From: Artem Bityutskiy >
> Sent: 31 January 2020 11:24
> On Fri, 2020-01-31 at 11:07 +0000, David Laight wrote:
> > Unless you know exactly which cpu table is being used the
> > only constraint a user can request is the latency.
>
> Hi David,
>
> in all my use-cases I always know what is the CPU I am dealing with and
> what are the C-states. Simply because in my view they are always CPU-
> dependent in terms of what they do and how are they named.
>
> What you say sounds to me like you would want to disable some C-states
> without knowing anything (or much) about the CPU you are dealing with
> and the C-state names.
>
> If so, could you please share examples of such use-cases?
Dunno, but clearly you want to disable (say) C3 while leaving C6
enabled.
I was trying to find why it was taking 600+us for a RT process
to get rescheduled when it had only been sleeping for a few us.
I found where it was sleeping, but that didn't help at all.
Someone pointed me at a 'random' pdf that referred to /dev/cpu_dma_latency.
Setting that to a small value (eg 20) helps no end.
But there are no references in the code or man pages to that.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] intel_idle: Introduce 'states_off' module parameter
2020-01-31 11:54 ` David Laight
@ 2020-01-31 12:03 ` Rafael J. Wysocki
0 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-01-31 12:03 UTC (permalink / raw)
To: David Laight
Cc: artem.bityutskiy, Rafael J. Wysocki, Linux PM, Linux ACPI, LKML,
Len Brown, Zhang Rui, David Box, Rafael J. Wysocki
On Fri, Jan 31, 2020 at 12:54 PM David Laight <David.Laight@aculab.com> wrote:
>
> From: Artem Bityutskiy >
> > Sent: 31 January 2020 11:24
> > On Fri, 2020-01-31 at 11:07 +0000, David Laight wrote:
> > > Unless you know exactly which cpu table is being used the
> > > only constraint a user can request is the latency.
> >
> > Hi David,
> >
> > in all my use-cases I always know what is the CPU I am dealing with and
> > what are the C-states. Simply because in my view they are always CPU-
> > dependent in terms of what they do and how are they named.
> >
> > What you say sounds to me like you would want to disable some C-states
> > without knowing anything (or much) about the CPU you are dealing with
> > and the C-state names.
> >
> > If so, could you please share examples of such use-cases?
>
> Dunno, but clearly you want to disable (say) C3 while leaving C6
> enabled.
>
> I was trying to find why it was taking 600+us for a RT process
> to get rescheduled when it had only been sleeping for a few us.
>
> I found where it was sleeping, but that didn't help at all.
> Someone pointed me at a 'random' pdf that referred to /dev/cpu_dma_latency.
> Setting that to a small value (eg 20) helps no end.
> But there are no references in the code or man pages to that.
There is a piece of kernel documentation regarding it, however:
https://www.kernel.org/doc/html/latest/admin-guide/pm/cpuidle.html#cpu-pm-qos
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] intel_idle: Introduce 'use_acpi' module parameter
2020-01-30 14:46 ` [PATCH 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
@ 2020-02-02 14:23 ` kbuild test robot
0 siblings, 0 replies; 12+ messages in thread
From: kbuild test robot @ 2020-02-02 14:23 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: kbuild-all, Linux PM, Linux ACPI, LKML, Len Brown, Zhang Rui,
David Box, Artem Bityutskiy, Rafael J. Wysocki
[-- Attachment #1: Type: text/plain, Size: 3745 bytes --]
Hi "Rafael,
I love your patch! Yet something to improve:
[auto build test ERROR on linus/master]
[also build test ERROR on next-20200130]
[cannot apply to acpi/next linux/master v5.5]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Rafael-J-Wysocki/intel_idle-Two-new-module-parameters/20200202-161814
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 94f2630b18975bb56eee5d1a36371db967643479
config: i386-randconfig-a001-20200202 (attached as .config)
compiler: gcc-6 (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>
All error/warnings (new ones prefixed by >>):
drivers//idle/intel_idle.c: In function 'intel_idle_init_cstates_icpu':
>> drivers//idle/intel_idle.c:1265:18: error: expected identifier before '(' token
#define use_acpi (false)
^
>> drivers//idle/intel_idle.c:1469:14: note: in expansion of macro 'use_acpi'
if ((icpu->use_acpi || use_acpi) &&
^~~~~~~~
drivers//idle/intel_idle.c: In function 'intel_idle_init':
>> drivers//idle/intel_idle.c:1265:18: error: expected identifier before '(' token
#define use_acpi (false)
^
drivers//idle/intel_idle.c:1617:13: note: in expansion of macro 'use_acpi'
if (icpu->use_acpi || use_acpi)
^~~~~~~~
vim +/use_acpi +1469 drivers//idle/intel_idle.c
1427
1428 static void __init intel_idle_init_cstates_icpu(struct cpuidle_driver *drv)
1429 {
1430 int cstate;
1431
1432 switch (boot_cpu_data.x86_model) {
1433 case INTEL_FAM6_IVYBRIDGE_X:
1434 ivt_idle_state_table_update();
1435 break;
1436 case INTEL_FAM6_ATOM_GOLDMONT:
1437 case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
1438 bxt_idle_state_table_update();
1439 break;
1440 case INTEL_FAM6_SKYLAKE:
1441 sklh_idle_state_table_update();
1442 break;
1443 }
1444
1445 for (cstate = 0; cstate < CPUIDLE_STATE_MAX; ++cstate) {
1446 unsigned int mwait_hint;
1447
1448 if (intel_idle_max_cstate_reached(cstate))
1449 break;
1450
1451 if (!cpuidle_state_table[cstate].enter &&
1452 !cpuidle_state_table[cstate].enter_s2idle)
1453 break;
1454
1455 /* If marked as unusable, skip this state. */
1456 if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_UNUSABLE) {
1457 pr_debug("state %s is disabled\n",
1458 cpuidle_state_table[cstate].name);
1459 continue;
1460 }
1461
1462 mwait_hint = flg2MWAIT(cpuidle_state_table[cstate].flags);
1463 if (!intel_idle_verify_cstate(mwait_hint))
1464 continue;
1465
1466 /* Structure copy. */
1467 drv->states[drv->state_count] = cpuidle_state_table[cstate];
1468
> 1469 if ((icpu->use_acpi || use_acpi) &&
1470 intel_idle_off_by_default(mwait_hint) &&
1471 !(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE))
1472 drv->states[drv->state_count].flags |= CPUIDLE_FLAG_OFF;
1473
1474 drv->state_count++;
1475 }
1476
1477 if (icpu->byt_auto_demotion_disable_flag) {
1478 wrmsrl(MSR_CC6_DEMOTION_POLICY_CONFIG, 0);
1479 wrmsrl(MSR_MC6_DEMOTION_POLICY_CONFIG, 0);
1480 }
1481 }
1482
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 39486 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 0/2] intel_idle: Two new module parameters
2020-01-30 14:44 [PATCH 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
2020-01-30 14:46 ` [PATCH 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
2020-01-30 14:47 ` [PATCH 2/2] intel_idle: Introduce 'states_off' " Rafael J. Wysocki
@ 2020-02-03 11:13 ` Rafael J. Wysocki
2020-02-03 11:15 ` [PATCH v2 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
2020-02-03 11:19 ` [PATCH v2 2/2] intel_idle: Introduce 'states_off' " Rafael J. Wysocki
2 siblings, 2 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-02-03 11:13 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Linux PM, Linux ACPI, LKML, Len Brown, Zhang Rui, David Box,
Artem Bityutskiy
On Thursday, January 30, 2020 3:44:40 PM CET Rafael J. Wysocki wrote:
> Hi All,
>
> This series adds to module parameters to intel_idle (on top of the material
> already in the mainline since Monday), one to make it use ACPI even if that
> is not enabled for the processor model present in the system and the other
> one to allow passing a list of idle states to disable by default to it.
>
> Details in the patch changelogs.
This is a small update of the original set fixing a build issue in the first
patch and addressing a couple of review comments in the second one.
Thanks,
Rafael
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/2] intel_idle: Introduce 'use_acpi' module parameter
2020-02-03 11:13 ` [PATCH v2 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
@ 2020-02-03 11:15 ` Rafael J. Wysocki
2020-02-03 11:19 ` [PATCH v2 2/2] intel_idle: Introduce 'states_off' " Rafael J. Wysocki
1 sibling, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-02-03 11:15 UTC (permalink / raw)
To: Linux PM
Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box, Artem Bityutskiy
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
For diagnostics, it is generally useful to be able to make intel_idle
take the system's ACPI tables into consideration even if that is not
required for the processor model in there, so introduce a new module
parameter, 'use_acpi', to make that happen and update the documentation
to cover it.
While at it, fix the 'no_acpi' module parameter name in the
documentation.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
-> v2:
* Fix build for CONFIG_ACPI unset.
---
Documentation/admin-guide/pm/intel_idle.rst | 13 +++++++++----
drivers/idle/intel_idle.c | 11 +++++++++--
2 files changed, 18 insertions(+), 6 deletions(-)
Index: linux-pm/drivers/idle/intel_idle.c
===================================================================
--- linux-pm.orig/drivers/idle/intel_idle.c
+++ linux-pm/drivers/idle/intel_idle.c
@@ -1131,6 +1131,10 @@ static bool no_acpi __read_mostly;
module_param(no_acpi, bool, 0444);
MODULE_PARM_DESC(no_acpi, "Do not use ACPI _CST for building the idle states list");
+static bool force_use_acpi __read_mostly; /* No effect if no_acpi is set. */
+module_param_named(use_acpi, force_use_acpi, bool, 0444);
+MODULE_PARM_DESC(use_acpi, "Use ACPI _CST for building the idle states list");
+
static struct acpi_processor_power acpi_state_table __initdata;
/**
@@ -1258,6 +1262,8 @@ static bool __init intel_idle_off_by_def
return true;
}
#else /* !CONFIG_ACPI_PROCESSOR_CSTATE */
+#define force_use_acpi (false)
+
static inline bool intel_idle_acpi_cst_extract(void) { return false; }
static inline void intel_idle_init_cstates_acpi(struct cpuidle_driver *drv) { }
static inline bool intel_idle_off_by_default(u32 mwait_hint) { return false; }
@@ -1460,7 +1466,8 @@ static void __init intel_idle_init_cstat
/* Structure copy. */
drv->states[drv->state_count] = cpuidle_state_table[cstate];
- if (icpu->use_acpi && intel_idle_off_by_default(mwait_hint) &&
+ if ((icpu->use_acpi || force_use_acpi) &&
+ intel_idle_off_by_default(mwait_hint) &&
!(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE))
drv->states[drv->state_count].flags |= CPUIDLE_FLAG_OFF;
@@ -1607,7 +1614,7 @@ static int __init intel_idle_init(void)
icpu = (const struct idle_cpu *)id->driver_data;
if (icpu) {
cpuidle_state_table = icpu->state_table;
- if (icpu->use_acpi)
+ if (icpu->use_acpi || force_use_acpi)
intel_idle_acpi_cst_extract();
} else if (!intel_idle_acpi_cst_extract()) {
return -ENODEV;
Index: linux-pm/Documentation/admin-guide/pm/intel_idle.rst
===================================================================
--- linux-pm.orig/Documentation/admin-guide/pm/intel_idle.rst
+++ linux-pm/Documentation/admin-guide/pm/intel_idle.rst
@@ -60,6 +60,9 @@ of the system. The former are always us
recognized by ``intel_idle`` and the latter are used if that is required for
the given processor model (which is the case for all server processor models
recognized by ``intel_idle``) or if the processor model is not recognized.
+[There is a module parameter that can be used to make the driver use the ACPI
+tables with any processor model recognized by it; see
+`below <intel-idle-parameters_>`_.]
If the ACPI tables are going to be used for building the list of available idle
states, ``intel_idle`` first looks for a ``_CST`` object under one of the ACPI
@@ -165,7 +168,7 @@ and ``idle=nomwait``. If any of them is
``MWAIT`` instruction is not allowed to be used, so the initialization of
``intel_idle`` will fail.
-Apart from that there are two module parameters recognized by ``intel_idle``
+Apart from that there are three module parameters recognized by ``intel_idle``
itself that can be set via the kernel command line (they cannot be updated via
sysfs, so that is the only way to change their values).
@@ -186,9 +189,11 @@ QoS) feature can be used to prevent ``CP
even if they have been enumerated (see :ref:`cpu-pm-qos` in :doc:`cpuidle`).
Setting ``max_cstate`` to 0 causes the ``intel_idle`` initialization to fail.
-The ``noacpi`` module parameter (which is recognized by ``intel_idle`` if the
-kernel has been configured with ACPI support), can be set to make the driver
-ignore the system's ACPI tables entirely (it is unset by default).
+The ``no_acpi`` and ``use_acpi`` module parameters (recognized by ``intel_idle``
+if the kernel has been configured with ACPI support) can be set to make the
+driver ignore the system's ACPI tables entirely or use them for all of the
+recognized processor models, respectively (they both are unset by default and
+``use_acpi`` has no effect if ``no_acpi`` is set).
.. _intel-idle-core-and-package-idle-states:
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 2/2] intel_idle: Introduce 'states_off' module parameter
2020-02-03 11:13 ` [PATCH v2 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
2020-02-03 11:15 ` [PATCH v2 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
@ 2020-02-03 11:19 ` Rafael J. Wysocki
1 sibling, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2020-02-03 11:19 UTC (permalink / raw)
To: Linux PM
Cc: Linux ACPI, LKML, Len Brown, Zhang Rui, David Box,
Artem Bityutskiy, David Laight
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In certain system configurations it may not be desirable to use some
C-states assumed to be available by intel_idle and the driver needs
to be prevented from using them even before the cpuidle sysfs
interface becomes accessible to user space. Currently, the only way
to achieve that is by setting the 'max_cstate' module parameter to a
value lower than the index of the shallowest of the C-states in
question, but that may be overly intrusive, because it effectively
makes all of the idle states deeper than the 'max_cstate' one go
away (and the C-state to avoid may be in the middle of the range
normally regarded as available).
To allow that limitation to be overcome, introduce a new module
parameter called 'states_off' to represent a list of idle states to
be disabled by default in the form of a bitmask and update the
documentation to cover it.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
-> v2:
* Address a couple of review comments from David Laight.
---
Documentation/admin-guide/pm/intel_idle.rst | 19 ++++++++++++++++++-
drivers/idle/intel_idle.c | 23 ++++++++++++++++++++---
2 files changed, 38 insertions(+), 4 deletions(-)
Index: linux-pm/drivers/idle/intel_idle.c
===================================================================
--- linux-pm.orig/drivers/idle/intel_idle.c
+++ linux-pm/drivers/idle/intel_idle.c
@@ -63,6 +63,7 @@ static struct cpuidle_driver intel_idle_
};
/* intel_idle.max_cstate=0 disables driver */
static int max_cstate = CPUIDLE_STATE_MAX - 1;
+static unsigned int disabled_states_mask;
static unsigned int mwait_substates;
@@ -1234,6 +1235,9 @@ static void __init intel_idle_init_cstat
if (cx->type > ACPI_STATE_C2)
state->flags |= CPUIDLE_FLAG_TLB_FLUSHED;
+ if (disabled_states_mask & BIT(cstate))
+ state->flags |= CPUIDLE_FLAG_OFF;
+
state->enter = intel_idle;
state->enter_s2idle = intel_idle_s2idle;
}
@@ -1466,9 +1470,10 @@ static void __init intel_idle_init_cstat
/* Structure copy. */
drv->states[drv->state_count] = cpuidle_state_table[cstate];
- if ((icpu->use_acpi || force_use_acpi) &&
- intel_idle_off_by_default(mwait_hint) &&
- !(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE))
+ if ((disabled_states_mask & BIT(drv->state_count)) ||
+ ((icpu->use_acpi || force_use_acpi) &&
+ intel_idle_off_by_default(mwait_hint) &&
+ !(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE)))
drv->states[drv->state_count].flags |= CPUIDLE_FLAG_OFF;
drv->state_count++;
@@ -1487,6 +1492,10 @@ static void __init intel_idle_init_cstat
static void __init intel_idle_cpuidle_driver_init(struct cpuidle_driver *drv)
{
cpuidle_poll_state_init(drv);
+
+ if (disabled_states_mask & BIT(0))
+ drv->states[0].flags |= CPUIDLE_FLAG_OFF;
+
drv->state_count = 1;
if (icpu)
@@ -1667,3 +1676,11 @@ device_initcall(intel_idle_init);
* is the easiest way (currently) to continue doing that.
*/
module_param(max_cstate, int, 0444);
+/*
+ * The positions of the bits that are set in this number are the indices of the
+ * idle states to be disabled by default (as reflected by the names of the
+ * corresponding idle state directories in sysfs, "state0", "state1" ...
+ * "state<i>" ..., where <i> is the index of the given state).
+ */
+module_param_named(states_off, disabled_states_mask, uint, 0444);
+MODULE_PARM_DESC(states_off, "Mask of disabled idle states");
Index: linux-pm/Documentation/admin-guide/pm/intel_idle.rst
===================================================================
--- linux-pm.orig/Documentation/admin-guide/pm/intel_idle.rst
+++ linux-pm/Documentation/admin-guide/pm/intel_idle.rst
@@ -168,7 +168,7 @@ and ``idle=nomwait``. If any of them is
``MWAIT`` instruction is not allowed to be used, so the initialization of
``intel_idle`` will fail.
-Apart from that there are three module parameters recognized by ``intel_idle``
+Apart from that there are four module parameters recognized by ``intel_idle``
itself that can be set via the kernel command line (they cannot be updated via
sysfs, so that is the only way to change their values).
@@ -195,6 +195,23 @@ driver ignore the system's ACPI tables e
recognized processor models, respectively (they both are unset by default and
``use_acpi`` has no effect if ``no_acpi`` is set).
+The value of the ``states_off`` module parameter (0 by default) represents a
+list of idle states to be disabled by default in the form of a bitmask.
+
+Namely, the positions of the bits that are set in the ``states_off`` value are
+the indices of idle states to be disabled by default (as reflected by the names
+of the corresponding idle state directories in ``sysfs``, :file:`state0`,
+:file:`state1` ... :file:`state<i>` ..., where ``<i>`` is the index of the given
+idle state; see :ref:`idle-states-representation` in :doc:`cpuidle`).
+
+For example, if ``states_off`` is equal to 3, the driver will disable idle
+states 0 and 1 by default, and if it is equal to 8, idle state 3 will be
+disabled by default and so on (bit positions beyond the maximum idle state index
+are ignored).
+
+The idle states disabled this way can be enabled (on a per-CPU basis) from user
+space via ``sysfs``.
+
.. _intel-idle-core-and-package-idle-states:
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-02-03 11:19 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-30 14:44 [PATCH 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
2020-01-30 14:46 ` [PATCH 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
2020-02-02 14:23 ` kbuild test robot
2020-01-30 14:47 ` [PATCH 2/2] intel_idle: Introduce 'states_off' " Rafael J. Wysocki
2020-01-31 11:07 ` David Laight
2020-01-31 11:23 ` Rafael J. Wysocki
2020-01-31 11:24 ` Artem Bityutskiy
2020-01-31 11:54 ` David Laight
2020-01-31 12:03 ` Rafael J. Wysocki
2020-02-03 11:13 ` [PATCH v2 0/2] intel_idle: Two new module parameters Rafael J. Wysocki
2020-02-03 11:15 ` [PATCH v2 1/2] intel_idle: Introduce 'use_acpi' module parameter Rafael J. Wysocki
2020-02-03 11:19 ` [PATCH v2 2/2] intel_idle: Introduce 'states_off' " Rafael J. Wysocki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).