linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH V2 0/4] cpuidle: global registration of idle states with per-cpu statistics
@ 2011-04-11  7:15 Trinabh Gupta
  2011-04-11  7:16 ` [RFC PATCH V2 1/4] Move dev->last_residency update to driver enter routine; remove dev->last_state Trinabh Gupta
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Trinabh Gupta @ 2011-04-11  7:15 UTC (permalink / raw)
  To: linux-pm, peterz; +Cc: linux-kernel, rnayak, karthik-dp, magnus.damm

The core change is to split the cpuidle_device structure into parts
that can be global and parts that has to remain per-cpu.  The per-cpu
pieces are mostly generic statistics that can be independent of
current running driver. 

Motivation:
* Simplify the cpuidle subsystem framework and have 
  registration/unregistration done by single cpu.

* Have single copy of cpuidle_states structure and thus reduce 
  memory consumption.

* Only in very rare cases asymmetric C-states exist which
  can be handled within the cpuidle driver. Most architectures
  do not have asymmetric C-states.

* References: 
	https://lkml.org/lkml/2011/2/10/37
	https://lkml.org/lkml/2011/3/25/52

The first couple of patches in the series move the statistics accounting 
part to withing cpuidle driver; which ensures correctness in updation
of statistics. Moving statistics within cpuidle driver is also required to
make cpuidle_state not writable and thus global. Third patch splits 
the per-cpu part out of cpuidle_state structure and the fourth patch 
makes the cpuidle_state global. Please refer to 
https://lkml.org/lkml/2011/3/25/52 . Version 1 of the series is at
https://lkml.org/lkml/2011/3/22/161

This patch series applies on top of 2.6.38 and is tested on x86 Nehalem 
system with multiple ACPI C-States.

To Do:

1. This patch series works only for acpi_idle driver. Changes would
  have to be done for other idle drivers i.e. intel_idle, at91_idle_driver,
  davinci_idle_driver, kirkwood_idle_driver, omap3_idle_driver.

2. Make ladder governor follow changed API. Currently it works for menu
  governor only.

Thanks,
-Trinabh

---

Trinabh Gupta (4):
      Single/Global registration of idle states
      Split cpuidle_state structure and move per-cpu statistics fields
      Remove CPUIDLE_FLAG_IGNORE and dev->prepare()
      Move dev->last_residency update to driver enter routine; remove dev->last_state


 drivers/acpi/processor_driver.c  |   18 ----
 drivers/acpi/processor_idle.c    |  184 ++++++++++++++++++++++++++++++--------
 drivers/cpuidle/cpuidle.c        |   80 +++++------------
 drivers/cpuidle/driver.c         |   26 +++++
 drivers/cpuidle/governors/menu.c |   25 +++--
 drivers/cpuidle/sysfs.c          |   18 ++--
 include/linux/cpuidle.h          |   49 ++++++----
 7 files changed, 247 insertions(+), 153 deletions(-)

--

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC PATCH V2 1/4] Move dev->last_residency update to driver enter routine; remove dev->last_state
  2011-04-11  7:15 [RFC PATCH V2 0/4] cpuidle: global registration of idle states with per-cpu statistics Trinabh Gupta
@ 2011-04-11  7:16 ` Trinabh Gupta
  2011-04-11  7:17 ` [RFC PATCH V2 2/4] Remove CPUIDLE_FLAG_IGNORE and dev->prepare() Trinabh Gupta
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Trinabh Gupta @ 2011-04-11  7:16 UTC (permalink / raw)
  To: linux-pm, peterz; +Cc: linux-kernel, rnayak, karthik-dp, magnus.damm

Cpuidle subsystem only suggests the state to enter and does not
guarantee if the suggested state is entered. The actual entered state
may be different because of software or hardware demotion. Current
cpuidle code uses last_state field to capture the actual state entered
and based on that updates the statistics for the state entered.

Ideally the driver enter routine should update the counters,
and it should return the state actually entered rather than the time
spent there. The generic cpuidle code should simply handle where
the counters live in the sysfs namespace, not updating the counters.

Reference:
https://lkml.org/lkml/2011/3/25/52

Signed-off-by: Trinabh Gupta <trinabh@linux.vnet.ibm.com>
---

 drivers/acpi/processor_idle.c    |   81 ++++++++++++++++++++++++++------------
 drivers/cpuidle/cpuidle.c        |   27 +++++--------
 drivers/cpuidle/governors/menu.c |    7 ++-
 include/linux/cpuidle.h          |    7 +--
 4 files changed, 73 insertions(+), 49 deletions(-)

diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index d615b7d..00712a7 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -741,22 +741,24 @@ static inline void acpi_idle_do_entry(struct acpi_processor_cx *cx)
 /**
  * acpi_idle_enter_c1 - enters an ACPI C1 state-type
  * @dev: the target CPU
- * @state: the state data
+ * @index: index of target state
  *
  * This is equivalent to the HALT instruction.
  */
 static int acpi_idle_enter_c1(struct cpuidle_device *dev,
-			      struct cpuidle_state *state)
+				int index)
 {
 	ktime_t  kt1, kt2;
 	s64 idle_time;
 	struct acpi_processor *pr;
+	struct cpuidle_state *state = &dev->states[index];
 	struct acpi_processor_cx *cx = cpuidle_get_statedata(state);
 
 	pr = __this_cpu_read(processors);
+	dev->last_residency = 0;
 
 	if (unlikely(!pr))
-		return 0;
+		return -EINVAL;
 
 	local_irq_disable();
 
@@ -764,7 +766,7 @@ static int acpi_idle_enter_c1(struct cpuidle_device *dev,
 	if (acpi_idle_suspend) {
 		local_irq_enable();
 		cpu_relax();
-		return 0;
+		return -EINVAL;
 	}
 
 	lapic_timer_state_broadcast(pr, cx, 1);
@@ -773,37 +775,48 @@ static int acpi_idle_enter_c1(struct cpuidle_device *dev,
 	kt2 = ktime_get_real();
 	idle_time =  ktime_to_us(ktime_sub(kt2, kt1));
 
+	/* Update device last_residency and state counters*/
+	dev->last_residency = (int)idle_time;
+	dev->states[index].time += (unsigned long long)dev->last_residency;
+	dev->states[index].usage++;
+
 	local_irq_enable();
 	cx->usage++;
 	lapic_timer_state_broadcast(pr, cx, 0);
 
-	return idle_time;
+	return index;
 }
 
 /**
  * acpi_idle_enter_simple - enters an ACPI state without BM handling
  * @dev: the target CPU
- * @state: the state data
+ * @index: the index of suggested state
  */
 static int acpi_idle_enter_simple(struct cpuidle_device *dev,
-				  struct cpuidle_state *state)
+				int index)
 {
 	struct acpi_processor *pr;
+	struct cpuidle_state *state = &dev->states[index];
 	struct acpi_processor_cx *cx = cpuidle_get_statedata(state);
 	ktime_t  kt1, kt2;
 	s64 idle_time_ns;
 	s64 idle_time;
 
 	pr = __this_cpu_read(processors);
+	dev->last_residency = 0;
 
 	if (unlikely(!pr))
-		return 0;
-
-	if (acpi_idle_suspend)
-		return(acpi_idle_enter_c1(dev, state));
+		return -EINVAL;
 
 	local_irq_disable();
 
+	if (acpi_idle_suspend) {
+		local_irq_enable();
+		cpu_relax();
+		return -EINVAL;
+	}
+
+
 	if (cx->entry_method != ACPI_CSTATE_FFH) {
 		current_thread_info()->status &= ~TS_POLLING;
 		/*
@@ -815,7 +828,7 @@ static int acpi_idle_enter_simple(struct cpuidle_device *dev,
 		if (unlikely(need_resched())) {
 			current_thread_info()->status |= TS_POLLING;
 			local_irq_enable();
-			return 0;
+			return -EINVAL;
 		}
 	}
 
@@ -837,6 +850,11 @@ static int acpi_idle_enter_simple(struct cpuidle_device *dev,
 	idle_time = idle_time_ns;
 	do_div(idle_time, NSEC_PER_USEC);
 
+	/* Update device last_residency and state counters*/
+	dev->last_residency = (int)idle_time;
+	dev->states[index].time += (unsigned long long)dev->last_residency;
+	dev->states[index].usage++;
+
 	/* Tell the scheduler how much we idled: */
 	sched_clock_idle_wakeup_event(idle_time_ns);
 
@@ -848,7 +866,7 @@ static int acpi_idle_enter_simple(struct cpuidle_device *dev,
 
 	lapic_timer_state_broadcast(pr, cx, 0);
 	cx->time += idle_time;
-	return idle_time;
+	return index;
 }
 
 static int c3_cpu_count;
@@ -857,14 +875,15 @@ static DEFINE_SPINLOCK(c3_lock);
 /**
  * acpi_idle_enter_bm - enters C3 with proper BM handling
  * @dev: the target CPU
- * @state: the state data
+ * @index: the index of suggested state
  *
  * If BM is detected, the deepest non-C3 idle state is entered instead.
  */
 static int acpi_idle_enter_bm(struct cpuidle_device *dev,
-			      struct cpuidle_state *state)
+				int index)
 {
 	struct acpi_processor *pr;
+	struct cpuidle_state *state = &dev->states[index];
 	struct acpi_processor_cx *cx = cpuidle_get_statedata(state);
 	ktime_t  kt1, kt2;
 	s64 idle_time_ns;
@@ -872,22 +891,26 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
 
 
 	pr = __this_cpu_read(processors);
+	dev->last_residency = 0;
 
 	if (unlikely(!pr))
-		return 0;
+		return -EINVAL;
 
-	if (acpi_idle_suspend)
-		return(acpi_idle_enter_c1(dev, state));
+
+	if (acpi_idle_suspend) {
+		cpu_relax();
+		return -EINVAL;
+	}
 
 	if (!cx->bm_sts_skip && acpi_idle_bm_check()) {
-		if (dev->safe_state) {
-			dev->last_state = dev->safe_state;
-			return dev->safe_state->enter(dev, dev->safe_state);
+		if (dev->safe_state_index >= 0) {
+			return dev->states[dev->safe_state_index].enter(dev,
+						dev->safe_state_index);
 		} else {
 			local_irq_disable();
 			acpi_safe_halt();
 			local_irq_enable();
-			return 0;
+			return -EINVAL;
 		}
 	}
 
@@ -904,7 +927,7 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
 		if (unlikely(need_resched())) {
 			current_thread_info()->status |= TS_POLLING;
 			local_irq_enable();
-			return 0;
+			return -EINVAL;
 		}
 	}
 
@@ -954,6 +977,11 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
 	idle_time = idle_time_ns;
 	do_div(idle_time, NSEC_PER_USEC);
 
+	/* Update device last_residency and state counters*/
+	dev->last_residency = (int)idle_time;
+	dev->states[index].time += (unsigned long long)dev->last_residency;
+	dev->states[index].usage++;
+
 	/* Tell the scheduler how much we idled: */
 	sched_clock_idle_wakeup_event(idle_time_ns);
 
@@ -965,7 +993,7 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
 
 	lapic_timer_state_broadcast(pr, cx, 0);
 	cx->time += idle_time;
-	return idle_time;
+	return index;
 }
 
 struct cpuidle_driver acpi_idle_driver = {
@@ -992,6 +1020,7 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 	}
 
 	dev->cpu = pr->id;
+	dev->safe_state_index = -1;
 	for (i = 0; i < CPUIDLE_STATE_MAX; i++) {
 		dev->states[i].name[0] = '\0';
 		dev->states[i].desc[0] = '\0';
@@ -1027,13 +1056,13 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 				state->flags |= CPUIDLE_FLAG_TIME_VALID;
 
 			state->enter = acpi_idle_enter_c1;
-			dev->safe_state = state;
+			dev->safe_state_index = count;
 			break;
 
 			case ACPI_STATE_C2:
 			state->flags |= CPUIDLE_FLAG_TIME_VALID;
 			state->enter = acpi_idle_enter_simple;
-			dev->safe_state = state;
+			dev->safe_state_index = count;
 			break;
 
 			case ACPI_STATE_C3:
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index bf50924..355b078 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -51,7 +51,7 @@ static void cpuidle_idle_call(void)
 {
 	struct cpuidle_device *dev = __this_cpu_read(cpuidle_devices);
 	struct cpuidle_state *target_state;
-	int next_state;
+	int next_state, entered_state;
 
 	/* check if the device is ready */
 	if (!dev || !dev->enabled) {
@@ -94,26 +94,18 @@ static void cpuidle_idle_call(void)
 
 	target_state = &dev->states[next_state];
 
-	/* enter the state and update stats */
-	dev->last_state = target_state;
-
+	/* Is using next_state here correct?? */
 	trace_power_start(POWER_CSTATE, next_state, dev->cpu);
 	trace_cpu_idle(next_state, dev->cpu);
 
-	dev->last_residency = target_state->enter(dev, target_state);
+	entered_state = target_state->enter(dev, next_state);
 
 	trace_power_end(dev->cpu);
 	trace_cpu_idle(PWR_EVENT_EXIT, dev->cpu);
 
-	if (dev->last_state)
-		target_state = dev->last_state;
-
-	target_state->time += (unsigned long long)dev->last_residency;
-	target_state->usage++;
-
 	/* give the governor an opportunity to reflect on the outcome */
 	if (cpuidle_curr_governor->reflect)
-		cpuidle_curr_governor->reflect(dev);
+		cpuidle_curr_governor->reflect(dev, entered_state);
 }
 
 /**
@@ -162,11 +154,10 @@ void cpuidle_resume_and_unlock(void)
 EXPORT_SYMBOL_GPL(cpuidle_resume_and_unlock);
 
 #ifdef CONFIG_ARCH_HAS_CPU_RELAX
-static int poll_idle(struct cpuidle_device *dev, struct cpuidle_state *st)
+static int poll_idle(struct cpuidle_device *dev, int index)
 {
 	ktime_t	t1, t2;
 	s64 diff;
-	int ret;
 
 	t1 = ktime_get();
 	local_irq_enable();
@@ -178,8 +169,11 @@ static int poll_idle(struct cpuidle_device *dev, struct cpuidle_state *st)
 	if (diff > INT_MAX)
 		diff = INT_MAX;
 
-	ret = (int) diff;
-	return ret;
+	dev->last_residency = (int) diff;
+	dev->states[index].time += (unsigned long long)dev->last_residency;
+	dev->states[index].usage++;
+
+	return index;
 }
 
 static void poll_idle_init(struct cpuidle_device *dev)
@@ -238,7 +232,6 @@ int cpuidle_enable_device(struct cpuidle_device *dev)
 		dev->states[i].time = 0;
 	}
 	dev->last_residency = 0;
-	dev->last_state = NULL;
 
 	smp_wmb();
 
diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index f508690..70d9982 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -308,14 +308,17 @@ static int menu_select(struct cpuidle_device *dev)
 /**
  * menu_reflect - records that data structures need update
  * @dev: the CPU
+ * @index: the index of actual entered state
  *
  * NOTE: it's important to be fast here because this operation will add to
  *       the overall exit latency.
  */
-static void menu_reflect(struct cpuidle_device *dev)
+static void menu_reflect(struct cpuidle_device *dev, int index)
 {
 	struct menu_device *data = &__get_cpu_var(menu_devices);
-	data->needs_update = 1;
+	data->last_state_idx = index;
+	if (index >= 0)
+		data->needs_update = 1;
 }
 
 /**
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 36719ea..45eef60 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -42,7 +42,7 @@ struct cpuidle_state {
 	unsigned long long	time; /* in US */
 
 	int (*enter)	(struct cpuidle_device *dev,
-			 struct cpuidle_state *state);
+			int index);
 };
 
 /* Idle State Flags */
@@ -87,13 +87,12 @@ struct cpuidle_device {
 	int			state_count;
 	struct cpuidle_state	states[CPUIDLE_STATE_MAX];
 	struct cpuidle_state_kobj *kobjs[CPUIDLE_STATE_MAX];
-	struct cpuidle_state	*last_state;
 
 	struct list_head 	device_list;
 	struct kobject		kobj;
 	struct completion	kobj_unregister;
 	void			*governor_data;
-	struct cpuidle_state	*safe_state;
+	int			safe_state_index;
 
 	int (*prepare)		(struct cpuidle_device *dev);
 };
@@ -165,7 +164,7 @@ struct cpuidle_governor {
 	void (*disable)		(struct cpuidle_device *dev);
 
 	int  (*select)		(struct cpuidle_device *dev);
-	void (*reflect)		(struct cpuidle_device *dev);
+	void (*reflect)		(struct cpuidle_device *dev, int index);
 
 	struct module 		*owner;
 };


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH V2 2/4] Remove CPUIDLE_FLAG_IGNORE and dev->prepare()
  2011-04-11  7:15 [RFC PATCH V2 0/4] cpuidle: global registration of idle states with per-cpu statistics Trinabh Gupta
  2011-04-11  7:16 ` [RFC PATCH V2 1/4] Move dev->last_residency update to driver enter routine; remove dev->last_state Trinabh Gupta
@ 2011-04-11  7:17 ` Trinabh Gupta
  2011-04-11  7:17 ` [RFC PATCH V2 3/4] Split cpuidle_state structure and move per-cpu statistics fields Trinabh Gupta
  2011-04-11  7:17 ` [RFC PATCH V2 4/4] Single/Global registration of idle states Trinabh Gupta
  3 siblings, 0 replies; 5+ messages in thread
From: Trinabh Gupta @ 2011-04-11  7:17 UTC (permalink / raw)
  To: linux-pm, peterz; +Cc: linux-kernel, rnayak, karthik-dp, magnus.damm

The cpuidle_device->prepare() mechanism causes updates to the
cpuidle_state[].flags, setting and clearing CPUIDLE_FLAG_IGNORE
to tell the governor not to chose a state on a per-cpu basis at
run-time. State demotion is now handled by the driver and it returns
the actual state entered. Hence, this mechanism is not required.
Also this removes per-cpu flags from cpuidle_state enabling
it to be made global.

Signed-off-by: Trinabh Gupta <trinabh@linux.vnet.ibm.com>
---

 drivers/cpuidle/cpuidle.c        |   10 ----------
 drivers/cpuidle/governors/menu.c |    2 --
 include/linux/cpuidle.h          |    3 ---
 3 files changed, 0 insertions(+), 15 deletions(-)

diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 355b078..92a6216 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -75,16 +75,6 @@ static void cpuidle_idle_call(void)
 	hrtimer_peek_ahead_timers();
 #endif
 
-	/*
-	 * Call the device's prepare function before calling the
-	 * governor's select function.  ->prepare gives the device's
-	 * cpuidle driver a chance to update any dynamic information
-	 * of its cpuidle states for the current idle period, e.g.
-	 * state availability, latencies, residencies, etc.
-	 */
-	if (dev->prepare)
-		dev->prepare(dev);
-
 	/* ask the governor for the next state */
 	next_state = cpuidle_curr_governor->select(dev);
 	if (need_resched()) {
diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index 70d9982..40b5630 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -286,8 +286,6 @@ static int menu_select(struct cpuidle_device *dev)
 	for (i = CPUIDLE_DRIVER_STATE_START; i < dev->state_count; i++) {
 		struct cpuidle_state *s = &dev->states[i];
 
-		if (s->flags & CPUIDLE_FLAG_IGNORE)
-			continue;
 		if (s->target_residency > data->predicted_us)
 			continue;
 		if (s->exit_latency > latency_req)
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 45eef60..a3306be 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -47,7 +47,6 @@ struct cpuidle_state {
 
 /* Idle State Flags */
 #define CPUIDLE_FLAG_TIME_VALID	(0x01) /* is residency time measurable? */
-#define CPUIDLE_FLAG_IGNORE	(0x100) /* ignore during this idle period */
 
 #define CPUIDLE_DRIVER_FLAGS_MASK (0xFFFF0000)
 
@@ -93,8 +92,6 @@ struct cpuidle_device {
 	struct completion	kobj_unregister;
 	void			*governor_data;
 	int			safe_state_index;
-
-	int (*prepare)		(struct cpuidle_device *dev);
 };
 
 DECLARE_PER_CPU(struct cpuidle_device *, cpuidle_devices);


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH V2 3/4] Split cpuidle_state structure and move per-cpu statistics fields
  2011-04-11  7:15 [RFC PATCH V2 0/4] cpuidle: global registration of idle states with per-cpu statistics Trinabh Gupta
  2011-04-11  7:16 ` [RFC PATCH V2 1/4] Move dev->last_residency update to driver enter routine; remove dev->last_state Trinabh Gupta
  2011-04-11  7:17 ` [RFC PATCH V2 2/4] Remove CPUIDLE_FLAG_IGNORE and dev->prepare() Trinabh Gupta
@ 2011-04-11  7:17 ` Trinabh Gupta
  2011-04-11  7:17 ` [RFC PATCH V2 4/4] Single/Global registration of idle states Trinabh Gupta
  3 siblings, 0 replies; 5+ messages in thread
From: Trinabh Gupta @ 2011-04-11  7:17 UTC (permalink / raw)
  To: linux-pm, peterz; +Cc: linux-kernel, rnayak, karthik-dp, magnus.damm

This is the first step towards global registration of cpuidle
states. The statistics used primarily by the governor are per-cpu
and have to be split from rest of the fields inside cpuidle_state,
which would be made global i.e. single copy. The driver_data field
is also per-cpu and moved.

Signed-off-by: Trinabh Gupta <trinabh@linux.vnet.ibm.com>
---

 drivers/acpi/processor_idle.c |   37 ++++++++++++++++++-------------------
 drivers/cpuidle/cpuidle.c     |   12 +++++++-----
 drivers/cpuidle/sysfs.c       |   15 +++++++++------
 include/linux/cpuidle.h       |   25 +++++++++++++++----------
 4 files changed, 49 insertions(+), 40 deletions(-)

diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 00712a7..bd29363 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -745,14 +745,13 @@ static inline void acpi_idle_do_entry(struct acpi_processor_cx *cx)
  *
  * This is equivalent to the HALT instruction.
  */
-static int acpi_idle_enter_c1(struct cpuidle_device *dev,
-				int index)
+static int acpi_idle_enter_c1(struct cpuidle_device *dev, int index)
 {
 	ktime_t  kt1, kt2;
 	s64 idle_time;
 	struct acpi_processor *pr;
-	struct cpuidle_state *state = &dev->states[index];
-	struct acpi_processor_cx *cx = cpuidle_get_statedata(state);
+	struct cpuidle_state_usage *state_usage = &dev->states_usage[index];
+	struct acpi_processor_cx *cx = cpuidle_get_statedata(state_usage);
 
 	pr = __this_cpu_read(processors);
 	dev->last_residency = 0;
@@ -777,8 +776,8 @@ static int acpi_idle_enter_c1(struct cpuidle_device *dev,
 
 	/* Update device last_residency and state counters*/
 	dev->last_residency = (int)idle_time;
-	dev->states[index].time += (unsigned long long)dev->last_residency;
-	dev->states[index].usage++;
+	state_usage->time += (unsigned long long)dev->last_residency;
+	state_usage->usage++;
 
 	local_irq_enable();
 	cx->usage++;
@@ -792,12 +791,11 @@ static int acpi_idle_enter_c1(struct cpuidle_device *dev,
  * @dev: the target CPU
  * @index: the index of suggested state
  */
-static int acpi_idle_enter_simple(struct cpuidle_device *dev,
-				int index)
+static int acpi_idle_enter_simple(struct cpuidle_device *dev, int index)
 {
 	struct acpi_processor *pr;
-	struct cpuidle_state *state = &dev->states[index];
-	struct acpi_processor_cx *cx = cpuidle_get_statedata(state);
+	struct cpuidle_state_usage *state_usage = &dev->states_usage[index];
+	struct acpi_processor_cx *cx = cpuidle_get_statedata(state_usage);
 	ktime_t  kt1, kt2;
 	s64 idle_time_ns;
 	s64 idle_time;
@@ -852,8 +850,8 @@ static int acpi_idle_enter_simple(struct cpuidle_device *dev,
 
 	/* Update device last_residency and state counters*/
 	dev->last_residency = (int)idle_time;
-	dev->states[index].time += (unsigned long long)dev->last_residency;
-	dev->states[index].usage++;
+	state_usage->time += (unsigned long long)dev->last_residency;
+	state_usage->usage++;
 
 	/* Tell the scheduler how much we idled: */
 	sched_clock_idle_wakeup_event(idle_time_ns);
@@ -879,12 +877,11 @@ static DEFINE_SPINLOCK(c3_lock);
  *
  * If BM is detected, the deepest non-C3 idle state is entered instead.
  */
-static int acpi_idle_enter_bm(struct cpuidle_device *dev,
-				int index)
+static int acpi_idle_enter_bm(struct cpuidle_device *dev, int index)
 {
 	struct acpi_processor *pr;
-	struct cpuidle_state *state = &dev->states[index];
-	struct acpi_processor_cx *cx = cpuidle_get_statedata(state);
+	struct cpuidle_state_usage *state_usage = &dev->states_usage[index];
+	struct acpi_processor_cx *cx = cpuidle_get_statedata(state_usage);
 	ktime_t  kt1, kt2;
 	s64 idle_time_ns;
 	s64 idle_time;
@@ -979,8 +976,8 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev,
 
 	/* Update device last_residency and state counters*/
 	dev->last_residency = (int)idle_time;
-	dev->states[index].time += (unsigned long long)dev->last_residency;
-	dev->states[index].usage++;
+	state_usage->time += (unsigned long long)dev->last_residency;
+	state_usage->usage++;
 
 	/* Tell the scheduler how much we idled: */
 	sched_clock_idle_wakeup_event(idle_time_ns);
@@ -1010,6 +1007,7 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 	int i, count = CPUIDLE_DRIVER_STATE_START;
 	struct acpi_processor_cx *cx;
 	struct cpuidle_state *state;
+	struct cpuidle_state_usage *state_usage;
 	struct cpuidle_device *dev = &pr->power.dev;
 
 	if (!pr->flags.power_setup_done)
@@ -1032,6 +1030,7 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 	for (i = 1; i < ACPI_PROCESSOR_MAX_POWER && i <= max_cstate; i++) {
 		cx = &pr->power.states[i];
 		state = &dev->states[count];
+		state_usage = &dev->states_usage[count];
 
 		if (!cx->valid)
 			continue;
@@ -1042,7 +1041,7 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 		    !(acpi_gbl_FADT.flags & ACPI_FADT_C2_MP_SUPPORTED))
 			continue;
 #endif
-		cpuidle_set_statedata(state, cx);
+		cpuidle_set_statedata(state_usage, cx);
 
 		snprintf(state->name, CPUIDLE_NAME_LEN, "C%d", i);
 		strncpy(state->desc, cx->desc, CPUIDLE_DESC_LEN);
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 92a6216..5d6f98d 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -160,8 +160,9 @@ static int poll_idle(struct cpuidle_device *dev, int index)
 		diff = INT_MAX;
 
 	dev->last_residency = (int) diff;
-	dev->states[index].time += (unsigned long long)dev->last_residency;
-	dev->states[index].usage++;
+	dev->states_usage[index].time +=
+			(unsigned long long)dev->last_residency;
+	dev->states_usage[index].usage++;
 
 	return index;
 }
@@ -169,8 +170,9 @@ static int poll_idle(struct cpuidle_device *dev, int index)
 static void poll_idle_init(struct cpuidle_device *dev)
 {
 	struct cpuidle_state *state = &dev->states[0];
+	struct cpuidle_state_usage *state_usage = &dev->states_usage[0];
 
-	cpuidle_set_statedata(state, NULL);
+	cpuidle_set_statedata(state_usage, NULL);
 
 	snprintf(state->name, CPUIDLE_NAME_LEN, "POLL");
 	snprintf(state->desc, CPUIDLE_DESC_LEN, "CPUIDLE CORE POLL IDLE");
@@ -218,8 +220,8 @@ int cpuidle_enable_device(struct cpuidle_device *dev)
 		goto fail_sysfs;
 
 	for (i = 0; i < dev->state_count; i++) {
-		dev->states[i].usage = 0;
-		dev->states[i].time = 0;
+		dev->states_usage[i].usage = 0;
+		dev->states_usage[i].time = 0;
 	}
 	dev->last_residency = 0;
 
diff --git a/drivers/cpuidle/sysfs.c b/drivers/cpuidle/sysfs.c
index be7917e..09c9c77 100644
--- a/drivers/cpuidle/sysfs.c
+++ b/drivers/cpuidle/sysfs.c
@@ -216,7 +216,7 @@ static struct kobj_type ktype_cpuidle = {
 
 struct cpuidle_state_attr {
 	struct attribute attr;
-	ssize_t (*show)(struct cpuidle_state *, char *);
+	ssize_t (*show)(struct cpuidle_state *, struct cpuidle_state_usage *, char *);
 	ssize_t (*store)(struct cpuidle_state *, const char *, size_t);
 };
 
@@ -224,19 +224,19 @@ struct cpuidle_state_attr {
 static struct cpuidle_state_attr attr_##_name = __ATTR(_name, 0444, show, NULL)
 
 #define define_show_state_function(_name) \
-static ssize_t show_state_##_name(struct cpuidle_state *state, char *buf) \
+static ssize_t show_state_##_name(struct cpuidle_state *state, struct cpuidle_state_usage *state_usage, char *buf) \
 { \
 	return sprintf(buf, "%u\n", state->_name);\
 }
 
 #define define_show_state_ull_function(_name) \
-static ssize_t show_state_##_name(struct cpuidle_state *state, char *buf) \
+static ssize_t show_state_##_name(struct cpuidle_state *state, struct cpuidle_state_usage *state_usage, char *buf) \
 { \
-	return sprintf(buf, "%llu\n", state->_name);\
+	return sprintf(buf, "%llu\n", state_usage->_name);\
 }
 
 #define define_show_state_str_function(_name) \
-static ssize_t show_state_##_name(struct cpuidle_state *state, char *buf) \
+static ssize_t show_state_##_name(struct cpuidle_state *state, struct cpuidle_state_usage *state_usage, char *buf) \
 { \
 	if (state->_name[0] == '\0')\
 		return sprintf(buf, "<null>\n");\
@@ -269,16 +269,18 @@ static struct attribute *cpuidle_state_default_attrs[] = {
 
 #define kobj_to_state_obj(k) container_of(k, struct cpuidle_state_kobj, kobj)
 #define kobj_to_state(k) (kobj_to_state_obj(k)->state)
+#define kobj_to_state_usage(k) (kobj_to_state_obj(k)->state_usage)
 #define attr_to_stateattr(a) container_of(a, struct cpuidle_state_attr, attr)
 static ssize_t cpuidle_state_show(struct kobject * kobj,
 	struct attribute * attr ,char * buf)
 {
 	int ret = -EIO;
 	struct cpuidle_state *state = kobj_to_state(kobj);
+	struct cpuidle_state_usage *state_usage = kobj_to_state_usage(kobj);
 	struct cpuidle_state_attr * cattr = attr_to_stateattr(attr);
 
 	if (cattr->show)
-		ret = cattr->show(state, buf);
+		ret = cattr->show(state, state_usage, buf);
 
 	return ret;
 }
@@ -323,6 +325,7 @@ int cpuidle_add_state_sysfs(struct cpuidle_device *device)
 		if (!kobj)
 			goto error_state;
 		kobj->state = &device->states[i];
+		kobj->state_usage = &device->states_usage[i];
 		init_completion(&kobj->kobj_unregister);
 
 		ret = kobject_init_and_add(&kobj->kobj, &ktype_state_cpuidle, &device->kobj,
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index a3306be..5a1a238 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -28,19 +28,22 @@ struct cpuidle_device;
  * CPUIDLE DEVICE INTERFACE *
  ****************************/
 
+struct cpuidle_state_usage {
+	void		*driver_data;
+
+	unsigned long long	usage;
+	unsigned long long	time; /* in US */
+};
+
 struct cpuidle_state {
 	char		name[CPUIDLE_NAME_LEN];
 	char		desc[CPUIDLE_DESC_LEN];
-	void		*driver_data;
 
 	unsigned int	flags;
 	unsigned int	exit_latency; /* in US */
 	unsigned int	power_usage; /* in mW */
 	unsigned int	target_residency; /* in US */
 
-	unsigned long long	usage;
-	unsigned long long	time; /* in US */
-
 	int (*enter)	(struct cpuidle_device *dev,
 			int index);
 };
@@ -52,26 +55,27 @@ struct cpuidle_state {
 
 /**
  * cpuidle_get_statedata - retrieves private driver state data
- * @state: the state
+ * @st_usage: the state usage statistics
  */
-static inline void * cpuidle_get_statedata(struct cpuidle_state *state)
+static inline void *cpuidle_get_statedata(struct cpuidle_state_usage *st_usage)
 {
-	return state->driver_data;
+	return st_usage->driver_data;
 }
 
 /**
  * cpuidle_set_statedata - stores private driver state data
- * @state: the state
+ * @st_usage: the state usage statistics
  * @data: the private data
  */
 static inline void
-cpuidle_set_statedata(struct cpuidle_state *state, void *data)
+cpuidle_set_statedata(struct cpuidle_state_usage *st_usage, void *data)
 {
-	state->driver_data = data;
+	st_usage->driver_data = data;
 }
 
 struct cpuidle_state_kobj {
 	struct cpuidle_state *state;
+	struct cpuidle_state_usage *state_usage;
 	struct completion kobj_unregister;
 	struct kobject kobj;
 };
@@ -85,6 +89,7 @@ struct cpuidle_device {
 	int			last_residency;
 	int			state_count;
 	struct cpuidle_state	states[CPUIDLE_STATE_MAX];
+	struct cpuidle_state_usage	states_usage[CPUIDLE_STATE_MAX];
 	struct cpuidle_state_kobj *kobjs[CPUIDLE_STATE_MAX];
 
 	struct list_head 	device_list;


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH V2 4/4] Single/Global registration of idle states
  2011-04-11  7:15 [RFC PATCH V2 0/4] cpuidle: global registration of idle states with per-cpu statistics Trinabh Gupta
                   ` (2 preceding siblings ...)
  2011-04-11  7:17 ` [RFC PATCH V2 3/4] Split cpuidle_state structure and move per-cpu statistics fields Trinabh Gupta
@ 2011-04-11  7:17 ` Trinabh Gupta
  3 siblings, 0 replies; 5+ messages in thread
From: Trinabh Gupta @ 2011-04-11  7:17 UTC (permalink / raw)
  To: linux-pm, peterz; +Cc: linux-kernel, rnayak, karthik-dp, magnus.damm

With this patch there is single copy of cpuidle_states structure
instead of per-cpu. The statistics needed on per-cpu basis
by the governor are kept per-cpu. This simplifies the cpuidle
subsystem as state registration is done by single cpu only.
Having single copy of cpuidle_states saves memory. Rare case
of asymmetric C-states can be handled within the cpuidle driverand
architectures such as POWER do not have asymmetric C-states.

To Do:

1. Handle the case when idle states may change at run time
and acpi_processor_cst_has_changed() routine is called.

2. Handle acpi_processor_power_exit() correctly i.e. ensure
unregistration of cpuidle driver since registration is now
moved inside acpi_processor_power_init().


Signed-off-by: Trinabh Gupta <trinabh@linux.vnet.ibm.com>
---

 drivers/acpi/processor_driver.c  |   18 +-----
 drivers/acpi/processor_idle.c    |  117 +++++++++++++++++++++++++++++++-------
 drivers/cpuidle/cpuidle.c        |   42 ++++----------
 drivers/cpuidle/driver.c         |   25 ++++++++
 drivers/cpuidle/governors/menu.c |   16 +++--
 drivers/cpuidle/sysfs.c          |    3 +
 include/linux/cpuidle.h          |   16 ++++-
 7 files changed, 156 insertions(+), 81 deletions(-)

diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
index a4e0f1b..91464b4 100644
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -503,8 +503,7 @@ static int __cpuinit acpi_processor_add(struct acpi_device *device)
 	acpi_processor_get_throttling_info(pr);
 	acpi_processor_get_limit_info(pr);
 
-
-	if (cpuidle_get_driver() == &acpi_idle_driver)
+	if (!cpuidle_get_driver() || cpuidle_get_driver() == &acpi_idle_driver)
 		acpi_processor_power_init(pr, device);
 
 	pr->cdev = thermal_cooling_device_register("Processor", device,
@@ -800,17 +799,9 @@ static int __init acpi_processor_init(void)
 
 	memset(&errata, 0, sizeof(errata));
 
-	if (!cpuidle_register_driver(&acpi_idle_driver)) {
-		printk(KERN_DEBUG "ACPI: %s registered with cpuidle\n",
-			acpi_idle_driver.name);
-	} else {
-		printk(KERN_DEBUG "ACPI: acpi_idle yielding to %s\n",
-			cpuidle_get_driver()->name);
-	}
-
 	result = acpi_bus_register_driver(&acpi_processor_driver);
 	if (result < 0)
-		goto out_cpuidle;
+		return result;
 
 	acpi_processor_install_hotplug_notify();
 
@@ -821,11 +812,6 @@ static int __init acpi_processor_init(void)
 	acpi_processor_throttling_init();
 
 	return 0;
-
-out_cpuidle:
-	cpuidle_unregister_driver(&acpi_idle_driver);
-
-	return result;
 }
 
 static void __exit acpi_processor_exit(void)
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index bd29363..505355d 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -741,11 +741,13 @@ static inline void acpi_idle_do_entry(struct acpi_processor_cx *cx)
 /**
  * acpi_idle_enter_c1 - enters an ACPI C1 state-type
  * @dev: the target CPU
+ * @drv: cpuidle driver containing cpuidle state info
  * @index: index of target state
  *
  * This is equivalent to the HALT instruction.
  */
-static int acpi_idle_enter_c1(struct cpuidle_device *dev, int index)
+static int acpi_idle_enter_c1(struct cpuidle_device *dev,
+		struct cpuidle_driver *drv, int index)
 {
 	ktime_t  kt1, kt2;
 	s64 idle_time;
@@ -789,9 +791,11 @@ static int acpi_idle_enter_c1(struct cpuidle_device *dev, int index)
 /**
  * acpi_idle_enter_simple - enters an ACPI state without BM handling
  * @dev: the target CPU
+ * @drv: cpuidle driver with cpuidle state information
  * @index: the index of suggested state
  */
-static int acpi_idle_enter_simple(struct cpuidle_device *dev, int index)
+static int acpi_idle_enter_simple(struct cpuidle_device *dev,
+		struct cpuidle_driver *drv, int index)
 {
 	struct acpi_processor *pr;
 	struct cpuidle_state_usage *state_usage = &dev->states_usage[index];
@@ -873,11 +877,13 @@ static DEFINE_SPINLOCK(c3_lock);
 /**
  * acpi_idle_enter_bm - enters C3 with proper BM handling
  * @dev: the target CPU
+ * @drv: cpuidle driver containing state data
  * @index: the index of suggested state
  *
  * If BM is detected, the deepest non-C3 idle state is entered instead.
  */
-static int acpi_idle_enter_bm(struct cpuidle_device *dev, int index)
+static int acpi_idle_enter_bm(struct cpuidle_device *dev,
+		struct cpuidle_driver *drv, int index)
 {
 	struct acpi_processor *pr;
 	struct cpuidle_state_usage *state_usage = &dev->states_usage[index];
@@ -900,9 +906,9 @@ static int acpi_idle_enter_bm(struct cpuidle_device *dev, int index)
 	}
 
 	if (!cx->bm_sts_skip && acpi_idle_bm_check()) {
-		if (dev->safe_state_index >= 0) {
-			return dev->states[dev->safe_state_index].enter(dev,
-						dev->safe_state_index);
+		if (drv->safe_state_index >= 0) {
+			return drv->states[drv->safe_state_index].enter(dev,
+						drv, drv->safe_state_index);
 		} else {
 			local_irq_disable();
 			acpi_safe_halt();
@@ -999,14 +1005,15 @@ struct cpuidle_driver acpi_idle_driver = {
 };
 
 /**
- * acpi_processor_setup_cpuidle - prepares and configures CPUIDLE
+ * acpi_processor_setup_cpuidle_cx - prepares and configures CPUIDLE
+ * device i.e. per-cpu data
+ *
  * @pr: the ACPI processor
  */
-static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
+static int acpi_processor_setup_cpuidle_cx(struct acpi_processor *pr)
 {
 	int i, count = CPUIDLE_DRIVER_STATE_START;
 	struct acpi_processor_cx *cx;
-	struct cpuidle_state *state;
 	struct cpuidle_state_usage *state_usage;
 	struct cpuidle_device *dev = &pr->power.dev;
 
@@ -1018,18 +1025,12 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 	}
 
 	dev->cpu = pr->id;
-	dev->safe_state_index = -1;
-	for (i = 0; i < CPUIDLE_STATE_MAX; i++) {
-		dev->states[i].name[0] = '\0';
-		dev->states[i].desc[0] = '\0';
-	}
 
 	if (max_cstate == 0)
 		max_cstate = 1;
 
 	for (i = 1; i < ACPI_PROCESSOR_MAX_POWER && i <= max_cstate; i++) {
 		cx = &pr->power.states[i];
-		state = &dev->states[count];
 		state_usage = &dev->states_usage[count];
 
 		if (!cx->valid)
@@ -1041,8 +1042,61 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 		    !(acpi_gbl_FADT.flags & ACPI_FADT_C2_MP_SUPPORTED))
 			continue;
 #endif
+
 		cpuidle_set_statedata(state_usage, cx);
 
+		count++;
+		if (count == CPUIDLE_STATE_MAX)
+			break;
+	}
+
+	dev->state_count = count;
+
+	if (!count)
+		return -EINVAL;
+
+	return 0;
+}
+
+/**
+ * acpi_processor_setup_cpuidle states- prepares and configures cpuidle
+ * global state data i.e. idle routines
+ *
+ * @pr: the ACPI processor
+ */
+static int acpi_processor_setup_cpuidle_states(struct acpi_processor *pr)
+{
+	int i, count = CPUIDLE_DRIVER_STATE_START;
+	struct acpi_processor_cx *cx;
+	struct cpuidle_state *state;
+	struct cpuidle_driver *drv = &acpi_idle_driver;
+
+	if (!pr->flags.power_setup_done)
+		return -EINVAL;
+
+	if (pr->flags.power == 0) {
+		return -EINVAL;
+	}
+
+	drv->safe_state_index = -1;
+
+	if (max_cstate == 0)
+		max_cstate = 1;
+
+	for (i = 1; i < ACPI_PROCESSOR_MAX_POWER && i <= max_cstate; i++) {
+		cx = &pr->power.states[i];
+
+		if (!cx->valid)
+			continue;
+
+#ifdef CONFIG_HOTPLUG_CPU
+		if ((cx->type != ACPI_STATE_C1) && (num_online_cpus() > 1) &&
+		    !pr->flags.has_cst &&
+		    !(acpi_gbl_FADT.flags & ACPI_FADT_C2_MP_SUPPORTED))
+			continue;
+#endif
+
+		state = &drv->states[count];
 		snprintf(state->name, CPUIDLE_NAME_LEN, "C%d", i);
 		strncpy(state->desc, cx->desc, CPUIDLE_DESC_LEN);
 		state->exit_latency = cx->latency;
@@ -1055,13 +1109,13 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 				state->flags |= CPUIDLE_FLAG_TIME_VALID;
 
 			state->enter = acpi_idle_enter_c1;
-			dev->safe_state_index = count;
+			drv->safe_state_index = count;
 			break;
 
 			case ACPI_STATE_C2:
 			state->flags |= CPUIDLE_FLAG_TIME_VALID;
 			state->enter = acpi_idle_enter_simple;
-			dev->safe_state_index = count;
+			drv->safe_state_index = count;
 			break;
 
 			case ACPI_STATE_C3:
@@ -1077,7 +1131,7 @@ static int acpi_processor_setup_cpuidle(struct acpi_processor *pr)
 			break;
 	}
 
-	dev->state_count = count;
+	drv->state_count = count;
 
 	if (!count)
 		return -EINVAL;
@@ -1106,7 +1160,15 @@ int acpi_processor_cst_has_changed(struct acpi_processor *pr)
 	cpuidle_disable_device(&pr->power.dev);
 	acpi_processor_get_power_info(pr);
 	if (pr->flags.power) {
-		acpi_processor_setup_cpuidle(pr);
+		/*
+		 * To Do: Currently state data within driver
+		 * is not updated. So change this to also update
+		 * actual state data i.e. what this routine is
+		 * meant for. Maybe complete unregistration and
+		 * re-registration.
+		 *
+		 */
+		acpi_processor_setup_cpuidle_cx(pr);
 		ret = cpuidle_enable_device(&pr->power.dev);
 	}
 	cpuidle_resume_and_unlock();
@@ -1118,7 +1180,7 @@ int __cpuinit acpi_processor_power_init(struct acpi_processor *pr,
 			      struct acpi_device *device)
 {
 	acpi_status status = 0;
-	static int first_run;
+	static int first_run, acpi_processor_registered;
 
 	if (disabled_by_idle_boot_param())
 		return 0;
@@ -1154,7 +1216,15 @@ int __cpuinit acpi_processor_power_init(struct acpi_processor *pr,
 	 * platforms that only support C1.
 	 */
 	if (pr->flags.power) {
-		acpi_processor_setup_cpuidle(pr);
+		if (!acpi_processor_registered) {
+			acpi_processor_setup_cpuidle_states(pr);
+			if (!cpuidle_register_driver(&acpi_idle_driver)) {
+				printk(KERN_DEBUG "ACPI: %s registered with cpuidle\n",
+					acpi_idle_driver.name);
+				acpi_processor_registered = 1;
+			}
+		}
+		acpi_processor_setup_cpuidle_cx(pr);
 		if (cpuidle_register_device(&pr->power.dev))
 			return -EIO;
 	}
@@ -1168,6 +1238,11 @@ int acpi_processor_power_exit(struct acpi_processor *pr,
 		return 0;
 
 	cpuidle_unregister_device(&pr->power.dev);
+	/* We will have to have unregister driver as well here
+	 * since we move register_driver to power_init. Maybe
+	 * just do an unregister everytime; which will be successful
+	 * only during the last call.
+	 */
 	pr->flags.power_setup_done = 0;
 
 	return 0;
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 5d6f98d..69c5de5 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -50,6 +50,7 @@ static int __cpuidle_register_device(struct cpuidle_device *dev);
 static void cpuidle_idle_call(void)
 {
 	struct cpuidle_device *dev = __this_cpu_read(cpuidle_devices);
+	struct cpuidle_driver *drv = cpuidle_get_driver();
 	struct cpuidle_state *target_state;
 	int next_state, entered_state;
 
@@ -76,19 +77,19 @@ static void cpuidle_idle_call(void)
 #endif
 
 	/* ask the governor for the next state */
-	next_state = cpuidle_curr_governor->select(dev);
+	next_state = cpuidle_curr_governor->select(drv, dev);
 	if (need_resched()) {
 		local_irq_enable();
 		return;
 	}
 
-	target_state = &dev->states[next_state];
+	target_state = &drv->states[next_state];
 
 	/* Is using next_state here correct?? */
 	trace_power_start(POWER_CSTATE, next_state, dev->cpu);
 	trace_cpu_idle(next_state, dev->cpu);
 
-	entered_state = target_state->enter(dev, next_state);
+	entered_state = target_state->enter(dev, drv, next_state);
 
 	trace_power_end(dev->cpu);
 	trace_cpu_idle(PWR_EVENT_EXIT, dev->cpu);
@@ -144,7 +145,8 @@ void cpuidle_resume_and_unlock(void)
 EXPORT_SYMBOL_GPL(cpuidle_resume_and_unlock);
 
 #ifdef CONFIG_ARCH_HAS_CPU_RELAX
-static int poll_idle(struct cpuidle_device *dev, int index)
+static int poll_idle(struct cpuidle_device *dev,
+		struct cpuidle_driver *drv, int index)
 {
 	ktime_t	t1, t2;
 	s64 diff;
@@ -167,12 +169,9 @@ static int poll_idle(struct cpuidle_device *dev, int index)
 	return index;
 }
 
-static void poll_idle_init(struct cpuidle_device *dev)
+static void poll_idle_init(struct cpuidle_driver *drv)
 {
-	struct cpuidle_state *state = &dev->states[0];
-	struct cpuidle_state_usage *state_usage = &dev->states_usage[0];
-
-	cpuidle_set_statedata(state_usage, NULL);
+	struct cpuidle_state *state = &drv->states[0];
 
 	snprintf(state->name, CPUIDLE_NAME_LEN, "POLL");
 	snprintf(state->desc, CPUIDLE_DESC_LEN, "CPUIDLE CORE POLL IDLE");
@@ -183,7 +182,7 @@ static void poll_idle_init(struct cpuidle_device *dev)
 	state->enter = poll_idle;
 }
 #else
-static void poll_idle_init(struct cpuidle_device *dev) {}
+static void poll_idle_init(struct cpuidle_driver *drv) {}
 #endif /* CONFIG_ARCH_HAS_CPU_RELAX */
 
 /**
@@ -210,7 +209,8 @@ int cpuidle_enable_device(struct cpuidle_device *dev)
 			return ret;
 	}
 
-	poll_idle_init(dev);
+	poll_idle_init(cpuidle_get_driver());
+	cpuidle_set_statedata(&dev->states_usage[0], NULL);
 
 	if ((ret = cpuidle_add_state_sysfs(dev)))
 		return ret;
@@ -285,26 +285,6 @@ static int __cpuidle_register_device(struct cpuidle_device *dev)
 
 	init_completion(&dev->kobj_unregister);
 
-	/*
-	 * cpuidle driver should set the dev->power_specified bit
-	 * before registering the device if the driver provides
-	 * power_usage numbers.
-	 *
-	 * For those devices whose ->power_specified is not set,
-	 * we fill in power_usage with decreasing values as the
-	 * cpuidle code has an implicit assumption that state Cn
-	 * uses less power than C(n-1).
-	 *
-	 * With CONFIG_ARCH_HAS_CPU_RELAX, C0 is already assigned
-	 * an power value of -1.  So we use -2, -3, etc, for other
-	 * c-states.
-	 */
-	if (!dev->power_specified) {
-		int i;
-		for (i = CPUIDLE_DRIVER_STATE_START; i < dev->state_count; i++)
-			dev->states[i].power_usage = -1 - i;
-	}
-
 	per_cpu(cpuidle_devices, dev->cpu) = dev;
 	list_add(&dev->device_list, &cpuidle_detected_devices);
 	if ((ret = cpuidle_add_sysfs(sys_dev))) {
diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c
index fd1601e..4b04129 100644
--- a/drivers/cpuidle/driver.c
+++ b/drivers/cpuidle/driver.c
@@ -17,6 +17,30 @@
 static struct cpuidle_driver *cpuidle_curr_driver;
 DEFINE_SPINLOCK(cpuidle_driver_lock);
 
+static void __cpuidle_register_driver(struct cpuidle_driver *drv)
+{
+	/*
+	 * cpuidle driver should set the drv->power_specified bit
+	 * before registering if the driver provides
+	 * power_usage numbers.
+	 *
+	 * If power_specified is not set,
+	 * we fill in power_usage with decreasing values as the
+	 * cpuidle code has an implicit assumption that state Cn
+	 * uses less power than C(n-1).
+	 *
+	 * With CONFIG_ARCH_HAS_CPU_RELAX, C0 is already assigned
+	 * an power value of -1.  So we use -2, -3, etc, for other
+	 * c-states.
+	 */
+	if (!drv->power_specified) {
+		int i;
+		for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++)
+			drv->states[i].power_usage = -1 - i;
+	}
+}
+
+
 /**
  * cpuidle_register_driver - registers a driver
  * @drv: the driver
@@ -31,6 +55,7 @@ int cpuidle_register_driver(struct cpuidle_driver *drv)
 		spin_unlock(&cpuidle_driver_lock);
 		return -EBUSY;
 	}
+	__cpuidle_register_driver(drv);
 	cpuidle_curr_driver = drv;
 	spin_unlock(&cpuidle_driver_lock);
 
diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index 40b5630..44651ae 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -182,7 +182,7 @@ static inline int performance_multiplier(void)
 
 static DEFINE_PER_CPU(struct menu_device, menu_devices);
 
-static void menu_update(struct cpuidle_device *dev);
+static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev);
 
 /* This implements DIV_ROUND_CLOSEST but avoids 64 bit division */
 static u64 div_round64(u64 dividend, u32 divisor)
@@ -228,9 +228,10 @@ static void detect_repeating_patterns(struct menu_device *data)
 
 /**
  * menu_select - selects the next idle state to enter
+ * @drv: cpuidle driver containing state data
  * @dev: the CPU
  */
-static int menu_select(struct cpuidle_device *dev)
+static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 {
 	struct menu_device *data = &__get_cpu_var(menu_devices);
 	int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
@@ -239,7 +240,7 @@ static int menu_select(struct cpuidle_device *dev)
 	int multiplier;
 
 	if (data->needs_update) {
-		menu_update(dev);
+		menu_update(drv, dev);
 		data->needs_update = 0;
 	}
 
@@ -283,8 +284,8 @@ static int menu_select(struct cpuidle_device *dev)
 	 * Find the idle state with the lowest power while satisfying
 	 * our constraints.
 	 */
-	for (i = CPUIDLE_DRIVER_STATE_START; i < dev->state_count; i++) {
-		struct cpuidle_state *s = &dev->states[i];
+	for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) {
+		struct cpuidle_state *s = &drv->states[i];
 
 		if (s->target_residency > data->predicted_us)
 			continue;
@@ -321,14 +322,15 @@ static void menu_reflect(struct cpuidle_device *dev, int index)
 
 /**
  * menu_update - attempts to guess what happened after entry
+ * @drv: cpuidle driver containing state data
  * @dev: the CPU
  */
-static void menu_update(struct cpuidle_device *dev)
+static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 {
 	struct menu_device *data = &__get_cpu_var(menu_devices);
 	int last_idx = data->last_state_idx;
 	unsigned int last_idle_us = cpuidle_get_last_residency(dev);
-	struct cpuidle_state *target = &dev->states[last_idx];
+	struct cpuidle_state *target = &drv->states[last_idx];
 	unsigned int measured_us;
 	u64 new_factor;
 
diff --git a/drivers/cpuidle/sysfs.c b/drivers/cpuidle/sysfs.c
index 09c9c77..90be2ad 100644
--- a/drivers/cpuidle/sysfs.c
+++ b/drivers/cpuidle/sysfs.c
@@ -318,13 +318,14 @@ int cpuidle_add_state_sysfs(struct cpuidle_device *device)
 {
 	int i, ret = -ENOMEM;
 	struct cpuidle_state_kobj *kobj;
+	struct cpuidle_driver *drv = cpuidle_get_driver();
 
 	/* state statistics */
 	for (i = 0; i < device->state_count; i++) {
 		kobj = kzalloc(sizeof(struct cpuidle_state_kobj), GFP_KERNEL);
 		if (!kobj)
 			goto error_state;
-		kobj->state = &device->states[i];
+		kobj->state = &drv->states[i];
 		kobj->state_usage = &device->states_usage[i];
 		init_completion(&kobj->kobj_unregister);
 
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 5a1a238..0964f21 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -22,6 +22,7 @@
 #define CPUIDLE_DESC_LEN	32
 
 struct cpuidle_device;
+struct cpuidle_driver;
 
 
 /****************************
@@ -45,6 +46,7 @@ struct cpuidle_state {
 	unsigned int	target_residency; /* in US */
 
 	int (*enter)	(struct cpuidle_device *dev,
+			struct cpuidle_driver *drv,
 			int index);
 };
 
@@ -83,20 +85,17 @@ struct cpuidle_state_kobj {
 struct cpuidle_device {
 	unsigned int		registered:1;
 	unsigned int		enabled:1;
-	unsigned int		power_specified:1;
 	unsigned int		cpu;
 
 	int			last_residency;
 	int			state_count;
-	struct cpuidle_state	states[CPUIDLE_STATE_MAX];
 	struct cpuidle_state_usage	states_usage[CPUIDLE_STATE_MAX];
 	struct cpuidle_state_kobj *kobjs[CPUIDLE_STATE_MAX];
 
-	struct list_head 	device_list;
+	struct list_head	device_list;
 	struct kobject		kobj;
 	struct completion	kobj_unregister;
 	void			*governor_data;
-	int			safe_state_index;
 };
 
 DECLARE_PER_CPU(struct cpuidle_device *, cpuidle_devices);
@@ -120,6 +119,12 @@ static inline int cpuidle_get_last_residency(struct cpuidle_device *dev)
 struct cpuidle_driver {
 	char			name[CPUIDLE_NAME_LEN];
 	struct module 		*owner;
+
+	unsigned int		power_specified:1;
+	struct cpuidle_state	states[CPUIDLE_STATE_MAX];
+	int			state_count;
+	struct cpuidle_state	*safe_state;
+	int			safe_state_index;
 };
 
 #ifdef CONFIG_CPU_IDLE
@@ -165,7 +170,8 @@ struct cpuidle_governor {
 	int  (*enable)		(struct cpuidle_device *dev);
 	void (*disable)		(struct cpuidle_device *dev);
 
-	int  (*select)		(struct cpuidle_device *dev);
+	int  (*select)		(struct cpuidle_driver *drv,
+					struct cpuidle_device *dev);
 	void (*reflect)		(struct cpuidle_device *dev, int index);
 
 	struct module 		*owner;


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-04-11  7:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-11  7:15 [RFC PATCH V2 0/4] cpuidle: global registration of idle states with per-cpu statistics Trinabh Gupta
2011-04-11  7:16 ` [RFC PATCH V2 1/4] Move dev->last_residency update to driver enter routine; remove dev->last_state Trinabh Gupta
2011-04-11  7:17 ` [RFC PATCH V2 2/4] Remove CPUIDLE_FLAG_IGNORE and dev->prepare() Trinabh Gupta
2011-04-11  7:17 ` [RFC PATCH V2 3/4] Split cpuidle_state structure and move per-cpu statistics fields Trinabh Gupta
2011-04-11  7:17 ` [RFC PATCH V2 4/4] Single/Global registration of idle states Trinabh Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).