linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
@ 2012-02-07  1:00 Rafael J. Wysocki
  2012-02-07  1:01 ` [PATCH 1/8] PM / Sleep: Initialize wakeup source locks in wakeup_source_add() Rafael J. Wysocki
                   ` (12 more replies)
  0 siblings, 13 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-07  1:00 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

Hi all,

This series tests the theory that the easiest way to sell a once rejected
feature is to advertise it under a different name.

Well, there actually are two different features, although they are closely
related to each other.  First, patch [6/8] introduces a feature that allows
the kernel to trigger system suspend (or more generally a transition into
a sleep state) whenever there are no active wakeup sources (no, they aren't
called wakelocks).  It is called "autosleep" here, but it was called a few
different names in the past ("opportunistic suspend" was probably the most
popular one).  Second, patch [8/8] introduces "wake locks" that are,
essentially, wakeup sources which may be created and manipulated by user
space.  Using them user space may control the autosleep feature introduced
earlier.

This also is a kind of a proof of concept for the people who wanted me to
show a kernel-based implementation of automatic suspend, so there you go.
Please note, however, that it is done so that the user space "wake locks"
interface is compatible with Android in support of its user space.  I don't
really like this interface, but since the Android's user space seems to rely
on it, I'm fine with using it as is.  YMMV.

Let me say a few words about every patch in the series individually.

[1/8] - This really is a bug fix, so it's v3.4 material.  Nobody has stepped
  on this bug so far, but it should be fixed anyway.

[2/8] - This is a freezer cleanup, worth doing anyway IMO, so v3.4 material too.

[3/8] - This is something we can do no problem, although completely optional
  without the autosleep feature.  Rather necessary with it, though.

[4/8] - This kind of reintroduces my original idea of using a wait queue for
  waiting until there are no wakeup events in progress.  Alan convinced me that
  it would be better to poll the counter to prevent wakeup_source_deactivate()
  from having to call wake_up_all() occasionally (that may be costly in fast
  paths), but then quite some people told me that the wait queue migh be
  better.  I think that the polling will make much less sense with autosleep
  and user space "wake locks".  Anyway, [4/8] is something we can do without
  those things too.

The patches above were given Sign-off-by tags, because I think they make some
sense regardless of the features introcuded by the remaining patches that in
turn are total RFC.

[5/8] - This changes wakeup source statistics so that they are more similar to
  the statistics collected for wakelocks on Android.  The file those statistics
  may be read from is still located in debugfs, though (I don't think it
  belongs to proc and its name is different from the analogous Android's file
  name anyway).  It could be done without autosleep, but then it would be a bit
  pointless.  BTW, this changes interfaces that _in_ _theory_ may be used by
  someone, but I'm not aware of anyone using them.  If you are one, I'll be
  pleased to learn about that, so please tell me who you are. :-)

[6/8] - Autosleep implementation.  I think the changelog explains the idea
  quite well and the code is really nothing special.  It doesn't really add
  anything new to the kernel in terms of infrastructure etc., it just uses
  the existing stuff to implement an alternative method of triggering system
  sleep transitions.  Note, though, that the interface here is different
  from the Android's one, because Android actually modifies /sys/power/state
  to trigger something called "early suspend" (that is never going to be
  implemented in the "stock" kernel as long as I have any influence on it) and
  we simply can't do that in the mainline.

[7/8] - This adds a wakeup source statistics that only makes sense with
  autosleep and (I believe) is analogous to the Android's prevent_suspend_time
  statistics.  Nothing really special, but I didn't want
  wakeup_source_activate/deactivate() to take a common lock to avoid
  congestion.

[8/8] - This adds a user space interface to create, activate and deactivate
  wakeup sources.  Since the files it consists of are called wake_lock and
  wake_unlock, to follow Android, the objects the wakeup sources are wrapped
  into are called "wakelocks" (for added confusion).  Since the interface
  doesn't provide any means to destroy those "wakelocks", I added a garbage
  collection mechanism to get rid of the unused ones, if any.  I also tought
  it might be a good idea to put a limit on the number of those things that
  user space can operate simultaneously, so I did that too.

All in all, it's not as much code as I thought it would be and it seems to be
relatively simple (which rises the question why the Android people didn't
even _try_ to do something like this instead of slapping the "real" wakelocks
onto the kernel FWIW).  IMHO it doesn't add anything really new to the kernel,
except for the user space interfaces that should be maintainable.  At least I
think I should be able to maintain them. :-)

All of the above has been tested very briefly on my test-bed Mackerel board
and it quite obviously requires more thorough testing, but first I need to know
if it makes sense to spend any more time on it.

IOW, I need to know your opinions!

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 1/8] PM / Sleep: Initialize wakeup source locks in wakeup_source_add()
  2012-02-07  1:00 [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
@ 2012-02-07  1:01 ` Rafael J. Wysocki
  2012-02-07 22:29   ` John Stultz
  2012-02-07  1:03 ` [PATCH 2/8] PM / Sleep: Do not check wakeup too often in try_to_freeze_tasks() Rafael J. Wysocki
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-07  1:01 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

From: Rafael J. Wysocki <rjw@sisk.pl>

Initialize wakeup source locks in wakeup_source_add() instead of
wakeup_source_create(), because otherwise the locks of the wakeup
sources that haven't been allocated with wakeup_source_create()
aren't initialized and handled properly.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/base/power/wakeup.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -64,7 +64,6 @@ struct wakeup_source *wakeup_source_crea
 	if (!ws)
 		return NULL;
 
-	spin_lock_init(&ws->lock);
 	if (name)
 		ws->name = kstrdup(name, GFP_KERNEL);
 
@@ -105,6 +104,7 @@ void wakeup_source_add(struct wakeup_sou
 	if (WARN_ON(!ws))
 		return;
 
+	spin_lock_init(&ws->lock);
 	setup_timer(&ws->timer, pm_wakeup_timer_fn, (unsigned long)ws);
 	ws->active = false;
 


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 2/8] PM / Sleep: Do not check wakeup too often in try_to_freeze_tasks()
  2012-02-07  1:00 [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
  2012-02-07  1:01 ` [PATCH 1/8] PM / Sleep: Initialize wakeup source locks in wakeup_source_add() Rafael J. Wysocki
@ 2012-02-07  1:03 ` Rafael J. Wysocki
  2012-02-07  1:03 ` [PATCH 3/8] PM / Sleep: Look for wakeup events in later stages of device suspend Rafael J. Wysocki
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-07  1:03 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

From: Rafael J. Wysocki <rjw@sisk.pl>

Use the observation that it is more efficient to check the wakeup
variable once before the loop reporting tasks that were not
frozen in try_to_freeze_tasks() than to do that in every step of that
loop.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/process.c |   16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

Index: linux/kernel/power/process.c
===================================================================
--- linux.orig/kernel/power/process.c
+++ linux/kernel/power/process.c
@@ -98,13 +98,15 @@ static int try_to_freeze_tasks(bool user
 		       elapsed_csecs / 100, elapsed_csecs % 100,
 		       todo - wq_busy, wq_busy);
 
-		read_lock(&tasklist_lock);
-		do_each_thread(g, p) {
-			if (!wakeup && !freezer_should_skip(p) &&
-			    p != current && freezing(p) && !frozen(p))
-				sched_show_task(p);
-		} while_each_thread(g, p);
-		read_unlock(&tasklist_lock);
+		if (!wakeup) {
+			read_lock(&tasklist_lock);
+			do_each_thread(g, p) {
+				if (p != current && !freezer_should_skip(p)
+				    && freezing(p) && !frozen(p))
+					sched_show_task(p);
+			} while_each_thread(g, p);
+			read_unlock(&tasklist_lock);
+		}
 	} else {
 		printk("(elapsed %d.%02d seconds) ", elapsed_csecs / 100,
 			elapsed_csecs % 100);


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 3/8] PM / Sleep: Look for wakeup events in later stages of device suspend
  2012-02-07  1:00 [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
  2012-02-07  1:01 ` [PATCH 1/8] PM / Sleep: Initialize wakeup source locks in wakeup_source_add() Rafael J. Wysocki
  2012-02-07  1:03 ` [PATCH 2/8] PM / Sleep: Do not check wakeup too often in try_to_freeze_tasks() Rafael J. Wysocki
@ 2012-02-07  1:03 ` Rafael J. Wysocki
  2012-02-07  1:04 ` [PATCH 4/8] PM / Sleep: Use wait queue to signal "no wakeup events in progress" Rafael J. Wysocki
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-07  1:03 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

From: Rafael J. Wysocki <rjw@sisk.pl>

Currently, the device suspend code only checks if there have been
any wakeup events, and therefore the ongoing system transition to a
sleep state should be aborted, during the first (i.e. "suspend")
device suspend phase.  However, wakeup events may be reported later
as well, so it's reasonable to look for them in the in the subsequent
(i.e. "late suspend" and "suspend noirq") phases.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/base/power/main.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

Index: linux/drivers/base/power/main.c
===================================================================
--- linux.orig/drivers/base/power/main.c
+++ linux/drivers/base/power/main.c
@@ -889,6 +889,11 @@ static int dpm_suspend_noirq(pm_message_
 		if (!list_empty(&dev->power.entry))
 			list_move(&dev->power.entry, &dpm_noirq_list);
 		put_device(dev);
+
+		if (pm_wakeup_pending()) {
+			error = -EBUSY;
+			break;
+		}
 	}
 	mutex_unlock(&dpm_list_mtx);
 	if (error)
@@ -962,6 +967,11 @@ static int dpm_suspend_late(pm_message_t
 		if (!list_empty(&dev->power.entry))
 			list_move(&dev->power.entry, &dpm_late_early_list);
 		put_device(dev);
+
+		if (pm_wakeup_pending()) {
+			error = -EBUSY;
+			break;
+		}
 	}
 	mutex_unlock(&dpm_list_mtx);
 	if (error)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 4/8] PM / Sleep: Use wait queue to signal "no wakeup events in progress"
  2012-02-07  1:00 [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2012-02-07  1:03 ` [PATCH 3/8] PM / Sleep: Look for wakeup events in later stages of device suspend Rafael J. Wysocki
@ 2012-02-07  1:04 ` Rafael J. Wysocki
  2012-02-08 23:10   ` NeilBrown
  2012-02-12  1:27   ` mark gross
  2012-02-07  1:05 ` [RFC][PATCH 5/8] PM / Sleep: Change wakeup statistics Rafael J. Wysocki
                   ` (8 subsequent siblings)
  12 siblings, 2 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-07  1:04 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

From: Rafael J. Wysocki <rjw@sisk.pl>

The current wakeup source deactivation code doesn't do anything when
the counter of wakeup events in progress goes down to zero, which
requires pm_get_wakeup_count() to poll that counter periodically.
Although this reduces the average time it takes to deactivate a
wakeup source, it also may lead to a substantial amount of unnecessary
polling if there are extended periods of wakeup activity.  Thus it
seems reasonable to use a wait queue for signaling the "no wakeup
events in progress" condition and remove the polling.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/base/power/wakeup.c |   18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -17,8 +17,6 @@
 
 #include "power.h"
 
-#define TIMEOUT		100
-
 /*
  * If set, the suspend/hibernate code will abort transitions to a sleep state
  * if wakeup events are registered during or immediately before the transition.
@@ -52,6 +50,8 @@ static void pm_wakeup_timer_fn(unsigned
 
 static LIST_HEAD(wakeup_sources);
 
+static DECLARE_WAIT_QUEUE_HEAD(wakeup_count_wait_queue);
+
 /**
  * wakeup_source_create - Create a struct wakeup_source object.
  * @name: Name of the new wakeup source.
@@ -84,7 +84,7 @@ void wakeup_source_destroy(struct wakeup
 	while (ws->active) {
 		spin_unlock_irq(&ws->lock);
 
-		schedule_timeout_interruptible(msecs_to_jiffies(TIMEOUT));
+		schedule_timeout_interruptible(msecs_to_jiffies(100));
 
 		spin_lock_irq(&ws->lock);
 	}
@@ -411,6 +411,7 @@ EXPORT_SYMBOL_GPL(pm_stay_awake);
  */
 static void wakeup_source_deactivate(struct wakeup_source *ws)
 {
+	unsigned int cnt, inpr;
 	ktime_t duration;
 	ktime_t now;
 
@@ -444,6 +445,10 @@ static void wakeup_source_deactivate(str
 	 * couter of wakeup events in progress simultaneously.
 	 */
 	atomic_add(MAX_IN_PROGRESS, &combined_event_count);
+
+	split_counters(&cnt, &inpr);
+	if (!inpr)
+		wake_up_all(&wakeup_count_wait_queue);
 }
 
 /**
@@ -624,14 +629,19 @@ bool pm_wakeup_pending(void)
 bool pm_get_wakeup_count(unsigned int *count)
 {
 	unsigned int cnt, inpr;
+	DEFINE_WAIT(wait);
 
 	for (;;) {
+		prepare_to_wait(&wakeup_count_wait_queue, &wait,
+				TASK_INTERRUPTIBLE);
 		split_counters(&cnt, &inpr);
 		if (inpr == 0 || signal_pending(current))
 			break;
 		pm_wakeup_update_hit_counts();
-		schedule_timeout_interruptible(msecs_to_jiffies(TIMEOUT));
+
+		schedule();
 	}
+	finish_wait(&wakeup_count_wait_queue, &wait);
 
 	split_counters(&cnt, &inpr);
 	*count = cnt;


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 5/8] PM / Sleep: Change wakeup statistics
  2012-02-07  1:00 [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
                   ` (3 preceding siblings ...)
  2012-02-07  1:04 ` [PATCH 4/8] PM / Sleep: Use wait queue to signal "no wakeup events in progress" Rafael J. Wysocki
@ 2012-02-07  1:05 ` Rafael J. Wysocki
  2012-02-15  6:15   ` Arve Hjønnevåg
  2012-02-07  1:06 ` [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep Rafael J. Wysocki
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-07  1:05 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

From: Rafael J. Wysocki <rjw@sisk.pl>

Wakeup statistics used by Android are slightly different from what we
have at the moment, so modify them to follow Android more closely.

This removes the struct wakeup_source's hit_cout field, which is very
rough and therefore not very useful, and adds two new fields,
wakeup_count and expire_count.  The first one tracks how many times
the wakeup source is activated with events_check_enabled set (which
roughly corresponds to the situations when a system power transition
to a sleep state is in progress and should be aborted by this wakeup
source if it is the only active one at that time) and the second one
is the number of times the wakeup source has been activated with a
timeout that expired.

Additionally, the last_time field is now updated when the wakeup
source is deactivated too (previously it was only updated during
the wakeup source's activation), which seems to be what Android does
with the analogous counter for wakelocks.

---
 drivers/base/power/sysfs.c  |   30 +++++++++++++++++++++++-----
 drivers/base/power/wakeup.c |   47 +++++++++++++++++---------------------------
 include/linux/pm_wakeup.h   |   12 +++++++----
 3 files changed, 52 insertions(+), 37 deletions(-)

Index: linux/include/linux/pm_wakeup.h
===================================================================
--- linux.orig/include/linux/pm_wakeup.h
+++ linux/include/linux/pm_wakeup.h
@@ -33,12 +33,14 @@
  *
  * @total_time: Total time this wakeup source has been active.
  * @max_time: Maximum time this wakeup source has been continuously active.
- * @last_time: Monotonic clock when the wakeup source's was activated last time.
+ * @last_time: Monotonic clock when the wakeup source's was touched last time.
  * @event_count: Number of signaled wakeup events.
  * @active_count: Number of times the wakeup sorce was activated.
  * @relax_count: Number of times the wakeup sorce was deactivated.
- * @hit_count: Number of times the wakeup sorce might abort system suspend.
+ * @expire_count: Number of times the wakeup source's timeout has expired.
+ * @wakeup_count: Number of times the wakeup source might abort suspend.
  * @active: Status of the wakeup source.
+ * @has_timeout: The wakeup source has been activated with a timeout.
  */
 struct wakeup_source {
 	char 			*name;
@@ -52,8 +54,10 @@ struct wakeup_source {
 	unsigned long		event_count;
 	unsigned long		active_count;
 	unsigned long		relax_count;
-	unsigned long		hit_count;
-	unsigned int		active:1;
+	unsigned long		expire_count;
+	unsigned long		wakeup_count;
+	bool			active:1;
+	bool			has_timeout:1;
 };
 
 #ifdef CONFIG_PM_SLEEP
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -21,7 +21,7 @@
  * If set, the suspend/hibernate code will abort transitions to a sleep state
  * if wakeup events are registered during or immediately before the transition.
  */
-bool events_check_enabled;
+bool events_check_enabled __read_mostly;
 
 /*
  * Combined counters of registered wakeup events and wakeup events in progress.
@@ -370,9 +370,15 @@ void __pm_stay_awake(struct wakeup_sourc
 		return;
 
 	spin_lock_irqsave(&ws->lock, flags);
+
 	ws->event_count++;
 	if (!ws->active)
 		wakeup_source_activate(ws);
+
+	/* This is racy, but the counter is approximate anyway. */
+	if (events_check_enabled)
+		ws->wakeup_count++;
+
 	spin_unlock_irqrestore(&ws->lock, flags);
 }
 EXPORT_SYMBOL_GPL(__pm_stay_awake);
@@ -438,6 +444,11 @@ static void wakeup_source_deactivate(str
 	if (ktime_to_ns(duration) > ktime_to_ns(ws->max_time))
 		ws->max_time = duration;
 
+	ws->last_time = now;
+	if (ws->has_timeout && time_after(jiffies, ws->timer_expires))
+		ws->expire_count++;
+
+	ws->has_timeout = false;
 	del_timer(&ws->timer);
 
 	/*
@@ -542,6 +553,7 @@ void __pm_wakeup_event(struct wakeup_sou
 	if (time_after(expires, ws->timer_expires)) {
 		mod_timer(&ws->timer, expires);
 		ws->timer_expires = expires;
+		ws->has_timeout = true;
 	}
 
  unlock:
@@ -571,24 +583,6 @@ void pm_wakeup_event(struct device *dev,
 EXPORT_SYMBOL_GPL(pm_wakeup_event);
 
 /**
- * pm_wakeup_update_hit_counts - Update hit counts of all active wakeup sources.
- */
-static void pm_wakeup_update_hit_counts(void)
-{
-	unsigned long flags;
-	struct wakeup_source *ws;
-
-	rcu_read_lock();
-	list_for_each_entry_rcu(ws, &wakeup_sources, entry) {
-		spin_lock_irqsave(&ws->lock, flags);
-		if (ws->active)
-			ws->hit_count++;
-		spin_unlock_irqrestore(&ws->lock, flags);
-	}
-	rcu_read_unlock();
-}
-
-/**
  * pm_wakeup_pending - Check if power transition in progress should be aborted.
  *
  * Compare the current number of registered wakeup events with its preserved
@@ -610,8 +604,6 @@ bool pm_wakeup_pending(void)
 		events_check_enabled = !ret;
 	}
 	spin_unlock_irqrestore(&events_lock, flags);
-	if (ret)
-		pm_wakeup_update_hit_counts();
 	return ret;
 }
 
@@ -637,7 +629,6 @@ bool pm_get_wakeup_count(unsigned int *c
 		split_counters(&cnt, &inpr);
 		if (inpr == 0 || signal_pending(current))
 			break;
-		pm_wakeup_update_hit_counts();
 
 		schedule();
 	}
@@ -670,8 +661,6 @@ bool pm_save_wakeup_count(unsigned int c
 		events_check_enabled = true;
 	}
 	spin_unlock_irq(&events_lock);
-	if (!events_check_enabled)
-		pm_wakeup_update_hit_counts();
 	return events_check_enabled;
 }
 
@@ -706,9 +695,10 @@ static int print_wakeup_source_stats(str
 		active_time = ktime_set(0, 0);
 	}
 
-	ret = seq_printf(m, "%-12s\t%lu\t\t%lu\t\t%lu\t\t"
+	ret = seq_printf(m, "%-12s\t%lu\t\t%lu\t\t%lu\t\t%lu\t\t"
 			"%lld\t\t%lld\t\t%lld\t\t%lld\n",
-			ws->name, active_count, ws->event_count, ws->hit_count,
+			ws->name, active_count, ws->event_count,
+			ws->wakeup_count, ws->expire_count,
 			ktime_to_ms(active_time), ktime_to_ms(total_time),
 			ktime_to_ms(max_time), ktime_to_ms(ws->last_time));
 
@@ -725,8 +715,9 @@ static int wakeup_sources_stats_show(str
 {
 	struct wakeup_source *ws;
 
-	seq_puts(m, "name\t\tactive_count\tevent_count\thit_count\t"
-		"active_since\ttotal_time\tmax_time\tlast_change\n");
+	seq_puts(m, "name\t\tactive_count\tevent_count\twakeup_count\t"
+		"expire_count\tactive_since\ttotal_time\tmax_time\t"
+		"last_change\n");
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(ws, &wakeup_sources, entry)
Index: linux/drivers/base/power/sysfs.c
===================================================================
--- linux.orig/drivers/base/power/sysfs.c
+++ linux/drivers/base/power/sysfs.c
@@ -288,22 +288,41 @@ static ssize_t wakeup_active_count_show(
 
 static DEVICE_ATTR(wakeup_active_count, 0444, wakeup_active_count_show, NULL);
 
-static ssize_t wakeup_hit_count_show(struct device *dev,
-				struct device_attribute *attr, char *buf)
+static ssize_t wakeup_wakeup_count_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	unsigned long count = 0;
+	bool enabled = false;
+
+	spin_lock_irq(&dev->power.lock);
+	if (dev->power.wakeup) {
+		count = dev->power.wakeup->wakeup_count;
+		enabled = true;
+	}
+	spin_unlock_irq(&dev->power.lock);
+	return enabled ? sprintf(buf, "%lu\n", count) : sprintf(buf, "\n");
+}
+
+static DEVICE_ATTR(wakeup_wakeup_count, 0444, wakeup_wakeup_count_show, NULL);
+
+static ssize_t wakeup_expire_count_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
 {
 	unsigned long count = 0;
 	bool enabled = false;
 
 	spin_lock_irq(&dev->power.lock);
 	if (dev->power.wakeup) {
-		count = dev->power.wakeup->hit_count;
+		count = dev->power.wakeup->expire_count;
 		enabled = true;
 	}
 	spin_unlock_irq(&dev->power.lock);
 	return enabled ? sprintf(buf, "%lu\n", count) : sprintf(buf, "\n");
 }
 
-static DEVICE_ATTR(wakeup_hit_count, 0444, wakeup_hit_count_show, NULL);
+static DEVICE_ATTR(wakeup_expire_count, 0444, wakeup_expire_count_show, NULL);
 
 static ssize_t wakeup_active_show(struct device *dev,
 				struct device_attribute *attr, char *buf)
@@ -460,7 +479,8 @@ static struct attribute *wakeup_attrs[]
 	&dev_attr_wakeup.attr,
 	&dev_attr_wakeup_count.attr,
 	&dev_attr_wakeup_active_count.attr,
-	&dev_attr_wakeup_hit_count.attr,
+	&dev_attr_wakeup_wakeup_count.attr,
+	&dev_attr_wakeup_expire_count.attr,
 	&dev_attr_wakeup_active.attr,
 	&dev_attr_wakeup_total_time_ms.attr,
 	&dev_attr_wakeup_max_time_ms.attr,


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
  2012-02-07  1:00 [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
                   ` (4 preceding siblings ...)
  2012-02-07  1:05 ` [RFC][PATCH 5/8] PM / Sleep: Change wakeup statistics Rafael J. Wysocki
@ 2012-02-07  1:06 ` Rafael J. Wysocki
  2012-02-07 22:49   ` [Update][RFC][PATCH " Rafael J. Wysocki
  2012-02-07  1:06 ` [RFC][PATCH 7/8] PM / Sleep: Add "prevent autosleep time" statistics to wakeup sources Rafael J. Wysocki
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-07  1:06 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

From: Rafael J. Wysocki <rjw@sisk.pl>

Introduce a mechanism by which the kernel can trigger global
transitions to a sleep state chosen by user space if there are no
active wakeup sources.

It consists of a new sysfs attribute, /sys/power/autosleep, that
can be written one of the strings returned by reads from
/sys/power/state, a freezable ordered workqueue and a work item
carrying out the "suspend" operations.  If a string representing
the system's sleep state is written to /sys/power/autosleep, the
work item triggering transitions to that state is queued up and
it requeues it self after every execution until user space writes
"off" to /sys/power/autosleep.  That work item enables the detection
of wakeup events using the functions already defined in
drivers/base/power/wakeup.c (with one small modification) and calls
either pm_suspend(), or hibernate() to put the system into a sleep
state.  If a wakeup event is reported while the transition is in
progress, it will abort the transition and the "system suspend" work
item will be queued up again.

---
 drivers/base/power/wakeup.c |   38 ++++++++------
 include/linux/suspend.h     |   13 ++++
 kernel/power/Kconfig        |    8 +++
 kernel/power/Makefile       |    1 
 kernel/power/autosleep.c    |  115 ++++++++++++++++++++++++++++++++++++++++++++
 kernel/power/main.c         |   93 +++++++++++++++++++++++++++++------
 kernel/power/power.h        |   18 ++++++
 7 files changed, 254 insertions(+), 32 deletions(-)

Index: linux/kernel/power/Makefile
===================================================================
--- linux.orig/kernel/power/Makefile
+++ linux/kernel/power/Makefile
@@ -8,5 +8,6 @@ obj-$(CONFIG_SUSPEND)		+= suspend.o
 obj-$(CONFIG_PM_TEST_SUSPEND)	+= suspend_test.o
 obj-$(CONFIG_HIBERNATION)	+= hibernate.o snapshot.o swap.o user.o \
 				   block_io.o
+obj-$(CONFIG_PM_AUTOSLEEP)	+= autosleep.o
 
 obj-$(CONFIG_MAGIC_SYSRQ)	+= poweroff.o
Index: linux/kernel/power/Kconfig
===================================================================
--- linux.orig/kernel/power/Kconfig
+++ linux/kernel/power/Kconfig
@@ -103,6 +103,14 @@ config PM_SLEEP_SMP
 	select HOTPLUG
 	select HOTPLUG_CPU
 
+config PM_AUTOSLEEP
+	bool "Opportunistic sleep"
+	depends on PM_SLEEP
+	default n
+	---help---
+	Allow the kernel to trigger a system transition into a global sleep
+	state automatically whenever there are no active wakeup sources.
+
 config PM_RUNTIME
 	bool "Run-time PM core functionality"
 	depends on !IA64_HP_SIM
Index: linux/kernel/power/power.h
===================================================================
--- linux.orig/kernel/power/power.h
+++ linux/kernel/power/power.h
@@ -269,3 +269,21 @@ static inline void suspend_thaw_processe
 {
 }
 #endif
+
+#ifdef CONFIG_PM_AUTOSLEEP
+
+/* kernel/power/autosleep.c */
+extern int pm_autosleep_init(void);
+extern void pm_autosleep_lock(void);
+extern void pm_autosleep_unlock(void);
+extern suspend_state_t pm_autosleep_state(void);
+extern int pm_autosleep_set_state(suspend_state_t state);
+
+#else /* !CONFIG_PM_AUTOSLEEP */
+
+static inline int pm_autosleep_init(void) { return 0; }
+static inline void pm_autosleep_lock(void) {}
+static inline void pm_autosleep_unlock(void) {}
+static inline suspend_state_t pm_autosleep_state(void) { return PM_SUSPEND_ON; }
+
+#endif /* !CONFIG_PM_AUTOSLEEP */
Index: linux/include/linux/suspend.h
===================================================================
--- linux.orig/include/linux/suspend.h
+++ linux/include/linux/suspend.h
@@ -372,7 +372,7 @@ extern int unregister_pm_notifier(struct
 extern bool events_check_enabled;
 
 extern bool pm_wakeup_pending(void);
-extern bool pm_get_wakeup_count(unsigned int *count);
+extern bool pm_get_wakeup_count(unsigned int *count, bool block);
 extern bool pm_save_wakeup_count(unsigned int count);
 
 static inline void lock_system_sleep(void)
@@ -423,6 +423,17 @@ static inline void unlock_system_sleep(v
 
 #endif /* !CONFIG_PM_SLEEP */
 
+#ifdef CONFIG_PM_AUTOSLEEP
+
+/* kernel/power/autosleep.c */
+void queue_up_suspend_work(void);
+
+#else /* !CONFIG_PM_AUTOSLEEP */
+
+static inline void queue_up_suspend_work(void) {}
+
+#endif /* !CONFIG_PM_AUTOSLEEP */
+
 #ifdef CONFIG_ARCH_SAVE_PAGE_KEYS
 /*
  * The ARCH_SAVE_PAGE_KEYS functions can be used by an architecture
Index: linux/kernel/power/autosleep.c
===================================================================
--- /dev/null
+++ linux/kernel/power/autosleep.c
@@ -0,0 +1,115 @@
+/*
+ * kernel/power/autosleep.c
+ *
+ * Opportunistic sleep support.
+ *
+ * Copyright (C) 2012 Rafael J. Wysocki <rjw@sisk.pl>
+ */
+
+#include <linux/device.h>
+#include <linux/mutex.h>
+#include <linux/pm_wakeup.h>
+
+#include "power.h"
+
+static struct workqueue_struct *autosleep_wq;
+static struct wakeup_source *autosleep_ws;
+
+static DEFINE_MUTEX(autosleep_lock);
+static DECLARE_COMPLETION(suspend_completion);
+
+static suspend_state_t autosleep_state;
+
+static void try_to_suspend(struct work_struct *work)
+{
+	unsigned int initial_count, final_count;
+
+	if (!pm_get_wakeup_count(&initial_count, true))
+		goto out;
+
+	if (!pm_save_wakeup_count(initial_count))
+		goto out;
+
+	mutex_lock(&autosleep_lock);
+	if (autosleep_state == PM_SUSPEND_ON) {
+		mutex_unlock(&autosleep_lock);
+		return;
+	}
+	INIT_COMPLETION(suspend_completion);
+	if (autosleep_state >= PM_SUSPEND_MAX)
+		hibernate();
+	else
+		pm_suspend(autosleep_state);
+
+	complete_all(&suspend_completion);
+	mutex_unlock(&autosleep_lock);
+
+	if (!pm_get_wakeup_count(&final_count, false))
+		goto out;
+
+	if (final_count == initial_count)
+		schedule_timeout(HZ / 2);
+
+ out:
+	queue_up_suspend_work();
+}
+
+static DECLARE_WORK(suspend_work, try_to_suspend);
+
+void queue_up_suspend_work(void)
+{
+	if (!work_pending(&suspend_work) && autosleep_state > PM_SUSPEND_ON)
+		queue_work(autosleep_wq, &suspend_work);
+}
+
+suspend_state_t pm_autosleep_state(void)
+{
+	return autosleep_state;
+}
+
+int pm_autosleep_set_state(suspend_state_t state)
+{
+#ifndef CONFIG_HIBERNATION
+	if (state >= PM_SUSPEND_MAX)
+		return -EINVAL;
+#endif
+	mutex_lock(&autosleep_lock);
+	__pm_stay_awake(autosleep_ws);
+	if (state == PM_SUSPEND_ON && autosleep_state != PM_SUSPEND_ON) {
+		autosleep_state = PM_SUSPEND_ON;
+		__pm_relax(autosleep_ws);
+		mutex_unlock(&autosleep_lock);
+		wait_for_completion(&suspend_completion);
+	} else if (state > PM_SUSPEND_ON) {
+		autosleep_state = state;
+		__pm_relax(autosleep_ws);
+		queue_up_suspend_work();
+		mutex_unlock(&autosleep_lock);
+	}
+	return 0;
+}
+
+void pm_autosleep_lock(void)
+{
+	mutex_lock(&autosleep_lock);
+}
+
+void pm_autosleep_unlock(void)
+{
+	mutex_unlock(&autosleep_lock);
+}
+
+int __init pm_autosleep_init(void)
+{
+	complete_all(&suspend_completion);
+	autosleep_ws = wakeup_source_register("main");
+	if (!autosleep_ws)
+		return -ENOMEM;
+
+	autosleep_wq = alloc_ordered_workqueue("autosleep", 0);
+	if (autosleep_wq)
+		return 0;
+
+	wakeup_source_unregister(autosleep_ws);
+	return -ENOMEM;
+}
Index: linux/kernel/power/main.c
===================================================================
--- linux.orig/kernel/power/main.c
+++ linux/kernel/power/main.c
@@ -269,8 +269,7 @@ static ssize_t state_show(struct kobject
 	return (s - buf);
 }
 
-static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
-			   const char *buf, size_t n)
+static suspend_state_t decode_state(const char *buf, size_t n)
 {
 #ifdef CONFIG_SUSPEND
 	suspend_state_t state = PM_SUSPEND_STANDBY;
@@ -278,29 +277,46 @@ static ssize_t state_store(struct kobjec
 #endif
 	char *p;
 	int len;
-	int error = -EINVAL;
 
 	p = memchr(buf, '\n', n);
 	len = p ? p - buf : n;
 
-	/* First, check if we are requested to hibernate */
-	if (len == 4 && !strncmp(buf, "disk", len)) {
-		error = hibernate();
-		goto Exit;
-	}
+	/* Check hibernation first. */
+	if (len == 4 && !strncmp(buf, "disk", len))
+		return PM_SUSPEND_MAX;
 
 #ifdef CONFIG_SUSPEND
 	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
 		if (*s && len == strlen(*s) && !strncmp(buf, *s, len))
 			break;
 	}
-	if (state < PM_SUSPEND_MAX && *s) {
-		error = enter_state(state);
-		suspend_stats_update(error);
-	}
+	if (state < PM_SUSPEND_MAX && *s)
+		return state;
 #endif
 
- Exit:
+	return PM_SUSPEND_ON;
+}
+
+static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
+			   const char *buf, size_t n)
+{
+	suspend_state_t state;
+	int error = -EINVAL;
+
+	pm_autosleep_lock();
+	if (pm_autosleep_state() > PM_SUSPEND_ON) {
+		error = -EBUSY;
+		goto out;
+	}
+
+	state = decode_state(buf, n);
+	if (state < PM_SUSPEND_MAX)
+		error = pm_suspend(state);
+	else if (state > PM_SUSPEND_ON)
+		error = hibernate();
+
+ out:
+	pm_autosleep_unlock();
 	return error ? error : n;
 }
 
@@ -341,7 +357,8 @@ static ssize_t wakeup_count_show(struct
 {
 	unsigned int val;
 
-	return pm_get_wakeup_count(&val) ? sprintf(buf, "%u\n", val) : -EINTR;
+	return pm_get_wakeup_count(&val, true) ?
+		sprintf(buf, "%u\n", val) : -EINTR;
 }
 
 static ssize_t wakeup_count_store(struct kobject *kobj,
@@ -358,6 +375,46 @@ static ssize_t wakeup_count_store(struct
 }
 
 power_attr(wakeup_count);
+
+#ifdef CONFIG_PM_AUTOSLEEP
+static ssize_t autosleep_show(struct kobject *kobj,
+			      struct kobj_attribute *attr,
+			      char *buf)
+{
+	suspend_state_t state = pm_autosleep_state();
+
+	if (state == PM_SUSPEND_ON)
+		return sprintf(buf, "off\n");
+
+#ifdef CONFIG_SUSPEND
+	if (state < PM_SUSPEND_MAX)
+		return sprintf(buf, "%s\n", valid_state(state) ?
+						pm_states[state] : "error");
+#endif
+#ifdef CONFIG_HIBERNATION
+	return sprintf(buf, "disk\n");
+#else
+	return sprintf(buf, "error");
+#endif
+}
+
+static ssize_t autosleep_store(struct kobject *kobj,
+			       struct kobj_attribute *attr,
+			       const char *buf, size_t n)
+{
+	suspend_state_t state = decode_state(buf, n);
+	int error;
+
+	if (state == PM_SUSPEND_ON && strncmp(buf, "off", 3)
+	    && strncmp(buf, "off\n", 4))
+		return -EINVAL;
+
+	error = pm_autosleep_set_state(state);
+	return error ? error : n;
+}
+
+power_attr(autosleep);
+#endif /* CONFIG_PM_AUTOSLEEP */
 #endif /* CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM_TRACE
@@ -411,6 +468,9 @@ static struct attribute * g[] = {
 #ifdef CONFIG_PM_SLEEP
 	&pm_async_attr.attr,
 	&wakeup_count_attr.attr,
+#ifdef CONFIG_PM_AUTOSLEEP
+	&autosleep_attr.attr,
+#endif
 #ifdef CONFIG_PM_DEBUG
 	&pm_test_attr.attr,
 #endif
@@ -446,7 +506,10 @@ static int __init pm_init(void)
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
-	return sysfs_create_group(power_kobj, &attr_group);
+	error = sysfs_create_group(power_kobj, &attr_group);
+	if (error)
+		return error;
+	return pm_autosleep_init();
 }
 
 core_initcall(pm_init);
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -458,8 +458,10 @@ static void wakeup_source_deactivate(str
 	atomic_add(MAX_IN_PROGRESS, &combined_event_count);
 
 	split_counters(&cnt, &inpr);
-	if (!inpr)
+	if (!inpr) {
 		wake_up_all(&wakeup_count_wait_queue);
+		queue_up_suspend_work();
+	}
 }
 
 /**
@@ -610,29 +612,33 @@ bool pm_wakeup_pending(void)
 /**
  * pm_get_wakeup_count - Read the number of registered wakeup events.
  * @count: Address to store the value at.
+ * @block: Whether or not to block.
  *
- * Store the number of registered wakeup events at the address in @count.  Block
- * if the current number of wakeup events being processed is nonzero.
+ * Store the number of registered wakeup events at the address in @count.  If
+ * @block is set, block until the current number of wakeup events being
+ * processed is zero.
  *
- * Return 'false' if the wait for the number of wakeup events being processed to
- * drop down to zero has been interrupted by a signal (and the current number
- * of wakeup events being processed is still nonzero).  Otherwise return 'true'.
+ * Return 'false' if the current number of wakeup events being processed is
+ * nonzero.  Otherwise return 'true'.
  */
-bool pm_get_wakeup_count(unsigned int *count)
+bool pm_get_wakeup_count(unsigned int *count, bool block)
 {
 	unsigned int cnt, inpr;
-	DEFINE_WAIT(wait);
 
-	for (;;) {
-		prepare_to_wait(&wakeup_count_wait_queue, &wait,
-				TASK_INTERRUPTIBLE);
-		split_counters(&cnt, &inpr);
-		if (inpr == 0 || signal_pending(current))
-			break;
+	if (block) {
+		DEFINE_WAIT(wait);
 
-		schedule();
+		for (;;) {
+			prepare_to_wait(&wakeup_count_wait_queue, &wait,
+					TASK_INTERRUPTIBLE);
+			split_counters(&cnt, &inpr);
+			if (inpr == 0 || signal_pending(current))
+				break;
+
+			schedule();
+		}
+		finish_wait(&wakeup_count_wait_queue, &wait);
 	}
-	finish_wait(&wakeup_count_wait_queue, &wait);
 
 	split_counters(&cnt, &inpr);
 	*count = cnt;


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 7/8] PM / Sleep: Add "prevent autosleep time" statistics to wakeup sources
  2012-02-07  1:00 [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
                   ` (5 preceding siblings ...)
  2012-02-07  1:06 ` [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep Rafael J. Wysocki
@ 2012-02-07  1:06 ` Rafael J. Wysocki
  2012-02-07  1:07 ` [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating " Rafael J. Wysocki
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-07  1:06 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

From: Rafael J. Wysocki <rjw@sisk.pl>

Android uses one wakelock statistics that is only necessary for
opportunistic sleep.  Namely, the prevent_suspend_time field
accumulates the total time the given wakelock has been locked
while "automatic suspend" was enabled.  Add an analogous field,
prevent_sleep_time, to wakeup sources and make it behave in a similar
way.

---
 drivers/base/power/wakeup.c |   61 +++++++++++++++++++++++++++++++++++++++++---
 include/linux/pm_wakeup.h   |    4 ++
 include/linux/suspend.h     |    1 
 kernel/power/autosleep.c    |    2 +
 4 files changed, 64 insertions(+), 4 deletions(-)

Index: linux/include/linux/pm_wakeup.h
===================================================================
--- linux.orig/include/linux/pm_wakeup.h
+++ linux/include/linux/pm_wakeup.h
@@ -34,6 +34,7 @@
  * @total_time: Total time this wakeup source has been active.
  * @max_time: Maximum time this wakeup source has been continuously active.
  * @last_time: Monotonic clock when the wakeup source's was touched last time.
+ * @prevent_sleep_time: Total time this source has been preventing autosleep.
  * @event_count: Number of signaled wakeup events.
  * @active_count: Number of times the wakeup sorce was activated.
  * @relax_count: Number of times the wakeup sorce was deactivated.
@@ -51,6 +52,8 @@ struct wakeup_source {
 	ktime_t total_time;
 	ktime_t max_time;
 	ktime_t last_time;
+	ktime_t start_prevent_time;
+	ktime_t prevent_sleep_time;
 	unsigned long		event_count;
 	unsigned long		active_count;
 	unsigned long		relax_count;
@@ -58,6 +61,7 @@ struct wakeup_source {
 	unsigned long		wakeup_count;
 	bool			active:1;
 	bool			has_timeout:1;
+	bool			autosleep_enabled:1;
 };
 
 #ifdef CONFIG_PM_SLEEP
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -351,6 +351,8 @@ static void wakeup_source_activate(struc
 	ws->active_count++;
 	ws->timer_expires = jiffies;
 	ws->last_time = ktime_get();
+	if (ws->autosleep_enabled)
+		ws->start_prevent_time = ws->last_time;
 
 	/* Increment the counter of events in progress. */
 	atomic_inc(&combined_event_count);
@@ -407,6 +409,17 @@ void pm_stay_awake(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(pm_stay_awake);
 
+#ifdef CONFIG_PM_AUTOSLEEP
+static void update_prevent_sleep_time(struct wakeup_source *ws, ktime_t now)
+{
+	ktime_t delta = ktime_sub(now, ws->start_prevent_time);
+	ws->prevent_sleep_time = ktime_add(ws->prevent_sleep_time, delta);
+}
+#else
+static inline void update_prevent_sleep_time(struct wakeup_source *ws,
+					     ktime_t now) {}
+#endif
+
 /**
  * wakup_source_deactivate - Mark given wakeup source as inactive.
  * @ws: Wakeup source to handle.
@@ -451,6 +464,9 @@ static void wakeup_source_deactivate(str
 	ws->has_timeout = false;
 	del_timer(&ws->timer);
 
+	if (ws->autosleep_enabled)
+		update_prevent_sleep_time(ws, now);
+
 	/*
 	 * Increment the counter of registered wakeup events and decrement the
 	 * couter of wakeup events in progress simultaneously.
@@ -670,6 +686,34 @@ bool pm_save_wakeup_count(unsigned int c
 	return events_check_enabled;
 }
 
+#ifdef CONFIG_PM_AUTOSLEEP
+/**
+ * pm_wakep_autosleep_enabled - Modify autosleep_enabled for all wakeup sources.
+ * @enabled: Whether to set or to clear the autosleep_enabled flags.
+ */
+void pm_wakep_autosleep_enabled(bool set)
+{
+	struct wakeup_source *ws;
+	ktime_t now = ktime_get();
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ws, &wakeup_sources, entry) {
+		spin_lock_irq(&ws->lock);
+		if (ws->autosleep_enabled != set) {
+			ws->autosleep_enabled = set;
+			if (ws->active) {
+				if (set)
+					ws->start_prevent_time = now;
+				else
+					update_prevent_sleep_time(ws, now);
+			}
+		}
+		spin_unlock_irq(&ws->lock);
+	}
+	rcu_read_unlock();
+}
+#endif /* CONFIG_PM_AUTOSLEEP */
+
 static struct dentry *wakeup_sources_stats_dentry;
 
 /**
@@ -685,28 +729,37 @@ static int print_wakeup_source_stats(str
 	ktime_t max_time;
 	unsigned long active_count;
 	ktime_t active_time;
+	ktime_t prevent_sleep_time;
 	int ret;
 
 	spin_lock_irqsave(&ws->lock, flags);
 
 	total_time = ws->total_time;
 	max_time = ws->max_time;
+	prevent_sleep_time = ws->prevent_sleep_time;
 	active_count = ws->active_count;
 	if (ws->active) {
-		active_time = ktime_sub(ktime_get(), ws->last_time);
+		ktime_t now = ktime_get();
+
+		active_time = ktime_sub(now, ws->last_time);
 		total_time = ktime_add(total_time, active_time);
 		if (active_time.tv64 > max_time.tv64)
 			max_time = active_time;
+
+		if (ws->autosleep_enabled)
+			prevent_sleep_time = ktime_add(prevent_sleep_time,
+				ktime_sub(now, ws->start_prevent_time));
 	} else {
 		active_time = ktime_set(0, 0);
 	}
 
 	ret = seq_printf(m, "%-12s\t%lu\t\t%lu\t\t%lu\t\t%lu\t\t"
-			"%lld\t\t%lld\t\t%lld\t\t%lld\n",
+			"%lld\t\t%lld\t\t%lld\t\t%lld\t\t%lld\n",
 			ws->name, active_count, ws->event_count,
 			ws->wakeup_count, ws->expire_count,
 			ktime_to_ms(active_time), ktime_to_ms(total_time),
-			ktime_to_ms(max_time), ktime_to_ms(ws->last_time));
+			ktime_to_ms(max_time), ktime_to_ms(ws->last_time),
+			ktime_to_ms(prevent_sleep_time));
 
 	spin_unlock_irqrestore(&ws->lock, flags);
 
@@ -723,7 +776,7 @@ static int wakeup_sources_stats_show(str
 
 	seq_puts(m, "name\t\tactive_count\tevent_count\twakeup_count\t"
 		"expire_count\tactive_since\ttotal_time\tmax_time\t"
-		"last_change\n");
+		"last_change\tprevent_suspend_time\n");
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(ws, &wakeup_sources, entry)
Index: linux/include/linux/suspend.h
===================================================================
--- linux.orig/include/linux/suspend.h
+++ linux/include/linux/suspend.h
@@ -374,6 +374,7 @@ extern bool events_check_enabled;
 extern bool pm_wakeup_pending(void);
 extern bool pm_get_wakeup_count(unsigned int *count, bool block);
 extern bool pm_save_wakeup_count(unsigned int count);
+extern void pm_wakep_autosleep_enabled(bool set);
 
 static inline void lock_system_sleep(void)
 {
Index: linux/kernel/power/autosleep.c
===================================================================
--- linux.orig/kernel/power/autosleep.c
+++ linux/kernel/power/autosleep.c
@@ -78,11 +78,13 @@ int pm_autosleep_set_state(suspend_state
 	if (state == PM_SUSPEND_ON && autosleep_state != PM_SUSPEND_ON) {
 		autosleep_state = PM_SUSPEND_ON;
 		__pm_relax(autosleep_ws);
+		pm_wakep_autosleep_enabled(false);
 		mutex_unlock(&autosleep_lock);
 		wait_for_completion(&suspend_completion);
 	} else if (state > PM_SUSPEND_ON) {
 		autosleep_state = state;
 		__pm_relax(autosleep_ws);
+		pm_wakep_autosleep_enabled(true);
 		queue_up_suspend_work();
 		mutex_unlock(&autosleep_lock);
 	}


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating wakeup sources
  2012-02-07  1:00 [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
                   ` (6 preceding siblings ...)
  2012-02-07  1:06 ` [RFC][PATCH 7/8] PM / Sleep: Add "prevent autosleep time" statistics to wakeup sources Rafael J. Wysocki
@ 2012-02-07  1:07 ` Rafael J. Wysocki
  2012-02-07  1:13 ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-07  1:07 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

From: Rafael J. Wysocki <rjw@sisk.pl>

Android allows user space to manipulate wakelocks using two
sysfs file located in /sys/power/, wake_lock and wake_unlock.
Writing a wakelock name and optionally a timeout to the wake_lock
file causes the wakelock whose name was written to be acquired (it
is created before is necessary), optionally with the given timeout.
Writing the name of a wakelock to wake_unlock causes that wakelock
to be released.

Implement an analogous interface for user space using wakeup sources.
Add the /sys/power/wake_lock and /sys/power/wake_unlock files
allowing user space to create, activate and deactivate wakeup
sources, such that writing a name and optionally a timeout to
wake_lock causes the wakeup source of that name to be activated,
optionally with the given timeout.  If that wakeup source doesn't
exist, it will be created and then activated.  Writing a name to
wake_unlock causes the wakeup source of that name, if there is one,
to be deactivated.  Wakeup sources created with the help of
wake_lock that haven't been used for more than 5 minutes are garbage
collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT
wakeup sources created with the help of wake_lock present at a time.

The data type used to track wakeup sources created by user space is
called "struct wakelock" to indicate the origins of this feature.

---
 drivers/base/power/wakeup.c |    1 
 kernel/power/Kconfig        |    8 +
 kernel/power/Makefile       |    1 
 kernel/power/main.c         |   41 ++++++++
 kernel/power/power.h        |    9 +
 kernel/power/wakelock.c     |  218 ++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 278 insertions(+)

Index: linux/kernel/power/main.c
===================================================================
--- linux.orig/kernel/power/main.c
+++ linux/kernel/power/main.c
@@ -415,6 +415,43 @@ static ssize_t autosleep_store(struct ko
 
 power_attr(autosleep);
 #endif /* CONFIG_PM_AUTOSLEEP */
+
+#ifdef CONFIG_PM_WAKELOCKS
+static ssize_t wake_lock_show(struct kobject *kobj,
+			      struct kobj_attribute *attr,
+			      char *buf)
+{
+	return pm_show_wakelocks(buf, true);
+}
+
+static ssize_t wake_lock_store(struct kobject *kobj,
+			       struct kobj_attribute *attr,
+			       const char *buf, size_t n)
+{
+	int error = pm_wake_lock(buf);
+	return error ? error : n;
+}
+
+power_attr(wake_lock);
+
+static ssize_t wake_unlock_show(struct kobject *kobj,
+				struct kobj_attribute *attr,
+				char *buf)
+{
+	return pm_show_wakelocks(buf, false);
+}
+
+static ssize_t wake_unlock_store(struct kobject *kobj,
+				 struct kobj_attribute *attr,
+				 const char *buf, size_t n)
+{
+	int error = pm_wake_unlock(buf);
+	return error ? error : n;
+}
+
+power_attr(wake_unlock);
+
+#endif /* CONFIG_PM_WAKELOCKS */
 #endif /* CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM_TRACE
@@ -471,6 +508,10 @@ static struct attribute * g[] = {
 #ifdef CONFIG_PM_AUTOSLEEP
 	&autosleep_attr.attr,
 #endif
+#ifdef CONFIG_PM_WAKELOCKS
+	&wake_lock_attr.attr,
+	&wake_unlock_attr.attr,
+#endif
 #ifdef CONFIG_PM_DEBUG
 	&pm_test_attr.attr,
 #endif
Index: linux/kernel/power/power.h
===================================================================
--- linux.orig/kernel/power/power.h
+++ linux/kernel/power/power.h
@@ -287,3 +287,12 @@ static inline void pm_autosleep_unlock(v
 static inline suspend_state_t pm_autosleep_state(void) { return PM_SUSPEND_ON; }
 
 #endif /* !CONFIG_PM_AUTOSLEEP */
+
+#ifdef CONFIG_PM_WAKELOCKS
+
+/* kernel/power/wakelock.c */
+extern ssize_t pm_show_wakelocks(char *buf, bool show_active);
+extern int pm_wake_lock(const char *buf);
+extern int pm_wake_unlock(const char *buf);
+
+#endif /* !CONFIG_PM_WAKELOCKS */
Index: linux/kernel/power/Kconfig
===================================================================
--- linux.orig/kernel/power/Kconfig
+++ linux/kernel/power/Kconfig
@@ -111,6 +111,14 @@ config PM_AUTOSLEEP
 	Allow the kernel to trigger a system transition into a global sleep
 	state automatically whenever there are no active wakeup sources.
 
+config PM_WAKELOCKS
+	bool "User space wakeup sources interface"
+	depends on PM_SLEEP
+	default n
+	---help---
+	Allow user space to create, activate and deactivate wakeup source
+	objects with the help of a sysfs-based interface.
+
 config PM_RUNTIME
 	bool "Run-time PM core functionality"
 	depends on !IA64_HP_SIM
Index: linux/kernel/power/wakelock.c
===================================================================
--- /dev/null
+++ linux/kernel/power/wakelock.c
@@ -0,0 +1,218 @@
+/*
+ * kernel/power/wakelock.c
+ *
+ * User space wakeup sources support.
+ *
+ * Copyright (C) 2012 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This code is based on the analogous interface allowing user space to
+ * manipulate wakelocks on Android.
+ */
+
+#include <linux/ctype.h>
+#include <linux/device.h>
+#include <linux/err.h>
+#include <linux/hrtimer.h>
+#include <linux/list.h>
+#include <linux/rbtree.h>
+#include <linux/slab.h>
+
+#define WL_NUMBER_LIMIT	100
+#define WL_GC_COUNT_MAX	100
+#define WL_GC_TIME_SEC	300
+
+static DEFINE_MUTEX(wakelocks_lock);
+
+struct wakelock {
+	char			*name;
+	struct rb_node		node;
+	struct wakeup_source	ws;
+	struct list_head	lru;
+};
+
+static struct rb_root wakelocks_tree = RB_ROOT;
+static LIST_HEAD(wakelocks_lru_list);
+static unsigned int number_of_wakelocks;
+static unsigned int wakelocks_gc_count;
+
+ssize_t pm_show_wakelocks(char *buf, bool show_active)
+{
+	struct rb_node *node;
+	struct wakelock *wl;
+	char *str = buf;
+	char *end = buf + PAGE_SIZE;
+
+	mutex_lock(&wakelocks_lock);
+
+	for (node = rb_first(&wakelocks_tree); node; node = rb_next(node)) {
+		bool active;
+
+		wl = rb_entry(node, struct wakelock, node);
+		spin_lock_irq(&wl->ws.lock);
+		active = wl->ws.active;
+		spin_unlock_irq(&wl->ws.lock);
+		if (active == show_active)
+			str += scnprintf(str, end - str, "%s ", wl->name);
+	}
+	str += scnprintf(str, end - str, "\n");
+
+	mutex_unlock(&wakelocks_lock);
+	return (str - buf);
+}
+
+static struct wakelock *wakelock_lookup_add(const char *name, size_t len,
+					    bool add_if_not_found)
+{
+	struct rb_node **node = &wakelocks_tree.rb_node;
+	struct rb_node *parent = *node;
+	struct wakelock *wl;
+
+	while (*node) {
+		int diff;
+
+		wl = rb_entry(*node, struct wakelock, node);
+		diff = strncmp(name, wl->name, len);
+		if (diff == 0) {
+			if (wl->name[len])
+				diff = -1;
+			else
+				return wl;
+		}
+		if (diff < 0)
+			node = &(*node)->rb_left;
+		else
+			node = &(*node)->rb_right;
+
+		parent = *node;
+	}
+	if (!add_if_not_found)
+		return ERR_PTR(-EINVAL);
+
+	if (number_of_wakelocks > WL_NUMBER_LIMIT)
+		return ERR_PTR(-ENOSPC);
+
+	/* Not found, we have to add a new one. */
+	wl = kzalloc(sizeof(*wl), GFP_KERNEL);
+	if (!wl)
+		return ERR_PTR(-ENOMEM);
+
+	wl->name = kstrndup(name, len, GFP_KERNEL);
+	if (!wl->name) {
+		kfree(wl);
+		return ERR_PTR(-ENOMEM);
+	}
+	wl->ws.name = wl->name;
+	wakeup_source_add(&wl->ws);
+	rb_link_node(&wl->node, parent, node);
+	rb_insert_color(&wl->node, &wakelocks_tree);
+	list_add(&wl->lru, &wakelocks_lru_list);
+	number_of_wakelocks++;
+	return wl;
+}
+
+int pm_wake_lock(const char *buf)
+{
+	const char *str = buf;
+	struct wakelock *wl;
+	u64 timeout_ns = 0;
+	size_t len;
+	int ret = 0;
+
+	while (*str && !isspace(*str))
+		str++;
+
+	len = str - buf;
+	if (!len)
+		return -EINVAL;
+
+	if (*str && *str != '\n') {
+		/* Find out if there's a valid timeout string appended. */
+		ret = kstrtou64(skip_spaces(str), 10, &timeout_ns);
+		if (ret)
+			return -EINVAL;
+	}
+
+	mutex_lock(&wakelocks_lock);
+
+	wl = wakelock_lookup_add(buf, len, true);
+	if (IS_ERR(wl)) {
+		ret = PTR_ERR(wl);
+		goto out;
+	}
+	if (timeout_ns) {
+		u64 timeout_ms = timeout_ns + NSEC_PER_MSEC - 1;
+
+		do_div(timeout_ms, NSEC_PER_MSEC);
+		__pm_wakeup_event(&wl->ws, timeout_ms);
+	} else {
+		__pm_stay_awake(&wl->ws);
+	}
+
+	list_move(&wl->lru, &wakelocks_lru_list);
+
+ out:
+	mutex_unlock(&wakelocks_lock);
+	return ret;
+}
+
+static void wakelocks_gc(void)
+{
+	struct wakelock *wl, *aux;
+	ktime_t now = ktime_get();
+
+	list_for_each_entry_safe_reverse(wl, aux, &wakelocks_lru_list, lru) {
+		u64 idle_time_ns;
+		bool active;
+
+		spin_lock_irq(&wl->ws.lock);
+		idle_time_ns = ktime_to_ns(ktime_sub(now, wl->ws.last_time));
+		active = wl->ws.active;
+		spin_unlock_irq(&wl->ws.lock);
+
+		if (idle_time_ns < ((u64)WL_GC_TIME_SEC * NSEC_PER_SEC))
+			break;
+
+		if (!active) {
+			wakeup_source_remove(&wl->ws);
+			rb_erase(&wl->node, &wakelocks_tree);
+			list_del(&wl->lru);
+			kfree(wl->name);
+			kfree(wl);
+			number_of_wakelocks--;
+		}
+	}
+	wakelocks_gc_count = 0;
+}
+
+int pm_wake_unlock(const char *buf)
+{
+	struct wakelock *wl;
+	size_t len;
+	int ret = 0;
+
+	len = strlen(buf);
+	if (!len)
+		return -EINVAL;
+
+	if (buf[len-1] == '\n')
+		len--;
+
+	if (!len)
+		return -EINVAL;
+
+	mutex_lock(&wakelocks_lock);
+
+	wl = wakelock_lookup_add(buf, len, false);
+	if (IS_ERR(wl)) {
+		ret = PTR_ERR(wl);
+		goto out;
+	}
+	__pm_relax(&wl->ws);
+	list_move(&wl->lru, &wakelocks_lru_list);
+	if (++wakelocks_gc_count > WL_GC_COUNT_MAX)
+		wakelocks_gc();
+
+ out:
+	mutex_unlock(&wakelocks_lock);
+	return ret;
+}
Index: linux/kernel/power/Makefile
===================================================================
--- linux.orig/kernel/power/Makefile
+++ linux/kernel/power/Makefile
@@ -9,5 +9,6 @@ obj-$(CONFIG_PM_TEST_SUSPEND)	+= suspend
 obj-$(CONFIG_HIBERNATION)	+= hibernate.o snapshot.o swap.o user.o \
 				   block_io.o
 obj-$(CONFIG_PM_AUTOSLEEP)	+= autosleep.o
+obj-$(CONFIG_PM_WAKELOCKS)	+= wakelock.o
 
 obj-$(CONFIG_MAGIC_SYSRQ)	+= poweroff.o
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -107,6 +107,7 @@ void wakeup_source_add(struct wakeup_sou
 	spin_lock_init(&ws->lock);
 	setup_timer(&ws->timer, pm_wakeup_timer_fn, (unsigned long)ws);
 	ws->active = false;
+	ws->last_time = ktime_get();
 
 	spin_lock_irq(&events_lock);
 	list_add_rcu(&ws->entry, &wakeup_sources);

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-07  1:00 [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
                   ` (7 preceding siblings ...)
  2012-02-07  1:07 ` [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating " Rafael J. Wysocki
@ 2012-02-07  1:13 ` Rafael J. Wysocki
  2012-02-08 23:57 ` NeilBrown
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-07  1:13 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

On Tuesday, February 07, 2012, Rafael J. Wysocki wrote:
> Hi all,
> 
> This series tests the theory that the easiest way to sell a once rejected
> feature is to advertise it under a different name.
> 
> Well, there actually are two different features, although they are closely
> related to each other.  First, patch [6/8] introduces a feature that allows
> the kernel to trigger system suspend (or more generally a transition into
> a sleep state) whenever there are no active wakeup sources (no, they aren't
> called wakelocks).  It is called "autosleep" here, but it was called a few
> different names in the past ("opportunistic suspend" was probably the most
> popular one).  Second, patch [8/8] introduces "wake locks" that are,
> essentially, wakeup sources which may be created and manipulated by user
> space.  Using them user space may control the autosleep feature introduced
> earlier.
> 
> This also is a kind of a proof of concept for the people who wanted me to
> show a kernel-based implementation of automatic suspend, so there you go.
> Please note, however, that it is done so that the user space "wake locks"
> interface is compatible with Android in support of its user space.  I don't
> really like this interface, but since the Android's user space seems to rely
> on it, I'm fine with using it as is.  YMMV.
> 
> Let me say a few words about every patch in the series individually.
> 
> [1/8] - This really is a bug fix, so it's v3.4 material.  Nobody has stepped
>   on this bug so far, but it should be fixed anyway.
> 
> [2/8] - This is a freezer cleanup, worth doing anyway IMO, so v3.4 material too.
> 
> [3/8] - This is something we can do no problem, although completely optional
>   without the autosleep feature.  Rather necessary with it, though.
> 
> [4/8] - This kind of reintroduces my original idea of using a wait queue for
>   waiting until there are no wakeup events in progress.  Alan convinced me that
>   it would be better to poll the counter to prevent wakeup_source_deactivate()
>   from having to call wake_up_all() occasionally (that may be costly in fast
>   paths), but then quite some people told me that the wait queue migh be
>   better.  I think that the polling will make much less sense with autosleep
>   and user space "wake locks".  Anyway, [4/8] is something we can do without
>   those things too.
> 
> The patches above were given Sign-off-by tags, because I think they make some
> sense regardless of the features introcuded by the remaining patches that in
> turn are total RFC.
> 
> [5/8] - This changes wakeup source statistics so that they are more similar to
>   the statistics collected for wakelocks on Android.  The file those statistics
>   may be read from is still located in debugfs, though (I don't think it
>   belongs to proc and its name is different from the analogous Android's file
>   name anyway).  It could be done without autosleep, but then it would be a bit
>   pointless.  BTW, this changes interfaces that _in_ _theory_ may be used by
>   someone, but I'm not aware of anyone using them.  If you are one, I'll be
>   pleased to learn about that, so please tell me who you are. :-)
> 
> [6/8] - Autosleep implementation.  I think the changelog explains the idea
>   quite well and the code is really nothing special.  It doesn't really add
>   anything new to the kernel in terms of infrastructure etc., it just uses
>   the existing stuff to implement an alternative method of triggering system
>   sleep transitions.  Note, though, that the interface here is different
>   from the Android's one, because Android actually modifies /sys/power/state
>   to trigger something called "early suspend" (that is never going to be
>   implemented in the "stock" kernel as long as I have any influence on it) and
>   we simply can't do that in the mainline.
> 
> [7/8] - This adds a wakeup source statistics that only makes sense with
>   autosleep and (I believe) is analogous to the Android's prevent_suspend_time
>   statistics.  Nothing really special, but I didn't want
>   wakeup_source_activate/deactivate() to take a common lock to avoid
>   congestion.
> 
> [8/8] - This adds a user space interface to create, activate and deactivate
>   wakeup sources.  Since the files it consists of are called wake_lock and
>   wake_unlock, to follow Android, the objects the wakeup sources are wrapped
>   into are called "wakelocks" (for added confusion).  Since the interface
>   doesn't provide any means to destroy those "wakelocks", I added a garbage
>   collection mechanism to get rid of the unused ones, if any.  I also tought
>   it might be a good idea to put a limit on the number of those things that
>   user space can operate simultaneously, so I did that too.
> 
> All in all, it's not as much code as I thought it would be and it seems to be
> relatively simple (which rises the question why the Android people didn't
> even _try_ to do something like this instead of slapping the "real" wakelocks
> onto the kernel FWIW).  IMHO it doesn't add anything really new to the kernel,
> except for the user space interfaces that should be maintainable.  At least I
> think I should be able to maintain them. :-)
> 
> All of the above has been tested very briefly on my test-bed Mackerel board
> and it quite obviously requires more thorough testing, but first I need to know
> if it makes sense to spend any more time on it.
> 
> IOW, I need to know your opinions!

Ouch.  Sorry for breaking the Greg's address.  Please replace it with the
correct one when you reply.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 1/8] PM / Sleep: Initialize wakeup source locks in wakeup_source_add()
  2012-02-07  1:01 ` [PATCH 1/8] PM / Sleep: Initialize wakeup source locks in wakeup_source_add() Rafael J. Wysocki
@ 2012-02-07 22:29   ` John Stultz
  2012-02-07 22:41     ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: John Stultz @ 2012-02-07 22:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, Brian Swetland, Neil Brown,
	Alan Stern

On Tue, 2012-02-07 at 02:01 +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Initialize wakeup source locks in wakeup_source_add() instead of
> wakeup_source_create(), because otherwise the locks of the wakeup
> sources that haven't been allocated with wakeup_source_create()
> aren't initialized and handled properly.
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>

Ah, I've shot myself in the foot before, forgetting to init the wakeup
source, so this should be good. Although, would a WARN_ON be better then
just initializing the lock in add? That way bad behavior is more likely
to be corrected, rather then just ignored.

thanks
-john


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 1/8] PM / Sleep: Initialize wakeup source locks in wakeup_source_add()
  2012-02-07 22:29   ` John Stultz
@ 2012-02-07 22:41     ` Rafael J. Wysocki
  0 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-07 22:41 UTC (permalink / raw)
  To: John Stultz
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, Brian Swetland, Neil Brown,
	Alan Stern

On Tuesday, February 07, 2012, John Stultz wrote:
> On Tue, 2012-02-07 at 02:01 +0100, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > Initialize wakeup source locks in wakeup_source_add() instead of
> > wakeup_source_create(), because otherwise the locks of the wakeup
> > sources that haven't been allocated with wakeup_source_create()
> > aren't initialized and handled properly.
> > 
> > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Ah, I've shot myself in the foot before, forgetting to init the wakeup
> source, so this should be good. Although, would a WARN_ON be better then
> just initializing the lock in add? That way bad behavior is more likely
> to be corrected, rather then just ignored.

Well, that's not bad behavior, since users are not supposed to open code
wakeup source initialization.  _add() is supposed to do the job (that's
why I regard this one as a fix).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [Update][RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
  2012-02-07  1:06 ` [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep Rafael J. Wysocki
@ 2012-02-07 22:49   ` Rafael J. Wysocki
  0 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-07 22:49 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

On Tuesday, February 07, 2012, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Introduce a mechanism by which the kernel can trigger global
> transitions to a sleep state chosen by user space if there are no
> active wakeup sources.
> 
> It consists of a new sysfs attribute, /sys/power/autosleep, that
> can be written one of the strings returned by reads from
> /sys/power/state, a freezable ordered workqueue and a work item
> carrying out the "suspend" operations.  If a string representing
> the system's sleep state is written to /sys/power/autosleep, the
> work item triggering transitions to that state is queued up and
> it requeues it self after every execution until user space writes
> "off" to /sys/power/autosleep.  That work item enables the detection
> of wakeup events using the functions already defined in
> drivers/base/power/wakeup.c (with one small modification) and calls
> either pm_suspend(), or hibernate() to put the system into a sleep
> state.  If a wakeup event is reported while the transition is in
> progress, it will abort the transition and the "system suspend" work
> item will be queued up again.

OK, so before somebody points that out to me, the completion was redundant
(it was a leftover from one of the previous versions of the patch, sorry
about that).

Moreover, try_to_suspend() is racy with respect to wakeup_count_store()
(in theory, an automatic suspend without checking wakeup sources may happen
if the latter is used carelessly when autosleep is enabled).

Thus below is an updated patch (it requires [8/8] to be updated too because
of the changes in pm_autosleep_set_state(), but that's rather trivial).

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM / Sleep: Implement opportunistic sleep

Introduce a mechanism by which the kernel can trigger global
transitions to a sleep state chosen by user space if there are no
active wakeup sources.

It consists of a new sysfs attribute, /sys/power/autosleep, that
can be written one of the strings returned by reads from
/sys/power/state, an ordered workqueue and a work item carrying out
the "suspend" operations.  If a string representing the system's
sleep state is written to /sys/power/autosleep, the work item
triggering transitions to that state is queued up and it requeues
itself after every execution until user space writes "off" to
/sys/power/autosleep.

That work item enables the detection of wakeup events using the
functions already defined in drivers/base/power/wakeup.c (with one
small modification) and calls either pm_suspend(), or hibernate() to
put the system into a sleep state.  If a wakeup event is reported
while the transition is in progress, it will abort the transition and
the "system suspend" work item will be queued up again.

---
 drivers/base/power/wakeup.c |   38 ++++++++------
 include/linux/suspend.h     |   13 ++++-
 kernel/power/Kconfig        |    8 +++
 kernel/power/Makefile       |    1 
 kernel/power/autosleep.c    |  112 ++++++++++++++++++++++++++++++++++++++++++++
 kernel/power/main.c         |  105 ++++++++++++++++++++++++++++++++++-------
 kernel/power/power.h        |   18 +++++++
 7 files changed, 262 insertions(+), 33 deletions(-)

Index: linux/kernel/power/Makefile
===================================================================
--- linux.orig/kernel/power/Makefile
+++ linux/kernel/power/Makefile
@@ -8,5 +8,6 @@ obj-$(CONFIG_SUSPEND)		+= suspend.o
 obj-$(CONFIG_PM_TEST_SUSPEND)	+= suspend_test.o
 obj-$(CONFIG_HIBERNATION)	+= hibernate.o snapshot.o swap.o user.o \
 				   block_io.o
+obj-$(CONFIG_PM_AUTOSLEEP)	+= autosleep.o
 
 obj-$(CONFIG_MAGIC_SYSRQ)	+= poweroff.o
Index: linux/kernel/power/Kconfig
===================================================================
--- linux.orig/kernel/power/Kconfig
+++ linux/kernel/power/Kconfig
@@ -103,6 +103,14 @@ config PM_SLEEP_SMP
 	select HOTPLUG
 	select HOTPLUG_CPU
 
+config PM_AUTOSLEEP
+	bool "Opportunistic sleep"
+	depends on PM_SLEEP
+	default n
+	---help---
+	Allow the kernel to trigger a system transition into a global sleep
+	state automatically whenever there are no active wakeup sources.
+
 config PM_RUNTIME
 	bool "Run-time PM core functionality"
 	depends on !IA64_HP_SIM
Index: linux/kernel/power/power.h
===================================================================
--- linux.orig/kernel/power/power.h
+++ linux/kernel/power/power.h
@@ -269,3 +269,21 @@ static inline void suspend_thaw_processe
 {
 }
 #endif
+
+#ifdef CONFIG_PM_AUTOSLEEP
+
+/* kernel/power/autosleep.c */
+extern int pm_autosleep_init(void);
+extern void pm_autosleep_lock(void);
+extern void pm_autosleep_unlock(void);
+extern suspend_state_t pm_autosleep_state(void);
+extern int pm_autosleep_set_state(suspend_state_t state);
+
+#else /* !CONFIG_PM_AUTOSLEEP */
+
+static inline int pm_autosleep_init(void) { return 0; }
+static inline void pm_autosleep_lock(void) {}
+static inline void pm_autosleep_unlock(void) {}
+static inline suspend_state_t pm_autosleep_state(void) { return PM_SUSPEND_ON; }
+
+#endif /* !CONFIG_PM_AUTOSLEEP */
Index: linux/include/linux/suspend.h
===================================================================
--- linux.orig/include/linux/suspend.h
+++ linux/include/linux/suspend.h
@@ -372,7 +372,7 @@ extern int unregister_pm_notifier(struct
 extern bool events_check_enabled;
 
 extern bool pm_wakeup_pending(void);
-extern bool pm_get_wakeup_count(unsigned int *count);
+extern bool pm_get_wakeup_count(unsigned int *count, bool block);
 extern bool pm_save_wakeup_count(unsigned int count);
 
 static inline void lock_system_sleep(void)
@@ -423,6 +423,17 @@ static inline void unlock_system_sleep(v
 
 #endif /* !CONFIG_PM_SLEEP */
 
+#ifdef CONFIG_PM_AUTOSLEEP
+
+/* kernel/power/autosleep.c */
+void queue_up_suspend_work(void);
+
+#else /* !CONFIG_PM_AUTOSLEEP */
+
+static inline void queue_up_suspend_work(void) {}
+
+#endif /* !CONFIG_PM_AUTOSLEEP */
+
 #ifdef CONFIG_ARCH_SAVE_PAGE_KEYS
 /*
  * The ARCH_SAVE_PAGE_KEYS functions can be used by an architecture
Index: linux/kernel/power/autosleep.c
===================================================================
--- /dev/null
+++ linux/kernel/power/autosleep.c
@@ -0,0 +1,112 @@
+/*
+ * kernel/power/autosleep.c
+ *
+ * Opportunistic sleep support.
+ *
+ * Copyright (C) 2012 Rafael J. Wysocki <rjw@sisk.pl>
+ */
+
+#include <linux/device.h>
+#include <linux/mutex.h>
+#include <linux/pm_wakeup.h>
+
+#include "power.h"
+
+static struct workqueue_struct *autosleep_wq;
+static struct wakeup_source *autosleep_ws;
+
+static DEFINE_MUTEX(autosleep_lock);
+
+static suspend_state_t autosleep_state;
+
+static void try_to_suspend(struct work_struct *work)
+{
+	unsigned int initial_count, final_count;
+
+	if (!pm_get_wakeup_count(&initial_count, true))
+		goto out;
+
+	mutex_lock(&autosleep_lock);
+
+	if (!pm_save_wakeup_count(initial_count)) {
+		mutex_unlock(&autosleep_lock);
+		goto out;
+	}
+
+	if (autosleep_state == PM_SUSPEND_ON) {
+		mutex_unlock(&autosleep_lock);
+		return;
+	}
+	if (autosleep_state >= PM_SUSPEND_MAX)
+		hibernate();
+	else
+		pm_suspend(autosleep_state);
+
+	mutex_unlock(&autosleep_lock);
+
+	if (!pm_get_wakeup_count(&final_count, false))
+		goto out;
+
+	if (final_count == initial_count)
+		schedule_timeout(HZ / 2);
+
+ out:
+	queue_up_suspend_work();
+}
+
+static DECLARE_WORK(suspend_work, try_to_suspend);
+
+void queue_up_suspend_work(void)
+{
+	if (!work_pending(&suspend_work) && autosleep_state > PM_SUSPEND_ON)
+		queue_work(autosleep_wq, &suspend_work);
+}
+
+suspend_state_t pm_autosleep_state(void)
+{
+	return autosleep_state;
+}
+
+int pm_autosleep_set_state(suspend_state_t state)
+{
+#ifndef CONFIG_HIBERNATION
+	if (state >= PM_SUSPEND_MAX)
+		return -EINVAL;
+#endif
+	mutex_lock(&autosleep_lock);
+	__pm_stay_awake(autosleep_ws);
+	if (state == PM_SUSPEND_ON && autosleep_state != PM_SUSPEND_ON) {
+		autosleep_state = PM_SUSPEND_ON;
+		__pm_relax(autosleep_ws);
+	} else if (state > PM_SUSPEND_ON) {
+		autosleep_state = state;
+		__pm_relax(autosleep_ws);
+		queue_up_suspend_work();
+	}
+	mutex_unlock(&autosleep_lock);
+	return 0;
+}
+
+void pm_autosleep_lock(void)
+{
+	mutex_lock(&autosleep_lock);
+}
+
+void pm_autosleep_unlock(void)
+{
+	mutex_unlock(&autosleep_lock);
+}
+
+int __init pm_autosleep_init(void)
+{
+	autosleep_ws = wakeup_source_register("main");
+	if (!autosleep_ws)
+		return -ENOMEM;
+
+	autosleep_wq = alloc_ordered_workqueue("autosleep", 0);
+	if (autosleep_wq)
+		return 0;
+
+	wakeup_source_unregister(autosleep_ws);
+	return -ENOMEM;
+}
Index: linux/kernel/power/main.c
===================================================================
--- linux.orig/kernel/power/main.c
+++ linux/kernel/power/main.c
@@ -269,8 +269,7 @@ static ssize_t state_show(struct kobject
 	return (s - buf);
 }
 
-static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
-			   const char *buf, size_t n)
+static suspend_state_t decode_state(const char *buf, size_t n)
 {
 #ifdef CONFIG_SUSPEND
 	suspend_state_t state = PM_SUSPEND_STANDBY;
@@ -278,29 +277,46 @@ static ssize_t state_store(struct kobjec
 #endif
 	char *p;
 	int len;
-	int error = -EINVAL;
 
 	p = memchr(buf, '\n', n);
 	len = p ? p - buf : n;
 
-	/* First, check if we are requested to hibernate */
-	if (len == 4 && !strncmp(buf, "disk", len)) {
-		error = hibernate();
-		goto Exit;
-	}
+	/* Check hibernation first. */
+	if (len == 4 && !strncmp(buf, "disk", len))
+		return PM_SUSPEND_MAX;
 
 #ifdef CONFIG_SUSPEND
 	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
 		if (*s && len == strlen(*s) && !strncmp(buf, *s, len))
 			break;
 	}
-	if (state < PM_SUSPEND_MAX && *s) {
-		error = enter_state(state);
-		suspend_stats_update(error);
-	}
+	if (state < PM_SUSPEND_MAX && *s)
+		return state;
 #endif
 
- Exit:
+	return PM_SUSPEND_ON;
+}
+
+static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
+			   const char *buf, size_t n)
+{
+	suspend_state_t state;
+	int error = -EINVAL;
+
+	pm_autosleep_lock();
+	if (pm_autosleep_state() > PM_SUSPEND_ON) {
+		error = -EBUSY;
+		goto out;
+	}
+
+	state = decode_state(buf, n);
+	if (state < PM_SUSPEND_MAX)
+		error = pm_suspend(state);
+	else if (state > PM_SUSPEND_ON)
+		error = hibernate();
+
+ out:
+	pm_autosleep_unlock();
 	return error ? error : n;
 }
 
@@ -341,7 +357,8 @@ static ssize_t wakeup_count_show(struct
 {
 	unsigned int val;
 
-	return pm_get_wakeup_count(&val) ? sprintf(buf, "%u\n", val) : -EINTR;
+	return pm_get_wakeup_count(&val, true) ?
+		sprintf(buf, "%u\n", val) : -EINTR;
 }
 
 static ssize_t wakeup_count_store(struct kobject *kobj,
@@ -349,15 +366,65 @@ static ssize_t wakeup_count_store(struct
 				const char *buf, size_t n)
 {
 	unsigned int val;
+	int error = -EINVAL;
+
+	pm_autosleep_lock();
+	if (pm_autosleep_state() > PM_SUSPEND_ON) {
+		error = -EBUSY;
+		goto out;
+	}
 
 	if (sscanf(buf, "%u", &val) == 1) {
 		if (pm_save_wakeup_count(val))
 			return n;
 	}
-	return -EINVAL;
+
+ out:
+	pm_autosleep_unlock();
+	return error;
 }
 
 power_attr(wakeup_count);
+
+#ifdef CONFIG_PM_AUTOSLEEP
+static ssize_t autosleep_show(struct kobject *kobj,
+			      struct kobj_attribute *attr,
+			      char *buf)
+{
+	suspend_state_t state = pm_autosleep_state();
+
+	if (state == PM_SUSPEND_ON)
+		return sprintf(buf, "off\n");
+
+#ifdef CONFIG_SUSPEND
+	if (state < PM_SUSPEND_MAX)
+		return sprintf(buf, "%s\n", valid_state(state) ?
+						pm_states[state] : "error");
+#endif
+#ifdef CONFIG_HIBERNATION
+	return sprintf(buf, "disk\n");
+#else
+	return sprintf(buf, "error");
+#endif
+}
+
+static ssize_t autosleep_store(struct kobject *kobj,
+			       struct kobj_attribute *attr,
+			       const char *buf, size_t n)
+{
+	suspend_state_t state = decode_state(buf, n);
+	int error;
+
+	if (state == PM_SUSPEND_ON && strncmp(buf, "off", 3)
+	    && strncmp(buf, "off\n", 4))
+		return -EINVAL;
+
+	error = pm_autosleep_set_state(state);
+	return error ? error : n;
+}
+
+power_attr(autosleep);
+#endif /* CONFIG_PM_AUTOSLEEP */
 #endif /* CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM_TRACE
@@ -411,6 +478,9 @@ static struct attribute * g[] = {
 #ifdef CONFIG_PM_SLEEP
 	&pm_async_attr.attr,
 	&wakeup_count_attr.attr,
+#ifdef CONFIG_PM_AUTOSLEEP
+	&autosleep_attr.attr,
+#endif
 #ifdef CONFIG_PM_DEBUG
 	&pm_test_attr.attr,
 #endif
@@ -446,7 +516,10 @@ static int __init pm_init(void)
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
-	return sysfs_create_group(power_kobj, &attr_group);
+	error = sysfs_create_group(power_kobj, &attr_group);
+	if (error)
+		return error;
+	return pm_autosleep_init();
 }
 
 core_initcall(pm_init);
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -458,8 +458,10 @@ static void wakeup_source_deactivate(str
 	atomic_add(MAX_IN_PROGRESS, &combined_event_count);
 
 	split_counters(&cnt, &inpr);
-	if (!inpr)
+	if (!inpr) {
 		wake_up_all(&wakeup_count_wait_queue);
+		queue_up_suspend_work();
+	}
 }
 
 /**
@@ -610,29 +612,33 @@ bool pm_wakeup_pending(void)
 /**
  * pm_get_wakeup_count - Read the number of registered wakeup events.
  * @count: Address to store the value at.
+ * @block: Whether or not to block.
  *
- * Store the number of registered wakeup events at the address in @count.  Block
- * if the current number of wakeup events being processed is nonzero.
+ * Store the number of registered wakeup events at the address in @count.  If
+ * @block is set, block until the current number of wakeup events being
+ * processed is zero.
  *
- * Return 'false' if the wait for the number of wakeup events being processed to
- * drop down to zero has been interrupted by a signal (and the current number
- * of wakeup events being processed is still nonzero).  Otherwise return 'true'.
+ * Return 'false' if the current number of wakeup events being processed is
+ * nonzero.  Otherwise return 'true'.
  */
-bool pm_get_wakeup_count(unsigned int *count)
+bool pm_get_wakeup_count(unsigned int *count, bool block)
 {
 	unsigned int cnt, inpr;
-	DEFINE_WAIT(wait);
 
-	for (;;) {
-		prepare_to_wait(&wakeup_count_wait_queue, &wait,
-				TASK_INTERRUPTIBLE);
-		split_counters(&cnt, &inpr);
-		if (inpr == 0 || signal_pending(current))
-			break;
+	if (block) {
+		DEFINE_WAIT(wait);
 
-		schedule();
+		for (;;) {
+			prepare_to_wait(&wakeup_count_wait_queue, &wait,
+					TASK_INTERRUPTIBLE);
+			split_counters(&cnt, &inpr);
+			if (inpr == 0 || signal_pending(current))
+				break;
+
+			schedule();
+		}
+		finish_wait(&wakeup_count_wait_queue, &wait);
 	}
-	finish_wait(&wakeup_count_wait_queue, &wait);
 
 	split_counters(&cnt, &inpr);
 	*count = cnt;

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 4/8] PM / Sleep: Use wait queue to signal "no wakeup events in progress"
  2012-02-07  1:04 ` [PATCH 4/8] PM / Sleep: Use wait queue to signal "no wakeup events in progress" Rafael J. Wysocki
@ 2012-02-08 23:10   ` NeilBrown
  2012-02-09  0:05     ` Rafael J. Wysocki
  2012-02-12  1:27   ` mark gross
  1 sibling, 1 reply; 129+ messages in thread
From: NeilBrown @ 2012-02-08 23:10 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Alan Stern

[-- Attachment #1: Type: text/plain, Size: 3382 bytes --]

On Tue, 7 Feb 2012 02:04:19 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> The current wakeup source deactivation code doesn't do anything when
> the counter of wakeup events in progress goes down to zero, which
> requires pm_get_wakeup_count() to poll that counter periodically.
> Although this reduces the average time it takes to deactivate a
> wakeup source, it also may lead to a substantial amount of unnecessary
> polling if there are extended periods of wakeup activity.  Thus it
> seems reasonable to use a wait queue for signaling the "no wakeup
> events in progress" condition and remove the polling.
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  drivers/base/power/wakeup.c |   18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> Index: linux/drivers/base/power/wakeup.c
> ===================================================================
> --- linux.orig/drivers/base/power/wakeup.c
> +++ linux/drivers/base/power/wakeup.c
> @@ -17,8 +17,6 @@
>  
>  #include "power.h"
>  
> -#define TIMEOUT		100
> -
>  /*
>   * If set, the suspend/hibernate code will abort transitions to a sleep state
>   * if wakeup events are registered during or immediately before the transition.
> @@ -52,6 +50,8 @@ static void pm_wakeup_timer_fn(unsigned
>  
>  static LIST_HEAD(wakeup_sources);
>  
> +static DECLARE_WAIT_QUEUE_HEAD(wakeup_count_wait_queue);
> +
>  /**
>   * wakeup_source_create - Create a struct wakeup_source object.
>   * @name: Name of the new wakeup source.
> @@ -84,7 +84,7 @@ void wakeup_source_destroy(struct wakeup
>  	while (ws->active) {
>  		spin_unlock_irq(&ws->lock);
>  
> -		schedule_timeout_interruptible(msecs_to_jiffies(TIMEOUT));
> +		schedule_timeout_interruptible(msecs_to_jiffies(100));
>  
>  		spin_lock_irq(&ws->lock);
>  	}
> @@ -411,6 +411,7 @@ EXPORT_SYMBOL_GPL(pm_stay_awake);
>   */
>  static void wakeup_source_deactivate(struct wakeup_source *ws)
>  {
> +	unsigned int cnt, inpr;
>  	ktime_t duration;
>  	ktime_t now;
>  
> @@ -444,6 +445,10 @@ static void wakeup_source_deactivate(str
>  	 * couter of wakeup events in progress simultaneously.
>  	 */
>  	atomic_add(MAX_IN_PROGRESS, &combined_event_count);
> +
> +	split_counters(&cnt, &inpr);
> +	if (!inpr)
> +		wake_up_all(&wakeup_count_wait_queue);
>  }

Would it be worth making this:

     if (!inpr && waitqueue_active(&wakeup_count_wait_queue))
		wake_up_all(&wakeup_count_wait_queue);

??
It would often save a spinlock.

Also was there a reason you used wake_up_all().  That is only really needed
were EXCLUSIVE waits are happening, and there aren't any of those.

Thanks,
NeilBrown


>  
>  /**
> @@ -624,14 +629,19 @@ bool pm_wakeup_pending(void)
>  bool pm_get_wakeup_count(unsigned int *count)
>  {
>  	unsigned int cnt, inpr;
> +	DEFINE_WAIT(wait);
>  
>  	for (;;) {
> +		prepare_to_wait(&wakeup_count_wait_queue, &wait,
> +				TASK_INTERRUPTIBLE);
>  		split_counters(&cnt, &inpr);
>  		if (inpr == 0 || signal_pending(current))
>  			break;
>  		pm_wakeup_update_hit_counts();
> -		schedule_timeout_interruptible(msecs_to_jiffies(TIMEOUT));
> +
> +		schedule();
>  	}
> +	finish_wait(&wakeup_count_wait_queue, &wait);
>  
>  	split_counters(&cnt, &inpr);
>  	*count = cnt;


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-07  1:00 [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
                   ` (8 preceding siblings ...)
  2012-02-07  1:13 ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
@ 2012-02-08 23:57 ` NeilBrown
  2012-02-10  0:44   ` Rafael J. Wysocki
  2012-02-12  1:54   ` mark gross
  2012-02-12  1:19 ` mark gross
                   ` (2 subsequent siblings)
  12 siblings, 2 replies; 129+ messages in thread
From: NeilBrown @ 2012-02-08 23:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Alan Stern

[-- Attachment #1: Type: text/plain, Size: 3533 bytes --]

On Tue, 7 Feb 2012 02:00:55 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:


> All in all, it's not as much code as I thought it would be and it seems to be
> relatively simple (which rises the question why the Android people didn't
> even _try_ to do something like this instead of slapping the "real" wakelocks
> onto the kernel FWIW).  IMHO it doesn't add anything really new to the kernel,
> except for the user space interfaces that should be maintainable.  At least I
> think I should be able to maintain them. :-)
> 
> All of the above has been tested very briefly on my test-bed Mackerel board
> and it quite obviously requires more thorough testing, but first I need to know
> if it makes sense to spend any more time on it.
> 
> IOW, I need to know your opinions!

I've got opinions!!!

I'll try to avoid the obvious bike-shedding about interface design...

The key point I want to make is that doing this in the kernel has one very
import difference to doing it in userspace (which, as you know, I prefer)
which may not be obvious to everyone at first sight.  So I will try to make it
apparent.

In the user-space solution that we have previously discussed, it is only
necessary for the kernel to hold a wakeup_source active until the event is
*visible* to user-space.  So a low level driver can queue e.g. an input event
and then deactivate their wakeup_source.  The event can remain in the input
queue without any wakeup_source being active and there is no risk of going to
sleep inappropriately.
This is because - in the user-space approach - user-space must effectively
poll every source of interesting wakeup events between the last wakeup_source
being deactivate and the next attempt to suspend.  This poll will notice the
event sitting in a queue so that a well-written user-space will not go to
sleep but will read the event.
(Note that this 'poll-of-every-device' need not be expensive.  It can be a
single 'poll' or 'select' or even 'read' on a pollfd).

In the kernel based approach that you have presented this is not the case.
As the kernel will initiate suspend the moment the last wakeup_source is
released (with no polling of other queues), there must be an unbroken chain of
wakeup_sources from the initial interrupt all the way up to the user.
In particular, any subsystem (such as 'input') must hold a wakeup_source
active as long as any designated 'wakeup event' is in any of its queues.
This means that the subsystem must be able to differentiate wakeup events
from non-wakeup events.
This might be easy (maybe "all events are wakeup events" or "all events on
this queue are wakeup events") but it is not obvious to me that that is the
case.

To summarise: for this solution to be effective it also requires that
 1/ every subsystem that carries wakeup events must know about wakeup_sources
    and must activate/deactivate them as events are queued/dequeued.
 2/ these subsystems must be able to differentiate between wakeup events and
    non-wakeup events, and this must be a configurable decision.

Currently, understanding wakeup events is restricted to:
 - drivers that are capable of configuring wakeup
 - user-space which cares about wakeup
The proposed solution adds:
 - intermediate subsystems which might queue wakeup events

I think that is a significant addition to make and not one to be made
lightly.  It might end up adding more code than you thought it would be :-)

Thanks for the opportunity to comment,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 4/8] PM / Sleep: Use wait queue to signal "no wakeup events in progress"
  2012-02-08 23:10   ` NeilBrown
@ 2012-02-09  0:05     ` Rafael J. Wysocki
  0 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-09  0:05 UTC (permalink / raw)
  To: NeilBrown
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Alan Stern

On Thursday, February 09, 2012, NeilBrown wrote:
> On Tue, 7 Feb 2012 02:04:19 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > The current wakeup source deactivation code doesn't do anything when
> > the counter of wakeup events in progress goes down to zero, which
> > requires pm_get_wakeup_count() to poll that counter periodically.
> > Although this reduces the average time it takes to deactivate a
> > wakeup source, it also may lead to a substantial amount of unnecessary
> > polling if there are extended periods of wakeup activity.  Thus it
> > seems reasonable to use a wait queue for signaling the "no wakeup
> > events in progress" condition and remove the polling.
> > 
> > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > ---
> >  drivers/base/power/wakeup.c |   18 ++++++++++++++----
> >  1 file changed, 14 insertions(+), 4 deletions(-)
> > 
> > Index: linux/drivers/base/power/wakeup.c
> > ===================================================================
> > --- linux.orig/drivers/base/power/wakeup.c
> > +++ linux/drivers/base/power/wakeup.c
> > @@ -17,8 +17,6 @@
> >  
> >  #include "power.h"
> >  
> > -#define TIMEOUT		100
> > -
> >  /*
> >   * If set, the suspend/hibernate code will abort transitions to a sleep state
> >   * if wakeup events are registered during or immediately before the transition.
> > @@ -52,6 +50,8 @@ static void pm_wakeup_timer_fn(unsigned
> >  
> >  static LIST_HEAD(wakeup_sources);
> >  
> > +static DECLARE_WAIT_QUEUE_HEAD(wakeup_count_wait_queue);
> > +
> >  /**
> >   * wakeup_source_create - Create a struct wakeup_source object.
> >   * @name: Name of the new wakeup source.
> > @@ -84,7 +84,7 @@ void wakeup_source_destroy(struct wakeup
> >  	while (ws->active) {
> >  		spin_unlock_irq(&ws->lock);
> >  
> > -		schedule_timeout_interruptible(msecs_to_jiffies(TIMEOUT));
> > +		schedule_timeout_interruptible(msecs_to_jiffies(100));
> >  
> >  		spin_lock_irq(&ws->lock);
> >  	}
> > @@ -411,6 +411,7 @@ EXPORT_SYMBOL_GPL(pm_stay_awake);
> >   */
> >  static void wakeup_source_deactivate(struct wakeup_source *ws)
> >  {
> > +	unsigned int cnt, inpr;
> >  	ktime_t duration;
> >  	ktime_t now;
> >  
> > @@ -444,6 +445,10 @@ static void wakeup_source_deactivate(str
> >  	 * couter of wakeup events in progress simultaneously.
> >  	 */
> >  	atomic_add(MAX_IN_PROGRESS, &combined_event_count);
> > +
> > +	split_counters(&cnt, &inpr);
> > +	if (!inpr)
> > +		wake_up_all(&wakeup_count_wait_queue);
> >  }
> 
> Would it be worth making this:
> 
>      if (!inpr && waitqueue_active(&wakeup_count_wait_queue))
> 		wake_up_all(&wakeup_count_wait_queue);
> 
> ??
> It would often save a spinlock.

Yes, good point. :-)

> Also was there a reason you used wake_up_all().  That is only really needed
> were EXCLUSIVE waits are happening, and there aren't any of those.

Right, I think wake_up() should be fine too.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-08 23:57 ` NeilBrown
@ 2012-02-10  0:44   ` Rafael J. Wysocki
  2012-02-12  2:05     ` mark gross
  2012-02-12  1:54   ` mark gross
  1 sibling, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-10  0:44 UTC (permalink / raw)
  To: NeilBrown
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Alan Stern

Hi,

On Thursday, February 09, 2012, NeilBrown wrote:
> On Tue, 7 Feb 2012 02:00:55 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> 
> > All in all, it's not as much code as I thought it would be and it seems to be
> > relatively simple (which rises the question why the Android people didn't
> > even _try_ to do something like this instead of slapping the "real" wakelocks
> > onto the kernel FWIW).  IMHO it doesn't add anything really new to the kernel,
> > except for the user space interfaces that should be maintainable.  At least I
> > think I should be able to maintain them. :-)
> > 
> > All of the above has been tested very briefly on my test-bed Mackerel board
> > and it quite obviously requires more thorough testing, but first I need to know
> > if it makes sense to spend any more time on it.
> > 
> > IOW, I need to know your opinions!
> 
> I've got opinions!!!

Good! :-)

It seems that no one else has.

> I'll try to avoid the obvious bike-shedding about interface design...
> 
> The key point I want to make is that doing this in the kernel has one very
> import difference to doing it in userspace (which, as you know, I prefer)
> which may not be obvious to everyone at first sight.  So I will try to make it
> apparent.
> 
> In the user-space solution that we have previously discussed, it is only
> necessary for the kernel to hold a wakeup_source active until the event is
> *visible* to user-space.  So a low level driver can queue e.g. an input event
> and then deactivate their wakeup_source.  The event can remain in the input
> queue without any wakeup_source being active and there is no risk of going to
> sleep inappropriately.
> This is because - in the user-space approach - user-space must effectively
> poll every source of interesting wakeup events between the last wakeup_source
> being deactivate and the next attempt to suspend.  This poll will notice the
> event sitting in a queue so that a well-written user-space will not go to
> sleep but will read the event.
> (Note that this 'poll-of-every-device' need not be expensive.  It can be a
> single 'poll' or 'select' or even 'read' on a pollfd).

So I see one little problem with that, which is that you'd need to teach user
space developers what to do an how to do that correctly.

Also, when you say "user space", it isn't exactly clear whether you mean a
power manager (that would carry out the attmepts to suspend) or applications
(that would need to communicate with the power manager to let it know what
they are doing).  This is important, because in general, before deactivating
a wakeup source the kernel subsystem should know that the associated event
has become visible not only to the "polling" application, but also (perhaps
indirectly) to the power manager, so that it doesn't trigger suspend too
early.

> In the kernel based approach that you have presented this is not the case.
> As the kernel will initiate suspend the moment the last wakeup_source is
> released (with no polling of other queues), there must be an unbroken chain of
> wakeup_sources from the initial interrupt all the way up to the user.
> In particular, any subsystem (such as 'input') must hold a wakeup_source
> active as long as any designated 'wakeup event' is in any of its queues.
> This means that the subsystem must be able to differentiate wakeup events
> from non-wakeup events.
> This might be easy (maybe "all events are wakeup events" or "all events on
> this queue are wakeup events") but it is not obvious to me that that is the
> case.
> 
> To summarise: for this solution to be effective it also requires that
>  1/ every subsystem that carries wakeup events must know about wakeup_sources
>     and must activate/deactivate them as events are queued/dequeued.
>  2/ these subsystems must be able to differentiate between wakeup events and
>     non-wakeup events, and this must be a configurable decision.
> 
> Currently, understanding wakeup events is restricted to:
>  - drivers that are capable of configuring wakeup
>  - user-space which cares about wakeup
> The proposed solution adds:
>  - intermediate subsystems which might queue wakeup events
> 
> I think that is a significant addition to make and not one to be made
> lightly.  It might end up adding more code than you thought it would be :-)

I'm aware of that and I expect people to come up with patches adding the
handling of wakeup events to a number of subsystems (this is kind of needed
regardless of autosleep if we want to be sure that user space has actually
consumed events we want it to take from us before suspending).  However,
I'm not expecting that to be a lot of code (I think we both can only speculate
about that at this point) and those subsystems have maintainers and the
decision whether or not to take that code is theirs.

That may be a long process, but at least we can see from Android what's
needed and where.

Still, the point here is to give people something to start with so that they
can take the Android user space, test it against the mainline and see what
doesn't work and why and come up with fixes.  Perhaps they will have better
ideas than we think right now, but surely nothing more is going to happen
without this starting point.

I'd like us and Android to use the same low-level data structures for power
management and the same API eventually, at least for drivers.  This is not
the case at the moment and it's actively hurting us as a project quite a bit.
If Android needs to add patches on top of whatever we have to get the desired
functionality, I'm fine with that, as long as they don't require drivers to use
APIs that are incompatible with the mainline.  Insisting that Android should
use a user-space-based autosleep implementation wouldn't help at all, because
realistically this isn't going to happen.

> Thanks for the opportunity to comment,

No need to thank for that, it's Open Source after all ...

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-07  1:00 [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
                   ` (9 preceding siblings ...)
  2012-02-08 23:57 ` NeilBrown
@ 2012-02-12  1:19 ` mark gross
  2012-02-14  2:07 ` Arve Hjønnevåg
  2012-02-21 23:31 ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 Rafael J. Wysocki
  12 siblings, 0 replies; 129+ messages in thread
From: mark gross @ 2012-02-12  1:19 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

On Tue, Feb 07, 2012 at 02:00:55AM +0100, Rafael J. Wysocki wrote:
> Hi all,
> 
> This series tests the theory that the easiest way to sell a once rejected
> feature is to advertise it under a different name.
> 
> Well, there actually are two different features, although they are closely
> related to each other.  First, patch [6/8] introduces a feature that allows
> the kernel to trigger system suspend (or more generally a transition into
> a sleep state) whenever there are no active wakeup sources (no, they aren't
> called wakelocks).  It is called "autosleep" here, but it was called a few
> different names in the past ("opportunistic suspend" was probably the most
> popular one).  Second, patch [8/8] introduces "wake locks" that are,
> essentially, wakeup sources which may be created and manipulated by user
> space.  Using them user space may control the autosleep feature introduced
> earlier.
> 
> This also is a kind of a proof of concept for the people who wanted me to
> show a kernel-based implementation of automatic suspend, so there you go.
> Please note, however, that it is done so that the user space "wake locks"
> interface is compatible with Android in support of its user space.  I don't
> really like this interface, but since the Android's user space seems to rely
> on it, I'm fine with using it as is.  YMMV.
> 
> Let me say a few words about every patch in the series individually.
> 
> [1/8] - This really is a bug fix, so it's v3.4 material.  Nobody has stepped
>   on this bug so far, but it should be fixed anyway.
> 
> [2/8] - This is a freezer cleanup, worth doing anyway IMO, so v3.4 material too.
> 
> [3/8] - This is something we can do no problem, although completely optional
>   without the autosleep feature.  Rather necessary with it, though.
> 
> [4/8] - This kind of reintroduces my original idea of using a wait queue for
>   waiting until there are no wakeup events in progress.  Alan convinced me that
>   it would be better to poll the counter to prevent wakeup_source_deactivate()
>   from having to call wake_up_all() occasionally (that may be costly in fast
>   paths), but then quite some people told me that the wait queue migh be
>   better.  I think that the polling will make much less sense with autosleep
>   and user space "wake locks".  Anyway, [4/8] is something we can do without
>   those things too.
> 
> The patches above were given Sign-off-by tags, because I think they make some
> sense regardless of the features introcuded by the remaining patches that in
> turn are total RFC.
> 
> [5/8] - This changes wakeup source statistics so that they are more similar to
>   the statistics collected for wakelocks on Android.  The file those statistics
>   may be read from is still located in debugfs, though (I don't think it
>   belongs to proc and its name is different from the analogous Android's file
>   name anyway).  It could be done without autosleep, but then it would be a bit
>   pointless.  BTW, this changes interfaces that _in_ _theory_ may be used by
>   someone, but I'm not aware of anyone using them.  If you are one, I'll be
>   pleased to learn about that, so please tell me who you are. :-)
> 
> [6/8] - Autosleep implementation.  I think the changelog explains the idea
>   quite well and the code is really nothing special.  It doesn't really add
>   anything new to the kernel in terms of infrastructure etc., it just uses
>   the existing stuff to implement an alternative method of triggering system
>   sleep transitions.  Note, though, that the interface here is different
>   from the Android's one, because Android actually modifies /sys/power/state
>   to trigger something called "early suspend" (that is never going to be
>   implemented in the "stock" kernel as long as I have any influence on it) and
>   we simply can't do that in the mainline.
dude early suspend is the hallmark of enlightend coding for implementing
a kernel / user mode handshake to user mode when the display is turned
off.  How can you not like that shit?

> 
> [7/8] - This adds a wakeup source statistics that only makes sense with
>   autosleep and (I believe) is analogous to the Android's prevent_suspend_time
>   statistics.  Nothing really special, but I didn't want
>   wakeup_source_activate/deactivate() to take a common lock to avoid
>   congestion.
> 
> [8/8] - This adds a user space interface to create, activate and deactivate
>   wakeup sources.  Since the files it consists of are called wake_lock and
>   wake_unlock, to follow Android, the objects the wakeup sources are wrapped
>   into are called "wakelocks" (for added confusion).  Since the interface
>   doesn't provide any means to destroy those "wakelocks", I added a garbage
>   collection mechanism to get rid of the unused ones, if any.  I also tought
>   it might be a good idea to put a limit on the number of those things that
>   user space can operate simultaneously, so I did that too.
> 
> All in all, it's not as much code as I thought it would be and it seems to be
> relatively simple (which rises the question why the Android people didn't
> even _try_ to do something like this instead of slapping the "real" wakelocks
> onto the kernel FWIW).  IMHO it doesn't add anything really new to the kernel,
> except for the user space interfaces that should be maintainable.  At least I
> think I should be able to maintain them. :-)
> 
> All of the above has been tested very briefly on my test-bed Mackerel board
> and it quite obviously requires more thorough testing, but first I need to know
> if it makes sense to spend any more time on it.
> 
> IOW, I need to know your opinions!
my opinion is "sigh".

FWIW we need to bring Android wakelocks into the main line so we can fix
them WRT wake event notification handling.  But, I'll have to take a
look at the patches to see if I still have heart burn over the race
between wake sources and wake lock dropping in kernel mode.

/me goes and looks now....

--mark

> 
> Thanks,
> Rafael
> 

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 4/8] PM / Sleep: Use wait queue to signal "no wakeup events in progress"
  2012-02-07  1:04 ` [PATCH 4/8] PM / Sleep: Use wait queue to signal "no wakeup events in progress" Rafael J. Wysocki
  2012-02-08 23:10   ` NeilBrown
@ 2012-02-12  1:27   ` mark gross
  1 sibling, 0 replies; 129+ messages in thread
From: mark gross @ 2012-02-12  1:27 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

On Tue, Feb 07, 2012 at 02:04:19AM +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> The current wakeup source deactivation code doesn't do anything when
> the counter of wakeup events in progress goes down to zero, which
> requires pm_get_wakeup_count() to poll that counter periodically.
> Although this reduces the average time it takes to deactivate a
> wakeup source, it also may lead to a substantial amount of unnecessary
> polling if there are extended periods of wakeup activity.  Thus it
> seems reasonable to use a wait queue for signaling the "no wakeup
> events in progress" condition and remove the polling.
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  drivers/base/power/wakeup.c |   18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> Index: linux/drivers/base/power/wakeup.c
> ===================================================================
> --- linux.orig/drivers/base/power/wakeup.c
> +++ linux/drivers/base/power/wakeup.c
> @@ -17,8 +17,6 @@
>  
>  #include "power.h"
>  
> -#define TIMEOUT		100
> -
>  /*
>   * If set, the suspend/hibernate code will abort transitions to a sleep state
>   * if wakeup events are registered during or immediately before the transition.
> @@ -52,6 +50,8 @@ static void pm_wakeup_timer_fn(unsigned
>  
>  static LIST_HEAD(wakeup_sources);
>  
> +static DECLARE_WAIT_QUEUE_HEAD(wakeup_count_wait_queue);
> +
>  /**
>   * wakeup_source_create - Create a struct wakeup_source object.
>   * @name: Name of the new wakeup source.
> @@ -84,7 +84,7 @@ void wakeup_source_destroy(struct wakeup
>  	while (ws->active) {
>  		spin_unlock_irq(&ws->lock);
>  
> -		schedule_timeout_interruptible(msecs_to_jiffies(TIMEOUT));
> +		schedule_timeout_interruptible(msecs_to_jiffies(100));
Nit/ style comment: how is replacing a TIMEOUT macro with a magic number
an improvement.  (maybe timeout is a un-helpful name but 100 isn't any
better. )
>  
>  		spin_lock_irq(&ws->lock);
>  	}
> @@ -411,6 +411,7 @@ EXPORT_SYMBOL_GPL(pm_stay_awake);
>   */
>  static void wakeup_source_deactivate(struct wakeup_source *ws)
>  {
> +	unsigned int cnt, inpr;
>  	ktime_t duration;
>  	ktime_t now;
>  
> @@ -444,6 +445,10 @@ static void wakeup_source_deactivate(str
>  	 * couter of wakeup events in progress simultaneously.
>  	 */
>  	atomic_add(MAX_IN_PROGRESS, &combined_event_count);
> +
> +	split_counters(&cnt, &inpr);
> +	if (!inpr)
> +		wake_up_all(&wakeup_count_wait_queue);
>  }
>  
>  /**
> @@ -624,14 +629,19 @@ bool pm_wakeup_pending(void)
>  bool pm_get_wakeup_count(unsigned int *count)
>  {
>  	unsigned int cnt, inpr;
> +	DEFINE_WAIT(wait);
>  
>  	for (;;) {
> +		prepare_to_wait(&wakeup_count_wait_queue, &wait,
> +				TASK_INTERRUPTIBLE);
>  		split_counters(&cnt, &inpr);
>  		if (inpr == 0 || signal_pending(current))
>  			break;
>  		pm_wakeup_update_hit_counts();
> -		schedule_timeout_interruptible(msecs_to_jiffies(TIMEOUT));
> +
> +		schedule();
>  	}
> +	finish_wait(&wakeup_count_wait_queue, &wait);
>  
>  	split_counters(&cnt, &inpr);
>  	*count = cnt;
> 

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-08 23:57 ` NeilBrown
  2012-02-10  0:44   ` Rafael J. Wysocki
@ 2012-02-12  1:54   ` mark gross
  1 sibling, 0 replies; 129+ messages in thread
From: mark gross @ 2012-02-12  1:54 UTC (permalink / raw)
  To: NeilBrown
  Cc: Rafael J. Wysocki, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg, John Stultz,
	Brian Swetland, Alan Stern

On Thu, Feb 09, 2012 at 10:57:36AM +1100, NeilBrown wrote:
> On Tue, 7 Feb 2012 02:00:55 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> 
> > All in all, it's not as much code as I thought it would be and it seems to be
> > relatively simple (which rises the question why the Android people didn't
> > even _try_ to do something like this instead of slapping the "real" wakelocks
> > onto the kernel FWIW).  IMHO it doesn't add anything really new to the kernel,
> > except for the user space interfaces that should be maintainable.  At least I
> > think I should be able to maintain them. :-)
> > 
> > All of the above has been tested very briefly on my test-bed Mackerel board
> > and it quite obviously requires more thorough testing, but first I need to know
> > if it makes sense to spend any more time on it.
> > 
> > IOW, I need to know your opinions!
> 
> I've got opinions!!!
> 
> I'll try to avoid the obvious bike-shedding about interface design...
> 
> The key point I want to make is that doing this in the kernel has one very
> import difference to doing it in userspace (which, as you know, I prefer)
> which may not be obvious to everyone at first sight.  So I will try to make it
> apparent.
> 
> In the user-space solution that we have previously discussed, it is only
> necessary for the kernel to hold a wakeup_source active until the event is
> *visible* to user-space.  So a low level driver can queue e.g. an input event
> and then deactivate their wakeup_source.  The event can remain in the input
> queue without any wakeup_source being active and there is no risk of going to
> sleep inappropriately.
> This is because - in the user-space approach - user-space must effectively
> poll every source of interesting wakeup events between the last wakeup_source
> being deactivate and the next attempt to suspend.  This poll will notice the
> event sitting in a queue so that a well-written user-space will not go to
> sleep but will read the event.
<sarcasm>
its on running on 100's of millions of devices today... It must be well
written.  Right?
</sarcasm>

> single 'poll' or 'select' or even 'read' on a pollfd).
> 
> In the kernel based approach that you have presented this is not the case.
> As the kernel will initiate suspend the moment the last wakeup_source is
> released (with no polling of other queues), there must be an unbroken chain of
> wakeup_sources from the initial interrupt all the way up to the user.
> In particular, any subsystem (such as 'input') must hold a wakeup_source
> active as long as any designated 'wakeup event' is in any of its queues.
> This means that the subsystem must be able to differentiate wakeup events
> from non-wakeup events.
> This might be easy (maybe "all events are wakeup events" or "all events on
> this queue are wakeup events") but it is not obvious to me that that is the
> case.
>
And this brings us to a wake acknowledgement of wake events from user
mode before re-suspending type of design.


> To summarise: for this solution to be effective it also requires that
>  1/ every subsystem that carries wakeup events must know about wakeup_sources
>     and must activate/deactivate them as events are queued/dequeued.
>  2/ these subsystems must be able to differentiate between wakeup events and
>     non-wakeup events, and this must be a configurable decision.
> 
> Currently, understanding wakeup events is restricted to:
>  - drivers that are capable of configuring wakeup
>  - user-space which cares about wakeup
> The proposed solution adds:
>  - intermediate subsystems which might queue wakeup events
> 
> I think that is a significant addition to make and not one to be made
> lightly.  It might end up adding more code than you thought it would be :-)
you mean wake lock-itis sprinkling time out wake locks all over the
place?

--mark

> Thanks for the opportunity to comment,
> NeilBrown



^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-10  0:44   ` Rafael J. Wysocki
@ 2012-02-12  2:05     ` mark gross
  2012-02-12 21:32       ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: mark gross @ 2012-02-12  2:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: NeilBrown, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg, John Stultz,
	Brian Swetland, Alan Stern

On Fri, Feb 10, 2012 at 01:44:10AM +0100, Rafael J. Wysocki wrote:
> Hi,
> 
> On Thursday, February 09, 2012, NeilBrown wrote:
> > On Tue, 7 Feb 2012 02:00:55 +0100 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > 
> > > All in all, it's not as much code as I thought it would be and it seems to be
> > > relatively simple (which rises the question why the Android people didn't
> > > even _try_ to do something like this instead of slapping the "real" wakelocks
> > > onto the kernel FWIW).  IMHO it doesn't add anything really new to the kernel,
> > > except for the user space interfaces that should be maintainable.  At least I
> > > think I should be able to maintain them. :-)
> > > 
> > > All of the above has been tested very briefly on my test-bed Mackerel board
> > > and it quite obviously requires more thorough testing, but first I need to know
> > > if it makes sense to spend any more time on it.
> > > 
> > > IOW, I need to know your opinions!
> > 
> > I've got opinions!!!
> 
> Good! :-)
> 
> It seems that no one else has.
I'm sorry I've been really bad this last year about my email latency.

> > I'll try to avoid the obvious bike-shedding about interface design...
> > 
> > The key point I want to make is that doing this in the kernel has one very
> > import difference to doing it in userspace (which, as you know, I prefer)
> > which may not be obvious to everyone at first sight.  So I will try to make it
> > apparent.
> > 
> > In the user-space solution that we have previously discussed, it is only
> > necessary for the kernel to hold a wakeup_source active until the event is
> > *visible* to user-space.  So a low level driver can queue e.g. an input event
> > and then deactivate their wakeup_source.  The event can remain in the input
> > queue without any wakeup_source being active and there is no risk of going to
> > sleep inappropriately.
> > This is because - in the user-space approach - user-space must effectively
> > poll every source of interesting wakeup events between the last wakeup_source
> > being deactivate and the next attempt to suspend.  This poll will notice the
> > event sitting in a queue so that a well-written user-space will not go to
> > sleep but will read the event.
> > (Note that this 'poll-of-every-device' need not be expensive.  It can be a
> > single 'poll' or 'select' or even 'read' on a pollfd).
> 
> So I see one little problem with that, which is that you'd need to teach user
> space developers what to do an how to do that correctly.
> 
> Also, when you say "user space", it isn't exactly clear whether you mean a
> power manager (that would carry out the attmepts to suspend) or applications
> (that would need to communicate with the power manager to let it know what
> they are doing).  This is important, because in general, before deactivating
> a wakeup source the kernel subsystem should know that the associated event
> has become visible not only to the "polling" application, but also (perhaps
> indirectly) to the power manager, so that it doesn't trigger suspend too
> early.

yup, an explicit user mode acknowledgment of the wake event would be
appropriate.

> > In the kernel based approach that you have presented this is not the case.
> > As the kernel will initiate suspend the moment the last wakeup_source is
> > released (with no polling of other queues), there must be an unbroken chain of
> > wakeup_sources from the initial interrupt all the way up to the user.
> > In particular, any subsystem (such as 'input') must hold a wakeup_source
> > active as long as any designated 'wakeup event' is in any of its queues.
> > This means that the subsystem must be able to differentiate wakeup events
> > from non-wakeup events.
> > This might be easy (maybe "all events are wakeup events" or "all events on
> > this queue are wakeup events") but it is not obvious to me that that is the
> > case.
> > 
> > To summarise: for this solution to be effective it also requires that
> >  1/ every subsystem that carries wakeup events must know about wakeup_sources
> >     and must activate/deactivate them as events are queued/dequeued.
> >  2/ these subsystems must be able to differentiate between wakeup events and
> >     non-wakeup events, and this must be a configurable decision.
> > 
> > Currently, understanding wakeup events is restricted to:
> >  - drivers that are capable of configuring wakeup
> >  - user-space which cares about wakeup
> > The proposed solution adds:
> >  - intermediate subsystems which might queue wakeup events
> > 
> > I think that is a significant addition to make and not one to be made
> > lightly.  It might end up adding more code than you thought it would be :-)
> 
> I'm aware of that and I expect people to come up with patches adding the
> handling of wakeup events to a number of subsystems (this is kind of needed
> regardless of autosleep if we want to be sure that user space has actually
> consumed events we want it to take from us before suspending).  However,
> I'm not expecting that to be a lot of code (I think we both can only speculate
> about that at this point) and those subsystems have maintainers and the
> decision whether or not to take that code is theirs.
> 
> That may be a long process, but at least we can see from Android what's
> needed and where.
> 
> Still, the point here is to give people something to start with so that they
> can take the Android user space, test it against the mainline and see what
> doesn't work and why and come up with fixes.  Perhaps they will have better
> ideas than we think right now, but surely nothing more is going to happen
> without this starting point.
> 
> I'd like us and Android to use the same low-level data structures for power
> management and the same API eventually, at least for drivers.  This is not
> the case at the moment and it's actively hurting us as a project quite a bit.
> If Android needs to add patches on top of whatever we have to get the desired
> functionality, I'm fine with that, as long as they don't require drivers to use
> APIs that are incompatible with the mainline.  Insisting that Android should
> use a user-space-based autosleep implementation wouldn't help at all, because
> realistically this isn't going to happen.

why not?  I don't think having the PMS explicitly acknowledge a wake
event is a big ask at all.

--mark

> > Thanks for the opportunity to comment,
> 
> No need to thank for that, it's Open Source after all ...
> 
> Thanks,
> Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-12  2:05     ` mark gross
@ 2012-02-12 21:32       ` Rafael J. Wysocki
  2012-02-14  0:11         ` Arve Hjønnevåg
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-12 21:32 UTC (permalink / raw)
  To: markgross
  Cc: NeilBrown, Linux PM list, LKML, Magnus Damm, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Alan Stern

On Sunday, February 12, 2012, mark gross wrote:
> On Fri, Feb 10, 2012 at 01:44:10AM +0100, Rafael J. Wysocki wrote:
[...]
> > I'd like us and Android to use the same low-level data structures for power
> > management and the same API eventually, at least for drivers.  This is not
> > the case at the moment and it's actively hurting us as a project quite a bit.
> > If Android needs to add patches on top of whatever we have to get the desired
> > functionality, I'm fine with that, as long as they don't require drivers to use
> > APIs that are incompatible with the mainline.  Insisting that Android should
> > use a user-space-based autosleep implementation wouldn't help at all, because
> > realistically this isn't going to happen.
> 
> why not?  I don't think having the PMS explicitly acknowledge a wake
> event is a big ask at all.

I'd like to hear what the Android people think about that, but somehow it seems
to me they won't like it. :-)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-12 21:32       ` Rafael J. Wysocki
@ 2012-02-14  0:11         ` Arve Hjønnevåg
  2012-02-15 15:28           ` mark gross
  0 siblings, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-02-14  0:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: markgross, NeilBrown, Linux PM list, LKML, Magnus Damm,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern

On Sun, Feb 12, 2012 at 1:32 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Sunday, February 12, 2012, mark gross wrote:
>> On Fri, Feb 10, 2012 at 01:44:10AM +0100, Rafael J. Wysocki wrote:
> [...]
>> > I'd like us and Android to use the same low-level data structures for power
>> > management and the same API eventually, at least for drivers.  This is not
>> > the case at the moment and it's actively hurting us as a project quite a bit.
>> > If Android needs to add patches on top of whatever we have to get the desired
>> > functionality, I'm fine with that, as long as they don't require drivers to use
>> > APIs that are incompatible with the mainline.  Insisting that Android should
>> > use a user-space-based autosleep implementation wouldn't help at all, because
>> > realistically this isn't going to happen.
>>
>> why not?  I don't think having the PMS explicitly acknowledge a wake
>> event is a big ask at all.
>
> I'd like to hear what the Android people think about that, but somehow it seems
> to me they won't like it. :-)
>

Correct.

The android power manager service does not handle wake events and
therefore does not know when it is safe to acknowledge a wake event
(assuming this acknowledgement re-triggers suspend). Other components
handle the event and only notify the power manager if the event should
change a state (e.g. turn the screen on). Some wake events, like the
alarm used for battery monitoring, don't signal user space at all if
the user visible state did not change. Other wake events are processed
by lower level user-space services than the system-server where the
power manager runs.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-07  1:00 [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
                   ` (10 preceding siblings ...)
  2012-02-12  1:19 ` mark gross
@ 2012-02-14  2:07 ` Arve Hjønnevåg
  2012-02-14 23:22   ` Rafael J. Wysocki
  2012-02-21 23:31 ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 Rafael J. Wysocki
  12 siblings, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-02-14  2:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, John Stultz, Brian Swetland, Neil Brown, Alan Stern

On Mon, Feb 6, 2012 at 5:00 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
...
> All in all, it's not as much code as I thought it would be and it seems to be
> relatively simple (which rises the question why the Android people didn't
> even _try_ to do something like this instead of slapping the "real" wakelocks
> onto the kernel FWIW).  IMHO it doesn't add anything really new to the kernel,
> except for the user space interfaces that should be maintainable.  At least I
> think I should be able to maintain them. :-)
>

Replacing a working solution with an untested one takes time. That
said, I have recently tried replacing all our kernel wake-locks with a
thin wrapper around wake-sources. This appears to mostly work, but the
wake-source timeout feature has some bugs or incompatible apis. An
init api would also be useful for embedding wake-sources in other data
structures without adding another memory allocation. Your patch to
move the spinlock init to wakeup_source_add still require the struct
to be zero initialized and the name set manually.

I needed to use two wake-sources per wake-lock since calling
__pm_stay_awake after __pm_wakeup_event on a wake-source does not
cancel the timeout. Unless there is a reason to keep this behavior I
would like __pm_stay_awake to cancel any active timeout.

Destroying a wake-source also has some problems. If you call
wakeup_source_destroy it will spin forever if the wake-source is
active without a timeout. And, if you call __pm_relax then
wakeup_source_destroy it could free the wake-source memory while the
timer function is still running. It also looks as if the wake_source
can be immediately deactivated if you call __pm_wakeup_event at the
same time as the previous timeout expired.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-14  2:07 ` Arve Hjønnevåg
@ 2012-02-14 23:22   ` Rafael J. Wysocki
  2012-02-15  5:57     ` Arve Hjønnevåg
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-14 23:22 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, John Stultz, Brian Swetland, Neil Brown, Alan Stern

On Tuesday, February 14, 2012, Arve Hjønnevåg wrote:
> On Mon, Feb 6, 2012 at 5:00 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> ...
> > All in all, it's not as much code as I thought it would be and it seems to be
> > relatively simple (which rises the question why the Android people didn't
> > even _try_ to do something like this instead of slapping the "real" wakelocks
> > onto the kernel FWIW).  IMHO it doesn't add anything really new to the kernel,
> > except for the user space interfaces that should be maintainable.  At least I
> > think I should be able to maintain them. :-)
> >
> 
> Replacing a working solution with an untested one takes time.

Sure, that's pretty obvious. :-)

> That said, I have recently tried replacing all our kernel wake-locks with a
> thin wrapper around wake-sources. This appears to mostly work,

Good!

> but the wake-source timeout feature has some bugs or incompatible apis. An
> init api would also be useful for embedding wake-sources in other data
> structures without adding another memory allocation. Your patch to
> move the spinlock init to wakeup_source_add still require the struct
> to be zero initialized and the name set manually.

That should be easy to fix.  What about the appended patch?

> I needed to use two wake-sources per wake-lock since calling
> __pm_stay_awake after __pm_wakeup_event on a wake-source does not
> cancel the timeout. Unless there is a reason to keep this behavior I
> would like __pm_stay_awake to cancel any active timeout.

That actually is a bug.  At least it's not consistent with
__pm_wakeup_event() that will replace the existing timeout with a new
one.

I'll post a patch to fix that in the next couple of days, stay tuned. :-)

> Destroying a wake-source also has some problems. If you call
> wakeup_source_destroy it will spin forever if the wake-source is
> active without a timeout. And, if you call __pm_relax then
> wakeup_source_destroy it could free the wake-source memory while the
> timer function is still running.

This also is a bug that needs fixing anyway.

> It also looks as if the wake_source can be immediately deactivated if
> you call __pm_wakeup_event at the same time as the previous timeout expired.

Yes, there is a race window if the timer function has already started.
It looks like I wanted to make it too simple. :-)  Will fix.

Thanks,
Rafael


Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/base/power/wakeup.c |   44 +++++++++++++++++++++++++++++++++++++-------
 include/linux/pm_wakeup.h   |    9 +++++++++
 2 files changed, 46 insertions(+), 7 deletions(-)

Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -53,6 +53,28 @@ static void pm_wakeup_timer_fn(unsigned
 static LIST_HEAD(wakeup_sources);
 
 /**
+ * wakeup_source_init - Initialize a struct wakeup_source object.
+ * @ws: Wakeup source to initialize.
+ * @name: Name of the new wakeup source.
+ */
+int wakeup_source_init(struct wakeup_source *ws, const char *name)
+{
+	int ret = 0;
+
+	if (!ws)
+		return -EINVAL;
+
+	memset(ws, 0, sizeof(*ws));
+	if (name) {
+		ws->name = kstrdup(name, GFP_KERNEL);
+		if (!ws->name)
+			ret = -ENOMEM;
+	}
+	return ret;
+}
+EXPORT_SYMBOL_GPL(wakeup_source_init);
+
+/**
  * wakeup_source_create - Create a struct wakeup_source object.
  * @name: Name of the new wakeup source.
  */
@@ -60,22 +82,20 @@ struct wakeup_source *wakeup_source_crea
 {
 	struct wakeup_source *ws;
 
-	ws = kzalloc(sizeof(*ws), GFP_KERNEL);
+	ws = kmalloc(sizeof(*ws), GFP_KERNEL);
 	if (!ws)
 		return NULL;
 
-	if (name)
-		ws->name = kstrdup(name, GFP_KERNEL);
-
+	wakeup_source_init(ws, name);
 	return ws;
 }
 EXPORT_SYMBOL_GPL(wakeup_source_create);
 
 /**
- * wakeup_source_destroy - Destroy a struct wakeup_source object.
- * @ws: Wakeup source to destroy.
+ * wakeup_source_drop - Prepare a struct wakeup_source object for destruction.
+ * @ws: Wakeup source to prepare for destruction.
  */
-void wakeup_source_destroy(struct wakeup_source *ws)
+void wakeup_source_drop(struct wakeup_source *ws)
 {
 	if (!ws)
 		return;
@@ -91,6 +111,16 @@ void wakeup_source_destroy(struct wakeup
 	spin_unlock_irq(&ws->lock);
 
 	kfree(ws->name);
+}
+EXPORT_SYMBOL_GPL(wakeup_source_drop);
+
+/**
+ * wakeup_source_destroy - Destroy a struct wakeup_source object.
+ * @ws: Wakeup source to destroy.
+ */
+void wakeup_source_destroy(struct wakeup_source *ws)
+{
+	wakeup_source_drop(ws);
 	kfree(ws);
 }
 EXPORT_SYMBOL_GPL(wakeup_source_destroy);
Index: linux/include/linux/pm_wakeup.h
===================================================================
--- linux.orig/include/linux/pm_wakeup.h
+++ linux/include/linux/pm_wakeup.h
@@ -73,7 +73,9 @@ static inline bool device_may_wakeup(str
 }
 
 /* drivers/base/power/wakeup.c */
+extern int wakeup_source_init(struct wakeup_source *ws, const char *name);
 extern struct wakeup_source *wakeup_source_create(const char *name);
+extern void wakeup_source_drop(struct wakeup_source *ws);
 extern void wakeup_source_destroy(struct wakeup_source *ws);
 extern void wakeup_source_add(struct wakeup_source *ws);
 extern void wakeup_source_remove(struct wakeup_source *ws);
@@ -103,11 +105,18 @@ static inline bool device_can_wakeup(str
 	return dev->power.can_wakeup;
 }
 
+static inline int wakeup_source_init(struct wakeup_source *ws, const char *name)
+{
+	return -ENOSYS;
+}
+
 static inline struct wakeup_source *wakeup_source_create(const char *name)
 {
 	return NULL;
 }
 
+static inline void wakeup_source_drop(struct wakeup_source *ws) {}
+
 static inline void wakeup_source_destroy(struct wakeup_source *ws) {}
 
 static inline void wakeup_source_add(struct wakeup_source *ws) {}


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-14 23:22   ` Rafael J. Wysocki
@ 2012-02-15  5:57     ` Arve Hjønnevåg
  2012-02-15 23:07       ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-02-15  5:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, John Stultz, Brian Swetland, Neil Brown, Alan Stern

2012/2/14 Rafael J. Wysocki <rjw@sisk.pl>:
> On Tuesday, February 14, 2012, Arve Hjønnevåg wrote:
>> On Mon, Feb 6, 2012 at 5:00 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> ...
>> but the wake-source timeout feature has some bugs or incompatible apis. An
>> init api would also be useful for embedding wake-sources in other data
>> structures without adding another memory allocation. Your patch to
>> move the spinlock init to wakeup_source_add still require the struct
>> to be zero initialized and the name set manually.
>
> That should be easy to fix.  What about the appended patch?
>

That works, but I still have to call more than one function before I
can use the wakeup-source (wakeup_source_init and wakeup_source_add)
and more than one function before I can free it (__pm_relax,
wakeup_source_remove and wakeup_source_drop). Is there any reason to
keep these separate?

Also, not copying the name when the caller provides the memory for the
wakeup-source would be a closer match to the wakelock api. Most of our
wakelocks pass a string constant as the name, and making a copy of
that string is not useful. wake_lock_init is also safe to call from
atomic context, but I don't know if anyone relies on this.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 5/8] PM / Sleep: Change wakeup statistics
  2012-02-07  1:05 ` [RFC][PATCH 5/8] PM / Sleep: Change wakeup statistics Rafael J. Wysocki
@ 2012-02-15  6:15   ` Arve Hjønnevåg
  2012-02-15 22:37     ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-02-15  6:15 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, John Stultz, Brian Swetland, Neil Brown, Alan Stern

On Mon, Feb 6, 2012 at 5:05 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
>
> Wakeup statistics used by Android are slightly different from what we
> have at the moment, so modify them to follow Android more closely.
...
> @@ -438,6 +444,11 @@ static void wakeup_source_deactivate(str
>        if (ktime_to_ns(duration) > ktime_to_ns(ws->max_time))
>                ws->max_time = duration;
>
> +       ws->last_time = now;
> +       if (ws->has_timeout && time_after(jiffies, ws->timer_expires))

time_after_eq may work better (or increment the count from the timer).
I applied this patch and the expire counts I see for wakeup-sources
that always time-out do not match the active count.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-14  0:11         ` Arve Hjønnevåg
@ 2012-02-15 15:28           ` mark gross
  0 siblings, 0 replies; 129+ messages in thread
From: mark gross @ 2012-02-15 15:28 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: Rafael J. Wysocki, markgross, NeilBrown, Linux PM list, LKML,
	Magnus Damm, Matthew Garrett, Greg KH, John Stultz,
	Brian Swetland, Alan Stern

On Mon, Feb 13, 2012 at 04:11:24PM -0800, Arve Hjønnevåg wrote:
> On Sun, Feb 12, 2012 at 1:32 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Sunday, February 12, 2012, mark gross wrote:
> >> On Fri, Feb 10, 2012 at 01:44:10AM +0100, Rafael J. Wysocki wrote:
> > [...]
> >> > I'd like us and Android to use the same low-level data structures for power
> >> > management and the same API eventually, at least for drivers.  This is not
> >> > the case at the moment and it's actively hurting us as a project quite a bit.
> >> > If Android needs to add patches on top of whatever we have to get the desired
> >> > functionality, I'm fine with that, as long as they don't require drivers to use
> >> > APIs that are incompatible with the mainline.  Insisting that Android should
> >> > use a user-space-based autosleep implementation wouldn't help at all, because
> >> > realistically this isn't going to happen.
> >>
> >> why not?  I don't think having the PMS explicitly acknowledge a wake
> >> event is a big ask at all.
> >
> > I'd like to hear what the Android people think about that, but somehow it seems
> > to me they won't like it. :-)
> >
> 
> Correct.
> 
> The android power manager service does not handle wake events and
> therefore does not know when it is safe to acknowledge a wake event
> (assuming this acknowledgement re-triggers suspend). Other components
> handle the event and only notify the power manager if the event should
> change a state (e.g. turn the screen on). Some wake events, like the
> alarm used for battery monitoring, don't signal user space at all if
> the user visible state did not change. Other wake events are processed
> by lower level user-space services than the system-server where the
> power manager runs.

So you are all good with the wake event suspend race condition never ever
getting corrected or the fact that we have to sprinkle overlapping
kernel wake locks up and down the stack if we want to attempt to
implement correct code or that there is *no* way to deal with the hand
off of a wake lock critical section between kernel and user mode on wake
events without having a somewhat arbitrary time out wake lock dropping in
kernel mode?

Fine, if you don't like having the PMS ack wake events how about having
the services that handle them do it?  

The basic problem with wake locks is that there is no explicit wake
event acknowledgment required before re-suspending.  How about helping
us come up with a solution to that.

--mark

> -- 
> Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 5/8] PM / Sleep: Change wakeup statistics
  2012-02-15  6:15   ` Arve Hjønnevåg
@ 2012-02-15 22:37     ` Rafael J. Wysocki
  2012-02-17  2:11       ` Arve Hjønnevåg
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-15 22:37 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, John Stultz, Brian Swetland, Neil Brown, Alan Stern

On Wednesday, February 15, 2012, Arve Hjønnevåg wrote:
> On Mon, Feb 6, 2012 at 5:05 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> >
> > Wakeup statistics used by Android are slightly different from what we
> > have at the moment, so modify them to follow Android more closely.
> ...
> > @@ -438,6 +444,11 @@ static void wakeup_source_deactivate(str
> >        if (ktime_to_ns(duration) > ktime_to_ns(ws->max_time))
> >                ws->max_time = duration;
> >
> > +       ws->last_time = now;
> > +       if (ws->has_timeout && time_after(jiffies, ws->timer_expires))
> 
> time_after_eq may work better (or increment the count from the timer).

I think incrementing the count from the timer is a better approach.

> I applied this patch and the expire counts I see for wakeup-sources
> that always time-out do not match the active count.

I see.  The reason may also be that __pm_wakeup_event() increments
ws->event_count even if the wakeup source is already active.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-15  5:57     ` Arve Hjønnevåg
@ 2012-02-15 23:07       ` Rafael J. Wysocki
  2012-02-16 22:22         ` Rafael J. Wysocki
  2012-02-17  3:55         ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Arve Hjønnevåg
  0 siblings, 2 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-15 23:07 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, John Stultz, Brian Swetland, Neil Brown, Alan Stern

On Wednesday, February 15, 2012, Arve Hjønnevåg wrote:
> 2012/2/14 Rafael J. Wysocki <rjw@sisk.pl>:
> > On Tuesday, February 14, 2012, Arve Hjønnevåg wrote:
> >> On Mon, Feb 6, 2012 at 5:00 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> ...
> >> but the wake-source timeout feature has some bugs or incompatible apis. An
> >> init api would also be useful for embedding wake-sources in other data
> >> structures without adding another memory allocation. Your patch to
> >> move the spinlock init to wakeup_source_add still require the struct
> >> to be zero initialized and the name set manually.
> >
> > That should be easy to fix.  What about the appended patch?
> >
> 
> That works, but I still have to call more than one function before I
> can use the wakeup-source (wakeup_source_init and wakeup_source_add)
> and more than one function before I can free it (__pm_relax,
> wakeup_source_remove and wakeup_source_drop). Is there any reason to
> keep these separate?

Yes, there is.  I think that wakeup_source_create/_destroy() should
use the same initialization functions internally that will be used for
externally allocated wakeup sources (to make sure that all wakeup source
objects are initialized in exactly the same way).

> Also, not copying the name when the caller provides the memory for the
> wakeup-source would be a closer match to the wakelock api. Most of our
> wakelocks pass a string constant as the name, and making a copy of
> that string is not useful. wake_lock_init is also safe to call from
> atomic context, but I don't know if anyone relies on this.

OK, below is another go.  It doesn't copy the name if wakeup_source_init() is
used (which also does the _add this time).  I think, though, that copying
the name is generally safer, because someone might use wakeup_source_init()
with the name string allocated on the stack or otherwise temporary, which would
be a bug with the new version.

Thanks,
Rafael


Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/base/power/wakeup.c |   41 ++++++++++++++++++++++++++++++++++-------
 include/linux/pm_wakeup.h   |   20 ++++++++++++++++++++
 2 files changed, 54 insertions(+), 7 deletions(-)

Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -53,6 +53,23 @@ static void pm_wakeup_timer_fn(unsigned
 static LIST_HEAD(wakeup_sources);
 
 /**
+ * wakeup_source_prepare - Prepare a new wakeup source for initialization.
+ * @ws: Wakeup source to prepare.
+ * @name: Pointer to the name of the new wakeup source.
+ *
+ * Callers must ensure that the @name string won't be freed when @ws is still in
+ * use.
+ */
+void wakeup_source_prepare(struct wakeup_source *ws, const char *name)
+{
+	if (ws) {
+		memset(ws, 0, sizeof(*ws));
+		ws->name = name;
+	}
+}
+EXPORT_SYMBOL_GPL(wakeup_source_prepare);
+
+/**
  * wakeup_source_create - Create a struct wakeup_source object.
  * @name: Name of the new wakeup source.
  */
@@ -60,31 +77,41 @@ struct wakeup_source *wakeup_source_crea
 {
 	struct wakeup_source *ws;
 
-	ws = kzalloc(sizeof(*ws), GFP_KERNEL);
+	ws = kmalloc(sizeof(*ws), GFP_KERNEL);
 	if (!ws)
 		return NULL;
 
-	if (name)
-		ws->name = kstrdup(name, GFP_KERNEL);
-
+	wakeup_source_prepare(ws, name ? kstrdup(name, GFP_KERNEL) : NULL);
 	return ws;
 }
 EXPORT_SYMBOL_GPL(wakeup_source_create);
 
 /**
- * wakeup_source_destroy - Destroy a struct wakeup_source object.
- * @ws: Wakeup source to destroy.
+ * wakeup_source_drop - Prepare a struct wakeup_source object for destruction.
+ * @ws: Wakeup source to prepare for destruction.
  *
  * Callers must ensure that __pm_stay_awake() or __pm_wakeup_event() will never
  * be run in parallel with this function for the same wakeup source object.
  */
-void wakeup_source_destroy(struct wakeup_source *ws)
+void wakeup_source_drop(struct wakeup_source *ws)
 {
 	if (!ws)
 		return;
 
 	del_timer_sync(&ws->timer);
 	__pm_relax(ws);
+}
+EXPORT_SYMBOL_GPL(wakeup_source_drop);
+
+/**
+ * wakeup_source_destroy - Destroy a struct wakeup_source object.
+ * @ws: Wakeup source to destroy.
+ *
+ * Use only for wakeup source objects created with wakeup_source_create().
+ */
+void wakeup_source_destroy(struct wakeup_source *ws)
+{
+	wakeup_source_drop(ws);
 	kfree(ws->name);
 	kfree(ws);
 }
Index: linux/include/linux/pm_wakeup.h
===================================================================
--- linux.orig/include/linux/pm_wakeup.h
+++ linux/include/linux/pm_wakeup.h
@@ -73,7 +73,9 @@ static inline bool device_may_wakeup(str
 }
 
 /* drivers/base/power/wakeup.c */
+extern void wakeup_source_prepare(struct wakeup_source *ws, const char *name);
 extern struct wakeup_source *wakeup_source_create(const char *name);
+extern void wakeup_source_drop(struct wakeup_source *ws);
 extern void wakeup_source_destroy(struct wakeup_source *ws);
 extern void wakeup_source_add(struct wakeup_source *ws);
 extern void wakeup_source_remove(struct wakeup_source *ws);
@@ -103,11 +105,16 @@ static inline bool device_can_wakeup(str
 	return dev->power.can_wakeup;
 }
 
+static inline void wakeup_source_prepare(struct wakeup_source *ws,
+					 const char *name) {}
+
 static inline struct wakeup_source *wakeup_source_create(const char *name)
 {
 	return NULL;
 }
 
+static inline void wakeup_source_drop(struct wakeup_source *ws) {}
+
 static inline void wakeup_source_destroy(struct wakeup_source *ws) {}
 
 static inline void wakeup_source_add(struct wakeup_source *ws) {}
@@ -165,4 +172,17 @@ static inline void pm_wakeup_event(struc
 
 #endif /* !CONFIG_PM_SLEEP */
 
+static inline void wakeup_source_init(struct wakeup_source *ws,
+				      const char *name)
+{
+	wakeup_source_prepare(ws, name);
+	wakeup_source_add(ws);
+}
+
+static inline void wakeup_source_trash(struct wakeup_source *ws)
+{
+	wakeup_source_remove(ws);
+	wakeup_source_drop(ws);
+}
+
 #endif /* _LINUX_PM_WAKEUP_H */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-15 23:07       ` Rafael J. Wysocki
@ 2012-02-16 22:22         ` Rafael J. Wysocki
  2012-02-17  3:56           ` Arve Hjønnevåg
  2012-02-17  3:55         ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Arve Hjønnevåg
  1 sibling, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-16 22:22 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, John Stultz, Brian Swetland, Neil Brown, Alan Stern

On Thursday, February 16, 2012, Rafael J. Wysocki wrote:
> On Wednesday, February 15, 2012, Arve Hjønnevåg wrote:
> > 2012/2/14 Rafael J. Wysocki <rjw@sisk.pl>:
> > > On Tuesday, February 14, 2012, Arve Hjønnevåg wrote:
> > >> On Mon, Feb 6, 2012 at 5:00 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > >> ...
> > >> but the wake-source timeout feature has some bugs or incompatible apis. An
> > >> init api would also be useful for embedding wake-sources in other data
> > >> structures without adding another memory allocation. Your patch to
> > >> move the spinlock init to wakeup_source_add still require the struct
> > >> to be zero initialized and the name set manually.
> > >
> > > That should be easy to fix.  What about the appended patch?
> > >
> > 
> > That works, but I still have to call more than one function before I
> > can use the wakeup-source (wakeup_source_init and wakeup_source_add)
> > and more than one function before I can free it (__pm_relax,
> > wakeup_source_remove and wakeup_source_drop). Is there any reason to
> > keep these separate?
> 
> Yes, there is.  I think that wakeup_source_create/_destroy() should
> use the same initialization functions internally that will be used for
> externally allocated wakeup sources (to make sure that all wakeup source
> objects are initialized in exactly the same way).
> 
> > Also, not copying the name when the caller provides the memory for the
> > wakeup-source would be a closer match to the wakelock api. Most of our
> > wakelocks pass a string constant as the name, and making a copy of
> > that string is not useful. wake_lock_init is also safe to call from
> > atomic context, but I don't know if anyone relies on this.
> 
> OK, below is another go.  It doesn't copy the name if wakeup_source_init() is
> used (which also does the _add this time).  I think, though, that copying
> the name is generally safer, because someone might use wakeup_source_init()
> with the name string allocated on the stack or otherwise temporary, which would
> be a bug with the new version.

So, is the new version more suitable than the previous one?

Rafael


> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  drivers/base/power/wakeup.c |   41 ++++++++++++++++++++++++++++++++++-------
>  include/linux/pm_wakeup.h   |   20 ++++++++++++++++++++
>  2 files changed, 54 insertions(+), 7 deletions(-)
> 
> Index: linux/drivers/base/power/wakeup.c
> ===================================================================
> --- linux.orig/drivers/base/power/wakeup.c
> +++ linux/drivers/base/power/wakeup.c
> @@ -53,6 +53,23 @@ static void pm_wakeup_timer_fn(unsigned
>  static LIST_HEAD(wakeup_sources);
>  
>  /**
> + * wakeup_source_prepare - Prepare a new wakeup source for initialization.
> + * @ws: Wakeup source to prepare.
> + * @name: Pointer to the name of the new wakeup source.
> + *
> + * Callers must ensure that the @name string won't be freed when @ws is still in
> + * use.
> + */
> +void wakeup_source_prepare(struct wakeup_source *ws, const char *name)
> +{
> +	if (ws) {
> +		memset(ws, 0, sizeof(*ws));
> +		ws->name = name;
> +	}
> +}
> +EXPORT_SYMBOL_GPL(wakeup_source_prepare);
> +
> +/**
>   * wakeup_source_create - Create a struct wakeup_source object.
>   * @name: Name of the new wakeup source.
>   */
> @@ -60,31 +77,41 @@ struct wakeup_source *wakeup_source_crea
>  {
>  	struct wakeup_source *ws;
>  
> -	ws = kzalloc(sizeof(*ws), GFP_KERNEL);
> +	ws = kmalloc(sizeof(*ws), GFP_KERNEL);
>  	if (!ws)
>  		return NULL;
>  
> -	if (name)
> -		ws->name = kstrdup(name, GFP_KERNEL);
> -
> +	wakeup_source_prepare(ws, name ? kstrdup(name, GFP_KERNEL) : NULL);
>  	return ws;
>  }
>  EXPORT_SYMBOL_GPL(wakeup_source_create);
>  
>  /**
> - * wakeup_source_destroy - Destroy a struct wakeup_source object.
> - * @ws: Wakeup source to destroy.
> + * wakeup_source_drop - Prepare a struct wakeup_source object for destruction.
> + * @ws: Wakeup source to prepare for destruction.
>   *
>   * Callers must ensure that __pm_stay_awake() or __pm_wakeup_event() will never
>   * be run in parallel with this function for the same wakeup source object.
>   */
> -void wakeup_source_destroy(struct wakeup_source *ws)
> +void wakeup_source_drop(struct wakeup_source *ws)
>  {
>  	if (!ws)
>  		return;
>  
>  	del_timer_sync(&ws->timer);
>  	__pm_relax(ws);
> +}
> +EXPORT_SYMBOL_GPL(wakeup_source_drop);
> +
> +/**
> + * wakeup_source_destroy - Destroy a struct wakeup_source object.
> + * @ws: Wakeup source to destroy.
> + *
> + * Use only for wakeup source objects created with wakeup_source_create().
> + */
> +void wakeup_source_destroy(struct wakeup_source *ws)
> +{
> +	wakeup_source_drop(ws);
>  	kfree(ws->name);
>  	kfree(ws);
>  }
> Index: linux/include/linux/pm_wakeup.h
> ===================================================================
> --- linux.orig/include/linux/pm_wakeup.h
> +++ linux/include/linux/pm_wakeup.h
> @@ -73,7 +73,9 @@ static inline bool device_may_wakeup(str
>  }
>  
>  /* drivers/base/power/wakeup.c */
> +extern void wakeup_source_prepare(struct wakeup_source *ws, const char *name);
>  extern struct wakeup_source *wakeup_source_create(const char *name);
> +extern void wakeup_source_drop(struct wakeup_source *ws);
>  extern void wakeup_source_destroy(struct wakeup_source *ws);
>  extern void wakeup_source_add(struct wakeup_source *ws);
>  extern void wakeup_source_remove(struct wakeup_source *ws);
> @@ -103,11 +105,16 @@ static inline bool device_can_wakeup(str
>  	return dev->power.can_wakeup;
>  }
>  
> +static inline void wakeup_source_prepare(struct wakeup_source *ws,
> +					 const char *name) {}
> +
>  static inline struct wakeup_source *wakeup_source_create(const char *name)
>  {
>  	return NULL;
>  }
>  
> +static inline void wakeup_source_drop(struct wakeup_source *ws) {}
> +
>  static inline void wakeup_source_destroy(struct wakeup_source *ws) {}
>  
>  static inline void wakeup_source_add(struct wakeup_source *ws) {}
> @@ -165,4 +172,17 @@ static inline void pm_wakeup_event(struc
>  
>  #endif /* !CONFIG_PM_SLEEP */
>  
> +static inline void wakeup_source_init(struct wakeup_source *ws,
> +				      const char *name)
> +{
> +	wakeup_source_prepare(ws, name);
> +	wakeup_source_add(ws);
> +}
> +
> +static inline void wakeup_source_trash(struct wakeup_source *ws)
> +{
> +	wakeup_source_remove(ws);
> +	wakeup_source_drop(ws);
> +}
> +
>  #endif /* _LINUX_PM_WAKEUP_H */
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 5/8] PM / Sleep: Change wakeup statistics
  2012-02-15 22:37     ` Rafael J. Wysocki
@ 2012-02-17  2:11       ` Arve Hjønnevåg
  0 siblings, 0 replies; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-02-17  2:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, John Stultz, Brian Swetland, Neil Brown, Alan Stern

2012/2/15 Rafael J. Wysocki <rjw@sisk.pl>:
> On Wednesday, February 15, 2012, Arve Hjønnevåg wrote:
>> On Mon, Feb 6, 2012 at 5:05 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> > From: Rafael J. Wysocki <rjw@sisk.pl>
>> >
>> > Wakeup statistics used by Android are slightly different from what we
>> > have at the moment, so modify them to follow Android more closely.
>> ...
>> > @@ -438,6 +444,11 @@ static void wakeup_source_deactivate(str
>> >        if (ktime_to_ns(duration) > ktime_to_ns(ws->max_time))
>> >                ws->max_time = duration;
>> >
>> > +       ws->last_time = now;
>> > +       if (ws->has_timeout && time_after(jiffies, ws->timer_expires))
>>
>> time_after_eq may work better (or increment the count from the timer).
>
> I think incrementing the count from the timer is a better approach.
>

OK.

>> I applied this patch and the expire counts I see for wakeup-sources
>> that always time-out do not match the active count.
>
> I see.  The reason may also be that __pm_wakeup_event() increments
> ws->event_count even if the wakeup source is already active.
>

The active count, which is what I was looking at, only changes if it
was not already active though.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-15 23:07       ` Rafael J. Wysocki
  2012-02-16 22:22         ` Rafael J. Wysocki
@ 2012-02-17  3:55         ` Arve Hjønnevåg
  2012-02-17 20:57           ` Rafael J. Wysocki
  1 sibling, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-02-17  3:55 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, John Stultz, Brian Swetland, Neil Brown, Alan Stern

2012/2/15 Rafael J. Wysocki <rjw@sisk.pl>:
> On Wednesday, February 15, 2012, Arve Hjønnevåg wrote:
>> 2012/2/14 Rafael J. Wysocki <rjw@sisk.pl>:
>> > On Tuesday, February 14, 2012, Arve Hjønnevåg wrote:
>> >> On Mon, Feb 6, 2012 at 5:00 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> >> ...
>> >> but the wake-source timeout feature has some bugs or incompatible apis. An
>> >> init api would also be useful for embedding wake-sources in other data
>> >> structures without adding another memory allocation. Your patch to
>> >> move the spinlock init to wakeup_source_add still require the struct
>> >> to be zero initialized and the name set manually.
>> >
>> > That should be easy to fix.  What about the appended patch?
>> >
>>
>> That works, but I still have to call more than one function before I
>> can use the wakeup-source (wakeup_source_init and wakeup_source_add)
>> and more than one function before I can free it (__pm_relax,
>> wakeup_source_remove and wakeup_source_drop). Is there any reason to
>> keep these separate?
>
> Yes, there is.  I think that wakeup_source_create/_destroy() should
> use the same initialization functions internally that will be used for
> externally allocated wakeup sources (to make sure that all wakeup source
> objects are initialized in exactly the same way).
>

I agree with that, but is it useful to export these helper functions?

>> Also, not copying the name when the caller provides the memory for the
>> wakeup-source would be a closer match to the wakelock api. Most of our
>> wakelocks pass a string constant as the name, and making a copy of
>> that string is not useful. wake_lock_init is also safe to call from
>> atomic context, but I don't know if anyone relies on this.
>
> OK, below is another go.  It doesn't copy the name if wakeup_source_init() is
> used (which also does the _add this time).  I think, though, that copying
> the name is generally safer, because someone might use wakeup_source_init()
> with the name string allocated on the stack or otherwise temporary, which would
> be a bug with the new version.
>

I prefer this version. I have not seen a bug where someone passed a
temporary as the wakelock name, I assume since this will show up
immediately in the stats file.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-16 22:22         ` Rafael J. Wysocki
@ 2012-02-17  3:56           ` Arve Hjønnevåg
  2012-02-17 23:02             ` [PATCH] PM / Sleep: Add more wakeup source initialization routines Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-02-17  3:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, John Stultz, Brian Swetland, Neil Brown, Alan Stern

2012/2/16 Rafael J. Wysocki <rjw@sisk.pl>:
...
>
> So, is the new version more suitable than the previous one?
>

Yes, I think it is.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks"
  2012-02-17  3:55         ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Arve Hjønnevåg
@ 2012-02-17 20:57           ` Rafael J. Wysocki
  0 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-17 20:57 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, John Stultz, Brian Swetland, Neil Brown, Alan Stern

On Friday, February 17, 2012, Arve Hjønnevåg wrote:
> 2012/2/15 Rafael J. Wysocki <rjw@sisk.pl>:
> > On Wednesday, February 15, 2012, Arve Hjønnevåg wrote:
> >> 2012/2/14 Rafael J. Wysocki <rjw@sisk.pl>:
> >> > On Tuesday, February 14, 2012, Arve Hjønnevåg wrote:
> >> >> On Mon, Feb 6, 2012 at 5:00 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> >> ...
> >> >> but the wake-source timeout feature has some bugs or incompatible apis. An
> >> >> init api would also be useful for embedding wake-sources in other data
> >> >> structures without adding another memory allocation. Your patch to
> >> >> move the spinlock init to wakeup_source_add still require the struct
> >> >> to be zero initialized and the name set manually.
> >> >
> >> > That should be easy to fix.  What about the appended patch?
> >> >
> >>
> >> That works, but I still have to call more than one function before I
> >> can use the wakeup-source (wakeup_source_init and wakeup_source_add)
> >> and more than one function before I can free it (__pm_relax,
> >> wakeup_source_remove and wakeup_source_drop). Is there any reason to
> >> keep these separate?
> >
> > Yes, there is.  I think that wakeup_source_create/_destroy() should
> > use the same initialization functions internally that will be used for
> > externally allocated wakeup sources (to make sure that all wakeup source
> > objects are initialized in exactly the same way).
> >
> 
> I agree with that, but is it useful to export these helper functions?

Well, we need to export either them or the ones that will call them internally
and in principle someone may want to do something between _prepare() and _add()
sometimes ...

> >> Also, not copying the name when the caller provides the memory for the
> >> wakeup-source would be a closer match to the wakelock api. Most of our
> >> wakelocks pass a string constant as the name, and making a copy of
> >> that string is not useful. wake_lock_init is also safe to call from
> >> atomic context, but I don't know if anyone relies on this.
> >
> > OK, below is another go.  It doesn't copy the name if wakeup_source_init() is
> > used (which also does the _add this time).  I think, though, that copying
> > the name is generally safer, because someone might use wakeup_source_init()
> > with the name string allocated on the stack or otherwise temporary, which would
> > be a bug with the new version.
> >
> 
> I prefer this version. I have not seen a bug where someone passed a
> temporary as the wakelock name, I assume since this will show up
> immediately in the stats file.

OK

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] PM / Sleep: Add more wakeup source initialization routines
  2012-02-17  3:56           ` Arve Hjønnevåg
@ 2012-02-17 23:02             ` Rafael J. Wysocki
  2012-02-18 23:50               ` [Update][PATCH] " Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-17 23:02 UTC (permalink / raw)
  To: Linux PM list
  Cc: Arve Hjønnevåg, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

From: Rafael J. Wysocki <rjw@sisk.pl>

The existing wakeup source initialization routines are not
particularly useful for wakeup sources that aren't created by
wakeup_source_create(), because their users have to open code
filling the objects with zeros and setting their names.  For this
reason, introduce routines that can be used for initializing, for
example, static wakeup source objects.

Requested-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---

This patch is on top of the linux-next branch of the linux-pm tree.

Thanks,
Rafael

---
 drivers/base/power/wakeup.c |   41 ++++++++++++++++++++++++++++++++++-------
 include/linux/pm_wakeup.h   |   20 ++++++++++++++++++++
 2 files changed, 54 insertions(+), 7 deletions(-)

Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -53,6 +53,23 @@ static void pm_wakeup_timer_fn(unsigned
 static LIST_HEAD(wakeup_sources);
 
 /**
+ * wakeup_source_prepare - Prepare a new wakeup source for initialization.
+ * @ws: Wakeup source to prepare.
+ * @name: Pointer to the name of the new wakeup source.
+ *
+ * Callers must ensure that the @name string won't be freed when @ws is still in
+ * use.
+ */
+void wakeup_source_prepare(struct wakeup_source *ws, const char *name)
+{
+	if (ws) {
+		memset(ws, 0, sizeof(*ws));
+		ws->name = name;
+	}
+}
+EXPORT_SYMBOL_GPL(wakeup_source_prepare);
+
+/**
  * wakeup_source_create - Create a struct wakeup_source object.
  * @name: Name of the new wakeup source.
  */
@@ -60,31 +77,41 @@ struct wakeup_source *wakeup_source_crea
 {
 	struct wakeup_source *ws;
 
-	ws = kzalloc(sizeof(*ws), GFP_KERNEL);
+	ws = kmalloc(sizeof(*ws), GFP_KERNEL);
 	if (!ws)
 		return NULL;
 
-	if (name)
-		ws->name = kstrdup(name, GFP_KERNEL);
-
+	wakeup_source_prepare(ws, name ? kstrdup(name, GFP_KERNEL) : NULL);
 	return ws;
 }
 EXPORT_SYMBOL_GPL(wakeup_source_create);
 
 /**
- * wakeup_source_destroy - Destroy a struct wakeup_source object.
- * @ws: Wakeup source to destroy.
+ * wakeup_source_drop - Prepare a struct wakeup_source object for destruction.
+ * @ws: Wakeup source to prepare for destruction.
  *
  * Callers must ensure that __pm_stay_awake() or __pm_wakeup_event() will never
  * be run in parallel with this function for the same wakeup source object.
  */
-void wakeup_source_destroy(struct wakeup_source *ws)
+void wakeup_source_drop(struct wakeup_source *ws)
 {
 	if (!ws)
 		return;
 
 	del_timer_sync(&ws->timer);
 	__pm_relax(ws);
+}
+EXPORT_SYMBOL_GPL(wakeup_source_drop);
+
+/**
+ * wakeup_source_destroy - Destroy a struct wakeup_source object.
+ * @ws: Wakeup source to destroy.
+ *
+ * Use only for wakeup source objects created with wakeup_source_create().
+ */
+void wakeup_source_destroy(struct wakeup_source *ws)
+{
+	wakeup_source_drop(ws);
 	kfree(ws->name);
 	kfree(ws);
 }
Index: linux/include/linux/pm_wakeup.h
===================================================================
--- linux.orig/include/linux/pm_wakeup.h
+++ linux/include/linux/pm_wakeup.h
@@ -73,7 +73,9 @@ static inline bool device_may_wakeup(str
 }
 
 /* drivers/base/power/wakeup.c */
+extern void wakeup_source_prepare(struct wakeup_source *ws, const char *name);
 extern struct wakeup_source *wakeup_source_create(const char *name);
+extern void wakeup_source_drop(struct wakeup_source *ws);
 extern void wakeup_source_destroy(struct wakeup_source *ws);
 extern void wakeup_source_add(struct wakeup_source *ws);
 extern void wakeup_source_remove(struct wakeup_source *ws);
@@ -103,11 +105,16 @@ static inline bool device_can_wakeup(str
 	return dev->power.can_wakeup;
 }
 
+static inline void wakeup_source_prepare(struct wakeup_source *ws,
+					 const char *name) {}
+
 static inline struct wakeup_source *wakeup_source_create(const char *name)
 {
 	return NULL;
 }
 
+static inline void wakeup_source_drop(struct wakeup_source *ws) {}
+
 static inline void wakeup_source_destroy(struct wakeup_source *ws) {}
 
 static inline void wakeup_source_add(struct wakeup_source *ws) {}
@@ -165,4 +172,17 @@ static inline void pm_wakeup_event(struc
 
 #endif /* !CONFIG_PM_SLEEP */
 
+static inline void wakeup_source_init(struct wakeup_source *ws,
+				      const char *name)
+{
+	wakeup_source_prepare(ws, name);
+	wakeup_source_add(ws);
+}
+
+static inline void wakeup_source_trash(struct wakeup_source *ws)
+{
+	wakeup_source_remove(ws);
+	wakeup_source_drop(ws);
+}
+
 #endif /* _LINUX_PM_WAKEUP_H */


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [Update][PATCH] PM / Sleep: Add more wakeup source initialization routines
  2012-02-17 23:02             ` [PATCH] PM / Sleep: Add more wakeup source initialization routines Rafael J. Wysocki
@ 2012-02-18 23:50               ` Rafael J. Wysocki
  2012-02-20 23:04                 ` [Update 2x][PATCH] " Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-18 23:50 UTC (permalink / raw)
  To: Linux PM list
  Cc: Arve Hjønnevåg, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM / Sleep: Add more wakeup source initialization routines

The existing wakeup source initialization routines are not
particularly useful for wakeup sources that aren't created by
wakeup_source_create(), because their users have to open code
filling the objects with zeros and setting their names.  For this
reason, introduce routines that can be used for initializing, for
example, static wakeup source objects.

Requested-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---

The name member of struct wakeup_source has to be of type (const char *)
due to the new dependencies between the arguments of the new initializers.
That also reflects the fact that that string is not supposed to be modified.

Thanks,
Rafael

---
 drivers/base/power/wakeup.c |   41 ++++++++++++++++++++++++++++++++++-------
 include/linux/pm_wakeup.h   |   22 +++++++++++++++++++++-
 2 files changed, 55 insertions(+), 8 deletions(-)

Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -53,6 +53,23 @@ static void pm_wakeup_timer_fn(unsigned
 static LIST_HEAD(wakeup_sources);
 
 /**
+ * wakeup_source_prepare - Prepare a new wakeup source for initialization.
+ * @ws: Wakeup source to prepare.
+ * @name: Pointer to the name of the new wakeup source.
+ *
+ * Callers must ensure that the @name string won't be freed when @ws is still in
+ * use.
+ */
+void wakeup_source_prepare(struct wakeup_source *ws, const char *name)
+{
+	if (ws) {
+		memset(ws, 0, sizeof(*ws));
+		ws->name = name;
+	}
+}
+EXPORT_SYMBOL_GPL(wakeup_source_prepare);
+
+/**
  * wakeup_source_create - Create a struct wakeup_source object.
  * @name: Name of the new wakeup source.
  */
@@ -60,31 +77,41 @@ struct wakeup_source *wakeup_source_crea
 {
 	struct wakeup_source *ws;
 
-	ws = kzalloc(sizeof(*ws), GFP_KERNEL);
+	ws = kmalloc(sizeof(*ws), GFP_KERNEL);
 	if (!ws)
 		return NULL;
 
-	if (name)
-		ws->name = kstrdup(name, GFP_KERNEL);
-
+	wakeup_source_prepare(ws, name ? kstrdup(name, GFP_KERNEL) : NULL);
 	return ws;
 }
 EXPORT_SYMBOL_GPL(wakeup_source_create);
 
 /**
- * wakeup_source_destroy - Destroy a struct wakeup_source object.
- * @ws: Wakeup source to destroy.
+ * wakeup_source_drop - Prepare a struct wakeup_source object for destruction.
+ * @ws: Wakeup source to prepare for destruction.
  *
  * Callers must ensure that __pm_stay_awake() or __pm_wakeup_event() will never
  * be run in parallel with this function for the same wakeup source object.
  */
-void wakeup_source_destroy(struct wakeup_source *ws)
+void wakeup_source_drop(struct wakeup_source *ws)
 {
 	if (!ws)
 		return;
 
 	del_timer_sync(&ws->timer);
 	__pm_relax(ws);
+}
+EXPORT_SYMBOL_GPL(wakeup_source_drop);
+
+/**
+ * wakeup_source_destroy - Destroy a struct wakeup_source object.
+ * @ws: Wakeup source to destroy.
+ *
+ * Use only for wakeup source objects created with wakeup_source_create().
+ */
+void wakeup_source_destroy(struct wakeup_source *ws)
+{
+	wakeup_source_drop(ws);
 	kfree(ws->name);
 	kfree(ws);
 }
Index: linux/include/linux/pm_wakeup.h
===================================================================
--- linux.orig/include/linux/pm_wakeup.h
+++ linux/include/linux/pm_wakeup.h
@@ -41,7 +41,7 @@
  * @active: Status of the wakeup source.
  */
 struct wakeup_source {
-	char 			*name;
+	const char 		*name;
 	struct list_head	entry;
 	spinlock_t		lock;
 	struct timer_list	timer;
@@ -73,7 +73,9 @@ static inline bool device_may_wakeup(str
 }
 
 /* drivers/base/power/wakeup.c */
+extern void wakeup_source_prepare(struct wakeup_source *ws, const char *name);
 extern struct wakeup_source *wakeup_source_create(const char *name);
+extern void wakeup_source_drop(struct wakeup_source *ws);
 extern void wakeup_source_destroy(struct wakeup_source *ws);
 extern void wakeup_source_add(struct wakeup_source *ws);
 extern void wakeup_source_remove(struct wakeup_source *ws);
@@ -103,11 +105,16 @@ static inline bool device_can_wakeup(str
 	return dev->power.can_wakeup;
 }
 
+static inline void wakeup_source_prepare(struct wakeup_source *ws,
+					 const char *name) {}
+
 static inline struct wakeup_source *wakeup_source_create(const char *name)
 {
 	return NULL;
 }
 
+static inline void wakeup_source_drop(struct wakeup_source *ws) {}
+
 static inline void wakeup_source_destroy(struct wakeup_source *ws) {}
 
 static inline void wakeup_source_add(struct wakeup_source *ws) {}
@@ -165,4 +172,17 @@ static inline void pm_wakeup_event(struc
 
 #endif /* !CONFIG_PM_SLEEP */
 
+static inline void wakeup_source_init(struct wakeup_source *ws,
+				      const char *name)
+{
+	wakeup_source_prepare(ws, name);
+	wakeup_source_add(ws);
+}
+
+static inline void wakeup_source_trash(struct wakeup_source *ws)
+{
+	wakeup_source_remove(ws);
+	wakeup_source_drop(ws);
+}
+
 #endif /* _LINUX_PM_WAKEUP_H */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [Update 2x][PATCH] PM / Sleep: Add more wakeup source initialization routines
  2012-02-18 23:50               ` [Update][PATCH] " Rafael J. Wysocki
@ 2012-02-20 23:04                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-20 23:04 UTC (permalink / raw)
  To: Linux PM list
  Cc: Arve Hjønnevåg, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern

From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM / Sleep: Add more wakeup source initialization routines

The existing wakeup source initialization routines are not
particularly useful for wakeup sources that aren't created by
wakeup_source_create(), because their users have to open code
filling the objects with zeros and setting their names.  For this
reason, introduce routines that can be used for initializing, for
example, static wakeup source objects.

Requested-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---

Make sure that wakeup_source_unregister() won't crash or trigger the
WARN_ON() in wakeup_source_remove() if a NULL pointer is passed to it.

Thanks,
Rafael

---
 drivers/base/power/wakeup.c |   50 ++++++++++++++++++++++++++++++++++++--------
 include/linux/pm_wakeup.h   |   22 ++++++++++++++++++-
 2 files changed, 62 insertions(+), 10 deletions(-)

Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -53,6 +53,23 @@ static void pm_wakeup_timer_fn(unsigned
 static LIST_HEAD(wakeup_sources);
 
 /**
+ * wakeup_source_prepare - Prepare a new wakeup source for initialization.
+ * @ws: Wakeup source to prepare.
+ * @name: Pointer to the name of the new wakeup source.
+ *
+ * Callers must ensure that the @name string won't be freed when @ws is still in
+ * use.
+ */
+void wakeup_source_prepare(struct wakeup_source *ws, const char *name)
+{
+	if (ws) {
+		memset(ws, 0, sizeof(*ws));
+		ws->name = name;
+	}
+}
+EXPORT_SYMBOL_GPL(wakeup_source_prepare);
+
+/**
  * wakeup_source_create - Create a struct wakeup_source object.
  * @name: Name of the new wakeup source.
  */
@@ -60,31 +77,44 @@ struct wakeup_source *wakeup_source_crea
 {
 	struct wakeup_source *ws;
 
-	ws = kzalloc(sizeof(*ws), GFP_KERNEL);
+	ws = kmalloc(sizeof(*ws), GFP_KERNEL);
 	if (!ws)
 		return NULL;
 
-	if (name)
-		ws->name = kstrdup(name, GFP_KERNEL);
-
+	wakeup_source_prepare(ws, name ? kstrdup(name, GFP_KERNEL) : NULL);
 	return ws;
 }
 EXPORT_SYMBOL_GPL(wakeup_source_create);
 
 /**
- * wakeup_source_destroy - Destroy a struct wakeup_source object.
- * @ws: Wakeup source to destroy.
+ * wakeup_source_drop - Prepare a struct wakeup_source object for destruction.
+ * @ws: Wakeup source to prepare for destruction.
  *
  * Callers must ensure that __pm_stay_awake() or __pm_wakeup_event() will never
  * be run in parallel with this function for the same wakeup source object.
  */
-void wakeup_source_destroy(struct wakeup_source *ws)
+void wakeup_source_drop(struct wakeup_source *ws)
 {
 	if (!ws)
 		return;
 
 	del_timer_sync(&ws->timer);
 	__pm_relax(ws);
+}
+EXPORT_SYMBOL_GPL(wakeup_source_drop);
+
+/**
+ * wakeup_source_destroy - Destroy a struct wakeup_source object.
+ * @ws: Wakeup source to destroy.
+ *
+ * Use only for wakeup source objects created with wakeup_source_create().
+ */
+void wakeup_source_destroy(struct wakeup_source *ws)
+{
+	if (!ws)
+		return;
+
+	wakeup_source_drop(ws);
 	kfree(ws->name);
 	kfree(ws);
 }
@@ -147,8 +177,10 @@ EXPORT_SYMBOL_GPL(wakeup_source_register
  */
 void wakeup_source_unregister(struct wakeup_source *ws)
 {
-	wakeup_source_remove(ws);
-	wakeup_source_destroy(ws);
+	if (ws) {
+		wakeup_source_remove(ws);
+		wakeup_source_destroy(ws);
+	}
 }
 EXPORT_SYMBOL_GPL(wakeup_source_unregister);
 
Index: linux/include/linux/pm_wakeup.h
===================================================================
--- linux.orig/include/linux/pm_wakeup.h
+++ linux/include/linux/pm_wakeup.h
@@ -41,7 +41,7 @@
  * @active: Status of the wakeup source.
  */
 struct wakeup_source {
-	char 			*name;
+	const char 		*name;
 	struct list_head	entry;
 	spinlock_t		lock;
 	struct timer_list	timer;
@@ -73,7 +73,9 @@ static inline bool device_may_wakeup(str
 }
 
 /* drivers/base/power/wakeup.c */
+extern void wakeup_source_prepare(struct wakeup_source *ws, const char *name);
 extern struct wakeup_source *wakeup_source_create(const char *name);
+extern void wakeup_source_drop(struct wakeup_source *ws);
 extern void wakeup_source_destroy(struct wakeup_source *ws);
 extern void wakeup_source_add(struct wakeup_source *ws);
 extern void wakeup_source_remove(struct wakeup_source *ws);
@@ -103,11 +105,16 @@ static inline bool device_can_wakeup(str
 	return dev->power.can_wakeup;
 }
 
+static inline void wakeup_source_prepare(struct wakeup_source *ws,
+					 const char *name) {}
+
 static inline struct wakeup_source *wakeup_source_create(const char *name)
 {
 	return NULL;
 }
 
+static inline void wakeup_source_drop(struct wakeup_source *ws) {}
+
 static inline void wakeup_source_destroy(struct wakeup_source *ws) {}
 
 static inline void wakeup_source_add(struct wakeup_source *ws) {}
@@ -165,4 +172,17 @@ static inline void pm_wakeup_event(struc
 
 #endif /* !CONFIG_PM_SLEEP */
 
+static inline void wakeup_source_init(struct wakeup_source *ws,
+				      const char *name)
+{
+	wakeup_source_prepare(ws, name);
+	wakeup_source_add(ws);
+}
+
+static inline void wakeup_source_trash(struct wakeup_source *ws)
+{
+	wakeup_source_remove(ws);
+	wakeup_source_drop(ws);
+}
+
 #endif /* _LINUX_PM_WAKEUP_H */

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2
  2012-02-07  1:00 [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
                   ` (11 preceding siblings ...)
  2012-02-14  2:07 ` Arve Hjønnevåg
@ 2012-02-21 23:31 ` Rafael J. Wysocki
  2012-02-21 23:32   ` [RFC][PATCH 1/7] PM / Sleep: Look for wakeup events in later stages of device suspend Rafael J. Wysocki
                     ` (8 more replies)
  12 siblings, 9 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-21 23:31 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

Hi all,

After the feedback so far I've decided to follow up with a refreshed patchset.
The first two patches from the previous one went to linux-pm/linux-next
and I included the recent evdev patch from Arve (with some modifications)
to this patchset for completness.

On Tuesday, February 07, 2012, Rafael J. Wysocki wrote:
> Hi all,
> 
> This series tests the theory that the easiest way to sell a once rejected
> feature is to advertise it under a different name.
> 
> Well, there actually are two different features, although they are closely
> related to each other.  First, patch [6/8] introduces a feature that allows
> the kernel to trigger system suspend (or more generally a transition into
> a sleep state) whenever there are no active wakeup sources (no, they aren't
> called wakelocks).  It is called "autosleep" here, but it was called a few
> different names in the past ("opportunistic suspend" was probably the most
> popular one).  Second, patch [8/8] introduces "wake locks" that are,
> essentially, wakeup sources which may be created and manipulated by user
> space.  Using them user space may control the autosleep feature introduced
> earlier.
> 
> This also is a kind of a proof of concept for the people who wanted me to
> show a kernel-based implementation of automatic suspend, so there you go.
> Please note, however, that it is done so that the user space "wake locks"
> interface is compatible with Android in support of its user space.  I don't
> really like this interface, but since the Android's user space seems to rely
> on it, I'm fine with using it as is.  YMMV.
> 
> Let me say a few words about every patch in the series individually.
> 
> [1/8] - This really is a bug fix, so it's v3.4 material.  Nobody has stepped
>   on this bug so far, but it should be fixed anyway.
> 
> [2/8] - This is a freezer cleanup, worth doing anyway IMO, so v3.4 material too.

The above two are in linux-pm/linux-next now.  There are a few more fixes
related to wakeup sources in there and the patches below are based on that
branch.

> [3/8] - This is something we can do no problem, although completely optional
>   without the autosleep feature.  Rather necessary with it, though.

Now [1/7] - Look for wakeup events in later stages of device suspend.

> [4/8] - This kind of reintroduces my original idea of using a wait queue for
>   waiting until there are no wakeup events in progress.  Alan convinced me that
>   it would be better to poll the counter to prevent wakeup_source_deactivate()
>   from having to call wake_up_all() occasionally (that may be costly in fast
>   paths), but then quite some people told me that the wait queue migh be
>   better.  I think that the polling will make much less sense with autosleep
>   and user space "wake locks".  Anyway, [4/8] is something we can do without
>   those things too.

Now [2/7] - Use wait queue to signal "no wakeup events in progress"

  With a couple of improvements suggested by Neil.

> The patches above were given Sign-off-by tags, because I think they make some
> sense regardless of the features introcuded by the remaining patches that in
> turn are total RFC.

This time all of the patches are signed-off and include the requisite
documentation changes (hopefully, I haven't forgotten about anything).

> [5/8] - This changes wakeup source statistics so that they are more similar to
>   the statistics collected for wakelocks on Android.  The file those statistics
>   may be read from is still located in debugfs, though (I don't think it
>   belongs to proc and its name is different from the analogous Android's file
>   name anyway).  It could be done without autosleep, but then it would be a bit
>   pointless.  BTW, this changes interfaces that _in_ _theory_ may be used by
>   someone, but I'm not aware of anyone using them.  If you are one, I'll be
>   pleased to learn about that, so please tell me who you are. :-)

Now [3/7] - Change wakeup source statistics to follow Android.

  Rebased and reworked in accordance with the Arve's feedback.

[4/7] - Add ioctl to block suspend while event queue is not empty.

  Originally posted by Arve as http://marc.info/?l=linux-pm&m=132711288825973&w=4
  Reworked and with modified changelog (I wonder what Dmity thinks about this).

  It has some minor problems (for example, in some situations the queue wakeup
  source may be activated for events that are not coming from a wakeup device),
  but I think it's simple enough, at least for illustration.  The ioctls
  introduced here will be used by Android user space anyway, although perhaps
  under different names, AFAICS.

> [6/8] - Autosleep implementation.  I think the changelog explains the idea
>   quite well and the code is really nothing special.  It doesn't really add
>   anything new to the kernel in terms of infrastructure etc., it just uses
>   the existing stuff to implement an alternative method of triggering system
>   sleep transitions.  Note, though, that the interface here is different
>   from the Android's one, because Android actually modifies /sys/power/state
>   to trigger something called "early suspend" (that is never going to be
>   implemented in the "stock" kernel as long as I have any influence on it) and
>   we simply can't do that in the mainline.

Now [5/7] - Implement opportunistic sleep

  Rebased and simplified (most notably, I've dropped the "main" wakeup source,
  since it wasn't really necessary).

> [7/8] - This adds a wakeup source statistics that only makes sense with
>   autosleep and (I believe) is analogous to the Android's prevent_suspend_time
>   statistics.  Nothing really special, but I didn't want
>   wakeup_source_activate/deactivate() to take a common lock to avoid
>   congestion.

Now [6/7] - Add "prevent autosleep time" statistics to wakeup sources.

  Rebased.

> [8/8] - This adds a user space interface to create, activate and deactivate
>   wakeup sources.  Since the files it consists of are called wake_lock and
>   wake_unlock, to follow Android, the objects the wakeup sources are wrapped
>   into are called "wakelocks" (for added confusion).  Since the interface
>   doesn't provide any means to destroy those "wakelocks", I added a garbage
>   collection mechanism to get rid of the unused ones, if any.  I also tought
>   it might be a good idea to put a limit on the number of those things that
>   user space can operate simultaneously, so I did that too.

Now [7/7] - Add user space interface for manipulating wakeup sources.

> All of the above has been tested very briefly on my test-bed Mackerel board
> and it quite obviously requires more thorough testing, but first I need to know
> if it makes sense to spend any more time on it.

The above is still accurate, but I also verified that the patches don't break
my PC test boxes (at least as long as the new features aren't used ;-)).

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 1/7] PM / Sleep: Look for wakeup events in later stages of device suspend
  2012-02-21 23:31 ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 Rafael J. Wysocki
@ 2012-02-21 23:32   ` Rafael J. Wysocki
  2012-02-21 23:33   ` [RFC][PATCH 2/7] PM / Sleep: Use wait queue to signal "no wakeup events in progress" Rafael J. Wysocki
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-21 23:32 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

From: Rafael J. Wysocki <rjw@sisk.pl>

Currently, the device suspend code in drivers/base/power/main.c
only checks if there have been any wakeup events, and therefore the
ongoing system transition to a sleep state should be aborted, during
the first (i.e. "suspend") device suspend phase.  However, wakeup
events may be reported later as well, so it's reasonable to look for
them in the in the subsequent (i.e. "late suspend" and "suspend
noirq") phases.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/base/power/main.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

Index: linux/drivers/base/power/main.c
===================================================================
--- linux.orig/drivers/base/power/main.c
+++ linux/drivers/base/power/main.c
@@ -889,6 +889,11 @@ static int dpm_suspend_noirq(pm_message_
 		if (!list_empty(&dev->power.entry))
 			list_move(&dev->power.entry, &dpm_noirq_list);
 		put_device(dev);
+
+		if (pm_wakeup_pending()) {
+			error = -EBUSY;
+			break;
+		}
 	}
 	mutex_unlock(&dpm_list_mtx);
 	if (error)
@@ -962,6 +967,11 @@ static int dpm_suspend_late(pm_message_t
 		if (!list_empty(&dev->power.entry))
 			list_move(&dev->power.entry, &dpm_late_early_list);
 		put_device(dev);
+
+		if (pm_wakeup_pending()) {
+			error = -EBUSY;
+			break;
+		}
 	}
 	mutex_unlock(&dpm_list_mtx);
 	if (error)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 2/7] PM / Sleep: Use wait queue to signal "no wakeup events in progress"
  2012-02-21 23:31 ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 Rafael J. Wysocki
  2012-02-21 23:32   ` [RFC][PATCH 1/7] PM / Sleep: Look for wakeup events in later stages of device suspend Rafael J. Wysocki
@ 2012-02-21 23:33   ` Rafael J. Wysocki
  2012-02-21 23:34   ` [RFC][PATCH 3/7] PM / Sleep: Change wakeup source statistics to follow Android Rafael J. Wysocki
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-21 23:33 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

From: Rafael J. Wysocki <rjw@sisk.pl>

The current wakeup source deactivation code doesn't do anything when
the counter of wakeup events in progress goes down to zero, which
requires pm_get_wakeup_count() to poll that counter periodically.
Although this reduces the average time it takes to deactivate a
wakeup source, it also may lead to a substantial amount of unnecessary
polling if there are extended periods of wakeup activity.  Thus it
seems reasonable to use a wait queue for signaling the "no wakeup
events in progress" condition and remove the polling.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/base/power/wakeup.c |   16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -17,8 +17,6 @@
 
 #include "power.h"
 
-#define TIMEOUT		100
-
 /*
  * If set, the suspend/hibernate code will abort transitions to a sleep state
  * if wakeup events are registered during or immediately before the transition.
@@ -52,6 +50,8 @@ static void pm_wakeup_timer_fn(unsigned
 
 static LIST_HEAD(wakeup_sources);
 
+static DECLARE_WAIT_QUEUE_HEAD(wakeup_count_wait_queue);
+
 /**
  * wakeup_source_prepare - Prepare a new wakeup source for initialization.
  * @ws: Wakeup source to prepare.
@@ -442,6 +442,7 @@ EXPORT_SYMBOL_GPL(pm_stay_awake);
  */
 static void wakeup_source_deactivate(struct wakeup_source *ws)
 {
+	unsigned int cnt, inpr;
 	ktime_t duration;
 	ktime_t now;
 
@@ -476,6 +477,10 @@ static void wakeup_source_deactivate(str
 	 * couter of wakeup events in progress simultaneously.
 	 */
 	atomic_add(MAX_IN_PROGRESS, &combined_event_count);
+
+	split_counters(&cnt, &inpr);
+	if (!inpr && waitqueue_active(&wakeup_count_wait_queue))
+		wake_up(&wakeup_count_wait_queue);
 }
 
 /**
@@ -667,14 +672,19 @@ bool pm_wakeup_pending(void)
 bool pm_get_wakeup_count(unsigned int *count)
 {
 	unsigned int cnt, inpr;
+	DEFINE_WAIT(wait);
 
 	for (;;) {
+		prepare_to_wait(&wakeup_count_wait_queue, &wait,
+				TASK_INTERRUPTIBLE);
 		split_counters(&cnt, &inpr);
 		if (inpr == 0 || signal_pending(current))
 			break;
 		pm_wakeup_update_hit_counts();
-		schedule_timeout_interruptible(msecs_to_jiffies(TIMEOUT));
+
+		schedule();
 	}
+	finish_wait(&wakeup_count_wait_queue, &wait);
 
 	split_counters(&cnt, &inpr);
 	*count = cnt;


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 3/7] PM / Sleep: Change wakeup source statistics to follow Android
  2012-02-21 23:31 ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 Rafael J. Wysocki
  2012-02-21 23:32   ` [RFC][PATCH 1/7] PM / Sleep: Look for wakeup events in later stages of device suspend Rafael J. Wysocki
  2012-02-21 23:33   ` [RFC][PATCH 2/7] PM / Sleep: Use wait queue to signal "no wakeup events in progress" Rafael J. Wysocki
@ 2012-02-21 23:34   ` Rafael J. Wysocki
  2012-02-21 23:34   ` [RFC][PATCH 4/7] Input / PM: Add ioctl to block suspend while event queue is not empty Rafael J. Wysocki
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-21 23:34 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

From: Rafael J. Wysocki <rjw@sisk.pl>

Wakeup statistics used by Android are slightly different from what we
have in wakeup sources at the moment and there aren't any known
users of those statistics other than Android, so modify them to make
it easier for Android to switch to wakeup sources.

This removes the struct wakeup_source's hit_cout field, which is very
rough and therefore not very useful, and adds two new fields,
wakeup_count and expire_count.  The first one tracks how many times
the wakeup source is activated with events_check_enabled set (which
roughly corresponds to the situations when a system power transition
to a sleep state is in progress and would be aborted by this wakeup
source if it were the only active one at that time) and the second
one is the number of times the wakeup source has been activated with
a timeout that expired.

Additionally, the last_time field is now updated when the wakeup
source is deactivated too (previously it was only updated during
the wakeup source's activation), which seems to be what Android does
with the analogous counter for wakelocks.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/ABI/testing/sysfs-devices-power |   24 ++++++---
 drivers/base/power/sysfs.c                    |   30 ++++++++++--
 drivers/base/power/wakeup.c                   |   64 +++++++++++---------------
 include/linux/pm_wakeup.h                     |   11 ++--
 4 files changed, 77 insertions(+), 52 deletions(-)

Index: linux/include/linux/pm_wakeup.h
===================================================================
--- linux.orig/include/linux/pm_wakeup.h
+++ linux/include/linux/pm_wakeup.h
@@ -33,12 +33,14 @@
  *
  * @total_time: Total time this wakeup source has been active.
  * @max_time: Maximum time this wakeup source has been continuously active.
- * @last_time: Monotonic clock when the wakeup source's was activated last time.
+ * @last_time: Monotonic clock when the wakeup source's was touched last time.
  * @event_count: Number of signaled wakeup events.
  * @active_count: Number of times the wakeup sorce was activated.
  * @relax_count: Number of times the wakeup sorce was deactivated.
- * @hit_count: Number of times the wakeup sorce might abort system suspend.
+ * @expire_count: Number of times the wakeup source's timeout has expired.
+ * @wakeup_count: Number of times the wakeup source might abort suspend.
  * @active: Status of the wakeup source.
+ * @has_timeout: The wakeup source has been activated with a timeout.
  */
 struct wakeup_source {
 	const char 		*name;
@@ -52,8 +54,9 @@ struct wakeup_source {
 	unsigned long		event_count;
 	unsigned long		active_count;
 	unsigned long		relax_count;
-	unsigned long		hit_count;
-	unsigned int		active:1;
+	unsigned long		expire_count;
+	unsigned long		wakeup_count;
+	bool			active:1;
 };
 
 #ifdef CONFIG_PM_SLEEP
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -21,7 +21,7 @@
  * If set, the suspend/hibernate code will abort transitions to a sleep state
  * if wakeup events are registered during or immediately before the transition.
  */
-bool events_check_enabled;
+bool events_check_enabled __read_mostly;
 
 /*
  * Combined counters of registered wakeup events and wakeup events in progress.
@@ -383,6 +383,21 @@ static void wakeup_source_activate(struc
 }
 
 /**
+ * wakeup_source_report_event - Report wakeup event using the given source.
+ * @ws: Wakeup source to report the event for.
+ */
+static void wakeup_source_report_event(struct wakeup_source *ws)
+{
+	ws->event_count++;
+	/* This is racy, but the counter is approximate anyway. */
+	if (events_check_enabled)
+		ws->wakeup_count++;
+
+	if (!ws->active)
+		wakeup_source_activate(ws);
+}
+
+/**
  * __pm_stay_awake - Notify the PM core of a wakeup event.
  * @ws: Wakeup source object associated with the source of the event.
  *
@@ -397,10 +412,7 @@ void __pm_stay_awake(struct wakeup_sourc
 
 	spin_lock_irqsave(&ws->lock, flags);
 
-	ws->event_count++;
-	if (!ws->active)
-		wakeup_source_activate(ws);
-
+	wakeup_source_report_event(ws);
 	del_timer(&ws->timer);
 	ws->timer_expires = 0;
 
@@ -469,6 +481,7 @@ static void wakeup_source_deactivate(str
 	if (ktime_to_ns(duration) > ktime_to_ns(ws->max_time))
 		ws->max_time = duration;
 
+	ws->last_time = now;
 	del_timer(&ws->timer);
 	ws->timer_expires = 0;
 
@@ -541,8 +554,10 @@ static void pm_wakeup_timer_fn(unsigned
 	spin_lock_irqsave(&ws->lock, flags);
 
 	if (ws->active && ws->timer_expires
-	    && time_after_eq(jiffies, ws->timer_expires))
+	    && time_after_eq(jiffies, ws->timer_expires)) {
 		wakeup_source_deactivate(ws);
+		ws->expire_count++;
+	}
 
 	spin_unlock_irqrestore(&ws->lock, flags);
 }
@@ -569,9 +584,7 @@ void __pm_wakeup_event(struct wakeup_sou
 
 	spin_lock_irqsave(&ws->lock, flags);
 
-	ws->event_count++;
-	if (!ws->active)
-		wakeup_source_activate(ws);
+	wakeup_source_report_event(ws);
 
 	if (!msec) {
 		wakeup_source_deactivate(ws);
@@ -614,24 +627,6 @@ void pm_wakeup_event(struct device *dev,
 EXPORT_SYMBOL_GPL(pm_wakeup_event);
 
 /**
- * pm_wakeup_update_hit_counts - Update hit counts of all active wakeup sources.
- */
-static void pm_wakeup_update_hit_counts(void)
-{
-	unsigned long flags;
-	struct wakeup_source *ws;
-
-	rcu_read_lock();
-	list_for_each_entry_rcu(ws, &wakeup_sources, entry) {
-		spin_lock_irqsave(&ws->lock, flags);
-		if (ws->active)
-			ws->hit_count++;
-		spin_unlock_irqrestore(&ws->lock, flags);
-	}
-	rcu_read_unlock();
-}
-
-/**
  * pm_wakeup_pending - Check if power transition in progress should be aborted.
  *
  * Compare the current number of registered wakeup events with its preserved
@@ -653,8 +648,6 @@ bool pm_wakeup_pending(void)
 		events_check_enabled = !ret;
 	}
 	spin_unlock_irqrestore(&events_lock, flags);
-	if (ret)
-		pm_wakeup_update_hit_counts();
 	return ret;
 }
 
@@ -680,7 +673,6 @@ bool pm_get_wakeup_count(unsigned int *c
 		split_counters(&cnt, &inpr);
 		if (inpr == 0 || signal_pending(current))
 			break;
-		pm_wakeup_update_hit_counts();
 
 		schedule();
 	}
@@ -713,8 +705,6 @@ bool pm_save_wakeup_count(unsigned int c
 		events_check_enabled = true;
 	}
 	spin_unlock_irq(&events_lock);
-	if (!events_check_enabled)
-		pm_wakeup_update_hit_counts();
 	return events_check_enabled;
 }
 
@@ -749,9 +739,10 @@ static int print_wakeup_source_stats(str
 		active_time = ktime_set(0, 0);
 	}
 
-	ret = seq_printf(m, "%-12s\t%lu\t\t%lu\t\t%lu\t\t"
+	ret = seq_printf(m, "%-12s\t%lu\t\t%lu\t\t%lu\t\t%lu\t\t"
 			"%lld\t\t%lld\t\t%lld\t\t%lld\n",
-			ws->name, active_count, ws->event_count, ws->hit_count,
+			ws->name, active_count, ws->event_count,
+			ws->wakeup_count, ws->expire_count,
 			ktime_to_ms(active_time), ktime_to_ms(total_time),
 			ktime_to_ms(max_time), ktime_to_ms(ws->last_time));
 
@@ -768,8 +759,9 @@ static int wakeup_sources_stats_show(str
 {
 	struct wakeup_source *ws;
 
-	seq_puts(m, "name\t\tactive_count\tevent_count\thit_count\t"
-		"active_since\ttotal_time\tmax_time\tlast_change\n");
+	seq_puts(m, "name\t\tactive_count\tevent_count\twakeup_count\t"
+		"expire_count\tactive_since\ttotal_time\tmax_time\t"
+		"last_change\n");
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(ws, &wakeup_sources, entry)
Index: linux/drivers/base/power/sysfs.c
===================================================================
--- linux.orig/drivers/base/power/sysfs.c
+++ linux/drivers/base/power/sysfs.c
@@ -288,22 +288,41 @@ static ssize_t wakeup_active_count_show(
 
 static DEVICE_ATTR(wakeup_active_count, 0444, wakeup_active_count_show, NULL);
 
-static ssize_t wakeup_hit_count_show(struct device *dev,
-				struct device_attribute *attr, char *buf)
+static ssize_t wakeup_abort_count_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	unsigned long count = 0;
+	bool enabled = false;
+
+	spin_lock_irq(&dev->power.lock);
+	if (dev->power.wakeup) {
+		count = dev->power.wakeup->wakeup_count;
+		enabled = true;
+	}
+	spin_unlock_irq(&dev->power.lock);
+	return enabled ? sprintf(buf, "%lu\n", count) : sprintf(buf, "\n");
+}
+
+static DEVICE_ATTR(wakeup_abort_count, 0444, wakeup_abort_count_show, NULL);
+
+static ssize_t wakeup_expire_count_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
 {
 	unsigned long count = 0;
 	bool enabled = false;
 
 	spin_lock_irq(&dev->power.lock);
 	if (dev->power.wakeup) {
-		count = dev->power.wakeup->hit_count;
+		count = dev->power.wakeup->expire_count;
 		enabled = true;
 	}
 	spin_unlock_irq(&dev->power.lock);
 	return enabled ? sprintf(buf, "%lu\n", count) : sprintf(buf, "\n");
 }
 
-static DEVICE_ATTR(wakeup_hit_count, 0444, wakeup_hit_count_show, NULL);
+static DEVICE_ATTR(wakeup_expire_count, 0444, wakeup_expire_count_show, NULL);
 
 static ssize_t wakeup_active_show(struct device *dev,
 				struct device_attribute *attr, char *buf)
@@ -460,7 +479,8 @@ static struct attribute *wakeup_attrs[]
 	&dev_attr_wakeup.attr,
 	&dev_attr_wakeup_count.attr,
 	&dev_attr_wakeup_active_count.attr,
-	&dev_attr_wakeup_hit_count.attr,
+	&dev_attr_wakeup_abort_count.attr,
+	&dev_attr_wakeup_expire_count.attr,
 	&dev_attr_wakeup_active.attr,
 	&dev_attr_wakeup_total_time_ms.attr,
 	&dev_attr_wakeup_max_time_ms.attr,
Index: linux/Documentation/ABI/testing/sysfs-devices-power
===================================================================
--- linux.orig/Documentation/ABI/testing/sysfs-devices-power
+++ linux/Documentation/ABI/testing/sysfs-devices-power
@@ -96,16 +96,26 @@ Description:
 		is read-only.  If the device is not enabled to wake up the
 		system from sleep states, this attribute is not present.
 
-What:		/sys/devices/.../power/wakeup_hit_count
-Date:		September 2010
+What:		/sys/devices/.../power/wakeup_abort_count
+Date:		February 2012
 Contact:	Rafael J. Wysocki <rjw@sisk.pl>
 Description:
-		The /sys/devices/.../wakeup_hit_count attribute contains the
+		The /sys/devices/.../wakeup_abort_count attribute contains the
 		number of times the processing of a wakeup event associated with
-		the device might prevent the system from entering a sleep state.
-		This attribute is read-only.  If the device is not enabled to
-		wake up the system from sleep states, this attribute is not
-		present.
+		the device might have aborted system transition into a sleep
+		state in progress.  This attribute is read-only.  If the device
+		is not enabled to wake up the system from sleep states, this
+		attribute is not present.
+
+What:		/sys/devices/.../power/wakeup_expire_count
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/devices/.../wakeup_expire_count attribute contains the
+		number of times a wakeup event associated with the device has
+		been reported with a timeout that expired.  This attribute is
+		read-only.  If the device is not enabled to wake up the system
+		from sleep states, this attribute is not present.
 
 What:		/sys/devices/.../power/wakeup_active
 Date:		September 2010


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 4/7] Input / PM: Add ioctl to block suspend while event queue is not empty
  2012-02-21 23:31 ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 Rafael J. Wysocki
                     ` (2 preceding siblings ...)
  2012-02-21 23:34   ` [RFC][PATCH 3/7] PM / Sleep: Change wakeup source statistics to follow Android Rafael J. Wysocki
@ 2012-02-21 23:34   ` Rafael J. Wysocki
  2012-02-24  5:16     ` Matt Helsley
  2012-02-21 23:35   ` [RFC][PATCH 5/7] PM / Sleep: Implement opportunistic sleep Rafael J. Wysocki
                     ` (4 subsequent siblings)
  8 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-21 23:34 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

From: Arve Hjønnevåg <arve@android.com>

Add a new ioctl, EVIOCSWAKEUPSRC, to attach a wakeup source object to
an evdev client event queue, such that it will be active whenever the
queue is not empty.  Then, all events in the queue will be regarded
as wakeup events in progress and pm_get_wakeup_count() will block (or
return false if woken up by a signal) until they are removed from the
queue.  In consequence, if the checking of wakeup events is enabled
(e.g. throught the /sys/power/wakeup_count interface), the system
won't be able to go into a sleep state until the queue is empty.

This allows user space processes to handle situations in which they
want to do a select() on an evdev descriptor, so they go to sleep
until there are some events to read from the device's queue, and then
they don't want the system to go into a sleep state until all the
events are read (presumably for further processing).  Of course, if
they don't want the system to go into a sleep state _after_ all the
events have been read from the queue, they have to use a separate
mechanism that will prevent the system from doing that and it has
to be activated before reading the first event (that also may be the
last one).

[rjw: Removed unnecessary checks, changed the names of the new ioctls
 and the names of the functions that add/remove wakeup source objects
 to/from evdev clients, modified the changelog.]

Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/input/evdev.c |   55 ++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/input.h |    3 ++
 2 files changed, 58 insertions(+)

Index: linux/drivers/input/evdev.c
===================================================================
--- linux.orig/drivers/input/evdev.c
+++ linux/drivers/input/evdev.c
@@ -43,6 +43,7 @@ struct evdev_client {
 	unsigned int tail;
 	unsigned int packet_head; /* [future] position of the first element of next packet */
 	spinlock_t buffer_lock; /* protects access to buffer, head and tail */
+	struct wakeup_source *wakeup_source;
 	struct fasync_struct *fasync;
 	struct evdev *evdev;
 	struct list_head node;
@@ -75,10 +76,12 @@ static void evdev_pass_event(struct evde
 		client->buffer[client->tail].value = 0;
 
 		client->packet_head = client->tail;
+		__pm_relax(client->wakeup_source);
 	}
 
 	if (event->type == EV_SYN && event->code == SYN_REPORT) {
 		client->packet_head = client->head;
+		__pm_stay_awake(client->wakeup_source);
 		kill_fasync(&client->fasync, SIGIO, POLL_IN);
 	}
 
@@ -255,6 +258,8 @@ static int evdev_release(struct inode *i
 	mutex_unlock(&evdev->mutex);
 
 	evdev_detach_client(evdev, client);
+	wakeup_source_unregister(client->wakeup_source);
+
 	kfree(client);
 
 	evdev_close_device(evdev);
@@ -373,6 +378,8 @@ static int evdev_fetch_next_event(struct
 	if (have_event) {
 		*event = client->buffer[client->tail++];
 		client->tail &= client->bufsize - 1;
+		if (client->packet_head == client->tail)
+			__pm_relax(client->wakeup_source);
 	}
 
 	spin_unlock_irq(&client->buffer_lock);
@@ -623,6 +630,45 @@ static int evdev_handle_set_keycode_v2(s
 	return input_set_keycode(dev, &ke);
 }
 
+static int evdev_attach_wakeup_source(struct evdev *evdev,
+				      struct evdev_client *client)
+{
+	struct wakeup_source *ws;
+	char name[28];
+
+	if (client->wakeup_source)
+		return 0;
+
+	snprintf(name, sizeof(name), "%s-%d",
+		 dev_name(&evdev->dev), task_tgid_vnr(current));
+
+	ws = wakeup_source_register(name);
+	if (!ws)
+		return -ENOMEM;
+
+	spin_lock_irq(&client->buffer_lock);
+	client->wakeup_source = ws;
+	if (client->packet_head != client->tail)
+		__pm_stay_awake(client->wakeup_source);
+	spin_unlock_irq(&client->buffer_lock);
+	return 0;
+}
+
+static int evdev_detach_wakeup_source(struct evdev *evdev,
+				      struct evdev_client *client)
+{
+	struct wakeup_source *ws;
+
+	spin_lock_irq(&client->buffer_lock);
+	ws = client->wakeup_source;
+	client->wakeup_source = NULL;
+	spin_unlock_irq(&client->buffer_lock);
+
+	wakeup_source_unregister(ws);
+
+	return 0;
+}
+
 static long evdev_do_ioctl(struct file *file, unsigned int cmd,
 			   void __user *p, int compat_mode)
 {
@@ -696,6 +742,15 @@ static long evdev_do_ioctl(struct file *
 
 	case EVIOCSKEYCODE_V2:
 		return evdev_handle_set_keycode_v2(dev, p);
+
+	case EVIOCGWAKEUPSRC:
+		return put_user(!!client->wakeup_source, ip);
+
+	case EVIOCSWAKEUPSRC:
+		if (p)
+			return evdev_attach_wakeup_source(evdev, client);
+		else
+			return evdev_detach_wakeup_source(evdev, client);
 	}
 
 	size = _IOC_SIZE(cmd);
Index: linux/include/linux/input.h
===================================================================
--- linux.orig/include/linux/input.h
+++ linux/include/linux/input.h
@@ -129,6 +129,9 @@ struct input_keymap_entry {
 
 #define EVIOCGRAB		_IOW('E', 0x90, int)			/* Grab/Release device */
 
+#define EVIOCGWAKEUPSRC	_IOR('E', 0x91, int)	/* Check if wakeup handling is enabled */
+#define EVIOCSWAKEUPSRC	_IOW('E', 0x91, int)	/* Enable/disable wakeup handling */
+
 /*
  * Device properties and quirks
  */


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 5/7] PM / Sleep: Implement opportunistic sleep
  2012-02-21 23:31 ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 Rafael J. Wysocki
                     ` (3 preceding siblings ...)
  2012-02-21 23:34   ` [RFC][PATCH 4/7] Input / PM: Add ioctl to block suspend while event queue is not empty Rafael J. Wysocki
@ 2012-02-21 23:35   ` Rafael J. Wysocki
  2012-02-22  8:45     ` Srivatsa S. Bhat
  2012-02-21 23:36   ` [RFC][PATCH 6/7] PM / Sleep: Add "prevent autosleep time" statistics to wakeup sources Rafael J. Wysocki
                     ` (3 subsequent siblings)
  8 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-21 23:35 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

From: Rafael J. Wysocki <rjw@sisk.pl>

Introduce a mechanism by which the kernel can trigger global
transitions to a sleep state chosen by user space if there are no
active wakeup sources.

It consists of a new sysfs attribute, /sys/power/autosleep, that
can be written one of the strings returned by reads from
/sys/power/state, an ordered workqueue and a work item carrying out
the "suspend" operations.  If a string representing the system's
sleep state is written to /sys/power/autosleep, the work item
triggering transitions to that state is queued up and it requeues
itself after every execution until user space writes "off" to
/sys/power/autosleep.

That work item enables the detection of wakeup events using the
functions already defined in drivers/base/power/wakeup.c (with one
small modification) and calls either pm_suspend(), or hibernate() to
put the system into a sleep state.  If a wakeup event is reported
while the transition is in progress, it will abort the transition and
the "system suspend" work item will be queued up again.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/ABI/testing/sysfs-power |   17 +++++
 drivers/base/power/wakeup.c           |   38 ++++++-----
 include/linux/suspend.h               |   13 +++-
 kernel/power/Kconfig                  |    8 ++
 kernel/power/Makefile                 |    1 
 kernel/power/autosleep.c              |   98 ++++++++++++++++++++++++++++++
 kernel/power/main.c                   |  108 ++++++++++++++++++++++++++++------
 kernel/power/power.h                  |   18 +++++
 8 files changed, 266 insertions(+), 35 deletions(-)

Index: linux/kernel/power/Makefile
===================================================================
--- linux.orig/kernel/power/Makefile
+++ linux/kernel/power/Makefile
@@ -9,5 +9,6 @@ obj-$(CONFIG_SUSPEND)		+= suspend.o
 obj-$(CONFIG_PM_TEST_SUSPEND)	+= suspend_test.o
 obj-$(CONFIG_HIBERNATION)	+= hibernate.o snapshot.o swap.o user.o \
 				   block_io.o
+obj-$(CONFIG_PM_AUTOSLEEP)	+= autosleep.o
 
 obj-$(CONFIG_MAGIC_SYSRQ)	+= poweroff.o
Index: linux/kernel/power/Kconfig
===================================================================
--- linux.orig/kernel/power/Kconfig
+++ linux/kernel/power/Kconfig
@@ -103,6 +103,14 @@ config PM_SLEEP_SMP
 	select HOTPLUG
 	select HOTPLUG_CPU
 
+config PM_AUTOSLEEP
+	bool "Opportunistic sleep"
+	depends on PM_SLEEP
+	default n
+	---help---
+	Allow the kernel to trigger a system transition into a global sleep
+	state automatically whenever there are no active wakeup sources.
+
 config PM_RUNTIME
 	bool "Run-time PM core functionality"
 	depends on !IA64_HP_SIM
Index: linux/kernel/power/power.h
===================================================================
--- linux.orig/kernel/power/power.h
+++ linux/kernel/power/power.h
@@ -264,3 +264,21 @@ static inline void suspend_thaw_processe
 {
 }
 #endif
+
+#ifdef CONFIG_PM_AUTOSLEEP
+
+/* kernel/power/autosleep.c */
+extern int pm_autosleep_init(void);
+extern void pm_autosleep_lock(void);
+extern void pm_autosleep_unlock(void);
+extern suspend_state_t pm_autosleep_state(void);
+extern int pm_autosleep_set_state(suspend_state_t state);
+
+#else /* !CONFIG_PM_AUTOSLEEP */
+
+static inline int pm_autosleep_init(void) { return 0; }
+static inline void pm_autosleep_lock(void) {}
+static inline void pm_autosleep_unlock(void) {}
+static inline suspend_state_t pm_autosleep_state(void) { return PM_SUSPEND_ON; }
+
+#endif /* !CONFIG_PM_AUTOSLEEP */
Index: linux/include/linux/suspend.h
===================================================================
--- linux.orig/include/linux/suspend.h
+++ linux/include/linux/suspend.h
@@ -356,7 +356,7 @@ extern int unregister_pm_notifier(struct
 extern bool events_check_enabled;
 
 extern bool pm_wakeup_pending(void);
-extern bool pm_get_wakeup_count(unsigned int *count);
+extern bool pm_get_wakeup_count(unsigned int *count, bool block);
 extern bool pm_save_wakeup_count(unsigned int count);
 
 static inline void lock_system_sleep(void)
@@ -407,6 +407,17 @@ static inline void unlock_system_sleep(v
 
 #endif /* !CONFIG_PM_SLEEP */
 
+#ifdef CONFIG_PM_AUTOSLEEP
+
+/* kernel/power/autosleep.c */
+void queue_up_suspend_work(void);
+
+#else /* !CONFIG_PM_AUTOSLEEP */
+
+static inline void queue_up_suspend_work(void) {}
+
+#endif /* !CONFIG_PM_AUTOSLEEP */
+
 #ifdef CONFIG_ARCH_SAVE_PAGE_KEYS
 /*
  * The ARCH_SAVE_PAGE_KEYS functions can be used by an architecture
Index: linux/kernel/power/autosleep.c
===================================================================
--- /dev/null
+++ linux/kernel/power/autosleep.c
@@ -0,0 +1,98 @@
+/*
+ * kernel/power/autosleep.c
+ *
+ * Opportunistic sleep support.
+ *
+ * Copyright (C) 2012 Rafael J. Wysocki <rjw@sisk.pl>
+ */
+
+#include <linux/device.h>
+#include <linux/mutex.h>
+#include <linux/pm_wakeup.h>
+
+#include "power.h"
+
+static suspend_state_t autosleep_state;
+static struct workqueue_struct *autosleep_wq;
+static DEFINE_MUTEX(autosleep_lock);
+
+static void try_to_suspend(struct work_struct *work)
+{
+	unsigned int initial_count, final_count;
+
+	if (!pm_get_wakeup_count(&initial_count, true))
+		goto out;
+
+	mutex_lock(&autosleep_lock);
+
+	if (!pm_save_wakeup_count(initial_count)) {
+		mutex_unlock(&autosleep_lock);
+		goto out;
+	}
+
+	if (autosleep_state == PM_SUSPEND_ON) {
+		mutex_unlock(&autosleep_lock);
+		return;
+	}
+	if (autosleep_state >= PM_SUSPEND_MAX)
+		hibernate();
+	else
+		pm_suspend(autosleep_state);
+
+	mutex_unlock(&autosleep_lock);
+
+	if (!pm_get_wakeup_count(&final_count, false))
+		goto out;
+
+	if (final_count == initial_count)
+		schedule_timeout(HZ / 2);
+
+ out:
+	queue_up_suspend_work();
+}
+
+static DECLARE_WORK(suspend_work, try_to_suspend);
+
+void queue_up_suspend_work(void)
+{
+	if (!work_pending(&suspend_work) && autosleep_state > PM_SUSPEND_ON)
+		queue_work(autosleep_wq, &suspend_work);
+}
+
+suspend_state_t pm_autosleep_state(void)
+{
+	return autosleep_state;
+}
+
+int pm_autosleep_set_state(suspend_state_t state)
+{
+#ifndef CONFIG_HIBERNATION
+	if (state >= PM_SUSPEND_MAX)
+		return -EINVAL;
+#endif
+	mutex_lock(&autosleep_lock);
+	if (state == PM_SUSPEND_ON && autosleep_state != PM_SUSPEND_ON) {
+		autosleep_state = PM_SUSPEND_ON;
+	} else if (state > PM_SUSPEND_ON) {
+		autosleep_state = state;
+		queue_up_suspend_work();
+	}
+	mutex_unlock(&autosleep_lock);
+	return 0;
+}
+
+void pm_autosleep_lock(void)
+{
+	mutex_lock(&autosleep_lock);
+}
+
+void pm_autosleep_unlock(void)
+{
+	mutex_unlock(&autosleep_lock);
+}
+
+int __init pm_autosleep_init(void)
+{
+	autosleep_wq = alloc_ordered_workqueue("autosleep", 0);
+	return autosleep_wq ? 0 : -ENOMEM;
+}
Index: linux/kernel/power/main.c
===================================================================
--- linux.orig/kernel/power/main.c
+++ linux/kernel/power/main.c
@@ -269,8 +269,7 @@ static ssize_t state_show(struct kobject
 	return (s - buf);
 }
 
-static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
-			   const char *buf, size_t n)
+static suspend_state_t decode_state(const char *buf, size_t n)
 {
 #ifdef CONFIG_SUSPEND
 	suspend_state_t state = PM_SUSPEND_STANDBY;
@@ -278,27 +277,43 @@ static ssize_t state_store(struct kobjec
 #endif
 	char *p;
 	int len;
-	int error = -EINVAL;
 
 	p = memchr(buf, '\n', n);
 	len = p ? p - buf : n;
 
-	/* First, check if we are requested to hibernate */
-	if (len == 4 && !strncmp(buf, "disk", len)) {
-		error = hibernate();
-		goto Exit;
-	}
+	/* Check hibernation first. */
+	if (len == 4 && !strncmp(buf, "disk", len))
+		return PM_SUSPEND_MAX;
 
 #ifdef CONFIG_SUSPEND
-	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
-		if (*s && len == strlen(*s) && !strncmp(buf, *s, len)) {
-			error = pm_suspend(state);
-			break;
-		}
-	}
+	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++)
+		if (*s && len == strlen(*s) && !strncmp(buf, *s, len))
+			return state;
 #endif
 
- Exit:
+	return PM_SUSPEND_ON;
+}
+
+static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
+			   const char *buf, size_t n)
+{
+	suspend_state_t state;
+	int error = -EINVAL;
+
+	pm_autosleep_lock();
+	if (pm_autosleep_state() > PM_SUSPEND_ON) {
+		error = -EBUSY;
+		goto out;
+	}
+
+	state = decode_state(buf, n);
+	if (state < PM_SUSPEND_MAX)
+		error = pm_suspend(state);
+	else if (state > PM_SUSPEND_ON)
+		error = hibernate();
+
+ out:
+	pm_autosleep_unlock();
 	return error ? error : n;
 }
 
@@ -339,7 +354,8 @@ static ssize_t wakeup_count_show(struct
 {
 	unsigned int val;
 
-	return pm_get_wakeup_count(&val) ? sprintf(buf, "%u\n", val) : -EINTR;
+	return pm_get_wakeup_count(&val, true) ?
+		sprintf(buf, "%u\n", val) : -EINTR;
 }
 
 static ssize_t wakeup_count_store(struct kobject *kobj,
@@ -347,15 +363,65 @@ static ssize_t wakeup_count_store(struct
 				const char *buf, size_t n)
 {
 	unsigned int val;
+	int error = -EINVAL;
+
+	pm_autosleep_lock();
+	if (pm_autosleep_state() > PM_SUSPEND_ON) {
+		error = -EBUSY;
+		goto out;
+	}
 
 	if (sscanf(buf, "%u", &val) == 1) {
 		if (pm_save_wakeup_count(val))
 			return n;
 	}
-	return -EINVAL;
+
+ out:
+	pm_autosleep_unlock();
+	return error;
 }
 
 power_attr(wakeup_count);
+
+#ifdef CONFIG_PM_AUTOSLEEP
+static ssize_t autosleep_show(struct kobject *kobj,
+			      struct kobj_attribute *attr,
+			      char *buf)
+{
+	suspend_state_t state = pm_autosleep_state();
+
+	if (state == PM_SUSPEND_ON)
+		return sprintf(buf, "off\n");
+
+#ifdef CONFIG_SUSPEND
+	if (state < PM_SUSPEND_MAX)
+		return sprintf(buf, "%s\n", valid_state(state) ?
+						pm_states[state] : "error");
+#endif
+#ifdef CONFIG_HIBERNATION
+	return sprintf(buf, "disk\n");
+#else
+	return sprintf(buf, "error");
+#endif
+}
+
+static ssize_t autosleep_store(struct kobject *kobj,
+			       struct kobj_attribute *attr,
+			       const char *buf, size_t n)
+{
+	suspend_state_t state = decode_state(buf, n);
+	int error;
+
+	if (state == PM_SUSPEND_ON && strncmp(buf, "off", 3)
+	    && strncmp(buf, "off\n", 4))
+		return -EINVAL;
+
+	error = pm_autosleep_set_state(state);
+	return error ? error : n;
+}
+
+power_attr(autosleep);
+#endif /* CONFIG_PM_AUTOSLEEP */
 #endif /* CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM_TRACE
@@ -409,6 +475,9 @@ static struct attribute * g[] = {
 #ifdef CONFIG_PM_SLEEP
 	&pm_async_attr.attr,
 	&wakeup_count_attr.attr,
+#ifdef CONFIG_PM_AUTOSLEEP
+	&autosleep_attr.attr,
+#endif
 #ifdef CONFIG_PM_DEBUG
 	&pm_test_attr.attr,
 #endif
@@ -444,7 +513,10 @@ static int __init pm_init(void)
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
-	return sysfs_create_group(power_kobj, &attr_group);
+	error = sysfs_create_group(power_kobj, &attr_group);
+	if (error)
+		return error;
+	return pm_autosleep_init();
 }
 
 core_initcall(pm_init);
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -492,8 +492,10 @@ static void wakeup_source_deactivate(str
 	atomic_add(MAX_IN_PROGRESS, &combined_event_count);
 
 	split_counters(&cnt, &inpr);
-	if (!inpr && waitqueue_active(&wakeup_count_wait_queue))
+	if (!inpr && waitqueue_active(&wakeup_count_wait_queue)) {
 		wake_up(&wakeup_count_wait_queue);
+		queue_up_suspend_work();
+	}
 }
 
 /**
@@ -654,29 +656,33 @@ bool pm_wakeup_pending(void)
 /**
  * pm_get_wakeup_count - Read the number of registered wakeup events.
  * @count: Address to store the value at.
+ * @block: Whether or not to block.
  *
- * Store the number of registered wakeup events at the address in @count.  Block
- * if the current number of wakeup events being processed is nonzero.
+ * Store the number of registered wakeup events at the address in @count.  If
+ * @block is set, block until the current number of wakeup events being
+ * processed is zero.
  *
- * Return 'false' if the wait for the number of wakeup events being processed to
- * drop down to zero has been interrupted by a signal (and the current number
- * of wakeup events being processed is still nonzero).  Otherwise return 'true'.
+ * Return 'false' if the current number of wakeup events being processed is
+ * nonzero.  Otherwise return 'true'.
  */
-bool pm_get_wakeup_count(unsigned int *count)
+bool pm_get_wakeup_count(unsigned int *count, bool block)
 {
 	unsigned int cnt, inpr;
-	DEFINE_WAIT(wait);
 
-	for (;;) {
-		prepare_to_wait(&wakeup_count_wait_queue, &wait,
-				TASK_INTERRUPTIBLE);
-		split_counters(&cnt, &inpr);
-		if (inpr == 0 || signal_pending(current))
-			break;
+	if (block) {
+		DEFINE_WAIT(wait);
 
-		schedule();
+		for (;;) {
+			prepare_to_wait(&wakeup_count_wait_queue, &wait,
+					TASK_INTERRUPTIBLE);
+			split_counters(&cnt, &inpr);
+			if (inpr == 0 || signal_pending(current))
+				break;
+
+			schedule();
+		}
+		finish_wait(&wakeup_count_wait_queue, &wait);
 	}
-	finish_wait(&wakeup_count_wait_queue, &wait);
 
 	split_counters(&cnt, &inpr);
 	*count = cnt;
Index: linux/Documentation/ABI/testing/sysfs-power
===================================================================
--- linux.orig/Documentation/ABI/testing/sysfs-power
+++ linux/Documentation/ABI/testing/sysfs-power
@@ -172,3 +172,20 @@ Description:
 
 		Reading from this file will display the current value, which is
 		set to 1 MB by default.
+
+What:		/sys/power/autosleep
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/autosleep file can be written one of the strings
+		returned by reads from /sys/power/state.  If that happens, a
+		work item attempting to trigger a transition of the system to
+		the sleep state represented by that string is queued up.  This
+		attempt will only succeed if there are no active wakeup sources
+		in the system at that time.  After evey execution, regardless
+		of whether or not the attempt to put the system to sleep has
+		succeeded, the work item requeues itself until user space
+		writes "off" to /sys/power/autosleep.
+
+		Reading from this file causes the last string successfully
+		written to it to be displayed.


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 6/7] PM / Sleep: Add "prevent autosleep time" statistics to wakeup sources
  2012-02-21 23:31 ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 Rafael J. Wysocki
                     ` (4 preceding siblings ...)
  2012-02-21 23:35   ` [RFC][PATCH 5/7] PM / Sleep: Implement opportunistic sleep Rafael J. Wysocki
@ 2012-02-21 23:36   ` Rafael J. Wysocki
  2012-02-21 23:37   ` [RFC][PATCH 7/7] PM / Sleep: Add user space interface for manipulating " Rafael J. Wysocki
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-21 23:36 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

From: Rafael J. Wysocki <rjw@sisk.pl>

Android uses one wakelock statistics that is only necessary for
opportunistic sleep.  Namely, the prevent_suspend_time field
accumulates the total time the given wakelock has been locked
while "automatic suspend" was enabled.  Add an analogous field,
prevent_sleep_time, to wakeup sources and make it behave in a similar
way.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/ABI/testing/sysfs-devices-power |   11 ++++
 drivers/base/power/sysfs.c                    |   24 ++++++++++
 drivers/base/power/wakeup.c                   |   61 ++++++++++++++++++++++++--
 include/linux/pm_wakeup.h                     |    4 +
 include/linux/suspend.h                       |    1 
 kernel/power/autosleep.c                      |    2 
 6 files changed, 99 insertions(+), 4 deletions(-)

Index: linux/include/linux/pm_wakeup.h
===================================================================
--- linux.orig/include/linux/pm_wakeup.h
+++ linux/include/linux/pm_wakeup.h
@@ -34,6 +34,7 @@
  * @total_time: Total time this wakeup source has been active.
  * @max_time: Maximum time this wakeup source has been continuously active.
  * @last_time: Monotonic clock when the wakeup source's was touched last time.
+ * @prevent_sleep_time: Total time this source has been preventing autosleep.
  * @event_count: Number of signaled wakeup events.
  * @active_count: Number of times the wakeup sorce was activated.
  * @relax_count: Number of times the wakeup sorce was deactivated.
@@ -51,12 +52,15 @@ struct wakeup_source {
 	ktime_t total_time;
 	ktime_t max_time;
 	ktime_t last_time;
+	ktime_t start_prevent_time;
+	ktime_t prevent_sleep_time;
 	unsigned long		event_count;
 	unsigned long		active_count;
 	unsigned long		relax_count;
 	unsigned long		expire_count;
 	unsigned long		wakeup_count;
 	bool			active:1;
+	bool			autosleep_enabled:1;
 };
 
 #ifdef CONFIG_PM_SLEEP
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -377,6 +377,8 @@ static void wakeup_source_activate(struc
 	ws->active = true;
 	ws->active_count++;
 	ws->last_time = ktime_get();
+	if (ws->autosleep_enabled)
+		ws->start_prevent_time = ws->last_time;
 
 	/* Increment the counter of events in progress. */
 	atomic_inc(&combined_event_count);
@@ -444,6 +446,17 @@ void pm_stay_awake(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(pm_stay_awake);
 
+#ifdef CONFIG_PM_AUTOSLEEP
+static void update_prevent_sleep_time(struct wakeup_source *ws, ktime_t now)
+{
+	ktime_t delta = ktime_sub(now, ws->start_prevent_time);
+	ws->prevent_sleep_time = ktime_add(ws->prevent_sleep_time, delta);
+}
+#else
+static inline void update_prevent_sleep_time(struct wakeup_source *ws,
+					     ktime_t now) {}
+#endif
+
 /**
  * wakup_source_deactivate - Mark given wakeup source as inactive.
  * @ws: Wakeup source to handle.
@@ -485,6 +498,9 @@ static void wakeup_source_deactivate(str
 	del_timer(&ws->timer);
 	ws->timer_expires = 0;
 
+	if (ws->autosleep_enabled)
+		update_prevent_sleep_time(ws, now);
+
 	/*
 	 * Increment the counter of registered wakeup events and decrement the
 	 * couter of wakeup events in progress simultaneously.
@@ -714,6 +730,34 @@ bool pm_save_wakeup_count(unsigned int c
 	return events_check_enabled;
 }
 
+#ifdef CONFIG_PM_AUTOSLEEP
+/**
+ * pm_wakep_autosleep_enabled - Modify autosleep_enabled for all wakeup sources.
+ * @enabled: Whether to set or to clear the autosleep_enabled flags.
+ */
+void pm_wakep_autosleep_enabled(bool set)
+{
+	struct wakeup_source *ws;
+	ktime_t now = ktime_get();
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ws, &wakeup_sources, entry) {
+		spin_lock_irq(&ws->lock);
+		if (ws->autosleep_enabled != set) {
+			ws->autosleep_enabled = set;
+			if (ws->active) {
+				if (set)
+					ws->start_prevent_time = now;
+				else
+					update_prevent_sleep_time(ws, now);
+			}
+		}
+		spin_unlock_irq(&ws->lock);
+	}
+	rcu_read_unlock();
+}
+#endif /* CONFIG_PM_AUTOSLEEP */
+
 static struct dentry *wakeup_sources_stats_dentry;
 
 /**
@@ -729,28 +773,37 @@ static int print_wakeup_source_stats(str
 	ktime_t max_time;
 	unsigned long active_count;
 	ktime_t active_time;
+	ktime_t prevent_sleep_time;
 	int ret;
 
 	spin_lock_irqsave(&ws->lock, flags);
 
 	total_time = ws->total_time;
 	max_time = ws->max_time;
+	prevent_sleep_time = ws->prevent_sleep_time;
 	active_count = ws->active_count;
 	if (ws->active) {
-		active_time = ktime_sub(ktime_get(), ws->last_time);
+		ktime_t now = ktime_get();
+
+		active_time = ktime_sub(now, ws->last_time);
 		total_time = ktime_add(total_time, active_time);
 		if (active_time.tv64 > max_time.tv64)
 			max_time = active_time;
+
+		if (ws->autosleep_enabled)
+			prevent_sleep_time = ktime_add(prevent_sleep_time,
+				ktime_sub(now, ws->start_prevent_time));
 	} else {
 		active_time = ktime_set(0, 0);
 	}
 
 	ret = seq_printf(m, "%-12s\t%lu\t\t%lu\t\t%lu\t\t%lu\t\t"
-			"%lld\t\t%lld\t\t%lld\t\t%lld\n",
+			"%lld\t\t%lld\t\t%lld\t\t%lld\t\t%lld\n",
 			ws->name, active_count, ws->event_count,
 			ws->wakeup_count, ws->expire_count,
 			ktime_to_ms(active_time), ktime_to_ms(total_time),
-			ktime_to_ms(max_time), ktime_to_ms(ws->last_time));
+			ktime_to_ms(max_time), ktime_to_ms(ws->last_time),
+			ktime_to_ms(prevent_sleep_time));
 
 	spin_unlock_irqrestore(&ws->lock, flags);
 
@@ -767,7 +820,7 @@ static int wakeup_sources_stats_show(str
 
 	seq_puts(m, "name\t\tactive_count\tevent_count\twakeup_count\t"
 		"expire_count\tactive_since\ttotal_time\tmax_time\t"
-		"last_change\n");
+		"last_change\tprevent_suspend_time\n");
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(ws, &wakeup_sources, entry)
Index: linux/include/linux/suspend.h
===================================================================
--- linux.orig/include/linux/suspend.h
+++ linux/include/linux/suspend.h
@@ -358,6 +358,7 @@ extern bool events_check_enabled;
 extern bool pm_wakeup_pending(void);
 extern bool pm_get_wakeup_count(unsigned int *count, bool block);
 extern bool pm_save_wakeup_count(unsigned int count);
+extern void pm_wakep_autosleep_enabled(bool set);
 
 static inline void lock_system_sleep(void)
 {
Index: linux/kernel/power/autosleep.c
===================================================================
--- linux.orig/kernel/power/autosleep.c
+++ linux/kernel/power/autosleep.c
@@ -73,8 +73,10 @@ int pm_autosleep_set_state(suspend_state
 	mutex_lock(&autosleep_lock);
 	if (state == PM_SUSPEND_ON && autosleep_state != PM_SUSPEND_ON) {
 		autosleep_state = PM_SUSPEND_ON;
+		pm_wakep_autosleep_enabled(false);
 	} else if (state > PM_SUSPEND_ON) {
 		autosleep_state = state;
+		pm_wakep_autosleep_enabled(true);
 		queue_up_suspend_work();
 	}
 	mutex_unlock(&autosleep_lock);
Index: linux/drivers/base/power/sysfs.c
===================================================================
--- linux.orig/drivers/base/power/sysfs.c
+++ linux/drivers/base/power/sysfs.c
@@ -391,6 +391,27 @@ static ssize_t wakeup_last_time_show(str
 }
 
 static DEVICE_ATTR(wakeup_last_time_ms, 0444, wakeup_last_time_show, NULL);
+
+#ifdef CONFIG_PM_AUTOSLEEP
+static ssize_t wakeup_prevent_sleep_time_show(struct device *dev,
+					      struct device_attribute *attr,
+					      char *buf)
+{
+	s64 msec = 0;
+	bool enabled = false;
+
+	spin_lock_irq(&dev->power.lock);
+	if (dev->power.wakeup) {
+		msec = ktime_to_ms(dev->power.wakeup->prevent_sleep_time);
+		enabled = true;
+	}
+	spin_unlock_irq(&dev->power.lock);
+	return enabled ? sprintf(buf, "%lld\n", msec) : sprintf(buf, "\n");
+}
+
+static DEVICE_ATTR(wakeup_prevent_sleep_time_ms, 0444,
+		   wakeup_prevent_sleep_time_show, NULL);
+#endif /* CONFIG_PM_AUTOSLEEP */
 #endif /* CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM_ADVANCED_DEBUG
@@ -485,6 +506,9 @@ static struct attribute *wakeup_attrs[]
 	&dev_attr_wakeup_total_time_ms.attr,
 	&dev_attr_wakeup_max_time_ms.attr,
 	&dev_attr_wakeup_last_time_ms.attr,
+#ifdef CONFIG_PM_AUTOSLEEP
+	&dev_attr_wakeup_prevent_sleep_time_ms.attr,
+#endif
 #endif
 	NULL,
 };
Index: linux/Documentation/ABI/testing/sysfs-devices-power
===================================================================
--- linux.orig/Documentation/ABI/testing/sysfs-devices-power
+++ linux/Documentation/ABI/testing/sysfs-devices-power
@@ -158,6 +158,17 @@ Description:
 		not enabled to wake up the system from sleep states, this
 		attribute is not present.
 
+What:		/sys/devices/.../power/wakeup_prevent_sleep_time_ms
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/devices/.../wakeup_prevent_sleep_time_ms attribute
+		contains the total time the device has been preventing
+		opportunistic transitions to sleep states from occuring.
+		This attribute is read-only.  If the device is not enabled to
+		wake up the system from sleep states, this attribute is not
+		present.
+
 What:		/sys/devices/.../power/autosuspend_delay_ms
 Date:		September 2010
 Contact:	Alan Stern <stern@rowland.harvard.edu>


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 7/7] PM / Sleep: Add user space interface for manipulating wakeup sources
  2012-02-21 23:31 ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 Rafael J. Wysocki
                     ` (5 preceding siblings ...)
  2012-02-21 23:36   ` [RFC][PATCH 6/7] PM / Sleep: Add "prevent autosleep time" statistics to wakeup sources Rafael J. Wysocki
@ 2012-02-21 23:37   ` Rafael J. Wysocki
  2012-02-22  4:49   ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 John Stultz
  2012-04-22 21:19   ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Rafael J. Wysocki
  8 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-21 23:37 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

From: Rafael J. Wysocki <rjw@sisk.pl>

Android allows user space to manipulate wakelocks using two
sysfs file located in /sys/power/, wake_lock and wake_unlock.
Writing a wakelock name and optionally a timeout to the wake_lock
file causes the wakelock whose name was written to be acquired (it
is created before is necessary), optionally with the given timeout.
Writing the name of a wakelock to wake_unlock causes that wakelock
to be released.

Implement an analogous interface for user space using wakeup sources.
Add the /sys/power/wake_lock and /sys/power/wake_unlock files
allowing user space to create, activate and deactivate wakeup
sources, such that writing a name and optionally a timeout to
wake_lock causes the wakeup source of that name to be activated,
optionally with the given timeout.  If that wakeup source doesn't
exist, it will be created and then activated.  Writing a name to
wake_unlock causes the wakeup source of that name, if there is one,
to be deactivated.  Wakeup sources created with the help of
wake_lock that haven't been used for more than 5 minutes are garbage
collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT
wakeup sources created with the help of wake_lock present at a time.

The data type used to track wakeup sources created by user space is
called "struct wakelock" to indicate the origins of this feature.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/ABI/testing/sysfs-power |   42 ++++++
 drivers/base/power/wakeup.c           |    1 
 kernel/power/Kconfig                  |    8 +
 kernel/power/Makefile                 |    1 
 kernel/power/main.c                   |   41 ++++++
 kernel/power/power.h                  |    9 +
 kernel/power/wakelock.c               |  218 ++++++++++++++++++++++++++++++++++
 7 files changed, 320 insertions(+)

Index: linux/kernel/power/main.c
===================================================================
--- linux.orig/kernel/power/main.c
+++ linux/kernel/power/main.c
@@ -422,6 +422,43 @@ static ssize_t autosleep_store(struct ko
 
 power_attr(autosleep);
 #endif /* CONFIG_PM_AUTOSLEEP */
+
+#ifdef CONFIG_PM_WAKELOCKS
+static ssize_t wake_lock_show(struct kobject *kobj,
+			      struct kobj_attribute *attr,
+			      char *buf)
+{
+	return pm_show_wakelocks(buf, true);
+}
+
+static ssize_t wake_lock_store(struct kobject *kobj,
+			       struct kobj_attribute *attr,
+			       const char *buf, size_t n)
+{
+	int error = pm_wake_lock(buf);
+	return error ? error : n;
+}
+
+power_attr(wake_lock);
+
+static ssize_t wake_unlock_show(struct kobject *kobj,
+				struct kobj_attribute *attr,
+				char *buf)
+{
+	return pm_show_wakelocks(buf, false);
+}
+
+static ssize_t wake_unlock_store(struct kobject *kobj,
+				 struct kobj_attribute *attr,
+				 const char *buf, size_t n)
+{
+	int error = pm_wake_unlock(buf);
+	return error ? error : n;
+}
+
+power_attr(wake_unlock);
+
+#endif /* CONFIG_PM_WAKELOCKS */
 #endif /* CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM_TRACE
@@ -478,6 +515,10 @@ static struct attribute * g[] = {
 #ifdef CONFIG_PM_AUTOSLEEP
 	&autosleep_attr.attr,
 #endif
+#ifdef CONFIG_PM_WAKELOCKS
+	&wake_lock_attr.attr,
+	&wake_unlock_attr.attr,
+#endif
 #ifdef CONFIG_PM_DEBUG
 	&pm_test_attr.attr,
 #endif
Index: linux/kernel/power/power.h
===================================================================
--- linux.orig/kernel/power/power.h
+++ linux/kernel/power/power.h
@@ -282,3 +282,12 @@ static inline void pm_autosleep_unlock(v
 static inline suspend_state_t pm_autosleep_state(void) { return PM_SUSPEND_ON; }
 
 #endif /* !CONFIG_PM_AUTOSLEEP */
+
+#ifdef CONFIG_PM_WAKELOCKS
+
+/* kernel/power/wakelock.c */
+extern ssize_t pm_show_wakelocks(char *buf, bool show_active);
+extern int pm_wake_lock(const char *buf);
+extern int pm_wake_unlock(const char *buf);
+
+#endif /* !CONFIG_PM_WAKELOCKS */
Index: linux/kernel/power/Kconfig
===================================================================
--- linux.orig/kernel/power/Kconfig
+++ linux/kernel/power/Kconfig
@@ -111,6 +111,14 @@ config PM_AUTOSLEEP
 	Allow the kernel to trigger a system transition into a global sleep
 	state automatically whenever there are no active wakeup sources.
 
+config PM_WAKELOCKS
+	bool "User space wakeup sources interface"
+	depends on PM_SLEEP
+	default n
+	---help---
+	Allow user space to create, activate and deactivate wakeup source
+	objects with the help of a sysfs-based interface.
+
 config PM_RUNTIME
 	bool "Run-time PM core functionality"
 	depends on !IA64_HP_SIM
Index: linux/kernel/power/wakelock.c
===================================================================
--- /dev/null
+++ linux/kernel/power/wakelock.c
@@ -0,0 +1,218 @@
+/*
+ * kernel/power/wakelock.c
+ *
+ * User space wakeup sources support.
+ *
+ * Copyright (C) 2012 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This code is based on the analogous interface allowing user space to
+ * manipulate wakelocks on Android.
+ */
+
+#include <linux/ctype.h>
+#include <linux/device.h>
+#include <linux/err.h>
+#include <linux/hrtimer.h>
+#include <linux/list.h>
+#include <linux/rbtree.h>
+#include <linux/slab.h>
+
+#define WL_NUMBER_LIMIT	100
+#define WL_GC_COUNT_MAX	100
+#define WL_GC_TIME_SEC	300
+
+static DEFINE_MUTEX(wakelocks_lock);
+
+struct wakelock {
+	char			*name;
+	struct rb_node		node;
+	struct wakeup_source	ws;
+	struct list_head	lru;
+};
+
+static struct rb_root wakelocks_tree = RB_ROOT;
+static LIST_HEAD(wakelocks_lru_list);
+static unsigned int number_of_wakelocks;
+static unsigned int wakelocks_gc_count;
+
+ssize_t pm_show_wakelocks(char *buf, bool show_active)
+{
+	struct rb_node *node;
+	struct wakelock *wl;
+	char *str = buf;
+	char *end = buf + PAGE_SIZE;
+
+	mutex_lock(&wakelocks_lock);
+
+	for (node = rb_first(&wakelocks_tree); node; node = rb_next(node)) {
+		bool active;
+
+		wl = rb_entry(node, struct wakelock, node);
+		spin_lock_irq(&wl->ws.lock);
+		active = wl->ws.active;
+		spin_unlock_irq(&wl->ws.lock);
+		if (active == show_active)
+			str += scnprintf(str, end - str, "%s ", wl->name);
+	}
+	str += scnprintf(str, end - str, "\n");
+
+	mutex_unlock(&wakelocks_lock);
+	return (str - buf);
+}
+
+static struct wakelock *wakelock_lookup_add(const char *name, size_t len,
+					    bool add_if_not_found)
+{
+	struct rb_node **node = &wakelocks_tree.rb_node;
+	struct rb_node *parent = *node;
+	struct wakelock *wl;
+
+	while (*node) {
+		int diff;
+
+		wl = rb_entry(*node, struct wakelock, node);
+		diff = strncmp(name, wl->name, len);
+		if (diff == 0) {
+			if (wl->name[len])
+				diff = -1;
+			else
+				return wl;
+		}
+		if (diff < 0)
+			node = &(*node)->rb_left;
+		else
+			node = &(*node)->rb_right;
+
+		parent = *node;
+	}
+	if (!add_if_not_found)
+		return ERR_PTR(-EINVAL);
+
+	if (number_of_wakelocks > WL_NUMBER_LIMIT)
+		return ERR_PTR(-ENOSPC);
+
+	/* Not found, we have to add a new one. */
+	wl = kzalloc(sizeof(*wl), GFP_KERNEL);
+	if (!wl)
+		return ERR_PTR(-ENOMEM);
+
+	wl->name = kstrndup(name, len, GFP_KERNEL);
+	if (!wl->name) {
+		kfree(wl);
+		return ERR_PTR(-ENOMEM);
+	}
+	wl->ws.name = wl->name;
+	wakeup_source_add(&wl->ws);
+	rb_link_node(&wl->node, parent, node);
+	rb_insert_color(&wl->node, &wakelocks_tree);
+	list_add(&wl->lru, &wakelocks_lru_list);
+	number_of_wakelocks++;
+	return wl;
+}
+
+int pm_wake_lock(const char *buf)
+{
+	const char *str = buf;
+	struct wakelock *wl;
+	u64 timeout_ns = 0;
+	size_t len;
+	int ret = 0;
+
+	while (*str && !isspace(*str))
+		str++;
+
+	len = str - buf;
+	if (!len)
+		return -EINVAL;
+
+	if (*str && *str != '\n') {
+		/* Find out if there's a valid timeout string appended. */
+		ret = kstrtou64(skip_spaces(str), 10, &timeout_ns);
+		if (ret)
+			return -EINVAL;
+	}
+
+	mutex_lock(&wakelocks_lock);
+
+	wl = wakelock_lookup_add(buf, len, true);
+	if (IS_ERR(wl)) {
+		ret = PTR_ERR(wl);
+		goto out;
+	}
+	if (timeout_ns) {
+		u64 timeout_ms = timeout_ns + NSEC_PER_MSEC - 1;
+
+		do_div(timeout_ms, NSEC_PER_MSEC);
+		__pm_wakeup_event(&wl->ws, timeout_ms);
+	} else {
+		__pm_stay_awake(&wl->ws);
+	}
+
+	list_move(&wl->lru, &wakelocks_lru_list);
+
+ out:
+	mutex_unlock(&wakelocks_lock);
+	return ret;
+}
+
+static void wakelocks_gc(void)
+{
+	struct wakelock *wl, *aux;
+	ktime_t now = ktime_get();
+
+	list_for_each_entry_safe_reverse(wl, aux, &wakelocks_lru_list, lru) {
+		u64 idle_time_ns;
+		bool active;
+
+		spin_lock_irq(&wl->ws.lock);
+		idle_time_ns = ktime_to_ns(ktime_sub(now, wl->ws.last_time));
+		active = wl->ws.active;
+		spin_unlock_irq(&wl->ws.lock);
+
+		if (idle_time_ns < ((u64)WL_GC_TIME_SEC * NSEC_PER_SEC))
+			break;
+
+		if (!active) {
+			wakeup_source_remove(&wl->ws);
+			rb_erase(&wl->node, &wakelocks_tree);
+			list_del(&wl->lru);
+			kfree(wl->name);
+			kfree(wl);
+			number_of_wakelocks--;
+		}
+	}
+	wakelocks_gc_count = 0;
+}
+
+int pm_wake_unlock(const char *buf)
+{
+	struct wakelock *wl;
+	size_t len;
+	int ret = 0;
+
+	len = strlen(buf);
+	if (!len)
+		return -EINVAL;
+
+	if (buf[len-1] == '\n')
+		len--;
+
+	if (!len)
+		return -EINVAL;
+
+	mutex_lock(&wakelocks_lock);
+
+	wl = wakelock_lookup_add(buf, len, false);
+	if (IS_ERR(wl)) {
+		ret = PTR_ERR(wl);
+		goto out;
+	}
+	__pm_relax(&wl->ws);
+	list_move(&wl->lru, &wakelocks_lru_list);
+	if (++wakelocks_gc_count > WL_GC_COUNT_MAX)
+		wakelocks_gc();
+
+ out:
+	mutex_unlock(&wakelocks_lock);
+	return ret;
+}
Index: linux/kernel/power/Makefile
===================================================================
--- linux.orig/kernel/power/Makefile
+++ linux/kernel/power/Makefile
@@ -10,5 +10,6 @@ obj-$(CONFIG_PM_TEST_SUSPEND)	+= suspend
 obj-$(CONFIG_HIBERNATION)	+= hibernate.o snapshot.o swap.o user.o \
 				   block_io.o
 obj-$(CONFIG_PM_AUTOSLEEP)	+= autosleep.o
+obj-$(CONFIG_PM_WAKELOCKS)	+= wakelock.o
 
 obj-$(CONFIG_MAGIC_SYSRQ)	+= poweroff.o
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -132,6 +132,7 @@ void wakeup_source_add(struct wakeup_sou
 	spin_lock_init(&ws->lock);
 	setup_timer(&ws->timer, pm_wakeup_timer_fn, (unsigned long)ws);
 	ws->active = false;
+	ws->last_time = ktime_get();
 
 	spin_lock_irq(&events_lock);
 	list_add_rcu(&ws->entry, &wakeup_sources);
Index: linux/Documentation/ABI/testing/sysfs-power
===================================================================
--- linux.orig/Documentation/ABI/testing/sysfs-power
+++ linux/Documentation/ABI/testing/sysfs-power
@@ -189,3 +189,45 @@ Description:
 
 		Reading from this file causes the last string successfully
 		written to it to be displayed.
+
+What:		/sys/power/wake_lock
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/wake_lock file allows user space to create
+		wakeup source objects and activate them on demand (if one of
+		those wakeup sources is active, reads from the
+		/sys/power/wakeup_count file block or return false).  When a
+		string without white space is written to /sys/power/wake_lock,
+		it will be assumed to represent a wakeup source name.  If there
+		is a wakeup source object with that name, it will be activated
+		(unless active already).  Otherwise, a new wakeup source object
+		will be registered, assigned the given name and activated.
+		If a string written to /sys/power/wake_lock contains white
+		space, the part of the string preceding the white space will be
+		regarded as a wakeup source name and handled as descrived above.
+		The other part of the string will be regarded as a timeout (in
+		nanoseconds) such that the wakeup source will be automatically
+		deactivated after it has expired.  The timeout, if present, is
+		set regardless of the current state of the wakeup source object
+		in question.
+
+		Reads from this file return a string consisting of the names of
+		wakeup sources created with the help of it that are active at
+		the moment, separated with spaces.
+
+
+What:		/sys/power/wake_unlock
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/wake_unlock file allows user space to deactivate
+		wakeup sources created with the help of /sys/power/wake_lock.
+		When a string is written to /sys/power/wake_unlock, it will be
+		assumed to represent the name of a wakeup source to deactivate.
+		If a wakeup source object of that name exists and is active at
+		the moment, it will be deactivated.
+
+		Reads from this file return a string consisting of the names of
+		wakeup sources created with the help of /sys/power/wake_lock
+		that are inactive at the moment, separated with spaces.


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2
  2012-02-21 23:31 ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 Rafael J. Wysocki
                     ` (6 preceding siblings ...)
  2012-02-21 23:37   ` [RFC][PATCH 7/7] PM / Sleep: Add user space interface for manipulating " Rafael J. Wysocki
@ 2012-02-22  4:49   ` John Stultz
  2012-02-22  8:44     ` Srivatsa S. Bhat
  2012-04-22 21:19   ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Rafael J. Wysocki
  8 siblings, 1 reply; 129+ messages in thread
From: John Stultz @ 2012-02-22  4:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, Brian Swetland, Neil Brown,
	Alan Stern, Dmitry Torokhov

On Wed, 2012-02-22 at 00:31 +0100, Rafael J. Wysocki wrote:
> Hi all,
> 
> After the feedback so far I've decided to follow up with a refreshed patchset.
> The first two patches from the previous one went to linux-pm/linux-next
> and I included the recent evdev patch from Arve (with some modifications)
> to this patchset for completness.

Hey Rafael, 
	Thanks again for posting this! I've started playing around with it in a
kvm environment, and got the following warning after echoing off >
autosleep:
...
PM: resume of devices complete after 185.615 msecs
PM: Finishing wakeup.
Restarting tasks ... done.
PM: Syncing filesystems ... done.
PM: Preparing system for mem sleep
Freezing user space processes ... 
Freezing of tasks failed after 20.01 seconds (1 tasks refusing to freeze, wq_busy=0):
bash            D ffff880015714010 
===============================
[ INFO: suspicious RCU usage. ]
3.3.0-rc3john+ #131 Not tainted
-------------------------------
kernel/sched/core.c:4784 suspicious rcu_dereference_check() usage!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 0
5 locks held by kworker/u:1/10:
 #0:  (autosleep){.+.+.+}, at: [<ffffffff81066db8>] process_one_work+0x2d8/0x8c0
 #1:  (suspend_work){+.+.+.}, at: [<ffffffff81066db8>] process_one_work+0x2d8/0x8c0
 #2:  (autosleep_lock){+.+.+.}, at: [<ffffffff810a2d3d>] try_to_suspend+0x2d/0xe0
 #3:  (pm_mutex){+.+.+.}, at: [<ffffffff8109b9fc>] pm_suspend+0x8c/0x210
 #4:  (tasklist_lock){.+.+..}, at: [<ffffffff8109b0f1>] try_to_freeze_tasks+0x2d1/0x400

stack backtrace:
Pid: 10, comm: kworker/u:1 Not tainted 3.3.0-rc3john+ #131
Call Trace:
 [<ffffffff81040d82>] ? vprintk+0x242/0x530
 [<ffffffff810b0fdb>] lockdep_rcu_suspicious+0xeb/0x100
 [<ffffffff81083371>] sched_show_task+0x121/0x180
 [<ffffffff8109b1e5>] try_to_freeze_tasks+0x3c5/0x400
 [<ffffffff810a2d10>] ? pm_autosleep_set_state+0x80/0x80
 [<ffffffff8109b2eb>] freeze_processes+0x3b/0xb0
 [<ffffffff8109baad>] pm_suspend+0x13d/0x210
 [<ffffffff810a2d5d>] try_to_suspend+0x4d/0xe0
 [<ffffffff81066f02>] process_one_work+0x422/0x8c0
 [<ffffffff81066db8>] ? process_one_work+0x2d8/0x8c0
 [<ffffffff810b063e>] ? put_lock_stats+0xe/0x40
 [<ffffffff81067a16>] worker_thread+0x476/0x550
 [<ffffffff810675a0>] ? rescuer_thread+0x200/0x200
 [<ffffffff810706fe>] kthread+0xae/0xc0
 [<ffffffff81af4cb4>] kernel_thread_helper+0x4/0x10
 [<ffffffff81af3078>] ? retint_restore_args+0x13/0x13
 [<ffffffff81070650>] ? __init_kthread_worker+0x70/0x70
 [<ffffffff81af4cb0>] ? gs_change+0x13/0x13
    0  1981   1980 0x00020004
 ffff880015715d88 0000000000000046 ffff880015715c88 ffffffff8102c22b
 ffff880015714010 ffff880015715fd8 ffff880015714010 ffff880015714000
 ffff880015715fd8 ffff880015714000 ffff880015c4e3c0 ffff88001342e540
Call Trace:
 [<ffffffff8102c22b>] ? kvm_clock_read+0x6b/0x90
 [<ffffffff810b1f2d>] ? mark_held_locks+0xad/0x150
 [<ffffffff81af10bf>] schedule+0x3f/0x60
 [<ffffffff81aef33b>] mutex_lock_nested+0x1cb/0x4c0
 [<ffffffff810a2cae>] ? pm_autosleep_set_state+0x1e/0x80
 [<ffffffff810a2cae>] ? pm_autosleep_set_state+0x1e/0x80
 [<ffffffff810a2cae>] pm_autosleep_set_state+0x1e/0x80
 [<ffffffff8109a74b>] autosleep_store+0x3b/0x80
 [<ffffffff813856e7>] kobj_attr_store+0x17/0x20
 [<ffffffff81200dcc>] sysfs_write_file+0xec/0x170
 [<ffffffff8118085f>] vfs_write+0x11f/0x1b0
 [<ffffffff811809f4>] sys_write+0x54/0xa0
 [<ffffffff81af4e66>] sysenter_dispatch+0x7/0x26
 [<ffffffff8139238e>] ? trace_hardirqs_on_thunk+0x3a/0x3f

Restarting tasks ... done.




^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2
  2012-02-22  4:49   ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 John Stultz
@ 2012-02-22  8:44     ` Srivatsa S. Bhat
  2012-02-22 22:10       ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take2 Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: Srivatsa S. Bhat @ 2012-02-22  8:44 UTC (permalink / raw)
  To: John Stultz
  Cc: Rafael J. Wysocki, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg,
	Brian Swetland, Neil Brown, Alan Stern, Dmitry Torokhov

On 02/22/2012 10:19 AM, John Stultz wrote:

> On Wed, 2012-02-22 at 00:31 +0100, Rafael J. Wysocki wrote:
>> Hi all,
>>
>> After the feedback so far I've decided to follow up with a refreshed patchset.
>> The first two patches from the previous one went to linux-pm/linux-next
>> and I included the recent evdev patch from Arve (with some modifications)
>> to this patchset for completness.
> 
> Hey Rafael, 
> 	Thanks again for posting this! I've started playing around with it in a
> kvm environment, and got the following warning after echoing off >
> autosleep:
> ...
> PM: resume of devices complete after 185.615 msecs
> PM: Finishing wakeup.
> Restarting tasks ... done.
> PM: Syncing filesystems ... done.
> PM: Preparing system for mem sleep
> Freezing user space processes ... 
> Freezing of tasks failed after 20.01 seconds (1 tasks refusing to freeze, wq_busy=0):
> bash            D ffff880015714010 


Ah.. I think I know what is the problem here..

The kernel was freezing userspace processes and meanwhile, you wrote "off"
to autosleep. So, as a result, this userspace process (bash) just now
entered kernel mode. Unfortunately, the autosleep_lock is held for too long,
that is, something like:

acquire autosleep_lock
modify autosleep_state
                               <============== "A"
 pm_suspend or hibernate()

release autosleep_lock

At point marked "A", we should have released the autosleep lock and only then
entered pm_suspend or hibernate(). Since the current code holds the lock and
enters suspend/hibernate, the userspace process that wrote "off" to autosleep
(or even userspace process that writes to /sys/power/state will end up waiting
on autosleep_lock, thus failing the freezing operation.)

So the solution is to always release the autosleep lock before entering
suspend/hibernation.

 
Regards,
Srivatsa S. Bhat

> ===============================
> [ INFO: suspicious RCU usage. ]
> 3.3.0-rc3john+ #131 Not tainted
> -------------------------------
> kernel/sched/core.c:4784 suspicious rcu_dereference_check() usage!
> 
> other info that might help us debug this:
> 
> 
> rcu_scheduler_active = 1, debug_locks = 0
> 5 locks held by kworker/u:1/10:
>  #0:  (autosleep){.+.+.+}, at: [<ffffffff81066db8>] process_one_work+0x2d8/0x8c0
>  #1:  (suspend_work){+.+.+.}, at: [<ffffffff81066db8>] process_one_work+0x2d8/0x8c0
>  #2:  (autosleep_lock){+.+.+.}, at: [<ffffffff810a2d3d>] try_to_suspend+0x2d/0xe0
>  #3:  (pm_mutex){+.+.+.}, at: [<ffffffff8109b9fc>] pm_suspend+0x8c/0x210
>  #4:  (tasklist_lock){.+.+..}, at: [<ffffffff8109b0f1>] try_to_freeze_tasks+0x2d1/0x400
> 
> stack backtrace:
> Pid: 10, comm: kworker/u:1 Not tainted 3.3.0-rc3john+ #131
> Call Trace:
>  [<ffffffff81040d82>] ? vprintk+0x242/0x530
>  [<ffffffff810b0fdb>] lockdep_rcu_suspicious+0xeb/0x100
>  [<ffffffff81083371>] sched_show_task+0x121/0x180
>  [<ffffffff8109b1e5>] try_to_freeze_tasks+0x3c5/0x400
>  [<ffffffff810a2d10>] ? pm_autosleep_set_state+0x80/0x80
>  [<ffffffff8109b2eb>] freeze_processes+0x3b/0xb0
>  [<ffffffff8109baad>] pm_suspend+0x13d/0x210
>  [<ffffffff810a2d5d>] try_to_suspend+0x4d/0xe0
>  [<ffffffff81066f02>] process_one_work+0x422/0x8c0
>  [<ffffffff81066db8>] ? process_one_work+0x2d8/0x8c0
>  [<ffffffff810b063e>] ? put_lock_stats+0xe/0x40
>  [<ffffffff81067a16>] worker_thread+0x476/0x550
>  [<ffffffff810675a0>] ? rescuer_thread+0x200/0x200
>  [<ffffffff810706fe>] kthread+0xae/0xc0
>  [<ffffffff81af4cb4>] kernel_thread_helper+0x4/0x10
>  [<ffffffff81af3078>] ? retint_restore_args+0x13/0x13
>  [<ffffffff81070650>] ? __init_kthread_worker+0x70/0x70
>  [<ffffffff81af4cb0>] ? gs_change+0x13/0x13
>     0  1981   1980 0x00020004
>  ffff880015715d88 0000000000000046 ffff880015715c88 ffffffff8102c22b
>  ffff880015714010 ffff880015715fd8 ffff880015714010 ffff880015714000
>  ffff880015715fd8 ffff880015714000 ffff880015c4e3c0 ffff88001342e540
> Call Trace:
>  [<ffffffff8102c22b>] ? kvm_clock_read+0x6b/0x90
>  [<ffffffff810b1f2d>] ? mark_held_locks+0xad/0x150
>  [<ffffffff81af10bf>] schedule+0x3f/0x60
>  [<ffffffff81aef33b>] mutex_lock_nested+0x1cb/0x4c0
>  [<ffffffff810a2cae>] ? pm_autosleep_set_state+0x1e/0x80
>  [<ffffffff810a2cae>] ? pm_autosleep_set_state+0x1e/0x80
>  [<ffffffff810a2cae>] pm_autosleep_set_state+0x1e/0x80
>  [<ffffffff8109a74b>] autosleep_store+0x3b/0x80
>  [<ffffffff813856e7>] kobj_attr_store+0x17/0x20
>  [<ffffffff81200dcc>] sysfs_write_file+0xec/0x170
>  [<ffffffff8118085f>] vfs_write+0x11f/0x1b0
>  [<ffffffff811809f4>] sys_write+0x54/0xa0
>  [<ffffffff81af4e66>] sysenter_dispatch+0x7/0x26
>  [<ffffffff8139238e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> 
> Restarting tasks ... done.
> 
> 
> 


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 5/7] PM / Sleep: Implement opportunistic sleep
  2012-02-21 23:35   ` [RFC][PATCH 5/7] PM / Sleep: Implement opportunistic sleep Rafael J. Wysocki
@ 2012-02-22  8:45     ` Srivatsa S. Bhat
  2012-02-22 22:10       ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: Srivatsa S. Bhat @ 2012-02-22  8:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

On 02/22/2012 05:05 AM, Rafael J. Wysocki wrote:

> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Introduce a mechanism by which the kernel can trigger global
> transitions to a sleep state chosen by user space if there are no
> active wakeup sources.
> 
> It consists of a new sysfs attribute, /sys/power/autosleep, that
> can be written one of the strings returned by reads from
> /sys/power/state, an ordered workqueue and a work item carrying out
> the "suspend" operations.  If a string representing the system's
> sleep state is written to /sys/power/autosleep, the work item
> triggering transitions to that state is queued up and it requeues
> itself after every execution until user space writes "off" to
> /sys/power/autosleep.
> 
> That work item enables the detection of wakeup events using the
> functions already defined in drivers/base/power/wakeup.c (with one
> small modification) and calls either pm_suspend(), or hibernate() to
> put the system into a sleep state.  If a wakeup event is reported
> while the transition is in progress, it will abort the transition and
> the "system suspend" work item will be queued up again.
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  Documentation/ABI/testing/sysfs-power |   17 +++++
>  drivers/base/power/wakeup.c           |   38 ++++++-----
>  include/linux/suspend.h               |   13 +++-
>  kernel/power/Kconfig                  |    8 ++
>  kernel/power/Makefile                 |    1 
>  kernel/power/autosleep.c              |   98 ++++++++++++++++++++++++++++++
>  kernel/power/main.c                   |  108 ++++++++++++++++++++++++++++------
>  kernel/power/power.h                  |   18 +++++
>  8 files changed, 266 insertions(+), 35 deletions(-)
> 
> Index: linux/kernel/power/Makefile
> ===================================================================
> --- linux.orig/kernel/power/Makefile
> +++ linux/kernel/power/Makefile
> @@ -9,5 +9,6 @@ obj-$(CONFIG_SUSPEND)		+= suspend.o
>  obj-$(CONFIG_PM_TEST_SUSPEND)	+= suspend_test.o
>  obj-$(CONFIG_HIBERNATION)	+= hibernate.o snapshot.o swap.o user.o \
>  				   block_io.o
> +obj-$(CONFIG_PM_AUTOSLEEP)	+= autosleep.o
> 
>  obj-$(CONFIG_MAGIC_SYSRQ)	+= poweroff.o
> Index: linux/kernel/power/Kconfig
> ===================================================================
> --- linux.orig/kernel/power/Kconfig
> +++ linux/kernel/power/Kconfig
> @@ -103,6 +103,14 @@ config PM_SLEEP_SMP
>  	select HOTPLUG
>  	select HOTPLUG_CPU
> 
> +config PM_AUTOSLEEP
> +	bool "Opportunistic sleep"
> +	depends on PM_SLEEP
> +	default n
> +	---help---
> +	Allow the kernel to trigger a system transition into a global sleep
> +	state automatically whenever there are no active wakeup sources.
> +
>  config PM_RUNTIME
>  	bool "Run-time PM core functionality"
>  	depends on !IA64_HP_SIM
> Index: linux/kernel/power/power.h
> ===================================================================
> --- linux.orig/kernel/power/power.h
> +++ linux/kernel/power/power.h
> @@ -264,3 +264,21 @@ static inline void suspend_thaw_processe
>  {
>  }
>  #endif
> +
> +#ifdef CONFIG_PM_AUTOSLEEP
> +
> +/* kernel/power/autosleep.c */
> +extern int pm_autosleep_init(void);
> +extern void pm_autosleep_lock(void);
> +extern void pm_autosleep_unlock(void);
> +extern suspend_state_t pm_autosleep_state(void);
> +extern int pm_autosleep_set_state(suspend_state_t state);
> +
> +#else /* !CONFIG_PM_AUTOSLEEP */
> +
> +static inline int pm_autosleep_init(void) { return 0; }
> +static inline void pm_autosleep_lock(void) {}
> +static inline void pm_autosleep_unlock(void) {}
> +static inline suspend_state_t pm_autosleep_state(void) { return PM_SUSPEND_ON; }
> +
> +#endif /* !CONFIG_PM_AUTOSLEEP */
> Index: linux/include/linux/suspend.h
> ===================================================================
> --- linux.orig/include/linux/suspend.h
> +++ linux/include/linux/suspend.h
> @@ -356,7 +356,7 @@ extern int unregister_pm_notifier(struct
>  extern bool events_check_enabled;
> 
>  extern bool pm_wakeup_pending(void);
> -extern bool pm_get_wakeup_count(unsigned int *count);
> +extern bool pm_get_wakeup_count(unsigned int *count, bool block);
>  extern bool pm_save_wakeup_count(unsigned int count);
> 
>  static inline void lock_system_sleep(void)
> @@ -407,6 +407,17 @@ static inline void unlock_system_sleep(v
> 
>  #endif /* !CONFIG_PM_SLEEP */
> 
> +#ifdef CONFIG_PM_AUTOSLEEP
> +
> +/* kernel/power/autosleep.c */
> +void queue_up_suspend_work(void);
> +
> +#else /* !CONFIG_PM_AUTOSLEEP */
> +
> +static inline void queue_up_suspend_work(void) {}
> +
> +#endif /* !CONFIG_PM_AUTOSLEEP */
> +
>  #ifdef CONFIG_ARCH_SAVE_PAGE_KEYS
>  /*
>   * The ARCH_SAVE_PAGE_KEYS functions can be used by an architecture
> Index: linux/kernel/power/autosleep.c
> ===================================================================
> --- /dev/null
> +++ linux/kernel/power/autosleep.c
> @@ -0,0 +1,98 @@
> +/*
> + * kernel/power/autosleep.c
> + *
> + * Opportunistic sleep support.
> + *
> + * Copyright (C) 2012 Rafael J. Wysocki <rjw@sisk.pl>
> + */
> +
> +#include <linux/device.h>
> +#include <linux/mutex.h>
> +#include <linux/pm_wakeup.h>
> +
> +#include "power.h"
> +
> +static suspend_state_t autosleep_state;
> +static struct workqueue_struct *autosleep_wq;
> +static DEFINE_MUTEX(autosleep_lock);
> +
> +static void try_to_suspend(struct work_struct *work)
> +{
> +	unsigned int initial_count, final_count;
> +
> +	if (!pm_get_wakeup_count(&initial_count, true))
> +		goto out;
> +
> +	mutex_lock(&autosleep_lock);
> +
> +	if (!pm_save_wakeup_count(initial_count)) {
> +		mutex_unlock(&autosleep_lock);
> +		goto out;
> +	}
> +
> +	if (autosleep_state == PM_SUSPEND_ON) {
> +		mutex_unlock(&autosleep_lock);
> +		return;
> +	}
> +	if (autosleep_state >= PM_SUSPEND_MAX)
> +		hibernate();
> +	else
> +		pm_suspend(autosleep_state);


We are calling pm_suspend() or hibernate() directly here.
Won't this break build when CONFIG_SUSPEND or CONFIG_HIBERNATION is not set?
CONFIG_PM_AUTOSLEEP depends only on PM_SLEEP which means we could enable
either one of suspend or hibernation and yet come to this point, breaking
the option which was not enabled.

Regards,
Srivatsa S. Bhat

> +
> +	mutex_unlock(&autosleep_lock);
> +
> +	if (!pm_get_wakeup_count(&final_count, false))
> +		goto out;
> +
> +	if (final_count == initial_count)
> +		schedule_timeout(HZ / 2);
> +
> + out:
> +	queue_up_suspend_work();
> +}
> +
> +static DECLARE_WORK(suspend_work, try_to_suspend);
> +
> +void queue_up_suspend_work(void)
> +{
> +	if (!work_pending(&suspend_work) && autosleep_state > PM_SUSPEND_ON)
> +		queue_work(autosleep_wq, &suspend_work);
> +}
> +
> +suspend_state_t pm_autosleep_state(void)
> +{
> +	return autosleep_state;
> +}
> +
> +int pm_autosleep_set_state(suspend_state_t state)
> +{
> +#ifndef CONFIG_HIBERNATION
> +	if (state >= PM_SUSPEND_MAX)
> +		return -EINVAL;
> +#endif
> +	mutex_lock(&autosleep_lock);
> +	if (state == PM_SUSPEND_ON && autosleep_state != PM_SUSPEND_ON) {
> +		autosleep_state = PM_SUSPEND_ON;
> +	} else if (state > PM_SUSPEND_ON) {
> +		autosleep_state = state;
> +		queue_up_suspend_work();
> +	}
> +	mutex_unlock(&autosleep_lock);
> +	return 0;
> +}
> +
> +void pm_autosleep_lock(void)
> +{
> +	mutex_lock(&autosleep_lock);
> +}
> +
> +void pm_autosleep_unlock(void)
> +{
> +	mutex_unlock(&autosleep_lock);
> +}
> +
> +int __init pm_autosleep_init(void)
> +{
> +	autosleep_wq = alloc_ordered_workqueue("autosleep", 0);
> +	return autosleep_wq ? 0 : -ENOMEM;
> +}
> Index: linux/kernel/power/main.c
> ===================================================================
> --- linux.orig/kernel/power/main.c
> +++ linux/kernel/power/main.c
> @@ -269,8 +269,7 @@ static ssize_t state_show(struct kobject
>  	return (s - buf);
>  }
> 
> -static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
> -			   const char *buf, size_t n)
> +static suspend_state_t decode_state(const char *buf, size_t n)
>  {
>  #ifdef CONFIG_SUSPEND
>  	suspend_state_t state = PM_SUSPEND_STANDBY;
> @@ -278,27 +277,43 @@ static ssize_t state_store(struct kobjec
>  #endif
>  	char *p;
>  	int len;
> -	int error = -EINVAL;
> 
>  	p = memchr(buf, '\n', n);
>  	len = p ? p - buf : n;
> 
> -	/* First, check if we are requested to hibernate */
> -	if (len == 4 && !strncmp(buf, "disk", len)) {
> -		error = hibernate();
> -		goto Exit;
> -	}
> +	/* Check hibernation first. */
> +	if (len == 4 && !strncmp(buf, "disk", len))
> +		return PM_SUSPEND_MAX;
> 
>  #ifdef CONFIG_SUSPEND
> -	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
> -		if (*s && len == strlen(*s) && !strncmp(buf, *s, len)) {
> -			error = pm_suspend(state);
> -			break;
> -		}
> -	}
> +	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++)
> +		if (*s && len == strlen(*s) && !strncmp(buf, *s, len))
> +			return state;
>  #endif
> 
> - Exit:
> +	return PM_SUSPEND_ON;
> +}
> +
> +static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
> +			   const char *buf, size_t n)
> +{
> +	suspend_state_t state;
> +	int error = -EINVAL;
> +
> +	pm_autosleep_lock();
> +	if (pm_autosleep_state() > PM_SUSPEND_ON) {
> +		error = -EBUSY;
> +		goto out;
> +	}
> +
> +	state = decode_state(buf, n);
> +	if (state < PM_SUSPEND_MAX)
> +		error = pm_suspend(state);
> +	else if (state > PM_SUSPEND_ON)
> +		error = hibernate();
> +
> + out:
> +	pm_autosleep_unlock();
>  	return error ? error : n;
>  }
> 
> @@ -339,7 +354,8 @@ static ssize_t wakeup_count_show(struct
>  {
>  	unsigned int val;
> 
> -	return pm_get_wakeup_count(&val) ? sprintf(buf, "%u\n", val) : -EINTR;
> +	return pm_get_wakeup_count(&val, true) ?
> +		sprintf(buf, "%u\n", val) : -EINTR;
>  }
> 
>  static ssize_t wakeup_count_store(struct kobject *kobj,
> @@ -347,15 +363,65 @@ static ssize_t wakeup_count_store(struct
>  				const char *buf, size_t n)
>  {
>  	unsigned int val;
> +	int error = -EINVAL;
> +
> +	pm_autosleep_lock();
> +	if (pm_autosleep_state() > PM_SUSPEND_ON) {
> +		error = -EBUSY;
> +		goto out;
> +	}
> 
>  	if (sscanf(buf, "%u", &val) == 1) {
>  		if (pm_save_wakeup_count(val))
>  			return n;
>  	}
> -	return -EINVAL;
> +
> + out:
> +	pm_autosleep_unlock();
> +	return error;
>  }
> 
>  power_attr(wakeup_count);
> +
> +#ifdef CONFIG_PM_AUTOSLEEP
> +static ssize_t autosleep_show(struct kobject *kobj,
> +			      struct kobj_attribute *attr,
> +			      char *buf)
> +{
> +	suspend_state_t state = pm_autosleep_state();
> +
> +	if (state == PM_SUSPEND_ON)
> +		return sprintf(buf, "off\n");
> +
> +#ifdef CONFIG_SUSPEND
> +	if (state < PM_SUSPEND_MAX)
> +		return sprintf(buf, "%s\n", valid_state(state) ?
> +						pm_states[state] : "error");
> +#endif
> +#ifdef CONFIG_HIBERNATION
> +	return sprintf(buf, "disk\n");
> +#else
> +	return sprintf(buf, "error");
> +#endif
> +}
> +
> +static ssize_t autosleep_store(struct kobject *kobj,
> +			       struct kobj_attribute *attr,
> +			       const char *buf, size_t n)
> +{
> +	suspend_state_t state = decode_state(buf, n);
> +	int error;
> +
> +	if (state == PM_SUSPEND_ON && strncmp(buf, "off", 3)
> +	    && strncmp(buf, "off\n", 4))
> +		return -EINVAL;
> +
> +	error = pm_autosleep_set_state(state);
> +	return error ? error : n;
> +}
> +
> +power_attr(autosleep);
> +#endif /* CONFIG_PM_AUTOSLEEP */
>  #endif /* CONFIG_PM_SLEEP */
> 
>  #ifdef CONFIG_PM_TRACE
> @@ -409,6 +475,9 @@ static struct attribute * g[] = {
>  #ifdef CONFIG_PM_SLEEP
>  	&pm_async_attr.attr,
>  	&wakeup_count_attr.attr,
> +#ifdef CONFIG_PM_AUTOSLEEP
> +	&autosleep_attr.attr,
> +#endif
>  #ifdef CONFIG_PM_DEBUG
>  	&pm_test_attr.attr,
>  #endif
> @@ -444,7 +513,10 @@ static int __init pm_init(void)
>  	power_kobj = kobject_create_and_add("power", NULL);
>  	if (!power_kobj)
>  		return -ENOMEM;
> -	return sysfs_create_group(power_kobj, &attr_group);
> +	error = sysfs_create_group(power_kobj, &attr_group);
> +	if (error)
> +		return error;
> +	return pm_autosleep_init();
>  }
> 
>  core_initcall(pm_init);
> Index: linux/drivers/base/power/wakeup.c
> ===================================================================
> --- linux.orig/drivers/base/power/wakeup.c
> +++ linux/drivers/base/power/wakeup.c
> @@ -492,8 +492,10 @@ static void wakeup_source_deactivate(str
>  	atomic_add(MAX_IN_PROGRESS, &combined_event_count);
> 
>  	split_counters(&cnt, &inpr);
> -	if (!inpr && waitqueue_active(&wakeup_count_wait_queue))
> +	if (!inpr && waitqueue_active(&wakeup_count_wait_queue)) {
>  		wake_up(&wakeup_count_wait_queue);
> +		queue_up_suspend_work();
> +	}
>  }
> 
>  /**
> @@ -654,29 +656,33 @@ bool pm_wakeup_pending(void)
>  /**
>   * pm_get_wakeup_count - Read the number of registered wakeup events.
>   * @count: Address to store the value at.
> + * @block: Whether or not to block.
>   *
> - * Store the number of registered wakeup events at the address in @count.  Block
> - * if the current number of wakeup events being processed is nonzero.
> + * Store the number of registered wakeup events at the address in @count.  If
> + * @block is set, block until the current number of wakeup events being
> + * processed is zero.
>   *
> - * Return 'false' if the wait for the number of wakeup events being processed to
> - * drop down to zero has been interrupted by a signal (and the current number
> - * of wakeup events being processed is still nonzero).  Otherwise return 'true'.
> + * Return 'false' if the current number of wakeup events being processed is
> + * nonzero.  Otherwise return 'true'.
>   */
> -bool pm_get_wakeup_count(unsigned int *count)
> +bool pm_get_wakeup_count(unsigned int *count, bool block)
>  {
>  	unsigned int cnt, inpr;
> -	DEFINE_WAIT(wait);
> 
> -	for (;;) {
> -		prepare_to_wait(&wakeup_count_wait_queue, &wait,
> -				TASK_INTERRUPTIBLE);
> -		split_counters(&cnt, &inpr);
> -		if (inpr == 0 || signal_pending(current))
> -			break;
> +	if (block) {
> +		DEFINE_WAIT(wait);
> 
> -		schedule();
> +		for (;;) {
> +			prepare_to_wait(&wakeup_count_wait_queue, &wait,
> +					TASK_INTERRUPTIBLE);
> +			split_counters(&cnt, &inpr);
> +			if (inpr == 0 || signal_pending(current))
> +				break;
> +
> +			schedule();
> +		}
> +		finish_wait(&wakeup_count_wait_queue, &wait);
>  	}
> -	finish_wait(&wakeup_count_wait_queue, &wait);
> 
>  	split_counters(&cnt, &inpr);
>  	*count = cnt;
> Index: linux/Documentation/ABI/testing/sysfs-power
> ===================================================================
> --- linux.orig/Documentation/ABI/testing/sysfs-power
> +++ linux/Documentation/ABI/testing/sysfs-power
> @@ -172,3 +172,20 @@ Description:
> 
>  		Reading from this file will display the current value, which is
>  		set to 1 MB by default.
> +
> +What:		/sys/power/autosleep
> +Date:		February 2012
> +Contact:	Rafael J. Wysocki <rjw@sisk.pl>
> +Description:
> +		The /sys/power/autosleep file can be written one of the strings
> +		returned by reads from /sys/power/state.  If that happens, a
> +		work item attempting to trigger a transition of the system to
> +		the sleep state represented by that string is queued up.  This
> +		attempt will only succeed if there are no active wakeup sources
> +		in the system at that time.  After evey execution, regardless
> +		of whether or not the attempt to put the system to sleep has
> +		succeeded, the work item requeues itself until user space
> +		writes "off" to /sys/power/autosleep.
> +
> +		Reading from this file causes the last string successfully
> +		written to it to be displayed.
> 

 


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take2
  2012-02-22  8:44     ` Srivatsa S. Bhat
@ 2012-02-22 22:10       ` Rafael J. Wysocki
  2012-02-23  6:25         ` Srivatsa S. Bhat
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-22 22:10 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: John Stultz, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg,
	Brian Swetland, Neil Brown, Alan Stern, Dmitry Torokhov

On Wednesday, February 22, 2012, Srivatsa S. Bhat wrote:
> On 02/22/2012 10:19 AM, John Stultz wrote:
> 
> > On Wed, 2012-02-22 at 00:31 +0100, Rafael J. Wysocki wrote:
> >> Hi all,
> >>
> >> After the feedback so far I've decided to follow up with a refreshed patchset.
> >> The first two patches from the previous one went to linux-pm/linux-next
> >> and I included the recent evdev patch from Arve (with some modifications)
> >> to this patchset for completness.
> > 
> > Hey Rafael, 
> > 	Thanks again for posting this! I've started playing around with it in a
> > kvm environment, and got the following warning after echoing off >
> > autosleep:
> > ...
> > PM: resume of devices complete after 185.615 msecs
> > PM: Finishing wakeup.
> > Restarting tasks ... done.
> > PM: Syncing filesystems ... done.
> > PM: Preparing system for mem sleep
> > Freezing user space processes ... 
> > Freezing of tasks failed after 20.01 seconds (1 tasks refusing to freeze, wq_busy=0):
> > bash            D ffff880015714010 
> 
> 
> Ah.. I think I know what is the problem here..
> 
> The kernel was freezing userspace processes and meanwhile, you wrote "off"
> to autosleep. So, as a result, this userspace process (bash) just now
> entered kernel mode. Unfortunately, the autosleep_lock is held for too long,
> that is, something like:
> 
> acquire autosleep_lock
> modify autosleep_state
>                                <============== "A"
>  pm_suspend or hibernate()
> 
> release autosleep_lock
> 
> At point marked "A", we should have released the autosleep lock and only then
> entered pm_suspend or hibernate(). Since the current code holds the lock and
> enters suspend/hibernate, the userspace process that wrote "off" to autosleep
> (or even userspace process that writes to /sys/power/state will end up waiting
> on autosleep_lock, thus failing the freezing operation.)
> 
> So the solution is to always release the autosleep lock before entering
> suspend/hibernation.

Well, the autosleep lock is intentionally held around suspend/hibernation in
try_to_suspend(), because otherwise it would be possible to trigger automatic
suspend right after user space has disabled it.

I think the solution is to make pm_autosleep_lock() do a _trylock() and
return error code if already locked.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 5/7] PM / Sleep: Implement opportunistic sleep
  2012-02-22  8:45     ` Srivatsa S. Bhat
@ 2012-02-22 22:10       ` Rafael J. Wysocki
  2012-02-23  5:35         ` Srivatsa S. Bhat
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-22 22:10 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

On Wednesday, February 22, 2012, Srivatsa S. Bhat wrote:
> On 02/22/2012 05:05 AM, Rafael J. Wysocki wrote:
> 
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > Introduce a mechanism by which the kernel can trigger global
> > transitions to a sleep state chosen by user space if there are no
> > active wakeup sources.
> > 
> > It consists of a new sysfs attribute, /sys/power/autosleep, that
> > can be written one of the strings returned by reads from
> > /sys/power/state, an ordered workqueue and a work item carrying out
> > the "suspend" operations.  If a string representing the system's
> > sleep state is written to /sys/power/autosleep, the work item
> > triggering transitions to that state is queued up and it requeues
> > itself after every execution until user space writes "off" to
> > /sys/power/autosleep.
> > 
> > That work item enables the detection of wakeup events using the
> > functions already defined in drivers/base/power/wakeup.c (with one
> > small modification) and calls either pm_suspend(), or hibernate() to
> > put the system into a sleep state.  If a wakeup event is reported
> > while the transition is in progress, it will abort the transition and
> > the "system suspend" work item will be queued up again.
> > 
> > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> > ---
> >  Documentation/ABI/testing/sysfs-power |   17 +++++
> >  drivers/base/power/wakeup.c           |   38 ++++++-----
> >  include/linux/suspend.h               |   13 +++-
> >  kernel/power/Kconfig                  |    8 ++
> >  kernel/power/Makefile                 |    1 
> >  kernel/power/autosleep.c              |   98 ++++++++++++++++++++++++++++++
> >  kernel/power/main.c                   |  108 ++++++++++++++++++++++++++++------
> >  kernel/power/power.h                  |   18 +++++
> >  8 files changed, 266 insertions(+), 35 deletions(-)
> > 
> > Index: linux/kernel/power/Makefile
> > ===================================================================
> > --- linux.orig/kernel/power/Makefile
> > +++ linux/kernel/power/Makefile
> > @@ -9,5 +9,6 @@ obj-$(CONFIG_SUSPEND)		+= suspend.o
> >  obj-$(CONFIG_PM_TEST_SUSPEND)	+= suspend_test.o
> >  obj-$(CONFIG_HIBERNATION)	+= hibernate.o snapshot.o swap.o user.o \
> >  				   block_io.o
> > +obj-$(CONFIG_PM_AUTOSLEEP)	+= autosleep.o
> > 
> >  obj-$(CONFIG_MAGIC_SYSRQ)	+= poweroff.o
> > Index: linux/kernel/power/Kconfig
> > ===================================================================
> > --- linux.orig/kernel/power/Kconfig
> > +++ linux/kernel/power/Kconfig
> > @@ -103,6 +103,14 @@ config PM_SLEEP_SMP
> >  	select HOTPLUG
> >  	select HOTPLUG_CPU
> > 
> > +config PM_AUTOSLEEP
> > +	bool "Opportunistic sleep"
> > +	depends on PM_SLEEP
> > +	default n
> > +	---help---
> > +	Allow the kernel to trigger a system transition into a global sleep
> > +	state automatically whenever there are no active wakeup sources.
> > +
> >  config PM_RUNTIME
> >  	bool "Run-time PM core functionality"
> >  	depends on !IA64_HP_SIM
> > Index: linux/kernel/power/power.h
> > ===================================================================
> > --- linux.orig/kernel/power/power.h
> > +++ linux/kernel/power/power.h
> > @@ -264,3 +264,21 @@ static inline void suspend_thaw_processe
> >  {
> >  }
> >  #endif
> > +
> > +#ifdef CONFIG_PM_AUTOSLEEP
> > +
> > +/* kernel/power/autosleep.c */
> > +extern int pm_autosleep_init(void);
> > +extern void pm_autosleep_lock(void);
> > +extern void pm_autosleep_unlock(void);
> > +extern suspend_state_t pm_autosleep_state(void);
> > +extern int pm_autosleep_set_state(suspend_state_t state);
> > +
> > +#else /* !CONFIG_PM_AUTOSLEEP */
> > +
> > +static inline int pm_autosleep_init(void) { return 0; }
> > +static inline void pm_autosleep_lock(void) {}
> > +static inline void pm_autosleep_unlock(void) {}
> > +static inline suspend_state_t pm_autosleep_state(void) { return PM_SUSPEND_ON; }
> > +
> > +#endif /* !CONFIG_PM_AUTOSLEEP */
> > Index: linux/include/linux/suspend.h
> > ===================================================================
> > --- linux.orig/include/linux/suspend.h
> > +++ linux/include/linux/suspend.h
> > @@ -356,7 +356,7 @@ extern int unregister_pm_notifier(struct
> >  extern bool events_check_enabled;
> > 
> >  extern bool pm_wakeup_pending(void);
> > -extern bool pm_get_wakeup_count(unsigned int *count);
> > +extern bool pm_get_wakeup_count(unsigned int *count, bool block);
> >  extern bool pm_save_wakeup_count(unsigned int count);
> > 
> >  static inline void lock_system_sleep(void)
> > @@ -407,6 +407,17 @@ static inline void unlock_system_sleep(v
> > 
> >  #endif /* !CONFIG_PM_SLEEP */
> > 
> > +#ifdef CONFIG_PM_AUTOSLEEP
> > +
> > +/* kernel/power/autosleep.c */
> > +void queue_up_suspend_work(void);
> > +
> > +#else /* !CONFIG_PM_AUTOSLEEP */
> > +
> > +static inline void queue_up_suspend_work(void) {}
> > +
> > +#endif /* !CONFIG_PM_AUTOSLEEP */
> > +
> >  #ifdef CONFIG_ARCH_SAVE_PAGE_KEYS
> >  /*
> >   * The ARCH_SAVE_PAGE_KEYS functions can be used by an architecture
> > Index: linux/kernel/power/autosleep.c
> > ===================================================================
> > --- /dev/null
> > +++ linux/kernel/power/autosleep.c
> > @@ -0,0 +1,98 @@
> > +/*
> > + * kernel/power/autosleep.c
> > + *
> > + * Opportunistic sleep support.
> > + *
> > + * Copyright (C) 2012 Rafael J. Wysocki <rjw@sisk.pl>
> > + */
> > +
> > +#include <linux/device.h>
> > +#include <linux/mutex.h>
> > +#include <linux/pm_wakeup.h>
> > +
> > +#include "power.h"
> > +
> > +static suspend_state_t autosleep_state;
> > +static struct workqueue_struct *autosleep_wq;
> > +static DEFINE_MUTEX(autosleep_lock);
> > +
> > +static void try_to_suspend(struct work_struct *work)
> > +{
> > +	unsigned int initial_count, final_count;
> > +
> > +	if (!pm_get_wakeup_count(&initial_count, true))
> > +		goto out;
> > +
> > +	mutex_lock(&autosleep_lock);
> > +
> > +	if (!pm_save_wakeup_count(initial_count)) {
> > +		mutex_unlock(&autosleep_lock);
> > +		goto out;
> > +	}
> > +
> > +	if (autosleep_state == PM_SUSPEND_ON) {
> > +		mutex_unlock(&autosleep_lock);
> > +		return;
> > +	}
> > +	if (autosleep_state >= PM_SUSPEND_MAX)
> > +		hibernate();
> > +	else
> > +		pm_suspend(autosleep_state);
> 
> 
> We are calling pm_suspend() or hibernate() directly here.
> Won't this break build when CONFIG_SUSPEND or CONFIG_HIBERNATION is not set?
> CONFIG_PM_AUTOSLEEP depends only on PM_SLEEP which means we could enable
> either one of suspend or hibernation and yet come to this point, breaking
> the option which was not enabled.

Both pm_suspend() and hibernate() have appropriate static inline definitions
for !CONFIG_SUSPEND and !CONFIG_HIBERNATION (in suspend.h), as far as I can say.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 5/7] PM / Sleep: Implement opportunistic sleep
  2012-02-22 22:10       ` Rafael J. Wysocki
@ 2012-02-23  5:35         ` Srivatsa S. Bhat
  0 siblings, 0 replies; 129+ messages in thread
From: Srivatsa S. Bhat @ 2012-02-23  5:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

On 02/23/2012 03:40 AM, Rafael J. Wysocki wrote:

> On Wednesday, February 22, 2012, Srivatsa S. Bhat wrote:
>> On 02/22/2012 05:05 AM, Rafael J. Wysocki wrote:
>>
>>> From: Rafael J. Wysocki <rjw@sisk.pl>
>>>
>>> Introduce a mechanism by which the kernel can trigger global
>>> transitions to a sleep state chosen by user space if there are no
>>> active wakeup sources.
>>>
>>> It consists of a new sysfs attribute, /sys/power/autosleep, that
>>> can be written one of the strings returned by reads from
>>> /sys/power/state, an ordered workqueue and a work item carrying out
>>> the "suspend" operations.  If a string representing the system's
>>> sleep state is written to /sys/power/autosleep, the work item
>>> triggering transitions to that state is queued up and it requeues
>>> itself after every execution until user space writes "off" to
>>> /sys/power/autosleep.
>>>
>>> That work item enables the detection of wakeup events using the
>>> functions already defined in drivers/base/power/wakeup.c (with one
>>> small modification) and calls either pm_suspend(), or hibernate() to
>>> put the system into a sleep state.  If a wakeup event is reported
>>> while the transition is in progress, it will abort the transition and
>>> the "system suspend" work item will be queued up again.
>>>
>>> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
>>> ---
>>>  Documentation/ABI/testing/sysfs-power |   17 +++++
>>>  drivers/base/power/wakeup.c           |   38 ++++++-----
>>>  include/linux/suspend.h               |   13 +++-
>>>  kernel/power/Kconfig                  |    8 ++
>>>  kernel/power/Makefile                 |    1 
>>>  kernel/power/autosleep.c              |   98 ++++++++++++++++++++++++++++++
>>>  kernel/power/main.c                   |  108 ++++++++++++++++++++++++++++------
>>>  kernel/power/power.h                  |   18 +++++
>>>  8 files changed, 266 insertions(+), 35 deletions(-)
>>>
>>> Index: linux/kernel/power/Makefile
>>> ===================================================================
>>> --- linux.orig/kernel/power/Makefile
>>> +++ linux/kernel/power/Makefile
>>> @@ -9,5 +9,6 @@ obj-$(CONFIG_SUSPEND)		+= suspend.o
>>>  obj-$(CONFIG_PM_TEST_SUSPEND)	+= suspend_test.o
>>>  obj-$(CONFIG_HIBERNATION)	+= hibernate.o snapshot.o swap.o user.o \
>>>  				   block_io.o
>>> +obj-$(CONFIG_PM_AUTOSLEEP)	+= autosleep.o
>>>
>>>  obj-$(CONFIG_MAGIC_SYSRQ)	+= poweroff.o
>>> Index: linux/kernel/power/Kconfig
>>> ===================================================================
>>> --- linux.orig/kernel/power/Kconfig
>>> +++ linux/kernel/power/Kconfig
>>> @@ -103,6 +103,14 @@ config PM_SLEEP_SMP
>>>  	select HOTPLUG
>>>  	select HOTPLUG_CPU
>>>
>>> +config PM_AUTOSLEEP
>>> +	bool "Opportunistic sleep"
>>> +	depends on PM_SLEEP
>>> +	default n
>>> +	---help---
>>> +	Allow the kernel to trigger a system transition into a global sleep
>>> +	state automatically whenever there are no active wakeup sources.
>>> +
>>>  config PM_RUNTIME
>>>  	bool "Run-time PM core functionality"
>>>  	depends on !IA64_HP_SIM
>>> Index: linux/kernel/power/power.h
>>> ===================================================================
>>> --- linux.orig/kernel/power/power.h
>>> +++ linux/kernel/power/power.h
>>> @@ -264,3 +264,21 @@ static inline void suspend_thaw_processe
>>>  {
>>>  }
>>>  #endif
>>> +
>>> +#ifdef CONFIG_PM_AUTOSLEEP
>>> +
>>> +/* kernel/power/autosleep.c */
>>> +extern int pm_autosleep_init(void);
>>> +extern void pm_autosleep_lock(void);
>>> +extern void pm_autosleep_unlock(void);
>>> +extern suspend_state_t pm_autosleep_state(void);
>>> +extern int pm_autosleep_set_state(suspend_state_t state);
>>> +
>>> +#else /* !CONFIG_PM_AUTOSLEEP */
>>> +
>>> +static inline int pm_autosleep_init(void) { return 0; }
>>> +static inline void pm_autosleep_lock(void) {}
>>> +static inline void pm_autosleep_unlock(void) {}
>>> +static inline suspend_state_t pm_autosleep_state(void) { return PM_SUSPEND_ON; }
>>> +
>>> +#endif /* !CONFIG_PM_AUTOSLEEP */
>>> Index: linux/include/linux/suspend.h
>>> ===================================================================
>>> --- linux.orig/include/linux/suspend.h
>>> +++ linux/include/linux/suspend.h
>>> @@ -356,7 +356,7 @@ extern int unregister_pm_notifier(struct
>>>  extern bool events_check_enabled;
>>>
>>>  extern bool pm_wakeup_pending(void);
>>> -extern bool pm_get_wakeup_count(unsigned int *count);
>>> +extern bool pm_get_wakeup_count(unsigned int *count, bool block);
>>>  extern bool pm_save_wakeup_count(unsigned int count);
>>>
>>>  static inline void lock_system_sleep(void)
>>> @@ -407,6 +407,17 @@ static inline void unlock_system_sleep(v
>>>
>>>  #endif /* !CONFIG_PM_SLEEP */
>>>
>>> +#ifdef CONFIG_PM_AUTOSLEEP
>>> +
>>> +/* kernel/power/autosleep.c */
>>> +void queue_up_suspend_work(void);
>>> +
>>> +#else /* !CONFIG_PM_AUTOSLEEP */
>>> +
>>> +static inline void queue_up_suspend_work(void) {}
>>> +
>>> +#endif /* !CONFIG_PM_AUTOSLEEP */
>>> +
>>>  #ifdef CONFIG_ARCH_SAVE_PAGE_KEYS
>>>  /*
>>>   * The ARCH_SAVE_PAGE_KEYS functions can be used by an architecture
>>> Index: linux/kernel/power/autosleep.c
>>> ===================================================================
>>> --- /dev/null
>>> +++ linux/kernel/power/autosleep.c
>>> @@ -0,0 +1,98 @@
>>> +/*
>>> + * kernel/power/autosleep.c
>>> + *
>>> + * Opportunistic sleep support.
>>> + *
>>> + * Copyright (C) 2012 Rafael J. Wysocki <rjw@sisk.pl>
>>> + */
>>> +
>>> +#include <linux/device.h>
>>> +#include <linux/mutex.h>
>>> +#include <linux/pm_wakeup.h>
>>> +
>>> +#include "power.h"
>>> +
>>> +static suspend_state_t autosleep_state;
>>> +static struct workqueue_struct *autosleep_wq;
>>> +static DEFINE_MUTEX(autosleep_lock);
>>> +
>>> +static void try_to_suspend(struct work_struct *work)
>>> +{
>>> +	unsigned int initial_count, final_count;
>>> +
>>> +	if (!pm_get_wakeup_count(&initial_count, true))
>>> +		goto out;
>>> +
>>> +	mutex_lock(&autosleep_lock);
>>> +
>>> +	if (!pm_save_wakeup_count(initial_count)) {
>>> +		mutex_unlock(&autosleep_lock);
>>> +		goto out;
>>> +	}
>>> +
>>> +	if (autosleep_state == PM_SUSPEND_ON) {
>>> +		mutex_unlock(&autosleep_lock);
>>> +		return;
>>> +	}
>>> +	if (autosleep_state >= PM_SUSPEND_MAX)
>>> +		hibernate();
>>> +	else
>>> +		pm_suspend(autosleep_state);
>>
>>
>> We are calling pm_suspend() or hibernate() directly here.
>> Won't this break build when CONFIG_SUSPEND or CONFIG_HIBERNATION is not set?
>> CONFIG_PM_AUTOSLEEP depends only on PM_SLEEP which means we could enable
>> either one of suspend or hibernation and yet come to this point, breaking
>> the option which was not enabled.
> 
> Both pm_suspend() and hibernate() have appropriate static inline definitions
> for !CONFIG_SUSPEND and !CONFIG_HIBERNATION (in suspend.h), as far as I can say.
> 


Oh, you are right.. I overlooked that, sorry!
 
Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take2
  2012-02-22 22:10       ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take2 Rafael J. Wysocki
@ 2012-02-23  6:25         ` Srivatsa S. Bhat
  2012-02-23 21:26           ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: Srivatsa S. Bhat @ 2012-02-23  6:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: John Stultz, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg,
	Brian Swetland, Neil Brown, Alan Stern, Dmitry Torokhov

On 02/23/2012 03:40 AM, Rafael J. Wysocki wrote:

> On Wednesday, February 22, 2012, Srivatsa S. Bhat wrote:
>> On 02/22/2012 10:19 AM, John Stultz wrote:
>>
>>> On Wed, 2012-02-22 at 00:31 +0100, Rafael J. Wysocki wrote:
>>>> Hi all,
>>>>
>>>> After the feedback so far I've decided to follow up with a refreshed patchset.
>>>> The first two patches from the previous one went to linux-pm/linux-next
>>>> and I included the recent evdev patch from Arve (with some modifications)
>>>> to this patchset for completness.
>>>
>>> Hey Rafael, 
>>> 	Thanks again for posting this! I've started playing around with it in a
>>> kvm environment, and got the following warning after echoing off >
>>> autosleep:
>>> ...
>>> PM: resume of devices complete after 185.615 msecs
>>> PM: Finishing wakeup.
>>> Restarting tasks ... done.
>>> PM: Syncing filesystems ... done.
>>> PM: Preparing system for mem sleep
>>> Freezing user space processes ... 
>>> Freezing of tasks failed after 20.01 seconds (1 tasks refusing to freeze, wq_busy=0):
>>> bash            D ffff880015714010 
>>
>>
>> Ah.. I think I know what is the problem here..
>>
>> The kernel was freezing userspace processes and meanwhile, you wrote "off"
>> to autosleep. So, as a result, this userspace process (bash) just now
>> entered kernel mode. Unfortunately, the autosleep_lock is held for too long,
>> that is, something like:
>>
>> acquire autosleep_lock
>> modify autosleep_state
>>                                <============== "A"
>>  pm_suspend or hibernate()
>>
>> release autosleep_lock
>>
>> At point marked "A", we should have released the autosleep lock and only then
>> entered pm_suspend or hibernate(). Since the current code holds the lock and
>> enters suspend/hibernate, the userspace process that wrote "off" to autosleep
>> (or even userspace process that writes to /sys/power/state will end up waiting
>> on autosleep_lock, thus failing the freezing operation.)
>>
>> So the solution is to always release the autosleep lock before entering
>> suspend/hibernation.
> 
> Well, the autosleep lock is intentionally held around suspend/hibernation in
> try_to_suspend(), because otherwise it would be possible to trigger automatic
> suspend right after user space has disabled it.
>


Hmm.. I was just wondering if we could avoid holding yet another lock in the
suspend/hibernate path, if possible.. 

 
> I think the solution is to make pm_autosleep_lock() do a _trylock() and
> return error code if already locked.
>

... and also do a trylock() in pm_autosleep_set_state() right?.... that is
where John hit the problem..

By the way, I am just curious.. how difficult will this make it for userspace
to disable autosleep? I mean, would a trylock mean that the user has to keep
fighting until he finally gets a chance to disable autosleep?

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take2
  2012-02-23  6:25         ` Srivatsa S. Bhat
@ 2012-02-23 21:26           ` Rafael J. Wysocki
  2012-02-23 21:32             ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-23 21:26 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: John Stultz, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg,
	Brian Swetland, Neil Brown, Alan Stern, Dmitry Torokhov

On Thursday, February 23, 2012, Srivatsa S. Bhat wrote:
> On 02/23/2012 03:40 AM, Rafael J. Wysocki wrote:
> 
> > On Wednesday, February 22, 2012, Srivatsa S. Bhat wrote:
> >> On 02/22/2012 10:19 AM, John Stultz wrote:
> >>
> >>> On Wed, 2012-02-22 at 00:31 +0100, Rafael J. Wysocki wrote:
> >>>> Hi all,
> >>>>
> >>>> After the feedback so far I've decided to follow up with a refreshed patchset.
> >>>> The first two patches from the previous one went to linux-pm/linux-next
> >>>> and I included the recent evdev patch from Arve (with some modifications)
> >>>> to this patchset for completness.
> >>>
> >>> Hey Rafael, 
> >>> 	Thanks again for posting this! I've started playing around with it in a
> >>> kvm environment, and got the following warning after echoing off >
> >>> autosleep:
> >>> ...
> >>> PM: resume of devices complete after 185.615 msecs
> >>> PM: Finishing wakeup.
> >>> Restarting tasks ... done.
> >>> PM: Syncing filesystems ... done.
> >>> PM: Preparing system for mem sleep
> >>> Freezing user space processes ... 
> >>> Freezing of tasks failed after 20.01 seconds (1 tasks refusing to freeze, wq_busy=0):
> >>> bash            D ffff880015714010 
> >>
> >>
> >> Ah.. I think I know what is the problem here..
> >>
> >> The kernel was freezing userspace processes and meanwhile, you wrote "off"
> >> to autosleep. So, as a result, this userspace process (bash) just now
> >> entered kernel mode. Unfortunately, the autosleep_lock is held for too long,
> >> that is, something like:
> >>
> >> acquire autosleep_lock
> >> modify autosleep_state
> >>                                <============== "A"
> >>  pm_suspend or hibernate()
> >>
> >> release autosleep_lock
> >>
> >> At point marked "A", we should have released the autosleep lock and only then
> >> entered pm_suspend or hibernate(). Since the current code holds the lock and
> >> enters suspend/hibernate, the userspace process that wrote "off" to autosleep
> >> (or even userspace process that writes to /sys/power/state will end up waiting
> >> on autosleep_lock, thus failing the freezing operation.)
> >>
> >> So the solution is to always release the autosleep lock before entering
> >> suspend/hibernation.
> > 
> > Well, the autosleep lock is intentionally held around suspend/hibernation in
> > try_to_suspend(), because otherwise it would be possible to trigger automatic
> > suspend right after user space has disabled it.
> >
> 
> 
> Hmm.. I was just wondering if we could avoid holding yet another lock in the
> suspend/hibernate path, if possible.. 
> 
>  
> > I think the solution is to make pm_autosleep_lock() do a _trylock() and
> > return error code if already locked.
> >
> 
> ... and also do a trylock() in pm_autosleep_set_state() right?.... that is
> where John hit the problem..
> 
> By the way, I am just curious.. how difficult will this make it for userspace
> to disable autosleep? I mean, would a trylock mean that the user has to keep
> fighting until he finally gets a chance to disable autosleep?

That's a good point, so I think it may be a good idea to do
mutex_lock_interruptible() in pm_autosleep_set_state() instead.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take2
  2012-02-23 21:26           ` Rafael J. Wysocki
@ 2012-02-23 21:32             ` Rafael J. Wysocki
  2012-02-24  4:44               ` Srivatsa S. Bhat
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-23 21:32 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: John Stultz, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg,
	Brian Swetland, Neil Brown, Alan Stern, Dmitry Torokhov

On Thursday, February 23, 2012, Rafael J. Wysocki wrote:
> On Thursday, February 23, 2012, Srivatsa S. Bhat wrote:
> > On 02/23/2012 03:40 AM, Rafael J. Wysocki wrote:
> > 
> > > On Wednesday, February 22, 2012, Srivatsa S. Bhat wrote:
> > >> On 02/22/2012 10:19 AM, John Stultz wrote:
> > >>
> > >>> On Wed, 2012-02-22 at 00:31 +0100, Rafael J. Wysocki wrote:
> > >>>> Hi all,
> > >>>>
> > >>>> After the feedback so far I've decided to follow up with a refreshed patchset.
> > >>>> The first two patches from the previous one went to linux-pm/linux-next
> > >>>> and I included the recent evdev patch from Arve (with some modifications)
> > >>>> to this patchset for completness.
> > >>>
> > >>> Hey Rafael, 
> > >>> 	Thanks again for posting this! I've started playing around with it in a
> > >>> kvm environment, and got the following warning after echoing off >
> > >>> autosleep:
> > >>> ...
> > >>> PM: resume of devices complete after 185.615 msecs
> > >>> PM: Finishing wakeup.
> > >>> Restarting tasks ... done.
> > >>> PM: Syncing filesystems ... done.
> > >>> PM: Preparing system for mem sleep
> > >>> Freezing user space processes ... 
> > >>> Freezing of tasks failed after 20.01 seconds (1 tasks refusing to freeze, wq_busy=0):
> > >>> bash            D ffff880015714010 
> > >>
> > >>
> > >> Ah.. I think I know what is the problem here..
> > >>
> > >> The kernel was freezing userspace processes and meanwhile, you wrote "off"
> > >> to autosleep. So, as a result, this userspace process (bash) just now
> > >> entered kernel mode. Unfortunately, the autosleep_lock is held for too long,
> > >> that is, something like:
> > >>
> > >> acquire autosleep_lock
> > >> modify autosleep_state
> > >>                                <============== "A"
> > >>  pm_suspend or hibernate()
> > >>
> > >> release autosleep_lock
> > >>
> > >> At point marked "A", we should have released the autosleep lock and only then
> > >> entered pm_suspend or hibernate(). Since the current code holds the lock and
> > >> enters suspend/hibernate, the userspace process that wrote "off" to autosleep
> > >> (or even userspace process that writes to /sys/power/state will end up waiting
> > >> on autosleep_lock, thus failing the freezing operation.)
> > >>
> > >> So the solution is to always release the autosleep lock before entering
> > >> suspend/hibernation.
> > > 
> > > Well, the autosleep lock is intentionally held around suspend/hibernation in
> > > try_to_suspend(), because otherwise it would be possible to trigger automatic
> > > suspend right after user space has disabled it.
> > >
> > 
> > 
> > Hmm.. I was just wondering if we could avoid holding yet another lock in the
> > suspend/hibernate path, if possible.. 
> > 
> >  
> > > I think the solution is to make pm_autosleep_lock() do a _trylock() and
> > > return error code if already locked.
> > >
> > 
> > ... and also do a trylock() in pm_autosleep_set_state() right?.... that is
> > where John hit the problem..
> > 
> > By the way, I am just curious.. how difficult will this make it for userspace
> > to disable autosleep? I mean, would a trylock mean that the user has to keep
> > fighting until he finally gets a chance to disable autosleep?
> 
> That's a good point, so I think it may be a good idea to do
> mutex_lock_interruptible() in pm_autosleep_set_state() instead.

Now that I think of it, perhaps it's a good idea to just make
pm_autosleep_lock() do mutex_lock_interruptible() _and_ make
pm_autosleep_set_state() use pm_autosleep_lock().

What do you think?

Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take2
  2012-02-23 21:32             ` Rafael J. Wysocki
@ 2012-02-24  4:44               ` Srivatsa S. Bhat
  2012-02-24 23:21                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: Srivatsa S. Bhat @ 2012-02-24  4:44 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: John Stultz, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg,
	Brian Swetland, Neil Brown, Alan Stern, Dmitry Torokhov

On 02/24/2012 03:02 AM, Rafael J. Wysocki wrote:

> On Thursday, February 23, 2012, Rafael J. Wysocki wrote:
>> On Thursday, February 23, 2012, Srivatsa S. Bhat wrote:
>>> On 02/23/2012 03:40 AM, Rafael J. Wysocki wrote:
>>>
>>>> On Wednesday, February 22, 2012, Srivatsa S. Bhat wrote:
>>>>> On 02/22/2012 10:19 AM, John Stultz wrote:
>>>>>
>>>>>> On Wed, 2012-02-22 at 00:31 +0100, Rafael J. Wysocki wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> After the feedback so far I've decided to follow up with a refreshed patchset.
>>>>>>> The first two patches from the previous one went to linux-pm/linux-next
>>>>>>> and I included the recent evdev patch from Arve (with some modifications)
>>>>>>> to this patchset for completness.
>>>>>>
>>>>>> Hey Rafael, 
>>>>>> 	Thanks again for posting this! I've started playing around with it in a
>>>>>> kvm environment, and got the following warning after echoing off >
>>>>>> autosleep:
>>>>>> ...
>>>>>> PM: resume of devices complete after 185.615 msecs
>>>>>> PM: Finishing wakeup.
>>>>>> Restarting tasks ... done.
>>>>>> PM: Syncing filesystems ... done.
>>>>>> PM: Preparing system for mem sleep
>>>>>> Freezing user space processes ... 
>>>>>> Freezing of tasks failed after 20.01 seconds (1 tasks refusing to freeze, wq_busy=0):
>>>>>> bash            D ffff880015714010 
>>>>>
>>>>>
>>>>> Ah.. I think I know what is the problem here..
>>>>>
>>>>> The kernel was freezing userspace processes and meanwhile, you wrote "off"
>>>>> to autosleep. So, as a result, this userspace process (bash) just now
>>>>> entered kernel mode. Unfortunately, the autosleep_lock is held for too long,
>>>>> that is, something like:
>>>>>
>>>>> acquire autosleep_lock
>>>>> modify autosleep_state
>>>>>                                <============== "A"
>>>>>  pm_suspend or hibernate()
>>>>>
>>>>> release autosleep_lock
>>>>>
>>>>> At point marked "A", we should have released the autosleep lock and only then
>>>>> entered pm_suspend or hibernate(). Since the current code holds the lock and
>>>>> enters suspend/hibernate, the userspace process that wrote "off" to autosleep
>>>>> (or even userspace process that writes to /sys/power/state will end up waiting
>>>>> on autosleep_lock, thus failing the freezing operation.)
>>>>>
>>>>> So the solution is to always release the autosleep lock before entering
>>>>> suspend/hibernation.
>>>>
>>>> Well, the autosleep lock is intentionally held around suspend/hibernation in
>>>> try_to_suspend(), because otherwise it would be possible to trigger automatic
>>>> suspend right after user space has disabled it.
>>>>
>>>
>>>
>>> Hmm.. I was just wondering if we could avoid holding yet another lock in the
>>> suspend/hibernate path, if possible.. 
>>>
>>>  
>>>> I think the solution is to make pm_autosleep_lock() do a _trylock() and
>>>> return error code if already locked.
>>>>
>>>
>>> ... and also do a trylock() in pm_autosleep_set_state() right?.... that is
>>> where John hit the problem..
>>>
>>> By the way, I am just curious.. how difficult will this make it for userspace
>>> to disable autosleep? I mean, would a trylock mean that the user has to keep
>>> fighting until he finally gets a chance to disable autosleep?
>>
>> That's a good point, so I think it may be a good idea to do
>> mutex_lock_interruptible() in pm_autosleep_set_state() instead.
> 
> Now that I think of it, perhaps it's a good idea to just make
> pm_autosleep_lock() do mutex_lock_interruptible() _and_ make
> pm_autosleep_set_state() use pm_autosleep_lock().
> 
> What do you think?
> 


Well, I don't think mutex_lock_interruptible() would help us much..
Consider what would happen, if we use it:

* pm-suspend got initiated as part of autosleep. Acquired autosleep lock.
* Userspace is about to get frozen.
* Now, the user tries to write "off" to autosleep. And hence, he is waiting
  for autosleep lock, interruptibly.
* The freezer sent a fake signal to all userspace processes and hence
  this process also got interrupted.. it is no longer waiting on autosleep
  lock - it got the signal and returned, and got frozen.
  (And when the userspace gets thawed later, this process won't have the
   autosleep lock - which is a different (but yet another) problem).

So ultimately the only thing we achieved is to ensure that freezing of
userspace goes smoothly. But the user process could not succeed in
disabling autosleep. Of course we can work around that by having the
mutex_lock_interruptible() in a loop and so on, but that gets very
ugly pretty soon.

So, I would suggest the following solution:

We want to achieve 2 things here:
 a. A user process trying to write to /sys/power/state or
    /sys/power/autosleep should not cause freezing failures.
 b. When a user process writes "off" to autosleep, the suspend/hibernate
    attempt that is on-going, if any, must be immediately aborted, to give
    the user the feeling that his preference has been noticed and respected.

And to achieve this, we note that a user process can write "off" to autosleep
only until the userspace gets frozen. No chance after that.

So, let's do this:
1. Drop the autosleep lock before entering pm-suspend/hibernate.
2. This means, a user process can get hold of this lock and successfully
   disable autosleep a moment after we initiated suspend, but before userspace
   got frozen fully.
3. So, to respect the user's wish, we add a check immediately after the
   freezing of userspace is complete - we check if the user disabled autosleep
   and bail out, if he did. Otherwise, we continue and suspend the machine.

IOW, this is like hitting 2 birds with one stone ;-)
We don't hold autosleep lock throughout suspend/hibernate, but still react
instantly when the user disables autosleep. And of course, freezing of tasks
won't fail, ever! :-)

 
Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 4/7] Input / PM: Add ioctl to block suspend while event queue is not empty
  2012-02-21 23:34   ` [RFC][PATCH 4/7] Input / PM: Add ioctl to block suspend while event queue is not empty Rafael J. Wysocki
@ 2012-02-24  5:16     ` Matt Helsley
  2012-02-25  4:25       ` Arve Hjønnevåg
  2012-02-26 20:57       ` Rafael J. Wysocki
  0 siblings, 2 replies; 129+ messages in thread
From: Matt Helsley @ 2012-02-24  5:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

On Wed, Feb 22, 2012 at 12:34:58AM +0100, Rafael J. Wysocki wrote:
> From: Arve Hjønnevåg <arve@android.com>
> 
> Add a new ioctl, EVIOCSWAKEUPSRC, to attach a wakeup source object to
> an evdev client event queue, such that it will be active whenever the
> queue is not empty.  Then, all events in the queue will be regarded
> as wakeup events in progress and pm_get_wakeup_count() will block (or
> return false if woken up by a signal) until they are removed from the
> queue.  In consequence, if the checking of wakeup events is enabled
> (e.g. throught the /sys/power/wakeup_count interface), the system
> won't be able to go into a sleep state until the queue is empty.
> 
> This allows user space processes to handle situations in which they
> want to do a select() on an evdev descriptor, so they go to sleep
> until there are some events to read from the device's queue, and then
> they don't want the system to go into a sleep state until all the
> events are read (presumably for further processing).  Of course, if
> they don't want the system to go into a sleep state _after_ all the
> events have been read from the queue, they have to use a separate
> mechanism that will prevent the system from doing that and it has
> to be activated before reading the first event (that also may be the
> last one).

I haven't seen this idea mentioned before but I must admit I haven't
been following this thread too closely so apologies (and don't bother
rehashing) if it has:

Could you just add this to epoll so that any fd userspace chooses would be
capable of doing this without introducing potentially ecclectic ioctl
interfaces?

struct epoll_event ev;

epfd = epoll_create1(EPOLL_STAY_AWAKE_SET);
ev.data.ptr = foo;
epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev);

Which could be useful because you can put one epollfd in another's epoll
set. Or maybe as an EPOLLKEEPAWAKE flag in the event struct sort of like
EPOLLET:

epfd = epoll_create1(0);
ev.events = EPOLLIN|EPOLLKEEPAWAKE;
epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev);

> 
> [rjw: Removed unnecessary checks, changed the names of the new ioctls
>  and the names of the functions that add/remove wakeup source objects
>  to/from evdev clients, modified the changelog.
> Signed-off-by: Arve Hjønnevåg <arve@android.com>
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  drivers/input/evdev.c |   55 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/input.h |    3 ++
>  2 files changed, 58 insertions(+)
> 
> Index: linux/drivers/input/evdev.c
> ===================================================================
> --- linux.orig/drivers/input/evdev.c
> +++ linux/drivers/input/evdev.c
> @@ -43,6 +43,7 @@ struct evdev_client {
>  	unsigned int tail;
>  	unsigned int packet_head; /* [future] position of the first element of next packet */
>  	spinlock_t buffer_lock; /* protects access to buffer, head and tail */
> +	struct wakeup_source *wakeup_source;
>  	struct fasync_struct *fasync;
>  	struct evdev *evdev;
>  	struct list_head node;
> @@ -75,10 +76,12 @@ static void evdev_pass_event(struct evde
>  		client->buffer[client->tail].value = 0;
> 
>  		client->packet_head = client->tail;
> +		__pm_relax(client->wakeup_source);
>  	}
> 
>  	if (event->type == EV_SYN && event->code == SYN_REPORT) {
>  		client->packet_head = client->head;
> +		__pm_stay_awake(client->wakeup_source);
>  		kill_fasync(&client->fasync, SIGIO, POLL_IN);
>  	}
> 
> @@ -255,6 +258,8 @@ static int evdev_release(struct inode *i
>  	mutex_unlock(&evdev->mutex);
> 
>  	evdev_detach_client(evdev, client);
> +	wakeup_source_unregister(client->wakeup_source);
> +
>  	kfree(client);
> 
>  	evdev_close_device(evdev);
> @@ -373,6 +378,8 @@ static int evdev_fetch_next_event(struct
>  	if (have_event) {
>  		*event = client->buffer[client->tail++];
>  		client->tail &= client->bufsize - 1;
> +		if (client->packet_head == client->tail)
> +			__pm_relax(client->wakeup_source);
>  	}
> 
>  	spin_unlock_irq(&client->buffer_lock);
> @@ -623,6 +630,45 @@ static int evdev_handle_set_keycode_v2(s
>  	return input_set_keycode(dev, &ke);
>  }
> 
> +static int evdev_attach_wakeup_source(struct evdev *evdev,
> +				      struct evdev_client *client)
> +{
> +	struct wakeup_source *ws;
> +	char name[28];
> +
> +	if (client->wakeup_source)
> +		return 0;
> +
> +	snprintf(name, sizeof(name), "%s-%d",
> +		 dev_name(&evdev->dev), task_tgid_vnr(current));

This does not look like it will work well with tasks in different pid
namespaces. What should happen, I think, is the wakeup_source should hold a
reference to either the struct pid of current or current itself. Then
when someone reads the file you should get the pid vnr in the reader's
pid namespace. That way instead of a bogus pid vnr 0 would show up if
"current" here is not in the reader's pid namepsace.

Cheers,
	-Matt Helsley


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take2
  2012-02-24  4:44               ` Srivatsa S. Bhat
@ 2012-02-24 23:21                 ` Rafael J. Wysocki
  2012-02-25  4:43                   ` Arve Hjønnevåg
  2012-02-25 19:20                   ` Srivatsa S. Bhat
  0 siblings, 2 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-24 23:21 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: John Stultz, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg,
	Brian Swetland, Neil Brown, Alan Stern, Dmitry Torokhov

On Friday, February 24, 2012, Srivatsa S. Bhat wrote:
> On 02/24/2012 03:02 AM, Rafael J. Wysocki wrote:
> 
> > On Thursday, February 23, 2012, Rafael J. Wysocki wrote:
> >> On Thursday, February 23, 2012, Srivatsa S. Bhat wrote:
> >>> On 02/23/2012 03:40 AM, Rafael J. Wysocki wrote:
[...]
> >>>
> >>> By the way, I am just curious.. how difficult will this make it for userspace
> >>> to disable autosleep? I mean, would a trylock mean that the user has to keep
> >>> fighting until he finally gets a chance to disable autosleep?
> >>
> >> That's a good point, so I think it may be a good idea to do
> >> mutex_lock_interruptible() in pm_autosleep_set_state() instead.
> > 
> > Now that I think of it, perhaps it's a good idea to just make
> > pm_autosleep_lock() do mutex_lock_interruptible() _and_ make
> > pm_autosleep_set_state() use pm_autosleep_lock().
> > 
> > What do you think?
> > 
> 
> 
> Well, I don't think mutex_lock_interruptible() would help us much..
> Consider what would happen, if we use it:
> 
> * pm-suspend got initiated as part of autosleep. Acquired autosleep lock.
> * Userspace is about to get frozen.
> * Now, the user tries to write "off" to autosleep. And hence, he is waiting
>   for autosleep lock, interruptibly.
> * The freezer sent a fake signal to all userspace processes and hence
>   this process also got interrupted.. it is no longer waiting on autosleep
>   lock - it got the signal and returned, and got frozen.
>   (And when the userspace gets thawed later, this process won't have the
>    autosleep lock - which is a different (but yet another) problem).
> 
> So ultimately the only thing we achieved is to ensure that freezing of
> userspace goes smoothly. But the user process could not succeed in
> disabling autosleep. Of course we can work around that by having the
> mutex_lock_interruptible() in a loop and so on, but that gets very
> ugly pretty soon.
> 
> So, I would suggest the following solution:
> 
> We want to achieve 2 things here:
>  a. A user process trying to write to /sys/power/state or
>     /sys/power/autosleep should not cause freezing failures.
>  b. When a user process writes "off" to autosleep, the suspend/hibernate
>     attempt that is on-going, if any, must be immediately aborted, to give
>     the user the feeling that his preference has been noticed and respected.
> 
> And to achieve this, we note that a user process can write "off" to autosleep
> only until the userspace gets frozen. No chance after that.
> 
> So, let's do this:
> 1. Drop the autosleep lock before entering pm-suspend/hibernate.
> 2. This means, a user process can get hold of this lock and successfully
>    disable autosleep a moment after we initiated suspend, but before userspace
>    got frozen fully.
> 3. So, to respect the user's wish, we add a check immediately after the
>    freezing of userspace is complete - we check if the user disabled autosleep
>    and bail out, if he did. Otherwise, we continue and suspend the machine.
> 
> IOW, this is like hitting 2 birds with one stone ;-)
> We don't hold autosleep lock throughout suspend/hibernate, but still react
> instantly when the user disables autosleep. And of course, freezing of tasks
> won't fail, ever! :-)

Well, you essentially are postulating to restore the "interface" wakeup source
that was present in the previous version of this patch and that I dropped in
order to simplify the code.

I guess I can do that ...

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 4/7] Input / PM: Add ioctl to block suspend while event queue is not empty
  2012-02-24  5:16     ` Matt Helsley
@ 2012-02-25  4:25       ` Arve Hjønnevåg
  2012-02-25 23:33         ` Rafael J. Wysocki
  2012-02-28  0:19         ` Matt Helsley
  2012-02-26 20:57       ` Rafael J. Wysocki
  1 sibling, 2 replies; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-02-25  4:25 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Rafael J. Wysocki, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

On Thu, Feb 23, 2012 at 9:16 PM, Matt Helsley <matthltc@us.ibm.com> wrote:
> On Wed, Feb 22, 2012 at 12:34:58AM +0100, Rafael J. Wysocki wrote:
>> From: Arve Hjønnevåg <arve@android.com>
>>
>> Add a new ioctl, EVIOCSWAKEUPSRC, to attach a wakeup source object to
>> an evdev client event queue, such that it will be active whenever the
>> queue is not empty.  Then, all events in the queue will be regarded
>> as wakeup events in progress and pm_get_wakeup_count() will block (or
>> return false if woken up by a signal) until they are removed from the
>> queue.  In consequence, if the checking of wakeup events is enabled
>> (e.g. throught the /sys/power/wakeup_count interface), the system
>> won't be able to go into a sleep state until the queue is empty.
>>
>> This allows user space processes to handle situations in which they
>> want to do a select() on an evdev descriptor, so they go to sleep
>> until there are some events to read from the device's queue, and then
>> they don't want the system to go into a sleep state until all the
>> events are read (presumably for further processing).  Of course, if
>> they don't want the system to go into a sleep state _after_ all the
>> events have been read from the queue, they have to use a separate
>> mechanism that will prevent the system from doing that and it has
>> to be activated before reading the first event (that also may be the
>> last one).
>
> I haven't seen this idea mentioned before but I must admit I haven't
> been following this thread too closely so apologies (and don't bother
> rehashing) if it has:
>
> Could you just add this to epoll so that any fd userspace chooses would be
> capable of doing this without introducing potentially ecclectic ioctl
> interfaces?
>

This is an interesting idea, but I'm not sure how well it would work.

I looked at the epoll code and it looks like it is possible to
activate the wakeup-source from the wait queue function it uses. The
epoll callback will happen without holding evdev client buffer_lock,
so the wakeup-source and buffer state will not always be in sync (this
may be OK, but require more thought). This callback is also called if
no data was added to the queue we are polling on because another
client has grabbed the input device (is this a bug or intended?).

There is no call into the epoll code when input queue is emptied, so
we can't deactivate the wakeup-source until epoll_wait is called
again. This also should be workable, but result in different stats.

It does not look like the normal poll and select interfaces can be
extended the same way (since they remove themselves from the
wait-queue before returning to user-space), so user-space has to be
changed to use epoll even if select or poll would be a better fit.

I don't know how many other drivers this would work for. The input
driver will wake up user-space from the same thread or interrupt
handler that queued the event, but other drivers may defer this to
another thread which makes an epoll wakeup-source insufficient.

...
>> +     snprintf(name, sizeof(name), "%s-%d",
>> +              dev_name(&evdev->dev), task_tgid_vnr(current));
>
> This does not look like it will work well with tasks in different pid
> namespaces. What should happen, I think, is the wakeup_source should hold a
> reference to either the struct pid of current or current itself. Then
> when someone reads the file you should get the pid vnr in the reader's
> pid namespace. That way instead of a bogus pid vnr 0 would show up if
> "current" here is not in the reader's pid namepsace.
>

The pid here is only used for debugging purposes, and used less than
the dev_name. I don't think tracking pid namespaces is worth the
trouble here, so if this is a real problem we can just drop the pid
from the name for now.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take2
  2012-02-24 23:21                 ` Rafael J. Wysocki
@ 2012-02-25  4:43                   ` Arve Hjønnevåg
  2012-02-25 20:43                     ` Rafael J. Wysocki
  2012-02-25 19:20                   ` Srivatsa S. Bhat
  1 sibling, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-02-25  4:43 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Srivatsa S. Bhat, John Stultz, Linux PM list, LKML, Magnus Damm,
	markgross, Matthew Garrett, Greg KH, Brian Swetland, Neil Brown,
	Alan Stern, Dmitry Torokhov

On Fri, Feb 24, 2012 at 3:21 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Friday, February 24, 2012, Srivatsa S. Bhat wrote:
>> On 02/24/2012 03:02 AM, Rafael J. Wysocki wrote:
>>
>> > On Thursday, February 23, 2012, Rafael J. Wysocki wrote:
>> >> On Thursday, February 23, 2012, Srivatsa S. Bhat wrote:
>> >>> On 02/23/2012 03:40 AM, Rafael J. Wysocki wrote:
> [...]
>> >>>
>> >>> By the way, I am just curious.. how difficult will this make it for userspace
>> >>> to disable autosleep? I mean, would a trylock mean that the user has to keep
>> >>> fighting until he finally gets a chance to disable autosleep?
>> >>
>> >> That's a good point, so I think it may be a good idea to do
>> >> mutex_lock_interruptible() in pm_autosleep_set_state() instead.
>> >
>> > Now that I think of it, perhaps it's a good idea to just make
>> > pm_autosleep_lock() do mutex_lock_interruptible() _and_ make
>> > pm_autosleep_set_state() use pm_autosleep_lock().
>> >
>> > What do you think?
>> >
>>
>>
>> Well, I don't think mutex_lock_interruptible() would help us much..
>> Consider what would happen, if we use it:
>>
>> * pm-suspend got initiated as part of autosleep. Acquired autosleep lock.
>> * Userspace is about to get frozen.
>> * Now, the user tries to write "off" to autosleep. And hence, he is waiting
>>   for autosleep lock, interruptibly.
>> * The freezer sent a fake signal to all userspace processes and hence
>>   this process also got interrupted.. it is no longer waiting on autosleep
>>   lock - it got the signal and returned, and got frozen.
>>   (And when the userspace gets thawed later, this process won't have the
>>    autosleep lock - which is a different (but yet another) problem).
>>
>> So ultimately the only thing we achieved is to ensure that freezing of
>> userspace goes smoothly. But the user process could not succeed in
>> disabling autosleep. Of course we can work around that by having the
>> mutex_lock_interruptible() in a loop and so on, but that gets very
>> ugly pretty soon.
>>
>> So, I would suggest the following solution:
>>
>> We want to achieve 2 things here:
>>  a. A user process trying to write to /sys/power/state or
>>     /sys/power/autosleep should not cause freezing failures.
>>  b. When a user process writes "off" to autosleep, the suspend/hibernate
>>     attempt that is on-going, if any, must be immediately aborted, to give
>>     the user the feeling that his preference has been noticed and respected.
>>
>> And to achieve this, we note that a user process can write "off" to autosleep
>> only until the userspace gets frozen. No chance after that.
>>
>> So, let's do this:
>> 1. Drop the autosleep lock before entering pm-suspend/hibernate.
>> 2. This means, a user process can get hold of this lock and successfully
>>    disable autosleep a moment after we initiated suspend, but before userspace
>>    got frozen fully.
>> 3. So, to respect the user's wish, we add a check immediately after the
>>    freezing of userspace is complete - we check if the user disabled autosleep
>>    and bail out, if he did. Otherwise, we continue and suspend the machine.
>>
>> IOW, this is like hitting 2 birds with one stone ;-)
>> We don't hold autosleep lock throughout suspend/hibernate, but still react
>> instantly when the user disables autosleep. And of course, freezing of tasks
>> won't fail, ever! :-)
>
> Well, you essentially are postulating to restore the "interface" wakeup source
> that was present in the previous version of this patch and that I dropped in
> order to simplify the code.
>
> I guess I can do that ...
>

If this wakeup source is reported as active whenever user-space has
not requested suspend that would be useful in the stats. It does not
look like your original patch did this however, but you could have a
main wakeup-source that you release when any form of suspend is
requested and activate when turning off auto suspend or returning from
a one-shot suspend operation.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take2
  2012-02-24 23:21                 ` Rafael J. Wysocki
  2012-02-25  4:43                   ` Arve Hjønnevåg
@ 2012-02-25 19:20                   ` Srivatsa S. Bhat
  2012-02-25 21:01                     ` Rafael J. Wysocki
  1 sibling, 1 reply; 129+ messages in thread
From: Srivatsa S. Bhat @ 2012-02-25 19:20 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: John Stultz, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg,
	Brian Swetland, Neil Brown, Alan Stern, Dmitry Torokhov

On 02/25/2012 04:51 AM, Rafael J. Wysocki wrote:

> On Friday, February 24, 2012, Srivatsa S. Bhat wrote:
>> On 02/24/2012 03:02 AM, Rafael J. Wysocki wrote:
>>
>>> On Thursday, February 23, 2012, Rafael J. Wysocki wrote:
>>>> On Thursday, February 23, 2012, Srivatsa S. Bhat wrote:
>>>>> On 02/23/2012 03:40 AM, Rafael J. Wysocki wrote:
> [...]
>>>>>
>>>>> By the way, I am just curious.. how difficult will this make it for userspace
>>>>> to disable autosleep? I mean, would a trylock mean that the user has to keep
>>>>> fighting until he finally gets a chance to disable autosleep?
>>>>
>>>> That's a good point, so I think it may be a good idea to do
>>>> mutex_lock_interruptible() in pm_autosleep_set_state() instead.
>>>
>>> Now that I think of it, perhaps it's a good idea to just make
>>> pm_autosleep_lock() do mutex_lock_interruptible() _and_ make
>>> pm_autosleep_set_state() use pm_autosleep_lock().
>>>
>>> What do you think?
>>>
>>
>>
>> Well, I don't think mutex_lock_interruptible() would help us much..
>> Consider what would happen, if we use it:
>>
>> * pm-suspend got initiated as part of autosleep. Acquired autosleep lock.
>> * Userspace is about to get frozen.
>> * Now, the user tries to write "off" to autosleep. And hence, he is waiting
>>   for autosleep lock, interruptibly.
>> * The freezer sent a fake signal to all userspace processes and hence
>>   this process also got interrupted.. it is no longer waiting on autosleep
>>   lock - it got the signal and returned, and got frozen.
>>   (And when the userspace gets thawed later, this process won't have the
>>    autosleep lock - which is a different (but yet another) problem).
>>
>> So ultimately the only thing we achieved is to ensure that freezing of
>> userspace goes smoothly. But the user process could not succeed in
>> disabling autosleep. Of course we can work around that by having the
>> mutex_lock_interruptible() in a loop and so on, but that gets very
>> ugly pretty soon.
>>
>> So, I would suggest the following solution:
>>
>> We want to achieve 2 things here:
>>  a. A user process trying to write to /sys/power/state or
>>     /sys/power/autosleep should not cause freezing failures.
>>  b. When a user process writes "off" to autosleep, the suspend/hibernate
>>     attempt that is on-going, if any, must be immediately aborted, to give
>>     the user the feeling that his preference has been noticed and respected.
>>
>> And to achieve this, we note that a user process can write "off" to autosleep
>> only until the userspace gets frozen. No chance after that.
>>
>> So, let's do this:
>> 1. Drop the autosleep lock before entering pm-suspend/hibernate.
>> 2. This means, a user process can get hold of this lock and successfully
>>    disable autosleep a moment after we initiated suspend, but before userspace
>>    got frozen fully.
>> 3. So, to respect the user's wish, we add a check immediately after the
>>    freezing of userspace is complete - we check if the user disabled autosleep
>>    and bail out, if he did. Otherwise, we continue and suspend the machine.
>>
>> IOW, this is like hitting 2 birds with one stone ;-)
>> We don't hold autosleep lock throughout suspend/hibernate, but still react
>> instantly when the user disables autosleep. And of course, freezing of tasks
>> won't fail, ever! :-)
> 
> Well, you essentially are postulating to restore the "interface" wakeup source
> that was present in the previous version of this patch and that I dropped in
> order to simplify the code.
> 


Oh is it? I guess I haven't followed this thread very closely...

> I guess I can do that ...
> 


Oh by the way, this scheme doesn't solve all problems. It might be effective
in reacting "instantly" to a request by the user to *switch off* autosleep.
But say, when the user wants to switch to suspend instead of hibernate as the
autosleep preference, for example, I don't think it would be as quick in
responding... (I mean, it might do the old operation one more time before
switching to the new one..)

But I guess at this point it might be wiser to say "sigh.. we can do only so
much..." instead of complicating the code too much in an attempt to meet
everybody's expectations :-)

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take2
  2012-02-25  4:43                   ` Arve Hjønnevåg
@ 2012-02-25 20:43                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-25 20:43 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: Srivatsa S. Bhat, John Stultz, Linux PM list, LKML, Magnus Damm,
	markgross, Matthew Garrett, Greg KH, Brian Swetland, Neil Brown,
	Alan Stern, Dmitry Torokhov

On Saturday, February 25, 2012, Arve Hjønnevåg wrote:
> On Fri, Feb 24, 2012 at 3:21 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Friday, February 24, 2012, Srivatsa S. Bhat wrote:
> >> On 02/24/2012 03:02 AM, Rafael J. Wysocki wrote:
> >>
> >> > On Thursday, February 23, 2012, Rafael J. Wysocki wrote:
> >> >> On Thursday, February 23, 2012, Srivatsa S. Bhat wrote:
> >> >>> On 02/23/2012 03:40 AM, Rafael J. Wysocki wrote:
> > [...]
> >> >>>
> >> >>> By the way, I am just curious.. how difficult will this make it for userspace
> >> >>> to disable autosleep? I mean, would a trylock mean that the user has to keep
> >> >>> fighting until he finally gets a chance to disable autosleep?
> >> >>
> >> >> That's a good point, so I think it may be a good idea to do
> >> >> mutex_lock_interruptible() in pm_autosleep_set_state() instead.
> >> >
> >> > Now that I think of it, perhaps it's a good idea to just make
> >> > pm_autosleep_lock() do mutex_lock_interruptible() _and_ make
> >> > pm_autosleep_set_state() use pm_autosleep_lock().
> >> >
> >> > What do you think?
> >> >
> >>
> >>
> >> Well, I don't think mutex_lock_interruptible() would help us much..
> >> Consider what would happen, if we use it:
> >>
> >> * pm-suspend got initiated as part of autosleep. Acquired autosleep lock.
> >> * Userspace is about to get frozen.
> >> * Now, the user tries to write "off" to autosleep. And hence, he is waiting
> >>   for autosleep lock, interruptibly.
> >> * The freezer sent a fake signal to all userspace processes and hence
> >>   this process also got interrupted.. it is no longer waiting on autosleep
> >>   lock - it got the signal and returned, and got frozen.
> >>   (And when the userspace gets thawed later, this process won't have the
> >>    autosleep lock - which is a different (but yet another) problem).
> >>
> >> So ultimately the only thing we achieved is to ensure that freezing of
> >> userspace goes smoothly. But the user process could not succeed in
> >> disabling autosleep. Of course we can work around that by having the
> >> mutex_lock_interruptible() in a loop and so on, but that gets very
> >> ugly pretty soon.
> >>
> >> So, I would suggest the following solution:
> >>
> >> We want to achieve 2 things here:
> >>  a. A user process trying to write to /sys/power/state or
> >>     /sys/power/autosleep should not cause freezing failures.
> >>  b. When a user process writes "off" to autosleep, the suspend/hibernate
> >>     attempt that is on-going, if any, must be immediately aborted, to give
> >>     the user the feeling that his preference has been noticed and respected.
> >>
> >> And to achieve this, we note that a user process can write "off" to autosleep
> >> only until the userspace gets frozen. No chance after that.
> >>
> >> So, let's do this:
> >> 1. Drop the autosleep lock before entering pm-suspend/hibernate.
> >> 2. This means, a user process can get hold of this lock and successfully
> >>    disable autosleep a moment after we initiated suspend, but before userspace
> >>    got frozen fully.
> >> 3. So, to respect the user's wish, we add a check immediately after the
> >>    freezing of userspace is complete - we check if the user disabled autosleep
> >>    and bail out, if he did. Otherwise, we continue and suspend the machine.
> >>
> >> IOW, this is like hitting 2 birds with one stone ;-)
> >> We don't hold autosleep lock throughout suspend/hibernate, but still react
> >> instantly when the user disables autosleep. And of course, freezing of tasks
> >> won't fail, ever! :-)
> >
> > Well, you essentially are postulating to restore the "interface" wakeup source
> > that was present in the previous version of this patch and that I dropped in
> > order to simplify the code.
> >
> > I guess I can do that ...
> >
> 
> If this wakeup source is reported as active whenever user-space has
> not requested suspend that would be useful in the stats. It does not
> look like your original patch did this however,

No, it didn't.

> but you could have a
> main wakeup-source that you release when any form of suspend is
> requested and activate when turning off auto suspend or returning from
> a one-shot suspend operation.

I honestly don't think I can do that and handle the /sys/power/wakeup_count
 -> /sys/power/state handoff (which is used by OLPC, as we've learnt recently)
sanely at the same time.  OTOH, I don't want CONFIG_AUTOSLEEP to disable that
interface entirely, because things like that basically prevent people from
trying alternative features, which is essential to us for "interesting
feedback" reasons.

So, my "main" wakeup source is only going to register the number of times user
space has (successfully) written to /sysp/power/autosleep (please have a look
at the updated patch I'm going to send in a reply to Srivatsa in a little
while).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take2
  2012-02-25 19:20                   ` Srivatsa S. Bhat
@ 2012-02-25 21:01                     ` Rafael J. Wysocki
  2012-02-28 10:24                       ` Srivatsa S. Bhat
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-25 21:01 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: John Stultz, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg,
	Brian Swetland, Neil Brown, Alan Stern, Dmitry Torokhov

On Saturday, February 25, 2012, Srivatsa S. Bhat wrote:
> On 02/25/2012 04:51 AM, Rafael J. Wysocki wrote:
> 
> > On Friday, February 24, 2012, Srivatsa S. Bhat wrote:
> >> On 02/24/2012 03:02 AM, Rafael J. Wysocki wrote:
> >>
> >>> On Thursday, February 23, 2012, Rafael J. Wysocki wrote:
> >>>> On Thursday, February 23, 2012, Srivatsa S. Bhat wrote:
> >>>>> On 02/23/2012 03:40 AM, Rafael J. Wysocki wrote:
> > [...]
> >>>>>
> >>>>> By the way, I am just curious.. how difficult will this make it for userspace
> >>>>> to disable autosleep? I mean, would a trylock mean that the user has to keep
> >>>>> fighting until he finally gets a chance to disable autosleep?
> >>>>
> >>>> That's a good point, so I think it may be a good idea to do
> >>>> mutex_lock_interruptible() in pm_autosleep_set_state() instead.
> >>>
> >>> Now that I think of it, perhaps it's a good idea to just make
> >>> pm_autosleep_lock() do mutex_lock_interruptible() _and_ make
> >>> pm_autosleep_set_state() use pm_autosleep_lock().
> >>>
> >>> What do you think?
> >>>
> >>
> >>
> >> Well, I don't think mutex_lock_interruptible() would help us much..
> >> Consider what would happen, if we use it:
> >>
> >> * pm-suspend got initiated as part of autosleep. Acquired autosleep lock.
> >> * Userspace is about to get frozen.
> >> * Now, the user tries to write "off" to autosleep. And hence, he is waiting
> >>   for autosleep lock, interruptibly.
> >> * The freezer sent a fake signal to all userspace processes and hence
> >>   this process also got interrupted.. it is no longer waiting on autosleep
> >>   lock - it got the signal and returned, and got frozen.
> >>   (And when the userspace gets thawed later, this process won't have the
> >>    autosleep lock - which is a different (but yet another) problem).
> >>
> >> So ultimately the only thing we achieved is to ensure that freezing of
> >> userspace goes smoothly. But the user process could not succeed in
> >> disabling autosleep. Of course we can work around that by having the
> >> mutex_lock_interruptible() in a loop and so on, but that gets very
> >> ugly pretty soon.
> >>
> >> So, I would suggest the following solution:
> >>
> >> We want to achieve 2 things here:
> >>  a. A user process trying to write to /sys/power/state or
> >>     /sys/power/autosleep should not cause freezing failures.
> >>  b. When a user process writes "off" to autosleep, the suspend/hibernate
> >>     attempt that is on-going, if any, must be immediately aborted, to give
> >>     the user the feeling that his preference has been noticed and respected.
> >>
> >> And to achieve this, we note that a user process can write "off" to autosleep
> >> only until the userspace gets frozen. No chance after that.
> >>
> >> So, let's do this:
> >> 1. Drop the autosleep lock before entering pm-suspend/hibernate.
> >> 2. This means, a user process can get hold of this lock and successfully
> >>    disable autosleep a moment after we initiated suspend, but before userspace
> >>    got frozen fully.
> >> 3. So, to respect the user's wish, we add a check immediately after the
> >>    freezing of userspace is complete - we check if the user disabled autosleep
> >>    and bail out, if he did. Otherwise, we continue and suspend the machine.
> >>
> >> IOW, this is like hitting 2 birds with one stone ;-)
> >> We don't hold autosleep lock throughout suspend/hibernate, but still react
> >> instantly when the user disables autosleep. And of course, freezing of tasks
> >> won't fail, ever! :-)
> > 
> > Well, you essentially are postulating to restore the "interface" wakeup source
> > that was present in the previous version of this patch and that I dropped in
> > order to simplify the code.
> > 
> 
> 
> Oh is it? I guess I haven't followed this thread very closely...
> 
> > I guess I can do that ...
> > 
> 
> 
> Oh by the way, this scheme doesn't solve all problems. It might be effective
> in reacting "instantly" to a request by the user to *switch off* autosleep.
> But say, when the user wants to switch to suspend instead of hibernate as the
> autosleep preference, for example, I don't think it would be as quick in
> responding... (I mean, it might do the old operation one more time before
> switching to the new one..)
> 
> But I guess at this point it might be wiser to say "sigh.. we can do only so
> much..." instead of complicating the code too much in an attempt to meet
> everybody's expectations :-)

I think we can do something like in the updated patch [5/7] below.

It uses a special wakeup source object called "autosleep" to bump up the
number of wakeup events in progress before acquiring autosleep_lock in
pm_autosleep_set_state().  This way, either pm_autosleep_set_state() will
acquire autosleep_lock before try_to_suspend(), in which case the latter
will see the change of autosleep_state immediately (after autosleep_lock has
been passed to it), or try_to_suspend() will get it first, but then
pm_save_wakeup_count() or pm_suspend()/hibernate() will see the nonzero counter
of wakeup events in progress and return error code (sooner or later).

The drawback is that writes to /sys/power/autosleep may interfere with
the /sys/power/wakeup_count + /sys/power/state interface by interrupting
transitions started by writing to /sys/power/state, for example (although
I think that's highly unlikely).

Additionally, I made pm_autosleep_lock() use mutex_trylock_interruptible()
to prevent operations on /sys/power/wakeup_count and/or /sys/power/state
from failing the freezing of tasks started by try_to_suspend().

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM / Sleep: Implement opportunistic sleep

Introduce a mechanism by which the kernel can trigger global
transitions to a sleep state chosen by user space if there are no
active wakeup sources.

It consists of a new sysfs attribute, /sys/power/autosleep, that
can be written one of the strings returned by reads from
/sys/power/state, an ordered workqueue and a work item carrying out
the "suspend" operations.  If a string representing the system's
sleep state is written to /sys/power/autosleep, the work item
triggering transitions to that state is queued up and it requeues
itself after every execution until user space writes "off" to
/sys/power/autosleep.

That work item enables the detection of wakeup events using the
functions already defined in drivers/base/power/wakeup.c (with one
small modification) and calls either pm_suspend(), or hibernate() to
put the system into a sleep state.  If a wakeup event is reported
while the transition is in progress, it will abort the transition and
the "system suspend" work item will be queued up again.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/ABI/testing/sysfs-power |   17 ++++
 drivers/base/power/wakeup.c           |   38 ++++++-----
 include/linux/suspend.h               |   13 +++
 kernel/power/Kconfig                  |    8 ++
 kernel/power/Makefile                 |    1 
 kernel/power/autosleep.c              |  113 ++++++++++++++++++++++++++++++++
 kernel/power/main.c                   |  117 ++++++++++++++++++++++++++++------
 kernel/power/power.h                  |   18 +++++
 8 files changed, 290 insertions(+), 35 deletions(-)

Index: linux/kernel/power/Makefile
===================================================================
--- linux.orig/kernel/power/Makefile
+++ linux/kernel/power/Makefile
@@ -9,5 +9,6 @@ obj-$(CONFIG_SUSPEND)		+= suspend.o
 obj-$(CONFIG_PM_TEST_SUSPEND)	+= suspend_test.o
 obj-$(CONFIG_HIBERNATION)	+= hibernate.o snapshot.o swap.o user.o \
 				   block_io.o
+obj-$(CONFIG_PM_AUTOSLEEP)	+= autosleep.o
 
 obj-$(CONFIG_MAGIC_SYSRQ)	+= poweroff.o
Index: linux/kernel/power/Kconfig
===================================================================
--- linux.orig/kernel/power/Kconfig
+++ linux/kernel/power/Kconfig
@@ -103,6 +103,14 @@ config PM_SLEEP_SMP
 	select HOTPLUG
 	select HOTPLUG_CPU
 
+config PM_AUTOSLEEP
+	bool "Opportunistic sleep"
+	depends on PM_SLEEP
+	default n
+	---help---
+	Allow the kernel to trigger a system transition into a global sleep
+	state automatically whenever there are no active wakeup sources.
+
 config PM_RUNTIME
 	bool "Run-time PM core functionality"
 	depends on !IA64_HP_SIM
Index: linux/kernel/power/power.h
===================================================================
--- linux.orig/kernel/power/power.h
+++ linux/kernel/power/power.h
@@ -264,3 +264,21 @@ static inline void suspend_thaw_processe
 {
 }
 #endif
+
+#ifdef CONFIG_PM_AUTOSLEEP
+
+/* kernel/power/autosleep.c */
+extern int pm_autosleep_init(void);
+extern int pm_autosleep_lock(void);
+extern void pm_autosleep_unlock(void);
+extern suspend_state_t pm_autosleep_state(void);
+extern int pm_autosleep_set_state(suspend_state_t state);
+
+#else /* !CONFIG_PM_AUTOSLEEP */
+
+static inline int pm_autosleep_init(void) { return 0; }
+static inline int pm_autosleep_lock(void) { return 0; }
+static inline void pm_autosleep_unlock(void) {}
+static inline suspend_state_t pm_autosleep_state(void) { return PM_SUSPEND_ON; }
+
+#endif /* !CONFIG_PM_AUTOSLEEP */
Index: linux/include/linux/suspend.h
===================================================================
--- linux.orig/include/linux/suspend.h
+++ linux/include/linux/suspend.h
@@ -356,7 +356,7 @@ extern int unregister_pm_notifier(struct
 extern bool events_check_enabled;
 
 extern bool pm_wakeup_pending(void);
-extern bool pm_get_wakeup_count(unsigned int *count);
+extern bool pm_get_wakeup_count(unsigned int *count, bool block);
 extern bool pm_save_wakeup_count(unsigned int count);
 
 static inline void lock_system_sleep(void)
@@ -407,6 +407,17 @@ static inline void unlock_system_sleep(v
 
 #endif /* !CONFIG_PM_SLEEP */
 
+#ifdef CONFIG_PM_AUTOSLEEP
+
+/* kernel/power/autosleep.c */
+void queue_up_suspend_work(void);
+
+#else /* !CONFIG_PM_AUTOSLEEP */
+
+static inline void queue_up_suspend_work(void) {}
+
+#endif /* !CONFIG_PM_AUTOSLEEP */
+
 #ifdef CONFIG_ARCH_SAVE_PAGE_KEYS
 /*
  * The ARCH_SAVE_PAGE_KEYS functions can be used by an architecture
Index: linux/kernel/power/autosleep.c
===================================================================
--- /dev/null
+++ linux/kernel/power/autosleep.c
@@ -0,0 +1,113 @@
+/*
+ * kernel/power/autosleep.c
+ *
+ * Opportunistic sleep support.
+ *
+ * Copyright (C) 2012 Rafael J. Wysocki <rjw@sisk.pl>
+ */
+
+#include <linux/device.h>
+#include <linux/mutex.h>
+#include <linux/pm_wakeup.h>
+
+#include "power.h"
+
+static suspend_state_t autosleep_state;
+static struct workqueue_struct *autosleep_wq;
+static DEFINE_MUTEX(autosleep_lock);
+static struct wakeup_source *autosleep_ws;
+
+static void try_to_suspend(struct work_struct *work)
+{
+	unsigned int initial_count, final_count;
+
+	if (!pm_get_wakeup_count(&initial_count, true))
+		goto out;
+
+	mutex_lock(&autosleep_lock);
+
+	if (!pm_save_wakeup_count(initial_count)) {
+		mutex_unlock(&autosleep_lock);
+		goto out;
+	}
+
+	if (autosleep_state == PM_SUSPEND_ON) {
+		mutex_unlock(&autosleep_lock);
+		return;
+	}
+	if (autosleep_state >= PM_SUSPEND_MAX)
+		hibernate();
+	else
+		pm_suspend(autosleep_state);
+
+	mutex_unlock(&autosleep_lock);
+
+	if (!pm_get_wakeup_count(&final_count, false))
+		goto out;
+
+	if (final_count == initial_count)
+		schedule_timeout(HZ / 2);
+
+ out:
+	queue_up_suspend_work();
+}
+
+static DECLARE_WORK(suspend_work, try_to_suspend);
+
+void queue_up_suspend_work(void)
+{
+	if (!work_pending(&suspend_work) && autosleep_state > PM_SUSPEND_ON)
+		queue_work(autosleep_wq, &suspend_work);
+}
+
+suspend_state_t pm_autosleep_state(void)
+{
+	return autosleep_state;
+}
+
+int pm_autosleep_lock(void)
+{
+	return mutex_lock_interruptible(&autosleep_lock);
+}
+
+void pm_autosleep_unlock(void)
+{
+	mutex_unlock(&autosleep_lock);
+}
+
+int pm_autosleep_set_state(suspend_state_t state)
+{
+
+#ifndef CONFIG_HIBERNATION
+	if (state >= PM_SUSPEND_MAX)
+		return -EINVAL;
+#endif
+
+	__pm_stay_awake(autosleep_ws);
+
+	mutex_lock(&autosleep_lock);
+
+	autosleep_state = state;
+
+	__pm_relax(autosleep_ws);
+
+	if (state > PM_SUSPEND_ON)
+		queue_up_suspend_work();
+
+	mutex_unlock(&autosleep_lock);
+	return 0;
+}
+
+int __init pm_autosleep_init(void)
+{
+	autosleep_ws = wakeup_source_register("autosleep");
+	if (!autosleep_ws)
+		return -ENOMEM;
+
+	autosleep_wq = alloc_ordered_workqueue("autosleep", 0);
+	if (autosleep_wq)
+		return 0;
+
+	wakeup_source_unregister(autosleep_ws);
+	return -ENOMEM;
+}
Index: linux/kernel/power/main.c
===================================================================
--- linux.orig/kernel/power/main.c
+++ linux/kernel/power/main.c
@@ -269,8 +269,7 @@ static ssize_t state_show(struct kobject
 	return (s - buf);
 }
 
-static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
-			   const char *buf, size_t n)
+static suspend_state_t decode_state(const char *buf, size_t n)
 {
 #ifdef CONFIG_SUSPEND
 	suspend_state_t state = PM_SUSPEND_STANDBY;
@@ -278,27 +277,48 @@ static ssize_t state_store(struct kobjec
 #endif
 	char *p;
 	int len;
-	int error = -EINVAL;
 
 	p = memchr(buf, '\n', n);
 	len = p ? p - buf : n;
 
-	/* First, check if we are requested to hibernate */
-	if (len == 4 && !strncmp(buf, "disk", len)) {
-		error = hibernate();
-		goto Exit;
-	}
+	/* Check hibernation first. */
+	if (len == 4 && !strncmp(buf, "disk", len))
+		return PM_SUSPEND_MAX;
 
 #ifdef CONFIG_SUSPEND
-	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
-		if (*s && len == strlen(*s) && !strncmp(buf, *s, len)) {
-			error = pm_suspend(state);
-			break;
-		}
-	}
+	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++)
+		if (*s && len == strlen(*s) && !strncmp(buf, *s, len))
+			return state;
 #endif
 
- Exit:
+	return PM_SUSPEND_ON;
+}
+
+static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
+			   const char *buf, size_t n)
+{
+	suspend_state_t state;
+	int error;
+
+	error = pm_autosleep_lock();
+	if (error)
+		return error;
+
+	if (pm_autosleep_state() > PM_SUSPEND_ON) {
+		error = -EBUSY;
+		goto out;
+	}
+
+	state = decode_state(buf, n);
+	if (state < PM_SUSPEND_MAX)
+		error = pm_suspend(state);
+	else if (state > PM_SUSPEND_ON)
+		error = hibernate();
+	else
+		error = -EINVAL;
+
+ out:
+	pm_autosleep_unlock();
 	return error ? error : n;
 }
 
@@ -339,7 +359,8 @@ static ssize_t wakeup_count_show(struct
 {
 	unsigned int val;
 
-	return pm_get_wakeup_count(&val) ? sprintf(buf, "%u\n", val) : -EINTR;
+	return pm_get_wakeup_count(&val, true) ?
+		sprintf(buf, "%u\n", val) : -EINTR;
 }
 
 static ssize_t wakeup_count_store(struct kobject *kobj,
@@ -347,15 +368,69 @@ static ssize_t wakeup_count_store(struct
 				const char *buf, size_t n)
 {
 	unsigned int val;
+	int error;
+
+	error = pm_autosleep_lock();
+	if (error)
+		return error;
+
+	if (pm_autosleep_state() > PM_SUSPEND_ON) {
+		error = -EBUSY;
+		goto out;
+	}
 
 	if (sscanf(buf, "%u", &val) == 1) {
 		if (pm_save_wakeup_count(val))
 			return n;
 	}
-	return -EINVAL;
+	error = -EINVAL;
+
+ out:
+	pm_autosleep_unlock();
+	return error;
 }
 
 power_attr(wakeup_count);
+
+#ifdef CONFIG_PM_AUTOSLEEP
+static ssize_t autosleep_show(struct kobject *kobj,
+			      struct kobj_attribute *attr,
+			      char *buf)
+{
+	suspend_state_t state = pm_autosleep_state();
+
+	if (state == PM_SUSPEND_ON)
+		return sprintf(buf, "off\n");
+
+#ifdef CONFIG_SUSPEND
+	if (state < PM_SUSPEND_MAX)
+		return sprintf(buf, "%s\n", valid_state(state) ?
+						pm_states[state] : "error");
+#endif
+#ifdef CONFIG_HIBERNATION
+	return sprintf(buf, "disk\n");
+#else
+	return sprintf(buf, "error");
+#endif
+}
+
+static ssize_t autosleep_store(struct kobject *kobj,
+			       struct kobj_attribute *attr,
+			       const char *buf, size_t n)
+{
+	suspend_state_t state = decode_state(buf, n);
+	int error;
+
+	if (state == PM_SUSPEND_ON && strncmp(buf, "off", 3)
+	    && strncmp(buf, "off\n", 4))
+		return -EINVAL;
+
+	error = pm_autosleep_set_state(state);
+	return error ? error : n;
+}
+
+power_attr(autosleep);
+#endif /* CONFIG_PM_AUTOSLEEP */
 #endif /* CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM_TRACE
@@ -409,6 +484,9 @@ static struct attribute * g[] = {
 #ifdef CONFIG_PM_SLEEP
 	&pm_async_attr.attr,
 	&wakeup_count_attr.attr,
+#ifdef CONFIG_PM_AUTOSLEEP
+	&autosleep_attr.attr,
+#endif
 #ifdef CONFIG_PM_DEBUG
 	&pm_test_attr.attr,
 #endif
@@ -444,7 +522,10 @@ static int __init pm_init(void)
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
-	return sysfs_create_group(power_kobj, &attr_group);
+	error = sysfs_create_group(power_kobj, &attr_group);
+	if (error)
+		return error;
+	return pm_autosleep_init();
 }
 
 core_initcall(pm_init);
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -492,8 +492,10 @@ static void wakeup_source_deactivate(str
 	atomic_add(MAX_IN_PROGRESS, &combined_event_count);
 
 	split_counters(&cnt, &inpr);
-	if (!inpr && waitqueue_active(&wakeup_count_wait_queue))
+	if (!inpr && waitqueue_active(&wakeup_count_wait_queue)) {
 		wake_up(&wakeup_count_wait_queue);
+		queue_up_suspend_work();
+	}
 }
 
 /**
@@ -654,29 +656,33 @@ bool pm_wakeup_pending(void)
 /**
  * pm_get_wakeup_count - Read the number of registered wakeup events.
  * @count: Address to store the value at.
+ * @block: Whether or not to block.
  *
- * Store the number of registered wakeup events at the address in @count.  Block
- * if the current number of wakeup events being processed is nonzero.
+ * Store the number of registered wakeup events at the address in @count.  If
+ * @block is set, block until the current number of wakeup events being
+ * processed is zero.
  *
- * Return 'false' if the wait for the number of wakeup events being processed to
- * drop down to zero has been interrupted by a signal (and the current number
- * of wakeup events being processed is still nonzero).  Otherwise return 'true'.
+ * Return 'false' if the current number of wakeup events being processed is
+ * nonzero.  Otherwise return 'true'.
  */
-bool pm_get_wakeup_count(unsigned int *count)
+bool pm_get_wakeup_count(unsigned int *count, bool block)
 {
 	unsigned int cnt, inpr;
-	DEFINE_WAIT(wait);
 
-	for (;;) {
-		prepare_to_wait(&wakeup_count_wait_queue, &wait,
-				TASK_INTERRUPTIBLE);
-		split_counters(&cnt, &inpr);
-		if (inpr == 0 || signal_pending(current))
-			break;
+	if (block) {
+		DEFINE_WAIT(wait);
 
-		schedule();
+		for (;;) {
+			prepare_to_wait(&wakeup_count_wait_queue, &wait,
+					TASK_INTERRUPTIBLE);
+			split_counters(&cnt, &inpr);
+			if (inpr == 0 || signal_pending(current))
+				break;
+
+			schedule();
+		}
+		finish_wait(&wakeup_count_wait_queue, &wait);
 	}
-	finish_wait(&wakeup_count_wait_queue, &wait);
 
 	split_counters(&cnt, &inpr);
 	*count = cnt;
Index: linux/Documentation/ABI/testing/sysfs-power
===================================================================
--- linux.orig/Documentation/ABI/testing/sysfs-power
+++ linux/Documentation/ABI/testing/sysfs-power
@@ -172,3 +172,20 @@ Description:
 
 		Reading from this file will display the current value, which is
 		set to 1 MB by default.
+
+What:		/sys/power/autosleep
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/autosleep file can be written one of the strings
+		returned by reads from /sys/power/state.  If that happens, a
+		work item attempting to trigger a transition of the system to
+		the sleep state represented by that string is queued up.  This
+		attempt will only succeed if there are no active wakeup sources
+		in the system at that time.  After evey execution, regardless
+		of whether or not the attempt to put the system to sleep has
+		succeeded, the work item requeues itself until user space
+		writes "off" to /sys/power/autosleep.
+
+		Reading from this file causes the last string successfully
+		written to it to be displayed.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 4/7] Input / PM: Add ioctl to block suspend while event queue is not empty
  2012-02-25  4:25       ` Arve Hjønnevåg
@ 2012-02-25 23:33         ` Rafael J. Wysocki
  2012-02-28  0:19         ` Matt Helsley
  1 sibling, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-25 23:33 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: Matt Helsley, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

On Saturday, February 25, 2012, Arve Hjønnevåg wrote:
> On Thu, Feb 23, 2012 at 9:16 PM, Matt Helsley <matthltc@us.ibm.com> wrote:
> > On Wed, Feb 22, 2012 at 12:34:58AM +0100, Rafael J. Wysocki wrote:
> >> From: Arve Hjønnevåg <arve@android.com>
> >>
> >> Add a new ioctl, EVIOCSWAKEUPSRC, to attach a wakeup source object to
> >> an evdev client event queue, such that it will be active whenever the
> >> queue is not empty.  Then, all events in the queue will be regarded
> >> as wakeup events in progress and pm_get_wakeup_count() will block (or
> >> return false if woken up by a signal) until they are removed from the
> >> queue.  In consequence, if the checking of wakeup events is enabled
> >> (e.g. throught the /sys/power/wakeup_count interface), the system
> >> won't be able to go into a sleep state until the queue is empty.
> >>
> >> This allows user space processes to handle situations in which they
> >> want to do a select() on an evdev descriptor, so they go to sleep
> >> until there are some events to read from the device's queue, and then
> >> they don't want the system to go into a sleep state until all the
> >> events are read (presumably for further processing).  Of course, if
> >> they don't want the system to go into a sleep state _after_ all the
> >> events have been read from the queue, they have to use a separate
> >> mechanism that will prevent the system from doing that and it has
> >> to be activated before reading the first event (that also may be the
> >> last one).
> >
> > I haven't seen this idea mentioned before but I must admit I haven't
> > been following this thread too closely so apologies (and don't bother
> > rehashing) if it has:
> >
> > Could you just add this to epoll so that any fd userspace chooses would be
> > capable of doing this without introducing potentially ecclectic ioctl
> > interfaces?
> >
> 
> This is an interesting idea, but I'm not sure how well it would work.
> 
> I looked at the epoll code and it looks like it is possible to
> activate the wakeup-source from the wait queue function it uses.

I'm not sure I'm following you here.  How exactly would you like to do that?

In particular, what data structure would the wakeup source object be
associated with?

> The epoll callback will happen without holding evdev client buffer_lock,
> so the wakeup-source and buffer state will not always be in sync (this
> may be OK, but require more thought). This callback is also called if
> no data was added to the queue we are polling on because another
> client has grabbed the input device (is this a bug or intended?).
> 
> There is no call into the epoll code when input queue is emptied, so
> we can't deactivate the wakeup-source until epoll_wait is called
> again. This also should be workable, but result in different stats.
> 
> It does not look like the normal poll and select interfaces can be
> extended the same way (since they remove themselves from the
> wait-queue before returning to user-space), so user-space has to be
> changed to use epoll even if select or poll would be a better fit.

Well, epoll without EPOLLET is equivalent to poll, so the only potential
issue is select.  How serious may the problem with that be?

> I don't know how many other drivers this would work for. The input
> driver will wake up user-space from the same thread or interrupt
> handler that queued the event, but other drivers may defer this to
> another thread which makes an epoll wakeup-source insufficient.

If we go for new ioctls insread, we'll have to add them to all of those
drivers, so I would prefer the epoll-based approach if that's viable at
least for a subset of the relevant drivers.

> ...
> >> +     snprintf(name, sizeof(name), "%s-%d",
> >> +              dev_name(&evdev->dev), task_tgid_vnr(current));
> >
> > This does not look like it will work well with tasks in different pid
> > namespaces. What should happen, I think, is the wakeup_source should hold a
> > reference to either the struct pid of current or current itself. Then
> > when someone reads the file you should get the pid vnr in the reader's
> > pid namespace. That way instead of a bogus pid vnr 0 would show up if
> > "current" here is not in the reader's pid namepsace.
> >
> 
> The pid here is only used for debugging purposes, and used less than
> the dev_name. I don't think tracking pid namespaces is worth the
> trouble here, so if this is a real problem we can just drop the pid
> from the name for now.

OK

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 4/7] Input / PM: Add ioctl to block suspend while event queue is not empty
  2012-02-24  5:16     ` Matt Helsley
  2012-02-25  4:25       ` Arve Hjønnevåg
@ 2012-02-26 20:57       ` Rafael J. Wysocki
  2012-02-27 22:18         ` Matt Helsley
  2012-02-28  5:58         ` Arve Hjønnevåg
  1 sibling, 2 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-26 20:57 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

On Friday, February 24, 2012, Matt Helsley wrote:
> On Wed, Feb 22, 2012 at 12:34:58AM +0100, Rafael J. Wysocki wrote:
> > From: Arve Hjønnevåg <arve@android.com>
> > 
> > Add a new ioctl, EVIOCSWAKEUPSRC, to attach a wakeup source object to
> > an evdev client event queue, such that it will be active whenever the
> > queue is not empty.  Then, all events in the queue will be regarded
> > as wakeup events in progress and pm_get_wakeup_count() will block (or
> > return false if woken up by a signal) until they are removed from the
> > queue.  In consequence, if the checking of wakeup events is enabled
> > (e.g. throught the /sys/power/wakeup_count interface), the system
> > won't be able to go into a sleep state until the queue is empty.
> > 
> > This allows user space processes to handle situations in which they
> > want to do a select() on an evdev descriptor, so they go to sleep
> > until there are some events to read from the device's queue, and then
> > they don't want the system to go into a sleep state until all the
> > events are read (presumably for further processing).  Of course, if
> > they don't want the system to go into a sleep state _after_ all the
> > events have been read from the queue, they have to use a separate
> > mechanism that will prevent the system from doing that and it has
> > to be activated before reading the first event (that also may be the
> > last one).
> 
> I haven't seen this idea mentioned before but I must admit I haven't
> been following this thread too closely so apologies (and don't bother
> rehashing) if it has:
> 
> Could you just add this to epoll so that any fd userspace chooses would be
> capable of doing this without introducing potentially ecclectic ioctl
> interfaces?
> 
> struct epoll_event ev;
> 
> epfd = epoll_create1(EPOLL_STAY_AWAKE_SET);
> ev.data.ptr = foo;
> epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev);
> 
> Which could be useful because you can put one epollfd in another's epoll
> set. Or maybe as an EPOLLKEEPAWAKE flag in the event struct sort of like
> EPOLLET:
> 
> epfd = epoll_create1(0);
> ev.events = EPOLLIN|EPOLLKEEPAWAKE;
> epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev);

Do you mean something like the patch below, or something different?

Rafael

---
 drivers/input/evdev.c     |   55 ++++++++++++++++++++++++++++++++++++++++++++++
 fs/eventpoll.c            |   15 +++++++++++-
 include/linux/eventpoll.h |    6 +++++
 include/linux/fs.h        |    1 
 4 files changed, 76 insertions(+), 1 deletion(-)

Index: linux/include/linux/fs.h
===================================================================
--- linux.orig/include/linux/fs.h
+++ linux/include/linux/fs.h
@@ -1604,6 +1604,7 @@ struct file_operations {
 	ssize_t (*aio_write) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
 	int (*readdir) (struct file *, void *, filldir_t);
 	unsigned int (*poll) (struct file *, struct poll_table_struct *);
+	void (*epoll_ctl) (struct file *, int, unsigned int);
 	long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
 	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
 	int (*mmap) (struct file *, struct vm_area_struct *);
Index: linux/fs/eventpoll.c
===================================================================
--- linux.orig/fs/eventpoll.c
+++ linux/fs/eventpoll.c
@@ -609,6 +609,10 @@ static int ep_remove(struct eventpoll *e
 	unsigned long flags;
 	struct file *file = epi->ffd.file;
 
+	/* Notify the underlying driver that the polling has completed */
+	if (file->f_op->epoll_ctl)
+		file->f_op->epoll_ctl(file, EPOLL_CTL_DEL, epi->event.events);
+
 	/*
 	 * Removes poll wait queue hooks. We _have_ to do this without holding
 	 * the "ep->lock" otherwise a deadlock might occur. This because of the
@@ -1094,6 +1098,10 @@ static int ep_insert(struct eventpoll *e
 	epq.epi = epi;
 	init_poll_funcptr(&epq.pt, ep_ptable_queue_proc);
 
+	/* Notify the underlying driver that we want to poll it */
+	if (tfile->f_op->epoll_ctl)
+		tfile->f_op->epoll_ctl(tfile, EPOLL_CTL_ADD, event->events);
+
 	/*
 	 * Attach the item to the poll hooks and get current event bits.
 	 * We can safely use the file* here because its usage count has
@@ -1185,6 +1193,7 @@ error_unregister:
  */
 static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_event *event)
 {
+	struct file *file = epi->ffd.file;
 	int pwake = 0;
 	unsigned int revents;
 
@@ -1196,11 +1205,15 @@ static int ep_modify(struct eventpoll *e
 	epi->event.events = event->events;
 	epi->event.data = event->data; /* protected by mtx */
 
+	/* Notify the underlying driver of the change */
+	if (file->f_op->epoll_ctl)
+		file->f_op->epoll_ctl(file, EPOLL_CTL_MOD, event->events);
+
 	/*
 	 * Get current event bits. We can safely use the file* here because
 	 * its usage count has been increased by the caller of this function.
 	 */
-	revents = epi->ffd.file->f_op->poll(epi->ffd.file, NULL);
+	revents = file->f_op->poll(file, NULL);
 
 	/*
 	 * If the item is "hot" and it is not registered inside the ready
Index: linux/drivers/input/evdev.c
===================================================================
--- linux.orig/drivers/input/evdev.c
+++ linux/drivers/input/evdev.c
@@ -16,6 +16,7 @@
 #define EVDEV_BUF_PACKETS	8
 
 #include <linux/poll.h>
+#include <linux/eventpoll.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/module.h>
@@ -43,6 +44,7 @@ struct evdev_client {
 	unsigned int tail;
 	unsigned int packet_head; /* [future] position of the first element of next packet */
 	spinlock_t buffer_lock; /* protects access to buffer, head and tail */
+	struct wakeup_source *wakeup_source;
 	struct fasync_struct *fasync;
 	struct evdev *evdev;
 	struct list_head node;
@@ -75,10 +77,12 @@ static void evdev_pass_event(struct evde
 		client->buffer[client->tail].value = 0;
 
 		client->packet_head = client->tail;
+		__pm_relax(client->wakeup_source);
 	}
 
 	if (event->type == EV_SYN && event->code == SYN_REPORT) {
 		client->packet_head = client->head;
+		__pm_stay_awake(client->wakeup_source);
 		kill_fasync(&client->fasync, SIGIO, POLL_IN);
 	}
 
@@ -255,6 +259,8 @@ static int evdev_release(struct inode *i
 	mutex_unlock(&evdev->mutex);
 
 	evdev_detach_client(evdev, client);
+	wakeup_source_unregister(client->wakeup_source);
+
 	kfree(client);
 
 	evdev_close_device(evdev);
@@ -373,6 +379,8 @@ static int evdev_fetch_next_event(struct
 	if (have_event) {
 		*event = client->buffer[client->tail++];
 		client->tail &= client->bufsize - 1;
+		if (client->packet_head == client->tail)
+			__pm_relax(client->wakeup_source);
 	}
 
 	spin_unlock_irq(&client->buffer_lock);
@@ -433,6 +441,52 @@ static unsigned int evdev_poll(struct fi
 	return mask;
 }
 
+static void evdev_client_attach_wakeup_source(struct evdev_client *client)
+{
+	struct wakeup_source *ws;
+
+	ws = wakeup_source_register(dev_name(&client->evdev->dev));
+	spin_lock_irq(&client->buffer_lock);
+	client->wakeup_source = ws;
+	if (client->packet_head != client->tail)
+		__pm_stay_awake(client->wakeup_source);
+	spin_unlock_irq(&client->buffer_lock);
+}
+
+static void evdev_client_detach_wakeup_source(struct evdev_client *client)
+{
+	struct wakeup_source *ws;
+
+	spin_lock_irq(&client->buffer_lock);
+	ws = client->wakeup_source;
+	client->wakeup_source = NULL;
+	spin_unlock_irq(&client->buffer_lock);
+	wakeup_source_unregister(ws);
+}
+
+static void evdev_epoll_ctl(struct file *file, int op,
+				    unsigned int events)
+{
+	struct evdev_client *client = file->private_data;
+
+	switch (op) {
+	case EPOLL_CTL_ADD:
+		if ((events & EPOLLWAKEUP) && !client->wakeup_source)
+			evdev_client_attach_wakeup_source(client);
+		break;
+	case EPOLL_CTL_DEL:
+		if (events & EPOLLWAKEUP)
+			evdev_client_detach_wakeup_source(client);
+		break;
+	case EPOLL_CTL_MOD:
+		/* 'events' is the new events mask (after the change) */
+		if ((events & EPOLLWAKEUP) && !client->wakeup_source)
+			evdev_client_attach_wakeup_source(client);
+		else if (!(events & EPOLLWAKEUP))
+			evdev_client_detach_wakeup_source(client);
+	}
+}
+
 #ifdef CONFIG_COMPAT
 
 #define BITS_PER_LONG_COMPAT (sizeof(compat_long_t) * 8)
@@ -845,6 +899,7 @@ static const struct file_operations evde
 	.read		= evdev_read,
 	.write		= evdev_write,
 	.poll		= evdev_poll,
+	.epoll_ctl	= evdev_epoll_ctl,
 	.open		= evdev_open,
 	.release	= evdev_release,
 	.unlocked_ioctl	= evdev_ioctl,
Index: linux/include/linux/eventpoll.h
===================================================================
--- linux.orig/include/linux/eventpoll.h
+++ linux/include/linux/eventpoll.h
@@ -26,6 +26,12 @@
 #define EPOLL_CTL_DEL 2
 #define EPOLL_CTL_MOD 3
 
+/*
+ * Request the handling of system wakeup events so as to prevent automatic
+ * system suspends from happening while those events are being processed.
+ */
+#define EPOLLWAKEUP (1 << 29)
+
 /* Set the One Shot behaviour for the target file descriptor */
 #define EPOLLONESHOT (1 << 30)
 

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 4/7] Input / PM: Add ioctl to block suspend while event queue is not empty
  2012-02-26 20:57       ` Rafael J. Wysocki
@ 2012-02-27 22:18         ` Matt Helsley
  2012-02-28  1:17           ` Rafael J. Wysocki
  2012-02-28  5:58         ` Arve Hjønnevåg
  1 sibling, 1 reply; 129+ messages in thread
From: Matt Helsley @ 2012-02-27 22:18 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Matt Helsley, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg, John Stultz,
	Brian Swetland, Neil Brown, Alan Stern, Dmitry Torokhov

On Sun, Feb 26, 2012 at 09:57:18PM +0100, Rafael J. Wysocki wrote:
> On Friday, February 24, 2012, Matt Helsley wrote:
> > On Wed, Feb 22, 2012 at 12:34:58AM +0100, Rafael J. Wysocki wrote:
> > > From: Arve Hjønnevåg <arve@android.com>
> > > 
> > > Add a new ioctl, EVIOCSWAKEUPSRC, to attach a wakeup source object to
> > > an evdev client event queue, such that it will be active whenever the
> > > queue is not empty.  Then, all events in the queue will be regarded
> > > as wakeup events in progress and pm_get_wakeup_count() will block (or
> > > return false if woken up by a signal) until they are removed from the
> > > queue.  In consequence, if the checking of wakeup events is enabled
> > > (e.g. throught the /sys/power/wakeup_count interface), the system
> > > won't be able to go into a sleep state until the queue is empty.
> > > 
> > > This allows user space processes to handle situations in which they
> > > want to do a select() on an evdev descriptor, so they go to sleep
> > > until there are some events to read from the device's queue, and then
> > > they don't want the system to go into a sleep state until all the
> > > events are read (presumably for further processing).  Of course, if
> > > they don't want the system to go into a sleep state _after_ all the
> > > events have been read from the queue, they have to use a separate
> > > mechanism that will prevent the system from doing that and it has
> > > to be activated before reading the first event (that also may be the
> > > last one).
> > 
> > I haven't seen this idea mentioned before but I must admit I haven't
> > been following this thread too closely so apologies (and don't bother
> > rehashing) if it has:
> > 
> > Could you just add this to epoll so that any fd userspace chooses would be
> > capable of doing this without introducing potentially ecclectic ioctl
> > interfaces?
> > 
> > struct epoll_event ev;
> > 
> > epfd = epoll_create1(EPOLL_STAY_AWAKE_SET);
> > ev.data.ptr = foo;
> > epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev);
> > 
> > Which could be useful because you can put one epollfd in another's epoll
> > set. Or maybe as an EPOLLKEEPAWAKE flag in the event struct sort of like
> > EPOLLET:
> > 
> > epfd = epoll_create1(0);
> > ev.events = EPOLLIN|EPOLLKEEPAWAKE;
> > epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev);
> 
> Do you mean something like the patch below, or something different?

Yeah, this was sort of what I was thinking of. It nicely avoids the
ioctl() bits. I guess my only issue is the fop mimics the epoll
interface -- should it just be an fop to manage the file as a wakeup
source rather than a generic hook into epoll?

Cheers,
	-Matt Helsley

> 
> Rafael
> 
> ---
>  drivers/input/evdev.c     |   55 ++++++++++++++++++++++++++++++++++++++++++++++
>  fs/eventpoll.c            |   15 +++++++++++-
>  include/linux/eventpoll.h |    6 +++++
>  include/linux/fs.h        |    1 
>  4 files changed, 76 insertions(+), 1 deletion(-)
> 
> Index: linux/include/linux/fs.h
> ===================================================================
> --- linux.orig/include/linux/fs.h
> +++ linux/include/linux/fs.h
> @@ -1604,6 +1604,7 @@ struct file_operations {
>  	ssize_t (*aio_write) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
>  	int (*readdir) (struct file *, void *, filldir_t);
>  	unsigned int (*poll) (struct file *, struct poll_table_struct *);
> +	void (*epoll_ctl) (struct file *, int, unsigned int);
>  	long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
>  	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
>  	int (*mmap) (struct file *, struct vm_area_struct *);
> Index: linux/fs/eventpoll.c
> ===================================================================
> --- linux.orig/fs/eventpoll.c
> +++ linux/fs/eventpoll.c
> @@ -609,6 +609,10 @@ static int ep_remove(struct eventpoll *e
>  	unsigned long flags;
>  	struct file *file = epi->ffd.file;
>  
> +	/* Notify the underlying driver that the polling has completed */
> +	if (file->f_op->epoll_ctl)
> +		file->f_op->epoll_ctl(file, EPOLL_CTL_DEL, epi->event.events);
> +
>  	/*
>  	 * Removes poll wait queue hooks. We _have_ to do this without holding
>  	 * the "ep->lock" otherwise a deadlock might occur. This because of the
> @@ -1094,6 +1098,10 @@ static int ep_insert(struct eventpoll *e
>  	epq.epi = epi;
>  	init_poll_funcptr(&epq.pt, ep_ptable_queue_proc);
>  
> +	/* Notify the underlying driver that we want to poll it */
> +	if (tfile->f_op->epoll_ctl)
> +		tfile->f_op->epoll_ctl(tfile, EPOLL_CTL_ADD, event->events);
> +
>  	/*
>  	 * Attach the item to the poll hooks and get current event bits.
>  	 * We can safely use the file* here because its usage count has
> @@ -1185,6 +1193,7 @@ error_unregister:
>   */
>  static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_event *event)
>  {
> +	struct file *file = epi->ffd.file;
>  	int pwake = 0;
>  	unsigned int revents;
>  
> @@ -1196,11 +1205,15 @@ static int ep_modify(struct eventpoll *e
>  	epi->event.events = event->events;
>  	epi->event.data = event->data; /* protected by mtx */
>  
> +	/* Notify the underlying driver of the change */
> +	if (file->f_op->epoll_ctl)
> +		file->f_op->epoll_ctl(file, EPOLL_CTL_MOD, event->events);
> +
>  	/*
>  	 * Get current event bits. We can safely use the file* here because
>  	 * its usage count has been increased by the caller of this function.
>  	 */
> -	revents = epi->ffd.file->f_op->poll(epi->ffd.file, NULL);
> +	revents = file->f_op->poll(file, NULL);
>  
>  	/*
>  	 * If the item is "hot" and it is not registered inside the ready
> Index: linux/drivers/input/evdev.c
> ===================================================================
> --- linux.orig/drivers/input/evdev.c
> +++ linux/drivers/input/evdev.c
> @@ -16,6 +16,7 @@
>  #define EVDEV_BUF_PACKETS	8
>  
>  #include <linux/poll.h>
> +#include <linux/eventpoll.h>
>  #include <linux/sched.h>
>  #include <linux/slab.h>
>  #include <linux/module.h>
> @@ -43,6 +44,7 @@ struct evdev_client {
>  	unsigned int tail;
>  	unsigned int packet_head; /* [future] position of the first element of next packet */
>  	spinlock_t buffer_lock; /* protects access to buffer, head and tail */
> +	struct wakeup_source *wakeup_source;
>  	struct fasync_struct *fasync;
>  	struct evdev *evdev;
>  	struct list_head node;
> @@ -75,10 +77,12 @@ static void evdev_pass_event(struct evde
>  		client->buffer[client->tail].value = 0;
>  
>  		client->packet_head = client->tail;
> +		__pm_relax(client->wakeup_source);
>  	}
>  
>  	if (event->type == EV_SYN && event->code == SYN_REPORT) {
>  		client->packet_head = client->head;
> +		__pm_stay_awake(client->wakeup_source);
>  		kill_fasync(&client->fasync, SIGIO, POLL_IN);
>  	}
>  
> @@ -255,6 +259,8 @@ static int evdev_release(struct inode *i
>  	mutex_unlock(&evdev->mutex);
>  
>  	evdev_detach_client(evdev, client);
> +	wakeup_source_unregister(client->wakeup_source);
> +
>  	kfree(client);
>  
>  	evdev_close_device(evdev);
> @@ -373,6 +379,8 @@ static int evdev_fetch_next_event(struct
>  	if (have_event) {
>  		*event = client->buffer[client->tail++];
>  		client->tail &= client->bufsize - 1;
> +		if (client->packet_head == client->tail)
> +			__pm_relax(client->wakeup_source);
>  	}
>  
>  	spin_unlock_irq(&client->buffer_lock);
> @@ -433,6 +441,52 @@ static unsigned int evdev_poll(struct fi
>  	return mask;
>  }
>  
> +static void evdev_client_attach_wakeup_source(struct evdev_client *client)
> +{
> +	struct wakeup_source *ws;
> +
> +	ws = wakeup_source_register(dev_name(&client->evdev->dev));
> +	spin_lock_irq(&client->buffer_lock);
> +	client->wakeup_source = ws;
> +	if (client->packet_head != client->tail)
> +		__pm_stay_awake(client->wakeup_source);
> +	spin_unlock_irq(&client->buffer_lock);
> +}
> +
> +static void evdev_client_detach_wakeup_source(struct evdev_client *client)
> +{
> +	struct wakeup_source *ws;
> +
> +	spin_lock_irq(&client->buffer_lock);
> +	ws = client->wakeup_source;
> +	client->wakeup_source = NULL;
> +	spin_unlock_irq(&client->buffer_lock);
> +	wakeup_source_unregister(ws);
> +}
> +
> +static void evdev_epoll_ctl(struct file *file, int op,
> +				    unsigned int events)
> +{
> +	struct evdev_client *client = file->private_data;
> +
> +	switch (op) {
> +	case EPOLL_CTL_ADD:
> +		if ((events & EPOLLWAKEUP) && !client->wakeup_source)
> +			evdev_client_attach_wakeup_source(client);
> +		break;
> +	case EPOLL_CTL_DEL:
> +		if (events & EPOLLWAKEUP)
> +			evdev_client_detach_wakeup_source(client);
> +		break;
> +	case EPOLL_CTL_MOD:
> +		/* 'events' is the new events mask (after the change) */
> +		if ((events & EPOLLWAKEUP) && !client->wakeup_source)
> +			evdev_client_attach_wakeup_source(client);
> +		else if (!(events & EPOLLWAKEUP))
> +			evdev_client_detach_wakeup_source(client);
> +	}
> +}
> +
>  #ifdef CONFIG_COMPAT
>  
>  #define BITS_PER_LONG_COMPAT (sizeof(compat_long_t) * 8)
> @@ -845,6 +899,7 @@ static const struct file_operations evde
>  	.read		= evdev_read,
>  	.write		= evdev_write,
>  	.poll		= evdev_poll,
> +	.epoll_ctl	= evdev_epoll_ctl,
>  	.open		= evdev_open,
>  	.release	= evdev_release,
>  	.unlocked_ioctl	= evdev_ioctl,
> Index: linux/include/linux/eventpoll.h
> ===================================================================
> --- linux.orig/include/linux/eventpoll.h
> +++ linux/include/linux/eventpoll.h
> @@ -26,6 +26,12 @@
>  #define EPOLL_CTL_DEL 2
>  #define EPOLL_CTL_MOD 3
>  
> +/*
> + * Request the handling of system wakeup events so as to prevent automatic
> + * system suspends from happening while those events are being processed.
> + */
> +#define EPOLLWAKEUP (1 << 29)
> +
>  /* Set the One Shot behaviour for the target file descriptor */
>  #define EPOLLONESHOT (1 << 30)
>  
> 


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 4/7] Input / PM: Add ioctl to block suspend while event queue is not empty
  2012-02-25  4:25       ` Arve Hjønnevåg
  2012-02-25 23:33         ` Rafael J. Wysocki
@ 2012-02-28  0:19         ` Matt Helsley
  1 sibling, 0 replies; 129+ messages in thread
From: Matt Helsley @ 2012-02-28  0:19 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: Matt Helsley, Rafael J. Wysocki, Linux PM list, LKML,
	Magnus Damm, markgross, Matthew Garrett, Greg KH, John Stultz,
	Brian Swetland, Neil Brown, Alan Stern, Dmitry Torokhov

On Fri, Feb 24, 2012 at 08:25:30PM -0800, Arve Hjønnevåg wrote:
> On Thu, Feb 23, 2012 at 9:16 PM, Matt Helsley <matthltc@us.ibm.com> wrote:
> > On Wed, Feb 22, 2012 at 12:34:58AM +0100, Rafael J. Wysocki wrote:
> >> From: Arve Hjønnevåg <arve@android.com>
> >>
> >> Add a new ioctl, EVIOCSWAKEUPSRC, to attach a wakeup source object to
> >> an evdev client event queue, such that it will be active whenever the
> >> queue is not empty.  Then, all events in the queue will be regarded
> >> as wakeup events in progress and pm_get_wakeup_count() will block (or
> >> return false if woken up by a signal) until they are removed from the
> >> queue.  In consequence, if the checking of wakeup events is enabled
> >> (e.g. throught the /sys/power/wakeup_count interface), the system
> >> won't be able to go into a sleep state until the queue is empty.
> >>
> >> This allows user space processes to handle situations in which they
> >> want to do a select() on an evdev descriptor, so they go to sleep
> >> until there are some events to read from the device's queue, and then
> >> they don't want the system to go into a sleep state until all the
> >> events are read (presumably for further processing).  Of course, if
> >> they don't want the system to go into a sleep state _after_ all the
> >> events have been read from the queue, they have to use a separate
> >> mechanism that will prevent the system from doing that and it has
> >> to be activated before reading the first event (that also may be the
> >> last one).
> >
> > I haven't seen this idea mentioned before but I must admit I haven't
> > been following this thread too closely so apologies (and don't bother
> > rehashing) if it has:
> >
> > Could you just add this to epoll so that any fd userspace chooses would be
> > capable of doing this without introducing potentially ecclectic ioctl
> > interfaces?
> >
> 
> This is an interesting idea, but I'm not sure how well it would work.
> 
> I looked at the epoll code and it looks like it is possible to
> activate the wakeup-source from the wait queue function it uses. The
> epoll callback will happen without holding evdev client buffer_lock,
> so the wakeup-source and buffer state will not always be in sync (this
> may be OK, but require more thought). This callback is also called if
> no data was added to the queue we are polling on because another
> client has grabbed the input device (is this a bug or intended?).
> 
> There is no call into the epoll code when input queue is emptied, so
> we can't deactivate the wakeup-source until epoll_wait is called
> again. This also should be workable, but result in different stats.
> 
> It does not look like the normal poll and select interfaces can be
> extended the same way (since they remove themselves from the
> wait-queue before returning to user-space), so user-space has to be

Yup, that is exactly why epoll is so well suited to this.

> changed to use epoll even if select or poll would be a better fit.

Either way, modification of application code is necessary, right?

> I don't know how many other drivers this would work for. The input
> driver will wake up user-space from the same thread or interrupt
> handler that queued the event, but other drivers may defer this to
> another thread which makes an epoll wakeup-source insufficient.

I don't understand how this would be insufficient. So long as the
interrupt causes the wakeup source to prevent the machine from suspending
before finishing interrupt handling does it matter whether the event
handling itself is deferred?

In case there's some confusion: I'm not saying that this idea will solve
all of the problems, especially:

> >> Of course, if
> >> they don't want the system to go into a sleep state _after_ all the
> >> events have been read from the queue, they have to use a separate
> >> mechanism that will prevent the system from doing that and it has
> >> to be activated before reading the first event (that also may be
> >> the
> >> last one).

(endquote)

> 
> ...
> >> +     snprintf(name, sizeof(name), "%s-%d",
> >> +              dev_name(&evdev->dev), task_tgid_vnr(current));
> >
> > This does not look like it will work well with tasks in different pid
> > namespaces. What should happen, I think, is the wakeup_source should hold a
> > reference to either the struct pid of current or current itself. Then
> > when someone reads the file you should get the pid vnr in the reader's
> > pid namespace. That way instead of a bogus pid vnr 0 would show up if
> > "current" here is not in the reader's pid namepsace.
> >
> 
> The pid here is only used for debugging purposes, and used less than
> the dev_name. I don't think tracking pid namespaces is worth the
> trouble here, so if this is a real problem we can just drop the pid
> from the name for now.

I think dropping the pid would be the best choice. If it's absolutely
necessary in the output then it should be made to work with pid namespaces
because the interface will be maintained forever.

Cheers,
	-Matt


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 4/7] Input / PM: Add ioctl to block suspend while event queue is not empty
  2012-02-27 22:18         ` Matt Helsley
@ 2012-02-28  1:17           ` Rafael J. Wysocki
  0 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-02-28  1:17 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov

On Monday, February 27, 2012, Matt Helsley wrote:
> On Sun, Feb 26, 2012 at 09:57:18PM +0100, Rafael J. Wysocki wrote:
> > On Friday, February 24, 2012, Matt Helsley wrote:
> > > On Wed, Feb 22, 2012 at 12:34:58AM +0100, Rafael J. Wysocki wrote:
> > > > From: Arve Hjønnevåg <arve@android.com>
> > > > 
> > > > Add a new ioctl, EVIOCSWAKEUPSRC, to attach a wakeup source object to
> > > > an evdev client event queue, such that it will be active whenever the
> > > > queue is not empty.  Then, all events in the queue will be regarded
> > > > as wakeup events in progress and pm_get_wakeup_count() will block (or
> > > > return false if woken up by a signal) until they are removed from the
> > > > queue.  In consequence, if the checking of wakeup events is enabled
> > > > (e.g. throught the /sys/power/wakeup_count interface), the system
> > > > won't be able to go into a sleep state until the queue is empty.
> > > > 
> > > > This allows user space processes to handle situations in which they
> > > > want to do a select() on an evdev descriptor, so they go to sleep
> > > > until there are some events to read from the device's queue, and then
> > > > they don't want the system to go into a sleep state until all the
> > > > events are read (presumably for further processing).  Of course, if
> > > > they don't want the system to go into a sleep state _after_ all the
> > > > events have been read from the queue, they have to use a separate
> > > > mechanism that will prevent the system from doing that and it has
> > > > to be activated before reading the first event (that also may be the
> > > > last one).
> > > 
> > > I haven't seen this idea mentioned before but I must admit I haven't
> > > been following this thread too closely so apologies (and don't bother
> > > rehashing) if it has:
> > > 
> > > Could you just add this to epoll so that any fd userspace chooses would be
> > > capable of doing this without introducing potentially ecclectic ioctl
> > > interfaces?
> > > 
> > > struct epoll_event ev;
> > > 
> > > epfd = epoll_create1(EPOLL_STAY_AWAKE_SET);
> > > ev.data.ptr = foo;
> > > epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev);
> > > 
> > > Which could be useful because you can put one epollfd in another's epoll
> > > set. Or maybe as an EPOLLKEEPAWAKE flag in the event struct sort of like
> > > EPOLLET:
> > > 
> > > epfd = epoll_create1(0);
> > > ev.events = EPOLLIN|EPOLLKEEPAWAKE;
> > > epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev);
> > 
> > Do you mean something like the patch below, or something different?
> 
> Yeah, this was sort of what I was thinking of. It nicely avoids the
> ioctl() bits. I guess my only issue is the fop mimics the epoll
> interface -- should it just be an fop to manage the file as a wakeup
> source rather than a generic hook into epoll?

I'm not exactly sure what you mean, could you be a bit more specific, please?

Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 4/7] Input / PM: Add ioctl to block suspend while event queue is not empty
  2012-02-26 20:57       ` Rafael J. Wysocki
  2012-02-27 22:18         ` Matt Helsley
@ 2012-02-28  5:58         ` Arve Hjønnevåg
  2012-03-04 22:56           ` Rafael J. Wysocki
  1 sibling, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-02-28  5:58 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Matt Helsley, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, jeffbrown

On Sun, Feb 26, 2012 at 12:57 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Friday, February 24, 2012, Matt Helsley wrote:
>> On Wed, Feb 22, 2012 at 12:34:58AM +0100, Rafael J. Wysocki wrote:
>> > From: Arve Hjønnevåg <arve@android.com>
>> >
>> > Add a new ioctl, EVIOCSWAKEUPSRC, to attach a wakeup source object to
>> > an evdev client event queue, such that it will be active whenever the
>> > queue is not empty.  Then, all events in the queue will be regarded
>> > as wakeup events in progress and pm_get_wakeup_count() will block (or
>> > return false if woken up by a signal) until they are removed from the
>> > queue.  In consequence, if the checking of wakeup events is enabled
>> > (e.g. throught the /sys/power/wakeup_count interface), the system
>> > won't be able to go into a sleep state until the queue is empty.
>> >
>> > This allows user space processes to handle situations in which they
>> > want to do a select() on an evdev descriptor, so they go to sleep
>> > until there are some events to read from the device's queue, and then
>> > they don't want the system to go into a sleep state until all the
>> > events are read (presumably for further processing).  Of course, if
>> > they don't want the system to go into a sleep state _after_ all the
>> > events have been read from the queue, they have to use a separate
>> > mechanism that will prevent the system from doing that and it has
>> > to be activated before reading the first event (that also may be the
>> > last one).
>>
>> I haven't seen this idea mentioned before but I must admit I haven't
>> been following this thread too closely so apologies (and don't bother
>> rehashing) if it has:
>>
>> Could you just add this to epoll so that any fd userspace chooses would be
>> capable of doing this without introducing potentially ecclectic ioctl
>> interfaces?
>>
>> struct epoll_event ev;
>>
>> epfd = epoll_create1(EPOLL_STAY_AWAKE_SET);
>> ev.data.ptr = foo;
>> epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev);
>>
>> Which could be useful because you can put one epollfd in another's epoll
>> set. Or maybe as an EPOLLKEEPAWAKE flag in the event struct sort of like
>> EPOLLET:
>>
>> epfd = epoll_create1(0);
>> ev.events = EPOLLIN|EPOLLKEEPAWAKE;
>> epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev);
>
> Do you mean something like the patch below, or something different?
>
> Rafael
>
> ---

I don't think it is useful to tie an evdev implementation to epoll
that way. You just replaced the ioctl with a new control function.

The code below tries to implement the same flag without modifying
evdev at all. The behavior of this is different as it will keep the
device awake until user-space calls epoll_wait again. I also used an
extra wakeup source to handle the function that runs without the
spin_lock held which means that non-wakeup files in same epoll list
could abort suspend.

-- 
Arve Hjønnevåg

----
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index f9cfd16..45af494 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -33,6 +33,7 @@
 #include <linux/bitops.h>
 #include <linux/mutex.h>
 #include <linux/anon_inodes.h>
+#include <linux/device.h>
 #include <asm/uaccess.h>
 #include <asm/system.h>
 #include <asm/io.h>
@@ -79,7 +80,7 @@
  */

 /* Epoll private bits inside the event mask */
-#define EP_PRIVATE_BITS (EPOLLONESHOT | EPOLLET)
+#define EP_PRIVATE_BITS (EPOLLWAKEUP | EPOLLONESHOT | EPOLLET)

 /* Maximum number of nesting allowed inside epoll sets */
 #define EP_MAX_NESTS 4
@@ -146,6 +147,9 @@ struct epitem {
 	/* List header used to link this item to the "struct file" items list */
 	struct list_head fllink;

+	/* wakeup_source used when EPOLLWAKEUP is set */
+	struct wakeup_source *ws;
+
 	/* The structure that describe the interested events and the source fd */
 	struct epoll_event event;
 };
@@ -186,6 +190,9 @@ struct eventpoll {
 	 */
 	struct epitem *ovflist;

+	/* wakeup_source used when ep_scan_ready_list is running */
+	struct wakeup_source *ws;
+
 	/* The user that created the eventpoll descriptor */
 	struct user_struct *user;
 };
@@ -492,6 +499,7 @@ static int ep_scan_ready_list(struct eventpoll *ep,
 	 * in a lockless way.
 	 */
 	spin_lock_irqsave(&ep->lock, flags);
+	__pm_stay_awake(ep->ws);
 	list_splice_init(&ep->rdllist, &txlist);
 	ep->ovflist = NULL;
 	spin_unlock_irqrestore(&ep->lock, flags);
@@ -515,9 +523,12 @@ static int ep_scan_ready_list(struct eventpoll *ep,
 		 * queued into ->ovflist but the "txlist" might already
 		 * contain them, and the list_splice() below takes care of them.
 		 */
-		if (!ep_is_linked(&epi->rdllink))
+		if (!ep_is_linked(&epi->rdllink)) {
 			list_add_tail(&epi->rdllink, &ep->rdllist);
+			__pm_stay_awake(epi->ws);
+		}
 	}
+
 	/*
 	 * We need to set back ep->ovflist to EP_UNACTIVE_PTR, so that after
 	 * releasing the lock, events will be queued in the normal way inside
@@ -529,6 +540,7 @@ static int ep_scan_ready_list(struct eventpoll *ep,
 	 * Quickly re-inject items left on "txlist".
 	 */
 	list_splice(&txlist, &ep->rdllist);
+	__pm_relax(ep->ws);

 	if (!list_empty(&ep->rdllist)) {
 		/*
@@ -583,6 +595,9 @@ static int ep_remove(struct eventpoll *ep, struct
epitem *epi)
 		list_del_init(&epi->rdllink);
 	spin_unlock_irqrestore(&ep->lock, flags);

+	if (epi->ws)
+		wakeup_source_unregister(epi->ws);
+
 	/* At this point it is safe to free the eventpoll item */
 	kmem_cache_free(epi_cache, epi);

@@ -633,6 +648,8 @@ static void ep_free(struct eventpoll *ep)
 	mutex_unlock(&epmutex);
 	mutex_destroy(&ep->mtx);
 	free_uid(ep->user);
+	if (ep->ws)
+		wakeup_source_unregister(ep->ws);
 	kfree(ep);
 }

@@ -661,6 +678,7 @@ static int ep_read_events_proc(struct eventpoll
*ep, struct list_head *head,
 			 * callback, but it's not actually ready, as far as
 			 * caller requested events goes. We can remove it here.
 			 */
+			__pm_relax(epi->ws);
 			list_del_init(&epi->rdllink);
 		}
 	}
@@ -851,8 +869,10 @@ static int ep_poll_callback(wait_queue_t *wait,
unsigned mode, int sync, void *k
 	}

 	/* If this file is already in the ready list we exit soon */
-	if (!ep_is_linked(&epi->rdllink))
+	if (!ep_is_linked(&epi->rdllink)) {
 		list_add_tail(&epi->rdllink, &ep->rdllist);
+		__pm_stay_awake(epi->ws);
+	}

 	/*
 	 * Wake up ( if active ) both the eventpoll wait list and the ->poll()
@@ -915,6 +935,30 @@ static void ep_rbtree_insert(struct eventpoll
*ep, struct epitem *epi)
 	rb_insert_color(&epi->rbn, &ep->rbr);
 }

+static int ep_create_wakeup_source(struct epitem *epi)
+{
+	const char *name;
+
+	if (!epi->ep->ws) {
+		epi->ep->ws = wakeup_source_register("eventpoll");
+		if (!epi->ep->ws)
+			return -ENOMEM;
+	}
+
+	name = epi->ffd.file->f_path.dentry->d_name.name;
+	epi->ws = wakeup_source_register(name);
+	if (!epi->ws)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void ep_destroy_wakeup_source(struct epitem *epi)
+{
+	wakeup_source_unregister(epi->ws);
+	epi->ws = NULL;
+}
+
 /*
  * Must be called with "mtx" held.
  */
@@ -942,6 +986,13 @@ static int ep_insert(struct eventpoll *ep, struct
epoll_event *event,
 	epi->event = *event;
 	epi->nwait = 0;
 	epi->next = EP_UNACTIVE_PTR;
+	if (epi->event.events & EPOLLWAKEUP) {
+		error = ep_create_wakeup_source(epi);
+		if (error)
+			goto error_create_wakeup_source;
+	} else {
+		epi->ws = NULL;
+	}

 	/* Initialize the poll table using the queue callback */
 	epq.epi = epi;
@@ -982,6 +1033,7 @@ static int ep_insert(struct eventpoll *ep, struct
epoll_event *event,
 	/* If the file is already "ready" we drop it inside the ready list */
 	if ((revents & event->events) && !ep_is_linked(&epi->rdllink)) {
 		list_add_tail(&epi->rdllink, &ep->rdllist);
+		__pm_stay_awake(epi->ws);

 		/* Notify waiting tasks that events are available */
 		if (waitqueue_active(&ep->wq))
@@ -1014,6 +1066,10 @@ error_unregister:
 		list_del_init(&epi->rdllink);
 	spin_unlock_irqrestore(&ep->lock, flags);

+	if (epi->ws)
+		wakeup_source_unregister(epi->ws);
+
+error_create_wakeup_source:
 	kmem_cache_free(epi_cache, epi);

 	return error;
@@ -1035,6 +1091,12 @@ static int ep_modify(struct eventpoll *ep,
struct epitem *epi, struct epoll_even
 	 */
 	epi->event.events = event->events;
 	epi->event.data = event->data; /* protected by mtx */
+	if (epi->event.events & EPOLLWAKEUP) {
+		if (!epi->ws)
+			ep_create_wakeup_source(epi);
+	} else if (epi->ws) {
+		ep_destroy_wakeup_source(epi);
+	}

 	/*
 	 * Get current event bits. We can safely use the file* here because
@@ -1050,6 +1112,7 @@ static int ep_modify(struct eventpoll *ep,
struct epitem *epi, struct epoll_even
 		spin_lock_irq(&ep->lock);
 		if (!ep_is_linked(&epi->rdllink)) {
 			list_add_tail(&epi->rdllink, &ep->rdllist);
+			__pm_stay_awake(epi->ws);

 			/* Notify waiting tasks that events are available */
 			if (waitqueue_active(&ep->wq))
@@ -1085,6 +1148,7 @@ static int ep_send_events_proc(struct eventpoll
*ep, struct list_head *head,
 	     !list_empty(head) && eventcnt < esed->maxevents;) {
 		epi = list_first_entry(head, struct epitem, rdllink);

+		__pm_relax(epi->ws);
 		list_del_init(&epi->rdllink);

 		revents = epi->ffd.file->f_op->poll(epi->ffd.file, NULL) &
@@ -1100,6 +1164,7 @@ static int ep_send_events_proc(struct eventpoll
*ep, struct list_head *head,
 			if (__put_user(revents, &uevent->events) ||
 			    __put_user(epi->event.data, &uevent->data)) {
 				list_add(&epi->rdllink, head);
+				__pm_stay_awake(epi->ws);
 				return eventcnt ? eventcnt : -EFAULT;
 			}
 			eventcnt++;
@@ -1119,6 +1184,7 @@ static int ep_send_events_proc(struct eventpoll
*ep, struct list_head *head,
 				 * poll callback will queue them in ep->ovflist.
 				 */
 				list_add_tail(&epi->rdllink, &ep->rdllist);
+				__pm_stay_awake(epi->ws);
 			}
 		}
 	}
diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
index f362733..cd156ff 100644
--- a/include/linux/eventpoll.h
+++ b/include/linux/eventpoll.h
@@ -26,6 +26,12 @@
 #define EPOLL_CTL_DEL 2
 #define EPOLL_CTL_MOD 3

+/*
+ * Request the handling of system wakeup events so as to prevent automatic
+ * system suspends from happening while those events are being processed.
+ */
+#define EPOLLWAKEUP (1 << 29)
+
 /* Set the One Shot behaviour for the target file descriptor */
 #define EPOLLONESHOT (1 << 30)

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take2
  2012-02-25 21:01                     ` Rafael J. Wysocki
@ 2012-02-28 10:24                       ` Srivatsa S. Bhat
  0 siblings, 0 replies; 129+ messages in thread
From: Srivatsa S. Bhat @ 2012-02-28 10:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: John Stultz, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg,
	Brian Swetland, Neil Brown, Alan Stern, Dmitry Torokhov

On 02/26/2012 02:31 AM, Rafael J. Wysocki wrote:

> 
> I think we can do something like in the updated patch [5/7] below.
> 
> It uses a special wakeup source object called "autosleep" to bump up the
> number of wakeup events in progress before acquiring autosleep_lock in
> pm_autosleep_set_state().  This way, either pm_autosleep_set_state() will
> acquire autosleep_lock before try_to_suspend(), in which case the latter
> will see the change of autosleep_state immediately (after autosleep_lock has
> been passed to it), or try_to_suspend() will get it first, but then
> pm_save_wakeup_count() or pm_suspend()/hibernate() will see the nonzero counter
> of wakeup events in progress and return error code (sooner or later).
> 
> The drawback is that writes to /sys/power/autosleep may interfere with
> the /sys/power/wakeup_count + /sys/power/state interface by interrupting
> transitions started by writing to /sys/power/state, for example (although
> I think that's highly unlikely).


Yes, but I think we can live with that.. It doesn't look like a big issue.

> 
> Additionally, I made pm_autosleep_lock() use mutex_trylock_interruptible()


You have used mutex_lock_interruptible() in the code below.. It wouldn't matter
as long as you have used some form of "interruptible" but I think
mutex_trylock_interruptible would be even better..

> to prevent operations on /sys/power/wakeup_count and/or /sys/power/state
> from failing the freezing of tasks started by try_to_suspend().
> 
> Thanks,
> Rafael
> 


The approach taken by the patch below looks good to me. I don't see any obvious
problems, except for the minor ones listed below.

> ---
> From: Rafael J. Wysocki <rjw@sisk.pl>
> Subject: PM / Sleep: Implement opportunistic sleep
> 
> Introduce a mechanism by which the kernel can trigger global
> transitions to a sleep state chosen by user space if there are no
> active wakeup sources.
> 
> It consists of a new sysfs attribute, /sys/power/autosleep, that
> can be written one of the strings returned by reads from
> /sys/power/state, an ordered workqueue and a work item carrying out
> the "suspend" operations.  If a string representing the system's
> sleep state is written to /sys/power/autosleep, the work item
> triggering transitions to that state is queued up and it requeues
> itself after every execution until user space writes "off" to
> /sys/power/autosleep.
> 
> That work item enables the detection of wakeup events using the
> functions already defined in drivers/base/power/wakeup.c (with one
> small modification) and calls either pm_suspend(), or hibernate() to
> put the system into a sleep state.  If a wakeup event is reported
> while the transition is in progress, it will abort the transition and
> the "system suspend" work item will be queued up again.
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> Index: linux/kernel/power/main.c
> ===================================================================
> --- linux.orig/kernel/power/main.c
> +++ linux/kernel/power/main.c
> @@ -269,8 +269,7 @@ static ssize_t state_show(struct kobject
>  	return (s - buf);
>  }
> 
> -static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
> -			   const char *buf, size_t n)
> +static suspend_state_t decode_state(const char *buf, size_t n)
>  {
>  #ifdef CONFIG_SUSPEND
>  	suspend_state_t state = PM_SUSPEND_STANDBY;
> @@ -278,27 +277,48 @@ static ssize_t state_store(struct kobjec
>  #endif
>  	char *p;
>  	int len;
> -	int error = -EINVAL;
> 
>  	p = memchr(buf, '\n', n);
>  	len = p ? p - buf : n;
> 
> -	/* First, check if we are requested to hibernate */
> -	if (len == 4 && !strncmp(buf, "disk", len)) {
> -		error = hibernate();
> -		goto Exit;
> -	}
> +	/* Check hibernation first. */
> +	if (len == 4 && !strncmp(buf, "disk", len))
> +		return PM_SUSPEND_MAX;
> 
>  #ifdef CONFIG_SUSPEND
> -	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
> -		if (*s && len == strlen(*s) && !strncmp(buf, *s, len)) {
> -			error = pm_suspend(state);
> -			break;
> -		}
> -	}
> +	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++)
> +		if (*s && len == strlen(*s) && !strncmp(buf, *s, len))
> +			return state;
>  #endif
> 
> - Exit:
> +	return PM_SUSPEND_ON;
> +}
> +
> +static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
> +			   const char *buf, size_t n)
> +{
> +	suspend_state_t state;
> +	int error;
> +
> +	error = pm_autosleep_lock();
> +	if (error)
> +		return error;
> +
> +	if (pm_autosleep_state() > PM_SUSPEND_ON) {
> +		error = -EBUSY;
> +		goto out;
> +	}
> +
> +	state = decode_state(buf, n);
> +	if (state < PM_SUSPEND_MAX)
> +		error = pm_suspend(state);
> +	else if (state > PM_SUSPEND_ON)
> +		error = hibernate();
> +	else
> +		error = -EINVAL;


By the way, the condition checks in the above if-else block look kinda
odd, considering what is done in other similar places, which are more
readable. It would be great if you could make them consistent.

> +
> + out:
> +	pm_autosleep_unlock();
>  	return error ? error : n;
>  }
> 
> @@ -339,7 +359,8 @@ static ssize_t wakeup_count_show(struct
>  {
>  	unsigned int val;
> 
> -	return pm_get_wakeup_count(&val) ? sprintf(buf, "%u\n", val) : -EINTR;
> +	return pm_get_wakeup_count(&val, true) ?
> +		sprintf(buf, "%u\n", val) : -EINTR;
>  }
> 
> +
> +static ssize_t autosleep_store(struct kobject *kobj,
> +			       struct kobj_attribute *attr,
> +			       const char *buf, size_t n)
> +{
> +	suspend_state_t state = decode_state(buf, n);
> +	int error;
> +
> +	if (state == PM_SUSPEND_ON && strncmp(buf, "off", 3)
> +	    && strncmp(buf, "off\n", 4))
> +		return -EINVAL;
> +


I am pretty sure you meant "if autosleep is already off, and the user
wrote "off" to /sys/power/autosleep, then return -EINVAL"

But strncmp() returns 0 if the strings match, and hence the code above
doesn't seem to do what you intended.

> +	error = pm_autosleep_set_state(state);
> +	return error ? error : n;
> +}
> +
> +power_attr(autosleep);
> +#endif /* CONFIG_PM_AUTOSLEEP */
>  #endif /* CONFIG_PM_SLEEP */
> 
>  #ifdef CONFIG_PM_TRACE


Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 4/7] Input / PM: Add ioctl to block suspend while event queue is not empty
  2012-02-28  5:58         ` Arve Hjønnevåg
@ 2012-03-04 22:56           ` Rafael J. Wysocki
  2012-03-06  1:04             ` [PATCH 1/2] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready Arve Hjønnevåg
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-03-04 22:56 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: Matt Helsley, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, jeffbrown

On Tuesday, February 28, 2012, Arve Hjønnevåg wrote:
> On Sun, Feb 26, 2012 at 12:57 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Friday, February 24, 2012, Matt Helsley wrote:
> >> On Wed, Feb 22, 2012 at 12:34:58AM +0100, Rafael J. Wysocki wrote:
> >> > From: Arve Hjønnevåg <arve@android.com>
> >> >
> >> > Add a new ioctl, EVIOCSWAKEUPSRC, to attach a wakeup source object to
> >> > an evdev client event queue, such that it will be active whenever the
> >> > queue is not empty.  Then, all events in the queue will be regarded
> >> > as wakeup events in progress and pm_get_wakeup_count() will block (or
> >> > return false if woken up by a signal) until they are removed from the
> >> > queue.  In consequence, if the checking of wakeup events is enabled
> >> > (e.g. throught the /sys/power/wakeup_count interface), the system
> >> > won't be able to go into a sleep state until the queue is empty.
> >> >
> >> > This allows user space processes to handle situations in which they
> >> > want to do a select() on an evdev descriptor, so they go to sleep
> >> > until there are some events to read from the device's queue, and then
> >> > they don't want the system to go into a sleep state until all the
> >> > events are read (presumably for further processing).  Of course, if
> >> > they don't want the system to go into a sleep state _after_ all the
> >> > events have been read from the queue, they have to use a separate
> >> > mechanism that will prevent the system from doing that and it has
> >> > to be activated before reading the first event (that also may be the
> >> > last one).
> >>
> >> I haven't seen this idea mentioned before but I must admit I haven't
> >> been following this thread too closely so apologies (and don't bother
> >> rehashing) if it has:
> >>
> >> Could you just add this to epoll so that any fd userspace chooses would be
> >> capable of doing this without introducing potentially ecclectic ioctl
> >> interfaces?
> >>
> >> struct epoll_event ev;
> >>
> >> epfd = epoll_create1(EPOLL_STAY_AWAKE_SET);
> >> ev.data.ptr = foo;
> >> epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev);
> >>
> >> Which could be useful because you can put one epollfd in another's epoll
> >> set. Or maybe as an EPOLLKEEPAWAKE flag in the event struct sort of like
> >> EPOLLET:
> >>
> >> epfd = epoll_create1(0);
> >> ev.events = EPOLLIN|EPOLLKEEPAWAKE;
> >> epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev);
> >
> > Do you mean something like the patch below, or something different?
> >
> > Rafael
> >
> > ---
> 
> I don't think it is useful to tie an evdev implementation to epoll
> that way. You just replaced the ioctl with a new control function.
> 
> The code below tries to implement the same flag without modifying
> evdev at all. The behavior of this is different as it will keep the
> device awake until user-space calls epoll_wait again. I also used an
> extra wakeup source to handle the function that runs without the
> spin_lock held which means that non-wakeup files in same epoll list
> could abort suspend.

Well, if that works for you, it will be better than adding ioctls to evdev
(and presumably a number of other devices).

Care to resubmit with a proper changelog and sign-off?

Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 1/2] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-03-04 22:56           ` Rafael J. Wysocki
@ 2012-03-06  1:04             ` Arve Hjønnevåg
  2012-03-06  1:04               ` [PATCH 2/2] PM / Sleep: Add wakeup_source_activate and wakeup_source_deactivate tracepoints Arve Hjønnevåg
  0 siblings, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-03-06  1:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Matt Helsley, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, jeffbrown,
	Arve Hjønnevåg

When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
wakeup_source will be active to prevent suspend. This can be used to
handle wakeup events from a driver that support poll, e.g. input, if
that driver wakes up the waitqueue passed to epoll before allowing
suspend.

The current implementation uses an extra wakeup_source when
ep_scan_ready_list runs. This can cause problems if a single thread
is polling on wakeup events and frequent non-wakeup events (events
usually arrive during thread freezing) using the same epoll file.

Signed-off-by: Arve Hjønnevåg <arve@android.com>
---
 fs/eventpoll.c            |   71 +++++++++++++++++++++++++++++++++++++++++++--
 include/linux/eventpoll.h |    6 ++++
 2 files changed, 74 insertions(+), 3 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index aabdfc3..6263ac6 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -33,6 +33,7 @@
 #include <linux/bitops.h>
 #include <linux/mutex.h>
 #include <linux/anon_inodes.h>
+#include <linux/device.h>
 #include <asm/uaccess.h>
 #include <asm/system.h>
 #include <asm/io.h>
@@ -88,7 +89,7 @@
  */
 
 /* Epoll private bits inside the event mask */
-#define EP_PRIVATE_BITS (EPOLLONESHOT | EPOLLET)
+#define EP_PRIVATE_BITS (EPOLLWAKEUP | EPOLLONESHOT | EPOLLET)
 
 /* Maximum number of nesting allowed inside epoll sets */
 #define EP_MAX_NESTS 4
@@ -155,6 +156,9 @@ struct epitem {
 	/* List header used to link this item to the "struct file" items list */
 	struct list_head fllink;
 
+	/* wakeup_source used when EPOLLWAKEUP is set */
+	struct wakeup_source *ws;
+
 	/* The structure that describe the interested events and the source fd */
 	struct epoll_event event;
 };
@@ -195,6 +199,9 @@ struct eventpoll {
 	 */
 	struct epitem *ovflist;
 
+	/* wakeup_source used when ep_scan_ready_list is running */
+	struct wakeup_source *ws;
+
 	/* The user that created the eventpoll descriptor */
 	struct user_struct *user;
 
@@ -524,6 +531,7 @@ static int ep_scan_ready_list(struct eventpoll *ep,
 	 * in a lockless way.
 	 */
 	spin_lock_irqsave(&ep->lock, flags);
+	__pm_stay_awake(ep->ws);
 	list_splice_init(&ep->rdllist, &txlist);
 	ep->ovflist = NULL;
 	spin_unlock_irqrestore(&ep->lock, flags);
@@ -547,8 +555,10 @@ static int ep_scan_ready_list(struct eventpoll *ep,
 		 * queued into ->ovflist but the "txlist" might already
 		 * contain them, and the list_splice() below takes care of them.
 		 */
-		if (!ep_is_linked(&epi->rdllink))
+		if (!ep_is_linked(&epi->rdllink)) {
 			list_add_tail(&epi->rdllink, &ep->rdllist);
+			__pm_stay_awake(epi->ws);
+		}
 	}
 	/*
 	 * We need to set back ep->ovflist to EP_UNACTIVE_PTR, so that after
@@ -561,6 +571,7 @@ static int ep_scan_ready_list(struct eventpoll *ep,
 	 * Quickly re-inject items left on "txlist".
 	 */
 	list_splice(&txlist, &ep->rdllist);
+	__pm_relax(ep->ws);
 
 	if (!list_empty(&ep->rdllist)) {
 		/*
@@ -615,6 +626,9 @@ static int ep_remove(struct eventpoll *ep, struct epitem *epi)
 		list_del_init(&epi->rdllink);
 	spin_unlock_irqrestore(&ep->lock, flags);
 
+	if (epi->ws)
+		wakeup_source_unregister(epi->ws);
+
 	/* At this point it is safe to free the eventpoll item */
 	kmem_cache_free(epi_cache, epi);
 
@@ -665,6 +679,8 @@ static void ep_free(struct eventpoll *ep)
 	mutex_unlock(&epmutex);
 	mutex_destroy(&ep->mtx);
 	free_uid(ep->user);
+	if (ep->ws)
+		wakeup_source_unregister(ep->ws);
 	kfree(ep);
 }
 
@@ -693,6 +709,7 @@ static int ep_read_events_proc(struct eventpoll *ep, struct list_head *head,
 			 * callback, but it's not actually ready, as far as
 			 * caller requested events goes. We can remove it here.
 			 */
+			__pm_relax(epi->ws);
 			list_del_init(&epi->rdllink);
 		}
 	}
@@ -877,8 +894,10 @@ static int ep_poll_callback(wait_queue_t *wait, unsigned mode, int sync, void *k
 	}
 
 	/* If this file is already in the ready list we exit soon */
-	if (!ep_is_linked(&epi->rdllink))
+	if (!ep_is_linked(&epi->rdllink)) {
 		list_add_tail(&epi->rdllink, &ep->rdllist);
+		__pm_stay_awake(epi->ws);
+	}
 
 	/*
 	 * Wake up ( if active ) both the eventpoll wait list and the ->poll()
@@ -1034,6 +1053,30 @@ static int reverse_path_check(void)
 	return error;
 }
 
+static int ep_create_wakeup_source(struct epitem *epi)
+{
+	const char *name;
+
+	if (!epi->ep->ws) {
+		epi->ep->ws = wakeup_source_register("eventpoll");
+		if (!epi->ep->ws)
+			return -ENOMEM;
+	}
+
+	name = epi->ffd.file->f_path.dentry->d_name.name;
+	epi->ws = wakeup_source_register(name);
+	if (!epi->ws)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void ep_destroy_wakeup_source(struct epitem *epi)
+{
+	wakeup_source_unregister(epi->ws);
+	epi->ws = NULL;
+}
+
 /*
  * Must be called with "mtx" held.
  */
@@ -1061,6 +1104,13 @@ static int ep_insert(struct eventpoll *ep, struct epoll_event *event,
 	epi->event = *event;
 	epi->nwait = 0;
 	epi->next = EP_UNACTIVE_PTR;
+	if (epi->event.events & EPOLLWAKEUP) {
+		error = ep_create_wakeup_source(epi);
+		if (error)
+			goto error_create_wakeup_source;
+	} else {
+		epi->ws = NULL;
+	}
 
 	/* Initialize the poll table using the queue callback */
 	epq.epi = epi;
@@ -1106,6 +1156,7 @@ static int ep_insert(struct eventpoll *ep, struct epoll_event *event,
 	/* If the file is already "ready" we drop it inside the ready list */
 	if ((revents & event->events) && !ep_is_linked(&epi->rdllink)) {
 		list_add_tail(&epi->rdllink, &ep->rdllist);
+		__pm_stay_awake(epi->ws);
 
 		/* Notify waiting tasks that events are available */
 		if (waitqueue_active(&ep->wq))
@@ -1146,6 +1197,10 @@ error_unregister:
 		list_del_init(&epi->rdllink);
 	spin_unlock_irqrestore(&ep->lock, flags);
 
+	if (epi->ws)
+		wakeup_source_unregister(epi->ws);
+
+error_create_wakeup_source:
 	kmem_cache_free(epi_cache, epi);
 
 	return error;
@@ -1167,6 +1222,12 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_even
 	 */
 	epi->event.events = event->events;
 	epi->event.data = event->data; /* protected by mtx */
+	if (epi->event.events & EPOLLWAKEUP) {
+		if (!epi->ws)
+			ep_create_wakeup_source(epi);
+	} else if (epi->ws) {
+		ep_destroy_wakeup_source(epi);
+	}
 
 	/*
 	 * Get current event bits. We can safely use the file* here because
@@ -1182,6 +1243,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_even
 		spin_lock_irq(&ep->lock);
 		if (!ep_is_linked(&epi->rdllink)) {
 			list_add_tail(&epi->rdllink, &ep->rdllist);
+			__pm_stay_awake(epi->ws);
 
 			/* Notify waiting tasks that events are available */
 			if (waitqueue_active(&ep->wq))
@@ -1217,6 +1279,7 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
 	     !list_empty(head) && eventcnt < esed->maxevents;) {
 		epi = list_first_entry(head, struct epitem, rdllink);
 
+		__pm_relax(epi->ws);
 		list_del_init(&epi->rdllink);
 
 		revents = epi->ffd.file->f_op->poll(epi->ffd.file, NULL) &
@@ -1232,6 +1295,7 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
 			if (__put_user(revents, &uevent->events) ||
 			    __put_user(epi->event.data, &uevent->data)) {
 				list_add(&epi->rdllink, head);
+				__pm_stay_awake(epi->ws);
 				return eventcnt ? eventcnt : -EFAULT;
 			}
 			eventcnt++;
@@ -1251,6 +1315,7 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
 				 * poll callback will queue them in ep->ovflist.
 				 */
 				list_add_tail(&epi->rdllink, &ep->rdllist);
+				__pm_stay_awake(epi->ws);
 			}
 		}
 	}
diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
index 657ab55..520a57c 100644
--- a/include/linux/eventpoll.h
+++ b/include/linux/eventpoll.h
@@ -26,6 +26,12 @@
 #define EPOLL_CTL_DEL 2
 #define EPOLL_CTL_MOD 3
 
+/*
+ * Request the handling of system wakeup events so as to prevent automatic
+ * system suspends from happening while those events are being processed.
+ */
+#define EPOLLWAKEUP (1 << 29)
+
 /* Set the One Shot behaviour for the target file descriptor */
 #define EPOLLONESHOT (1 << 30)
 
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [PATCH 2/2] PM / Sleep: Add wakeup_source_activate and wakeup_source_deactivate tracepoints
  2012-03-06  1:04             ` [PATCH 1/2] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready Arve Hjønnevåg
@ 2012-03-06  1:04               ` Arve Hjønnevåg
  0 siblings, 0 replies; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-03-06  1:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Matt Helsley, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, jeffbrown,
	Arve Hjønnevåg

Add tracepoints to wakeup_source_activate and wakeup_source_deactivate.
Useful for checking that specific wakeup sources overlap as expected.

Signed-off-by: Arve Hjønnevåg <arve@android.com>
---
 drivers/base/power/wakeup.c  |   12 +++++++++---
 include/trace/events/power.h |   34 ++++++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
index a896cc8..94b843d 100644
--- a/drivers/base/power/wakeup.c
+++ b/drivers/base/power/wakeup.c
@@ -14,6 +14,7 @@
 #include <linux/suspend.h>
 #include <linux/seq_file.h>
 #include <linux/debugfs.h>
+#include <trace/events/power.h>
 
 #include "power.h"
 
@@ -375,6 +376,8 @@ EXPORT_SYMBOL_GPL(device_set_wakeup_enable);
  */
 static void wakeup_source_activate(struct wakeup_source *ws)
 {
+	unsigned int cec;
+
 	ws->active = true;
 	ws->active_count++;
 	ws->last_time = ktime_get();
@@ -382,7 +385,9 @@ static void wakeup_source_activate(struct wakeup_source *ws)
 		ws->start_prevent_time = ws->last_time;
 
 	/* Increment the counter of events in progress. */
-	atomic_inc(&combined_event_count);
+	cec = atomic_inc_return(&combined_event_count);
+
+	trace_wakeup_source_activate(ws->name, cec);
 }
 
 /**
@@ -468,7 +473,7 @@ static inline void update_prevent_sleep_time(struct wakeup_source *ws,
  */
 static void wakeup_source_deactivate(struct wakeup_source *ws)
 {
-	unsigned int cnt, inpr;
+	unsigned int cnt, inpr, cec;
 	ktime_t duration;
 	ktime_t now;
 
@@ -506,7 +511,8 @@ static void wakeup_source_deactivate(struct wakeup_source *ws)
 	 * Increment the counter of registered wakeup events and decrement the
 	 * couter of wakeup events in progress simultaneously.
 	 */
-	atomic_add(MAX_IN_PROGRESS, &combined_event_count);
+	cec = atomic_add_return(MAX_IN_PROGRESS, &combined_event_count);
+	trace_wakeup_source_deactivate(ws->name, cec);
 
 	split_counters(&cnt, &inpr);
 	if (!inpr && waitqueue_active(&wakeup_count_wait_queue)) {
diff --git a/include/trace/events/power.h b/include/trace/events/power.h
index 1bcc2a8..5c7b721 100644
--- a/include/trace/events/power.h
+++ b/include/trace/events/power.h
@@ -65,6 +65,40 @@ TRACE_EVENT(machine_suspend,
 	TP_printk("state=%lu", (unsigned long)__entry->state)
 );
 
+DECLARE_EVENT_CLASS(wakeup_source,
+
+	TP_PROTO(const char *name, unsigned int state),
+
+	TP_ARGS(name, state),
+
+	TP_STRUCT__entry(
+		__string(       name,           name            )
+		__field(        u64,            state           )
+	),
+
+	TP_fast_assign(
+		__assign_str(name, name);
+		__entry->state = state;
+	),
+
+	TP_printk("%s state=0x%lx", __get_str(name),
+		(unsigned long)__entry->state)
+);
+
+DEFINE_EVENT(wakeup_source, wakeup_source_activate,
+
+	TP_PROTO(const char *name, unsigned int state),
+
+	TP_ARGS(name, state)
+);
+
+DEFINE_EVENT(wakeup_source, wakeup_source_deactivate,
+
+	TP_PROTO(const char *name, unsigned int state),
+
+	TP_ARGS(name, state)
+);
+
 /* This code will be removed after deprecation time exceeded (2.6.41) */
 #ifdef CONFIG_EVENT_POWER_TRACING_DEPRECATED
 
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3
  2012-02-21 23:31 ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 Rafael J. Wysocki
                     ` (7 preceding siblings ...)
  2012-02-22  4:49   ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 John Stultz
@ 2012-04-22 21:19   ` Rafael J. Wysocki
  2012-04-22 21:19     ` [PATCH 1/8] PM / Sleep: Look for wakeup events in later stages of device suspend Rafael J. Wysocki
                       ` (8 more replies)
  8 siblings, 9 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-22 21:19 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

Hi all,

Following is the third update of the autosleep patchset.

Patches [1-4/8] are regarded as v3.5 material, the rest - depending on
the feedback I get (lack of feedback will be understood as no objections,
though).

On Wednesday, February 22, 2012, Rafael J. Wysocki wrote:
> Hi all,
> 
> After the feedback so far I've decided to follow up with a refreshed patchset.
> The first two patches from the previous one went to linux-pm/linux-next
> and I included the recent evdev patch from Arve (with some modifications)
> to this patchset for completness.
> 
> On Tuesday, February 07, 2012, Rafael J. Wysocki wrote:
> > Hi all,
> > 
> > This series tests the theory that the easiest way to sell a once rejected
> > feature is to advertise it under a different name.
> > 
> > Well, there actually are two different features, although they are closely
> > related to each other.  First, patch [6/8] introduces a feature that allows
> > the kernel to trigger system suspend (or more generally a transition into
> > a sleep state) whenever there are no active wakeup sources (no, they aren't
> > called wakelocks).  It is called "autosleep" here, but it was called a few
> > different names in the past ("opportunistic suspend" was probably the most
> > popular one).  Second, patch [8/8] introduces "wake locks" that are,
> > essentially, wakeup sources which may be created and manipulated by user
> > space.  Using them user space may control the autosleep feature introduced
> > earlier.
> > 
> > This also is a kind of a proof of concept for the people who wanted me to
> > show a kernel-based implementation of automatic suspend, so there you go.
> > Please note, however, that it is done so that the user space "wake locks"
> > interface is compatible with Android in support of its user space.  I don't
> > really like this interface, but since the Android's user space seems to rely
> > on it, I'm fine with using it as is.  YMMV.
> > 
> > Let me say a few words about every patch in the series individually.
> > 
> > [1/8] - This really is a bug fix, so it's v3.4 material.  Nobody has stepped
> >   on this bug so far, but it should be fixed anyway.
> > 
> > [2/8] - This is a freezer cleanup, worth doing anyway IMO, so v3.4 material too.

The two patches above have been merged.

> The above two are in linux-pm/linux-next now.  There are a few more fixes
> related to wakeup sources in there and the patches below are based on that
> branch.
> 
> > [3/8] - This is something we can do no problem, although completely optional
> >   without the autosleep feature.  Rather necessary with it, though.
>
> Now [1/7] - Look for wakeup events in later stages of device suspend.

[1/8] now - Look for wakeup events later down the suspend code path.
 
> > [4/8] - This kind of reintroduces my original idea of using a wait queue for
> >   waiting until there are no wakeup events in progress.  Alan convinced me that
> >   it would be better to poll the counter to prevent wakeup_source_deactivate()
> >   from having to call wake_up_all() occasionally (that may be costly in fast
> >   paths), but then quite some people told me that the wait queue migh be
> >   better.  I think that the polling will make much less sense with autosleep
> >   and user space "wake locks".  Anyway, [4/8] is something we can do without
> >   those things too.
> 
> Now [2/7] - Use wait queue to signal "no wakeup events in progress"
> 
>   With a couple of improvements suggested by Neil.

[2/8] now - Use wait queue to signal "no wakeup events in progress" condition.

> > The patches above were given Sign-off-by tags, because I think they make some
> > sense regardless of the features introcuded by the remaining patches that in
> > turn are total RFC.
> 
> This time all of the patches are signed-off and include the requisite
> documentation changes (hopefully, I haven't forgotten about anything).
> 
> > [5/8] - This changes wakeup source statistics so that they are more similar to
> >   the statistics collected for wakelocks on Android.  The file those statistics
> >   may be read from is still located in debugfs, though (I don't think it
> >   belongs to proc and its name is different from the analogous Android's file
> >   name anyway).  It could be done without autosleep, but then it would be a bit
> >   pointless.  BTW, this changes interfaces that _in_ _theory_ may be used by
> >   someone, but I'm not aware of anyone using them.  If you are one, I'll be
> >   pleased to learn about that, so please tell me who you are. :-)
> 
> Now [3/7] - Change wakeup source statistics to follow Android.
>
>   Rebased and reworked in accordance with the Arve's feedback.

[3/8] now - Change wakeup source statistics to follow Android.

[4/8] - Add tracepoints to wakeup_source_{de}activate()

[5/8] - Teach epoll to use wakeup sources if requested

This should be sufficient to ensure that a wakeup source will be kept active
after a wakeup event all the way up to user space without a need to make a
number of random drivers use wakeup sources.

> > [6/8] - Autosleep implementation.  I think the changelog explains the idea
> >   quite well and the code is really nothing special.  It doesn't really add
> >   anything new to the kernel in terms of infrastructure etc., it just uses
> >   the existing stuff to implement an alternative method of triggering system
> >   sleep transitions.  Note, though, that the interface here is different
> >   from the Android's one, because Android actually modifies /sys/power/state
> >   to trigger something called "early suspend" (that is never going to be
> >   implemented in the "stock" kernel as long as I have any influence on it) and
> >   we simply can't do that in the mainline.
> 
> Now [5/7] - Implement opportunistic sleep
> 
>   Rebased and simplified (most notably, I've dropped the "main" wakeup source,
>   since it wasn't really necessary).

[6/8] now - Implement apportunistic sleep.

> > [7/8] - This adds a wakeup source statistics that only makes sense with
> >   autosleep and (I believe) is analogous to the Android's prevent_suspend_time
> >   statistics.  Nothing really special, but I didn't want
> >   wakeup_source_activate/deactivate() to take a common lock to avoid
> >   congestion.
> 
> Now [6/7] - Add "prevent autosleep time" statistics to wakeup sources.
> 
>   Rebased.

[7/8] now - Add "prevent autosleep time" statistics to wakeup sources.

> > [8/8] - This adds a user space interface to create, activate and deactivate
> >   wakeup sources.  Since the files it consists of are called wake_lock and
> >   wake_unlock, to follow Android, the objects the wakeup sources are wrapped
> >   into are called "wakelocks" (for added confusion).  Since the interface
> >   doesn't provide any means to destroy those "wakelocks", I added a garbage
> >   collection mechanism to get rid of the unused ones, if any.  I also tought
> >   it might be a good idea to put a limit on the number of those things that
> >   user space can operate simultaneously, so I did that too.
> 
> Now [7/7] - Add user space interface for manipulating wakeup sources.

[8/8] now - Add user space interface for manipulating wakeup sources.

> > All of the above has been tested very briefly on my test-bed Mackerel board
> > and it quite obviously requires more thorough testing, but first I need to know
> > if it makes sense to spend any more time on it.
> 
> The above is still accurate, but I also verified that the patches don't break
> my PC test boxes (at least as long as the new features aren't used ;-)).

Nothing has changed in that respect, as far as I can say.

The patches in the following series are available from the autosleep branch in
the linux-pm tree.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 1/8] PM / Sleep: Look for wakeup events in later stages of device suspend
  2012-04-22 21:19   ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Rafael J. Wysocki
@ 2012-04-22 21:19     ` Rafael J. Wysocki
  2012-04-22 21:20     ` [PATCH 2/8] PM / Sleep: Use wait queue to signal "no wakeup events in progress" Rafael J. Wysocki
                       ` (7 subsequent siblings)
  8 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-22 21:19 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

From: Rafael J. Wysocki <rjw@sisk.pl>

Currently, the device suspend code in drivers/base/power/main.c
only checks if there have been any wakeup events, and therefore the
ongoing system transition to a sleep state should be aborted, during
the first (i.e. "suspend") device suspend phase.  However, wakeup
events may be reported later as well, so it's reasonable to look for
them in the in the subsequent (i.e. "late suspend" and "suspend
noirq") phases.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/base/power/main.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

Index: linux/drivers/base/power/main.c
===================================================================
--- linux.orig/drivers/base/power/main.c
+++ linux/drivers/base/power/main.c
@@ -889,6 +889,11 @@ static int dpm_suspend_noirq(pm_message_
 		if (!list_empty(&dev->power.entry))
 			list_move(&dev->power.entry, &dpm_noirq_list);
 		put_device(dev);
+
+		if (pm_wakeup_pending()) {
+			error = -EBUSY;
+			break;
+		}
 	}
 	mutex_unlock(&dpm_list_mtx);
 	if (error)
@@ -962,6 +967,11 @@ static int dpm_suspend_late(pm_message_t
 		if (!list_empty(&dev->power.entry))
 			list_move(&dev->power.entry, &dpm_late_early_list);
 		put_device(dev);
+
+		if (pm_wakeup_pending()) {
+			error = -EBUSY;
+			break;
+		}
 	}
 	mutex_unlock(&dpm_list_mtx);
 	if (error)


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 2/8] PM / Sleep: Use wait queue to signal "no wakeup events in progress"
  2012-04-22 21:19   ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Rafael J. Wysocki
  2012-04-22 21:19     ` [PATCH 1/8] PM / Sleep: Look for wakeup events in later stages of device suspend Rafael J. Wysocki
@ 2012-04-22 21:20     ` Rafael J. Wysocki
  2012-04-23  4:01       ` mark gross
  2012-04-22 21:21     ` [PATCH 3/8] PM / Sleep: Change wakeup source statistics to follow Android Rafael J. Wysocki
                       ` (6 subsequent siblings)
  8 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-22 21:20 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

From: Rafael J. Wysocki <rjw@sisk.pl>

The current wakeup source deactivation code doesn't do anything when
the counter of wakeup events in progress goes down to zero, which
requires pm_get_wakeup_count() to poll that counter periodically.
Although this reduces the average time it takes to deactivate a
wakeup source, it also may lead to a substantial amount of unnecessary
polling if there are extended periods of wakeup activity.  Thus it
seems reasonable to use a wait queue for signaling the "no wakeup
events in progress" condition and remove the polling.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/base/power/wakeup.c |   16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -17,8 +17,6 @@
 
 #include "power.h"
 
-#define TIMEOUT		100
-
 /*
  * If set, the suspend/hibernate code will abort transitions to a sleep state
  * if wakeup events are registered during or immediately before the transition.
@@ -52,6 +50,8 @@ static void pm_wakeup_timer_fn(unsigned
 
 static LIST_HEAD(wakeup_sources);
 
+static DECLARE_WAIT_QUEUE_HEAD(wakeup_count_wait_queue);
+
 /**
  * wakeup_source_prepare - Prepare a new wakeup source for initialization.
  * @ws: Wakeup source to prepare.
@@ -442,6 +442,7 @@ EXPORT_SYMBOL_GPL(pm_stay_awake);
  */
 static void wakeup_source_deactivate(struct wakeup_source *ws)
 {
+	unsigned int cnt, inpr;
 	ktime_t duration;
 	ktime_t now;
 
@@ -476,6 +477,10 @@ static void wakeup_source_deactivate(str
 	 * couter of wakeup events in progress simultaneously.
 	 */
 	atomic_add(MAX_IN_PROGRESS, &combined_event_count);
+
+	split_counters(&cnt, &inpr);
+	if (!inpr && waitqueue_active(&wakeup_count_wait_queue))
+		wake_up(&wakeup_count_wait_queue);
 }
 
 /**
@@ -667,14 +672,19 @@ bool pm_wakeup_pending(void)
 bool pm_get_wakeup_count(unsigned int *count)
 {
 	unsigned int cnt, inpr;
+	DEFINE_WAIT(wait);
 
 	for (;;) {
+		prepare_to_wait(&wakeup_count_wait_queue, &wait,
+				TASK_INTERRUPTIBLE);
 		split_counters(&cnt, &inpr);
 		if (inpr == 0 || signal_pending(current))
 			break;
 		pm_wakeup_update_hit_counts();
-		schedule_timeout_interruptible(msecs_to_jiffies(TIMEOUT));
+
+		schedule();
 	}
+	finish_wait(&wakeup_count_wait_queue, &wait);
 
 	split_counters(&cnt, &inpr);
 	*count = cnt;


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 3/8] PM / Sleep: Change wakeup source statistics to follow Android
  2012-04-22 21:19   ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Rafael J. Wysocki
  2012-04-22 21:19     ` [PATCH 1/8] PM / Sleep: Look for wakeup events in later stages of device suspend Rafael J. Wysocki
  2012-04-22 21:20     ` [PATCH 2/8] PM / Sleep: Use wait queue to signal "no wakeup events in progress" Rafael J. Wysocki
@ 2012-04-22 21:21     ` Rafael J. Wysocki
  2012-04-22 21:21     ` [PATCH 4/8] PM / Sleep: Add wakeup_source_activate and wakeup_source_deactivate tracepoints Rafael J. Wysocki
                       ` (5 subsequent siblings)
  8 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-22 21:21 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

From: Rafael J. Wysocki <rjw@sisk.pl>

Wakeup statistics used by Android are slightly different from what we
have in wakeup sources at the moment and there aren't any known
users of those statistics other than Android, so modify them to make
it easier for Android to switch to wakeup sources.

This removes the struct wakeup_source's hit_cout field, which is very
rough and therefore not very useful, and adds two new fields,
wakeup_count and expire_count.  The first one tracks how many times
the wakeup source is activated with events_check_enabled set (which
roughly corresponds to the situations when a system power transition
to a sleep state is in progress and would be aborted by this wakeup
source if it were the only active one at that time) and the second
one is the number of times the wakeup source has been activated with
a timeout that expired.

Additionally, the last_time field is now updated when the wakeup
source is deactivated too (previously it was only updated during
the wakeup source's activation), which seems to be what Android does
with the analogous counter for wakelocks.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/ABI/testing/sysfs-devices-power |   24 ++++++---
 drivers/base/power/sysfs.c                    |   30 ++++++++++--
 drivers/base/power/wakeup.c                   |   64 +++++++++++---------------
 include/linux/pm_wakeup.h                     |   11 ++--
 4 files changed, 77 insertions(+), 52 deletions(-)

Index: linux/include/linux/pm_wakeup.h
===================================================================
--- linux.orig/include/linux/pm_wakeup.h
+++ linux/include/linux/pm_wakeup.h
@@ -33,12 +33,14 @@
  *
  * @total_time: Total time this wakeup source has been active.
  * @max_time: Maximum time this wakeup source has been continuously active.
- * @last_time: Monotonic clock when the wakeup source's was activated last time.
+ * @last_time: Monotonic clock when the wakeup source's was touched last time.
  * @event_count: Number of signaled wakeup events.
  * @active_count: Number of times the wakeup sorce was activated.
  * @relax_count: Number of times the wakeup sorce was deactivated.
- * @hit_count: Number of times the wakeup sorce might abort system suspend.
+ * @expire_count: Number of times the wakeup source's timeout has expired.
+ * @wakeup_count: Number of times the wakeup source might abort suspend.
  * @active: Status of the wakeup source.
+ * @has_timeout: The wakeup source has been activated with a timeout.
  */
 struct wakeup_source {
 	const char 		*name;
@@ -52,8 +54,9 @@ struct wakeup_source {
 	unsigned long		event_count;
 	unsigned long		active_count;
 	unsigned long		relax_count;
-	unsigned long		hit_count;
-	unsigned int		active:1;
+	unsigned long		expire_count;
+	unsigned long		wakeup_count;
+	bool			active:1;
 };
 
 #ifdef CONFIG_PM_SLEEP
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -21,7 +21,7 @@
  * If set, the suspend/hibernate code will abort transitions to a sleep state
  * if wakeup events are registered during or immediately before the transition.
  */
-bool events_check_enabled;
+bool events_check_enabled __read_mostly;
 
 /*
  * Combined counters of registered wakeup events and wakeup events in progress.
@@ -383,6 +383,21 @@ static void wakeup_source_activate(struc
 }
 
 /**
+ * wakeup_source_report_event - Report wakeup event using the given source.
+ * @ws: Wakeup source to report the event for.
+ */
+static void wakeup_source_report_event(struct wakeup_source *ws)
+{
+	ws->event_count++;
+	/* This is racy, but the counter is approximate anyway. */
+	if (events_check_enabled)
+		ws->wakeup_count++;
+
+	if (!ws->active)
+		wakeup_source_activate(ws);
+}
+
+/**
  * __pm_stay_awake - Notify the PM core of a wakeup event.
  * @ws: Wakeup source object associated with the source of the event.
  *
@@ -397,10 +412,7 @@ void __pm_stay_awake(struct wakeup_sourc
 
 	spin_lock_irqsave(&ws->lock, flags);
 
-	ws->event_count++;
-	if (!ws->active)
-		wakeup_source_activate(ws);
-
+	wakeup_source_report_event(ws);
 	del_timer(&ws->timer);
 	ws->timer_expires = 0;
 
@@ -469,6 +481,7 @@ static void wakeup_source_deactivate(str
 	if (ktime_to_ns(duration) > ktime_to_ns(ws->max_time))
 		ws->max_time = duration;
 
+	ws->last_time = now;
 	del_timer(&ws->timer);
 	ws->timer_expires = 0;
 
@@ -541,8 +554,10 @@ static void pm_wakeup_timer_fn(unsigned
 	spin_lock_irqsave(&ws->lock, flags);
 
 	if (ws->active && ws->timer_expires
-	    && time_after_eq(jiffies, ws->timer_expires))
+	    && time_after_eq(jiffies, ws->timer_expires)) {
 		wakeup_source_deactivate(ws);
+		ws->expire_count++;
+	}
 
 	spin_unlock_irqrestore(&ws->lock, flags);
 }
@@ -569,9 +584,7 @@ void __pm_wakeup_event(struct wakeup_sou
 
 	spin_lock_irqsave(&ws->lock, flags);
 
-	ws->event_count++;
-	if (!ws->active)
-		wakeup_source_activate(ws);
+	wakeup_source_report_event(ws);
 
 	if (!msec) {
 		wakeup_source_deactivate(ws);
@@ -614,24 +627,6 @@ void pm_wakeup_event(struct device *dev,
 EXPORT_SYMBOL_GPL(pm_wakeup_event);
 
 /**
- * pm_wakeup_update_hit_counts - Update hit counts of all active wakeup sources.
- */
-static void pm_wakeup_update_hit_counts(void)
-{
-	unsigned long flags;
-	struct wakeup_source *ws;
-
-	rcu_read_lock();
-	list_for_each_entry_rcu(ws, &wakeup_sources, entry) {
-		spin_lock_irqsave(&ws->lock, flags);
-		if (ws->active)
-			ws->hit_count++;
-		spin_unlock_irqrestore(&ws->lock, flags);
-	}
-	rcu_read_unlock();
-}
-
-/**
  * pm_wakeup_pending - Check if power transition in progress should be aborted.
  *
  * Compare the current number of registered wakeup events with its preserved
@@ -653,8 +648,6 @@ bool pm_wakeup_pending(void)
 		events_check_enabled = !ret;
 	}
 	spin_unlock_irqrestore(&events_lock, flags);
-	if (ret)
-		pm_wakeup_update_hit_counts();
 	return ret;
 }
 
@@ -680,7 +673,6 @@ bool pm_get_wakeup_count(unsigned int *c
 		split_counters(&cnt, &inpr);
 		if (inpr == 0 || signal_pending(current))
 			break;
-		pm_wakeup_update_hit_counts();
 
 		schedule();
 	}
@@ -713,8 +705,6 @@ bool pm_save_wakeup_count(unsigned int c
 		events_check_enabled = true;
 	}
 	spin_unlock_irq(&events_lock);
-	if (!events_check_enabled)
-		pm_wakeup_update_hit_counts();
 	return events_check_enabled;
 }
 
@@ -749,9 +739,10 @@ static int print_wakeup_source_stats(str
 		active_time = ktime_set(0, 0);
 	}
 
-	ret = seq_printf(m, "%-12s\t%lu\t\t%lu\t\t%lu\t\t"
+	ret = seq_printf(m, "%-12s\t%lu\t\t%lu\t\t%lu\t\t%lu\t\t"
 			"%lld\t\t%lld\t\t%lld\t\t%lld\n",
-			ws->name, active_count, ws->event_count, ws->hit_count,
+			ws->name, active_count, ws->event_count,
+			ws->wakeup_count, ws->expire_count,
 			ktime_to_ms(active_time), ktime_to_ms(total_time),
 			ktime_to_ms(max_time), ktime_to_ms(ws->last_time));
 
@@ -768,8 +759,9 @@ static int wakeup_sources_stats_show(str
 {
 	struct wakeup_source *ws;
 
-	seq_puts(m, "name\t\tactive_count\tevent_count\thit_count\t"
-		"active_since\ttotal_time\tmax_time\tlast_change\n");
+	seq_puts(m, "name\t\tactive_count\tevent_count\twakeup_count\t"
+		"expire_count\tactive_since\ttotal_time\tmax_time\t"
+		"last_change\n");
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(ws, &wakeup_sources, entry)
Index: linux/drivers/base/power/sysfs.c
===================================================================
--- linux.orig/drivers/base/power/sysfs.c
+++ linux/drivers/base/power/sysfs.c
@@ -314,22 +314,41 @@ static ssize_t wakeup_active_count_show(
 
 static DEVICE_ATTR(wakeup_active_count, 0444, wakeup_active_count_show, NULL);
 
-static ssize_t wakeup_hit_count_show(struct device *dev,
-				struct device_attribute *attr, char *buf)
+static ssize_t wakeup_abort_count_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	unsigned long count = 0;
+	bool enabled = false;
+
+	spin_lock_irq(&dev->power.lock);
+	if (dev->power.wakeup) {
+		count = dev->power.wakeup->wakeup_count;
+		enabled = true;
+	}
+	spin_unlock_irq(&dev->power.lock);
+	return enabled ? sprintf(buf, "%lu\n", count) : sprintf(buf, "\n");
+}
+
+static DEVICE_ATTR(wakeup_abort_count, 0444, wakeup_abort_count_show, NULL);
+
+static ssize_t wakeup_expire_count_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
 {
 	unsigned long count = 0;
 	bool enabled = false;
 
 	spin_lock_irq(&dev->power.lock);
 	if (dev->power.wakeup) {
-		count = dev->power.wakeup->hit_count;
+		count = dev->power.wakeup->expire_count;
 		enabled = true;
 	}
 	spin_unlock_irq(&dev->power.lock);
 	return enabled ? sprintf(buf, "%lu\n", count) : sprintf(buf, "\n");
 }
 
-static DEVICE_ATTR(wakeup_hit_count, 0444, wakeup_hit_count_show, NULL);
+static DEVICE_ATTR(wakeup_expire_count, 0444, wakeup_expire_count_show, NULL);
 
 static ssize_t wakeup_active_show(struct device *dev,
 				struct device_attribute *attr, char *buf)
@@ -486,7 +505,8 @@ static struct attribute *wakeup_attrs[]
 	&dev_attr_wakeup.attr,
 	&dev_attr_wakeup_count.attr,
 	&dev_attr_wakeup_active_count.attr,
-	&dev_attr_wakeup_hit_count.attr,
+	&dev_attr_wakeup_abort_count.attr,
+	&dev_attr_wakeup_expire_count.attr,
 	&dev_attr_wakeup_active.attr,
 	&dev_attr_wakeup_total_time_ms.attr,
 	&dev_attr_wakeup_max_time_ms.attr,
Index: linux/Documentation/ABI/testing/sysfs-devices-power
===================================================================
--- linux.orig/Documentation/ABI/testing/sysfs-devices-power
+++ linux/Documentation/ABI/testing/sysfs-devices-power
@@ -96,16 +96,26 @@ Description:
 		is read-only.  If the device is not enabled to wake up the
 		system from sleep states, this attribute is not present.
 
-What:		/sys/devices/.../power/wakeup_hit_count
-Date:		September 2010
+What:		/sys/devices/.../power/wakeup_abort_count
+Date:		February 2012
 Contact:	Rafael J. Wysocki <rjw@sisk.pl>
 Description:
-		The /sys/devices/.../wakeup_hit_count attribute contains the
+		The /sys/devices/.../wakeup_abort_count attribute contains the
 		number of times the processing of a wakeup event associated with
-		the device might prevent the system from entering a sleep state.
-		This attribute is read-only.  If the device is not enabled to
-		wake up the system from sleep states, this attribute is not
-		present.
+		the device might have aborted system transition into a sleep
+		state in progress.  This attribute is read-only.  If the device
+		is not enabled to wake up the system from sleep states, this
+		attribute is not present.
+
+What:		/sys/devices/.../power/wakeup_expire_count
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/devices/.../wakeup_expire_count attribute contains the
+		number of times a wakeup event associated with the device has
+		been reported with a timeout that expired.  This attribute is
+		read-only.  If the device is not enabled to wake up the system
+		from sleep states, this attribute is not present.
 
 What:		/sys/devices/.../power/wakeup_active
 Date:		September 2010


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 4/8] PM / Sleep: Add wakeup_source_activate and wakeup_source_deactivate tracepoints
  2012-04-22 21:19   ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Rafael J. Wysocki
                       ` (2 preceding siblings ...)
  2012-04-22 21:21     ` [PATCH 3/8] PM / Sleep: Change wakeup source statistics to follow Android Rafael J. Wysocki
@ 2012-04-22 21:21     ` Rafael J. Wysocki
  2012-04-22 21:22     ` [RFC][PATCH 5/8] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready Rafael J. Wysocki
                       ` (4 subsequent siblings)
  8 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-22 21:21 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

From: Arve Hjønnevåg <arve@android.com>

Add tracepoints to wakeup_source_activate and wakeup_source_deactivate.
Useful for checking that specific wakeup sources overlap as expected.

Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 drivers/base/power/wakeup.c  |   12 +++++++++---
 include/trace/events/power.h |   34 ++++++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+), 3 deletions(-)

Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -14,6 +14,7 @@
 #include <linux/suspend.h>
 #include <linux/seq_file.h>
 #include <linux/debugfs.h>
+#include <trace/events/power.h>
 
 #include "power.h"
 
@@ -374,12 +375,16 @@ EXPORT_SYMBOL_GPL(device_set_wakeup_enab
  */
 static void wakeup_source_activate(struct wakeup_source *ws)
 {
+	unsigned int cec;
+
 	ws->active = true;
 	ws->active_count++;
 	ws->last_time = ktime_get();
 
 	/* Increment the counter of events in progress. */
-	atomic_inc(&combined_event_count);
+	cec = atomic_inc_return(&combined_event_count);
+
+	trace_wakeup_source_activate(ws->name, cec);
 }
 
 /**
@@ -454,7 +459,7 @@ EXPORT_SYMBOL_GPL(pm_stay_awake);
  */
 static void wakeup_source_deactivate(struct wakeup_source *ws)
 {
-	unsigned int cnt, inpr;
+	unsigned int cnt, inpr, cec;
 	ktime_t duration;
 	ktime_t now;
 
@@ -489,7 +494,8 @@ static void wakeup_source_deactivate(str
 	 * Increment the counter of registered wakeup events and decrement the
 	 * couter of wakeup events in progress simultaneously.
 	 */
-	atomic_add(MAX_IN_PROGRESS, &combined_event_count);
+	cec = atomic_add_return(MAX_IN_PROGRESS, &combined_event_count);
+	trace_wakeup_source_deactivate(ws->name, cec);
 
 	split_counters(&cnt, &inpr);
 	if (!inpr && waitqueue_active(&wakeup_count_wait_queue))
Index: linux/include/trace/events/power.h
===================================================================
--- linux.orig/include/trace/events/power.h
+++ linux/include/trace/events/power.h
@@ -65,6 +65,40 @@ TRACE_EVENT(machine_suspend,
 	TP_printk("state=%lu", (unsigned long)__entry->state)
 );
 
+DECLARE_EVENT_CLASS(wakeup_source,
+
+	TP_PROTO(const char *name, unsigned int state),
+
+	TP_ARGS(name, state),
+
+	TP_STRUCT__entry(
+		__string(       name,           name            )
+		__field(        u64,            state           )
+	),
+
+	TP_fast_assign(
+		__assign_str(name, name);
+		__entry->state = state;
+	),
+
+	TP_printk("%s state=0x%lx", __get_str(name),
+		(unsigned long)__entry->state)
+);
+
+DEFINE_EVENT(wakeup_source, wakeup_source_activate,
+
+	TP_PROTO(const char *name, unsigned int state),
+
+	TP_ARGS(name, state)
+);
+
+DEFINE_EVENT(wakeup_source, wakeup_source_deactivate,
+
+	TP_PROTO(const char *name, unsigned int state),
+
+	TP_ARGS(name, state)
+);
+
 #ifdef CONFIG_EVENT_POWER_TRACING_DEPRECATED
 
 /*


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 5/8] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-04-22 21:19   ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Rafael J. Wysocki
                       ` (3 preceding siblings ...)
  2012-04-22 21:21     ` [PATCH 4/8] PM / Sleep: Add wakeup_source_activate and wakeup_source_deactivate tracepoints Rafael J. Wysocki
@ 2012-04-22 21:22     ` Rafael J. Wysocki
  2012-04-26  4:03       ` NeilBrown
  2012-04-22 21:23     ` [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep Rafael J. Wysocki
                       ` (3 subsequent siblings)
  8 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-22 21:22 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

From: Arve Hjønnevåg <arve@android.com>

When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
wakeup_source will be active to prevent suspend. This can be used to
handle wakeup events from a driver that support poll, e.g. input, if
that driver wakes up the waitqueue passed to epoll before allowing
suspend.

The current implementation uses an extra wakeup_source when
ep_scan_ready_list runs. This can cause problems if a single thread
is polling on wakeup events and frequent non-wakeup events (events
usually arrive during thread freezing) using the same epoll file.

Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 fs/eventpoll.c            |   71 ++++++++++++++++++++++++++++++++++++++++++++--
 include/linux/eventpoll.h |    6 +++
 2 files changed, 74 insertions(+), 3 deletions(-)

Index: linux/fs/eventpoll.c
===================================================================
--- linux.orig/fs/eventpoll.c
+++ linux/fs/eventpoll.c
@@ -33,6 +33,7 @@
 #include <linux/bitops.h>
 #include <linux/mutex.h>
 #include <linux/anon_inodes.h>
+#include <linux/device.h>
 #include <asm/uaccess.h>
 #include <asm/io.h>
 #include <asm/mman.h>
@@ -87,7 +88,7 @@
  */
 
 /* Epoll private bits inside the event mask */
-#define EP_PRIVATE_BITS (EPOLLONESHOT | EPOLLET)
+#define EP_PRIVATE_BITS (EPOLLWAKEUP | EPOLLONESHOT | EPOLLET)
 
 /* Maximum number of nesting allowed inside epoll sets */
 #define EP_MAX_NESTS 4
@@ -154,6 +155,9 @@ struct epitem {
 	/* List header used to link this item to the "struct file" items list */
 	struct list_head fllink;
 
+	/* wakeup_source used when EPOLLWAKEUP is set */
+	struct wakeup_source *ws;
+
 	/* The structure that describe the interested events and the source fd */
 	struct epoll_event event;
 };
@@ -194,6 +198,9 @@ struct eventpoll {
 	 */
 	struct epitem *ovflist;
 
+	/* wakeup_source used when ep_scan_ready_list is running */
+	struct wakeup_source *ws;
+
 	/* The user that created the eventpoll descriptor */
 	struct user_struct *user;
 
@@ -565,6 +572,7 @@ static int ep_scan_ready_list(struct eve
 	 * in a lockless way.
 	 */
 	spin_lock_irqsave(&ep->lock, flags);
+	__pm_stay_awake(ep->ws);
 	list_splice_init(&ep->rdllist, &txlist);
 	ep->ovflist = NULL;
 	spin_unlock_irqrestore(&ep->lock, flags);
@@ -588,8 +596,10 @@ static int ep_scan_ready_list(struct eve
 		 * queued into ->ovflist but the "txlist" might already
 		 * contain them, and the list_splice() below takes care of them.
 		 */
-		if (!ep_is_linked(&epi->rdllink))
+		if (!ep_is_linked(&epi->rdllink)) {
 			list_add_tail(&epi->rdllink, &ep->rdllist);
+			__pm_stay_awake(epi->ws);
+		}
 	}
 	/*
 	 * We need to set back ep->ovflist to EP_UNACTIVE_PTR, so that after
@@ -602,6 +612,7 @@ static int ep_scan_ready_list(struct eve
 	 * Quickly re-inject items left on "txlist".
 	 */
 	list_splice(&txlist, &ep->rdllist);
+	__pm_relax(ep->ws);
 
 	if (!list_empty(&ep->rdllist)) {
 		/*
@@ -656,6 +667,9 @@ static int ep_remove(struct eventpoll *e
 		list_del_init(&epi->rdllink);
 	spin_unlock_irqrestore(&ep->lock, flags);
 
+	if (epi->ws)
+		wakeup_source_unregister(epi->ws);
+
 	/* At this point it is safe to free the eventpoll item */
 	kmem_cache_free(epi_cache, epi);
 
@@ -706,6 +720,8 @@ static void ep_free(struct eventpoll *ep
 	mutex_unlock(&epmutex);
 	mutex_destroy(&ep->mtx);
 	free_uid(ep->user);
+	if (ep->ws)
+		wakeup_source_unregister(ep->ws);
 	kfree(ep);
 }
 
@@ -737,6 +753,7 @@ static int ep_read_events_proc(struct ev
 			 * callback, but it's not actually ready, as far as
 			 * caller requested events goes. We can remove it here.
 			 */
+			__pm_relax(epi->ws);
 			list_del_init(&epi->rdllink);
 		}
 	}
@@ -932,8 +949,10 @@ static int ep_poll_callback(wait_queue_t
 	}
 
 	/* If this file is already in the ready list we exit soon */
-	if (!ep_is_linked(&epi->rdllink))
+	if (!ep_is_linked(&epi->rdllink)) {
 		list_add_tail(&epi->rdllink, &ep->rdllist);
+		__pm_stay_awake(epi->ws);
+	}
 
 	/*
 	 * Wake up ( if active ) both the eventpoll wait list and the ->poll()
@@ -1091,6 +1110,30 @@ static int reverse_path_check(void)
 	return error;
 }
 
+static int ep_create_wakeup_source(struct epitem *epi)
+{
+	const char *name;
+
+	if (!epi->ep->ws) {
+		epi->ep->ws = wakeup_source_register("eventpoll");
+		if (!epi->ep->ws)
+			return -ENOMEM;
+	}
+
+	name = epi->ffd.file->f_path.dentry->d_name.name;
+	epi->ws = wakeup_source_register(name);
+	if (!epi->ws)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void ep_destroy_wakeup_source(struct epitem *epi)
+{
+	wakeup_source_unregister(epi->ws);
+	epi->ws = NULL;
+}
+
 /*
  * Must be called with "mtx" held.
  */
@@ -1118,6 +1161,13 @@ static int ep_insert(struct eventpoll *e
 	epi->event = *event;
 	epi->nwait = 0;
 	epi->next = EP_UNACTIVE_PTR;
+	if (epi->event.events & EPOLLWAKEUP) {
+		error = ep_create_wakeup_source(epi);
+		if (error)
+			goto error_create_wakeup_source;
+	} else {
+		epi->ws = NULL;
+	}
 
 	/* Initialize the poll table using the queue callback */
 	epq.epi = epi;
@@ -1164,6 +1214,7 @@ static int ep_insert(struct eventpoll *e
 	/* If the file is already "ready" we drop it inside the ready list */
 	if ((revents & event->events) && !ep_is_linked(&epi->rdllink)) {
 		list_add_tail(&epi->rdllink, &ep->rdllist);
+		__pm_stay_awake(epi->ws);
 
 		/* Notify waiting tasks that events are available */
 		if (waitqueue_active(&ep->wq))
@@ -1204,6 +1255,10 @@ error_unregister:
 		list_del_init(&epi->rdllink);
 	spin_unlock_irqrestore(&ep->lock, flags);
 
+	if (epi->ws)
+		wakeup_source_unregister(epi->ws);
+
+error_create_wakeup_source:
 	kmem_cache_free(epi_cache, epi);
 
 	return error;
@@ -1229,6 +1284,12 @@ static int ep_modify(struct eventpoll *e
 	epi->event.events = event->events;
 	pt._key = event->events;
 	epi->event.data = event->data; /* protected by mtx */
+	if (epi->event.events & EPOLLWAKEUP) {
+		if (!epi->ws)
+			ep_create_wakeup_source(epi);
+	} else if (epi->ws) {
+		ep_destroy_wakeup_source(epi);
+	}
 
 	/*
 	 * Get current event bits. We can safely use the file* here because
@@ -1244,6 +1305,7 @@ static int ep_modify(struct eventpoll *e
 		spin_lock_irq(&ep->lock);
 		if (!ep_is_linked(&epi->rdllink)) {
 			list_add_tail(&epi->rdllink, &ep->rdllist);
+			__pm_stay_awake(epi->ws);
 
 			/* Notify waiting tasks that events are available */
 			if (waitqueue_active(&ep->wq))
@@ -1282,6 +1344,7 @@ static int ep_send_events_proc(struct ev
 	     !list_empty(head) && eventcnt < esed->maxevents;) {
 		epi = list_first_entry(head, struct epitem, rdllink);
 
+		__pm_relax(epi->ws);
 		list_del_init(&epi->rdllink);
 
 		pt._key = epi->event.events;
@@ -1298,6 +1361,7 @@ static int ep_send_events_proc(struct ev
 			if (__put_user(revents, &uevent->events) ||
 			    __put_user(epi->event.data, &uevent->data)) {
 				list_add(&epi->rdllink, head);
+				__pm_stay_awake(epi->ws);
 				return eventcnt ? eventcnt : -EFAULT;
 			}
 			eventcnt++;
@@ -1317,6 +1381,7 @@ static int ep_send_events_proc(struct ev
 				 * poll callback will queue them in ep->ovflist.
 				 */
 				list_add_tail(&epi->rdllink, &ep->rdllist);
+				__pm_stay_awake(epi->ws);
 			}
 		}
 	}
Index: linux/include/linux/eventpoll.h
===================================================================
--- linux.orig/include/linux/eventpoll.h
+++ linux/include/linux/eventpoll.h
@@ -26,6 +26,12 @@
 #define EPOLL_CTL_DEL 2
 #define EPOLL_CTL_MOD 3
 
+/*
+ * Request the handling of system wakeup events so as to prevent automatic
+ * system suspends from happening while those events are being processed.
+ */
+#define EPOLLWAKEUP (1 << 29)
+
 /* Set the One Shot behaviour for the target file descriptor */
 #define EPOLLONESHOT (1 << 30)
 


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
  2012-04-22 21:19   ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Rafael J. Wysocki
                       ` (4 preceding siblings ...)
  2012-04-22 21:22     ` [RFC][PATCH 5/8] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready Rafael J. Wysocki
@ 2012-04-22 21:23     ` Rafael J. Wysocki
  2012-04-26  3:05       ` NeilBrown
  2012-04-22 21:24     ` [RFC][PATCH 7/8] PM / Sleep: Add "prevent autosleep time" statistics to wakeup sources Rafael J. Wysocki
                       ` (2 subsequent siblings)
  8 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-22 21:23 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

From: Rafael J. Wysocki <rjw@sisk.pl>

Introduce a mechanism by which the kernel can trigger global
transitions to a sleep state chosen by user space if there are no
active wakeup sources.

It consists of a new sysfs attribute, /sys/power/autosleep, that
can be written one of the strings returned by reads from
/sys/power/state, an ordered workqueue and a work item carrying out
the "suspend" operations.  If a string representing the system's
sleep state is written to /sys/power/autosleep, the work item
triggering transitions to that state is queued up and it requeues
itself after every execution until user space writes "off" to
/sys/power/autosleep.

That work item enables the detection of wakeup events using the
functions already defined in drivers/base/power/wakeup.c (with one
small modification) and calls either pm_suspend(), or hibernate() to
put the system into a sleep state.  If a wakeup event is reported
while the transition is in progress, it will abort the transition and
the "system suspend" work item will be queued up again.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/ABI/testing/sysfs-power |   17 ++++
 drivers/base/power/wakeup.c           |   38 ++++++-----
 include/linux/suspend.h               |   13 +++
 kernel/power/Kconfig                  |    8 ++
 kernel/power/Makefile                 |    1 
 kernel/power/autosleep.c              |  113 ++++++++++++++++++++++++++++++++
 kernel/power/main.c                   |  117 ++++++++++++++++++++++++++++------
 kernel/power/power.h                  |   18 +++++
 8 files changed, 290 insertions(+), 35 deletions(-)

Index: linux/kernel/power/Makefile
===================================================================
--- linux.orig/kernel/power/Makefile
+++ linux/kernel/power/Makefile
@@ -9,5 +9,6 @@ obj-$(CONFIG_SUSPEND)		+= suspend.o
 obj-$(CONFIG_PM_TEST_SUSPEND)	+= suspend_test.o
 obj-$(CONFIG_HIBERNATION)	+= hibernate.o snapshot.o swap.o user.o \
 				   block_io.o
+obj-$(CONFIG_PM_AUTOSLEEP)	+= autosleep.o
 
 obj-$(CONFIG_MAGIC_SYSRQ)	+= poweroff.o
Index: linux/kernel/power/Kconfig
===================================================================
--- linux.orig/kernel/power/Kconfig
+++ linux/kernel/power/Kconfig
@@ -103,6 +103,14 @@ config PM_SLEEP_SMP
 	select HOTPLUG
 	select HOTPLUG_CPU
 
+config PM_AUTOSLEEP
+	bool "Opportunistic sleep"
+	depends on PM_SLEEP
+	default n
+	---help---
+	Allow the kernel to trigger a system transition into a global sleep
+	state automatically whenever there are no active wakeup sources.
+
 config PM_RUNTIME
 	bool "Run-time PM core functionality"
 	depends on !IA64_HP_SIM
Index: linux/kernel/power/power.h
===================================================================
--- linux.orig/kernel/power/power.h
+++ linux/kernel/power/power.h
@@ -264,3 +264,21 @@ static inline void suspend_thaw_processe
 {
 }
 #endif
+
+#ifdef CONFIG_PM_AUTOSLEEP
+
+/* kernel/power/autosleep.c */
+extern int pm_autosleep_init(void);
+extern int pm_autosleep_lock(void);
+extern void pm_autosleep_unlock(void);
+extern suspend_state_t pm_autosleep_state(void);
+extern int pm_autosleep_set_state(suspend_state_t state);
+
+#else /* !CONFIG_PM_AUTOSLEEP */
+
+static inline int pm_autosleep_init(void) { return 0; }
+static inline int pm_autosleep_lock(void) { return 0; }
+static inline void pm_autosleep_unlock(void) {}
+static inline suspend_state_t pm_autosleep_state(void) { return PM_SUSPEND_ON; }
+
+#endif /* !CONFIG_PM_AUTOSLEEP */
Index: linux/include/linux/suspend.h
===================================================================
--- linux.orig/include/linux/suspend.h
+++ linux/include/linux/suspend.h
@@ -356,7 +356,7 @@ extern int unregister_pm_notifier(struct
 extern bool events_check_enabled;
 
 extern bool pm_wakeup_pending(void);
-extern bool pm_get_wakeup_count(unsigned int *count);
+extern bool pm_get_wakeup_count(unsigned int *count, bool block);
 extern bool pm_save_wakeup_count(unsigned int count);
 
 static inline void lock_system_sleep(void)
@@ -407,6 +407,17 @@ static inline void unlock_system_sleep(v
 
 #endif /* !CONFIG_PM_SLEEP */
 
+#ifdef CONFIG_PM_AUTOSLEEP
+
+/* kernel/power/autosleep.c */
+void queue_up_suspend_work(void);
+
+#else /* !CONFIG_PM_AUTOSLEEP */
+
+static inline void queue_up_suspend_work(void) {}
+
+#endif /* !CONFIG_PM_AUTOSLEEP */
+
 #ifdef CONFIG_ARCH_SAVE_PAGE_KEYS
 /*
  * The ARCH_SAVE_PAGE_KEYS functions can be used by an architecture
Index: linux/kernel/power/autosleep.c
===================================================================
--- /dev/null
+++ linux/kernel/power/autosleep.c
@@ -0,0 +1,113 @@
+/*
+ * kernel/power/autosleep.c
+ *
+ * Opportunistic sleep support.
+ *
+ * Copyright (C) 2012 Rafael J. Wysocki <rjw@sisk.pl>
+ */
+
+#include <linux/device.h>
+#include <linux/mutex.h>
+#include <linux/pm_wakeup.h>
+
+#include "power.h"
+
+static suspend_state_t autosleep_state;
+static struct workqueue_struct *autosleep_wq;
+static DEFINE_MUTEX(autosleep_lock);
+static struct wakeup_source *autosleep_ws;
+
+static void try_to_suspend(struct work_struct *work)
+{
+	unsigned int initial_count, final_count;
+
+	if (!pm_get_wakeup_count(&initial_count, true))
+		goto out;
+
+	mutex_lock(&autosleep_lock);
+
+	if (!pm_save_wakeup_count(initial_count)) {
+		mutex_unlock(&autosleep_lock);
+		goto out;
+	}
+
+	if (autosleep_state == PM_SUSPEND_ON) {
+		mutex_unlock(&autosleep_lock);
+		return;
+	}
+	if (autosleep_state >= PM_SUSPEND_MAX)
+		hibernate();
+	else
+		pm_suspend(autosleep_state);
+
+	mutex_unlock(&autosleep_lock);
+
+	if (!pm_get_wakeup_count(&final_count, false))
+		goto out;
+
+	if (final_count == initial_count)
+		schedule_timeout(HZ / 2);
+
+ out:
+	queue_up_suspend_work();
+}
+
+static DECLARE_WORK(suspend_work, try_to_suspend);
+
+void queue_up_suspend_work(void)
+{
+	if (!work_pending(&suspend_work) && autosleep_state > PM_SUSPEND_ON)
+		queue_work(autosleep_wq, &suspend_work);
+}
+
+suspend_state_t pm_autosleep_state(void)
+{
+	return autosleep_state;
+}
+
+int pm_autosleep_lock(void)
+{
+	return mutex_lock_interruptible(&autosleep_lock);
+}
+
+void pm_autosleep_unlock(void)
+{
+	mutex_unlock(&autosleep_lock);
+}
+
+int pm_autosleep_set_state(suspend_state_t state)
+{
+
+#ifndef CONFIG_HIBERNATION
+	if (state >= PM_SUSPEND_MAX)
+		return -EINVAL;
+#endif
+
+	__pm_stay_awake(autosleep_ws);
+
+	mutex_lock(&autosleep_lock);
+
+	autosleep_state = state;
+
+	__pm_relax(autosleep_ws);
+
+	if (state > PM_SUSPEND_ON)
+		queue_up_suspend_work();
+
+	mutex_unlock(&autosleep_lock);
+	return 0;
+}
+
+int __init pm_autosleep_init(void)
+{
+	autosleep_ws = wakeup_source_register("autosleep");
+	if (!autosleep_ws)
+		return -ENOMEM;
+
+	autosleep_wq = alloc_ordered_workqueue("autosleep", 0);
+	if (autosleep_wq)
+		return 0;
+
+	wakeup_source_unregister(autosleep_ws);
+	return -ENOMEM;
+}
Index: linux/kernel/power/main.c
===================================================================
--- linux.orig/kernel/power/main.c
+++ linux/kernel/power/main.c
@@ -269,8 +269,7 @@ static ssize_t state_show(struct kobject
 	return (s - buf);
 }
 
-static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
-			   const char *buf, size_t n)
+static suspend_state_t decode_state(const char *buf, size_t n)
 {
 #ifdef CONFIG_SUSPEND
 	suspend_state_t state = PM_SUSPEND_STANDBY;
@@ -278,27 +277,48 @@ static ssize_t state_store(struct kobjec
 #endif
 	char *p;
 	int len;
-	int error = -EINVAL;
 
 	p = memchr(buf, '\n', n);
 	len = p ? p - buf : n;
 
-	/* First, check if we are requested to hibernate */
-	if (len == 4 && !strncmp(buf, "disk", len)) {
-		error = hibernate();
-		goto Exit;
-	}
+	/* Check hibernation first. */
+	if (len == 4 && !strncmp(buf, "disk", len))
+		return PM_SUSPEND_MAX;
 
 #ifdef CONFIG_SUSPEND
-	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
-		if (*s && len == strlen(*s) && !strncmp(buf, *s, len)) {
-			error = pm_suspend(state);
-			break;
-		}
-	}
+	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++)
+		if (*s && len == strlen(*s) && !strncmp(buf, *s, len))
+			return state;
 #endif
 
- Exit:
+	return PM_SUSPEND_ON;
+}
+
+static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
+			   const char *buf, size_t n)
+{
+	suspend_state_t state;
+	int error;
+
+	error = pm_autosleep_lock();
+	if (error)
+		return error;
+
+	if (pm_autosleep_state() > PM_SUSPEND_ON) {
+		error = -EBUSY;
+		goto out;
+	}
+
+	state = decode_state(buf, n);
+	if (state < PM_SUSPEND_MAX)
+		error = pm_suspend(state);
+	else if (state > PM_SUSPEND_ON)
+		error = hibernate();
+	else
+		error = -EINVAL;
+
+ out:
+	pm_autosleep_unlock();
 	return error ? error : n;
 }
 
@@ -339,7 +359,8 @@ static ssize_t wakeup_count_show(struct
 {
 	unsigned int val;
 
-	return pm_get_wakeup_count(&val) ? sprintf(buf, "%u\n", val) : -EINTR;
+	return pm_get_wakeup_count(&val, true) ?
+		sprintf(buf, "%u\n", val) : -EINTR;
 }
 
 static ssize_t wakeup_count_store(struct kobject *kobj,
@@ -347,15 +368,69 @@ static ssize_t wakeup_count_store(struct
 				const char *buf, size_t n)
 {
 	unsigned int val;
+	int error;
+
+	error = pm_autosleep_lock();
+	if (error)
+		return error;
+
+	if (pm_autosleep_state() > PM_SUSPEND_ON) {
+		error = -EBUSY;
+		goto out;
+	}
 
 	if (sscanf(buf, "%u", &val) == 1) {
 		if (pm_save_wakeup_count(val))
 			return n;
 	}
-	return -EINVAL;
+	error = -EINVAL;
+
+ out:
+	pm_autosleep_unlock();
+	return error;
 }
 
 power_attr(wakeup_count);
+
+#ifdef CONFIG_PM_AUTOSLEEP
+static ssize_t autosleep_show(struct kobject *kobj,
+			      struct kobj_attribute *attr,
+			      char *buf)
+{
+	suspend_state_t state = pm_autosleep_state();
+
+	if (state == PM_SUSPEND_ON)
+		return sprintf(buf, "off\n");
+
+#ifdef CONFIG_SUSPEND
+	if (state < PM_SUSPEND_MAX)
+		return sprintf(buf, "%s\n", valid_state(state) ?
+						pm_states[state] : "error");
+#endif
+#ifdef CONFIG_HIBERNATION
+	return sprintf(buf, "disk\n");
+#else
+	return sprintf(buf, "error");
+#endif
+}
+
+static ssize_t autosleep_store(struct kobject *kobj,
+			       struct kobj_attribute *attr,
+			       const char *buf, size_t n)
+{
+	suspend_state_t state = decode_state(buf, n);
+	int error;
+
+	if (state == PM_SUSPEND_ON && strncmp(buf, "off", 3)
+	    && strncmp(buf, "off\n", 4))
+		return -EINVAL;
+
+	error = pm_autosleep_set_state(state);
+	return error ? error : n;
+}
+
+power_attr(autosleep);
+#endif /* CONFIG_PM_AUTOSLEEP */
 #endif /* CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM_TRACE
@@ -409,6 +484,9 @@ static struct attribute * g[] = {
 #ifdef CONFIG_PM_SLEEP
 	&pm_async_attr.attr,
 	&wakeup_count_attr.attr,
+#ifdef CONFIG_PM_AUTOSLEEP
+	&autosleep_attr.attr,
+#endif
 #ifdef CONFIG_PM_DEBUG
 	&pm_test_attr.attr,
 #endif
@@ -444,7 +522,10 @@ static int __init pm_init(void)
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
-	return sysfs_create_group(power_kobj, &attr_group);
+	error = sysfs_create_group(power_kobj, &attr_group);
+	if (error)
+		return error;
+	return pm_autosleep_init();
 }
 
 core_initcall(pm_init);
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -498,8 +498,10 @@ static void wakeup_source_deactivate(str
 	trace_wakeup_source_deactivate(ws->name, cec);
 
 	split_counters(&cnt, &inpr);
-	if (!inpr && waitqueue_active(&wakeup_count_wait_queue))
+	if (!inpr && waitqueue_active(&wakeup_count_wait_queue)) {
 		wake_up(&wakeup_count_wait_queue);
+		queue_up_suspend_work();
+	}
 }
 
 /**
@@ -660,29 +662,33 @@ bool pm_wakeup_pending(void)
 /**
  * pm_get_wakeup_count - Read the number of registered wakeup events.
  * @count: Address to store the value at.
+ * @block: Whether or not to block.
  *
- * Store the number of registered wakeup events at the address in @count.  Block
- * if the current number of wakeup events being processed is nonzero.
+ * Store the number of registered wakeup events at the address in @count.  If
+ * @block is set, block until the current number of wakeup events being
+ * processed is zero.
  *
- * Return 'false' if the wait for the number of wakeup events being processed to
- * drop down to zero has been interrupted by a signal (and the current number
- * of wakeup events being processed is still nonzero).  Otherwise return 'true'.
+ * Return 'false' if the current number of wakeup events being processed is
+ * nonzero.  Otherwise return 'true'.
  */
-bool pm_get_wakeup_count(unsigned int *count)
+bool pm_get_wakeup_count(unsigned int *count, bool block)
 {
 	unsigned int cnt, inpr;
-	DEFINE_WAIT(wait);
 
-	for (;;) {
-		prepare_to_wait(&wakeup_count_wait_queue, &wait,
-				TASK_INTERRUPTIBLE);
-		split_counters(&cnt, &inpr);
-		if (inpr == 0 || signal_pending(current))
-			break;
+	if (block) {
+		DEFINE_WAIT(wait);
 
-		schedule();
+		for (;;) {
+			prepare_to_wait(&wakeup_count_wait_queue, &wait,
+					TASK_INTERRUPTIBLE);
+			split_counters(&cnt, &inpr);
+			if (inpr == 0 || signal_pending(current))
+				break;
+
+			schedule();
+		}
+		finish_wait(&wakeup_count_wait_queue, &wait);
 	}
-	finish_wait(&wakeup_count_wait_queue, &wait);
 
 	split_counters(&cnt, &inpr);
 	*count = cnt;
Index: linux/Documentation/ABI/testing/sysfs-power
===================================================================
--- linux.orig/Documentation/ABI/testing/sysfs-power
+++ linux/Documentation/ABI/testing/sysfs-power
@@ -172,3 +172,20 @@ Description:
 
 		Reading from this file will display the current value, which is
 		set to 1 MB by default.
+
+What:		/sys/power/autosleep
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/autosleep file can be written one of the strings
+		returned by reads from /sys/power/state.  If that happens, a
+		work item attempting to trigger a transition of the system to
+		the sleep state represented by that string is queued up.  This
+		attempt will only succeed if there are no active wakeup sources
+		in the system at that time.  After evey execution, regardless
+		of whether or not the attempt to put the system to sleep has
+		succeeded, the work item requeues itself until user space
+		writes "off" to /sys/power/autosleep.
+
+		Reading from this file causes the last string successfully
+		written to it to be displayed.


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 7/8] PM / Sleep: Add "prevent autosleep time" statistics to wakeup sources
  2012-04-22 21:19   ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Rafael J. Wysocki
                       ` (5 preceding siblings ...)
  2012-04-22 21:23     ` [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep Rafael J. Wysocki
@ 2012-04-22 21:24     ` Rafael J. Wysocki
  2012-04-22 21:24     ` [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating " Rafael J. Wysocki
  2012-04-23 16:49     ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Greg KH
  8 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-22 21:24 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

From: Rafael J. Wysocki <rjw@sisk.pl>

Android uses one wakelock statistics that is only necessary for
opportunistic sleep.  Namely, the prevent_suspend_time field
accumulates the total time the given wakelock has been locked
while "automatic suspend" was enabled.  Add an analogous field,
prevent_sleep_time, to wakeup sources and make it behave in a similar
way.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/ABI/testing/sysfs-devices-power |   11 ++++
 drivers/base/power/sysfs.c                    |   24 ++++++++++
 drivers/base/power/wakeup.c                   |   61 ++++++++++++++++++++++++--
 include/linux/pm_wakeup.h                     |    4 +
 include/linux/suspend.h                       |    1 
 kernel/power/autosleep.c                      |    6 ++
 6 files changed, 102 insertions(+), 5 deletions(-)

Index: linux/include/linux/pm_wakeup.h
===================================================================
--- linux.orig/include/linux/pm_wakeup.h
+++ linux/include/linux/pm_wakeup.h
@@ -34,6 +34,7 @@
  * @total_time: Total time this wakeup source has been active.
  * @max_time: Maximum time this wakeup source has been continuously active.
  * @last_time: Monotonic clock when the wakeup source's was touched last time.
+ * @prevent_sleep_time: Total time this source has been preventing autosleep.
  * @event_count: Number of signaled wakeup events.
  * @active_count: Number of times the wakeup sorce was activated.
  * @relax_count: Number of times the wakeup sorce was deactivated.
@@ -51,12 +52,15 @@ struct wakeup_source {
 	ktime_t total_time;
 	ktime_t max_time;
 	ktime_t last_time;
+	ktime_t start_prevent_time;
+	ktime_t prevent_sleep_time;
 	unsigned long		event_count;
 	unsigned long		active_count;
 	unsigned long		relax_count;
 	unsigned long		expire_count;
 	unsigned long		wakeup_count;
 	bool			active:1;
+	bool			autosleep_enabled:1;
 };
 
 #ifdef CONFIG_PM_SLEEP
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -380,6 +380,8 @@ static void wakeup_source_activate(struc
 	ws->active = true;
 	ws->active_count++;
 	ws->last_time = ktime_get();
+	if (ws->autosleep_enabled)
+		ws->start_prevent_time = ws->last_time;
 
 	/* Increment the counter of events in progress. */
 	cec = atomic_inc_return(&combined_event_count);
@@ -449,6 +451,17 @@ void pm_stay_awake(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(pm_stay_awake);
 
+#ifdef CONFIG_PM_AUTOSLEEP
+static void update_prevent_sleep_time(struct wakeup_source *ws, ktime_t now)
+{
+	ktime_t delta = ktime_sub(now, ws->start_prevent_time);
+	ws->prevent_sleep_time = ktime_add(ws->prevent_sleep_time, delta);
+}
+#else
+static inline void update_prevent_sleep_time(struct wakeup_source *ws,
+					     ktime_t now) {}
+#endif
+
 /**
  * wakup_source_deactivate - Mark given wakeup source as inactive.
  * @ws: Wakeup source to handle.
@@ -490,6 +503,9 @@ static void wakeup_source_deactivate(str
 	del_timer(&ws->timer);
 	ws->timer_expires = 0;
 
+	if (ws->autosleep_enabled)
+		update_prevent_sleep_time(ws, now);
+
 	/*
 	 * Increment the counter of registered wakeup events and decrement the
 	 * couter of wakeup events in progress simultaneously.
@@ -720,6 +736,34 @@ bool pm_save_wakeup_count(unsigned int c
 	return events_check_enabled;
 }
 
+#ifdef CONFIG_PM_AUTOSLEEP
+/**
+ * pm_wakep_autosleep_enabled - Modify autosleep_enabled for all wakeup sources.
+ * @enabled: Whether to set or to clear the autosleep_enabled flags.
+ */
+void pm_wakep_autosleep_enabled(bool set)
+{
+	struct wakeup_source *ws;
+	ktime_t now = ktime_get();
+
+	rcu_read_lock();
+	list_for_each_entry_rcu(ws, &wakeup_sources, entry) {
+		spin_lock_irq(&ws->lock);
+		if (ws->autosleep_enabled != set) {
+			ws->autosleep_enabled = set;
+			if (ws->active) {
+				if (set)
+					ws->start_prevent_time = now;
+				else
+					update_prevent_sleep_time(ws, now);
+			}
+		}
+		spin_unlock_irq(&ws->lock);
+	}
+	rcu_read_unlock();
+}
+#endif /* CONFIG_PM_AUTOSLEEP */
+
 static struct dentry *wakeup_sources_stats_dentry;
 
 /**
@@ -735,28 +779,37 @@ static int print_wakeup_source_stats(str
 	ktime_t max_time;
 	unsigned long active_count;
 	ktime_t active_time;
+	ktime_t prevent_sleep_time;
 	int ret;
 
 	spin_lock_irqsave(&ws->lock, flags);
 
 	total_time = ws->total_time;
 	max_time = ws->max_time;
+	prevent_sleep_time = ws->prevent_sleep_time;
 	active_count = ws->active_count;
 	if (ws->active) {
-		active_time = ktime_sub(ktime_get(), ws->last_time);
+		ktime_t now = ktime_get();
+
+		active_time = ktime_sub(now, ws->last_time);
 		total_time = ktime_add(total_time, active_time);
 		if (active_time.tv64 > max_time.tv64)
 			max_time = active_time;
+
+		if (ws->autosleep_enabled)
+			prevent_sleep_time = ktime_add(prevent_sleep_time,
+				ktime_sub(now, ws->start_prevent_time));
 	} else {
 		active_time = ktime_set(0, 0);
 	}
 
 	ret = seq_printf(m, "%-12s\t%lu\t\t%lu\t\t%lu\t\t%lu\t\t"
-			"%lld\t\t%lld\t\t%lld\t\t%lld\n",
+			"%lld\t\t%lld\t\t%lld\t\t%lld\t\t%lld\n",
 			ws->name, active_count, ws->event_count,
 			ws->wakeup_count, ws->expire_count,
 			ktime_to_ms(active_time), ktime_to_ms(total_time),
-			ktime_to_ms(max_time), ktime_to_ms(ws->last_time));
+			ktime_to_ms(max_time), ktime_to_ms(ws->last_time),
+			ktime_to_ms(prevent_sleep_time));
 
 	spin_unlock_irqrestore(&ws->lock, flags);
 
@@ -773,7 +826,7 @@ static int wakeup_sources_stats_show(str
 
 	seq_puts(m, "name\t\tactive_count\tevent_count\twakeup_count\t"
 		"expire_count\tactive_since\ttotal_time\tmax_time\t"
-		"last_change\n");
+		"last_change\tprevent_suspend_time\n");
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(ws, &wakeup_sources, entry)
Index: linux/include/linux/suspend.h
===================================================================
--- linux.orig/include/linux/suspend.h
+++ linux/include/linux/suspend.h
@@ -358,6 +358,7 @@ extern bool events_check_enabled;
 extern bool pm_wakeup_pending(void);
 extern bool pm_get_wakeup_count(unsigned int *count, bool block);
 extern bool pm_save_wakeup_count(unsigned int count);
+extern void pm_wakep_autosleep_enabled(bool set);
 
 static inline void lock_system_sleep(void)
 {
Index: linux/drivers/base/power/sysfs.c
===================================================================
--- linux.orig/drivers/base/power/sysfs.c
+++ linux/drivers/base/power/sysfs.c
@@ -417,6 +417,27 @@ static ssize_t wakeup_last_time_show(str
 }
 
 static DEVICE_ATTR(wakeup_last_time_ms, 0444, wakeup_last_time_show, NULL);
+
+#ifdef CONFIG_PM_AUTOSLEEP
+static ssize_t wakeup_prevent_sleep_time_show(struct device *dev,
+					      struct device_attribute *attr,
+					      char *buf)
+{
+	s64 msec = 0;
+	bool enabled = false;
+
+	spin_lock_irq(&dev->power.lock);
+	if (dev->power.wakeup) {
+		msec = ktime_to_ms(dev->power.wakeup->prevent_sleep_time);
+		enabled = true;
+	}
+	spin_unlock_irq(&dev->power.lock);
+	return enabled ? sprintf(buf, "%lld\n", msec) : sprintf(buf, "\n");
+}
+
+static DEVICE_ATTR(wakeup_prevent_sleep_time_ms, 0444,
+		   wakeup_prevent_sleep_time_show, NULL);
+#endif /* CONFIG_PM_AUTOSLEEP */
 #endif /* CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM_ADVANCED_DEBUG
@@ -511,6 +532,9 @@ static struct attribute *wakeup_attrs[]
 	&dev_attr_wakeup_total_time_ms.attr,
 	&dev_attr_wakeup_max_time_ms.attr,
 	&dev_attr_wakeup_last_time_ms.attr,
+#ifdef CONFIG_PM_AUTOSLEEP
+	&dev_attr_wakeup_prevent_sleep_time_ms.attr,
+#endif
 #endif
 	NULL,
 };
Index: linux/Documentation/ABI/testing/sysfs-devices-power
===================================================================
--- linux.orig/Documentation/ABI/testing/sysfs-devices-power
+++ linux/Documentation/ABI/testing/sysfs-devices-power
@@ -158,6 +158,17 @@ Description:
 		not enabled to wake up the system from sleep states, this
 		attribute is not present.
 
+What:		/sys/devices/.../power/wakeup_prevent_sleep_time_ms
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/devices/.../wakeup_prevent_sleep_time_ms attribute
+		contains the total time the device has been preventing
+		opportunistic transitions to sleep states from occuring.
+		This attribute is read-only.  If the device is not enabled to
+		wake up the system from sleep states, this attribute is not
+		present.
+
 What:		/sys/devices/.../power/autosuspend_delay_ms
 Date:		September 2010
 Contact:	Alan Stern <stern@rowland.harvard.edu>
Index: linux/kernel/power/autosleep.c
===================================================================
--- linux.orig/kernel/power/autosleep.c
+++ linux/kernel/power/autosleep.c
@@ -91,8 +91,12 @@ int pm_autosleep_set_state(suspend_state
 
 	__pm_relax(autosleep_ws);
 
-	if (state > PM_SUSPEND_ON)
+	if (state > PM_SUSPEND_ON) {
+		pm_wakep_autosleep_enabled(true);
 		queue_up_suspend_work();
+	} else {
+		pm_wakep_autosleep_enabled(false);
+	}
 
 	mutex_unlock(&autosleep_lock);
 	return 0;


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating wakeup sources
  2012-04-22 21:19   ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Rafael J. Wysocki
                       ` (6 preceding siblings ...)
  2012-04-22 21:24     ` [RFC][PATCH 7/8] PM / Sleep: Add "prevent autosleep time" statistics to wakeup sources Rafael J. Wysocki
@ 2012-04-22 21:24     ` Rafael J. Wysocki
  2012-04-24  1:35       ` John Stultz
  2012-04-23 16:49     ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Greg KH
  8 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-22 21:24 UTC (permalink / raw)
  To: Linux PM list
  Cc: LKML, Magnus Damm, markgross, Matthew Garrett, Greg KH,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

From: Rafael J. Wysocki <rjw@sisk.pl>

Android allows user space to manipulate wakelocks using two
sysfs file located in /sys/power/, wake_lock and wake_unlock.
Writing a wakelock name and optionally a timeout to the wake_lock
file causes the wakelock whose name was written to be acquired (it
is created before is necessary), optionally with the given timeout.
Writing the name of a wakelock to wake_unlock causes that wakelock
to be released.

Implement an analogous interface for user space using wakeup sources.
Add the /sys/power/wake_lock and /sys/power/wake_unlock files
allowing user space to create, activate and deactivate wakeup
sources, such that writing a name and optionally a timeout to
wake_lock causes the wakeup source of that name to be activated,
optionally with the given timeout.  If that wakeup source doesn't
exist, it will be created and then activated.  Writing a name to
wake_unlock causes the wakeup source of that name, if there is one,
to be deactivated.  Wakeup sources created with the help of
wake_lock that haven't been used for more than 5 minutes are garbage
collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT
wakeup sources created with the help of wake_lock present at a time.

The data type used to track wakeup sources created by user space is
called "struct wakelock" to indicate the origins of this feature.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 Documentation/ABI/testing/sysfs-power |   42 ++++++
 drivers/base/power/wakeup.c           |    1 
 kernel/power/Kconfig                  |    8 +
 kernel/power/Makefile                 |    1 
 kernel/power/main.c                   |   41 ++++++
 kernel/power/power.h                  |    9 +
 kernel/power/wakelock.c               |  218 ++++++++++++++++++++++++++++++++++
 7 files changed, 320 insertions(+)

Index: linux/kernel/power/main.c
===================================================================
--- linux.orig/kernel/power/main.c
+++ linux/kernel/power/main.c
@@ -431,6 +431,43 @@ static ssize_t autosleep_store(struct ko
 
 power_attr(autosleep);
 #endif /* CONFIG_PM_AUTOSLEEP */
+
+#ifdef CONFIG_PM_WAKELOCKS
+static ssize_t wake_lock_show(struct kobject *kobj,
+			      struct kobj_attribute *attr,
+			      char *buf)
+{
+	return pm_show_wakelocks(buf, true);
+}
+
+static ssize_t wake_lock_store(struct kobject *kobj,
+			       struct kobj_attribute *attr,
+			       const char *buf, size_t n)
+{
+	int error = pm_wake_lock(buf);
+	return error ? error : n;
+}
+
+power_attr(wake_lock);
+
+static ssize_t wake_unlock_show(struct kobject *kobj,
+				struct kobj_attribute *attr,
+				char *buf)
+{
+	return pm_show_wakelocks(buf, false);
+}
+
+static ssize_t wake_unlock_store(struct kobject *kobj,
+				 struct kobj_attribute *attr,
+				 const char *buf, size_t n)
+{
+	int error = pm_wake_unlock(buf);
+	return error ? error : n;
+}
+
+power_attr(wake_unlock);
+
+#endif /* CONFIG_PM_WAKELOCKS */
 #endif /* CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM_TRACE
@@ -487,6 +524,10 @@ static struct attribute * g[] = {
 #ifdef CONFIG_PM_AUTOSLEEP
 	&autosleep_attr.attr,
 #endif
+#ifdef CONFIG_PM_WAKELOCKS
+	&wake_lock_attr.attr,
+	&wake_unlock_attr.attr,
+#endif
 #ifdef CONFIG_PM_DEBUG
 	&pm_test_attr.attr,
 #endif
Index: linux/kernel/power/power.h
===================================================================
--- linux.orig/kernel/power/power.h
+++ linux/kernel/power/power.h
@@ -282,3 +282,12 @@ static inline void pm_autosleep_unlock(v
 static inline suspend_state_t pm_autosleep_state(void) { return PM_SUSPEND_ON; }
 
 #endif /* !CONFIG_PM_AUTOSLEEP */
+
+#ifdef CONFIG_PM_WAKELOCKS
+
+/* kernel/power/wakelock.c */
+extern ssize_t pm_show_wakelocks(char *buf, bool show_active);
+extern int pm_wake_lock(const char *buf);
+extern int pm_wake_unlock(const char *buf);
+
+#endif /* !CONFIG_PM_WAKELOCKS */
Index: linux/kernel/power/Kconfig
===================================================================
--- linux.orig/kernel/power/Kconfig
+++ linux/kernel/power/Kconfig
@@ -111,6 +111,14 @@ config PM_AUTOSLEEP
 	Allow the kernel to trigger a system transition into a global sleep
 	state automatically whenever there are no active wakeup sources.
 
+config PM_WAKELOCKS
+	bool "User space wakeup sources interface"
+	depends on PM_SLEEP
+	default n
+	---help---
+	Allow user space to create, activate and deactivate wakeup source
+	objects with the help of a sysfs-based interface.
+
 config PM_RUNTIME
 	bool "Run-time PM core functionality"
 	depends on !IA64_HP_SIM
Index: linux/kernel/power/wakelock.c
===================================================================
--- /dev/null
+++ linux/kernel/power/wakelock.c
@@ -0,0 +1,218 @@
+/*
+ * kernel/power/wakelock.c
+ *
+ * User space wakeup sources support.
+ *
+ * Copyright (C) 2012 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This code is based on the analogous interface allowing user space to
+ * manipulate wakelocks on Android.
+ */
+
+#include <linux/ctype.h>
+#include <linux/device.h>
+#include <linux/err.h>
+#include <linux/hrtimer.h>
+#include <linux/list.h>
+#include <linux/rbtree.h>
+#include <linux/slab.h>
+
+#define WL_NUMBER_LIMIT	100
+#define WL_GC_COUNT_MAX	100
+#define WL_GC_TIME_SEC	300
+
+static DEFINE_MUTEX(wakelocks_lock);
+
+struct wakelock {
+	char			*name;
+	struct rb_node		node;
+	struct wakeup_source	ws;
+	struct list_head	lru;
+};
+
+static struct rb_root wakelocks_tree = RB_ROOT;
+static LIST_HEAD(wakelocks_lru_list);
+static unsigned int number_of_wakelocks;
+static unsigned int wakelocks_gc_count;
+
+ssize_t pm_show_wakelocks(char *buf, bool show_active)
+{
+	struct rb_node *node;
+	struct wakelock *wl;
+	char *str = buf;
+	char *end = buf + PAGE_SIZE;
+
+	mutex_lock(&wakelocks_lock);
+
+	for (node = rb_first(&wakelocks_tree); node; node = rb_next(node)) {
+		bool active;
+
+		wl = rb_entry(node, struct wakelock, node);
+		spin_lock_irq(&wl->ws.lock);
+		active = wl->ws.active;
+		spin_unlock_irq(&wl->ws.lock);
+		if (active == show_active)
+			str += scnprintf(str, end - str, "%s ", wl->name);
+	}
+	str += scnprintf(str, end - str, "\n");
+
+	mutex_unlock(&wakelocks_lock);
+	return (str - buf);
+}
+
+static struct wakelock *wakelock_lookup_add(const char *name, size_t len,
+					    bool add_if_not_found)
+{
+	struct rb_node **node = &wakelocks_tree.rb_node;
+	struct rb_node *parent = *node;
+	struct wakelock *wl;
+
+	while (*node) {
+		int diff;
+
+		wl = rb_entry(*node, struct wakelock, node);
+		diff = strncmp(name, wl->name, len);
+		if (diff == 0) {
+			if (wl->name[len])
+				diff = -1;
+			else
+				return wl;
+		}
+		if (diff < 0)
+			node = &(*node)->rb_left;
+		else
+			node = &(*node)->rb_right;
+
+		parent = *node;
+	}
+	if (!add_if_not_found)
+		return ERR_PTR(-EINVAL);
+
+	if (number_of_wakelocks > WL_NUMBER_LIMIT)
+		return ERR_PTR(-ENOSPC);
+
+	/* Not found, we have to add a new one. */
+	wl = kzalloc(sizeof(*wl), GFP_KERNEL);
+	if (!wl)
+		return ERR_PTR(-ENOMEM);
+
+	wl->name = kstrndup(name, len, GFP_KERNEL);
+	if (!wl->name) {
+		kfree(wl);
+		return ERR_PTR(-ENOMEM);
+	}
+	wl->ws.name = wl->name;
+	wakeup_source_add(&wl->ws);
+	rb_link_node(&wl->node, parent, node);
+	rb_insert_color(&wl->node, &wakelocks_tree);
+	list_add(&wl->lru, &wakelocks_lru_list);
+	number_of_wakelocks++;
+	return wl;
+}
+
+int pm_wake_lock(const char *buf)
+{
+	const char *str = buf;
+	struct wakelock *wl;
+	u64 timeout_ns = 0;
+	size_t len;
+	int ret = 0;
+
+	while (*str && !isspace(*str))
+		str++;
+
+	len = str - buf;
+	if (!len)
+		return -EINVAL;
+
+	if (*str && *str != '\n') {
+		/* Find out if there's a valid timeout string appended. */
+		ret = kstrtou64(skip_spaces(str), 10, &timeout_ns);
+		if (ret)
+			return -EINVAL;
+	}
+
+	mutex_lock(&wakelocks_lock);
+
+	wl = wakelock_lookup_add(buf, len, true);
+	if (IS_ERR(wl)) {
+		ret = PTR_ERR(wl);
+		goto out;
+	}
+	if (timeout_ns) {
+		u64 timeout_ms = timeout_ns + NSEC_PER_MSEC - 1;
+
+		do_div(timeout_ms, NSEC_PER_MSEC);
+		__pm_wakeup_event(&wl->ws, timeout_ms);
+	} else {
+		__pm_stay_awake(&wl->ws);
+	}
+
+	list_move(&wl->lru, &wakelocks_lru_list);
+
+ out:
+	mutex_unlock(&wakelocks_lock);
+	return ret;
+}
+
+static void wakelocks_gc(void)
+{
+	struct wakelock *wl, *aux;
+	ktime_t now = ktime_get();
+
+	list_for_each_entry_safe_reverse(wl, aux, &wakelocks_lru_list, lru) {
+		u64 idle_time_ns;
+		bool active;
+
+		spin_lock_irq(&wl->ws.lock);
+		idle_time_ns = ktime_to_ns(ktime_sub(now, wl->ws.last_time));
+		active = wl->ws.active;
+		spin_unlock_irq(&wl->ws.lock);
+
+		if (idle_time_ns < ((u64)WL_GC_TIME_SEC * NSEC_PER_SEC))
+			break;
+
+		if (!active) {
+			wakeup_source_remove(&wl->ws);
+			rb_erase(&wl->node, &wakelocks_tree);
+			list_del(&wl->lru);
+			kfree(wl->name);
+			kfree(wl);
+			number_of_wakelocks--;
+		}
+	}
+	wakelocks_gc_count = 0;
+}
+
+int pm_wake_unlock(const char *buf)
+{
+	struct wakelock *wl;
+	size_t len;
+	int ret = 0;
+
+	len = strlen(buf);
+	if (!len)
+		return -EINVAL;
+
+	if (buf[len-1] == '\n')
+		len--;
+
+	if (!len)
+		return -EINVAL;
+
+	mutex_lock(&wakelocks_lock);
+
+	wl = wakelock_lookup_add(buf, len, false);
+	if (IS_ERR(wl)) {
+		ret = PTR_ERR(wl);
+		goto out;
+	}
+	__pm_relax(&wl->ws);
+	list_move(&wl->lru, &wakelocks_lru_list);
+	if (++wakelocks_gc_count > WL_GC_COUNT_MAX)
+		wakelocks_gc();
+
+ out:
+	mutex_unlock(&wakelocks_lock);
+	return ret;
+}
Index: linux/kernel/power/Makefile
===================================================================
--- linux.orig/kernel/power/Makefile
+++ linux/kernel/power/Makefile
@@ -10,5 +10,6 @@ obj-$(CONFIG_PM_TEST_SUSPEND)	+= suspend
 obj-$(CONFIG_HIBERNATION)	+= hibernate.o snapshot.o swap.o user.o \
 				   block_io.o
 obj-$(CONFIG_PM_AUTOSLEEP)	+= autosleep.o
+obj-$(CONFIG_PM_WAKELOCKS)	+= wakelock.o
 
 obj-$(CONFIG_MAGIC_SYSRQ)	+= poweroff.o
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -133,6 +133,7 @@ void wakeup_source_add(struct wakeup_sou
 	spin_lock_init(&ws->lock);
 	setup_timer(&ws->timer, pm_wakeup_timer_fn, (unsigned long)ws);
 	ws->active = false;
+	ws->last_time = ktime_get();
 
 	spin_lock_irq(&events_lock);
 	list_add_rcu(&ws->entry, &wakeup_sources);
Index: linux/Documentation/ABI/testing/sysfs-power
===================================================================
--- linux.orig/Documentation/ABI/testing/sysfs-power
+++ linux/Documentation/ABI/testing/sysfs-power
@@ -189,3 +189,45 @@ Description:
 
 		Reading from this file causes the last string successfully
 		written to it to be displayed.
+
+What:		/sys/power/wake_lock
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/wake_lock file allows user space to create
+		wakeup source objects and activate them on demand (if one of
+		those wakeup sources is active, reads from the
+		/sys/power/wakeup_count file block or return false).  When a
+		string without white space is written to /sys/power/wake_lock,
+		it will be assumed to represent a wakeup source name.  If there
+		is a wakeup source object with that name, it will be activated
+		(unless active already).  Otherwise, a new wakeup source object
+		will be registered, assigned the given name and activated.
+		If a string written to /sys/power/wake_lock contains white
+		space, the part of the string preceding the white space will be
+		regarded as a wakeup source name and handled as descrived above.
+		The other part of the string will be regarded as a timeout (in
+		nanoseconds) such that the wakeup source will be automatically
+		deactivated after it has expired.  The timeout, if present, is
+		set regardless of the current state of the wakeup source object
+		in question.
+
+		Reads from this file return a string consisting of the names of
+		wakeup sources created with the help of it that are active at
+		the moment, separated with spaces.
+
+
+What:		/sys/power/wake_unlock
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/wake_unlock file allows user space to deactivate
+		wakeup sources created with the help of /sys/power/wake_lock.
+		When a string is written to /sys/power/wake_unlock, it will be
+		assumed to represent the name of a wakeup source to deactivate.
+		If a wakeup source object of that name exists and is active at
+		the moment, it will be deactivated.
+
+		Reads from this file return a string consisting of the names of
+		wakeup sources created with the help of /sys/power/wake_lock
+		that are inactive at the moment, separated with spaces.


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 2/8] PM / Sleep: Use wait queue to signal "no wakeup events in progress"
  2012-04-22 21:20     ` [PATCH 2/8] PM / Sleep: Use wait queue to signal "no wakeup events in progress" Rafael J. Wysocki
@ 2012-04-23  4:01       ` mark gross
  0 siblings, 0 replies; 129+ messages in thread
From: mark gross @ 2012-04-23  4:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Sun, Apr 22, 2012 at 11:20:39PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> The current wakeup source deactivation code doesn't do anything when
> the counter of wakeup events in progress goes down to zero, which
> requires pm_get_wakeup_count() to poll that counter periodically.
> Although this reduces the average time it takes to deactivate a
> wakeup source, it also may lead to a substantial amount of unnecessary
> polling if there are extended periods of wakeup activity.  Thus it
> seems reasonable to use a wait queue for signaling the "no wakeup
> events in progress" condition and remove the polling.
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  drivers/base/power/wakeup.c |   16 +++++++++++++---
>  1 file changed, 13 insertions(+), 3 deletions(-)
> 
> Index: linux/drivers/base/power/wakeup.c
> ===================================================================
> --- linux.orig/drivers/base/power/wakeup.c
> +++ linux/drivers/base/power/wakeup.c
> @@ -17,8 +17,6 @@
>  
>  #include "power.h"
>  
> -#define TIMEOUT		100
> -
>  /*
>   * If set, the suspend/hibernate code will abort transitions to a sleep state
>   * if wakeup events are registered during or immediately before the transition.
> @@ -52,6 +50,8 @@ static void pm_wakeup_timer_fn(unsigned
>  
>  static LIST_HEAD(wakeup_sources);
>  
> +static DECLARE_WAIT_QUEUE_HEAD(wakeup_count_wait_queue);
> +
>  /**
>   * wakeup_source_prepare - Prepare a new wakeup source for initialization.
>   * @ws: Wakeup source to prepare.
> @@ -442,6 +442,7 @@ EXPORT_SYMBOL_GPL(pm_stay_awake);
>   */
>  static void wakeup_source_deactivate(struct wakeup_source *ws)
>  {
> +	unsigned int cnt, inpr;
>  	ktime_t duration;
>  	ktime_t now;
>  
> @@ -476,6 +477,10 @@ static void wakeup_source_deactivate(str
>  	 * couter of wakeup events in progress simultaneously.
>  	 */
>  	atomic_add(MAX_IN_PROGRESS, &combined_event_count);
> +
> +	split_counters(&cnt, &inpr);
> +	if (!inpr && waitqueue_active(&wakeup_count_wait_queue))
> +		wake_up(&wakeup_count_wait_queue);
>  }
>  
>  /**
> @@ -667,14 +672,19 @@ bool pm_wakeup_pending(void)
>  bool pm_get_wakeup_count(unsigned int *count)
>  {
>  	unsigned int cnt, inpr;
> +	DEFINE_WAIT(wait);
>  
>  	for (;;) {
> +		prepare_to_wait(&wakeup_count_wait_queue, &wait,
> +				TASK_INTERRUPTIBLE);
>  		split_counters(&cnt, &inpr);
>  		if (inpr == 0 || signal_pending(current))
>  			break;
>  		pm_wakeup_update_hit_counts();
> -		schedule_timeout_interruptible(msecs_to_jiffies(TIMEOUT));
> +
> +		schedule();
>  	}
> +	finish_wait(&wakeup_count_wait_queue, &wait);
>  
>  	split_counters(&cnt, &inpr);
>  	*count = cnt;

Acked-by: mark gross <markgross@thegnar.org>


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3
  2012-04-22 21:19   ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Rafael J. Wysocki
                       ` (7 preceding siblings ...)
  2012-04-22 21:24     ` [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating " Rafael J. Wysocki
@ 2012-04-23 16:49     ` Greg KH
  2012-04-23 19:51       ` Rafael J. Wysocki
  8 siblings, 1 reply; 129+ messages in thread
From: Greg KH @ 2012-04-23 16:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Sun, Apr 22, 2012 at 11:19:01PM +0200, Rafael J. Wysocki wrote:
> Hi all,
> 
> Following is the third update of the autosleep patchset.
> 
> Patches [1-4/8] are regarded as v3.5 material, the rest - depending on
> the feedback I get (lack of feedback will be understood as no objections,
> though).

This all looks great to me, thanks for continuing to push this:
	Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3
  2012-04-23 16:49     ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Greg KH
@ 2012-04-23 19:51       ` Rafael J. Wysocki
  0 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-23 19:51 UTC (permalink / raw)
  To: Greg KH
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Monday, April 23, 2012, Greg KH wrote:
> On Sun, Apr 22, 2012 at 11:19:01PM +0200, Rafael J. Wysocki wrote:
> > Hi all,
> > 
> > Following is the third update of the autosleep patchset.
> > 
> > Patches [1-4/8] are regarded as v3.5 material, the rest - depending on
> > the feedback I get (lack of feedback will be understood as no objections,
> > though).
> 
> This all looks great to me, thanks for continuing to push this:
> 	Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Thanks a lot!

Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating wakeup sources
  2012-04-22 21:24     ` [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating " Rafael J. Wysocki
@ 2012-04-24  1:35       ` John Stultz
  2012-04-24 21:27         ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: John Stultz @ 2012-04-24  1:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On 04/22/2012 02:24 PM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki<rjw@sisk.pl>
>
> Android allows user space to manipulate wakelocks using two
> sysfs file located in /sys/power/, wake_lock and wake_unlock.
> Writing a wakelock name and optionally a timeout to the wake_lock
> file causes the wakelock whose name was written to be acquired (it
> is created before is necessary), optionally with the given timeout.
> Writing the name of a wakelock to wake_unlock causes that wakelock
> to be released.
>
> Implement an analogous interface for user space using wakeup sources.
> Add the /sys/power/wake_lock and /sys/power/wake_unlock files
> allowing user space to create, activate and deactivate wakeup
> sources, such that writing a name and optionally a timeout to
> wake_lock causes the wakeup source of that name to be activated,
> optionally with the given timeout.  If that wakeup source doesn't
> exist, it will be created and then activated.  Writing a name to
> wake_unlock causes the wakeup source of that name, if there is one,
> to be deactivated.  Wakeup sources created with the help of
> wake_lock that haven't been used for more than 5 minutes are garbage
> collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT
> wakeup sources created with the help of wake_lock present at a time.
>
> The data type used to track wakeup sources created by user space is
> called "struct wakelock" to indicate the origins of this feature.
>
> Signed-off-by: Rafael J. Wysocki<rjw@sisk.pl>
> ---
One small bug.  In wakelock_lookup_add, you're assigning parent after 
you assign node, so at loop exit the parent might be null.
This resulted in some strange cases where I'd add two wakelocks and 
everything would be fine, but then adding the third would cause the 
first two to get lost.

The following patch seems to fix it.

thanks
-john

diff --git a/kernel/power/wakelock.c b/kernel/power/wakelock.c
index 2f99f02..f950cc2 100644
--- a/kernel/power/wakelock.c
+++ b/kernel/power/wakelock.c
@@ -70,6 +70,7 @@ static struct wakelock *wakelock_lookup_add(const char *name, size_t len,
  	while (*node) {
  		int diff;

+		parent = *node;
  		wl = rb_entry(*node, struct wakelock, node);
  		diff = strncmp(name, wl->name, len);
  		if (diff == 0) {
@@ -82,8 +83,6 @@ static struct wakelock *wakelock_lookup_add(const char *name, size_t len,
  			node =&(*node)->rb_left;
  		else
  			node =&(*node)->rb_right;
-
-		parent = *node;
  	}
  	if (!add_if_not_found)
  		return ERR_PTR(-EINVAL);



^ permalink raw reply related	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating wakeup sources
  2012-04-24  1:35       ` John Stultz
@ 2012-04-24 21:27         ` Rafael J. Wysocki
  2012-04-26  6:31           ` NeilBrown
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-24 21:27 UTC (permalink / raw)
  To: John Stultz
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Neil Brown, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Tuesday, April 24, 2012, John Stultz wrote:
> On 04/22/2012 02:24 PM, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki<rjw@sisk.pl>
> >
> > Android allows user space to manipulate wakelocks using two
> > sysfs file located in /sys/power/, wake_lock and wake_unlock.
> > Writing a wakelock name and optionally a timeout to the wake_lock
> > file causes the wakelock whose name was written to be acquired (it
> > is created before is necessary), optionally with the given timeout.
> > Writing the name of a wakelock to wake_unlock causes that wakelock
> > to be released.
> >
> > Implement an analogous interface for user space using wakeup sources.
> > Add the /sys/power/wake_lock and /sys/power/wake_unlock files
> > allowing user space to create, activate and deactivate wakeup
> > sources, such that writing a name and optionally a timeout to
> > wake_lock causes the wakeup source of that name to be activated,
> > optionally with the given timeout.  If that wakeup source doesn't
> > exist, it will be created and then activated.  Writing a name to
> > wake_unlock causes the wakeup source of that name, if there is one,
> > to be deactivated.  Wakeup sources created with the help of
> > wake_lock that haven't been used for more than 5 minutes are garbage
> > collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT
> > wakeup sources created with the help of wake_lock present at a time.
> >
> > The data type used to track wakeup sources created by user space is
> > called "struct wakelock" to indicate the origins of this feature.
> >
> > Signed-off-by: Rafael J. Wysocki<rjw@sisk.pl>
> > ---
> One small bug.  In wakelock_lookup_add, you're assigning parent after 
> you assign node, so at loop exit the parent might be null.
> This resulted in some strange cases where I'd add two wakelocks and 
> everything would be fine, but then adding the third would cause the 
> first two to get lost.
> 
> The following patch seems to fix it.

Thanks a lot for the fix!

I have folded it into the $subject patch and the new version is appended.

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM / Sleep: Add user space interface for manipulating wakeup sources, v2

Android allows user space to manipulate wakelocks using two
sysfs file located in /sys/power/, wake_lock and wake_unlock.
Writing a wakelock name and optionally a timeout to the wake_lock
file causes the wakelock whose name was written to be acquired (it
is created before is necessary), optionally with the given timeout.
Writing the name of a wakelock to wake_unlock causes that wakelock
to be released.

Implement an analogous interface for user space using wakeup sources.
Add the /sys/power/wake_lock and /sys/power/wake_unlock files
allowing user space to create, activate and deactivate wakeup
sources, such that writing a name and optionally a timeout to
wake_lock causes the wakeup source of that name to be activated,
optionally with the given timeout.  If that wakeup source doesn't
exist, it will be created and then activated.  Writing a name to
wake_unlock causes the wakeup source of that name, if there is one,
to be deactivated.  Wakeup sources created with the help of
wake_lock that haven't been used for more than 5 minutes are garbage
collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT
wakeup sources created with the help of wake_lock present at a time.

The data type used to track wakeup sources created by user space is
called "struct wakelock" to indicate the origins of this feature.

This version of the patch includes an rbtree manipulation fix from John Stultz.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/ABI/testing/sysfs-power |   42 ++++++
 drivers/base/power/wakeup.c           |    1 
 kernel/power/Kconfig                  |    8 +
 kernel/power/Makefile                 |    1 
 kernel/power/main.c                   |   41 ++++++
 kernel/power/power.h                  |    9 +
 kernel/power/wakelock.c               |  217 ++++++++++++++++++++++++++++++++++
 7 files changed, 319 insertions(+)

Index: linux/kernel/power/main.c
===================================================================
--- linux.orig/kernel/power/main.c
+++ linux/kernel/power/main.c
@@ -431,6 +431,43 @@ static ssize_t autosleep_store(struct ko
 
 power_attr(autosleep);
 #endif /* CONFIG_PM_AUTOSLEEP */
+
+#ifdef CONFIG_PM_WAKELOCKS
+static ssize_t wake_lock_show(struct kobject *kobj,
+			      struct kobj_attribute *attr,
+			      char *buf)
+{
+	return pm_show_wakelocks(buf, true);
+}
+
+static ssize_t wake_lock_store(struct kobject *kobj,
+			       struct kobj_attribute *attr,
+			       const char *buf, size_t n)
+{
+	int error = pm_wake_lock(buf);
+	return error ? error : n;
+}
+
+power_attr(wake_lock);
+
+static ssize_t wake_unlock_show(struct kobject *kobj,
+				struct kobj_attribute *attr,
+				char *buf)
+{
+	return pm_show_wakelocks(buf, false);
+}
+
+static ssize_t wake_unlock_store(struct kobject *kobj,
+				 struct kobj_attribute *attr,
+				 const char *buf, size_t n)
+{
+	int error = pm_wake_unlock(buf);
+	return error ? error : n;
+}
+
+power_attr(wake_unlock);
+
+#endif /* CONFIG_PM_WAKELOCKS */
 #endif /* CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM_TRACE
@@ -487,6 +524,10 @@ static struct attribute * g[] = {
 #ifdef CONFIG_PM_AUTOSLEEP
 	&autosleep_attr.attr,
 #endif
+#ifdef CONFIG_PM_WAKELOCKS
+	&wake_lock_attr.attr,
+	&wake_unlock_attr.attr,
+#endif
 #ifdef CONFIG_PM_DEBUG
 	&pm_test_attr.attr,
 #endif
Index: linux/kernel/power/power.h
===================================================================
--- linux.orig/kernel/power/power.h
+++ linux/kernel/power/power.h
@@ -282,3 +282,12 @@ static inline void pm_autosleep_unlock(v
 static inline suspend_state_t pm_autosleep_state(void) { return PM_SUSPEND_ON; }
 
 #endif /* !CONFIG_PM_AUTOSLEEP */
+
+#ifdef CONFIG_PM_WAKELOCKS
+
+/* kernel/power/wakelock.c */
+extern ssize_t pm_show_wakelocks(char *buf, bool show_active);
+extern int pm_wake_lock(const char *buf);
+extern int pm_wake_unlock(const char *buf);
+
+#endif /* !CONFIG_PM_WAKELOCKS */
Index: linux/kernel/power/Kconfig
===================================================================
--- linux.orig/kernel/power/Kconfig
+++ linux/kernel/power/Kconfig
@@ -111,6 +111,14 @@ config PM_AUTOSLEEP
 	Allow the kernel to trigger a system transition into a global sleep
 	state automatically whenever there are no active wakeup sources.
 
+config PM_WAKELOCKS
+	bool "User space wakeup sources interface"
+	depends on PM_SLEEP
+	default n
+	---help---
+	Allow user space to create, activate and deactivate wakeup source
+	objects with the help of a sysfs-based interface.
+
 config PM_RUNTIME
 	bool "Run-time PM core functionality"
 	depends on !IA64_HP_SIM
Index: linux/kernel/power/wakelock.c
===================================================================
--- /dev/null
+++ linux/kernel/power/wakelock.c
@@ -0,0 +1,217 @@
+/*
+ * kernel/power/wakelock.c
+ *
+ * User space wakeup sources support.
+ *
+ * Copyright (C) 2012 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This code is based on the analogous interface allowing user space to
+ * manipulate wakelocks on Android.
+ */
+
+#include <linux/ctype.h>
+#include <linux/device.h>
+#include <linux/err.h>
+#include <linux/hrtimer.h>
+#include <linux/list.h>
+#include <linux/rbtree.h>
+#include <linux/slab.h>
+
+#define WL_NUMBER_LIMIT	100
+#define WL_GC_COUNT_MAX	100
+#define WL_GC_TIME_SEC	300
+
+static DEFINE_MUTEX(wakelocks_lock);
+
+struct wakelock {
+	char			*name;
+	struct rb_node		node;
+	struct wakeup_source	ws;
+	struct list_head	lru;
+};
+
+static struct rb_root wakelocks_tree = RB_ROOT;
+static LIST_HEAD(wakelocks_lru_list);
+static unsigned int number_of_wakelocks;
+static unsigned int wakelocks_gc_count;
+
+ssize_t pm_show_wakelocks(char *buf, bool show_active)
+{
+	struct rb_node *node;
+	struct wakelock *wl;
+	char *str = buf;
+	char *end = buf + PAGE_SIZE;
+
+	mutex_lock(&wakelocks_lock);
+
+	for (node = rb_first(&wakelocks_tree); node; node = rb_next(node)) {
+		bool active;
+
+		wl = rb_entry(node, struct wakelock, node);
+		spin_lock_irq(&wl->ws.lock);
+		active = wl->ws.active;
+		spin_unlock_irq(&wl->ws.lock);
+		if (active == show_active)
+			str += scnprintf(str, end - str, "%s ", wl->name);
+	}
+	str += scnprintf(str, end - str, "\n");
+
+	mutex_unlock(&wakelocks_lock);
+	return (str - buf);
+}
+
+static struct wakelock *wakelock_lookup_add(const char *name, size_t len,
+					    bool add_if_not_found)
+{
+	struct rb_node **node = &wakelocks_tree.rb_node;
+	struct rb_node *parent = *node;
+	struct wakelock *wl;
+
+	while (*node) {
+		int diff;
+
+		parent = *node;
+		wl = rb_entry(*node, struct wakelock, node);
+		diff = strncmp(name, wl->name, len);
+		if (diff == 0) {
+			if (wl->name[len])
+				diff = -1;
+			else
+				return wl;
+		}
+		if (diff < 0)
+			node = &(*node)->rb_left;
+		else
+			node = &(*node)->rb_right;
+	}
+	if (!add_if_not_found)
+		return ERR_PTR(-EINVAL);
+
+	if (number_of_wakelocks > WL_NUMBER_LIMIT)
+		return ERR_PTR(-ENOSPC);
+
+	/* Not found, we have to add a new one. */
+	wl = kzalloc(sizeof(*wl), GFP_KERNEL);
+	if (!wl)
+		return ERR_PTR(-ENOMEM);
+
+	wl->name = kstrndup(name, len, GFP_KERNEL);
+	if (!wl->name) {
+		kfree(wl);
+		return ERR_PTR(-ENOMEM);
+	}
+	wl->ws.name = wl->name;
+	wakeup_source_add(&wl->ws);
+	rb_link_node(&wl->node, parent, node);
+	rb_insert_color(&wl->node, &wakelocks_tree);
+	list_add(&wl->lru, &wakelocks_lru_list);
+	number_of_wakelocks++;
+	return wl;
+}
+
+int pm_wake_lock(const char *buf)
+{
+	const char *str = buf;
+	struct wakelock *wl;
+	u64 timeout_ns = 0;
+	size_t len;
+	int ret = 0;
+
+	while (*str && !isspace(*str))
+		str++;
+
+	len = str - buf;
+	if (!len)
+		return -EINVAL;
+
+	if (*str && *str != '\n') {
+		/* Find out if there's a valid timeout string appended. */
+		ret = kstrtou64(skip_spaces(str), 10, &timeout_ns);
+		if (ret)
+			return -EINVAL;
+	}
+
+	mutex_lock(&wakelocks_lock);
+
+	wl = wakelock_lookup_add(buf, len, true);
+	if (IS_ERR(wl)) {
+		ret = PTR_ERR(wl);
+		goto out;
+	}
+	if (timeout_ns) {
+		u64 timeout_ms = timeout_ns + NSEC_PER_MSEC - 1;
+
+		do_div(timeout_ms, NSEC_PER_MSEC);
+		__pm_wakeup_event(&wl->ws, timeout_ms);
+	} else {
+		__pm_stay_awake(&wl->ws);
+	}
+
+	list_move(&wl->lru, &wakelocks_lru_list);
+
+ out:
+	mutex_unlock(&wakelocks_lock);
+	return ret;
+}
+
+static void wakelocks_gc(void)
+{
+	struct wakelock *wl, *aux;
+	ktime_t now = ktime_get();
+
+	list_for_each_entry_safe_reverse(wl, aux, &wakelocks_lru_list, lru) {
+		u64 idle_time_ns;
+		bool active;
+
+		spin_lock_irq(&wl->ws.lock);
+		idle_time_ns = ktime_to_ns(ktime_sub(now, wl->ws.last_time));
+		active = wl->ws.active;
+		spin_unlock_irq(&wl->ws.lock);
+
+		if (idle_time_ns < ((u64)WL_GC_TIME_SEC * NSEC_PER_SEC))
+			break;
+
+		if (!active) {
+			wakeup_source_remove(&wl->ws);
+			rb_erase(&wl->node, &wakelocks_tree);
+			list_del(&wl->lru);
+			kfree(wl->name);
+			kfree(wl);
+			number_of_wakelocks--;
+		}
+	}
+	wakelocks_gc_count = 0;
+}
+
+int pm_wake_unlock(const char *buf)
+{
+	struct wakelock *wl;
+	size_t len;
+	int ret = 0;
+
+	len = strlen(buf);
+	if (!len)
+		return -EINVAL;
+
+	if (buf[len-1] == '\n')
+		len--;
+
+	if (!len)
+		return -EINVAL;
+
+	mutex_lock(&wakelocks_lock);
+
+	wl = wakelock_lookup_add(buf, len, false);
+	if (IS_ERR(wl)) {
+		ret = PTR_ERR(wl);
+		goto out;
+	}
+	__pm_relax(&wl->ws);
+	list_move(&wl->lru, &wakelocks_lru_list);
+	if (++wakelocks_gc_count > WL_GC_COUNT_MAX)
+		wakelocks_gc();
+
+ out:
+	mutex_unlock(&wakelocks_lock);
+	return ret;
+}
Index: linux/kernel/power/Makefile
===================================================================
--- linux.orig/kernel/power/Makefile
+++ linux/kernel/power/Makefile
@@ -10,5 +10,6 @@ obj-$(CONFIG_PM_TEST_SUSPEND)	+= suspend
 obj-$(CONFIG_HIBERNATION)	+= hibernate.o snapshot.o swap.o user.o \
 				   block_io.o
 obj-$(CONFIG_PM_AUTOSLEEP)	+= autosleep.o
+obj-$(CONFIG_PM_WAKELOCKS)	+= wakelock.o
 
 obj-$(CONFIG_MAGIC_SYSRQ)	+= poweroff.o
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -133,6 +133,7 @@ void wakeup_source_add(struct wakeup_sou
 	spin_lock_init(&ws->lock);
 	setup_timer(&ws->timer, pm_wakeup_timer_fn, (unsigned long)ws);
 	ws->active = false;
+	ws->last_time = ktime_get();
 
 	spin_lock_irq(&events_lock);
 	list_add_rcu(&ws->entry, &wakeup_sources);
Index: linux/Documentation/ABI/testing/sysfs-power
===================================================================
--- linux.orig/Documentation/ABI/testing/sysfs-power
+++ linux/Documentation/ABI/testing/sysfs-power
@@ -189,3 +189,45 @@ Description:
 
 		Reading from this file causes the last string successfully
 		written to it to be displayed.
+
+What:		/sys/power/wake_lock
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/wake_lock file allows user space to create
+		wakeup source objects and activate them on demand (if one of
+		those wakeup sources is active, reads from the
+		/sys/power/wakeup_count file block or return false).  When a
+		string without white space is written to /sys/power/wake_lock,
+		it will be assumed to represent a wakeup source name.  If there
+		is a wakeup source object with that name, it will be activated
+		(unless active already).  Otherwise, a new wakeup source object
+		will be registered, assigned the given name and activated.
+		If a string written to /sys/power/wake_lock contains white
+		space, the part of the string preceding the white space will be
+		regarded as a wakeup source name and handled as descrived above.
+		The other part of the string will be regarded as a timeout (in
+		nanoseconds) such that the wakeup source will be automatically
+		deactivated after it has expired.  The timeout, if present, is
+		set regardless of the current state of the wakeup source object
+		in question.
+
+		Reads from this file return a string consisting of the names of
+		wakeup sources created with the help of it that are active at
+		the moment, separated with spaces.
+
+
+What:		/sys/power/wake_unlock
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/wake_unlock file allows user space to deactivate
+		wakeup sources created with the help of /sys/power/wake_lock.
+		When a string is written to /sys/power/wake_unlock, it will be
+		assumed to represent the name of a wakeup source to deactivate.
+		If a wakeup source object of that name exists and is active at
+		the moment, it will be deactivated.
+
+		Reads from this file return a string consisting of the names of
+		wakeup sources created with the help of /sys/power/wake_lock
+		that are inactive at the moment, separated with spaces.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
  2012-04-22 21:23     ` [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep Rafael J. Wysocki
@ 2012-04-26  3:05       ` NeilBrown
  2012-04-26 21:52         ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: NeilBrown @ 2012-04-26  3:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

[-- Attachment #1: Type: text/plain, Size: 8109 bytes --]

On Sun, 22 Apr 2012 23:23:23 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> From: "Rafael J. Wysocki" <rjw@sisk.pl>
> To: Linux PM list <linux-pm@vger.kernel.org>
> Cc: LKML <linux-kernel@vger.kernel.org>, Magnus Damm <magnus.damm@gmail.com>, markgross@thegnar.org, Matthew Garrett <mjg@redhat.com>, Greg KH <gregkh@linuxfoundation.org>, Arve Hjønnevåg <arve@android.com>, John Stultz <john.stultz@linaro.org>, Brian Swetland <swetland@google.com>, Neil Brown <neilb@suse.de>, Alan Stern <stern@rowland.harvard.edu>, Dmitry Torokhov <dmitry.torokhov@gmail.com>, "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
> Subject: [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
> Date: Sun, 22 Apr 2012 23:23:23 +0200
> Sender: linux-kernel-owner@vger.kernel.org
> User-Agent: KMail/1.13.6 (Linux/3.4.0-rc3+; KDE/4.6.0; x86_64; ; )
> 
> From: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Introduce a mechanism by which the kernel can trigger global
> transitions to a sleep state chosen by user space if there are no
> active wakeup sources.

Hi Rafael,
 just a few little issues below.  Over all I think that if we have to have
 auto-sleep in the kernel, then this is a good way to do it.

> +static void try_to_suspend(struct work_struct *work)
> +{
> +	unsigned int initial_count, final_count;
> +
> +	if (!pm_get_wakeup_count(&initial_count, true))
> +		goto out;
> +
> +	mutex_lock(&autosleep_lock);
> +
> +	if (!pm_save_wakeup_count(initial_count)) {
> +		mutex_unlock(&autosleep_lock);
> +		goto out;
> +	}
> +
> +	if (autosleep_state == PM_SUSPEND_ON) {
> +		mutex_unlock(&autosleep_lock);
> +		return;
> +	}
> +	if (autosleep_state >= PM_SUSPEND_MAX)
> +		hibernate();
> +	else
> +		pm_suspend(autosleep_state);
> +
> +	mutex_unlock(&autosleep_lock);
> +
> +	if (!pm_get_wakeup_count(&final_count, false))
> +		goto out;
> +
> +	if (final_count == initial_count)
> +		schedule_timeout(HZ / 2);

This doesn't do what you seem to expect it to do.
You need to set current->state to something like TASK_UNINTERRUPTIBLE
before calling schedule_timeout, otherwise it is effectily a no-op.
schedule_timeout_uninterruptible(), for example, will do this for you.

However the value of this isn't clear to me, so a comment would probably be a
good thing.
This continue presumably fires if we wake up without any wakeup sources
being activated.  In that case you want to delay for 500ms - presumably to
avoid a tight suspend/resume loop if something goes wrong?

I have occasionally seen a stray/uninteresting interrupt wake from suspend
immediately after entering suspend and the next attempt succeeds.  Maybe this
is a bug in some driver somewhere, but not a big one.  I think I would rather
in that case that we attempt to re-enter suspend immediately.  Maybe after a
few failed attempts it makes sense to back off.

The other question is: if we want to back-off, is 500ms really enough?  What
will be gained by, or could be achieved in, that time?  An exponential
back-off might be defensible, but I can't see the value of a 500ms fixed
back-off.
However if you can, I'd love to see a comment in there explaining it.


> +
> + out:
> +	queue_up_suspend_work();
> +}
> +


> +
> +int pm_autosleep_set_state(suspend_state_t state)
> +{
> +
> +#ifndef CONFIG_HIBERNATION
> +	if (state >= PM_SUSPEND_MAX)
> +		return -EINVAL;
> +#endif
> +
> +	__pm_stay_awake(autosleep_ws);
> +
> +	mutex_lock(&autosleep_lock);
> +
> +	autosleep_state = state;
> +
> +	__pm_relax(autosleep_ws);

I'm struggling to see the point of the autosleep_ws.

A suspend cannot actually happen while this code is running (can it?) because
it will wait for the process to enter the freezer.
So the only effect of this is:
  1/ cause the current auto-sleep cycle to abort and
  2/ maybe add some accounting number is the autosleep_ws.
Is that right?
Which of these is needed?

I would imagine that any process writing to /sys/power/autosleep would be
holding a wakelock, and if it didn't it should expect things to be racy...

Am I missing something?


> +
> +	if (state > PM_SUSPEND_ON)
> +		queue_up_suspend_work();

The test here is superfluous as queue_up_suspend_work() itself tests that
'state' is > PM_SUSPEND_ON.  However maybe it is more readable this way, so I
won't object it you like it.


> +
> +	mutex_unlock(&autosleep_lock);
> +	return 0;
> +}


> @@ -339,7 +359,8 @@ static ssize_t wakeup_count_show(struct
>  {
>  	unsigned int val;
>  
> -	return pm_get_wakeup_count(&val) ? sprintf(buf, "%u\n", val) : -EINTR;
> +	return pm_get_wakeup_count(&val, true) ?
> +		sprintf(buf, "%u\n", val) : -EINTR;
>  }

I think it would be really nice for user-space auto-suspend if the 'block'
flag to be settable from the O_NONBLOCK setting.  And for poll() to work
on /sys/power/wakeup-count.  However this would require a bit of surgery on
sysfs.  So that is a "maybe later", but having the 'block' flag in there is
a step in the right direction.


>  
>  static ssize_t wakeup_count_store(struct kobject *kobj,
> @@ -347,15 +368,69 @@ static ssize_t wakeup_count_store(struct
>  				const char *buf, size_t n)
>  {
>  	unsigned int val;
> +	int error;
> +
> +	error = pm_autosleep_lock();
> +	if (error)
> +		return error;
> +
> +	if (pm_autosleep_state() > PM_SUSPEND_ON) {
> +		error = -EBUSY;
> +		goto out;
> +	}
>  
>  	if (sscanf(buf, "%u", &val) == 1) {
>  		if (pm_save_wakeup_count(val))
>  			return n;

You need a 'pm_autosleep_unlock() in there - or possibly
  error = n; goto out;


>  	}
> -	return -EINVAL;
> +	error = -EINVAL;
> +
> + out:
> +	pm_autosleep_unlock();
> +	return error;
>  }

>  core_initcall(pm_init);
> Index: linux/drivers/base/power/wakeup.c
> ===================================================================
> --- linux.orig/drivers/base/power/wakeup.c
> +++ linux/drivers/base/power/wakeup.c
> @@ -498,8 +498,10 @@ static void wakeup_source_deactivate(str
>  	trace_wakeup_source_deactivate(ws->name, cec);
>  
>  	split_counters(&cnt, &inpr);
> -	if (!inpr && waitqueue_active(&wakeup_count_wait_queue))
> +	if (!inpr && waitqueue_active(&wakeup_count_wait_queue)) {
>  		wake_up(&wakeup_count_wait_queue);
> +		queue_up_suspend_work();
> +	}

This doesn't look right.  suspend_work always requeues itself unless
autosleep_state == PM_SUSPEND_ON, and whenver autosleep_state is set we
already call queue_up_suspend_work().  So there is no need to call it here.

> Index: linux/Documentation/ABI/testing/sysfs-power
> ===================================================================
> --- linux.orig/Documentation/ABI/testing/sysfs-power
> +++ linux/Documentation/ABI/testing/sysfs-power
> @@ -172,3 +172,20 @@ Description:
>  
>  		Reading from this file will display the current value, which is
>  		set to 1 MB by default.
> +
> +What:		/sys/power/autosleep
> +Date:		February 2012
> +Contact:	Rafael J. Wysocki <rjw@sisk.pl>
> +Description:
> +		The /sys/power/autosleep file can be written one of the strings

"To the .. file can be written..." or
"The .. file can have written ..." or
"One of the strings returned by (reads from) /sys/power/state can be written
to the file ..."
??
> +		returned by reads from /sys/power/state.  If that happens, a
> +		work item attempting to trigger a transition of the system to
> +		the sleep state represented by that string is queued up.  This
> +		attempt will only succeed if there are no active wakeup sources
> +		in the system at that time.  After evey execution, regardless
                                                   ^^^^
  "every"

> +		of whether or not the attempt to put the system to sleep has
> +		succeeded, the work item requeues itself until user space
> +		writes "off" to /sys/power/autosleep.
> +
> +		Reading from this file causes the last string successfully
> +		written to it to be displayed.
                                    ^^^^^^^^^  "returned".


Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 5/8] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-04-22 21:22     ` [RFC][PATCH 5/8] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready Rafael J. Wysocki
@ 2012-04-26  4:03       ` NeilBrown
  2012-04-26 20:40         ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: NeilBrown @ 2012-04-26  4:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

[-- Attachment #1: Type: text/plain, Size: 3740 bytes --]

On Sun, 22 Apr 2012 23:22:43 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> From: Arve Hjønnevåg <arve@android.com>
> 
> When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
> wakeup_source will be active to prevent suspend. This can be used to
> handle wakeup events from a driver that support poll, e.g. input, if
> that driver wakes up the waitqueue passed to epoll before allowing
> suspend.
> 
> The current implementation uses an extra wakeup_source when
> ep_scan_ready_list runs. This can cause problems if a single thread
> is polling on wakeup events and frequent non-wakeup events (events
> usually arrive during thread freezing) using the same epoll file.

This is quite neat.

If I understand it correctly, you register file descriptors with epoll_ctl()
on an fd created with epoll_create(), and set the new EPOLLWAKEUP flag.
Then when a regular 'poll' or 'select' on the epoll fd reports that it is
readable you:
  - get a wakelock
  - use epoll_wait to collect the events
  - process the events
  - release your wakelock
  - go back to poll() or select() on the epoll fd.
Correct?  As long as there are ready events with EPOLLWAKEUP set a
wakeup_source is held active and the system won't go to sleep.

My concern with this is about permissions.  It appears that any process could
wait of some fd (maybe a pipe they created themselves) with EPOLLWAKEUP, and
then simply never epoll_wait() for the event.  Then they would be keeping
the system awake.  I don't think that is acceptable.

So there needs to be some way to limit who can effectively block suspend by
using EPOLLWAKEUP.
(This is one of the reasons I like an all-user-space solution.  Policy issues
like this can easily be decided in user-space but are clumsy to put into the
kernel).

Also, I'm having trouble understanding the ep->ws wakeup_source.
The epi->ws makes lots of sense and I think I understand it all.
However I don't see why you need a wakeup_source for the 'struct eventpoll'.

Every time that 'poll' decides to call the ->poll fop for the eventpoll, this
wakeup_source will be activated and deactivated which will abort any current
suspend cycle even if there are no events to report.

I suspect it can just go away.


One last item that doesn't really belong here - but it is in context.

This mechanism is elegant because it provides a single implementation that
provides wakeup_source for almost any sort of device.  I would like to do the
same thing for interrupts.
Most (maybe all) of the wakeup device on my phone have an interrupt where the
body is run in a thread.  When the thread has done it's work the event is
visible to userspace so the EPOLLWAKEUP mechanism is all that is needed to
complete the path to user-space (or for my user-space solution, nothing else
is needed once it is visible to user-space).
So we just need to ensure a clear path from the "top half" interrupt handler
to the threaded handler.
So I imagine attaching a wakeup source to every interrupt for which 'wakeup'
is enabled, activating it when the top-half starts and relaxing it when the
bottom-half completes.  With this in place, almost all drivers would get
wakeup_source handling for free.
Does this seem reasonable to you.  I'm afraid I don't have code yet, but hope
to find time in a few weeks.

One difficulty with that is that I have noticed a number of drivers that
potentially enable_irq_wake just before suspend and disable_irq_wake
immediately after (e.g. gpio_keys.c).  Allocating a wakeup_source on each
enable_irq_wake would be an unfortunate overhead.  Maybe we just allocate it
the first time enable_irq_wake is called ....


Thanks,
NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating wakeup sources
  2012-04-24 21:27         ` Rafael J. Wysocki
@ 2012-04-26  6:31           ` NeilBrown
  2012-04-26 22:04             ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: NeilBrown @ 2012-04-26  6:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: John Stultz, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg, John Stultz,
	Brian Swetland, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

[-- Attachment #1: Type: text/plain, Size: 2991 bytes --]

On Tue, 24 Apr 2012 23:27:17 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> From: Rafael J. Wysocki <rjw@sisk.pl>
> Subject: PM / Sleep: Add user space interface for manipulating wakeup sources, v2
> 
> Android allows user space to manipulate wakelocks using two
> sysfs file located in /sys/power/, wake_lock and wake_unlock.
> Writing a wakelock name and optionally a timeout to the wake_lock
> file causes the wakelock whose name was written to be acquired (it
> is created before is necessary), optionally with the given timeout.
> Writing the name of a wakelock to wake_unlock causes that wakelock
> to be released.
> 
> Implement an analogous interface for user space using wakeup sources.
> Add the /sys/power/wake_lock and /sys/power/wake_unlock files
> allowing user space to create, activate and deactivate wakeup
> sources, such that writing a name and optionally a timeout to
> wake_lock causes the wakeup source of that name to be activated,
> optionally with the given timeout.  If that wakeup source doesn't
> exist, it will be created and then activated.  Writing a name to
> wake_unlock causes the wakeup source of that name, if there is one,
> to be deactivated.  Wakeup sources created with the help of
> wake_lock that haven't been used for more than 5 minutes are garbage
> collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT
> wakeup sources created with the help of wake_lock present at a time.
> 
> The data type used to track wakeup sources created by user space is
> called "struct wakelock" to indicate the origins of this feature.
> 
> This version of the patch includes an rbtree manipulation fix from John Stultz.

Looks good.  Just a couple of minor suggestions.


> +ssize_t pm_show_wakelocks(char *buf, bool show_active)
> +{
> +	struct rb_node *node;
> +	struct wakelock *wl;
> +	char *str = buf;
> +	char *end = buf + PAGE_SIZE;
> +
> +	mutex_lock(&wakelocks_lock);
> +
> +	for (node = rb_first(&wakelocks_tree); node; node = rb_next(node)) {
> +		bool active;
> +
> +		wl = rb_entry(node, struct wakelock, node);
> +		spin_lock_irq(&wl->ws.lock);
> +		active = wl->ws.active;
> +		spin_unlock_irq(&wl->ws.lock);

I don't think the spin_lock is needed.  We are just reading one value and it
is either 0 or not.  So there is no possibility for any inconsistency.
                if (wl->ws.active == show_active)
?

> +		if (active == show_active)
> +			str += scnprintf(str, end - str, "%s ", wl->name);

Arg.  Extra space on the end of the line!!  :-)

I would suggest the entries be terminated by '\n' rather than separate by
space.
one-item-per-line is much more common in Unix in general.  'grep' allows
you to find things more easily etc.
   while read a
   do echo $a > wake_unlock
   done < wake_lock



> +	}
> +	str += scnprintf(str, end - str, "\n");
> +
> +	mutex_unlock(&wakelocks_lock);
> +	return (str - buf);
> +}
> +


Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 5/8] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-04-26  4:03       ` NeilBrown
@ 2012-04-26 20:40         ` Rafael J. Wysocki
  2012-04-27  3:49           ` Arve Hjønnevåg
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-26 20:40 UTC (permalink / raw)
  To: NeilBrown, Arve Hjønnevåg
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, John Stultz, Brian Swetland, Alan Stern,
	Dmitry Torokhov, Srivatsa S. Bhat

On Thursday, April 26, 2012, NeilBrown wrote:
> On Sun, 22 Apr 2012 23:22:43 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > From: Arve Hjønnevåg <arve@android.com>
> > 
> > When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
> > wakeup_source will be active to prevent suspend. This can be used to
> > handle wakeup events from a driver that support poll, e.g. input, if
> > that driver wakes up the waitqueue passed to epoll before allowing
> > suspend.
> > 
> > The current implementation uses an extra wakeup_source when
> > ep_scan_ready_list runs. This can cause problems if a single thread
> > is polling on wakeup events and frequent non-wakeup events (events
> > usually arrive during thread freezing) using the same epoll file.
> 
> This is quite neat.
> 
> If I understand it correctly, you register file descriptors with epoll_ctl()
> on an fd created with epoll_create(), and set the new EPOLLWAKEUP flag.
> Then when a regular 'poll' or 'select' on the epoll fd reports that it is
> readable you:
>   - get a wakelock
>   - use epoll_wait to collect the events
>   - process the events
>   - release your wakelock
>   - go back to poll() or select() on the epoll fd.
> Correct?  As long as there are ready events with EPOLLWAKEUP set a
> wakeup_source is held active and the system won't go to sleep.
> 
> My concern with this is about permissions.  It appears that any process could
> wait of some fd (maybe a pipe they created themselves) with EPOLLWAKEUP, and
> then simply never epoll_wait() for the event.  Then they would be keeping
> the system awake.  I don't think that is acceptable.

I wonder what Arve has to say to that, but let me say that on systems without
autosleep every process can go into an infinite busy loop which is going to
drain battery relatively quickly just as well and I don't see why that's so
much different.

> So there needs to be some way to limit who can effectively block suspend by
> using EPOLLWAKEUP.
> (This is one of the reasons I like an all-user-space solution.  Policy issues
> like this can easily be decided in user-space but are clumsy to put into the
> kernel).
> 
> Also, I'm having trouble understanding the ep->ws wakeup_source.
> The epi->ws makes lots of sense and I think I understand it all.
> However I don't see why you need a wakeup_source for the 'struct eventpoll'.
> 
> Every time that 'poll' decides to call the ->poll fop for the eventpoll, this
> wakeup_source will be activated and deactivated which will abort any current
> suspend cycle even if there are no events to report.
> 
> I suspect it can just go away.

I'll leave this one entirely to Arve, if you don't mind. :-)

> One last item that doesn't really belong here - but it is in context.
> 
> This mechanism is elegant because it provides a single implementation that
> provides wakeup_source for almost any sort of device.  I would like to do the
> same thing for interrupts.
> Most (maybe all) of the wakeup device on my phone have an interrupt where the
> body is run in a thread.  When the thread has done it's work the event is
> visible to userspace so the EPOLLWAKEUP mechanism is all that is needed to
> complete the path to user-space (or for my user-space solution, nothing else
> is needed once it is visible to user-space).
> So we just need to ensure a clear path from the "top half" interrupt handler
> to the threaded handler.
> So I imagine attaching a wakeup source to every interrupt for which 'wakeup'
> is enabled, activating it when the top-half starts and relaxing it when the
> bottom-half completes.  With this in place, almost all drivers would get
> wakeup_source handling for free.
> Does this seem reasonable to you.

Yes, it does.

Wakeup devices have their own wakeup source objects anyway, so perhaps they may
be used for this purpose somehow (just wondering).

> I'm afraid I don't have code yet, but hope to find time in a few weeks.
> 
> One difficulty with that is that I have noticed a number of drivers that
> potentially enable_irq_wake just before suspend and disable_irq_wake
> immediately after (e.g. gpio_keys.c).  Allocating a wakeup_source on each
> enable_irq_wake would be an unfortunate overhead.  Maybe we just allocate it
> the first time enable_irq_wake is called ....

I guess we can do something in analogy with device_wakeup_enable()?

Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
  2012-04-26  3:05       ` NeilBrown
@ 2012-04-26 21:52         ` Rafael J. Wysocki
  2012-04-27  0:39           ` NeilBrown
  2012-05-03  0:23           ` Arve Hjønnevåg
  0 siblings, 2 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-26 21:52 UTC (permalink / raw)
  To: NeilBrown
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Thursday, April 26, 2012, NeilBrown wrote:
> On Sun, 22 Apr 2012 23:23:23 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > From: "Rafael J. Wysocki" <rjw@sisk.pl>
> > To: Linux PM list <linux-pm@vger.kernel.org>
> > Cc: LKML <linux-kernel@vger.kernel.org>, Magnus Damm <magnus.damm@gmail.com>, markgross@thegnar.org, Matthew Garrett <mjg@redhat.com>, Greg KH <gregkh@linuxfoundation.org>, Arve Hjønnevåg <arve@android.com>, John Stultz <john.stultz@linaro.org>, Brian Swetland <swetland@google.com>, Neil Brown <neilb@suse.de>, Alan Stern <stern@rowland.harvard.edu>, Dmitry Torokhov <dmitry.torokhov@gmail.com>, "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
> > Subject: [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
> > Date: Sun, 22 Apr 2012 23:23:23 +0200
> > Sender: linux-kernel-owner@vger.kernel.org
> > User-Agent: KMail/1.13.6 (Linux/3.4.0-rc3+; KDE/4.6.0; x86_64; ; )
> > 
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > 
> > Introduce a mechanism by which the kernel can trigger global
> > transitions to a sleep state chosen by user space if there are no
> > active wakeup sources.
> 
> Hi Rafael,

Hi,

>  just a few little issues below.  Over all I think that if we have to have
>  auto-sleep in the kernel, then this is a good way to do it.

Good, we seem to agree in principle, then. :-)

> > +static void try_to_suspend(struct work_struct *work)
> > +{
> > +	unsigned int initial_count, final_count;
> > +
> > +	if (!pm_get_wakeup_count(&initial_count, true))
> > +		goto out;
> > +
> > +	mutex_lock(&autosleep_lock);
> > +
> > +	if (!pm_save_wakeup_count(initial_count)) {
> > +		mutex_unlock(&autosleep_lock);
> > +		goto out;
> > +	}
> > +
> > +	if (autosleep_state == PM_SUSPEND_ON) {
> > +		mutex_unlock(&autosleep_lock);
> > +		return;
> > +	}
> > +	if (autosleep_state >= PM_SUSPEND_MAX)
> > +		hibernate();
> > +	else
> > +		pm_suspend(autosleep_state);
> > +
> > +	mutex_unlock(&autosleep_lock);
> > +
> > +	if (!pm_get_wakeup_count(&final_count, false))
> > +		goto out;
> > +
> > +	if (final_count == initial_count)
> > +		schedule_timeout(HZ / 2);
> 
> This doesn't do what you seem to expect it to do.
> You need to set current->state to something like TASK_UNINTERRUPTIBLE
> before calling schedule_timeout, otherwise it is effectily a no-op.
> schedule_timeout_uninterruptible(), for example, will do this for you.

Right.  I obviously overlooked the missing state change.

> However the value of this isn't clear to me, so a comment would probably be a
> good thing.
> This continue presumably fires if we wake up without any wakeup sources
> being activated.  In that case you want to delay for 500ms - presumably to
> avoid a tight suspend/resume loop if something goes wrong?

Yes.

> I have occasionally seen a stray/uninteresting interrupt wake from suspend
> immediately after entering suspend and the next attempt succeeds.  Maybe this
> is a bug in some driver somewhere, but not a big one.  I think I would rather
> in that case that we attempt to re-enter suspend immediately.  Maybe after a
> few failed attempts it makes sense to back off.

Perhaps.  We can adjust this particular thing later, I think.

> The other question is: if we want to back-off, is 500ms really enough?  What
> will be gained by, or could be achieved in, that time?  An exponential
> back-off might be defensible, but I can't see the value of a 500ms fixed
> back-off.
> However if you can, I'd love to see a comment in there explaining it.

Sure.

> > +
> > + out:
> > +	queue_up_suspend_work();
> > +}
> > +
> 
> 
> > +
> > +int pm_autosleep_set_state(suspend_state_t state)
> > +{
> > +
> > +#ifndef CONFIG_HIBERNATION
> > +	if (state >= PM_SUSPEND_MAX)
> > +		return -EINVAL;
> > +#endif
> > +
> > +	__pm_stay_awake(autosleep_ws);
> > +
> > +	mutex_lock(&autosleep_lock);
> > +
> > +	autosleep_state = state;
> > +
> > +	__pm_relax(autosleep_ws);
> 
> I'm struggling to see the point of the autosleep_ws.
> 
> A suspend cannot actually happen while this code is running (can it?) because
> it will wait for the process to enter the freezer.
> So the only effect of this is:
>   1/ cause the current auto-sleep cycle to abort and
>   2/ maybe add some accounting number is the autosleep_ws.
> Is that right?
> Which of these is needed?

This is to solve a problem when user space attempts to echo "off" to
/sys/power/autosleep exactly when pm_suspend() is initiated as a part
of autosleep under the autosleep lock.  In that case, if autosleep_ws is not
there, the process wanting to disable autosleep will have to wait for the
pm_suspend() to complete (unless it holds a wakelock), which is suboptimal.

> I would imagine that any process writing to /sys/power/autosleep would be
> holding a wakelock, and if it didn't it should expect things to be racy...
> 
> Am I missing something?

The assumption above is kind of optimistic in my opinion.  That process
very well may be a system administrator's bash, for example. :-)

> > +
> > +	if (state > PM_SUSPEND_ON)
> > +		queue_up_suspend_work();
> 
> The test here is superfluous as queue_up_suspend_work() itself tests that
> 'state' is > PM_SUSPEND_ON.  However maybe it is more readable this way, so I
> won't object it you like it.

Well, patch [7/8] adds the second statement under this conditional,
so I'd prefer to keep it the current way.

> > +
> > +	mutex_unlock(&autosleep_lock);
> > +	return 0;
> > +}
> 
> 
> > @@ -339,7 +359,8 @@ static ssize_t wakeup_count_show(struct
> >  {
> >  	unsigned int val;
> >  
> > -	return pm_get_wakeup_count(&val) ? sprintf(buf, "%u\n", val) : -EINTR;
> > +	return pm_get_wakeup_count(&val, true) ?
> > +		sprintf(buf, "%u\n", val) : -EINTR;
> >  }
> 
> I think it would be really nice for user-space auto-suspend if the 'block'
> flag to be settable from the O_NONBLOCK setting.  And for poll() to work
> on /sys/power/wakeup-count.  However this would require a bit of surgery on
> sysfs.  So that is a "maybe later", but having the 'block' flag in there is
> a step in the right direction.

Yes, "maybe later" is what I think about that too. :-)

> >  
> >  static ssize_t wakeup_count_store(struct kobject *kobj,
> > @@ -347,15 +368,69 @@ static ssize_t wakeup_count_store(struct
> >  				const char *buf, size_t n)
> >  {
> >  	unsigned int val;
> > +	int error;
> > +
> > +	error = pm_autosleep_lock();
> > +	if (error)
> > +		return error;
> > +
> > +	if (pm_autosleep_state() > PM_SUSPEND_ON) {
> > +		error = -EBUSY;
> > +		goto out;
> > +	}
> >  
> >  	if (sscanf(buf, "%u", &val) == 1) {
> >  		if (pm_save_wakeup_count(val))
> >  			return n;
> 
> You need a 'pm_autosleep_unlock() in there - or possibly
>   error = n; goto out;

Right, thanks for spotting this!

> >  	}
> > -	return -EINVAL;
> > +	error = -EINVAL;
> > +
> > + out:
> > +	pm_autosleep_unlock();
> > +	return error;
> >  }
> 
> >  core_initcall(pm_init);
> > Index: linux/drivers/base/power/wakeup.c
> > ===================================================================
> > --- linux.orig/drivers/base/power/wakeup.c
> > +++ linux/drivers/base/power/wakeup.c
> > @@ -498,8 +498,10 @@ static void wakeup_source_deactivate(str
> >  	trace_wakeup_source_deactivate(ws->name, cec);
> >  
> >  	split_counters(&cnt, &inpr);
> > -	if (!inpr && waitqueue_active(&wakeup_count_wait_queue))
> > +	if (!inpr && waitqueue_active(&wakeup_count_wait_queue)) {
> >  		wake_up(&wakeup_count_wait_queue);
> > +		queue_up_suspend_work();
> > +	}
> 
> This doesn't look right.  suspend_work always requeues itself unless
> autosleep_state == PM_SUSPEND_ON, and whenver autosleep_state is set we
> already call queue_up_suspend_work().  So there is no need to call it here.

OK, I agree.  Good, I don't have to add more code to wakeup_source_deactivate(). :-)

> > Index: linux/Documentation/ABI/testing/sysfs-power
> > ===================================================================
> > --- linux.orig/Documentation/ABI/testing/sysfs-power
> > +++ linux/Documentation/ABI/testing/sysfs-power
> > @@ -172,3 +172,20 @@ Description:
> >  
> >  		Reading from this file will display the current value, which is
> >  		set to 1 MB by default.
> > +
> > +What:		/sys/power/autosleep
> > +Date:		February 2012
> > +Contact:	Rafael J. Wysocki <rjw@sisk.pl>
> > +Description:
> > +		The /sys/power/autosleep file can be written one of the strings
> 
> "To the .. file can be written..." or
> "The .. file can have written ..." or
> "One of the strings returned by (reads from) /sys/power/state can be written
> to the file ..."
> ??
> > +		returned by reads from /sys/power/state.  If that happens, a
> > +		work item attempting to trigger a transition of the system to
> > +		the sleep state represented by that string is queued up.  This
> > +		attempt will only succeed if there are no active wakeup sources
> > +		in the system at that time.  After evey execution, regardless
>                                                    ^^^^
>   "every"
> 
> > +		of whether or not the attempt to put the system to sleep has
> > +		succeeded, the work item requeues itself until user space
> > +		writes "off" to /sys/power/autosleep.
> > +
> > +		Reading from this file causes the last string successfully
> > +		written to it to be displayed.
>                                     ^^^^^^^^^  "returned".

Well spotted, thanks!

Below is an updated patch hopefully addressing your comments.

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM / Sleep: Implement opportunistic sleep, v2

Introduce a mechanism by which the kernel can trigger global
transitions to a sleep state chosen by user space if there are no
active wakeup sources.

It consists of a new sysfs attribute, /sys/power/autosleep, that
can be written one of the strings returned by reads from
/sys/power/state, an ordered workqueue and a work item carrying out
the "suspend" operations.  If a string representing the system's
sleep state is written to /sys/power/autosleep, the work item
triggering transitions to that state is queued up and it requeues
itself after every execution until user space writes "off" to
/sys/power/autosleep.

That work item enables the detection of wakeup events using the
functions already defined in drivers/base/power/wakeup.c (with one
small modification) and calls either pm_suspend(), or hibernate() to
put the system into a sleep state.  If a wakeup event is reported
while the transition is in progress, it will abort the transition and
the "system suspend" work item will be queued up again.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/ABI/testing/sysfs-power |   17 ++++
 drivers/base/power/wakeup.c           |   34 +++++----
 include/linux/suspend.h               |   13 +++
 kernel/power/Kconfig                  |    8 ++
 kernel/power/Makefile                 |    1 
 kernel/power/autosleep.c              |  117 +++++++++++++++++++++++++++++++++
 kernel/power/main.c                   |  119 ++++++++++++++++++++++++++++------
 kernel/power/power.h                  |   18 +++++
 8 files changed, 292 insertions(+), 35 deletions(-)

Index: linux/kernel/power/Makefile
===================================================================
--- linux.orig/kernel/power/Makefile
+++ linux/kernel/power/Makefile
@@ -9,5 +9,6 @@ obj-$(CONFIG_SUSPEND)		+= suspend.o
 obj-$(CONFIG_PM_TEST_SUSPEND)	+= suspend_test.o
 obj-$(CONFIG_HIBERNATION)	+= hibernate.o snapshot.o swap.o user.o \
 				   block_io.o
+obj-$(CONFIG_PM_AUTOSLEEP)	+= autosleep.o
 
 obj-$(CONFIG_MAGIC_SYSRQ)	+= poweroff.o
Index: linux/kernel/power/Kconfig
===================================================================
--- linux.orig/kernel/power/Kconfig
+++ linux/kernel/power/Kconfig
@@ -103,6 +103,14 @@ config PM_SLEEP_SMP
 	select HOTPLUG
 	select HOTPLUG_CPU
 
+config PM_AUTOSLEEP
+	bool "Opportunistic sleep"
+	depends on PM_SLEEP
+	default n
+	---help---
+	Allow the kernel to trigger a system transition into a global sleep
+	state automatically whenever there are no active wakeup sources.
+
 config PM_RUNTIME
 	bool "Run-time PM core functionality"
 	depends on !IA64_HP_SIM
Index: linux/kernel/power/power.h
===================================================================
--- linux.orig/kernel/power/power.h
+++ linux/kernel/power/power.h
@@ -264,3 +264,21 @@ static inline void suspend_thaw_processe
 {
 }
 #endif
+
+#ifdef CONFIG_PM_AUTOSLEEP
+
+/* kernel/power/autosleep.c */
+extern int pm_autosleep_init(void);
+extern int pm_autosleep_lock(void);
+extern void pm_autosleep_unlock(void);
+extern suspend_state_t pm_autosleep_state(void);
+extern int pm_autosleep_set_state(suspend_state_t state);
+
+#else /* !CONFIG_PM_AUTOSLEEP */
+
+static inline int pm_autosleep_init(void) { return 0; }
+static inline int pm_autosleep_lock(void) { return 0; }
+static inline void pm_autosleep_unlock(void) {}
+static inline suspend_state_t pm_autosleep_state(void) { return PM_SUSPEND_ON; }
+
+#endif /* !CONFIG_PM_AUTOSLEEP */
Index: linux/include/linux/suspend.h
===================================================================
--- linux.orig/include/linux/suspend.h
+++ linux/include/linux/suspend.h
@@ -356,7 +356,7 @@ extern int unregister_pm_notifier(struct
 extern bool events_check_enabled;
 
 extern bool pm_wakeup_pending(void);
-extern bool pm_get_wakeup_count(unsigned int *count);
+extern bool pm_get_wakeup_count(unsigned int *count, bool block);
 extern bool pm_save_wakeup_count(unsigned int count);
 
 static inline void lock_system_sleep(void)
@@ -407,6 +407,17 @@ static inline void unlock_system_sleep(v
 
 #endif /* !CONFIG_PM_SLEEP */
 
+#ifdef CONFIG_PM_AUTOSLEEP
+
+/* kernel/power/autosleep.c */
+void queue_up_suspend_work(void);
+
+#else /* !CONFIG_PM_AUTOSLEEP */
+
+static inline void queue_up_suspend_work(void) {}
+
+#endif /* !CONFIG_PM_AUTOSLEEP */
+
 #ifdef CONFIG_ARCH_SAVE_PAGE_KEYS
 /*
  * The ARCH_SAVE_PAGE_KEYS functions can be used by an architecture
Index: linux/kernel/power/autosleep.c
===================================================================
--- /dev/null
+++ linux/kernel/power/autosleep.c
@@ -0,0 +1,117 @@
+/*
+ * kernel/power/autosleep.c
+ *
+ * Opportunistic sleep support.
+ *
+ * Copyright (C) 2012 Rafael J. Wysocki <rjw@sisk.pl>
+ */
+
+#include <linux/device.h>
+#include <linux/mutex.h>
+#include <linux/pm_wakeup.h>
+
+#include "power.h"
+
+static suspend_state_t autosleep_state;
+static struct workqueue_struct *autosleep_wq;
+static DEFINE_MUTEX(autosleep_lock);
+static struct wakeup_source *autosleep_ws;
+
+static void try_to_suspend(struct work_struct *work)
+{
+	unsigned int initial_count, final_count;
+
+	if (!pm_get_wakeup_count(&initial_count, true))
+		goto out;
+
+	mutex_lock(&autosleep_lock);
+
+	if (!pm_save_wakeup_count(initial_count)) {
+		mutex_unlock(&autosleep_lock);
+		goto out;
+	}
+
+	if (autosleep_state == PM_SUSPEND_ON) {
+		mutex_unlock(&autosleep_lock);
+		return;
+	}
+	if (autosleep_state >= PM_SUSPEND_MAX)
+		hibernate();
+	else
+		pm_suspend(autosleep_state);
+
+	mutex_unlock(&autosleep_lock);
+
+	if (!pm_get_wakeup_count(&final_count, false))
+		goto out;
+
+	/*
+	 * If the wakeup occured for an unknown reason, wait to prevent the
+	 * system from trying to suspend and waking up in a tight loop.
+	 */
+	if (final_count == initial_count)
+		schedule_timeout_uninterruptible(HZ / 2);
+
+ out:
+	queue_up_suspend_work();
+}
+
+static DECLARE_WORK(suspend_work, try_to_suspend);
+
+void queue_up_suspend_work(void)
+{
+	if (!work_pending(&suspend_work) && autosleep_state > PM_SUSPEND_ON)
+		queue_work(autosleep_wq, &suspend_work);
+}
+
+suspend_state_t pm_autosleep_state(void)
+{
+	return autosleep_state;
+}
+
+int pm_autosleep_lock(void)
+{
+	return mutex_lock_interruptible(&autosleep_lock);
+}
+
+void pm_autosleep_unlock(void)
+{
+	mutex_unlock(&autosleep_lock);
+}
+
+int pm_autosleep_set_state(suspend_state_t state)
+{
+
+#ifndef CONFIG_HIBERNATION
+	if (state >= PM_SUSPEND_MAX)
+		return -EINVAL;
+#endif
+
+	__pm_stay_awake(autosleep_ws);
+
+	mutex_lock(&autosleep_lock);
+
+	autosleep_state = state;
+
+	__pm_relax(autosleep_ws);
+
+	if (state > PM_SUSPEND_ON)
+		queue_up_suspend_work();
+
+	mutex_unlock(&autosleep_lock);
+	return 0;
+}
+
+int __init pm_autosleep_init(void)
+{
+	autosleep_ws = wakeup_source_register("autosleep");
+	if (!autosleep_ws)
+		return -ENOMEM;
+
+	autosleep_wq = alloc_ordered_workqueue("autosleep", 0);
+	if (autosleep_wq)
+		return 0;
+
+	wakeup_source_unregister(autosleep_ws);
+	return -ENOMEM;
+}
Index: linux/kernel/power/main.c
===================================================================
--- linux.orig/kernel/power/main.c
+++ linux/kernel/power/main.c
@@ -269,8 +269,7 @@ static ssize_t state_show(struct kobject
 	return (s - buf);
 }
 
-static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
-			   const char *buf, size_t n)
+static suspend_state_t decode_state(const char *buf, size_t n)
 {
 #ifdef CONFIG_SUSPEND
 	suspend_state_t state = PM_SUSPEND_STANDBY;
@@ -278,27 +277,48 @@ static ssize_t state_store(struct kobjec
 #endif
 	char *p;
 	int len;
-	int error = -EINVAL;
 
 	p = memchr(buf, '\n', n);
 	len = p ? p - buf : n;
 
-	/* First, check if we are requested to hibernate */
-	if (len == 4 && !strncmp(buf, "disk", len)) {
-		error = hibernate();
-		goto Exit;
-	}
+	/* Check hibernation first. */
+	if (len == 4 && !strncmp(buf, "disk", len))
+		return PM_SUSPEND_MAX;
 
 #ifdef CONFIG_SUSPEND
-	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++) {
-		if (*s && len == strlen(*s) && !strncmp(buf, *s, len)) {
-			error = pm_suspend(state);
-			break;
-		}
-	}
+	for (s = &pm_states[state]; state < PM_SUSPEND_MAX; s++, state++)
+		if (*s && len == strlen(*s) && !strncmp(buf, *s, len))
+			return state;
 #endif
 
- Exit:
+	return PM_SUSPEND_ON;
+}
+
+static ssize_t state_store(struct kobject *kobj, struct kobj_attribute *attr,
+			   const char *buf, size_t n)
+{
+	suspend_state_t state;
+	int error;
+
+	error = pm_autosleep_lock();
+	if (error)
+		return error;
+
+	if (pm_autosleep_state() > PM_SUSPEND_ON) {
+		error = -EBUSY;
+		goto out;
+	}
+
+	state = decode_state(buf, n);
+	if (state < PM_SUSPEND_MAX)
+		error = pm_suspend(state);
+	else if (state == PM_SUSPEND_MAX)
+		error = hibernate();
+	else
+		error = -EINVAL;
+
+ out:
+	pm_autosleep_unlock();
 	return error ? error : n;
 }
 
@@ -339,7 +359,8 @@ static ssize_t wakeup_count_show(struct
 {
 	unsigned int val;
 
-	return pm_get_wakeup_count(&val) ? sprintf(buf, "%u\n", val) : -EINTR;
+	return pm_get_wakeup_count(&val, true) ?
+		sprintf(buf, "%u\n", val) : -EINTR;
 }
 
 static ssize_t wakeup_count_store(struct kobject *kobj,
@@ -347,15 +368,69 @@ static ssize_t wakeup_count_store(struct
 				const char *buf, size_t n)
 {
 	unsigned int val;
+	int error;
+
+	error = pm_autosleep_lock();
+	if (error)
+		return error;
+
+	if (pm_autosleep_state() > PM_SUSPEND_ON) {
+		error = -EBUSY;
+		goto out;
+	}
 
+	error = -EINVAL;
 	if (sscanf(buf, "%u", &val) == 1) {
 		if (pm_save_wakeup_count(val))
-			return n;
+			error = n;
 	}
-	return -EINVAL;
+
+ out:
+	pm_autosleep_unlock();
+	return error;
 }
 
 power_attr(wakeup_count);
+
+#ifdef CONFIG_PM_AUTOSLEEP
+static ssize_t autosleep_show(struct kobject *kobj,
+			      struct kobj_attribute *attr,
+			      char *buf)
+{
+	suspend_state_t state = pm_autosleep_state();
+
+	if (state == PM_SUSPEND_ON)
+		return sprintf(buf, "off\n");
+
+#ifdef CONFIG_SUSPEND
+	if (state < PM_SUSPEND_MAX)
+		return sprintf(buf, "%s\n", valid_state(state) ?
+						pm_states[state] : "error");
+#endif
+#ifdef CONFIG_HIBERNATION
+	return sprintf(buf, "disk\n");
+#else
+	return sprintf(buf, "error");
+#endif
+}
+
+static ssize_t autosleep_store(struct kobject *kobj,
+			       struct kobj_attribute *attr,
+			       const char *buf, size_t n)
+{
+	suspend_state_t state = decode_state(buf, n);
+	int error;
+
+	if (state == PM_SUSPEND_ON
+	    && !(strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4)))
+		return -EINVAL;
+
+	error = pm_autosleep_set_state(state);
+	return error ? error : n;
+}
+
+power_attr(autosleep);
+#endif /* CONFIG_PM_AUTOSLEEP */
 #endif /* CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM_TRACE
@@ -409,6 +484,9 @@ static struct attribute * g[] = {
 #ifdef CONFIG_PM_SLEEP
 	&pm_async_attr.attr,
 	&wakeup_count_attr.attr,
+#ifdef CONFIG_PM_AUTOSLEEP
+	&autosleep_attr.attr,
+#endif
 #ifdef CONFIG_PM_DEBUG
 	&pm_test_attr.attr,
 #endif
@@ -444,7 +522,10 @@ static int __init pm_init(void)
 	power_kobj = kobject_create_and_add("power", NULL);
 	if (!power_kobj)
 		return -ENOMEM;
-	return sysfs_create_group(power_kobj, &attr_group);
+	error = sysfs_create_group(power_kobj, &attr_group);
+	if (error)
+		return error;
+	return pm_autosleep_init();
 }
 
 core_initcall(pm_init);
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -660,29 +660,33 @@ bool pm_wakeup_pending(void)
 /**
  * pm_get_wakeup_count - Read the number of registered wakeup events.
  * @count: Address to store the value at.
+ * @block: Whether or not to block.
  *
- * Store the number of registered wakeup events at the address in @count.  Block
- * if the current number of wakeup events being processed is nonzero.
+ * Store the number of registered wakeup events at the address in @count.  If
+ * @block is set, block until the current number of wakeup events being
+ * processed is zero.
  *
- * Return 'false' if the wait for the number of wakeup events being processed to
- * drop down to zero has been interrupted by a signal (and the current number
- * of wakeup events being processed is still nonzero).  Otherwise return 'true'.
+ * Return 'false' if the current number of wakeup events being processed is
+ * nonzero.  Otherwise return 'true'.
  */
-bool pm_get_wakeup_count(unsigned int *count)
+bool pm_get_wakeup_count(unsigned int *count, bool block)
 {
 	unsigned int cnt, inpr;
-	DEFINE_WAIT(wait);
 
-	for (;;) {
-		prepare_to_wait(&wakeup_count_wait_queue, &wait,
-				TASK_INTERRUPTIBLE);
-		split_counters(&cnt, &inpr);
-		if (inpr == 0 || signal_pending(current))
-			break;
+	if (block) {
+		DEFINE_WAIT(wait);
 
-		schedule();
+		for (;;) {
+			prepare_to_wait(&wakeup_count_wait_queue, &wait,
+					TASK_INTERRUPTIBLE);
+			split_counters(&cnt, &inpr);
+			if (inpr == 0 || signal_pending(current))
+				break;
+
+			schedule();
+		}
+		finish_wait(&wakeup_count_wait_queue, &wait);
 	}
-	finish_wait(&wakeup_count_wait_queue, &wait);
 
 	split_counters(&cnt, &inpr);
 	*count = cnt;
Index: linux/Documentation/ABI/testing/sysfs-power
===================================================================
--- linux.orig/Documentation/ABI/testing/sysfs-power
+++ linux/Documentation/ABI/testing/sysfs-power
@@ -172,3 +172,20 @@ Description:
 
 		Reading from this file will display the current value, which is
 		set to 1 MB by default.
+
+What:		/sys/power/autosleep
+Date:		April 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/autosleep file can be written one of the strings
+		returned by reads from /sys/power/state.  If that happens, a
+		work item attempting to trigger a transition of the system to
+		the sleep state represented by that string is queued up.  This
+		attempt will only succeed if there are no active wakeup sources
+		in the system at that time.  After every execution, regardless
+		of whether or not the attempt to put the system to sleep has
+		succeeded, the work item requeues itself until user space
+		writes "off" to /sys/power/autosleep.
+
+		Reading from this file causes the last string successfully
+		written to it to be returned.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating wakeup sources
  2012-04-26  6:31           ` NeilBrown
@ 2012-04-26 22:04             ` Rafael J. Wysocki
  2012-04-27  0:07               ` NeilBrown
  2012-04-27  3:57               ` Arve Hjønnevåg
  0 siblings, 2 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-26 22:04 UTC (permalink / raw)
  To: NeilBrown
  Cc: John Stultz, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg, John Stultz,
	Brian Swetland, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Thursday, April 26, 2012, NeilBrown wrote:
> On Tue, 24 Apr 2012 23:27:17 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > Subject: PM / Sleep: Add user space interface for manipulating wakeup sources, v2
> > 
> > Android allows user space to manipulate wakelocks using two
> > sysfs file located in /sys/power/, wake_lock and wake_unlock.
> > Writing a wakelock name and optionally a timeout to the wake_lock
> > file causes the wakelock whose name was written to be acquired (it
> > is created before is necessary), optionally with the given timeout.
> > Writing the name of a wakelock to wake_unlock causes that wakelock
> > to be released.
> > 
> > Implement an analogous interface for user space using wakeup sources.
> > Add the /sys/power/wake_lock and /sys/power/wake_unlock files
> > allowing user space to create, activate and deactivate wakeup
> > sources, such that writing a name and optionally a timeout to
> > wake_lock causes the wakeup source of that name to be activated,
> > optionally with the given timeout.  If that wakeup source doesn't
> > exist, it will be created and then activated.  Writing a name to
> > wake_unlock causes the wakeup source of that name, if there is one,
> > to be deactivated.  Wakeup sources created with the help of
> > wake_lock that haven't been used for more than 5 minutes are garbage
> > collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT
> > wakeup sources created with the help of wake_lock present at a time.
> > 
> > The data type used to track wakeup sources created by user space is
> > called "struct wakelock" to indicate the origins of this feature.
> > 
> > This version of the patch includes an rbtree manipulation fix from John Stultz.
> 
> Looks good.  Just a couple of minor suggestions.
> 
> 
> > +ssize_t pm_show_wakelocks(char *buf, bool show_active)
> > +{
> > +	struct rb_node *node;
> > +	struct wakelock *wl;
> > +	char *str = buf;
> > +	char *end = buf + PAGE_SIZE;
> > +
> > +	mutex_lock(&wakelocks_lock);
> > +
> > +	for (node = rb_first(&wakelocks_tree); node; node = rb_next(node)) {
> > +		bool active;
> > +
> > +		wl = rb_entry(node, struct wakelock, node);
> > +		spin_lock_irq(&wl->ws.lock);
> > +		active = wl->ws.active;
> > +		spin_unlock_irq(&wl->ws.lock);
> 
> I don't think the spin_lock is needed.  We are just reading one value and it
> is either 0 or not.  So there is no possibility for any inconsistency.
>                 if (wl->ws.active == show_active)
> ?

Good point.

> > +		if (active == show_active)
> > +			str += scnprintf(str, end - str, "%s ", wl->name);
> 
> Arg.  Extra space on the end of the line!!  :-)

Well, it's not too difficult to get rid of it (as in the patch below).

> I would suggest the entries be terminated by '\n' rather than separate by
> space.
> one-item-per-line is much more common in Unix in general.  'grep' allows
> you to find things more easily etc.
>    while read a
>    do echo $a > wake_unlock
>    done < wake_lock

I know, but this follows the general convention of the files under /sys/power/.

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM / Sleep: Add user space interface for manipulating wakeup sources, v3

Android allows user space to manipulate wakelocks using two
sysfs file located in /sys/power/, wake_lock and wake_unlock.
Writing a wakelock name and optionally a timeout to the wake_lock
file causes the wakelock whose name was written to be acquired (it
is created before is necessary), optionally with the given timeout.
Writing the name of a wakelock to wake_unlock causes that wakelock
to be released.

Implement an analogous interface for user space using wakeup sources.
Add the /sys/power/wake_lock and /sys/power/wake_unlock files
allowing user space to create, activate and deactivate wakeup
sources, such that writing a name and optionally a timeout to
wake_lock causes the wakeup source of that name to be activated,
optionally with the given timeout.  If that wakeup source doesn't
exist, it will be created and then activated.  Writing a name to
wake_unlock causes the wakeup source of that name, if there is one,
to be deactivated.  Wakeup sources created with the help of
wake_lock that haven't been used for more than 5 minutes are garbage
collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT
wakeup sources created with the help of wake_lock present at a time.

The data type used to track wakeup sources created by user space is
called "struct wakelock" to indicate the origins of this feature.

This version of the patch includes an rbtree manipulation fix from John Stultz.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/ABI/testing/sysfs-power |   42 ++++++
 drivers/base/power/wakeup.c           |    1 
 kernel/power/Kconfig                  |    8 +
 kernel/power/Makefile                 |    1 
 kernel/power/main.c                   |   41 ++++++
 kernel/power/power.h                  |    9 +
 kernel/power/wakelock.c               |  215 ++++++++++++++++++++++++++++++++++
 7 files changed, 317 insertions(+)

Index: linux/kernel/power/main.c
===================================================================
--- linux.orig/kernel/power/main.c
+++ linux/kernel/power/main.c
@@ -431,6 +431,43 @@ static ssize_t autosleep_store(struct ko
 
 power_attr(autosleep);
 #endif /* CONFIG_PM_AUTOSLEEP */
+
+#ifdef CONFIG_PM_WAKELOCKS
+static ssize_t wake_lock_show(struct kobject *kobj,
+			      struct kobj_attribute *attr,
+			      char *buf)
+{
+	return pm_show_wakelocks(buf, true);
+}
+
+static ssize_t wake_lock_store(struct kobject *kobj,
+			       struct kobj_attribute *attr,
+			       const char *buf, size_t n)
+{
+	int error = pm_wake_lock(buf);
+	return error ? error : n;
+}
+
+power_attr(wake_lock);
+
+static ssize_t wake_unlock_show(struct kobject *kobj,
+				struct kobj_attribute *attr,
+				char *buf)
+{
+	return pm_show_wakelocks(buf, false);
+}
+
+static ssize_t wake_unlock_store(struct kobject *kobj,
+				 struct kobj_attribute *attr,
+				 const char *buf, size_t n)
+{
+	int error = pm_wake_unlock(buf);
+	return error ? error : n;
+}
+
+power_attr(wake_unlock);
+
+#endif /* CONFIG_PM_WAKELOCKS */
 #endif /* CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM_TRACE
@@ -487,6 +524,10 @@ static struct attribute * g[] = {
 #ifdef CONFIG_PM_AUTOSLEEP
 	&autosleep_attr.attr,
 #endif
+#ifdef CONFIG_PM_WAKELOCKS
+	&wake_lock_attr.attr,
+	&wake_unlock_attr.attr,
+#endif
 #ifdef CONFIG_PM_DEBUG
 	&pm_test_attr.attr,
 #endif
Index: linux/kernel/power/power.h
===================================================================
--- linux.orig/kernel/power/power.h
+++ linux/kernel/power/power.h
@@ -282,3 +282,12 @@ static inline void pm_autosleep_unlock(v
 static inline suspend_state_t pm_autosleep_state(void) { return PM_SUSPEND_ON; }
 
 #endif /* !CONFIG_PM_AUTOSLEEP */
+
+#ifdef CONFIG_PM_WAKELOCKS
+
+/* kernel/power/wakelock.c */
+extern ssize_t pm_show_wakelocks(char *buf, bool show_active);
+extern int pm_wake_lock(const char *buf);
+extern int pm_wake_unlock(const char *buf);
+
+#endif /* !CONFIG_PM_WAKELOCKS */
Index: linux/kernel/power/Kconfig
===================================================================
--- linux.orig/kernel/power/Kconfig
+++ linux/kernel/power/Kconfig
@@ -111,6 +111,14 @@ config PM_AUTOSLEEP
 	Allow the kernel to trigger a system transition into a global sleep
 	state automatically whenever there are no active wakeup sources.
 
+config PM_WAKELOCKS
+	bool "User space wakeup sources interface"
+	depends on PM_SLEEP
+	default n
+	---help---
+	Allow user space to create, activate and deactivate wakeup source
+	objects with the help of a sysfs-based interface.
+
 config PM_RUNTIME
 	bool "Run-time PM core functionality"
 	depends on !IA64_HP_SIM
Index: linux/kernel/power/wakelock.c
===================================================================
--- /dev/null
+++ linux/kernel/power/wakelock.c
@@ -0,0 +1,215 @@
+/*
+ * kernel/power/wakelock.c
+ *
+ * User space wakeup sources support.
+ *
+ * Copyright (C) 2012 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This code is based on the analogous interface allowing user space to
+ * manipulate wakelocks on Android.
+ */
+
+#include <linux/ctype.h>
+#include <linux/device.h>
+#include <linux/err.h>
+#include <linux/hrtimer.h>
+#include <linux/list.h>
+#include <linux/rbtree.h>
+#include <linux/slab.h>
+
+#define WL_NUMBER_LIMIT	100
+#define WL_GC_COUNT_MAX	100
+#define WL_GC_TIME_SEC	300
+
+static DEFINE_MUTEX(wakelocks_lock);
+
+struct wakelock {
+	char			*name;
+	struct rb_node		node;
+	struct wakeup_source	ws;
+	struct list_head	lru;
+};
+
+static struct rb_root wakelocks_tree = RB_ROOT;
+static LIST_HEAD(wakelocks_lru_list);
+static unsigned int number_of_wakelocks;
+static unsigned int wakelocks_gc_count;
+
+ssize_t pm_show_wakelocks(char *buf, bool show_active)
+{
+	struct rb_node *node;
+	struct wakelock *wl;
+	char *str = buf;
+	char *end = buf + PAGE_SIZE;
+
+	mutex_lock(&wakelocks_lock);
+
+	for (node = rb_first(&wakelocks_tree); node; node = rb_next(node)) {
+		wl = rb_entry(node, struct wakelock, node);
+		if (wl->ws.active == show_active)
+			str += scnprintf(str, end - str, "%s ", wl->name);
+	}
+	if (str > buf)
+		str--;
+
+	str += scnprintf(str, end - str, "\n");
+
+	mutex_unlock(&wakelocks_lock);
+	return (str - buf);
+}
+
+static struct wakelock *wakelock_lookup_add(const char *name, size_t len,
+					    bool add_if_not_found)
+{
+	struct rb_node **node = &wakelocks_tree.rb_node;
+	struct rb_node *parent = *node;
+	struct wakelock *wl;
+
+	while (*node) {
+		int diff;
+
+		parent = *node;
+		wl = rb_entry(*node, struct wakelock, node);
+		diff = strncmp(name, wl->name, len);
+		if (diff == 0) {
+			if (wl->name[len])
+				diff = -1;
+			else
+				return wl;
+		}
+		if (diff < 0)
+			node = &(*node)->rb_left;
+		else
+			node = &(*node)->rb_right;
+	}
+	if (!add_if_not_found)
+		return ERR_PTR(-EINVAL);
+
+	if (number_of_wakelocks > WL_NUMBER_LIMIT)
+		return ERR_PTR(-ENOSPC);
+
+	/* Not found, we have to add a new one. */
+	wl = kzalloc(sizeof(*wl), GFP_KERNEL);
+	if (!wl)
+		return ERR_PTR(-ENOMEM);
+
+	wl->name = kstrndup(name, len, GFP_KERNEL);
+	if (!wl->name) {
+		kfree(wl);
+		return ERR_PTR(-ENOMEM);
+	}
+	wl->ws.name = wl->name;
+	wakeup_source_add(&wl->ws);
+	rb_link_node(&wl->node, parent, node);
+	rb_insert_color(&wl->node, &wakelocks_tree);
+	list_add(&wl->lru, &wakelocks_lru_list);
+	number_of_wakelocks++;
+	return wl;
+}
+
+int pm_wake_lock(const char *buf)
+{
+	const char *str = buf;
+	struct wakelock *wl;
+	u64 timeout_ns = 0;
+	size_t len;
+	int ret = 0;
+
+	while (*str && !isspace(*str))
+		str++;
+
+	len = str - buf;
+	if (!len)
+		return -EINVAL;
+
+	if (*str && *str != '\n') {
+		/* Find out if there's a valid timeout string appended. */
+		ret = kstrtou64(skip_spaces(str), 10, &timeout_ns);
+		if (ret)
+			return -EINVAL;
+	}
+
+	mutex_lock(&wakelocks_lock);
+
+	wl = wakelock_lookup_add(buf, len, true);
+	if (IS_ERR(wl)) {
+		ret = PTR_ERR(wl);
+		goto out;
+	}
+	if (timeout_ns) {
+		u64 timeout_ms = timeout_ns + NSEC_PER_MSEC - 1;
+
+		do_div(timeout_ms, NSEC_PER_MSEC);
+		__pm_wakeup_event(&wl->ws, timeout_ms);
+	} else {
+		__pm_stay_awake(&wl->ws);
+	}
+
+	list_move(&wl->lru, &wakelocks_lru_list);
+
+ out:
+	mutex_unlock(&wakelocks_lock);
+	return ret;
+}
+
+static void wakelocks_gc(void)
+{
+	struct wakelock *wl, *aux;
+	ktime_t now = ktime_get();
+
+	list_for_each_entry_safe_reverse(wl, aux, &wakelocks_lru_list, lru) {
+		u64 idle_time_ns;
+		bool active;
+
+		spin_lock_irq(&wl->ws.lock);
+		idle_time_ns = ktime_to_ns(ktime_sub(now, wl->ws.last_time));
+		active = wl->ws.active;
+		spin_unlock_irq(&wl->ws.lock);
+
+		if (idle_time_ns < ((u64)WL_GC_TIME_SEC * NSEC_PER_SEC))
+			break;
+
+		if (!active) {
+			wakeup_source_remove(&wl->ws);
+			rb_erase(&wl->node, &wakelocks_tree);
+			list_del(&wl->lru);
+			kfree(wl->name);
+			kfree(wl);
+			number_of_wakelocks--;
+		}
+	}
+	wakelocks_gc_count = 0;
+}
+
+int pm_wake_unlock(const char *buf)
+{
+	struct wakelock *wl;
+	size_t len;
+	int ret = 0;
+
+	len = strlen(buf);
+	if (!len)
+		return -EINVAL;
+
+	if (buf[len-1] == '\n')
+		len--;
+
+	if (!len)
+		return -EINVAL;
+
+	mutex_lock(&wakelocks_lock);
+
+	wl = wakelock_lookup_add(buf, len, false);
+	if (IS_ERR(wl)) {
+		ret = PTR_ERR(wl);
+		goto out;
+	}
+	__pm_relax(&wl->ws);
+	list_move(&wl->lru, &wakelocks_lru_list);
+	if (++wakelocks_gc_count > WL_GC_COUNT_MAX)
+		wakelocks_gc();
+
+ out:
+	mutex_unlock(&wakelocks_lock);
+	return ret;
+}
Index: linux/kernel/power/Makefile
===================================================================
--- linux.orig/kernel/power/Makefile
+++ linux/kernel/power/Makefile
@@ -10,5 +10,6 @@ obj-$(CONFIG_PM_TEST_SUSPEND)	+= suspend
 obj-$(CONFIG_HIBERNATION)	+= hibernate.o snapshot.o swap.o user.o \
 				   block_io.o
 obj-$(CONFIG_PM_AUTOSLEEP)	+= autosleep.o
+obj-$(CONFIG_PM_WAKELOCKS)	+= wakelock.o
 
 obj-$(CONFIG_MAGIC_SYSRQ)	+= poweroff.o
Index: linux/drivers/base/power/wakeup.c
===================================================================
--- linux.orig/drivers/base/power/wakeup.c
+++ linux/drivers/base/power/wakeup.c
@@ -133,6 +133,7 @@ void wakeup_source_add(struct wakeup_sou
 	spin_lock_init(&ws->lock);
 	setup_timer(&ws->timer, pm_wakeup_timer_fn, (unsigned long)ws);
 	ws->active = false;
+	ws->last_time = ktime_get();
 
 	spin_lock_irq(&events_lock);
 	list_add_rcu(&ws->entry, &wakeup_sources);
Index: linux/Documentation/ABI/testing/sysfs-power
===================================================================
--- linux.orig/Documentation/ABI/testing/sysfs-power
+++ linux/Documentation/ABI/testing/sysfs-power
@@ -189,3 +189,45 @@ Description:
 
 		Reading from this file causes the last string successfully
 		written to it to be returned.
+
+What:		/sys/power/wake_lock
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/wake_lock file allows user space to create
+		wakeup source objects and activate them on demand (if one of
+		those wakeup sources is active, reads from the
+		/sys/power/wakeup_count file block or return false).  When a
+		string without white space is written to /sys/power/wake_lock,
+		it will be assumed to represent a wakeup source name.  If there
+		is a wakeup source object with that name, it will be activated
+		(unless active already).  Otherwise, a new wakeup source object
+		will be registered, assigned the given name and activated.
+		If a string written to /sys/power/wake_lock contains white
+		space, the part of the string preceding the white space will be
+		regarded as a wakeup source name and handled as descrived above.
+		The other part of the string will be regarded as a timeout (in
+		nanoseconds) such that the wakeup source will be automatically
+		deactivated after it has expired.  The timeout, if present, is
+		set regardless of the current state of the wakeup source object
+		in question.
+
+		Reads from this file return a string consisting of the names of
+		wakeup sources created with the help of it that are active at
+		the moment, separated with spaces.
+
+
+What:		/sys/power/wake_unlock
+Date:		February 2012
+Contact:	Rafael J. Wysocki <rjw@sisk.pl>
+Description:
+		The /sys/power/wake_unlock file allows user space to deactivate
+		wakeup sources created with the help of /sys/power/wake_lock.
+		When a string is written to /sys/power/wake_unlock, it will be
+		assumed to represent the name of a wakeup source to deactivate.
+		If a wakeup source object of that name exists and is active at
+		the moment, it will be deactivated.
+
+		Reads from this file return a string consisting of the names of
+		wakeup sources created with the help of /sys/power/wake_lock
+		that are inactive at the moment, separated with spaces.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating wakeup sources
  2012-04-26 22:04             ` Rafael J. Wysocki
@ 2012-04-27  0:07               ` NeilBrown
  2012-04-27 21:15                 ` Rafael J. Wysocki
  2012-04-27  3:57               ` Arve Hjønnevåg
  1 sibling, 1 reply; 129+ messages in thread
From: NeilBrown @ 2012-04-27  0:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: John Stultz, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg, John Stultz,
	Brian Swetland, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

[-- Attachment #1: Type: text/plain, Size: 278 bytes --]

On Fri, 27 Apr 2012 00:04:27 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> ---
> From: Rafael J. Wysocki <rjw@sisk.pl>
> Subject: PM / Sleep: Add user space interface for manipulating wakeup sources, v3

Reviewed-by: NeilBrown <neilb@suse.de>

Thanks.
NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
  2012-04-26 21:52         ` Rafael J. Wysocki
@ 2012-04-27  0:39           ` NeilBrown
  2012-04-27 21:22             ` Rafael J. Wysocki
  2012-05-03  0:23           ` Arve Hjønnevåg
  1 sibling, 1 reply; 129+ messages in thread
From: NeilBrown @ 2012-04-27  0:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

[-- Attachment #1: Type: text/plain, Size: 7217 bytes --]

On Thu, 26 Apr 2012 23:52:42 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:

> On Thursday, April 26, 2012, NeilBrown wrote:
> > On Sun, 22 Apr 2012 23:23:23 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > 
> > > From: "Rafael J. Wysocki" <rjw@sisk.pl>
> > > To: Linux PM list <linux-pm@vger.kernel.org>
> > > Cc: LKML <linux-kernel@vger.kernel.org>, Magnus Damm <magnus.damm@gmail.com>, markgross@thegnar.org, Matthew Garrett <mjg@redhat.com>, Greg KH <gregkh@linuxfoundation.org>, Arve Hjønnevåg <arve@android.com>, John Stultz <john.stultz@linaro.org>, Brian Swetland <swetland@google.com>, Neil Brown <neilb@suse.de>, Alan Stern <stern@rowland.harvard.edu>, Dmitry Torokhov <dmitry.torokhov@gmail.com>, "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
> > > Subject: [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
> > > Date: Sun, 22 Apr 2012 23:23:23 +0200
> > > Sender: linux-kernel-owner@vger.kernel.org
> > > User-Agent: KMail/1.13.6 (Linux/3.4.0-rc3+; KDE/4.6.0; x86_64; ; )
> > > 
> > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > 
> > > Introduce a mechanism by which the kernel can trigger global
> > > transitions to a sleep state chosen by user space if there are no
> > > active wakeup sources.
> > 
> > Hi Rafael,
> 
> Hi,
> 
> >  just a few little issues below.  Over all I think that if we have to have
> >  auto-sleep in the kernel, then this is a good way to do it.
> 
> Good, we seem to agree in principle, then. :-)
> 
> > > +static void try_to_suspend(struct work_struct *work)
> > > +{
> > > +	unsigned int initial_count, final_count;
> > > +
> > > +	if (!pm_get_wakeup_count(&initial_count, true))
> > > +		goto out;
> > > +
> > > +	mutex_lock(&autosleep_lock);
> > > +
> > > +	if (!pm_save_wakeup_count(initial_count)) {
> > > +		mutex_unlock(&autosleep_lock);
> > > +		goto out;
> > > +	}
> > > +
> > > +	if (autosleep_state == PM_SUSPEND_ON) {
> > > +		mutex_unlock(&autosleep_lock);
> > > +		return;
> > > +	}
> > > +	if (autosleep_state >= PM_SUSPEND_MAX)
> > > +		hibernate();
> > > +	else
> > > +		pm_suspend(autosleep_state);
> > > +
> > > +	mutex_unlock(&autosleep_lock);
> > > +
> > > +	if (!pm_get_wakeup_count(&final_count, false))
> > > +		goto out;
> > > +
> > > +	if (final_count == initial_count)
> > > +		schedule_timeout(HZ / 2);
> > 
> > This doesn't do what you seem to expect it to do.
> > You need to set current->state to something like TASK_UNINTERRUPTIBLE
> > before calling schedule_timeout, otherwise it is effectily a no-op.
> > schedule_timeout_uninterruptible(), for example, will do this for you.
> 
> Right.  I obviously overlooked the missing state change.
> 
> > However the value of this isn't clear to me, so a comment would probably be a
> > good thing.
> > This continue presumably fires if we wake up without any wakeup sources
> > being activated.  In that case you want to delay for 500ms - presumably to
> > avoid a tight suspend/resume loop if something goes wrong?
> 
> Yes.
> 
> > I have occasionally seen a stray/uninteresting interrupt wake from suspend
> > immediately after entering suspend and the next attempt succeeds.  Maybe this
> > is a bug in some driver somewhere, but not a big one.  I think I would rather
> > in that case that we attempt to re-enter suspend immediately.  Maybe after a
> > few failed attempts it makes sense to back off.
> 
> Perhaps.  We can adjust this particular thing later, I think.
> 
> > The other question is: if we want to back-off, is 500ms really enough?  What
> > will be gained by, or could be achieved in, that time?  An exponential
> > back-off might be defensible, but I can't see the value of a 500ms fixed
> > back-off.
> > However if you can, I'd love to see a comment in there explaining it.
> 
> Sure.
> 
> > > +
> > > + out:
> > > +	queue_up_suspend_work();
> > > +}
> > > +
> > 
> > 
> > > +
> > > +int pm_autosleep_set_state(suspend_state_t state)
> > > +{
> > > +
> > > +#ifndef CONFIG_HIBERNATION
> > > +	if (state >= PM_SUSPEND_MAX)
> > > +		return -EINVAL;
> > > +#endif
> > > +
> > > +	__pm_stay_awake(autosleep_ws);
> > > +
> > > +	mutex_lock(&autosleep_lock);
> > > +
> > > +	autosleep_state = state;
> > > +
> > > +	__pm_relax(autosleep_ws);
> > 
> > I'm struggling to see the point of the autosleep_ws.
> > 
> > A suspend cannot actually happen while this code is running (can it?) because
> > it will wait for the process to enter the freezer.
> > So the only effect of this is:
> >   1/ cause the current auto-sleep cycle to abort and
> >   2/ maybe add some accounting number is the autosleep_ws.
> > Is that right?
> > Which of these is needed?
> 
> This is to solve a problem when user space attempts to echo "off" to
> /sys/power/autosleep exactly when pm_suspend() is initiated as a part
> of autosleep under the autosleep lock.  In that case, if autosleep_ws is not
> there, the process wanting to disable autosleep will have to wait for the
> pm_suspend() to complete (unless it holds a wakelock), which is suboptimal.
> 
> > I would imagine that any process writing to /sys/power/autosleep would be
> > holding a wakelock, and if it didn't it should expect things to be racy...
> > 
> > Am I missing something?
> 
> The assumption above is kind of optimistic in my opinion.  That process
> very well may be a system administrator's bash, for example. :-)

If it is, then presumably the auto-sleep could kick in between any pair of
keystrokes that the sysadmin types.  Or between the final 'enter' and when the
write() system call begins.  All that autosleep_ws seems to provide is
certainty that when the write() system call completes, autosleep will be
fully disabled.
I don't think that is really worth anything.

However, something did occur to me that I would like clarified.
What happens if try_to_suspend() gets the autosleep_lock just before
wakeup_count_store(), state_store() or pm_autosleep_set_state()
try to get it?
For pm_autosleep_set_state() the try_to_suspend() attempt will abort because
it is holding autosleep_ws, so it will drop the lock and
pm_autosleep_set_state() will continue happily.
For the other two, what will happen (if there are no active wakesources and
autosleep is enabled).
I'm guessing that try_to_suspend will try to freeze all the process, which
sends a pseudo signal to all processes, so the mutex_lock_interruptible will
fail and the suspend will complete.
Then will the aborted write() system call be re-attempted?

If that is right, then here is a very clear need to autosleep_ws:  it prevents
a deadlock.
So it appears there is a very real need for autosleep_ws that even I can
agree with.  It seems subtle though and could usefully be documented:

/* Note: it is only safe to mutex_lock(&autosleep_lock) if a wakeup_source
 * is active, otherwise a deadlock with try_to_suspend() is possible.
 * Alternatively mutex_lock_interruptible() can be used.  This will then fail
 * if an auto_sleep cycle tries to freeze processes.
 */
static DEFINE_MUTEX(autosleep_lock);

So:
  Reviewed-by: NeilBrown <neilb@suse.de>

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 5/8] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-04-26 20:40         ` Rafael J. Wysocki
@ 2012-04-27  3:49           ` Arve Hjønnevåg
  2012-04-27 21:18             ` Rafael J. Wysocki
  2012-04-30  1:58             ` [RFC][PATCH 5/8] " NeilBrown
  0 siblings, 2 replies; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-04-27  3:49 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: NeilBrown, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

2012/4/26 Rafael J. Wysocki <rjw@sisk.pl>:
> On Thursday, April 26, 2012, NeilBrown wrote:
>> On Sun, 22 Apr 2012 23:22:43 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
>>
>> > From: Arve Hjønnevåg <arve@android.com>
>> >
>> > When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
>> > wakeup_source will be active to prevent suspend. This can be used to
>> > handle wakeup events from a driver that support poll, e.g. input, if
>> > that driver wakes up the waitqueue passed to epoll before allowing
>> > suspend.
>> >
>> > The current implementation uses an extra wakeup_source when
>> > ep_scan_ready_list runs. This can cause problems if a single thread
>> > is polling on wakeup events and frequent non-wakeup events (events
>> > usually arrive during thread freezing) using the same epoll file.
>>
>> This is quite neat.
>>
>> If I understand it correctly, you register file descriptors with epoll_ctl()
>> on an fd created with epoll_create(), and set the new EPOLLWAKEUP flag.
>> Then when a regular 'poll' or 'select' on the epoll fd reports that it is
>> readable you:

I think it makes more sense to use epoll_wait than mixing this with
select or poll.

>>   - get a wakelock
This may not be needed, since epoll does not reevaluate its state
until you call into it again (at least using epoll_wait).

>>   - use epoll_wait to collect the events
>>   - process the events
>>   - release your wakelock
>>   - go back to poll() or select() on the epoll fd.
>> Correct?  As long as there are ready events with EPOLLWAKEUP set a
>> wakeup_source is held active and the system won't go to sleep.
>>
>> My concern with this is about permissions.  It appears that any process could
>> wait of some fd (maybe a pipe they created themselves) with EPOLLWAKEUP, and
>> then simply never epoll_wait() for the event.  Then they would be keeping
>> the system awake.  I don't think that is acceptable.
>
> I wonder what Arve has to say to that, but let me say that on systems without
> autosleep every process can go into an infinite busy loop which is going to
> drain battery relatively quickly just as well and I don't see why that's so
> much different.
>

I still think is useful to limit access to this feature. On a phone, a
process stuck in an infinite loop will increase battery drain, but if
this process does not have permission to prevent suspend, then this is
only catastrophic if another process that have that permission is
preventing suspend. I think we should add a capability for this.
Assuming you agree, do want me to create a separate patch for that
adds a capability, or roll it into this one.

>> So there needs to be some way to limit who can effectively block suspend by
>> using EPOLLWAKEUP.
>> (This is one of the reasons I like an all-user-space solution.  Policy issues
>> like this can easily be decided in user-space but are clumsy to put into the
>> kernel).
>>
>> Also, I'm having trouble understanding the ep->ws wakeup_source.
>> The epi->ws makes lots of sense and I think I understand it all.
>> However I don't see why you need a wakeup_source for the 'struct eventpoll'.
>>
>> Every time that 'poll' decides to call the ->poll fop for the eventpoll, this
>> wakeup_source will be activated and deactivated which will abort any current
>> suspend cycle even if there are no events to report.
>>
>> I suspect it can just go away.
>
> I'll leave this one entirely to Arve, if you don't mind. :-)
>

I keep the wakeup-source active whenever the epitem is on a list
(ep->rdllist or the local txlist). The temporary txlist is modified
without holding the lock that protects ep->rdllist. It is easier to
use a separate wakeup source to prevent suspend while this list is
manipulated than trying to maintain the wakeup-source state in a
different way than the existing eventpoll state. I think this only
causes real problems if the same epoll file is used for frequent
non-wakeup events (e.g. a gyro) and wakeup events. You should be able
to work around this by using two epoll files.

>> One last item that doesn't really belong here - but it is in context.
>>
>> This mechanism is elegant because it provides a single implementation that
>> provides wakeup_source for almost any sort of device.  I would like to do the
>> same thing for interrupts.
>> Most (maybe all) of the wakeup device on my phone have an interrupt where the
>> body is run in a thread.  When the thread has done it's work the event is
>> visible to userspace so the EPOLLWAKEUP mechanism is all that is needed to
>> complete the path to user-space (or for my user-space solution, nothing else
>> is needed once it is visible to user-space).
>> So we just need to ensure a clear path from the "top half" interrupt handler
>> to the threaded handler.
>> So I imagine attaching a wakeup source to every interrupt for which 'wakeup'
>> is enabled, activating it when the top-half starts and relaxing it when the
>> bottom-half completes.  With this in place, almost all drivers would get
>> wakeup_source handling for free.
>> Does this seem reasonable to you.
>
> Yes, it does.
>

How useful is that? Suspend already synchronizes with interrupt
handlers and will not proceed until they have returned. Are threaded
interrupts handlers not always run at that stage? For drivers that use
work-queues instead of a threaded interrupt handler, I think the
suspend-blocking work-queue patch I wrote a while back is convenient.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating wakeup sources
  2012-04-26 22:04             ` Rafael J. Wysocki
  2012-04-27  0:07               ` NeilBrown
@ 2012-04-27  3:57               ` Arve Hjønnevåg
  2012-04-27 21:14                 ` Rafael J. Wysocki
  1 sibling, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-04-27  3:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: NeilBrown, John Stultz, Linux PM list, LKML, Magnus Damm,
	markgross, Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

2012/4/26 Rafael J. Wysocki <rjw@sisk.pl>:
...
> ---
> From: Rafael J. Wysocki <rjw@sisk.pl>
> Subject: PM / Sleep: Add user space interface for manipulating wakeup sources, v3
>
> Android allows user space to manipulate wakelocks using two
> sysfs file located in /sys/power/, wake_lock and wake_unlock.
> Writing a wakelock name and optionally a timeout to the wake_lock
> file causes the wakelock whose name was written to be acquired (it
> is created before is necessary), optionally with the given timeout.
> Writing the name of a wakelock to wake_unlock causes that wakelock
> to be released.
>
> Implement an analogous interface for user space using wakeup sources.
> Add the /sys/power/wake_lock and /sys/power/wake_unlock files
> allowing user space to create, activate and deactivate wakeup
> sources, such that writing a name and optionally a timeout to
> wake_lock causes the wakeup source of that name to be activated,
> optionally with the given timeout.  If that wakeup source doesn't
> exist, it will be created and then activated.  Writing a name to
> wake_unlock causes the wakeup source of that name, if there is one,
> to be deactivated.  Wakeup sources created with the help of
> wake_lock that haven't been used for more than 5 minutes are garbage
> collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT

I think it would be better if the garbage collection and limit was
configurable and optional. I would probably turn both features off
since I do not want to chase down bugs because a wakelock was ignored,
and I think the garbage collection will erase stats that we care
about.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating wakeup sources
  2012-04-27  3:57               ` Arve Hjønnevåg
@ 2012-04-27 21:14                 ` Rafael J. Wysocki
  2012-04-27 21:17                   ` Arve Hjønnevåg
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-27 21:14 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: NeilBrown, John Stultz, Linux PM list, LKML, Magnus Damm,
	markgross, Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Friday, April 27, 2012, Arve Hjønnevåg wrote:
> 2012/4/26 Rafael J. Wysocki <rjw@sisk.pl>:
> ...
> > ---
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > Subject: PM / Sleep: Add user space interface for manipulating wakeup sources, v3
> >
> > Android allows user space to manipulate wakelocks using two
> > sysfs file located in /sys/power/, wake_lock and wake_unlock.
> > Writing a wakelock name and optionally a timeout to the wake_lock
> > file causes the wakelock whose name was written to be acquired (it
> > is created before is necessary), optionally with the given timeout.
> > Writing the name of a wakelock to wake_unlock causes that wakelock
> > to be released.
> >
> > Implement an analogous interface for user space using wakeup sources.
> > Add the /sys/power/wake_lock and /sys/power/wake_unlock files
> > allowing user space to create, activate and deactivate wakeup
> > sources, such that writing a name and optionally a timeout to
> > wake_lock causes the wakeup source of that name to be activated,
> > optionally with the given timeout.  If that wakeup source doesn't
> > exist, it will be created and then activated.  Writing a name to
> > wake_unlock causes the wakeup source of that name, if there is one,
> > to be deactivated.  Wakeup sources created with the help of
> > wake_lock that haven't been used for more than 5 minutes are garbage
> > collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT
> 
> I think it would be better if the garbage collection and limit was
> configurable and optional. I would probably turn both features off
> since I do not want to chase down bugs because a wakelock was ignored,
> and I think the garbage collection will erase stats that we care
> about.

OK, but would you mind if I added the configurability as a separate incremental
patch?

Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating wakeup sources
  2012-04-27  0:07               ` NeilBrown
@ 2012-04-27 21:15                 ` Rafael J. Wysocki
  0 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-27 21:15 UTC (permalink / raw)
  To: NeilBrown
  Cc: John Stultz, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, Arve Hjønnevåg, John Stultz,
	Brian Swetland, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Friday, April 27, 2012, NeilBrown wrote:
> On Fri, 27 Apr 2012 00:04:27 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > ---
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > Subject: PM / Sleep: Add user space interface for manipulating wakeup sources, v3
> 
> Reviewed-by: NeilBrown <neilb@suse.de>

Thanks!

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating wakeup sources
  2012-04-27 21:14                 ` Rafael J. Wysocki
@ 2012-04-27 21:17                   ` Arve Hjønnevåg
  2012-04-27 21:34                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-04-27 21:17 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: NeilBrown, John Stultz, Linux PM list, LKML, Magnus Damm,
	markgross, Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

2012/4/27 Rafael J. Wysocki <rjw@sisk.pl>:
> On Friday, April 27, 2012, Arve Hjønnevåg wrote:
>> 2012/4/26 Rafael J. Wysocki <rjw@sisk.pl>:
>> ...
>> > ---
>> > From: Rafael J. Wysocki <rjw@sisk.pl>
>> > Subject: PM / Sleep: Add user space interface for manipulating wakeup sources, v3
>> >
>> > Android allows user space to manipulate wakelocks using two
>> > sysfs file located in /sys/power/, wake_lock and wake_unlock.
>> > Writing a wakelock name and optionally a timeout to the wake_lock
>> > file causes the wakelock whose name was written to be acquired (it
>> > is created before is necessary), optionally with the given timeout.
>> > Writing the name of a wakelock to wake_unlock causes that wakelock
>> > to be released.
>> >
>> > Implement an analogous interface for user space using wakeup sources.
>> > Add the /sys/power/wake_lock and /sys/power/wake_unlock files
>> > allowing user space to create, activate and deactivate wakeup
>> > sources, such that writing a name and optionally a timeout to
>> > wake_lock causes the wakeup source of that name to be activated,
>> > optionally with the given timeout.  If that wakeup source doesn't
>> > exist, it will be created and then activated.  Writing a name to
>> > wake_unlock causes the wakeup source of that name, if there is one,
>> > to be deactivated.  Wakeup sources created with the help of
>> > wake_lock that haven't been used for more than 5 minutes are garbage
>> > collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT
>>
>> I think it would be better if the garbage collection and limit was
>> configurable and optional. I would probably turn both features off
>> since I do not want to chase down bugs because a wakelock was ignored,
>> and I think the garbage collection will erase stats that we care
>> about.
>
> OK, but would you mind if I added the configurability as a separate incremental
> patch?
>

That is fine with me.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 5/8] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-04-27  3:49           ` Arve Hjønnevåg
@ 2012-04-27 21:18             ` Rafael J. Wysocki
  2012-04-27 23:26               ` [PATCH] " Arve Hjønnevåg
  2012-04-30  1:58             ` [RFC][PATCH 5/8] " NeilBrown
  1 sibling, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-27 21:18 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: NeilBrown, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Friday, April 27, 2012, Arve Hjønnevåg wrote:
> 2012/4/26 Rafael J. Wysocki <rjw@sisk.pl>:
> > On Thursday, April 26, 2012, NeilBrown wrote:
> >> On Sun, 22 Apr 2012 23:22:43 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> >>
> >> > From: Arve Hjønnevåg <arve@android.com>
> >> >
> >> > When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
> >> > wakeup_source will be active to prevent suspend. This can be used to
> >> > handle wakeup events from a driver that support poll, e.g. input, if
> >> > that driver wakes up the waitqueue passed to epoll before allowing
> >> > suspend.
> >> >
> >> > The current implementation uses an extra wakeup_source when
> >> > ep_scan_ready_list runs. This can cause problems if a single thread
> >> > is polling on wakeup events and frequent non-wakeup events (events
> >> > usually arrive during thread freezing) using the same epoll file.
> >>
> >> This is quite neat.
> >>
> >> If I understand it correctly, you register file descriptors with epoll_ctl()
> >> on an fd created with epoll_create(), and set the new EPOLLWAKEUP flag.
> >> Then when a regular 'poll' or 'select' on the epoll fd reports that it is
> >> readable you:
> 
> I think it makes more sense to use epoll_wait than mixing this with
> select or poll.
> 
> >>   - get a wakelock
> This may not be needed, since epoll does not reevaluate its state
> until you call into it again (at least using epoll_wait).
> 
> >>   - use epoll_wait to collect the events
> >>   - process the events
> >>   - release your wakelock
> >>   - go back to poll() or select() on the epoll fd.
> >> Correct?  As long as there are ready events with EPOLLWAKEUP set a
> >> wakeup_source is held active and the system won't go to sleep.
> >>
> >> My concern with this is about permissions.  It appears that any process could
> >> wait of some fd (maybe a pipe they created themselves) with EPOLLWAKEUP, and
> >> then simply never epoll_wait() for the event.  Then they would be keeping
> >> the system awake.  I don't think that is acceptable.
> >
> > I wonder what Arve has to say to that, but let me say that on systems without
> > autosleep every process can go into an infinite busy loop which is going to
> > drain battery relatively quickly just as well and I don't see why that's so
> > much different.
> >
> 
> I still think is useful to limit access to this feature. On a phone, a
> process stuck in an infinite loop will increase battery drain, but if
> this process does not have permission to prevent suspend, then this is
> only catastrophic if another process that have that permission is
> preventing suspend. I think we should add a capability for this.
> Assuming you agree,

I do.

> do want me to create a separate patch for that
> adds a capability, or roll it into this one.

Please roll it into this one, if that's not a problem.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
  2012-04-27  0:39           ` NeilBrown
@ 2012-04-27 21:22             ` Rafael J. Wysocki
  0 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-27 21:22 UTC (permalink / raw)
  To: NeilBrown
  Cc: Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	Greg KH, Arve Hjønnevåg, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Friday, April 27, 2012, NeilBrown wrote:
> On Thu, 26 Apr 2012 23:52:42 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> 
> > On Thursday, April 26, 2012, NeilBrown wrote:
> > > On Sun, 22 Apr 2012 23:23:23 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > > 
> > > > From: "Rafael J. Wysocki" <rjw@sisk.pl>
> > > > To: Linux PM list <linux-pm@vger.kernel.org>
> > > > Cc: LKML <linux-kernel@vger.kernel.org>, Magnus Damm <magnus.damm@gmail.com>, markgross@thegnar.org, Matthew Garrett <mjg@redhat.com>, Greg KH <gregkh@linuxfoundation.org>, Arve Hjønnevåg <arve@android.com>, John Stultz <john.stultz@linaro.org>, Brian Swetland <swetland@google.com>, Neil Brown <neilb@suse.de>, Alan Stern <stern@rowland.harvard.edu>, Dmitry Torokhov <dmitry.torokhov@gmail.com>, "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
> > > > Subject: [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
> > > > Date: Sun, 22 Apr 2012 23:23:23 +0200
> > > > Sender: linux-kernel-owner@vger.kernel.org
> > > > User-Agent: KMail/1.13.6 (Linux/3.4.0-rc3+; KDE/4.6.0; x86_64; ; )
> > > > 
> > > > From: Rafael J. Wysocki <rjw@sisk.pl>
> > > > 
> > > > Introduce a mechanism by which the kernel can trigger global
> > > > transitions to a sleep state chosen by user space if there are no
> > > > active wakeup sources.
> > > 
> > > Hi Rafael,
> > 
> > Hi,
> > 
> > >  just a few little issues below.  Over all I think that if we have to have
> > >  auto-sleep in the kernel, then this is a good way to do it.
> > 
> > Good, we seem to agree in principle, then. :-)
> > 
> > > > +static void try_to_suspend(struct work_struct *work)
> > > > +{
> > > > +	unsigned int initial_count, final_count;
> > > > +
> > > > +	if (!pm_get_wakeup_count(&initial_count, true))
> > > > +		goto out;
> > > > +
> > > > +	mutex_lock(&autosleep_lock);
> > > > +
> > > > +	if (!pm_save_wakeup_count(initial_count)) {
> > > > +		mutex_unlock(&autosleep_lock);
> > > > +		goto out;
> > > > +	}
> > > > +
> > > > +	if (autosleep_state == PM_SUSPEND_ON) {
> > > > +		mutex_unlock(&autosleep_lock);
> > > > +		return;
> > > > +	}
> > > > +	if (autosleep_state >= PM_SUSPEND_MAX)
> > > > +		hibernate();
> > > > +	else
> > > > +		pm_suspend(autosleep_state);
> > > > +
> > > > +	mutex_unlock(&autosleep_lock);
> > > > +
> > > > +	if (!pm_get_wakeup_count(&final_count, false))
> > > > +		goto out;
> > > > +
> > > > +	if (final_count == initial_count)
> > > > +		schedule_timeout(HZ / 2);
> > > 
> > > This doesn't do what you seem to expect it to do.
> > > You need to set current->state to something like TASK_UNINTERRUPTIBLE
> > > before calling schedule_timeout, otherwise it is effectily a no-op.
> > > schedule_timeout_uninterruptible(), for example, will do this for you.
> > 
> > Right.  I obviously overlooked the missing state change.
> > 
> > > However the value of this isn't clear to me, so a comment would probably be a
> > > good thing.
> > > This continue presumably fires if we wake up without any wakeup sources
> > > being activated.  In that case you want to delay for 500ms - presumably to
> > > avoid a tight suspend/resume loop if something goes wrong?
> > 
> > Yes.
> > 
> > > I have occasionally seen a stray/uninteresting interrupt wake from suspend
> > > immediately after entering suspend and the next attempt succeeds.  Maybe this
> > > is a bug in some driver somewhere, but not a big one.  I think I would rather
> > > in that case that we attempt to re-enter suspend immediately.  Maybe after a
> > > few failed attempts it makes sense to back off.
> > 
> > Perhaps.  We can adjust this particular thing later, I think.
> > 
> > > The other question is: if we want to back-off, is 500ms really enough?  What
> > > will be gained by, or could be achieved in, that time?  An exponential
> > > back-off might be defensible, but I can't see the value of a 500ms fixed
> > > back-off.
> > > However if you can, I'd love to see a comment in there explaining it.
> > 
> > Sure.
> > 
> > > > +
> > > > + out:
> > > > +	queue_up_suspend_work();
> > > > +}
> > > > +
> > > 
> > > 
> > > > +
> > > > +int pm_autosleep_set_state(suspend_state_t state)
> > > > +{
> > > > +
> > > > +#ifndef CONFIG_HIBERNATION
> > > > +	if (state >= PM_SUSPEND_MAX)
> > > > +		return -EINVAL;
> > > > +#endif
> > > > +
> > > > +	__pm_stay_awake(autosleep_ws);
> > > > +
> > > > +	mutex_lock(&autosleep_lock);
> > > > +
> > > > +	autosleep_state = state;
> > > > +
> > > > +	__pm_relax(autosleep_ws);
> > > 
> > > I'm struggling to see the point of the autosleep_ws.
> > > 
> > > A suspend cannot actually happen while this code is running (can it?) because
> > > it will wait for the process to enter the freezer.
> > > So the only effect of this is:
> > >   1/ cause the current auto-sleep cycle to abort and
> > >   2/ maybe add some accounting number is the autosleep_ws.
> > > Is that right?
> > > Which of these is needed?
> > 
> > This is to solve a problem when user space attempts to echo "off" to
> > /sys/power/autosleep exactly when pm_suspend() is initiated as a part
> > of autosleep under the autosleep lock.  In that case, if autosleep_ws is not
> > there, the process wanting to disable autosleep will have to wait for the
> > pm_suspend() to complete (unless it holds a wakelock), which is suboptimal.
> > 
> > > I would imagine that any process writing to /sys/power/autosleep would be
> > > holding a wakelock, and if it didn't it should expect things to be racy...
> > > 
> > > Am I missing something?
> > 
> > The assumption above is kind of optimistic in my opinion.  That process
> > very well may be a system administrator's bash, for example. :-)
> 
> If it is, then presumably the auto-sleep could kick in between any pair of
> keystrokes that the sysadmin types.  Or between the final 'enter' and when the
> write() system call begins.  All that autosleep_ws seems to provide is
> certainty that when the write() system call completes, autosleep will be
> fully disabled.
> I don't think that is really worth anything.
> 
> However, something did occur to me that I would like clarified.
> What happens if try_to_suspend() gets the autosleep_lock just before
> wakeup_count_store(), state_store() or pm_autosleep_set_state()
> try to get it?
> For pm_autosleep_set_state() the try_to_suspend() attempt will abort because
> it is holding autosleep_ws, so it will drop the lock and
> pm_autosleep_set_state() will continue happily.
> For the other two, what will happen (if there are no active wakesources and
> autosleep is enabled).
> I'm guessing that try_to_suspend will try to freeze all the process, which
> sends a pseudo signal to all processes, so the mutex_lock_interruptible will
> fail and the suspend will complete.
> Then will the aborted write() system call be re-attempted?
> 
> If that is right, then here is a very clear need to autosleep_ws:  it prevents
> a deadlock.

Yes, I think that this is the case.

> So it appears there is a very real need for autosleep_ws that even I can
> agree with.  It seems subtle though and could usefully be documented:
> 
> /* Note: it is only safe to mutex_lock(&autosleep_lock) if a wakeup_source
>  * is active, otherwise a deadlock with try_to_suspend() is possible.
>  * Alternatively mutex_lock_interruptible() can be used.  This will then fail
>  * if an auto_sleep cycle tries to freeze processes.
>  */

I'll add the comment above if you don't mind. :-)

> static DEFINE_MUTEX(autosleep_lock);
> 
> So:
>   Reviewed-by: NeilBrown <neilb@suse.de>

Thanks!

Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating wakeup sources
  2012-04-27 21:17                   ` Arve Hjønnevåg
@ 2012-04-27 21:34                     ` Rafael J. Wysocki
  2012-05-03 19:29                       ` [PATCH 0/2]: Kconfig options for wakelocks limit and gc (was: Re: [RFC][PATCH 8/8] PM / Sleep: Add user space ...) Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-04-27 21:34 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: NeilBrown, John Stultz, Linux PM list, LKML, Magnus Damm,
	markgross, Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Friday, April 27, 2012, Arve Hjønnevåg wrote:
> 2012/4/27 Rafael J. Wysocki <rjw@sisk.pl>:
> > On Friday, April 27, 2012, Arve Hjønnevåg wrote:
> >> 2012/4/26 Rafael J. Wysocki <rjw@sisk.pl>:
> >> ...
> >> > ---
> >> > From: Rafael J. Wysocki <rjw@sisk.pl>
> >> > Subject: PM / Sleep: Add user space interface for manipulating wakeup sources, v3
> >> >
> >> > Android allows user space to manipulate wakelocks using two
> >> > sysfs file located in /sys/power/, wake_lock and wake_unlock.
> >> > Writing a wakelock name and optionally a timeout to the wake_lock
> >> > file causes the wakelock whose name was written to be acquired (it
> >> > is created before is necessary), optionally with the given timeout.
> >> > Writing the name of a wakelock to wake_unlock causes that wakelock
> >> > to be released.
> >> >
> >> > Implement an analogous interface for user space using wakeup sources.
> >> > Add the /sys/power/wake_lock and /sys/power/wake_unlock files
> >> > allowing user space to create, activate and deactivate wakeup
> >> > sources, such that writing a name and optionally a timeout to
> >> > wake_lock causes the wakeup source of that name to be activated,
> >> > optionally with the given timeout.  If that wakeup source doesn't
> >> > exist, it will be created and then activated.  Writing a name to
> >> > wake_unlock causes the wakeup source of that name, if there is one,
> >> > to be deactivated.  Wakeup sources created with the help of
> >> > wake_lock that haven't been used for more than 5 minutes are garbage
> >> > collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT
> >>
> >> I think it would be better if the garbage collection and limit was
> >> configurable and optional. I would probably turn both features off
> >> since I do not want to chase down bugs because a wakelock was ignored,
> >> and I think the garbage collection will erase stats that we care
> >> about.
> >
> > OK, but would you mind if I added the configurability as a separate incremental
> > patch?
> >
> 
> That is fine with me.

Cool, thanks!

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-04-27 21:18             ` Rafael J. Wysocki
@ 2012-04-27 23:26               ` Arve Hjønnevåg
  0 siblings, 0 replies; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-04-27 23:26 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Arve Hjønnevåg, NeilBrown, Linux PM list, LKML,
	Magnus Damm, markgross, Matthew Garrett, Greg KH, John Stultz,
	Brian Swetland, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
wakeup_source will be active to prevent suspend. This can be used to
handle wakeup events from a driver that support poll, e.g. input, if
that driver wakes up the waitqueue passed to epoll before allowing
suspend.

The current implementation uses an extra wakeup_source when
ep_scan_ready_list runs. This can cause problems if a single thread
is polling on wakeup events and frequent non-wakeup events (events
usually arrive during thread freezing) using the same epoll file.

Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 fs/eventpoll.c             |   75 ++++++++++++++++++++++++++++++++++++++++++--
 include/linux/capability.h |    5 ++-
 include/linux/eventpoll.h  |    6 +++
 3 files changed, 82 insertions(+), 4 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 739b098..16718f6 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -33,6 +33,7 @@
 #include <linux/bitops.h>
 #include <linux/mutex.h>
 #include <linux/anon_inodes.h>
+#include <linux/device.h>
 #include <asm/uaccess.h>
 #include <asm/io.h>
 #include <asm/mman.h>
@@ -87,7 +88,7 @@
  */
 
 /* Epoll private bits inside the event mask */
-#define EP_PRIVATE_BITS (EPOLLONESHOT | EPOLLET)
+#define EP_PRIVATE_BITS (EPOLLWAKEUP | EPOLLONESHOT | EPOLLET)
 
 /* Maximum number of nesting allowed inside epoll sets */
 #define EP_MAX_NESTS 4
@@ -154,6 +155,9 @@ struct epitem {
 	/* List header used to link this item to the "struct file" items list */
 	struct list_head fllink;
 
+	/* wakeup_source used when EPOLLWAKEUP is set */
+	struct wakeup_source *ws;
+
 	/* The structure that describe the interested events and the source fd */
 	struct epoll_event event;
 };
@@ -194,6 +198,9 @@ struct eventpoll {
 	 */
 	struct epitem *ovflist;
 
+	/* wakeup_source used when ep_scan_ready_list is running */
+	struct wakeup_source *ws;
+
 	/* The user that created the eventpoll descriptor */
 	struct user_struct *user;
 
@@ -565,6 +572,7 @@ static int ep_scan_ready_list(struct eventpoll *ep,
 	 * in a lockless way.
 	 */
 	spin_lock_irqsave(&ep->lock, flags);
+	__pm_stay_awake(ep->ws);
 	list_splice_init(&ep->rdllist, &txlist);
 	ep->ovflist = NULL;
 	spin_unlock_irqrestore(&ep->lock, flags);
@@ -588,8 +596,10 @@ static int ep_scan_ready_list(struct eventpoll *ep,
 		 * queued into ->ovflist but the "txlist" might already
 		 * contain them, and the list_splice() below takes care of them.
 		 */
-		if (!ep_is_linked(&epi->rdllink))
+		if (!ep_is_linked(&epi->rdllink)) {
 			list_add_tail(&epi->rdllink, &ep->rdllist);
+			__pm_stay_awake(epi->ws);
+		}
 	}
 	/*
 	 * We need to set back ep->ovflist to EP_UNACTIVE_PTR, so that after
@@ -602,6 +612,7 @@ static int ep_scan_ready_list(struct eventpoll *ep,
 	 * Quickly re-inject items left on "txlist".
 	 */
 	list_splice(&txlist, &ep->rdllist);
+	__pm_relax(ep->ws);
 
 	if (!list_empty(&ep->rdllist)) {
 		/*
@@ -656,6 +667,9 @@ static int ep_remove(struct eventpoll *ep, struct epitem *epi)
 		list_del_init(&epi->rdllink);
 	spin_unlock_irqrestore(&ep->lock, flags);
 
+	if (epi->ws)
+		wakeup_source_unregister(epi->ws);
+
 	/* At this point it is safe to free the eventpoll item */
 	kmem_cache_free(epi_cache, epi);
 
@@ -706,6 +720,8 @@ static void ep_free(struct eventpoll *ep)
 	mutex_unlock(&epmutex);
 	mutex_destroy(&ep->mtx);
 	free_uid(ep->user);
+	if (ep->ws)
+		wakeup_source_unregister(ep->ws);
 	kfree(ep);
 }
 
@@ -737,6 +753,7 @@ static int ep_read_events_proc(struct eventpoll *ep, struct list_head *head,
 			 * callback, but it's not actually ready, as far as
 			 * caller requested events goes. We can remove it here.
 			 */
+			__pm_relax(epi->ws);
 			list_del_init(&epi->rdllink);
 		}
 	}
@@ -932,8 +949,10 @@ static int ep_poll_callback(wait_queue_t *wait, unsigned mode, int sync, void *k
 	}
 
 	/* If this file is already in the ready list we exit soon */
-	if (!ep_is_linked(&epi->rdllink))
+	if (!ep_is_linked(&epi->rdllink)) {
 		list_add_tail(&epi->rdllink, &ep->rdllist);
+		__pm_stay_awake(epi->ws);
+	}
 
 	/*
 	 * Wake up ( if active ) both the eventpoll wait list and the ->poll()
@@ -1091,6 +1110,30 @@ static int reverse_path_check(void)
 	return error;
 }
 
+static int ep_create_wakeup_source(struct epitem *epi)
+{
+	const char *name;
+
+	if (!epi->ep->ws) {
+		epi->ep->ws = wakeup_source_register("eventpoll");
+		if (!epi->ep->ws)
+			return -ENOMEM;
+	}
+
+	name = epi->ffd.file->f_path.dentry->d_name.name;
+	epi->ws = wakeup_source_register(name);
+	if (!epi->ws)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void ep_destroy_wakeup_source(struct epitem *epi)
+{
+	wakeup_source_unregister(epi->ws);
+	epi->ws = NULL;
+}
+
 /*
  * Must be called with "mtx" held.
  */
@@ -1118,6 +1161,13 @@ static int ep_insert(struct eventpoll *ep, struct epoll_event *event,
 	epi->event = *event;
 	epi->nwait = 0;
 	epi->next = EP_UNACTIVE_PTR;
+	if (epi->event.events & EPOLLWAKEUP) {
+		error = ep_create_wakeup_source(epi);
+		if (error)
+			goto error_create_wakeup_source;
+	} else {
+		epi->ws = NULL;
+	}
 
 	/* Initialize the poll table using the queue callback */
 	epq.epi = epi;
@@ -1164,6 +1214,7 @@ static int ep_insert(struct eventpoll *ep, struct epoll_event *event,
 	/* If the file is already "ready" we drop it inside the ready list */
 	if ((revents & event->events) && !ep_is_linked(&epi->rdllink)) {
 		list_add_tail(&epi->rdllink, &ep->rdllist);
+		__pm_stay_awake(epi->ws);
 
 		/* Notify waiting tasks that events are available */
 		if (waitqueue_active(&ep->wq))
@@ -1204,6 +1255,10 @@ error_unregister:
 		list_del_init(&epi->rdllink);
 	spin_unlock_irqrestore(&ep->lock, flags);
 
+	if (epi->ws)
+		wakeup_source_unregister(epi->ws);
+
+error_create_wakeup_source:
 	kmem_cache_free(epi_cache, epi);
 
 	return error;
@@ -1229,6 +1284,12 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_even
 	epi->event.events = event->events;
 	pt._key = event->events;
 	epi->event.data = event->data; /* protected by mtx */
+	if (epi->event.events & EPOLLWAKEUP) {
+		if (!epi->ws)
+			ep_create_wakeup_source(epi);
+	} else if (epi->ws) {
+		ep_destroy_wakeup_source(epi);
+	}
 
 	/*
 	 * Get current event bits. We can safely use the file* here because
@@ -1244,6 +1305,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_even
 		spin_lock_irq(&ep->lock);
 		if (!ep_is_linked(&epi->rdllink)) {
 			list_add_tail(&epi->rdllink, &ep->rdllist);
+			__pm_stay_awake(epi->ws);
 
 			/* Notify waiting tasks that events are available */
 			if (waitqueue_active(&ep->wq))
@@ -1282,6 +1344,7 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
 	     !list_empty(head) && eventcnt < esed->maxevents;) {
 		epi = list_first_entry(head, struct epitem, rdllink);
 
+		__pm_relax(epi->ws);
 		list_del_init(&epi->rdllink);
 
 		pt._key = epi->event.events;
@@ -1298,6 +1361,7 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
 			if (__put_user(revents, &uevent->events) ||
 			    __put_user(epi->event.data, &uevent->data)) {
 				list_add(&epi->rdllink, head);
+				__pm_stay_awake(epi->ws);
 				return eventcnt ? eventcnt : -EFAULT;
 			}
 			eventcnt++;
@@ -1317,6 +1381,7 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
 				 * poll callback will queue them in ep->ovflist.
 				 */
 				list_add_tail(&epi->rdllink, &ep->rdllist);
+				__pm_stay_awake(epi->ws);
 			}
 		}
 	}
@@ -1629,6 +1694,10 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd,
 	if (!tfile->f_op || !tfile->f_op->poll)
 		goto error_tgt_fput;
 
+	/* Check if EPOLLWAKEUP is allowed */
+	if ((epds.events & EPOLLWAKEUP) && !capable(CAP_EPOLLWAKEUP))
+		goto error_tgt_fput;
+
 	/*
 	 * We have to check that the file structure underneath the file descriptor
 	 * the user passed to us _is_ an eventpoll file. And also we do not permit
diff --git a/include/linux/capability.h b/include/linux/capability.h
index 12d52de..222974a 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -360,8 +360,11 @@ struct cpu_vfs_cap_data {
 
 #define CAP_WAKE_ALARM            35
 
+/* Allow preventing automatic system suspends while epoll events are pending */
 
-#define CAP_LAST_CAP         CAP_WAKE_ALARM
+#define CAP_EPOLLWAKEUP      36
+
+#define CAP_LAST_CAP         CAP_EPOLLWAKEUP
 
 #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
 
diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
index 657ab55..520a57c 100644
--- a/include/linux/eventpoll.h
+++ b/include/linux/eventpoll.h
@@ -26,6 +26,12 @@
 #define EPOLL_CTL_DEL 2
 #define EPOLL_CTL_MOD 3
 
+/*
+ * Request the handling of system wakeup events so as to prevent automatic
+ * system suspends from happening while those events are being processed.
+ */
+#define EPOLLWAKEUP (1 << 29)
+
 /* Set the One Shot behaviour for the target file descriptor */
 #define EPOLLONESHOT (1 << 30)
 
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 5/8] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-04-27  3:49           ` Arve Hjønnevåg
  2012-04-27 21:18             ` Rafael J. Wysocki
@ 2012-04-30  1:58             ` NeilBrown
  2012-05-01  0:52               ` Arve Hjønnevåg
  1 sibling, 1 reply; 129+ messages in thread
From: NeilBrown @ 2012-04-30  1:58 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: Rafael J. Wysocki, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

[-- Attachment #1: Type: text/plain, Size: 9270 bytes --]

On Thu, 26 Apr 2012 20:49:51 -0700 Arve Hjønnevåg <arve@android.com> wrote:

> 2012/4/26 Rafael J. Wysocki <rjw@sisk.pl>:
> > On Thursday, April 26, 2012, NeilBrown wrote:
> >> On Sun, 22 Apr 2012 23:22:43 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> >>
> >> > From: Arve Hjønnevåg <arve@android.com>
> >> >
> >> > When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
> >> > wakeup_source will be active to prevent suspend. This can be used to
> >> > handle wakeup events from a driver that support poll, e.g. input, if
> >> > that driver wakes up the waitqueue passed to epoll before allowing
> >> > suspend.
> >> >
> >> > The current implementation uses an extra wakeup_source when
> >> > ep_scan_ready_list runs. This can cause problems if a single thread
> >> > is polling on wakeup events and frequent non-wakeup events (events
> >> > usually arrive during thread freezing) using the same epoll file.
> >>
> >> This is quite neat.
> >>
> >> If I understand it correctly, you register file descriptors with epoll_ctl()
> >> on an fd created with epoll_create(), and set the new EPOLLWAKEUP flag.
> >> Then when a regular 'poll' or 'select' on the epoll fd reports that it is
> >> readable you:
> 
> I think it makes more sense to use epoll_wait than mixing this with
> select or poll.
> 
> >>   - get a wakelock
> This may not be needed, since epoll does not reevaluate its state
> until you call into it again (at least using epoll_wait).
> 
> >>   - use epoll_wait to collect the events
> >>   - process the events
> >>   - release your wakelock
> >>   - go back to poll() or select() on the epoll fd.
> >> Correct?  As long as there are ready events with EPOLLWAKEUP set a
> >> wakeup_source is held active and the system won't go to sleep.
> >>
> >> My concern with this is about permissions.  It appears that any process could
> >> wait of some fd (maybe a pipe they created themselves) with EPOLLWAKEUP, and
> >> then simply never epoll_wait() for the event.  Then they would be keeping
> >> the system awake.  I don't think that is acceptable.
> >
> > I wonder what Arve has to say to that, but let me say that on systems without
> > autosleep every process can go into an infinite busy loop which is going to
> > drain battery relatively quickly just as well and I don't see why that's so
> > much different.
> >
> 
> I still think is useful to limit access to this feature. On a phone, a
> process stuck in an infinite loop will increase battery drain, but if
> this process does not have permission to prevent suspend, then this is
> only catastrophic if another process that have that permission is
> preventing suspend. I think we should add a capability for this.
> Assuming you agree, do want me to create a separate patch for that
> adds a capability, or roll it into this one.
> 
> >> So there needs to be some way to limit who can effectively block suspend by
> >> using EPOLLWAKEUP.
> >> (This is one of the reasons I like an all-user-space solution.  Policy issues
> >> like this can easily be decided in user-space but are clumsy to put into the
> >> kernel).
> >>
> >> Also, I'm having trouble understanding the ep->ws wakeup_source.
> >> The epi->ws makes lots of sense and I think I understand it all.
> >> However I don't see why you need a wakeup_source for the 'struct eventpoll'.
> >>
> >> Every time that 'poll' decides to call the ->poll fop for the eventpoll, this
> >> wakeup_source will be activated and deactivated which will abort any current
> >> suspend cycle even if there are no events to report.
> >>
> >> I suspect it can just go away.
> >
> > I'll leave this one entirely to Arve, if you don't mind. :-)
> >
> 
> I keep the wakeup-source active whenever the epitem is on a list
> (ep->rdllist or the local txlist). The temporary txlist is modified
> without holding the lock that protects ep->rdllist. It is easier to
> use a separate wakeup source to prevent suspend while this list is
> manipulated than trying to maintain the wakeup-source state in a
> different way than the existing eventpoll state. I think this only
> causes real problems if the same epoll file is used for frequent
> non-wakeup events (e.g. a gyro) and wakeup events. You should be able
> to work around this by using two epoll files.

Thanks for the explanation.  I can now see more clearly how your patch works.
I can also see why you might need the ep->ws wakeup_source.  However I don't
like it.

If it acted purely as a lock and prevented suspend while it was active then
it would be fine.  However it doesn't.  It also aborts any current suspend
attempt - so it is externally visible.
The way your code it written, *any* call to epoll_wait will abort the current
suspend cycle, even if it is called by a completely non-privileged user.
That may not obviously be harmful, but it makes the precise semantics of the
system call something quite non-obvious, and it is much better to have a very
clean semantic.
As you say, it can probably be worked-around but code is much safer when you
don't need to work-around things.

I see two alternatives:
1/ set the 'wakeup' flag on the whole epoll-fd, not on the individual events
   that it is asked to monitor.  i.e. add a new flag to epoll_create1()
   instead of to epoll_ctl events.
   Then you just need a single wakeup_source for the fd which is active
   whenever any event is ready.

   This interface might be generally nicer, I'm not sure.

2/ Find a way to get rid of ep->ws.
   Thinking about it more, I again think it isn't needed.
   The reason is that suspend is already exclusive with any process running in
   kernel context.
   One of the first things suspend does is to freeze all process and (for
   regular non-kernel-thread processes) this happens by sending a virtual
   signal which is acted up when the process returns from a system call or
   returns from a context switch.  So while any given system call is running
   (e.g. epoll_wait) suspend is blocked.  When epoll_wait sets
   TASK_INTERRUPTIBLE the 'freeze' signal will interrupt it of course, but
   this is the only point where suspend can interfere with epoll_wait, and you
   aren't holding ep->ws then anyway.
   Hopefully Rafael will correct me if I got that outline wrong.  But even if 
   I did, I think we need to get rid of ep->ws.

Also, I think it is important to clearly document how to use this safely.
You suggested that if any EPOLLWAKEUP event is ready, then suspend will
remain disabled not only until the event is handled, but also until the next
call to epoll_wait.  That sounds like very useful semantics, but it isn't at
all explicit in the patch.  I think it should be made very clear in
eventpoll.h how the flag can be used. (and then eventually get this into a
man page of course).

> 
> >> One last item that doesn't really belong here - but it is in context.
> >>
> >> This mechanism is elegant because it provides a single implementation that
> >> provides wakeup_source for almost any sort of device.  I would like to do the
> >> same thing for interrupts.
> >> Most (maybe all) of the wakeup device on my phone have an interrupt where the
> >> body is run in a thread.  When the thread has done it's work the event is
> >> visible to userspace so the EPOLLWAKEUP mechanism is all that is needed to
> >> complete the path to user-space (or for my user-space solution, nothing else
> >> is needed once it is visible to user-space).
> >> So we just need to ensure a clear path from the "top half" interrupt handler
> >> to the threaded handler.
> >> So I imagine attaching a wakeup source to every interrupt for which 'wakeup'
> >> is enabled, activating it when the top-half starts and relaxing it when the
> >> bottom-half completes.  With this in place, almost all drivers would get
> >> wakeup_source handling for free.
> >> Does this seem reasonable to you.
> >
> > Yes, it does.
> >
> 
> How useful is that? Suspend already synchronizes with interrupt
> handlers and will not proceed until they have returned. Are threaded
> interrupts handlers not always run at that stage? For drivers that use
> work-queues instead of a threaded interrupt handler, I think the
> suspend-blocking work-queue patch I wrote a while back is convenient.
> 

Maybe it isn't useful at all - I'm still working this stuff out.

Yes, threaded interrupts are run "straight away", but what exactly does that
mean?  And in particular, is there any interlocking to ensure they run
before suspend gets stop the CPU?  Maybe the scheduling priority of the
different threads is enough to make sure this works, as irq_threads are
SCHED_FIFO and  the suspending thread almost certainly isn't.  But is that
still a guarantee on an SMP machine?  irq_threads aren't freezable so suspend
won't block on them for that reason..

I really just want to be sure that some interlock is in place to ensure that
the threaded interrupt handler runs before suspend absolutely commits to
suspending.  If that is already the case, when what I suggest isn't needed as
you suggest.  Do you know of such an interlock?

Thanks,
NeilBrown



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 5/8] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-04-30  1:58             ` [RFC][PATCH 5/8] " NeilBrown
@ 2012-05-01  0:52               ` Arve Hjønnevåg
  2012-05-01  2:18                 ` NeilBrown
  2012-05-01  5:33                 ` [PATCH] " Arve Hjønnevåg
  0 siblings, 2 replies; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-05-01  0:52 UTC (permalink / raw)
  To: NeilBrown
  Cc: Rafael J. Wysocki, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Sun, Apr 29, 2012 at 6:58 PM, NeilBrown <neilb@suse.de> wrote:
> On Thu, 26 Apr 2012 20:49:51 -0700 Arve Hjønnevåg <arve@android.com> wrote:
...
>> I keep the wakeup-source active whenever the epitem is on a list
>> (ep->rdllist or the local txlist). The temporary txlist is modified
>> without holding the lock that protects ep->rdllist. It is easier to
>> use a separate wakeup source to prevent suspend while this list is
>> manipulated than trying to maintain the wakeup-source state in a
>> different way than the existing eventpoll state. I think this only
>> causes real problems if the same epoll file is used for frequent
>> non-wakeup events (e.g. a gyro) and wakeup events. You should be able
>> to work around this by using two epoll files.
>
> Thanks for the explanation.  I can now see more clearly how your patch works.
> I can also see why you might need the ep->ws wakeup_source.  However I don't
> like it.
>
> If it acted purely as a lock and prevented suspend while it was active then
> it would be fine.  However it doesn't.  It also aborts any current suspend
> attempt - so it is externally visible.
> The way your code it written, *any* call to epoll_wait will abort the current
> suspend cycle, even if it is called by a completely non-privileged user.

With the patch I posted Friday, a non-privileged user will not be able
to pass EPOLLWAKEUP and have the wakeup-source created.

> That may not obviously be harmful, but it makes the precise semantics of the
> system call something quite non-obvious, and it is much better to have a very
> clean semantic.
> As you say, it can probably be worked-around but code is much safer when you
> don't need to work-around things.
>
> I see two alternatives:
> 1/ set the 'wakeup' flag on the whole epoll-fd, not on the individual events
>   that it is asked to monitor.  i.e. add a new flag to epoll_create1()
>   instead of to epoll_ctl events.
>   Then you just need a single wakeup_source for the fd which is active
>   whenever any event is ready.
>
>   This interface might be generally nicer, I'm not sure.
>
> 2/ Find a way to get rid of ep->ws.
>   Thinking about it more, I again think it isn't needed.
>   The reason is that suspend is already exclusive with any process running in
>   kernel context.
>   One of the first things suspend does is to freeze all process and (for
>   regular non-kernel-thread processes) this happens by sending a virtual
>   signal which is acted up when the process returns from a system call or
>   returns from a context switch.  So while any given system call is running
>   (e.g. epoll_wait) suspend is blocked.  When epoll_wait sets
>   TASK_INTERRUPTIBLE the 'freeze' signal will interrupt it of course, but
>   this is the only point where suspend can interfere with epoll_wait, and you
>   aren't holding ep->ws then anyway.
>   Hopefully Rafael will correct me if I got that outline wrong.  But even if
>   I did, I think we need to get rid of ep->ws.
>

If ep_scan_ready_list is only called from freezable threads, then
ep->ws is not strictly needed, but without it another suspend attempt
will be triggered if there are not other wakeup-sources active. I'm
also not sure if it could get called from a non-freezable thread since
other subsystems can call it through the poll hook.

A third option is to only activate ep->ws when needed. This may may work:
---
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 16718f6..beb7138 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -572,7 +572,6 @@ static int ep_scan_ready_list(struct eventpoll *ep,
 	 * in a lockless way.
 	 */
 	spin_lock_irqsave(&ep->lock, flags);
-	__pm_stay_awake(ep->ws);
 	list_splice_init(&ep->rdllist, &txlist);
 	ep->ovflist = NULL;
 	spin_unlock_irqrestore(&ep->lock, flags);
@@ -753,6 +752,8 @@ static int ep_read_events_proc(struct eventpoll
*ep, struct list_head *head,
 			 * callback, but it's not actually ready, as far as
 			 * caller requested events goes. We can remove it here.
 			 */
+			if (epi->ws && epi->ws->active)
+				__pm_stay_awake(ep->ws);
 			__pm_relax(epi->ws);
 			list_del_init(&epi->rdllink);
 		}
@@ -1344,6 +1345,8 @@ static int ep_send_events_proc(struct eventpoll
*ep, struct list_head *head,
 	     !list_empty(head) && eventcnt < esed->maxevents;) {
 		epi = list_first_entry(head, struct epitem, rdllink);

+		if (epi->ws && epi->ws->active)
+			__pm_stay_awake(ep->ws);
 		__pm_relax(epi->ws);
 		list_del_init(&epi->rdllink);

---


> Also, I think it is important to clearly document how to use this safely.
> You suggested that if any EPOLLWAKEUP event is ready, then suspend will
> remain disabled not only until the event is handled, but also until the next
> call to epoll_wait.  That sounds like very useful semantics, but it isn't at
> all explicit in the patch.  I think it should be made very clear in
> eventpoll.h how the flag can be used. (and then eventually get this into a
> man page of course).
>

OK

>>
>> >> One last item that doesn't really belong here - but it is in context.
>> >>
>> >> This mechanism is elegant because it provides a single implementation that
>> >> provides wakeup_source for almost any sort of device.  I would like to do the
>> >> same thing for interrupts.
>> >> Most (maybe all) of the wakeup device on my phone have an interrupt where the
>> >> body is run in a thread.  When the thread has done it's work the event is
>> >> visible to userspace so the EPOLLWAKEUP mechanism is all that is needed to
>> >> complete the path to user-space (or for my user-space solution, nothing else
>> >> is needed once it is visible to user-space).
>> >> So we just need to ensure a clear path from the "top half" interrupt handler
>> >> to the threaded handler.
>> >> So I imagine attaching a wakeup source to every interrupt for which 'wakeup'
>> >> is enabled, activating it when the top-half starts and relaxing it when the
>> >> bottom-half completes.  With this in place, almost all drivers would get
>> >> wakeup_source handling for free.
>> >> Does this seem reasonable to you.
>> >
>> > Yes, it does.
>> >
>>
>> How useful is that? Suspend already synchronizes with interrupt
>> handlers and will not proceed until they have returned. Are threaded
>> interrupts handlers not always run at that stage? For drivers that use
>> work-queues instead of a threaded interrupt handler, I think the
>> suspend-blocking work-queue patch I wrote a while back is convenient.
>>
>
> Maybe it isn't useful at all - I'm still working this stuff out.
>
> Yes, threaded interrupts are run "straight away", but what exactly does that
> mean?  And in particular, is there any interlocking to ensure they run
> before suspend gets stop the CPU?  Maybe the scheduling priority of the
> different threads is enough to make sure this works, as irq_threads are
> SCHED_FIFO and  the suspending thread almost certainly isn't.  But is that
> still a guarantee on an SMP machine?  irq_threads aren't freezable so suspend
> won't block on them for that reason..
>
> I really just want to be sure that some interlock is in place to ensure that
> the threaded interrupt handler runs before suspend absolutely commits to
> suspending.  If that is already the case, when what I suggest isn't needed as
> you suggest.  Do you know of such an interlock?
>

Normal interrupts are disabled during suspend. This synchronizes with
the interrupt handler, and pending wakeup interrupts abort suspend. I
have not looked at this code since threaded interrupt handlers were
added, so there could be bugs there.

-- 
Arve Hjønnevåg

^ permalink raw reply related	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 5/8] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-05-01  0:52               ` Arve Hjønnevåg
@ 2012-05-01  2:18                 ` NeilBrown
  2012-05-01  5:33                 ` [PATCH] " Arve Hjønnevåg
  1 sibling, 0 replies; 129+ messages in thread
From: NeilBrown @ 2012-05-01  2:18 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: Rafael J. Wysocki, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

[-- Attachment #1: Type: text/plain, Size: 8758 bytes --]

On Mon, 30 Apr 2012 17:52:08 -0700 Arve Hjønnevåg <arve@android.com> wrote:

> On Sun, Apr 29, 2012 at 6:58 PM, NeilBrown <neilb@suse.de> wrote:
> > On Thu, 26 Apr 2012 20:49:51 -0700 Arve Hjønnevåg <arve@android.com> wrote:
> ...
> >> I keep the wakeup-source active whenever the epitem is on a list
> >> (ep->rdllist or the local txlist). The temporary txlist is modified
> >> without holding the lock that protects ep->rdllist. It is easier to
> >> use a separate wakeup source to prevent suspend while this list is
> >> manipulated than trying to maintain the wakeup-source state in a
> >> different way than the existing eventpoll state. I think this only
> >> causes real problems if the same epoll file is used for frequent
> >> non-wakeup events (e.g. a gyro) and wakeup events. You should be able
> >> to work around this by using two epoll files.
> >
> > Thanks for the explanation.  I can now see more clearly how your patch works.
> > I can also see why you might need the ep->ws wakeup_source.  However I don't
> > like it.
> >
> > If it acted purely as a lock and prevented suspend while it was active then
> > it would be fine.  However it doesn't.  It also aborts any current suspend
> > attempt - so it is externally visible.
> > The way your code it written, *any* call to epoll_wait will abort the current
> > suspend cycle, even if it is called by a completely non-privileged user.
> 
> With the patch I posted Friday, a non-privileged user will not be able
> to pass EPOLLWAKEUP and have the wakeup-source created.

Ahhh yes.  I hadn't noticed that you only create ep->ws the first time that
an epi->ws is created.  So that aspect is fine, thanks.


> 
> > That may not obviously be harmful, but it makes the precise semantics of the
> > system call something quite non-obvious, and it is much better to have a very
> > clean semantic.
> > As you say, it can probably be worked-around but code is much safer when you
> > don't need to work-around things.
> >
> > I see two alternatives:
> > 1/ set the 'wakeup' flag on the whole epoll-fd, not on the individual events
> >   that it is asked to monitor.  i.e. add a new flag to epoll_create1()
> >   instead of to epoll_ctl events.
> >   Then you just need a single wakeup_source for the fd which is active
> >   whenever any event is ready.
> >
> >   This interface might be generally nicer, I'm not sure.
> >
> > 2/ Find a way to get rid of ep->ws.
> >   Thinking about it more, I again think it isn't needed.
> >   The reason is that suspend is already exclusive with any process running in
> >   kernel context.
> >   One of the first things suspend does is to freeze all process and (for
> >   regular non-kernel-thread processes) this happens by sending a virtual
> >   signal which is acted up when the process returns from a system call or
> >   returns from a context switch.  So while any given system call is running
> >   (e.g. epoll_wait) suspend is blocked.  When epoll_wait sets
> >   TASK_INTERRUPTIBLE the 'freeze' signal will interrupt it of course, but
> >   this is the only point where suspend can interfere with epoll_wait, and you
> >   aren't holding ep->ws then anyway.
> >   Hopefully Rafael will correct me if I got that outline wrong.  But even if
> >   I did, I think we need to get rid of ep->ws.
> >
> 
> If ep_scan_ready_list is only called from freezable threads, then
> ep->ws is not strictly needed, but without it another suspend attempt
> will be triggered if there are not other wakeup-sources active. I'm
> also not sure if it could get called from a non-freezable thread since
> other subsystems can call it through the poll hook.

I can see that triggering another suspend cycle that we know will fail is not
ideal - every thread needs to move from 'idle' to 'frozen' and back again.
I wonder if a lighter-weight mechanism might achieve that better.  It feels
like wakeup_source is being used for two very different purposes here - one
to disable suspend while some event is pending, one to lock-out suspend
during a critical piece of code.  I feel it would be neater and more
transparent if they were two different mechanisms.
If there were a second mechanism, I would like to see it used in
pm_autosleep_set_state in place of autosleep_ws as well - it seems like a
similar situation.

Maybe adding a second locking mechanism can be added later - I don't think
the current code is wrong, it just seems inelegant.


> 
> A third option is to only activate ep->ws when needed. This may may work:

I think this is an improvement if we stay with using a wakeup_source to
protect the critical code section.  If we come up with a different mechanism,
them it is not necessary.


> 
> > Also, I think it is important to clearly document how to use this safely.
> > You suggested that if any EPOLLWAKEUP event is ready, then suspend will
> > remain disabled not only until the event is handled, but also until the next
> > call to epoll_wait.  That sounds like very useful semantics, but it isn't at
> > all explicit in the patch.  I think it should be made very clear in
> > eventpoll.h how the flag can be used. (and then eventually get this into a
> > man page of course).
> >
> 
> OK
> 
> >>
> >> >> One last item that doesn't really belong here - but it is in context.
> >> >>
> >> >> This mechanism is elegant because it provides a single implementation that
> >> >> provides wakeup_source for almost any sort of device.  I would like to do the
> >> >> same thing for interrupts.
> >> >> Most (maybe all) of the wakeup device on my phone have an interrupt where the
> >> >> body is run in a thread.  When the thread has done it's work the event is
> >> >> visible to userspace so the EPOLLWAKEUP mechanism is all that is needed to
> >> >> complete the path to user-space (or for my user-space solution, nothing else
> >> >> is needed once it is visible to user-space).
> >> >> So we just need to ensure a clear path from the "top half" interrupt handler
> >> >> to the threaded handler.
> >> >> So I imagine attaching a wakeup source to every interrupt for which 'wakeup'
> >> >> is enabled, activating it when the top-half starts and relaxing it when the
> >> >> bottom-half completes.  With this in place, almost all drivers would get
> >> >> wakeup_source handling for free.
> >> >> Does this seem reasonable to you.
> >> >
> >> > Yes, it does.
> >> >
> >>
> >> How useful is that? Suspend already synchronizes with interrupt
> >> handlers and will not proceed until they have returned. Are threaded
> >> interrupts handlers not always run at that stage? For drivers that use
> >> work-queues instead of a threaded interrupt handler, I think the
> >> suspend-blocking work-queue patch I wrote a while back is convenient.
> >>
> >
> > Maybe it isn't useful at all - I'm still working this stuff out.
> >
> > Yes, threaded interrupts are run "straight away", but what exactly does that
> > mean?  And in particular, is there any interlocking to ensure they run
> > before suspend gets stop the CPU?  Maybe the scheduling priority of the
> > different threads is enough to make sure this works, as irq_threads are
> > SCHED_FIFO and  the suspending thread almost certainly isn't.  But is that
> > still a guarantee on an SMP machine?  irq_threads aren't freezable so suspend
> > won't block on them for that reason..
> >
> > I really just want to be sure that some interlock is in place to ensure that
> > the threaded interrupt handler runs before suspend absolutely commits to
> > suspending.  If that is already the case, when what I suggest isn't needed as
> > you suggest.  Do you know of such an interlock?
> >
> 
> Normal interrupts are disabled during suspend. This synchronizes with
> the interrupt handler, and pending wakeup interrupts abort suspend. I
> have not looked at this code since threaded interrupt handlers were
> added, so there could be bugs there.
> 

If disabling an interrupt ensured that the interrupt thread was idle, and
thus synchronised with both interrupt handlers, that would be good...

And it does.  suspend_device_irqs() calls synchronize_irq() on each suspended
irq which waits for any irq thread to be idle.  So it's all good.

i.e. in an interrupt is marked for WAKEUP, and its handler (whether threaded
or not)  calls wake_up on a wait_queue_head that is used by 'poll' to detect
activity, then using EPOLLWAKEUP is sufficient to collect those events
without racing with suspend.  So we don't need to teach multiple subsystems
about wakeup_sources - eventpoll and the interrupt handling can do it all.
Excellent.

Thanks,
NeilBrown



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-05-01  0:52               ` Arve Hjønnevåg
  2012-05-01  2:18                 ` NeilBrown
@ 2012-05-01  5:33                 ` Arve Hjønnevåg
  2012-05-01  6:28                   ` NeilBrown
  2012-07-16  6:38                   ` Michael Kerrisk
  1 sibling, 2 replies; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-05-01  5:33 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Arve Hjønnevåg, NeilBrown, Linux PM list, LKML,
	Magnus Damm, markgross, Matthew Garrett, Greg KH, John Stultz,
	Brian Swetland, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
wakeup_source will be active to prevent suspend. This can be used to
handle wakeup events from a driver that support poll, e.g. input, if
that driver wakes up the waitqueue passed to epoll before allowing
suspend.

Signed-off-by: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 fs/eventpoll.c             |   90 ++++++++++++++++++++++++++++++++++++++++++-
 include/linux/capability.h |    5 ++-
 include/linux/eventpoll.h  |   12 ++++++
 3 files changed, 103 insertions(+), 4 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 739b098..1abed50 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -33,6 +33,7 @@
 #include <linux/bitops.h>
 #include <linux/mutex.h>
 #include <linux/anon_inodes.h>
+#include <linux/device.h>
 #include <asm/uaccess.h>
 #include <asm/io.h>
 #include <asm/mman.h>
@@ -87,7 +88,7 @@
  */
 
 /* Epoll private bits inside the event mask */
-#define EP_PRIVATE_BITS (EPOLLONESHOT | EPOLLET)
+#define EP_PRIVATE_BITS (EPOLLWAKEUP | EPOLLONESHOT | EPOLLET)
 
 /* Maximum number of nesting allowed inside epoll sets */
 #define EP_MAX_NESTS 4
@@ -154,6 +155,9 @@ struct epitem {
 	/* List header used to link this item to the "struct file" items list */
 	struct list_head fllink;
 
+	/* wakeup_source used when EPOLLWAKEUP is set */
+	struct wakeup_source *ws;
+
 	/* The structure that describe the interested events and the source fd */
 	struct epoll_event event;
 };
@@ -194,6 +198,9 @@ struct eventpoll {
 	 */
 	struct epitem *ovflist;
 
+	/* wakeup_source used when ep_scan_ready_list is running */
+	struct wakeup_source *ws;
+
 	/* The user that created the eventpoll descriptor */
 	struct user_struct *user;
 
@@ -588,8 +595,10 @@ static int ep_scan_ready_list(struct eventpoll *ep,
 		 * queued into ->ovflist but the "txlist" might already
 		 * contain them, and the list_splice() below takes care of them.
 		 */
-		if (!ep_is_linked(&epi->rdllink))
+		if (!ep_is_linked(&epi->rdllink)) {
 			list_add_tail(&epi->rdllink, &ep->rdllist);
+			__pm_stay_awake(epi->ws);
+		}
 	}
 	/*
 	 * We need to set back ep->ovflist to EP_UNACTIVE_PTR, so that after
@@ -602,6 +611,7 @@ static int ep_scan_ready_list(struct eventpoll *ep,
 	 * Quickly re-inject items left on "txlist".
 	 */
 	list_splice(&txlist, &ep->rdllist);
+	__pm_relax(ep->ws);
 
 	if (!list_empty(&ep->rdllist)) {
 		/*
@@ -656,6 +666,8 @@ static int ep_remove(struct eventpoll *ep, struct epitem *epi)
 		list_del_init(&epi->rdllink);
 	spin_unlock_irqrestore(&ep->lock, flags);
 
+	wakeup_source_unregister(epi->ws);
+
 	/* At this point it is safe to free the eventpoll item */
 	kmem_cache_free(epi_cache, epi);
 
@@ -706,6 +718,7 @@ static void ep_free(struct eventpoll *ep)
 	mutex_unlock(&epmutex);
 	mutex_destroy(&ep->mtx);
 	free_uid(ep->user);
+	wakeup_source_unregister(ep->ws);
 	kfree(ep);
 }
 
@@ -737,6 +750,7 @@ static int ep_read_events_proc(struct eventpoll *ep, struct list_head *head,
 			 * callback, but it's not actually ready, as far as
 			 * caller requested events goes. We can remove it here.
 			 */
+			__pm_relax(epi->ws);
 			list_del_init(&epi->rdllink);
 		}
 	}
@@ -927,13 +941,23 @@ static int ep_poll_callback(wait_queue_t *wait, unsigned mode, int sync, void *k
 		if (epi->next == EP_UNACTIVE_PTR) {
 			epi->next = ep->ovflist;
 			ep->ovflist = epi;
+			if (epi->ws) {
+				/*
+				 * Activate ep->ws since epi->ws may get
+				 * deactivated at any time.
+				 */
+				__pm_stay_awake(ep->ws);
+			}
+
 		}
 		goto out_unlock;
 	}
 
 	/* If this file is already in the ready list we exit soon */
-	if (!ep_is_linked(&epi->rdllink))
+	if (!ep_is_linked(&epi->rdllink)) {
 		list_add_tail(&epi->rdllink, &ep->rdllist);
+		__pm_stay_awake(epi->ws);
+	}
 
 	/*
 	 * Wake up ( if active ) both the eventpoll wait list and the ->poll()
@@ -1091,6 +1115,30 @@ static int reverse_path_check(void)
 	return error;
 }
 
+static int ep_create_wakeup_source(struct epitem *epi)
+{
+	const char *name;
+
+	if (!epi->ep->ws) {
+		epi->ep->ws = wakeup_source_register("eventpoll");
+		if (!epi->ep->ws)
+			return -ENOMEM;
+	}
+
+	name = epi->ffd.file->f_path.dentry->d_name.name;
+	epi->ws = wakeup_source_register(name);
+	if (!epi->ws)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void ep_destroy_wakeup_source(struct epitem *epi)
+{
+	wakeup_source_unregister(epi->ws);
+	epi->ws = NULL;
+}
+
 /*
  * Must be called with "mtx" held.
  */
@@ -1118,6 +1166,13 @@ static int ep_insert(struct eventpoll *ep, struct epoll_event *event,
 	epi->event = *event;
 	epi->nwait = 0;
 	epi->next = EP_UNACTIVE_PTR;
+	if (epi->event.events & EPOLLWAKEUP) {
+		error = ep_create_wakeup_source(epi);
+		if (error)
+			goto error_create_wakeup_source;
+	} else {
+		epi->ws = NULL;
+	}
 
 	/* Initialize the poll table using the queue callback */
 	epq.epi = epi;
@@ -1164,6 +1219,7 @@ static int ep_insert(struct eventpoll *ep, struct epoll_event *event,
 	/* If the file is already "ready" we drop it inside the ready list */
 	if ((revents & event->events) && !ep_is_linked(&epi->rdllink)) {
 		list_add_tail(&epi->rdllink, &ep->rdllist);
+		__pm_stay_awake(epi->ws);
 
 		/* Notify waiting tasks that events are available */
 		if (waitqueue_active(&ep->wq))
@@ -1204,6 +1260,9 @@ error_unregister:
 		list_del_init(&epi->rdllink);
 	spin_unlock_irqrestore(&ep->lock, flags);
 
+	wakeup_source_unregister(epi->ws);
+
+error_create_wakeup_source:
 	kmem_cache_free(epi_cache, epi);
 
 	return error;
@@ -1229,6 +1288,12 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_even
 	epi->event.events = event->events;
 	pt._key = event->events;
 	epi->event.data = event->data; /* protected by mtx */
+	if (epi->event.events & EPOLLWAKEUP) {
+		if (!epi->ws)
+			ep_create_wakeup_source(epi);
+	} else if (epi->ws) {
+		ep_destroy_wakeup_source(epi);
+	}
 
 	/*
 	 * Get current event bits. We can safely use the file* here because
@@ -1244,6 +1309,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_even
 		spin_lock_irq(&ep->lock);
 		if (!ep_is_linked(&epi->rdllink)) {
 			list_add_tail(&epi->rdllink, &ep->rdllist);
+			__pm_stay_awake(epi->ws);
 
 			/* Notify waiting tasks that events are available */
 			if (waitqueue_active(&ep->wq))
@@ -1282,6 +1348,18 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
 	     !list_empty(head) && eventcnt < esed->maxevents;) {
 		epi = list_first_entry(head, struct epitem, rdllink);
 
+		/*
+		 * Activate ep->ws before deactivating epi->ws to prevent
+		 * triggering auto-suspend here (in case we reactive epi->ws
+		 * below).
+		 *
+		 * This could be rearranged to delay the deactivation of epi->ws
+		 * instead, but then epi->ws would temporarily be out of sync
+		 * with ep_is_linked().
+		 */
+		if (epi->ws && epi->ws->active)
+			__pm_stay_awake(ep->ws);
+		__pm_relax(epi->ws);
 		list_del_init(&epi->rdllink);
 
 		pt._key = epi->event.events;
@@ -1298,6 +1376,7 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
 			if (__put_user(revents, &uevent->events) ||
 			    __put_user(epi->event.data, &uevent->data)) {
 				list_add(&epi->rdllink, head);
+				__pm_stay_awake(epi->ws);
 				return eventcnt ? eventcnt : -EFAULT;
 			}
 			eventcnt++;
@@ -1317,6 +1396,7 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
 				 * poll callback will queue them in ep->ovflist.
 				 */
 				list_add_tail(&epi->rdllink, &ep->rdllist);
+				__pm_stay_awake(epi->ws);
 			}
 		}
 	}
@@ -1629,6 +1709,10 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd,
 	if (!tfile->f_op || !tfile->f_op->poll)
 		goto error_tgt_fput;
 
+	/* Check if EPOLLWAKEUP is allowed */
+	if ((epds.events & EPOLLWAKEUP) && !capable(CAP_EPOLLWAKEUP))
+		goto error_tgt_fput;
+
 	/*
 	 * We have to check that the file structure underneath the file descriptor
 	 * the user passed to us _is_ an eventpoll file. And also we do not permit
diff --git a/include/linux/capability.h b/include/linux/capability.h
index 12d52de..222974a 100644
--- a/include/linux/capability.h
+++ b/include/linux/capability.h
@@ -360,8 +360,11 @@ struct cpu_vfs_cap_data {
 
 #define CAP_WAKE_ALARM            35
 
+/* Allow preventing automatic system suspends while epoll events are pending */
 
-#define CAP_LAST_CAP         CAP_WAKE_ALARM
+#define CAP_EPOLLWAKEUP      36
+
+#define CAP_LAST_CAP         CAP_EPOLLWAKEUP
 
 #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
 
diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
index 657ab55..5b591fb 100644
--- a/include/linux/eventpoll.h
+++ b/include/linux/eventpoll.h
@@ -26,6 +26,18 @@
 #define EPOLL_CTL_DEL 2
 #define EPOLL_CTL_MOD 3
 
+/*
+ * Request the handling of system wakeup events so as to prevent automatic
+ * system suspends from happening while those events are being processed.
+ *
+ * Assuming neither EPOLLET nor EPOLLONESHOT is set, automatic system suspends
+ * will not be re-allowed until epoll_wait is called again after consuming the
+ * wakeup event(s).
+ *
+ * Requires CAP_EPOLLWAKEUP
+ */
+#define EPOLLWAKEUP (1 << 29)
+
 /* Set the One Shot behaviour for the target file descriptor */
 #define EPOLLONESHOT (1 << 30)
 
-- 
1.7.7.3


^ permalink raw reply related	[flat|nested] 129+ messages in thread

* Re: [PATCH] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-05-01  5:33                 ` [PATCH] " Arve Hjønnevåg
@ 2012-05-01  6:28                   ` NeilBrown
  2012-05-01 13:51                     ` Rafael J. Wysocki
  2012-07-16  6:38                   ` Michael Kerrisk
  1 sibling, 1 reply; 129+ messages in thread
From: NeilBrown @ 2012-05-01  6:28 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: Rafael J. Wysocki, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

[-- Attachment #1: Type: text/plain, Size: 10684 bytes --]

On Mon, 30 Apr 2012 22:33:48 -0700 Arve Hjønnevåg <arve@android.com> wrote:

> When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
> wakeup_source will be active to prevent suspend. This can be used to
> handle wakeup events from a driver that support poll, e.g. input, if
> that driver wakes up the waitqueue passed to epoll before allowing
> suspend.
> 
> Signed-off-by: Arve Hjønnevåg <arve@android.com>
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>

Thanks.
 Reviewed-by: NeilBrown <neilb@suse.de>

However:
1/ I think all references to "automatic system suspend" can be replaced with
   "system suspend" as an active wakeup_source disables any suspend, no matter
   it's source
2/ I reserve to right to submit for discussion a later patch which removes
   the ep->ws in favour or some other exclusion mechanism :-)

NeilBrown



> ---
>  fs/eventpoll.c             |   90 ++++++++++++++++++++++++++++++++++++++++++-
>  include/linux/capability.h |    5 ++-
>  include/linux/eventpoll.h  |   12 ++++++
>  3 files changed, 103 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index 739b098..1abed50 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -33,6 +33,7 @@
>  #include <linux/bitops.h>
>  #include <linux/mutex.h>
>  #include <linux/anon_inodes.h>
> +#include <linux/device.h>
>  #include <asm/uaccess.h>
>  #include <asm/io.h>
>  #include <asm/mman.h>
> @@ -87,7 +88,7 @@
>   */
>  
>  /* Epoll private bits inside the event mask */
> -#define EP_PRIVATE_BITS (EPOLLONESHOT | EPOLLET)
> +#define EP_PRIVATE_BITS (EPOLLWAKEUP | EPOLLONESHOT | EPOLLET)
>  
>  /* Maximum number of nesting allowed inside epoll sets */
>  #define EP_MAX_NESTS 4
> @@ -154,6 +155,9 @@ struct epitem {
>  	/* List header used to link this item to the "struct file" items list */
>  	struct list_head fllink;
>  
> +	/* wakeup_source used when EPOLLWAKEUP is set */
> +	struct wakeup_source *ws;
> +
>  	/* The structure that describe the interested events and the source fd */
>  	struct epoll_event event;
>  };
> @@ -194,6 +198,9 @@ struct eventpoll {
>  	 */
>  	struct epitem *ovflist;
>  
> +	/* wakeup_source used when ep_scan_ready_list is running */
> +	struct wakeup_source *ws;
> +
>  	/* The user that created the eventpoll descriptor */
>  	struct user_struct *user;
>  
> @@ -588,8 +595,10 @@ static int ep_scan_ready_list(struct eventpoll *ep,
>  		 * queued into ->ovflist but the "txlist" might already
>  		 * contain them, and the list_splice() below takes care of them.
>  		 */
> -		if (!ep_is_linked(&epi->rdllink))
> +		if (!ep_is_linked(&epi->rdllink)) {
>  			list_add_tail(&epi->rdllink, &ep->rdllist);
> +			__pm_stay_awake(epi->ws);
> +		}
>  	}
>  	/*
>  	 * We need to set back ep->ovflist to EP_UNACTIVE_PTR, so that after
> @@ -602,6 +611,7 @@ static int ep_scan_ready_list(struct eventpoll *ep,
>  	 * Quickly re-inject items left on "txlist".
>  	 */
>  	list_splice(&txlist, &ep->rdllist);
> +	__pm_relax(ep->ws);
>  
>  	if (!list_empty(&ep->rdllist)) {
>  		/*
> @@ -656,6 +666,8 @@ static int ep_remove(struct eventpoll *ep, struct epitem *epi)
>  		list_del_init(&epi->rdllink);
>  	spin_unlock_irqrestore(&ep->lock, flags);
>  
> +	wakeup_source_unregister(epi->ws);
> +
>  	/* At this point it is safe to free the eventpoll item */
>  	kmem_cache_free(epi_cache, epi);
>  
> @@ -706,6 +718,7 @@ static void ep_free(struct eventpoll *ep)
>  	mutex_unlock(&epmutex);
>  	mutex_destroy(&ep->mtx);
>  	free_uid(ep->user);
> +	wakeup_source_unregister(ep->ws);
>  	kfree(ep);
>  }
>  
> @@ -737,6 +750,7 @@ static int ep_read_events_proc(struct eventpoll *ep, struct list_head *head,
>  			 * callback, but it's not actually ready, as far as
>  			 * caller requested events goes. We can remove it here.
>  			 */
> +			__pm_relax(epi->ws);
>  			list_del_init(&epi->rdllink);
>  		}
>  	}
> @@ -927,13 +941,23 @@ static int ep_poll_callback(wait_queue_t *wait, unsigned mode, int sync, void *k
>  		if (epi->next == EP_UNACTIVE_PTR) {
>  			epi->next = ep->ovflist;
>  			ep->ovflist = epi;
> +			if (epi->ws) {
> +				/*
> +				 * Activate ep->ws since epi->ws may get
> +				 * deactivated at any time.
> +				 */
> +				__pm_stay_awake(ep->ws);
> +			}
> +
>  		}
>  		goto out_unlock;
>  	}
>  
>  	/* If this file is already in the ready list we exit soon */
> -	if (!ep_is_linked(&epi->rdllink))
> +	if (!ep_is_linked(&epi->rdllink)) {
>  		list_add_tail(&epi->rdllink, &ep->rdllist);
> +		__pm_stay_awake(epi->ws);
> +	}
>  
>  	/*
>  	 * Wake up ( if active ) both the eventpoll wait list and the ->poll()
> @@ -1091,6 +1115,30 @@ static int reverse_path_check(void)
>  	return error;
>  }
>  
> +static int ep_create_wakeup_source(struct epitem *epi)
> +{
> +	const char *name;
> +
> +	if (!epi->ep->ws) {
> +		epi->ep->ws = wakeup_source_register("eventpoll");
> +		if (!epi->ep->ws)
> +			return -ENOMEM;
> +	}
> +
> +	name = epi->ffd.file->f_path.dentry->d_name.name;
> +	epi->ws = wakeup_source_register(name);
> +	if (!epi->ws)
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +static void ep_destroy_wakeup_source(struct epitem *epi)
> +{
> +	wakeup_source_unregister(epi->ws);
> +	epi->ws = NULL;
> +}
> +
>  /*
>   * Must be called with "mtx" held.
>   */
> @@ -1118,6 +1166,13 @@ static int ep_insert(struct eventpoll *ep, struct epoll_event *event,
>  	epi->event = *event;
>  	epi->nwait = 0;
>  	epi->next = EP_UNACTIVE_PTR;
> +	if (epi->event.events & EPOLLWAKEUP) {
> +		error = ep_create_wakeup_source(epi);
> +		if (error)
> +			goto error_create_wakeup_source;
> +	} else {
> +		epi->ws = NULL;
> +	}
>  
>  	/* Initialize the poll table using the queue callback */
>  	epq.epi = epi;
> @@ -1164,6 +1219,7 @@ static int ep_insert(struct eventpoll *ep, struct epoll_event *event,
>  	/* If the file is already "ready" we drop it inside the ready list */
>  	if ((revents & event->events) && !ep_is_linked(&epi->rdllink)) {
>  		list_add_tail(&epi->rdllink, &ep->rdllist);
> +		__pm_stay_awake(epi->ws);
>  
>  		/* Notify waiting tasks that events are available */
>  		if (waitqueue_active(&ep->wq))
> @@ -1204,6 +1260,9 @@ error_unregister:
>  		list_del_init(&epi->rdllink);
>  	spin_unlock_irqrestore(&ep->lock, flags);
>  
> +	wakeup_source_unregister(epi->ws);
> +
> +error_create_wakeup_source:
>  	kmem_cache_free(epi_cache, epi);
>  
>  	return error;
> @@ -1229,6 +1288,12 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_even
>  	epi->event.events = event->events;
>  	pt._key = event->events;
>  	epi->event.data = event->data; /* protected by mtx */
> +	if (epi->event.events & EPOLLWAKEUP) {
> +		if (!epi->ws)
> +			ep_create_wakeup_source(epi);
> +	} else if (epi->ws) {
> +		ep_destroy_wakeup_source(epi);
> +	}
>  
>  	/*
>  	 * Get current event bits. We can safely use the file* here because
> @@ -1244,6 +1309,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_even
>  		spin_lock_irq(&ep->lock);
>  		if (!ep_is_linked(&epi->rdllink)) {
>  			list_add_tail(&epi->rdllink, &ep->rdllist);
> +			__pm_stay_awake(epi->ws);
>  
>  			/* Notify waiting tasks that events are available */
>  			if (waitqueue_active(&ep->wq))
> @@ -1282,6 +1348,18 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
>  	     !list_empty(head) && eventcnt < esed->maxevents;) {
>  		epi = list_first_entry(head, struct epitem, rdllink);
>  
> +		/*
> +		 * Activate ep->ws before deactivating epi->ws to prevent
> +		 * triggering auto-suspend here (in case we reactive epi->ws
> +		 * below).
> +		 *
> +		 * This could be rearranged to delay the deactivation of epi->ws
> +		 * instead, but then epi->ws would temporarily be out of sync
> +		 * with ep_is_linked().
> +		 */
> +		if (epi->ws && epi->ws->active)
> +			__pm_stay_awake(ep->ws);
> +		__pm_relax(epi->ws);
>  		list_del_init(&epi->rdllink);
>  
>  		pt._key = epi->event.events;
> @@ -1298,6 +1376,7 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
>  			if (__put_user(revents, &uevent->events) ||
>  			    __put_user(epi->event.data, &uevent->data)) {
>  				list_add(&epi->rdllink, head);
> +				__pm_stay_awake(epi->ws);
>  				return eventcnt ? eventcnt : -EFAULT;
>  			}
>  			eventcnt++;
> @@ -1317,6 +1396,7 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
>  				 * poll callback will queue them in ep->ovflist.
>  				 */
>  				list_add_tail(&epi->rdllink, &ep->rdllist);
> +				__pm_stay_awake(epi->ws);
>  			}
>  		}
>  	}
> @@ -1629,6 +1709,10 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd,
>  	if (!tfile->f_op || !tfile->f_op->poll)
>  		goto error_tgt_fput;
>  
> +	/* Check if EPOLLWAKEUP is allowed */
> +	if ((epds.events & EPOLLWAKEUP) && !capable(CAP_EPOLLWAKEUP))
> +		goto error_tgt_fput;
> +
>  	/*
>  	 * We have to check that the file structure underneath the file descriptor
>  	 * the user passed to us _is_ an eventpoll file. And also we do not permit
> diff --git a/include/linux/capability.h b/include/linux/capability.h
> index 12d52de..222974a 100644
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -360,8 +360,11 @@ struct cpu_vfs_cap_data {
>  
>  #define CAP_WAKE_ALARM            35
>  
> +/* Allow preventing automatic system suspends while epoll events are pending */
>  
> -#define CAP_LAST_CAP         CAP_WAKE_ALARM
> +#define CAP_EPOLLWAKEUP      36
> +
> +#define CAP_LAST_CAP         CAP_EPOLLWAKEUP
>  
>  #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
>  
> diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
> index 657ab55..5b591fb 100644
> --- a/include/linux/eventpoll.h
> +++ b/include/linux/eventpoll.h
> @@ -26,6 +26,18 @@
>  #define EPOLL_CTL_DEL 2
>  #define EPOLL_CTL_MOD 3
>  
> +/*
> + * Request the handling of system wakeup events so as to prevent automatic
> + * system suspends from happening while those events are being processed.
> + *
> + * Assuming neither EPOLLET nor EPOLLONESHOT is set, automatic system suspends
> + * will not be re-allowed until epoll_wait is called again after consuming the
> + * wakeup event(s).
> + *
> + * Requires CAP_EPOLLWAKEUP
> + */
> +#define EPOLLWAKEUP (1 << 29)
> +
>  /* Set the One Shot behaviour for the target file descriptor */
>  #define EPOLLONESHOT (1 << 30)
>  


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-05-01  6:28                   ` NeilBrown
@ 2012-05-01 13:51                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-05-01 13:51 UTC (permalink / raw)
  To: NeilBrown
  Cc: Arve Hjønnevåg, Linux PM list, LKML, Magnus Damm,
	markgross, Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Tuesday, May 01, 2012, NeilBrown wrote:
> On Mon, 30 Apr 2012 22:33:48 -0700 Arve Hjønnevåg <arve@android.com> wrote:
> 
> > When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
> > wakeup_source will be active to prevent suspend. This can be used to
> > handle wakeup events from a driver that support poll, e.g. input, if
> > that driver wakes up the waitqueue passed to epoll before allowing
> > suspend.
> > 
> > Signed-off-by: Arve Hjønnevåg <arve@android.com>
> > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Thanks.
>  Reviewed-by: NeilBrown <neilb@suse.de>

Thanks a lot for your involvement here!

> However:
> 1/ I think all references to "automatic system suspend" can be replaced with
>    "system suspend" as an active wakeup_source disables any suspend, no matter
>    it's source

OK, I'll change that when applying the patch (although that only applies to
suspends taking the wakeup events signaling through wakeup sources into
account).

> 2/ I reserve to right to submit for discussion a later patch which removes
>    the ep->ws in favour or some other exclusion mechanism :-)

Well, you can alwyas do that. :-)  Of course, when the patch goes to Linus,
we'll have to be careful about changes visible to user space, though.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
  2012-04-26 21:52         ` Rafael J. Wysocki
  2012-04-27  0:39           ` NeilBrown
@ 2012-05-03  0:23           ` Arve Hjønnevåg
  2012-05-03 13:28             ` Rafael J. Wysocki
  1 sibling, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-05-03  0:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: NeilBrown, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Thu, Apr 26, 2012 at 2:52 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
...
> From: Rafael J. Wysocki <rjw@sisk.pl>
> Subject: PM / Sleep: Implement opportunistic sleep, v2
>
> Introduce a mechanism by which the kernel can trigger global
> transitions to a sleep state chosen by user space if there are no
> active wakeup sources.
>
> It consists of a new sysfs attribute, /sys/power/autosleep, that
> can be written one of the strings returned by reads from
> /sys/power/state, an ordered workqueue and a work item carrying out
> the "suspend" operations.  If a string representing the system's
> sleep state is written to /sys/power/autosleep, the work item
> triggering transitions to that state is queued up and it requeues
> itself after every execution until user space writes "off" to
> /sys/power/autosleep.
>

This does not work. Writing something other than "off" disabled auto
suspend for me.

...
> +static ssize_t autosleep_store(struct kobject *kobj,
> +                              struct kobj_attribute *attr,
> +                              const char *buf, size_t n)
> +{
> +       suspend_state_t state = decode_state(buf, n);
> +       int error;
> +
> +       if (state == PM_SUSPEND_ON
> +           && !(strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4)))
> +               return -EINVAL;

Did you mean:
	if (state == PM_SUSPEND_ON
	    && strcmp(buf, "off") && strcmp(buf, "off\n"))
		return -EINVAL;

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
  2012-05-03  0:23           ` Arve Hjønnevåg
@ 2012-05-03 13:28             ` Rafael J. Wysocki
  2012-05-03 21:27               ` Arve Hjønnevåg
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-05-03 13:28 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: NeilBrown, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Thursday, May 03, 2012, Arve Hjønnevåg wrote:
> On Thu, Apr 26, 2012 at 2:52 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> ...
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > Subject: PM / Sleep: Implement opportunistic sleep, v2
> >
> > Introduce a mechanism by which the kernel can trigger global
> > transitions to a sleep state chosen by user space if there are no
> > active wakeup sources.
> >
> > It consists of a new sysfs attribute, /sys/power/autosleep, that
> > can be written one of the strings returned by reads from
> > /sys/power/state, an ordered workqueue and a work item carrying out
> > the "suspend" operations.  If a string representing the system's
> > sleep state is written to /sys/power/autosleep, the work item
> > triggering transitions to that state is queued up and it requeues
> > itself after every execution until user space writes "off" to
> > /sys/power/autosleep.
> >
> 
> This does not work. Writing something other than "off" disabled auto
> suspend for me.

My bad, sorry about that.

> ...
> > +static ssize_t autosleep_store(struct kobject *kobj,
> > +                              struct kobj_attribute *attr,
> > +                              const char *buf, size_t n)
> > +{
> > +       suspend_state_t state = decode_state(buf, n);
> > +       int error;
> > +
> > +       if (state == PM_SUSPEND_ON
> > +           && !(strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4)))
> > +               return -EINVAL;
> 
> Did you mean:
> 	if (state == PM_SUSPEND_ON
> 	    && strcmp(buf, "off") && strcmp(buf, "off\n"))
> 		return -EINVAL;


Yes, I did.

I'll add the following as an incremental patch on top of the series.

Thanks,
Rafael

---
 kernel/power/main.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/kernel/power/main.c
===================================================================
--- linux.orig/kernel/power/main.c
+++ linux/kernel/power/main.c
@@ -422,7 +422,7 @@ static ssize_t autosleep_store(struct ko
 	int error;
 
 	if (state == PM_SUSPEND_ON
-	    && !(strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4)))
+	    && strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4))
 		return -EINVAL;
 
 	error = pm_autosleep_set_state(state);

^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 0/2]: Kconfig options for wakelocks limit and gc (was: Re: [RFC][PATCH 8/8] PM / Sleep: Add user space ...)
  2012-04-27 21:34                     ` Rafael J. Wysocki
@ 2012-05-03 19:29                       ` Rafael J. Wysocki
  2012-05-03 19:30                         ` [PATCH 1/2] PM / Sleep: Make the limit of user space wakeup sources configurable Rafael J. Wysocki
                                           ` (2 more replies)
  0 siblings, 3 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-05-03 19:29 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: NeilBrown, John Stultz, Linux PM list, LKML, Magnus Damm,
	markgross, Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Friday, April 27, 2012, Rafael J. Wysocki wrote:
> On Friday, April 27, 2012, Arve Hjønnevåg wrote:
> > 2012/4/27 Rafael J. Wysocki <rjw@sisk.pl>:
> > > On Friday, April 27, 2012, Arve Hjønnevåg wrote:
> > >> 2012/4/26 Rafael J. Wysocki <rjw@sisk.pl>:
> > >> ...
> > >> > ---
> > >> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > >> > Subject: PM / Sleep: Add user space interface for manipulating wakeup sources, v3
> > >> >
> > >> > Android allows user space to manipulate wakelocks using two
> > >> > sysfs file located in /sys/power/, wake_lock and wake_unlock.
> > >> > Writing a wakelock name and optionally a timeout to the wake_lock
> > >> > file causes the wakelock whose name was written to be acquired (it
> > >> > is created before is necessary), optionally with the given timeout.
> > >> > Writing the name of a wakelock to wake_unlock causes that wakelock
> > >> > to be released.
> > >> >
> > >> > Implement an analogous interface for user space using wakeup sources.
> > >> > Add the /sys/power/wake_lock and /sys/power/wake_unlock files
> > >> > allowing user space to create, activate and deactivate wakeup
> > >> > sources, such that writing a name and optionally a timeout to
> > >> > wake_lock causes the wakeup source of that name to be activated,
> > >> > optionally with the given timeout.  If that wakeup source doesn't
> > >> > exist, it will be created and then activated.  Writing a name to
> > >> > wake_unlock causes the wakeup source of that name, if there is one,
> > >> > to be deactivated.  Wakeup sources created with the help of
> > >> > wake_lock that haven't been used for more than 5 minutes are garbage
> > >> > collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT
> > >>
> > >> I think it would be better if the garbage collection and limit was
> > >> configurable and optional. I would probably turn both features off
> > >> since I do not want to chase down bugs because a wakelock was ignored,
> > >> and I think the garbage collection will erase stats that we care
> > >> about.
> > >
> > > OK, but would you mind if I added the configurability as a separate incremental
> > > patch?
> > >
> > 
> > That is fine with me.
> 
> Cool, thanks!

The following two patches add the configuration options for the limit and
garbage collector.  Please let me know if they are OK with you.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 1/2] PM / Sleep: Make the limit of user space wakeup sources configurable
  2012-05-03 19:29                       ` [PATCH 0/2]: Kconfig options for wakelocks limit and gc (was: Re: [RFC][PATCH 8/8] PM / Sleep: Add user space ...) Rafael J. Wysocki
@ 2012-05-03 19:30                         ` Rafael J. Wysocki
  2012-05-03 19:34                         ` [PATCH 2/2] PM / Sleep: User space wakeup sources garbage collector Kconfig option Rafael J. Wysocki
  2012-05-03 22:14                         ` [PATCH 0/2]: Kconfig options for wakelocks limit and gc (was: Re: [RFC][PATCH 8/8] PM / Sleep: Add user space ...) Arve Hjønnevåg
  2 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-05-03 19:30 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: NeilBrown, John Stultz, Linux PM list, LKML, Magnus Damm,
	markgross, Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

From: Rafael J. Wysocki <rjw@sisk.pl>

Make it possible to configure out the check against the limit of
user space wakeup sources for debugging and default Android builds.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/Kconfig    |    6 ++++++
 kernel/power/wakelock.c |   31 ++++++++++++++++++++++++++-----
 2 files changed, 32 insertions(+), 5 deletions(-)

Index: linux/kernel/power/Kconfig
===================================================================
--- linux.orig/kernel/power/Kconfig
+++ linux/kernel/power/Kconfig
@@ -119,6 +119,12 @@ config PM_WAKELOCKS
 	Allow user space to create, activate and deactivate wakeup source
 	objects with the help of a sysfs-based interface.
 
+config PM_WAKELOCKS_LIMIT
+	int "Maximum number of user space wakeup sources (0 = no limit)"
+	range 0 100000
+	default 100
+	depends on PM_WAKELOCKS
+
 config PM_RUNTIME
 	bool "Run-time PM core functionality"
 	depends on !IA64_HP_SIM
Index: linux/kernel/power/wakelock.c
===================================================================
--- linux.orig/kernel/power/wakelock.c
+++ linux/kernel/power/wakelock.c
@@ -17,7 +17,6 @@
 #include <linux/rbtree.h>
 #include <linux/slab.h>
 
-#define WL_NUMBER_LIMIT	100
 #define WL_GC_COUNT_MAX	100
 #define WL_GC_TIME_SEC	300
 
@@ -32,7 +31,6 @@ struct wakelock {
 
 static struct rb_root wakelocks_tree = RB_ROOT;
 static LIST_HEAD(wakelocks_lru_list);
-static unsigned int number_of_wakelocks;
 static unsigned int wakelocks_gc_count;
 
 ssize_t pm_show_wakelocks(char *buf, bool show_active)
@@ -58,6 +56,29 @@ ssize_t pm_show_wakelocks(char *buf, boo
 	return (str - buf);
 }
 
+#if CONFIG_PM_WAKELOCKS_LIMIT > 0
+static unsigned int number_of_wakelocks;
+
+static inline bool wakelocks_limit_exceeded(void)
+{
+	return number_of_wakelocks > CONFIG_PM_WAKELOCKS_LIMIT;
+}
+
+static inline void increment_wakelocks_number(void)
+{
+	number_of_wakelocks++;
+}
+
+static inline void decrement_wakelocks_number(void)
+{
+	number_of_wakelocks--;
+}
+#else /* CONFIG_PM_WAKELOCKS_LIMIT = 0 */
+static inline bool wakelocks_limit_exceeded(void) { return false; }
+static inline void increment_wakelocks_number(void) {}
+static inline void decrement_wakelocks_number(void) {}
+#endif /* CONFIG_PM_WAKELOCKS_LIMIT */
+
 static struct wakelock *wakelock_lookup_add(const char *name, size_t len,
 					    bool add_if_not_found)
 {
@@ -85,7 +106,7 @@ static struct wakelock *wakelock_lookup_
 	if (!add_if_not_found)
 		return ERR_PTR(-EINVAL);
 
-	if (number_of_wakelocks > WL_NUMBER_LIMIT)
+	if (wakelocks_limit_exceeded())
 		return ERR_PTR(-ENOSPC);
 
 	/* Not found, we have to add a new one. */
@@ -103,7 +124,7 @@ static struct wakelock *wakelock_lookup_
 	rb_link_node(&wl->node, parent, node);
 	rb_insert_color(&wl->node, &wakelocks_tree);
 	list_add(&wl->lru, &wakelocks_lru_list);
-	number_of_wakelocks++;
+	increment_wakelocks_number();
 	return wl;
 }
 
@@ -175,7 +196,7 @@ static void wakelocks_gc(void)
 			list_del(&wl->lru);
 			kfree(wl->name);
 			kfree(wl);
-			number_of_wakelocks--;
+			decrement_wakelocks_number();
 		}
 	}
 	wakelocks_gc_count = 0;


^ permalink raw reply	[flat|nested] 129+ messages in thread

* [PATCH 2/2] PM / Sleep: User space wakeup sources garbage collector Kconfig option
  2012-05-03 19:29                       ` [PATCH 0/2]: Kconfig options for wakelocks limit and gc (was: Re: [RFC][PATCH 8/8] PM / Sleep: Add user space ...) Rafael J. Wysocki
  2012-05-03 19:30                         ` [PATCH 1/2] PM / Sleep: Make the limit of user space wakeup sources configurable Rafael J. Wysocki
@ 2012-05-03 19:34                         ` Rafael J. Wysocki
  2012-05-03 22:14                         ` [PATCH 0/2]: Kconfig options for wakelocks limit and gc (was: Re: [RFC][PATCH 8/8] PM / Sleep: Add user space ...) Arve Hjønnevåg
  2 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-05-03 19:34 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: NeilBrown, John Stultz, Linux PM list, LKML, Magnus Damm,
	markgross, Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

From: Rafael J. Wysocki <rjw@sisk.pl>

Make it possible to configure out the user space wakeup sources
garbage collector for debugging and default Android builds.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/Kconfig    |    5 ++
 kernel/power/wakelock.c |  101 +++++++++++++++++++++++++++++-------------------
 2 files changed, 67 insertions(+), 39 deletions(-)

Index: linux/kernel/power/Kconfig
===================================================================
--- linux.orig/kernel/power/Kconfig
+++ linux/kernel/power/Kconfig
@@ -125,6 +125,11 @@ config PM_WAKELOCKS_LIMIT
 	default 100
 	depends on PM_WAKELOCKS
 
+config PM_WAKELOCKS_GC
+	bool "Garbage collector for user space wakeup sources"
+	depends on PM_WAKELOCKS
+	default y
+
 config PM_RUNTIME
 	bool "Run-time PM core functionality"
 	depends on !IA64_HP_SIM
Index: linux/kernel/power/wakelock.c
===================================================================
--- linux.orig/kernel/power/wakelock.c
+++ linux/kernel/power/wakelock.c
@@ -17,21 +17,18 @@
 #include <linux/rbtree.h>
 #include <linux/slab.h>
 
-#define WL_GC_COUNT_MAX	100
-#define WL_GC_TIME_SEC	300
-
 static DEFINE_MUTEX(wakelocks_lock);
 
 struct wakelock {
 	char			*name;
 	struct rb_node		node;
 	struct wakeup_source	ws;
+#ifdef CONFIG_PM_WAKELOCKS_GC
 	struct list_head	lru;
+#endif
 };
 
 static struct rb_root wakelocks_tree = RB_ROOT;
-static LIST_HEAD(wakelocks_lru_list);
-static unsigned int wakelocks_gc_count;
 
 ssize_t pm_show_wakelocks(char *buf, bool show_active)
 {
@@ -79,6 +76,61 @@ static inline void increment_wakelocks_n
 static inline void decrement_wakelocks_number(void) {}
 #endif /* CONFIG_PM_WAKELOCKS_LIMIT */
 
+#ifdef CONFIG_PM_WAKELOCKS_GC
+#define WL_GC_COUNT_MAX	100
+#define WL_GC_TIME_SEC	300
+
+static LIST_HEAD(wakelocks_lru_list);
+static unsigned int wakelocks_gc_count;
+
+static inline void wakelocks_lru_add(struct wakelock *wl)
+{
+	list_add(&wl->lru, &wakelocks_lru_list);
+}
+
+static inline void wakelocks_lru_most_recent(struct wakelock *wl)
+{
+	list_move(&wl->lru, &wakelocks_lru_list);
+}
+
+static void wakelocks_gc(void)
+{
+	struct wakelock *wl, *aux;
+	ktime_t now;
+
+	if (++wakelocks_gc_count <= WL_GC_COUNT_MAX)
+		return;
+
+	now = ktime_get();
+	list_for_each_entry_safe_reverse(wl, aux, &wakelocks_lru_list, lru) {
+		u64 idle_time_ns;
+		bool active;
+
+		spin_lock_irq(&wl->ws.lock);
+		idle_time_ns = ktime_to_ns(ktime_sub(now, wl->ws.last_time));
+		active = wl->ws.active;
+		spin_unlock_irq(&wl->ws.lock);
+
+		if (idle_time_ns < ((u64)WL_GC_TIME_SEC * NSEC_PER_SEC))
+			break;
+
+		if (!active) {
+			wakeup_source_remove(&wl->ws);
+			rb_erase(&wl->node, &wakelocks_tree);
+			list_del(&wl->lru);
+			kfree(wl->name);
+			kfree(wl);
+			decrement_wakelocks_number();
+		}
+	}
+	wakelocks_gc_count = 0;
+}
+#else /* !CONFIG_PM_WAKELOCKS_GC */
+static inline void wakelocks_lru_add(struct wakelock *wl) {}
+static inline void wakelocks_lru_most_recent(struct wakelock *wl) {}
+static inline void wakelocks_gc(void) {}
+#endif /* !CONFIG_PM_WAKELOCKS_GC */
+
 static struct wakelock *wakelock_lookup_add(const char *name, size_t len,
 					    bool add_if_not_found)
 {
@@ -123,7 +175,7 @@ static struct wakelock *wakelock_lookup_
 	wakeup_source_add(&wl->ws);
 	rb_link_node(&wl->node, parent, node);
 	rb_insert_color(&wl->node, &wakelocks_tree);
-	list_add(&wl->lru, &wakelocks_lru_list);
+	wakelocks_lru_add(wl);
 	increment_wakelocks_number();
 	return wl;
 }
@@ -166,42 +218,13 @@ int pm_wake_lock(const char *buf)
 		__pm_stay_awake(&wl->ws);
 	}
 
-	list_move(&wl->lru, &wakelocks_lru_list);
+	wakelocks_lru_most_recent(wl);
 
  out:
 	mutex_unlock(&wakelocks_lock);
 	return ret;
 }
 
-static void wakelocks_gc(void)
-{
-	struct wakelock *wl, *aux;
-	ktime_t now = ktime_get();
-
-	list_for_each_entry_safe_reverse(wl, aux, &wakelocks_lru_list, lru) {
-		u64 idle_time_ns;
-		bool active;
-
-		spin_lock_irq(&wl->ws.lock);
-		idle_time_ns = ktime_to_ns(ktime_sub(now, wl->ws.last_time));
-		active = wl->ws.active;
-		spin_unlock_irq(&wl->ws.lock);
-
-		if (idle_time_ns < ((u64)WL_GC_TIME_SEC * NSEC_PER_SEC))
-			break;
-
-		if (!active) {
-			wakeup_source_remove(&wl->ws);
-			rb_erase(&wl->node, &wakelocks_tree);
-			list_del(&wl->lru);
-			kfree(wl->name);
-			kfree(wl);
-			decrement_wakelocks_number();
-		}
-	}
-	wakelocks_gc_count = 0;
-}
-
 int pm_wake_unlock(const char *buf)
 {
 	struct wakelock *wl;
@@ -226,9 +249,9 @@ int pm_wake_unlock(const char *buf)
 		goto out;
 	}
 	__pm_relax(&wl->ws);
-	list_move(&wl->lru, &wakelocks_lru_list);
-	if (++wakelocks_gc_count > WL_GC_COUNT_MAX)
-		wakelocks_gc();
+
+	wakelocks_lru_most_recent(wl);
+	wakelocks_gc();
 
  out:
 	mutex_unlock(&wakelocks_lock);


^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
  2012-05-03 13:28             ` Rafael J. Wysocki
@ 2012-05-03 21:27               ` Arve Hjønnevåg
  2012-05-03 22:20                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-05-03 21:27 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: NeilBrown, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Thu, May 3, 2012 at 6:28 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Thursday, May 03, 2012, Arve Hjønnevåg wrote:
>> On Thu, Apr 26, 2012 at 2:52 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> ...
>> > From: Rafael J. Wysocki <rjw@sisk.pl>
>> > Subject: PM / Sleep: Implement opportunistic sleep, v2
>> >
>> > Introduce a mechanism by which the kernel can trigger global
>> > transitions to a sleep state chosen by user space if there are no
>> > active wakeup sources.
>> >
>> > It consists of a new sysfs attribute, /sys/power/autosleep, that
>> > can be written one of the strings returned by reads from
>> > /sys/power/state, an ordered workqueue and a work item carrying out
>> > the "suspend" operations.  If a string representing the system's
>> > sleep state is written to /sys/power/autosleep, the work item
>> > triggering transitions to that state is queued up and it requeues
>> > itself after every execution until user space writes "off" to
>> > /sys/power/autosleep.
>> >
>>
>> This does not work. Writing something other than "off" disabled auto
>> suspend for me.
>
> My bad, sorry about that.
>
>> ...
>> > +static ssize_t autosleep_store(struct kobject *kobj,
>> > +                              struct kobj_attribute *attr,
>> > +                              const char *buf, size_t n)
>> > +{
>> > +       suspend_state_t state = decode_state(buf, n);
>> > +       int error;
>> > +
>> > +       if (state == PM_SUSPEND_ON
>> > +           && !(strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4)))
>> > +               return -EINVAL;
>>
>> Did you mean:
>>       if (state == PM_SUSPEND_ON
>>           && strcmp(buf, "off") && strcmp(buf, "off\n"))
>>               return -EINVAL;
>
>
> Yes, I did.
>
> I'll add the following as an incremental patch on top of the series.
>
> Thanks,
> Rafael
>
> ---
>  kernel/power/main.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux/kernel/power/main.c
> ===================================================================
> --- linux.orig/kernel/power/main.c
> +++ linux/kernel/power/main.c
> @@ -422,7 +422,7 @@ static ssize_t autosleep_store(struct ko
>        int error;
>
>        if (state == PM_SUSPEND_ON
> -           && !(strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4)))
> +           && strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4))
>                return -EINVAL;
>
>        error = pm_autosleep_set_state(state);

You still use strncmp here, so anything that starts with "off" is
allowed (and the second strncmp is redundant).

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 0/2]: Kconfig options for wakelocks limit and gc (was: Re: [RFC][PATCH 8/8] PM / Sleep: Add user space ...)
  2012-05-03 19:29                       ` [PATCH 0/2]: Kconfig options for wakelocks limit and gc (was: Re: [RFC][PATCH 8/8] PM / Sleep: Add user space ...) Rafael J. Wysocki
  2012-05-03 19:30                         ` [PATCH 1/2] PM / Sleep: Make the limit of user space wakeup sources configurable Rafael J. Wysocki
  2012-05-03 19:34                         ` [PATCH 2/2] PM / Sleep: User space wakeup sources garbage collector Kconfig option Rafael J. Wysocki
@ 2012-05-03 22:14                         ` Arve Hjønnevåg
  2012-05-03 22:20                           ` Rafael J. Wysocki
  2 siblings, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-05-03 22:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: NeilBrown, John Stultz, Linux PM list, LKML, Magnus Damm,
	markgross, Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Thu, May 3, 2012 at 12:29 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Friday, April 27, 2012, Rafael J. Wysocki wrote:
>> On Friday, April 27, 2012, Arve Hjønnevåg wrote:
>> > 2012/4/27 Rafael J. Wysocki <rjw@sisk.pl>:
>> > > On Friday, April 27, 2012, Arve Hjønnevåg wrote:
>> > >> 2012/4/26 Rafael J. Wysocki <rjw@sisk.pl>:
>> > >> ...
>> > >> > ---
>> > >> > From: Rafael J. Wysocki <rjw@sisk.pl>
>> > >> > Subject: PM / Sleep: Add user space interface for manipulating wakeup sources, v3
>> > >> >
>> > >> > Android allows user space to manipulate wakelocks using two
>> > >> > sysfs file located in /sys/power/, wake_lock and wake_unlock.
>> > >> > Writing a wakelock name and optionally a timeout to the wake_lock
>> > >> > file causes the wakelock whose name was written to be acquired (it
>> > >> > is created before is necessary), optionally with the given timeout.
>> > >> > Writing the name of a wakelock to wake_unlock causes that wakelock
>> > >> > to be released.
>> > >> >
>> > >> > Implement an analogous interface for user space using wakeup sources.
>> > >> > Add the /sys/power/wake_lock and /sys/power/wake_unlock files
>> > >> > allowing user space to create, activate and deactivate wakeup
>> > >> > sources, such that writing a name and optionally a timeout to
>> > >> > wake_lock causes the wakeup source of that name to be activated,
>> > >> > optionally with the given timeout.  If that wakeup source doesn't
>> > >> > exist, it will be created and then activated.  Writing a name to
>> > >> > wake_unlock causes the wakeup source of that name, if there is one,
>> > >> > to be deactivated.  Wakeup sources created with the help of
>> > >> > wake_lock that haven't been used for more than 5 minutes are garbage
>> > >> > collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT
>> > >>
>> > >> I think it would be better if the garbage collection and limit was
>> > >> configurable and optional. I would probably turn both features off
>> > >> since I do not want to chase down bugs because a wakelock was ignored,
>> > >> and I think the garbage collection will erase stats that we care
>> > >> about.
>> > >
>> > > OK, but would you mind if I added the configurability as a separate incremental
>> > > patch?
>> > >
>> >
>> > That is fine with me.
>>
>> Cool, thanks!
>
> The following two patches add the configuration options for the limit and
> garbage collector.  Please let me know if they are OK with you.
>

Yes.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
  2012-05-03 22:20                 ` Rafael J. Wysocki
@ 2012-05-03 22:16                   ` Arve Hjønnevåg
  2012-05-03 22:24                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-05-03 22:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: NeilBrown, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Thu, May 3, 2012 at 3:20 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Thursday, May 03, 2012, Arve Hjønnevåg wrote:
>> On Thu, May 3, 2012 at 6:28 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> > On Thursday, May 03, 2012, Arve Hjønnevåg wrote:
>> >> On Thu, Apr 26, 2012 at 2:52 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> >> ...
>> >> > From: Rafael J. Wysocki <rjw@sisk.pl>
>> >> > Subject: PM / Sleep: Implement opportunistic sleep, v2
>> >> >
>> >> > Introduce a mechanism by which the kernel can trigger global
>> >> > transitions to a sleep state chosen by user space if there are no
>> >> > active wakeup sources.
>> >> >
>> >> > It consists of a new sysfs attribute, /sys/power/autosleep, that
>> >> > can be written one of the strings returned by reads from
>> >> > /sys/power/state, an ordered workqueue and a work item carrying out
>> >> > the "suspend" operations.  If a string representing the system's
>> >> > sleep state is written to /sys/power/autosleep, the work item
>> >> > triggering transitions to that state is queued up and it requeues
>> >> > itself after every execution until user space writes "off" to
>> >> > /sys/power/autosleep.
>> >> >
>> >>
>> >> This does not work. Writing something other than "off" disabled auto
>> >> suspend for me.
>> >
>> > My bad, sorry about that.
>> >
>> >> ...
>> >> > +static ssize_t autosleep_store(struct kobject *kobj,
>> >> > +                              struct kobj_attribute *attr,
>> >> > +                              const char *buf, size_t n)
>> >> > +{
>> >> > +       suspend_state_t state = decode_state(buf, n);
>> >> > +       int error;
>> >> > +
>> >> > +       if (state == PM_SUSPEND_ON
>> >> > +           && !(strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4)))
>> >> > +               return -EINVAL;
>> >>
>> >> Did you mean:
>> >>       if (state == PM_SUSPEND_ON
>> >>           && strcmp(buf, "off") && strcmp(buf, "off\n"))
>> >>               return -EINVAL;
>> >
>> >
>> > Yes, I did.
>> >
>> > I'll add the following as an incremental patch on top of the series.
>> >
>> > Thanks,
>> > Rafael
>> >
>> > ---
>> >  kernel/power/main.c |    2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > Index: linux/kernel/power/main.c
>> > ===================================================================
>> > --- linux.orig/kernel/power/main.c
>> > +++ linux/kernel/power/main.c
>> > @@ -422,7 +422,7 @@ static ssize_t autosleep_store(struct ko
>> >        int error;
>> >
>> >        if (state == PM_SUSPEND_ON
>> > -           && !(strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4)))
>> > +           && strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4))
>> >                return -EINVAL;
>> >
>> >        error = pm_autosleep_set_state(state);
>>
>> You still use strncmp here, so anything that starts with "off" is
>> allowed (and the second strncmp is redundant).
>
> Good point.  So I'm going to add the patch below after all.
> OK to add your sign-off to it?
>

Yes.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
  2012-05-03 21:27               ` Arve Hjønnevåg
@ 2012-05-03 22:20                 ` Rafael J. Wysocki
  2012-05-03 22:16                   ` Arve Hjønnevåg
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-05-03 22:20 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: NeilBrown, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Thursday, May 03, 2012, Arve Hjønnevåg wrote:
> On Thu, May 3, 2012 at 6:28 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Thursday, May 03, 2012, Arve Hjønnevåg wrote:
> >> On Thu, Apr 26, 2012 at 2:52 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> ...
> >> > From: Rafael J. Wysocki <rjw@sisk.pl>
> >> > Subject: PM / Sleep: Implement opportunistic sleep, v2
> >> >
> >> > Introduce a mechanism by which the kernel can trigger global
> >> > transitions to a sleep state chosen by user space if there are no
> >> > active wakeup sources.
> >> >
> >> > It consists of a new sysfs attribute, /sys/power/autosleep, that
> >> > can be written one of the strings returned by reads from
> >> > /sys/power/state, an ordered workqueue and a work item carrying out
> >> > the "suspend" operations.  If a string representing the system's
> >> > sleep state is written to /sys/power/autosleep, the work item
> >> > triggering transitions to that state is queued up and it requeues
> >> > itself after every execution until user space writes "off" to
> >> > /sys/power/autosleep.
> >> >
> >>
> >> This does not work. Writing something other than "off" disabled auto
> >> suspend for me.
> >
> > My bad, sorry about that.
> >
> >> ...
> >> > +static ssize_t autosleep_store(struct kobject *kobj,
> >> > +                              struct kobj_attribute *attr,
> >> > +                              const char *buf, size_t n)
> >> > +{
> >> > +       suspend_state_t state = decode_state(buf, n);
> >> > +       int error;
> >> > +
> >> > +       if (state == PM_SUSPEND_ON
> >> > +           && !(strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4)))
> >> > +               return -EINVAL;
> >>
> >> Did you mean:
> >>       if (state == PM_SUSPEND_ON
> >>           && strcmp(buf, "off") && strcmp(buf, "off\n"))
> >>               return -EINVAL;
> >
> >
> > Yes, I did.
> >
> > I'll add the following as an incremental patch on top of the series.
> >
> > Thanks,
> > Rafael
> >
> > ---
> >  kernel/power/main.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > Index: linux/kernel/power/main.c
> > ===================================================================
> > --- linux.orig/kernel/power/main.c
> > +++ linux/kernel/power/main.c
> > @@ -422,7 +422,7 @@ static ssize_t autosleep_store(struct ko
> >        int error;
> >
> >        if (state == PM_SUSPEND_ON
> > -           && !(strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4)))
> > +           && strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4))
> >                return -EINVAL;
> >
> >        error = pm_autosleep_set_state(state);
> 
> You still use strncmp here, so anything that starts with "off" is
> allowed (and the second strncmp is redundant).

Good point.  So I'm going to add the patch below after all.
OK to add your sign-off to it?

Rafael


---
 kernel/power/main.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/kernel/power/main.c
===================================================================
--- linux.orig/kernel/power/main.c
+++ linux/kernel/power/main.c
@@ -422,7 +422,7 @@ static ssize_t autosleep_store(struct ko
 	int error;
 
 	if (state == PM_SUSPEND_ON
-	    && !(strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4)))
+	    && strcmp(buf, "off") && strcmp(buf, "off\n"))
 		return -EINVAL;
 
 	error = pm_autosleep_set_state(state);

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH 0/2]: Kconfig options for wakelocks limit and gc (was: Re: [RFC][PATCH 8/8] PM / Sleep: Add user space ...)
  2012-05-03 22:14                         ` [PATCH 0/2]: Kconfig options for wakelocks limit and gc (was: Re: [RFC][PATCH 8/8] PM / Sleep: Add user space ...) Arve Hjønnevåg
@ 2012-05-03 22:20                           ` Rafael J. Wysocki
  0 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-05-03 22:20 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: NeilBrown, John Stultz, Linux PM list, LKML, Magnus Damm,
	markgross, Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Friday, May 04, 2012, Arve Hjønnevåg wrote:
> On Thu, May 3, 2012 at 12:29 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Friday, April 27, 2012, Rafael J. Wysocki wrote:
> >> On Friday, April 27, 2012, Arve Hjønnevåg wrote:
> >> > 2012/4/27 Rafael J. Wysocki <rjw@sisk.pl>:
> >> > > On Friday, April 27, 2012, Arve Hjønnevåg wrote:
> >> > >> 2012/4/26 Rafael J. Wysocki <rjw@sisk.pl>:
> >> > >> ...
> >> > >> > ---
> >> > >> > From: Rafael J. Wysocki <rjw@sisk.pl>
> >> > >> > Subject: PM / Sleep: Add user space interface for manipulating wakeup sources, v3
> >> > >> >
> >> > >> > Android allows user space to manipulate wakelocks using two
> >> > >> > sysfs file located in /sys/power/, wake_lock and wake_unlock.
> >> > >> > Writing a wakelock name and optionally a timeout to the wake_lock
> >> > >> > file causes the wakelock whose name was written to be acquired (it
> >> > >> > is created before is necessary), optionally with the given timeout.
> >> > >> > Writing the name of a wakelock to wake_unlock causes that wakelock
> >> > >> > to be released.
> >> > >> >
> >> > >> > Implement an analogous interface for user space using wakeup sources.
> >> > >> > Add the /sys/power/wake_lock and /sys/power/wake_unlock files
> >> > >> > allowing user space to create, activate and deactivate wakeup
> >> > >> > sources, such that writing a name and optionally a timeout to
> >> > >> > wake_lock causes the wakeup source of that name to be activated,
> >> > >> > optionally with the given timeout.  If that wakeup source doesn't
> >> > >> > exist, it will be created and then activated.  Writing a name to
> >> > >> > wake_unlock causes the wakeup source of that name, if there is one,
> >> > >> > to be deactivated.  Wakeup sources created with the help of
> >> > >> > wake_lock that haven't been used for more than 5 minutes are garbage
> >> > >> > collected and destroyed.  Moreover, there can be only WL_NUMBER_LIMIT
> >> > >>
> >> > >> I think it would be better if the garbage collection and limit was
> >> > >> configurable and optional. I would probably turn both features off
> >> > >> since I do not want to chase down bugs because a wakelock was ignored,
> >> > >> and I think the garbage collection will erase stats that we care
> >> > >> about.
> >> > >
> >> > > OK, but would you mind if I added the configurability as a separate incremental
> >> > > patch?
> >> > >
> >> >
> >> > That is fine with me.
> >>
> >> Cool, thanks!
> >
> > The following two patches add the configuration options for the limit and
> > garbage collector.  Please let me know if they are OK with you.
> >
> 
> Yes.

Good, thanks!

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep
  2012-05-03 22:16                   ` Arve Hjønnevåg
@ 2012-05-03 22:24                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-05-03 22:24 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: NeilBrown, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat

On Friday, May 04, 2012, Arve Hjønnevåg wrote:
> On Thu, May 3, 2012 at 3:20 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Thursday, May 03, 2012, Arve Hjønnevåg wrote:
> >> On Thu, May 3, 2012 at 6:28 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> > On Thursday, May 03, 2012, Arve Hjønnevåg wrote:
> >> >> On Thu, Apr 26, 2012 at 2:52 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> >> ...
> >> >> > From: Rafael J. Wysocki <rjw@sisk.pl>
> >> >> > Subject: PM / Sleep: Implement opportunistic sleep, v2
> >> >> >
> >> >> > Introduce a mechanism by which the kernel can trigger global
> >> >> > transitions to a sleep state chosen by user space if there are no
> >> >> > active wakeup sources.
> >> >> >
> >> >> > It consists of a new sysfs attribute, /sys/power/autosleep, that
> >> >> > can be written one of the strings returned by reads from
> >> >> > /sys/power/state, an ordered workqueue and a work item carrying out
> >> >> > the "suspend" operations.  If a string representing the system's
> >> >> > sleep state is written to /sys/power/autosleep, the work item
> >> >> > triggering transitions to that state is queued up and it requeues
> >> >> > itself after every execution until user space writes "off" to
> >> >> > /sys/power/autosleep.
> >> >> >
> >> >>
> >> >> This does not work. Writing something other than "off" disabled auto
> >> >> suspend for me.
> >> >
> >> > My bad, sorry about that.
> >> >
> >> >> ...
> >> >> > +static ssize_t autosleep_store(struct kobject *kobj,
> >> >> > +                              struct kobj_attribute *attr,
> >> >> > +                              const char *buf, size_t n)
> >> >> > +{
> >> >> > +       suspend_state_t state = decode_state(buf, n);
> >> >> > +       int error;
> >> >> > +
> >> >> > +       if (state == PM_SUSPEND_ON
> >> >> > +           && !(strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4)))
> >> >> > +               return -EINVAL;
> >> >>
> >> >> Did you mean:
> >> >>       if (state == PM_SUSPEND_ON
> >> >>           && strcmp(buf, "off") && strcmp(buf, "off\n"))
> >> >>               return -EINVAL;
> >> >
> >> >
> >> > Yes, I did.
> >> >
> >> > I'll add the following as an incremental patch on top of the series.
> >> >
> >> > Thanks,
> >> > Rafael
> >> >
> >> > ---
> >> >  kernel/power/main.c |    2 +-
> >> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >> >
> >> > Index: linux/kernel/power/main.c
> >> > ===================================================================
> >> > --- linux.orig/kernel/power/main.c
> >> > +++ linux/kernel/power/main.c
> >> > @@ -422,7 +422,7 @@ static ssize_t autosleep_store(struct ko
> >> >        int error;
> >> >
> >> >        if (state == PM_SUSPEND_ON
> >> > -           && !(strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4)))
> >> > +           && strncmp(buf, "off", 3) && strncmp(buf, "off\n", 4))
> >> >                return -EINVAL;
> >> >
> >> >        error = pm_autosleep_set_state(state);
> >>
> >> You still use strncmp here, so anything that starts with "off" is
> >> allowed (and the second strncmp is redundant).
> >
> > Good point.  So I'm going to add the patch below after all.
> > OK to add your sign-off to it?
> >
> 
> Yes.

OK, done.

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-05-01  5:33                 ` [PATCH] " Arve Hjønnevåg
  2012-05-01  6:28                   ` NeilBrown
@ 2012-07-16  6:38                   ` Michael Kerrisk
  2012-07-16 11:00                     ` Rafael J. Wysocki
  1 sibling, 1 reply; 129+ messages in thread
From: Michael Kerrisk @ 2012-07-16  6:38 UTC (permalink / raw)
  To: Rafael J. Wysocki, Arve Hjønnevåg
  Cc: NeilBrown, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat, Michael Kerrisk,
	mtk

Arve, Rafael,

On Tue, May 1, 2012 at 7:33 AM, Arve Hjønnevåg <arve@android.com> wrote:
> When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
> wakeup_source will be active to prevent suspend. This can be used to
> handle wakeup events from a driver that support poll, e.g. input, if
> that driver wakes up the waitqueue passed to epoll before allowing
> suspend.

It's late it the -rc series, but it strikes me that CAP_EPOLLWAKEUP is
a poor name for the capability that governs the use of EPOLLWAKEUP.
While on the one hand some capabilities are overloaded
(https://lwn.net/Articles/486306/), on the other hand we should avoid
adding individual capabilities for each new API feature (otherwise
capabilities become administratively unwieldy).

This capability is not really about "EPOLL". It's about the ability to
block system suspend. Therefore, IMO, a better name would be something
like: CAP_BLOCK_SUSPEND. This name is better because there might be
some other API feature that is later added that also has the effect of
preventing system suspends, and we could reasonably govern that
feature with the same capability.

Does that seem sensible to you? I can send a patch for the name change.

Thanks,

Michael



> Signed-off-by: Arve Hjønnevåg <arve@android.com>
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  fs/eventpoll.c             |   90 ++++++++++++++++++++++++++++++++++++++++++-
>  include/linux/capability.h |    5 ++-
>  include/linux/eventpoll.h  |   12 ++++++
>  3 files changed, 103 insertions(+), 4 deletions(-)
>
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index 739b098..1abed50 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -33,6 +33,7 @@
>  #include <linux/bitops.h>
>  #include <linux/mutex.h>
>  #include <linux/anon_inodes.h>
> +#include <linux/device.h>
>  #include <asm/uaccess.h>
>  #include <asm/io.h>
>  #include <asm/mman.h>
> @@ -87,7 +88,7 @@
>   */
>
>  /* Epoll private bits inside the event mask */
> -#define EP_PRIVATE_BITS (EPOLLONESHOT | EPOLLET)
> +#define EP_PRIVATE_BITS (EPOLLWAKEUP | EPOLLONESHOT | EPOLLET)
>
>  /* Maximum number of nesting allowed inside epoll sets */
>  #define EP_MAX_NESTS 4
> @@ -154,6 +155,9 @@ struct epitem {
>         /* List header used to link this item to the "struct file" items list */
>         struct list_head fllink;
>
> +       /* wakeup_source used when EPOLLWAKEUP is set */
> +       struct wakeup_source *ws;
> +
>         /* The structure that describe the interested events and the source fd */
>         struct epoll_event event;
>  };
> @@ -194,6 +198,9 @@ struct eventpoll {
>          */
>         struct epitem *ovflist;
>
> +       /* wakeup_source used when ep_scan_ready_list is running */
> +       struct wakeup_source *ws;
> +
>         /* The user that created the eventpoll descriptor */
>         struct user_struct *user;
>
> @@ -588,8 +595,10 @@ static int ep_scan_ready_list(struct eventpoll *ep,
>                  * queued into ->ovflist but the "txlist" might already
>                  * contain them, and the list_splice() below takes care of them.
>                  */
> -               if (!ep_is_linked(&epi->rdllink))
> +               if (!ep_is_linked(&epi->rdllink)) {
>                         list_add_tail(&epi->rdllink, &ep->rdllist);
> +                       __pm_stay_awake(epi->ws);
> +               }
>         }
>         /*
>          * We need to set back ep->ovflist to EP_UNACTIVE_PTR, so that after
> @@ -602,6 +611,7 @@ static int ep_scan_ready_list(struct eventpoll *ep,
>          * Quickly re-inject items left on "txlist".
>          */
>         list_splice(&txlist, &ep->rdllist);
> +       __pm_relax(ep->ws);
>
>         if (!list_empty(&ep->rdllist)) {
>                 /*
> @@ -656,6 +666,8 @@ static int ep_remove(struct eventpoll *ep, struct epitem *epi)
>                 list_del_init(&epi->rdllink);
>         spin_unlock_irqrestore(&ep->lock, flags);
>
> +       wakeup_source_unregister(epi->ws);
> +
>         /* At this point it is safe to free the eventpoll item */
>         kmem_cache_free(epi_cache, epi);
>
> @@ -706,6 +718,7 @@ static void ep_free(struct eventpoll *ep)
>         mutex_unlock(&epmutex);
>         mutex_destroy(&ep->mtx);
>         free_uid(ep->user);
> +       wakeup_source_unregister(ep->ws);
>         kfree(ep);
>  }
>
> @@ -737,6 +750,7 @@ static int ep_read_events_proc(struct eventpoll *ep, struct list_head *head,
>                          * callback, but it's not actually ready, as far as
>                          * caller requested events goes. We can remove it here.
>                          */
> +                       __pm_relax(epi->ws);
>                         list_del_init(&epi->rdllink);
>                 }
>         }
> @@ -927,13 +941,23 @@ static int ep_poll_callback(wait_queue_t *wait, unsigned mode, int sync, void *k
>                 if (epi->next == EP_UNACTIVE_PTR) {
>                         epi->next = ep->ovflist;
>                         ep->ovflist = epi;
> +                       if (epi->ws) {
> +                               /*
> +                                * Activate ep->ws since epi->ws may get
> +                                * deactivated at any time.
> +                                */
> +                               __pm_stay_awake(ep->ws);
> +                       }
> +
>                 }
>                 goto out_unlock;
>         }
>
>         /* If this file is already in the ready list we exit soon */
> -       if (!ep_is_linked(&epi->rdllink))
> +       if (!ep_is_linked(&epi->rdllink)) {
>                 list_add_tail(&epi->rdllink, &ep->rdllist);
> +               __pm_stay_awake(epi->ws);
> +       }
>
>         /*
>          * Wake up ( if active ) both the eventpoll wait list and the ->poll()
> @@ -1091,6 +1115,30 @@ static int reverse_path_check(void)
>         return error;
>  }
>
> +static int ep_create_wakeup_source(struct epitem *epi)
> +{
> +       const char *name;
> +
> +       if (!epi->ep->ws) {
> +               epi->ep->ws = wakeup_source_register("eventpoll");
> +               if (!epi->ep->ws)
> +                       return -ENOMEM;
> +       }
> +
> +       name = epi->ffd.file->f_path.dentry->d_name.name;
> +       epi->ws = wakeup_source_register(name);
> +       if (!epi->ws)
> +               return -ENOMEM;
> +
> +       return 0;
> +}
> +
> +static void ep_destroy_wakeup_source(struct epitem *epi)
> +{
> +       wakeup_source_unregister(epi->ws);
> +       epi->ws = NULL;
> +}
> +
>  /*
>   * Must be called with "mtx" held.
>   */
> @@ -1118,6 +1166,13 @@ static int ep_insert(struct eventpoll *ep, struct epoll_event *event,
>         epi->event = *event;
>         epi->nwait = 0;
>         epi->next = EP_UNACTIVE_PTR;
> +       if (epi->event.events & EPOLLWAKEUP) {
> +               error = ep_create_wakeup_source(epi);
> +               if (error)
> +                       goto error_create_wakeup_source;
> +       } else {
> +               epi->ws = NULL;
> +       }
>
>         /* Initialize the poll table using the queue callback */
>         epq.epi = epi;
> @@ -1164,6 +1219,7 @@ static int ep_insert(struct eventpoll *ep, struct epoll_event *event,
>         /* If the file is already "ready" we drop it inside the ready list */
>         if ((revents & event->events) && !ep_is_linked(&epi->rdllink)) {
>                 list_add_tail(&epi->rdllink, &ep->rdllist);
> +               __pm_stay_awake(epi->ws);
>
>                 /* Notify waiting tasks that events are available */
>                 if (waitqueue_active(&ep->wq))
> @@ -1204,6 +1260,9 @@ error_unregister:
>                 list_del_init(&epi->rdllink);
>         spin_unlock_irqrestore(&ep->lock, flags);
>
> +       wakeup_source_unregister(epi->ws);
> +
> +error_create_wakeup_source:
>         kmem_cache_free(epi_cache, epi);
>
>         return error;
> @@ -1229,6 +1288,12 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_even
>         epi->event.events = event->events;
>         pt._key = event->events;
>         epi->event.data = event->data; /* protected by mtx */
> +       if (epi->event.events & EPOLLWAKEUP) {
> +               if (!epi->ws)
> +                       ep_create_wakeup_source(epi);
> +       } else if (epi->ws) {
> +               ep_destroy_wakeup_source(epi);
> +       }
>
>         /*
>          * Get current event bits. We can safely use the file* here because
> @@ -1244,6 +1309,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_even
>                 spin_lock_irq(&ep->lock);
>                 if (!ep_is_linked(&epi->rdllink)) {
>                         list_add_tail(&epi->rdllink, &ep->rdllist);
> +                       __pm_stay_awake(epi->ws);
>
>                         /* Notify waiting tasks that events are available */
>                         if (waitqueue_active(&ep->wq))
> @@ -1282,6 +1348,18 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
>              !list_empty(head) && eventcnt < esed->maxevents;) {
>                 epi = list_first_entry(head, struct epitem, rdllink);
>
> +               /*
> +                * Activate ep->ws before deactivating epi->ws to prevent
> +                * triggering auto-suspend here (in case we reactive epi->ws
> +                * below).
> +                *
> +                * This could be rearranged to delay the deactivation of epi->ws
> +                * instead, but then epi->ws would temporarily be out of sync
> +                * with ep_is_linked().
> +                */
> +               if (epi->ws && epi->ws->active)
> +                       __pm_stay_awake(ep->ws);
> +               __pm_relax(epi->ws);
>                 list_del_init(&epi->rdllink);
>
>                 pt._key = epi->event.events;
> @@ -1298,6 +1376,7 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
>                         if (__put_user(revents, &uevent->events) ||
>                             __put_user(epi->event.data, &uevent->data)) {
>                                 list_add(&epi->rdllink, head);
> +                               __pm_stay_awake(epi->ws);
>                                 return eventcnt ? eventcnt : -EFAULT;
>                         }
>                         eventcnt++;
> @@ -1317,6 +1396,7 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
>                                  * poll callback will queue them in ep->ovflist.
>                                  */
>                                 list_add_tail(&epi->rdllink, &ep->rdllist);
> +                               __pm_stay_awake(epi->ws);
>                         }
>                 }
>         }
> @@ -1629,6 +1709,10 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd,
>         if (!tfile->f_op || !tfile->f_op->poll)
>                 goto error_tgt_fput;
>
> +       /* Check if EPOLLWAKEUP is allowed */
> +       if ((epds.events & EPOLLWAKEUP) && !capable(CAP_EPOLLWAKEUP))
> +               goto error_tgt_fput;
> +
>         /*
>          * We have to check that the file structure underneath the file descriptor
>          * the user passed to us _is_ an eventpoll file. And also we do not permit
> diff --git a/include/linux/capability.h b/include/linux/capability.h
> index 12d52de..222974a 100644
> --- a/include/linux/capability.h
> +++ b/include/linux/capability.h
> @@ -360,8 +360,11 @@ struct cpu_vfs_cap_data {
>
>  #define CAP_WAKE_ALARM            35
>
> +/* Allow preventing automatic system suspends while epoll events are pending */
>
> -#define CAP_LAST_CAP         CAP_WAKE_ALARM
> +#define CAP_EPOLLWAKEUP      36
> +
> +#define CAP_LAST_CAP         CAP_EPOLLWAKEUP
>
>  #define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
>
> diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h
> index 657ab55..5b591fb 100644
> --- a/include/linux/eventpoll.h
> +++ b/include/linux/eventpoll.h
> @@ -26,6 +26,18 @@
>  #define EPOLL_CTL_DEL 2
>  #define EPOLL_CTL_MOD 3
>
> +/*
> + * Request the handling of system wakeup events so as to prevent automatic
> + * system suspends from happening while those events are being processed.
> + *
> + * Assuming neither EPOLLET nor EPOLLONESHOT is set, automatic system suspends
> + * will not be re-allowed until epoll_wait is called again after consuming the
> + * wakeup event(s).
> + *
> + * Requires CAP_EPOLLWAKEUP
> + */
> +#define EPOLLWAKEUP (1 << 29)
> +
>  /* Set the One Shot behaviour for the target file descriptor */
>  #define EPOLLONESHOT (1 << 30)
>
> --
> 1.7.7.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Michael Kerrisk Linux man-pages maintainer;
http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface", http://blog.man7.org/

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-07-16  6:38                   ` Michael Kerrisk
@ 2012-07-16 11:00                     ` Rafael J. Wysocki
  2012-07-16 22:04                       ` Arve Hjønnevåg
  0 siblings, 1 reply; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-07-16 11:00 UTC (permalink / raw)
  To: Michael Kerrisk, Arve Hjønnevåg
  Cc: NeilBrown, Linux PM list, LKML, Magnus Damm, markgross,
	Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat, mtk

On Monday, July 16, 2012, Michael Kerrisk wrote:
> Arve, Rafael,
> 
> On Tue, May 1, 2012 at 7:33 AM, Arve Hjønnevåg <arve@android.com> wrote:
> > When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
> > wakeup_source will be active to prevent suspend. This can be used to
> > handle wakeup events from a driver that support poll, e.g. input, if
> > that driver wakes up the waitqueue passed to epoll before allowing
> > suspend.
> 
> It's late it the -rc series,

Well, exactly. :-)

> but it strikes me that CAP_EPOLLWAKEUP is
> a poor name for the capability that governs the use of EPOLLWAKEUP.
> While on the one hand some capabilities are overloaded
> (https://lwn.net/Articles/486306/), on the other hand we should avoid
> adding individual capabilities for each new API feature (otherwise
> capabilities become administratively unwieldy).
> 
> This capability is not really about "EPOLL". It's about the ability to
> block system suspend. Therefore, IMO, a better name would be something
> like: CAP_BLOCK_SUSPEND. This name is better because there might be
> some other API feature that is later added that also has the effect of
> preventing system suspends, and we could reasonably govern that
> feature with the same capability.
> 
> Does that seem sensible to you? I can send a patch for the name change.

I'm not sure what Arve thinks about that, but I'd be fine with that.

Arve, what do you think?

Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-07-16 11:00                     ` Rafael J. Wysocki
@ 2012-07-16 22:04                       ` Arve Hjønnevåg
  2012-07-17  5:14                         ` Michael Kerrisk
  0 siblings, 1 reply; 129+ messages in thread
From: Arve Hjønnevåg @ 2012-07-16 22:04 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Michael Kerrisk, NeilBrown, Linux PM list, LKML, Magnus Damm,
	markgross, Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat, mtk

On Mon, Jul 16, 2012 at 4:00 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Monday, July 16, 2012, Michael Kerrisk wrote:
>> Arve, Rafael,
>>
>> On Tue, May 1, 2012 at 7:33 AM, Arve Hjønnevåg <arve@android.com> wrote:
>> > When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
>> > wakeup_source will be active to prevent suspend. This can be used to
>> > handle wakeup events from a driver that support poll, e.g. input, if
>> > that driver wakes up the waitqueue passed to epoll before allowing
>> > suspend.
>>
>> It's late it the -rc series,
>
> Well, exactly. :-)
>
>> but it strikes me that CAP_EPOLLWAKEUP is
>> a poor name for the capability that governs the use of EPOLLWAKEUP.
>> While on the one hand some capabilities are overloaded
>> (https://lwn.net/Articles/486306/), on the other hand we should avoid
>> adding individual capabilities for each new API feature (otherwise
>> capabilities become administratively unwieldy).
>>
>> This capability is not really about "EPOLL". It's about the ability to
>> block system suspend. Therefore, IMO, a better name would be something
>> like: CAP_BLOCK_SUSPEND. This name is better because there might be
>> some other API feature that is later added that also has the effect of
>> preventing system suspends, and we could reasonably govern that
>> feature with the same capability.

We already have another api, "/sys/power/wake_lock", that allow
user-space to block suspend. Do we want to apply this capability that
api as well, or only to apis that do not have other ways to restrict
access?

>>
>> Does that seem sensible to you? I can send a patch for the name change.
>
> I'm not sure what Arve thinks about that, but I'd be fine with that.
>
> Arve, what do you think?
>

CAP_BLOCK_SUSPEND is fine with me, but if it does not apply to the
sysfs interface, then the comment should probably mention this.

-- 
Arve Hjønnevåg

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-07-16 22:04                       ` Arve Hjønnevåg
@ 2012-07-17  5:14                         ` Michael Kerrisk
  2012-07-17 19:22                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 129+ messages in thread
From: Michael Kerrisk @ 2012-07-17  5:14 UTC (permalink / raw)
  To: Arve Hjønnevåg
  Cc: Rafael J. Wysocki, NeilBrown, Linux PM list, LKML, Magnus Damm,
	markgross, Matthew Garrett, Greg KH, John Stultz, Brian Swetland,
	Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat, mtk

On Tue, Jul 17, 2012 at 12:04 AM, Arve Hjønnevåg <arve@android.com> wrote:
> On Mon, Jul 16, 2012 at 4:00 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> On Monday, July 16, 2012, Michael Kerrisk wrote:
>>> Arve, Rafael,
>>>
>>> On Tue, May 1, 2012 at 7:33 AM, Arve Hjønnevåg <arve@android.com> wrote:
>>> > When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
>>> > wakeup_source will be active to prevent suspend. This can be used to
>>> > handle wakeup events from a driver that support poll, e.g. input, if
>>> > that driver wakes up the waitqueue passed to epoll before allowing
>>> > suspend.
>>>
>>> It's late it the -rc series,
>>
>> Well, exactly. :-)

If someone had CCed linux-api@ along the way (as per
Documentation/SubmitChecklist), it might have helped ;-)

>>
>>> but it strikes me that CAP_EPOLLWAKEUP is
>>> a poor name for the capability that governs the use of EPOLLWAKEUP.
>>> While on the one hand some capabilities are overloaded
>>> (https://lwn.net/Articles/486306/), on the other hand we should avoid
>>> adding individual capabilities for each new API feature (otherwise
>>> capabilities become administratively unwieldy).
>>>
>>> This capability is not really about "EPOLL". It's about the ability to
>>> block system suspend. Therefore, IMO, a better name would be something
>>> like: CAP_BLOCK_SUSPEND. This name is better because there might be
>>> some other API feature that is later added that also has the effect of
>>> preventing system suspends, and we could reasonably govern that
>>> feature with the same capability.
>
> We already have another api, "/sys/power/wake_lock", that allow
> user-space to block suspend. Do we want to apply this capability that
> api as well, or only to apis that do not have other ways to restrict
> access?

Well, the question is: is there a governor on the use of
/sys/power/wake_lock? It makes sense either they are both governed
(preferably by the same mechanism, I would have thought), or neither
is.

>>> Does that seem sensible to you? I can send a patch for the name change.
>>
>> I'm not sure what Arve thinks about that, but I'd be fine with that.
>>
>> Arve, what do you think?
>>
>
> CAP_BLOCK_SUSPEND is fine with me, but if it does not apply to the
> sysfs interface, then the comment should probably mention this.

I've sent a patch, but omitted mention of API details in the comments.
Maybe that can be changed afterward, when a decision has been reached
about governing /sys/power/wake_lock.

Thanks,

Michael

-- 
Michael Kerrisk Linux man-pages maintainer;
http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface", http://blog.man7.org/

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-07-17  5:14                         ` Michael Kerrisk
@ 2012-07-17 19:22                           ` Rafael J. Wysocki
  2012-07-17 19:36                             ` Greg KH
  2012-07-18  6:41                             ` Michael Kerrisk (man-pages)
  0 siblings, 2 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-07-17 19:22 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Arve Hjønnevåg, NeilBrown, Linux PM list, LKML,
	Magnus Damm, markgross, Matthew Garrett, Greg KH, John Stultz,
	Brian Swetland, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat,
	mtk

On Tuesday, July 17, 2012, Michael Kerrisk wrote:
> On Tue, Jul 17, 2012 at 12:04 AM, Arve Hjønnevåg <arve@android.com> wrote:
> > On Mon, Jul 16, 2012 at 4:00 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> On Monday, July 16, 2012, Michael Kerrisk wrote:
> >>> Arve, Rafael,
> >>>
> >>> On Tue, May 1, 2012 at 7:33 AM, Arve Hjønnevåg <arve@android.com> wrote:
> >>> > When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
> >>> > wakeup_source will be active to prevent suspend. This can be used to
> >>> > handle wakeup events from a driver that support poll, e.g. input, if
> >>> > that driver wakes up the waitqueue passed to epoll before allowing
> >>> > suspend.
> >>>
> >>> It's late it the -rc series,
> >>
> >> Well, exactly. :-)
> 
> If someone had CCed linux-api@ along the way (as per
> Documentation/SubmitChecklist), it might have helped ;-)

Well, it still _is_ late.

> >>> but it strikes me that CAP_EPOLLWAKEUP is
> >>> a poor name for the capability that governs the use of EPOLLWAKEUP.
> >>> While on the one hand some capabilities are overloaded
> >>> (https://lwn.net/Articles/486306/), on the other hand we should avoid
> >>> adding individual capabilities for each new API feature (otherwise
> >>> capabilities become administratively unwieldy).
> >>>
> >>> This capability is not really about "EPOLL". It's about the ability to
> >>> block system suspend. Therefore, IMO, a better name would be something
> >>> like: CAP_BLOCK_SUSPEND. This name is better because there might be
> >>> some other API feature that is later added that also has the effect of
> >>> preventing system suspends, and we could reasonably govern that
> >>> feature with the same capability.
> >
> > We already have another api, "/sys/power/wake_lock", that allow
> > user-space to block suspend. Do we want to apply this capability that
> > api as well, or only to apis that do not have other ways to restrict
> > access?
> 
> Well, the question is: is there a governor on the use of
> /sys/power/wake_lock? It makes sense either they are both governed
> (preferably by the same mechanism, I would have thought), or neither
> is.
> 
> >>> Does that seem sensible to you? I can send a patch for the name change.
> >>
> >> I'm not sure what Arve thinks about that, but I'd be fine with that.
> >>
> >> Arve, what do you think?
> >>
> >
> > CAP_BLOCK_SUSPEND is fine with me, but if it does not apply to the
> > sysfs interface, then the comment should probably mention this.
> 
> I've sent a patch, but omitted mention of API details in the comments.
> Maybe that can be changed afterward, when a decision has been reached
> about governing /sys/power/wake_lock.

I'm going to push your patch for v3.5, but then I'm considering the following
one for v3.6.  I wouldn't like to make more changes in v3.5-rc at this point,
if possible.

Thanks,
Rafael

---
From: Rafael J. Wysocki <rjw@sisk.pl>
Subject: PM / Sleep: Require CAP_BLOCK_SUSPEND to use wake_lock/wake_unlock

Require processes wanting to use the wake_lock/wake_unlock sysfs
files to have the CAP_BLOCK_SUSPEND capability, which also is
required for the eventpoll EPOLLWAKEUP flag to be effective, so that
all interfaces related to blocking autosleep depend on the same
capability.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
---
 kernel/power/wakelock.c |    7 +++++++
 1 file changed, 7 insertions(+)

Index: linux/kernel/power/wakelock.c
===================================================================
--- linux.orig/kernel/power/wakelock.c
+++ linux/kernel/power/wakelock.c
@@ -9,6 +9,7 @@
  * manipulate wakelocks on Android.
  */
 
+#include <linux/capability.h>
 #include <linux/ctype.h>
 #include <linux/device.h>
 #include <linux/err.h>
@@ -188,6 +189,9 @@ int pm_wake_lock(const char *buf)
 	size_t len;
 	int ret = 0;
 
+	if (!capable(CAP_BLOCK_SUSPEND))
+		return -EPERM;
+
 	while (*str && !isspace(*str))
 		str++;
 
@@ -231,6 +235,9 @@ int pm_wake_unlock(const char *buf)
 	size_t len;
 	int ret = 0;
 
+	if (!capable(CAP_BLOCK_SUSPEND))
+		return -EPERM;
+
 	len = strlen(buf);
 	if (!len)
 		return -EINVAL;

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-07-17 19:22                           ` Rafael J. Wysocki
@ 2012-07-17 19:36                             ` Greg KH
  2012-07-17 19:55                               ` Rafael J. Wysocki
  2012-07-18  6:41                             ` Michael Kerrisk (man-pages)
  1 sibling, 1 reply; 129+ messages in thread
From: Greg KH @ 2012-07-17 19:36 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Michael Kerrisk, Arve Hjønnevåg, NeilBrown,
	Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	John Stultz, Brian Swetland, Alan Stern, Dmitry Torokhov,
	Srivatsa S. Bhat, mtk

On Tue, Jul 17, 2012 at 09:22:25PM +0200, Rafael J. Wysocki wrote:
> On Tuesday, July 17, 2012, Michael Kerrisk wrote:
> > On Tue, Jul 17, 2012 at 12:04 AM, Arve Hjønnevåg <arve@android.com> wrote:
> > > On Mon, Jul 16, 2012 at 4:00 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > >> On Monday, July 16, 2012, Michael Kerrisk wrote:
> > >>> Arve, Rafael,
> > >>>
> > >>> On Tue, May 1, 2012 at 7:33 AM, Arve Hjønnevåg <arve@android.com> wrote:
> > >>> > When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
> > >>> > wakeup_source will be active to prevent suspend. This can be used to
> > >>> > handle wakeup events from a driver that support poll, e.g. input, if
> > >>> > that driver wakes up the waitqueue passed to epoll before allowing
> > >>> > suspend.
> > >>>
> > >>> It's late it the -rc series,
> > >>
> > >> Well, exactly. :-)
> > 
> > If someone had CCed linux-api@ along the way (as per
> > Documentation/SubmitChecklist), it might have helped ;-)
> 
> Well, it still _is_ late.
> 
> > >>> but it strikes me that CAP_EPOLLWAKEUP is
> > >>> a poor name for the capability that governs the use of EPOLLWAKEUP.
> > >>> While on the one hand some capabilities are overloaded
> > >>> (https://lwn.net/Articles/486306/), on the other hand we should avoid
> > >>> adding individual capabilities for each new API feature (otherwise
> > >>> capabilities become administratively unwieldy).
> > >>>
> > >>> This capability is not really about "EPOLL". It's about the ability to
> > >>> block system suspend. Therefore, IMO, a better name would be something
> > >>> like: CAP_BLOCK_SUSPEND. This name is better because there might be
> > >>> some other API feature that is later added that also has the effect of
> > >>> preventing system suspends, and we could reasonably govern that
> > >>> feature with the same capability.
> > >
> > > We already have another api, "/sys/power/wake_lock", that allow
> > > user-space to block suspend. Do we want to apply this capability that
> > > api as well, or only to apis that do not have other ways to restrict
> > > access?
> > 
> > Well, the question is: is there a governor on the use of
> > /sys/power/wake_lock? It makes sense either they are both governed
> > (preferably by the same mechanism, I would have thought), or neither
> > is.
> > 
> > >>> Does that seem sensible to you? I can send a patch for the name change.
> > >>
> > >> I'm not sure what Arve thinks about that, but I'd be fine with that.
> > >>
> > >> Arve, what do you think?
> > >>
> > >
> > > CAP_BLOCK_SUSPEND is fine with me, but if it does not apply to the
> > > sysfs interface, then the comment should probably mention this.
> > 
> > I've sent a patch, but omitted mention of API details in the comments.
> > Maybe that can be changed afterward, when a decision has been reached
> > about governing /sys/power/wake_lock.
> 
> I'm going to push your patch for v3.5, but then I'm considering the following
> one for v3.6.  I wouldn't like to make more changes in v3.5-rc at this point,
> if possible.
> 
> Thanks,
> Rafael
> 
> ---
> From: Rafael J. Wysocki <rjw@sisk.pl>
> Subject: PM / Sleep: Require CAP_BLOCK_SUSPEND to use wake_lock/wake_unlock
> 
> Require processes wanting to use the wake_lock/wake_unlock sysfs
> files to have the CAP_BLOCK_SUSPEND capability, which also is
> required for the eventpoll EPOLLWAKEUP flag to be effective, so that
> all interfaces related to blocking autosleep depend on the same
> capability.
> 
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>

Care to mark that for -stable as well?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-07-17 19:36                             ` Greg KH
@ 2012-07-17 19:55                               ` Rafael J. Wysocki
  0 siblings, 0 replies; 129+ messages in thread
From: Rafael J. Wysocki @ 2012-07-17 19:55 UTC (permalink / raw)
  To: Greg KH
  Cc: Michael Kerrisk, Arve Hjønnevåg, NeilBrown,
	Linux PM list, LKML, Magnus Damm, markgross, Matthew Garrett,
	John Stultz, Brian Swetland, Alan Stern, Dmitry Torokhov,
	Srivatsa S. Bhat, mtk

On Tuesday, July 17, 2012, Greg KH wrote:
> On Tue, Jul 17, 2012 at 09:22:25PM +0200, Rafael J. Wysocki wrote:
> > On Tuesday, July 17, 2012, Michael Kerrisk wrote:
> > > On Tue, Jul 17, 2012 at 12:04 AM, Arve Hjønnevåg <arve@android.com> wrote:
> > > > On Mon, Jul 16, 2012 at 4:00 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > > >> On Monday, July 16, 2012, Michael Kerrisk wrote:
> > > >>> Arve, Rafael,
> > > >>>
> > > >>> On Tue, May 1, 2012 at 7:33 AM, Arve Hjønnevåg <arve@android.com> wrote:
> > > >>> > When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
> > > >>> > wakeup_source will be active to prevent suspend. This can be used to
> > > >>> > handle wakeup events from a driver that support poll, e.g. input, if
> > > >>> > that driver wakes up the waitqueue passed to epoll before allowing
> > > >>> > suspend.
> > > >>>
> > > >>> It's late it the -rc series,
> > > >>
> > > >> Well, exactly. :-)
> > > 
> > > If someone had CCed linux-api@ along the way (as per
> > > Documentation/SubmitChecklist), it might have helped ;-)
> > 
> > Well, it still _is_ late.
> > 
> > > >>> but it strikes me that CAP_EPOLLWAKEUP is
> > > >>> a poor name for the capability that governs the use of EPOLLWAKEUP.
> > > >>> While on the one hand some capabilities are overloaded
> > > >>> (https://lwn.net/Articles/486306/), on the other hand we should avoid
> > > >>> adding individual capabilities for each new API feature (otherwise
> > > >>> capabilities become administratively unwieldy).
> > > >>>
> > > >>> This capability is not really about "EPOLL". It's about the ability to
> > > >>> block system suspend. Therefore, IMO, a better name would be something
> > > >>> like: CAP_BLOCK_SUSPEND. This name is better because there might be
> > > >>> some other API feature that is later added that also has the effect of
> > > >>> preventing system suspends, and we could reasonably govern that
> > > >>> feature with the same capability.
> > > >
> > > > We already have another api, "/sys/power/wake_lock", that allow
> > > > user-space to block suspend. Do we want to apply this capability that
> > > > api as well, or only to apis that do not have other ways to restrict
> > > > access?
> > > 
> > > Well, the question is: is there a governor on the use of
> > > /sys/power/wake_lock? It makes sense either they are both governed
> > > (preferably by the same mechanism, I would have thought), or neither
> > > is.
> > > 
> > > >>> Does that seem sensible to you? I can send a patch for the name change.
> > > >>
> > > >> I'm not sure what Arve thinks about that, but I'd be fine with that.
> > > >>
> > > >> Arve, what do you think?
> > > >>
> > > >
> > > > CAP_BLOCK_SUSPEND is fine with me, but if it does not apply to the
> > > > sysfs interface, then the comment should probably mention this.
> > > 
> > > I've sent a patch, but omitted mention of API details in the comments.
> > > Maybe that can be changed afterward, when a decision has been reached
> > > about governing /sys/power/wake_lock.
> > 
> > I'm going to push your patch for v3.5, but then I'm considering the following
> > one for v3.6.  I wouldn't like to make more changes in v3.5-rc at this point,
> > if possible.
> > 
> > Thanks,
> > Rafael
> > 
> > ---
> > From: Rafael J. Wysocki <rjw@sisk.pl>
> > Subject: PM / Sleep: Require CAP_BLOCK_SUSPEND to use wake_lock/wake_unlock
> > 
> > Require processes wanting to use the wake_lock/wake_unlock sysfs
> > files to have the CAP_BLOCK_SUSPEND capability, which also is
> > required for the eventpoll EPOLLWAKEUP flag to be effective, so that
> > all interfaces related to blocking autosleep depend on the same
> > capability.
> > 
> > Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> 
> Care to mark that for -stable as well?

I will.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 129+ messages in thread

* Re: [PATCH] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready
  2012-07-17 19:22                           ` Rafael J. Wysocki
  2012-07-17 19:36                             ` Greg KH
@ 2012-07-18  6:41                             ` Michael Kerrisk (man-pages)
  1 sibling, 0 replies; 129+ messages in thread
From: Michael Kerrisk (man-pages) @ 2012-07-18  6:41 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Arve Hjønnevåg, NeilBrown, Linux PM list, LKML,
	Magnus Damm, markgross, Matthew Garrett, Greg KH, John Stultz,
	Brian Swetland, Alan Stern, Dmitry Torokhov, Srivatsa S. Bhat,
	mtk

On Tue, Jul 17, 2012 at 9:22 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:

[...]

> I'm going to push your patch for v3.5,

Thanks.

> but then I'm considering the following
> one for v3.6.  I wouldn't like to make more changes in v3.5-rc at this point,

Acked-by: Michael Kerrisk <mtk.man-pages@gmail.com>

Thanks,

Michael

> ---
> From: Rafael J. Wysocki <rjw@sisk.pl>
> Subject: PM / Sleep: Require CAP_BLOCK_SUSPEND to use wake_lock/wake_unlock
>
> Require processes wanting to use the wake_lock/wake_unlock sysfs
> files to have the CAP_BLOCK_SUSPEND capability, which also is
> required for the eventpoll EPOLLWAKEUP flag to be effective, so that
> all interfaces related to blocking autosleep depend on the same
> capability.
>
> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
>  kernel/power/wakelock.c |    7 +++++++
>  1 file changed, 7 insertions(+)
>
> Index: linux/kernel/power/wakelock.c
> ===================================================================
> --- linux.orig/kernel/power/wakelock.c
> +++ linux/kernel/power/wakelock.c
> @@ -9,6 +9,7 @@
>   * manipulate wakelocks on Android.
>   */
>
> +#include <linux/capability.h>
>  #include <linux/ctype.h>
>  #include <linux/device.h>
>  #include <linux/err.h>
> @@ -188,6 +189,9 @@ int pm_wake_lock(const char *buf)
>         size_t len;
>         int ret = 0;
>
> +       if (!capable(CAP_BLOCK_SUSPEND))
> +               return -EPERM;
> +
>         while (*str && !isspace(*str))
>                 str++;
>
> @@ -231,6 +235,9 @@ int pm_wake_unlock(const char *buf)
>         size_t len;
>         int ret = 0;
>
> +       if (!capable(CAP_BLOCK_SUSPEND))
> +               return -EPERM;
> +
>         len = strlen(buf);
>         if (!len)
>                 return -EINVAL;



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface"; http://man7.org/tlpi/

^ permalink raw reply	[flat|nested] 129+ messages in thread

end of thread, other threads:[~2012-07-18  6:42 UTC | newest]

Thread overview: 129+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-07  1:00 [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
2012-02-07  1:01 ` [PATCH 1/8] PM / Sleep: Initialize wakeup source locks in wakeup_source_add() Rafael J. Wysocki
2012-02-07 22:29   ` John Stultz
2012-02-07 22:41     ` Rafael J. Wysocki
2012-02-07  1:03 ` [PATCH 2/8] PM / Sleep: Do not check wakeup too often in try_to_freeze_tasks() Rafael J. Wysocki
2012-02-07  1:03 ` [PATCH 3/8] PM / Sleep: Look for wakeup events in later stages of device suspend Rafael J. Wysocki
2012-02-07  1:04 ` [PATCH 4/8] PM / Sleep: Use wait queue to signal "no wakeup events in progress" Rafael J. Wysocki
2012-02-08 23:10   ` NeilBrown
2012-02-09  0:05     ` Rafael J. Wysocki
2012-02-12  1:27   ` mark gross
2012-02-07  1:05 ` [RFC][PATCH 5/8] PM / Sleep: Change wakeup statistics Rafael J. Wysocki
2012-02-15  6:15   ` Arve Hjønnevåg
2012-02-15 22:37     ` Rafael J. Wysocki
2012-02-17  2:11       ` Arve Hjønnevåg
2012-02-07  1:06 ` [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep Rafael J. Wysocki
2012-02-07 22:49   ` [Update][RFC][PATCH " Rafael J. Wysocki
2012-02-07  1:06 ` [RFC][PATCH 7/8] PM / Sleep: Add "prevent autosleep time" statistics to wakeup sources Rafael J. Wysocki
2012-02-07  1:07 ` [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating " Rafael J. Wysocki
2012-02-07  1:13 ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Rafael J. Wysocki
2012-02-08 23:57 ` NeilBrown
2012-02-10  0:44   ` Rafael J. Wysocki
2012-02-12  2:05     ` mark gross
2012-02-12 21:32       ` Rafael J. Wysocki
2012-02-14  0:11         ` Arve Hjønnevåg
2012-02-15 15:28           ` mark gross
2012-02-12  1:54   ` mark gross
2012-02-12  1:19 ` mark gross
2012-02-14  2:07 ` Arve Hjønnevåg
2012-02-14 23:22   ` Rafael J. Wysocki
2012-02-15  5:57     ` Arve Hjønnevåg
2012-02-15 23:07       ` Rafael J. Wysocki
2012-02-16 22:22         ` Rafael J. Wysocki
2012-02-17  3:56           ` Arve Hjønnevåg
2012-02-17 23:02             ` [PATCH] PM / Sleep: Add more wakeup source initialization routines Rafael J. Wysocki
2012-02-18 23:50               ` [Update][PATCH] " Rafael J. Wysocki
2012-02-20 23:04                 ` [Update 2x][PATCH] " Rafael J. Wysocki
2012-02-17  3:55         ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks" Arve Hjønnevåg
2012-02-17 20:57           ` Rafael J. Wysocki
2012-02-21 23:31 ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 Rafael J. Wysocki
2012-02-21 23:32   ` [RFC][PATCH 1/7] PM / Sleep: Look for wakeup events in later stages of device suspend Rafael J. Wysocki
2012-02-21 23:33   ` [RFC][PATCH 2/7] PM / Sleep: Use wait queue to signal "no wakeup events in progress" Rafael J. Wysocki
2012-02-21 23:34   ` [RFC][PATCH 3/7] PM / Sleep: Change wakeup source statistics to follow Android Rafael J. Wysocki
2012-02-21 23:34   ` [RFC][PATCH 4/7] Input / PM: Add ioctl to block suspend while event queue is not empty Rafael J. Wysocki
2012-02-24  5:16     ` Matt Helsley
2012-02-25  4:25       ` Arve Hjønnevåg
2012-02-25 23:33         ` Rafael J. Wysocki
2012-02-28  0:19         ` Matt Helsley
2012-02-26 20:57       ` Rafael J. Wysocki
2012-02-27 22:18         ` Matt Helsley
2012-02-28  1:17           ` Rafael J. Wysocki
2012-02-28  5:58         ` Arve Hjønnevåg
2012-03-04 22:56           ` Rafael J. Wysocki
2012-03-06  1:04             ` [PATCH 1/2] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready Arve Hjønnevåg
2012-03-06  1:04               ` [PATCH 2/2] PM / Sleep: Add wakeup_source_activate and wakeup_source_deactivate tracepoints Arve Hjønnevåg
2012-02-21 23:35   ` [RFC][PATCH 5/7] PM / Sleep: Implement opportunistic sleep Rafael J. Wysocki
2012-02-22  8:45     ` Srivatsa S. Bhat
2012-02-22 22:10       ` Rafael J. Wysocki
2012-02-23  5:35         ` Srivatsa S. Bhat
2012-02-21 23:36   ` [RFC][PATCH 6/7] PM / Sleep: Add "prevent autosleep time" statistics to wakeup sources Rafael J. Wysocki
2012-02-21 23:37   ` [RFC][PATCH 7/7] PM / Sleep: Add user space interface for manipulating " Rafael J. Wysocki
2012-02-22  4:49   ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take 2 John Stultz
2012-02-22  8:44     ` Srivatsa S. Bhat
2012-02-22 22:10       ` [RFC][PATCH 0/7] PM: Implement autosleep and "wake locks", take2 Rafael J. Wysocki
2012-02-23  6:25         ` Srivatsa S. Bhat
2012-02-23 21:26           ` Rafael J. Wysocki
2012-02-23 21:32             ` Rafael J. Wysocki
2012-02-24  4:44               ` Srivatsa S. Bhat
2012-02-24 23:21                 ` Rafael J. Wysocki
2012-02-25  4:43                   ` Arve Hjønnevåg
2012-02-25 20:43                     ` Rafael J. Wysocki
2012-02-25 19:20                   ` Srivatsa S. Bhat
2012-02-25 21:01                     ` Rafael J. Wysocki
2012-02-28 10:24                       ` Srivatsa S. Bhat
2012-04-22 21:19   ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Rafael J. Wysocki
2012-04-22 21:19     ` [PATCH 1/8] PM / Sleep: Look for wakeup events in later stages of device suspend Rafael J. Wysocki
2012-04-22 21:20     ` [PATCH 2/8] PM / Sleep: Use wait queue to signal "no wakeup events in progress" Rafael J. Wysocki
2012-04-23  4:01       ` mark gross
2012-04-22 21:21     ` [PATCH 3/8] PM / Sleep: Change wakeup source statistics to follow Android Rafael J. Wysocki
2012-04-22 21:21     ` [PATCH 4/8] PM / Sleep: Add wakeup_source_activate and wakeup_source_deactivate tracepoints Rafael J. Wysocki
2012-04-22 21:22     ` [RFC][PATCH 5/8] epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready Rafael J. Wysocki
2012-04-26  4:03       ` NeilBrown
2012-04-26 20:40         ` Rafael J. Wysocki
2012-04-27  3:49           ` Arve Hjønnevåg
2012-04-27 21:18             ` Rafael J. Wysocki
2012-04-27 23:26               ` [PATCH] " Arve Hjønnevåg
2012-04-30  1:58             ` [RFC][PATCH 5/8] " NeilBrown
2012-05-01  0:52               ` Arve Hjønnevåg
2012-05-01  2:18                 ` NeilBrown
2012-05-01  5:33                 ` [PATCH] " Arve Hjønnevåg
2012-05-01  6:28                   ` NeilBrown
2012-05-01 13:51                     ` Rafael J. Wysocki
2012-07-16  6:38                   ` Michael Kerrisk
2012-07-16 11:00                     ` Rafael J. Wysocki
2012-07-16 22:04                       ` Arve Hjønnevåg
2012-07-17  5:14                         ` Michael Kerrisk
2012-07-17 19:22                           ` Rafael J. Wysocki
2012-07-17 19:36                             ` Greg KH
2012-07-17 19:55                               ` Rafael J. Wysocki
2012-07-18  6:41                             ` Michael Kerrisk (man-pages)
2012-04-22 21:23     ` [RFC][PATCH 6/8] PM / Sleep: Implement opportunistic sleep Rafael J. Wysocki
2012-04-26  3:05       ` NeilBrown
2012-04-26 21:52         ` Rafael J. Wysocki
2012-04-27  0:39           ` NeilBrown
2012-04-27 21:22             ` Rafael J. Wysocki
2012-05-03  0:23           ` Arve Hjønnevåg
2012-05-03 13:28             ` Rafael J. Wysocki
2012-05-03 21:27               ` Arve Hjønnevåg
2012-05-03 22:20                 ` Rafael J. Wysocki
2012-05-03 22:16                   ` Arve Hjønnevåg
2012-05-03 22:24                     ` Rafael J. Wysocki
2012-04-22 21:24     ` [RFC][PATCH 7/8] PM / Sleep: Add "prevent autosleep time" statistics to wakeup sources Rafael J. Wysocki
2012-04-22 21:24     ` [RFC][PATCH 8/8] PM / Sleep: Add user space interface for manipulating " Rafael J. Wysocki
2012-04-24  1:35       ` John Stultz
2012-04-24 21:27         ` Rafael J. Wysocki
2012-04-26  6:31           ` NeilBrown
2012-04-26 22:04             ` Rafael J. Wysocki
2012-04-27  0:07               ` NeilBrown
2012-04-27 21:15                 ` Rafael J. Wysocki
2012-04-27  3:57               ` Arve Hjønnevåg
2012-04-27 21:14                 ` Rafael J. Wysocki
2012-04-27 21:17                   ` Arve Hjønnevåg
2012-04-27 21:34                     ` Rafael J. Wysocki
2012-05-03 19:29                       ` [PATCH 0/2]: Kconfig options for wakelocks limit and gc (was: Re: [RFC][PATCH 8/8] PM / Sleep: Add user space ...) Rafael J. Wysocki
2012-05-03 19:30                         ` [PATCH 1/2] PM / Sleep: Make the limit of user space wakeup sources configurable Rafael J. Wysocki
2012-05-03 19:34                         ` [PATCH 2/2] PM / Sleep: User space wakeup sources garbage collector Kconfig option Rafael J. Wysocki
2012-05-03 22:14                         ` [PATCH 0/2]: Kconfig options for wakelocks limit and gc (was: Re: [RFC][PATCH 8/8] PM / Sleep: Add user space ...) Arve Hjønnevåg
2012-05-03 22:20                           ` Rafael J. Wysocki
2012-04-23 16:49     ` [RFC][PATCH 0/8] PM: Implement autosleep and "wake locks", take 3 Greg KH
2012-04-23 19:51       ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).