[PATCH 0/2] Extend timer_slack_ns to u64 on 32bit systems & add /proc/<pid>/timerslack

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] Extend timer_slack_ns to u64 on 32bit systems & add /proc/<pid>/timerslack_ns
@ 2016-02-17  1:06 John Stultz
  2016-02-17  1:06 ` [PATCH 1/2] timer: Convert timer_slack_ns from unsigned long to u64 John Stultz
  2016-02-17  1:06 ` [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface John Stultz
  0 siblings, 2 replies; 27+ messages in thread
From: John Stultz @ 2016-02-17  1:06 UTC (permalink / raw)
  To: Andrew Morton, Thomas Gleixner, Arjan van de Ven
  Cc: lkml, John Stultz, Oren Laadan, Ruchi Kandoi, Rom Lemarchand,
	Kees Cook, Android Kernel Team

I didn't get any negative feedback on the last round of this,
so I figured I'd send it along w/o the RFC header this time.

This patchset introduces a /proc/<pid>/timerslack_ns interface
which would allow controlling processes to be able to set the
timerslack value on other processes in order to save power by
avoiding wakeups (Something Android currently does via
out-of-tree patches).

The first patch tries to fix the internal timer_slack_ns usage
which was defined as a long, which limits the slack range to
~4 seconds on 32bit systems. It converts it to a u64, which
provides the same basically unlimited slack (500 years) on both
32bit and 64bit machines.

The second patch introduces the /proc/<pid>/timerslack_ns
interface which allows the full 64bit slack range for a task
to be read or set on both 32bit and 64bit machines.

With these two patches, on a 32bit machine, after setting the
slack on bash to 10 seconds:
$ time sleep 1

real    0m10.747s
user    0m0.001s
sys     0m0.005s

The first patch is a little ugly, since I had to chase the slack
delta arguments through a number of functions converting them to
u64s. Let me know if it makes sense to break that up more or not.

Other then that things are fairly straight forward.

Feedback and thoughts would be greatly appreciated!

thanks
-john

Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Android Kernel Team <kernel-team@android.com>

John Stultz (2):
  timer: Convert timer_slack_ns from unsigned long to u64
  proc: Add /proc/<pid>/timerslack_ns interface

 Documentation/filesystems/proc.txt | 18 ++++++++++
 fs/eventpoll.c                     |  2 +-
 fs/proc/base.c                     | 69 ++++++++++++++++++++++++++++++++++++++
 fs/select.c                        |  8 ++---
 include/linux/freezer.h            |  2 +-
 include/linux/hrtimer.h            | 12 ++++---
 include/linux/poll.h               |  2 +-
 include/linux/sched.h              |  4 +--
 kernel/sys.c                       |  5 ++-
 kernel/time/hrtimer.c              |  8 ++---
 kernel/time/timer.c                |  4 +--
 11 files changed, 113 insertions(+), 21 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/2] timer: Convert timer_slack_ns from unsigned long to u64
  2016-02-17  1:06 [PATCH 0/2] Extend timer_slack_ns to u64 on 32bit systems & add /proc/<pid>/timerslack_ns John Stultz
@ 2016-02-17  1:06 ` John Stultz
  2016-02-17  1:06 ` [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface John Stultz
  1 sibling, 0 replies; 27+ messages in thread
From: John Stultz @ 2016-02-17  1:06 UTC (permalink / raw)
  To: Andrew Morton, Thomas Gleixner, Arjan van de Ven
  Cc: lkml, John Stultz, Oren Laadan, Ruchi Kandoi, Rom Lemarchand,
	Kees Cook, Android Kernel Team

The timer_slack_ns value in the task struct is currently a
unsigned long. This means that on 32bit applications, the
maximum slack is just over 4 seconds. However, on 64bit
machines, its much much larger (~500 years).

This disparity could make application development a little
difficult, so this patch extends the timer_slack_ns entry
(as well as the default_slack) to a u64. This means both 32bit
and 64bit systems have the same effective internal slack range.

Now the existing ABI via PR_GET_TIMERSLACK and PR_SET_TIMERSLACK
specify the interface as a unsigned long, so we preserve that
limitation on 32bit systems, where SET_TIMERSLACK can only set
the slack to a unsigned long value, and GET_TIMERSLACK will
return ULONG_MAX if the slack is actually larger then what can
be stored by an unsigned long.

This patch also modifies hrtimer functions which specified the
slack delta as a unsigned long.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Android Kernel Team <kernel-team@android.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 fs/eventpoll.c          |  2 +-
 fs/select.c             |  8 ++++----
 include/linux/freezer.h |  2 +-
 include/linux/hrtimer.h | 12 +++++++-----
 include/linux/poll.h    |  2 +-
 include/linux/sched.h   |  4 ++--
 kernel/sys.c            |  5 ++++-
 kernel/time/hrtimer.c   |  8 ++++----
 kernel/time/timer.c     |  4 ++--
 9 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index cde6074..8a74a2a 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1616,7 +1616,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
 {
 	int res = 0, eavail, timed_out = 0;
 	unsigned long flags;
-	long slack = 0;
+	u64 slack = 0;
 	wait_queue_t wait;
 	ktime_t expires, *to = NULL;
 
diff --git a/fs/select.c b/fs/select.c
index 79d0d49..8692939 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -70,9 +70,9 @@ static long __estimate_accuracy(struct timespec *tv)
 	return slack;
 }
 
-long select_estimate_accuracy(struct timespec *tv)
+u64 select_estimate_accuracy(struct timespec *tv)
 {
-	unsigned long ret;
+	u64 ret;
 	struct timespec now;
 
 	/*
@@ -402,7 +402,7 @@ int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
 	struct poll_wqueues table;
 	poll_table *wait;
 	int retval, i, timed_out = 0;
-	unsigned long slack = 0;
+	u64 slack = 0;
 	unsigned int busy_flag = net_busy_loop_on() ? POLL_BUSY_LOOP : 0;
 	unsigned long busy_end = 0;
 
@@ -784,7 +784,7 @@ static int do_poll(struct poll_list *list, struct poll_wqueues *wait,
 	poll_table* pt = &wait->pt;
 	ktime_t expire, *to = NULL;
 	int timed_out = 0, count = 0;
-	unsigned long slack = 0;
+	u64 slack = 0;
 	unsigned int busy_flag = net_busy_loop_on() ? POLL_BUSY_LOOP : 0;
 	unsigned long busy_end = 0;
 
diff --git a/include/linux/freezer.h b/include/linux/freezer.h
index 6b7fd9c..dd03e83 100644
--- a/include/linux/freezer.h
+++ b/include/linux/freezer.h
@@ -231,7 +231,7 @@ static inline long freezable_schedule_timeout_killable_unsafe(long timeout)
  * call this with locks held.
  */
 static inline int freezable_schedule_hrtimeout_range(ktime_t *expires,
-		unsigned long delta, const enum hrtimer_mode mode)
+		u64 delta, const enum hrtimer_mode mode)
 {
 	int __retval;
 	freezer_do_not_count();
diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 2ead22d..c98c653 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -220,7 +220,7 @@ static inline void hrtimer_set_expires_range(struct hrtimer *timer, ktime_t time
 	timer->node.expires = ktime_add_safe(time, delta);
 }
 
-static inline void hrtimer_set_expires_range_ns(struct hrtimer *timer, ktime_t time, unsigned long delta)
+static inline void hrtimer_set_expires_range_ns(struct hrtimer *timer, ktime_t time, u64 delta)
 {
 	timer->_softexpires = time;
 	timer->node.expires = ktime_add_safe(time, ns_to_ktime(delta));
@@ -378,7 +378,7 @@ static inline void destroy_hrtimer_on_stack(struct hrtimer *timer) { }
 
 /* Basic timer operations: */
 extern void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
-			unsigned long range_ns, const enum hrtimer_mode mode);
+				   u64 range_ns, const enum hrtimer_mode mode);
 
 /**
  * hrtimer_start - (re)start an hrtimer on the current CPU
@@ -399,7 +399,7 @@ extern int hrtimer_try_to_cancel(struct hrtimer *timer);
 static inline void hrtimer_start_expires(struct hrtimer *timer,
 					 enum hrtimer_mode mode)
 {
-	unsigned long delta;
+	u64 delta;
 	ktime_t soft, hard;
 	soft = hrtimer_get_softexpires(timer);
 	hard = hrtimer_get_expires(timer);
@@ -477,10 +477,12 @@ extern long hrtimer_nanosleep_restart(struct restart_block *restart_block);
 extern void hrtimer_init_sleeper(struct hrtimer_sleeper *sl,
 				 struct task_struct *tsk);
 
-extern int schedule_hrtimeout_range(ktime_t *expires, unsigned long delta,
+extern int schedule_hrtimeout_range(ktime_t *expires, u64 delta,
 						const enum hrtimer_mode mode);
 extern int schedule_hrtimeout_range_clock(ktime_t *expires,
-		unsigned long delta, const enum hrtimer_mode mode, int clock);
+					  u64 delta,
+					  const enum hrtimer_mode mode,
+					  int clock);
 extern int schedule_hrtimeout(ktime_t *expires, const enum hrtimer_mode mode);
 
 /* Soft interrupt function to run the hrtimer queues: */
diff --git a/include/linux/poll.h b/include/linux/poll.h
index c08386f..9fb4f40 100644
--- a/include/linux/poll.h
+++ b/include/linux/poll.h
@@ -96,7 +96,7 @@ extern void poll_initwait(struct poll_wqueues *pwq);
 extern void poll_freewait(struct poll_wqueues *pwq);
 extern int poll_schedule_timeout(struct poll_wqueues *pwq, int state,
 				 ktime_t *expires, unsigned long slack);
-extern long select_estimate_accuracy(struct timespec *tv);
+extern u64 select_estimate_accuracy(struct timespec *tv);
 
 
 static inline int poll_schedule(struct poll_wqueues *pwq, int state)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index a10494a..1cc4a4f 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1784,8 +1784,8 @@ struct task_struct {
 	 * time slack values; these are used to round up poll() and
 	 * select() etc timeout values. These are in nanoseconds.
 	 */
-	unsigned long timer_slack_ns;
-	unsigned long default_timer_slack_ns;
+	u64 timer_slack_ns;
+	u64 default_timer_slack_ns;
 
 #ifdef CONFIG_KASAN
 	unsigned int kasan_depth;
diff --git a/kernel/sys.c b/kernel/sys.c
index 78947de..cf8ba54 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2169,7 +2169,10 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 		error = perf_event_task_enable();
 		break;
 	case PR_GET_TIMERSLACK:
-		error = current->timer_slack_ns;
+		if (current->timer_slack_ns > ULONG_MAX)
+			error = ULONG_MAX;
+		else
+			error = current->timer_slack_ns;
 		break;
 	case PR_SET_TIMERSLACK:
 		if (arg2 <= 0)
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index fa909f9..58a321c 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -979,7 +979,7 @@ static inline ktime_t hrtimer_update_lowres(struct hrtimer *timer, ktime_t tim,
  *		relative (HRTIMER_MODE_REL)
  */
 void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
-			    unsigned long delta_ns, const enum hrtimer_mode mode)
+			    u64 delta_ns, const enum hrtimer_mode mode)
 {
 	struct hrtimer_clock_base *base, *new_base;
 	unsigned long flags;
@@ -1548,7 +1548,7 @@ long hrtimer_nanosleep(struct timespec *rqtp, struct timespec __user *rmtp,
 	struct restart_block *restart;
 	struct hrtimer_sleeper t;
 	int ret = 0;
-	unsigned long slack;
+	u64 slack;
 
 	slack = current->timer_slack_ns;
 	if (dl_task(current) || rt_task(current))
@@ -1724,7 +1724,7 @@ void __init hrtimers_init(void)
  * @clock:	timer clock, CLOCK_MONOTONIC or CLOCK_REALTIME
  */
 int __sched
-schedule_hrtimeout_range_clock(ktime_t *expires, unsigned long delta,
+schedule_hrtimeout_range_clock(ktime_t *expires, u64 delta,
 			       const enum hrtimer_mode mode, int clock)
 {
 	struct hrtimer_sleeper t;
@@ -1792,7 +1792,7 @@ schedule_hrtimeout_range_clock(ktime_t *expires, unsigned long delta,
  *
  * Returns 0 when the timer has expired otherwise -EINTR
  */
-int __sched schedule_hrtimeout_range(ktime_t *expires, unsigned long delta,
+int __sched schedule_hrtimeout_range(ktime_t *expires, u64 delta,
 				     const enum hrtimer_mode mode)
 {
 	return schedule_hrtimeout_range_clock(expires, delta, mode,
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index bbc5d11..d1798fa 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1698,10 +1698,10 @@ EXPORT_SYMBOL(msleep_interruptible);
 static void __sched do_usleep_range(unsigned long min, unsigned long max)
 {
 	ktime_t kmin;
-	unsigned long delta;
+	u64 delta;
 
 	kmin = ktime_set(0, min * NSEC_PER_USEC);
-	delta = (max - min) * NSEC_PER_USEC;
+	delta = (u64)(max - min) * NSEC_PER_USEC;
 	schedule_hrtimeout_range(&kmin, delta, HRTIMER_MODE_REL);
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-02-17  1:06 [PATCH 0/2] Extend timer_slack_ns to u64 on 32bit systems & add /proc/<pid>/timerslack_ns John Stultz
  2016-02-17  1:06 ` [PATCH 1/2] timer: Convert timer_slack_ns from unsigned long to u64 John Stultz
@ 2016-02-17  1:06 ` John Stultz
  2016-02-17 19:35   ` Andrew Morton
                     ` (2 more replies)
  1 sibling, 3 replies; 27+ messages in thread
From: John Stultz @ 2016-02-17  1:06 UTC (permalink / raw)
  To: Andrew Morton, Thomas Gleixner, Arjan van de Ven
  Cc: lkml, John Stultz, Oren Laadan, Ruchi Kandoi, Rom Lemarchand,
	Kees Cook, Android Kernel Team

This patch provides a proc/PID/timerslack_ns interface which
exposes a task's timerslack value in nanoseconds and allows it
to be changed.

This allows power/performance management software to set timer
slack for other threads according to its policy for the thread
(such as when the thread is designated foreground vs. background
activity)

If the value written is non-zero, slack is set to that value.
Otherwise sets it to the default for the thread.

This interface checks that the calling task has permissions to
to use PTRACE_MODE_ATTACH_FSCREDS on the target task, so that we
can ensure arbitrary apps do not change the timer slack for other
apps.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Android Kernel Team <kernel-team@android.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 Documentation/filesystems/proc.txt | 18 ++++++++++
 fs/proc/base.c                     | 69 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 87 insertions(+)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 843b045b..7f5607a 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -43,6 +43,7 @@ Table of Contents
   3.7   /proc/<pid>/task/<tid>/children - Information about task children
   3.8   /proc/<pid>/fdinfo/<fd> - Information about opened file
   3.9   /proc/<pid>/map_files - Information about memory mapped files
+  3.10  /proc/<pid>/timerslack_ns - Task timerslack value
 
   4	Configuring procfs
   4.1	Mount options
@@ -1862,6 +1863,23 @@ time one can open(2) mappings from the listings of two processes and
 comparing their inode numbers to figure out which anonymous memory areas
 are actually shared.
 
+3.10	/proc/<pid>/timerslack_ns - Task timerslack value
+---------------------------------------------------------
+This file provides the value of the task's timerslack value in nanoseconds.
+This value specifies a amount of time that normal timers may be deferred
+in order to coalesce timers and avoid unnecessary wakeups.
+
+This allows a task's interactivity vs power consumption trade off to be
+adjusted.
+
+Writing 0 to the file will set the tasks timerslack to the default value.
+
+Valid values are from 0 - ULLONG_MAX
+
+An application setting the value must have PTRACE_MODE_ATTACH_FSCREDS level
+permissions on the task specified to change its timerslack_ns value.
+
+
 ------------------------------------------------------------------------------
 Configuring procfs
 ------------------------------------------------------------------------------
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 4f764c2..d7c51ca 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2257,6 +2257,74 @@ static const struct file_operations proc_timers_operations = {
 	.release	= seq_release_private,
 };
 
+static ssize_t timerslack_ns_write(struct file *file, const char __user *buf,
+					size_t count, loff_t *offset)
+{
+	struct inode *inode = file_inode(file);
+	struct task_struct *p;
+	char buffer[PROC_NUMBUF];
+	u64 slack_ns;
+	int err;
+
+	memset(buffer, 0, sizeof(buffer));
+	if (count > sizeof(buffer) - 1)
+		count = sizeof(buffer) - 1;
+
+	if (copy_from_user(buffer, buf, count))
+		return -EFAULT;
+
+	err = kstrtoull(strstrip(buffer), 10, &slack_ns);
+	if (err < 0)
+		return err;
+
+	p = get_proc_task(inode);
+	if (!p)
+		return -ESRCH;
+
+	if (ptrace_may_access(p, PTRACE_MODE_ATTACH_FSCREDS)) {
+		if (slack_ns == 0)
+			p->timer_slack_ns = p->default_timer_slack_ns;
+		else
+			p->timer_slack_ns = slack_ns;
+	} else
+		count = -EINVAL;
+
+	put_task_struct(p);
+
+	return count;
+}
+
+static int timerslack_ns_show(struct seq_file *m, void *v)
+{
+	struct inode *inode = m->private;
+	struct task_struct *p;
+
+	p = get_proc_task(inode);
+	if (!p)
+		return -ESRCH;
+
+	task_lock(p);
+	seq_printf(m, "%llu\n", p->timer_slack_ns);
+	task_unlock(p);
+
+	put_task_struct(p);
+
+	return 0;
+}
+
+static int timerslack_ns_open(struct inode *inode, struct file *filp)
+{
+	return single_open(filp, timerslack_ns_show, inode);
+}
+
+static const struct file_operations proc_pid_set_timerslack_ns_operations = {
+	.open		= timerslack_ns_open,
+	.read		= seq_read,
+	.write		= timerslack_ns_write,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
 static int proc_pident_instantiate(struct inode *dir,
 	struct dentry *dentry, struct task_struct *task, const void *ptr)
 {
@@ -2831,6 +2899,7 @@ static const struct pid_entry tgid_base_stuff[] = {
 #ifdef CONFIG_CHECKPOINT_RESTORE
 	REG("timers",	  S_IRUGO, proc_timers_operations),
 #endif
+	REG("timerslack_ns", S_IRUGO|S_IWUSR, proc_pid_set_timerslack_ns_operations),
 };
 
 static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-02-17  1:06 ` [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface John Stultz
@ 2016-02-17 19:35   ` Andrew Morton
  2016-02-17 20:09     ` Kees Cook
  2016-02-17 20:49     ` John Stultz
  2016-02-18  5:59   ` [PATCH] proc: /proc/<pid>/timerslack_ns permissions fixes John Stultz
  2016-07-13 23:47   ` [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface John Stultz
  2 siblings, 2 replies; 27+ messages in thread
From: Andrew Morton @ 2016-02-17 19:35 UTC (permalink / raw)
  To: John Stultz
  Cc: Thomas Gleixner, Arjan van de Ven, lkml, Oren Laadan,
	Ruchi Kandoi, Rom Lemarchand, Kees Cook, Android Kernel Team

On Tue, 16 Feb 2016 17:06:31 -0800 John Stultz <john.stultz@linaro.org> wrote:

> This patch provides a proc/PID/timerslack_ns interface which
> exposes a task's timerslack value in nanoseconds and allows it
> to be changed.
> 
> This allows power/performance management software to set timer
> slack for other threads according to its policy for the thread
> (such as when the thread is designated foreground vs. background
> activity)
> 
> If the value written is non-zero, slack is set to that value.
> Otherwise sets it to the default for the thread.
> 
> This interface checks that the calling task has permissions to
> to use PTRACE_MODE_ATTACH_FSCREDS on the target task, so that we
> can ensure arbitrary apps do not change the timer slack for other
> apps.

hm.  What the heck is PTRACE_MODE_ATTACH_FSCREDS and why was it chosen?

The procfs file's permissions are 0644, yes?  So a process's
timer_slack is world-readable?  hm.

> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -2257,6 +2257,74 @@ static const struct file_operations proc_timers_operations = {
>  	.release	= seq_release_private,
>  };
>  
> +static ssize_t timerslack_ns_write(struct file *file, const char __user *buf,
> +					size_t count, loff_t *offset)
> +{
> +	struct inode *inode = file_inode(file);
> +	struct task_struct *p;
> +	char buffer[PROC_NUMBUF];
> +	u64 slack_ns;
> +	int err;
> +
> +	memset(buffer, 0, sizeof(buffer));
> +	if (count > sizeof(buffer) - 1)
> +		count = sizeof(buffer) - 1;
> +
> +	if (copy_from_user(buffer, buf, count))
> +		return -EFAULT;
> +
> +	err = kstrtoull(strstrip(buffer), 10, &slack_ns);
> +	if (err < 0)
> +		return err;

Use kstrtoull_from_user()?

> +	p = get_proc_task(inode);
> +	if (!p)
> +		return -ESRCH;
> +
> +	if (ptrace_may_access(p, PTRACE_MODE_ATTACH_FSCREDS)) {
> +		if (slack_ns == 0)
> +			p->timer_slack_ns = p->default_timer_slack_ns;
> +		else
> +			p->timer_slack_ns = slack_ns;
> +	} else
> +		count = -EINVAL;
> +
> +	put_task_struct(p);
> +
> +	return count;
> +}
> +

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-02-17 19:35   ` Andrew Morton
@ 2016-02-17 20:09     ` Kees Cook
  2016-02-17 20:18       ` Andrew Morton
  2016-02-17 20:49     ` John Stultz
  1 sibling, 1 reply; 27+ messages in thread
From: Kees Cook @ 2016-02-17 20:09 UTC (permalink / raw)
  To: Andrew Morton
  Cc: John Stultz, Thomas Gleixner, Arjan van de Ven, lkml,
	Oren Laadan, Ruchi Kandoi, Rom Lemarchand, Android Kernel Team

On Wed, Feb 17, 2016 at 11:35 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Tue, 16 Feb 2016 17:06:31 -0800 John Stultz <john.stultz@linaro.org> wrote:
>
>> This patch provides a proc/PID/timerslack_ns interface which
>> exposes a task's timerslack value in nanoseconds and allows it
>> to be changed.
>>
>> This allows power/performance management software to set timer
>> slack for other threads according to its policy for the thread
>> (such as when the thread is designated foreground vs. background
>> activity)
>>
>> If the value written is non-zero, slack is set to that value.
>> Otherwise sets it to the default for the thread.
>>
>> This interface checks that the calling task has permissions to
>> to use PTRACE_MODE_ATTACH_FSCREDS on the target task, so that we
>> can ensure arbitrary apps do not change the timer slack for other
>> apps.
>
> hm.  What the heck is PTRACE_MODE_ATTACH_FSCREDS and why was it chosen?

This says the writer needs to have ptrace "attach" level of access,
and that it should be checked with fscreds, as is the standard for
most /proc things like that.

> The procfs file's permissions are 0644, yes?  So a process's
> timer_slack is world-readable?  hm.

This should be 600, IMO.

-Kees

>
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -2257,6 +2257,74 @@ static const struct file_operations proc_timers_operations = {
>>       .release        = seq_release_private,
>>  };
>>
>> +static ssize_t timerslack_ns_write(struct file *file, const char __user *buf,
>> +                                     size_t count, loff_t *offset)
>> +{
>> +     struct inode *inode = file_inode(file);
>> +     struct task_struct *p;
>> +     char buffer[PROC_NUMBUF];
>> +     u64 slack_ns;
>> +     int err;
>> +
>> +     memset(buffer, 0, sizeof(buffer));
>> +     if (count > sizeof(buffer) - 1)
>> +             count = sizeof(buffer) - 1;
>> +
>> +     if (copy_from_user(buffer, buf, count))
>> +             return -EFAULT;
>> +
>> +     err = kstrtoull(strstrip(buffer), 10, &slack_ns);
>> +     if (err < 0)
>> +             return err;
>
> Use kstrtoull_from_user()?
>
>> +     p = get_proc_task(inode);
>> +     if (!p)
>> +             return -ESRCH;
>> +
>> +     if (ptrace_may_access(p, PTRACE_MODE_ATTACH_FSCREDS)) {
>> +             if (slack_ns == 0)
>> +                     p->timer_slack_ns = p->default_timer_slack_ns;
>> +             else
>> +                     p->timer_slack_ns = slack_ns;
>> +     } else
>> +             count = -EINVAL;
>> +
>> +     put_task_struct(p);
>> +
>> +     return count;
>> +}
>> +
>



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-02-17 20:09     ` Kees Cook
@ 2016-02-17 20:18       ` Andrew Morton
  2016-02-17 20:51         ` John Stultz
  2016-02-17 22:29         ` John Stultz
  0 siblings, 2 replies; 27+ messages in thread
From: Andrew Morton @ 2016-02-17 20:18 UTC (permalink / raw)
  To: Kees Cook
  Cc: John Stultz, Thomas Gleixner, Arjan van de Ven, lkml,
	Oren Laadan, Ruchi Kandoi, Rom Lemarchand, Android Kernel Team

On Wed, 17 Feb 2016 12:09:08 -0800 Kees Cook <keescook@chromium.org> wrote:

> On Wed, Feb 17, 2016 at 11:35 AM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
> > On Tue, 16 Feb 2016 17:06:31 -0800 John Stultz <john.stultz@linaro.org> wrote:
> >
> >> This patch provides a proc/PID/timerslack_ns interface which
> >> exposes a task's timerslack value in nanoseconds and allows it
> >> to be changed.
> >>
> >> This allows power/performance management software to set timer
> >> slack for other threads according to its policy for the thread
> >> (such as when the thread is designated foreground vs. background
> >> activity)
> >>
> >> If the value written is non-zero, slack is set to that value.
> >> Otherwise sets it to the default for the thread.
> >>
> >> This interface checks that the calling task has permissions to
> >> to use PTRACE_MODE_ATTACH_FSCREDS on the target task, so that we
> >> can ensure arbitrary apps do not change the timer slack for other
> >> apps.
> >
> > hm.  What the heck is PTRACE_MODE_ATTACH_FSCREDS and why was it chosen?
> 
> This says the writer needs to have ptrace "attach" level of access,
> and that it should be checked with fscreds, as is the standard for
> most /proc things like that.

The only place where PTRACE_MODE_ATTACH_FSCREDS is used in all of Linux
is /prc/pid/stack.  Makes me curious!

> > The procfs file's permissions are 0644, yes?  So a process's
> > timer_slack is world-readable?  hm.
> 
> This should be 600, IMO.

Sounds safer.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-02-17 19:35   ` Andrew Morton
  2016-02-17 20:09     ` Kees Cook
@ 2016-02-17 20:49     ` John Stultz
  1 sibling, 0 replies; 27+ messages in thread
From: John Stultz @ 2016-02-17 20:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Thomas Gleixner, Arjan van de Ven, lkml, Oren Laadan,
	Ruchi Kandoi, Rom Lemarchand, Kees Cook, Android Kernel Team

On Wed, Feb 17, 2016 at 11:35 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Tue, 16 Feb 2016 17:06:31 -0800 John Stultz <john.stultz@linaro.org> wrote:
>
>> This patch provides a proc/PID/timerslack_ns interface which
>> exposes a task's timerslack value in nanoseconds and allows it
>> to be changed.
>>
>> This allows power/performance management software to set timer
>> slack for other threads according to its policy for the thread
>> (such as when the thread is designated foreground vs. background
>> activity)
>>
>> If the value written is non-zero, slack is set to that value.
>> Otherwise sets it to the default for the thread.
>>
>> This interface checks that the calling task has permissions to
>> to use PTRACE_MODE_ATTACH_FSCREDS on the target task, so that we
>> can ensure arbitrary apps do not change the timer slack for other
>> apps.
>
> hm.  What the heck is PTRACE_MODE_ATTACH_FSCREDS and why was it chosen?

Somewhat out of necessity. I found due to recent changes (sha1:
caaee6234d0)  to __ptrace_may_access() one must use FSCREDS or
REALCREDS. So PTRACE_MODE_ATTACH on its own won't work. ("What the
heck" was what I thought too when I noticed my test wasn't working :)

Since we're accessing this via a filesystem interface, I picked
FSCREDS.  But if I chose wrong here, please let me know.

>> +     err = kstrtoull(strstrip(buffer), 10, &slack_ns);
>> +     if (err < 0)
>> +             return err;
>
> Use kstrtoull_from_user()?

Ok. I'll rework it to use this.

thanks!
-john

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-02-17 20:18       ` Andrew Morton
@ 2016-02-17 20:51         ` John Stultz
  2016-02-17 21:07           ` Andrew Morton
  2016-02-17 22:29         ` John Stultz
  1 sibling, 1 reply; 27+ messages in thread
From: John Stultz @ 2016-02-17 20:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kees Cook, Thomas Gleixner, Arjan van de Ven, lkml, Oren Laadan,
	Ruchi Kandoi, Rom Lemarchand, Android Kernel Team

On Wed, Feb 17, 2016 at 12:18 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Wed, 17 Feb 2016 12:09:08 -0800 Kees Cook <keescook@chromium.org> wrote:
>
>> On Wed, Feb 17, 2016 at 11:35 AM, Andrew Morton
>> <akpm@linux-foundation.org> wrote:
>> > On Tue, 16 Feb 2016 17:06:31 -0800 John Stultz <john.stultz@linaro.org> wrote:
>> >
>> >> This patch provides a proc/PID/timerslack_ns interface which
>> >> exposes a task's timerslack value in nanoseconds and allows it
>> >> to be changed.
>> >>
>> >> This allows power/performance management software to set timer
>> >> slack for other threads according to its policy for the thread
>> >> (such as when the thread is designated foreground vs. background
>> >> activity)
>> >>
>> >> If the value written is non-zero, slack is set to that value.
>> >> Otherwise sets it to the default for the thread.
>> >>
>> >> This interface checks that the calling task has permissions to
>> >> to use PTRACE_MODE_ATTACH_FSCREDS on the target task, so that we
>> >> can ensure arbitrary apps do not change the timer slack for other
>> >> apps.
>> >
>> > hm.  What the heck is PTRACE_MODE_ATTACH_FSCREDS and why was it chosen?
>>
>> This says the writer needs to have ptrace "attach" level of access,
>> and that it should be checked with fscreds, as is the standard for
>> most /proc things like that.
>
> The only place where PTRACE_MODE_ATTACH_FSCREDS is used in all of Linux
> is /prc/pid/stack.  Makes me curious!

Other uses may be using a combination PTRACE_MODE_ATTACH|PTRACE_MODE_FSCREDS.

>> > The procfs file's permissions are 0644, yes?  So a process's
>> > timer_slack is world-readable?  hm.
>>
>> This should be 600, IMO.
>
> Sounds safer.

Ok. Reworking the patch to use that as well.


Andrew: I saw you added these to -mm already. Would you prefer a fixup
patch ontop, or should I just send out a folded down v3 of the
patchset?

thanks
-john

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-02-17 20:51         ` John Stultz
@ 2016-02-17 21:07           ` Andrew Morton
  0 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2016-02-17 21:07 UTC (permalink / raw)
  To: John Stultz
  Cc: Kees Cook, Thomas Gleixner, Arjan van de Ven, lkml, Oren Laadan,
	Ruchi Kandoi, Rom Lemarchand, Android Kernel Team

On Wed, 17 Feb 2016 12:51:17 -0800 John Stultz <john.stultz@linaro.org> wrote:

> Andrew: I saw you added these to -mm already. Would you prefer a fixup
> patch ontop, or should I just send out a folded down v3 of the
> patchset?

I'm OK with either.  I usually turn replacements into deltas so that
I (and others) can see what changed.  But it can become a bit unwieldy.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-02-17 20:18       ` Andrew Morton
  2016-02-17 20:51         ` John Stultz
@ 2016-02-17 22:29         ` John Stultz
  2016-02-17 22:45           ` Kees Cook
  2016-02-17 22:53           ` Andrew Morton
  1 sibling, 2 replies; 27+ messages in thread
From: John Stultz @ 2016-02-17 22:29 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Kees Cook, Thomas Gleixner, Arjan van de Ven, lkml, Oren Laadan,
	Ruchi Kandoi, Rom Lemarchand, Android Kernel Team

On Wed, Feb 17, 2016 at 12:18 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Wed, 17 Feb 2016 12:09:08 -0800 Kees Cook <keescook@chromium.org> wrote:
>> On Wed, Feb 17, 2016 at 11:35 AM, Andrew Morton
>> > The procfs file's permissions are 0644, yes?  So a process's
>> > timer_slack is world-readable?  hm.
>>
>> This should be 600, IMO.
>
> Sounds safer.

So I've gone ahead and addressed this and the other feedback you had.
But this bit made me realize that I may have missed a key aspect to
the interface that Android needs.

In particular, the whole point here is to allow a controlling task to
modify other tasks' timerslack to limit background tasks' power usage
(and to modify them back to normal when the background tasks become
foreground tasks). Note that on android different tasks run as
different users.

Currently, the controlling process has minimally elevated privileges
(CAP_SYS_NICE). The initial review suggested those privileges should
be higher (PTRACE_MODE_ATTACH), which I've implemented.  However, I'm
realizing that by moving to the proc interface, the filesystem
permissions here put yet another barrier in the way.

While the 600 permissions makes initial sense, it does limit these
controlling tasks with extra privileges (though not root) from
modifying the timerslack, since they cannot open the file to begin
with.

So.... Does world writable (plus the PTRACE_MODE_ATTACH_FSCREDS check)
make more sense here?  Or is there a better way for a system to tweak
the default permissions for procfs entries? (And if so, does that
render the PTRACE_MODE_ATTACH... check unnecessary?).

Apologies. I'm fighting a head-cold, so I'm not feeling particularly sharp here.

thanks
-john

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-02-17 22:29         ` John Stultz
@ 2016-02-17 22:45           ` Kees Cook
  2016-02-17 22:51             ` John Stultz
  2016-02-17 22:53           ` Andrew Morton
  1 sibling, 1 reply; 27+ messages in thread
From: Kees Cook @ 2016-02-17 22:45 UTC (permalink / raw)
  To: John Stultz
  Cc: Andrew Morton, Thomas Gleixner, Arjan van de Ven, lkml,
	Oren Laadan, Ruchi Kandoi, Rom Lemarchand, Android Kernel Team

On Wed, Feb 17, 2016 at 2:29 PM, John Stultz <john.stultz@linaro.org> wrote:
> On Wed, Feb 17, 2016 at 12:18 PM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
>> On Wed, 17 Feb 2016 12:09:08 -0800 Kees Cook <keescook@chromium.org> wrote:
>>> On Wed, Feb 17, 2016 at 11:35 AM, Andrew Morton
>>> > The procfs file's permissions are 0644, yes?  So a process's
>>> > timer_slack is world-readable?  hm.
>>>
>>> This should be 600, IMO.
>>
>> Sounds safer.
>
> So I've gone ahead and addressed this and the other feedback you had.
> But this bit made me realize that I may have missed a key aspect to
> the interface that Android needs.
>
> In particular, the whole point here is to allow a controlling task to
> modify other tasks' timerslack to limit background tasks' power usage
> (and to modify them back to normal when the background tasks become
> foreground tasks). Note that on android different tasks run as
> different users.
>
> Currently, the controlling process has minimally elevated privileges
> (CAP_SYS_NICE). The initial review suggested those privileges should
> be higher (PTRACE_MODE_ATTACH), which I've implemented.  However, I'm
> realizing that by moving to the proc interface, the filesystem
> permissions here put yet another barrier in the way.
>
> While the 600 permissions makes initial sense, it does limit these
> controlling tasks with extra privileges (though not root) from
> modifying the timerslack, since they cannot open the file to begin
> with.
>
> So.... Does world writable (plus the PTRACE_MODE_ATTACH_FSCREDS check)
> make more sense here?  Or is there a better way for a system to tweak
> the default permissions for procfs entries? (And if so, does that
> render the PTRACE_MODE_ATTACH... check unnecessary?).
>
> Apologies. I'm fighting a head-cold, so I'm not feeling particularly sharp here.

Is timerslack sensitive at all? You could add the ptrace test to the
_show function too, maybe. Then 0666 would solve the open issue
without leaking the timerslack.

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-02-17 22:45           ` Kees Cook
@ 2016-02-17 22:51             ` John Stultz
  0 siblings, 0 replies; 27+ messages in thread
From: John Stultz @ 2016-02-17 22:51 UTC (permalink / raw)
  To: Kees Cook
  Cc: Andrew Morton, Thomas Gleixner, Arjan van de Ven, lkml,
	Oren Laadan, Ruchi Kandoi, Rom Lemarchand, Android Kernel Team

On Wed, Feb 17, 2016 at 2:45 PM, Kees Cook <keescook@chromium.org> wrote:
> On Wed, Feb 17, 2016 at 2:29 PM, John Stultz <john.stultz@linaro.org> wrote:
>> On Wed, Feb 17, 2016 at 12:18 PM, Andrew Morton
>> <akpm@linux-foundation.org> wrote:
>>> On Wed, 17 Feb 2016 12:09:08 -0800 Kees Cook <keescook@chromium.org> wrote:
>>>> On Wed, Feb 17, 2016 at 11:35 AM, Andrew Morton
>>>> > The procfs file's permissions are 0644, yes?  So a process's
>>>> > timer_slack is world-readable?  hm.
>>>>
>>>> This should be 600, IMO.
>>>
>>> Sounds safer.
>>
>> So I've gone ahead and addressed this and the other feedback you had.
>> But this bit made me realize that I may have missed a key aspect to
>> the interface that Android needs.
>>
>> In particular, the whole point here is to allow a controlling task to
>> modify other tasks' timerslack to limit background tasks' power usage
>> (and to modify them back to normal when the background tasks become
>> foreground tasks). Note that on android different tasks run as
>> different users.
>>
>> Currently, the controlling process has minimally elevated privileges
>> (CAP_SYS_NICE). The initial review suggested those privileges should
>> be higher (PTRACE_MODE_ATTACH), which I've implemented.  However, I'm
>> realizing that by moving to the proc interface, the filesystem
>> permissions here put yet another barrier in the way.
>>
>> While the 600 permissions makes initial sense, it does limit these
>> controlling tasks with extra privileges (though not root) from
>> modifying the timerslack, since they cannot open the file to begin
>> with.
>>
>> So.... Does world writable (plus the PTRACE_MODE_ATTACH_FSCREDS check)
>> make more sense here?  Or is there a better way for a system to tweak
>> the default permissions for procfs entries? (And if so, does that
>> render the PTRACE_MODE_ATTACH... check unnecessary?).
>>
>> Apologies. I'm fighting a head-cold, so I'm not feeling particularly sharp here.
>
> Is timerslack sensitive at all? You could add the ptrace test to the
> _show function too, maybe. Then 0666 would solve the open issue
> without leaking the timerslack.

I don't see how timerslack would be sensitive, but probably many
mistakes start out that way, so not being cavalier about it seems
wise.  :)

Ok. Sounds like you and Andrew are on the same page wrt 666 +
PTRACE_MODE_ATTACH, and that seems like it would be workable.

I'll get that implemented here shortly.

Thanks so much again for the feedback! Really appreciate it!
-john

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-02-17 22:29         ` John Stultz
  2016-02-17 22:45           ` Kees Cook
@ 2016-02-17 22:53           ` Andrew Morton
  1 sibling, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2016-02-17 22:53 UTC (permalink / raw)
  To: John Stultz
  Cc: Kees Cook, Thomas Gleixner, Arjan van de Ven, lkml, Oren Laadan,
	Ruchi Kandoi, Rom Lemarchand, Android Kernel Team

On Wed, 17 Feb 2016 14:29:29 -0800 John Stultz <john.stultz@linaro.org> wrote:

> On Wed, Feb 17, 2016 at 12:18 PM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
> > On Wed, 17 Feb 2016 12:09:08 -0800 Kees Cook <keescook@chromium.org> wrote:
> >> On Wed, Feb 17, 2016 at 11:35 AM, Andrew Morton
> >> > The procfs file's permissions are 0644, yes?  So a process's
> >> > timer_slack is world-readable?  hm.
> >>
> >> This should be 600, IMO.
> >
> > Sounds safer.
> 
> So I've gone ahead and addressed this and the other feedback you had.
> But this bit made me realize that I may have missed a key aspect to
> the interface that Android needs.
> 
> In particular, the whole point here is to allow a controlling task to
> modify other tasks' timerslack to limit background tasks' power usage
> (and to modify them back to normal when the background tasks become
> foreground tasks). Note that on android different tasks run as
> different users.
> 
> Currently, the controlling process has minimally elevated privileges
> (CAP_SYS_NICE). The initial review suggested those privileges should
> be higher (PTRACE_MODE_ATTACH), which I've implemented.  However, I'm
> realizing that by moving to the proc interface, the filesystem
> permissions here put yet another barrier in the way.
> 
> While the 600 permissions makes initial sense, it does limit these
> controlling tasks with extra privileges (though not root) from
> modifying the timerslack, since they cannot open the file to begin
> with.
> 
> So.... Does world writable (plus the PTRACE_MODE_ATTACH_FSCREDS check)
> make more sense here?  Or is there a better way for a system to tweak
> the default permissions for procfs entries? (And if so, does that
> render the PTRACE_MODE_ATTACH... check unnecessary?).

I can't immediately think of a problem with it.  Could we check
PTRACE_MODE_ATTACH_FSCREDS in open() to prevent bad guys from reading
our timerslack?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] proc: /proc/<pid>/timerslack_ns permissions fixes
  2016-02-17  1:06 ` [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface John Stultz
  2016-02-17 19:35   ` Andrew Morton
@ 2016-02-18  5:59   ` John Stultz
  2016-02-18 17:52     ` Kees Cook
  2016-07-13 23:47   ` [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface John Stultz
  2 siblings, 1 reply; 27+ messages in thread
From: John Stultz @ 2016-02-18  5:59 UTC (permalink / raw)
  To: Andrew Morton, Thomas Gleixner, Arjan van de Ven
  Cc: lkml, John Stultz, Oren Laadan, Ruchi Kandoi, Rom Lemarchand,
	Kees Cook, Android Kernel Team

This patch adjusts the timerslack_ns file permissions to be
0666 but requires PTRACE_MODE_ATTACH_FSCREDS to read or write
the value.

This allows tasks with sufficient privledges (CAP_SYS_PTRACE)
to be able to modify a the timerslack for proccesses owned by
a different user.

This patch also fixes a return value from EINVAL to EPERM,
and does task locking consistently, given we're handling u64s
on 32bit systems. It also makes use of kstrtoull_from_user
which simplifies some code.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Android Kernel Team <kernel-team@android.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
This patch applies on top of the previous two patches
which Andrew already added to -mm. It can be folded
down or kept separate as desired.

I've also wired up the Android userspace side to use
this interface, and tested it there, and things seem
to be working properly ( - with some selinux noise, I
still need to figure out the selinux policy changes,
but its working with permissive mode).

 fs/proc/base.c | 28 +++++++++++++---------------
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index d7c51ca..35f583a 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2262,18 +2262,10 @@ static ssize_t timerslack_ns_write(struct file *file, const char __user *buf,
 {
 	struct inode *inode = file_inode(file);
 	struct task_struct *p;
-	char buffer[PROC_NUMBUF];
 	u64 slack_ns;
 	int err;
 
-	memset(buffer, 0, sizeof(buffer));
-	if (count > sizeof(buffer) - 1)
-		count = sizeof(buffer) - 1;
-
-	if (copy_from_user(buffer, buf, count))
-		return -EFAULT;
-
-	err = kstrtoull(strstrip(buffer), 10, &slack_ns);
+	err = kstrtoull_from_user(buf, count, 10, &slack_ns);
 	if (err < 0)
 		return err;
 
@@ -2282,12 +2274,14 @@ static ssize_t timerslack_ns_write(struct file *file, const char __user *buf,
 		return -ESRCH;
 
 	if (ptrace_may_access(p, PTRACE_MODE_ATTACH_FSCREDS)) {
+		task_lock(p);
 		if (slack_ns == 0)
 			p->timer_slack_ns = p->default_timer_slack_ns;
 		else
 			p->timer_slack_ns = slack_ns;
+		task_unlock(p);
 	} else
-		count = -EINVAL;
+		count = -EPERM;
 
 	put_task_struct(p);
 
@@ -2298,18 +2292,22 @@ static int timerslack_ns_show(struct seq_file *m, void *v)
 {
 	struct inode *inode = m->private;
 	struct task_struct *p;
+	int err =  0;
 
 	p = get_proc_task(inode);
 	if (!p)
 		return -ESRCH;
 
-	task_lock(p);
-	seq_printf(m, "%llu\n", p->timer_slack_ns);
-	task_unlock(p);
+	if (ptrace_may_access(p, PTRACE_MODE_ATTACH_FSCREDS)) {
+		task_lock(p);
+		seq_printf(m, "%llu\n", p->timer_slack_ns);
+		task_unlock(p);
+	} else
+		err = -EPERM;
 
 	put_task_struct(p);
 
-	return 0;
+	return err;
 }
 
 static int timerslack_ns_open(struct inode *inode, struct file *filp)
@@ -2899,7 +2897,7 @@ static const struct pid_entry tgid_base_stuff[] = {
 #ifdef CONFIG_CHECKPOINT_RESTORE
 	REG("timers",	  S_IRUGO, proc_timers_operations),
 #endif
-	REG("timerslack_ns", S_IRUGO|S_IWUSR, proc_pid_set_timerslack_ns_operations),
+	REG("timerslack_ns", S_IRUGO|S_IWUGO, proc_pid_set_timerslack_ns_operations),
 };
 
 static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH] proc: /proc/<pid>/timerslack_ns permissions fixes
  2016-02-18  5:59   ` [PATCH] proc: /proc/<pid>/timerslack_ns permissions fixes John Stultz
@ 2016-02-18 17:52     ` Kees Cook
  0 siblings, 0 replies; 27+ messages in thread
From: Kees Cook @ 2016-02-18 17:52 UTC (permalink / raw)
  To: John Stultz
  Cc: Andrew Morton, Thomas Gleixner, Arjan van de Ven, lkml,
	Oren Laadan, Ruchi Kandoi, Rom Lemarchand, Android Kernel Team

On Wed, Feb 17, 2016 at 9:59 PM, John Stultz <john.stultz@linaro.org> wrote:
> This patch adjusts the timerslack_ns file permissions to be
> 0666 but requires PTRACE_MODE_ATTACH_FSCREDS to read or write
> the value.
>
> This allows tasks with sufficient privledges (CAP_SYS_PTRACE)
> to be able to modify a the timerslack for proccesses owned by
> a different user.
>
> This patch also fixes a return value from EINVAL to EPERM,
> and does task locking consistently, given we're handling u64s
> on 32bit systems. It also makes use of kstrtoull_from_user
> which simplifies some code.
>
> Cc: Arjan van de Ven <arjan@linux.intel.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Oren Laadan <orenl@cellrox.com>
> Cc: Ruchi Kandoi <kandoiruchi@google.com>
> Cc: Rom Lemarchand <romlem@android.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Android Kernel Team <kernel-team@android.com>
> Signed-off-by: John Stultz <john.stultz@linaro.org>

Acked-by: Kees Cook <keescook@chromium.org>

> ---
> This patch applies on top of the previous two patches
> which Andrew already added to -mm. It can be folded
> down or kept separate as desired.

Probably best to fold them together.

-Kees

>
> I've also wired up the Android userspace side to use
> this interface, and tested it there, and things seem
> to be working properly ( - with some selinux noise, I
> still need to figure out the selinux policy changes,
> but its working with permissive mode).
>
>  fs/proc/base.c | 28 +++++++++++++---------------
>  1 file changed, 13 insertions(+), 15 deletions(-)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index d7c51ca..35f583a 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -2262,18 +2262,10 @@ static ssize_t timerslack_ns_write(struct file *file, const char __user *buf,
>  {
>         struct inode *inode = file_inode(file);
>         struct task_struct *p;
> -       char buffer[PROC_NUMBUF];
>         u64 slack_ns;
>         int err;
>
> -       memset(buffer, 0, sizeof(buffer));
> -       if (count > sizeof(buffer) - 1)
> -               count = sizeof(buffer) - 1;
> -
> -       if (copy_from_user(buffer, buf, count))
> -               return -EFAULT;
> -
> -       err = kstrtoull(strstrip(buffer), 10, &slack_ns);
> +       err = kstrtoull_from_user(buf, count, 10, &slack_ns);
>         if (err < 0)
>                 return err;
>
> @@ -2282,12 +2274,14 @@ static ssize_t timerslack_ns_write(struct file *file, const char __user *buf,
>                 return -ESRCH;
>
>         if (ptrace_may_access(p, PTRACE_MODE_ATTACH_FSCREDS)) {
> +               task_lock(p);
>                 if (slack_ns == 0)
>                         p->timer_slack_ns = p->default_timer_slack_ns;
>                 else
>                         p->timer_slack_ns = slack_ns;
> +               task_unlock(p);
>         } else
> -               count = -EINVAL;
> +               count = -EPERM;
>
>         put_task_struct(p);
>
> @@ -2298,18 +2292,22 @@ static int timerslack_ns_show(struct seq_file *m, void *v)
>  {
>         struct inode *inode = m->private;
>         struct task_struct *p;
> +       int err =  0;
>
>         p = get_proc_task(inode);
>         if (!p)
>                 return -ESRCH;
>
> -       task_lock(p);
> -       seq_printf(m, "%llu\n", p->timer_slack_ns);
> -       task_unlock(p);
> +       if (ptrace_may_access(p, PTRACE_MODE_ATTACH_FSCREDS)) {
> +               task_lock(p);
> +               seq_printf(m, "%llu\n", p->timer_slack_ns);
> +               task_unlock(p);
> +       } else
> +               err = -EPERM;
>
>         put_task_struct(p);
>
> -       return 0;
> +       return err;
>  }
>
>  static int timerslack_ns_open(struct inode *inode, struct file *filp)
> @@ -2899,7 +2897,7 @@ static const struct pid_entry tgid_base_stuff[] = {
>  #ifdef CONFIG_CHECKPOINT_RESTORE
>         REG("timers",     S_IRUGO, proc_timers_operations),
>  #endif
> -       REG("timerslack_ns", S_IRUGO|S_IWUSR, proc_pid_set_timerslack_ns_operations),
> +       REG("timerslack_ns", S_IRUGO|S_IWUGO, proc_pid_set_timerslack_ns_operations),
>  };
>
>  static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
> --
> 1.9.1
>



-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-02-17  1:06 ` [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface John Stultz
  2016-02-17 19:35   ` Andrew Morton
  2016-02-18  5:59   ` [PATCH] proc: /proc/<pid>/timerslack_ns permissions fixes John Stultz
@ 2016-07-13 23:47   ` John Stultz
  2016-07-14  3:39     ` Kees Cook
  2 siblings, 1 reply; 27+ messages in thread
From: John Stultz @ 2016-07-13 23:47 UTC (permalink / raw)
  To: Andrew Morton, Thomas Gleixner, Arjan van de Ven
  Cc: lkml, John Stultz, Oren Laadan, Ruchi Kandoi, Rom Lemarchand,
	Kees Cook, Android Kernel Team, Todd Kjos, Colin Cross,
	Nick Kralevich, Dmitry Shmidt, Elliott Hughes

On Tue, Feb 16, 2016 at 5:06 PM, John Stultz <john.stultz@linaro.org> wrote:
> This patch provides a proc/PID/timerslack_ns interface which
> exposes a task's timerslack value in nanoseconds and allows it
> to be changed.
>
> This allows power/performance management software to set timer
> slack for other threads according to its policy for the thread
> (such as when the thread is designated foreground vs. background
> activity)
>
> If the value written is non-zero, slack is set to that value.
> Otherwise sets it to the default for the thread.
>
> This interface checks that the calling task has permissions to
> to use PTRACE_MODE_ATTACH_FSCREDS on the target task, so that we
> can ensure arbitrary apps do not change the timer slack for other
> apps.

Sigh.

So I wanted to pull this thread up again, because when I originally
proposed upstreaming the PR_SET_TIMERSLACK_PID feature from the AOSP
common.git tree, the first objection from Arjan was that it only
required CAP_SYS_NICE:
   http://lkml.iu.edu/hypermail/linux/kernel/1506.3/01491.html

And reasonably, setting timerslack to very large values does have the
potential to effect applications much further then what a task could
do previously with CAP_SYS_NICE.

CAP_SYS_PTRACE was suggested instead, as that allows applications to
manipulate other tasks more drastically.

(At the time, I checked with some of the Android developers, and got
no objection to changing to use this capability.)

However, after submitting the changes to Android required to support
the upstreamed /proc/<tid>/timerslack_ns interface, I've gotten some
objections with adding CAP_SYS_PTRACE to the system_server, as this
would allow the system_server to be able to inspect and modify memory
on any task in the system. This gives the system_server privileged to
effect applications much further then what it could do previously.

So I worry I'm a bit stuck here. For general systems, CAP_SYS_NICE is
too low a level of privilege  to set a tasks timerslack, but
apparently CAP_SYS_PTRACE is too high a privilege for Android's
system_server to require just to set a tasks timerslack value.

So I wanted to ask again if we might consider backing this down to
CAP_SYS_NICE, or if we can instead introduce a new CAP_SYS_TIMERSLACK
or something to provide the needed in-between capability level.

Thoughts?

thanks
-john

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-07-13 23:47   ` [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface John Stultz
@ 2016-07-14  3:39     ` Kees Cook
  2016-07-14  5:29       ` Arjan van de Ven
  2016-07-14 12:48       ` Serge E. Hallyn
  0 siblings, 2 replies; 27+ messages in thread
From: Kees Cook @ 2016-07-14  3:39 UTC (permalink / raw)
  To: John Stultz
  Cc: Andrew Morton, Thomas Gleixner, Arjan van de Ven, lkml,
	Oren Laadan, Ruchi Kandoi, Rom Lemarchand, Android Kernel Team,
	Todd Kjos, Colin Cross, Nick Kralevich, Dmitry Shmidt,
	Elliott Hughes

On Wed, Jul 13, 2016 at 4:47 PM, John Stultz <john.stultz@linaro.org> wrote:
> On Tue, Feb 16, 2016 at 5:06 PM, John Stultz <john.stultz@linaro.org> wrote:
>> This patch provides a proc/PID/timerslack_ns interface which
>> exposes a task's timerslack value in nanoseconds and allows it
>> to be changed.
>>
>> This allows power/performance management software to set timer
>> slack for other threads according to its policy for the thread
>> (such as when the thread is designated foreground vs. background
>> activity)
>>
>> If the value written is non-zero, slack is set to that value.
>> Otherwise sets it to the default for the thread.
>>
>> This interface checks that the calling task has permissions to
>> to use PTRACE_MODE_ATTACH_FSCREDS on the target task, so that we
>> can ensure arbitrary apps do not change the timer slack for other
>> apps.
>
> Sigh.
>
> So I wanted to pull this thread up again, because when I originally
> proposed upstreaming the PR_SET_TIMERSLACK_PID feature from the AOSP
> common.git tree, the first objection from Arjan was that it only
> required CAP_SYS_NICE:
>    http://lkml.iu.edu/hypermail/linux/kernel/1506.3/01491.html
>
> And reasonably, setting timerslack to very large values does have the
> potential to effect applications much further then what a task could
> do previously with CAP_SYS_NICE.
>
> CAP_SYS_PTRACE was suggested instead, as that allows applications to
> manipulate other tasks more drastically.
>
> (At the time, I checked with some of the Android developers, and got
> no objection to changing to use this capability.)
>
> However, after submitting the changes to Android required to support
> the upstreamed /proc/<tid>/timerslack_ns interface, I've gotten some
> objections with adding CAP_SYS_PTRACE to the system_server, as this
> would allow the system_server to be able to inspect and modify memory
> on any task in the system. This gives the system_server privileged to
> effect applications much further then what it could do previously.
>
> So I worry I'm a bit stuck here. For general systems, CAP_SYS_NICE is
> too low a level of privilege  to set a tasks timerslack, but
> apparently CAP_SYS_PTRACE is too high a privilege for Android's
> system_server to require just to set a tasks timerslack value.
>
> So I wanted to ask again if we might consider backing this down to
> CAP_SYS_NICE, or if we can instead introduce a new CAP_SYS_TIMERSLACK
> or something to provide the needed in-between capability level.

Adding new capabilities appears to not really be viable (lots of
threads about this...)

I think the original CAP_SYS_NICE should be fine. A malicious
CAP_SYS_NICE process can do plenty of insane things, I don't feel like
the timer slack adds to any realistic risks.

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-07-14  3:39     ` Kees Cook
@ 2016-07-14  5:29       ` Arjan van de Ven
  2016-07-14 12:48       ` Serge E. Hallyn
  1 sibling, 0 replies; 27+ messages in thread
From: Arjan van de Ven @ 2016-07-14  5:29 UTC (permalink / raw)
  To: Kees Cook, John Stultz
  Cc: Andrew Morton, Thomas Gleixner, lkml, Oren Laadan, Ruchi Kandoi,
	Rom Lemarchand, Android Kernel Team, Todd Kjos, Colin Cross,
	Nick Kralevich, Dmitry Shmidt, Elliott Hughes

On 7/13/2016 8:39 PM, Kees Cook wrote:
>>
>> So I worry I'm a bit stuck here. For general systems, CAP_SYS_NICE is
>> too low a level of privilege  to set a tasks timerslack, but
>> apparently CAP_SYS_PTRACE is too high a privilege for Android's
>> system_server to require just to set a tasks timerslack value.
>>
>> So I wanted to ask again if we might consider backing this down to
>> CAP_SYS_NICE, or if we can instead introduce a new CAP_SYS_TIMERSLACK
>> or something to provide the needed in-between capability level.
>
> Adding new capabilities appears to not really be viable (lots of
> threads about this...)
>
> I think the original CAP_SYS_NICE should be fine. A malicious
> CAP_SYS_NICE process can do plenty of insane things, I don't feel like
> the timer slack adds to any realistic risks.

if the result is really as bad as you describe, then that is worse than
the impact of this being CAP_SYS_NICE, and thus SYS_TRACE is maybe the
purist answer, but not the pragmatic best answer; certainly I don't want
to make the overall system security worse.

I wonder how much you want to set the slack; one of the options (and I don't
know how this will work in the code, if it's horrible don't do it)
is to limit how much slack CAP_SYS_NICE can set (say, 50 or 100 msec, e.g. in the order
of a "time slice" or two if Linux had time slices, similar to what nice would do)
while CAP_SYS_TRACE  can set the full 4 seconds.
If it makes the code horrible, don't do it and just do SYS_NICE.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-07-14  3:39     ` Kees Cook
  2016-07-14  5:29       ` Arjan van de Ven
@ 2016-07-14 12:48       ` Serge E. Hallyn
  2016-07-14 13:42         ` Arjan van de Ven
  2016-07-14 16:09         ` John Stultz
  1 sibling, 2 replies; 27+ messages in thread
From: Serge E. Hallyn @ 2016-07-14 12:48 UTC (permalink / raw)
  To: Kees Cook
  Cc: John Stultz, Andrew Morton, Thomas Gleixner, Arjan van de Ven,
	lkml, Oren Laadan, Ruchi Kandoi, Rom Lemarchand,
	Android Kernel Team, Todd Kjos, Colin Cross, Nick Kralevich,
	Dmitry Shmidt, Elliott Hughes

Quoting Kees Cook (keescook@chromium.org):
> On Wed, Jul 13, 2016 at 4:47 PM, John Stultz <john.stultz@linaro.org> wrote:
> > On Tue, Feb 16, 2016 at 5:06 PM, John Stultz <john.stultz@linaro.org> wrote:
> >> This patch provides a proc/PID/timerslack_ns interface which
> >> exposes a task's timerslack value in nanoseconds and allows it
> >> to be changed.
> >>
> >> This allows power/performance management software to set timer
> >> slack for other threads according to its policy for the thread
> >> (such as when the thread is designated foreground vs. background
> >> activity)
> >>
> >> If the value written is non-zero, slack is set to that value.
> >> Otherwise sets it to the default for the thread.
> >>
> >> This interface checks that the calling task has permissions to
> >> to use PTRACE_MODE_ATTACH_FSCREDS on the target task, so that we
> >> can ensure arbitrary apps do not change the timer slack for other
> >> apps.
> >
> > Sigh.
> >
> > So I wanted to pull this thread up again, because when I originally
> > proposed upstreaming the PR_SET_TIMERSLACK_PID feature from the AOSP
> > common.git tree, the first objection from Arjan was that it only
> > required CAP_SYS_NICE:
> >    http://lkml.iu.edu/hypermail/linux/kernel/1506.3/01491.html
> >
> > And reasonably, setting timerslack to very large values does have the
> > potential to effect applications much further then what a task could
> > do previously with CAP_SYS_NICE.
> >
> > CAP_SYS_PTRACE was suggested instead, as that allows applications to
> > manipulate other tasks more drastically.
> >
> > (At the time, I checked with some of the Android developers, and got
> > no objection to changing to use this capability.)
> >
> > However, after submitting the changes to Android required to support
> > the upstreamed /proc/<tid>/timerslack_ns interface, I've gotten some
> > objections with adding CAP_SYS_PTRACE to the system_server, as this
> > would allow the system_server to be able to inspect and modify memory
> > on any task in the system. This gives the system_server privileged to
> > effect applications much further then what it could do previously.
> >
> > So I worry I'm a bit stuck here. For general systems, CAP_SYS_NICE is
> > too low a level of privilege  to set a tasks timerslack, but
> > apparently CAP_SYS_PTRACE is too high a privilege for Android's
> > system_server to require just to set a tasks timerslack value.
> >
> > So I wanted to ask again if we might consider backing this down to
> > CAP_SYS_NICE, or if we can instead introduce a new CAP_SYS_TIMERSLACK
> > or something to provide the needed in-between capability level.
> 
> Adding new capabilities appears to not really be viable (lots of
> threads about this...)

Sorry - why is this?

> I think the original CAP_SYS_NICE should be fine. A malicious
> CAP_SYS_NICE process can do plenty of insane things, I don't feel like
> the timer slack adds to any realistic risks.

Can someone give a detailed explanation of what you could do with
the new timerslack feature and compare it to what you can do with
sys_nice?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-07-14 12:48       ` Serge E. Hallyn
@ 2016-07-14 13:42         ` Arjan van de Ven
  2016-07-14 16:01           ` John Stultz
  2016-07-14 16:09         ` John Stultz
  1 sibling, 1 reply; 27+ messages in thread
From: Arjan van de Ven @ 2016-07-14 13:42 UTC (permalink / raw)
  To: Serge E. Hallyn, Kees Cook
  Cc: John Stultz, Andrew Morton, Thomas Gleixner, lkml, Oren Laadan,
	Ruchi Kandoi, Rom Lemarchand, Android Kernel Team, Todd Kjos,
	Colin Cross, Nick Kralevich, Dmitry Shmidt, Elliott Hughes

On 7/14/2016 5:48 AM, Serge E. Hallyn wrote:

> Can someone give a detailed explanation of what you could do with
> the new timerslack feature and compare it to what you can do with
> sys_nice?
>

what you can do with the timerslack feature is add upto 4 seconds of extra
time/delay on top of each select()/poll()/nanosleep()/... (basically anything that
uses hrtimers on behalf of the user), and then also control within that
4 second window exactly when that extra delay ends
(which may help a timing attack kind of scenario)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-07-14 13:42         ` Arjan van de Ven
@ 2016-07-14 16:01           ` John Stultz
  0 siblings, 0 replies; 27+ messages in thread
From: John Stultz @ 2016-07-14 16:01 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Serge E. Hallyn, Kees Cook, Andrew Morton, Thomas Gleixner, lkml,
	Oren Laadan, Ruchi Kandoi, Rom Lemarchand, Android Kernel Team,
	Todd Kjos, Colin Cross, Nick Kralevich, Dmitry Shmidt,
	Elliott Hughes

On Thu, Jul 14, 2016 at 6:42 AM, Arjan van de Ven <arjan@linux.intel.com> wrote:
> On 7/14/2016 5:48 AM, Serge E. Hallyn wrote:
>
>> Can someone give a detailed explanation of what you could do with
>> the new timerslack feature and compare it to what you can do with
>> sys_nice?
>>
>
> what you can do with the timerslack feature is add upto 4 seconds of extra
> time/delay on top of each select()/poll()/nanosleep()/... (basically
> anything that
> uses hrtimers on behalf of the user), and then also control within that
> 4 second window exactly when that extra delay ends
> (which may help a timing attack kind of scenario)

So the interface actually allows for 64bits of nanoseconds, so more or
less infinite delay if nothing else is happening.

thanks
-john

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-07-14 12:48       ` Serge E. Hallyn
  2016-07-14 13:42         ` Arjan van de Ven
@ 2016-07-14 16:09         ` John Stultz
  2016-07-14 17:45           ` Kees Cook
  1 sibling, 1 reply; 27+ messages in thread
From: John Stultz @ 2016-07-14 16:09 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Kees Cook, Andrew Morton, Thomas Gleixner, Arjan van de Ven,
	lkml, Oren Laadan, Ruchi Kandoi, Rom Lemarchand,
	Android Kernel Team, Todd Kjos, Colin Cross, Nick Kralevich,
	Dmitry Shmidt, Elliott Hughes

On Thu, Jul 14, 2016 at 5:48 AM, Serge E. Hallyn <serge@hallyn.com> wrote:
> Quoting Kees Cook (keescook@chromium.org):
>> I think the original CAP_SYS_NICE should be fine. A malicious
>> CAP_SYS_NICE process can do plenty of insane things, I don't feel like
>> the timer slack adds to any realistic risks.
>
> Can someone give a detailed explanation of what you could do with
> the new timerslack feature and compare it to what you can do with
> sys_nice?

Looking at the man page for CAP_SYS_NICE, it looks like such a task
can set a task as SCHED_FIFO, so they could fork some spinning
processes and set them all SCHED_FIFO 99, in effect delaying all other
tasks for an infinite amount of time.

So one might argue setting large timerslack vlaues isn't that
different risk wise?

thanks
-john

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-07-14 16:09         ` John Stultz
@ 2016-07-14 17:45           ` Kees Cook
  2016-07-14 17:48             ` Arjan van de Ven
  2016-07-14 17:49             ` Serge E. Hallyn
  0 siblings, 2 replies; 27+ messages in thread
From: Kees Cook @ 2016-07-14 17:45 UTC (permalink / raw)
  To: John Stultz
  Cc: Serge E. Hallyn, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven, lkml, Oren Laadan, Ruchi Kandoi,
	Rom Lemarchand, Android Kernel Team, Todd Kjos, Colin Cross,
	Nick Kralevich, Dmitry Shmidt, Elliott Hughes

On Thu, Jul 14, 2016 at 9:09 AM, John Stultz <john.stultz@linaro.org> wrote:
> On Thu, Jul 14, 2016 at 5:48 AM, Serge E. Hallyn <serge@hallyn.com> wrote:
>> Quoting Kees Cook (keescook@chromium.org):
>>> I think the original CAP_SYS_NICE should be fine. A malicious
>>> CAP_SYS_NICE process can do plenty of insane things, I don't feel like
>>> the timer slack adds to any realistic risks.
>>
>> Can someone give a detailed explanation of what you could do with
>> the new timerslack feature and compare it to what you can do with
>> sys_nice?
>
> Looking at the man page for CAP_SYS_NICE, it looks like such a task
> can set a task as SCHED_FIFO, so they could fork some spinning
> processes and set them all SCHED_FIFO 99, in effect delaying all other
> tasks for an infinite amount of time.
>
> So one might argue setting large timerslack vlaues isn't that
> different risk wise?

Right -- you can hose a system with CAP_SYS_NICE already; I don't
think timerslack realistically changes that.

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-07-14 17:45           ` Kees Cook
@ 2016-07-14 17:48             ` Arjan van de Ven
  2016-07-14 17:49             ` Serge E. Hallyn
  1 sibling, 0 replies; 27+ messages in thread
From: Arjan van de Ven @ 2016-07-14 17:48 UTC (permalink / raw)
  To: Kees Cook, John Stultz
  Cc: Serge E. Hallyn, Andrew Morton, Thomas Gleixner, lkml,
	Oren Laadan, Ruchi Kandoi, Rom Lemarchand, Android Kernel Team,
	Todd Kjos, Colin Cross, Nick Kralevich, Dmitry Shmidt,
	Elliott Hughes

On 7/14/2016 10:45 AM, Kees Cook wrote:
> On Thu, Jul 14, 2016 at 9:09 AM, John Stultz <john.stultz@linaro.org> wrote:
>> On Thu, Jul 14, 2016 at 5:48 AM, Serge E. Hallyn <serge@hallyn.com> wrote:
>>> Quoting Kees Cook (keescook@chromium.org):
>>>> I think the original CAP_SYS_NICE should be fine. A malicious
>>>> CAP_SYS_NICE process can do plenty of insane things, I don't feel like
>>>> the timer slack adds to any realistic risks.
>>>
>>> Can someone give a detailed explanation of what you could do with
>>> the new timerslack feature and compare it to what you can do with
>>> sys_nice?
>>
>> Looking at the man page for CAP_SYS_NICE, it looks like such a task
>> can set a task as SCHED_FIFO, so they could fork some spinning
>> processes and set them all SCHED_FIFO 99, in effect delaying all other
>> tasks for an infinite amount of time.
>>
>> So one might argue setting large timerslack vlaues isn't that
>> different risk wise?
>
> Right -- you can hose a system with CAP_SYS_NICE already; I don't
> think timerslack realistically changes that.

fair enough

the worry of being able to time attack things is there already with the SCHED_FIFO
so... purist objection withdrawn in favor of the pragmatic

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-07-14 17:45           ` Kees Cook
  2016-07-14 17:48             ` Arjan van de Ven
@ 2016-07-14 17:49             ` Serge E. Hallyn
  2016-07-14 17:56               ` Kees Cook
  1 sibling, 1 reply; 27+ messages in thread
From: Serge E. Hallyn @ 2016-07-14 17:49 UTC (permalink / raw)
  To: Kees Cook
  Cc: John Stultz, Serge E. Hallyn, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven, lkml, Oren Laadan, Ruchi Kandoi,
	Rom Lemarchand, Android Kernel Team, Todd Kjos, Colin Cross,
	Nick Kralevich, Dmitry Shmidt, Elliott Hughes

Quoting Kees Cook (keescook@chromium.org):
> On Thu, Jul 14, 2016 at 9:09 AM, John Stultz <john.stultz@linaro.org> wrote:
> > On Thu, Jul 14, 2016 at 5:48 AM, Serge E. Hallyn <serge@hallyn.com> wrote:
> >> Quoting Kees Cook (keescook@chromium.org):
> >>> I think the original CAP_SYS_NICE should be fine. A malicious
> >>> CAP_SYS_NICE process can do plenty of insane things, I don't feel like
> >>> the timer slack adds to any realistic risks.
> >>
> >> Can someone give a detailed explanation of what you could do with
> >> the new timerslack feature and compare it to what you can do with
> >> sys_nice?
> >
> > Looking at the man page for CAP_SYS_NICE, it looks like such a task
> > can set a task as SCHED_FIFO, so they could fork some spinning
> > processes and set them all SCHED_FIFO 99, in effect delaying all other
> > tasks for an infinite amount of time.
> >
> > So one might argue setting large timerslack vlaues isn't that
> > different risk wise?
> 
> Right -- you can hose a system with CAP_SYS_NICE already; I don't
> think timerslack realistically changes that.

Thanks - so it seems to me if we go with CAP_SYS_NICE we are giving
those who can already hose the system another vector to doing so.  But
if we require CAP_SYS_PTRACE then we are giving those who can newly hose
the system also the ability to subvert any task.  It sounds like
CAP_SYS_NICE is the winner.

Kees, you said adding a capability is hard - can you expound on that?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-07-14 17:49             ` Serge E. Hallyn
@ 2016-07-14 17:56               ` Kees Cook
  2016-07-14 20:21                 ` Serge E. Hallyn
  0 siblings, 1 reply; 27+ messages in thread
From: Kees Cook @ 2016-07-14 17:56 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: John Stultz, Andrew Morton, Thomas Gleixner, Arjan van de Ven,
	lkml, Oren Laadan, Ruchi Kandoi, Rom Lemarchand,
	Android Kernel Team, Todd Kjos, Colin Cross, Nick Kralevich,
	Dmitry Shmidt, Elliott Hughes

On Thu, Jul 14, 2016 at 10:49 AM, Serge E. Hallyn <serge@hallyn.com> wrote:
> Kees, you said adding a capability is hard - can you expound on that?

Best I can find at the moment was discussion around CAP_COMPROMISE_KERNEL:
http://thread.gmane.org/gmane.linux.kernel/1459165

Basically, adding a new capability for an interface can create
userspace compatibility problems (though perhaps in this case, it's a
new interface, so a new capability would be okay, but it's such a
narrow use-case and CAP_SYS_NICE fits fine).

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface
  2016-07-14 17:56               ` Kees Cook
@ 2016-07-14 20:21                 ` Serge E. Hallyn
  0 siblings, 0 replies; 27+ messages in thread
From: Serge E. Hallyn @ 2016-07-14 20:21 UTC (permalink / raw)
  To: Kees Cook
  Cc: Serge E. Hallyn, John Stultz, Andrew Morton, Thomas Gleixner,
	Arjan van de Ven, lkml, Oren Laadan, Ruchi Kandoi,
	Rom Lemarchand, Android Kernel Team, Todd Kjos, Colin Cross,
	Nick Kralevich, Dmitry Shmidt, Elliott Hughes

Quoting Kees Cook (keescook@chromium.org):
> On Thu, Jul 14, 2016 at 10:49 AM, Serge E. Hallyn <serge@hallyn.com> wrote:
> > Kees, you said adding a capability is hard - can you expound on that?
> 
> Best I can find at the moment was discussion around CAP_COMPROMISE_KERNEL:
> http://thread.gmane.org/gmane.linux.kernel/1459165

Hm, the last discussion I recall around that topic involved a confusing
negative capability iirc, I assume CAP_COMPROMISE_KERNEL was the revamped
version.

> Basically, adding a new capability for an interface can create
> userspace compatibility problems (though perhaps in this case, it's a
> new interface, so a new capability would be okay, but it's such a
> narrow use-case and CAP_SYS_NICE fits fine).

Right, there are two ways they can be added.  For new functionality,
no big deal.  (Of course we'd like to avoid going beyond 64 bits of cap
too soon, so don't want to go crazy).

The other is when we want to split off a more fine-grained version of
an existing capability.  Then we just have to make sure that the coarser
pre-existing capability continues to work as expected.  Breaking out
CAP_SYSLOG from CAP_SYS_ADMIN was an example of that.

-serge

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2016-07-14 20:21 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-17  1:06 [PATCH 0/2] Extend timer_slack_ns to u64 on 32bit systems & add /proc/<pid>/timerslack_ns John Stultz
2016-02-17  1:06 ` [PATCH 1/2] timer: Convert timer_slack_ns from unsigned long to u64 John Stultz
2016-02-17  1:06 ` [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface John Stultz
2016-02-17 19:35   ` Andrew Morton
2016-02-17 20:09     ` Kees Cook
2016-02-17 20:18       ` Andrew Morton
2016-02-17 20:51         ` John Stultz
2016-02-17 21:07           ` Andrew Morton
2016-02-17 22:29         ` John Stultz
2016-02-17 22:45           ` Kees Cook
2016-02-17 22:51             ` John Stultz
2016-02-17 22:53           ` Andrew Morton
2016-02-17 20:49     ` John Stultz
2016-02-18  5:59   ` [PATCH] proc: /proc/<pid>/timerslack_ns permissions fixes John Stultz
2016-02-18 17:52     ` Kees Cook
2016-07-13 23:47   ` [PATCH 2/2] proc: Add /proc/<pid>/timerslack_ns interface John Stultz
2016-07-14  3:39     ` Kees Cook
2016-07-14  5:29       ` Arjan van de Ven
2016-07-14 12:48       ` Serge E. Hallyn
2016-07-14 13:42         ` Arjan van de Ven
2016-07-14 16:01           ` John Stultz
2016-07-14 16:09         ` John Stultz
2016-07-14 17:45           ` Kees Cook
2016-07-14 17:48             ` Arjan van de Ven
2016-07-14 17:49             ` Serge E. Hallyn
2016-07-14 17:56               ` Kees Cook
2016-07-14 20:21                 ` Serge E. Hallyn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).