linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 4.9 103/107] fs/proc/base.c: use ns_capable instead of capable for timerslack_ns
       [not found] <20190128161947.57405-1-sashal@kernel.org>
@ 2019-01-28 16:19 ` Sasha Levin
  2019-01-28 16:19 ` [PATCH AUTOSEL 4.9 105/107] proc/sysctl: fix return error for proc_doulongvec_minmax() Sasha Levin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2019-01-28 16:19 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Benjamin Gordon, John Stultz, Eric W. Biederman, Kees Cook,
	Serge E. Hallyn, Thomas Gleixner, Arjan van de Ven, Oren Laadan,
	Ruchi Kandoi, Rom Lemarchand, Todd Kjos, Colin Cross,
	Nick Kralevich, Dmitry Shmidt, Elliott Hughes, Alexey Dobriyan,
	Andrew Morton, Linus Torvalds, Sasha Levin, linux-fsdevel

From: Benjamin Gordon <bmgordon@google.com>

[ Upstream commit 8da0b4f692c6d90b09c91f271517db746a22ff67 ]

Access to timerslack_ns is controlled by a process having CAP_SYS_NICE
in its effective capability set, but the current check looks in the root
namespace instead of the process' user namespace.  Since a process is
allowed to do other activities controlled by CAP_SYS_NICE inside a
namespace, it should also be able to adjust timerslack_ns.

Link: http://lkml.kernel.org/r/20181030180012.232896-1-bmgordon@google.com
Signed-off-by: Benjamin Gordon <bmgordon@google.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Oren Laadan <orenl@cellrox.com>
Cc: Ruchi Kandoi <kandoiruchi@google.com>
Cc: Rom Lemarchand <romlem@android.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Colin Cross <ccross@android.com>
Cc: Nick Kralevich <nnk@google.com>
Cc: Dmitry Shmidt <dimitrysh@google.com>
Cc: Elliott Hughes <enh@google.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/proc/base.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 79702d405ba7..f73de326c630 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2337,10 +2337,13 @@ static ssize_t timerslack_ns_write(struct file *file, const char __user *buf,
 		return -ESRCH;
 
 	if (p != current) {
-		if (!capable(CAP_SYS_NICE)) {
+		rcu_read_lock();
+		if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) {
+			rcu_read_unlock();
 			count = -EPERM;
 			goto out;
 		}
+		rcu_read_unlock();
 
 		err = security_task_setscheduler(p);
 		if (err) {
@@ -2373,11 +2376,14 @@ static int timerslack_ns_show(struct seq_file *m, void *v)
 		return -ESRCH;
 
 	if (p != current) {
-
-		if (!capable(CAP_SYS_NICE)) {
+		rcu_read_lock();
+		if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) {
+			rcu_read_unlock();
 			err = -EPERM;
 			goto out;
 		}
+		rcu_read_unlock();
+
 		err = security_task_getscheduler(p);
 		if (err)
 			goto out;
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH AUTOSEL 4.9 105/107] proc/sysctl: fix return error for proc_doulongvec_minmax()
       [not found] <20190128161947.57405-1-sashal@kernel.org>
  2019-01-28 16:19 ` [PATCH AUTOSEL 4.9 103/107] fs/proc/base.c: use ns_capable instead of capable for timerslack_ns Sasha Levin
@ 2019-01-28 16:19 ` Sasha Levin
  2019-01-28 16:19 ` [PATCH AUTOSEL 4.9 106/107] fs/epoll: drop ovflist branch prediction Sasha Levin
  2019-01-28 16:19 ` [PATCH AUTOSEL 4.9 107/107] exec: load_script: don't blindly truncate shebang string Sasha Levin
  3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2019-01-28 16:19 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Cheng Lin, Alexey Dobriyan, Andrew Morton, Linus Torvalds,
	Sasha Levin, linux-fsdevel

From: Cheng Lin <cheng.lin130@zte.com.cn>

[ Upstream commit 09be178400829dddc1189b50a7888495dd26aa84 ]

If the number of input parameters is less than the total parameters, an
EINVAL error will be returned.

For example, we use proc_doulongvec_minmax to pass up to two parameters
with kern_table:

{
	.procname       = "monitor_signals",
	.data           = &monitor_sigs,
	.maxlen         = 2*sizeof(unsigned long),
	.mode           = 0644,
	.proc_handler   = proc_doulongvec_minmax,
},

Reproduce:

When passing two parameters, it's work normal.  But passing only one
parameter, an error "Invalid argument"(EINVAL) is returned.

  [root@cl150 ~]# echo 1 2 > /proc/sys/kernel/monitor_signals
  [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
  1       2
  [root@cl150 ~]# echo 3 > /proc/sys/kernel/monitor_signals
  -bash: echo: write error: Invalid argument
  [root@cl150 ~]# echo $?
  1
  [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
  3       2
  [root@cl150 ~]#

The following is the result after apply this patch.  No error is
returned when the number of input parameters is less than the total
parameters.

  [root@cl150 ~]# echo 1 2 > /proc/sys/kernel/monitor_signals
  [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
  1       2
  [root@cl150 ~]# echo 3 > /proc/sys/kernel/monitor_signals
  [root@cl150 ~]# echo $?
  0
  [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
  3       2
  [root@cl150 ~]#

There are three processing functions dealing with digital parameters,
__do_proc_dointvec/__do_proc_douintvec/__do_proc_doulongvec_minmax.

This patch deals with __do_proc_doulongvec_minmax, just as
__do_proc_dointvec does, adding a check for parameters 'left'.  In
__do_proc_douintvec, its code implementation explicitly does not support
multiple inputs.

static int __do_proc_douintvec(...){
         ...
         /*
          * Arrays are not supported, keep this simple. *Do not* add
          * support for them.
          */
         if (vleft != 1) {
                 *lenp = 0;
                 return -EINVAL;
         }
         ...
}

So, just __do_proc_doulongvec_minmax has the problem.  And most use of
proc_doulongvec_minmax/proc_doulongvec_ms_jiffies_minmax just have one
parameter.

Link: http://lkml.kernel.org/r/1544081775-15720-1-git-send-email-cheng.lin130@zte.com.cn
Signed-off-by: Cheng Lin <cheng.lin130@zte.com.cn>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 kernel/sysctl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 23f658d311c0..93c7b02279b9 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -2503,6 +2503,8 @@ static int __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, int
 			bool neg;
 
 			left -= proc_skip_spaces(&p);
+			if (!left)
+				break;
 
 			err = proc_get_long(&p, &left, &val, &neg,
 					     proc_wspace_sep,
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH AUTOSEL 4.9 106/107] fs/epoll: drop ovflist branch prediction
       [not found] <20190128161947.57405-1-sashal@kernel.org>
  2019-01-28 16:19 ` [PATCH AUTOSEL 4.9 103/107] fs/proc/base.c: use ns_capable instead of capable for timerslack_ns Sasha Levin
  2019-01-28 16:19 ` [PATCH AUTOSEL 4.9 105/107] proc/sysctl: fix return error for proc_doulongvec_minmax() Sasha Levin
@ 2019-01-28 16:19 ` Sasha Levin
  2019-01-28 16:19 ` [PATCH AUTOSEL 4.9 107/107] exec: load_script: don't blindly truncate shebang string Sasha Levin
  3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2019-01-28 16:19 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Davidlohr Bueso, Davidlohr Bueso, Al Viro, Jason Baron,
	Andrew Morton, Linus Torvalds, Sasha Levin, linux-fsdevel

From: Davidlohr Bueso <dave@stgolabs.net>

[ Upstream commit 76699a67f3041ff4c7af6d6ee9be2bfbf1ffb671 ]

The ep->ovflist is a secondary ready-list to temporarily store events
that might occur when doing sproc without holding the ep->wq.lock.  This
accounts for every time we check for ready events and also send events
back to userspace; both callbacks, particularly the latter because of
copy_to_user, can account for a non-trivial time.

As such, the unlikely() check to see if the pointer is being used, seems
both misleading and sub-optimal.  In fact, we go to an awful lot of
trouble to sync both lists, and populating the ovflist is far from an
uncommon scenario.

For example, profiling a concurrent epoll_wait(2) benchmark, with
CONFIG_PROFILE_ANNOTATED_BRANCHES shows that for a two threads a 33%
incorrect rate was seen; and when incrementally increasing the number of
epoll instances (which is used, for example for multiple queuing load
balancing models), up to a 90% incorrect rate was seen.

Similarly, by deleting the prediction, 3% throughput boost was seen
across incremental threads.

Link: http://lkml.kernel.org/r/20181108051006.18751-4-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jason Baron <jbaron@akamai.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/eventpoll.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 3cbc30413add..a9c0bf8782f5 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1040,7 +1040,7 @@ static int ep_poll_callback(wait_queue_t *wait, unsigned mode, int sync, void *k
 	 * semantics). All the events that happen during that period of time are
 	 * chained in ep->ovflist and requeued later on.
 	 */
-	if (unlikely(ep->ovflist != EP_UNACTIVE_PTR)) {
+	if (ep->ovflist != EP_UNACTIVE_PTR) {
 		if (epi->next == EP_UNACTIVE_PTR) {
 			epi->next = ep->ovflist;
 			ep->ovflist = epi;
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH AUTOSEL 4.9 107/107] exec: load_script: don't blindly truncate shebang string
       [not found] <20190128161947.57405-1-sashal@kernel.org>
                   ` (2 preceding siblings ...)
  2019-01-28 16:19 ` [PATCH AUTOSEL 4.9 106/107] fs/epoll: drop ovflist branch prediction Sasha Levin
@ 2019-01-28 16:19 ` Sasha Levin
  3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2019-01-28 16:19 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Oleg Nesterov, Ben Woodard, Eric W. Biederman, Andrew Morton,
	Linus Torvalds, Sasha Levin, linux-fsdevel

From: Oleg Nesterov <oleg@redhat.com>

[ Upstream commit 8099b047ecc431518b9bb6bdbba3549bbecdc343 ]

load_script() simply truncates bprm->buf and this is very wrong if the
length of shebang string exceeds BINPRM_BUF_SIZE-2.  This can silently
truncate i_arg or (worse) we can execute the wrong binary if buf[2:126]
happens to be the valid executable path.

Change load_script() to return ENOEXEC if it can't find '\n' or zero in
bprm->buf.  Note that '\0' can come from either
prepare_binprm()->memset() or from kernel_read(), we do not care.

Link: http://lkml.kernel.org/r/20181112160931.GA28463@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Ben Woodard <woodard@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/binfmt_script.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/binfmt_script.c b/fs/binfmt_script.c
index afdf4e3cafc2..634bdbb23851 100644
--- a/fs/binfmt_script.c
+++ b/fs/binfmt_script.c
@@ -43,10 +43,14 @@ static int load_script(struct linux_binprm *bprm)
 	fput(bprm->file);
 	bprm->file = NULL;
 
-	bprm->buf[BINPRM_BUF_SIZE - 1] = '\0';
-	if ((cp = strchr(bprm->buf, '\n')) == NULL)
-		cp = bprm->buf+BINPRM_BUF_SIZE-1;
+	for (cp = bprm->buf+2;; cp++) {
+		if (cp >= bprm->buf + BINPRM_BUF_SIZE)
+			return -ENOEXEC;
+		if (!*cp || (*cp == '\n'))
+			break;
+	}
 	*cp = '\0';
+
 	while (cp > bprm->buf) {
 		cp--;
 		if ((*cp == ' ') || (*cp == '\t'))
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-01-28 16:41 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190128161947.57405-1-sashal@kernel.org>
2019-01-28 16:19 ` [PATCH AUTOSEL 4.9 103/107] fs/proc/base.c: use ns_capable instead of capable for timerslack_ns Sasha Levin
2019-01-28 16:19 ` [PATCH AUTOSEL 4.9 105/107] proc/sysctl: fix return error for proc_doulongvec_minmax() Sasha Levin
2019-01-28 16:19 ` [PATCH AUTOSEL 4.9 106/107] fs/epoll: drop ovflist branch prediction Sasha Levin
2019-01-28 16:19 ` [PATCH AUTOSEL 4.9 107/107] exec: load_script: don't blindly truncate shebang string Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).