linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mandeep Singh Baines <msb@chromium.org>
To: David Rientjes <rientjes@google.com>
Cc: Mandeep Singh Baines <msb@chromium.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Rik van Riel <riel@redhat.com>, Ying Han <yinghan@google.com>,
	linux-kernel@vger.kernel.org, gspencer@chromium.org,
	piman@chromium.org, wad@chromium.org, olofj@chromium.org,
	Bodo Eggert <7eggert@web.de>
Subject: [PATCH v3] oom: allow a non-CAP_SYS_RESOURCE proces to oom_score_adj down
Date: Mon, 15 Nov 2010 16:03:59 -0800	[thread overview]
Message-ID: <20101116000359.GS7363@google.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1011151404080.17082@chino.kir.corp.google.com>

We'd like to be able to oom_score_adj a process up/down as its
enters/leaves the foreground. Currently, it is not possible to oom_adj
down without CAP_SYS_RESOURCE. This patch allows a task to decrease
its oom_score_adj back to the value that a CAP_SYS_RESOURCE thread set
it or its inherited value at fork. Assuming the thread that has forked
it has oom_score_adj of 0, each tab process could decrease it back from
0 upon activation unless a CAP_SYS_RESOURCE thread elevated it to
something higher.

Alternative considered:

* a setuid binary
* a daemon with CAP_SYS_RESOURCE

Since you don't wan't all processes to be able to reduce their
oom_adj, a setuid or daemon implementation would be complex. The
alternatives also have much higher overhead.

This patch updated from original patch based on feedback from
David Rientjes <rientjes@google.com>.

Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Acked-by: David Rientjes <rientjes@google.com>
---
 Documentation/filesystems/proc.txt |    4 ++++
 fs/proc/base.c                     |    4 +++-
 include/linux/sched.h              |    2 ++
 kernel/fork.c                      |    1 +
 4 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index e73df27..7139c50 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -1296,6 +1296,10 @@ scaled linearly with /proc/<pid>/oom_score_adj.
 Writing to /proc/<pid>/oom_score_adj or /proc/<pid>/oom_adj will change the
 other with its scaled value.
 
+The value of /proc/<pid>/oom_score_adj may be reduced no lower than the last
+value set by a CAP_SYS_RESOURCE process. To reduce the value any lower
+requires CAP_SYS_RESOURCE.
+
 NOTICE: /proc/<pid>/oom_adj is deprecated and will be removed, please see
 Documentation/feature-removal-schedule.txt.
 
diff --git a/fs/proc/base.c b/fs/proc/base.c
index f3d02ca..7b1a9df 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1164,7 +1164,7 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf,
 		goto err_task_lock;
 	}
 
-	if (oom_score_adj < task->signal->oom_score_adj &&
+	if (oom_score_adj < task->signal->oom_score_adj_min &&
 			!capable(CAP_SYS_RESOURCE)) {
 		err = -EACCES;
 		goto err_sighand;
@@ -1177,6 +1177,8 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf,
 			atomic_dec(&task->mm->oom_disable_count);
 	}
 	task->signal->oom_score_adj = oom_score_adj;
+	if (has_capability_noaudit(current, CAP_SYS_RESOURCE))
+		task->signal->oom_score_adj_min = oom_score_adj;
 	/*
 	 * Scale /proc/pid/oom_adj appropriately ensuring that OOM_DISABLE is
 	 * always attainable.
diff --git a/include/linux/sched.h b/include/linux/sched.h
index f53cdf2..2a71ee0 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -626,6 +626,8 @@ struct signal_struct {
 
 	int oom_adj;		/* OOM kill score adjustment (bit shift) */
 	int oom_score_adj;	/* OOM kill score adjustment */
+	int oom_score_adj_min;	/* OOM kill score adjustment minimum value.
+				 * Only settable by CAP_SYS_RESOURCE. */
 
 	struct mutex cred_guard_mutex;	/* guard against foreign influences on
 					 * credential calculations
diff --git a/kernel/fork.c b/kernel/fork.c
index 3b159c5..0979527 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -907,6 +907,7 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
 
 	sig->oom_adj = current->signal->oom_adj;
 	sig->oom_score_adj = current->signal->oom_score_adj;
+	sig->oom_score_adj_min = current->signal->oom_score_adj_min;
 
 	mutex_init(&sig->cred_guard_mutex);
 
-- 
1.7.3.1


  reply	other threads:[~2010-11-16  1:21 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-11  4:35 [PATCH] oom: create a resource limit for oom_adj Mandeep Singh Baines
2010-11-11  7:35 ` David Rientjes
2010-11-11 18:30   ` Mandeep Singh Baines
2010-11-11 20:57     ` David Rientjes
2010-11-11 22:25       ` Mandeep Singh Baines
2010-11-11 23:19         ` David Rientjes
2010-11-11 23:56           ` Mandeep Singh Baines
2010-11-13  0:46             ` [PATCH] oom: allow a non-CAP_SYS_RESOURCE proces to oom_score_adj down Mandeep Singh Baines
2010-11-14  1:37               ` David Rientjes
2010-11-15 22:01                 ` [PATCH v2] " Mandeep Singh Baines
2010-11-15 22:06                   ` David Rientjes
2010-11-16  0:03                     ` Mandeep Singh Baines [this message]
2010-11-14  5:07 ` [PATCH] oom: create a resource limit for oom_adj KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101116000359.GS7363@google.com \
    --to=msb@chromium.org \
    --cc=7eggert@web.de \
    --cc=akpm@linux-foundation.org \
    --cc=gspencer@chromium.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=olofj@chromium.org \
    --cc=piman@chromium.org \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=wad@chromium.org \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).