* [PATCH 1/2] core: add lockless update_rlimit_cpu
@ 2009-09-02 9:45 Jiri Slaby
2009-09-02 9:45 ` [PATCH 2/2] core: allow setrlimit to non-current tasks Jiri Slaby
0 siblings, 1 reply; 22+ messages in thread
From: Jiri Slaby @ 2009-09-02 9:45 UTC (permalink / raw)
To: oleg; +Cc: akpm, mingo, linux-kernel, Jiri Slaby
Oleg Nesterov wrote:
> But I dislike the fact the patch uses tasklist_lock. Can't
> lock_task_sighand() work for you? (of course, in this case
> update_rlimit_cpu() should be updated too).
Yeah, I think so. I didn't know about that, sorry.
Something like these two?
--
Rename update_rlimit_cpu to __update_rlimit_cpu and create
update_rlimit_cpu wrapper with sighand lock.
The former is to be used by users who already hold the lock.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
---
include/linux/posix-timers.h | 1 +
kernel/posix-cpu-timers.c | 14 +++++++++-----
2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 3e23844..d0bac29 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -117,6 +117,7 @@ void set_process_cpu_timer(struct task_struct *task, unsigned int clock_idx,
long clock_nanosleep_restart(struct restart_block *restart_block);
+void __update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new);
void update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new);
#endif
diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index a61c625..56eb1d5 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -12,17 +12,21 @@
/*
* Called after updating RLIMIT_CPU to set timer expiration if necessary.
*/
-void update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new)
+void __update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new)
{
cputime_t cputime = secs_to_cputime(rlim_new);
struct signal_struct *const sig = task->signal;
if (cputime_eq(sig->it[CPUCLOCK_PROF].expires, cputime_zero) ||
- cputime_gt(sig->it[CPUCLOCK_PROF].expires, cputime)) {
- spin_lock_irq(&task->sighand->siglock);
+ cputime_gt(sig->it[CPUCLOCK_PROF].expires, cputime))
set_process_cpu_timer(task, CPUCLOCK_PROF, &cputime, NULL);
- spin_unlock_irq(&task->sighand->siglock);
- }
+}
+
+void update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new)
+{
+ spin_lock_irq(&task->sighand->siglock);
+ __update_rlimit_cpu(task, rlim_new);
+ spin_unlock_irq(&task->sighand->siglock);
}
static int check_clock(const clockid_t which_clock)
--
1.6.3.3
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 2/2] core: allow setrlimit to non-current tasks
2009-09-02 9:45 [PATCH 1/2] core: add lockless update_rlimit_cpu Jiri Slaby
@ 2009-09-02 9:45 ` Jiri Slaby
2009-09-02 9:47 ` Jiri Slaby
2009-09-02 13:50 ` Oleg Nesterov
0 siblings, 2 replies; 22+ messages in thread
From: Jiri Slaby @ 2009-09-02 9:45 UTC (permalink / raw)
To: oleg; +Cc: akpm, mingo, linux-kernel, Jiri Slaby
Add locking to allow setrlimit accept task parameter other than
current.
Namely, lock tasklist_lock for read and check whether the task
structure has signal and sighand non-null. Do all the signal
processing under that lock still held.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
---
kernel/sys.c | 25 ++++++++++++++++++-------
1 files changed, 18 insertions(+), 7 deletions(-)
diff --git a/kernel/sys.c b/kernel/sys.c
index 919cebe..4a55bef 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1240,20 +1240,28 @@ int setrlimit(struct task_struct *tsk, unsigned int resource,
struct rlimit *new_rlim)
{
struct rlimit *old_rlim;
+ unsigned long flags;
int retval;
if (new_rlim->rlim_cur > new_rlim->rlim_max)
return -EINVAL;
+
+ if (lock_task_sighand(tsk, &flags) == NULL)
+ return -ESRCH;
old_rlim = tsk->signal->rlim + resource;
if ((new_rlim->rlim_max > old_rlim->rlim_max) &&
- !capable(CAP_SYS_RESOURCE))
- return -EPERM;
- if (resource == RLIMIT_NOFILE && new_rlim->rlim_max > sysctl_nr_open)
- return -EPERM;
+ !capable(CAP_SYS_RESOURCE)) {
+ retval = -EPERM;
+ goto unlock;
+ }
+ if (resource == RLIMIT_NOFILE && new_rlim->rlim_max > sysctl_nr_open) {
+ retval = -EPERM;
+ goto unlock;
+ }
retval = security_task_setrlimit(tsk, resource, new_rlim);
if (retval)
- return retval;
+ goto unlock;
if (resource == RLIMIT_CPU && new_rlim->rlim_cur == 0) {
/*
@@ -1281,9 +1289,12 @@ int setrlimit(struct task_struct *tsk, unsigned int resource,
if (new_rlim->rlim_cur == RLIM_INFINITY)
goto out;
- update_rlimit_cpu(tsk, new_rlim->rlim_cur);
+ __update_rlimit_cpu(tsk, new_rlim->rlim_cur);
out:
- return 0;
+ retval = 0;
+unlock:
+ unlock_task_sighand(tsk, &flags);
+ return retval;
}
SYSCALL_DEFINE2(setrlimit, unsigned int, resource, struct rlimit __user *, rlim)
--
1.6.3.3
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] core: allow setrlimit to non-current tasks
2009-09-02 9:45 ` [PATCH 2/2] core: allow setrlimit to non-current tasks Jiri Slaby
@ 2009-09-02 9:47 ` Jiri Slaby
2009-09-02 13:50 ` Oleg Nesterov
1 sibling, 0 replies; 22+ messages in thread
From: Jiri Slaby @ 2009-09-02 9:47 UTC (permalink / raw)
To: Jiri Slaby; +Cc: oleg, akpm, mingo, linux-kernel
On 09/02/2009 11:45 AM, Jiri Slaby wrote:
> Namely, lock tasklist_lock for read and check whether the task
> structure has signal and sighand non-null. Do all the signal
> processing under that lock still held.
The description is incorrect now, indeed...
And I broke mail threads, sorry.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] core: allow setrlimit to non-current tasks
2009-09-02 9:45 ` [PATCH 2/2] core: allow setrlimit to non-current tasks Jiri Slaby
2009-09-02 9:47 ` Jiri Slaby
@ 2009-09-02 13:50 ` Oleg Nesterov
2009-09-02 18:44 ` Jiri Slaby
1 sibling, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2009-09-02 13:50 UTC (permalink / raw)
To: Jiri Slaby; +Cc: akpm, mingo, linux-kernel
On 09/02, Jiri Slaby wrote:
>
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -1240,20 +1240,28 @@ int setrlimit(struct task_struct *tsk, unsigned int resource,
> struct rlimit *new_rlim)
> {
> struct rlimit *old_rlim;
> + unsigned long flags;
> int retval;
>
> if (new_rlim->rlim_cur > new_rlim->rlim_max)
> return -EINVAL;
> +
> + if (lock_task_sighand(tsk, &flags) == NULL)
> + return -ESRCH;
No, sorry, this can't work.
Because we need task_lock() to update rlimits, and ->alloc_lock does not
nest under ->siglock.
Looks like we have to use tasklist_lock, but please don't use _irq, and
please do not check ->signal != NULL. Perhaps it makes sense to take
tasklist only if !same_thread_group(tsk, current) though.
Oh. We really need to make ->signal refcountable.
But there is another minor problem. If we use read_lock(ttasklist), then
the write to /proc/application_pid/limits can race with application doing
sys_setrlimits().
Nothing bad can happen, but this means that "echo ... > /proc/limits" can
be lost. Not good, if admin wants to lower ->rlim_max we should try to ensure
this always works.
Oleg.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] core: allow setrlimit to non-current tasks
2009-09-02 13:50 ` Oleg Nesterov
@ 2009-09-02 18:44 ` Jiri Slaby
2009-09-02 21:51 ` Oleg Nesterov
0 siblings, 1 reply; 22+ messages in thread
From: Jiri Slaby @ 2009-09-02 18:44 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: akpm, mingo, linux-kernel
On 09/02/2009 03:50 PM, Oleg Nesterov wrote:
> But there is another minor problem. If we use read_lock(ttasklist), then
> the write to /proc/application_pid/limits can race with application doing
> sys_setrlimits().
>
> Nothing bad can happen, but this means that "echo ... > /proc/limits" can
> be lost. Not good, if admin wants to lower ->rlim_max we should try to ensure
> this always works.
Actually, process cpu timer may be set to a wrong value. When
* somebody unrelated holds sighand->siglock
* process one stores rlim_new to rlim and gets stuck on spin_lock(siglock)
* process two does the same
* somebody releases sighand->siglock
* process one continues...
I can't think of anything else than doing all the checks and updates
under alloc_lock, introducing coarse grained static mutex in setrlimit
to protect it, or some other(I-don't-know-of)/new lock. Any ideas?
Thanks.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] core: allow setrlimit to non-current tasks
2009-09-02 18:44 ` Jiri Slaby
@ 2009-09-02 21:51 ` Oleg Nesterov
2009-09-03 13:47 ` Jiri Slaby
0 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2009-09-02 21:51 UTC (permalink / raw)
To: Jiri Slaby; +Cc: akpm, mingo, linux-kernel
On 09/02, Jiri Slaby wrote:
>
> On 09/02/2009 03:50 PM, Oleg Nesterov wrote:
> > But there is another minor problem. If we use read_lock(ttasklist), then
> > the write to /proc/application_pid/limits can race with application doing
> > sys_setrlimits().
> >
> > Nothing bad can happen, but this means that "echo ... > /proc/limits" can
> > be lost. Not good, if admin wants to lower ->rlim_max we should try to ensure
> > this always works.
>
> Actually, process cpu timer may be set to a wrong value. When
Yes, I thought about this too. In fact I was going to complain, but then
decided this is OK.
> * somebody unrelated holds sighand->siglock
> * process one stores rlim_new to rlim and gets stuck on spin_lock(siglock)
> * process two does the same
s/process/thread/. (I am talking about the current code). IOW, if the
application is stupid and does setrlimit() from multimple threads at
the same time - we can't help, the result is not predictable.
But, unless I missed something I think this case is fine, please see below.
> * somebody releases sighand->siglock
> * process one continues...
Now, it is possible that cputime_expires.xxx_exp does not match
->rlim[RLIMIT_CPU].rlim_cur.
But we don't care. update_rlimit_cpu() must ensure that cputime_expires.xxx_exp
is not greater than necessary, nothing else.
> I can't think of anything else than doing all the checks and updates
> under alloc_lock, introducing coarse grained static mutex in setrlimit
> to protect it,
Oh, please don't ;)
Or I missed your point?
But if you mean this series, then yes, I agree. We should try to do something
to ensure that at least rlim_max can be always lowered when admin writes to
/proc/pid/limits.
Oleg.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/2] core: allow setrlimit to non-current tasks
2009-09-02 21:51 ` Oleg Nesterov
@ 2009-09-03 13:47 ` Jiri Slaby
2009-09-03 13:52 ` [PATCH] " Jiri Slaby
` (2 more replies)
0 siblings, 3 replies; 22+ messages in thread
From: Jiri Slaby @ 2009-09-03 13:47 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: akpm, mingo, linux-kernel
On 09/02/2009 11:51 PM, Oleg Nesterov wrote:
> On 09/02, Jiri Slaby wrote:
>> I can't think of anything else than doing all the checks and updates
>> under alloc_lock, introducing coarse grained static mutex in setrlimit
>> to protect it,
>
> Oh, please don't ;)
>
> Or I missed your point?
>
>
> But if you mean this series, then yes, I agree.
Yes, I meant those. But I don't know what do you agree with :).
> We should try to do something
> to ensure that at least rlim_max can be always lowered when admin writes to
> /proc/pid/limits.
Yes, that's what I asked about when I wrote the three options which I
was able to think of above. So any other ideas about how to elegantly
protect against sys_setrlimit vs. admin+/proc/*/limits race?
Thanks a heap.
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH] core: allow setrlimit to non-current tasks
2009-09-03 13:47 ` Jiri Slaby
@ 2009-09-03 13:52 ` Jiri Slaby
2009-09-03 17:41 ` Oleg Nesterov
2009-09-03 17:20 ` [PATCH 0/1] sys_setrlimit: make sure ->rlim_max never grows Oleg Nesterov
2009-09-03 17:21 ` [PATCH 1/1] " Oleg Nesterov
2 siblings, 1 reply; 22+ messages in thread
From: Jiri Slaby @ 2009-09-03 13:52 UTC (permalink / raw)
To: oleg; +Cc: akpm, mingo, linux-kernel, Jiri Slaby
Just for clarity. The current patch looks like this:
--
Add locking to allow setrlimit accept task parameter other than
current.
Namely, lock tasklist_lock for read and check whether the task
structure has sighand non-null. Do all the signal processing under
that lock still held. We do this only for non-current tasks.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
---
kernel/sys.c | 24 ++++++++++++++++++++----
1 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/kernel/sys.c b/kernel/sys.c
index ed500c8..85d5d4d 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1240,6 +1240,7 @@ int setrlimit(struct task_struct *tsk, unsigned int resource,
struct rlimit *new_rlim)
{
struct rlimit *old_rlim;
+ unsigned int needs_locking = !same_thread_group(tsk, current);
int retval;
if (new_rlim->rlim_cur > new_rlim->rlim_max)
@@ -1247,13 +1248,24 @@ int setrlimit(struct task_struct *tsk, unsigned int resource,
if (resource == RLIMIT_NOFILE && new_rlim->rlim_max > sysctl_nr_open)
return -EPERM;
+ /* optimization: 'current' doesn't need locking, e.g. setrlimit */
+ if (needs_locking) {
+ /* protect tsk->signal and tsk->sighand from disappearing */
+ read_lock(&tasklist_lock);
+ if (!tsk->sighand) {
+ retval = -ESRCH;
+ goto unlock;
+ }
+ }
old_rlim = tsk->signal->rlim + resource;
if ((new_rlim->rlim_max > old_rlim->rlim_max) &&
- !capable(CAP_SYS_RESOURCE))
- return -EPERM;
+ !capable(CAP_SYS_RESOURCE)) {
+ retval = -EPERM;
+ goto unlock;
+ }
retval = security_task_setrlimit(tsk, resource, new_rlim);
if (retval)
- return retval;
+ goto unlock;
if (resource == RLIMIT_CPU && new_rlim->rlim_cur == 0) {
/*
@@ -1283,7 +1295,11 @@ int setrlimit(struct task_struct *tsk, unsigned int resource,
update_rlimit_cpu(tsk, new_rlim->rlim_cur);
out:
- return 0;
+ retval = 0;
+unlock:
+ if (needs_locking)
+ read_unlock(&tasklist_lock);
+ return retval;
}
SYSCALL_DEFINE2(setrlimit, unsigned int, resource, struct rlimit __user *, rlim)
--
1.6.3.3
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 0/1] sys_setrlimit: make sure ->rlim_max never grows
2009-09-03 13:47 ` Jiri Slaby
2009-09-03 13:52 ` [PATCH] " Jiri Slaby
@ 2009-09-03 17:20 ` Oleg Nesterov
2009-09-03 17:21 ` [PATCH 1/1] " Oleg Nesterov
2 siblings, 0 replies; 22+ messages in thread
From: Oleg Nesterov @ 2009-09-03 17:20 UTC (permalink / raw)
To: Jiri Slaby; +Cc: akpm, mingo, linux-kernel
On 09/03, Jiri Slaby wrote:
>
> On 09/02/2009 11:51 PM, Oleg Nesterov wrote:
> > On 09/02, Jiri Slaby wrote:
> >> I can't think of anything else than doing all the checks and updates
> >> under alloc_lock, introducing coarse grained static mutex in setrlimit
> >> to protect it,
> >
> > Oh, please don't ;)
> >
> > Or I missed your point?
> >
> > But if you mean this series, then yes, I agree.
>
> Yes, I meant those. But I don't know what do you agree with :).
Not sure what I agree with, but I am glad we seem to agree with each other ;)
> > We should try to do something
> > to ensure that at least rlim_max can be always lowered when admin writes to
> > /proc/pid/limits.
>
> Yes, that's what I asked about when I wrote the three options which I
> was able to think of above. So any other ideas about how to elegantly
> protect against sys_setrlimit vs. admin+/proc/*/limits race?
Perhaps we should start these change with this patch (see the next email) ?
Perhaps, before your changes, we should "fix" sys_setrlimit() first ?
Well, the patch (the next email) is not tested... What do you think?
Oleg.
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 1/1] sys_setrlimit: make sure ->rlim_max never grows
2009-09-03 13:47 ` Jiri Slaby
2009-09-03 13:52 ` [PATCH] " Jiri Slaby
2009-09-03 17:20 ` [PATCH 0/1] sys_setrlimit: make sure ->rlim_max never grows Oleg Nesterov
@ 2009-09-03 17:21 ` Oleg Nesterov
2 siblings, 0 replies; 22+ messages in thread
From: Oleg Nesterov @ 2009-09-03 17:21 UTC (permalink / raw)
To: Jiri Slaby; +Cc: akpm, mingo, linux-kernel
Mostly preparation for Jiri's changes, but probably makes sense anyway.
sys_setrlimit() checks new_rlim.rlim_max <= old_rlim->rlim_max, but when
it takes task_lock() old_rlim->rlim_max can be already lowered. Move this
check under task_lock().
Currently this is not important, we can only race with our sub-thread,
this means the application is stupid. But when we change the code to allow
the update of !current task's limits, it becomes important to make sure
->rlim_max can be lowered "reliably" even if we race with the application
doing sys_setrlimit().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
kernel/sys.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
--- WAIT/kernel/sys.c~SETRLIMIT 2009-06-17 14:11:26.000000000 +0200
+++ WAIT/kernel/sys.c 2009-09-03 19:01:48.000000000 +0200
@@ -1247,10 +1247,6 @@ SYSCALL_DEFINE2(setrlimit, unsigned int,
return -EFAULT;
if (new_rlim.rlim_cur > new_rlim.rlim_max)
return -EINVAL;
- old_rlim = current->signal->rlim + resource;
- if ((new_rlim.rlim_max > old_rlim->rlim_max) &&
- !capable(CAP_SYS_RESOURCE))
- return -EPERM;
if (resource == RLIMIT_NOFILE && new_rlim.rlim_max > sysctl_nr_open)
return -EPERM;
@@ -1268,13 +1264,17 @@ SYSCALL_DEFINE2(setrlimit, unsigned int,
new_rlim.rlim_cur = 1;
}
+ old_rlim = current->signal->rlim + resource;
task_lock(current->group_leader);
- *old_rlim = new_rlim;
+ if ((new_rlim.rlim_max <= old_rlim->rlim_max) ||
+ capable(CAP_SYS_RESOURCE))
+ *old_rlim = new_rlim;
+ else
+ retval = -EPERM;
task_unlock(current->group_leader);
- if (resource != RLIMIT_CPU)
+ if (retval || resource != RLIMIT_CPU)
goto out;
-
/*
* RLIMIT_CPU handling. Note that the kernel fails to return an error
* code if it rejected the user's attempt to set RLIMIT_CPU. This is a
@@ -1286,7 +1286,7 @@ SYSCALL_DEFINE2(setrlimit, unsigned int,
update_rlimit_cpu(new_rlim.rlim_cur);
out:
- return 0;
+ return retval;
}
/*
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH] core: allow setrlimit to non-current tasks
2009-09-03 13:52 ` [PATCH] " Jiri Slaby
@ 2009-09-03 17:41 ` Oleg Nesterov
2009-09-03 20:08 ` [PATCH v2 1/8] SECURITY: selinux, fix update_rlimit_cpu parameter Jiri Slaby
` (7 more replies)
0 siblings, 8 replies; 22+ messages in thread
From: Oleg Nesterov @ 2009-09-03 17:41 UTC (permalink / raw)
To: Jiri Slaby; +Cc: akpm, mingo, linux-kernel
On 09/03, Jiri Slaby wrote:
>
> @@ -1240,6 +1240,7 @@ int setrlimit(struct task_struct *tsk, unsigned int resource,
> struct rlimit *new_rlim)
> {
> struct rlimit *old_rlim;
> + unsigned int needs_locking = !same_thread_group(tsk, current);
> int retval;
Yes, thanks for doing this, imho this optimization is worthwhile.
But I'd suggest you to add this optimization in a separate patch
because,
> + /* optimization: 'current' doesn't need locking, e.g. setrlimit */
> + if (needs_locking) {
> + /* protect tsk->signal and tsk->sighand from disappearing */
> + read_lock(&tasklist_lock);
> + if (!tsk->sighand) {
> + retval = -ESRCH;
> + goto unlock;
I should have mentioned this before, but it is not that simple.
Even if same_thread_group(tsk, current), we must not trust tsk->sighand,
it can be NULL if our subthread is dead. (well, we need ->signal, not
->sighand but this doesn't matter because they disappear simultaneously).
Actually, perhaps same_thread_group() is not needed, perhaps it is enough
to avoid tasklist in sys_setrlimit case. So, I think optimization should
do:
retval = -ESRCH;
if (tsk != current) {
read_lock(&tasklist_lock);
if (!tsk->sighand)
goto unlock;
}
unlock:
if (tsk != current)
read_unlock(&tasklist_lock);
Or, if we use same_thread_group(),
needs_locking = !same_thread_group(tsk, current);
if (!needs_locking)
tsk = current;
else {
take tasklist, check ->sighand.
}
Oleg.
^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH v2 1/8] SECURITY: selinux, fix update_rlimit_cpu parameter
2009-09-03 17:41 ` Oleg Nesterov
@ 2009-09-03 20:08 ` Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 2/8] SECURITY: add task_struct to setrlimit Jiri Slaby
` (6 subsequent siblings)
7 siblings, 0 replies; 22+ messages in thread
From: Jiri Slaby @ 2009-09-03 20:08 UTC (permalink / raw)
To: oleg
Cc: akpm, mingo, linux-kernel, Jiri Slaby, Stephen Smalley,
Eric Paris, David Howells
OK, thanks, I ended up having this so far.
--
Don't pass rlim_cur member of RLIM_NLIMITS-1=RLIMIT_RTTIME limit
to update_rlimit_cpu() in selinux_bprm_committing_creds.
Use proper rlim[RLIMIT_CPU].rlim_cur instead.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Cc: David Howells <dhowells@redhat.com>
---
security/selinux/hooks.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index cf41988..496e626 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -2360,7 +2360,7 @@ static void selinux_bprm_committing_creds(struct linux_binprm *bprm)
initrlim = init_task.signal->rlim + i;
rlim->rlim_cur = min(rlim->rlim_max, initrlim->rlim_cur);
}
- update_rlimit_cpu(rlim->rlim_cur);
+ update_rlimit_cpu(current->signal->rlim[RLIMIT_CPU].rlim_cur);
}
}
--
1.6.3.3
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 2/8] SECURITY: add task_struct to setrlimit
2009-09-03 17:41 ` Oleg Nesterov
2009-09-03 20:08 ` [PATCH v2 1/8] SECURITY: selinux, fix update_rlimit_cpu parameter Jiri Slaby
@ 2009-09-03 20:08 ` Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 3/8] core: add task_struct to update_rlimit_cpu Jiri Slaby
` (5 subsequent siblings)
7 siblings, 0 replies; 22+ messages in thread
From: Jiri Slaby @ 2009-09-03 20:08 UTC (permalink / raw)
To: oleg; +Cc: akpm, mingo, linux-kernel, Jiri Slaby
Add task_struct to task_setrlimit of security_operations to be able to set
rlimit of different task than current.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: Eric Paris <eparis@redhat.com>
Acked-by: James Morris <jmorris@namei.org>
---
include/linux/security.h | 9 ++++++---
kernel/sys.c | 2 +-
security/capability.c | 3 ++-
security/security.c | 5 +++--
security/selinux/hooks.c | 7 ++++---
5 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/include/linux/security.h b/include/linux/security.h
index 0f81525..880a21c 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1520,7 +1520,8 @@ struct security_operations {
int (*task_setnice) (struct task_struct *p, int nice);
int (*task_setioprio) (struct task_struct *p, int ioprio);
int (*task_getioprio) (struct task_struct *p);
- int (*task_setrlimit) (unsigned int resource, struct rlimit *new_rlim);
+ int (*task_setrlimit) (struct task_struct *p, unsigned int resource,
+ struct rlimit *new_rlim);
int (*task_setscheduler) (struct task_struct *p, int policy,
struct sched_param *lp);
int (*task_getscheduler) (struct task_struct *p);
@@ -1775,7 +1776,8 @@ int security_task_setgroups(struct group_info *group_info);
int security_task_setnice(struct task_struct *p, int nice);
int security_task_setioprio(struct task_struct *p, int ioprio);
int security_task_getioprio(struct task_struct *p);
-int security_task_setrlimit(unsigned int resource, struct rlimit *new_rlim);
+int security_task_setrlimit(struct task_struct *p, unsigned int resource,
+ struct rlimit *new_rlim);
int security_task_setscheduler(struct task_struct *p,
int policy, struct sched_param *lp);
int security_task_getscheduler(struct task_struct *p);
@@ -2385,7 +2387,8 @@ static inline int security_task_getioprio(struct task_struct *p)
return 0;
}
-static inline int security_task_setrlimit(unsigned int resource,
+static inline int security_task_setrlimit(struct task_struct *p,
+ unsigned int resource,
struct rlimit *new_rlim)
{
return 0;
diff --git a/kernel/sys.c b/kernel/sys.c
index a7e36b6..ae7be2e 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1254,7 +1254,7 @@ SYSCALL_DEFINE2(setrlimit, unsigned int, resource, struct rlimit __user *, rlim)
if (resource == RLIMIT_NOFILE && new_rlim.rlim_max > sysctl_nr_open)
return -EPERM;
- retval = security_task_setrlimit(resource, &new_rlim);
+ retval = security_task_setrlimit(current, resource, &new_rlim);
if (retval)
return retval;
diff --git a/security/capability.c b/security/capability.c
index 6c38f4d..90d6daf 100644
--- a/security/capability.c
+++ b/security/capability.c
@@ -447,7 +447,8 @@ static int cap_task_getioprio(struct task_struct *p)
return 0;
}
-static int cap_task_setrlimit(unsigned int resource, struct rlimit *new_rlim)
+static int cap_task_setrlimit(struct task_struct *p, unsigned int resource,
+ struct rlimit *new_rlim)
{
return 0;
}
diff --git a/security/security.c b/security/security.c
index 0b7f9eb..b2c21dc 100644
--- a/security/security.c
+++ b/security/security.c
@@ -779,9 +779,10 @@ int security_task_getioprio(struct task_struct *p)
return security_ops->task_getioprio(p);
}
-int security_task_setrlimit(unsigned int resource, struct rlimit *new_rlim)
+int security_task_setrlimit(struct task_struct *p, unsigned int resource,
+ struct rlimit *new_rlim)
{
- return security_ops->task_setrlimit(resource, new_rlim);
+ return security_ops->task_setrlimit(p, resource, new_rlim);
}
int security_task_setscheduler(struct task_struct *p,
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 496e626..839622a 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3355,16 +3355,17 @@ static int selinux_task_getioprio(struct task_struct *p)
return current_has_perm(p, PROCESS__GETSCHED);
}
-static int selinux_task_setrlimit(unsigned int resource, struct rlimit *new_rlim)
+static int selinux_task_setrlimit(struct task_struct *p, unsigned int resource,
+ struct rlimit *new_rlim)
{
- struct rlimit *old_rlim = current->signal->rlim + resource;
+ struct rlimit *old_rlim = p->signal->rlim + resource;
/* Control the ability to change the hard limit (whether
lowering or raising it), so that the hard limit can
later be used as a safe reset point for the soft limit
upon context transitions. See selinux_bprm_committing_creds. */
if (old_rlim->rlim_max != new_rlim->rlim_max)
- return current_has_perm(current, PROCESS__SETRLIMIT);
+ return current_has_perm(p, PROCESS__SETRLIMIT);
return 0;
}
--
1.6.3.3
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 3/8] core: add task_struct to update_rlimit_cpu
2009-09-03 17:41 ` Oleg Nesterov
2009-09-03 20:08 ` [PATCH v2 1/8] SECURITY: selinux, fix update_rlimit_cpu parameter Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 2/8] SECURITY: add task_struct to setrlimit Jiri Slaby
@ 2009-09-03 20:08 ` Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 4/8] sys_setrlimit: make sure ->rlim_max never grows Jiri Slaby
` (4 subsequent siblings)
7 siblings, 0 replies; 22+ messages in thread
From: Jiri Slaby @ 2009-09-03 20:08 UTC (permalink / raw)
To: oleg; +Cc: akpm, mingo, linux-kernel, Jiri Slaby
Add task_struct as a parameter to update_rlimit_cpu to be able to set
rlimit_cpu of different task than current.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: James Morris <jmorris@namei.org>
---
include/linux/posix-timers.h | 2 +-
kernel/posix-cpu-timers.c | 10 +++++-----
kernel/sys.c | 2 +-
security/selinux/hooks.c | 3 ++-
4 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 4f71bf4..3e23844 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -117,6 +117,6 @@ void set_process_cpu_timer(struct task_struct *task, unsigned int clock_idx,
long clock_nanosleep_restart(struct restart_block *restart_block);
-void update_rlimit_cpu(unsigned long rlim_new);
+void update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new);
#endif
diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index 12161f7..a61c625 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -12,16 +12,16 @@
/*
* Called after updating RLIMIT_CPU to set timer expiration if necessary.
*/
-void update_rlimit_cpu(unsigned long rlim_new)
+void update_rlimit_cpu(struct task_struct *task, unsigned long rlim_new)
{
cputime_t cputime = secs_to_cputime(rlim_new);
- struct signal_struct *const sig = current->signal;
+ struct signal_struct *const sig = task->signal;
if (cputime_eq(sig->it[CPUCLOCK_PROF].expires, cputime_zero) ||
cputime_gt(sig->it[CPUCLOCK_PROF].expires, cputime)) {
- spin_lock_irq(¤t->sighand->siglock);
- set_process_cpu_timer(current, CPUCLOCK_PROF, &cputime, NULL);
- spin_unlock_irq(¤t->sighand->siglock);
+ spin_lock_irq(&task->sighand->siglock);
+ set_process_cpu_timer(task, CPUCLOCK_PROF, &cputime, NULL);
+ spin_unlock_irq(&task->sighand->siglock);
}
}
diff --git a/kernel/sys.c b/kernel/sys.c
index ae7be2e..d501f60 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1284,7 +1284,7 @@ SYSCALL_DEFINE2(setrlimit, unsigned int, resource, struct rlimit __user *, rlim)
if (new_rlim.rlim_cur == RLIM_INFINITY)
goto out;
- update_rlimit_cpu(new_rlim.rlim_cur);
+ update_rlimit_cpu(current, new_rlim.rlim_cur);
out:
return 0;
}
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 839622a..6903784 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -2360,7 +2360,8 @@ static void selinux_bprm_committing_creds(struct linux_binprm *bprm)
initrlim = init_task.signal->rlim + i;
rlim->rlim_cur = min(rlim->rlim_max, initrlim->rlim_cur);
}
- update_rlimit_cpu(current->signal->rlim[RLIMIT_CPU].rlim_cur);
+ update_rlimit_cpu(current,
+ current->signal->rlim[RLIMIT_CPU].rlim_cur);
}
}
--
1.6.3.3
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 4/8] sys_setrlimit: make sure ->rlim_max never grows
2009-09-03 17:41 ` Oleg Nesterov
` (2 preceding siblings ...)
2009-09-03 20:08 ` [PATCH v2 3/8] core: add task_struct to update_rlimit_cpu Jiri Slaby
@ 2009-09-03 20:08 ` Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 5/8] core: split sys_setrlimit Jiri Slaby
` (3 subsequent siblings)
7 siblings, 0 replies; 22+ messages in thread
From: Jiri Slaby @ 2009-09-03 20:08 UTC (permalink / raw)
To: oleg; +Cc: akpm, mingo, linux-kernel, Jiri Slaby
From: Oleg Nesterov <oleg@redhat.com>
Mostly preparation for Jiri's changes, but probably makes sense anyway.
sys_setrlimit() checks new_rlim.rlim_max <= old_rlim->rlim_max, but when
it takes task_lock() old_rlim->rlim_max can be already lowered. Move this
check under task_lock().
Currently this is not important, we can only race with our sub-thread,
this means the application is stupid. But when we change the code to allow
the update of !current task's limits, it becomes important to make sure
->rlim_max can be lowered "reliably" even if we race with the application
doing sys_setrlimit().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
---
kernel/sys.c | 15 ++++++++-------
1 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/kernel/sys.c b/kernel/sys.c
index d501f60..600d3e2 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1247,10 +1247,6 @@ SYSCALL_DEFINE2(setrlimit, unsigned int, resource, struct rlimit __user *, rlim)
return -EFAULT;
if (new_rlim.rlim_cur > new_rlim.rlim_max)
return -EINVAL;
- old_rlim = current->signal->rlim + resource;
- if ((new_rlim.rlim_max > old_rlim->rlim_max) &&
- !capable(CAP_SYS_RESOURCE))
- return -EPERM;
if (resource == RLIMIT_NOFILE && new_rlim.rlim_max > sysctl_nr_open)
return -EPERM;
@@ -1268,11 +1264,16 @@ SYSCALL_DEFINE2(setrlimit, unsigned int, resource, struct rlimit __user *, rlim)
new_rlim.rlim_cur = 1;
}
+ old_rlim = current->signal->rlim + resource;
task_lock(current->group_leader);
- *old_rlim = new_rlim;
+ if ((new_rlim.rlim_max <= old_rlim->rlim_max) ||
+ capable(CAP_SYS_RESOURCE))
+ *old_rlim = new_rlim;
+ else
+ retval = -EPERM;
task_unlock(current->group_leader);
- if (resource != RLIMIT_CPU)
+ if (retval || resource != RLIMIT_CPU)
goto out;
/*
@@ -1286,7 +1287,7 @@ SYSCALL_DEFINE2(setrlimit, unsigned int, resource, struct rlimit __user *, rlim)
update_rlimit_cpu(current, new_rlim.rlim_cur);
out:
- return 0;
+ return retval;
}
/*
--
1.6.3.3
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 5/8] core: split sys_setrlimit
2009-09-03 17:41 ` Oleg Nesterov
` (3 preceding siblings ...)
2009-09-03 20:08 ` [PATCH v2 4/8] sys_setrlimit: make sure ->rlim_max never grows Jiri Slaby
@ 2009-09-03 20:08 ` Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 6/8] core: allow setrlimit to non-current tasks Jiri Slaby
` (2 subsequent siblings)
7 siblings, 0 replies; 22+ messages in thread
From: Jiri Slaby @ 2009-09-03 20:08 UTC (permalink / raw)
To: oleg; +Cc: akpm, mingo, linux-kernel, Jiri Slaby
Create setrlimit from sys_setrlimit and declare setrlimit in
the resource header. This is to allow rlimits to be changed not
only by syscall, but later from proc code too.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
---
include/linux/resource.h | 2 ++
kernel/sys.c | 44 ++++++++++++++++++++++++++------------------
2 files changed, 28 insertions(+), 18 deletions(-)
diff --git a/include/linux/resource.h b/include/linux/resource.h
index 40fc7e6..4301d67 100644
--- a/include/linux/resource.h
+++ b/include/linux/resource.h
@@ -71,5 +71,7 @@ struct rlimit {
#include <asm/resource.h>
int getrusage(struct task_struct *p, int who, struct rusage __user *ru);
+int setrlimit(struct task_struct *tsk, unsigned int resource,
+ struct rlimit *new_rlim);
#endif
diff --git a/kernel/sys.c b/kernel/sys.c
index 600d3e2..655dd7b 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1236,42 +1236,39 @@ SYSCALL_DEFINE2(old_getrlimit, unsigned int, resource,
#endif
-SYSCALL_DEFINE2(setrlimit, unsigned int, resource, struct rlimit __user *, rlim)
+int setrlimit(struct task_struct *tsk, unsigned int resource,
+ struct rlimit *new_rlim)
{
- struct rlimit new_rlim, *old_rlim;
+ struct rlimit *old_rlim;
int retval;
- if (resource >= RLIM_NLIMITS)
- return -EINVAL;
- if (copy_from_user(&new_rlim, rlim, sizeof(*rlim)))
- return -EFAULT;
- if (new_rlim.rlim_cur > new_rlim.rlim_max)
+ if (new_rlim->rlim_cur > new_rlim->rlim_max)
return -EINVAL;
- if (resource == RLIMIT_NOFILE && new_rlim.rlim_max > sysctl_nr_open)
+ if (resource == RLIMIT_NOFILE && new_rlim->rlim_max > sysctl_nr_open)
return -EPERM;
- retval = security_task_setrlimit(current, resource, &new_rlim);
+ retval = security_task_setrlimit(tsk, resource, new_rlim);
if (retval)
return retval;
- if (resource == RLIMIT_CPU && new_rlim.rlim_cur == 0) {
+ if (resource == RLIMIT_CPU && new_rlim->rlim_cur == 0) {
/*
* The caller is asking for an immediate RLIMIT_CPU
* expiry. But we use the zero value to mean "it was
* never set". So let's cheat and make it one second
* instead
*/
- new_rlim.rlim_cur = 1;
+ new_rlim->rlim_cur = 1;
}
- old_rlim = current->signal->rlim + resource;
- task_lock(current->group_leader);
- if ((new_rlim.rlim_max <= old_rlim->rlim_max) ||
+ old_rlim = tsk->signal->rlim + resource;
+ task_lock(tsk->group_leader);
+ if ((new_rlim->rlim_max <= old_rlim->rlim_max) ||
capable(CAP_SYS_RESOURCE))
- *old_rlim = new_rlim;
+ *old_rlim = *new_rlim;
else
retval = -EPERM;
- task_unlock(current->group_leader);
+ task_unlock(tsk->group_leader);
if (retval || resource != RLIMIT_CPU)
goto out;
@@ -1282,14 +1279,25 @@ SYSCALL_DEFINE2(setrlimit, unsigned int, resource, struct rlimit __user *, rlim)
* very long-standing error, and fixing it now risks breakage of
* applications, so we live with it
*/
- if (new_rlim.rlim_cur == RLIM_INFINITY)
+ if (new_rlim->rlim_cur == RLIM_INFINITY)
goto out;
- update_rlimit_cpu(current, new_rlim.rlim_cur);
+ update_rlimit_cpu(tsk, new_rlim->rlim_cur);
out:
return retval;
}
+SYSCALL_DEFINE2(setrlimit, unsigned int, resource, struct rlimit __user *, rlim)
+{
+ struct rlimit new_rlim;
+
+ if (resource >= RLIM_NLIMITS)
+ return -EINVAL;
+ if (copy_from_user(&new_rlim, rlim, sizeof(*rlim)))
+ return -EFAULT;
+ return setrlimit(current, resource, &new_rlim);
+}
+
/*
* It would make sense to put struct rusage in the task_struct,
* except that would make the task_struct be *really big*. After
--
1.6.3.3
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 6/8] core: allow setrlimit to non-current tasks
2009-09-03 17:41 ` Oleg Nesterov
` (4 preceding siblings ...)
2009-09-03 20:08 ` [PATCH v2 5/8] core: split sys_setrlimit Jiri Slaby
@ 2009-09-03 20:08 ` Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 7/8] core: optimize setrlimit for current task Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 8/8] FS: proc, make limits writable Jiri Slaby
7 siblings, 0 replies; 22+ messages in thread
From: Jiri Slaby @ 2009-09-03 20:08 UTC (permalink / raw)
To: oleg; +Cc: akpm, mingo, linux-kernel, Jiri Slaby
Add locking to allow setrlimit accept task parameter other than
current.
Namely, lock tasklist_lock for read and check whether the task
structure has sighand non-null. Do all the signal processing under
that lock still held. We do this only for non-current tasks.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
---
kernel/sys.c | 10 +++++++++-
1 files changed, 9 insertions(+), 1 deletions(-)
diff --git a/kernel/sys.c b/kernel/sys.c
index 655dd7b..791b5c5 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1247,9 +1247,16 @@ int setrlimit(struct task_struct *tsk, unsigned int resource,
if (resource == RLIMIT_NOFILE && new_rlim->rlim_max > sysctl_nr_open)
return -EPERM;
+ /* protect tsk->signal and tsk->sighand from disappearing */
+ read_lock(&tasklist_lock);
+ if (!tsk->sighand) {
+ retval = -ESRCH;
+ goto out;
+ }
+
retval = security_task_setrlimit(tsk, resource, new_rlim);
if (retval)
- return retval;
+ goto out;
if (resource == RLIMIT_CPU && new_rlim->rlim_cur == 0) {
/*
@@ -1284,6 +1291,7 @@ int setrlimit(struct task_struct *tsk, unsigned int resource,
update_rlimit_cpu(tsk, new_rlim->rlim_cur);
out:
+ read_unlock(&tasklist_lock);
return retval;
}
--
1.6.3.3
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 7/8] core: optimize setrlimit for current task
2009-09-03 17:41 ` Oleg Nesterov
` (5 preceding siblings ...)
2009-09-03 20:08 ` [PATCH v2 6/8] core: allow setrlimit to non-current tasks Jiri Slaby
@ 2009-09-03 20:08 ` Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 8/8] FS: proc, make limits writable Jiri Slaby
7 siblings, 0 replies; 22+ messages in thread
From: Jiri Slaby @ 2009-09-03 20:08 UTC (permalink / raw)
To: oleg; +Cc: akpm, mingo, linux-kernel, Jiri Slaby
Don't take tasklist lock for 'current'. It's not needed, since
current->sighand/signal can't disappear.
This improves serlimit called especially via sys_setrlimit.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
---
kernel/sys.c | 16 ++++++++++------
1 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/kernel/sys.c b/kernel/sys.c
index 791b5c5..feca9d6 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -1247,11 +1247,14 @@ int setrlimit(struct task_struct *tsk, unsigned int resource,
if (resource == RLIMIT_NOFILE && new_rlim->rlim_max > sysctl_nr_open)
return -EPERM;
- /* protect tsk->signal and tsk->sighand from disappearing */
- read_lock(&tasklist_lock);
- if (!tsk->sighand) {
- retval = -ESRCH;
- goto out;
+ /* optimization: 'current' doesn't need locking, e.g. setrlimit */
+ if (tsk != current) {
+ /* protect tsk->signal and tsk->sighand from disappearing */
+ read_lock(&tasklist_lock);
+ if (!tsk->sighand) {
+ retval = -ESRCH;
+ goto out;
+ }
}
retval = security_task_setrlimit(tsk, resource, new_rlim);
@@ -1291,7 +1294,8 @@ int setrlimit(struct task_struct *tsk, unsigned int resource,
update_rlimit_cpu(tsk, new_rlim->rlim_cur);
out:
- read_unlock(&tasklist_lock);
+ if (tsk != current)
+ read_unlock(&tasklist_lock);
return retval;
}
--
1.6.3.3
^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH v2 8/8] FS: proc, make limits writable
2009-09-03 17:41 ` Oleg Nesterov
` (6 preceding siblings ...)
2009-09-03 20:08 ` [PATCH v2 7/8] core: optimize setrlimit for current task Jiri Slaby
@ 2009-09-03 20:08 ` Jiri Slaby
2009-09-04 14:26 ` Oleg Nesterov
7 siblings, 1 reply; 22+ messages in thread
From: Jiri Slaby @ 2009-09-03 20:08 UTC (permalink / raw)
To: oleg; +Cc: akpm, mingo, linux-kernel, Jiri Slaby
Allow writing strings such as
Max core file size=0:unlimited
to /proc/<pid>/limits to change limits.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
---
fs/proc/base.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 76 insertions(+), 2 deletions(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index ea71cae..0d9e3e9 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -129,6 +129,8 @@ struct pid_entry {
NULL, &proc_single_file_operations, \
{ .proc_show = show } )
+static ssize_t proc_info_read(struct file * file, char __user * buf,
+ size_t count, loff_t *ppos);
/*
* Count the number of hardlinks for the pid_entry table, excluding the .
* and .. links.
@@ -522,6 +524,74 @@ static int proc_pid_limits(struct task_struct *task, char *buffer)
return count;
}
+static ssize_t limits_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct task_struct *task = get_proc_task(file->f_path.dentry->d_inode);
+ char str[32 + 1 + 16 + 1 + 16 + 1], *delim, *next;
+ struct rlimit new_rlimit;
+ unsigned int i;
+ int ret;
+
+ if (!task) {
+ count = -ESRCH;
+ goto out;
+ }
+ if (copy_from_user(str, buf, min(count, sizeof(str) - 1))) {
+ count = -EFAULT;
+ goto put_task;
+ }
+
+ str[min(count, sizeof(str) - 1)] = 0;
+
+ delim = strchr(str, '=');
+ if (!delim) {
+ count = -EINVAL;
+ goto put_task;
+ }
+ *delim++ = 0; /* for easy 'str' usage */
+ new_rlimit.rlim_cur = simple_strtoul(delim, &next, 0);
+ if (*next != ':') {
+ if (strncmp(delim, "unlimited:", 10)) {
+ count = -EINVAL;
+ goto put_task;
+ }
+ new_rlimit.rlim_cur = RLIM_INFINITY;
+ next = delim + 9; /* move to ':' */
+ }
+ delim = next + 1;
+ new_rlimit.rlim_max = simple_strtoul(delim, &next, 0);
+ if (*next != 0) {
+ if (strcmp(delim, "unlimited")) {
+ count = -EINVAL;
+ goto put_task;
+ }
+ new_rlimit.rlim_max = RLIM_INFINITY;
+ }
+
+ for (i = 0; i < RLIM_NLIMITS; i++)
+ if (!strcmp(str, lnames[i].name))
+ break;
+ if (i >= RLIM_NLIMITS) {
+ count = -EINVAL;
+ goto put_task;
+ }
+
+ ret = setrlimit(task, i, &new_rlimit);
+ if (ret)
+ count = ret;
+
+put_task:
+ put_task_struct(task);
+out:
+ return count;
+}
+
+static const struct file_operations proc_pid_limits_operations = {
+ .read = proc_info_read,
+ .write = limits_write,
+};
+
#ifdef CONFIG_HAVE_ARCH_TRACEHOOK
static int proc_pid_syscall(struct task_struct *task, char *buffer)
{
@@ -2501,7 +2571,9 @@ static const struct pid_entry tgid_base_stuff[] = {
INF("auxv", S_IRUSR, proc_pid_auxv),
ONE("status", S_IRUGO, proc_pid_status),
ONE("personality", S_IRUSR, proc_pid_personality),
- INF("limits", S_IRUSR, proc_pid_limits),
+ NOD("limits", S_IFREG|S_IRUSR|S_IWUSR, NULL,
+ &proc_pid_limits_operations,
+ { .proc_read = proc_pid_limits }),
#ifdef CONFIG_SCHED_DEBUG
REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
#endif
@@ -2836,7 +2908,9 @@ static const struct pid_entry tid_base_stuff[] = {
INF("auxv", S_IRUSR, proc_pid_auxv),
ONE("status", S_IRUGO, proc_pid_status),
ONE("personality", S_IRUSR, proc_pid_personality),
- INF("limits", S_IRUSR, proc_pid_limits),
+ NOD("limits", S_IFREG|S_IRUSR|S_IWUSR, NULL,
+ &proc_pid_limits_operations,
+ { .proc_read = proc_pid_limits }),
#ifdef CONFIG_SCHED_DEBUG
REG("sched", S_IRUGO|S_IWUSR, proc_pid_sched_operations),
#endif
--
1.6.3.3
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH v2 8/8] FS: proc, make limits writable
2009-09-03 20:08 ` [PATCH v2 8/8] FS: proc, make limits writable Jiri Slaby
@ 2009-09-04 14:26 ` Oleg Nesterov
2009-10-08 20:55 ` Jiri Slaby
0 siblings, 1 reply; 22+ messages in thread
From: Oleg Nesterov @ 2009-09-04 14:26 UTC (permalink / raw)
To: Jiri Slaby; +Cc: akpm, mingo, linux-kernel
On 09/03, Jiri Slaby wrote:
>
> Allow writing strings such as
> Max core file size=0:unlimited
> to /proc/<pid>/limits to change limits.
Can't review the parsing in limits_write() because I don't have enough
"C" skills, but otherwise the whole series looks correct to me.
One small nit, just to suggest the further 9/8 cleanup,
> +static const struct file_operations proc_pid_limits_operations = {
> + .read = proc_info_read,
> + .write = limits_write,
> +};
I think it makes sense to tweak proc_pid_limits() a little bit (and
rename it), so that we can do
.read = limits_read,
.write = limits_write
Then,
> @@ -2501,7 +2571,9 @@ static const struct pid_entry tgid_base_stuff[] = {
> + NOD("limits", S_IFREG|S_IRUSR|S_IWUSR, NULL,
> + &proc_pid_limits_operations,
> + { .proc_read = proc_pid_limits }),
We could use
REG("limits", S_IRUSR|S_IWUSR, &proc_pid_limits_operations),
instead, this looks a bit cleaner to me.
But as I said, we can do this later.
And another minor nit (just in case you will re-submit this series for
some reason). Perhaps the changelog in 6/8 should mention that we do
not do any security checks when tsk != current (without selinux). We
assume that either the caller is sys_setrlimit(), or the caller should
verify it has rights to change the limits: in case of limits_write()
we rely on ->mode = S_IRUSR|S_IWUSR.
Oleg.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2 8/8] FS: proc, make limits writable
2009-09-04 14:26 ` Oleg Nesterov
@ 2009-10-08 20:55 ` Jiri Slaby
2009-10-12 15:13 ` Oleg Nesterov
0 siblings, 1 reply; 22+ messages in thread
From: Jiri Slaby @ 2009-10-08 20:55 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: akpm, mingo, linux-kernel
On 09/04/2009 04:26 PM, Oleg Nesterov wrote:
> One small nit, just to suggest the further 9/8 cleanup,
>
>> +static const struct file_operations proc_pid_limits_operations = {
>> + .read = proc_info_read,
>> + .write = limits_write,
>> +};
>
> I think it makes sense to tweak proc_pid_limits() a little bit (and
> rename it), so that we can do
>
> .read = limits_read,
> .write = limits_write
>
> Then,
>
>> @@ -2501,7 +2571,9 @@ static const struct pid_entry tgid_base_stuff[] = {
>> + NOD("limits", S_IFREG|S_IRUSR|S_IWUSR, NULL,
>> + &proc_pid_limits_operations,
>> + { .proc_read = proc_pid_limits }),
>
> We could use
>
> REG("limits", S_IRUSR|S_IWUSR, &proc_pid_limits_operations),
>
> instead, this looks a bit cleaner to me.
Hi again, nobody picked them up yet, I waited till the end of the merge
window and now I'll repost.
Did you mean here to do the proc_info_read work (get/put task, alloc
buf, simple_read) directly in limits_read?
> And another minor nit (just in case you will re-submit this series for
> some reason). Perhaps the changelog in 6/8 should mention that we do
> not do any security checks when tsk != current (without selinux). We
> assume that either the caller is sys_setrlimit(), or the caller should
> verify it has rights to change the limits: in case of limits_write()
> we rely on ->mode = S_IRUSR|S_IWUSR.
I did it as a comment by the setrlimit. I think nobody would care about
a changelog note ;).
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH v2 8/8] FS: proc, make limits writable
2009-10-08 20:55 ` Jiri Slaby
@ 2009-10-12 15:13 ` Oleg Nesterov
0 siblings, 0 replies; 22+ messages in thread
From: Oleg Nesterov @ 2009-10-12 15:13 UTC (permalink / raw)
To: Jiri Slaby; +Cc: akpm, mingo, linux-kernel
On 10/08, Jiri Slaby wrote:
>
> On 09/04/2009 04:26 PM, Oleg Nesterov wrote:
> > One small nit, just to suggest the further 9/8 cleanup,
> >
> >> +static const struct file_operations proc_pid_limits_operations = {
> >> + .read = proc_info_read,
> >> + .write = limits_write,
> >> +};
> >
> > I think it makes sense to tweak proc_pid_limits() a little bit (and
> > rename it), so that we can do
> >
> > .read = limits_read,
> > .write = limits_write
> >
> > Then,
> >
> >> @@ -2501,7 +2571,9 @@ static const struct pid_entry tgid_base_stuff[] = {
> >> + NOD("limits", S_IFREG|S_IRUSR|S_IWUSR, NULL,
> >> + &proc_pid_limits_operations,
> >> + { .proc_read = proc_pid_limits }),
> >
> > We could use
> >
> > REG("limits", S_IRUSR|S_IWUSR, &proc_pid_limits_operations),
> >
> > instead, this looks a bit cleaner to me.
>
> Hi again, nobody picked them up yet, I waited till the end of the merge
> window and now I'll repost.
>
> Did you mean here to do the proc_info_read work (get/put task, alloc
> buf, simple_read) directly in limits_read?
limits_read() has to do get_proc_task(d_inode), yes. Otherwise I don't
see any additional complications, you can use simple_read_from_buffer(),
no need to allocate the buffer but of course you can you shouldn't
write more than "size_t count" bytes.
But perhaps I missed something, and in any case this is up to you. If
you don't like this - please forget.
> > And another minor nit (just in case you will re-submit this series for
> > some reason). Perhaps the changelog in 6/8 should mention that we do
> > not do any security checks when tsk != current (without selinux). We
> > assume that either the caller is sys_setrlimit(), or the caller should
> > verify it has rights to change the limits: in case of limits_write()
> > we rely on ->mode = S_IRUSR|S_IWUSR.
>
> I did it as a comment by the setrlimit. I think nobody would care about
> a changelog note ;).
Good ;)
Oleg.
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2009-10-12 15:19 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-02 9:45 [PATCH 1/2] core: add lockless update_rlimit_cpu Jiri Slaby
2009-09-02 9:45 ` [PATCH 2/2] core: allow setrlimit to non-current tasks Jiri Slaby
2009-09-02 9:47 ` Jiri Slaby
2009-09-02 13:50 ` Oleg Nesterov
2009-09-02 18:44 ` Jiri Slaby
2009-09-02 21:51 ` Oleg Nesterov
2009-09-03 13:47 ` Jiri Slaby
2009-09-03 13:52 ` [PATCH] " Jiri Slaby
2009-09-03 17:41 ` Oleg Nesterov
2009-09-03 20:08 ` [PATCH v2 1/8] SECURITY: selinux, fix update_rlimit_cpu parameter Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 2/8] SECURITY: add task_struct to setrlimit Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 3/8] core: add task_struct to update_rlimit_cpu Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 4/8] sys_setrlimit: make sure ->rlim_max never grows Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 5/8] core: split sys_setrlimit Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 6/8] core: allow setrlimit to non-current tasks Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 7/8] core: optimize setrlimit for current task Jiri Slaby
2009-09-03 20:08 ` [PATCH v2 8/8] FS: proc, make limits writable Jiri Slaby
2009-09-04 14:26 ` Oleg Nesterov
2009-10-08 20:55 ` Jiri Slaby
2009-10-12 15:13 ` Oleg Nesterov
2009-09-03 17:20 ` [PATCH 0/1] sys_setrlimit: make sure ->rlim_max never grows Oleg Nesterov
2009-09-03 17:21 ` [PATCH 1/1] " Oleg Nesterov
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.