* [PATCH -mm 3/3] proc: make task_sig() lockless
@ 2010-03-22 18:41 Oleg Nesterov
2010-03-23 8:30 ` David Howells
2010-03-23 8:37 ` David Howells
0 siblings, 2 replies; 11+ messages in thread
From: Oleg Nesterov @ 2010-03-22 18:41 UTC (permalink / raw)
To: Andrew Morton
Cc: Alexey Dobriyan, David Howells, Eric W. Biederman,
Roland McGrath, linux-kernel
Now that task->signal can't go away and collect_sigign_sigcatch()
is rcu-safe, task_sig() doesn't need ->siglock.
Remove lock_task_sighand() and unnecessary sigemptyset's, move
collect_sigign_sigcatch() under rcu_read_lock().
Of course, this means we read pending/blocked/etc nonatomically,
but I hope this is OK for fs/proc.
Probably we can change do_task_stat() to avod ->siglock too, except
we can't get tty_nr lockless.
Also, remove the "is this correct?" comment. I think it is safe
to dereference __task_cred(p)->user under rcu lock. In any case,
->siglock can't help to protect cred->user.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
fs/proc/array.c | 26 ++++++++++----------------
1 file changed, 10 insertions(+), 16 deletions(-)
--- 34-rc1/fs/proc/array.c~PROC_3_TASK_SIG_DONT_USE_SIGLOCK 2010-03-22 17:39:42.000000000 +0100
+++ 34-rc1/fs/proc/array.c 2010-03-22 18:36:13.000000000 +0100
@@ -257,30 +257,24 @@ static void collect_sigign_sigcatch(stru
static inline void task_sig(struct seq_file *m, struct task_struct *p)
{
- unsigned long flags;
sigset_t pending, shpending, blocked, ignored, caught;
int num_threads = 0;
unsigned long qsize = 0;
unsigned long qlim = 0;
- sigemptyset(&pending);
- sigemptyset(&shpending);
- sigemptyset(&blocked);
sigemptyset(&ignored);
sigemptyset(&caught);
- if (lock_task_sighand(p, &flags)) {
- pending = p->pending.signal;
- shpending = p->signal->shared_pending.signal;
- blocked = p->blocked;
- collect_sigign_sigcatch(p, &ignored, &caught);
- num_threads = get_nr_threads(p);
- rcu_read_lock(); /* FIXME: is this correct? */
- qsize = atomic_read(&__task_cred(p)->user->sigpending);
- rcu_read_unlock();
- qlim = task_rlimit(p, RLIMIT_SIGPENDING);
- unlock_task_sighand(p, &flags);
- }
+ blocked = p->blocked;
+ pending = p->pending.signal;
+ shpending = p->signal->shared_pending.signal;
+ qlim = task_rlimit(p, RLIMIT_SIGPENDING);
+ num_threads = get_nr_threads(p);
+
+ rcu_read_lock();
+ collect_sigign_sigcatch(p, &ignored, &caught);
+ qsize = atomic_read(&__task_cred(p)->user->sigpending);
+ rcu_read_unlock();
seq_printf(m, "Threads:\t%d\n", num_threads);
seq_printf(m, "SigQ:\t%lu/%lu\n", qsize, qlim);
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
2010-03-22 18:41 [PATCH -mm 3/3] proc: make task_sig() lockless Oleg Nesterov
@ 2010-03-23 8:30 ` David Howells
2010-03-23 8:37 ` David Howells
1 sibling, 0 replies; 11+ messages in thread
From: David Howells @ 2010-03-23 8:30 UTC (permalink / raw)
To: Oleg Nesterov
Cc: dhowells, Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
Roland McGrath, linux-kernel
Oleg Nesterov <oleg@redhat.com> wrote:
> Also, remove the "is this correct?" comment. I think it is safe
> to dereference __task_cred(p)->user under rcu lock.
It is.
David
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
2010-03-22 18:41 [PATCH -mm 3/3] proc: make task_sig() lockless Oleg Nesterov
2010-03-23 8:30 ` David Howells
@ 2010-03-23 8:37 ` David Howells
2010-03-23 10:57 ` Oleg Nesterov
2010-03-24 8:37 ` David Howells
1 sibling, 2 replies; 11+ messages in thread
From: David Howells @ 2010-03-23 8:37 UTC (permalink / raw)
To: Oleg Nesterov
Cc: dhowells, Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
Roland McGrath, linux-kernel
Oleg Nesterov <oleg@redhat.com> wrote:
> task_sig() doesn't need ->siglock.
Except that the data returned might then be inconsistent because you don't
hold a lock as you read the various bits of it.
David
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
2010-03-23 8:37 ` David Howells
@ 2010-03-23 10:57 ` Oleg Nesterov
2010-04-09 19:59 ` Roland McGrath
2010-04-10 8:16 ` David Howells
2010-03-24 8:37 ` David Howells
1 sibling, 2 replies; 11+ messages in thread
From: Oleg Nesterov @ 2010-03-23 10:57 UTC (permalink / raw)
To: David Howells
Cc: Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
Roland McGrath, linux-kernel
On 03/23, David Howells wrote:
>
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > task_sig() doesn't need ->siglock.
>
> Except that the data returned might then be inconsistent because you don't
> hold a lock as you read the various bits of it.
Yes. From the changelog:
Of course, this means we read pending/blocked/etc nonatomically,
but I hope this is OK for fs/proc.
But I don't think the returned data could be "really" inconsistent
from the /bin/ps pov. Yes, it is possible that, say, some signal is
seen as both pending and ignored without ->siglock. Or we can report
user->sigpending != 0 while pending/shpending are empty.
But this looks harmless to me. We never guaranteed /proc/pid/status
can't report the "intermediate" state, and I don't think we can
confuse the user-space.
Do you agree? Or do you think this can make problems ?
Oleg.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
2010-03-23 8:37 ` David Howells
2010-03-23 10:57 ` Oleg Nesterov
@ 2010-03-24 8:37 ` David Howells
2010-03-24 15:00 ` Oleg Nesterov
1 sibling, 1 reply; 11+ messages in thread
From: David Howells @ 2010-03-24 8:37 UTC (permalink / raw)
To: Oleg Nesterov
Cc: dhowells, Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
Roland McGrath, linux-kernel
Oleg Nesterov <oleg@redhat.com> wrote:
> > Except that the data returned might then be inconsistent because you don't
> > hold a lock as you read the various bits of it.
>
> Yes. From the changelog:
>
> Of course, this means we read pending/blocked/etc nonatomically,
> but I hope this is OK for fs/proc.
Ah, yes. I read that as you meant how procfs accessed the actual data
structures, not how the user accessed procfs. It might be worth clarifying
that.
> But I don't think the returned data could be "really" inconsistent
> from the /bin/ps pov. Yes, it is possible that, say, some signal is
> seen as both pending and ignored without ->siglock. Or we can report
> user->sigpending != 0 while pending/shpending are empty.
>
> But this looks harmless to me. We never guaranteed /proc/pid/status
> can't report the "intermediate" state, and I don't think we can
> confuse the user-space.
>
> Do you agree? Or do you think this can make problems ?
I don't know of anything this will affect adversely. In fact, I'm not sure
there was a guarantee that it would be atomic anyway.
So as far as I'm concerned, you can add:
Acked-by: David Howells <dhowells@redhat.com>
> > Probably we can change do_task_stat() to avod ->siglock too, except
> > we can't get tty_nr lockless.
Btw, avoid has an 'i' in it... :-)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
2010-03-24 8:37 ` David Howells
@ 2010-03-24 15:00 ` Oleg Nesterov
0 siblings, 0 replies; 11+ messages in thread
From: Oleg Nesterov @ 2010-03-24 15:00 UTC (permalink / raw)
To: David Howells
Cc: Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
Roland McGrath, linux-kernel
On 03/24, David Howells wrote:
>
> Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > Yes. From the changelog:
> >
> > Of course, this means we read pending/blocked/etc nonatomically,
> > but I hope this is OK for fs/proc.
>
> Ah, yes. I read that as you meant how procfs accessed the actual data
> structures, not how the user accessed procfs. It might be worth clarifying
> that.
OK, agreed.
> Acked-by: David Howells <dhowells@redhat.com>
Thanks,
> > > Probably we can change do_task_stat() to avod ->siglock too, except
>
> Btw, avoid has an 'i' in it... :-)
Another reason to update the changelog ;)
Andrew, please find the updated changelog for proc-make-task_sig-lockless.patch
If this is not convenient, please ignore or tell me what is the "right" way
to fix the changelog when the patch is already in -mm.
------------------------------------------------------------------------------
Now that task->signal can't go away and collect_sigign_sigcatch() is
rcu-safe, task_sig() doesn't need ->siglock.
Remove lock_task_sighand() and unnecessary sigemptyset's, move
collect_sigign_sigcatch() under rcu_read_lock().
Of course, this means we read pending/blocked/etc nonatomically and we
can report this info in some intermediate state. Say, a signal can be
reported as both pending and ignored, or we can report ->sigpending != 0
while pending/shpending are empty, etc. Hopefully this is OK for proc,
we never promised this info should be atomic.
Probably we can change do_task_stat() to avoid ->siglock too, except we
can't get tty_nr lockless.
Also, remove the "is this correct?" comment. I think it is safe to
dereference __task_cred(p)->user under rcu lock. In any case, ->siglock
can't help to protect cred->user.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
2010-03-23 10:57 ` Oleg Nesterov
@ 2010-04-09 19:59 ` Roland McGrath
2010-04-12 19:50 ` Oleg Nesterov
2010-04-10 8:16 ` David Howells
1 sibling, 1 reply; 11+ messages in thread
From: Roland McGrath @ 2010-04-09 19:59 UTC (permalink / raw)
To: Oleg Nesterov
Cc: David Howells, Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
linux-kernel
> Yes. From the changelog:
>
> Of course, this means we read pending/blocked/etc nonatomically,
> but I hope this is OK for fs/proc.
>
> But I don't think the returned data could be "really" inconsistent
> from the /bin/ps pov. Yes, it is possible that, say, some signal is
> seen as both pending and ignored without ->siglock. Or we can report
> user->sigpending != 0 while pending/shpending are empty.
>
> But this looks harmless to me. We never guaranteed /proc/pid/status
> can't report the "intermediate" state, and I don't think we can
> confuse the user-space.
>
> Do you agree? Or do you think this can make problems ?
I'm not so sure. Operations like sigprocmask and sigaction really have
always been entirely atomic from the userland perspective before. Now it
becomes possible to read from /proc e.g. a blocked set that never existed
as such (one word updated by sigprocmask but not yet the next word).
Thanks,
Roland
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
2010-03-23 10:57 ` Oleg Nesterov
2010-04-09 19:59 ` Roland McGrath
@ 2010-04-10 8:16 ` David Howells
1 sibling, 0 replies; 11+ messages in thread
From: David Howells @ 2010-04-10 8:16 UTC (permalink / raw)
To: Roland McGrath
Cc: dhowells, Oleg Nesterov, Andrew Morton, Alexey Dobriyan,
Eric W. Biederman, linux-kernel
Roland McGrath <roland@redhat.com> wrote:
> I'm not so sure. Operations like sigprocmask and sigaction really have
> always been entirely atomic from the userland perspective before. Now it
> becomes possible to read from /proc e.g. a blocked set that never existed
> as such (one word updated by sigprocmask but not yet the next word).
If you have a small userspace buffer, that was previously possible too.
David
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
2010-04-09 19:59 ` Roland McGrath
@ 2010-04-12 19:50 ` Oleg Nesterov
2010-04-13 6:30 ` Roland McGrath
0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2010-04-12 19:50 UTC (permalink / raw)
To: Roland McGrath
Cc: David Howells, Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
linux-kernel
On 04/09, Roland McGrath wrote:
>
> > Yes. From the changelog:
> >
> > Of course, this means we read pending/blocked/etc nonatomically,
> > but I hope this is OK for fs/proc.
> >
> > But I don't think the returned data could be "really" inconsistent
> > from the /bin/ps pov. Yes, it is possible that, say, some signal is
> > seen as both pending and ignored without ->siglock. Or we can report
> > user->sigpending != 0 while pending/shpending are empty.
> >
> > But this looks harmless to me. We never guaranteed /proc/pid/status
> > can't report the "intermediate" state, and I don't think we can
> > confuse the user-space.
> >
> > Do you agree? Or do you think this can make problems ?
>
> I'm not so sure. Operations like sigprocmask and sigaction really have
> always been entirely atomic from the userland perspective before. Now it
> becomes possible to read from /proc e.g. a blocked set that never existed
> as such (one word updated by sigprocmask but not yet the next word).
Yes, /proc/pid/status can report the intermediate state, I even sent
the updated changelog to document this.
But if you are not sure this is OK, I am worried. Do you think we should
drop this patch? If yes, I won't argue.
Oleg.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
2010-04-12 19:50 ` Oleg Nesterov
@ 2010-04-13 6:30 ` Roland McGrath
2010-04-13 20:00 ` Oleg Nesterov
0 siblings, 1 reply; 11+ messages in thread
From: Roland McGrath @ 2010-04-13 6:30 UTC (permalink / raw)
To: Oleg Nesterov
Cc: David Howells, Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
linux-kernel
> Yes, /proc/pid/status can report the intermediate state, I even sent
> the updated changelog to document this.
>
> But if you are not sure this is OK, I am worried. Do you think we should
> drop this patch? If yes, I won't argue.
I'm not dead-set against it, but I am hesitant. My inclination is not to
remove any previous userland atomicity guarantees with regard to observable
signal state in any form. At least, don't do that in part of a whole
cleanup flurry where it is intermixed with lots of changes that really are
pure cleanup with absolutely no userland-observable change. If it really
helps to fragment what was atomic before, then we can consider it. But
let's not be in a hurry.
David mentioned that users who do multiple reads due to using tiny buffers
already don't get atomic sampling. That is certainly true but I don't
think it's relevant. It is completely reliable that you can easily
allocate a buffer big enough to get all the Sig* fields on the first read,
and any user program that might care about the coherence of the data,
by definition, is already doing that.
Thanks,
Roland
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
2010-04-13 6:30 ` Roland McGrath
@ 2010-04-13 20:00 ` Oleg Nesterov
0 siblings, 0 replies; 11+ messages in thread
From: Oleg Nesterov @ 2010-04-13 20:00 UTC (permalink / raw)
To: Roland McGrath
Cc: David Howells, Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
linux-kernel
OK. Andrew, please drop
proc-make-collect_sigign_sigcatch-rcu-safe.patch
proc-make-task_sig-lockless.patch
On 04/12, Roland McGrath wrote:
>
> > Yes, /proc/pid/status can report the intermediate state, I even sent
> > the updated changelog to document this.
> >
> > But if you are not sure this is OK, I am worried. Do you think we should
> > drop this patch? If yes, I won't argue.
>
> I'm not dead-set against it, but I am hesitant. My inclination is not to
> remove any previous userland atomicity guarantees with regard to observable
> signal state in any form.
OK. Not that I really understand why do we need atomicity, but OK.
I was going to remove ->siglock from /fs/proc/ completely (except
do_io_accounting), but given that nobody replied to do_task_stat patches
this will not happen soon.
> At least, don't do that in part of a whole
> cleanup flurry where it is intermixed with lots of changes that really are
> pure cleanup with absolutely no userland-observable change.
OK. Anyway, these changes are simple, we can reconsider them later.
Oleg.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2010-04-13 20:03 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-22 18:41 [PATCH -mm 3/3] proc: make task_sig() lockless Oleg Nesterov
2010-03-23 8:30 ` David Howells
2010-03-23 8:37 ` David Howells
2010-03-23 10:57 ` Oleg Nesterov
2010-04-09 19:59 ` Roland McGrath
2010-04-12 19:50 ` Oleg Nesterov
2010-04-13 6:30 ` Roland McGrath
2010-04-13 20:00 ` Oleg Nesterov
2010-04-10 8:16 ` David Howells
2010-03-24 8:37 ` David Howells
2010-03-24 15:00 ` Oleg Nesterov
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.