All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH -mm 3/3] proc: make task_sig() lockless
@ 2010-03-22 18:41 Oleg Nesterov
  2010-03-23  8:30 ` David Howells
  2010-03-23  8:37 ` David Howells
  0 siblings, 2 replies; 11+ messages in thread
From: Oleg Nesterov @ 2010-03-22 18:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexey Dobriyan, David Howells, Eric W. Biederman,
	Roland McGrath, linux-kernel

Now that task->signal can't go away and collect_sigign_sigcatch()
is rcu-safe, task_sig() doesn't need ->siglock.

Remove lock_task_sighand() and unnecessary sigemptyset's, move
collect_sigign_sigcatch() under rcu_read_lock().

Of course, this means we read pending/blocked/etc nonatomically,
but I hope this is OK for fs/proc.

Probably we can change do_task_stat() to avod ->siglock too, except
we can't get tty_nr lockless.

Also, remove the "is this correct?" comment. I think it is safe
to dereference __task_cred(p)->user under rcu lock. In any case,
->siglock can't help to protect cred->user.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 fs/proc/array.c |   26 ++++++++++----------------
 1 file changed, 10 insertions(+), 16 deletions(-)

--- 34-rc1/fs/proc/array.c~PROC_3_TASK_SIG_DONT_USE_SIGLOCK	2010-03-22 17:39:42.000000000 +0100
+++ 34-rc1/fs/proc/array.c	2010-03-22 18:36:13.000000000 +0100
@@ -257,30 +257,24 @@ static void collect_sigign_sigcatch(stru
 
 static inline void task_sig(struct seq_file *m, struct task_struct *p)
 {
-	unsigned long flags;
 	sigset_t pending, shpending, blocked, ignored, caught;
 	int num_threads = 0;
 	unsigned long qsize = 0;
 	unsigned long qlim = 0;
 
-	sigemptyset(&pending);
-	sigemptyset(&shpending);
-	sigemptyset(&blocked);
 	sigemptyset(&ignored);
 	sigemptyset(&caught);
 
-	if (lock_task_sighand(p, &flags)) {
-		pending = p->pending.signal;
-		shpending = p->signal->shared_pending.signal;
-		blocked = p->blocked;
-		collect_sigign_sigcatch(p, &ignored, &caught);
-		num_threads = get_nr_threads(p);
-		rcu_read_lock();  /* FIXME: is this correct? */
-		qsize = atomic_read(&__task_cred(p)->user->sigpending);
-		rcu_read_unlock();
-		qlim = task_rlimit(p, RLIMIT_SIGPENDING);
-		unlock_task_sighand(p, &flags);
-	}
+	blocked = p->blocked;
+	pending = p->pending.signal;
+	shpending = p->signal->shared_pending.signal;
+	qlim = task_rlimit(p, RLIMIT_SIGPENDING);
+	num_threads = get_nr_threads(p);
+
+	rcu_read_lock();
+	collect_sigign_sigcatch(p, &ignored, &caught);
+	qsize = atomic_read(&__task_cred(p)->user->sigpending);
+	rcu_read_unlock();
 
 	seq_printf(m, "Threads:\t%d\n", num_threads);
 	seq_printf(m, "SigQ:\t%lu/%lu\n", qsize, qlim);


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
  2010-03-22 18:41 [PATCH -mm 3/3] proc: make task_sig() lockless Oleg Nesterov
@ 2010-03-23  8:30 ` David Howells
  2010-03-23  8:37 ` David Howells
  1 sibling, 0 replies; 11+ messages in thread
From: David Howells @ 2010-03-23  8:30 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dhowells, Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
	Roland McGrath, linux-kernel

Oleg Nesterov <oleg@redhat.com> wrote:

> Also, remove the "is this correct?" comment. I think it is safe
> to dereference __task_cred(p)->user under rcu lock.

It is.

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
  2010-03-22 18:41 [PATCH -mm 3/3] proc: make task_sig() lockless Oleg Nesterov
  2010-03-23  8:30 ` David Howells
@ 2010-03-23  8:37 ` David Howells
  2010-03-23 10:57   ` Oleg Nesterov
  2010-03-24  8:37   ` David Howells
  1 sibling, 2 replies; 11+ messages in thread
From: David Howells @ 2010-03-23  8:37 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dhowells, Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
	Roland McGrath, linux-kernel

Oleg Nesterov <oleg@redhat.com> wrote:

> task_sig() doesn't need ->siglock.

Except that the data returned might then be inconsistent because you don't
hold a lock as you read the various bits of it.

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
  2010-03-23  8:37 ` David Howells
@ 2010-03-23 10:57   ` Oleg Nesterov
  2010-04-09 19:59     ` Roland McGrath
  2010-04-10  8:16     ` David Howells
  2010-03-24  8:37   ` David Howells
  1 sibling, 2 replies; 11+ messages in thread
From: Oleg Nesterov @ 2010-03-23 10:57 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
	Roland McGrath, linux-kernel

On 03/23, David Howells wrote:
>
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > task_sig() doesn't need ->siglock.
>
> Except that the data returned might then be inconsistent because you don't
> hold a lock as you read the various bits of it.

Yes. From the changelog:

	Of course, this means we read pending/blocked/etc nonatomically,
	but I hope this is OK for fs/proc.

But I don't think the returned data could be "really" inconsistent
from the /bin/ps pov. Yes, it is possible that, say, some signal is
seen as both pending and ignored without ->siglock. Or we can report
user->sigpending != 0 while pending/shpending are empty.

But this looks harmless to me. We never guaranteed /proc/pid/status
can't report the "intermediate" state, and I don't think we can
confuse the user-space.

Do you agree? Or do you think this can make problems ?

Oleg.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
  2010-03-23  8:37 ` David Howells
  2010-03-23 10:57   ` Oleg Nesterov
@ 2010-03-24  8:37   ` David Howells
  2010-03-24 15:00     ` Oleg Nesterov
  1 sibling, 1 reply; 11+ messages in thread
From: David Howells @ 2010-03-24  8:37 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dhowells, Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
	Roland McGrath, linux-kernel

Oleg Nesterov <oleg@redhat.com> wrote:

> > Except that the data returned might then be inconsistent because you don't
> > hold a lock as you read the various bits of it.
> 
> Yes. From the changelog:
> 
> 	Of course, this means we read pending/blocked/etc nonatomically,
> 	but I hope this is OK for fs/proc.

Ah, yes.  I read that as you meant how procfs accessed the actual data
structures, not how the user accessed procfs.  It might be worth clarifying
that.

> But I don't think the returned data could be "really" inconsistent
> from the /bin/ps pov. Yes, it is possible that, say, some signal is
> seen as both pending and ignored without ->siglock. Or we can report
> user->sigpending != 0 while pending/shpending are empty.
> 
> But this looks harmless to me. We never guaranteed /proc/pid/status
> can't report the "intermediate" state, and I don't think we can
> confuse the user-space.
> 
> Do you agree? Or do you think this can make problems ?

I don't know of anything this will affect adversely.  In fact, I'm not sure
there was a guarantee that it would be atomic anyway.

So as far as I'm concerned, you can add:

Acked-by: David Howells <dhowells@redhat.com>

> > Probably we can change do_task_stat() to avod ->siglock too, except
> > we can't get tty_nr lockless.

Btw, avoid has an 'i' in it... :-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
  2010-03-24  8:37   ` David Howells
@ 2010-03-24 15:00     ` Oleg Nesterov
  0 siblings, 0 replies; 11+ messages in thread
From: Oleg Nesterov @ 2010-03-24 15:00 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
	Roland McGrath, linux-kernel

On 03/24, David Howells wrote:
>
> Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > Yes. From the changelog:
> >
> > 	Of course, this means we read pending/blocked/etc nonatomically,
> > 	but I hope this is OK for fs/proc.
>
> Ah, yes.  I read that as you meant how procfs accessed the actual data
> structures, not how the user accessed procfs.  It might be worth clarifying
> that.

OK, agreed.

> Acked-by: David Howells <dhowells@redhat.com>

Thanks,

> > > Probably we can change do_task_stat() to avod ->siglock too, except
>
> Btw, avoid has an 'i' in it... :-)

Another reason to update the changelog ;)


Andrew, please find the updated changelog for proc-make-task_sig-lockless.patch
If this is not convenient, please ignore or tell me what is the "right" way
to fix the changelog when the patch is already in -mm.

------------------------------------------------------------------------------
Now that task->signal can't go away and collect_sigign_sigcatch() is
rcu-safe, task_sig() doesn't need ->siglock.

Remove lock_task_sighand() and unnecessary sigemptyset's, move
collect_sigign_sigcatch() under rcu_read_lock().

Of course, this means we read pending/blocked/etc nonatomically and we
can report this info in some intermediate state. Say, a signal can be
reported as both pending and ignored, or we can report ->sigpending != 0
while pending/shpending are empty, etc. Hopefully this is OK for proc,
we never promised this info should be atomic.

Probably we can change do_task_stat() to avoid ->siglock too, except we
can't get tty_nr lockless.

Also, remove the "is this correct?" comment.  I think it is safe to
dereference __task_cred(p)->user under rcu lock.  In any case, ->siglock
can't help to protect cred->user.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
  2010-03-23 10:57   ` Oleg Nesterov
@ 2010-04-09 19:59     ` Roland McGrath
  2010-04-12 19:50       ` Oleg Nesterov
  2010-04-10  8:16     ` David Howells
  1 sibling, 1 reply; 11+ messages in thread
From: Roland McGrath @ 2010-04-09 19:59 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: David Howells, Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
	linux-kernel

> Yes. From the changelog:
> 
> 	Of course, this means we read pending/blocked/etc nonatomically,
> 	but I hope this is OK for fs/proc.
> 
> But I don't think the returned data could be "really" inconsistent
> from the /bin/ps pov. Yes, it is possible that, say, some signal is
> seen as both pending and ignored without ->siglock. Or we can report
> user->sigpending != 0 while pending/shpending are empty.
> 
> But this looks harmless to me. We never guaranteed /proc/pid/status
> can't report the "intermediate" state, and I don't think we can
> confuse the user-space.
> 
> Do you agree? Or do you think this can make problems ?

I'm not so sure.  Operations like sigprocmask and sigaction really have
always been entirely atomic from the userland perspective before.  Now it
becomes possible to read from /proc e.g. a blocked set that never existed
as such (one word updated by sigprocmask but not yet the next word).


Thanks,
Roland

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
  2010-03-23 10:57   ` Oleg Nesterov
  2010-04-09 19:59     ` Roland McGrath
@ 2010-04-10  8:16     ` David Howells
  1 sibling, 0 replies; 11+ messages in thread
From: David Howells @ 2010-04-10  8:16 UTC (permalink / raw)
  To: Roland McGrath
  Cc: dhowells, Oleg Nesterov, Andrew Morton, Alexey Dobriyan,
	Eric W. Biederman, linux-kernel

Roland McGrath <roland@redhat.com> wrote:

> I'm not so sure.  Operations like sigprocmask and sigaction really have
> always been entirely atomic from the userland perspective before.  Now it
> becomes possible to read from /proc e.g. a blocked set that never existed
> as such (one word updated by sigprocmask but not yet the next word).

If you have a small userspace buffer, that was previously possible too.

David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
  2010-04-09 19:59     ` Roland McGrath
@ 2010-04-12 19:50       ` Oleg Nesterov
  2010-04-13  6:30         ` Roland McGrath
  0 siblings, 1 reply; 11+ messages in thread
From: Oleg Nesterov @ 2010-04-12 19:50 UTC (permalink / raw)
  To: Roland McGrath
  Cc: David Howells, Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
	linux-kernel

On 04/09, Roland McGrath wrote:
>
> > Yes. From the changelog:
> >
> > 	Of course, this means we read pending/blocked/etc nonatomically,
> > 	but I hope this is OK for fs/proc.
> >
> > But I don't think the returned data could be "really" inconsistent
> > from the /bin/ps pov. Yes, it is possible that, say, some signal is
> > seen as both pending and ignored without ->siglock. Or we can report
> > user->sigpending != 0 while pending/shpending are empty.
> >
> > But this looks harmless to me. We never guaranteed /proc/pid/status
> > can't report the "intermediate" state, and I don't think we can
> > confuse the user-space.
> >
> > Do you agree? Or do you think this can make problems ?
>
> I'm not so sure.  Operations like sigprocmask and sigaction really have
> always been entirely atomic from the userland perspective before.  Now it
> becomes possible to read from /proc e.g. a blocked set that never existed
> as such (one word updated by sigprocmask but not yet the next word).

Yes, /proc/pid/status can report the intermediate state, I even sent
the updated changelog to document this.

But if you are not sure this is OK, I am worried. Do you think we should
drop this patch? If yes, I won't argue.

Oleg.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
  2010-04-12 19:50       ` Oleg Nesterov
@ 2010-04-13  6:30         ` Roland McGrath
  2010-04-13 20:00           ` Oleg Nesterov
  0 siblings, 1 reply; 11+ messages in thread
From: Roland McGrath @ 2010-04-13  6:30 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: David Howells, Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
	linux-kernel

> Yes, /proc/pid/status can report the intermediate state, I even sent
> the updated changelog to document this.
> 
> But if you are not sure this is OK, I am worried. Do you think we should
> drop this patch? If yes, I won't argue.

I'm not dead-set against it, but I am hesitant.  My inclination is not to
remove any previous userland atomicity guarantees with regard to observable
signal state in any form.  At least, don't do that in part of a whole
cleanup flurry where it is intermixed with lots of changes that really are
pure cleanup with absolutely no userland-observable change.  If it really
helps to fragment what was atomic before, then we can consider it.  But
let's not be in a hurry.

David mentioned that users who do multiple reads due to using tiny buffers
already don't get atomic sampling.  That is certainly true but I don't
think it's relevant.  It is completely reliable that you can easily
allocate a buffer big enough to get all the Sig* fields on the first read,
and any user program that might care about the coherence of the data,
by definition, is already doing that.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH -mm 3/3] proc: make task_sig() lockless
  2010-04-13  6:30         ` Roland McGrath
@ 2010-04-13 20:00           ` Oleg Nesterov
  0 siblings, 0 replies; 11+ messages in thread
From: Oleg Nesterov @ 2010-04-13 20:00 UTC (permalink / raw)
  To: Roland McGrath
  Cc: David Howells, Andrew Morton, Alexey Dobriyan, Eric W. Biederman,
	linux-kernel

OK. Andrew, please drop

	proc-make-collect_sigign_sigcatch-rcu-safe.patch
	proc-make-task_sig-lockless.patch

On 04/12, Roland McGrath wrote:
>
> > Yes, /proc/pid/status can report the intermediate state, I even sent
> > the updated changelog to document this.
> >
> > But if you are not sure this is OK, I am worried. Do you think we should
> > drop this patch? If yes, I won't argue.
>
> I'm not dead-set against it, but I am hesitant.  My inclination is not to
> remove any previous userland atomicity guarantees with regard to observable
> signal state in any form.

OK. Not that I really understand why do we need atomicity, but OK.

I was going to remove ->siglock from /fs/proc/ completely (except
do_io_accounting), but given that nobody replied to do_task_stat patches
this will not happen soon.

> At least, don't do that in part of a whole
> cleanup flurry where it is intermixed with lots of changes that really are
> pure cleanup with absolutely no userland-observable change.

OK. Anyway, these changes are simple, we can reconsider them later.

Oleg.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-04-13 20:03 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-22 18:41 [PATCH -mm 3/3] proc: make task_sig() lockless Oleg Nesterov
2010-03-23  8:30 ` David Howells
2010-03-23  8:37 ` David Howells
2010-03-23 10:57   ` Oleg Nesterov
2010-04-09 19:59     ` Roland McGrath
2010-04-12 19:50       ` Oleg Nesterov
2010-04-13  6:30         ` Roland McGrath
2010-04-13 20:00           ` Oleg Nesterov
2010-04-10  8:16     ` David Howells
2010-03-24  8:37   ` David Howells
2010-03-24 15:00     ` Oleg Nesterov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.