linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: heavy handed exit() in latest BK
       [not found] <200302091130.h19BU2107744@magilla.sf.frob.com>
@ 2003-02-09 11:40 ` Ingo Molnar
  2003-02-09 11:56   ` Roland McGrath
  0 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2003-02-09 11:40 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Linus Torvalds, Anton Blanchard, Andrew Morton, Arjan van de Ven,
	linux-kernel


Roland,

there are two bugs in the signal code still:

 - a read_lock(&tasklist_lock) is missing around the group_send_sig_info()
   in send_sig_info().

 - session-IDs and group-IDs are set outside the tasklist lock. This
   causes breakage in the USB code. The correct fix is to do this:

void __set_special_pids(pid_t session, pid_t pgrp)
{
        struct task_struct *curr = current;

        if (curr->session != session) {
                detach_pid(curr, PIDTYPE_SID);
                curr->session = session;
                attach_pid(curr, PIDTYPE_SID, session);
        }
        if (curr->pgrp != pgrp) {
                detach_pid(curr, PIDTYPE_PGID);
                curr->pgrp = pgrp;
                attach_pid(curr, PIDTYPE_PGID, pgrp);
        }
}

void set_special_pids(pid_t session, pid_t pgrp)
{
        write_lock_irq(&tasklist_lock);
        __set_special_pids(session, pgrp);
        write_unlock_irq(&tasklist_lock);
}

and then update all places that do:

-	current->session = 1;
-	current->pgrp = 1;

to:

+	set_special_pids(1, 1);

otherwise we change the PIDs without properly rehashing them.

	Ingo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09 11:40 ` heavy handed exit() in latest BK Ingo Molnar
@ 2003-02-09 11:56   ` Roland McGrath
  2003-02-09 12:09     ` Ingo Molnar
                       ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Roland McGrath @ 2003-02-09 11:56 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Anton Blanchard, Andrew Morton, Arjan van de Ven,
	linux-kernel

>  - a read_lock(&tasklist_lock) is missing around the group_send_sig_info()
>    in send_sig_info().

Indeed.  I still intend to clean up those entry points and haven't gotten
to it, so I hadn't bothered with this yet either (though I think I sent it
to you for the backport).  It certainly does bite in practice, e.g. SIGPIPE.

There is a similar failure to take the lock before using zap_other_threads.
I thought I sent this patch before, but it's not in 2.5 yet.

--- /home/roland/redhat/linux-2.5.59-1.1007/fs/exec.c.~1~	Fri Feb  7 20:04:27 2003
+++ /home/roland/redhat/linux-2.5.59-1.1007/fs/exec.c	Sun Feb  9 03:43:36 2003
@@ -601,9 +601,12 @@ static inline int de_thread(struct task_
 
 	if (thread_group_empty(current))
 		goto no_thread_group;
+
 	/*
-	 * Kill all other threads in the thread group:
+	 * Kill all other threads in the thread group.
+	 * We must hold tasklist_lock to call zap_other_threads.
 	 */
+	read_lock(&tasklist_lock);
 	spin_lock_irq(lock);
 	if (oldsig->group_exit) {
 		/*
@@ -611,6 +614,7 @@ static inline int de_thread(struct task_
 		 * return so that the signal is processed.
 		 */
 		spin_unlock_irq(lock);
+		read_unlock(&tasklist_lock);
 		kmem_cache_free(sighand_cachep, newsighand);
 		if (newsig)
 			kmem_cache_free(signal_cachep, newsig);
@@ -629,12 +633,15 @@ static inline int de_thread(struct task_
 		oldsig->group_exit_task = current;
 		current->state = TASK_UNINTERRUPTIBLE;
 		spin_unlock_irq(lock);
+		read_unlock(&tasklist_lock);
 		schedule();
+		read_lock(&tasklist_lock);
 		spin_lock_irq(lock);
 		if (oldsig->group_exit_task)
 			BUG();
 	}
 	spin_unlock_irq(lock);
+	read_unlock(&tasklist_lock);
 
 	/*
 	 * At this point all other threads have exited, all we have to


>  - session-IDs and group-IDs are set outside the tasklist lock. This
>    causes breakage in the USB code. The correct fix is to do this:

This is outside the part of the code that I've touched lately.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09 11:56   ` Roland McGrath
@ 2003-02-09 12:09     ` Ingo Molnar
  2003-02-09 12:18     ` Ingo Molnar
  2003-02-10  1:07     ` Linus Torvalds
  2 siblings, 0 replies; 27+ messages in thread
From: Ingo Molnar @ 2003-02-09 12:09 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Linus Torvalds, Anton Blanchard, Andrew Morton, Arjan van de Ven,
	linux-kernel


On Sun, 9 Feb 2003, Roland McGrath wrote:

> >  - a read_lock(&tasklist_lock) is missing around the group_send_sig_info()
> >    in send_sig_info().
> 
> Indeed.  I still intend to clean up those entry points and haven't
> gotten to it, so I hadn't bothered with this yet either (though I think
> I sent it to you for the backport).  It certainly does bite in practice,
> e.g. SIGPIPE.

it does bite - here's the correctness fix meanwhile, until the interface 
is cleaned up.

	Ingo

--- linux/kernel/signal.c.orig	
+++ linux/kernel/signal.c	
@@ -1083,17 +1083,19 @@
 int
 send_sig_info(int sig, struct siginfo *info, struct task_struct *p)
 {
+	int ret;
+
 	/* XXX should nix these interfaces and update the kernel */
-	if (T(sig, SIG_KERNEL_BROADCAST_MASK))
-		/* XXX do callers really always hold the tasklist_lock?? */
-		return group_send_sig_info(sig, info, p);
-	else {
-		int error;
+	if (T(sig, SIG_KERNEL_BROADCAST_MASK)) {
+		read_lock(&tasklist_lock);
+		ret = group_send_sig_info(sig, info, p);
+		read_unlock(&tasklist_lock);
+	} else {
 		spin_lock_irq(&p->sighand->siglock);
-		error = specific_send_sig_info(sig, info, p);
+		ret = specific_send_sig_info(sig, info, p);
 		spin_unlock_irq(&p->sighand->siglock);
-		return error;
 	}
+	return ret;
 }
 
 int


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09 11:56   ` Roland McGrath
  2003-02-09 12:09     ` Ingo Molnar
@ 2003-02-09 12:18     ` Ingo Molnar
  2003-02-09 12:23       ` Ingo Molnar
  2003-02-10  1:07     ` Linus Torvalds
  2 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2003-02-09 12:18 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Linus Torvalds, Anton Blanchard, Andrew Morton, Arjan van de Ven,
	linux-kernel


On Sun, 9 Feb 2003, Roland McGrath wrote:

> >  - session-IDs and group-IDs are set outside the tasklist lock. This
> >    causes breakage in the USB code. The correct fix is to do this:
> 
> This is outside the part of the code that I've touched lately.

yes, i introduced the bug with the new pidhash. Here's the fix, against
BK-curr.

	Ingo

--- linux/include/linux/sched.h.orig	
+++ linux/include/linux/sched.h	
@@ -503,6 +503,8 @@
 extern struct   mm_struct init_mm;
 
 extern struct task_struct *find_task_by_pid(int pid);
+extern void set_special_pids(pid_t session, pid_t pgrp);
+extern void __set_special_pids(pid_t session, pid_t pgrp);
 
 /* per-UID process charging. */
 extern struct user_struct * alloc_uid(uid_t);
--- linux/fs/jffs/intrep.c.orig	
+++ linux/fs/jffs/intrep.c	
@@ -3344,8 +3344,7 @@
 	lock_kernel();
 	exit_mm(c->gc_task);
 
-	current->session = 1;
-	current->pgrp = 1;
+	set_special_pids(1, 1);
 	init_completion(&c->gc_thread_comp); /* barrier */ 
 	spin_lock_irq(&current->sighand->siglock);
 	siginitsetinv (&current->blocked, sigmask(SIGHUP) | sigmask(SIGKILL) | sigmask(SIGSTOP) | sigmask(SIGCONT));
--- linux/kernel/exit.c.orig	
+++ linux/kernel/exit.c	
@@ -254,6 +254,29 @@
 	write_unlock_irq(&tasklist_lock);
 }
 
+void __set_special_pids(pid_t session, pid_t pgrp)
+{
+	struct task_struct *curr = current;
+
+	if (curr->session != session) {
+		detach_pid(curr, PIDTYPE_SID);
+		curr->session = session;
+		attach_pid(curr, PIDTYPE_SID, session);
+	}
+	if (curr->pgrp != pgrp) {
+		detach_pid(curr, PIDTYPE_PGID);
+		curr->pgrp = pgrp;
+		attach_pid(curr, PIDTYPE_PGID, pgrp);
+	}
+}
+
+void set_special_pids(pid_t session, pid_t pgrp)
+{
+	write_lock_irq(&tasklist_lock);
+	__set_special_pids(session, pgrp);
+	write_unlock_irq(&tasklist_lock);
+}
+
 /*
  *	Put all the gunge required to become a kernel thread without
  *	attached user resources in one place where it belongs.
@@ -271,8 +294,7 @@
 	 */
 	exit_mm(current);
 
-	current->session = 1;
-	current->pgrp = 1;
+	set_special_pids(1, 1);
 	current->tty = NULL;
 
 	/* Become as one with the init task */
--- linux/kernel/kmod.c.orig	
+++ linux/kernel/kmod.c	
@@ -100,8 +100,7 @@
 	int i;
 	struct task_struct *curtask = current;
 
-	curtask->session = 1;
-	curtask->pgrp = 1;
+	set_special_pids(1, 1);
 
 	use_init_fs_context();
 
--- linux/kernel/sys.c.orig	
+++ linux/kernel/sys.c	
@@ -1021,16 +1021,7 @@
 		goto out;
 
 	current->leader = 1;
-	if (current->session != current->pid) {
-		detach_pid(current, PIDTYPE_SID);
-		current->session = current->pid;
-		attach_pid(current, PIDTYPE_SID, current->pid);
-	}
-	if (current->pgrp != current->pid) {
-		detach_pid(current, PIDTYPE_PGID);
-		current->pgrp = current->pid;
-		attach_pid(current, PIDTYPE_PGID, current->pid);
-	}
+	__set_special_pids(current->pid, current->pid);
 	current->tty = NULL;
 	current->tty_old_pgrp = 0;
 	err = current->pgrp;


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09 12:23       ` Ingo Molnar
@ 2003-02-09 12:22         ` Arjan van de Ven
  0 siblings, 0 replies; 27+ messages in thread
From: Arjan van de Ven @ 2003-02-09 12:22 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Roland McGrath, Linus Torvalds, Anton Blanchard, Andrew Morton,
	Arjan van de Ven, linux-kernel

On Sun, Feb 09, 2003 at 01:23:14PM +0100, Ingo Molnar wrote:
> 
> Arjan pointed out that this one is needed as well:

but in second thought... modules don't really need to set the
pids at all, since reparent_to_init() is designed for doing so...

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09 12:18     ` Ingo Molnar
@ 2003-02-09 12:23       ` Ingo Molnar
  2003-02-09 12:22         ` Arjan van de Ven
  0 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2003-02-09 12:23 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Linus Torvalds, Anton Blanchard, Andrew Morton, Arjan van de Ven,
	linux-kernel


Arjan pointed out that this one is needed as well:

--- linux/kernel/ksyms.c.orig
+++ linux/kernel/ksyms.c
@@ -604,6 +604,7 @@
 EXPORT_SYMBOL(tasklist_lock);
 EXPORT_SYMBOL(find_task_by_pid);
 EXPORT_SYMBOL(next_thread);
+EXPORT_SYMBOL_GPL(set_special_pids);
 #if defined(CONFIG_SMP) && defined(__GENERIC_PER_CPU)
 EXPORT_SYMBOL(__per_cpu_offset);
 #endif


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09 11:56   ` Roland McGrath
  2003-02-09 12:09     ` Ingo Molnar
  2003-02-09 12:18     ` Ingo Molnar
@ 2003-02-10  1:07     ` Linus Torvalds
  2003-02-10  1:27       ` Roland McGrath
  2 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2003-02-10  1:07 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Ingo Molnar, Anton Blanchard, Andrew Morton, Arjan van de Ven,
	linux-kernel


On Sun, 9 Feb 2003, Roland McGrath wrote:
> 
> There is a similar failure to take the lock before using zap_other_threads.
> I thought I sent this patch before, but it's not in 2.5 yet.

Hmm.. From looking at this patch, it seems as if you believe that 
spinlocks must next correctly. They don't have to.

The ABBA kind of deadlock only means that you have to _take_ the spinlocks 
in the right order, you can release them in any order you like (as long as 
you release all of them).

So if the only requirement is that zap_other_threads() is called with the
tasklist lock held, I would suggest just dropping the tasklist lock after
the zap_other threads thing, despite the fact that you still want to hold 
on to the siglock for a while longer. That simplifies the patch, and means 
that you don't have to worry about dropping and re-taking the tasklist 
lock in the loop that follows.

		Linus


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-10  1:07     ` Linus Torvalds
@ 2003-02-10  1:27       ` Roland McGrath
  0 siblings, 0 replies; 27+ messages in thread
From: Roland McGrath @ 2003-02-10  1:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Anton Blanchard, Andrew Morton, Arjan van de Ven,
	linux-kernel

--- /home/roland/redhat/linux-2.5.59-1.1007/fs/exec.c.~1~	Fri Feb  7 20:04:27 2003
+++ /home/roland/redhat/linux-2.5.59-1.1007/fs/exec.c	Sun Feb  9 17:25:31 2003
@@ -601,9 +601,12 @@ static inline int de_thread(struct task_
 
 	if (thread_group_empty(current))
 		goto no_thread_group;
+
 	/*
-	 * Kill all other threads in the thread group:
+	 * Kill all other threads in the thread group.
+	 * We must hold tasklist_lock to call zap_other_threads.
 	 */
+	read_lock(&tasklist_lock);
 	spin_lock_irq(lock);
 	if (oldsig->group_exit) {
 		/*
@@ -611,6 +614,7 @@ static inline int de_thread(struct task_
 		 * return so that the signal is processed.
 		 */
 		spin_unlock_irq(lock);
+		read_unlock(&tasklist_lock);
 		kmem_cache_free(sighand_cachep, newsighand);
 		if (newsig)
 			kmem_cache_free(signal_cachep, newsig);
@@ -618,6 +622,7 @@ static inline int de_thread(struct task_
 	}
 	oldsig->group_exit = 1;
 	zap_other_threads(current);
+	read_unlock(&tasklist_lock);
 
 	/*
 	 * Account for the thread group leader hanging around:

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-10  8:53   ` Ingo Molnar
@ 2003-02-10 15:22     ` Linus Torvalds
  0 siblings, 0 replies; 27+ messages in thread
From: Linus Torvalds @ 2003-02-10 15:22 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Anton Blanchard, linux-kernel, Roland McGrath, Andrew Morton, arjanv


On Mon, 10 Feb 2003, Ingo Molnar wrote:
> 
> > Interesting. Especially as the last thing exit_notify() does (just a few
> > lines above the schedule()) is to do
> > 
> >         tsk->state = TASK_ZOMBIE;
> > 
> > and that schedule() _really_ really shouldn't return. Regardless of any
> > signal handler changes.
> 
> the proper way to avoid such scenarios (besides removing tasks from all
> waitqueues) is to remove the thread from all the PID-hashes prior setting
> it to TASK_ZOMBIE.

Not a good idea.

We still want to find zombie processes, since they show up in "ps"  
listings etc. And I don't think sending a signal to a zombie process
should return ESRCH, since it's there.

Btw, I fixed it by making all wake-up events give a mask of which states 
can be woken up. That's really what SIGCONT wanted anyway (only wake up 
stopped tasks), _and_ it's what default_wake_function() really wanted.

		Linus


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  2:00 ` Linus Torvalds
  2003-02-09  2:17   ` Roland McGrath
  2003-02-09  2:33   ` Linus Torvalds
@ 2003-02-10  8:53   ` Ingo Molnar
  2003-02-10 15:22     ` Linus Torvalds
  2 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2003-02-10  8:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Anton Blanchard, linux-kernel, Roland McGrath, Andrew Morton, arjanv


On Sat, 8 Feb 2003, Linus Torvalds wrote:

> Interesting. Especially as the last thing exit_notify() does (just a few
> lines above the schedule()) is to do
> 
>         tsk->state = TASK_ZOMBIE;
> 
> and that schedule() _really_ really shouldn't return. Regardless of any
> signal handler changes.

the proper way to avoid such scenarios (besides removing tasks from all
waitqueues) is to remove the thread from all the PID-hashes prior setting
it to TASK_ZOMBIE. Since any signal-sending needs a task pointer, and the
task pointer can only be found by looking it up in the hash, the proper
way to exclude signal related wakeups (without introducing special
handling into the wakeup code), is to remove the thread from the hash.

	Ingo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  4:51           ` Linus Torvalds
  2003-02-09  4:57             ` Linus Torvalds
  2003-02-09  5:00             ` Roland McGrath
@ 2003-02-09  9:28             ` Russell King
  2 siblings, 0 replies; 27+ messages in thread
From: Russell King @ 2003-02-09  9:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Roland McGrath, Anton Blanchard, linux-kernel, Ingo Molnar,
	Andrew Morton, arjanv

On Sat, Feb 08, 2003 at 08:51:05PM -0800, Linus Torvalds wrote:
> On Sat, 8 Feb 2003, Roland McGrath wrote:
> >
> > Here is the patch vs 2.5.59-1.1007 that I am using now.  gdb seems happy.
> > I have not run a lot of other tests yet.
> 
> Looks like kernel threads still go crazy at shutdown. I saw the migration 
> threads apparently hogging the CPU.

I hope you're aware that alt-sysrq-t has been broken for some time?

include/linux/sched.h:

#define TASK_RUNNING            0
#define TASK_INTERRUPTIBLE      1
#define TASK_UNINTERRUPTIBLE    2
#define TASK_STOPPED            4
#define TASK_ZOMBIE             8
#define TASK_DEAD               16

kernel/sched.c:

        static const char * stat_nam[] = { "R", "S", "D", "Z", "T", "W" };

fs/proc/array.c:

static const char *task_state_array[] = {
        "R (running)",          /*  0 */
        "S (sleeping)",         /*  1 */
        "D (disk sleep)",       /*  2 */
        "T (stopped)",          /*  8 */
        "Z (zombie)",           /*  4 */
        "X (dead)"              /* 16 */
};

So, for one more time, here's another mailing of the same patch to fix
this brokenness.  In addition, we fix the wrong comment in fs/proc/array.c

--- orig/kernel/sched.c	Sun Feb  9 09:16:31 2003
+++ linux/kernel/sched.c	Sun Feb  9 09:23:44 2003
@@ -2037,7 +2037,7 @@
 	unsigned long free = 0;
 	task_t *relative;
 	int state;
-	static const char * stat_nam[] = { "R", "S", "D", "Z", "T", "W" };
+	static const char * stat_nam[] = { "R", "S", "D", "T", "Z", "W" };
 
 	printk("%-13.13s ", p->comm);
 	state = p->state ? __ffs(p->state) + 1 : 0;
--- orig/fs/proc/array.c	Sun Feb  9 09:17:36 2003
+++ linux/fs/proc/array.c	Sun Feb  9 09:26:00 2003
@@ -126,8 +126,8 @@
 	"R (running)",		/*  0 */
 	"S (sleeping)",		/*  1 */
 	"D (disk sleep)",	/*  2 */
-	"T (stopped)",		/*  8 */
-	"Z (zombie)",		/*  4 */
+	"T (stopped)",		/*  4 */
+	"Z (zombie)",		/*  8 */
 	"X (dead)"		/* 16 */
 };
 


-- 
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  4:51           ` Linus Torvalds
  2003-02-09  4:57             ` Linus Torvalds
@ 2003-02-09  5:00             ` Roland McGrath
  2003-02-09  9:28             ` Russell King
  2 siblings, 0 replies; 27+ messages in thread
From: Roland McGrath @ 2003-02-09  5:00 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Anton Blanchard, linux-kernel, Ingo Molnar, Andrew Morton, arjanv

> Looks like kernel threads still go crazy at shutdown. I saw the migration 
> threads apparently hogging the CPU.

Hmm, my (2-CPU) machine reboots quickly.  I'd have to be checking closely
somehow to see if there is some short period of weird hoggery (I'm not sure
how, since my fingers aren't quick enough).  What exactly did you observe,
and how?


Thanks,
Roland

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  4:51           ` Linus Torvalds
@ 2003-02-09  4:57             ` Linus Torvalds
  2003-02-09  5:00             ` Roland McGrath
  2003-02-09  9:28             ` Russell King
  2 siblings, 0 replies; 27+ messages in thread
From: Linus Torvalds @ 2003-02-09  4:57 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Anton Blanchard, linux-kernel, Ingo Molnar, Andrew Morton, arjanv


On Sat, 8 Feb 2003, Linus Torvalds wrote:
> 
> Looks like kernel threads still go crazy at shutdown. I saw the migration 
> threads apparently hogging the CPU.

Never mind, I didn't merge your patch correctly, I still had my old one 
there that allowed setting signal-pending even for a blocked signal (and 
thus would confuse all the kernel threads that didn't expect to ever have 
somebody claim they had pending signals).

I think your patch is ok.

		Linus


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  3:48         ` Roland McGrath
@ 2003-02-09  4:51           ` Linus Torvalds
  2003-02-09  4:57             ` Linus Torvalds
                               ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Linus Torvalds @ 2003-02-09  4:51 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Anton Blanchard, linux-kernel, Ingo Molnar, Andrew Morton, arjanv


On Sat, 8 Feb 2003, Roland McGrath wrote:
>
> Here is the patch vs 2.5.59-1.1007 that I am using now.  gdb seems happy.
> I have not run a lot of other tests yet.

Looks like kernel threads still go crazy at shutdown. I saw the migration 
threads apparently hogging the CPU.

		Linus


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  3:33       ` Roland McGrath
  2003-02-09  3:37         ` Linus Torvalds
@ 2003-02-09  3:48         ` Roland McGrath
  2003-02-09  4:51           ` Linus Torvalds
  1 sibling, 1 reply; 27+ messages in thread
From: Roland McGrath @ 2003-02-09  3:48 UTC (permalink / raw)
  To: Linus Torvalds, Anton Blanchard, linux-kernel, Ingo Molnar,
	Andrew Morton, arjanv

Here is the patch vs 2.5.59-1.1007 that I am using now.  gdb seems happy.
I have not run a lot of other tests yet.


--- /home/roland/redhat/linux-2.5.59-1.1007/kernel/signal.c.~1~	Fri Feb  7 20:04:29 2003
+++ /home/roland/redhat/linux-2.5.59-1.1007/kernel/signal.c	Sat Feb  8 19:35:28 2003
@@ -120,9 +120,6 @@ int max_queued_signals = 1024;
 #define SIG_KERNEL_STOP_MASK (\
 	M(SIGSTOP)   |  M(SIGTSTP)   |  M(SIGTTIN)   |  M(SIGTTOU)   )
 
-#define SIG_KERNEL_CONT_MASK (\
-	M(SIGCONT)   |  M(SIGKILL)   )
-
 #define SIG_KERNEL_COREDUMP_MASK (\
         M(SIGQUIT)   |  M(SIGILL)    |  M(SIGTRAP)   |  M(SIGABRT)   | \
         M(SIGFPE)    |  M(SIGSEGV)   |  M(SIGBUS)    |  M(SIGSYS)    | \
@@ -139,8 +136,6 @@ int max_queued_signals = 1024;
 		(((sig) < SIGRTMIN)  && T(sig, SIG_KERNEL_IGNORE_MASK))
 #define sig_kernel_stop(sig) \
 		(((sig) < SIGRTMIN)  && T(sig, SIG_KERNEL_STOP_MASK))
-#define sig_kernel_cont(sig) \
-		(((sig) < SIGRTMIN)  && T(sig, SIG_KERNEL_CONT_MASK))
 
 #define sig_user_defined(t, signr) \
 	(((t)->sighand->action[(signr)-1].sa.sa_handler != SIG_DFL) &&	\
@@ -592,7 +587,7 @@ static void handle_stop_signal(int sig, 
 			t = next_thread(t);
 		} while (t != p);
 	}
-	else if (sig_kernel_cont(sig)) {
+	else if (sig == SIGCONT) {
 		/*
 		 * Remove all stop signals from all queues,
 		 * and wake all threads.
@@ -618,7 +613,8 @@ static void handle_stop_signal(int sig, 
 					p->group_leader,
 					p->group_leader->real_parent);
 		}
-		rm_from_queue(SIG_KERNEL_STOP_MASK, &p->signal->shared_pending);
+		rm_from_queue(SIG_KERNEL_STOP_MASK,
+			      &p->signal->shared_pending);
 		t = p;
 		do {
 			rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending);
@@ -637,17 +633,13 @@ static void handle_stop_signal(int sig, 
 			 * running the handler.  With the TIF_SIGPENDING
 			 * flag set, the thread will pause and acquire the
 			 * siglock that we hold now and until we've queued
-			 * the pending signal.  For SIGKILL, we likewise
-			 * don't want anybody doing anything but taking the
-			 * SIGKILL.  The only case in which a thread would
-			 * not already be in the signal dequeuing loop is
-			 * non-signal (e.g. syscall) ptrace tracing, so we
-			 * don't worry about an unnecessary trip through
-			 * the signal code and just keep this code path
-			 * simpler by unconditionally setting the flag.
+			 * the pending signal.
 			 */
-			set_tsk_thread_flag(t, TIF_SIGPENDING);
-			wake_up_process(t);
+			if (!(t->flags & PF_EXITING)) {
+				if (!sigismember(&t->blocked, SIGCONT))
+					set_tsk_thread_flag(t, TIF_SIGPENDING);
+				wake_up_process(t);
+			}
 			t = next_thread(t);
 		} while (t != p);
 	}
@@ -782,23 +774,25 @@ force_sig_specific(int sig, struct task_
 }
 
 /*
- * Test if P wants to take SIG.  After we've checked all threads with this,
- * it's equivalent to finding no threads not blocking SIG.  Any threads not
- * blocking SIG were ruled out because they are not running and already
- * have pending signals.  Such threads will dequeue from the shared queue
- * as soon as they're available, so putting the signal on the shared queue
- * will be equivalent to sending it to one such thread.
- */
-#define wants_signal(sig, p)	(!sigismember(&(p)->blocked, sig) \
-				 && (p)->state < TASK_STOPPED \
-				 && !((p)->flags & PF_EXITING) \
-				 && (task_curr(p) || !signal_pending(p)))
+ * Test if P wants to take SIG.  MASK gives P->state bits that rule it out.
+ * After we've checked all threads with this, it's equivalent to finding no
+ * threads not blocking SIG.  Any threads not blocking SIG were ruled out
+ * because they are not running and already have pending signals.  Such
+ * threads will dequeue from the shared queue as soon as they're available,
+ * so putting the signal on the shared queue will be equivalent to sending
+ * it to one such thread.
+ */
+#define wants_signal(sig, p, mask)				\
+	(!sigismember(&(p)->blocked, sig) &&			\
+	 !((p)->state & (mask)) && !((p)->flags & PF_EXITING) &&\
+	 (task_curr(p) || !signal_pending(p)))
 
 static inline int
 __group_send_sig_info(int sig, struct siginfo *info, struct task_struct *p)
 {
 	struct task_struct *t;
 	int ret;
+	unsigned int mask;
 
 #if CONFIG_SMP
 	if (!spin_is_locked(&p->sighand->siglock))
@@ -824,12 +818,20 @@ __group_send_sig_info(int sig, struct si
 		return ret;
 
 	/*
+	 * Signals are only accepted by non-stopped processes.
+	 * SIGKILL is the exception: it acts on stopped processes too.
+	 */
+	mask = TASK_ZOMBIE | TASK_DEAD;
+	if (sig != SIGKILL)
+		mask |= TASK_STOPPED;
+
+	/*
 	 * Now find a thread we can wake up to take the signal off the queue.
 	 *
 	 * If the main thread wants the signal, it gets first crack.
 	 * Probably the least surprising to the average bear.
 	 */
-	if (wants_signal(sig, p))
+	if (wants_signal(sig, p, mask))
 		t = p;
 	else if (thread_group_empty(p))
 		/*
@@ -847,7 +849,7 @@ __group_send_sig_info(int sig, struct si
 			t = p->signal->curr_target = p;
 		BUG_ON(t->tgid != p->tgid);
 
-		while (!wants_signal(sig, t)) {
+		while (!wants_signal(sig, t, mask)) {
 			t = next_thread(t);
 			if (t == p->signal->curr_target)
 				/*

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  3:37         ` Linus Torvalds
@ 2003-02-09  3:40           ` Linus Torvalds
  0 siblings, 0 replies; 27+ messages in thread
From: Linus Torvalds @ 2003-02-09  3:40 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Anton Blanchard, linux-kernel, Ingo Molnar, Andrew Morton, arjanv


On Sat, 8 Feb 2003, Linus Torvalds wrote:
> 
> Oops. Indeed. TASK_RUNNING isn't a bit at all, it's a lack of sleep.

So it might work this way around..

		Linus

---
===== kernel/signal.c 1.61 vs edited =====
--- 1.61/kernel/signal.c	Fri Feb  7 06:46:13 2003
+++ edited/kernel/signal.c	Sat Feb  8 19:39:48 2003
@@ -591,8 +591,7 @@
 			rm_from_queue(sigmask(SIGCONT), &t->pending);
 			t = next_thread(t);
 		} while (t != p);
-	}
-	else if (sig_kernel_cont(sig)) {
+	} else if (sig == SIGCONT) {
 		/*
 		 * Remove all stop signals from all queues,
 		 * and wake all threads.
@@ -637,17 +636,12 @@
 			 * running the handler.  With the TIF_SIGPENDING
 			 * flag set, the thread will pause and acquire the
 			 * siglock that we hold now and until we've queued
-			 * the pending signal.  For SIGKILL, we likewise
-			 * don't want anybody doing anything but taking the
-			 * SIGKILL.  The only case in which a thread would
-			 * not already be in the signal dequeuing loop is
-			 * non-signal (e.g. syscall) ptrace tracing, so we
-			 * don't worry about an unnecessary trip through
-			 * the signal code and just keep this code path
-			 * simpler by unconditionally setting the flag.
+			 * the pending signal. 
 			 */
-			set_tsk_thread_flag(t, TIF_SIGPENDING);
-			wake_up_process(t);
+			if (!(t->flags & PF_EXITING)) {
+				set_tsk_thread_flag(t, TIF_SIGPENDING);
+				wake_up_process(t);
+			}
 			t = next_thread(t);
 		} while (t != p);
 	}
@@ -789,15 +783,17 @@
  * as soon as they're available, so putting the signal on the shared queue
  * will be equivalent to sending it to one such thread.
  */
-#define wants_signal(sig, p)	(!sigismember(&(p)->blocked, sig) \
-				 && (p)->state < TASK_STOPPED \
-				 && !((p)->flags & PF_EXITING) \
-				 && (task_curr(p) || !signal_pending(p)))
+#define wants_signal(sig, p, mask) 			\
+	(!sigismember(&(p)->blocked, sig)		\
+	 && !((p)->state & mask)			\
+	 && !((p)->flags & PF_EXITING)			\
+	 && (task_curr(p) || !signal_pending(p)))
 
 static inline int
 __group_send_sig_info(int sig, struct siginfo *info, struct task_struct *p)
 {
 	struct task_struct *t;
+	unsigned int mask;
 	int ret;
 
 #if CONFIG_SMP
@@ -815,6 +811,14 @@
 		return 0;
 
 	/*
+	 * Don't bother zombies and stopped tasks (but
+	 * SIGKILL will punch through stopped state)
+	 */
+	mask = TASK_DEAD | TASK_ZOMBIE;
+	if (sig != SIGKILL)
+		mask |= TASK_STOPPED;
+
+	/*
 	 * Put this signal on the shared-pending queue, or fail with EAGAIN.
 	 * We always use the shared queue for process-wide signals,
 	 * to avoid several races.
@@ -829,7 +833,7 @@
 	 * If the main thread wants the signal, it gets first crack.
 	 * Probably the least surprising to the average bear.
 	 */
-	if (wants_signal(sig, p))
+	if (wants_signal(sig, p, mask))
 		t = p;
 	else if (thread_group_empty(p))
 		/*
@@ -847,7 +851,7 @@
 			t = p->signal->curr_target = p;
 		BUG_ON(t->tgid != p->tgid);
 
-		while (!wants_signal(sig, t)) {
+		while (!wants_signal(sig, t, mask)) {
 			t = next_thread(t);
 			if (t == p->signal->curr_target)
 				/*


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  3:33       ` Roland McGrath
@ 2003-02-09  3:37         ` Linus Torvalds
  2003-02-09  3:40           ` Linus Torvalds
  2003-02-09  3:48         ` Roland McGrath
  1 sibling, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2003-02-09  3:37 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Anton Blanchard, linux-kernel, Ingo Molnar, Andrew Morton, arjanv


On Sat, 8 Feb 2003, Roland McGrath wrote:
>
> Ah.  Deciding state should be treated as a bitmask is not so hot when
> TASK_RUNNING is 0.

Oops. Indeed. TASK_RUNNING isn't a bit at all, it's a lack of sleep.

Turning the logic the other way around:

	/* Ignore processes that are dead or stopped (except for SIGKILL) */
	mask = TASK_ZOMBIE | TASK_DEAD;
	if (sig != SIGKILL)
		mask != TASK_STOPPED;

and testing for !((p)->state & mask) should fix it. The mask ends up being 
the states that are _not_ good to send signals for ;)

		Linus


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  3:30     ` Roland McGrath
@ 2003-02-09  3:33       ` Roland McGrath
  2003-02-09  3:37         ` Linus Torvalds
  2003-02-09  3:48         ` Roland McGrath
  0 siblings, 2 replies; 27+ messages in thread
From: Roland McGrath @ 2003-02-09  3:33 UTC (permalink / raw)
  To: Linus Torvalds, Anton Blanchard, linux-kernel, Ingo Molnar,
	Andrew Morton, arjanv

Ah.  Deciding state should be treated as a bitmask is not so hot when
TASK_RUNNING is 0.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  2:33   ` Linus Torvalds
  2003-02-09  2:41     ` Roland McGrath
@ 2003-02-09  3:30     ` Roland McGrath
  2003-02-09  3:33       ` Roland McGrath
  1 sibling, 1 reply; 27+ messages in thread
From: Roland McGrath @ 2003-02-09  3:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Anton Blanchard, linux-kernel, Ingo Molnar, Andrew Morton, arjanv

There is definitely something after that patch.  gdb is broken in ways that
are being mysterious to me, and now zombies with ppid 1 are sticking around
forever.  With that patch, do:

  bash$ (sleep 2 & echo zombie $! ; exec sleep 10000000) & sleep 4; kill -9 $!
  [2] 1220
  zombie 1222
  bash$ ps l1222
  [2]+  Killed                  ( sleep 2 & echo zombie $!; exec sleep 10000000 )
  F   UID   PID  PPID PRI  NI   VSZ  RSS WCHAN  STAT TTY        TIME COMMAND
  0  5281  1222     1  15   0     0    0 wait4  Z    pts/1      0:00 [sleep <defun

Such zombies seem to stick around forever.  I must have managed to break
the normal delivery of SIGCHLD to init, but right now it's a mystery to me how.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  2:33   ` Linus Torvalds
@ 2003-02-09  2:41     ` Roland McGrath
  2003-02-09  3:30     ` Roland McGrath
  1 sibling, 0 replies; 27+ messages in thread
From: Roland McGrath @ 2003-02-09  2:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Anton Blanchard, linux-kernel, Ingo Molnar, Andrew Morton, arjanv

> Roland, does this look sane to you? 

Hey, that's my patch!  :-) Actually, mine has one line of difference, which
is not to set TIF_SIGPENDING when SIGCONT is blocked.  That difference is
visible to kernel threads that block SIGCONT and check signal_pending,
of which there are some though I don't recall understanding why.

I am still doing tests.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  2:31       ` Roland McGrath
@ 2003-02-09  2:34         ` Linus Torvalds
  0 siblings, 0 replies; 27+ messages in thread
From: Linus Torvalds @ 2003-02-09  2:34 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Anton Blanchard, linux-kernel, Ingo Molnar, Andrew Morton, arjanv


On Sat, 8 Feb 2003, Roland McGrath wrote:
> 
> I'm already in the midst of those and will test my various cases and gdb
> and so forth.  If you want to put something in before I've done testing,
> then please send me the patch you use.

You just got it. HOWEVER, I haven't actually committed it to my tree, and 
I might as well wait for your test results, so you can choose to just 
ignore it if you want to.

		Linus


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  2:00 ` Linus Torvalds
  2003-02-09  2:17   ` Roland McGrath
@ 2003-02-09  2:33   ` Linus Torvalds
  2003-02-09  2:41     ` Roland McGrath
  2003-02-09  3:30     ` Roland McGrath
  2003-02-10  8:53   ` Ingo Molnar
  2 siblings, 2 replies; 27+ messages in thread
From: Linus Torvalds @ 2003-02-09  2:33 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: linux-kernel, Ingo Molnar, Roland McGrath, Andrew Morton, arjanv


On Sat, 8 Feb 2003, Linus Torvalds wrote:
>
> And then the call trace would be very interesting indeed if the above bug 
> triggers.. (and unlike the do_exit() bug, the tri_to_wake_up() thing 
> should clearly pinpoint where the problem actually happens, knock wood).

Ok, let's assume it was the wake_up_process() in handle_stop_signal(), 
then hopefully this (totally untested) patch should fix it.

Roland, does this look sane to you? It makes the STOP wakeup be a 
SIGCONT-only thing, and makes the wakeup conditional on !PF_EXITING.

It also makes SIGKILL ignore the stopped status for delivery.

Does this make the gdb test-suite happy?

		Linus

---
===== kernel/signal.c 1.61 vs edited =====
--- 1.61/kernel/signal.c	Fri Feb  7 06:46:13 2003
+++ edited/kernel/signal.c	Sat Feb  8 18:29:11 2003
@@ -591,8 +591,7 @@
 			rm_from_queue(sigmask(SIGCONT), &t->pending);
 			t = next_thread(t);
 		} while (t != p);
-	}
-	else if (sig_kernel_cont(sig)) {
+	} else if (sig == SIGCONT) {
 		/*
 		 * Remove all stop signals from all queues,
 		 * and wake all threads.
@@ -637,17 +636,12 @@
 			 * running the handler.  With the TIF_SIGPENDING
 			 * flag set, the thread will pause and acquire the
 			 * siglock that we hold now and until we've queued
-			 * the pending signal.  For SIGKILL, we likewise
-			 * don't want anybody doing anything but taking the
-			 * SIGKILL.  The only case in which a thread would
-			 * not already be in the signal dequeuing loop is
-			 * non-signal (e.g. syscall) ptrace tracing, so we
-			 * don't worry about an unnecessary trip through
-			 * the signal code and just keep this code path
-			 * simpler by unconditionally setting the flag.
+			 * the pending signal. 
 			 */
-			set_tsk_thread_flag(t, TIF_SIGPENDING);
-			wake_up_process(t);
+			if (!(t->flags & PF_EXITING)) {
+				set_tsk_thread_flag(t, TIF_SIGPENDING);
+				wake_up_process(t);
+			}
 			t = next_thread(t);
 		} while (t != p);
 	}
@@ -789,15 +783,17 @@
  * as soon as they're available, so putting the signal on the shared queue
  * will be equivalent to sending it to one such thread.
  */
-#define wants_signal(sig, p)	(!sigismember(&(p)->blocked, sig) \
-				 && (p)->state < TASK_STOPPED \
-				 && !((p)->flags & PF_EXITING) \
-				 && (task_curr(p) || !signal_pending(p)))
+#define wants_signal(sig, p, mask) 			\
+	(!sigismember(&(p)->blocked, sig)		\
+	 && ((p)->state & mask)				\
+	 && !((p)->flags & PF_EXITING)			\
+	 && (task_curr(p) || !signal_pending(p)))
 
 static inline int
 __group_send_sig_info(int sig, struct siginfo *info, struct task_struct *p)
 {
 	struct task_struct *t;
+	unsigned int mask;
 	int ret;
 
 #if CONFIG_SMP
@@ -815,6 +811,14 @@
 		return 0;
 
 	/*
+	 * Normally we only post signals to non-stopped threads,
+	 * bit SIGKILL will puch through that too..
+	 */
+	mask = TASK_RUNNING | TASK_INTERRUPTIBLE | TASK_UNINTERRUPTIBLE;
+	if (sig == SIGKILL)
+		mask |= TASK_STOPPED;
+
+	/*
 	 * Put this signal on the shared-pending queue, or fail with EAGAIN.
 	 * We always use the shared queue for process-wide signals,
 	 * to avoid several races.
@@ -829,7 +833,7 @@
 	 * If the main thread wants the signal, it gets first crack.
 	 * Probably the least surprising to the average bear.
 	 */
-	if (wants_signal(sig, p))
+	if (wants_signal(sig, p, mask))
 		t = p;
 	else if (thread_group_empty(p))
 		/*
@@ -847,7 +851,7 @@
 			t = p->signal->curr_target = p;
 		BUG_ON(t->tgid != p->tgid);
 
-		while (!wants_signal(sig, t)) {
+		while (!wants_signal(sig, t, mask)) {
 			t = next_thread(t);
 			if (t == p->signal->curr_target)
 				/*


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  2:19     ` Linus Torvalds
@ 2003-02-09  2:31       ` Roland McGrath
  2003-02-09  2:34         ` Linus Torvalds
  0 siblings, 1 reply; 27+ messages in thread
From: Roland McGrath @ 2003-02-09  2:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Anton Blanchard, linux-kernel, Ingo Molnar, Andrew Morton, arjanv

> Anyway, I'll code up the SIGKILL changes that should make this a non-issue 

I'm already in the midst of those and will test my various cases and gdb
and so forth.  If you want to put something in before I've done testing,
then please send me the patch you use.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  2:17   ` Roland McGrath
@ 2003-02-09  2:19     ` Linus Torvalds
  2003-02-09  2:31       ` Roland McGrath
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2003-02-09  2:19 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Anton Blanchard, linux-kernel, Ingo Molnar, Andrew Morton, arjanv


On Sat, 8 Feb 2003, Roland McGrath wrote:
> 
> As I said above, I think this race is possible in other uses of
> wake_up_process.

I don't think you have any other users (other than signals)  that wake up
processes that aren't on some wait-queue.

And by the time we exit, we have better had removed outselves from all the
wait-queues, so I suspect signals are really the only thing that can wake
up a process after it died but before it's truly gone.

Anyway, I'll code up the SIGKILL changes that should make this a non-issue 
(along with the bad SIGKILL/kernel-thread interaction).

		Linus


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  2:00 ` Linus Torvalds
@ 2003-02-09  2:17   ` Roland McGrath
  2003-02-09  2:19     ` Linus Torvalds
  2003-02-09  2:33   ` Linus Torvalds
  2003-02-10  8:53   ` Ingo Molnar
  2 siblings, 1 reply; 27+ messages in thread
From: Roland McGrath @ 2003-02-09  2:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Anton Blanchard, linux-kernel, Ingo Molnar, Andrew Morton, arjanv

> By then we also have local interrupts disabled, and we've explicitly 
> disabled preemption, so I don't see how anything could ever wake us up any 
> more. 

We already know how this happens.  I think it's only possible with SMP.  
I described this very problem in the final paragraph of my penultimate
message on the signals changes:

    Incidentally, I've run across another bug introduced by the last rework of
    handle_stop_signal (or perhaps similar races have always been there, I'm
    not quite sure at the moment).  It can call wake_up_process on a zombie
    that's on its way to exit, triggering the BUG at the end of do_exit.  I
    think this race may be possible in all of the signal_wake_up calls for
    SIGKILL cases, and other uses of wake_up_process like PTRACE_KILL.
    Some such places check ->state, but they do not lock out exit races.
    Perhaps having wake_up_process itself be race-proof would be simplest.
    I don't have a good sense of how best to fix this one yet.

This will probably stop biting anyone in practice after the most recent new
plan for SIGKILL we've just been discussing.  For signals, the race will
only be possible for SIGCONT sent when a thread is on its way to die.  That
can be avoided by checking PF_EXITING in handle_stop_signal, because after
setting PF_EXITING any thread in do_exit will take the siglock and thus
can't have gotten far enough to go to TASK_ZOMBIE without being serialized
after the loop in handle_stop_signal.

As I said above, I think this race is possible in other uses of
wake_up_process.  PTRACE_KILL is one example, but there are others and I
would have to check carefully to be convinced that other factors rule out
this exit race for them.  I think that BUG_ON check should definitely go
into try_wake_up so that it hits should any of these other races ever
actually bite.  Unless I am missing something, it won't necessarily catch
all races unless you use xchg to set TASK_RUNNING and then check the old value.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: heavy handed exit() in latest BK
  2003-02-09  0:57 Anton Blanchard
@ 2003-02-09  2:00 ` Linus Torvalds
  2003-02-09  2:17   ` Roland McGrath
                     ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Linus Torvalds @ 2003-02-09  2:00 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: linux-kernel, Ingo Molnar, Roland McGrath, Andrew Morton, arjanv


On Sun, 9 Feb 2003, Anton Blanchard wrote:
> 
> From BK late last night (has everything except the most recent 2
> sound changesets)
> 
> kernel BUG at kernel/exit.c:710!
> 
> which is:
> 
> do_exit()
> ...
>         schedule();
>         BUG();
>         /* Avoid "noreturn function does return".  */
>         for (;;) ;
> }
> 
> Oops. In a police lineup Id finger the signal changes.

Interesting. Especially as the last thing exit_notify() does (just a few 
lines above the schedule()) is to do

        tsk->state = TASK_ZOMBIE;

and that schedule() _really_ really shouldn't return. Regardless of any 
signal handler changes.

By then we also have local interrupts disabled, and we've explicitly 
disabled preemption, so I don't see how anything could ever wake us up any 
more. 

But yes, I'd agree with your line-up, if for no other reason than the fact 
that I don't really see anything else even _remotely_ likely to muck 
around with process state.

I don't see this (obviously), but could you make "try_to_wake_up()" (in 
kernel/sched.c) do a

	BUG_ON(p->state & TASK_ZOMBIE);

to see if somebody tries to wake up a zombie task (maybe the signal 
handler stuff does that under some unlucky circumstances).

And then the call trace would be very interesting indeed if the above bug 
triggers.. (and unlike the do_exit() bug, the tri_to_wake_up() thing 
should clearly pinpoint where the problem actually happens, knock wood).

		Linus


^ permalink raw reply	[flat|nested] 27+ messages in thread

* heavy handed exit() in latest BK
@ 2003-02-09  0:57 Anton Blanchard
  2003-02-09  2:00 ` Linus Torvalds
  0 siblings, 1 reply; 27+ messages in thread
From: Anton Blanchard @ 2003-02-09  0:57 UTC (permalink / raw)
  To: torvalds; +Cc: linux-kernel


Hi,

>From BK late last night (has everything except the most recent 2
sound changesets)

kernel BUG at kernel/exit.c:710!

which is:

do_exit()
...
        schedule();
        BUG();
        /* Avoid "noreturn function does return".  */
        for (;;) ;
}

Oops. In a police lineup Id finger the signal changes.

Anton

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2003-02-10 15:16 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <200302091130.h19BU2107744@magilla.sf.frob.com>
2003-02-09 11:40 ` heavy handed exit() in latest BK Ingo Molnar
2003-02-09 11:56   ` Roland McGrath
2003-02-09 12:09     ` Ingo Molnar
2003-02-09 12:18     ` Ingo Molnar
2003-02-09 12:23       ` Ingo Molnar
2003-02-09 12:22         ` Arjan van de Ven
2003-02-10  1:07     ` Linus Torvalds
2003-02-10  1:27       ` Roland McGrath
2003-02-09  0:57 Anton Blanchard
2003-02-09  2:00 ` Linus Torvalds
2003-02-09  2:17   ` Roland McGrath
2003-02-09  2:19     ` Linus Torvalds
2003-02-09  2:31       ` Roland McGrath
2003-02-09  2:34         ` Linus Torvalds
2003-02-09  2:33   ` Linus Torvalds
2003-02-09  2:41     ` Roland McGrath
2003-02-09  3:30     ` Roland McGrath
2003-02-09  3:33       ` Roland McGrath
2003-02-09  3:37         ` Linus Torvalds
2003-02-09  3:40           ` Linus Torvalds
2003-02-09  3:48         ` Roland McGrath
2003-02-09  4:51           ` Linus Torvalds
2003-02-09  4:57             ` Linus Torvalds
2003-02-09  5:00             ` Roland McGrath
2003-02-09  9:28             ` Russell King
2003-02-10  8:53   ` Ingo Molnar
2003-02-10 15:22     ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).