linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BUG can fill process table
@ 2001-08-03 12:18 André Cruz
  2001-08-03 19:44 ` Andrew Morton
  0 siblings, 1 reply; 6+ messages in thread
From: André Cruz @ 2001-08-03 12:18 UTC (permalink / raw)
  To: linux-kernel

When a process (say dhcpcd) changes the IFF_UP flag to TRUE on an
interface (say eth0) to bring it up, a new process is created named
after the interface (eth0 in this case) and it's PPID is dhcpcd. 

If dhcpcd later changes the IFF_UP flag to FALSE to bring the interface
down, the eth0 process dies but stays as a zombie. The problem is that
dhcpcd never receives a sigchld (suposedly eth0 is it's child) and even
if it executes a wait() no process is reaped and wait() returns -1. 

The worse part of this is that when dhcpcd wants to bring up the
interface again a NEW eth0 process is created and so this starts to fill
up the process table. 

I see two solutions for this: either the interface process start with a
PPID of 1 (I noticed that init has no problems dealing with them when
they die) or dhcpcd should receive a sigchld and be able to reap them. 

Btw why are these process even created? 2.2 didn't do it I think.

I don't know where the problem lies so if someone could tell me who to
contact about this that would be great. Or maybe if this is already
known?

Thanks


----------
André Cruz


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BUG can fill process table
  2001-08-03 12:18 BUG can fill process table André Cruz
@ 2001-08-03 19:44 ` Andrew Morton
  2001-08-03 19:58   ` link status changes forwarded to userspace? Anders Eriksson
                     ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Andrew Morton @ 2001-08-03 19:44 UTC (permalink / raw)
  To: André Cruz; +Cc: linux-kernel

André Cruz wrote:
> 
> When a process (say dhcpcd) changes the IFF_UP flag to TRUE on an
> interface (say eth0) to bring it up, a new process is created named
> after the interface (eth0 in this case) and it's PPID is dhcpcd.
> 
> If dhcpcd later changes the IFF_UP flag to FALSE to bring the interface
> down, the eth0 process dies but stays as a zombie. The problem is that
> dhcpcd never receives a sigchld (suposedly eth0 is it's child) and even
> if it executes a wait() no process is reaped and wait() returns -1.
> 
> The worse part of this is that when dhcpcd wants to bring up the
> interface again a NEW eth0 process is created and so this starts to fill
> up the process table.
> 
> I see two solutions for this: either the interface process start with a
> PPID of 1 (I noticed that init has no problems dealing with them when
> they die) or dhcpcd should receive a sigchld and be able to reap them.
> 
> Btw why are these process even created? 2.2 didn't do it I think.
> 
> I don't know where the problem lies so if someone could tell me who to
> contact about this that would be great. Or maybe if this is already
> known?

Oh.  That one again.

Could you please try the below patch?

We really need to fix this - we shouldn't be giving random
userspace tasks surprise children just because some syscall decided
to create a kernel thread.

This patch _should_ fix it, as well as the problem reported in
April at http://uwsg.iu.edu/hypermail/linux/kernel/0104.1/0763.html
But the patch may be incomplete - I did it back in April and got
distracted partway through testing.

It reparents to init OK, but it's only a partial solution - the other
problem we have is that these kernel threads will inherit things
like UIDs, scheduling priority, scheduling policy, etc from the
parent.  I haven't seen any bug reports attributable to this, but....

I think we should have a standalone function which performs this
reparenting and also initialises siginficant values in *current to
useful values.


--- linux-2.4.4-pre3/kernel/sched.c	Sun Apr 15 15:34:25 2001
+++ linux-akpm/kernel/sched.c	Mon Apr 16 20:40:47 2001
@@ -32,6 +32,7 @@
 extern void timer_bh(void);
 extern void tqueue_bh(void);
 extern void immediate_bh(void);
+extern struct task_struct *child_reaper;
 
 /*
  * scheduler variables
@@ -1260,32 +1261,53 @@
 /*
  *	Put all the gunge required to become a kernel thread without
  *	attached user resources in one place where it belongs.
+ *
+ * 	Kernel 2.4.4-pre3, andrewm@uow.edu.au: reparent the caller
+ *	to init and set the exit signal to SIGCHLD so the thread
+ *	will be properly reaped if it exist.
  */
 
 void daemonize(void)
 {
 	struct fs_struct *fs;
-
+	struct task_struct *this_task = current;
 
 	/*
 	 * If we were started as result of loading a module, close all of the
 	 * user space pages.  We don't need them, and if we didn't close them
 	 * they would be locked into memory.
 	 */
-	exit_mm(current);
+	exit_mm(this_task);
 
-	current->session = 1;
-	current->pgrp = 1;
+	this_task->session = 1;
+	this_task->pgrp = 1;
 
 	/* Become as one with the init task */
 
-	exit_fs(current);	/* current->fs->count--; */
+	exit_fs(this_task);		/* this_task->fs->count--; */
 	fs = init_task.fs;
-	current->fs = fs;
+	this_task->fs = fs;
 	atomic_inc(&fs->count);
- 	exit_files(current);
-	current->files = init_task.files;
-	atomic_inc(&current->files->count);
+ 	exit_files(this_task);		/* this_task->files->count-- */
+	this_task->files = init_task.files;
+	atomic_inc(&this_task->files->count);
+
+	write_lock_irq(&tasklist_lock);
+
+	/* Reparent to init */
+	REMOVE_LINKS(this_task);
+	this_task->p_pptr = child_reaper;
+	this_task->p_opptr = child_reaper;
+	SET_LINKS(this_task);
+
+	/* Set the exit signal to SIGCHLD so we signal init on exit */
+	if (this_task->exit_signal != 0) {
+		printk(KERN_ERR "task `%s' exit_signal %d in daemonize()\n",
+			this_task->comm, this_task->exit_signal);
+	}
+	this_task->exit_signal = SIGCHLD;
+
+	write_unlock_irq(&tasklist_lock);
 }
 
 void __init init_idle(void)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* link status changes forwarded to userspace?
  2001-08-03 19:44 ` Andrew Morton
@ 2001-08-03 19:58   ` Anders Eriksson
  2001-08-03 20:06     ` Justin Guyett
  2001-08-03 20:09   ` BUG can fill process table André Cruz
  2001-08-03 22:57   ` André Cruz
  2 siblings, 1 reply; 6+ messages in thread
From: Anders Eriksson @ 2001-08-03 19:58 UTC (permalink / raw)
  To: linux-kernel


Hi,
Is it possible to get the link status change (e.g. eth cable inserted in card) 
forwarded to userspace? I'd be cool to auto trigger dhcp on connect and 
silently skip the interface if no cable during boot.

/Anders


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: link status changes forwarded to userspace?
  2001-08-03 19:58   ` link status changes forwarded to userspace? Anders Eriksson
@ 2001-08-03 20:06     ` Justin Guyett
  0 siblings, 0 replies; 6+ messages in thread
From: Justin Guyett @ 2001-08-03 20:06 UTC (permalink / raw)
  To: Anders Eriksson; +Cc: linux-kernel

On Fri, 3 Aug 2001, Anders Eriksson wrote:

> Is it possible to get the link status change (e.g. eth cable inserted in card)
> forwarded to userspace? I'd be cool to auto trigger dhcp on connect and
> silently skip the interface if no cable during boot.

you can get the link status from userspace, and that should be enough to
skip no-link interfaces during boot.  mii-tool does it (comes with
net-tools).

justin


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: BUG can fill process table
  2001-08-03 19:44 ` Andrew Morton
  2001-08-03 19:58   ` link status changes forwarded to userspace? Anders Eriksson
@ 2001-08-03 20:09   ` André Cruz
  2001-08-03 22:57   ` André Cruz
  2 siblings, 0 replies; 6+ messages in thread
From: André Cruz @ 2001-08-03 20:09 UTC (permalink / raw)
  To: 'Andrew Morton'; +Cc: linux-kernel

I will be leaving tomorrow for vacations and can't test this patch right
now. I'll be back in 10 days and if it hasn't been tested enough by then
I'll apply it.

Meanwhile if someone else could look into this issue I would appreciate
it. It's surprising no one else has been annoyed by this issue. :)

Btw here is a ps output of my router after dhcpcd has been running for a
few days:

sprawl# ./ps -fx
  PID TTY      STAT   TIME COMMAND
    7 ?        SW     0:00 [kupdated]
    6 ?        SW     0:00 [bdflush]
    5 ?        SW     0:00 [kreclaimd]
    4 ?        SW     0:00 [kswapd]
    3 ?        SWN    0:01 [ksoftirqd_CPU0]
    1 ?        S      0:01 init [3] 
    2 ?        SW     0:00 [keventd]
  289 ?        S      0:01 /sbin/syslogd -m 0
  292 ?        S      0:00 /sbin/klogd
  301 ?        SW     0:00 [eth1]
  317 ?        S      0:00 /usr/sbin/cron
  330 ?        S      0:41 /usr/sbin/sshd
11190 ?        S      0:00  \_ /usr/sbin/sshd
11192 ttyp0    S      0:00      \_ -sh
11228 ttyp0    R      0:00          \_ ./ps -fx
 9684 ?        S      0:03 dhcpcd eth0
 9728 ?        Z      0:00  \_ [eth0 <defunct>]
 9731 ?        Z      0:00  \_ [eth0 <defunct>]
 9735 ?        Z      0:00  \_ [eth0 <defunct>]
 9742 ?        Z      0:00  \_ [eth0 <defunct>]
 9752 ?        SW     0:00  \_ [eth0]
sprawl# 

Thanks for the help!


----------
André Cruz

-----Original Message-----
From: akpm@vasquez.zip.com.au [mailto:akpm@vasquez.zip.com.au] On Behalf
Of Andrew Morton
Sent: sexta-feira, 3 de Agosto de 2001 20:45
To: André Cruz
Cc: linux-kernel@vger.kernel.org
Subject: Re: BUG can fill process table


André Cruz wrote:
> 
> When a process (say dhcpcd) changes the IFF_UP flag to TRUE on an 
> interface (say eth0) to bring it up, a new process is created named 
> after the interface (eth0 in this case) and it's PPID is dhcpcd.
> 
> If dhcpcd later changes the IFF_UP flag to FALSE to bring the 
> interface down, the eth0 process dies but stays as a zombie. The 
> problem is that dhcpcd never receives a sigchld (suposedly eth0 is 
> it's child) and even if it executes a wait() no process is reaped and 
> wait() returns -1.
> 
> The worse part of this is that when dhcpcd wants to bring up the 
> interface again a NEW eth0 process is created and so this starts to 
> fill up the process table.
 
Oh.  That one again.

Could you please try the below patch?

We really need to fix this - we shouldn't be giving random userspace
tasks surprise children just because some syscall decided to create a
kernel thread.

This patch _should_ fix it, as well as the problem reported in April at
http://uwsg.iu.edu/hypermail/linux/kernel/0104.1/0763.html
But the patch may be incomplete - I did it back in April and got
distracted partway through testing.

It reparents to init OK, but it's only a partial solution - the other
problem we have is that these kernel threads will inherit things like
UIDs, scheduling priority, scheduling policy, etc from the parent.  I
haven't seen any bug reports attributable to this, but....

I think we should have a standalone function which performs this
reparenting and also initialises siginficant values in *current to
useful values.


--- linux-2.4.4-pre3/kernel/sched.c	Sun Apr 15 15:34:25 2001
+++ linux-akpm/kernel/sched.c	Mon Apr 16 20:40:47 2001
@@ -32,6 +32,7 @@
 extern void timer_bh(void);
 extern void tqueue_bh(void);
 extern void immediate_bh(void);
+extern struct task_struct *child_reaper;
 
 /*
  * scheduler variables
@@ -1260,32 +1261,53 @@
 /*
  *	Put all the gunge required to become a kernel thread without
  *	attached user resources in one place where it belongs.
+ *
+ * 	Kernel 2.4.4-pre3, andrewm@uow.edu.au: reparent the caller
+ *	to init and set the exit signal to SIGCHLD so the thread
+ *	will be properly reaped if it exist.
  */
 
 void daemonize(void)
 {
 	struct fs_struct *fs;
-
+	struct task_struct *this_task = current;
 
 	/*
 	 * If we were started as result of loading a module, close all
of the
 	 * user space pages.  We don't need them, and if we didn't close
them
 	 * they would be locked into memory.
 	 */
-	exit_mm(current);
+	exit_mm(this_task);
 
-	current->session = 1;
-	current->pgrp = 1;
+	this_task->session = 1;
+	this_task->pgrp = 1;
 
 	/* Become as one with the init task */
 
-	exit_fs(current);	/* current->fs->count--; */
+	exit_fs(this_task);		/* this_task->fs->count--; */
 	fs = init_task.fs;
-	current->fs = fs;
+	this_task->fs = fs;
 	atomic_inc(&fs->count);
- 	exit_files(current);
-	current->files = init_task.files;
-	atomic_inc(&current->files->count);
+ 	exit_files(this_task);		/* this_task->files->count-- */
+	this_task->files = init_task.files;
+	atomic_inc(&this_task->files->count);
+
+	write_lock_irq(&tasklist_lock);
+
+	/* Reparent to init */
+	REMOVE_LINKS(this_task);
+	this_task->p_pptr = child_reaper;
+	this_task->p_opptr = child_reaper;
+	SET_LINKS(this_task);
+
+	/* Set the exit signal to SIGCHLD so we signal init on exit */
+	if (this_task->exit_signal != 0) {
+		printk(KERN_ERR "task `%s' exit_signal %d in
daemonize()\n",
+			this_task->comm, this_task->exit_signal);
+	}
+	this_task->exit_signal = SIGCHLD;
+
+	write_unlock_irq(&tasklist_lock);
 }
 
 void __init init_idle(void)


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: BUG can fill process table
  2001-08-03 19:44 ` Andrew Morton
  2001-08-03 19:58   ` link status changes forwarded to userspace? Anders Eriksson
  2001-08-03 20:09   ` BUG can fill process table André Cruz
@ 2001-08-03 22:57   ` André Cruz
  2 siblings, 0 replies; 6+ messages in thread
From: André Cruz @ 2001-08-03 22:57 UTC (permalink / raw)
  To: 'Andrew Morton'; +Cc: linux-kernel

I still managed to patch my kernel and from what I could see it works.
Interface processes are created with a PPID of 1 and when someone brings
the interface down they are reaped by init. I don't know if there are
further tests you wanted me to make... The machine has been running for
about an hour and so far no side effects are seen...

Will this be incorporated in the next official kernel? What remains to
be made?

Thanks for your help.


----------
André Cruz

> -----Original Message-----
> From: akpm@vasquez.zip.com.au 
> [mailto:akpm@vasquez.zip.com.au] On Behalf Of Andrew Morton
> Sent: sexta-feira, 3 de Agosto de 2001 20:45
> To: André Cruz
> Cc: linux-kernel@vger.kernel.org
> Subject: Re: BUG can fill process table
> 
> 
> André Cruz wrote:
> > 
> > When a process (say dhcpcd) changes the IFF_UP flag to TRUE on an 
> > interface (say eth0) to bring it up, a new process is created named 
> > after the interface (eth0 in this case) and it's PPID is dhcpcd.
> > 
> > If dhcpcd later changes the IFF_UP flag to FALSE to bring the 
> > interface down, the eth0 process dies but stays as a zombie. The 
> > problem is that dhcpcd never receives a sigchld (suposedly eth0 is 
> > it's child) and even if it executes a wait() no process is 
> reaped and 
> > wait() returns -1.
> > 
> > The worse part of this is that when dhcpcd wants to bring up the 
> > interface again a NEW eth0 process is created and so this starts to 
> > fill up the process table.
> > 
> > I see two solutions for this: either the interface process 
> start with 
> > a PPID of 1 (I noticed that init has no problems dealing with them 
> > when they die) or dhcpcd should receive a sigchld and be 
> able to reap 
> > them.
> > 
> > Btw why are these process even created? 2.2 didn't do it I think.
> > 
> > I don't know where the problem lies so if someone could 
> tell me who to 
> > contact about this that would be great. Or maybe if this is already 
> > known?
> 
> Oh.  That one again.
> 
> Could you please try the below patch?
> 
> We really need to fix this - we shouldn't be giving random 
> userspace tasks surprise children just because some syscall 
> decided to create a kernel thread.
> 
> This patch _should_ fix it, as well as the problem reported 
> in April at http://uwsg.iu.edu/hypermail/linux/kernel/0104.1/0763.html
> But the patch may be incomplete - I did it back in April and 
> got distracted partway through testing.
> 
> It reparents to init OK, but it's only a partial solution - 
> the other problem we have is that these kernel threads will 
> inherit things like UIDs, scheduling priority, scheduling 
> policy, etc from the parent.  I haven't seen any bug reports 
> attributable to this, but....
> 
> I think we should have a standalone function which performs 
> this reparenting and also initialises siginficant values in 
> *current to useful values.
> 
> 
> --- linux-2.4.4-pre3/kernel/sched.c	Sun Apr 15 15:34:25 2001
> +++ linux-akpm/kernel/sched.c	Mon Apr 16 20:40:47 2001
> @@ -32,6 +32,7 @@
>  extern void timer_bh(void);
>  extern void tqueue_bh(void);
>  extern void immediate_bh(void);
> +extern struct task_struct *child_reaper;
>  
>  /*
>   * scheduler variables
> @@ -1260,32 +1261,53 @@
>  /*
>   *	Put all the gunge required to become a kernel thread without
>   *	attached user resources in one place where it belongs.
> + *
> + * 	Kernel 2.4.4-pre3, andrewm@uow.edu.au: reparent the caller
> + *	to init and set the exit signal to SIGCHLD so the thread
> + *	will be properly reaped if it exist.
>   */
>  
>  void daemonize(void)
>  {
>  	struct fs_struct *fs;
> -
> +	struct task_struct *this_task = current;
>  
>  	/*
>  	 * If we were started as result of loading a module, 
> close all of the
>  	 * user space pages.  We don't need them, and if we 
> didn't close them
>  	 * they would be locked into memory.
>  	 */
> -	exit_mm(current);
> +	exit_mm(this_task);
>  
> -	current->session = 1;
> -	current->pgrp = 1;
> +	this_task->session = 1;
> +	this_task->pgrp = 1;
>  
>  	/* Become as one with the init task */
>  
> -	exit_fs(current);	/* current->fs->count--; */
> +	exit_fs(this_task);		/* this_task->fs->count--; */
>  	fs = init_task.fs;
> -	current->fs = fs;
> +	this_task->fs = fs;
>  	atomic_inc(&fs->count);
> - 	exit_files(current);
> -	current->files = init_task.files;
> -	atomic_inc(&current->files->count);
> + 	exit_files(this_task);		/* this_task->files->count-- */
> +	this_task->files = init_task.files;
> +	atomic_inc(&this_task->files->count);
> +
> +	write_lock_irq(&tasklist_lock);
> +
> +	/* Reparent to init */
> +	REMOVE_LINKS(this_task);
> +	this_task->p_pptr = child_reaper;
> +	this_task->p_opptr = child_reaper;
> +	SET_LINKS(this_task);
> +
> +	/* Set the exit signal to SIGCHLD so we signal init on exit */
> +	if (this_task->exit_signal != 0) {
> +		printk(KERN_ERR "task `%s' exit_signal %d in 
> daemonize()\n",
> +			this_task->comm, this_task->exit_signal);
> +	}
> +	this_task->exit_signal = SIGCHLD;
> +
> +	write_unlock_irq(&tasklist_lock);
>  }
>  
>  void __init init_idle(void)
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2001-08-03 22:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-03 12:18 BUG can fill process table André Cruz
2001-08-03 19:44 ` Andrew Morton
2001-08-03 19:58   ` link status changes forwarded to userspace? Anders Eriksson
2001-08-03 20:06     ` Justin Guyett
2001-08-03 20:09   ` BUG can fill process table André Cruz
2001-08-03 22:57   ` André Cruz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).