linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* kjournald keeps reference to namespace
@ 2006-02-18  1:35 Herbert Poetzl
  2006-02-18  1:54 ` Andrew Morton
  2006-02-18 13:36 ` Björn Steinbrink
  0 siblings, 2 replies; 10+ messages in thread
From: Herbert Poetzl @ 2006-02-18  1:35 UTC (permalink / raw)
  To: Andrew Morton, Al Viro; +Cc: Linux Kernel ML


Hi Folks!

when creating a private namespace (CLONE_NS) and
then mounting an ext3 filesystem, a new kernel
thread (kjournald) is created, which keeps a
reference to the namespace, which after the the
process exits, remains and blocks access to the
block device, as it is still bd_claim-ed.

this leaves a private namespace behind and a
block device which cannot be opened exclusively.
unmount is not an option, as the namespace is
not longer reachable.

this behaviour seems to be there since ever,
well since namespaces and kjournald exists :)

the following 'cruel' hack 'solves' this issue

best,
Herbert


--- fs/jbd/journal.c.orig	2006-01-03 17:29:56 +0100
+++ fs/jbd/journal.c	2006-02-18 02:23:21 +0100
@@ -33,6 +33,7 @@
 #include <linux/mm.h>
 #include <linux/suspend.h>
 #include <linux/pagemap.h>
+#include <linux/namespace.h>
 #include <asm/uaccess.h>
 #include <asm/page.h>
 #include <linux/proc_fs.h>
@@ -116,6 +117,13 @@ static int kjournald(void *arg)
 	struct timer_list timer;
 
 	daemonize("kjournald");
+	{
+		struct namespace *ns = current->namespace;
+
+		current->namespace = NULL;
+		put_namespace(ns);
+	}
+
 
 	/* Set up an interval timer which can be used to trigger a
            commit wakeup after the commit interval expires */


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kjournald keeps reference to namespace
  2006-02-18  1:35 kjournald keeps reference to namespace Herbert Poetzl
@ 2006-02-18  1:54 ` Andrew Morton
  2006-02-18  3:30   ` Herbert Poetzl
  2006-02-18 13:36 ` Björn Steinbrink
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2006-02-18  1:54 UTC (permalink / raw)
  To: Herbert Poetzl; +Cc: viro, linux-kernel

Herbert Poetzl <herbert@13thfloor.at> wrote:
>
> 
> Hi Folks!
> 
> when creating a private namespace (CLONE_NS) and
> then mounting an ext3 filesystem, a new kernel
> thread (kjournald) is created, which keeps a
> reference to the namespace, which after the the
> process exits, remains and blocks access to the
> block device, as it is still bd_claim-ed.

There are numerous ways in which user processes can parent kernel threads.

bix:/usr/src/linux-2.6.16-rc4> grep -rl kernel_thread drivers net fs | wc
     64      64    1657

> this leaves a private namespace behind and a
> block device which cannot be opened exclusively.
> unmount is not an option, as the namespace is
> not longer reachable.
> 
> this behaviour seems to be there since ever,
> well since namespaces and kjournald exists :)
> 
> the following 'cruel' hack 'solves' this issue
> 
> best,
> Herbert
> 
> 
> --- fs/jbd/journal.c.orig	2006-01-03 17:29:56 +0100
> +++ fs/jbd/journal.c	2006-02-18 02:23:21 +0100
> @@ -33,6 +33,7 @@
>  #include <linux/mm.h>
>  #include <linux/suspend.h>
>  #include <linux/pagemap.h>
> +#include <linux/namespace.h>
>  #include <asm/uaccess.h>
>  #include <asm/page.h>
>  #include <linux/proc_fs.h>
> @@ -116,6 +117,13 @@ static int kjournald(void *arg)
>  	struct timer_list timer;
>  
>  	daemonize("kjournald");
> +	{
> +		struct namespace *ns = current->namespace;
> +
> +		current->namespace = NULL;
> +		put_namespace(ns);
> +	}
> +
>  

I think it'd be better to convert ext3 to use the kthread API which appears
to accidentally not have this problem, because such threads are parented by
keventd, which were parented by init.

That being said, perhaps we should do a put_namespace() in kernel_thread(),
too.

I'm kinda surprised that your patch didn't oops over a NULL ->namespace
when the kernel internally mounted the root filesystem.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kjournald keeps reference to namespace
  2006-02-18  1:54 ` Andrew Morton
@ 2006-02-18  3:30   ` Herbert Poetzl
  2006-02-24 21:28     ` Paul Collins
  0 siblings, 1 reply; 10+ messages in thread
From: Herbert Poetzl @ 2006-02-18  3:30 UTC (permalink / raw)
  To: Andrew Morton; +Cc: viro, linux-kernel

On Fri, Feb 17, 2006 at 05:54:28PM -0800, Andrew Morton wrote:
> Herbert Poetzl <herbert@13thfloor.at> wrote:
> >
> > 
> > Hi Folks!
> > 
> > when creating a private namespace (CLONE_NS) and
> > then mounting an ext3 filesystem, a new kernel
> > thread (kjournald) is created, which keeps a
> > reference to the namespace, which after the the
> > process exits, remains and blocks access to the
> > block device, as it is still bd_claim-ed.
> 
> There are numerous ways in which user processes can parent kernel threads.
> 
> bix:/usr/src/linux-2.6.16-rc4> grep -rl kernel_thread drivers net fs | wc
>      64      64    1657
> 
> > this leaves a private namespace behind and a
> > block device which cannot be opened exclusively.
> > unmount is not an option, as the namespace is
> > not longer reachable.
> > 
> > this behaviour seems to be there since ever,
> > well since namespaces and kjournald exists :)
> > 
> > the following 'cruel' hack 'solves' this issue
> > 
> > best,
> > Herbert
> > 
> > 
> > --- fs/jbd/journal.c.orig	2006-01-03 17:29:56 +0100
> > +++ fs/jbd/journal.c	2006-02-18 02:23:21 +0100
> > @@ -33,6 +33,7 @@
> >  #include <linux/mm.h>
> >  #include <linux/suspend.h>
> >  #include <linux/pagemap.h>
> > +#include <linux/namespace.h>
> >  #include <asm/uaccess.h>
> >  #include <asm/page.h>
> >  #include <linux/proc_fs.h>
> > @@ -116,6 +117,13 @@ static int kjournald(void *arg)
> >  	struct timer_list timer;
> >  
> >  	daemonize("kjournald");
> > +	{
> > +		struct namespace *ns = current->namespace;
> > +
> > +		current->namespace = NULL;
> > +		put_namespace(ns);
> > +	}
> > +
> >  
> 
> I think it'd be better to convert ext3 to use the kthread API which
> appears to accidentally not have this problem, because such threads
> are parented by keventd, which were parented by init.

sounds like a plan!

> That being said, perhaps we should do a put_namespace() in
> kernel_thread(), too.

hmm, keep the reference but put_namespace()?

> I'm kinda surprised that your patch didn't oops over a NULL
> ->namespace when the kernel internally mounted the root filesystem.

nope, booted just fine, but I was worried too, the
interesting detail is that the kjournald thread 
vanishes with this 'hack' when the namespace is
disposed, which doesn't happen without it ...

anyway, will look into it, of course, input is
welcome ...

best,
Herbert



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kjournald keeps reference to namespace
  2006-02-18  1:35 kjournald keeps reference to namespace Herbert Poetzl
  2006-02-18  1:54 ` Andrew Morton
@ 2006-02-18 13:36 ` Björn Steinbrink
  2006-02-18 16:32   ` Björn Steinbrink
  1 sibling, 1 reply; 10+ messages in thread
From: Björn Steinbrink @ 2006-02-18 13:36 UTC (permalink / raw)
  To: herbert; +Cc: akpm, viro, linux-kernel

On 2006.02.18 02:35:47 +0100, Herbert Poetzl wrote:
> 
> Hi Folks!
> 
> when creating a private namespace (CLONE_NS) and
> then mounting an ext3 filesystem, a new kernel
> thread (kjournald) is created, which keeps a
> reference to the namespace, which after the the
> process exits, remains and blocks access to the
> block device, as it is still bd_claim-ed.
> 
> this leaves a private namespace behind and a
> block device which cannot be opened exclusively.
> unmount is not an option, as the namespace is
> not longer reachable.
> 
> this behaviour seems to be there since ever,
> well since namespaces and kjournald exists :)
> 
> the following 'cruel' hack 'solves' this issue

In daemonize() a new thread gets cleaned up and 'merged' with init_task.
The current fs_struct is handled there, but not the current namespace.
The following patch adds the namespace part.

Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
---


diff -NurpP --minimal linux-2.6.16-rc4/kernel/exit.c linux-2.6.16-rc4-ns/kernel/exit.c
--- linux-2.6.16-rc4/kernel/exit.c	2006-02-18 13:59:59.000000000 +0100
+++ linux-2.6.16-rc4-ns/kernel/exit.c	2006-02-18 14:04:26.000000000 +0100
@@ -360,6 +360,8 @@ void daemonize(const char *name, ...)
 	fs = init_task.fs;
 	current->fs = fs;
 	atomic_inc(&fs->count);
+	exit_namespace(current);
+	current->namespace = init_task.namespace;
  	exit_files(current);
 	current->files = init_task.files;
 	atomic_inc(&current->files->count);

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kjournald keeps reference to namespace
  2006-02-18 13:36 ` Björn Steinbrink
@ 2006-02-18 16:32   ` Björn Steinbrink
  2006-02-18 17:12     ` Björn Steinbrink
  0 siblings, 1 reply; 10+ messages in thread
From: Björn Steinbrink @ 2006-02-18 16:32 UTC (permalink / raw)
  To: herbert, akpm, viro, linux-kernel

> In daemonize() a new thread gets cleaned up and 'merged' with init_task.
> The current fs_struct is handled there, but not the current namespace.
> The following patch adds the namespace part.
> 
> Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
> ---

Oops, forgot the increment the namespace usage count...
---


diff -NurpP --minimal linux-2.6.16-rc4/kernel/exit.c linux-2.6.16-rc4-ns/kernel/exit.c
--- linux-2.6.16-rc4/kernel/exit.c	2006-02-18 13:59:59.000000000 +0100
+++ linux-2.6.16-rc4-ns/kernel/exit.c	2006-02-18 17:27:48.000000000 +0100
@@ -360,6 +360,9 @@ void daemonize(const char *name, ...)
 	fs = init_task.fs;
 	current->fs = fs;
 	atomic_inc(&fs->count);
+	exit_namespace(current);
+	current->namespace = init_task.namespace;
+	atomic_inc(&current->namespace->count);
  	exit_files(current);
 	current->files = init_task.files;
 	atomic_inc(&current->files->count);

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kjournald keeps reference to namespace
  2006-02-18 16:32   ` Björn Steinbrink
@ 2006-02-18 17:12     ` Björn Steinbrink
  2006-02-19  2:32       ` Eric W. Biederman
  0 siblings, 1 reply; 10+ messages in thread
From: Björn Steinbrink @ 2006-02-18 17:12 UTC (permalink / raw)
  To: herbert, akpm, viro, linux-kernel; +Cc: ebiederm, torvalds

On 2006.02.18 17:32:27 +0100, Björn Steinbrink wrote:
> > In daemonize() a new thread gets cleaned up and 'merged' with init_task.
> > The current fs_struct is handled there, but not the current namespace.
> > The following patch adds the namespace part.
> > 
> > Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
> > ---
> 
> Oops, forgot the increment the namespace usage count...

Ok, this time with the get_namespace wrapper, thanks to Eric Biederman
for pointing that out to me.
---


diff -NurpP --minimal linux-2.6.16-rc4/kernel/exit.c linux-2.6.16-rc4-ns/kernel/exit.c
--- linux-2.6.16-rc4/kernel/exit.c	2006-02-18 13:59:59.000000000 +0100
+++ linux-2.6.16-rc4-ns/kernel/exit.c	2006-02-18 17:57:21.000000000 +0100
@@ -360,6 +360,9 @@ void daemonize(const char *name, ...)
 	fs = init_task.fs;
 	current->fs = fs;
 	atomic_inc(&fs->count);
+	exit_namespace(current);
+	current->namespace = init_task.namespace;
+	get_namespace(current->namespace);
  	exit_files(current);
 	current->files = init_task.files;
 	atomic_inc(&current->files->count);

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kjournald keeps reference to namespace
  2006-02-18 17:12     ` Björn Steinbrink
@ 2006-02-19  2:32       ` Eric W. Biederman
  0 siblings, 0 replies; 10+ messages in thread
From: Eric W. Biederman @ 2006-02-19  2:32 UTC (permalink / raw)
  To: Björn Steinbrink; +Cc: herbert, akpm, viro, linux-kernel, torvalds

Björn Steinbrink <B.Steinbrink@gmx.de> writes:

> On 2006.02.18 17:32:27 +0100, Björn Steinbrink wrote:
>> > In daemonize() a new thread gets cleaned up and 'merged' with init_task.
>> > The current fs_struct is handled there, but not the current namespace.
>> > The following patch adds the namespace part.
>> > 
>> > Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
>> > ---
>> 
>> Oops, forgot the increment the namespace usage count...
>
> Ok, this time with the get_namespace wrapper, thanks to Eric Biederman
> for pointing that out to me.

Acked-by: Eric Biederman <ebiederm@xmission.com>

Note we can't ever count on using our parents namespace because
we already have called exit_fs(), which is the only way to the
namespace from a process.

> ---
>
>
> diff -NurpP --minimal linux-2.6.16-rc4/kernel/exit.c
> linux-2.6.16-rc4-ns/kernel/exit.c
> --- linux-2.6.16-rc4/kernel/exit.c	2006-02-18 13:59:59.000000000 +0100
> +++ linux-2.6.16-rc4-ns/kernel/exit.c	2006-02-18 17:57:21.000000000 +0100
> @@ -360,6 +360,9 @@ void daemonize(const char *name, ...)
>  	fs = init_task.fs;
>  	current->fs = fs;
>  	atomic_inc(&fs->count);
> +	exit_namespace(current);
> +	current->namespace = init_task.namespace;
> +	get_namespace(current->namespace);
>   	exit_files(current);
>  	current->files = init_task.files;
>  	atomic_inc(&current->files->count);

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kjournald keeps reference to namespace
  2006-02-18  3:30   ` Herbert Poetzl
@ 2006-02-24 21:28     ` Paul Collins
  2006-02-24 21:36       ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Paul Collins @ 2006-02-24 21:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

Herbert Poetzl <herbert@13thfloor.at> writes:

> On Fri, Feb 17, 2006 at 05:54:28PM -0800, Andrew Morton wrote:
>> I think it'd be better to convert ext3 to use the kthread API which
>> appears to accidentally not have this problem, because such threads
>> are parented by keventd, which were parented by init.
>
> sounds like a plan!

Here's my attempt at such a conversion.  Since jbd doesn't seem to
want to collect an exit status, I didn't bother with kthread_stop().

I got overexcited and also embedded the journal device in the process
name, but that's probably useless churn.  Looks nice in pstree though:

     |-kthread-+-kblockd/0
     |         |-khubd
     |         |-2*[pdflush]
     |         |-aio/0
     |         |-v9fs/0
     |         |-cqueue/0
     |         |-kfand
     |         |-kcryptd/0
     |         |-kjournald/3:3
     |         |-kjournald/3:8
     |         |-kjournald/3:4
     |         |-kjournald/3:5
     |         `-kjournald/254:1


Signed-off-by: Paul Collins <paul@ondioline.org>

diff --git a/fs/jbd/journal.c b/fs/jbd/journal.c
index e4b516a..e33d993 100644
--- a/fs/jbd/journal.c
+++ b/fs/jbd/journal.c
@@ -33,6 +33,8 @@
 #include <linux/mm.h>
 #include <linux/suspend.h>
 #include <linux/pagemap.h>
+#include <linux/kthread.h>
+#include <linux/kdev_t.h>
 #include <asm/uaccess.h>
 #include <asm/page.h>
 #include <linux/proc_fs.h>
@@ -115,8 +117,6 @@ static int kjournald(void *arg)
 	transaction_t *transaction;
 	struct timer_list timer;
 
-	daemonize("kjournald");
-
 	/* Set up an interval timer which can be used to trigger a
            commit wakeup after the commit interval expires */
 	init_timer(&timer);
@@ -207,12 +207,14 @@ end_loop:
 	journal->j_task = NULL;
 	wake_up(&journal->j_wait_done_commit);
 	jbd_debug(1, "Journal thread exiting.\n");
-	return 0;
+	do_exit(0);
 }
 
 static void journal_start_thread(journal_t *journal)
 {
-	kernel_thread(kjournald, journal, CLONE_VM|CLONE_FS|CLONE_FILES);
+	dev_t jdev = journal->j_dev->bd_dev;
+	kthread_run(kjournald, journal, "kjournald/%d:%d",
+		    MAJOR(jdev), MINOR(jdev), NULL);
 	wait_event(journal->j_wait_done_commit, journal->j_task != 0);
 }
 


-- 
Dag vijandelijk luchtschip de huismeester is dood

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: kjournald keeps reference to namespace
  2006-02-24 21:28     ` Paul Collins
@ 2006-02-24 21:36       ` Andrew Morton
  2006-02-24 22:01         ` Paul Collins
  0 siblings, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2006-02-24 21:36 UTC (permalink / raw)
  To: Paul Collins; +Cc: linux-kernel

Paul Collins <paul@briny.ondioline.org> wrote:
>
>  Herbert Poetzl <herbert@13thfloor.at> writes:
> 
>  > On Fri, Feb 17, 2006 at 05:54:28PM -0800, Andrew Morton wrote:
>  >> I think it'd be better to convert ext3 to use the kthread API which
>  >> appears to accidentally not have this problem, because such threads
>  >> are parented by keventd, which were parented by init.
>  >
>  > sounds like a plan!
> 
>  Here's my attempt at such a conversion.  Since jbd doesn't seem to
>  want to collect an exit status, I didn't bother with kthread_stop().

Ah.  I already did something similar. 
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.16-rc4/2.6.16-rc4-mm2/broken-out/jbd-convert-kjournald-to-kthread-api.patch

>  I got overexcited and also embedded the journal device in the process
>  name, but that's probably useless churn.  Looks nice in pstree though:
> 
>       |-kthread-+-kblockd/0
>       |         |-khubd
>       |         |-2*[pdflush]
>       |         |-aio/0
>       |         |-v9fs/0
>       |         |-cqueue/0
>       |         |-kfand
>       |         |-kcryptd/0
>       |         |-kjournald/3:3
>       |         |-kjournald/3:8
>       |         |-kjournald/3:4
>       |         |-kjournald/3:5
>       |         `-kjournald/254:1

We only have 15 chars for that string - the final one you have there is on
the raggedy edge.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kjournald keeps reference to namespace
  2006-02-24 21:36       ` Andrew Morton
@ 2006-02-24 22:01         ` Paul Collins
  0 siblings, 0 replies; 10+ messages in thread
From: Paul Collins @ 2006-02-24 22:01 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

Andrew Morton <akpm@osdl.org> writes:

> Paul Collins <paul@briny.ondioline.org> wrote:
>>
>>  Herbert Poetzl <herbert@13thfloor.at> writes:
>> 
>>  > On Fri, Feb 17, 2006 at 05:54:28PM -0800, Andrew Morton wrote:
>>  >> I think it'd be better to convert ext3 to use the kthread API which
>>  >> appears to accidentally not have this problem, because such threads
>>  >> are parented by keventd, which were parented by init.
>>  >
>>  > sounds like a plan!
>> 
>>  Here's my attempt at such a conversion.  Since jbd doesn't seem to
>>  want to collect an exit status, I didn't bother with kthread_stop().
>
> Ah.  I already did something similar. 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.16-rc4/2.6.16-rc4-mm2/broken-out/jbd-convert-kjournald-to-kthread-api.patch

Sweet, I managed to get it right at least.

>>  I got overexcited and also embedded the journal device in the process
>>  name, but that's probably useless churn.  Looks nice in pstree though:
>> 
>>       |         `-kjournald/254:1
>
> We only have 15 chars for that string - the final one you have there is on
> the raggedy edge.

Oh well, it's not that useful anyway.  I can count the number of times
I've needed to kill a particular kjournald on the fingers of one head.

-- 
Dag vijandelijk luchtschip de huismeester is dood

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-02-24 22:02 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-18  1:35 kjournald keeps reference to namespace Herbert Poetzl
2006-02-18  1:54 ` Andrew Morton
2006-02-18  3:30   ` Herbert Poetzl
2006-02-24 21:28     ` Paul Collins
2006-02-24 21:36       ` Andrew Morton
2006-02-24 22:01         ` Paul Collins
2006-02-18 13:36 ` Björn Steinbrink
2006-02-18 16:32   ` Björn Steinbrink
2006-02-18 17:12     ` Björn Steinbrink
2006-02-19  2:32       ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).