All of lore.kernel.org
 help / color / mirror / Atom feed
* initcall dependency problem (ns vs. threads)
@ 2011-08-01 18:01 ` Vasiliy Kulikov
  0 siblings, 0 replies; 53+ messages in thread
From: Vasiliy Kulikov @ 2011-08-01 18:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, kernel-hardening, Paul E. McKenney, Manuel Lauss,
	linux-kernel, Richard Weinberger, torvalds, Marc Zyngier

Hi,

There were reported problems with recent shm changes, by Manuel
Lauss (on MIPS), Richard Weinberger (on UML), and Marc Zyngier (on ARM).

https://lkml.org/lkml/2011/8/1/149
https://lkml.org/lkml/2011/8/1/162
https://lkml.org/lkml/2011/8/1/210

The problem became visible on this patch:

    commit 5774ed014f02120db9a6945a1ecebeb97c2acccb
    Author: Vasiliy Kulikov <segoon@openwall.com>
    Date:   Fri Jul 29 03:55:31 2011 +0400

        shm: handle separate PID namespaces case

It started to use &shm_ids(ns).rw_mutex, which is not initialized yet.
Init IPC namespace is initialized as initcall() and some threads are
created as early_initcall().

I threat it is a dependency bug in the core kernel - kernel threads
should be able to use any namespace information, but currently there is
a race between namespace initialization code (which is initcall) and
kernel threads (which are early_initcall).

I don't feel enough experienced in init code dependencies, so I report
it to you.

    static int __init kernel_init(void * unused)
    {
        ...
        do_pre_smp_initcalls(); << threads start here
        ...
        do_basic_setup();


    static void __init do_basic_setup(void)
    {
        cpuset_init_smp();
        usermodehelper_init();
        init_tmpfs();
        driver_init();
        init_irq_proc();
        do_ctors();
        do_initcalls(); << namespace init here
    }

Thanks,

-- 
Vasiliy

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] initcall dependency problem (ns vs. threads)
@ 2011-08-01 18:01 ` Vasiliy Kulikov
  0 siblings, 0 replies; 53+ messages in thread
From: Vasiliy Kulikov @ 2011-08-01 18:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, kernel-hardening, Paul E. McKenney, Manuel Lauss,
	linux-kernel, Richard Weinberger, torvalds, Marc Zyngier

Hi,

There were reported problems with recent shm changes, by Manuel
Lauss (on MIPS), Richard Weinberger (on UML), and Marc Zyngier (on ARM).

https://lkml.org/lkml/2011/8/1/149
https://lkml.org/lkml/2011/8/1/162
https://lkml.org/lkml/2011/8/1/210

The problem became visible on this patch:

    commit 5774ed014f02120db9a6945a1ecebeb97c2acccb
    Author: Vasiliy Kulikov <segoon@openwall.com>
    Date:   Fri Jul 29 03:55:31 2011 +0400

        shm: handle separate PID namespaces case

It started to use &shm_ids(ns).rw_mutex, which is not initialized yet.
Init IPC namespace is initialized as initcall() and some threads are
created as early_initcall().

I threat it is a dependency bug in the core kernel - kernel threads
should be able to use any namespace information, but currently there is
a race between namespace initialization code (which is initcall) and
kernel threads (which are early_initcall).

I don't feel enough experienced in init code dependencies, so I report
it to you.

    static int __init kernel_init(void * unused)
    {
        ...
        do_pre_smp_initcalls(); << threads start here
        ...
        do_basic_setup();


    static void __init do_basic_setup(void)
    {
        cpuset_init_smp();
        usermodehelper_init();
        init_tmpfs();
        driver_init();
        init_irq_proc();
        do_ctors();
        do_initcalls(); << namespace init here
    }

Thanks,

-- 
Vasiliy

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: initcall dependency problem (ns vs. threads)
  2011-08-01 18:01 ` [kernel-hardening] " Vasiliy Kulikov
@ 2011-08-01 18:20   ` Andrew Morton
  -1 siblings, 0 replies; 53+ messages in thread
From: Andrew Morton @ 2011-08-01 18:20 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Ingo Molnar, kernel-hardening, Paul E. McKenney, Manuel Lauss,
	linux-kernel, Richard Weinberger, torvalds, Marc Zyngier

On Mon, 1 Aug 2011 22:01:51 +0400 Vasiliy Kulikov <segoon@openwall.com> wrote:

> Hi,
> 
> There were reported problems with recent shm changes, by Manuel
> Lauss (on MIPS), Richard Weinberger (on UML), and Marc Zyngier (on ARM).
> 
> https://lkml.org/lkml/2011/8/1/149
> https://lkml.org/lkml/2011/8/1/162
> https://lkml.org/lkml/2011/8/1/210
> 
> The problem became visible on this patch:
> 
>     commit 5774ed014f02120db9a6945a1ecebeb97c2acccb
>     Author: Vasiliy Kulikov <segoon@openwall.com>
>     Date:   Fri Jul 29 03:55:31 2011 +0400
> 
>         shm: handle separate PID namespaces case
> 
> It started to use &shm_ids(ns).rw_mutex, which is not initialized yet.
> Init IPC namespace is initialized as initcall() and some threads are
> created as early_initcall().
> 
> I threat it is a dependency bug in the core kernel - kernel threads
> should be able to use any namespace information, but currently there is
> a race between namespace initialization code (which is initcall) and
> kernel threads (which are early_initcall).
> 
> I don't feel enough experienced in init code dependencies, so I report
> it to you.
> 
>     static int __init kernel_init(void * unused)
>     {
>         ...
>         do_pre_smp_initcalls(); << threads start here
>         ...
>         do_basic_setup();
> 
> 
>     static void __init do_basic_setup(void)
>     {
>         cpuset_init_smp();
>         usermodehelper_init();
>         init_tmpfs();
>         driver_init();
>         init_irq_proc();
>         do_ctors();
>         do_initcalls(); << namespace init here
>     }

There's not really enough detail here for me to suggest a fix without
actually doing some work.  Which ipc initialization function is being
called to late?  Which thread is using which data structures before
which initialization function has been run?

Are we talking about init_ipc_ns.ids[] here?  If so, did you try
initializing the three rwsems at compile-time?

That's rather a nasty hack though.  It'd be better to run the mystery
init function before starting the threads.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: initcall dependency problem (ns vs. threads)
@ 2011-08-01 18:20   ` Andrew Morton
  0 siblings, 0 replies; 53+ messages in thread
From: Andrew Morton @ 2011-08-01 18:20 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Ingo Molnar, kernel-hardening, Paul E. McKenney, Manuel Lauss,
	linux-kernel, Richard Weinberger, torvalds, Marc Zyngier

On Mon, 1 Aug 2011 22:01:51 +0400 Vasiliy Kulikov <segoon@openwall.com> wrote:

> Hi,
> 
> There were reported problems with recent shm changes, by Manuel
> Lauss (on MIPS), Richard Weinberger (on UML), and Marc Zyngier (on ARM).
> 
> https://lkml.org/lkml/2011/8/1/149
> https://lkml.org/lkml/2011/8/1/162
> https://lkml.org/lkml/2011/8/1/210
> 
> The problem became visible on this patch:
> 
>     commit 5774ed014f02120db9a6945a1ecebeb97c2acccb
>     Author: Vasiliy Kulikov <segoon@openwall.com>
>     Date:   Fri Jul 29 03:55:31 2011 +0400
> 
>         shm: handle separate PID namespaces case
> 
> It started to use &shm_ids(ns).rw_mutex, which is not initialized yet.
> Init IPC namespace is initialized as initcall() and some threads are
> created as early_initcall().
> 
> I threat it is a dependency bug in the core kernel - kernel threads
> should be able to use any namespace information, but currently there is
> a race between namespace initialization code (which is initcall) and
> kernel threads (which are early_initcall).
> 
> I don't feel enough experienced in init code dependencies, so I report
> it to you.
> 
>     static int __init kernel_init(void * unused)
>     {
>         ...
>         do_pre_smp_initcalls(); << threads start here
>         ...
>         do_basic_setup();
> 
> 
>     static void __init do_basic_setup(void)
>     {
>         cpuset_init_smp();
>         usermodehelper_init();
>         init_tmpfs();
>         driver_init();
>         init_irq_proc();
>         do_ctors();
>         do_initcalls(); << namespace init here
>     }

There's not really enough detail here for me to suggest a fix without
actually doing some work.  Which ipc initialization function is being
called to late?  Which thread is using which data structures before
which initialization function has been run?

Are we talking about init_ipc_ns.ids[] here?  If so, did you try
initializing the three rwsems at compile-time?

That's rather a nasty hack though.  It'd be better to run the mystery
init function before starting the threads.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [kernel-hardening] Re: initcall dependency problem (ns vs. threads)
  2011-08-01 18:20   ` [kernel-hardening] " Andrew Morton
  (?)
@ 2011-08-01 18:34   ` Vasiliy Kulikov
  -1 siblings, 0 replies; 53+ messages in thread
From: Vasiliy Kulikov @ 2011-08-01 18:34 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Ingo Molnar, Paul E. McKenney, Manuel Lauss, linux-kernel,
	Richard Weinberger, torvalds, Marc Zyngier

On Mon, Aug 01, 2011 at 11:20 -0700, Andrew Morton wrote:
> There's not really enough detail here for me to suggest a fix without
> actually doing some work.  Which ipc initialization function is being
> called to late?

The call sequence is:

static int __init ipc_init(void)
{
    ...
	shm_init();
    ...
}
__initcall(ipc_init);

void __init shm_init (void)
{
	shm_init_ns(&init_ipc_ns);
    ...

void shm_init_ns(struct ipc_namespace *ns)
{
    ...
	ipc_init_ids(&shm_ids(ns));

void ipc_init_ids(struct ipc_ids *ids)
{
	init_rwsem(&ids->rw_mutex);
    ...


The code triggering the oops (called from do_exit()):

void exit_shm(struct task_struct *task)
{
    ...
	down_write(&shm_ids(ns).rw_mutex);

>  Which thread is using which data structures before
> which initialization function has been run?

Actually, it doesn't matter.  If ANY thread exits before init_rwsem()
then exit_shm() would use uninitialized shm_ids(ns).rw_mutex.


> Are we talking about init_ipc_ns.ids[] here?  If so, did you try
> initializing the three rwsems at compile-time?

No, good idea.  I'll do it.

IMO moving specific initializer is bad by design.  There should be a
guarantee what resources are accessible on what boot stage.  I suppose
it should be: all thread related information (including ns data) is
accessible for the moment of threads' code execution.


Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [kernel-hardening] Re: initcall dependency problem (ns vs. threads)
  2011-08-01 18:20   ` [kernel-hardening] " Andrew Morton
  (?)
  (?)
@ 2011-08-01 19:03   ` Vasiliy Kulikov
  2011-08-01 19:07     ` Andrew Morton
  2011-08-02  0:01     ` Linus Torvalds
  -1 siblings, 2 replies; 53+ messages in thread
From: Vasiliy Kulikov @ 2011-08-01 19:03 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ingo Molnar, kernel-hardening, Paul E. McKenney, Manuel Lauss,
	linux-kernel, Richard Weinberger, torvalds, Marc Zyngier

> Are we talking about init_ipc_ns.ids[] here?  If so, did you try
> initializing the three rwsems at compile-time?
> 
> That's rather a nasty hack though.  It'd be better to run the mystery
> init function before starting the threads.

Looks like it solves the race.  However, I think it should be solved on
another level.  Other bugs might be hidden with this race.


Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [kernel-hardening] Re: initcall dependency problem (ns vs. threads)
  2011-08-01 19:03   ` Vasiliy Kulikov
@ 2011-08-01 19:07     ` Andrew Morton
  2011-08-01 19:22       ` Vasiliy Kulikov
  2011-08-02  0:01     ` Linus Torvalds
  1 sibling, 1 reply; 53+ messages in thread
From: Andrew Morton @ 2011-08-01 19:07 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Ingo Molnar, kernel-hardening, Paul E. McKenney, Manuel Lauss,
	linux-kernel, Richard Weinberger, torvalds, Marc Zyngier

On Mon, 1 Aug 2011 23:03:41 +0400
Vasiliy Kulikov <segoon@openwall.com> wrote:

> > Are we talking about init_ipc_ns.ids[] here?  If so, did you try
> > initializing the three rwsems at compile-time?
> > 
> > That's rather a nasty hack though.  It'd be better to run the mystery
> > init function before starting the threads.
> 
> Looks like it solves the race.

What patch are you talking about here?

>  However, I think it should be solved on
> another level.

What level?

>  Other bugs might be hidden with this race.

What bugs?

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [kernel-hardening] Re: initcall dependency problem (ns vs. threads)
  2011-08-01 19:07     ` Andrew Morton
@ 2011-08-01 19:22       ` Vasiliy Kulikov
  0 siblings, 0 replies; 53+ messages in thread
From: Vasiliy Kulikov @ 2011-08-01 19:22 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Ingo Molnar, Paul E. McKenney, Manuel Lauss, linux-kernel,
	Richard Weinberger, torvalds, Marc Zyngier

On Mon, Aug 01, 2011 at 12:07 -0700, Andrew Morton wrote:
> On Mon, 1 Aug 2011 23:03:41 +0400
> Vasiliy Kulikov <segoon@openwall.com> wrote:
> 
> > > Are we talking about init_ipc_ns.ids[] here?  If so, did you try
> > > initializing the three rwsems at compile-time?
> > > 
> > > That's rather a nasty hack though.  It'd be better to run the mystery
> > > init function before starting the threads.
> > 
> > Looks like it solves the race.
> 
> What patch are you talking about here?

Sorry for short sentences :)  I tried the patch you've suggested -
initialize rw_mutex in the init_ipc_ns declaration.  Surely, it solves a
specific race.  As no kernel threads actually use shm, other fields are
not needed to be initialized before do_initcall().

However, it is a bit ugly as it divides namespace initialization code
into init_ipc_ns initialization and other namespaces.  It's better to
use the same code for all namespaces (as it currently is).


> >  However, I think it should be solved on
> > another level.
> 
> What level?

I mean it is a bug of _implicit_ assume that kthreads don't use ns
related information.  So, AFAICS, it can be fixed 2 ways:

1) Move creations of kernel threads somewhere after namespaces
initializations in the init chain.

2) Deferring threads creation until all ns initialization is done.


> >  Other bugs might be hidden with this race.
> 
> What bugs?

I don't speak about specific bugs (I know the only one, which is this shm
related bug), but I suppose some threads might use some ns related
information as well.  At least I don't see whether it is somehow
explicitly denied currently.

Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [kernel-hardening] Re: initcall dependency problem (ns vs. threads)
  2011-08-01 19:03   ` Vasiliy Kulikov
  2011-08-01 19:07     ` Andrew Morton
@ 2011-08-02  0:01     ` Linus Torvalds
  2011-08-02 12:45         ` [kernel-hardening] " Vasiliy Kulikov
  1 sibling, 1 reply; 53+ messages in thread
From: Linus Torvalds @ 2011-08-02  0:01 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Andrew Morton, Ingo Molnar, kernel-hardening, Paul E. McKenney,
	Manuel Lauss, linux-kernel, Richard Weinberger, Marc Zyngier

On Mon, Aug 1, 2011 at 9:03 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>> Are we talking about init_ipc_ns.ids[] here?  If so, did you try
>> initializing the three rwsems at compile-time?
>>
>> That's rather a nasty hack though.  It'd be better to run the mystery
>> init function before starting the threads.
>
> Looks like it solves the race.  However, I think it should be solved on
> another level.  Other bugs might be hidden with this race.

Statically initializing code data structures in order to avoid these
kinds of problems is the correct fix.

Can you send me the patch?

              Linus

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-02  0:01     ` Linus Torvalds
@ 2011-08-02 12:45         ` Vasiliy Kulikov
  0 siblings, 0 replies; 53+ messages in thread
From: Vasiliy Kulikov @ 2011-08-02 12:45 UTC (permalink / raw)
  To: Linus Torvalds, Manuel Lauss, linux-kernel, Richard Weinberger,
	Marc Zyngier
  Cc: Andrew Morton, Ingo Molnar, kernel-hardening, Paul E. McKenney

Hi,

Manuel, Richard, Marc - can you test this patch, please?


From: Vasiliy Kulikov <segoon@openwall.com>
Subject: [PATCH] shm: fix a race between shm_exit() and shm_init()

On thread exit shm_exit() is called, it uses shm_ids(ns).rw_mutex.
It is initialized in shm_init(), but it is not called yet at the moment
of kernel threads exit.  Some kernel threads are created in
do_pre_smp_initcalls(), and shm_init() is called in do_initcalls().

Static initialization of shm_ids(init_ipc_ns).rw_mutex fixes the race.

It fixes a kernel oops:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
...
[<c0320090>] (__down_write_nested+0x88/0xe0) from [<c015da08>] (exit_shm+0x28/0x48)
[<c015da08>] (exit_shm+0x28/0x48) from [<c002e550>] (do_exit+0x59c/0x750)
[<c002e550>] (do_exit+0x59c/0x750) from [<c003eaac>] (____call_usermodehelper+0x13c/0x154)
[<c003eaac>] (____call_usermodehelper+0x13c/0x154) from [<c000f630>] (kernel_thread_exit+0x0/0x8)
Code: 1afffffa e597c00c e58d0000 e587d00c (e58cd000)

Reported-by: Manuel Lauss <manuel.lauss@googlemail.com>
Reported-by: Richard Weinberger <richard@nod.at>
Reported-by: Marc Zyngier <maz@misterjones.org>
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
---
 ipc/msgutil.c |    6 ++++++
 ipc/shm.c     |   11 ++++++++++-
 ipc/util.c    |   30 +++++++++++++++++-------------
 ipc/util.h    |    1 +
 4 files changed, 34 insertions(+), 14 deletions(-)

diff --git a/ipc/msgutil.c b/ipc/msgutil.c
index 8b5ce5d..6da67b6 100644
--- a/ipc/msgutil.c
+++ b/ipc/msgutil.c
@@ -20,6 +20,9 @@
 
 DEFINE_SPINLOCK(mq_lock);
 
+#define INIT_IPC_SHM_IDS(name) \
+	{ .rw_mutex = __RWSEM_INITIALIZER(name.rw_mutex), }
+
 /*
  * The next 2 defines are here bc this is the only file
  * compiled when either CONFIG_SYSVIPC and CONFIG_POSIX_MQUEUE
@@ -27,6 +30,9 @@ DEFINE_SPINLOCK(mq_lock);
  */
 struct ipc_namespace init_ipc_ns = {
 	.count		= ATOMIC_INIT(1),
+	.ids	= {
+		[IPC_SHM_IDS] = INIT_IPC_SHM_IDS(init_ipc_ns.ids[IPC_SHM_IDS]),
+	},
 #ifdef CONFIG_POSIX_MQUEUE
 	.mq_queues_max   = DFLT_QUEUESMAX,
 	.mq_msg_max      = DFLT_MSGMAX,
diff --git a/ipc/shm.c b/ipc/shm.c
index 9fb044f..7d084a0 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -76,7 +76,16 @@ void shm_init_ns(struct ipc_namespace *ns)
 	ns->shm_ctlmni = SHMMNI;
 	ns->shm_rmid_forced = 0;
 	ns->shm_tot = 0;
-	ipc_init_ids(&shm_ids(ns));
+
+	/*
+	 * For init_ipc_ns shm_ids().rw_mutex is statically initialized
+	 * as kernel threads should be able to use it in do_exit() before
+	 * shm_init(), which is called on do_initcall()
+	 */
+	if (ns == &init_ipc_ns)
+		__ipc_init_ids(&shm_ids(ns));
+	else
+		ipc_init_ids(&shm_ids(ns));
 }
 
 /*
diff --git a/ipc/util.c b/ipc/util.c
index 75261a3..673dde5 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -108,31 +108,35 @@ static int __init ipc_init(void)
 }
 __initcall(ipc_init);
 
-/**
- *	ipc_init_ids		-	initialise IPC identifiers
- *	@ids: Identifier set
- *
- *	Set up the sequence range to use for the ipc identifier range (limited
- *	below IPCMNI) then initialise the ids idr.
- */
- 
-void ipc_init_ids(struct ipc_ids *ids)
+void __ipc_init_ids(struct ipc_ids *ids)
 {
-	init_rwsem(&ids->rw_mutex);
-
 	ids->in_use = 0;
 	ids->seq = 0;
 	{
 		int seq_limit = INT_MAX/SEQ_MULTIPLIER;
 		if (seq_limit > USHRT_MAX)
 			ids->seq_max = USHRT_MAX;
-		 else
-		 	ids->seq_max = seq_limit;
+		else
+			ids->seq_max = seq_limit;
 	}
 
 	idr_init(&ids->ipcs_idr);
 }
 
+/**
+ *	ipc_init_ids		-	initialise IPC identifiers
+ *	@ids: Identifier set
+ *
+ *	Set up the sequence range to use for the ipc identifier range (limited
+ *	below IPCMNI) then initialise the ids idr.
+ */
+
+void ipc_init_ids(struct ipc_ids *ids)
+{
+	init_rwsem(&ids->rw_mutex);
+	__ipc_init_ids(ids);
+}
+
 #ifdef CONFIG_PROC_FS
 static const struct file_operations sysvipc_proc_fops;
 /**
diff --git a/ipc/util.h b/ipc/util.h
index 6f5c20b..8bbcd9c 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -80,6 +80,7 @@ struct seq_file;
 struct ipc_ids;
 
 void ipc_init_ids(struct ipc_ids *);
+void __ipc_init_ids(struct ipc_ids *ids);
 #ifdef CONFIG_PROC_FS
 void __init ipc_init_proc_interface(const char *path, const char *header,
 		int ids, int (*show)(struct seq_file *, void *));
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [kernel-hardening] [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-02 12:45         ` Vasiliy Kulikov
  0 siblings, 0 replies; 53+ messages in thread
From: Vasiliy Kulikov @ 2011-08-02 12:45 UTC (permalink / raw)
  To: Linus Torvalds, Manuel Lauss, linux-kernel, Richard Weinberger,
	Marc Zyngier
  Cc: Andrew Morton, Ingo Molnar, kernel-hardening, Paul E. McKenney

Hi,

Manuel, Richard, Marc - can you test this patch, please?


From: Vasiliy Kulikov <segoon@openwall.com>
Subject: [PATCH] shm: fix a race between shm_exit() and shm_init()

On thread exit shm_exit() is called, it uses shm_ids(ns).rw_mutex.
It is initialized in shm_init(), but it is not called yet at the moment
of kernel threads exit.  Some kernel threads are created in
do_pre_smp_initcalls(), and shm_init() is called in do_initcalls().

Static initialization of shm_ids(init_ipc_ns).rw_mutex fixes the race.

It fixes a kernel oops:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
...
[<c0320090>] (__down_write_nested+0x88/0xe0) from [<c015da08>] (exit_shm+0x28/0x48)
[<c015da08>] (exit_shm+0x28/0x48) from [<c002e550>] (do_exit+0x59c/0x750)
[<c002e550>] (do_exit+0x59c/0x750) from [<c003eaac>] (____call_usermodehelper+0x13c/0x154)
[<c003eaac>] (____call_usermodehelper+0x13c/0x154) from [<c000f630>] (kernel_thread_exit+0x0/0x8)
Code: 1afffffa e597c00c e58d0000 e587d00c (e58cd000)

Reported-by: Manuel Lauss <manuel.lauss@googlemail.com>
Reported-by: Richard Weinberger <richard@nod.at>
Reported-by: Marc Zyngier <maz@misterjones.org>
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
---
 ipc/msgutil.c |    6 ++++++
 ipc/shm.c     |   11 ++++++++++-
 ipc/util.c    |   30 +++++++++++++++++-------------
 ipc/util.h    |    1 +
 4 files changed, 34 insertions(+), 14 deletions(-)

diff --git a/ipc/msgutil.c b/ipc/msgutil.c
index 8b5ce5d..6da67b6 100644
--- a/ipc/msgutil.c
+++ b/ipc/msgutil.c
@@ -20,6 +20,9 @@
 
 DEFINE_SPINLOCK(mq_lock);
 
+#define INIT_IPC_SHM_IDS(name) \
+	{ .rw_mutex = __RWSEM_INITIALIZER(name.rw_mutex), }
+
 /*
  * The next 2 defines are here bc this is the only file
  * compiled when either CONFIG_SYSVIPC and CONFIG_POSIX_MQUEUE
@@ -27,6 +30,9 @@ DEFINE_SPINLOCK(mq_lock);
  */
 struct ipc_namespace init_ipc_ns = {
 	.count		= ATOMIC_INIT(1),
+	.ids	= {
+		[IPC_SHM_IDS] = INIT_IPC_SHM_IDS(init_ipc_ns.ids[IPC_SHM_IDS]),
+	},
 #ifdef CONFIG_POSIX_MQUEUE
 	.mq_queues_max   = DFLT_QUEUESMAX,
 	.mq_msg_max      = DFLT_MSGMAX,
diff --git a/ipc/shm.c b/ipc/shm.c
index 9fb044f..7d084a0 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -76,7 +76,16 @@ void shm_init_ns(struct ipc_namespace *ns)
 	ns->shm_ctlmni = SHMMNI;
 	ns->shm_rmid_forced = 0;
 	ns->shm_tot = 0;
-	ipc_init_ids(&shm_ids(ns));
+
+	/*
+	 * For init_ipc_ns shm_ids().rw_mutex is statically initialized
+	 * as kernel threads should be able to use it in do_exit() before
+	 * shm_init(), which is called on do_initcall()
+	 */
+	if (ns == &init_ipc_ns)
+		__ipc_init_ids(&shm_ids(ns));
+	else
+		ipc_init_ids(&shm_ids(ns));
 }
 
 /*
diff --git a/ipc/util.c b/ipc/util.c
index 75261a3..673dde5 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -108,31 +108,35 @@ static int __init ipc_init(void)
 }
 __initcall(ipc_init);
 
-/**
- *	ipc_init_ids		-	initialise IPC identifiers
- *	@ids: Identifier set
- *
- *	Set up the sequence range to use for the ipc identifier range (limited
- *	below IPCMNI) then initialise the ids idr.
- */
- 
-void ipc_init_ids(struct ipc_ids *ids)
+void __ipc_init_ids(struct ipc_ids *ids)
 {
-	init_rwsem(&ids->rw_mutex);
-
 	ids->in_use = 0;
 	ids->seq = 0;
 	{
 		int seq_limit = INT_MAX/SEQ_MULTIPLIER;
 		if (seq_limit > USHRT_MAX)
 			ids->seq_max = USHRT_MAX;
-		 else
-		 	ids->seq_max = seq_limit;
+		else
+			ids->seq_max = seq_limit;
 	}
 
 	idr_init(&ids->ipcs_idr);
 }
 
+/**
+ *	ipc_init_ids		-	initialise IPC identifiers
+ *	@ids: Identifier set
+ *
+ *	Set up the sequence range to use for the ipc identifier range (limited
+ *	below IPCMNI) then initialise the ids idr.
+ */
+
+void ipc_init_ids(struct ipc_ids *ids)
+{
+	init_rwsem(&ids->rw_mutex);
+	__ipc_init_ids(ids);
+}
+
 #ifdef CONFIG_PROC_FS
 static const struct file_operations sysvipc_proc_fops;
 /**
diff --git a/ipc/util.h b/ipc/util.h
index 6f5c20b..8bbcd9c 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -80,6 +80,7 @@ struct seq_file;
 struct ipc_ids;
 
 void ipc_init_ids(struct ipc_ids *);
+void __ipc_init_ids(struct ipc_ids *ids);
 #ifdef CONFIG_PROC_FS
 void __init ipc_init_proc_interface(const char *path, const char *header,
 		int ids, int (*show)(struct seq_file *, void *));
-- 
1.7.0.4

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-02 12:45         ` [kernel-hardening] " Vasiliy Kulikov
@ 2011-08-02 12:51           ` Manuel Lauss
  -1 siblings, 0 replies; 53+ messages in thread
From: Manuel Lauss @ 2011-08-02 12:51 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, linux-kernel, Richard Weinberger, Marc Zyngier,
	Andrew Morton, Ingo Molnar, kernel-hardening, Paul E. McKenney

On Tue, Aug 2, 2011 at 2:45 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> Manuel, Richard, Marc - can you test this patch, please?

Works for me.

If it matters:
Tested-by: Manuel Lauss <manuel.lauss@googlemail.com>

Thanks!
        Manuel Lauss

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-02 12:51           ` Manuel Lauss
  0 siblings, 0 replies; 53+ messages in thread
From: Manuel Lauss @ 2011-08-02 12:51 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, linux-kernel, Richard Weinberger, Marc Zyngier,
	Andrew Morton, Ingo Molnar, kernel-hardening, Paul E. McKenney

On Tue, Aug 2, 2011 at 2:45 PM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> Manuel, Richard, Marc - can you test this patch, please?

Works for me.

If it matters:
Tested-by: Manuel Lauss <manuel.lauss@googlemail.com>

Thanks!
        Manuel Lauss

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-02 12:45         ` [kernel-hardening] " Vasiliy Kulikov
@ 2011-08-02 13:23           ` Richard Weinberger
  -1 siblings, 0 replies; 53+ messages in thread
From: Richard Weinberger @ 2011-08-02 13:23 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, Manuel Lauss, linux-kernel, Marc Zyngier,
	Andrew Morton, Ingo Molnar, kernel-hardening, Paul E. McKenney

On Dienstag 02 August 2011 14:45:30 Vasiliy Kulikov wrote:
> Hi,
> 
> Manuel, Richard, Marc - can you test this patch, please?
> 

Works fine. :-)

Tested-by: Richard Weinberger <richard@nod.at>

Thanks,
//richard

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-02 13:23           ` Richard Weinberger
  0 siblings, 0 replies; 53+ messages in thread
From: Richard Weinberger @ 2011-08-02 13:23 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, Manuel Lauss, linux-kernel, Marc Zyngier,
	Andrew Morton, Ingo Molnar, kernel-hardening, Paul E. McKenney

On Dienstag 02 August 2011 14:45:30 Vasiliy Kulikov wrote:
> Hi,
> 
> Manuel, Richard, Marc - can you test this patch, please?
> 

Works fine. :-)

Tested-by: Richard Weinberger <richard@nod.at>

Thanks,
//richard

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-02 12:45         ` [kernel-hardening] " Vasiliy Kulikov
@ 2011-08-02 13:29           ` Marc Zyngier
  -1 siblings, 0 replies; 53+ messages in thread
From: Marc Zyngier @ 2011-08-02 13:29 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, Manuel Lauss, linux-kernel, Richard Weinberger,
	Andrew Morton, Ingo Molnar, kernel-hardening, Paul E. McKenney

On Tue, 2 Aug 2011 16:45:30 +0400
Vasiliy Kulikov <segoon@openwall.com> wrote:

> Manuel, Richard, Marc - can you test this patch, please?

Works fine here (tested on an EB11MP).

FWIW: Tested-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
I'm the slime oozin' out from your TV set...

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-02 13:29           ` Marc Zyngier
  0 siblings, 0 replies; 53+ messages in thread
From: Marc Zyngier @ 2011-08-02 13:29 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, Manuel Lauss, linux-kernel, Richard Weinberger,
	Andrew Morton, Ingo Molnar, kernel-hardening, Paul E. McKenney

On Tue, 2 Aug 2011 16:45:30 +0400
Vasiliy Kulikov <segoon@openwall.com> wrote:

> Manuel, Richard, Marc - can you test this patch, please?

Works fine here (tested on an EB11MP).

FWIW: Tested-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
I'm the slime oozin' out from your TV set...

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-02 12:45         ` [kernel-hardening] " Vasiliy Kulikov
@ 2011-08-02 20:33           ` Andrew Morton
  -1 siblings, 0 replies; 53+ messages in thread
From: Andrew Morton @ 2011-08-02 20:33 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, Manuel Lauss, linux-kernel, Richard Weinberger,
	Marc Zyngier, Ingo Molnar, kernel-hardening, Paul E. McKenney

On Tue, 2 Aug 2011 16:45:30 +0400
Vasiliy Kulikov <segoon@openwall.com> wrote:

> On thread exit shm_exit() is called, it uses shm_ids(ns).rw_mutex.

You meant shm_exit_ns().

> It is initialized in shm_init(), but it is not called yet at the moment
> of kernel threads exit.  Some kernel threads are created in
> do_pre_smp_initcalls(), and shm_init() is called in do_initcalls().
> 
> Static initialization of shm_ids(init_ipc_ns).rw_mutex fixes the race.
> 
> It fixes a kernel oops:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 00000000
> ...
> [<c0320090>] (__down_write_nested+0x88/0xe0) from [<c015da08>] (exit_shm+0x28/0x48)
> [<c015da08>] (exit_shm+0x28/0x48) from [<c002e550>] (do_exit+0x59c/0x750)
> [<c002e550>] (do_exit+0x59c/0x750) from [<c003eaac>] (____call_usermodehelper+0x13c/0x154)
> [<c003eaac>] (____call_usermodehelper+0x13c/0x154) from [<c000f630>] (kernel_thread_exit+0x0/0x8)
> Code: 1afffffa e597c00c e58d0000 e587d00c (e58cd000)
> 
> ...
>
> --- a/ipc/msgutil.c
> +++ b/ipc/msgutil.c
> @@ -20,6 +20,9 @@
>  
>  DEFINE_SPINLOCK(mq_lock);
>  
> +#define INIT_IPC_SHM_IDS(name) \
> +	{ .rw_mutex = __RWSEM_INITIALIZER(name.rw_mutex), }
> +
>  /*
>   * The next 2 defines are here bc this is the only file
>   * compiled when either CONFIG_SYSVIPC and CONFIG_POSIX_MQUEUE
> @@ -27,6 +30,9 @@ DEFINE_SPINLOCK(mq_lock);
>   */
>  struct ipc_namespace init_ipc_ns = {
>  	.count		= ATOMIC_INIT(1),
> +	.ids	= {
> +		[IPC_SHM_IDS] = INIT_IPC_SHM_IDS(init_ipc_ns.ids[IPC_SHM_IDS]),
> +	},

That's what I meant by "nasty".  We initialise one field because we
happen to use that one at the wrong time, and leave everything else
uninitialised. eww.

But in this case it's not as bad as it might be -
shm_exit_ns()->free_ipcs() is a no-op because ids[2].inuse is zero, so
we kinda _did_ initialise that.  otoh we left ids[0].rw_mutex and
ids[1].rw_mutex uninitialised, so it's still nasty ;)

We could perhaps have fixed the bug by testing ids->inuse before taking
the mutex, which would also have been a speedup for that function. 
That would need some thought.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-02 20:33           ` Andrew Morton
  0 siblings, 0 replies; 53+ messages in thread
From: Andrew Morton @ 2011-08-02 20:33 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, Manuel Lauss, linux-kernel, Richard Weinberger,
	Marc Zyngier, Ingo Molnar, kernel-hardening, Paul E. McKenney

On Tue, 2 Aug 2011 16:45:30 +0400
Vasiliy Kulikov <segoon@openwall.com> wrote:

> On thread exit shm_exit() is called, it uses shm_ids(ns).rw_mutex.

You meant shm_exit_ns().

> It is initialized in shm_init(), but it is not called yet at the moment
> of kernel threads exit.  Some kernel threads are created in
> do_pre_smp_initcalls(), and shm_init() is called in do_initcalls().
> 
> Static initialization of shm_ids(init_ipc_ns).rw_mutex fixes the race.
> 
> It fixes a kernel oops:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 00000000
> ...
> [<c0320090>] (__down_write_nested+0x88/0xe0) from [<c015da08>] (exit_shm+0x28/0x48)
> [<c015da08>] (exit_shm+0x28/0x48) from [<c002e550>] (do_exit+0x59c/0x750)
> [<c002e550>] (do_exit+0x59c/0x750) from [<c003eaac>] (____call_usermodehelper+0x13c/0x154)
> [<c003eaac>] (____call_usermodehelper+0x13c/0x154) from [<c000f630>] (kernel_thread_exit+0x0/0x8)
> Code: 1afffffa e597c00c e58d0000 e587d00c (e58cd000)
> 
> ...
>
> --- a/ipc/msgutil.c
> +++ b/ipc/msgutil.c
> @@ -20,6 +20,9 @@
>  
>  DEFINE_SPINLOCK(mq_lock);
>  
> +#define INIT_IPC_SHM_IDS(name) \
> +	{ .rw_mutex = __RWSEM_INITIALIZER(name.rw_mutex), }
> +
>  /*
>   * The next 2 defines are here bc this is the only file
>   * compiled when either CONFIG_SYSVIPC and CONFIG_POSIX_MQUEUE
> @@ -27,6 +30,9 @@ DEFINE_SPINLOCK(mq_lock);
>   */
>  struct ipc_namespace init_ipc_ns = {
>  	.count		= ATOMIC_INIT(1),
> +	.ids	= {
> +		[IPC_SHM_IDS] = INIT_IPC_SHM_IDS(init_ipc_ns.ids[IPC_SHM_IDS]),
> +	},

That's what I meant by "nasty".  We initialise one field because we
happen to use that one at the wrong time, and leave everything else
uninitialised. eww.

But in this case it's not as bad as it might be -
shm_exit_ns()->free_ipcs() is a no-op because ids[2].inuse is zero, so
we kinda _did_ initialise that.  otoh we left ids[0].rw_mutex and
ids[1].rw_mutex uninitialised, so it's still nasty ;)

We could perhaps have fixed the bug by testing ids->inuse before taking
the mutex, which would also have been a speedup for that function. 
That would need some thought.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-02 12:45         ` [kernel-hardening] " Vasiliy Kulikov
@ 2011-08-02 20:55           ` Andrew Morton
  -1 siblings, 0 replies; 53+ messages in thread
From: Andrew Morton @ 2011-08-02 20:55 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, Manuel Lauss, linux-kernel, Richard Weinberger,
	Marc Zyngier, Ingo Molnar, kernel-hardening, Paul E. McKenney

On Tue, 2 Aug 2011 16:45:30 +0400
Vasiliy Kulikov <segoon@openwall.com> wrote:

> On thread exit shm_exit() is called, it uses shm_ids(ns).rw_mutex.
> It is initialized in shm_init(), but it is not called yet at the moment
> of kernel threads exit.  Some kernel threads are created in
> do_pre_smp_initcalls(), and shm_init() is called in do_initcalls().
> 
> Static initialization of shm_ids(init_ipc_ns).rw_mutex fixes the race.
> 
> It fixes a kernel oops:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 00000000
> ...
> [<c0320090>] (__down_write_nested+0x88/0xe0) from [<c015da08>] (exit_shm+0x28/0x48)
> [<c015da08>] (exit_shm+0x28/0x48) from [<c002e550>] (do_exit+0x59c/0x750)
> [<c002e550>] (do_exit+0x59c/0x750) from [<c003eaac>] (____call_usermodehelper+0x13c/0x154)
> [<c003eaac>] (____call_usermodehelper+0x13c/0x154) from [<c000f630>] (kernel_thread_exit+0x0/0x8)

erm, wait.  There's no reason I can think of why a kernel thread needs
to call shm_exit() at all?

Is that a regular kernel thread exiting, or is it a
call_usermodehelper() worker thread?  It *looks* like
____call_usermodehelper()'s kernel_execve() failed, so
____call_usermodehelper() directly called do_exit().

Something's still screwed up here - we shouldn't be trying to run
usermode helper applications before shm_init() has been run - usermode
helpers can use ipc!

Can someone who can reproduce this please work out if and why we're
calling call_usermodehelper() under do_pre_smp_initcalls()?  Something
like this...

--- a/init/main.c~a
+++ a/init/main.c
@@ -722,12 +722,16 @@ static void __init do_basic_setup(void)
 	do_initcalls();
 }
 
+int in_do_pre_smp_initcalls;
+
 static void __init do_pre_smp_initcalls(void)
 {
 	initcall_t *fn;
 
+	in_do_pre_smp_initcalls = 1;
 	for (fn = __initcall_start; fn < __early_initcall_end; fn++)
 		do_one_initcall(*fn);
+	in_do_pre_smp_initcalls = 0;
 }
 
 static void run_init_process(const char *init_filename)
--- a/kernel/kmod.c~a
+++ a/kernel/kmod.c
@@ -412,12 +412,17 @@ EXPORT_SYMBOL(call_usermodehelper_setfns
  * asynchronously if wait is not set, and runs as a child of keventd.
  * (ie. it runs with full root capabilities).
  */
+
+extern int in_do_pre_smp_initcalls;
+
 int call_usermodehelper_exec(struct subprocess_info *sub_info,
 			     enum umh_wait wait)
 {
 	DECLARE_COMPLETION_ONSTACK(done);
 	int retval = 0;
 
+	if (in_do_pre_smp_initcalls)
+		dump_stack();
 	helper_lock();
 	if (sub_info->path[0] == '\0')
 		goto out;
_


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-02 20:55           ` Andrew Morton
  0 siblings, 0 replies; 53+ messages in thread
From: Andrew Morton @ 2011-08-02 20:55 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Linus Torvalds, Manuel Lauss, linux-kernel, Richard Weinberger,
	Marc Zyngier, Ingo Molnar, kernel-hardening, Paul E. McKenney

On Tue, 2 Aug 2011 16:45:30 +0400
Vasiliy Kulikov <segoon@openwall.com> wrote:

> On thread exit shm_exit() is called, it uses shm_ids(ns).rw_mutex.
> It is initialized in shm_init(), but it is not called yet at the moment
> of kernel threads exit.  Some kernel threads are created in
> do_pre_smp_initcalls(), and shm_init() is called in do_initcalls().
> 
> Static initialization of shm_ids(init_ipc_ns).rw_mutex fixes the race.
> 
> It fixes a kernel oops:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 00000000
> ...
> [<c0320090>] (__down_write_nested+0x88/0xe0) from [<c015da08>] (exit_shm+0x28/0x48)
> [<c015da08>] (exit_shm+0x28/0x48) from [<c002e550>] (do_exit+0x59c/0x750)
> [<c002e550>] (do_exit+0x59c/0x750) from [<c003eaac>] (____call_usermodehelper+0x13c/0x154)
> [<c003eaac>] (____call_usermodehelper+0x13c/0x154) from [<c000f630>] (kernel_thread_exit+0x0/0x8)

erm, wait.  There's no reason I can think of why a kernel thread needs
to call shm_exit() at all?

Is that a regular kernel thread exiting, or is it a
call_usermodehelper() worker thread?  It *looks* like
____call_usermodehelper()'s kernel_execve() failed, so
____call_usermodehelper() directly called do_exit().

Something's still screwed up here - we shouldn't be trying to run
usermode helper applications before shm_init() has been run - usermode
helpers can use ipc!

Can someone who can reproduce this please work out if and why we're
calling call_usermodehelper() under do_pre_smp_initcalls()?  Something
like this...

--- a/init/main.c~a
+++ a/init/main.c
@@ -722,12 +722,16 @@ static void __init do_basic_setup(void)
 	do_initcalls();
 }
 
+int in_do_pre_smp_initcalls;
+
 static void __init do_pre_smp_initcalls(void)
 {
 	initcall_t *fn;
 
+	in_do_pre_smp_initcalls = 1;
 	for (fn = __initcall_start; fn < __early_initcall_end; fn++)
 		do_one_initcall(*fn);
+	in_do_pre_smp_initcalls = 0;
 }
 
 static void run_init_process(const char *init_filename)
--- a/kernel/kmod.c~a
+++ a/kernel/kmod.c
@@ -412,12 +412,17 @@ EXPORT_SYMBOL(call_usermodehelper_setfns
  * asynchronously if wait is not set, and runs as a child of keventd.
  * (ie. it runs with full root capabilities).
  */
+
+extern int in_do_pre_smp_initcalls;
+
 int call_usermodehelper_exec(struct subprocess_info *sub_info,
 			     enum umh_wait wait)
 {
 	DECLARE_COMPLETION_ONSTACK(done);
 	int retval = 0;
 
+	if (in_do_pre_smp_initcalls)
+		dump_stack();
 	helper_lock();
 	if (sub_info->path[0] == '\0')
 		goto out;
_

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-02 20:55           ` [kernel-hardening] " Andrew Morton
@ 2011-08-03  5:30             ` Manuel Lauss
  -1 siblings, 0 replies; 53+ messages in thread
From: Manuel Lauss @ 2011-08-03  5:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vasiliy Kulikov, Linus Torvalds, linux-kernel,
	Richard Weinberger, Marc Zyngier, Ingo Molnar, kernel-hardening,
	Paul E. McKenney

On Tue, Aug 2, 2011 at 10:55 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Tue, 2 Aug 2011 16:45:30 +0400
> Vasiliy Kulikov <segoon@openwall.com> wrote:
>
>> On thread exit shm_exit() is called, it uses shm_ids(ns).rw_mutex.
>> It is initialized in shm_init(), but it is not called yet at the moment
>> of kernel threads exit.  Some kernel threads are created in
>> do_pre_smp_initcalls(), and shm_init() is called in do_initcalls().
>>
>> Static initialization of shm_ids(init_ipc_ns).rw_mutex fixes the race.
>>
>> It fixes a kernel oops:
>>
>> Unable to handle kernel NULL pointer dereference at virtual address 00000000
>> ...
>> [<c0320090>] (__down_write_nested+0x88/0xe0) from [<c015da08>] (exit_shm+0x28/0x48)
>> [<c015da08>] (exit_shm+0x28/0x48) from [<c002e550>] (do_exit+0x59c/0x750)
>> [<c002e550>] (do_exit+0x59c/0x750) from [<c003eaac>] (____call_usermodehelper+0x13c/0x154)
>> [<c003eaac>] (____call_usermodehelper+0x13c/0x154) from [<c000f630>] (kernel_thread_exit+0x0/0x8)
>
> erm, wait.  There's no reason I can think of why a kernel thread needs
> to call shm_exit() at all?
>
> Is that a regular kernel thread exiting, or is it a
> call_usermodehelper() worker thread?  It *looks* like
> ____call_usermodehelper()'s kernel_execve() failed, so
> ____call_usermodehelper() directly called do_exit().
>
> Something's still screwed up here - we shouldn't be trying to run
> usermode helper applications before shm_init() has been run - usermode
> helpers can use ipc!
>
> Can someone who can reproduce this please work out if and why we're
> calling call_usermodehelper() under do_pre_smp_initcalls()?  Something
> like this...

I applied your test patch, but it didn't print anything new (it's a
single-cpu system).


Linux version 3.0.0-db1200-07143-g9c8749b (mano@flagship) (gcc version
4.5.2 (Gentoo 4.5.2 p1.1) ) #14 Wed Aug 3 07:23:37 CEST 2011
CPU revision is: 04030202 (Au1250)
(PRId 04030202) @ 696.00 MHz
Alchemy/AMD/RMI DB1200 Board, CPLD Rev 2  Board-ID 12  Daughtercard ID 15
Determined physical RAM map:
 memory: 10000000 @ 00000000 (usable)
Zone PFN ranges:
  Normal   0x00000000 -> 0x00010000
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
    0: 0x00000000 -> 0x00010000
On node 0 totalpages: 65536
free_area_init_node: node 0, pgdat 806cc1a0, node_mem_map 81000000
  Normal zone: 512 pages used for memmap
  Normal zone: 0 pages reserved
  Normal zone: 65024 pages, LIFO batch:15
pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
pcpu-alloc: [0] 0
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 65024
Kernel command line:  root=/dev/hda1 rootfstype=ext2 console=tty
console=ttyS0,115200 video=au1200fb:panel:bs
PID hash table entries: 1024 (order: 0, 4096 bytes)
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Primary instruction cache 16kB, VIPT, 4-way, linesize 32 bytes.
Primary data cache 16kB, 4-way, VIPT, no aliases, linesize 32 bytes
Memory: 252456k/262144k available (4839k kernel code, 9688k reserved,
1098k data, 200k init, 0k highmem)
NR_IRQS:128
Alchemy clocksource installed
Console: colour dummy device 80x25
console [tty0] enabled
Calibrating delay loop (skipped) preset value.. 696.00 BogoMIPS (lpj=3480000)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 512
CPU 0 Unable to handle kernel paging request at virtual address
00000000, epc == 805b86ec, ra == 802a6f1c
Oops[#1]:
Cpu 0
$ 0   : 00000000 10003c00 00000000 10003c01
$ 4   : 806b6584 8fc45e60 806b6588 8fc3f520
$ 8   : 00000000 00000000 0016e35f 00000000
$12   : 00000080 00000010 00000010 8fc3001c
$16   : 8fc3f520 00000000 00000000 00000000
$20   : 00000000 00000000 8fc45eb4 00000000
$24   : 00000000 8011ef30
$28   : 8fc44000 8fc45e50 00000001 802a6f1c
Hi    : 00000000
Lo    : 00000000
epc   : 805b86ec __down_write_nested+0x68/0xf0
    Not tainted
ra    : 802a6f1c exit_shm+0x24/0x54
Status: 10003c02    KERNEL EXL
Cause : 0080800c
BadVA : 00000000
PrId  : 04030202 (Au1250)
Process kworker/u:0 (pid: 9, threadinfo=8fc44000, task=8fc3f520, tls=00000000)
Stack : 00000000 00000000 00000000 00000000 806b6588 00000000 8fc3f520 00000002
        8fc2c000 806b6584 00000000 802a6f1c 00000000 00000000 00000000 00000000
        806b6530 00000000 8fc3f520 801280b8 04000200 00040000 00000000 00008000
        00000000 00000000 40000000 00000000 00000000 00000000 8fc28ca0 8fc153a0
        00000000 00000000 00000000 00000000 00000000 00000000 00000000 80138b68
        ...
Call Trace:
[<805b86ec>] __down_write_nested+0x68/0xf0
[<802a6f1c>] exit_shm+0x24/0x54
[<801280b8>] do_exit+0x50c/0x664
[<80138b68>] ____call_usermodehelper+0xfc/0x118
[<8010573c>] kernel_thread_helper+0x10/0x18


Code: ac850008  afa60010  afa20014 <ac450000> 40016000  30630001
3421001f  3821001f  00611825
Disabling lock debugging due to kernel taint
Fixing recursive fault but reboot is needed!
NET: Registered protocol family 16
bio: create slab <bio-0> at 0
SCSI subsystem initialized
libata version 3.00 loaded.
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
Advanced Linux Sound Architecture Driver Version 1.0.24.
Bluetooth: Core ver 2.16
NET: Registered protocol family 31
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: L2CAP socket layer initialized
Bluetooth: SCO socket layer initialized
cfg80211: Calling CRDA to update world regulatory domain
Switching to clocksource alchemy-counter1
Switched to NOHz mode on CPU #0
NET: Registered protocol family 2
IP route cache hash table entries: 2048 (order: 1, 8192 bytes)
TCP established hash table entries: 8192 (order: 4, 65536 bytes)
TCP bind hash table entries: 8192 (order: 3, 32768 bytes)
TCP: Hash tables configured (established 8192 bind 8192)
TCP reno registered
UDP hash table entries: 256 (order: 0, 4096 bytes)
UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
DB1200 device configuration:
 S6.8 OFF: PSC0 mode I2C
   OTG port VBUS supply available!
 S6.7 OFF: PSC1 mode AC97
audit: initializing netlink socket (disabled)
type=2000 audit(0.120:1): initialized
squashfs: version 4.0 (2009/01/31) Phillip Lougher
Registering the id_resolver key type
nfs4filelayout_init: NFSv4 File Layout Driver Registering...
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
JFFS2 version 2.2. (NAND) (SUMMARY)  ┬® 2001-2006 Red Hat, Inc.
msgmni has been set to 493
NET: Registered protocol family 38
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
io scheduler noop registered (default)
au1200fb: LCD controller driver for AU1200 processors
au1200fb: Panel 5 Samsung_1024x768_TFT
au1200fb: Win 2 0-FS gfx, 1-video, 2-ovly gfx, 3-ovly gfx
Panel(Samsung_1024x768_TFT), 1024x768
[...]
Manuel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-03  5:30             ` Manuel Lauss
  0 siblings, 0 replies; 53+ messages in thread
From: Manuel Lauss @ 2011-08-03  5:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vasiliy Kulikov, Linus Torvalds, linux-kernel,
	Richard Weinberger, Marc Zyngier, Ingo Molnar, kernel-hardening,
	Paul E. McKenney

On Tue, Aug 2, 2011 at 10:55 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Tue, 2 Aug 2011 16:45:30 +0400
> Vasiliy Kulikov <segoon@openwall.com> wrote:
>
>> On thread exit shm_exit() is called, it uses shm_ids(ns).rw_mutex.
>> It is initialized in shm_init(), but it is not called yet at the moment
>> of kernel threads exit.  Some kernel threads are created in
>> do_pre_smp_initcalls(), and shm_init() is called in do_initcalls().
>>
>> Static initialization of shm_ids(init_ipc_ns).rw_mutex fixes the race.
>>
>> It fixes a kernel oops:
>>
>> Unable to handle kernel NULL pointer dereference at virtual address 00000000
>> ...
>> [<c0320090>] (__down_write_nested+0x88/0xe0) from [<c015da08>] (exit_shm+0x28/0x48)
>> [<c015da08>] (exit_shm+0x28/0x48) from [<c002e550>] (do_exit+0x59c/0x750)
>> [<c002e550>] (do_exit+0x59c/0x750) from [<c003eaac>] (____call_usermodehelper+0x13c/0x154)
>> [<c003eaac>] (____call_usermodehelper+0x13c/0x154) from [<c000f630>] (kernel_thread_exit+0x0/0x8)
>
> erm, wait.  There's no reason I can think of why a kernel thread needs
> to call shm_exit() at all?
>
> Is that a regular kernel thread exiting, or is it a
> call_usermodehelper() worker thread?  It *looks* like
> ____call_usermodehelper()'s kernel_execve() failed, so
> ____call_usermodehelper() directly called do_exit().
>
> Something's still screwed up here - we shouldn't be trying to run
> usermode helper applications before shm_init() has been run - usermode
> helpers can use ipc!
>
> Can someone who can reproduce this please work out if and why we're
> calling call_usermodehelper() under do_pre_smp_initcalls()?  Something
> like this...

I applied your test patch, but it didn't print anything new (it's a
single-cpu system).


Linux version 3.0.0-db1200-07143-g9c8749b (mano@flagship) (gcc version
4.5.2 (Gentoo 4.5.2 p1.1) ) #14 Wed Aug 3 07:23:37 CEST 2011
CPU revision is: 04030202 (Au1250)
(PRId 04030202) @ 696.00 MHz
Alchemy/AMD/RMI DB1200 Board, CPLD Rev 2  Board-ID 12  Daughtercard ID 15
Determined physical RAM map:
 memory: 10000000 @ 00000000 (usable)
Zone PFN ranges:
  Normal   0x00000000 -> 0x00010000
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
    0: 0x00000000 -> 0x00010000
On node 0 totalpages: 65536
free_area_init_node: node 0, pgdat 806cc1a0, node_mem_map 81000000
  Normal zone: 512 pages used for memmap
  Normal zone: 0 pages reserved
  Normal zone: 65024 pages, LIFO batch:15
pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
pcpu-alloc: [0] 0
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 65024
Kernel command line:  root=/dev/hda1 rootfstype=ext2 console=tty
console=ttyS0,115200 video=au1200fb:panel:bs
PID hash table entries: 1024 (order: 0, 4096 bytes)
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Primary instruction cache 16kB, VIPT, 4-way, linesize 32 bytes.
Primary data cache 16kB, 4-way, VIPT, no aliases, linesize 32 bytes
Memory: 252456k/262144k available (4839k kernel code, 9688k reserved,
1098k data, 200k init, 0k highmem)
NR_IRQS:128
Alchemy clocksource installed
Console: colour dummy device 80x25
console [tty0] enabled
Calibrating delay loop (skipped) preset value.. 696.00 BogoMIPS (lpj=3480000)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 512
CPU 0 Unable to handle kernel paging request at virtual address
00000000, epc == 805b86ec, ra == 802a6f1c
Oops[#1]:
Cpu 0
$ 0   : 00000000 10003c00 00000000 10003c01
$ 4   : 806b6584 8fc45e60 806b6588 8fc3f520
$ 8   : 00000000 00000000 0016e35f 00000000
$12   : 00000080 00000010 00000010 8fc3001c
$16   : 8fc3f520 00000000 00000000 00000000
$20   : 00000000 00000000 8fc45eb4 00000000
$24   : 00000000 8011ef30
$28   : 8fc44000 8fc45e50 00000001 802a6f1c
Hi    : 00000000
Lo    : 00000000
epc   : 805b86ec __down_write_nested+0x68/0xf0
    Not tainted
ra    : 802a6f1c exit_shm+0x24/0x54
Status: 10003c02    KERNEL EXL
Cause : 0080800c
BadVA : 00000000
PrId  : 04030202 (Au1250)
Process kworker/u:0 (pid: 9, threadinfo=8fc44000, task=8fc3f520, tls=00000000)
Stack : 00000000 00000000 00000000 00000000 806b6588 00000000 8fc3f520 00000002
        8fc2c000 806b6584 00000000 802a6f1c 00000000 00000000 00000000 00000000
        806b6530 00000000 8fc3f520 801280b8 04000200 00040000 00000000 00008000
        00000000 00000000 40000000 00000000 00000000 00000000 8fc28ca0 8fc153a0
        00000000 00000000 00000000 00000000 00000000 00000000 00000000 80138b68
        ...
Call Trace:
[<805b86ec>] __down_write_nested+0x68/0xf0
[<802a6f1c>] exit_shm+0x24/0x54
[<801280b8>] do_exit+0x50c/0x664
[<80138b68>] ____call_usermodehelper+0xfc/0x118
[<8010573c>] kernel_thread_helper+0x10/0x18


Code: ac850008  afa60010  afa20014 <ac450000> 40016000  30630001
3421001f  3821001f  00611825
Disabling lock debugging due to kernel taint
Fixing recursive fault but reboot is needed!
NET: Registered protocol family 16
bio: create slab <bio-0> at 0
SCSI subsystem initialized
libata version 3.00 loaded.
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
Advanced Linux Sound Architecture Driver Version 1.0.24.
Bluetooth: Core ver 2.16
NET: Registered protocol family 31
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: L2CAP socket layer initialized
Bluetooth: SCO socket layer initialized
cfg80211: Calling CRDA to update world regulatory domain
Switching to clocksource alchemy-counter1
Switched to NOHz mode on CPU #0
NET: Registered protocol family 2
IP route cache hash table entries: 2048 (order: 1, 8192 bytes)
TCP established hash table entries: 8192 (order: 4, 65536 bytes)
TCP bind hash table entries: 8192 (order: 3, 32768 bytes)
TCP: Hash tables configured (established 8192 bind 8192)
TCP reno registered
UDP hash table entries: 256 (order: 0, 4096 bytes)
UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
DB1200 device configuration:
 S6.8 OFF: PSC0 mode I2C
   OTG port VBUS supply available!
 S6.7 OFF: PSC1 mode AC97
audit: initializing netlink socket (disabled)
type=2000 audit(0.120:1): initialized
squashfs: version 4.0 (2009/01/31) Phillip Lougher
Registering the id_resolver key type
nfs4filelayout_init: NFSv4 File Layout Driver Registering...
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
JFFS2 version 2.2. (NAND) (SUMMARY)  ┬® 2001-2006 Red Hat, Inc.
msgmni has been set to 493
NET: Registered protocol family 38
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253)
io scheduler noop registered (default)
au1200fb: LCD controller driver for AU1200 processors
au1200fb: Panel 5 Samsung_1024x768_TFT
au1200fb: Win 2 0-FS gfx, 1-video, 2-ovly gfx, 3-ovly gfx
Panel(Samsung_1024x768_TFT), 1024x768
[...]
Manuel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-02 12:45         ` [kernel-hardening] " Vasiliy Kulikov
@ 2011-08-03  7:43           ` Linus Torvalds
  -1 siblings, 0 replies; 53+ messages in thread
From: Linus Torvalds @ 2011-08-03  7:43 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Manuel Lauss, linux-kernel, Richard Weinberger, Marc Zyngier,
	Andrew Morton, Ingo Molnar, kernel-hardening, Paul E. McKenney

[-- Attachment #1: Type: text/plain, Size: 1222 bytes --]

On Tue, Aug 2, 2011 at 2:45 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>
> From: Vasiliy Kulikov <segoon@openwall.com>
> Subject: [PATCH] shm: fix a race between shm_exit() and shm_init()

This patch is disgusting.

Doing things like this:

> +       /*
> +        * For init_ipc_ns shm_ids().rw_mutex is statically initialized
> +        * as kernel threads should be able to use it in do_exit() before
> +        * shm_init(), which is called on do_initcall()
> +        */
> +       if (ns == &init_ipc_ns)
> +               __ipc_init_ids(&shm_ids(ns));
> +       else
> +               ipc_init_ids(&shm_ids(ns));

should have told you that there is something totally wrong with your patch.

I'd prefer to really do the initialization in the allocator (at which
point it would be very natural to do the initialization statically for
a static allocation, and you wouldn't have the above kind of nasty
conditional stuff), but that whole namespace initialization and setup
just looks pretty nasty.

Looking at some of the other cases like net_ns_init(), maybe the
proper fix is to just make 'ipc_ns_init()' be a pure_initcall().

Does the attached patch work?

              Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 654 bytes --]

 ipc/shm.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/ipc/shm.c b/ipc/shm.c
index 9fb044f3b345..3791fd865bbd 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -105,9 +105,16 @@ void shm_exit_ns(struct ipc_namespace *ns)
 }
 #endif
 
-void __init shm_init (void)
+static int __init ipc_ns_init(void)
 {
 	shm_init_ns(&init_ipc_ns);
+	return 0;
+}
+
+pure_initcall(ipc_ns_init);
+
+void __init shm_init (void)
+{
 	ipc_init_proc_interface("sysvipc/shm",
 #if BITS_PER_LONG <= 32
 				"       key      shmid perms       size  cpid  lpid nattch   uid   gid  cuid  cgid      atime      dtime      ctime        rss       swap\n",

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-03  7:43           ` Linus Torvalds
  0 siblings, 0 replies; 53+ messages in thread
From: Linus Torvalds @ 2011-08-03  7:43 UTC (permalink / raw)
  To: Vasiliy Kulikov
  Cc: Manuel Lauss, linux-kernel, Richard Weinberger, Marc Zyngier,
	Andrew Morton, Ingo Molnar, kernel-hardening, Paul E. McKenney

[-- Attachment #1: Type: text/plain, Size: 1222 bytes --]

On Tue, Aug 2, 2011 at 2:45 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>
> From: Vasiliy Kulikov <segoon@openwall.com>
> Subject: [PATCH] shm: fix a race between shm_exit() and shm_init()

This patch is disgusting.

Doing things like this:

> +       /*
> +        * For init_ipc_ns shm_ids().rw_mutex is statically initialized
> +        * as kernel threads should be able to use it in do_exit() before
> +        * shm_init(), which is called on do_initcall()
> +        */
> +       if (ns == &init_ipc_ns)
> +               __ipc_init_ids(&shm_ids(ns));
> +       else
> +               ipc_init_ids(&shm_ids(ns));

should have told you that there is something totally wrong with your patch.

I'd prefer to really do the initialization in the allocator (at which
point it would be very natural to do the initialization statically for
a static allocation, and you wouldn't have the above kind of nasty
conditional stuff), but that whole namespace initialization and setup
just looks pretty nasty.

Looking at some of the other cases like net_ns_init(), maybe the
proper fix is to just make 'ipc_ns_init()' be a pure_initcall().

Does the attached patch work?

              Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 654 bytes --]

 ipc/shm.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/ipc/shm.c b/ipc/shm.c
index 9fb044f3b345..3791fd865bbd 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -105,9 +105,16 @@ void shm_exit_ns(struct ipc_namespace *ns)
 }
 #endif
 
-void __init shm_init (void)
+static int __init ipc_ns_init(void)
 {
 	shm_init_ns(&init_ipc_ns);
+	return 0;
+}
+
+pure_initcall(ipc_ns_init);
+
+void __init shm_init (void)
+{
 	ipc_init_proc_interface("sysvipc/shm",
 #if BITS_PER_LONG <= 32
 				"       key      shmid perms       size  cpid  lpid nattch   uid   gid  cuid  cgid      atime      dtime      ctime        rss       swap\n",

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-03  7:43           ` [kernel-hardening] " Linus Torvalds
@ 2011-08-03  7:50             ` Manuel Lauss
  -1 siblings, 0 replies; 53+ messages in thread
From: Manuel Lauss @ 2011-08-03  7:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vasiliy Kulikov, linux-kernel, Richard Weinberger, Marc Zyngier,
	Andrew Morton, Ingo Molnar, kernel-hardening, Paul E. McKenney

On Wed, Aug 3, 2011 at 9:43 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Aug 2, 2011 at 2:45 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>>
>> From: Vasiliy Kulikov <segoon@openwall.com>
>> Subject: [PATCH] shm: fix a race between shm_exit() and shm_init()
>
> This patch is disgusting.
>
> Doing things like this:
>
>> +       /*
>> +        * For init_ipc_ns shm_ids().rw_mutex is statically initialized
>> +        * as kernel threads should be able to use it in do_exit() before
>> +        * shm_init(), which is called on do_initcall()
>> +        */
>> +       if (ns == &init_ipc_ns)
>> +               __ipc_init_ids(&shm_ids(ns));
>> +       else
>> +               ipc_init_ids(&shm_ids(ns));
>
> should have told you that there is something totally wrong with your patch.
>
> I'd prefer to really do the initialization in the allocator (at which
> point it would be very natural to do the initialization statically for
> a static allocation, and you wouldn't have the above kind of nasty
> conditional stuff), but that whole namespace initialization and setup
> just looks pretty nasty.
>
> Looking at some of the other cases like net_ns_init(), maybe the
> proper fix is to just make 'ipc_ns_init()' be a pure_initcall().
>
> Does the attached patch work?

No, same oops still.

Manuel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-03  7:50             ` Manuel Lauss
  0 siblings, 0 replies; 53+ messages in thread
From: Manuel Lauss @ 2011-08-03  7:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vasiliy Kulikov, linux-kernel, Richard Weinberger, Marc Zyngier,
	Andrew Morton, Ingo Molnar, kernel-hardening, Paul E. McKenney

On Wed, Aug 3, 2011 at 9:43 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Aug 2, 2011 at 2:45 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>>
>> From: Vasiliy Kulikov <segoon@openwall.com>
>> Subject: [PATCH] shm: fix a race between shm_exit() and shm_init()
>
> This patch is disgusting.
>
> Doing things like this:
>
>> +       /*
>> +        * For init_ipc_ns shm_ids().rw_mutex is statically initialized
>> +        * as kernel threads should be able to use it in do_exit() before
>> +        * shm_init(), which is called on do_initcall()
>> +        */
>> +       if (ns == &init_ipc_ns)
>> +               __ipc_init_ids(&shm_ids(ns));
>> +       else
>> +               ipc_init_ids(&shm_ids(ns));
>
> should have told you that there is something totally wrong with your patch.
>
> I'd prefer to really do the initialization in the allocator (at which
> point it would be very natural to do the initialization statically for
> a static allocation, and you wouldn't have the above kind of nasty
> conditional stuff), but that whole namespace initialization and setup
> just looks pretty nasty.
>
> Looking at some of the other cases like net_ns_init(), maybe the
> proper fix is to just make 'ipc_ns_init()' be a pure_initcall().
>
> Does the attached patch work?

No, same oops still.

Manuel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-03  7:50             ` [kernel-hardening] " Manuel Lauss
@ 2011-08-03  8:00               ` Manuel Lauss
  -1 siblings, 0 replies; 53+ messages in thread
From: Manuel Lauss @ 2011-08-03  8:00 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vasiliy Kulikov, linux-kernel, Richard Weinberger, Marc Zyngier,
	Andrew Morton, Ingo Molnar, kernel-hardening, Paul E. McKenney

On Wed, Aug 3, 2011 at 9:50 AM, Manuel Lauss
<manuel.lauss@googlemail.com> wrote:
> On Wed, Aug 3, 2011 at 9:43 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> On Tue, Aug 2, 2011 at 2:45 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>>>
>>> From: Vasiliy Kulikov <segoon@openwall.com>
>>> Subject: [PATCH] shm: fix a race between shm_exit() and shm_init()
>>
>> This patch is disgusting.
>>
>> Doing things like this:
>>
>>> +       /*
>>> +        * For init_ipc_ns shm_ids().rw_mutex is statically initialized
>>> +        * as kernel threads should be able to use it in do_exit() before
>>> +        * shm_init(), which is called on do_initcall()
>>> +        */
>>> +       if (ns == &init_ipc_ns)
>>> +               __ipc_init_ids(&shm_ids(ns));
>>> +       else
>>> +               ipc_init_ids(&shm_ids(ns));
>>
>> should have told you that there is something totally wrong with your patch.
>>
>> I'd prefer to really do the initialization in the allocator (at which
>> point it would be very natural to do the initialization statically for
>> a static allocation, and you wouldn't have the above kind of nasty
>> conditional stuff), but that whole namespace initialization and setup
>> just looks pretty nasty.
>>
>> Looking at some of the other cases like net_ns_init(), maybe the
>> proper fix is to just make 'ipc_ns_init()' be a pure_initcall().
>>
>> Does the attached patch work?
>
> No, same oops still.

Your new initcall comes right AFTER the oops:

Mount-cache hash table entries: 512
CPU 0 Unable to handle kernel paging request at virtual address
00000000, epc == 8030b2bc, ra == 80206ba0
Oops[#1]:
Cpu 0
$ 0   : 00000000 10003c00 00000000 10003c01
$ 4   : 80379644 83c41e88 80379648 83c37520
$ 8   : 00000000 01312d00 0016e35f 83c3f01c
$12   : 83c09f08 83c09f10 00000001 00000780
$16   : 83c37520 00000000 00000000 00000000
$20   : 00000000 00000000 00000000 00000000
$24   : 00000000 8011fbec
$28   : 83c40000 83c41e78 00000000 80206ba0
Hi    : 00000000
Lo    : 01312d00
epc   : 8030b2bc __down_write_nested+0x68/0xf0
    Not tainted
ra    : 80206ba0 exit_shm+0x24/0x54
Status: 10003c02    KERNEL EXL
Cause : 0080800c
BadVA : 00000000
PrId  : 01030200 (Au1500)
Process kworker/u:0 (pid: 9, threadinfo=83c40000, task=83c37520, tls=00000000)
Stack : ffffffff ffffffff 00000000 ffffffff 80379648 00000000 83c37520 00000002
        ffffffff 80379644 00000000 80206ba0 00000000 00000000 00000000 00000000
        803795f0 00000000 83c37520 80128ebc 83c40000 83c41ef0 ffffffff 8013a8a8
        00000001 ffffffff 83c2ac80 83c2ac80 83c16300 8013a8b4 ffffffff ffffffff
        ffffffff ffffffff ffffffff ffffffff ffffffff 00000000 00000000 80105b1c
        ...
Call Trace:
[<8030b2bc>] __down_write_nested+0x68/0xf0
[<80206ba0>] exit_shm+0x24/0x54
[<80128ebc>] do_exit+0x1a4/0x254
[<8013a8b4>] ____call_usermodehelper+0xfc/0x118
[<80105b1c>] kernel_thread_helper+0x10/0x18


Code: ac850008  afa60010  afa20014 <ac450000> 40016000  30630001
3421001f  3821001f  00611825
Disabling lock debugging due to kernel taint
Fixing recursive fault but reboot is needed!
calling  ipc_ns_init+0x0/0x4c @ 1
initcall ipc_ns_init+0x0/0x4c returned 0 after 0 usecs
calling  init_mmap_min_addr+0x0/0x18 @ 1
initcall init_mmap_min_addr+0x0/0x18 returned 0 after 0 usecs
calling  init_atomic64_lock+0x0/0x8 @ 1

Manuel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-03  8:00               ` Manuel Lauss
  0 siblings, 0 replies; 53+ messages in thread
From: Manuel Lauss @ 2011-08-03  8:00 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vasiliy Kulikov, linux-kernel, Richard Weinberger, Marc Zyngier,
	Andrew Morton, Ingo Molnar, kernel-hardening, Paul E. McKenney

On Wed, Aug 3, 2011 at 9:50 AM, Manuel Lauss
<manuel.lauss@googlemail.com> wrote:
> On Wed, Aug 3, 2011 at 9:43 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> On Tue, Aug 2, 2011 at 2:45 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
>>>
>>> From: Vasiliy Kulikov <segoon@openwall.com>
>>> Subject: [PATCH] shm: fix a race between shm_exit() and shm_init()
>>
>> This patch is disgusting.
>>
>> Doing things like this:
>>
>>> +       /*
>>> +        * For init_ipc_ns shm_ids().rw_mutex is statically initialized
>>> +        * as kernel threads should be able to use it in do_exit() before
>>> +        * shm_init(), which is called on do_initcall()
>>> +        */
>>> +       if (ns == &init_ipc_ns)
>>> +               __ipc_init_ids(&shm_ids(ns));
>>> +       else
>>> +               ipc_init_ids(&shm_ids(ns));
>>
>> should have told you that there is something totally wrong with your patch.
>>
>> I'd prefer to really do the initialization in the allocator (at which
>> point it would be very natural to do the initialization statically for
>> a static allocation, and you wouldn't have the above kind of nasty
>> conditional stuff), but that whole namespace initialization and setup
>> just looks pretty nasty.
>>
>> Looking at some of the other cases like net_ns_init(), maybe the
>> proper fix is to just make 'ipc_ns_init()' be a pure_initcall().
>>
>> Does the attached patch work?
>
> No, same oops still.

Your new initcall comes right AFTER the oops:

Mount-cache hash table entries: 512
CPU 0 Unable to handle kernel paging request at virtual address
00000000, epc == 8030b2bc, ra == 80206ba0
Oops[#1]:
Cpu 0
$ 0   : 00000000 10003c00 00000000 10003c01
$ 4   : 80379644 83c41e88 80379648 83c37520
$ 8   : 00000000 01312d00 0016e35f 83c3f01c
$12   : 83c09f08 83c09f10 00000001 00000780
$16   : 83c37520 00000000 00000000 00000000
$20   : 00000000 00000000 00000000 00000000
$24   : 00000000 8011fbec
$28   : 83c40000 83c41e78 00000000 80206ba0
Hi    : 00000000
Lo    : 01312d00
epc   : 8030b2bc __down_write_nested+0x68/0xf0
    Not tainted
ra    : 80206ba0 exit_shm+0x24/0x54
Status: 10003c02    KERNEL EXL
Cause : 0080800c
BadVA : 00000000
PrId  : 01030200 (Au1500)
Process kworker/u:0 (pid: 9, threadinfo=83c40000, task=83c37520, tls=00000000)
Stack : ffffffff ffffffff 00000000 ffffffff 80379648 00000000 83c37520 00000002
        ffffffff 80379644 00000000 80206ba0 00000000 00000000 00000000 00000000
        803795f0 00000000 83c37520 80128ebc 83c40000 83c41ef0 ffffffff 8013a8a8
        00000001 ffffffff 83c2ac80 83c2ac80 83c16300 8013a8b4 ffffffff ffffffff
        ffffffff ffffffff ffffffff ffffffff ffffffff 00000000 00000000 80105b1c
        ...
Call Trace:
[<8030b2bc>] __down_write_nested+0x68/0xf0
[<80206ba0>] exit_shm+0x24/0x54
[<80128ebc>] do_exit+0x1a4/0x254
[<8013a8b4>] ____call_usermodehelper+0xfc/0x118
[<80105b1c>] kernel_thread_helper+0x10/0x18


Code: ac850008  afa60010  afa20014 <ac450000> 40016000  30630001
3421001f  3821001f  00611825
Disabling lock debugging due to kernel taint
Fixing recursive fault but reboot is needed!
calling  ipc_ns_init+0x0/0x4c @ 1
initcall ipc_ns_init+0x0/0x4c returned 0 after 0 usecs
calling  init_mmap_min_addr+0x0/0x18 @ 1
initcall init_mmap_min_addr+0x0/0x18 returned 0 after 0 usecs
calling  init_atomic64_lock+0x0/0x8 @ 1

Manuel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-02 20:55           ` [kernel-hardening] " Andrew Morton
@ 2011-08-03  8:05             ` Marc Zyngier
  -1 siblings, 0 replies; 53+ messages in thread
From: Marc Zyngier @ 2011-08-03  8:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vasiliy Kulikov, Linus Torvalds, Manuel Lauss, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney


On Tue, 2 Aug 2011 13:55:12 -0700, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Tue, 2 Aug 2011 16:45:30 +0400
> Vasiliy Kulikov <segoon@openwall.com> wrote:
> 
>> On thread exit shm_exit() is called, it uses shm_ids(ns).rw_mutex.
>> It is initialized in shm_init(), but it is not called yet at the moment
>> of kernel threads exit.  Some kernel threads are created in
>> do_pre_smp_initcalls(), and shm_init() is called in do_initcalls().
>> 
>> Static initialization of shm_ids(init_ipc_ns).rw_mutex fixes the race.
>> 
>> It fixes a kernel oops:
>> 
>> Unable to handle kernel NULL pointer dereference at virtual address
>> 00000000
>> ...
>> [<c0320090>] (__down_write_nested+0x88/0xe0) from [<c015da08>]
>> (exit_shm+0x28/0x48)
>> [<c015da08>] (exit_shm+0x28/0x48) from [<c002e550>]
(do_exit+0x59c/0x750)
>> [<c002e550>] (do_exit+0x59c/0x750) from [<c003eaac>]
>> (____call_usermodehelper+0x13c/0x154)
>> [<c003eaac>] (____call_usermodehelper+0x13c/0x154) from [<c000f630>]
>> (kernel_thread_exit+0x0/0x8)
> 
> erm, wait.  There's no reason I can think of why a kernel thread needs
> to call shm_exit() at all?
> 
> Is that a regular kernel thread exiting, or is it a
> call_usermodehelper() worker thread?  It *looks* like
> ____call_usermodehelper()'s kernel_execve() failed, so
> ____call_usermodehelper() directly called do_exit().
>
> Something's still screwed up here - we shouldn't be trying to run
> usermode helper applications before shm_init() has been run - usermode
> helpers can use ipc!
> 
> Can someone who can reproduce this please work out if and why we're
> calling call_usermodehelper() under do_pre_smp_initcalls()?  Something
> like this...

[snip]

Got nothing. So I moved the reset of in_do_pre_smp_initcall to the
beginning of shm_init(), and here's what I found: (again sorry for the
bloody line wrapping):

[<c0014740>] (unwind_backtrace+0x0/0xf4) from [<c003ed58>]
(call_usermodehelper_exec+0x108/0x118)
[<c003ed58>] (call_usermodehelper_exec+0x108/0x118) from [<c017608c>]
(kobject_uevent_env+0x40c/0x450)
[<c017608c>] (kobject_uevent_env+0x40c/0x450) from [<c01754c8>]
(kset_register+0x3c/0x44)
[<c01754c8>] (kset_register+0x3c/0x44) from [<c01c2eb0>]
(bus_register+0xac/0x28c)
[<c01c2eb0>] (bus_register+0xac/0x28c) from [<c0422278>]
(platform_bus_init+0x20/0x40)
[<c0422278>] (platform_bus_init+0x20/0x40) from [<c04222f0>]
(driver_init+0x18/0x24)
[<c04222f0>] (driver_init+0x18/0x24) from [<c040e270>]
(kernel_init+0x84/0x13c)
[<c040e270>] (kernel_init+0x84/0x13c) from [<c000f630>]
(kernel_thread_exit+0x0/0x8)
[<c0014740>] (unwind_backtrace+0x0/0xf4) from [<c003ed58>]
(call_usermodehelper_exec+0x108/0x118)
[<c003ed58>] (call_usermodehelper_exec+0x108/0x118) from [<c017608c>]
(kobject_uevent_env+0x40c/0x450)
[<c017608c>] (kobject_uevent_env+0x40c/0x450) from [<c01754c8>]
(kset_register+0x3c/0x44)
[<c01754c8>] (kset_register+0x3c/0x44) from [<c01c1eec>]
(sysdev_class_register+0x60/0x8c)
[<c01c1eec>] (sysdev_class_register+0x60/0x8c) from [<c040e270>]
(kernel_init+0x84/0x13c)
[<c040e270>] (kernel_init+0x84/0x13c) from [<c000f630>]
(kernel_thread_exit+0x0/0x8)
NET: Registered protocol family 16
[<c0014740>] (unwind_backtrace+0x0/0xf4) from [<c003ed58>]
(call_usermodehelper_exec+0x108/0x118)
[<c003ed58>] (call_usermodehelper_exec+0x108/0x118) from [<c017608c>]
(kobject_uevent_env+0x40c/0x450)
[<c017608c>] (kobject_uevent_env+0x40c/0x450) from [<c01754c8>]
(kset_register+0x3c/0x44)
[<c01754c8>] (kset_register+0x3c/0x44) from [<c01c44fc>]
(__class_register+0xb8/0x200)
[<c01c44fc>] (__class_register+0xb8/0x200) from [<c01c4698>]
(__class_create+0x54/0x7c)
[<c01c4698>] (__class_create+0x54/0x7c) from [<c04178d8>]
(bdi_class_init+0x18/0x58)
[<c04178d8>] (bdi_class_init+0x18/0x58) from [<c0008650>]
(do_one_initcall+0x34/0x178)
[<c0008650>] (do_one_initcall+0x34/0x178) from [<c040e280>]
(kernel_init+0x94/0x13c)
[<c040e280>] (kernel_init+0x94/0x13c) from [<c000f630>]
(kernel_thread_exit+0x0/0x8)
[...]

A flurry of these. So pre-smp initcalls are not the "guilty" ones. It
chokes a bit further down the drain. ipc_init is called way after the
drivers are registered:

c0436b78 T __initcall_start
c0436b78 T __setup_end
c0436b7c t __initcall_spawn_ksoftirqdearly
c0436b80 t __initcall_init_workqueuesearly
c0436b84 t __initcall_cpu_stop_initearly
c0436b88 t __initcall_rcu_scheduler_really_startedearly
c0436b8c T __early_initcall_end
c0436b8c t __initcall_init_mmap_min_addr0
c0436b90 t __initcall_net_ns_init0
c0436b94 t __initcall_ptrace_break_init1
c0436b98 t __initcall_consistent_init1
c0436b9c t __initcall_v6_userpage_init1
c0436ba0 t __initcall_alloc_frozen_cpus1
c0436ba4 t __initcall_sysctl_init1
c0436ba8 t __initcall_ksysfs_init1
c0436bac t __initcall_init_jiffies_clocksource1
c0436bb0 t __initcall_pm_init1
c0436bb4 t __initcall_init_zero_pfn1
c0436bb8 t __initcall_fsnotify_init1
c0436bbc t __initcall_filelock_init1
c0436bc0 t __initcall_init_script_binfmt1
c0436bc4 t __initcall_init_elf_binfmt1
c0436bc8 t __initcall_debugfs_init1
c0436bcc t __initcall_random32_init1
c0436bd0 t __initcall_sock_init1
c0436bd4 t __initcall_netlink_proto_init1
c0436bd8 t __initcall_bdi_class_init2
c0436bdc t __initcall_kobject_uevent_init2
c0436be0 t __initcall_amba_init2
c0436be4 t __initcall_tty_class_init2
c0436be8 t __initcall_vtconsole_class_init2
c0436bec t __initcall_wakeup_sources_debugfs_init2
c0436bf0 t __initcall_customize_machine3
c0436bf4 t __initcall_arch_hw_breakpoint_init3
c0436bf8 t __initcall_exceptions_init3
c0436bfc t __initcall_realview_i2c_init3
c0436c00 t __initcall_pl011_init3
c0436c04 t __initcall_topology_init4
c0436c08 t __initcall_param_sysfs_init4
c0436c0c t __initcall_pm_sysrq_init4
c0436c10 t __initcall_default_bdi_init4
c0436c14 t __initcall_init_bio4
c0436c18 t __initcall_fsnotify_notification_init4
c0436c1c t __initcall_blk_settings_init4
c0436c20 t __initcall_blk_ioc_init4
c0436c24 t __initcall_blk_softirq_init4
c0436c28 t __initcall_blk_iopoll_setup4
c0436c2c t __initcall_genhd_device_init4
c0436c30 t __initcall_fbmem_init4
c0436c34 t __initcall_misc_init4
c0436c38 t __initcall_init_scsi4
c0436c3c t __initcall_ata_init4
c0436c40 t __initcall_phy_init4
c0436c44 t __initcall_usb_init4
c0436c48 t __initcall_serio_init4
c0436c4c t __initcall_input_init4
c0436c50 t __initcall_rtc_init4
c0436c54 t __initcall_init_soundcore4
c0436c58 t __initcall_alsa_sound_init4
c0436c5c t __initcall_ac97_bus_init4
c0436c60 t __initcall_proto_init4
c0436c64 t __initcall_net_dev_init4
c0436c68 t __initcall_neigh_init4
c0436c6c t __initcall_genl_init4
c0436c70 t __initcall_sysctl_init4
c0436c74 t __initcall_proc_cpu_init5
c0436c78 t __initcall_dma_debug_do_init5
c0436c7c t __initcall_alignment_init5
c0436c80 t __initcall_clocksource_done_booting5
c0436c84 t __initcall_init_pipe_fs5
c0436c88 t __initcall_eventpoll_init5
c0436c8c t __initcall_anon_inode_init5
c0436c90 t __initcall_blk_scsi_ioctl_init5
c0436c94 t __initcall_chr_dev_init5
c0436c98 t __initcall_firmware_class_init5
c0436c9c t __initcall_sysctl_core_init5
c0436ca0 t __initcall_inet_init5
c0436ca4 t __initcall_af_unix_init5
c0436ca8 t __initcall_init_sunrpc5
c0436cac t __initcall_populate_rootfsrootfs
c0436cb0 t __initcall_timer_init_syscore_ops6
c0436cb4 t __initcall_register_pmu_driver6
c0436cb8 t __initcall_proc_execdomains_init6
c0436cbc t __initcall_ioresources_init6
c0436cc0 t __initcall_uid_cache_init6
c0436cc4 t __initcall_init_posix_timers6
c0436cc8 t __initcall_init_posix_cpu_timers6
c0436ccc t __initcall_timekeeping_init_ops6
c0436cd0 t __initcall_init_clocksource_sysfs6
c0436cd4 t __initcall_init_timer_list_procfs6
c0436cd8 t __initcall_alarmtimer_init6
c0436cdc t __initcall_futex_init6
c0436ce0 t __initcall_proc_modules_init6
c0436ce4 t __initcall_kallsyms_init6
c0436ce8 t __initcall_ikconfig_init6
c0436cec t __initcall_hung_task_init6
c0436cf0 t __initcall_utsname_sysctl_init6
c0436cf4 t __initcall_perf_event_sysfs_init6
c0436cf8 t __initcall_init_per_zone_wmark_min6
c0436cfc t __initcall_kswapd_init6
c0436d00 t __initcall_setup_vmstat6
c0436d04 t __initcall_mm_sysfs_init6
c0436d08 t __initcall_proc_vmalloc_init6
c0436d0c t __initcall_memblock_init_debugfs6
c0436d10 t __initcall_init_emergency_pool6
c0436d14 t __initcall_procswaps_init6
c0436d18 t __initcall_slab_proc_init6
c0436d1c t __initcall_slab_sysfs_init6
c0436d20 t __initcall_fcntl_init6
c0436d24 t __initcall_proc_filesystems_init6
c0436d28 t __initcall_fsnotify_mark_init6
c0436d2c t __initcall_dnotify_init6
c0436d30 t __initcall_inotify_user_setup6
c0436d34 t __initcall_aio_setup6
c0436d38 t __initcall_proc_locks_init6
c0436d3c t __initcall_proc_cmdline_init6
c0436d40 t __initcall_proc_consoles_init6
c0436d44 t __initcall_proc_cpuinfo_init6
c0436d48 t __initcall_proc_devices_init6
c0436d4c t __initcall_proc_interrupts_init6
c0436d50 t __initcall_proc_loadavg_init6
c0436d54 t __initcall_proc_meminfo_init6
c0436d58 t __initcall_proc_stat_init6
c0436d5c t __initcall_proc_uptime_init6
c0436d60 t __initcall_proc_version_init6
c0436d64 t __initcall_proc_softirqs_init6
c0436d68 t __initcall_proc_kmsg_init6
c0436d6c t __initcall_proc_page_init6
c0436d70 t __initcall_init_devpts_fs6
c0436d74 t __initcall_init_ext3_fs6
c0436d78 t __initcall_init_ext2_fs6
c0436d7c t __initcall_journal_init6
c0436d80 t __initcall_init_cramfs_fs6
c0436d84 t __initcall_init_ramfs_fs6
c0436d88 t __initcall_init_fat_fs6
c0436d8c t __initcall_init_vfat_fs6
c0436d90 t __initcall_init_nfs_fs6
c0436d94 t __initcall_init_nlm6
c0436d98 t __initcall_init_nls_cp4376
c0436d9c t __initcall_init_nls_iso8859_16
c0436da0 t __initcall_ipc_init6

Reordering the initcalls seems the easiest solution, but it is still very
fragile...

        M.
-- 
Fast, cheap, reliable. Pick two.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-03  8:05             ` Marc Zyngier
  0 siblings, 0 replies; 53+ messages in thread
From: Marc Zyngier @ 2011-08-03  8:05 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Vasiliy Kulikov, Linus Torvalds, Manuel Lauss, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney


On Tue, 2 Aug 2011 13:55:12 -0700, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Tue, 2 Aug 2011 16:45:30 +0400
> Vasiliy Kulikov <segoon@openwall.com> wrote:
> 
>> On thread exit shm_exit() is called, it uses shm_ids(ns).rw_mutex.
>> It is initialized in shm_init(), but it is not called yet at the moment
>> of kernel threads exit.  Some kernel threads are created in
>> do_pre_smp_initcalls(), and shm_init() is called in do_initcalls().
>> 
>> Static initialization of shm_ids(init_ipc_ns).rw_mutex fixes the race.
>> 
>> It fixes a kernel oops:
>> 
>> Unable to handle kernel NULL pointer dereference at virtual address
>> 00000000
>> ...
>> [<c0320090>] (__down_write_nested+0x88/0xe0) from [<c015da08>]
>> (exit_shm+0x28/0x48)
>> [<c015da08>] (exit_shm+0x28/0x48) from [<c002e550>]
(do_exit+0x59c/0x750)
>> [<c002e550>] (do_exit+0x59c/0x750) from [<c003eaac>]
>> (____call_usermodehelper+0x13c/0x154)
>> [<c003eaac>] (____call_usermodehelper+0x13c/0x154) from [<c000f630>]
>> (kernel_thread_exit+0x0/0x8)
> 
> erm, wait.  There's no reason I can think of why a kernel thread needs
> to call shm_exit() at all?
> 
> Is that a regular kernel thread exiting, or is it a
> call_usermodehelper() worker thread?  It *looks* like
> ____call_usermodehelper()'s kernel_execve() failed, so
> ____call_usermodehelper() directly called do_exit().
>
> Something's still screwed up here - we shouldn't be trying to run
> usermode helper applications before shm_init() has been run - usermode
> helpers can use ipc!
> 
> Can someone who can reproduce this please work out if and why we're
> calling call_usermodehelper() under do_pre_smp_initcalls()?  Something
> like this...

[snip]

Got nothing. So I moved the reset of in_do_pre_smp_initcall to the
beginning of shm_init(), and here's what I found: (again sorry for the
bloody line wrapping):

[<c0014740>] (unwind_backtrace+0x0/0xf4) from [<c003ed58>]
(call_usermodehelper_exec+0x108/0x118)
[<c003ed58>] (call_usermodehelper_exec+0x108/0x118) from [<c017608c>]
(kobject_uevent_env+0x40c/0x450)
[<c017608c>] (kobject_uevent_env+0x40c/0x450) from [<c01754c8>]
(kset_register+0x3c/0x44)
[<c01754c8>] (kset_register+0x3c/0x44) from [<c01c2eb0>]
(bus_register+0xac/0x28c)
[<c01c2eb0>] (bus_register+0xac/0x28c) from [<c0422278>]
(platform_bus_init+0x20/0x40)
[<c0422278>] (platform_bus_init+0x20/0x40) from [<c04222f0>]
(driver_init+0x18/0x24)
[<c04222f0>] (driver_init+0x18/0x24) from [<c040e270>]
(kernel_init+0x84/0x13c)
[<c040e270>] (kernel_init+0x84/0x13c) from [<c000f630>]
(kernel_thread_exit+0x0/0x8)
[<c0014740>] (unwind_backtrace+0x0/0xf4) from [<c003ed58>]
(call_usermodehelper_exec+0x108/0x118)
[<c003ed58>] (call_usermodehelper_exec+0x108/0x118) from [<c017608c>]
(kobject_uevent_env+0x40c/0x450)
[<c017608c>] (kobject_uevent_env+0x40c/0x450) from [<c01754c8>]
(kset_register+0x3c/0x44)
[<c01754c8>] (kset_register+0x3c/0x44) from [<c01c1eec>]
(sysdev_class_register+0x60/0x8c)
[<c01c1eec>] (sysdev_class_register+0x60/0x8c) from [<c040e270>]
(kernel_init+0x84/0x13c)
[<c040e270>] (kernel_init+0x84/0x13c) from [<c000f630>]
(kernel_thread_exit+0x0/0x8)
NET: Registered protocol family 16
[<c0014740>] (unwind_backtrace+0x0/0xf4) from [<c003ed58>]
(call_usermodehelper_exec+0x108/0x118)
[<c003ed58>] (call_usermodehelper_exec+0x108/0x118) from [<c017608c>]
(kobject_uevent_env+0x40c/0x450)
[<c017608c>] (kobject_uevent_env+0x40c/0x450) from [<c01754c8>]
(kset_register+0x3c/0x44)
[<c01754c8>] (kset_register+0x3c/0x44) from [<c01c44fc>]
(__class_register+0xb8/0x200)
[<c01c44fc>] (__class_register+0xb8/0x200) from [<c01c4698>]
(__class_create+0x54/0x7c)
[<c01c4698>] (__class_create+0x54/0x7c) from [<c04178d8>]
(bdi_class_init+0x18/0x58)
[<c04178d8>] (bdi_class_init+0x18/0x58) from [<c0008650>]
(do_one_initcall+0x34/0x178)
[<c0008650>] (do_one_initcall+0x34/0x178) from [<c040e280>]
(kernel_init+0x94/0x13c)
[<c040e280>] (kernel_init+0x94/0x13c) from [<c000f630>]
(kernel_thread_exit+0x0/0x8)
[...]

A flurry of these. So pre-smp initcalls are not the "guilty" ones. It
chokes a bit further down the drain. ipc_init is called way after the
drivers are registered:

c0436b78 T __initcall_start
c0436b78 T __setup_end
c0436b7c t __initcall_spawn_ksoftirqdearly
c0436b80 t __initcall_init_workqueuesearly
c0436b84 t __initcall_cpu_stop_initearly
c0436b88 t __initcall_rcu_scheduler_really_startedearly
c0436b8c T __early_initcall_end
c0436b8c t __initcall_init_mmap_min_addr0
c0436b90 t __initcall_net_ns_init0
c0436b94 t __initcall_ptrace_break_init1
c0436b98 t __initcall_consistent_init1
c0436b9c t __initcall_v6_userpage_init1
c0436ba0 t __initcall_alloc_frozen_cpus1
c0436ba4 t __initcall_sysctl_init1
c0436ba8 t __initcall_ksysfs_init1
c0436bac t __initcall_init_jiffies_clocksource1
c0436bb0 t __initcall_pm_init1
c0436bb4 t __initcall_init_zero_pfn1
c0436bb8 t __initcall_fsnotify_init1
c0436bbc t __initcall_filelock_init1
c0436bc0 t __initcall_init_script_binfmt1
c0436bc4 t __initcall_init_elf_binfmt1
c0436bc8 t __initcall_debugfs_init1
c0436bcc t __initcall_random32_init1
c0436bd0 t __initcall_sock_init1
c0436bd4 t __initcall_netlink_proto_init1
c0436bd8 t __initcall_bdi_class_init2
c0436bdc t __initcall_kobject_uevent_init2
c0436be0 t __initcall_amba_init2
c0436be4 t __initcall_tty_class_init2
c0436be8 t __initcall_vtconsole_class_init2
c0436bec t __initcall_wakeup_sources_debugfs_init2
c0436bf0 t __initcall_customize_machine3
c0436bf4 t __initcall_arch_hw_breakpoint_init3
c0436bf8 t __initcall_exceptions_init3
c0436bfc t __initcall_realview_i2c_init3
c0436c00 t __initcall_pl011_init3
c0436c04 t __initcall_topology_init4
c0436c08 t __initcall_param_sysfs_init4
c0436c0c t __initcall_pm_sysrq_init4
c0436c10 t __initcall_default_bdi_init4
c0436c14 t __initcall_init_bio4
c0436c18 t __initcall_fsnotify_notification_init4
c0436c1c t __initcall_blk_settings_init4
c0436c20 t __initcall_blk_ioc_init4
c0436c24 t __initcall_blk_softirq_init4
c0436c28 t __initcall_blk_iopoll_setup4
c0436c2c t __initcall_genhd_device_init4
c0436c30 t __initcall_fbmem_init4
c0436c34 t __initcall_misc_init4
c0436c38 t __initcall_init_scsi4
c0436c3c t __initcall_ata_init4
c0436c40 t __initcall_phy_init4
c0436c44 t __initcall_usb_init4
c0436c48 t __initcall_serio_init4
c0436c4c t __initcall_input_init4
c0436c50 t __initcall_rtc_init4
c0436c54 t __initcall_init_soundcore4
c0436c58 t __initcall_alsa_sound_init4
c0436c5c t __initcall_ac97_bus_init4
c0436c60 t __initcall_proto_init4
c0436c64 t __initcall_net_dev_init4
c0436c68 t __initcall_neigh_init4
c0436c6c t __initcall_genl_init4
c0436c70 t __initcall_sysctl_init4
c0436c74 t __initcall_proc_cpu_init5
c0436c78 t __initcall_dma_debug_do_init5
c0436c7c t __initcall_alignment_init5
c0436c80 t __initcall_clocksource_done_booting5
c0436c84 t __initcall_init_pipe_fs5
c0436c88 t __initcall_eventpoll_init5
c0436c8c t __initcall_anon_inode_init5
c0436c90 t __initcall_blk_scsi_ioctl_init5
c0436c94 t __initcall_chr_dev_init5
c0436c98 t __initcall_firmware_class_init5
c0436c9c t __initcall_sysctl_core_init5
c0436ca0 t __initcall_inet_init5
c0436ca4 t __initcall_af_unix_init5
c0436ca8 t __initcall_init_sunrpc5
c0436cac t __initcall_populate_rootfsrootfs
c0436cb0 t __initcall_timer_init_syscore_ops6
c0436cb4 t __initcall_register_pmu_driver6
c0436cb8 t __initcall_proc_execdomains_init6
c0436cbc t __initcall_ioresources_init6
c0436cc0 t __initcall_uid_cache_init6
c0436cc4 t __initcall_init_posix_timers6
c0436cc8 t __initcall_init_posix_cpu_timers6
c0436ccc t __initcall_timekeeping_init_ops6
c0436cd0 t __initcall_init_clocksource_sysfs6
c0436cd4 t __initcall_init_timer_list_procfs6
c0436cd8 t __initcall_alarmtimer_init6
c0436cdc t __initcall_futex_init6
c0436ce0 t __initcall_proc_modules_init6
c0436ce4 t __initcall_kallsyms_init6
c0436ce8 t __initcall_ikconfig_init6
c0436cec t __initcall_hung_task_init6
c0436cf0 t __initcall_utsname_sysctl_init6
c0436cf4 t __initcall_perf_event_sysfs_init6
c0436cf8 t __initcall_init_per_zone_wmark_min6
c0436cfc t __initcall_kswapd_init6
c0436d00 t __initcall_setup_vmstat6
c0436d04 t __initcall_mm_sysfs_init6
c0436d08 t __initcall_proc_vmalloc_init6
c0436d0c t __initcall_memblock_init_debugfs6
c0436d10 t __initcall_init_emergency_pool6
c0436d14 t __initcall_procswaps_init6
c0436d18 t __initcall_slab_proc_init6
c0436d1c t __initcall_slab_sysfs_init6
c0436d20 t __initcall_fcntl_init6
c0436d24 t __initcall_proc_filesystems_init6
c0436d28 t __initcall_fsnotify_mark_init6
c0436d2c t __initcall_dnotify_init6
c0436d30 t __initcall_inotify_user_setup6
c0436d34 t __initcall_aio_setup6
c0436d38 t __initcall_proc_locks_init6
c0436d3c t __initcall_proc_cmdline_init6
c0436d40 t __initcall_proc_consoles_init6
c0436d44 t __initcall_proc_cpuinfo_init6
c0436d48 t __initcall_proc_devices_init6
c0436d4c t __initcall_proc_interrupts_init6
c0436d50 t __initcall_proc_loadavg_init6
c0436d54 t __initcall_proc_meminfo_init6
c0436d58 t __initcall_proc_stat_init6
c0436d5c t __initcall_proc_uptime_init6
c0436d60 t __initcall_proc_version_init6
c0436d64 t __initcall_proc_softirqs_init6
c0436d68 t __initcall_proc_kmsg_init6
c0436d6c t __initcall_proc_page_init6
c0436d70 t __initcall_init_devpts_fs6
c0436d74 t __initcall_init_ext3_fs6
c0436d78 t __initcall_init_ext2_fs6
c0436d7c t __initcall_journal_init6
c0436d80 t __initcall_init_cramfs_fs6
c0436d84 t __initcall_init_ramfs_fs6
c0436d88 t __initcall_init_fat_fs6
c0436d8c t __initcall_init_vfat_fs6
c0436d90 t __initcall_init_nfs_fs6
c0436d94 t __initcall_init_nlm6
c0436d98 t __initcall_init_nls_cp4376
c0436d9c t __initcall_init_nls_iso8859_16
c0436da0 t __initcall_ipc_init6

Reordering the initcalls seems the easiest solution, but it is still very
fragile...

        M.
-- 
Fast, cheap, reliable. Pick two.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-03  8:05             ` [kernel-hardening] " Marc Zyngier
@ 2011-08-03  8:19               ` Linus Torvalds
  -1 siblings, 0 replies; 53+ messages in thread
From: Linus Torvalds @ 2011-08-03  8:19 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Andrew Morton, Vasiliy Kulikov, Manuel Lauss, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney

On Tue, Aug 2, 2011 at 10:05 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>
> Reordering the initcalls seems the easiest solution, but it is still very
> fragile...

So that's what I tried to do, by making it a "pure_initcall()". Even
that didn't seem to be enough according to Manuel.

Can you try my patch (that makes just that ipc ns init be a
pure_initcall(), together with your hack on top of Andrew's? What is
it that happens so early that even pure_initcall() hasn't been done
yet?

             Linus

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-03  8:19               ` Linus Torvalds
  0 siblings, 0 replies; 53+ messages in thread
From: Linus Torvalds @ 2011-08-03  8:19 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Andrew Morton, Vasiliy Kulikov, Manuel Lauss, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney

On Tue, Aug 2, 2011 at 10:05 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>
> Reordering the initcalls seems the easiest solution, but it is still very
> fragile...

So that's what I tried to do, by making it a "pure_initcall()". Even
that didn't seem to be enough according to Manuel.

Can you try my patch (that makes just that ipc ns init be a
pure_initcall(), together with your hack on top of Andrew's? What is
it that happens so early that even pure_initcall() hasn't been done
yet?

             Linus

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-03  8:19               ` [kernel-hardening] " Linus Torvalds
@ 2011-08-03 10:04                 ` Manuel Lauss
  -1 siblings, 0 replies; 53+ messages in thread
From: Manuel Lauss @ 2011-08-03 10:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Marc Zyngier, Andrew Morton, Vasiliy Kulikov, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney

On Wed, Aug 3, 2011 at 10:19 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Aug 2, 2011 at 10:05 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>
>> Reordering the initcalls seems the easiest solution, but it is still very
>> fragile...
>
> So that's what I tried to do, by making it a "pure_initcall()". Even
> that didn't seem to be enough according to Manuel.
>
> Can you try my patch (that makes just that ipc ns init be a
> pure_initcall(), together with your hack on top of Andrew's? What is
> it that happens so early that even pure_initcall() hasn't been done
> yet?

I stuck a few printk's in the init path, the first "schedule()" in
init/main.c::rest_init()
starts it.  I don't know enough kernel (yet) to trace it further, sorry.

Manuel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-03 10:04                 ` Manuel Lauss
  0 siblings, 0 replies; 53+ messages in thread
From: Manuel Lauss @ 2011-08-03 10:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Marc Zyngier, Andrew Morton, Vasiliy Kulikov, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney

On Wed, Aug 3, 2011 at 10:19 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Aug 2, 2011 at 10:05 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>
>> Reordering the initcalls seems the easiest solution, but it is still very
>> fragile...
>
> So that's what I tried to do, by making it a "pure_initcall()". Even
> that didn't seem to be enough according to Manuel.
>
> Can you try my patch (that makes just that ipc ns init be a
> pure_initcall(), together with your hack on top of Andrew's? What is
> it that happens so early that even pure_initcall() hasn't been done
> yet?

I stuck a few printk's in the init path, the first "schedule()" in
init/main.c::rest_init()
starts it.  I don't know enough kernel (yet) to trace it further, sorry.

Manuel

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-03  8:19               ` [kernel-hardening] " Linus Torvalds
@ 2011-08-03 10:30                 ` Marc Zyngier
  -1 siblings, 0 replies; 53+ messages in thread
From: Marc Zyngier @ 2011-08-03 10:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Vasiliy Kulikov, Manuel Lauss, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney

On Wed, 3 Aug 2011 09:19:24 +0100
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Aug 2, 2011 at 10:05 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >
> > Reordering the initcalls seems the easiest solution, but it is still very
> > fragile...
> 
> So that's what I tried to do, by making it a "pure_initcall()". Even
> that didn't seem to be enough according to Manuel.
> 
> Can you try my patch (that makes just that ipc ns init be a
> pure_initcall(), together with your hack on top of Andrew's? What is
> it that happens so early that even pure_initcall() hasn't been done
> yet?

Here you go:

SMP: Total of 4 processors activated (334.95 BogoMIPS).
[<c0014754>] (unwind_backtrace+0x0/0xf4) from [<c003ed78>] (call_usermodehelper_exec+0x108/0x118)
[<c003ed78>] (call_usermodehelper_exec+0x108/0x118) from [<c01760ac>] (kobject_uevent_env+0x40c/0x450)
[<c01760ac>] (kobject_uevent_env+0x40c/0x450) from [<c01754e8>] (kset_register+0x3c/0x44)
[<c01754e8>] (kset_register+0x3c/0x44) from [<c01c2ed0>] (bus_register+0xac/0x28c)
[<c01c2ed0>] (bus_register+0xac/0x28c) from [<c042227c>] (platform_bus_init+0x20/0x40)
[<c042227c>] (platform_bus_init+0x20/0x40) from [<c04222f4>] (driver_init+0x18/0x24)
[<c04222f4>] (driver_init+0x18/0x24) from [<c040e270>] (kernel_init+0x84/0x13c)
[<c040e270>] (kernel_init+0x84/0x13c) from [<c000f630>] (kernel_thread_exit+0x0/0x8)
[<c0014754>] (unwind_backtrace+0x0/0xf4) from [<c003ed78>] (call_usermodehelper_exec+0x108/0x118)
[<c003ed78>] (call_usermodehelper_exec+0x108/0x118) from [<c01760ac>] (kobject_uevent_env+0x40c/0x450)
[<c01760ac>] (kobject_uevent_env+0x40c/0x450) from [<c01754e8>] (kset_register+0x3c/0x44)
[<c01754e8>] (kset_register+0x3c/0x44) from [<c01c1f0c>] (sysdev_class_register+0x60/0x8c)
[<c01c1f0c>] (sysdev_class_register+0x60/0x8c) from [<c040e270>] (kernel_init+0x84/0x13c)
[<c040e270>] (kernel_init+0x84/0x13c) from [<c000f630>] (kernel_thread_exit+0x0/0x8)
NET: Registered protocol family 16
L2x0 series cache controller enabled

driver_init() is called from do_basic_setup(), before any initcall...

	M.
-- 
I'm the slime oozin' out from your TV set...


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-03 10:30                 ` Marc Zyngier
  0 siblings, 0 replies; 53+ messages in thread
From: Marc Zyngier @ 2011-08-03 10:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Vasiliy Kulikov, Manuel Lauss, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney

On Wed, 3 Aug 2011 09:19:24 +0100
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Aug 2, 2011 at 10:05 PM, Marc Zyngier <marc.zyngier@arm.com> wrote:
> >
> > Reordering the initcalls seems the easiest solution, but it is still very
> > fragile...
> 
> So that's what I tried to do, by making it a "pure_initcall()". Even
> that didn't seem to be enough according to Manuel.
> 
> Can you try my patch (that makes just that ipc ns init be a
> pure_initcall(), together with your hack on top of Andrew's? What is
> it that happens so early that even pure_initcall() hasn't been done
> yet?

Here you go:

SMP: Total of 4 processors activated (334.95 BogoMIPS).
[<c0014754>] (unwind_backtrace+0x0/0xf4) from [<c003ed78>] (call_usermodehelper_exec+0x108/0x118)
[<c003ed78>] (call_usermodehelper_exec+0x108/0x118) from [<c01760ac>] (kobject_uevent_env+0x40c/0x450)
[<c01760ac>] (kobject_uevent_env+0x40c/0x450) from [<c01754e8>] (kset_register+0x3c/0x44)
[<c01754e8>] (kset_register+0x3c/0x44) from [<c01c2ed0>] (bus_register+0xac/0x28c)
[<c01c2ed0>] (bus_register+0xac/0x28c) from [<c042227c>] (platform_bus_init+0x20/0x40)
[<c042227c>] (platform_bus_init+0x20/0x40) from [<c04222f4>] (driver_init+0x18/0x24)
[<c04222f4>] (driver_init+0x18/0x24) from [<c040e270>] (kernel_init+0x84/0x13c)
[<c040e270>] (kernel_init+0x84/0x13c) from [<c000f630>] (kernel_thread_exit+0x0/0x8)
[<c0014754>] (unwind_backtrace+0x0/0xf4) from [<c003ed78>] (call_usermodehelper_exec+0x108/0x118)
[<c003ed78>] (call_usermodehelper_exec+0x108/0x118) from [<c01760ac>] (kobject_uevent_env+0x40c/0x450)
[<c01760ac>] (kobject_uevent_env+0x40c/0x450) from [<c01754e8>] (kset_register+0x3c/0x44)
[<c01754e8>] (kset_register+0x3c/0x44) from [<c01c1f0c>] (sysdev_class_register+0x60/0x8c)
[<c01c1f0c>] (sysdev_class_register+0x60/0x8c) from [<c040e270>] (kernel_init+0x84/0x13c)
[<c040e270>] (kernel_init+0x84/0x13c) from [<c000f630>] (kernel_thread_exit+0x0/0x8)
NET: Registered protocol family 16
L2x0 series cache controller enabled

driver_init() is called from do_basic_setup(), before any initcall...

	M.
-- 
I'm the slime oozin' out from your TV set...

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-03 10:30                 ` [kernel-hardening] " Marc Zyngier
  (?)
@ 2011-08-03 13:13                 ` Thadeu Lima de Souza Cascardo
  2011-08-03 13:33                   ` Kay Sievers
  -1 siblings, 1 reply; 53+ messages in thread
From: Thadeu Lima de Souza Cascardo @ 2011-08-03 13:13 UTC (permalink / raw)
  To: linux-kernel
  Cc: marc.zyngier, manuel.lauss, torvalds, akpm, segoon, richard,
	gregkh, kay.sievers

[-- Attachment #1: Type: text/plain, Size: 416 bytes --]

As Marc Zyngier has pointed out, the culprit is driver_init, which
calls devices_init. I'd say other calls from driver_init also call into
kset and kobject, which will dispatch uevent helper.

I tried to reproduce the problem using UML and only was successful after
setting CONFIG_UEVENT_HELPER_PATH. I guess we could obsolete this
feature and plan its removal soon. Is anybody still using this?

Regards,
Cascardo.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-03 13:13                 ` Thadeu Lima de Souza Cascardo
@ 2011-08-03 13:33                   ` Kay Sievers
  2011-08-03 13:45                     ` Richard Weinberger
  0 siblings, 1 reply; 53+ messages in thread
From: Kay Sievers @ 2011-08-03 13:33 UTC (permalink / raw)
  To: Thadeu Lima de Souza Cascardo
  Cc: linux-kernel, marc.zyngier, manuel.lauss, torvalds, akpm, segoon,
	richard, gregkh

On Wed, Aug 3, 2011 at 15:13, Thadeu Lima de Souza Cascardo
<cascardo@minaslivre.org> wrote:
> As Marc Zyngier has pointed out, the culprit is driver_init, which
> calls devices_init. I'd say other calls from driver_init also call into
> kset and kobject, which will dispatch uevent helper.
>
> I tried to reproduce the problem using UML and only was successful after
> setting CONFIG_UEVENT_HELPER_PATH. I guess we could obsolete this
> feature and plan its removal soon. Is anybody still using this?

Forking binaries from the kernel for frequent events is something
fundamentally wrong to do. It is not even rate-limited or has any
upper bounds. It is known to create out-of-memory situations on
machines with many devices and tiny RAM. Configs that virtualizations
with many guests use pretty often. They can not even bootup with
/sbin/hotplug enabled then.

It would be nice to remove that broken thing, no common system or
distro uses it since quite some years now. But I wouldn't be surprised
if some people still use it for whatever hack they need locally.

Kay

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-03 13:33                   ` Kay Sievers
@ 2011-08-03 13:45                     ` Richard Weinberger
  0 siblings, 0 replies; 53+ messages in thread
From: Richard Weinberger @ 2011-08-03 13:45 UTC (permalink / raw)
  To: Kay Sievers
  Cc: Thadeu Lima de Souza Cascardo, linux-kernel, marc.zyngier,
	manuel.lauss, torvalds, akpm, segoon, gregkh

On Mittwoch 03 August 2011 15:33:27 Kay Sievers wrote:
> On Wed, Aug 3, 2011 at 15:13, Thadeu Lima de Souza Cascardo
> 
> <cascardo@minaslivre.org> wrote:
> > As Marc Zyngier has pointed out, the culprit is driver_init, which
> > calls devices_init. I'd say other calls from driver_init also call into
> > kset and kobject, which will dispatch uevent helper.
> > 
> > I tried to reproduce the problem using UML and only was successful after
> > setting CONFIG_UEVENT_HELPER_PATH. I guess we could obsolete this
> > feature and plan its removal soon. Is anybody still using this?
> 
> Forking binaries from the kernel for frequent events is something
> fundamentally wrong to do. It is not even rate-limited or has any
> upper bounds. It is known to create out-of-memory situations on
> machines with many devices and tiny RAM. Configs that virtualizations
> with many guests use pretty often. They can not even bootup with
> /sbin/hotplug enabled then.
> 
> It would be nice to remove that broken thing, no common system or
> distro uses it since quite some years now. But I wouldn't be surprised
> if some people still use it for whatever hack they need locally.
> 

True.

I'll remove CONFIG_UEVENT_HELPER_PATH from UML's defconfig.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-03  7:43           ` [kernel-hardening] " Linus Torvalds
@ 2011-08-03 19:33             ` Andrew Morton
  -1 siblings, 0 replies; 53+ messages in thread
From: Andrew Morton @ 2011-08-03 19:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vasiliy Kulikov, Manuel Lauss, linux-kernel, Richard Weinberger,
	Marc Zyngier, Ingo Molnar, kernel-hardening, Paul E. McKenney,
	Kay Sievers

On Tue, 2 Aug 2011 21:43:23 -1000
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Aug 2, 2011 at 2:45 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> >
> > From: Vasiliy Kulikov <segoon@openwall.com>
> > Subject: [PATCH] shm: fix a race between shm_exit() and shm_init()
> 
> This patch is disgusting.
> 

I think it's buggy too.

> 
> > +       /*
> > +        * For init_ipc_ns shm_ids().rw_mutex is statically initialized
> > +        * as kernel threads should be able to use it in do_exit() before
> > +        * shm_init(), which is called on do_initcall()
> > +        */
> > +       if (ns == &init_ipc_ns)
> > +                ipc_init_ids(&shm_ids(ns));
> > +       else
> > +               ipc_init_ids(&shm_ids(ns));

afacit init_ipc_ns.ids[0].rw_mutex and init_ipc_ns.ids[1].rw_mutex
never get initialised with this patch?


Still.  It seems that the real bug is that driver_init() is trying to
invoke userspace helpers before the kernel is ready to run userspace.

We could hack around it with something like

--- a/init/main.c~a
+++ a/init/main.c
@@ -108,6 +108,9 @@ bool early_boot_irqs_disabled __read_mos
 enum system_states system_state __read_mostly;
 EXPORT_SYMBOL(system_state);
 
+/* This gets set when the kernel is ready to execute usermode code */
+bool usermode_available = false;
+
 /*
  * Boot command-line arguments
  */
@@ -804,6 +807,8 @@ static int __init kernel_init(void * unu
 
 	do_basic_setup();
 
+	usermode_available = true;
+
 	/* Open the /dev/console on the rootfs, this should never fail */
 	if (sys_open((const char __user *) "/dev/console", O_RDWR, 0) < 0)
 		printk(KERN_WARNING "Warning: unable to open an initial console.\n");
diff -puN include/linux/init.h~a include/linux/init.h
--- a/include/linux/init.h~a
+++ a/include/linux/init.h
@@ -149,6 +149,7 @@ extern int do_one_initcall(initcall_t fn
 extern char __initdata boot_command_line[];
 extern char *saved_command_line;
 extern unsigned int reset_devices;
+extern bool usermode_available;
 
 /* used by init/main.c */
 void setup_arch(char **);
diff -puN kernel/kmod.c~a kernel/kmod.c
--- a/kernel/kmod.c~a
+++ a/kernel/kmod.c
@@ -418,6 +418,9 @@ int call_usermodehelper_exec(struct subp
 	DECLARE_COMPLETION_ONSTACK(done);
 	int retval = 0;
 
+	if (usermode_available == false)
+		return -EINVAL;
+
 	helper_lock();
 	if (sub_info->path[0] == '\0')
 		goto out;
_


(which I think deserves a big fat WARN_ON() when it triggers)

But it would be better to fix driver_init() for real.

Save us, Kay!

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-03 19:33             ` Andrew Morton
  0 siblings, 0 replies; 53+ messages in thread
From: Andrew Morton @ 2011-08-03 19:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Vasiliy Kulikov, Manuel Lauss, linux-kernel, Richard Weinberger,
	Marc Zyngier, Ingo Molnar, kernel-hardening, Paul E. McKenney,
	Kay Sievers

On Tue, 2 Aug 2011 21:43:23 -1000
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Aug 2, 2011 at 2:45 AM, Vasiliy Kulikov <segoon@openwall.com> wrote:
> >
> > From: Vasiliy Kulikov <segoon@openwall.com>
> > Subject: [PATCH] shm: fix a race between shm_exit() and shm_init()
> 
> This patch is disgusting.
> 

I think it's buggy too.

> 
> > +       /*
> > +        * For init_ipc_ns shm_ids().rw_mutex is statically initialized
> > +        * as kernel threads should be able to use it in do_exit() before
> > +        * shm_init(), which is called on do_initcall()
> > +        */
> > +       if (ns == &init_ipc_ns)
> > +                ipc_init_ids(&shm_ids(ns));
> > +       else
> > +               ipc_init_ids(&shm_ids(ns));

afacit init_ipc_ns.ids[0].rw_mutex and init_ipc_ns.ids[1].rw_mutex
never get initialised with this patch?


Still.  It seems that the real bug is that driver_init() is trying to
invoke userspace helpers before the kernel is ready to run userspace.

We could hack around it with something like

--- a/init/main.c~a
+++ a/init/main.c
@@ -108,6 +108,9 @@ bool early_boot_irqs_disabled __read_mos
 enum system_states system_state __read_mostly;
 EXPORT_SYMBOL(system_state);
 
+/* This gets set when the kernel is ready to execute usermode code */
+bool usermode_available = false;
+
 /*
  * Boot command-line arguments
  */
@@ -804,6 +807,8 @@ static int __init kernel_init(void * unu
 
 	do_basic_setup();
 
+	usermode_available = true;
+
 	/* Open the /dev/console on the rootfs, this should never fail */
 	if (sys_open((const char __user *) "/dev/console", O_RDWR, 0) < 0)
 		printk(KERN_WARNING "Warning: unable to open an initial console.\n");
diff -puN include/linux/init.h~a include/linux/init.h
--- a/include/linux/init.h~a
+++ a/include/linux/init.h
@@ -149,6 +149,7 @@ extern int do_one_initcall(initcall_t fn
 extern char __initdata boot_command_line[];
 extern char *saved_command_line;
 extern unsigned int reset_devices;
+extern bool usermode_available;
 
 /* used by init/main.c */
 void setup_arch(char **);
diff -puN kernel/kmod.c~a kernel/kmod.c
--- a/kernel/kmod.c~a
+++ a/kernel/kmod.c
@@ -418,6 +418,9 @@ int call_usermodehelper_exec(struct subp
 	DECLARE_COMPLETION_ONSTACK(done);
 	int retval = 0;
 
+	if (usermode_available == false)
+		return -EINVAL;
+
 	helper_lock();
 	if (sub_info->path[0] == '\0')
 		goto out;
_


(which I think deserves a big fat WARN_ON() when it triggers)

But it would be better to fix driver_init() for real.

Save us, Kay!

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-03 19:33             ` [kernel-hardening] " Andrew Morton
  (?)
@ 2011-08-03 19:52             ` Vasiliy Kulikov
  -1 siblings, 0 replies; 53+ messages in thread
From: Vasiliy Kulikov @ 2011-08-03 19:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linus Torvalds, Manuel Lauss, linux-kernel, Richard Weinberger,
	Marc Zyngier, Ingo Molnar, kernel-hardening, Paul E. McKenney,
	Kay Sievers

On Wed, Aug 03, 2011 at 12:33 -0700, Andrew Morton wrote:
> > > +       /*
> > > +        * For init_ipc_ns shm_ids().rw_mutex is statically initialized
> > > +        * as kernel threads should be able to use it in do_exit() before
> > > +        * shm_init(), which is called on do_initcall()
> > > +        */
> > > +       if (ns == &init_ipc_ns)
> > > +                ipc_init_ids(&shm_ids(ns));
> > > +       else
> > > +               ipc_init_ids(&shm_ids(ns));
> 
> afacit init_ipc_ns.ids[0].rw_mutex and init_ipc_ns.ids[1].rw_mutex
> never get initialised with this patch?

No, these .rw_mutex are initialized in runtime, as before.


This patch should fix the specific oops (not a dependency issue):

https://lkml.org/lkml/2011/8/3/256


> Still.  It seems that the real bug is that driver_init() is trying to
> invoke userspace helpers before the kernel is ready to run userspace.

What if declare a completion, trigger it after all ns init code is
finished, and wait on the completion inside of
call_usermodehelper_exec()?

Thanks,

-- 
Vasiliy Kulikov
http://www.openwall.com - bringing security into open computing environments

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-03 10:30                 ` [kernel-hardening] " Marc Zyngier
@ 2011-08-04  0:35                   ` Linus Torvalds
  -1 siblings, 0 replies; 53+ messages in thread
From: Linus Torvalds @ 2011-08-04  0:35 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Andrew Morton, Vasiliy Kulikov, Manuel Lauss, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney

[-- Attachment #1: Type: text/plain, Size: 683 bytes --]

On Wed, Aug 3, 2011 at 12:30 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>
> driver_init() is called from do_basic_setup(), before any initcall...

Ok, thanks everybody. I think the proper fix is the attached patch.
It's crazy to have usermode helpers enabled early during boot when
init etc haven't even been set up.

So does this finally fix the problem (I realize that there are other
patches floating around that *also* fix it, so please test this
without those other patches).

I'm planning on also applying the patch that optimizes the case of no
shm attaches, which would also hide the oops, but I think this
attached patch is the RightThing(tm).

                   Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 1110 bytes --]

 init/main.c   |    5 ++++-
 kernel/kmod.c |    2 +-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/init/main.c b/init/main.c
index 1952d37e4ecb..9c51ee7adf3d 100644
--- a/init/main.c
+++ b/init/main.c
@@ -369,9 +369,12 @@ static noinline void __init_refok rest_init(void)
 	init_idle_bootup_task(current);
 	preempt_enable_no_resched();
 	schedule();
-	preempt_disable();
+
+	/* At this point, we can enable user mode helper functionality */
+	usermodehelper_enable();
 
 	/* Call into cpu_idle with preempt disabled */
+	preempt_disable();
 	cpu_idle();
 }
 
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 47613dfb7b28..ddc7644c1305 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -274,7 +274,7 @@ static void __call_usermodehelper(struct work_struct *work)
  * (used for preventing user land processes from being created after the user
  * land has been frozen during a system-wide hibernation or suspend operation).
  */
-static int usermodehelper_disabled;
+static int usermodehelper_disabled = 1;
 
 /* Number of helpers running */
 static atomic_t running_helpers = ATOMIC_INIT(0);

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-04  0:35                   ` Linus Torvalds
  0 siblings, 0 replies; 53+ messages in thread
From: Linus Torvalds @ 2011-08-04  0:35 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Andrew Morton, Vasiliy Kulikov, Manuel Lauss, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney

[-- Attachment #1: Type: text/plain, Size: 683 bytes --]

On Wed, Aug 3, 2011 at 12:30 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>
> driver_init() is called from do_basic_setup(), before any initcall...

Ok, thanks everybody. I think the proper fix is the attached patch.
It's crazy to have usermode helpers enabled early during boot when
init etc haven't even been set up.

So does this finally fix the problem (I realize that there are other
patches floating around that *also* fix it, so please test this
without those other patches).

I'm planning on also applying the patch that optimizes the case of no
shm attaches, which would also hide the oops, but I think this
attached patch is the RightThing(tm).

                   Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 1110 bytes --]

 init/main.c   |    5 ++++-
 kernel/kmod.c |    2 +-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/init/main.c b/init/main.c
index 1952d37e4ecb..9c51ee7adf3d 100644
--- a/init/main.c
+++ b/init/main.c
@@ -369,9 +369,12 @@ static noinline void __init_refok rest_init(void)
 	init_idle_bootup_task(current);
 	preempt_enable_no_resched();
 	schedule();
-	preempt_disable();
+
+	/* At this point, we can enable user mode helper functionality */
+	usermodehelper_enable();
 
 	/* Call into cpu_idle with preempt disabled */
+	preempt_disable();
 	cpu_idle();
 }
 
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 47613dfb7b28..ddc7644c1305 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -274,7 +274,7 @@ static void __call_usermodehelper(struct work_struct *work)
  * (used for preventing user land processes from being created after the user
  * land has been frozen during a system-wide hibernation or suspend operation).
  */
-static int usermodehelper_disabled;
+static int usermodehelper_disabled = 1;
 
 /* Number of helpers running */
 static atomic_t running_helpers = ATOMIC_INIT(0);

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-04  0:35                   ` [kernel-hardening] " Linus Torvalds
@ 2011-08-04  0:50                     ` Andrew Morton
  -1 siblings, 0 replies; 53+ messages in thread
From: Andrew Morton @ 2011-08-04  0:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Marc Zyngier, Vasiliy Kulikov, Manuel Lauss, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney, Kay Sievers, Greg KH

On Wed, 3 Aug 2011 14:35:09 -1000 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> --- a/init/main.c
> +++ b/init/main.c
> @@ -369,9 +369,12 @@ static noinline void __init_refok rest_init(void)
>  	init_idle_bootup_task(current);
>  	preempt_enable_no_resched();
>  	schedule();
> -	preempt_disable();
> +
> +	/* At this point, we can enable user mode helper functionality */
> +	usermodehelper_enable();
>  
>  	/* Call into cpu_idle with preempt disabled */
> +	preempt_disable();
>  	cpu_idle();
>  }

Well, it's still a workaround.  We'll still have driver_init() trying
to run userspace helpers at an inappropriate time, and failing to do
so.  Either something will break or it should not be attempting
attempting to do this at all.

Perhaps Kay and Greg can suggest how we can fix all this up?

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-04  0:50                     ` Andrew Morton
  0 siblings, 0 replies; 53+ messages in thread
From: Andrew Morton @ 2011-08-04  0:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Marc Zyngier, Vasiliy Kulikov, Manuel Lauss, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney, Kay Sievers, Greg KH

On Wed, 3 Aug 2011 14:35:09 -1000 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> --- a/init/main.c
> +++ b/init/main.c
> @@ -369,9 +369,12 @@ static noinline void __init_refok rest_init(void)
>  	init_idle_bootup_task(current);
>  	preempt_enable_no_resched();
>  	schedule();
> -	preempt_disable();
> +
> +	/* At this point, we can enable user mode helper functionality */
> +	usermodehelper_enable();
>  
>  	/* Call into cpu_idle with preempt disabled */
> +	preempt_disable();
>  	cpu_idle();
>  }

Well, it's still a workaround.  We'll still have driver_init() trying
to run userspace helpers at an inappropriate time, and failing to do
so.  Either something will break or it should not be attempting
attempting to do this at all.

Perhaps Kay and Greg can suggest how we can fix all this up?

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-04  0:50                     ` [kernel-hardening] " Andrew Morton
@ 2011-08-04  1:01                       ` Linus Torvalds
  -1 siblings, 0 replies; 53+ messages in thread
From: Linus Torvalds @ 2011-08-04  1:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Marc Zyngier, Vasiliy Kulikov, Manuel Lauss, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney, Kay Sievers, Greg KH

On Wed, Aug 3, 2011 at 2:50 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
>
> Well, it's still a workaround.  We'll still have driver_init() trying
> to run userspace helpers at an inappropriate time, and failing to do
> so.  Either something will break or it should not be attempting
> attempting to do this at all.

Well, it has never worked before either, so I wouldn't call it a
"workaround", more of a "insane subsystems do user-mode helpers for
crazy small things, and at crazy times. Stop them from wasting our
time and effort at those crazy times when we know they would fail
anyway".

> Perhaps Kay and Greg can suggest how we can fix all this up?

I think just not calling user-mode when user-mode isn't ready is a real fix.

The fact that the device models send an absolutely *insane* amount of
events for everything, and don't track "this is the bootup device
scan" on their own is kind of sad, but it's how they roll. This fixes
it at a core level, so that the device layer doesn't have to track the
"am I booting, or is this a dynamic event?" at all.

               Linus

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-04  1:01                       ` Linus Torvalds
  0 siblings, 0 replies; 53+ messages in thread
From: Linus Torvalds @ 2011-08-04  1:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Marc Zyngier, Vasiliy Kulikov, Manuel Lauss, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney, Kay Sievers, Greg KH

On Wed, Aug 3, 2011 at 2:50 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
>
> Well, it's still a workaround.  We'll still have driver_init() trying
> to run userspace helpers at an inappropriate time, and failing to do
> so.  Either something will break or it should not be attempting
> attempting to do this at all.

Well, it has never worked before either, so I wouldn't call it a
"workaround", more of a "insane subsystems do user-mode helpers for
crazy small things, and at crazy times. Stop them from wasting our
time and effort at those crazy times when we know they would fail
anyway".

> Perhaps Kay and Greg can suggest how we can fix all this up?

I think just not calling user-mode when user-mode isn't ready is a real fix.

The fact that the device models send an absolutely *insane* amount of
events for everything, and don't track "this is the bootup device
scan" on their own is kind of sad, but it's how they roll. This fixes
it at a core level, so that the device layer doesn't have to track the
"am I booting, or is this a dynamic event?" at all.

               Linus

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-04  1:01                       ` [kernel-hardening] " Linus Torvalds
@ 2011-08-04  1:15                         ` Kay Sievers
  -1 siblings, 0 replies; 53+ messages in thread
From: Kay Sievers @ 2011-08-04  1:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Marc Zyngier, Vasiliy Kulikov, Manuel Lauss,
	linux-kernel, Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney, Greg KH

On Thu, Aug 4, 2011 at 03:01, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Aug 3, 2011 at 2:50 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
>>
>> Well, it's still a workaround.  We'll still have driver_init() trying
>> to run userspace helpers at an inappropriate time, and failing to do
>> so.  Either something will break or it should not be attempting
>> attempting to do this at all.
>
> Well, it has never worked before either, so I wouldn't call it a
> "workaround", more of a "insane subsystems do user-mode helpers for
> crazy small things, and at crazy times. Stop them from wasting our
> time and effort at those crazy times when we know they would fail
> anyway".
>
>> Perhaps Kay and Greg can suggest how we can fix all this up?
>
> I think just not calling user-mode when user-mode isn't ready is a real fix.

Yes, we should default to off, and probably only enable it just right
before we start /sbin/init from the mounted root or in initramfs.
There should be no useful use of /sbin/hotplug before we mounted root,
or unpacked initramfs.

> The fact that the device models send an absolutely *insane* amount of
> events for everything, and don't track "this is the bootup device
> scan" on their own is kind of sad, but it's how they roll. This fixes
> it at a core level, so that the device layer doesn't have to track the
> "am I booting, or is this a dynamic event?" at all.

It is insane, and the entire uevent hookup with /sbin/hotplug should
just be removed. We send from 1000 to many 1000s of events at bootup,
and forking a binary here makes not much sense these days.

All distros default to CONFIG_UEVENT_HELPER_PATH="" because nothing
else really works anymore, some boxes don't even boot up.
/sbin/hotplug worked fine when we all had one USB mouse connected and
needed to load like 5 drivers for the system, but not today.

It's a broken concept today, at least in the way we plug it into the
driver core makes no sense today, it is just a heritage from the old
days.

Kay

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-04  1:15                         ` Kay Sievers
  0 siblings, 0 replies; 53+ messages in thread
From: Kay Sievers @ 2011-08-04  1:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Marc Zyngier, Vasiliy Kulikov, Manuel Lauss,
	linux-kernel, Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney, Greg KH

On Thu, Aug 4, 2011 at 03:01, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Aug 3, 2011 at 2:50 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
>>
>> Well, it's still a workaround.  We'll still have driver_init() trying
>> to run userspace helpers at an inappropriate time, and failing to do
>> so.  Either something will break or it should not be attempting
>> attempting to do this at all.
>
> Well, it has never worked before either, so I wouldn't call it a
> "workaround", more of a "insane subsystems do user-mode helpers for
> crazy small things, and at crazy times. Stop them from wasting our
> time and effort at those crazy times when we know they would fail
> anyway".
>
>> Perhaps Kay and Greg can suggest how we can fix all this up?
>
> I think just not calling user-mode when user-mode isn't ready is a real fix.

Yes, we should default to off, and probably only enable it just right
before we start /sbin/init from the mounted root or in initramfs.
There should be no useful use of /sbin/hotplug before we mounted root,
or unpacked initramfs.

> The fact that the device models send an absolutely *insane* amount of
> events for everything, and don't track "this is the bootup device
> scan" on their own is kind of sad, but it's how they roll. This fixes
> it at a core level, so that the device layer doesn't have to track the
> "am I booting, or is this a dynamic event?" at all.

It is insane, and the entire uevent hookup with /sbin/hotplug should
just be removed. We send from 1000 to many 1000s of events at bootup,
and forking a binary here makes not much sense these days.

All distros default to CONFIG_UEVENT_HELPER_PATH="" because nothing
else really works anymore, some boxes don't even boot up.
/sbin/hotplug worked fine when we all had one USB mouse connected and
needed to load like 5 drivers for the system, but not today.

It's a broken concept today, at least in the way we plug it into the
driver core makes no sense today, it is just a heritage from the old
days.

Kay

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
  2011-08-04  0:35                   ` [kernel-hardening] " Linus Torvalds
@ 2011-08-04  8:26                     ` Marc Zyngier
  -1 siblings, 0 replies; 53+ messages in thread
From: Marc Zyngier @ 2011-08-04  8:26 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Vasiliy Kulikov, Manuel Lauss, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney

On 04/08/11 01:35, Linus Torvalds wrote:
> On Wed, Aug 3, 2011 at 12:30 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>
>> driver_init() is called from do_basic_setup(), before any initcall...
> 
> Ok, thanks everybody. I think the proper fix is the attached patch.
> It's crazy to have usermode helpers enabled early during boot when
> init etc haven't even been set up.
> 
> So does this finally fix the problem (I realize that there are other
> patches floating around that *also* fix it, so please test this
> without those other patches).

This patch alone doesn't cure the problem... However, if combined with
the pure_initcall patch, it boots correctly.

I'll try to gather some traces later today.

	M.
-- 
Jazz is not dead. It just smells funny...


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [kernel-hardening] Re: [PATCH] shm: fix a race between shm_exit() and shm_init()
@ 2011-08-04  8:26                     ` Marc Zyngier
  0 siblings, 0 replies; 53+ messages in thread
From: Marc Zyngier @ 2011-08-04  8:26 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, Vasiliy Kulikov, Manuel Lauss, linux-kernel,
	Richard Weinberger, Ingo Molnar, kernel-hardening,
	Paul E. McKenney

On 04/08/11 01:35, Linus Torvalds wrote:
> On Wed, Aug 3, 2011 at 12:30 AM, Marc Zyngier <marc.zyngier@arm.com> wrote:
>>
>> driver_init() is called from do_basic_setup(), before any initcall...
> 
> Ok, thanks everybody. I think the proper fix is the attached patch.
> It's crazy to have usermode helpers enabled early during boot when
> init etc haven't even been set up.
> 
> So does this finally fix the problem (I realize that there are other
> patches floating around that *also* fix it, so please test this
> without those other patches).

This patch alone doesn't cure the problem... However, if combined with
the pure_initcall patch, it boots correctly.

I'll try to gather some traces later today.

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2011-08-04  8:26 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-01 18:01 initcall dependency problem (ns vs. threads) Vasiliy Kulikov
2011-08-01 18:01 ` [kernel-hardening] " Vasiliy Kulikov
2011-08-01 18:20 ` Andrew Morton
2011-08-01 18:20   ` [kernel-hardening] " Andrew Morton
2011-08-01 18:34   ` Vasiliy Kulikov
2011-08-01 19:03   ` Vasiliy Kulikov
2011-08-01 19:07     ` Andrew Morton
2011-08-01 19:22       ` Vasiliy Kulikov
2011-08-02  0:01     ` Linus Torvalds
2011-08-02 12:45       ` [PATCH] shm: fix a race between shm_exit() and shm_init() Vasiliy Kulikov
2011-08-02 12:45         ` [kernel-hardening] " Vasiliy Kulikov
2011-08-02 12:51         ` Manuel Lauss
2011-08-02 12:51           ` [kernel-hardening] " Manuel Lauss
2011-08-02 13:23         ` Richard Weinberger
2011-08-02 13:23           ` [kernel-hardening] " Richard Weinberger
2011-08-02 13:29         ` Marc Zyngier
2011-08-02 13:29           ` [kernel-hardening] " Marc Zyngier
2011-08-02 20:33         ` Andrew Morton
2011-08-02 20:33           ` [kernel-hardening] " Andrew Morton
2011-08-02 20:55         ` Andrew Morton
2011-08-02 20:55           ` [kernel-hardening] " Andrew Morton
2011-08-03  5:30           ` Manuel Lauss
2011-08-03  5:30             ` [kernel-hardening] " Manuel Lauss
2011-08-03  8:05           ` Marc Zyngier
2011-08-03  8:05             ` [kernel-hardening] " Marc Zyngier
2011-08-03  8:19             ` Linus Torvalds
2011-08-03  8:19               ` [kernel-hardening] " Linus Torvalds
2011-08-03 10:04               ` Manuel Lauss
2011-08-03 10:04                 ` [kernel-hardening] " Manuel Lauss
2011-08-03 10:30               ` Marc Zyngier
2011-08-03 10:30                 ` [kernel-hardening] " Marc Zyngier
2011-08-03 13:13                 ` Thadeu Lima de Souza Cascardo
2011-08-03 13:33                   ` Kay Sievers
2011-08-03 13:45                     ` Richard Weinberger
2011-08-04  0:35                 ` Linus Torvalds
2011-08-04  0:35                   ` [kernel-hardening] " Linus Torvalds
2011-08-04  0:50                   ` Andrew Morton
2011-08-04  0:50                     ` [kernel-hardening] " Andrew Morton
2011-08-04  1:01                     ` Linus Torvalds
2011-08-04  1:01                       ` [kernel-hardening] " Linus Torvalds
2011-08-04  1:15                       ` Kay Sievers
2011-08-04  1:15                         ` [kernel-hardening] " Kay Sievers
2011-08-04  8:26                   ` Marc Zyngier
2011-08-04  8:26                     ` [kernel-hardening] " Marc Zyngier
2011-08-03  7:43         ` Linus Torvalds
2011-08-03  7:43           ` [kernel-hardening] " Linus Torvalds
2011-08-03  7:50           ` Manuel Lauss
2011-08-03  7:50             ` [kernel-hardening] " Manuel Lauss
2011-08-03  8:00             ` Manuel Lauss
2011-08-03  8:00               ` [kernel-hardening] " Manuel Lauss
2011-08-03 19:33           ` Andrew Morton
2011-08-03 19:33             ` [kernel-hardening] " Andrew Morton
2011-08-03 19:52             ` Vasiliy Kulikov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.