All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] __copy_to_user_inatomic broken on non Pentium machines
@ 2007-03-25  8:07 Thomas Gleixner
  2007-03-25 18:14 ` Linus Torvalds
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Gleixner @ 2007-03-25  8:07 UTC (permalink / raw)
  To: LKML
  Cc: Stable Kernel Team, Linus Torvalds, Ingo Molnar, Andrew Morton,
	Adrian Bunk

Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0)
Last known working kernel: 2.6.18 (did not try 2.6.19 yet)

Enabling CONFIG_PREEMPT on latest mainline as well as 2.6.20 trigger

[   14.150000] BUG: sleeping function called from invalid context at /home/tglx/work/kernel/vanilla/linux-2.6.20/kernel/rwsem.c:20
[   14.160000] in_atomic():1, irqs_disabled():0
[   14.160000] no locks held by init/1.
[   14.170000]  [<c0103346>] show_trace_log_lvl+0x1a/0x2f
[   14.180000]  [<c0103441>] show_trace+0x12/0x14
[   14.190000]  [<c0103cf5>] dump_stack+0x16/0x18
[   14.190000]  [<c010aa62>] __might_sleep+0xc7/0xcd
[   14.200000]  [<c01213a1>] down_read+0x18/0x47
[   14.210000]  [<c01a01e4>] __copy_to_user_ll+0x5e/0x1b6
[   14.220000]  [<c012cf85>] file_read_actor+0x10b/0x149
[   14.230000]  [<c012d7b2>] do_generic_mapping_read+0x187/0x433
[   14.240000]  [<c012f64b>] generic_file_aio_read+0x191/0x1ca
[   14.240000]  [<c0141657>] do_sync_read+0xc2/0xff
[   14.250000]  [<c0141eb6>] vfs_read+0x90/0x145
[   14.260000]  [<c014227e>] sys_read+0x3f/0x63
[   14.270000]  [<c0102fb0>] syscall_call+0x7/0xb
[   14.270000]  =======================

and 

[   22.660000] BUG: scheduling while atomic: e2fsck/0x10000001/272
[   22.670000] 1 lock held by e2fsck/272:
[   22.680000]  #0:  (&mm->mmap_sem){----}, at: [<c01a01e4>] __copy_to_user_ll+0x5e/0x1b6
[   22.690000]  [<c0103346>] show_trace_log_lvl+0x1a/0x2f
[   22.700000]  [<c0103441>] show_trace+0x12/0x14
[   22.710000]  [<c0103cf5>] dump_stack+0x16/0x18
[   22.720000]  [<c024a189>] __sched_text_start+0x71/0x57f
[   22.720000]  [<c010b49f>] __cond_resched+0x21/0x3b
[   22.730000]  [<c024aca7>] cond_resched+0x26/0x31
[   22.740000]  [<c0137ae5>] get_user_pages+0x1e1/0x23c
[   22.750000]  [<c01a021e>] __copy_to_user_ll+0x98/0x1b6
[   22.760000]  [<c012cf85>] file_read_actor+0x10b/0x149
[   22.770000]  [<c012d7b2>] do_generic_mapping_read+0x187/0x433
[   22.780000]  [<c012f64b>] generic_file_aio_read+0x191/0x1ca
[   22.790000]  [<c0141657>] do_sync_read+0xc2/0xff
[   22.790000]  [<c0141eb6>] vfs_read+0x90/0x145
[   22.800000]  [<c014227e>] sys_read+0x3f/0x63
[   22.810000]  [<c0102fb0>] syscall_call+0x7/0xb
[   22.820000]  =======================

which is not surprising. 

int file_read_actor(read_descriptor_t *desc, struct page *page,
                        unsigned long offset, unsigned long size)
{
....
        /*
         * Faults on the destination of a read are common, so do it before
         * taking the kmap.
         */
        if (!fault_in_pages_writeable(desc->arg.buf, size)) {
                kaddr = kmap_atomic(page, KM_USER0);
---->           left = __copy_to_user_inatomic(desc->arg.buf,
                                                kaddr + offset, size);

	is called with preempt_count == 1, due to the kmap_atomic() above.

Now __copy_to_user_ll() takes the (boot_cpu_data.wp_works_ok == 0) path,
which in turn calls 

down_read(current->mm->mmap_sem) - which might sleep

and

get_user_pages() - which has a cond_resched() inside.

Not sure how to fix that.

	tglx



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] __copy_to_user_inatomic broken on non Pentium machines
  2007-03-25  8:07 [BUG] __copy_to_user_inatomic broken on non Pentium machines Thomas Gleixner
@ 2007-03-25 18:14 ` Linus Torvalds
  2007-03-25 18:30   ` Ingo Molnar
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Linus Torvalds @ 2007-03-25 18:14 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Stable Kernel Team, Ingo Molnar, Andrew Morton, Adrian Bunk



On Sun, 25 Mar 2007, Thomas Gleixner wrote:
>
> Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0)

This shouldn't be "pre-pentium", afaik. WP-works-ok on i486 too. I think 
only the original i386 had this bug ("feature").

But I agree, it does seem to be broken on such machines (I assume you 
don't actually have one, but just tested by forcing it by hand ;)

> Now __copy_to_user_ll() takes the (boot_cpu_data.wp_works_ok == 0) path,
> which in turn calls 
> 
> down_read(current->mm->mmap_sem) - which might sleep
> 
> and
> 
> get_user_pages() - which has a cond_resched() inside.
> 
> Not sure how to fix that.

I agree. Nasty. But the thing is, it's actually much worse. We use 
"__put_user()" earlier to try to fault it in writably, and that one is 
totally broken on a CPU where wp_works_ok isn't set.

The whole notion that we should do this at access time is broken.

We should go back to doing it at "access_ok()", or we should just state 
that we don't support original-i386 CPU's any more. As it is, we don't do 
it right *anyway*, since we only do the tests properly in 
__copy_to_user(), and totally miss them in __put_user() and friends.

So it's buggy on i386 however you try to fix it. The only way to fix it 
properly is to move the i386 fixup early, into "access_ok()", the way it 
used to be. 

		Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] __copy_to_user_inatomic broken on non Pentium machines
  2007-03-25 18:14 ` Linus Torvalds
@ 2007-03-25 18:30   ` Ingo Molnar
  2007-03-25 19:41   ` Thomas Gleixner
  2007-04-02 12:25   ` [PATCH] i386: fix file_read_actor() and pipe_read() for original i386 systems Thomas Gleixner
  2 siblings, 0 replies; 7+ messages in thread
From: Ingo Molnar @ 2007-03-25 18:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Thomas Gleixner, LKML, Stable Kernel Team, Andrew Morton, Adrian Bunk


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Sun, 25 Mar 2007, Thomas Gleixner wrote:
> >
> > Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0)
> 
> This shouldn't be "pre-pentium", afaik. WP-works-ok on i486 too. I 
> think only the original i386 had this bug ("feature").
> 
> But I agree, it does seem to be broken on such machines (I assume you 
> don't actually have one, but just tested by forcing it by hand ;)

actually, AFAIK this is a genuine i386 box Thomas has (an embedded 
board). Our hardware legacies and the resulting dependencies _really_ 
stick around for quite long time :-/

	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] __copy_to_user_inatomic broken on non Pentium machines
  2007-03-25 18:14 ` Linus Torvalds
  2007-03-25 18:30   ` Ingo Molnar
@ 2007-03-25 19:41   ` Thomas Gleixner
  2007-04-02 12:25   ` [PATCH] i386: fix file_read_actor() and pipe_read() for original i386 systems Thomas Gleixner
  2 siblings, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2007-03-25 19:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: LKML, Stable Kernel Team, Ingo Molnar, Andrew Morton, Adrian Bunk

On Sun, 2007-03-25 at 11:14 -0700, Linus Torvalds wrote:
> > Environment: Pre Pentium systems, (boot_cpu_data.wp_works_ok == 0)
> 
> This shouldn't be "pre-pentium", afaik. WP-works-ok on i486 too. I think 
> only the original i386 had this bug ("feature").
>
> But I agree, it does seem to be broken on such machines (I assume you 
> don't actually have one, but just tested by forcing it by hand ;)

Yes, it's a genuine i386 embedded system and AFAIK the same feature is
available on 486 clones. i386 and Co are still in used in the embedded
space.

	tglx





^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] i386: fix file_read_actor() and pipe_read()  for original i386 systems
  2007-03-25 18:14 ` Linus Torvalds
  2007-03-25 18:30   ` Ingo Molnar
  2007-03-25 19:41   ` Thomas Gleixner
@ 2007-04-02 12:25   ` Thomas Gleixner
  2007-04-02 12:34     ` Ingo Molnar
  2007-04-13 23:06     ` Adrian Bunk
  2 siblings, 2 replies; 7+ messages in thread
From: Thomas Gleixner @ 2007-04-02 12:25 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: LKML, Stable Kernel Team, Ingo Molnar, Andrew Morton,
	Adrian Bunk, Manfred Spraul

The __copy_to_user_inatomic() calls in file_read_actor() and pipe_read()
are broken on original i386 machines, where WP-works-ok == false, as
__copy_to_user_inatomic() on such systems calls functions which might
sleep and/or contain cond_resched() calls inside of a kmap_atomic()
region.

The original check for WP-works-ok was in access_ok(), but got moved
during the 2.5 series to fix a race vs. swap.

Return the number of bytes to copy in the case where we are in an atomic
region, so the non atomic code pathes in file_read_actor() and
pipe_read() are taken.

This could be optimized to avoid the kmap_atomicby moving the check for
WP-works-ok into fault_in_pages_writeable(), but this is more intrusive
and can be done later.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

diff --git a/arch/i386/lib/usercopy.c b/arch/i386/lib/usercopy.c
index d22cfc9..1950277 100644
--- a/arch/i386/lib/usercopy.c
+++ b/arch/i386/lib/usercopy.c
@@ -10,6 +10,7 @@
 #include <linux/blkdev.h>
 #include <linux/module.h>
 #include <linux/backing-dev.h>
+#include <linux/interrupt.h>
 #include <asm/uaccess.h>
 #include <asm/mmx.h>
 
@@ -719,6 +720,14 @@ unsigned long __copy_to_user_ll(void __user *to, const void *from,
 #ifndef CONFIG_X86_WP_WORKS_OK
 	if (unlikely(boot_cpu_data.wp_works_ok == 0) &&
 			((unsigned long )to) < TASK_SIZE) {
+		/*
+		 * When we are in an atomic section (see
+		 * mm/filemap.c:file_read_actor), return the full
+		 * length to take the slow path.
+		 */
+		if (in_atomic())
+			return n;
+
 		/* 
 		 * CPU does not honor the WP bit when writing
 		 * from supervisory mode, and due to preemption or SMP,






^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] i386: fix file_read_actor() and pipe_read()  for original i386 systems
  2007-04-02 12:25   ` [PATCH] i386: fix file_read_actor() and pipe_read() for original i386 systems Thomas Gleixner
@ 2007-04-02 12:34     ` Ingo Molnar
  2007-04-13 23:06     ` Adrian Bunk
  1 sibling, 0 replies; 7+ messages in thread
From: Ingo Molnar @ 2007-04-02 12:34 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, LKML, Stable Kernel Team, Andrew Morton,
	Adrian Bunk, Manfred Spraul


* Thomas Gleixner <tglx@linutronix.de> wrote:

> The __copy_to_user_inatomic() calls in file_read_actor() and 
> pipe_read() are broken on original i386 machines, where WP-works-ok == 
> false, as __copy_to_user_inatomic() on such systems calls functions 
> which might sleep and/or contain cond_resched() calls inside of a 
> kmap_atomic() region.
> 
> The original check for WP-works-ok was in access_ok(), but got moved 
> during the 2.5 series to fix a race vs. swap.
> 
> Return the number of bytes to copy in the case where we are in an 
> atomic region, so the non atomic code pathes in file_read_actor() and 
> pipe_read() are taken.

neat trick! (v2.6.21 must-have me thinks.)

> This could be optimized to avoid the kmap_atomicby moving the check 
                                                  ^---- add space
> for WP-works-ok into fault_in_pages_writeable(), but this is more 
> intrusive and can be done later.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Acked-by: Ingo Molnar <mingo@elte.hu>

	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] i386: fix file_read_actor() and pipe_read()  for original i386 systems
  2007-04-02 12:25   ` [PATCH] i386: fix file_read_actor() and pipe_read() for original i386 systems Thomas Gleixner
  2007-04-02 12:34     ` Ingo Molnar
@ 2007-04-13 23:06     ` Adrian Bunk
  1 sibling, 0 replies; 7+ messages in thread
From: Adrian Bunk @ 2007-04-13 23:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Linus Torvalds, LKML, Stable Kernel Team, Ingo Molnar,
	Andrew Morton, Manfred Spraul

Thanks, applied to 2.6.16.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-04-13 23:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-03-25  8:07 [BUG] __copy_to_user_inatomic broken on non Pentium machines Thomas Gleixner
2007-03-25 18:14 ` Linus Torvalds
2007-03-25 18:30   ` Ingo Molnar
2007-03-25 19:41   ` Thomas Gleixner
2007-04-02 12:25   ` [PATCH] i386: fix file_read_actor() and pipe_read() for original i386 systems Thomas Gleixner
2007-04-02 12:34     ` Ingo Molnar
2007-04-13 23:06     ` Adrian Bunk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.