linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* PROBLEM: Bug in __pollwait() can cause select() and poll() to hang in  2.4.22-pre2 -- second try
@ 2003-06-27 18:19 Ray Bryant
  2003-06-30  4:34 ` Rusty Russell
  0 siblings, 1 reply; 8+ messages in thread
From: Ray Bryant @ 2003-06-27 18:19 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Morton, Manfred Spraul, Andi Kleen, trivial, alan

[1.] One line summary of the problem:

      In low memory situations, a process that issues a call to select()
or poll() can sleep forever in the kernel.

[2.] Full description of the problem/report:

      select() and poll() call a common routine: __pollwait().  On the
first call to __pollwait(), it calls __get_free_page(GFP_KERNEL) to
allocate a table to hold wait queues.  In the natural course of things,
this calls into __alloc_pages().  In low memory situations, the process
can then end up in the rebalance code at the bottom of __alloc_pages()
where there is a call to yield().  If the process makes this call, this
is a bad thing [tm], since the process state at that point is
TASK_INTERRUPTIBLE.  There is no wait queue yet for the process (that is
done later in __pollwait()) and no schedule timeout event has yet been
created (that is done later in select()) so the process will never
return from the call to yield().

[3.] Keywords (i.e., modules, networking, kernel):

      Kernel

[4.] Kernel version (from /proc/version):

      This bug appears to be present in every 2.4 kernel from (at least)
2.4.13 thru 2.4.22-pre2.  It is not present in 2.5.70, since a different
method of waiting for memory to free up is used there (in
__alloc_pages()).

[5.] Output of Oops.. message (if applicable) with symbolic information
      resolved (see Documentation/oops-tracing.txt)

      N/A.

[6.] A small shell script or example program which triggers the
      problem (if possible)

      We ecountered this whilst running batch queue tests that are too
complex to include here.

[7.] Environment

[7.1.] Software (add the output of the ver_linux script here)

[7.2.] Processor information (from /proc/cpuinfo):

       We encountered this on ia64, however, this is in machine
independent code and we believe the bug is present on all 2.4.21
platforms.

[7.3.] Module information (from /proc/modules):

[7.4.] Loaded driver and hardware information (/proc/ioports,
/proc/iomem)

[7.5.] PCI information ('lspci -vvv' as root)

[7.6.] SCSI information (from /proc/scsi/scsi)

[7.7.] Other information that might be relevant to the problem
        (please look in /proc and include all information that you
        think to be relevant):

[X.] Other notes, patches, fixes, workarounds:

      The simplest fix (as suggested by Manfred Spraul) is to set
current=>state to TASK_RUNNING just before the call to yield() in
__alloc_pages().  I have tested this sufficiently that I believe
this does not change the user level semantics of select() (my
concern was that if state got set to TASK_RUNNING that the syscall
could return before any fd's are ready or the select() timeout has
expired, but this does not appear to be the case).

Here is a trivial patch against 2.4.22-pre2:

--- linux-2.4.22-pre2.orig/mm/page_alloc.c      Thu Nov 28 17:53:15 2002
+++ linux-2.4.22-pre2/mm/page_alloc.c   Fri Jun 27 13:47:49 2003
@@ -418,6 +418,7 @@
                 return NULL;

         /* Yield for kswapd, and try again */
+        set_current_state(TASK_RUNNING);
         yield();
         goto rebalance;
  }

-- 
Best Regards,
Ray
-----------------------------------------------
                   Ray Bryant
512-453-9679 (work)         512-507-7807 (cell)
Jun 23-Jul 18 I will be at: 970-513-4743
raybry@sgi.com             raybry@austin.rr.com
The box said: "Requires Windows 98 or better",
            so I installed Linux.
-----------------------------------------------


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PROBLEM: Bug in __pollwait() can cause select() and poll() to hang in 2.4.22-pre2 -- second try
  2003-06-27 18:19 PROBLEM: Bug in __pollwait() can cause select() and poll() to hang in 2.4.22-pre2 -- second try Ray Bryant
@ 2003-06-30  4:34 ` Rusty Russell
  2003-06-30 16:24   ` Manfred Spraul
  0 siblings, 1 reply; 8+ messages in thread
From: Rusty Russell @ 2003-06-30  4:34 UTC (permalink / raw)
  To: Ray Bryant; +Cc: linux-kernel, Andrew Morton, Manfred Spraul, Andi Kleen, alan

In message <3EFC8AA8.7000501@sgi.com> you write:
>       The simplest fix (as suggested by Manfred Spraul) is to set
> current=>state to TASK_RUNNING just before the call to yield() in
> __alloc_pages().  I have tested this sufficiently that I believe
> this does not change the user level semantics of select() (my
> concern was that if state got set to TASK_RUNNING that the syscall
> could return before any fd's are ready or the select() timeout has
> expired, but this does not appear to be the case).

Horrible problem.  Solution presented is icky, and at the *very* least
needs a comment about its relationship to poll.

More logical would be to have the set_task_state() before
__get_free_page() inside __pollwait, but that will cause every poll to
spin once, killing performance.  Allocating the first page up front
(inside do_pollfd and do_select) would help that, but slow down the
case where normally no alloc is needed, which might be common.  Having
a small first table inside the poll_table itself would work, but the
POLL_TABLE_FULL() macro then gets more complicated.

2.5 has exactly the same issue: perhaps 2.4 should take this patch,
and 2.5 should try something better (I'd suggest trying the embedded
minitable approach).

Anyway, my point is that it's not suitable for the Trivial Patch
Monkey 8)

> Here is a trivial patch against 2.4.22-pre2:
> 
> --- linux-2.4.22-pre2.orig/mm/page_alloc.c      Thu Nov 28 17:53:15 2002
> +++ linux-2.4.22-pre2/mm/page_alloc.c   Fri Jun 27 13:47:49 2003
> @@ -418,6 +418,7 @@
>                  return NULL;
> 
>          /* Yield for kswapd, and try again */
> +        set_current_state(TASK_RUNNING);
>          yield();
>          goto rebalance;
>   }

Cheers,
Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PROBLEM: Bug in __pollwait() can cause select() and poll() to hang in 2.4.22-pre2 -- second try
  2003-06-30  4:34 ` Rusty Russell
@ 2003-06-30 16:24   ` Manfred Spraul
  2003-07-01  1:17     ` Rusty Russell
  0 siblings, 1 reply; 8+ messages in thread
From: Manfred Spraul @ 2003-06-30 16:24 UTC (permalink / raw)
  To: Rusty Russell; +Cc: Ray Bryant, linux-kernel, Andrew Morton, Andi Kleen, alan

Rusty Russell wrote:

>2.5 has exactly the same issue: perhaps 2.4 should take this patch,
>and 2.5 should try something better (I'd suggest trying the embedded
>minitable approach).
>
>  
>
I tried it, but Linus didn't like the idea of on-stack minitables. The 
patches are still at
http://www.colorfullife.com/~manfred/Linux-Kernel/poll/

--
    Manfred


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PROBLEM: Bug in __pollwait() can cause select() and poll() to hang in 2.4.22-pre2 -- second try
  2003-06-30 16:24   ` Manfred Spraul
@ 2003-07-01  1:17     ` Rusty Russell
  2003-07-01  4:17       ` Linus Torvalds
  0 siblings, 1 reply; 8+ messages in thread
From: Rusty Russell @ 2003-07-01  1:17 UTC (permalink / raw)
  To: Manfred Spraul
  Cc: Ray Bryant, linux-kernel, Andrew Morton, Andi Kleen, alan, torvalds

In message <3F006421.4090408@colorfullife.com> you write:
> Rusty Russell wrote:
> 
> >2.5 has exactly the same issue: perhaps 2.4 should take this patch,
> >and 2.5 should try something better (I'd suggest trying the embedded
> >minitable approach).
>
> I tried it, but Linus didn't like the idea of on-stack minitables. The 
> patches are still at
> http://www.colorfullife.com/~manfred/Linux-Kernel/poll/

I wonder if he'd change his mind when presented with an apparently
random set_task_state() in __alloc_pages...

Linus?  See thread below: poll_wait is called with task state !=
TASK_RUNNING, but can do a yield on low memory, causing eternal hangs.

Thanks,
Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.


In message <3EFC8AA8.7000501@sgi.com> you write:
>       The simplest fix (as suggested by Manfred Spraul) is to set
> current=>state to TASK_RUNNING just before the call to yield() in
> __alloc_pages().  I have tested this sufficiently that I believe
> this does not change the user level semantics of select() (my
> concern was that if state got set to TASK_RUNNING that the syscall
> could return before any fd's are ready or the select() timeout has
> expired, but this does not appear to be the case).

Horrible problem.  Solution presented is icky, and at the *very* least
needs a comment about its relationship to poll.

More logical would be to have the set_task_state() before
__get_free_page() inside __pollwait, but that will cause every poll to
spin once, killing performance.  Allocating the first page up front
(inside do_pollfd and do_select) would help that, but slow down the
case where normally no alloc is needed, which might be common.  Having
a small first table inside the poll_table itself would work, but the
POLL_TABLE_FULL() macro then gets more complicated.

2.5 has exactly the same issue: perhaps 2.4 should take this patch,
and 2.5 should try something better (I'd suggest trying the embedded
minitable approach).

Anyway, my point is that it's not suitable for the Trivial Patch
Monkey 8)

> Here is a trivial patch against 2.4.22-pre2:
> 
> --- linux-2.4.22-pre2.orig/mm/page_alloc.c      Thu Nov 28 17:53:15 2002
> +++ linux-2.4.22-pre2/mm/page_alloc.c   Fri Jun 27 13:47:49 2003
> @@ -418,6 +418,7 @@
>                  return NULL;
> 
>          /* Yield for kswapd, and try again */
> +        set_current_state(TASK_RUNNING);
>          yield();
>          goto rebalance;
>   }

Cheers,
Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PROBLEM: Bug in __pollwait() can cause select() and poll() to hang in 2.4.22-pre2 -- second try
  2003-07-01  1:17     ` Rusty Russell
@ 2003-07-01  4:17       ` Linus Torvalds
  2003-07-01  5:08         ` Rusty Russell
  0 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2003-07-01  4:17 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Manfred Spraul, Ray Bryant, Kernel Mailing List, Andrew Morton,
	Andi Kleen, Alan Cox


On Tue, 1 Jul 2003, Rusty Russell wrote:
> 
> Linus?  See thread below: poll_wait is called with task state !=
> TASK_RUNNING, but can do a yield on low memory, causing eternal hangs.

Hint: 2.5.x does not have this problem, because the yield() in 2.5.x isn't
buggy.

So the proper fix is to just fix yield() on 2.4.x.

		Linus



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PROBLEM: Bug in __pollwait() can cause select() and poll() to hang in 2.4.22-pre2 -- second try
  2003-07-01  4:17       ` Linus Torvalds
@ 2003-07-01  5:08         ` Rusty Russell
  2003-07-02 18:06           ` Ray Bryant
  0 siblings, 1 reply; 8+ messages in thread
From: Rusty Russell @ 2003-07-01  5:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Manfred Spraul, Ray Bryant, Kernel Mailing List, Andrew Morton,
	Andi Kleen, Alan Cox

In message <Pine.LNX.4.44.0306302114170.2186-100000@home.osdl.org> you write:
> 
> On Tue, 1 Jul 2003, Rusty Russell wrote:
> > 
> > Linus?  See thread below: poll_wait is called with task state !=
> > TASK_RUNNING, but can do a yield on low memory, causing eternal hangs.
> 
> Hint: 2.5.x does not have this problem, because the yield() in 2.5.x isn't
> buggy.
> 
> So the proper fix is to just fix yield() on 2.4.x.

Thanks Linus.

Um, Ray?  2.4's yield also does:

	void yield(void)
	{
		set_current_state(TASK_RUNNING);
		sys_sched_yield();
		schedule();
	}

So how did the below patch make any difference?

Now thoroughly confused,
Rusty.

--- linux-2.4.22-pre2.orig/mm/page_alloc.c      Thu Nov 28 17:53:15 2002
+++ linux-2.4.22-pre2/mm/page_alloc.c   Fri Jun 27 13:47:49 2003
@@ -418,6 +418,7 @@
                 return NULL;

         /* Yield for kswapd, and try again */
+        set_current_state(TASK_RUNNING);
         yield();
         goto rebalance;
  }

--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PROBLEM: Bug in __pollwait() can cause select() and poll() to hang in 2.4.22-pre2 -- second try
  2003-07-01  5:08         ` Rusty Russell
@ 2003-07-02 18:06           ` Ray Bryant
  2003-07-03  0:56             ` Rusty Russell
  0 siblings, 1 reply; 8+ messages in thread
From: Ray Bryant @ 2003-07-02 18:06 UTC (permalink / raw)
  To: Rusty Russell
  Cc: Linus Torvalds, Manfred Spraul, Kernel Mailing List,
	Andrew Morton, Andi Kleen, Alan Cox

Rusty Russell wrote:

> Um, Ray?  2.4's yield also does:
> 
> 	void yield(void)
> 	{
> 		set_current_state(TASK_RUNNING);
> 		sys_sched_yield();
> 		schedule();
> 	}
> 
> So how did the below patch make any difference?
> 
> Now thoroughly confused,
> Rusty.
> 
> --- linux-2.4.22-pre2.orig/mm/page_alloc.c      Thu Nov 28 17:53:15 2002
> +++ linux-2.4.22-pre2/mm/page_alloc.c   Fri Jun 27 13:47:49 2003
> @@ -418,6 +418,7 @@
>                  return NULL;
> 
>          /* Yield for kswapd, and try again */
> +        set_current_state(TASK_RUNNING);
>          yield();
>          goto rebalance;
>   }
> 
> --
>   Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
> 

Duh.  My fault.  I didn't see this in 2.4.22-pre2.  Some checking shows 
that it is also in 2.4.20.  How this didn't get into our SGI 2.4.20 tree 
is beyond me (this where we originally found this problem).  So there is 
no problem in 2.4.22-pre2.

Rusty -- thanks for your perseverence on this.

-- 
Best Regards,
Ray
-----------------------------------------------
                   Ray Bryant
512-453-9679 (work)         512-507-7807 (cell)
Jun 23-Jul 18 I will be at: 970-513-4743
raybry@sgi.com             raybry@austin.rr.com
The box said: "Requires Windows 98 or better",
            so I installed Linux.
-----------------------------------------------


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PROBLEM: Bug in __pollwait() can cause select() and poll() to hang in 2.4.22-pre2 -- second try
  2003-07-02 18:06           ` Ray Bryant
@ 2003-07-03  0:56             ` Rusty Russell
  0 siblings, 0 replies; 8+ messages in thread
From: Rusty Russell @ 2003-07-03  0:56 UTC (permalink / raw)
  To: Ray Bryant
  Cc: Linus Torvalds, Manfred Spraul, Kernel Mailing List,
	Andrew Morton, Andi Kleen, Alan Cox

In message <3F031F0F.4020600@sgi.com> you write:
> Duh.  My fault.  I didn't see this in 2.4.22-pre2.  Some checking shows 
> that it is also in 2.4.20.  How this didn't get into our SGI 2.4.20 tree 
> is beyond me (this where we originally found this problem).  So there is 
> no problem in 2.4.22-pre2.
> 
> Rusty -- thanks for your perseverence on this.

Hey, glad I could help.

As for perseverance, it's an occupational hazard.  Linus has said
before that he doesn't want patches which don't get aggressively
pushed by someone, and indeed, they get lost.  Hence the Trivial Patch
Monkey...

Cheers,
Rusty.
--
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2003-07-03  1:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-06-27 18:19 PROBLEM: Bug in __pollwait() can cause select() and poll() to hang in 2.4.22-pre2 -- second try Ray Bryant
2003-06-30  4:34 ` Rusty Russell
2003-06-30 16:24   ` Manfred Spraul
2003-07-01  1:17     ` Rusty Russell
2003-07-01  4:17       ` Linus Torvalds
2003-07-01  5:08         ` Rusty Russell
2003-07-02 18:06           ` Ray Bryant
2003-07-03  0:56             ` Rusty Russell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).