* 2.6.24-rc7 to 2.6.24-rc8 possible regression @ 2008-01-21 16:06 Denys Fedoryshchenko 2008-01-21 22:45 ` Stefan Richter 0 siblings, 1 reply; 9+ messages in thread From: Denys Fedoryshchenko @ 2008-01-21 16:06 UTC (permalink / raw) To: linux-kernel After running screen found in dmesg. It was not happening before. [625138.248257] [625138.248260] ============================================= [625138.248542] [ INFO: possible recursive locking detected ] [625138.248686] 2.6.24-rc8-devel #2 [625138.248821] --------------------------------------------- [625138.248963] screen/18164 is trying to acquire lock: [625138.249101] (&q->lock){++..}, at: [<c01175ff>] __wake_up+0x15/0x42 [625138.249454] [625138.249456] but task is already holding lock: [625138.249724] (&q->lock){++..}, at: [<c01175ff>] __wake_up+0x15/0x42 [625138.250073] [625138.250075] other info that might help us debug this: [625138.250343] 2 locks held by screen/18164: [625138.250477] #0: (&tty->atomic_read_lock){--..}, at: [<c0231a6c>] read_chan+0x18f/0x50b [625138.250960] #1: (&q->lock){++..}, at: [<c01175ff>] __wake_up+0x15/0x42 [625138.251356] [625138.251357] stack backtrace: [625138.251623] Pid: 18164, comm: screen Not tainted 2.6.24-rc8-devel #2 [625138.251764] [<c0105e84>] show_trace_log_lvl+0x1a/0x2f [625138.251959] [<c010682c>] show_trace+0x12/0x14 [625138.252150] [<c0107123>] dump_stack+0x6c/0x72 [625138.252338] [<c01384c0>] __lock_acquire+0x172/0xb8c [625138.252533] [<c01392a7>] lock_acquire+0x5f/0x78 [625138.252725] [<c032afa7>] _spin_lock_irqsave+0x34/0x44 [625138.252920] [<c01175ff>] __wake_up+0x15/0x42 [625138.253108] [<c0186a16>] ep_poll_safewake+0x8e/0xbf [625138.253300] [<c01876df>] ep_poll_callback+0x9f/0xac [625138.253491] [<c0115d36>] __wake_up_common+0x32/0x5c [625138.253688] [<c011761b>] __wake_up+0x31/0x42 [625138.253878] [<c022c803>] tty_wakeup+0x4f/0x54 [625138.254070] [<c0232c7c>] pty_unthrottle+0x15/0x21 [625138.254258] [<c022ffe3>] check_unthrottle+0x2e/0x30 [625138.254445] [<c0231cf4>] read_chan+0x417/0x50b [625138.254633] [<c022ec67>] tty_read+0x66/0xac [625138.254819] [<c01628dc>] vfs_read+0x8e/0x117 [625138.255004] [<c0162d10>] sys_read+0x3d/0x61 [625138.255190] [<c0104dee>] sysenter_past_esp+0x5f/0xa5 [625138.255376] ======================= -- Denys Fedoryshchenko Technical Manager Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.24-rc7 to 2.6.24-rc8 possible regression 2008-01-21 16:06 2.6.24-rc7 to 2.6.24-rc8 possible regression Denys Fedoryshchenko @ 2008-01-21 22:45 ` Stefan Richter 2008-01-22 11:10 ` Denys Fedoryshchenko 0 siblings, 1 reply; 9+ messages in thread From: Stefan Richter @ 2008-01-21 22:45 UTC (permalink / raw) To: Denys Fedoryshchenko; +Cc: linux-kernel, peterz Denys Fedoryshchenko wrote: > After running screen found in dmesg. It was not happening before. > > [625138.248257] > [625138.248260] ============================================= > [625138.248542] [ INFO: possible recursive locking detected ] > [625138.248686] 2.6.24-rc8-devel #2 > [625138.248821] --------------------------------------------- > [625138.248963] screen/18164 is trying to acquire lock: > [625138.249101] (&q->lock){++..}, at: [<c01175ff>] __wake_up+0x15/0x42 > [625138.249454] > [625138.249456] but task is already holding lock: > [625138.249724] (&q->lock){++..}, at: [<c01175ff>] __wake_up+0x15/0x42 > [625138.250073] > [625138.250075] other info that might help us debug this: > [625138.250343] 2 locks held by screen/18164: > [625138.250477] #0: (&tty->atomic_read_lock){--..}, at: [<c0231a6c>] > read_chan+0x18f/0x50b > [625138.250960] #1: (&q->lock){++..}, at: [<c01175ff>] __wake_up+0x15/0x42 > [625138.251356] > [625138.251357] stack backtrace: > [625138.251623] Pid: 18164, comm: screen Not tainted 2.6.24-rc8-devel #2 > [625138.251764] [<c0105e84>] show_trace_log_lvl+0x1a/0x2f > [625138.251959] [<c010682c>] show_trace+0x12/0x14 > [625138.252150] [<c0107123>] dump_stack+0x6c/0x72 > [625138.252338] [<c01384c0>] __lock_acquire+0x172/0xb8c > [625138.252533] [<c01392a7>] lock_acquire+0x5f/0x78 > [625138.252725] [<c032afa7>] _spin_lock_irqsave+0x34/0x44 > [625138.252920] [<c01175ff>] __wake_up+0x15/0x42 > [625138.253108] [<c0186a16>] ep_poll_safewake+0x8e/0xbf > [625138.253300] [<c01876df>] ep_poll_callback+0x9f/0xac > [625138.253491] [<c0115d36>] __wake_up_common+0x32/0x5c > [625138.253688] [<c011761b>] __wake_up+0x31/0x42 > [625138.253878] [<c022c803>] tty_wakeup+0x4f/0x54 > [625138.254070] [<c0232c7c>] pty_unthrottle+0x15/0x21 > [625138.254258] [<c022ffe3>] check_unthrottle+0x2e/0x30 > [625138.254445] [<c0231cf4>] read_chan+0x417/0x50b > [625138.254633] [<c022ec67>] tty_read+0x66/0xac > [625138.254819] [<c01628dc>] vfs_read+0x8e/0x117 > [625138.255004] [<c0162d10>] sys_read+0x3d/0x61 > [625138.255190] [<c0104dee>] sysenter_past_esp+0x5f/0xa5 > [625138.255376] ======================= Do you have Peter's lockdep annotation patch for epoll applied? http://lkml.org/lkml/2008/1/13/84 -- Stefan Richter -=====-==--- ---= =-=-= http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.24-rc7 to 2.6.24-rc8 possible regression 2008-01-21 22:45 ` Stefan Richter @ 2008-01-22 11:10 ` Denys Fedoryshchenko 2008-01-22 16:23 ` Stefan Richter 0 siblings, 1 reply; 9+ messages in thread From: Denys Fedoryshchenko @ 2008-01-22 11:10 UTC (permalink / raw) To: Stefan Richter; +Cc: linux-kernel, peterz No, i am using vanilla kernel. It is one of production machines, and as i know screen is not using epoll. I will try to apply on all my production machines this patch. Sorry if it is related. On Mon, 21 Jan 2008 23:45:40 +0100, Stefan Richter wrote > Denys Fedoryshchenko wrote: > > After running screen found in dmesg. It was not happening before. > > > > [625138.248257] > > [625138.248260] ============================================= > > [625138.248542] [ INFO: possible recursive locking detected ] > > [625138.248686] 2.6.24-rc8-devel #2 > > [625138.248821] --------------------------------------------- > > [625138.248963] screen/18164 is trying to acquire lock: > > [625138.249101] (&q->lock){++..}, at: [<c01175ff>] __wake_up+0x15/0x42 > > [625138.249454] > > [625138.249456] but task is already holding lock: > > [625138.249724] (&q->lock){++..}, at: [<c01175ff>] __wake_up+0x15/0x42 > > [625138.250073] > > [625138.250075] other info that might help us debug this: > > [625138.250343] 2 locks held by screen/18164: > > [625138.250477] #0: (&tty->atomic_read_lock){--..}, at: [<c0231a6c>] > > read_chan+0x18f/0x50b > > [625138.250960] #1: (&q->lock){++..}, at: [<c01175ff>] __wake_up+0x15/ 0x42 > > [625138.251356] > > [625138.251357] stack backtrace: > > [625138.251623] Pid: 18164, comm: screen Not tainted 2.6.24-rc8-devel #2 > > [625138.251764] [<c0105e84>] show_trace_log_lvl+0x1a/0x2f > > [625138.251959] [<c010682c>] show_trace+0x12/0x14 > > [625138.252150] [<c0107123>] dump_stack+0x6c/0x72 > > [625138.252338] [<c01384c0>] __lock_acquire+0x172/0xb8c > > [625138.252533] [<c01392a7>] lock_acquire+0x5f/0x78 > > [625138.252725] [<c032afa7>] _spin_lock_irqsave+0x34/0x44 > > [625138.252920] [<c01175ff>] __wake_up+0x15/0x42 > > [625138.253108] [<c0186a16>] ep_poll_safewake+0x8e/0xbf > > [625138.253300] [<c01876df>] ep_poll_callback+0x9f/0xac > > [625138.253491] [<c0115d36>] __wake_up_common+0x32/0x5c > > [625138.253688] [<c011761b>] __wake_up+0x31/0x42 > > [625138.253878] [<c022c803>] tty_wakeup+0x4f/0x54 > > [625138.254070] [<c0232c7c>] pty_unthrottle+0x15/0x21 > > [625138.254258] [<c022ffe3>] check_unthrottle+0x2e/0x30 > > [625138.254445] [<c0231cf4>] read_chan+0x417/0x50b > > [625138.254633] [<c022ec67>] tty_read+0x66/0xac > > [625138.254819] [<c01628dc>] vfs_read+0x8e/0x117 > > [625138.255004] [<c0162d10>] sys_read+0x3d/0x61 > > [625138.255190] [<c0104dee>] sysenter_past_esp+0x5f/0xa5 > > [625138.255376] ======================= > > Do you have Peter's lockdep annotation patch for epoll applied? > http://lkml.org/lkml/2008/1/13/84 > -- > Stefan Richter > -=====-==--- ---= =-=-= > http://arcgraph.de/sr/ -- Denys Fedoryshchenko Technical Manager Virtual ISP S.A.L. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.24-rc7 to 2.6.24-rc8 possible regression 2008-01-22 11:10 ` Denys Fedoryshchenko @ 2008-01-22 16:23 ` Stefan Richter 2008-01-22 16:28 ` Peter Zijlstra 0 siblings, 1 reply; 9+ messages in thread From: Stefan Richter @ 2008-01-22 16:23 UTC (permalink / raw) To: Denys Fedoryshchenko; +Cc: linux-kernel, peterz Denys Fedoryshchenko wrote: > No, i am using vanilla kernel. It is one of production machines, and as i > know screen is not using epoll. OK, but the trace shows that it is the epoll recursion again. > I will try to apply on all my production machines this patch. Sorry if it is > related. Well, let's hope that the lockdep annotation or whatever other fix gets into mainline sooner than later. Which reminds me to test my setup again which appeared to be able to reproduce the __wake_up recursion on my command... -- Stefan Richter -=====-==--- ---= =-==- http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.24-rc7 to 2.6.24-rc8 possible regression 2008-01-22 16:23 ` Stefan Richter @ 2008-01-22 16:28 ` Peter Zijlstra 2008-01-22 22:51 ` Stefan Richter 0 siblings, 1 reply; 9+ messages in thread From: Peter Zijlstra @ 2008-01-22 16:28 UTC (permalink / raw) To: Stefan Richter; +Cc: Denys Fedoryshchenko, linux-kernel On Tue, 2008-01-22 at 17:23 +0100, Stefan Richter wrote: > Denys Fedoryshchenko wrote: > > No, i am using vanilla kernel. It is one of production machines, and as i > > know screen is not using epoll. > > OK, but the trace shows that it is the epoll recursion again. > > > I will try to apply on all my production machines this patch. Sorry if it is > > related. > > Well, let's hope that the lockdep annotation or whatever other fix gets > into mainline sooner than later. Which reminds me to test my setup > again which appeared to be able to reproduce the __wake_up recursion on > my command... Would be appreciated, I have been waiting on testing feedback because I'm not fully certain here. Curious though that this gets reported frequently the last few weeks, afaics this problem is way old. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.24-rc7 to 2.6.24-rc8 possible regression 2008-01-22 16:28 ` Peter Zijlstra @ 2008-01-22 22:51 ` Stefan Richter 2008-01-22 22:54 ` [PATCH] lockdep: annotate epoll Stefan Richter 2008-01-22 22:54 ` 2.6.24-rc7 to 2.6.24-rc8 possible regression Davide Libenzi 0 siblings, 2 replies; 9+ messages in thread From: Stefan Richter @ 2008-01-22 22:51 UTC (permalink / raw) To: Peter Zijlstra Cc: Denys Fedoryshchenko, linux-kernel, Andrew Morton, Davide Libenzi (adding Cc: Davide and akpm) On 22 Jan, Peter Zijlstra wrote: > > On Tue, 2008-01-22 at 17:23 +0100, Stefan Richter wrote: >> Denys Fedoryshchenko wrote: >> > No, i am using vanilla kernel. It is one of production machines, and as i >> > know screen is not using epoll. >> >> OK, but the trace shows that it is the epoll recursion again. >> >> > I will try to apply on all my production machines this patch. Sorry if it is >> > related. >> >> Well, let's hope that the lockdep annotation or whatever other fix gets >> into mainline sooner than later. Which reminds me to test my setup >> again which appeared to be able to reproduce the __wake_up recursion on >> my command... > > Would be appreciated, I have been waiting on testing feedback because > I'm not fully certain here. > > Curious though that this gets reported frequently the last few weeks, > afaics this problem is way old. Here is a report against Fedora's 2.6.23-0.222.rc9.git4.fc8, filed in October: https://bugzilla.redhat.com/show_bug.cgi?id=323411 I did several tests with a pristine 2.6.24-rc8 now. The lockdep warning below is 100% reliably triggered by grabbing video from a DV camcorder with dvgrab v3 via firewire-core's character device file ABI. Also, this warning is 100% reliably suppressed by your annotation patch. I will reply to this message with this patch for kind consideration by those concerned. :-) ============================================= [ INFO: possible recursive locking detected ] 2.6.24-rc8 #1 --------------------------------------------- swapper/0 is trying to acquire lock: (&q->lock){++..}, at: [<ffffffff8022aef7>] __wake_up+0x22/0x4e but task is already holding lock: (&q->lock){++..}, at: [<ffffffff8022aef7>] __wake_up+0x22/0x4e other info that might help us debug this: 1 lock held by swapper/0: #0: (&q->lock){++..}, at: [<ffffffff8022aef7>] __wake_up+0x22/0x4e stack backtrace: Pid: 0, comm: swapper Not tainted 2.6.24-rc8 #1 Call Trace: <IRQ> [<ffffffff8024ec38>] __lock_acquire+0x1c5/0xc68 [<ffffffff8024faa1>] lock_acquire+0x51/0x6c [<ffffffff8022aef7>] __wake_up+0x22/0x4e [<ffffffff8041499b>] _spin_lock_irqsave+0x37/0x47 [<ffffffff8022aef7>] __wake_up+0x22/0x4e [<ffffffff802a926e>] ep_poll_safewake+0x86/0xc5 [<ffffffff802aa15a>] ep_poll_callback+0xc1/0xcb [<ffffffff80228d4d>] __wake_up_common+0x41/0x74 [<ffffffff8022af0d>] __wake_up+0x38/0x4e [<ffffffff8806888c>] :firewire_core:fw_core_handle_request+0x1b0/0x1eb [<ffffffff8807301f>] :firewire_ohci:handle_ar_packet+0xfb/0x10f [<ffffffff88073f01>] :firewire_ohci:ar_context_tasklet+0xd6/0xe7 [<ffffffff80236328>] tasklet_action+0x5d/0xb0 [<ffffffff80236229>] __do_softirq+0x5d/0xe0 [<ffffffff8021dc2f>] ack_apic_level+0x10/0xdc [<ffffffff8020c9fc>] call_softirq+0x1c/0x28 [<ffffffff8020e839>] do_softirq+0x31/0x94 [<ffffffff8023616c>] irq_exit+0x45/0x51 [<ffffffff8020e954>] do_IRQ+0xb8/0xd8 [<ffffffff8020bcc6>] ret_from_intr+0x0/0xf <EOI> [<ffffffff8021c2da>] lapic_next_event+0x0/0xa [<ffffffff8022cde9>] sched_clock_idle_wakeup_event+0x46/0x63 [<ffffffff8805a746>] :processor:acpi_idle_enter_simple+0x179/0x1df [<ffffffff8805a73c>] :processor:acpi_idle_enter_simple+0x16f/0x1df [<ffffffff803b4608>] cpuidle_idle_call+0x79/0xad [<ffffffff803b458f>] cpuidle_idle_call+0x0/0xad [<ffffffff8020a5e4>] cpu_idle+0x95/0xcc [<ffffffff80571a8a>] start_kernel+0x2d3/0x2df [<ffffffff80571115>] _sinittext+0x115/0x11c -- Stefan Richter -=====-==--- ---= =-==- http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] lockdep: annotate epoll 2008-01-22 22:51 ` Stefan Richter @ 2008-01-22 22:54 ` Stefan Richter 2008-01-22 22:59 ` Davide Libenzi 2008-01-22 22:54 ` 2.6.24-rc7 to 2.6.24-rc8 possible regression Davide Libenzi 1 sibling, 1 reply; 9+ messages in thread From: Stefan Richter @ 2008-01-22 22:54 UTC (permalink / raw) To: Peter Zijlstra Cc: Denys Fedoryshchenko, linux-kernel, Andrew Morton, Davide Libenzi On 22 Jan, Stefan Richter wrote: > On 22 Jan, Peter Zijlstra wrote: >> Curious though that this gets reported frequently the last few weeks, >> afaics this problem is way old. > > Here is a report against Fedora's 2.6.23-0.222.rc9.git4.fc8, filed in > October: https://bugzilla.redhat.com/show_bug.cgi?id=323411 Upstream bug: http://bugzilla.kernel.org/show_bug.cgi?id=9786 Date: Sun, 13 Jan 2008 19:44:26 +0100 From: Peter Zijlstra <a.p.zijlstra@chello.nl> On Sat, 2008-01-05 at 13:35 -0800, Davide Libenzi wrote: > I remember I talked with Arjan about this time ago. Basically, since 1) > you can drop an epoll fd inside another epoll fd 2) callback-based wakeups > are used, you can see a wake_up() from inside another wake_up(), but they > will never refer to the same lock instance. > Think about: > > dfd = socket(...); > efd1 = epoll_create(); > efd2 = epoll_create(); > epoll_ctl(efd1, EPOLL_CTL_ADD, dfd, ...); > epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...); > > When a packet arrives to the device underneath "dfd", the net code will > issue a wake_up() on its poll wake list. Epoll (efd1) has installed a > callback wakeup entry on that queue, and the wake_up() performed by the > "dfd" net code will end up in ep_poll_callback(). At this point epoll > (efd1) notices that it may have some event ready, so it needs to wake up > the waiters on its poll wait list (efd2). So it calls ep_poll_safewake() > that ends up in another wake_up(), after having checked about the > recursion constraints. That are, no more than EP_MAX_POLLWAKE_NESTS, to > avoid stack blasting. Never hit the same queue, to avoid loops like: > > epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...); > epoll_ctl(efd3, EPOLL_CTL_ADD, efd2, ...); > epoll_ctl(efd4, EPOLL_CTL_ADD, efd3, ...); > epoll_ctl(efd1, EPOLL_CTL_ADD, efd4, ...); > > The code "if (tncur->wq == wq || ..." prevents re-entering the same > queue/lock. Since the epoll code is very careful to not nest same instance locks allow the recursion. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Stefan Richter <stefanr@s5r6.in-berlin.de> --- fs/eventpoll.c | 2 +- include/linux/wait.h | 16 ++++++++++++++++ 2 files changed, 17 insertions(+), 1 deletion(-) Index: linux/fs/eventpoll.c =================================================================== --- linux.orig/fs/eventpoll.c +++ linux/fs/eventpoll.c @@ -353,7 +353,7 @@ static void ep_poll_safewake(struct poll spin_unlock_irqrestore(&psw->lock, flags); /* Do really wake up now */ - wake_up(wq); + wake_up_nested(wq, 1 + wake_nests); /* Remove the current task from the list */ spin_lock_irqsave(&psw->lock, flags); Index: linux/include/linux/wait.h =================================================================== --- linux.orig/include/linux/wait.h +++ linux/include/linux/wait.h @@ -161,6 +161,22 @@ wait_queue_head_t *FASTCALL(bit_waitqueu #define wake_up_locked(x) __wake_up_locked((x), TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE) #define wake_up_interruptible_sync(x) __wake_up_sync((x),TASK_INTERRUPTIBLE, 1) +#ifdef CONFIG_DEBUG_LOCK_ALLOC +/* + * macro to avoid include hell + */ +#define wake_up_nested(x, s) \ +do { \ + unsigned long flags; \ + \ + spin_lock_irqsave_nested(&(x)->lock, flags, (s)); \ + wake_up_locked(x); \ + spin_unlock_irqrestore(&(x)->lock, flags); \ +} while (0) +#else +#define wake_up_nested(x, s) wake_up(x) +#endif + #define __wait_event(wq, condition) \ do { \ DEFINE_WAIT(__wait); \ -- Stefan Richter -=====-==--- ---= =-==- http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] lockdep: annotate epoll 2008-01-22 22:54 ` [PATCH] lockdep: annotate epoll Stefan Richter @ 2008-01-22 22:59 ` Davide Libenzi 0 siblings, 0 replies; 9+ messages in thread From: Davide Libenzi @ 2008-01-22 22:59 UTC (permalink / raw) To: Stefan Richter Cc: Peter Zijlstra, Denys Fedoryshchenko, Linux Kernel Mailing List, Andrew Morton On Tue, 22 Jan 2008, Stefan Richter wrote: > On 22 Jan, Stefan Richter wrote: > > On 22 Jan, Peter Zijlstra wrote: > >> Curious though that this gets reported frequently the last few weeks, > >> afaics this problem is way old. > > > > Here is a report against Fedora's 2.6.23-0.222.rc9.git4.fc8, filed in > > October: https://bugzilla.redhat.com/show_bug.cgi?id=323411 > > Upstream bug: http://bugzilla.kernel.org/show_bug.cgi?id=9786 > > Date: Sun, 13 Jan 2008 19:44:26 +0100 > From: Peter Zijlstra <a.p.zijlstra@chello.nl> > > On Sat, 2008-01-05 at 13:35 -0800, Davide Libenzi wrote: > > > I remember I talked with Arjan about this time ago. Basically, since 1) > > you can drop an epoll fd inside another epoll fd 2) callback-based wakeups > > are used, you can see a wake_up() from inside another wake_up(), but they > > will never refer to the same lock instance. > > Think about: > > > > dfd = socket(...); > > efd1 = epoll_create(); > > efd2 = epoll_create(); > > epoll_ctl(efd1, EPOLL_CTL_ADD, dfd, ...); > > epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...); > > > > When a packet arrives to the device underneath "dfd", the net code will > > issue a wake_up() on its poll wake list. Epoll (efd1) has installed a > > callback wakeup entry on that queue, and the wake_up() performed by the > > "dfd" net code will end up in ep_poll_callback(). At this point epoll > > (efd1) notices that it may have some event ready, so it needs to wake up > > the waiters on its poll wait list (efd2). So it calls ep_poll_safewake() > > that ends up in another wake_up(), after having checked about the > > recursion constraints. That are, no more than EP_MAX_POLLWAKE_NESTS, to > > avoid stack blasting. Never hit the same queue, to avoid loops like: > > > > epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...); > > epoll_ctl(efd3, EPOLL_CTL_ADD, efd2, ...); > > epoll_ctl(efd4, EPOLL_CTL_ADD, efd3, ...); > > epoll_ctl(efd1, EPOLL_CTL_ADD, efd4, ...); > > > > The code "if (tncur->wq == wq || ..." prevents re-entering the same > > queue/lock. > > Since the epoll code is very careful to not nest same instance locks > allow the recursion. > > Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> > Tested-by: Stefan Richter <stefanr@s5r6.in-berlin.de> > --- > fs/eventpoll.c | 2 +- > include/linux/wait.h | 16 ++++++++++++++++ > 2 files changed, 17 insertions(+), 1 deletion(-) > > Index: linux/fs/eventpoll.c > =================================================================== > --- linux.orig/fs/eventpoll.c > +++ linux/fs/eventpoll.c > @@ -353,7 +353,7 @@ static void ep_poll_safewake(struct poll > spin_unlock_irqrestore(&psw->lock, flags); > > /* Do really wake up now */ > - wake_up(wq); > + wake_up_nested(wq, 1 + wake_nests); > > /* Remove the current task from the list */ > spin_lock_irqsave(&psw->lock, flags); > Index: linux/include/linux/wait.h > =================================================================== > --- linux.orig/include/linux/wait.h > +++ linux/include/linux/wait.h > @@ -161,6 +161,22 @@ wait_queue_head_t *FASTCALL(bit_waitqueu > #define wake_up_locked(x) __wake_up_locked((x), TASK_UNINTERRUPTIBLE | TASK_INTERRUPTIBLE) > #define wake_up_interruptible_sync(x) __wake_up_sync((x),TASK_INTERRUPTIBLE, 1) > > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > +/* > + * macro to avoid include hell > + */ > +#define wake_up_nested(x, s) \ > +do { \ > + unsigned long flags; \ > + \ > + spin_lock_irqsave_nested(&(x)->lock, flags, (s)); \ > + wake_up_locked(x); \ > + spin_unlock_irqrestore(&(x)->lock, flags); \ > +} while (0) > +#else > +#define wake_up_nested(x, s) wake_up(x) > +#endif > + > #define __wait_event(wq, condition) \ > do { \ > DEFINE_WAIT(__wait); \ > Looks fine to me. Acked-by: Davide Libenzi <davidel@xmailserver.org> - Davide ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: 2.6.24-rc7 to 2.6.24-rc8 possible regression 2008-01-22 22:51 ` Stefan Richter 2008-01-22 22:54 ` [PATCH] lockdep: annotate epoll Stefan Richter @ 2008-01-22 22:54 ` Davide Libenzi 1 sibling, 0 replies; 9+ messages in thread From: Davide Libenzi @ 2008-01-22 22:54 UTC (permalink / raw) To: Stefan Richter Cc: Peter Zijlstra, Denys Fedoryshchenko, Linux Kernel Mailing List, Andrew Morton On Tue, 22 Jan 2008, Stefan Richter wrote: > (adding Cc: Davide and akpm) > > On 22 Jan, Peter Zijlstra wrote: > > > > On Tue, 2008-01-22 at 17:23 +0100, Stefan Richter wrote: > >> Denys Fedoryshchenko wrote: > >> > No, i am using vanilla kernel. It is one of production machines, and as i > >> > know screen is not using epoll. > >> > >> OK, but the trace shows that it is the epoll recursion again. > >> > >> > I will try to apply on all my production machines this patch. Sorry if it is > >> > related. > >> > >> Well, let's hope that the lockdep annotation or whatever other fix gets > >> into mainline sooner than later. Which reminds me to test my setup > >> again which appeared to be able to reproduce the __wake_up recursion on > >> my command... > > > > Would be appreciated, I have been waiting on testing feedback because > > I'm not fully certain here. > > > > Curious though that this gets reported frequently the last few weeks, > > afaics this problem is way old. > > Here is a report against Fedora's 2.6.23-0.222.rc9.git4.fc8, filed in > October: https://bugzilla.redhat.com/show_bug.cgi?id=323411 > > I did several tests with a pristine 2.6.24-rc8 now. The lockdep warning > below is 100% reliably triggered by grabbing video from a DV camcorder > with dvgrab v3 via firewire-core's character device file ABI. > > Also, this warning is 100% reliably suppressed by your annotation patch. > I will reply to this message with this patch for kind consideration by > those concerned. :-) Patch? Where is it? - Davide ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2008-01-22 22:59 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-01-21 16:06 2.6.24-rc7 to 2.6.24-rc8 possible regression Denys Fedoryshchenko 2008-01-21 22:45 ` Stefan Richter 2008-01-22 11:10 ` Denys Fedoryshchenko 2008-01-22 16:23 ` Stefan Richter 2008-01-22 16:28 ` Peter Zijlstra 2008-01-22 22:51 ` Stefan Richter 2008-01-22 22:54 ` [PATCH] lockdep: annotate epoll Stefan Richter 2008-01-22 22:59 ` Davide Libenzi 2008-01-22 22:54 ` 2.6.24-rc7 to 2.6.24-rc8 possible regression Davide Libenzi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).