* [patch] allow netpoll_poll to be called recursively
@ 2004-08-16 22:27 Jeff Moyer
2004-08-26 2:02 ` Matt Mackall
0 siblings, 1 reply; 2+ messages in thread
From: Jeff Moyer @ 2004-08-16 22:27 UTC (permalink / raw)
To: mpm; +Cc: linux-kernel
Hi, Matt,
This should fix the recursive netpoll_poll deadlock that can happen with
the newly introduced netpoll_poll_lock.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
--- linux-2.6.7/net/core/netpoll.c.getcpu 2004-08-16 15:50:47.275980584 -0400
+++ linux-2.6.7/net/core/netpoll.c 2004-08-16 15:46:53.328546000 -0400
@@ -72,7 +72,9 @@ void netpoll_poll(struct netpoll *np)
* timeouts. Thus, we set our budget to a more reasonable value.
*/
int budget = 16;
+ static int poll_owner = -1;
unsigned long flags;
+ int netpoll_rx_flag = NETPOLL_RX_DROP;
if(!np->dev || !netif_running(np->dev) || !np->dev->poll_controller)
return;
@@ -81,17 +83,27 @@ void netpoll_poll(struct netpoll *np)
np->dev->poll_controller(np->dev);
/* If scheduling is stopped, tickle NAPI bits */
- spin_lock_irqsave(&netpoll_poll_lock, flags);
+ local_irq_save(flags);
+ if (!spin_trylock(&netpoll_poll_lock)) {
+ /* allow recursive calls on this cpu */
+ if (smp_processor_id() != poll_owner)
+ spin_lock(&netpoll_poll_lock);
+ else
+ netpoll_rx_flag = 0;
+ }
+ poll_owner = smp_processor_id();
+
if (np->dev->poll &&
test_bit(__LINK_STATE_RX_SCHED, &np->dev->state)) {
- np->dev->netpoll_rx |= NETPOLL_RX_DROP;
+ np->dev->netpoll_rx |= netpoll_rx_flag;
atomic_inc(&trapped);
np->dev->poll(np->dev, &budget);
atomic_dec(&trapped);
- np->dev->netpoll_rx &= ~NETPOLL_RX_DROP;
+ np->dev->netpoll_rx &= ~netpoll_rx_flag;
}
+ poll_owner = -1;
spin_unlock_irqrestore(&netpoll_poll_lock, flags);
zap_completion_queue();
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [patch] allow netpoll_poll to be called recursively
2004-08-16 22:27 [patch] allow netpoll_poll to be called recursively Jeff Moyer
@ 2004-08-26 2:02 ` Matt Mackall
0 siblings, 0 replies; 2+ messages in thread
From: Matt Mackall @ 2004-08-26 2:02 UTC (permalink / raw)
To: Jeff Moyer; +Cc: linux-kernel
On Mon, Aug 16, 2004 at 06:27:44PM -0400, Jeff Moyer wrote:
> Hi, Matt,
>
> This should fix the recursive netpoll_poll deadlock that can happen with
> the newly introduced netpoll_poll_lock.
I rewrote this to get my head around it:
Index: linux/net/core/netpoll.c
===================================================================
--- linux.orig/net/core/netpoll.c 2004-08-25 12:29:10.783502466 -0500
+++ linux/net/core/netpoll.c 2004-08-25 20:55:05.468729251 -0500
@@ -72,6 +72,7 @@
* timeouts. Thus, we set our budget to a more reasonable value.
*/
int budget = 16;
+ static int poll_owner = -1;
unsigned long flags;
if(!np->dev || !netif_running(np->dev) || !np->dev->poll_controller)
@@ -81,18 +82,33 @@
np->dev->poll_controller(np->dev);
/* If scheduling is stopped, tickle NAPI bits */
- spin_lock_irqsave(&netpoll_poll_lock, flags);
if (np->dev->poll &&
test_bit(__LINK_STATE_RX_SCHED, &np->dev->state)) {
- np->dev->netpoll_rx |= NETPOLL_RX_DROP;
- atomic_inc(&trapped);
- np->dev->poll(np->dev, &budget);
+ /* attempt to grab the lock to manipulate the _rx bits */
+ local_irq_save(flags);
- atomic_dec(&trapped);
- np->dev->netpoll_rx &= ~NETPOLL_RX_DROP;
+ if (smp_processor_id() != poll_owner) {
+ /* we're not the lock owner, wait for the lock */
+ spin_lock(&netpoll_poll_lock);
+ poll_owner = smp_processor_id();
+
+ np->dev->netpoll_rx |= NETPOLL_RX_DROP;
+ atomic_inc(&trapped);
+
+ np->dev->poll(np->dev, &budget);
+
+ atomic_dec(&trapped);
+ np->dev->netpoll_rx &= ~NETPOLL_RX_DROP;
+ poll_owner = -1;
+ spin_unlock_irqrestore(&netpoll_poll_lock, flags);
+ }
+ else {
+ /* we're already holding the lock */
+ np->dev->poll(np->dev, &budget);
+ local_irq_restore(flags);
+ }
}
- spin_unlock_irqrestore(&netpoll_poll_lock, flags);
zap_completion_queue();
}
I think the above matches your intent, let me know if I missed
something. Note your original version would do an unlock in the
recursive invocation even if it didn't do a lock.
But I still don't like this. dev->poll() is liable to attempt to
recursively take its own driver lock again internally and we still
deadlock. Have we already seen recursion here? If we do, I think we
need to fix that in drivers. Meanwhile we should just bail here and
maybe set a "something bad happened" flag.
--
Mathematics is the supreme nostalgia of our time.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2004-08-26 2:02 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-08-16 22:27 [patch] allow netpoll_poll to be called recursively Jeff Moyer
2004-08-26 2:02 ` Matt Mackall
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).