linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jarek Poplawski <jarkao2@o2.pl>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Folkert van Heusden <folkert@vanheusden.com>,
	linux-kernel@vger.kernel.org,
	Jason Wessel <jason.wessel@windriver.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	stable@kernel.org, netdev@vger.kernel.org
Subject: [PATCH] Re: [2.6.21.1] soft lockup when removing netconsole module
Date: Wed, 13 Jun 2007 11:25:37 +0200	[thread overview]
Message-ID: <20070613092537.GA2432@ff.dom.local> (raw)
In-Reply-To: <20070612110233.GA3281@ff.dom.local>

On Tue, Jun 12, 2007 at 01:02:33PM +0200, Jarek Poplawski wrote:
...
> Of course such a problem should preferably be fixed by somebody who
> knows the code (alas I don't know netconsole), to be sure all needed
> cancels are still done after this change. I hope Jason's patch is
> right but I'm a little surprised I can't see netdev in cc (I'll try
> to fix this).

So, I've had a look into netpoll and, unfortunately, I don't
think this patch is right... 

> > From: Jason Wessel <jason.wessel@windriver.com>
> > 
> > Do not call cancel_rearming_delayed_work() if there is no
> > pending work.
> > 
> > Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > ---
> > 
> >  net/core/netpoll.c |    6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff -puN net/core/netpoll.c~a net/core/netpoll.c
> > --- a/net/core/netpoll.c~a
> > +++ a/net/core/netpoll.c
> > @@ -784,8 +784,10 @@ void netpoll_cleanup(struct netpoll *np)
> >  			if (atomic_dec_and_test(&npinfo->refcnt)) {
> >  				skb_queue_purge(&npinfo->arp_tx);
> >  				skb_queue_purge(&npinfo->txq);
> > -				cancel_rearming_delayed_work(&npinfo->tx_work);
> > -				flush_scheduled_work();
> > +				if (delayed_work_pending(&npinfo->tx_work)) {
> > +					cancel_rearming_delayed_work(&npinfo->tx_work);
> > +					flush_scheduled_work();
> > +				}
> >  
> >  				kfree(npinfo);
> >  			}
> > _

There are such possibilities:

1. After positive delayed_work_pending(&npinfo->tx_work) test
some work is queued, but there is no guarantee that when running
it'll rearm again, so cancel_rearming_delayed_work can loop again;

2. After negative delayed_work_pending(&npinfo->tx_work) test
a work is just running, eg. waiting on netif_tx_lock, while
kfree(npinfo) is done here (oops?!).

I've found an additional problem here with or without this patch:
after deleting a timer in cancel_rearming_delayed_work() there could
stay a last skb queued in npinfo->txq, and after kfree(npinfo)
we have small memory leak. If I'm right here similar fix is needed
in the current netpoll code: additional npinfo->txq purging only
or maybe the whole cancel_rearming_ changed like this.

I've tried to eliminate these problems in attached below patch
proposal. I'm not sure it's all right: as I've written earlier I
don't know netconsole enough, but it's probably a little better
than above solution.

I've some doubts yet (I didn't have time to check this all):

1. I hope this other schedule_delayed_work() from netpoll_send_skb()
is not possible when netpoll_cleanup() runs - if I'm wrong additional
check of npinfo->refcnt should be done there;
2. I also hope npinfo->refcnt before scheduling should be enough here
- if not - another possibility is adding some locking eg.:
netif_tx_lock before cancel for synchronization.

Of course it would be very nice if somebody could test or verify
this patch more.

Regards,
Jarek P.


Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>

---

diff -Nurp 2.6.21-/net/core/netpoll.c 2.6.21/net/core/netpoll.c
--- 2.6.21-/net/core/netpoll.c	2007-04-26 15:08:32.000000000 +0200
+++ 2.6.21/net/core/netpoll.c	2007-06-12 21:05:23.000000000 +0200
@@ -73,7 +73,8 @@ static void queue_process(struct work_st
 			netif_tx_unlock(dev);
 			local_irq_restore(flags);
 
-			schedule_delayed_work(&npinfo->tx_work, HZ/10);
+			if (atomic_read(&npinfo->refcnt))
+				schedule_delayed_work(&npinfo->tx_work, HZ/10);
 			return;
 		}
 		netif_tx_unlock(dev);
@@ -780,9 +781,15 @@ void netpoll_cleanup(struct netpoll *np)
 			if (atomic_dec_and_test(&npinfo->refcnt)) {
 				skb_queue_purge(&npinfo->arp_tx);
 				skb_queue_purge(&npinfo->txq);
-				cancel_rearming_delayed_work(&npinfo->tx_work);
+				cancel_delayed_work(&npinfo->tx_work);
 				flush_scheduled_work();
 
+				/* clean after last, unfinished work */
+				if (!skb_queue_empty(&npinfo->txq)) {
+					struct sk_buff *skb;
+					skb = __skb_dequeue(&npinfo->txq);
+					kfree_skb(skb);
+				}
 				kfree(npinfo);
 			}
 		}

  reply	other threads:[~2007-06-13  9:18 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-26 15:40 [2.6.21.1] soft lockup when removing netconsole module Folkert van Heusden
2007-05-26 15:53 ` Parag Warudkar
2007-05-26 16:12   ` Thomas Gleixner
2007-05-26 16:17     ` Folkert van Heusden
2007-05-26 16:35       ` Thomas Gleixner
2007-05-26 16:49         ` Folkert van Heusden
2007-05-26 17:06           ` Thomas Gleixner
2007-05-26 17:12             ` Folkert van Heusden
2007-05-27 20:32         ` Matt Mackall
2007-05-29  7:56 ` Andrew Morton
2007-05-30 13:28   ` [PATCH] " Jason Wessel
2007-05-30 20:38     ` Folkert van Heusden
2007-06-12 11:02   ` Jarek Poplawski
2007-06-13  9:25     ` Jarek Poplawski [this message]
2007-06-26 23:07       ` [PATCH] " Andrew Morton
2007-06-27  0:46         ` Wessel, Jason
2007-06-27  1:00           ` Andrew Morton
2007-06-27  7:24             ` Jarek Poplawski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070613092537.GA2432@ff.dom.local \
    --to=jarkao2@o2.pl \
    --cc=akpm@linux-foundation.org \
    --cc=folkert@vanheusden.com \
    --cc=jason.wessel@windriver.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=stable@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).