linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Mohr <andim2@users.sourceforge.net>
To: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: andi@lisas.de, akpm@linux-foundation.org,
	e1000-devel@lists.sourceforge.net, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] Make e100 suspend handler support PCI cards lacking PM capability
Date: Fri, 19 Jun 2009 10:00:50 +0200	[thread overview]
Message-ID: <20090619080050.GA20131@rhlx01.hs-esslingen.de> (raw)
In-Reply-To: <200906141909.46354.rjw@sisk.pl>

Hi,

On Sun, Jun 14, 2009 at 07:09:45PM +0200, Rafael J. Wysocki wrote:
> On Sunday 14 June 2009, Andreas Mohr wrote:
> > - why do we call netif_device_detach() _after_ doing hardware shutdown
> >   of the network controller? I'd guess this can cause huge issues?
> >   Someone told me he had rtnl lock issues upon S2D with e100
> >   (very similar to my rtnl issues during aborted .suspend),
> >   and that might possibly be the reason?
> 
> I think you're right, but I'm not a network driver expert.
> 
> Perhaps you can change the ordering and see if that fixes the rtnl issue
> (since you're able to reproduce it without my patch, that should be easy to
> verify).

Well, I just moved netif_device_detach() above netif_running() check,
but this didn't fix my network issues in case of a rejecting .suspend
handler: after resume when unloading e100, that hangs, and I get tons of
rtnl timeouts and locked rtnl mutex.
This is most likely because upon e100 unload, a backtrace showed that I
was hanging in e100_down -> msleep (somewhere at the very beginning of e100_down),
which is most definitely the inlined napi_disable() call there:

static inline void napi_disable(struct napi_struct *n)
{
        set_bit(NAPI_STATE_DISABLE, &n->state);
        while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
                msleep(1);
        clear_bit(NAPI_STATE_DISABLE, &n->state);
}

IOW the .suspend seems to keep NAPI layer active, yet due to .suspend failure
there's no .resume called, thus card is in an _inoperable_ state and
NAPI cannot be processed any further, thus napi_disable() on driver unload
locks up.


BTW, in include/linux/napi.h, shouldn't napi_disable() make use of
napi_synchronize() instead of C&P?
(simply move napi_synchronize() above napi_disable() and use it there)
Oh wait, there's the CONFIG_SMP complication:
napi_synchronize() is implemented for SMP only, whereas napi_disable()
checks the same thing _always_.
(or is it a BUG that napi_disable() does the same check for non-SMP,
too??)

Thanks,

Andreas Mohr

  parent reply	other threads:[~2009-06-19  8:00 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-28  8:01 [GIT]: Networking David Miller
2008-12-29 10:25 ` Andreas Mohr
2008-12-29 17:17   ` Andrew Morton
2008-12-29 21:09     ` Johannes Berg
2008-12-30 11:05       ` Andreas Mohr
2008-12-29 23:15     ` Jeff Kirsher
2008-12-30 12:07       ` Andreas Mohr
2009-02-28 20:37         ` 2.6.29 e100.c non-MII support status? (Re: [GIT]: Networking) Andreas Mohr
2009-03-01 10:57           ` Jeff Kirsher
2009-03-01 21:24             ` Andreas Mohr
2009-06-02 21:48               ` [PATCH] Add non-MII PHY support to e100 (Re: 2.6.29 e100.c non-MII support status? (Re: [GIT]: Networking)) Andreas Mohr
2009-06-03  6:01                 ` e100 kills S2R on my box, plus network drops dead Andreas Mohr
2009-06-03  6:30                   ` Andreas Mohr
2009-06-13 19:19                     ` [PATCH] Make e100 suspend handler support PCI cards lacking PM capability Andreas Mohr
2009-06-13 22:28                       ` Rafael J. Wysocki
2009-06-13 22:45                         ` Rafael J. Wysocki
2009-06-14 12:51                         ` Andreas Mohr
2009-06-14 14:06                           ` Rafael J. Wysocki
2009-06-14 16:31                             ` Rafael J. Wysocki
2009-06-14 16:46                             ` Andreas Mohr
2009-06-14 17:09                               ` Rafael J. Wysocki
2009-06-14 17:20                                 ` Andreas Mohr
2009-06-19  8:00                                 ` Andreas Mohr [this message]
2009-06-14 19:46                               ` [PATCH] Net / e100: Fix suspend of devices that cannot be power managed Rafael J. Wysocki
2009-06-18  2:03                                 ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090619080050.GA20131@rhlx01.hs-esslingen.de \
    --to=andim2@users.sourceforge.net \
    --cc=akpm@linux-foundation.org \
    --cc=andi@lisas.de \
    --cc=e1000-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rjw@sisk.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).