All of lore.kernel.org
 help / color / mirror / Atom feed
From: Auke Kok <auke-jan.h.kok@intel.com>
To: Kenzo Iwami <k-iwami@cj.jp.nec.com>
Cc: "Brandeburg, Jesse" <jesse.brandeburg@intel.com>,
	Shaw Vrana <shaw@vranix.com>,
	netdev@vger.kernel.org, "Ronciak, John" <john.ronciak@intel.com>
Subject: Re: watchdog timeout panic in e1000 driver
Date: Mon, 04 Dec 2006 16:46:06 -0800	[thread overview]
Message-ID: <4574C14E.2060108@intel.com> (raw)
In-Reply-To: <4573E6FD.3030905@cj.jp.nec.com>

Kenzo Iwami wrote:
> Hi,
> 
>>> Doesn't this just mean that we need a spinlock or some other kind of
>>> semaphore around acquiring, using, and releasing this resource?  We keep
>>> going around and around about this but I'm pretty sure spinlocks are
>>> meant to be able to solve exactly this issue.
>>>
>>> The problem is going to get considerably more nasty if we need to hold a
>>> spinlock with interrupts disabled for a significant amount of time, at
>>> which point a semaphore of some kind with a spinlock around it would
>>> seem to be more useful.
>> Even if spin_lock() was used to protect this resource, it is still possible
>> for an interrupt to kick in and call e1000_watchdog. In this case,
>> e1000_get_software_semaphore() will be called from within the interrupt
>> handler and the problem will still occur.
>>
>> In order to solve this problem, interrupt should be disabled (for example,
>> spin_lock_irqsave).
>> The interrupt handler can't run while the process is holding this resource,
>> and this problem doesn't occur.
>>
>>> I'll work with Auke to see if we can come up with another try.
>> Do you have any updates about your test code?
> 
> Does the fix I previously proposed have problems?
> If it does, I'd like to help find investigate another fix to solve
> this problem.

There are several issues that are conflicting and mixing that make it less than 
intuitive to decide what the better fix is.

Most of all, we discussed that adding a spinlock is not going to fix the underlying 
problem of contention, as the code that would need to be spinlocked can sleep. Not a 
good thing.

Adding state tracking code in the form of atomics might solve the issue too, but then we 
need to do this in quite a few locations. And it comes down to the fact that we really 
want all users of the semaphore to halt in case it is in use.

Reducing the swfw semaphore time is a usefull exercise, but requires an amazing amount 
of changes to all of the phy code to make sure we're not locking it too long, and even 
then I doubt that we will reduce the maximum lock time to acceptable levels.

The watchdog then, appears to needlessly lock the semaphore every two seconds. this is 
because even though the link is up and we're already setup, we go through the trouble of 
doing all the PHY reads, which are protected by the semaphores.

I'm currently testing a watchdog version which completely bypasses these checks in case 
the MAC didn't detect a link change, and we already are setup completely. In that case, 
all we need to do is update stats and reschedule the timer.

I'll keep you posted on progress.

Cheers,

Auke

  reply	other threads:[~2006-12-05  0:52 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-16 17:20 watchdog timeout panic in e1000 driver Brandeburg, Jesse
2006-11-21 10:16 ` Kenzo Iwami
2006-12-04  9:14   ` Kenzo Iwami
2006-12-05  0:46     ` Auke Kok [this message]
2006-12-12  7:58       ` Kenzo Iwami
2006-12-19  0:13         ` Kenzo Iwami
2007-01-15  9:12           ` Kenzo Iwami
2007-01-15 16:14             ` Auke Kok
2007-01-16  8:42               ` Kenzo Iwami
2007-01-18  9:22                 ` Kenzo Iwami
  -- strict thread matches above, loose matches on Subject: below --
2006-10-19 10:19 Kenzo Iwami
2006-10-19 15:39 ` Auke Kok
     [not found]   ` <4538BFF2.2040207@cj.jp.nec.com>
2006-10-20 15:51     ` Auke Kok
2006-10-24  9:01       ` Kenzo Iwami
2006-10-24 16:15         ` Auke Kok
2006-10-25 13:41           ` Kenzo Iwami
2006-10-25 15:09             ` Auke Kok
2006-10-26 10:35               ` Kenzo Iwami
2006-10-26 14:34                 ` Auke Kok
2006-10-30 11:36                   ` Kenzo Iwami
2006-10-30 17:30                     ` Auke Kok
2006-10-31  3:22                       ` Shaw Vrana
2006-11-01 13:21                         ` Kenzo Iwami
2006-11-15 10:33                           ` Kenzo Iwami
2006-11-15 16:11                             ` Auke Kok
2006-11-16  9:23                               ` Kenzo Iwami
2007-02-20  9:26 ` Kenzo Iwami
2007-02-20 16:10   ` Auke Kok
2007-02-21  5:17     ` Kenzo Iwami

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4574C14E.2060108@intel.com \
    --to=auke-jan.h.kok@intel.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=john.ronciak@intel.com \
    --cc=k-iwami@cj.jp.nec.com \
    --cc=netdev@vger.kernel.org \
    --cc=shaw@vranix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.