All of lore.kernel.org
 help / color / mirror / Atom feed
From: Auke Kok <auke-jan.h.kok@intel.com>
To: Kenzo Iwami <k-iwami@cj.jp.nec.com>
Cc: netdev@vger.kernel.org,
	Jesse Brandeburg <jesse.brandeburg@intel.com>,
	"Ronciak, John" <john.ronciak@intel.com>
Subject: Re: watchdog timeout panic in e1000 driver
Date: Fri, 20 Oct 2006 08:51:28 -0700	[thread overview]
Message-ID: <4538F080.5020003@intel.com> (raw)
In-Reply-To: <4538BFF2.2040207@cj.jp.nec.com>

Kenzo Iwami wrote:
> Hi,
> 
> Thank you for your comment.
> 
>>> A watchdog timeout panic occurred in e1000 driver (7.2.9-NAPI).
>> where's the panic message ?
> 
> attached the panic message (e1000_panic).
> 
> [...]
>>> This problem only occurs on a server using ethernet controller inside
>>> 631xESB/632xESB, and NMI watchdog enabled.
>> why only this system? have you seen/tried it on other machines?
> 
> This problem is caused by e1000_get_software_semaphore() being called from
> within the interrupt handler, while the interrupted code is still holding
> this semaphore.  e1000_get_software_semaphore() is called from
> e1000_get_hw_eeprom_semaphore() only when hw->mac_type is e1000_80003es2lan.
> This condition is true only for MACs inside 631xESB/632xESB.
> 
> When this problem happens e1000_get_software_semaphore() will wait for
> 16 seconds (inside the interrupt handler) before it fails, thus causing
> the watchdog timeout.
> 
> I haven't actually tried it on other machines, but theoretically, it will
> only happen on MAC inside 631xESB/632xESB chip set.
> 
> [...]
>> Reverting this could would not be a fix, but only a workaround that leaves the problem 
>> still in the code, and as such not progress in the right direction.
>>
>> I find this report extremely edgy, but I'll look into the fact that the driver attempts 
>> to sleep for 16384 + 1 msec, which seems overly long :)
>>
>> As a side note, most other e1000 NIC's use hardcoded word_size numbers, but esb2 systems 
>> read it from a register/eeprom. Can you send me the output of `ethtool -e ethX` ? 
>> off-list is OK, it might be large.
> 
> attached is the output of "ethtool -e ethX" (eeprom_eth0).

thanks.

This panic report falls in the category "how hard can I break my system as root". 
Explicitly abusing the system performing restricted calls depletes resources and 
harasses the sw lock (in this case). The reason that the driver attempts to wait that 
long is that in the case of ESB2 systems, the SPI interface to the EEPROM can be slow, 
thus taking a long time to complete certain commands.

We're looking into making this theoretical lock time shorter in the mean time, thanks 
for reporting this.

Cheers,

Auke

  parent reply	other threads:[~2006-10-20 15:53 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-19 10:19 watchdog timeout panic in e1000 driver Kenzo Iwami
2006-10-19 15:39 ` Auke Kok
     [not found]   ` <4538BFF2.2040207@cj.jp.nec.com>
2006-10-20 15:51     ` Auke Kok [this message]
2006-10-24  9:01       ` Kenzo Iwami
2006-10-24 16:15         ` Auke Kok
2006-10-25 13:41           ` Kenzo Iwami
2006-10-25 15:09             ` Auke Kok
2006-10-26 10:35               ` Kenzo Iwami
2006-10-26 14:34                 ` Auke Kok
2006-10-30 11:36                   ` Kenzo Iwami
2006-10-30 17:30                     ` Auke Kok
2006-10-31  3:22                       ` Shaw Vrana
2006-11-01 13:21                         ` Kenzo Iwami
2006-11-15 10:33                           ` Kenzo Iwami
2006-11-15 16:11                             ` Auke Kok
2006-11-16  9:23                               ` Kenzo Iwami
2007-02-20  9:26 ` Kenzo Iwami
2007-02-20 16:10   ` Auke Kok
2007-02-21  5:17     ` Kenzo Iwami
2006-11-16 17:20 Brandeburg, Jesse
2006-11-21 10:16 ` Kenzo Iwami
2006-12-04  9:14   ` Kenzo Iwami
2006-12-05  0:46     ` Auke Kok
2006-12-12  7:58       ` Kenzo Iwami
2006-12-19  0:13         ` Kenzo Iwami
2007-01-15  9:12           ` Kenzo Iwami
2007-01-15 16:14             ` Auke Kok
2007-01-16  8:42               ` Kenzo Iwami
2007-01-18  9:22                 ` Kenzo Iwami

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4538F080.5020003@intel.com \
    --to=auke-jan.h.kok@intel.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=john.ronciak@intel.com \
    --cc=k-iwami@cj.jp.nec.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.