All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.39.2: skb_over_panic: kernel BUG at net/core/skbuff.c:127!
@ 2011-06-27 21:48 Justin Piszcz
  2011-06-27 22:13 ` [E1000-devel] " Ronciak, John
  0 siblings, 1 reply; 9+ messages in thread
From: Justin Piszcz @ 2011-06-27 21:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: e1000-devel

Hi,

Looks like I am not the first one:
https://bugzilla.redhat.com/show_bug.cgi?id=429868

Any thoughts on this one?
http://home.comcast.net/~jpiszcz/20110627/IMG_2703.JPG

Rough transcription:

skb_over_panic: text:ffffffff813d711c len:44117 put:44117 head:ffff880415d40000 data:ffff880415d40040 tail:0xac95 end:0x640 dev:eth2
-- [ cut here ] ---
kernel BUG at net/core/skbuff.c:127!
invalid opcode: 0000 [#1] SMP

Was playing a video over samba (eth0) and then the kernel panic'd on eth2?

I use all INTEL controllers:

00:19.0 Ethernet controller: Intel Corporation 82578DC Gigabit Network Connection (rev 05)
01:00.0 Ethernet controller: Intel Corporation 82598EB 10-Gigabit AT2 Server Adapter (rev 01)
03:00.0 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
03:00.1 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
03:00.2 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
03:00.3 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)

Justin.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [E1000-devel] 2.6.39.2: skb_over_panic: kernel BUG at net/core/skbuff.c:127!
  2011-06-27 21:48 2.6.39.2: skb_over_panic: kernel BUG at net/core/skbuff.c:127! Justin Piszcz
@ 2011-06-27 22:13 ` Ronciak, John
  2011-06-27 22:20   ` Justin Piszcz
  0 siblings, 1 reply; 9+ messages in thread
From: Ronciak, John @ 2011-06-27 22:13 UTC (permalink / raw)
  To: Justin Piszcz, linux-kernel; +Cc: e1000-devel

Sorry to hear of your problem.  Please do not try to relate something that happened on REHL4 to something that is happening in the 2.6.39 .2 kernel.

Is the problem reproducible?  If so, does this happen every time? What interface had what HW associated with it?  The info included below doesn't say.  What in type of traffic was happening on eth2?

Cheers,
John


> -----Original Message-----
> From: Justin Piszcz [mailto:jpiszcz@lucidpixels.com]
> Sent: Monday, June 27, 2011 2:49 PM
> To: linux-kernel@vger.kernel.org
> Cc: e1000-devel@lists.sourceforge.net
> Subject: [E1000-devel] 2.6.39.2: skb_over_panic: kernel BUG at
> net/core/skbuff.c:127!
> 
> Hi,
> 
> Looks like I am not the first one:
> https://bugzilla.redhat.com/show_bug.cgi?id=429868
> 
> Any thoughts on this one?
> http://home.comcast.net/~jpiszcz/20110627/IMG_2703.JPG
> 
> Rough transcription:
> 
> skb_over_panic: text:ffffffff813d711c len:44117 put:44117
> head:ffff880415d40000 data:ffff880415d40040 tail:0xac95 end:0x640
> dev:eth2
> -- [ cut here ] ---
> kernel BUG at net/core/skbuff.c:127!
> invalid opcode: 0000 [#1] SMP
> 
> Was playing a video over samba (eth0) and then the kernel panic'd on
> eth2?
> 
> I use all INTEL controllers:
> 
> 00:19.0 Ethernet controller: Intel Corporation 82578DC Gigabit Network
> Connection (rev 05) 01:00.0 Ethernet controller: Intel Corporation
> 82598EB 10-Gigabit AT2 Server Adapter (rev 01) 03:00.0 Ethernet
> controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
> 03:00.1 Ethernet controller: Intel Corporation 82580 Gigabit Network
> Connection (rev 01)
> 03:00.2 Ethernet controller: Intel Corporation 82580 Gigabit Network
> Connection (rev 01)
> 03:00.3 Ethernet controller: Intel Corporation 82580 Gigabit Network
> Connection (rev 01)
> 
> Justin.
> 
> 
> -----------------------------------------------------------------------
> -------
> All of the data generated in your IT infrastructure is seriously
> valuable.
> Why? It contains a definitive record of application performance,
> security threats, fraudulent activity, and more. Splunk takes this data
> and makes sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-d2d-c2
> _______________________________________________
> E1000-devel mailing list
> E1000-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/e1000-devel
> To learn more about Intel® Ethernet, visit
> http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [E1000-devel] 2.6.39.2: skb_over_panic: kernel BUG at net/core/skbuff.c:127!
  2011-06-27 22:13 ` [E1000-devel] " Ronciak, John
@ 2011-06-27 22:20   ` Justin Piszcz
  2011-06-27 22:28     ` Ronciak, John
  0 siblings, 1 reply; 9+ messages in thread
From: Justin Piszcz @ 2011-06-27 22:20 UTC (permalink / raw)
  To: Ronciak, John; +Cc: linux-kernel, e1000-devel



On Mon, 27 Jun 2011, Ronciak, John wrote:

> Sorry to hear of your problem.  Please do not try to relate something that happened on REHL4 to something that is happening in the 2.6.39 .2 kernel.
>
> Is the problem reproducible?  If so, does this happen every time? What interface had what HW associated with it?  The info included below doesn't say.  What in type of traffic was happening on eth2?
>
> Cheers,
> John
>

Hello John,

Not much actually, just regular cable modem network traffic (was not even
utilizing it heavily), vnstat below:

  eth2                                                                     18:11
   ^                                                              r
   |                                                  r           r
   |                                                  r           r
   |                                                  r           r
   |  r                                               r  r        r
   |  r                                               r  r        r  r
   |  r                                               r  r        r  r
   |  r                          r                    r  r  r  r  r  r
   |  r                          r                    r  r  r  r  r  r  r
   |  r                          r                    r  r  r  r  r  r  r
  -+--------------------------------------------------------------------------->
   |  19 20 21 22 23 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18

  h  rx (KiB)   tx (KiB)      h  rx (KiB)   tx (KiB)      h  rx (KiB)   tx (KiB) 
19      89730       8838    03      11249        620    11     126496       9383
20      10972        596    04      43551       1729    12      86326       6513
21      11489        611    05      12159        699    13      47805       3863
22      12073        630    06      12080        712    14      52322       4577
23      12153        654    07      11894        710    15     135715       8618
00      11536        594    08       7392        396    16      75680       6181
01      11159        560    09          0          0    17      33817       3213
02      11564        575    10        759         51    18       4315        190

The most recent crash was at 16:33.

The system crashed earlier too but I had X running so I couldn't get a 
screenshot, there is no serial port on this machine and netconsole does not
work with this type of failure.

It is unfortunately not reproducible, is there any debugging options that
you would recommend I can enable that will expose this bug? I'm up for
anything at this point.

All of this may have started when I added a 4-port Intel NIC:
http://www.intel.com/Assets/PDF/prodbrief/323205.pdf

NIC = Intel Ethernet I340 Server Adatper

But this is just a guess..

Justin.





^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [E1000-devel] 2.6.39.2: skb_over_panic: kernel BUG at net/core/skbuff.c:127!
  2011-06-27 22:20   ` Justin Piszcz
@ 2011-06-27 22:28     ` Ronciak, John
  2011-06-27 22:34       ` Justin Piszcz
  0 siblings, 1 reply; 9+ messages in thread
From: Ronciak, John @ 2011-06-27 22:28 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-kernel, e1000-devel

> Hello John,
> 
> Not much actually, just regular cable modem network traffic (was not
> even utilizing it heavily), vnstat below:
> 
>   eth2
> 18:11
>    ^                                                              r
>    |                                                  r           r
>    |                                                  r           r
>    |                                                  r           r
>    |  r                                               r  r        r
>    |  r                                               r  r        r  r
>    |  r                                               r  r        r  r
>    |  r                          r                    r  r  r  r  r  r
>    |  r                          r                    r  r  r  r  r  r
> r
>    |  r                          r                    r  r  r  r  r  r
> r
>   -+-------------------------------------------------------------------
> -------->
>    |  19 20 21 22 23 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16
> 17 18
> 
>   h  rx (KiB)   tx (KiB)      h  rx (KiB)   tx (KiB)      h  rx (KiB)
> tx (KiB)
> 19      89730       8838    03      11249        620    11     126496
> 9383
> 20      10972        596    04      43551       1729    12      86326
> 6513
> 21      11489        611    05      12159        699    13      47805
> 3863
> 22      12073        630    06      12080        712    14      52322
> 4577
> 23      12153        654    07      11894        710    15     135715
> 8618
> 00      11536        594    08       7392        396    16      75680
> 6181
> 01      11159        560    09          0          0    17      33817
> 3213
> 02      11564        575    10        759         51    18       4315
> 190
> 
> The most recent crash was at 16:33.
> 
> The system crashed earlier too but I had X running so I couldn't get a
> screenshot, there is no serial port on this machine and netconsole does
> not work with this type of failure.
> 
> It is unfortunately not reproducible, is there any debugging options
> that you would recommend I can enable that will expose this bug? I'm up
> for anything at this point.
> 
> All of this may have started when I added a 4-port Intel NIC:
> http://www.intel.com/Assets/PDF/prodbrief/323205.pdf
> 
> NIC = Intel Ethernet I340 Server Adatper
> 
> But this is just a guess..
> 
> Justin.
You still didn't tell us what eth interface is on which HW.  We need to know that.  Do an 'ethtool -i eth2' and the same for each of the other interfaces in the system.  The igb driver that is used on the NIC described above is in high use without people reporting this error.

Cheers,
John



^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [E1000-devel] 2.6.39.2: skb_over_panic: kernel BUG at net/core/skbuff.c:127!
  2011-06-27 22:28     ` Ronciak, John
@ 2011-06-27 22:34       ` Justin Piszcz
  2011-06-28  0:08         ` Justin Piszcz
  0 siblings, 1 reply; 9+ messages in thread
From: Justin Piszcz @ 2011-06-27 22:34 UTC (permalink / raw)
  To: Ronciak, John; +Cc: linux-kernel, e1000-devel



On Mon, 27 Jun 2011, Ronciak, John wrote:

>> Hello John,
>> Justin.
> You still didn't tell us what eth interface is on which HW.  We need to know that.  Do an 'ethtool -i eth2' and the same for each of the other interfaces in the system.  The igb driver that is used on the NIC described above is in high use without people reporting this error.
>
> Cheers,
> John

Sorry,

eth0 = e1000e (on-board (on an Intel DP55KG))
eth{1,2,3,4} = igb (the 4-port NIC I mentioned)
eth5 = ixgbe (the 10GbE AT2 server board (copper))

--

[    2.301687] e1000e 0000:00:19.0: eth0: (PCI Express:2.5GB/s:Width x1) hidden:mac
[    2.301878] e1000e 0000:00:19.0: eth0: Intel(R) PRO/1000 Network Connection
[    2.302162] e1000e 0000:00:19.0: eth0: MAC: 9, PHY: 9, PBA No: FFFFFF-0FF

[    2.350085] igb 0000:03:00.0: eth1: (PCIe:2.5Gb/s:Width x4) hidden:mac
[    2.350351] igb 0000:03:00.0: eth1: PBA No: E84075-002
[    2.400381] igb 0000:03:00.1: eth2: (PCIe:2.5Gb/s:Width x4) hidden:mac
[    2.400486] igb 0000:03:00.1: eth2: PBA No: E84075-002
[    2.448103] igb 0000:03:00.2: eth3: (PCIe:2.5Gb/s:Width x4) hidden:mac
[    2.448391] igb 0000:03:00.2: eth3: PBA No: E84075-002
[    2.496095] igb 0000:03:00.3: eth4: (PCIe:2.5Gb/s:Width x4) hidden:mac
[    2.496377] igb 0000:03:00.3: eth4: PBA No: E84075-002

[   38.916123] ixgbe 0000:01:00.0: eth5: NIC Link is Up 10 Gbps, Flow Control: RX/TX

--

ethtool output as requested:

driver: e1000e
version: 1.3.10-k2
firmware-version: 0.12-5
bus-info: 0000:00:19.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes

ethtool -I eth1
driver: igb
version: 3.0.6-k2
firmware-version: 3.19-0
bus-info: 0000:03:00.3
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes

ethtool -I eth2
driver: igb
version: 3.0.6-k2
firmware-version: 3.19-0
bus-info: 0000:03:00.2
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes

ethtool -I eth3
driver: igb
version: 3.0.6-k2
firmware-version: 3.19-0
bus-info: 0000:03:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes

ethtool -I eth4
driver: igb
version: 3.0.6-k2
firmware-version: 3.19-0
bus-info: 0000:03:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes

ethtool -I eth5
driver: ixgbe
version: 3.2.9-k2
firmware-version: 2.9-0
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes

Justin.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [E1000-devel] 2.6.39.2: skb_over_panic: kernel BUG at net/core/skbuff.c:127!
  2011-06-27 22:34       ` Justin Piszcz
@ 2011-06-28  0:08         ` Justin Piszcz
  2011-06-28 15:50           ` Alexander Duyck
                             ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Justin Piszcz @ 2011-06-28  0:08 UTC (permalink / raw)
  To: Ronciak, John; +Cc: linux-kernel, e1000-devel


On Mon, 27 Jun 2011, Justin Piszcz wrote:

>
>
> On Mon, 27 Jun 2011, Ronciak, John wrote:
>

Hi,

Here's another crash: (see the dmesg, its right when powering the disks up)
http://home.comcast.net/~jpiszcz/20110627/IMG_2704.JPG

In this case, I was mkfs.xfs -f (some disks attached to a sata dock) over
an Sil 3132 card, I disconnected the card and re-ran it w/ the on-board
SATA controller and the problem no longer occurred (crashed repeatedly
everytime with the NIC error), strange.

In any case, will let you know if there are any further crashes after
removing that PCI-e card.

Justin.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [E1000-devel] 2.6.39.2: skb_over_panic: kernel BUG at net/core/skbuff.c:127!
  2011-06-28  0:08         ` Justin Piszcz
@ 2011-06-28 15:50           ` Alexander Duyck
  2011-06-30 23:03           ` Justin Piszcz
       [not found]           ` <alpine.DEB.2.02.1107012149480.8312@p34.internal.lan>
  2 siblings, 0 replies; 9+ messages in thread
From: Alexander Duyck @ 2011-06-28 15:50 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: Ronciak, John, e1000-devel, linux-kernel

On 06/27/2011 05:08 PM, Justin Piszcz wrote:
> Hi,
>
> Here's another crash: (see the dmesg, its right when powering the disks up)
> http://home.comcast.net/~jpiszcz/20110627/IMG_2704.JPG
>
> In this case, I was mkfs.xfs -f (some disks attached to a sata dock) over
> an Sil 3132 card, I disconnected the card and re-ran it w/ the on-board
> SATA controller and the problem no longer occurred (crashed repeatedly
> everytime with the NIC error), strange.
>
> In any case, will let you know if there are any further crashes after
> removing that PCI-e card.
>
> Justin.
Justin,

One other thing you might try is downloading and installing our latest 
igb driver from e1000.sf.net.  It looks like you are currently using the 
in-kernel driver and it is possible that there may be differences 
between the two that could resolve the issue you are experiencing.

If you are able to reproduce the issue with the Sourceforge driver then 
that will provide valuable information.  Once we have reproduced the 
issue with the Sourceforge driver, we would be able to provide you a 
debug driver so that we can narrow down this issue further.

Thanks,

Alex

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [E1000-devel] 2.6.39.2: skb_over_panic: kernel BUG at net/core/skbuff.c:127!
  2011-06-28  0:08         ` Justin Piszcz
  2011-06-28 15:50           ` Alexander Duyck
@ 2011-06-30 23:03           ` Justin Piszcz
       [not found]           ` <alpine.DEB.2.02.1107012149480.8312@p34.internal.lan>
  2 siblings, 0 replies; 9+ messages in thread
From: Justin Piszcz @ 2011-06-30 23:03 UTC (permalink / raw)
  To: Ronciak, John; +Cc: linux-kernel, e1000-devel



On Mon, 27 Jun 2011, Justin Piszcz wrote:

>
> On Mon, 27 Jun 2011, Justin Piszcz wrote:
>
>> 
>> 
>> On Mon, 27 Jun 2011, Ronciak, John wrote:
>> 
>
> Hi,
>
> Here's another crash: (see the dmesg, its right when powering the disks up)
> http://home.comcast.net/~jpiszcz/20110627/IMG_2704.JPG
>
> In this case, I was mkfs.xfs -f (some disks attached to a sata dock) over
> an Sil 3132 card, I disconnected the card and re-ran it w/ the on-board
> SATA controller and the problem no longer occurred (crashed repeatedly
> everytime with the NIC error), strange.
>
> In any case, will let you know if there are any further crashes after
> removing that PCI-e card.
>
> Justin.
>
>

Hi,

Per:
http://www.mail-archive.com/e1000-devel@lists.sourceforge.net/msg04232.html

I am using the drivers on e1000.sf.net for:
e1000e
igb
igbe

version:        1.3.17-NAPI
srcversion:     BA556C5C800B0D67E5F8B84
version:        3.0.22
srcversion:     45B8078075068728A5A5573
version:        3.3.9-NAPI
srcversion:     0734B0E06E21B50A92ADDFF

No crashes when I run mkfs.xfs (w/the eSATA card back in).

Will monitor throughout to see if it recurs.
When will the current -stable versions go into mainline?

Also, is there a kernel option to 'pause' or take a screenshot of a kernel 
console crash/dump/stack trace (besides kdump) and not reboot the machine 
when it crashes?

I do not have any option to reboot on panic, but sometimes it still does 
that.

Thanks!

Justin.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [E1000-devel] 2.6.39.2: skb_over_panic: kernel BUG at net/core/skbuff.c:127!
       [not found]           ` <alpine.DEB.2.02.1107012149480.8312@p34.internal.lan>
@ 2011-07-02  1:53             ` Justin Piszcz
  0 siblings, 0 replies; 9+ messages in thread
From: Justin Piszcz @ 2011-07-02  1:53 UTC (permalink / raw)
  To: Ronciak, John; +Cc: linux-kernel, e1000-devel



On Fri, 1 Jul 2011, Justin Piszcz wrote:

>
>
> On Mon, 27 Jun 2011, Justin Piszcz wrote:
>
>> 
>> On Mon, 27 Jun 2011, Justin Piszcz wrote:
>> 
>>> 
>>> 
>>> On Mon, 27 Jun 2011, Ronciak, John wrote:
>>> 
>> 
>> Hi,
>> 
>> Here's another crash: (see the dmesg, its right when powering the disks up)
>> http://home.comcast.net/~jpiszcz/20110627/IMG_2704.JPG
>> 
>> In this case, I was mkfs.xfs -f (some disks attached to a sata dock) over
>> an Sil 3132 card, I disconnected the card and re-ran it w/ the on-board
>> SATA controller and the problem no longer occurred (crashed repeatedly
>> everytime with the NIC error), strange.
>> 
>> In any case, will let you know if there are any further crashes after
>> removing that PCI-e card.
>> 
>> Justin.
>> 
>> 
>


Hello,

Please ignore all my bug reports concerning kernel crashes for the past 3-4
weeks, my PSU from 2008 failed, I replaced it and my system is back up and
running, I'm sure it will be fine now.

I'll also update the bugzilla entry, for 2.6.38->2.6.39.

Thanks,

Justin.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-07-02  1:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-27 21:48 2.6.39.2: skb_over_panic: kernel BUG at net/core/skbuff.c:127! Justin Piszcz
2011-06-27 22:13 ` [E1000-devel] " Ronciak, John
2011-06-27 22:20   ` Justin Piszcz
2011-06-27 22:28     ` Ronciak, John
2011-06-27 22:34       ` Justin Piszcz
2011-06-28  0:08         ` Justin Piszcz
2011-06-28 15:50           ` Alexander Duyck
2011-06-30 23:03           ` Justin Piszcz
     [not found]           ` <alpine.DEB.2.02.1107012149480.8312@p34.internal.lan>
2011-07-02  1:53             ` Justin Piszcz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.