All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: linux 2.6.9: r8169: eth0: PCI error (status: 0x8404). Device disabled.
       [not found]     ` <20041020201010.GA13023@electric-eye.fr.zoreil.com>
@ 2004-10-21  9:25       ` Eamonn Hamilton
  2004-10-21 12:02         ` Francois Romieu
  0 siblings, 1 reply; 8+ messages in thread
From: Eamonn Hamilton @ 2004-10-21  9:25 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev

Hi Francois,

Many apologies for the top posting, I'm in an Outlook land and it's
given me bad habits :( ( mea culpa, mea culpa, mea maxima culpa, je
m'apologie :)

As you surmised, the card seems to work OK under moderate load, but
under havy transmit load the error occurs. It hasn't been seen under
Anyway, I now have 2.6.9 patched with the patch you sent with NAPI
enabled, 2.6.7 and 2.6.8 installed. I'm also going to install another
kernel with the code you mentioned commented out as follows, yes ?

if (unlikely(status & SYSErr)) {
   /* rtl8169_pcierr_interrupt(dev); */
   break;

You also mentioned being able to enable TX checksum and segmentation
offload - how do I enable this, is it through ethtool or something more
esoteric?

Thanks again for the help, hopefully I'll be able to get this lot tested
over the next couple of days and let you know how it went.

Cheers,
Eamonn

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux 2.6.9: r8169: eth0: PCI error (status: 0x8404). Device disabled.
  2004-10-21  9:25       ` linux 2.6.9: r8169: eth0: PCI error (status: 0x8404). Device disabled Eamonn Hamilton
@ 2004-10-21 12:02         ` Francois Romieu
  2004-10-22  9:31           ` Eamonn Hamilton
  0 siblings, 1 reply; 8+ messages in thread
From: Francois Romieu @ 2004-10-21 12:02 UTC (permalink / raw)
  To: Eamonn Hamilton; +Cc: netdev

Eamonn Hamilton <EAMONN.HAMILTON@saic.com> :
[...]
> As you surmised, the card seems to work OK under moderate load, but
> under havy transmit load the error occurs. It hasn't been seen under
> Anyway, I now have 2.6.9 patched with the patch you sent with NAPI
> enabled, 2.6.7 and 2.6.8 installed. I'm also going to install another
> kernel with the code you mentioned commented out as follows, yes ?
> 
> if (unlikely(status & SYSErr)) {
>    /* rtl8169_pcierr_interrupt(dev); */
>    break;

Yep.
This is a gross hack but it is the first time that an unexpected
PCI error is reported on x86 (they are expected when I do fancy
testing :o) ). So I wonder if it really harms.

> You also mentioned being able to enable TX checksum and segmentation
> offload - how do I enable this, is it through ethtool or something more
> esoteric?

ethtool -K ethX tx on sg on tso on

You'll need the patch for TSO that M. Xu posted a few days ago. It is in
the vanilla tree and will appear in the upcoming -bk snapshot.

> Thanks again for the help, hopefully I'll be able to get this lot tested
> over the next couple of days and let you know how it went.

Apparently there is a Master Abort which exhibits zero difference with the
amd64 DAC error. The current code is only enough for my home system to
recover from PCI error issued by the 8169. It will have to be tweaked.

--
Ueimor

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux 2.6.9: r8169: eth0: PCI error (status: 0x8404). Device disabled.
  2004-10-21 12:02         ` Francois Romieu
@ 2004-10-22  9:31           ` Eamonn Hamilton
  2004-10-22 10:52             ` Francois Romieu
  0 siblings, 1 reply; 8+ messages in thread
From: Eamonn Hamilton @ 2004-10-22  9:31 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev

Hi Francois,

Testing 2.6.9 with just the untouched patch applied again fails :

PCI error (cmd = 0x0017, status = 0x22b0)

Testing the kernel with the error reset disabled just causes networking
to stop, with no reported error after copying data from the box for a
bit ( < 1GB )

Tests were tried under 2.6.7 and 2.6.8, but the guy doing the testing
reported odd problems under these kernels ( they were the pre-packaged
debian ones, I'll run up a manual set today sometime ).

Hope this helps,
Eamonn

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux 2.6.9: r8169: eth0: PCI error (status: 0x8404). Device disabled.
  2004-10-22  9:31           ` Eamonn Hamilton
@ 2004-10-22 10:52             ` Francois Romieu
  2004-11-08 14:03               ` Eamonn Hamilton
  0 siblings, 1 reply; 8+ messages in thread
From: Francois Romieu @ 2004-10-22 10:52 UTC (permalink / raw)
  To: Eamonn Hamilton; +Cc: netdev

Eamonn Hamilton <EAMONN.HAMILTON@saic.com> :
[...]
> Testing the kernel with the error reset disabled just causes networking
> to stop, with no reported error after copying data from the box for a
> bit ( < 1GB )

At this point, can you have network traffic again is the device goes
through closed/open or the module through rmmod/insmod ?

> Hope this helps,

Yes.

--
Ueimor

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux 2.6.9: r8169: eth0: PCI error (status: 0x8404). Device disabled.
  2004-10-22 10:52             ` Francois Romieu
@ 2004-11-08 14:03               ` Eamonn Hamilton
  2004-11-09 23:43                 ` Francois Romieu
  0 siblings, 1 reply; 8+ messages in thread
From: Eamonn Hamilton @ 2004-11-08 14:03 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Eamonn Hamilton, netdev

Hi Francois.

Sorry for the delay in getting back to you - real life has a nasty habit
of intruding :(

Anyway, The device was brought back to functioning order by removing the
module and re-inserting it. I have to confess, I forgot to try simply
downing/upping it, I'll ask my friend to try that should it happen
again. Bizarrely, the frequency of it happening seems to be down for
whatever reason - it took ~12GB traffic to cause it to happen this
afternoon.

Anyway, hope this helps.

Cheers,
Eamonn


On Fri, 2004-10-22 at 12:52 +0200, Francois Romieu wrote:
> Eamonn Hamilton <EAMONN.HAMILTON@saic.com> :
> [...]
> > Testing the kernel with the error reset disabled just causes networking
> > to stop, with no reported error after copying data from the box for a
> > bit ( < 1GB )
> 
> At this point, can you have network traffic again is the device goes
> through closed/open or the module through rmmod/insmod ?
> 
> > Hope this helps,
> 
> Yes.
> 
> --
> Ueimor
-- 
Eamonn Hamilton

Senior Systems Engineer
SAIC Ltd
Tel : 01224 333833

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux 2.6.9: r8169: eth0: PCI error (status: 0x8404). Device disabled.
  2004-11-08 14:03               ` Eamonn Hamilton
@ 2004-11-09 23:43                 ` Francois Romieu
  2004-11-10  9:28                   ` Eamonn Hamilton
  2004-11-10 10:04                   ` Eamonn Hamilton
  0 siblings, 2 replies; 8+ messages in thread
From: Francois Romieu @ 2004-11-09 23:43 UTC (permalink / raw)
  To: Eamonn Hamilton; +Cc: netdev

Eamonn Hamilton <EAMONN.HAMILTON@saic.com> :
[...]
> Anyway, The device was brought back to functioning order by removing the
> module and re-inserting it. I have to confess, I forgot to try simply
> downing/upping it, I'll ask my friend to try that should it happen
> again. Bizarrely, the frequency of it happening seems to be down for
> whatever reason - it took ~12GB traffic to cause it to happen this
> afternoon.
> 
> Anyway, hope this helps.

Yes. It suggests that it should not be too hard to add a hack for recovery.

I am a bit surprized with this isolated report though.

--
Ueimor

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux 2.6.9: r8169: eth0: PCI error (status: 0x8404). Device disabled.
  2004-11-09 23:43                 ` Francois Romieu
@ 2004-11-10  9:28                   ` Eamonn Hamilton
  2004-11-10 10:04                   ` Eamonn Hamilton
  1 sibling, 0 replies; 8+ messages in thread
From: Eamonn Hamilton @ 2004-11-10  9:28 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Eamonn Hamilton, netdev

I'm getting rather distrustful of this motherboard - it's an A7N8X-X,
the NForce2 chipset. I'm sure I've seen some reports regarding DMS
oddities with it, but PCI errors?

Anyway, thanks for your help, it's much appreciated :)

Cheers,
Eamonn

On Wed, 2004-11-10 at 00:43 +0100, Francois Romieu wrote:
> Eamonn Hamilton <EAMONN.HAMILTON@saic.com> :
> [...]
> > Anyway, The device was brought back to functioning order by removing the
> > module and re-inserting it. I have to confess, I forgot to try simply
> > downing/upping it, I'll ask my friend to try that should it happen
> > again. Bizarrely, the frequency of it happening seems to be down for
> > whatever reason - it took ~12GB traffic to cause it to happen this
> > afternoon.
> > 
> > Anyway, hope this helps.
> 
> Yes. It suggests that it should not be too hard to add a hack for recovery.
> 
> I am a bit surprized with this isolated report though.
> 
> --
> Ueimor
-- 
Eamonn Hamilton

Senior Systems Engineer
SAIC Ltd
Tel : 01224 333833

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: linux 2.6.9: r8169: eth0: PCI error (status: 0x8404). Device disabled.
  2004-11-09 23:43                 ` Francois Romieu
  2004-11-10  9:28                   ` Eamonn Hamilton
@ 2004-11-10 10:04                   ` Eamonn Hamilton
  1 sibling, 0 replies; 8+ messages in thread
From: Eamonn Hamilton @ 2004-11-10 10:04 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev

On Wed, 2004-11-10 at 00:43 +0100, Francois Romieu wrote:
> Eamonn Hamilton <EAMONN.HAMILTON@saic.com> :
> [...]
> > Anyway, The device was brought back to functioning order by removing the
> > module and re-inserting it. I have to confess, I forgot to try simply
> > downing/upping it, I'll ask my friend to try that should it happen
> > again. Bizarrely, the frequency of it happening seems to be down for
> > whatever reason - it took ~12GB traffic to cause it to happen this
> > afternoon.
> > 
> > Anyway, hope this helps.
> 
> Yes. It suggests that it should not be too hard to add a hack for recovery.
> 
> I am a bit surprized with this isolated report though.

Arghh.

The guy whose computer it is has just told me that he was getting weird
errors until he rebooted it after reinserting the module, which makes me
very suspicious of DMA/PCI issues on this board. 

Ho-hum, more testing required.

Cheers,
Eamonn


> 
> --
> Ueimor
-- 
Eamonn Hamilton

Senior Systems Engineer
SAIC Ltd
Tel : 01224 333833

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2004-11-10 10:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1098269117.6631.5.camel@ukabzc383.uk.saic.com>
     [not found] ` <20041020121520.GA4004@electric-eye.fr.zoreil.com>
     [not found]   ` <1098282567.6631.10.camel@ukabzc383.uk.saic.com>
     [not found]     ` <20041020201010.GA13023@electric-eye.fr.zoreil.com>
2004-10-21  9:25       ` linux 2.6.9: r8169: eth0: PCI error (status: 0x8404). Device disabled Eamonn Hamilton
2004-10-21 12:02         ` Francois Romieu
2004-10-22  9:31           ` Eamonn Hamilton
2004-10-22 10:52             ` Francois Romieu
2004-11-08 14:03               ` Eamonn Hamilton
2004-11-09 23:43                 ` Francois Romieu
2004-11-10  9:28                   ` Eamonn Hamilton
2004-11-10 10:04                   ` Eamonn Hamilton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.