linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [Fwd: 2.6.10-rc3: tulip-driver: tulip_stop_rxtx() failed]
       [not found] <41BAAC04.6090706@pobox.com>
@ 2004-12-12 21:48 ` Grant Grundler
  2004-12-13  2:19   ` frahm
       [not found]   ` <200412130313.iBD3DAF4004365@albireo.free.fr>
  0 siblings, 2 replies; 8+ messages in thread
From: Grant Grundler @ 2004-12-12 21:48 UTC (permalink / raw)
  To: linux-kernel, frahm; +Cc: Grant Grundler, John Linville, Jeff Garzik


Klaus Frahm wrote:
> I would like give my feedback on a recent modification of the tulip
> driver in 2.6.10-rc3 by:
> 
> John W. Linville:
>   o tulip: make tulip_stop_rxtx() wait for DMA to fully stop

Klaus,
thanks for the feedback. I'm one of the original authors of the patch.

That patch was added prevent the tulip driver from deallocating DMA buffers
while tulip was still doing DMA to consistent memory or RX buffers.
This showed up as an MCA (crash) on faster ia64 (1.6Ghz) HP ZX1 platforms.


> I have a sitecom network card which happens to work with the tulip
> driver. I have observed in the kernel version 2.6.10-rc3 when I
> deconfigure the device with "ifconfig eth1 down" (or also with 
> "dhcpcd -k eth1"), I get the following message in dmesg and
> /var/log/messages:
> 0000:00:0e.0: tulip_stop_rxtx() failed
> The message did not appear until 2.6.10-rc2 and I assume it is due to
> the modification of "John W. Linville" mentioned above. 

Correct - it is.

> This message does not seem to create any problem for me and I do not
> require any assistance. I can configure and deconfigure the device as
> usual and the network card appears to work properly. However, I thought
> this information might be useful for debugging purposes for the
> developer.

I need just one or two more bits of info.
Apologies for not including that in the original patch. (see below)


> Since this is related to DMA this might also be a hardward bug since I
> use a 5 year old motherboard and since 2.4.21 I can no longer use DMA
> for my old cdrom. DMA for the harddisk works properly.

The message does not indicate a new problem on your platform.
And it's unlikely you will run into the same problem the patch
was intended to fix.


>  Furthermore I
> have two network cards, the first one (eth0) is:
> 
> 3c59x: Donald Becker and others. www.scyld.com/network/vortex.html
> 0000:00:0d.0: 3Com PCI 3c905C Tornado at 0xa800. Vers LK1.1.19
> 
> and the second one is the sitecom card which appears in dmesg as:
> 
> Linux Tulip driver version 1.1.13 (May 11, 2002)
> PCI: Found IRQ 10 for device 0000:00:0e.0
> PCI: Sharing IRQ 10 with 0000:00:0a.0
> tulip0:  MII transceiver #1 config 1000 status 786d advertising 05e1.
> eth1: ADMtek Comet rev 17 at 0001a400, 00:0C:F6:03:DA:D3, IRQ 10.
> 
> When I deactivate my internet connection (I am using a modem which
> provides a dhcp-serveur) by "dhcpcd -k eth1" I obtain in
> /var/log/messages:
> Dec 11 01:14:44 albireo dhcpcd[1867]: terminating on signal 1 
> Dec 11 01:14:44 albireo kernel: 0000:00:0e.0: tulip_stop_rxtx() failed

I'm kicking myself now since I didn't include the CSR5 or CSR6 value from
the last time it was read. Can you manually apply this small change
and try it for me?

-			printk(KERN_DEBUG "%s: tulip_stop_rxtx() failed\n",
-					tp->pdev->slot_name);
+			printk(KERN_DEBUG "%s: tulip_stop_rxtx() failed"
+					" (CSR5 0x%x CSR6 0x%x)\n",
+					tp->pdev->slot_name,
+					ioread32(ioaddr + CSR5),
+					ioread32(ioaddr + CSR6));

Basically, I need the CSR5/CSR6 contents after the loop is exited.

I expect one of three things to fix this:
o The comet card needs more time than we've allocated.
  Could you also try larger values for "i" in the loop?
  e.g. 2000/10 or 4000/10

o The loop is too "tight" and poking the card every 10us is interfering
  with DMA.  The solution is to change the udelay(10) to 50 or 100
  (and the corresponding "i" value initialization).

o Chip defect. When DMA is stopped, CSR5 Transmit State and Receive
  State machines are expected to be zero. It's possible this chip
  just never sets those states. I suppose we could check CSR6 bits
  to confirm the ST and SR bits are clear before printing the message.
  The CSR6 value above will tell me if that's feasible.


> Dec 11 01:14:44 albireo dhcpcd.exe: interface eth1 has been brought down
> 
> when I reactivate the connection afterwards with "dhcp eth1" I get:
> Dec 11 01:16:09 albireo dhcpcd.exe: interface eth1 has been configured with new IP=xx.xx.xx.xx
> Dec 11 01:16:12 albireo kernel: 0000:00:0e.0: tulip_stop_rxtx() failed
> Dec 11 01:16:12 albireo kernel: eth1: Setting full-duplex based on MII#1 link partner capability of 4061.
> 
> Afterwards the connection works properly. 
> 
> For information the values of /proc/interrupts:
>             CPU0       
>   0:    5010413          XT-PIC  timer
>   1:      10212          XT-PIC  i8042
>   2:          0          XT-PIC  cascade
>   5:          0          XT-PIC  uhci_hcd, eth0
>  10:       5357          XT-PIC  eth1, aic7xxx
>  12:     140379          XT-PIC  i8042
>  14:       7021          XT-PIC  ide0
>  15:       1650          XT-PIC  ide1
> NMI:          0 
> ERR:          0
> 
> and of lspci:
> 00:00.0 Host bridge: Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge (rev 03)
> 00:01.0 PCI bridge: Intel Corporation 440BX/ZX - 82443BX/ZX AGP bridge (rev 03)
> 00:04.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 02)
> 00:04.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01)
> 00:04.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01)
> 00:04.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 02)
> 00:0a.0 SCSI storage controller: Adaptec AHA-7850 (rev 03)
> 00:0d.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 78)
> 00:0e.0 Ethernet controller: Bridgecom, Inc: Unknown device 0985 (rev 11)
> 01:00.0 VGA compatible controller: ATI Technologies Inc 3D Rage Pro AGP 1X/2X (rev 5c)
> 
> I have no idea if my observation is really important or indicates a bug.

Yes - either a HW bug or it indicates we need to adjust the loop.

> I thought it might be useful to give my feedback before 2.6.10 comes out
> in case there is a bug.

yes - thanks.

> I can provide further information on my configuration if necessary.
> Please make in this case a cc to frahm_at_irsamc_dot_ups-tlse_dot_fr
> since I am not subscribed to the mailing list. 

The above output would be great. thanks!

grant

> 
> 
> Greetings, Klaus Frahm.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Fwd: 2.6.10-rc3: tulip-driver: tulip_stop_rxtx() failed]
  2004-12-12 21:48 ` [Fwd: 2.6.10-rc3: tulip-driver: tulip_stop_rxtx() failed] Grant Grundler
@ 2004-12-13  2:19   ` frahm
  2004-12-13  3:36     ` Grant Grundler
       [not found]   ` <200412130313.iBD3DAF4004365@albireo.free.fr>
  1 sibling, 1 reply; 8+ messages in thread
From: frahm @ 2004-12-13  2:19 UTC (permalink / raw)
  To: grundler; +Cc: linux-kernel

Dear Grant, 

I am happy to help and to see my feedback was indeed useful. First I
would like to add to my first message, that I have observed a kind of
freezing for several seconds (5-10 seconds) of Firefox with one
particular web-page which heavily uses a flashplayer- and jave-plugin.
This does not appear in exactly the same form up to 2.6.10-rc2 but the
web-page under consideration is always kind of "heavy" and difficult to
load even with a high-speed connection. 


> I'm kicking myself now since I didn't include the CSR5 or CSR6 value from
> the last time it was read. Can you manually apply this small change
> and try it for me?
> 
> -			printk(KERN_DEBUG "%s: tulip_stop_rxtx() failed\n",
> -					tp->pdev->slot_name);
> +			printk(KERN_DEBUG "%s: tulip_stop_rxtx() failed"
> +					" (CSR5 0x%x CSR6 0x%x)\n",
> +					tp->pdev->slot_name,
> +					ioread32(ioaddr + CSR5),
> +					ioread32(ioaddr + CSR6));
> 
> Basically, I need the CSR5/CSR6 contents after the loop is exited.
> 

I have applied this modification to the file "drivers/net/tulip/tulip.h"
and recompiled the modules. 

I have activated and deactivated three times the device eth1 (by "dhcpcd
eth1" and "dhcpcd -k eth1") attached to the tulip driver and the result
in /var/log/messages is:

Dec 13 02:42:57 albireo kernel: Linux Tulip driver version 1.1.13 (May 11, 2002)
Dec 13 02:42:57 albireo kernel: PCI: Found IRQ 10 for device 0000:00:0e.0
Dec 13 02:42:57 albireo kernel: PCI: Sharing IRQ 10 with 0000:00:0a.0
Dec 13 02:42:57 albireo kernel: tulip0:  MII transceiver #1 config 1000 status 786d advertising 05e1.
Dec 13 02:42:57 albireo kernel: eth1: ADMtek Comet rev 17 at 0001a400, 00:0C:F6:03:DA:D3, IRQ 10.
Dec 13 02:42:57 albireo dhcpcd.exe: interface eth1 has been configured with new IP=xx.xx.xx.xx
Dec 13 02:43:00 albireo kernel: 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc664010 CSR6 0xff972113)
Dec 13 02:43:00 albireo kernel: eth1: Setting full-duplex based on MII#1 link partner capability of 4061.
Dec 13 02:43:08 albireo dhcpcd[1340]: terminating on signal 1 
Dec 13 02:43:08 albireo kernel: 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc06c012 CSR6 0xff970111)
Dec 13 02:43:08 albireo dhcpcd.exe: interface eth1 has been brought down
Dec 13 02:43:18 albireo dhcpcd.exe: interface eth1 has been configured with new IP=xx.xx.xx.xx
Dec 13 02:43:21 albireo kernel: 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc664010 CSR6 0xff972113)
Dec 13 02:43:21 albireo kernel: eth1: Setting full-duplex based on MII#1 link partner capability of 4061.
Dec 13 02:43:45 albireo dhcpcd[1357]: terminating on signal 1 
Dec 13 02:43:45 albireo kernel: 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc06c012 CSR6 0xff970111)
Dec 13 02:43:45 albireo dhcpcd.exe: interface eth1 has been brought down
Dec 13 02:44:01 albireo dhcpcd.exe: interface eth1 has been configured with new IP=xx.xx.xx.xx
Dec 13 02:44:03 albireo kernel: 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc664010 CSR6 0xff972113)
Dec 13 02:44:03 albireo kernel: eth1: Setting full-duplex based on MII#1 link partner capability of 4061.
Dec 13 02:44:12 albireo dhcpcd[1374]: terminating on signal 1 
Dec 13 02:44:12 albireo kernel: 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc06c012 CSR6 0xff970111)
Dec 13 02:44:12 albireo dhcpcd.exe: interface eth1 has been brought down

The message with "tulip_stop_rxtx() failed" appears each time I activate
and deactivate the device. The values of CSR5 and CSR6 are reproduceable
but different for activation and deactivation:

  activation: (CSR5 0xfc664010 CSR6 0xff972113)
deactivation: (CSR5 0xfc06c012 CSR6 0xff970111)


I have also joined the output of /proc/pci for the network card in case
it may contain some useful information:

  Bus  0, device  14, function  0:
    Ethernet controller: Linksys NC100 Network Everywhere Fast Ethernet 10/100 (rev 17).
      IRQ 10.
      Master Capable.  Latency=32.  Min Gnt=255.Max Lat=255.
      I/O at 0xa400 [0xa4ff].
      Non-prefetchable 32 bit memory at 0xe0000000 [0xe00003ff].


Greetings, Klaus.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Fwd: 2.6.10-rc3: tulip-driver: tulip_stop_rxtx() failed]
  2004-12-13  2:19   ` frahm
@ 2004-12-13  3:36     ` Grant Grundler
  0 siblings, 0 replies; 8+ messages in thread
From: Grant Grundler @ 2004-12-13  3:36 UTC (permalink / raw)
  To: frahm; +Cc: grundler, linux-kernel, jgarzik

On Mon, Dec 13, 2004 at 03:19:20AM +0100, frahm@irsamc.ups-tlse.fr wrote:
> Dear Grant, 
> 
> I am happy to help and to see my feedback was indeed useful.

yes, indeed.

>  First I
> would like to add to my first message, that I have observed a kind of
> freezing for several seconds (5-10 seconds) of Firefox with one
> particular web-page which heavily uses a flashplayer- and jave-plugin.
> This does not appear in exactly the same form up to 2.6.10-rc2 but the
> web-page under consideration is always kind of "heavy" and difficult to
> load even with a high-speed connection. 

Sorry - while this sounds like a pre-emption/scheduling problem,
I'm not much help in dealing with it. And I'm no friend of flash player
though I understand why corporate types like it (form of DRM).
But I don't like flash as it tends to take over the user session.

...
> Dec 13 02:42:57 albireo kernel: Linux Tulip driver version 1.1.13 (May 11, 2002)

Oh..that reminds me to ask for the "date" to either be updated or removed.

...
> The values of CSR5 and CSR6 are reproduceable
> but different for activation and deactivation:
> 
>   activation: (CSR5 0xfc664010 CSR6 0xff972113)
> deactivation: (CSR5 0xfc06c012 CSR6 0xff970111)

Yes - I agree with you summary - Thanks!

The bits in CSR5 that I care about are TS (22:20) and RS (19:17).
For activation, TS is 6 and for deactivation it's 0. That's correct.
For activation and deactivation, RS is 3, and that's wrong.

TS
0 == Stopped--RESET command or STOP COMMAND issued, or transmit jabber expired
6 == Suspended--Transmit FIFO underflow, or an unavailable transmit descriptor

RS
3 == Running, waiting for RX packet

And in CSR6, I only care about ST (bit 13) and SR (bit 1).
For activation ST is 1 and deactivation it's 0.
But SR is always 1. Again, that's wrong but agrees with "CSR5.RS".

ST = Start/Stop transmission
SR = Start/Stop Receive

To summarize, the CSR5 and CSR6 values agree.
It looks like this chip does not implement shutting down the RX engine
*or* it just lies about the state of the machine.

My advice is do NOT ifdown the NIC once you bring it up as I'm inclined
to believe the former. If the RX machine really isn't stopped, it will
continue to DMA and corrupt memory. One would either need a PCI bus
analyzer or hack the code to monitor the RX descriptors and associated
buffers  *after* tulip_stop_rxtx() had been called to see if they
get modified.

> I have also joined the output of /proc/pci for the network card in case
> it may contain some useful information:
> 
>   Bus  0, device  14, function  0:
>     Ethernet controller: Linksys NC100 Network Everywhere Fast Ethernet 10/100 (rev 17).
>       IRQ 10.
>       Master Capable.  Latency=32.  Min Gnt=255.Max Lat=255.
>       I/O at 0xa400 [0xa4ff].
>       Non-prefetchable 32 bit memory at 0xe0000000 [0xe00003ff].

Nothing unusual here. Just good to document the defect.

thanks!
grant

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Fwd: 2.6.10-rc3: tulip-driver: tulip_stop_rxtx() failed]
       [not found]   ` <200412130313.iBD3DAF4004365@albireo.free.fr>
@ 2004-12-13  3:59     ` Grant Grundler
  2004-12-13 11:52       ` FRAHM Klaus
  2004-12-15  0:57       ` John W. Linville
  0 siblings, 2 replies; 8+ messages in thread
From: Grant Grundler @ 2004-12-13  3:59 UTC (permalink / raw)
  To: frahm; +Cc: grundler, linux-kernel, jgarzik

On Mon, Dec 13, 2004 at 04:13:10AM +0100, frahm@irsamc.ups-tlse.fr wrote:
> I am sorry, I forgot the modification for "i" in the loop and the udelay:

np...I really appreciate you taking the time to run these.

...
> Here is the output of dmesg (I carefully removed the old tulip module and 
> inserted its new version after each recompilation.)
> 
> --- i=2000/10, udelay(10)
> 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc664010 CSR6 0xff972113)
> 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc06c012 CSR6 0xff970111)

> --- i=4000/10, udelay(10)
> 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc664010 CSR6 0xff972113)
> 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc06c012 CSR6 0xff970111)

> --- i=1300/50, udelay(50)
> 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc664010 CSR6 0xff972113)
> 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc06c012 CSR6 0xff970111)

> --- i=4000/50, udelay(50)
> 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc664010 CSR6 0xff972113)
> 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc06c012 CSR6 0xff970111)

> --- i=1300/100, udelay(100)
> 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc664010 CSR6 0xff972113)
> 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc06c012 CSR6 0xff970111)

> --- i=4000/100, udelay(100)
> 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc664010 CSR6 0xff972113)
> 0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc06c012 CSR6 0xff970111)

> There is no modification in the values of CSR5 and CSR6.

yeah. :^(
Rules out those two theories pretty much.

> I suppose this implies a Chip defect which is quite plausible
> since a I have cheap Sitecom card which is perhaps not 100%
> compatible with the tulip-driver ? 

Definitely not compatible.

And as noted in the previous email, I don't advise ifdown the NIC
unless you can verify it will not corrupt the memory that was
previously used for RX descriptors and RX buffers.

OTOH, 100BT cards are *so* cheap, it should be possible to replace
if it's not built-in on the motherboard.

Sorry for the bad news and thanks for doing the extra tests.

But still, I'm hopeing for two code changes as a result:
1) include CSR5 and CSR6 in the printk output
2) the date of the tulip driver revision needs to be updated (or dropped).

thanks,
grant

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Fwd: 2.6.10-rc3: tulip-driver: tulip_stop_rxtx() failed]
  2004-12-13  3:59     ` Grant Grundler
@ 2004-12-13 11:52       ` FRAHM Klaus
  2004-12-15  0:57       ` John W. Linville
  1 sibling, 0 replies; 8+ messages in thread
From: FRAHM Klaus @ 2004-12-13 11:52 UTC (permalink / raw)
  To: grundler; +Cc: linux-kernel

> 
> Definitely not compatible.
> 
> And as noted in the previous email, I don't advise ifdown the NIC
> unless you can verify it will not corrupt the memory that was
> previously used for RX descriptors and RX buffers.
> 
> OTOH, 100BT cards are *so* cheap, it should be possible to replace
> if it's not built-in on the motherboard.
> 
> Sorry for the bad news and thanks for doing the extra tests.
> 
> But still, I'm hopeing for two code changes as a result:
> 1) include CSR5 and CSR6 in the printk output
> 2) the date of the tulip driver revision needs to be updated (or dropped).
> 
> thanks,
> grant

Hi Grant,

thanks for the analysis. It's good to know not to deactivate the NIC. I
never knowingly encoutered any real problem, but maybe there may been
other of problems which I did not attribute to the NIC. Normally I do
not deactivated the interface, but up to a few weeks ago the DHCP daemon
could do this in order to change the IP number. Now this is over and I
have since then a fixed IP number (because I am directly connected to
the DSLAM of my ISP, this is called "degroupage" in France as opposed to
IP/ADSL where one is only indirectly connected with Franc-Telecom). 
I am actually using a "Freebox" as a modem which has recently, `let's
say', arisen some interest in the linux-kernel mailing list because it
runs Linux inside: 

http://marc.theaimsgroup.com/?l=linux-kernel&m=109936781417837&w=2

For the NIC I have mentioned in my first message that I have two NICs,
the other being a 3Com card which is mostly likely better. I will simply 
switch the roles between my first and the second interface by exchanging
the aliases for eth0/1 in modprobes.conf. Then I use the Sitecom card
with the tulip driver only for the internal connection which is never
shut down and not so often used.


Greetings, Klaus.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Fwd: 2.6.10-rc3: tulip-driver: tulip_stop_rxtx() failed]
  2004-12-13  3:59     ` Grant Grundler
  2004-12-13 11:52       ` FRAHM Klaus
@ 2004-12-15  0:57       ` John W. Linville
  2004-12-15 17:21         ` Grant Grundler
  1 sibling, 1 reply; 8+ messages in thread
From: John W. Linville @ 2004-12-15  0:57 UTC (permalink / raw)
  To: Grant Grundler; +Cc: frahm, linux-kernel, jgarzik

On Sun, Dec 12, 2004 at 08:59:36PM -0700, Grant Grundler wrote:

> But still, I'm hopeing for two code changes as a result:
> 1) include CSR5 and CSR6 in the printk output
> 2) the date of the tulip driver revision needs to be updated (or dropped).

Patches?

If you don't want to post them publicly yourself, send them to me
and I'll be happy to test/package/post them...

John
-- 
John W. Linville
linville@tuxdriver.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Fwd: 2.6.10-rc3: tulip-driver: tulip_stop_rxtx() failed]
  2004-12-15  0:57       ` John W. Linville
@ 2004-12-15 17:21         ` Grant Grundler
  0 siblings, 0 replies; 8+ messages in thread
From: Grant Grundler @ 2004-12-15 17:21 UTC (permalink / raw)
  To: Grant Grundler, frahm, linux-kernel, jgarzik

On Tue, Dec 14, 2004 at 07:57:22PM -0500, John W. Linville wrote:
> On Sun, Dec 12, 2004 at 08:59:36PM -0700, Grant Grundler wrote:
> 
> > But still, I'm hopeing for two code changes as a result:
> > 1) include CSR5 and CSR6 in the printk output
> > 2) the date of the tulip driver revision needs to be updated (or dropped).
> 
> Patches?

Sorry...appended below.

> If you don't want to post them publicly yourself, send them to me
> and I'll be happy to test/package/post them...

Thanks!
But I don't mind posting them...I just spaced out and send the first bit
to jgarzik directly instead of "reply all".

Commit Log:
	add CSR5 and CSR6 output to debug tulip_stop_rxtx failures.
	Update version release date

Signed-off-by:
	Grant Grundler <grundler@parisc-linux.org>

thanks,
grant

Index: drivers/net/tulip/tulip.h
===================================================================
RCS file: /var/cvs/linux-2.6/drivers/net/tulip/tulip.h,v
retrieving revision 1.11
diff -u -p -r1.11 tulip.h
--- drivers/net/tulip/tulip.h	4 Dec 2004 07:02:42 -0000	1.11
+++ drivers/net/tulip/tulip.h	12 Dec 2004 21:51:43 -0000
@@ -474,8 +474,11 @@ static inline void tulip_stop_rxtx(struc
 			udelay(10);
 
 		if (!i)
-			printk(KERN_DEBUG "%s: tulip_stop_rxtx() failed\n",
-					tp->pdev->slot_name);
+			printk(KERN_DEBUG "%s: tulip_stop_rxtx() failed"
+					" (CSR5 0x%x CSR6 0x%x)\n",
+					tp->pdev->slot_name,
+					ioread32(ioaddr + CSR5),
+					ioread32(ioaddr + CSR6));
 	}
 }
 

Index: drivers/net/tulip/tulip_core.c
===================================================================
RCS file: /var/cvs/linux-2.6/drivers/net/tulip/tulip_core.c,v
retrieving revision 1.24
diff -u -p -r1.24 tulip_core.c
--- drivers/net/tulip/tulip_core.c	4 Dec 2004 07:02:42 -0000	1.24
+++ drivers/net/tulip/tulip_core.c	15 Dec 2004 17:18:39 -0000
@@ -22,7 +22,7 @@
 #else
 #define DRV_VERSION	"1.1.13"
 #endif
-#define DRV_RELDATE	"May 11, 2002"
+#define DRV_RELDATE	"December 15, 2004"
 
 
 #include <linux/module.h>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Fwd: 2.6.10-rc3: tulip-driver: tulip_stop_rxtx() failed]
@ 2004-12-13  3:14 frahm
  0 siblings, 0 replies; 8+ messages in thread
From: frahm @ 2004-12-13  3:14 UTC (permalink / raw)
  To: linux-kernel

I am sorry, I forgot the modification for "i" in the loop and teh udelay:

> I expect one of three things to fix this:
> o The comet card needs more time than we've allocated.
>   Could you also try larger values for "i" in the loop?
>   e.g. 2000/10 or 4000/10
> 
> o The loop is too "tight" and poking the card every 10us is interfering
>   with DMA.  The solution is to change the udelay(10) to 50 or 100
>   (and the corresponding "i" value initialization).

Here is the output of dmesg (I carefully removed the old tulip module and 
inserted its new version after each recompilation.)

--- i=2000/10, udelay(10)
Linux Tulip driver version 1.1.13 (May 11, 2002)
PCI: Found IRQ 10 for device 0000:00:0e.0
PCI: Sharing IRQ 10 with 0000:00:0a.0
tulip0:  MII transceiver #1 config 1000 status 786d advertising 05e1.
eth1: ADMtek Comet rev 17 at 0001a400, 00:0C:F6:03:DA:D3, IRQ 10.
0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc664010 CSR6 0xff972113)
eth1: Setting full-duplex based on MII#1 link partner capability of 4061.
0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc06c012 CSR6 0xff970111)
--- i=4000/10, udelay(10)
Linux Tulip driver version 1.1.13 (May 11, 2002)
PCI: Found IRQ 10 for device 0000:00:0e.0
PCI: Sharing IRQ 10 with 0000:00:0a.0
tulip0:  MII transceiver #1 config 1000 status 786d advertising 05e1.
eth1: ADMtek Comet rev 17 at 0001a400, 00:0C:F6:03:DA:D3, IRQ 10.
0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc664010 CSR6 0xff972113)
eth1: Setting full-duplex based on MII#1 link partner capability of 4061.
0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc06c012 CSR6 0xff970111)
--- i=1300/50, udelay(50)
Linux Tulip driver version 1.1.13 (May 11, 2002)
PCI: Found IRQ 10 for device 0000:00:0e.0
PCI: Sharing IRQ 10 with 0000:00:0a.0
tulip0:  MII transceiver #1 config 1000 status 786d advertising 05e1.
eth1: ADMtek Comet rev 17 at 0001a400, 00:0C:F6:03:DA:D3, IRQ 10.
0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc664010 CSR6 0xff972113)
eth1: Setting full-duplex based on MII#1 link partner capability of 4061.
0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc06c012 CSR6 0xff970111)
--- i=4000/50, udelay(50)
Linux Tulip driver version 1.1.13 (May 11, 2002)
PCI: Found IRQ 10 for device 0000:00:0e.0
PCI: Sharing IRQ 10 with 0000:00:0a.0
tulip0:  MII transceiver #1 config 1000 status 786d advertising 05e1.
eth1: ADMtek Comet rev 17 at 0001a400, 00:0C:F6:03:DA:D3, IRQ 10.
0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc664010 CSR6 0xff972113)
eth1: Setting full-duplex based on MII#1 link partner capability of 4061.
0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc06c012 CSR6 0xff970111)
--- i=1300/100, udelay(100)
Linux Tulip driver version 1.1.13 (May 11, 2002)
PCI: Found IRQ 10 for device 0000:00:0e.0
PCI: Sharing IRQ 10 with 0000:00:0a.0
tulip0:  MII transceiver #1 config 1000 status 786d advertising 05e1.
eth1: ADMtek Comet rev 17 at 0001a400, 00:0C:F6:03:DA:D3, IRQ 10.
0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc664010 CSR6 0xff972113)
eth1: Setting full-duplex based on MII#1 link partner capability of 4061.
0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc06c012 CSR6 0xff970111)
--- i=4000/100, udelay(100)
Linux Tulip driver version 1.1.13 (May 11, 2002)
PCI: Found IRQ 10 for device 0000:00:0e.0
PCI: Sharing IRQ 10 with 0000:00:0a.0
tulip0:  MII transceiver #1 config 1000 status 786d advertising 05e1.
eth1: ADMtek Comet rev 17 at 0001a400, 00:0C:F6:03:DA:D3, IRQ 10.
0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc664010 CSR6 0xff972113)
eth1: Setting full-duplex based on MII#1 link partner capability of 4061.
0000:00:0e.0: tulip_stop_rxtx() failed (CSR5 0xfc06c012 CSR6 0xff970111)

There is no modification in the values of CSR5 and CSR6. I suppose this 
implies a Chip defect which is quite plausible since a I have cheap Sitecom 
card which is perhaps not 100% compatible with the tulip-driver ? 

> o Chip defect. When DMA is stopped, CSR5 Transmit State and Receive
>   State machines are expected to be zero. It's possible this chip
>   just never sets those states. I suppose we could check CSR6 bits
>   to confirm the ST and SR bits are clear before printing the message.
>   The CSR6 value above will tell me if that's feasible.

Greetings, Klaus.



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2004-12-15 17:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <41BAAC04.6090706@pobox.com>
2004-12-12 21:48 ` [Fwd: 2.6.10-rc3: tulip-driver: tulip_stop_rxtx() failed] Grant Grundler
2004-12-13  2:19   ` frahm
2004-12-13  3:36     ` Grant Grundler
     [not found]   ` <200412130313.iBD3DAF4004365@albireo.free.fr>
2004-12-13  3:59     ` Grant Grundler
2004-12-13 11:52       ` FRAHM Klaus
2004-12-15  0:57       ` John W. Linville
2004-12-15 17:21         ` Grant Grundler
2004-12-13  3:14 frahm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).