* ISSUE: DFE530-TX REV-A3-1 times out on transmit @ 2001-08-24 14:24 David Schmitt 2001-08-25 17:05 ` Urban Widmark 0 siblings, 1 reply; 12+ messages in thread From: David Schmitt @ 2001-08-24 14:24 UTC (permalink / raw) To: linux-kernel [1.] One line summary of the problem: DFE530-TX REV-A3-1 times out on transmit [2.] Full description of the problem/report: After receiving ~50 MB of network traffic the DFE530-TX (REV-A3-1) starts emitting Aug 24 11:13:57 cheesy kernel: NETDEV WATCHDOG: eth0: transmit timed out Aug 24 11:13:57 cheesy kernel: eth0: Transmit timed out, status 0000, PHY status 782d, resetting... Aug 24 11:13:57 cheesy kernel: eth0: reset finished after 5 microseconds. After some more traffic the resets stop working and the card cannot transmit or receive anymore. Aug 24 11:15:07 cheesy kernel: NETDEV WATCHDOG: eth0: transmit timed out Aug 24 11:15:07 cheesy kernel: eth0: Transmit timed out, status 0000, PHY status 782d, resetting... Aug 24 11:15:07 cheesy kernel: eth0: reset did not complete in 10 ms. Aug 24 11:15:07 cheesy kernel: eth0: reset finished after 10005 microseconds. Aug 24 11:15:07 cheesy kernel: eth0: Transmit frame #1 queued in slot 0. Aug 24 11:15:07 cheesy kernel: eth0: Transmit frame #2 queued in slot 1. Aug 24 11:15:07 cheesy kernel: eth0: Transmit frame #3 queued in slot 2. Aug 24 11:15:07 cheesy kernel: eth0: Transmit frame #4 queued in slot 3. Aug 24 11:15:07 cheesy kernel: eth0: Transmit frame #5 queued in slot 4. Aug 24 11:15:07 cheesy kernel: eth0: Transmit frame #6 queued in slot 5. Aug 24 11:15:07 cheesy kernel: eth0: Transmit frame #7 queued in slot 6. Aug 24 11:15:07 cheesy kernel: eth0: Transmit frame #8 queued in slot 7. Aug 24 11:15:07 cheesy kernel: eth0: Transmit frame #9 queued in slot 8. Aug 24 11:15:07 cheesy kernel: eth0: Transmit frame #10 queued in slot 9. Aug 24 11:15:09 cheesy kernel: eth0: VIA Rhine monitor tick, status 0000. Aug 24 11:15:11 cheesy kernel: NETDEV WATCHDOG: eth0: transmit timed out Aug 24 11:15:11 cheesy kernel: eth0: Transmit timed out, status 0000, PHY status 782d, resetting... Aug 24 11:15:11 cheesy kernel: eth0: reset did not complete in 10 ms. Aug 24 11:15:11 cheesy kernel: eth0: reset finished after 10005 microseconds. Reloading the module doesn't help either. Only a reboot reenables network connectivity. [3.] Keywords (i.e., modules, networking, kernel): d-link, dfe530-tx rev-a3-1, networking, transmit NETDEV WATCHDOG: transmit timed out nic network card pci [4.] Kernel version (from /proc/version): Linux version 2.4.9-cheesy-1 (david@cheesy) (gcc version 2.95.4 20010319 (Debian prerelease)) #1 Wed Aug 22 17:21:16 CEST 2001 [6.] A small shell script or example program which triggers the problem (if possible) Downloading amounts of data (>50MB) will eventually trigger the problem. Transmitting data at less than full speed will not trigger it (or at least I haven't waited long enough?) [7.] Environment [7.1.] Software (add the output of the ver_linux script here) Linux cheesy 2.4.9-cheesy-1 #1 Wed Aug 22 17:21:16 CEST 2001 i686 unknown Kernel modules 2.4.6 Gnu C 2.95.4 Binutils 2.11.90.0.27 Linux C Library 2.2.3 Dynamic Linker (ld.so) 2.2.3 Procps 2.0.7 Mount 2.11h Net-tools 1.60 Kbd 0.2.3 Sh-utils 2.0.11 [7.2.] Processor information (from /proc/cpuinfo): cheesy:~# cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 4 model name : AMD Athlon(tm) Processor stepping : 2 cpu MHz : 1199.699 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr syscall mmxext 3dnowext 3dnow bogomips : 2392.06 cheesy:~# [7.3.] Module information (from /proc/modules): cheesy:~# cat /proc/modules serial 42816 0 (autoclean) via-rhine 10704 1 unix 15008 4 (autoclean) ide-disk 6912 4 (autoclean) ide-probe-mod 8592 0 (autoclean) ide-mod 68432 4 (autoclean) [ide-disk ide-probe-mod] ext2 33424 2 (autoclean) cheesy:~# [7.4.] SCSI information (from /proc/scsi/scsi): No SCSI. [7.5.] Other information that might be relevant to the problem (please look in /proc and include all information that you think to be relevant): cheesy:/proc# cat /proc/iomem 00000000-0009fbff : System RAM 0009fc00-0009ffff : reserved 000a0000-000bffff : Video RAM area 000c0000-000c7fff : Video ROM 000f0000-000fffff : System ROM 00100000-1ffebfff : System RAM 00100000-001b6e77 : Kernel code 001b6e78-001f3a7f : Kernel data 1ffec000-1ffeefff : ACPI Tables 1ffef000-1fffefff : reserved 1ffff000-1fffffff : ACPI Non-volatile Storage dd000000-dd0000ff : VIA Technologies, Inc. Ethernet Controller dd000000-dd0000ff : via-rhine dd800000-dfdfffff : PCI Bus #01 dd800000-dd800fff : ATI Technologies Inc Rage XL AGP de000000-deffffff : ATI Technologies Inc Rage XL AGP dff00000-dfffffff : PCI Bus #01 e0000000-e7ffffff : VIA Technologies, Inc. VT8363/8365 [KT133/KM133] ffff0000-ffffffff : reserved cheesy:/proc# cat /proc/ioports 0000-001f : dma1 0020-003f : pic1 0040-005f : timer 0060-006f : keyboard 0080-008f : dma page reg 00a0-00bf : pic2 00c0-00df : dma2 00f0-00ff : fpu 0170-0177 : ide1 01f0-01f7 : ide0 02f8-02ff : serial(set) 0376-0376 : ide1 03c0-03df : vga+ 03f6-03f6 : ide0 03f8-03ff : serial(set) 0cf8-0cff : PCI conf1 9400-94ff : VIA Technologies, Inc. Ethernet Controller 9400-94ff : via-rhine a000-a003 : VIA Technologies, Inc. AC97 Audio Controller a400-a403 : VIA Technologies, Inc. AC97 Audio Controller a800-a8ff : VIA Technologies, Inc. AC97 Audio Controller b000-b01f : VIA Technologies, Inc. UHCI USB (#2) b400-b41f : VIA Technologies, Inc. UHCI USB b800-b80f : VIA Technologies, Inc. Bus Master IDE b800-b807 : ide0 b808-b80f : ide1 d000-dfff : PCI Bus #01 d800-d8ff : ATI Technologies Inc Rage XL AGP e200-e27f : VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] e800-e80f : VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] [X.] Other notes, patches, fixes, workarounds Further information from lspci, via-diag and ifconfig output as well as well as complete kernel syslog from boot to network-lock can be found on http://www.heureka.co.at/~david/dfe530tx/ Short lspci output: 00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 03) 00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP] 00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40) 00:04.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06) 00:04.2 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16) 00:04.3 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16) 00:04.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40) 00:04.5 Multimedia audio controller: VIA Technologies, Inc. AC97 Audio Controller (rev 50) 00:0b.0 Ethernet controller: VIA Technologies, Inc. Ethernet Controller (rev 43) 01:00.0 VGA compatible controller: ATI Technologies Inc Rage XL AGP (rev 27) Thank you for your time and work! Regards, David Schmitt -- Report, Hardware and Bandwidth provided by Heureka GesmbH, Austria. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ISSUE: DFE530-TX REV-A3-1 times out on transmit 2001-08-24 14:24 ISSUE: DFE530-TX REV-A3-1 times out on transmit David Schmitt @ 2001-08-25 17:05 ` Urban Widmark 2001-08-27 8:27 ` David Schmitt 0 siblings, 1 reply; 12+ messages in thread From: Urban Widmark @ 2001-08-25 17:05 UTC (permalink / raw) To: David Schmitt; +Cc: linux-kernel On Fri, 24 Aug 2001, David Schmitt wrote: > Aug 24 11:15:07 cheesy kernel: NETDEV WATCHDOG: eth0: transmit timed out > Aug 24 11:15:07 cheesy kernel: eth0: Transmit timed out, status 0000, PHY status 782d, resetting... > Aug 24 11:15:07 cheesy kernel: eth0: reset did not complete in 10 ms. > Aug 24 11:15:07 cheesy kernel: eth0: reset finished after 10005 microseconds. > Aug 24 11:15:07 cheesy kernel: eth0: Transmit frame #1 queued in slot 0. [snip] > Aug 24 11:15:07 cheesy kernel: eth0: Transmit frame #10 queued in slot 9. > Aug 24 11:15:09 cheesy kernel: eth0: VIA Rhine monitor tick, status 0000. > Aug 24 11:15:11 cheesy kernel: NETDEV WATCHDOG: eth0: transmit timed out > Aug 24 11:15:11 cheesy kernel: eth0: Transmit timed out, status 0000, PHY status 782d, resetting... > Aug 24 11:15:11 cheesy kernel: eth0: reset did not complete in 10 ms. > Aug 24 11:15:11 cheesy kernel: eth0: reset finished after 10005 microseconds. > > Reloading the module doesn't help either. Only a reboot > reenables network connectivity. There is a patch in the 2.4.8-acX kernels that fixes a problem with reseting the card when it is first used. I can't say that I know that it fixes anything you are seeing, but it could be worth trying. Did this start with recent versions, or have you never run older kernels on this hw? Reloading the module is to the hardware about the same as the watchdog reset. Rebooting obviously triggers something else too ... perhaps the BIOS talks some sense to the card. > [6.] A small shell script or example program which triggers the > problem (if possible) > > Downloading amounts of data (>50MB) will eventually trigger > the problem. Transmitting data at less than full speed will > not trigger it (or at least I haven't waited long enough?) What do you use to download? from a server on the LAN or something remote? and how do you slow down the speed of your transmission? How fast is it when it is fast, and how much do you slow it down? My other machine does not have anything useful installed, but it did have chargen and discard open. nc other.machine chargen > /dev/null iptraf says about 64Mbps nc other.machine discard < /dev/zero iptraf says about 44Mbps Sending about 1.5G in both directions, without problems. I used to have a netperf setup and that would (more or less) fill the 100Mbps. > [X.] Other notes, patches, fixes, workarounds > > Further information from lspci, via-diag and ifconfig output as well > as well as complete kernel syslog from boot to network-lock can be > found on http://www.heureka.co.at/~david/dfe530tx/ The syslog gives a few hints that something is wrong ... eth0: Transmit error, Tx status 00008100. 8 - transmit error 1 - transmit aborted after excessive collisions but at the same time the 00 part means that the "collision retry count" is 0 and that it hasn't set a flag that it "experienced collisions in this transmit event". I think there were 3 of these, and from all but the last it recovers by itself. Perhaps the collisions (or whatever it is that the card sees as collisions) continued for a longer period. It ends up in "eth0: transmit timed out" and the driver tries to reset the card. That does not appear to work at all. It's a nice report, I wish I had something more useful to reply with. The driver source has links to some datasheets. They might be useful in improving the reset code. (Hmm, the tx_timeout code does: reset -> initialise ring -> wait for hw but initialise ring talks to the hw, perhaps it should wait for hw first ...) /Urban ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ISSUE: DFE530-TX REV-A3-1 times out on transmit 2001-08-25 17:05 ` Urban Widmark @ 2001-08-27 8:27 ` David Schmitt 2001-08-27 9:13 ` David Schmitt ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: David Schmitt @ 2001-08-27 8:27 UTC (permalink / raw) To: linux-kernel On Sat, Aug 25, 2001 at 07:05:26PM +0200, Urban Widmark wrote: > On Fri, 24 Aug 2001, David Schmitt wrote: > > > Aug 24 11:15:07 cheesy kernel: NETDEV WATCHDOG: eth0: transmit timed out > > Aug 24 11:15:07 cheesy kernel: eth0: Transmit timed out, status 0000, PHY status 782d, resetting... > > Aug 24 11:15:07 cheesy kernel: eth0: reset did not complete in 10 ms. > > Aug 24 11:15:07 cheesy kernel: eth0: reset finished after 10005 microseconds. > > Aug 24 11:15:07 cheesy kernel: eth0: Transmit frame #1 queued in slot 0. > [snip] > > Aug 24 11:15:07 cheesy kernel: eth0: Transmit frame #10 queued in slot 9. > > Aug 24 11:15:09 cheesy kernel: eth0: VIA Rhine monitor tick, status 0000. > > Aug 24 11:15:11 cheesy kernel: NETDEV WATCHDOG: eth0: transmit timed out > > Aug 24 11:15:11 cheesy kernel: eth0: Transmit timed out, status 0000, PHY status 782d, resetting... > > Aug 24 11:15:11 cheesy kernel: eth0: reset did not complete in 10 ms. > > Aug 24 11:15:11 cheesy kernel: eth0: reset finished after 10005 microseconds. > > > > Reloading the module doesn't help either. Only a reboot > > reenables network connectivity. > > There is a patch in the 2.4.8-acX kernels that fixes a problem with > reseting the card when it is first used. I can't say that I know that it > fixes anything you are seeing, but it could be worth trying. Ok, I will try that too and report back. > Did this start with recent versions, or have you never run older kernels > on this hw? I tried it now with 2.2.19 and killed it too. See below for details. > Reloading the module is to the hardware about the same as the watchdog > reset. Good news: Under 2.2.19, reloading the module indeed reset the card, so that it worked again. I will upload debugoutput from 2.2.19 too (http://www.heureka.co.at/~david/dfe530tx/) > Rebooting obviously triggers something else too ... perhaps the BIOS talks > some sense to the card. As mentioned above, it seems like the 2.2.19 version does the Right Thing (but doesn't recover autmatically). > > [6.] A small shell script or example program which triggers the > > problem (if possible) > > > > Downloading amounts of data (>50MB) will eventually trigger > > the problem. Transmitting data at less than full speed will > > not trigger it (or at least I haven't waited long enough?) > > What do you use to download? from a server on the LAN or something remote? > and how do you slow down the speed of your transmission? How fast is it > when it is fast, and how much do you slow it down? Ok, I could reproduce it kinda more systematically: # ssh other.machine cat /dev/zero Generates about 2Mbit incoming traffic. This doesn't trigger the problem. but doing one or two parallel ping -f other.machine locks the NIC for good. > > [X.] Other notes, patches, fixes, workarounds > > > > Further information from lspci, via-diag and ifconfig output as well > > as well as complete kernel syslog from boot to network-lock can be > > found on http://www.heureka.co.at/~david/dfe530tx/ > > The syslog gives a few hints that something is wrong ... > > eth0: Transmit error, Tx status 00008100. > 8 - transmit error > 1 - transmit aborted after excessive collisions > > but at the same time the 00 part means that the "collision retry count" is > 0 and that it hasn't set a flag that it "experienced collisions in this > transmit event". The network where the DFE530TX (and the other.machine) are attached contains some 20-30 Windows PCs and some Novell Servers which all seem quite braodcast-happy. The network itself is (mostly) unswitched and 10Mbit halfduplex, so I guess this really is connected to the collisions. > I think there were 3 of these, and from all but the last it recovers by > itself. Perhaps the collisions (or whatever it is that the card sees as > collisions) continued for a longer period. > It ends up in "eth0: transmit timed out" and the driver tries to reset the > card. That does not appear to work at all. Under 2.2.19 the card doesn't recover automatically (lacking the watchdog) but manually reloading the module works. > It's a nice report, I wish I had something more useful to reply with. Well, you pointed me in the right direction :-) > The driver source has links to some datasheets. They might be useful in > improving the reset code. > (Hmm, the tx_timeout code does: reset -> initialise ring -> wait for hw > but initialise ring talks to the hw, perhaps it should wait for hw first > ...) I'm not really into hacking C but, I will try to provide as much info as possible. Regards, David -- Sponsored by heureKA, Austria (http://www.heureka.co.at) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ISSUE: DFE530-TX REV-A3-1 times out on transmit 2001-08-27 8:27 ` David Schmitt @ 2001-08-27 9:13 ` David Schmitt 2001-08-27 19:02 ` Urban Widmark 2001-08-28 7:26 ` Dennis Bjorklund 2 siblings, 0 replies; 12+ messages in thread From: David Schmitt @ 2001-08-27 9:13 UTC (permalink / raw) To: linux-kernel Hi! sorry for replying on my on message. On Mon, Aug 27, 2001 at 10:27:40AM +0200, David Schmitt wrote: > On Sat, Aug 25, 2001 at 07:05:26PM +0200, Urban Widmark wrote: > > On Fri, 24 Aug 2001, David Schmitt wrote: > > > Reloading the module doesn't help either. Only a reboot > > > reenables network connectivity. > > > > There is a patch in the 2.4.8-acX kernels that fixes a problem with > > reseting the card when it is first used. I can't say that I know that it > > fixes anything you are seeing, but it could be worth trying. > > Ok, I will try that too and report back. Nope. Using the patched via-rhine.c from 2.4.8-ac12 didn't help. Regards, David Schmitt -- Sponsored by heureKA, Austria (http://www.heureka.co.at) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ISSUE: DFE530-TX REV-A3-1 times out on transmit 2001-08-27 8:27 ` David Schmitt 2001-08-27 9:13 ` David Schmitt @ 2001-08-27 19:02 ` Urban Widmark 2001-08-28 13:45 ` David Schmitt 2001-08-28 7:26 ` Dennis Bjorklund 2 siblings, 1 reply; 12+ messages in thread From: Urban Widmark @ 2001-08-27 19:02 UTC (permalink / raw) To: David Schmitt; +Cc: linux-kernel On Mon, 27 Aug 2001, David Schmitt wrote: > > Reloading the module is to the hardware about the same as the watchdog > > reset. > > Good news: Under 2.2.19, reloading the module indeed reset the card, > so that it worked again. Interesting ... > > Rebooting obviously triggers something else too ... perhaps the BIOS talks > > some sense to the card. > > As mentioned above, it seems like the 2.2.19 version does the Right > Thing (but doesn't recover autmatically). The 2.2.19 version doesn't do anything on timeout (except print a message that it is resetting, which it isn't :). The driver has a few changes during the 2.4 series: 2.4.3 was patched to actually reset things on tx_timeout, but that also changed the startup sequence. 2.4.6 got changes to reload certain things from eeprom when the driver is loaded (fix a problem with booting from win98 that does a power down). 2.4.7 changes to the transmit code to use "singlecopy" for unaligned buffers. 2.4.8-acX fixed a bug in the startup code from 2.4.3 Testing the 2.4.2 and 2.4.3 drivers could give something (should work to simply copy the drivers/net/via-rhine.c from the different versions into a 2.4.9 source tree). > but doing one or two parallel ping -f other.machine locks the NIC for > good. Good (that you have a reliable way to trigger this). For about how long do you need to run this? > The network where the DFE530TX (and the other.machine) are attached > contains some 20-30 Windows PCs and some Novell Servers which all seem > quite braodcast-happy. The network itself is (mostly) unswitched and > 10Mbit halfduplex, so I guess this really is connected to the > collisions. Depending on the sort of access you have, you could test unplugging everyone else and repeat the 'ping -f' test. I don't have the hardware to test now, but when time permits ... /Urban ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ISSUE: DFE530-TX REV-A3-1 times out on transmit 2001-08-27 19:02 ` Urban Widmark @ 2001-08-28 13:45 ` David Schmitt 2001-08-28 19:46 ` Urban Widmark 0 siblings, 1 reply; 12+ messages in thread From: David Schmitt @ 2001-08-28 13:45 UTC (permalink / raw) To: linux-kernel Re! After fooling around a bit more I can now distinguish three different states (with 2.4.9): 1) Working (very easy): Aug 28 13:45:01 cheesy kernel: In via_rhine_rx(), entry XX status 00668f00. 2) Recoverable troubles (nasty but bearable): Aug 28 13:45:04 cheesy kernel: In via_rhine_rx(), entry XX status 005e9700. Aug 28 13:45:05 cheesy kernel: NETDEV WATCHDOG: eth0: transmit timed out Aug 28 13:45:05 cheesy kernel: eth0: reset finished after 5 microseconds. Aug 28 13:45:05 cheesy kernel: In via_rhine_rx(), entry XX status 00668f00. Aug 28 13:45:32 cheesy kernel: In via_rhine_rx(), entry XX status 015a9700. Aug 28 13:45:33 cheesy kernel: NETDEV WATCHDOG: eth0: transmit timed out Aug 28 13:45:33 cheesy kernel: eth0: reset finished after 105 microseconds. Aug 28 13:45:33 cheesy kernel: In via_rhine_rx(), entry XX status 00668f00. Note 1: '005e9700' doesn't always cause a timeout. Note 2: the delay in the second paragraph. and 3) Unrecoverable (really bad): Aug 28 13:45:51 cheesy kernel: In via_rhine_rx(), entry XX status 00668f00. Aug 28 13:45:52 cheesy kernel: In via_rhine_rx(), entry XX status 00668f00. Aug 28 13:45:52 cheesy kernel: In via_rhine_rx(), entry XX status 00729700. Aug 28 13:45:54 cheesy kernel: In via_rhine_rx(), entry XX status 00669700. Aug 28 13:45:54 cheesy kernel: In via_rhine_rx(), entry XX status 00ee9700. Aug 28 13:45:54 cheesy kernel: In via_rhine_rx(), entry XX status 00f79700. Aug 28 13:45:54 cheesy kernel: In via_rhine_rx(), entry XX status 00669700. Aug 28 13:45:55 cheesy kernel: In via_rhine_rx(), entry XX status 005e9700. Aug 28 13:45:55 cheesy kernel: NETDEV WATCHDOG: eth0: transmit timed out Aug 28 13:45:55 cheesy kernel: eth0: reset finished after 10005 microseconds. Aug 28 13:45:59 cheesy kernel: NETDEV WATCHDOG: eth0: transmit timed out Aug 28 13:45:59 cheesy kernel: eth0: reset finished after 10005 microseconds. Then I noticed the following: Upon unload the driver emits some kind of exit status: Correct shutdown: Aug 28 14:53:34 cheesy kernel: eth0: Shutting down ethercard, status was 085a. After first resets: Aug 28 14:54:11 cheesy kernel: eth0: Shutting down ethercard, status was 081a. After total NIC lockup: Aug 28 14:56:24 cheesy kernel: eth0: Shutting down ethercard, status was 883a. Wondering if via-diag shows some differences I got this: root@cheesy # via-diag -mm -aa -ee via-diag.c:v2.06 5/22/2001 Donald Becker (becker@scyld.com) http://www.scyld.com/diag/index.html Index #1: Found a VIA VT3065 Rhine-II adapter at 0x9400. Station address 00:05:5d:09:90:1f. Tx disabled, Rx enabled, half-duplex (0x800c). Receive mode is 0x6c: Normal unicast and hashed multicast. Transmit mode is 0x22: Transmitter set to INTERNAL LOOPBACK!. [..] This seems to be the problem. Reloading the module does not help anymore. I'd guess forcing the transmit mode in the reset to something sane would help?? Any ideas what can be done for further debugging this problem? On Mon, Aug 27, 2001 at 09:02:29PM +0200, Urban Widmark wrote: [kernels] I tried it with this kernels: 2.2.19 Resets correctly with rm-/insmod 2.2.19 with Dennis' patch Resets often, but no lock up. 2.4.9 with via-rhine.c from 2.4.2 Resets correctly with rm-/insmod 2.4.9 with via-rhine.c from 2.4.3 First resets correctly. After third or fourth time locks up, with transmit mode 0x21 (which via-diag says is the same as 0x20) 2.4.9 Resets a fwe times correctly, then see above. > > but doing one or two parallel ping -f other.machine locks the NIC for > > good. > > Good (that you have a reliable way to trigger this). For about how long do > you need to run this? 10-20 seconds. > > The network where the DFE530TX (and the other.machine) are attached > > contains some 20-30 Windows PCs and some Novell Servers which all seem > > quite braodcast-happy. The network itself is (mostly) unswitched and > > 10Mbit halfduplex, so I guess this really is connected to the > > collisions. > > Depending on the sort of access you have, you could test unplugging > everyone else and repeat the 'ping -f' test. I try and take the card home, there I have some more possibilities to test. Regards, David -- Signaturen sind wie Frauen. Man findet selten eine Vernuenftige -- gesehen in at.linux ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ISSUE: DFE530-TX REV-A3-1 times out on transmit 2001-08-28 13:45 ` David Schmitt @ 2001-08-28 19:46 ` Urban Widmark 2001-08-29 6:42 ` Dennis Bjorklund 2001-08-29 12:48 ` David Schmitt 0 siblings, 2 replies; 12+ messages in thread From: Urban Widmark @ 2001-08-28 19:46 UTC (permalink / raw) To: David Schmitt; +Cc: linux-kernel, Dennis Bjorklund [-- Attachment #1: Type: TEXT/PLAIN, Size: 1901 bytes --] On Tue, 28 Aug 2001, David Schmitt wrote: > Note 1: '005e9700' doesn't always cause a timeout. That status is from the Rx, not Tx. I think they are all ok. 005e9700 - length=0x5e, RxOK, accept broadcast, single buffer, end buffer 00668f00 - length=0x66, RxOk, chain buffer, single buffer, end buffer > Correct shutdown: > Aug 28 14:53:34 cheesy kernel: eth0: Shutting down ethercard, status was 085a. > > After first resets: > Aug 28 14:54:11 cheesy kernel: eth0: Shutting down ethercard, status was 081a. > > After total NIC lockup: > Aug 28 14:56:24 cheesy kernel: eth0: Shutting down ethercard, status was 883a. 8000 means that the chip is still doing a software reset. I think the difference between 085a/081a is simply that you caught it in different states. > Tx disabled, Rx enabled, half-duplex (0x800c). > Receive mode is 0x6c: Normal unicast and hashed multicast. > Transmit mode is 0x22: Transmitter set to INTERNAL LOOPBACK!. Is this after unloading the module? The via_rhine_close sets the transmitter to loopback mode (comment says to avoid hardware races ...). The reset code does not. That should not be a problem, when loading the module (and on reset) it changes this to normal mode. > 2.2.19 with Dennis' patch > > Resets often, but no lock up. That is interesting. This code should be almost identical to 2.4.x (or not, Dennis?). The way the timeout code is run may be different of course, but the driver part is the same. I'm ignoring that for now (if you don't mind) and have made a patch with some possible improvements. Someone found a modified driver on some dlink server that contains (claimed) workarounds for various chip peculiarities (bugs). I also added a "force software reset" that is described in the datasheet. Not sure what the difference is, but it can't hurt trying that if the normal reset fails. Perhaps this helps, probably not. /Urban [-- Attachment #2: Type: TEXT/PLAIN, Size: 2699 bytes --] --- linux-2.4.9-orig/drivers/net/via-rhine.c Sun Aug 19 12:08:22 2001 +++ linux-2.4.9-00/drivers/net/via-rhine.c Tue Aug 28 21:37:18 2001 @@ -497,6 +497,14 @@ if (debug > 1) printk(KERN_INFO "%s: reset finished after %d microseconds.\n", name, 5*i); + + if (chip_id == VT6102 && readw(ioaddr + ChipCmd) & CmdReset) { + /* Try to force software reset (we are dead anyway ...) */ + writeb(0x40, ioaddr + 0x81); + for (i=0; i<2000 && (readw(ioaddr + ChipCmd) & CmdReset); i++) + udelay(5); + } + } static int __devinit via_rhine_init_one (struct pci_dev *pdev, @@ -1078,8 +1086,50 @@ spin_lock(&np->lock); + /* Disable interrupts by clearing the interrupt mask. */ + writew(0x0000, ioaddr + IntrEnable); + + /* shutdown code from the driver supposedly modified by D-Link. */ + if (np->drv_flags & HasWOL) { + int ww; + + /* FIXME: 0x01 isn't loopback according to the docs, it is reserved! */ + /* Nic Loop Back On */ + writeb(readb(ioaddr + TxConfig) | 0x01, ioaddr + TxConfig); + + /* Tx Off */ + writeb(readb(ioaddr + ChipCmd) ^ 0x10, ioaddr + ChipCmd); + for (ww = 0; ww < W_MAX_TIMEOUT; ww++) { + if ((readb(ioaddr + ChipCmd) & 0x10) == 0) + break; + } + + /* Rx Off */ + writeb(readb(ioaddr + ChipCmd) ^ 0x08, ioaddr + ChipCmd); + for (ww = 0; ww < W_MAX_TIMEOUT; ww++) { + if ((readb(ioaddr + ChipCmd) & 0x08) == 0) + break; + } + + if (ww == W_MAX_TIMEOUT) { + /* Turn on fifo test */ + writew(readw(ioaddr + GFIFOTest) | 0x0001, ioaddr + GFIFOTest); + /* Turn on fifo reject */ + writew(readw(ioaddr + GFIFOTest) | 0x0800, ioaddr + GFIFOTest); + /* Turn off fifo test */ + writew(readw(ioaddr + GFIFOTest) & 0xFFFE, ioaddr + GFIFOTest); + } + + /* Nic Loop Back Off */ + writeb(readb(ioaddr + TxConfig) & 0xFE, ioaddr + TxConfig); + } + + /* Stop the chip's Tx and Rx processes. */ + writew(CmdStop, ioaddr + ChipCmd); + /* Reset the chip. */ writew(CmdReset, ioaddr + ChipCmd); + wait_for_reset(dev, dev->name); /* clear all descriptors */ free_tbufs(dev); @@ -1088,7 +1138,6 @@ alloc_rbufs(dev); /* Reinitialize the hardware. */ - wait_for_reset(dev, dev->name); init_registers(dev); spin_unlock(&np->lock); @@ -1554,7 +1603,8 @@ dev->name, readw(ioaddr + ChipCmd)); /* Switch to loopback mode to avoid hardware races. */ - writeb(np->tx_thresh | 0x02, ioaddr + TxConfig); + /* FIXME: docs say 0x01 is reserved! Becker version set this and not 0x02 */ + writeb(np->tx_thresh | 0x01, ioaddr + TxConfig); /* Disable interrupts by clearing the interrupt mask. */ writew(0x0000, ioaddr + IntrEnable); ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ISSUE: DFE530-TX REV-A3-1 times out on transmit 2001-08-28 19:46 ` Urban Widmark @ 2001-08-29 6:42 ` Dennis Bjorklund 2001-08-29 12:48 ` David Schmitt 1 sibling, 0 replies; 12+ messages in thread From: Dennis Bjorklund @ 2001-08-29 6:42 UTC (permalink / raw) To: Urban Widmark; +Cc: David Schmitt, linux-kernel On Tue, 28 Aug 2001, Urban Widmark wrote: > > 2.2.19 with Dennis' patch > > > > Resets often, but no lock up. > > That is interesting. This code should be almost identical to 2.4.x (or > not, Dennis?). The way the timeout code is run may be different of course, > but the driver part is the same. Well. I took (part of) the init code from 2.2.19 and used that for both init and reset so there might be differences from 2.4.x. But the only difference I remember that was really different, were some extra spinlocks. -- /Dennis ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ISSUE: DFE530-TX REV-A3-1 times out on transmit 2001-08-28 19:46 ` Urban Widmark 2001-08-29 6:42 ` Dennis Bjorklund @ 2001-08-29 12:48 ` David Schmitt 2001-08-29 18:45 ` Urban Widmark 1 sibling, 1 reply; 12+ messages in thread From: David Schmitt @ 2001-08-29 12:48 UTC (permalink / raw) Cc: linux-kernel On Tue, Aug 28, 2001 at 09:46:18PM +0200, Urban Widmark wrote: > I'm ignoring that for now (if you don't mind) and have made a patch with > some possible improvements. Someone found a modified driver on some dlink > server that contains (claimed) workarounds for various chip peculiarities > (bugs). > > I also added a "force software reset" that is described in the datasheet. > Not sure what the difference is, but it can't hurt trying that if the > normal reset fails. > > Perhaps this helps, probably not. under 'normal loads' (ie one tcp d/l at max, few other traffic) the situation didn' get better, it hangs as often as with the original via-rhine, at least it feels so. No hard figures here. But even writing this mail (via ssh) here parallel to a download over the lan (from the same server) triggers resets. under heavy loads (ie with multiple flood pings) it resets often but I couldn't push it over the edge anymore. I have it running now for several minutes under multiple pingfloods and it always recovered (from quite a amount of resets). At least it recovers now. Thank you for your time and work! Regards, David Schmitt -- Signaturen sind wie Frauen. Man findet selten eine Vernuenftige -- gesehen in at.linux ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ISSUE: DFE530-TX REV-A3-1 times out on transmit 2001-08-29 12:48 ` David Schmitt @ 2001-08-29 18:45 ` Urban Widmark 2001-08-31 12:18 ` David Schmitt 0 siblings, 1 reply; 12+ messages in thread From: Urban Widmark @ 2001-08-29 18:45 UTC (permalink / raw) To: David Schmitt; +Cc: linux-kernel On Wed, 29 Aug 2001, David Schmitt wrote: > under 'normal loads' (ie one tcp d/l at max, few other traffic) the > situation didn' get better, it hangs as often as with the original > via-rhine, at least it feels so. No hard figures here. But even > writing this mail (via ssh) here parallel to a download over the lan > (from the same server) triggers resets. That is still pretty awful ... but it doesn't stop working? (you say hangs, but then resets) > under heavy loads (ie with multiple flood pings) it resets often but I > couldn't push it over the edge anymore. I have it running now for > several minutes under multiple pingfloods and it always recovered > (from quite a amount of resets). Ok, that means the "D-Link magic" does improve reset. It may be interesting to find out which parts that help. I simply added things that looked good ... Lacking information on what the bit-flipping is supposed to do, one way to try and do that is to remove code and see how much can be removed without breaking anything. (Sounds like a childrens game, except for programmers ...) I'll still try generating collisions and see what happens. If I can't reproduce this perhaps you would test a different patch to see which change that made a difference? /Urban ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ISSUE: DFE530-TX REV-A3-1 times out on transmit 2001-08-29 18:45 ` Urban Widmark @ 2001-08-31 12:18 ` David Schmitt 0 siblings, 0 replies; 12+ messages in thread From: David Schmitt @ 2001-08-31 12:18 UTC (permalink / raw) To: linux-kernel On Wed, Aug 29, 2001 at 08:45:34PM +0200, Urban Widmark wrote: > On Wed, 29 Aug 2001, David Schmitt wrote: > > under 'normal loads' (ie one tcp d/l at max, few other traffic) the > > situation didn' get better, it hangs as often as with the original > > via-rhine, at least it feels so. No hard figures here. But even > > writing this mail (via ssh) here parallel to a download over the lan > > (from the same server) triggers resets. > > That is still pretty awful ... but it doesn't stop working? > (you say hangs, but then resets) sorry, sloppy language. hang == reset (in this case) > > under heavy loads (ie with multiple flood pings) it resets often but I > > couldn't push it over the edge anymore. I have it running now for > > several minutes under multiple pingfloods and it always recovered > > (from quite a amount of resets). > > Ok, that means the "D-Link magic" does improve reset. Yes. Until your patch 2.4.9 resetted three or four times sucessfully and then the resets stopped working. With your patch it resets as often but doesn't fail resetting anymore. > It may be interesting to find out which parts that help. I simply added > things that looked good ... Lacking information on what the bit-flipping > is supposed to do, one way to try and do that is to remove code and see > how much can be removed without breaking anything. > (Sounds like a childrens game, except for programmers ...) Hehe, bruteforcing it :-)) > I'll still try generating collisions and see what happens. If I can't > reproduce this perhaps you would test a different patch to see which > change that made a difference? Sure, the machine is fast enough to handle another kernel recompile or two :^)) Regards, David -- Signaturen sind wie Frauen. Man findet selten eine Vernuenftige -- gesehen in at.linux ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ISSUE: DFE530-TX REV-A3-1 times out on transmit 2001-08-27 8:27 ` David Schmitt 2001-08-27 9:13 ` David Schmitt 2001-08-27 19:02 ` Urban Widmark @ 2001-08-28 7:26 ` Dennis Bjorklund 2 siblings, 0 replies; 12+ messages in thread From: Dennis Bjorklund @ 2001-08-28 7:26 UTC (permalink / raw) To: David Schmitt; +Cc: linux-kernel [-- Attachment #1: Type: TEXT/PLAIN, Size: 522 bytes --] On Mon, 27 Aug 2001, David Schmitt wrote: > As mentioned above, it seems like the 2.2.19 version does the Right > Thing (but doesn't recover autmatically). I backported the recover stuff from 2.4.x to 2.2.20-pre9 and it works nice for me. I've been running it for a couple of weeks without problem and where it before locked up it now resets (clear improvement for me). I sent it to Alan in hope that it could make it to 2.2.20 but I got no reply. I don't know if I should continue send it to him or what. -- /Dennis [-- Attachment #2: Type: TEXT/PLAIN, Size: 12736 bytes --] --- linux/drivers/net/via-rhine.c Sat Aug 18 14:10:07 2001 +++ linux-2.2.19/drivers/net/via-rhine.c Sat Aug 18 14:23:38 2001 @@ -25,11 +25,15 @@ LK1.0.0: - Urban Widmark: merges from Beckers 1.08b version and 2.4.0 (VT6102) + + LK1.0.1 + - Dennis Björklund: backport tx_timeout from 2.4.x to reset on timeout + instead of stop working... */ /* These identify the driver base version and may not be removed. */ static const char version1[] = -"via-rhine.c:v1.08b-LK1.0.0 12/14/2000 Written by Donald Becker\n"; +"via-rhine.c:v1.08b-LK1.0.1 12/14/2000 Written by Donald Becker\n"; static const char version2[] = " http://www.scyld.com/network/via-rhine.html\n"; @@ -95,9 +99,11 @@ #include <linux/etherdevice.h> #include <linux/skbuff.h> #include <linux/init.h> +#include <linux/delay.h> #include <asm/processor.h> /* Processor type for cache alignment. */ #include <asm/bitops.h> #include <asm/io.h> +#include <asm/irq.h> /* Condensed bus+endian portability operations. */ #define virt_to_le32desc(addr) cpu_to_le32(virt_to_bus(addr)) @@ -256,6 +262,13 @@ struct device *dev, long ioaddr, int irq, int chp_idx, int fnd_cnt); +enum via_rhine_chips { + VT86C100A = 0, + VT6102, + VT3043, +}; + +/* directly indexed by enum via_rhine_chips, above */ static struct pci_id_info pci_tbl[] __initdata = { { "VIA VT86C100A Rhine-II", 0x1106, 0x6100, 0xffff, RHINE_IOTYPE, 128, via_probe1}, @@ -379,7 +392,6 @@ static void check_duplex(struct device *dev); static void netdev_timer(unsigned long data); static void tx_timeout(struct device *dev); -static void init_ring(struct device *dev); static int start_tx(struct sk_buff *skb, struct device *dev); static void intr_handler(int irq, void *dev_instance, struct pt_regs *regs); static int netdev_rx(struct device *dev); @@ -395,6 +407,31 @@ /* A list of our installed devices, for removing the driver module. */ static struct device *root_net_dev = NULL; +static void wait_for_reset(struct device *dev, char *name) +{ + struct netdev_private *np = dev->priv; + long ioaddr = dev->base_addr; + int chip_id = np->chip_id; + int i; + + /* 3043 may need long delay after reset (dlink) */ + if (chip_id == VT3043 || chip_id == VT86C100A) + udelay(100); + + i = 0; + do { + udelay(5); + i++; + if(i > 2000) { + printk(KERN_ERR "%s: reset did not complete in 10 ms.\n", name); + break; + } + } while(readw(ioaddr + ChipCmd) & CmdReset); + if (debug > 1) + printk(KERN_INFO "%s: reset finished after %d microseconds.\n", + name, 5*i); +} + /* Ideally we would detect all network cards in slot order. That would be best done a central PCI probe dispatch, which wouldn't work well when dynamically adding drivers. So instead we detect just the @@ -594,6 +631,141 @@ return dev; } +static void alloc_rbufs(struct device *dev) +{ + struct netdev_private *np = (struct netdev_private *)dev->priv; + int i; + + np->cur_rx = 0; + np->dirty_rx = 0; + + np->rx_buf_sz = (dev->mtu <= 1500 ? PKT_BUF_SZ : dev->mtu + 32); + np->rx_head_desc = &np->rx_ring[0]; + + for (i = 0; i < RX_RING_SIZE; i++) { + np->rx_ring[i].rx_status = 0; + np->rx_ring[i].desc_length = cpu_to_le32(np->rx_buf_sz); + np->rx_ring[i].next_desc = virt_to_le32desc(&np->rx_ring[i+1]); + np->rx_skbuff[i] = 0; + } + /* Mark the last entry as wrapping the ring. */ + np->rx_ring[i-1].next_desc = virt_to_le32desc(&np->rx_ring[0]); + + /* Fill in the Rx buffers. Handle allocation failure gracefully. */ + for (i = 0; i < RX_RING_SIZE; i++) { + struct sk_buff *skb = dev_alloc_skb(np->rx_buf_sz); + np->rx_skbuff[i] = skb; + if (skb == NULL) + break; + skb->dev = dev; /* Mark as being used by this device. */ + np->rx_ring[i].addr = virt_to_le32desc(skb->tail); + np->rx_ring[i].rx_status = cpu_to_le32(DescOwn); + } + np->dirty_rx = (unsigned int)(i - RX_RING_SIZE); +} + +static void free_rbufs(struct device* dev) +{ + struct netdev_private *np = (struct netdev_private *)dev->priv; + int i; + + /* Free all the skbuffs in the Rx queue. */ + for (i = 0; i < RX_RING_SIZE; i++) { + np->rx_ring[i].rx_status = 0; + np->rx_ring[i].addr = 0xBADF00D0; /* An invalid address. */ + if (np->rx_skbuff[i]) { +#if LINUX_VERSION_CODE < 0x20100 + np->rx_skbuff[i]->free = 1; +#endif + dev_kfree_skb(np->rx_skbuff[i]); + } + np->rx_skbuff[i] = 0; + } +} + +static void alloc_tbufs(struct device* dev) +{ + struct netdev_private *np = (struct netdev_private *)dev->priv; + int i; + + np->tx_full = 0; + np->cur_tx = 0; + np->dirty_tx = 0; + + for (i = 0; i < TX_RING_SIZE; i++) { + np->tx_skbuff[i] = 0; + np->tx_ring[i].tx_status = 0; + np->tx_ring[i].desc_length = cpu_to_le32(0x00e08000); + np->tx_ring[i].next_desc = virt_to_le32desc(&np->tx_ring[i+1]); + np->tx_buf[i] = kmalloc(PKT_BUF_SZ, GFP_KERNEL); + } + np->tx_ring[i-1].next_desc = virt_to_le32desc(&np->tx_ring[0]); +} + +static void free_tbufs(struct device *dev) +{ + struct netdev_private *np = (struct netdev_private *)dev->priv; + int i; + + for (i = 0; i < TX_RING_SIZE; i++) { + if (np->tx_skbuff[i]) + dev_kfree_skb(np->tx_skbuff[i]); + np->tx_skbuff[i] = 0; + if (np->tx_buf[i]) { + kfree(np->tx_buf[i]); + np->tx_buf[i] = 0; + } + } +} + +static void init_registers(struct device *dev) +{ + struct netdev_private *np = (struct netdev_private *)dev->priv; + long ioaddr = dev->base_addr; + int i; + + for (i = 0; i < 6; i++) + writeb(dev->dev_addr[i], ioaddr + StationAddr + i); + + /* Initialize other registers. */ + writew(0x0006, ioaddr + PCIBusConfig); /* Tune configuration??? */ + /* Configure the FIFO thresholds. */ + writeb(0x20, ioaddr + TxConfig); /* Initial threshold 32 bytes */ + np->tx_thresh = 0x20; + np->rx_thresh = 0x60; /* Written in set_rx_mode(). */ + + if (dev->if_port == 0) + dev->if_port = np->default_port; + + dev->tbusy = 0; + dev->interrupt = 0; + + writel(virt_to_bus(np->rx_ring), ioaddr + RxRingPtr); + writel(virt_to_bus(np->tx_ring), ioaddr + TxRingPtr); + + set_rx_mode(dev); + + dev->start = 1; + + /* Enable interrupts by setting the interrupt mask. */ + writew(IntrRxDone | IntrRxErr | IntrRxEmpty| IntrRxOverflow| IntrRxDropped| + IntrTxDone | IntrTxAbort | IntrTxUnderrun | + IntrPCIErr | IntrStatsMax | IntrLinkChange | IntrMIIChange, + ioaddr + IntrEnable); + + np->chip_cmd = CmdStart|CmdTxOn|CmdRxOn|CmdNoTxPoll; + if (np->duplex_lock) + np->chip_cmd |= CmdFDuplex; + writew(np->chip_cmd, ioaddr + ChipCmd); + + check_duplex(dev); + /* The LED outputs of various MII xcvrs should be configured. */ + /* For NS or Mison phys, turn on bit 1 in register 0x17 */ + /* For ESI phys, turn on bit 7 in register 0x17. */ + mdio_write(dev, np->phys[0], 0x17, mdio_read(dev, np->phys[0], 0x17) | + (np->drv_flags & HasESIPhy) ? 0x0080 : 0x0001); +} + \f /* Read and write over the MII Management Data I/O (MDIO) interface. */ @@ -650,7 +822,6 @@ { struct netdev_private *np = (struct netdev_private *)dev->priv; long ioaddr = dev->base_addr; - int i; /* Reset the chip. */ writew(CmdReset, ioaddr + ChipCmd); @@ -666,48 +837,10 @@ printk(KERN_DEBUG "%s: netdev_open() irq %d.\n", dev->name, dev->irq); - init_ring(dev); - - writel(virt_to_bus(np->rx_ring), ioaddr + RxRingPtr); - writel(virt_to_bus(np->tx_ring), ioaddr + TxRingPtr); - - for (i = 0; i < 6; i++) - writeb(dev->dev_addr[i], ioaddr + StationAddr + i); - - /* Initialize other registers. */ - writew(0x0006, ioaddr + PCIBusConfig); /* Tune configuration??? */ - /* Configure the FIFO thresholds. */ - writeb(0x20, ioaddr + TxConfig); /* Initial threshold 32 bytes */ - np->tx_thresh = 0x20; - np->rx_thresh = 0x60; /* Written in set_rx_mode(). */ - - if (dev->if_port == 0) - dev->if_port = np->default_port; - - dev->tbusy = 0; - dev->interrupt = 0; - - set_rx_mode(dev); - - dev->start = 1; + alloc_rbufs(dev); + alloc_tbufs(dev); - /* Enable interrupts by setting the interrupt mask. */ - writew(IntrRxDone | IntrRxErr | IntrRxEmpty| IntrRxOverflow| IntrRxDropped| - IntrTxDone | IntrTxAbort | IntrTxUnderrun | - IntrPCIErr | IntrStatsMax | IntrLinkChange | IntrMIIChange, - ioaddr + IntrEnable); - - np->chip_cmd = CmdStart|CmdTxOn|CmdRxOn|CmdNoTxPoll; - if (np->duplex_lock) - np->chip_cmd |= CmdFDuplex; - writew(np->chip_cmd, ioaddr + ChipCmd); - - check_duplex(dev); - /* The LED outputs of various MII xcvrs should be configured. */ - /* For NS or Mison phys, turn on bit 1 in register 0x17 */ - /* For ESI phys, turn on bit 7 in register 0x17. */ - mdio_write(dev, np->phys[0], 0x17, mdio_read(dev, np->phys[0], 0x17) | - (np->drv_flags & HasESIPhy) ? 0x0080 : 0x0001); + init_registers(dev); if (debug > 2) printk(KERN_DEBUG "%s: Done netdev_open(), status %4.4x " @@ -775,6 +908,7 @@ static void tx_timeout(struct device *dev) { struct netdev_private *np = (struct netdev_private *)dev->priv; + struct pci_dev *pdev = pci_find_slot(np->pci_bus, np->pci_devfn); long ioaddr = dev->base_addr; printk(KERN_WARNING "%s: Transmit timed out, status %4.4x, PHY status " @@ -782,60 +916,32 @@ dev->name, readw(ioaddr + IntrStatus), mdio_read(dev, np->phys[0], 1)); - /* Perhaps we should reinitialize the hardware here. */ dev->if_port = 0; - /* Stop and restart the chip's Tx processes . */ - /* Trigger an immediate transmit demand. */ - - dev->trans_start = jiffies; - np->stats.tx_errors++; - return; -} + /* protect against concurrent rx interrupts */ + disable_irq(pdev->irq); + /* Reset the chip. */ + writew(CmdReset, ioaddr + ChipCmd); -/* Initialize the Rx and Tx rings, along with various 'dev' bits. */ -static void init_ring(struct device *dev) -{ - struct netdev_private *np = (struct netdev_private *)dev->priv; - int i; - - np->tx_full = 0; - np->cur_rx = np->cur_tx = 0; - np->dirty_rx = np->dirty_tx = 0; + /* clear all descriptors */ + free_tbufs(dev); + free_rbufs(dev); + alloc_tbufs(dev); + alloc_rbufs(dev); + + /* Reinitialize the hardware. */ + wait_for_reset(dev, dev->name); + init_registers(dev); - np->rx_buf_sz = (dev->mtu <= 1500 ? PKT_BUF_SZ : dev->mtu + 32); - np->rx_head_desc = &np->rx_ring[0]; + enable_irq(pdev->irq); - for (i = 0; i < RX_RING_SIZE; i++) { - np->rx_ring[i].rx_status = 0; - np->rx_ring[i].desc_length = cpu_to_le32(np->rx_buf_sz); - np->rx_ring[i].next_desc = virt_to_le32desc(&np->rx_ring[i+1]); - np->rx_skbuff[i] = 0; - } - /* Mark the last entry as wrapping the ring. */ - np->rx_ring[i-1].next_desc = virt_to_le32desc(&np->rx_ring[0]); + dev->trans_start = jiffies; + np->stats.tx_errors++; - /* Fill in the Rx buffers. Handle allocation failure gracefully. */ - for (i = 0; i < RX_RING_SIZE; i++) { - struct sk_buff *skb = dev_alloc_skb(np->rx_buf_sz); - np->rx_skbuff[i] = skb; - if (skb == NULL) - break; - skb->dev = dev; /* Mark as being used by this device. */ - np->rx_ring[i].addr = virt_to_le32desc(skb->tail); - np->rx_ring[i].rx_status = cpu_to_le32(DescOwn); - } - np->dirty_rx = (unsigned int)(i - RX_RING_SIZE); - - for (i = 0; i < TX_RING_SIZE; i++) { - np->tx_skbuff[i] = 0; - np->tx_ring[i].tx_status = 0; - np->tx_ring[i].desc_length = cpu_to_le32(0x00e08000); - np->tx_ring[i].next_desc = virt_to_le32desc(&np->tx_ring[i+1]); - np->tx_buf[i] = kmalloc(PKT_BUF_SZ, GFP_KERNEL); - } - np->tx_ring[i-1].next_desc = virt_to_le32desc(&np->tx_ring[0]); + /* wake queue */ + clear_bit(0, (void*)&dev->tbusy); + mark_bh(NET_BH); return; } @@ -1233,7 +1339,6 @@ { long ioaddr = dev->base_addr; struct netdev_private *np = (struct netdev_private *)dev->priv; - int i; dev->start = 0; dev->tbusy = 1; @@ -1255,27 +1360,8 @@ free_irq(dev->irq, dev); - /* Free all the skbuffs in the Rx queue. */ - for (i = 0; i < RX_RING_SIZE; i++) { - np->rx_ring[i].rx_status = 0; - np->rx_ring[i].addr = 0xBADF00D0; /* An invalid address. */ - if (np->rx_skbuff[i]) { -#if LINUX_VERSION_CODE < 0x20100 - np->rx_skbuff[i]->free = 1; -#endif - dev_kfree_skb(np->rx_skbuff[i]); - } - np->rx_skbuff[i] = 0; - } - for (i = 0; i < TX_RING_SIZE; i++) { - if (np->tx_skbuff[i]) - dev_kfree_skb(np->tx_skbuff[i]); - np->tx_skbuff[i] = 0; - if (np->tx_buf[i]) { - kfree(np->tx_buf[i]); - np->tx_buf[i] = 0; - } - } + free_rbufs(dev); + free_tbufs(dev); MOD_DEC_USE_COUNT; ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2001-08-31 12:18 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2001-08-24 14:24 ISSUE: DFE530-TX REV-A3-1 times out on transmit David Schmitt 2001-08-25 17:05 ` Urban Widmark 2001-08-27 8:27 ` David Schmitt 2001-08-27 9:13 ` David Schmitt 2001-08-27 19:02 ` Urban Widmark 2001-08-28 13:45 ` David Schmitt 2001-08-28 19:46 ` Urban Widmark 2001-08-29 6:42 ` Dennis Bjorklund 2001-08-29 12:48 ` David Schmitt 2001-08-29 18:45 ` Urban Widmark 2001-08-31 12:18 ` David Schmitt 2001-08-28 7:26 ` Dennis Bjorklund
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).