All of lore.kernel.org
 help / color / mirror / Atom feed
* PROBLEM: system freezes on starting a sky2 + ne2k-pci bridge
@ 2007-10-02 14:12 Mirko Parthey
  2007-10-04 18:27 ` PROBLEM: system freezes on starting ne2k-pci + bridge Mirko Parthey
  0 siblings, 1 reply; 3+ messages in thread
From: Mirko Parthey @ 2007-10-02 14:12 UTC (permalink / raw)
  To: netdev

On a machine running Debian testing, I get complete lockups
(Num lock LED not responding anymore)
when I try to use the network in the following configuration:

eth0: Marvell 8056 Gigabit LAN Controller, sky2 driver;
      the driver reports: Yukon EC Ultra (0xb4) rev 2
eth1: Compex ReadyLink 2000 (BNC+TP), ne2k-pci driver
br0: bridge comprising eth0 and eth1

Kernel versions tried (all of them show this problem):
- linux-image-2.6.18-5-amd64 (Debian etch)
- linux-image-2.6.22-2-amd64 (Debian testing)
- plain kernel.org 2.6.23-rc8-git4 (with allmodconfig and ATKBD=y)

The 2.6.18 kernel sometimes prints
  Losing some ticks ... checking if CPU frequency changed.
  Your time source seems to be instable or some driver is hogging
  interrupts.
  rip __do_softirq + 0x53/0xd5
before freezing.

The lockups can be reproduced fairly reliably as follows:
- power cycle the machine (shutdown with ACPI power off suffices,
  no need to use the switch on the power supply)
- switch to runlevel 1 (optional)
- eth0: has a gigabit link partner
- eth1: network cable disconnected
    (the driver will complain: Tx timed out, cable problem?)
- run the Debian-specific command
  $ ifup br0
  to start the bridge and its member interfaces
  br0 is configured in /etc/network/interfaces as follows:
    iface br0 inet static
      address 192.168.1.17
      netmask 255.255.255.0
      broadcast 192.168.1.255
      gateway 192.168.1.99
      bridge_ports eth0 eth1
      bridge_stp off
      bridge_fd 1

The lockup will occur while br0 is being brought up,
or soon afterwards.  If the system doesn't freeze within a minute,
I can transfer hundreds of MB of data without problems. For further
attempts at reproducing the problem, I then need to power cycle.

The Debian command "ifup br0" essentially does the following:
  brctl addbr br0
  brctl addif br0 eth0
  ifconfig eth0 0.0.0.0 up
  # sleep 2
  brctl addif br0 eth1
  ifconfig eth1 0.0.0.0 up
  # sleep 2
  brctl setfd br0 1
  brctl stp br0 off
  ifconfig br0 0.0.0.0 up
  # sleep 2
  ifconfig br0 192.168.1.17 netmask 255.255.255.0 broadcast 192.168.1.255 up
  # sleep 2
  route add default gw 192.168.1.99 br0

After adding the sleep commands, I could reproduce the problem
with the above shell script (without using Debian-specific commands),
although less reliably than with "ifup br0".

Finally some environment information. Please let me know if you need further
information, or if I should do further experiments.

# lspci
00:00.0 Host bridge: Intel Corporation P965/G965 Memory Controller Hub (rev 02)
00:01.0 PCI bridge: Intel Corporation P965/G965 PCI Express Root Port (rev 02)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #2 (rev 02)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 02)
00:1c.3 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 4 (rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 5 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation 82801HB/HR (ICH8/R) LPC Interface Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801H (ICH8 Family) 4 port SATA IDE Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 02)
00:1f.5 IDE interface: Intel Corporation 82801H (ICH8 Family) 2 port SATA IDE Controller (rev 02)
01:00.0 VGA compatible controller: ATI Technologies Inc RV370 5B60 [Radeon X300 (PCIE)]
01:00.1 Display controller: ATI Technologies Inc RV370 [Radeon X300SE]
03:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02)
03:00.1 IDE interface: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02)
04:00.0 Ethernet controller: Marvell Technology Group Ltd. Unknown device 4364 (rev 12)
05:00.0 Multimedia video controller: Brooktree Corporation Bt878 Video Capture (rev 02)
05:00.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture (rev 02)
05:01.0 Ethernet controller: Compex ReadyLink 2000 (rev 0a)

# lspci -s 4:0 -vvv
04:00.0 Ethernet controller: Marvell Technology Group Ltd. Unknown device 4364 (rev 12)
        Subsystem: Giga-byte Technology Unknown device e000
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0, Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 58
        Region 0: Memory at ef000000 (64-bit, non-prefetchable) [size=16K]
        Region 2: I/O ports at 8000 [size=256]
        [virtual] Expansion ROM at 80000000 [disabled] [size=128K]
        Capabilities: [48] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [50] Vital Product Data
        Capabilities: [5c] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+
                Address: 00000000fee00000  Data: 403a
        Capabilities: [e0] Express Legacy Endpoint IRQ 0
                Device: Supported: MaxPayload 128 bytes, PhantFunc 0, ExtTag-
                Device: Latency L0s unlimited, L1 unlimited
                Device: AtnBtn- AtnInd- PwrInd-
                Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
                Device: RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
                Device: MaxPayload 128 bytes, MaxReadReq 512 bytes
                Link: Supported Speed 2.5Gb/s, Width x1, ASPM L0s L1, Port 0
                Link: Latency L0s <256ns, L1 unlimited
                Link: ASPM Disabled RCB 128 bytes CommClk- ExtSynch-
                Link: Speed 2.5Gb/s, Width x1
        Capabilities: [100] Advanced Error Reporting

# lspci -s 5:1 -vvv
05:01.0 Ethernet controller: Compex ReadyLink 2000 (rev 0a)
        Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Interrupt: pin A routed to IRQ 177
        Region 0: I/O ports at 9000 [size=32]
        [virtual] Expansion ROM at f1108000 [disabled] [size=32K]

Regards,
Mirko

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: PROBLEM: system freezes on starting ne2k-pci + bridge
  2007-10-02 14:12 PROBLEM: system freezes on starting a sky2 + ne2k-pci bridge Mirko Parthey
@ 2007-10-04 18:27 ` Mirko Parthey
  2007-10-04 21:27   ` Stephen Hemminger
  0 siblings, 1 reply; 3+ messages in thread
From: Mirko Parthey @ 2007-10-04 18:27 UTC (permalink / raw)
  To: netdev

On Tue, Oct 02, 2007 at 04:12:17PM +0200, I wrote:
> On a machine running Debian testing, I get complete lockups
> (Num lock LED not responding anymore)
> 
> Kernel versions tried (all of them show this problem):
> - linux-image-2.6.18-5-amd64 (Debian etch)
> - linux-image-2.6.22-2-amd64 (Debian testing)
> - plain kernel.org 2.6.23-rc8-git4 (with allmodconfig and ATKBD=y)
> 
> The 2.6.18 kernel sometimes prints
>   Losing some ticks ... checking if CPU frequency changed.
>   Your time source seems to be instable or some driver is hogging
>   interrupts.
>   rip __do_softirq + 0x53/0xd5
> before freezing.

I was able to narrow this down a bit - the problem can be reproduced with 
the ne2k-pci driver alone, sky2 is not needed.
Powering off isn't necessary, either.

Hardware preparation:
- eth0: Compex ReadyLink 2000 (BNC+TP), ne2k-pci driver,
  network cable disconnected

How to reproduce the problem:

brctl addbr br0
brctl addif br0 eth0
ifconfig eth0 0 up
ifconfig br0 192.168.1.17 up
sync
find / >/dev/null &
ping -b 192.168.1.255

This will lock up my system, usually within a few seconds.

Some additional information:
- I could not reproduce the problem when using eth0 directly,
  without a bridge.
- Booting with "maxcpus=1" does not help, the problem remains.
  My system doesn't boot with "nosmp", otherwise I would have
  tried this too.

- Mainboard: Gigabyte GA-965P-S3

- /proc/cpuinfo:
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 CPU          6400  @ 2.13GHz
stepping	: 6
cpu MHz		: 2133.394
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips	: 4269.87
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 CPU          6400  @ 2.13GHz
stepping	: 6
cpu MHz		: 2133.394
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips	: 4267.07
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

- /proc/interrupts:
           CPU0       CPU1       
  0:      46738          0   IO-APIC-edge      timer
  1:        690          0   IO-APIC-edge      i8042
  7:          0          0   IO-APIC-edge      parport0
  8:          0          0   IO-APIC-edge      rtc
  9:          0          0   IO-APIC-fasteoi   acpi
 12:          4          0   IO-APIC-edge      i8042
 16:      25552          0   IO-APIC-fasteoi   uhci_hcd:usb1, ide0, radeon@pci:0000:01:00.0
 18:          2          0   IO-APIC-fasteoi   uhci_hcd:usb5, ehci_hcd:usb6
 19:      15146          0   IO-APIC-fasteoi   uhci_hcd:usb4, libata, libata, ahci, eth0
 20:          3          0   IO-APIC-fasteoi   bttv0, Bt87x audio
 21:          0          0   IO-APIC-fasteoi   uhci_hcd:usb2
 22:        192          0   IO-APIC-fasteoi   HDA Intel
 23:       5009          0   IO-APIC-fasteoi   uhci_hcd:usb3, ehci_hcd:usb7
NMI:          0          0 
LOC:      46261      46240 
ERR:          0

- /proc/ioports:
0000-001f : dma1
0020-0021 : pic1
0040-0043 : timer0
0050-0053 : timer1
0060-006f : keyboard
0070-0077 : rtc
0080-008f : dma page reg
00a0-00a1 : pic2
00c0-00df : dma2
00f0-00ff : fpu
0378-037a : parport0
03c0-03df : vga+
03f8-03ff : serial
0400-047f : 0000:00:1f.0
  0400-0403 : ACPI PM1a_EVT_BLK
  0404-0405 : ACPI PM1a_CNT_BLK
  0408-040b : ACPI PM_TMR
  0410-0415 : ACPI CPU throttle
  0428-042f : ACPI GPE0_BLK
  0460-047f : iTCO_wdt
0480-04bf : 0000:00:1f.0
0500-051f : 0000:00:1f.3
  0500-051f : i801_smbus
0778-077a : parport0
0cf8-0cff : PCI conf1
4000-4fff : PCI Bus #02
5000-5fff : PCI Bus #01
  5000-50ff : 0000:01:00.0
6000-7fff : PCI Bus #03
  6000-6007 : 0000:03:00.1
    6000-6007 : ide0
  6400-6403 : 0000:03:00.1
    6402-6402 : ide0
  6800-6807 : 0000:03:00.1
  6c00-6c03 : 0000:03:00.1
  7000-700f : 0000:03:00.1
    7000-7007 : ide0
    7008-700f : ide1
8000-8fff : PCI Bus #04
  8000-80ff : 0000:04:00.0
9000-9fff : PCI Bus #05
  9000-901f : 0000:05:01.0
    9000-901f : ne2k-pci
  9400-947f : 0000:05:02.0
a000-a01f : 0000:00:1a.1
  a000-a01f : uhci_hcd
a400-a41f : 0000:00:1d.0
  a400-a41f : uhci_hcd
a800-a81f : 0000:00:1d.1
  a800-a81f : uhci_hcd
ac00-ac1f : 0000:00:1d.2
  ac00-ac1f : uhci_hcd
b000-b01f : 0000:00:1a.0
  b000-b01f : uhci_hcd
b400-b407 : 0000:00:1f.2
  b400-b407 : libata
b800-b803 : 0000:00:1f.2
  b800-b803 : libata
bc00-bc07 : 0000:00:1f.2
  bc00-bc07 : libata
c000-c003 : 0000:00:1f.2
  c000-c003 : libata
c400-c40f : 0000:00:1f.2
  c400-c40f : libata
c800-c80f : 0000:00:1f.2
d000-d007 : 0000:00:1f.5
  d000-d007 : libata
d400-d403 : 0000:00:1f.5
  d400-d403 : libata
d800-d807 : 0000:00:1f.5
  d800-d807 : libata
dc00-dc03 : 0000:00:1f.5
  dc00-dc03 : libata
e000-e00f : 0000:00:1f.5
  e000-e00f : libata
e400-e40f : 0000:00:1f.5

- /proc/iomem:
00000000-0009f7ff : System RAM
  00000000-00000000 : Crash kernel
0009f800-0009ffff : reserved
000d2800-000d3fff : pnp 00:0b
000f0000-000fffff : reserved
00100000-7fedffff : System RAM
  00200000-003f6749 : Kernel code
  003f674a-004e2b7f : Kernel data
7fee0000-7fee2fff : ACPI Non-volatile Storage
7fee3000-7feeffff : ACPI Tables
7fef0000-7fefffff : reserved
80000000-800fffff : PCI Bus #04
  80000000-8001ffff : 0000:04:00.0
e0000000-e7ffffff : PCI Bus #01
  e0000000-e7ffffff : 0000:01:00.0
e8000000-ebffffff : reserved
ec000000-edffffff : PCI Bus #01
  ec000000-ec01ffff : 0000:01:00.0
  ed000000-ed00ffff : 0000:01:00.0
  ed010000-ed01ffff : 0000:01:00.1
ee000000-efffffff : PCI Bus #04
  ef000000-ef003fff : 0000:04:00.0
f0000000-f1ffffff : PCI Bus #05
  f1000000-f100007f : 0000:05:02.0
f2000000-f20fffff : PCI Bus #03
  f2000000-f2001fff : 0000:03:00.0
    f2000000-f2001fff : ahci
f2100000-f21fffff : PCI Bus #05
  f2100000-f2100fff : 0000:05:00.0
    f2100000-f2100fff : bttv0
  f2101000-f2101fff : 0000:05:00.1
    f2101000-f2101fff : Bt87x audio
  f2108000-f210ffff : 0000:05:01.0
  f2120000-f213ffff : 0000:05:02.0
f2200000-f2203fff : 0000:00:1b.0
  f2200000-f2203fff : ICH HD audio
f2204000-f22043ff : 0000:00:1a.7
  f2204000-f22043ff : ehci_hcd
f2205000-f22053ff : 0000:00:1d.7
  f2205000-f22053ff : ehci_hcd
f2206000-f22060ff : 0000:00:1f.3
fec00000-fec00fff : IOAPIC 0
fee00000-fee00fff : Local APIC

- # lspci -s 5:1 -vvv
05:01.0 Ethernet controller: Compex ReadyLink 2000 (rev 0a)
        Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Interrupt: pin A routed to IRQ 19
        Region 0: I/O ports at 9000 [size=32]
        [virtual] Expansion ROM at f2108000 [disabled] [size=32K]

- ver_linux output:
If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.

Linux guitar2 2.6.22-2-amd64 #1 SMP Thu Aug 30 23:43:59 UTC 2007 x86_64 GNU/Linux

Gnu C                  4.2.1
Gnu make               3.81
binutils               Binutils
util-linux             2.12r
mount                  2.12r
module-init-tools      3.3-pre11
e2fsprogs              1.40.2
Linux C Library        6.1
Dynamic linker (ldd)   2.6.1
Procps                 3.2.7
Net-tools              1.60
Console-tools          0.2.3
Sh-utils               5.97
udev                   114
wireless-tools         29
Modules Loaded         radeon drm nfsd exportfs ppdev lp button ac battery cpufreq_powersave cpufreq_ondemand cpufreq_userspace cpufreq_conservative cpufreq_stats freq_table nfs lockd nfs_acl sunrpc ipv6 bridge ext2 nls_iso8859_1 nls_cp437 vfat fat loop snd_bt87x bt878 snd_hda_intel snd_pcm_oss snd_pcm snd_mixer_oss tuner snd_seq_dummy tvaudio msp3400 snd_seq_oss bttv snd_seq_midi video_buf firmware_class snd_rawmidi ir_common snd_seq_midi_event compat_ioctl32 i2c_algo_bit i2c_i801 btcx_risc iTCO_wdt snd_seq tveeprom parport_pc parport videodev i2c_core serio_raw psmouse snd_timer snd_seq_device v4l2_common v4l1_compat pcspkr snd soundcore snd_page_alloc intel_agp tsdev evdev ext3 jbd mbcache dm_mirror dm_snapshot dm_mod ide_cd cdrom sd_mod usbhid hid generic usb_storage jmicron ide_core a
 hci ne2k_pci 8390 ata_piix ata_generic libata scsi_mod ehci_hcd uhci_hcd thermal processor fan

Regards,
Mirko

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: PROBLEM: system freezes on starting ne2k-pci + bridge
  2007-10-04 18:27 ` PROBLEM: system freezes on starting ne2k-pci + bridge Mirko Parthey
@ 2007-10-04 21:27   ` Stephen Hemminger
  0 siblings, 0 replies; 3+ messages in thread
From: Stephen Hemminger @ 2007-10-04 21:27 UTC (permalink / raw)
  To: Mirko Parthey; +Cc: netdev

On Thu, 4 Oct 2007 20:27:38 +0200
mirko.parthey@informatik.tu-chemnitz.de (Mirko Parthey) wrote:

> On Tue, Oct 02, 2007 at 04:12:17PM +0200, I wrote:
> > On a machine running Debian testing, I get complete lockups
> > (Num lock LED not responding anymore)
> > 
> > Kernel versions tried (all of them show this problem):
> > - linux-image-2.6.18-5-amd64 (Debian etch)
> > - linux-image-2.6.22-2-amd64 (Debian testing)
> > - plain kernel.org 2.6.23-rc8-git4 (with allmodconfig and ATKBD=y)
> > 
> > The 2.6.18 kernel sometimes prints
> >   Losing some ticks ... checking if CPU frequency changed.
> >   Your time source seems to be instable or some driver is hogging
> >   interrupts.
> >   rip __do_softirq + 0x53/0xd5
> > before freezing.
> 
> I was able to narrow this down a bit - the problem can be reproduced with 
> the ne2k-pci driver alone, sky2 is not needed.
> Powering off isn't necessary, either.
> 
> Hardware preparation:
> - eth0: Compex ReadyLink 2000 (BNC+TP), ne2k-pci driver,
>   network cable disconnected
> 
> How to reproduce the problem:
> 
> brctl addbr br0
> brctl addif br0 eth0
> ifconfig eth0 0 up
> ifconfig br0 192.168.1.17 up
> sync
> find / >/dev/null &
> ping -b 192.168.1.255
> 
> This will lock up my system, usually within a few seconds.
> 
> Some additional information:
> - I could not reproduce the problem when using eth0 directly,
>   without a bridge.
> - Booting with "maxcpus=1" does not help, the problem remains.
>   My system doesn't boot with "nosmp", otherwise I would have
>   tried this too.
>

Yes its a bug, but the ne2k is old crufty device driver not really
suited to bridging. It lacks:
   * proper speed reporting via ethtool (not much of any ethtool support).
   * doesn't report carrier up/down status




-- 
Stephen Hemminger <shemminger@linux-foundation.org>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-10-04 21:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-10-02 14:12 PROBLEM: system freezes on starting a sky2 + ne2k-pci bridge Mirko Parthey
2007-10-04 18:27 ` PROBLEM: system freezes on starting ne2k-pci + bridge Mirko Parthey
2007-10-04 21:27   ` Stephen Hemminger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.