linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Lockup with 2.6.9-ac15 related to netconsole
@ 2004-12-16 16:20 Mark Broadbent
  2004-12-16 21:10 ` Matt Mackall
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Broadbent @ 2004-12-16 16:20 UTC (permalink / raw)
  To: mpm; +Cc: linux-kernel

Hi,

I'm having problem using ethereal/tcpdump in conjunction with the
netconsole (built as a module).  If the netconsole is loaded and I try to
launch tcpdump on the same interface as the netconsole is transmitting I
get a hard lock-up.  The following commands can consistently do this:
# tcpdump -i eth0
eth0: Promiscuous Mode Entered
<... normal output ...>
^C
# modprobe netconsole
# tcpdump -i eth0
eth0: Promiscuous Mode Entered
<4>NMI Watchdog detected LOCKUP

I attempted to dump registers to the screen using SysRq but all I get is:

<6>SysRq:

There are no kernel messages (after the ones attached) received by the
machine collecting the netconsole output.
Attached are dmesg, lspci and lsmod output.

/proc/version says:
Linux version 2.6.9-ac15-mb7 (broadben@mbpc) (gcc version 3.3.5 (Debian
1:3.3.5-3)) #12 SMP Wed Dec 15 09:42:13 GMT 2004
netconsole options:
options netconsole
netconsole=@172.20.0.117/eth0,514@172.20.0.127/00:02:B3:03:4E:EA
Command line:
root=/dev/hdb1 nmi_watchdog=1


Any ideas (apart from not using tcpdump and netconsole on the same interface)

Thanks
Mark

-- 
Mark Broadbent <markb@wetlettuce.com>
Web: http://www.wetlettuce.com

dmesg output:

Linux version 2.6.9-ac15-mb7 (broadben@mbpc) (gcc version 3.3.5 (Debian
1:3.3.5-3)) #12 SMP Wed Dec 15 09:42:13 GMT 2004BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000001fff0000 (usable)
 BIOS-e820: 000000001fff0000 - 000000001fff8000 (ACPI data)
 BIOS-e820: 000000001fff8000 - 0000000020000000 (ACPI NVS)
 BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved)
511MB LOWMEM available.
found SMP MP-table at 000fc0f0
On node 0 totalpages: 131056
  DMA zone: 4096 pages, LIFO batch:1
  Normal zone: 126960 pages, LIFO batch:16
  HighMem zone: 0 pages, LIFO batch:1
DMI 2.3 present.
ACPI: RSDP (v000 AMI                                   ) @ 0x000fa3b0
ACPI: RSDT (v001 AMIINT INTEL865 0x00000010 MSFT 0x00000097) @ 0x1fff0000
ACPI: FADT (v001 AMIINT INTEL865 0x00000011 MSFT 0x00000097) @ 0x1fff0030
ACPI: MADT (v001 AMIINT INTEL865 0x00000009 MSFT 0x00000097) @ 0x1fff00c0
ACPI: DSDT (v001  INTEL    I865G 0x00001000 MSFT 0x0100000d) @ 0x00000000
ACPI: PM-Timer IO Port: 0x808
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:2 APIC version 20
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:2 APIC version 20
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 20 low level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
Enabling APIC mode:  Flat.  Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Built 1 zonelists
Kernel command line: root=/dev/hdb1 nmi_watchdog=1
Initializing CPU#0
PID hash table entries: 2048 (order: 11, 32768 bytes)
Detected 3001.290 MHz processor.
Using pmtmr for high-res timesource
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 515976k/524224k available (1526k kernel code, 7648k reserved, 766k
data, 156k init, 0k highmem)Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay loop... 5947.39 BogoMIPS (lpj=2973696)
Security Scaffold v1.0.0 initialized
Capability LSM initialized
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000
CPU: After vendor identify, caps:  bfebfbff 00000000 00000000 00000000
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 0
CPU: After all inits, caps:        bfebfbff 00000000 00000000 00000080
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
CPU0: Thermal monitoring enabled
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
CPU0: Intel(R) Pentium(R) 4 CPU 3.00GHz stepping 09
per-CPU timeslice cutoff: 1462.52 usecs.
task migration cache decay timeout: 2 msecs.
Booting processor 1/1 eip 2000
Initializing CPU#1
Calibrating delay loop... 5996.54 BogoMIPS (lpj=2998272)
CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000
CPU: After vendor identify, caps:  bfebfbff 00000000 00000000 00000000
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Physical Processor ID: 0
CPU: After all inits, caps:        bfebfbff 00000000 00000000 00000080
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
CPU1: Intel P4/Xeon Extended MCE MSRs (12) available
CPU1: Thermal monitoring enabled
CPU1: Intel(R) Pentium(R) 4 CPU 3.00GHz stepping 09
Total of 2 processors activated (11943.93 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 pin1=2 pin2=-1
testing NMI watchdog ... OK.
checking TSC synchronization across 2 CPUs: passed.
Brought up 2 CPUs
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfdb81, last bus=2
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20040816
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (00:00)
PCI: Probing PCI hardware (bus 00)
PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.1
PCI: Transparent bridge - 0000:00:1e.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.ICHB._PRT]
ACPI: Power Resource [URP1] (off)
ACPI: Power Resource [URP2] (off)
ACPI: Power Resource [FDDP] (off)
ACPI: Power Resource [LPTP] (off)
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 *10 11 12 14 15)
Linux Plug and Play Support v0.97 (c) Adam Belay
PnPBIOS: Disabled by ACPI
PCI: Using ACPI for IRQ routing
ACPI: PCI interrupt 0000:00:1d.0[A] -> GSI 16 (level, low) -> IRQ 16
ACPI: PCI interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 19
ACPI: PCI interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 18
ACPI: PCI interrupt 0000:00:1d.3[A] -> GSI 16 (level, low) -> IRQ 16
ACPI: PCI interrupt 0000:00:1d.7[D] -> GSI 23 (level, low) -> IRQ 23
ACPI: PCI interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 18
ACPI: PCI interrupt 0000:00:1f.3[B] -> GSI 17 (level, low) -> IRQ 17
ACPI: PCI interrupt 0000:00:1f.5[B] -> GSI 17 (level, low) -> IRQ 17
ACPI: PCI interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16
ACPI: PCI interrupt 0000:02:03.0[A] -> GSI 19 (level, low) -> IRQ 19
ACPI: PCI interrupt 0000:02:06.0[A] -> GSI 20 (level, low) -> IRQ 20
Machine check exception polling timer started.
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
Initializing Cryptographic API
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
hw_random: RNG not detected
Hangcheck: starting hangcheck timer 0.5.0 (tick is 180 seconds, margin is
60 seconds).serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH5: IDE controller at PCI slot 0000:00:1f.1
PCI: Enabling device 0000:00:1f.1 (0005 -> 0007)
ACPI: PCI interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 18
ICH5: chipset revision 2
ICH5: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:DMA
    ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:DMA, hdd:pio
Probing IDE interface ide0...
hda: Maxtor 92049U6, ATA DISK drive
hdb: MAXTOR 6L040J2, ATA DISK drive
Using anticipatory io scheduler
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdc: GENERIC CRD-BP1600P, ATAPI CD/DVD-ROM drive
hdd: IOMEGA ZIP 100 ATAPI, ATAPI FLOPPY drive
ide1 at 0x170-0x177,0x376 on irq 15
Probing IDE interface ide2...
ide2: Wait for ready failed before probe !
Probing IDE interface ide3...
ide3: Wait for ready failed before probe !
Probing IDE interface ide4...
ide4: Wait for ready failed before probe !
Probing IDE interface ide5...
ide5: Wait for ready failed before probe !
hda: max request size: 128KiB
hda: 40026672 sectors (20493 MB) w/2048KiB Cache, CHS=39709/16/63, UDMA(66)
hda: cache flushes not supported
 hda: hda1
hdb: max request size: 128KiB
hdb: 78177792 sectors (40027 MB) w/1818KiB Cache, CHS=65535/16/63, UDMA(100)
hdb: cache flushes supported
 hdb: hdb1 hdb2 hdb3 hdb4
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard on isa0060/serio0
input: ImPS/2 Generic Wheel Mouse on isa0060/serio1
NET: Registered protocol family 2
IP: routing cache hash table of 4096 buckets, 32Kbytes
TCP: Hash tables configured (established 32768 bind 32768)
NET: Registered protocol family 1
NET: Registered protocol family 15
ReiserFS: hdb1: found reiserfs format "3.6" with standard journal
ReiserFS: hdb1: using ordered data mode
ReiserFS: hdb1: journal params: device hdb1, size 8192, journal first
block 18, max trans len 1024, max batch 900, max commit age 30, max trans
age 30ReiserFS: hdb1: checking transaction log (hdb1)
ReiserFS: hdb1: replayed 11 transactions in 0 seconds
ReiserFS: hdb1: Using r5 hash to sort names
VFS: Mounted root (reiserfs filesystem) readonly.
Freeing unused kernel memory: 156k freed
hdc: ATAPI 40X CD-ROM CD-R/RW drive, 4096kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
Adding 2008116k swap on /dev/hdb3.  Priority:-1 extents:1
ReiserFS: hdb1: Removing [6201 106807 0x0 SD]..done
ReiserFS: hdb1: Removing [6201 106340 0x0 SD]..done
ReiserFS: hdb1: Removing [6201 106169 0x0 SD]..done
ReiserFS: hdb1: Removing [6201 103322 0x0 SD]..done
ReiserFS: hdb1: There were 4 uncompleted unlinks/truncates. Completed
Linux Tulip driver version 1.1.13-NAPI (May 11, 2002)
ACPI: PCI interrupt 0000:02:03.0[A] -> GSI 19 (level, low) -> IRQ 19
tulip0:  MII transceiver #1 config 1000 status 786d advertising 05e1.
tulip0:  MII transceiver #2 config 1000 status 786d advertising 05e1.
tulip0:  MII transceiver #3 config 1000 status 786d advertising 05e1.
tulip0:  MII transceiver #4 config 1000 status 786d advertising 05e1.
eth0: ADMtek Comet rev 17 at 0xcc00, 00:04:E2:39:36:B4, IRQ 19.
ReiserFS: hdb2: found reiserfs format "3.6" with standard journal
ReiserFS: hdb2: using ordered data mode
ReiserFS: hdb2: journal params: device hdb2, size 8192, journal first
block 18, max trans len 1024, max batch 900, max commit age 30, max trans
age 30ReiserFS: hdb2: checking transaction log (hdb2)
ReiserFS: hdb2: Using r5 hash to sort names
kjournald starting.  Commit interval 5 seconds
EXT3 FS on hdb4, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Linux agpgart interface v0.100 (c) Dave Jones
agpgart: Detected an Intel 865 Chipset.
agpgart: Maximum main memory to use for agp memory: 439M
agpgart: AGP aperture is 128M @ 0xf0000000
usbcore: registered new driver usbfs
usbcore: registered new driver hub
USB Universal Host Controller Interface driver v2.2
ACPI: PCI interrupt 0000:00:1d.0[A] -> GSI 16 (level, low) -> IRQ 16
uhci_hcd 0000:00:1d.0: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #1
PCI: Setting latency timer of device 0000:00:1d.0 to 64
uhci_hcd 0000:00:1d.0: irq 16, io base 0000e000
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 1
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
ACPI: PCI interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 19
uhci_hcd 0000:00:1d.1: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #2
PCI: Setting latency timer of device 0000:00:1d.1 to 64
uhci_hcd 0000:00:1d.1: irq 19, io base 0000e400
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 2
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ACPI: PCI interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 18
uhci_hcd 0000:00:1d.2: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #3
PCI: Setting latency timer of device 0000:00:1d.2 to 64
uhci_hcd 0000:00:1d.2: irq 18, io base 0000e800
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 3
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
ACPI: PCI interrupt 0000:00:1d.3[A] -> GSI 16 (level, low) -> IRQ 16
uhci_hcd 0000:00:1d.3: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #4
PCI: Setting latency timer of device 0000:00:1d.3 to 64
uhci_hcd 0000:00:1d.3: irq 16, io base 0000ec00
uhci_hcd 0000:00:1d.3: new USB bus registered, assigned bus number 4
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
ACPI: PCI interrupt 0000:00:1d.7[D] -> GSI 23 (level, low) -> IRQ 23
ehci_hcd 0000:00:1d.7: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB2 EHCI
ControllerPCI: Setting latency timer of device 0000:00:1d.7 to 64
ehci_hcd 0000:00:1d.7: irq 23, pci mem e09d6c00
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 5
PCI: cache line size of 128 is not supported by device 0000:00:1d.7
ehci_hcd 0000:00:1d.7: USB 2.0 enabled, EHCI 1.00, driver 2004-May-10
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 8 ports detected
ACPI: PCI interrupt 0000:00:1f.5[B] -> GSI 17 (level, low) -> IRQ 17
PCI: Setting latency timer of device 0000:00:1f.5 to 64
intel8x0_measure_ac97_clock: measured 49569 usecs
intel8x0: clocking to 48000
r8169 Gigabit Ethernet driver 1.2 loaded
ACPI: PCI interrupt 0000:02:06.0[A] -> GSI 20 (level, low) -> IRQ 20
r8169: NAPI enabled
eth1: Identified chip type is 'RTL8169s/8110s'.
eth1: RTL8169 at 0xe0ac8b00, 00:0c:76:91:7d:6b, IRQ 20
r8169: eth1: link up
eth0: Setting full-duplex based on MII#1 link partner capability of 41e1.
nfs warning: mount version older than kernel
nfs warning: mount version older than kernel
nfs warning: mount version older than kernel
nfs warning: mount version older than kernel
nfs warning: mount version older than kernel
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
Real Time Clock Driver v1.12
NET: Registered protocol family 10
Disabled Privacy Extensions on device c03042e0(lo)
IPv6 over IPv4 tunneling driver
IA-32 Microcode Update Driver: v1.14 <tigran@veritas.com>
microcode: CPU1 updated from revision 0x14 to 0x2e, date = 08112004
microcode: CPU0 updated from revision 0x14 to 0x2e, date = 08112004
IA-32 Microcode Update Driver v1.14 unregistered
ACPI: Processor [CPU1] (supports C1, 8 throttling states)
ACPI: Processor [CPU2] (supports C1, 8 throttling states)
ACPI: Power Button (FF) [PWRF]
parport0: PC-style at 0x378 (0x778) [PCSPP,TRISTATE,EPP]
parport0: irq 7 detected
lp0: using parport0 (polling).
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
eth0: no IPv6 routers present
eth1: no IPv6 routers present
nfs warning: mount version older than kernel


lsmod output:

Module                  Size  Used by
nfsd                  216992  8
exportfs                6272  1 nfsd
parport_pc             34496  1
lp                     10376  0
parport                39496  2 parport_pc,lp
binfmt_misc            12040  1
autofs4                18948  1
thermal                13320  0
fan                     4228  0
button                  6800  0
processor              13728  1 thermal
md5                     4224  1
ipv6                  242816  14
rtc                    12744  0
8250                   20992  0
serial_core            22528  1 8250
nfs                   210404  6
lockd                  66120  3 nfsd,nfs
sunrpc                147108  11 nfsd,nfs,lockd
r8169                  20104  0
snd_intel8x0           33356  1
snd_ac97_codec         65232  1 snd_intel8x0
snd_pcm_oss            50344  0
snd_mixer_oss          19072  1 snd_pcm_oss
snd_pcm                92292  2 snd_intel8x0,snd_pcm_oss
snd_timer              24836  1 snd_pcm
snd_page_alloc          9992  2 snd_intel8x0,snd_pcm
gameport                4864  1 snd_intel8x0
snd_mpu401_uart         7936  1 snd_intel8x0
snd_rawmidi            24484  1 snd_mpu401_uart
snd_seq_device          8072  1 snd_rawmidi
snd                    54884  11
snd_intel8x0,snd_ac97_codec,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer,snd_mpu401_uart,snd_rawmidi,snd_seq_devicesoundcore               9952  1 snd
ehci_hcd               27908  0
uhci_hcd               31376  0
usbcore               111844  4 ehci_hcd,uhci_hcd
intel_mch_agp          10256  0
intel_agp              21024  1
agpgart                32940  2 intel_mch_agp,intel_agp
evdev                   9088  0
ext3                  116712  1
jbd                    63128  1 ext3
mbcache                 8836  1 ext3
tulip                  46752  0
crc32                   4608  2 r8169,tulip
ide_cd                 40608  0
cdrom                  38172  1 ide_cd

lspci output:

0000:00:00.0 Host bridge: Intel Corp. 82865G/PE/P DRAM Controller/Host-Hub
Interface (rev 02)	Subsystem: Micro-Star International Co., Ltd.: Unknown device 7280
	Flags: bus master, fast devsel, latency 0
	Memory at f0000000 (32-bit, prefetchable) [size=128M]
	Capabilities: [e4] #09 [2106]
	Capabilities: [a0] AGP version 3.0

0000:00:01.0 PCI bridge: Intel Corp. 82865G/PE/P PCI to AGP Controller
(rev 02) (prog-if 00 [Normal decode])	Flags: bus master, 66MHz, fast devsel, latency 32
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=32
	Memory behind bridge: fc900000-fe9fffff
	Prefetchable memory behind bridge: dff00000-efefffff

0000:00:1d.0 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI
#1 (rev 02) (prog-if 00 [UHCI])	Subsystem: Micro-Star International Co., Ltd. 865PE Neo2 (MS-6728)
	Flags: bus master, medium devsel, latency 0, IRQ 16
	I/O ports at e000 [size=32]

0000:00:1d.1 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI
#2 (rev 02) (prog-if 00 [UHCI])	Subsystem: Micro-Star International Co., Ltd. 865PE Neo2 (MS-6728)
	Flags: bus master, medium devsel, latency 0, IRQ 19
	I/O ports at e400 [size=32]

0000:00:1d.2 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI
#3 (rev 02) (prog-if 00 [UHCI])	Subsystem: Micro-Star International Co., Ltd. 865PE Neo2 (MS-6728)
	Flags: bus master, medium devsel, latency 0, IRQ 18
	I/O ports at e800 [size=32]

0000:00:1d.3 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI
#4 (rev 02) (prog-if 00 [UHCI])	Subsystem: Micro-Star International Co., Ltd. 865PE Neo2 (MS-6728)
	Flags: bus master, medium devsel, latency 0, IRQ 16
	I/O ports at ec00 [size=32]

0000:00:1d.7 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB2 EHCI
Controller (rev 02) (prog-if 20 [EHCI])	Subsystem: Micro-Star International Co., Ltd. 865PE Neo2 (MS-6728)
	Flags: bus master, medium devsel, latency 0, IRQ 23
	Memory at febffc00 (32-bit, non-prefetchable) [size=1K]
	Capabilities: [50] Power Management version 2
	Capabilities: [58] #0a [20a0]

0000:00:1e.0 PCI bridge: Intel Corp. 82801 PCI Bridge (rev c2) (prog-if 00
[Normal decode])	Flags: bus master, fast devsel, latency 0
	Bus: primary=00, secondary=02, subordinate=02, sec-latency=32
	I/O behind bridge: 0000c000-0000cfff
	Memory behind bridge: fea00000-feafffff

0000:00:1f.0 ISA bridge: Intel Corp. 82801EB/ER (ICH5/ICH5R) LPC Bridge
(rev 02)	Flags: bus master, medium devsel, latency 0

0000:00:1f.1 IDE interface: Intel Corp. 82801EB/ER (ICH5/ICH5R) Ultra ATA
100 Storage Controller (rev 02) (prog-if 8a [Master SecP PriP])	Subsystem: Micro-Star International Co., Ltd. 865PE Neo2 (MS-6728)
	Flags: bus master, medium devsel, latency 0, IRQ 18
	I/O ports at <unassigned>
	I/O ports at <unassigned>
	I/O ports at <unassigned>
	I/O ports at <unassigned>
	I/O ports at fc00 [size=16]
	Memory at 20000000 (32-bit, non-prefetchable) [size=1K]

0000:00:1f.3 SMBus: Intel Corp. 82801EB/ER (ICH5/ICH5R) SMBus Controller
(rev 02)	Subsystem: Micro-Star International Co., Ltd. 865PE Neo2 (MS-6728)
	Flags: medium devsel, IRQ 17
	I/O ports at 0c00 [size=32]

0000:00:1f.5 Multimedia audio controller: Intel Corp. 82801EB/ER
(ICH5/ICH5R) AC'97 Audio Controller (rev 02)	Subsystem: Micro-Star International Co., Ltd.: Unknown device 0080
	Flags: bus master, medium devsel, latency 0, IRQ 17
	I/O ports at dc00 [size=256]
	I/O ports at d800 [size=64]
	Memory at febffa00 (32-bit, non-prefetchable) [size=512]
	Memory at febff900 (32-bit, non-prefetchable) [size=256]
	Capabilities: [50] Power Management version 2

0000:01:00.0 VGA compatible controller: nVidia Corporation NV11 [GeForce2
MX/MX 400] (rev a1) (prog-if 00 [VGA])	Subsystem: Guillemot Corporation: Unknown device 7100
	Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 16
	Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
	Memory at e0000000 (32-bit, prefetchable) [size=128M]
	Expansion ROM at fe9f0000 [disabled] [size=64K]
	Capabilities: [60] Power Management version 2
	Capabilities: [44] AGP version 2.0

0000:02:03.0 Ethernet controller: Accton Technology Corporation EN-1216
Ethernet Adapter (rev 11)	Subsystem: Standard Microsystems Corp [SMC]: Unknown device 1255
	Flags: bus master, medium devsel, latency 32, IRQ 19
	I/O ports at cc00 [size=256]
	Memory at feaffc00 (32-bit, non-prefetchable) [size=1K]
	Expansion ROM at feac0000 [disabled] [size=128K]
	Capabilities: [c0] Power Management version 2

0000:02:06.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169
Gigabit Ethernet (rev 10)	Subsystem: Micro-Star International Co., Ltd.: Unknown device 728c
	Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 20
	I/O ports at c800 [size=256]
	Memory at feaffb00 (32-bit, non-prefetchable) [size=256]
	Expansion ROM at feaa0000 [disabled] [size=128K]
	Capabilities: [dc] Power Management version 2


dmesg netconsole output:

Linux Tulip driver version 1.1.13-NAPI (May 11, 2002)
ACPI: PCI interrupt 0000:02:03.0[A] -> GSI 19 (level, low) -> IRQ 19
tulip0:  MII transceiver #1 config 1000 status 786d advertising 05e1.
tulip0:  MII transceiver #2 config 1000 status 786d advertising 05e1.
tulip0:  MII transceiver #3 config 1000 status 786d advertising 05e1.
tulip0:  MII transceiver #4 config 1000 status 786d advertising 05e1.
eth0: ADMtek Comet rev 17 at 0xcc00, 00:04:E2:39:36:B4, IRQ 19.
netconsole: local port 6665
netconsole: local IP 172.20.0.117
netconsole: interface eth0
netconsole: remote port 514
netconsole: remote IP 172.20.0.127
netconsole: remote ethernet address 00:02:b3:03:4e:ea
netconsole: device eth0 not up yet, forcing it
netconsole: carrier detect appears flaky, waiting 10 seconds
eth0: Setting full-duplex based on MII#1 link partner capability of 41e1.
netconsole: network logging started




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-16 16:20 Lockup with 2.6.9-ac15 related to netconsole Mark Broadbent
@ 2004-12-16 21:10 ` Matt Mackall
  2004-12-17  9:10   ` Mark Broadbent
  0 siblings, 1 reply; 26+ messages in thread
From: Matt Mackall @ 2004-12-16 21:10 UTC (permalink / raw)
  To: Mark Broadbent; +Cc: linux-kernel

On Thu, Dec 16, 2004 at 04:20:02PM -0000, Mark Broadbent wrote:
> Hi,
> 
> I'm having problem using ethereal/tcpdump in conjunction with the
> netconsole (built as a module).  If the netconsole is loaded and I try to
> launch tcpdump on the same interface as the netconsole is transmitting I
> get a hard lock-up.  The following commands can consistently do this:
> # tcpdump -i eth0
> eth0: Promiscuous Mode Entered
> <... normal output ...>
> ^C
> # modprobe netconsole
> # tcpdump -i eth0
> eth0: Promiscuous Mode Entered
> <4>NMI Watchdog detected LOCKUP

Joy. Can you try it on your other interface to see if it's
driver-specific?

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-16 21:10 ` Matt Mackall
@ 2004-12-17  9:10   ` Mark Broadbent
  2004-12-17 21:57     ` Matt Mackall
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Broadbent @ 2004-12-17  9:10 UTC (permalink / raw)
  To: mpm; +Cc: linux-kernel


Matt Mackall said:
> On Thu, Dec 16, 2004 at 04:20:02PM -0000, Mark Broadbent wrote:
>> Hi,
>>
>> I'm having problem using ethereal/tcpdump in conjunction with the
>> netconsole (built as a module).  If the netconsole is loaded and I try
>> to launch tcpdump on the same interface as the netconsole is
>> transmitting I get a hard lock-up.  The following commands can
>> consistently do this: # tcpdump -i eth0
>> eth0: Promiscuous Mode Entered
>> <... normal output ...>
>> ^C
>> # modprobe netconsole
>> # tcpdump -i eth0
>> eth0: Promiscuous Mode Entered
>> <4>NMI Watchdog detected LOCKUP
>
> Joy. Can you try it on your other interface to see if it's
> driver-specific?

Tried using eth1 which is using the r8169 but it doesn't support polling. 
I also tried with 2.6.10-rc3-bk10 but it still doesn't support polling. 
Also it still locks up using eth0 (the tulip driver) with 2.6.10-rc3-bk10.
Thanks
Mark

-- 
Mark Broadbent <markb@wetlettuce.com>
Web: http://www.wetlettuce.com




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-17  9:10   ` Mark Broadbent
@ 2004-12-17 21:57     ` Matt Mackall
  2004-12-17 23:35       ` Francois Romieu
  0 siblings, 1 reply; 26+ messages in thread
From: Matt Mackall @ 2004-12-17 21:57 UTC (permalink / raw)
  To: Mark Broadbent; +Cc: linux-kernel

On Fri, Dec 17, 2004 at 09:10:14AM -0000, Mark Broadbent wrote:
> 
> Matt Mackall said:
> > On Thu, Dec 16, 2004 at 04:20:02PM -0000, Mark Broadbent wrote:
> >> Hi,
> >>
> >> I'm having problem using ethereal/tcpdump in conjunction with the
> >> netconsole (built as a module).  If the netconsole is loaded and I try
> >> to launch tcpdump on the same interface as the netconsole is
> >> transmitting I get a hard lock-up.  The following commands can
> >> consistently do this: # tcpdump -i eth0
> >> eth0: Promiscuous Mode Entered
> >> <... normal output ...>
> >> ^C
> >> # modprobe netconsole
> >> # tcpdump -i eth0
> >> eth0: Promiscuous Mode Entered
> >> <4>NMI Watchdog detected LOCKUP
> >
> > Joy. Can you try it on your other interface to see if it's
> > driver-specific?
> 
> Tried using eth1 which is using the r8169 but it doesn't support polling. 
> I also tried with 2.6.10-rc3-bk10 but it still doesn't support polling. 
> Also it still locks up using eth0 (the tulip driver) with 2.6.10-rc3-bk10.

Please try the attached untested, uncompiled patch to add polling to
r8169:

Index: l/drivers/net/r8169.c
===================================================================
--- l.orig/drivers/net/r8169.c	2004-11-04 10:53:04.779520000 -0800
+++ l/drivers/net/r8169.c	2004-12-17 13:30:35.367771000 -0800
@@ -1120,6 +1120,9 @@
 	dev->weight = R8169_NAPI_WEIGHT;
 	printk(KERN_INFO PFX "NAPI enabled\n");
 #endif
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	dev->poll_controller = rtl8169_netpoll;
+#endif
 	tp->intr_mask = 0xffff;
 	tp->pci_dev = pdev;
 	tp->mmio_addr = ioaddr;
@@ -1839,6 +1842,15 @@
 }
 #endif
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static void rtl8169_netpoll(struct net_device *dev)
+{
+	disable_irq(dev->irq);
+	rtl8169_interrupt(dev->irq, netdev, NULL);
+	enable_irq(dev->irq);
+}
+#endif
+
 static int
 rtl8169_close(struct net_device *dev)
 {


--
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-17 21:57     ` Matt Mackall
@ 2004-12-17 23:35       ` Francois Romieu
  2004-12-18 13:25         ` Mark Broadbent
  2004-12-20  9:42         ` Mark Broadbent
  0 siblings, 2 replies; 26+ messages in thread
From: Francois Romieu @ 2004-12-17 23:35 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Mark Broadbent, linux-kernel

Matt Mackall <mpm@selenic.com> :
[...]
> Please try the attached untested, uncompiled patch to add polling to
> r8169:
[...]
> @@ -1839,6 +1842,15 @@
>  }
>  #endif
>  
> +#ifdef CONFIG_NET_POLL_CONTROLLER
> +static void rtl8169_netpoll(struct net_device *dev)
> +{
> +	disable_irq(dev->irq);
> +	rtl8169_interrupt(dev->irq, netdev, NULL);
                                    ^^^^^^ -> should be "dev"

The r8169 driver in -mm offers netpoll. A patch which syncs the r8169
driver from 2.6.10-rc3 with current -mm is available at:
http://www.fr.zoreil.com/people/francois/misc/20041218-2.6.10-rc3-r8169.c-test.patch

Please report success/failure. Cc: netdev@oss.sgi.com is welcome.

--
Ueimor

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-17 23:35       ` Francois Romieu
@ 2004-12-18 13:25         ` Mark Broadbent
  2004-12-20  9:42         ` Mark Broadbent
  1 sibling, 0 replies; 26+ messages in thread
From: Mark Broadbent @ 2004-12-18 13:25 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Matt Mackall, linux-kernel

On Sat, 2004-12-18 at 00:35 +0100, Francois Romieu wrote:
> Matt Mackall <mpm@selenic.com> :
> [...]
> > Please try the attached untested, uncompiled patch to add polling to
> > r8169:
> [...]
> > @@ -1839,6 +1842,15 @@
> >  }
> >  #endif
> >  
> > +#ifdef CONFIG_NET_POLL_CONTROLLER
> > +static void rtl8169_netpoll(struct net_device *dev)
> > +{
> > +	disable_irq(dev->irq);
> > +	rtl8169_interrupt(dev->irq, netdev, NULL);
>                                     ^^^^^^ -> should be "dev"
> 
> The r8169 driver in -mm offers netpoll. A patch which syncs the r8169
> driver from 2.6.10-rc3 with current -mm is available at:
> http://www.fr.zoreil.com/people/francois/misc/20041218-2.6.10-rc3-r8169.c-test.patch
> 
> Please report success/failure. Cc: netdev@oss.sgi.com is welcome.

Will try -mm when I next have access to the hardware (on Monday) and
will report back.

Thanks
Mark

> --
> Ueimor
> 
-- 
Mark Broadbent <markb@wetlettuce.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-17 23:35       ` Francois Romieu
  2004-12-18 13:25         ` Mark Broadbent
@ 2004-12-20  9:42         ` Mark Broadbent
  2004-12-20 21:14           ` Matt Mackall
  1 sibling, 1 reply; 26+ messages in thread
From: Mark Broadbent @ 2004-12-20  9:42 UTC (permalink / raw)
  To: romieu; +Cc: mpm, linux-kernel, netdev


Francois Romieu said:
> Matt Mackall <mpm@selenic.com> :
> [...]
>> Please try the attached untested, uncompiled patch to add polling to
>> r8169:
> [...]
>> @@ -1839,6 +1842,15 @@
>>  }
>>  #endif
>>
>> +#ifdef CONFIG_NET_POLL_CONTROLLER
>> +static void rtl8169_netpoll(struct net_device *dev)
>> +{
>> +	disable_irq(dev->irq);
>> +	rtl8169_interrupt(dev->irq, netdev, NULL);
>                                    ^^^^^^ -> should be "dev"
>
> The r8169 driver in -mm offers netpoll. A patch which syncs the r8169
> driver from 2.6.10-rc3 with current -mm is available at:
> http://www.fr.zoreil.com/people/francois/misc/20041218-2.6.10-rc3-r8169.c-test.patch>
> Please report success/failure. Cc: netdev@oss.sgi.com is welcome.

Exactly the same happens, I still get a 'NMI Watchdog detected LOCKUP'
with the r8169 device using the above patch on top of 2.6.10-rc3-bk10.
Thanks
Mark

-- 
Mark Broadbent <markb@wetlettuce.com>
Web: http://www.wetlettuce.com




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-20  9:42         ` Mark Broadbent
@ 2004-12-20 21:14           ` Matt Mackall
  2004-12-21  0:22             ` Francois Romieu
  0 siblings, 1 reply; 26+ messages in thread
From: Matt Mackall @ 2004-12-20 21:14 UTC (permalink / raw)
  To: Mark Broadbent; +Cc: romieu, linux-kernel, netdev

On Mon, Dec 20, 2004 at 09:42:08AM -0000, Mark Broadbent wrote:
> 
> Exactly the same happens, I still get a 'NMI Watchdog detected LOCKUP'
> with the r8169 device using the above patch on top of 2.6.10-rc3-bk10.

Ok, that suggests a problem localized to netpoll itself. Do you have
spinlock debugging turned on by any chance? 

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-20 21:14           ` Matt Mackall
@ 2004-12-21  0:22             ` Francois Romieu
  2004-12-21  0:55               ` Matt Mackall
  0 siblings, 1 reply; 26+ messages in thread
From: Francois Romieu @ 2004-12-21  0:22 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Mark Broadbent, linux-kernel, netdev

Matt Mackall <mpm@selenic.com> :
> On Mon, Dec 20, 2004 at 09:42:08AM -0000, Mark Broadbent wrote:
> > 
> > Exactly the same happens, I still get a 'NMI Watchdog detected LOCKUP'
> > with the r8169 device using the above patch on top of 2.6.10-rc3-bk10.
> 
> Ok, that suggests a problem localized to netpoll itself. Do you have
> spinlock debugging turned on by any chance? 

Any chance of:
1 dev_queue_xmit
2 dev->xmit_lock taken
3 interruption
4 printk
5 netconsole write
6 dev->xmit_lock again
7 lockup

?

This is probably the silly question of the day.

--
Ueimor

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-21  0:22             ` Francois Romieu
@ 2004-12-21  0:55               ` Matt Mackall
  2004-12-21 10:23                 ` Mark Broadbent
  0 siblings, 1 reply; 26+ messages in thread
From: Matt Mackall @ 2004-12-21  0:55 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Mark Broadbent, linux-kernel, netdev

On Tue, Dec 21, 2004 at 01:22:18AM +0100, Francois Romieu wrote:
> Matt Mackall <mpm@selenic.com> :
> > On Mon, Dec 20, 2004 at 09:42:08AM -0000, Mark Broadbent wrote:
> > > 
> > > Exactly the same happens, I still get a 'NMI Watchdog detected LOCKUP'
> > > with the r8169 device using the above patch on top of 2.6.10-rc3-bk10.
> > 
> > Ok, that suggests a problem localized to netpoll itself. Do you have
> > spinlock debugging turned on by any chance? 
> 
> Any chance of:
> 1 dev_queue_xmit
> 2 dev->xmit_lock taken
> 3 interruption
> 4 printk
> 5 netconsole write
> 6 dev->xmit_lock again
> 7 lockup
> 
> ?
> 
> This is probably the silly question of the day.

Maybe, but the answer isn't obvious to me at the moment as I haven't
been thinking about such stuff enough lately. Silly response of the
day:

Mark, can you try this (again completely untested, but at least
compiles) patch? I'm afraid I don't have a proper test rig to
reproduce this at the moment. This will attempt to grab the lock, and
if it fails, will check for recursion. Then it will try to print a
message on the local console, temporarily disabling netconsole to
allow the printk to get through..

Index: l/net/core/netpoll.c
===================================================================
--- l.orig/net/core/netpoll.c	2004-11-04 10:53:23.388610000 -0800
+++ l/net/core/netpoll.c	2004-12-20 16:45:40.212709000 -0800
@@ -31,6 +31,8 @@
 #define MAX_SKBS 32
 #define MAX_UDP_CHUNK 1460
 
+static int netpoll_kill;
+
 static spinlock_t skb_list_lock = SPIN_LOCK_UNLOCKED;
 static int nr_skbs;
 static struct sk_buff *skbs;
@@ -183,13 +185,24 @@
 	int status;
 
 repeat:
-	if(!np || !np->dev || !netif_running(np->dev)) {
+	if(!np || !np->dev || !netif_running(np->dev) || netpoll_kill) {
 		__kfree_skb(skb);
 		return;
 	}
 
-	spin_lock(&np->dev->xmit_lock);
-	np->dev->xmit_lock_owner = smp_processor_id();
+	if(spin_trylock(&np->dev->xmit_lock))
+		np->dev->xmit_lock_owner = smp_processor_id();
+	else {
+		if(np->dev->xmit_lock_owner == smp_processor_id()) {
+			netpoll_kill = 1;
+			__kfree_skb(skb);
+			printk("Tried to recursively get dev->xmit_lock");
+			netpoll_kill = 0;
+			return;
+		}
+		spin_lock(&np->dev->xmit_lock);
+
+	}
 
 	/*
 	 * network drivers do not expect to be called if the queue is


-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-21  0:55               ` Matt Mackall
@ 2004-12-21 10:23                 ` Mark Broadbent
  2004-12-21 12:37                   ` Francois Romieu
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Broadbent @ 2004-12-21 10:23 UTC (permalink / raw)
  To: mpm; +Cc: romieu, linux-kernel, netdev


Matt Mackall said:
> On Tue, Dec 21, 2004 at 01:22:18AM +0100, Francois Romieu wrote:
>> Matt Mackall <mpm@selenic.com> :
>> > On Mon, Dec 20, 2004 at 09:42:08AM -0000, Mark Broadbent wrote:
>> > >
>> > > Exactly the same happens, I still get a 'NMI Watchdog detected
>> > > LOCKUP' with the r8169 device using the above patch on top of
>> > > 2.6.10-rc3-bk10.
>> >
>> > Ok, that suggests a problem localized to netpoll itself. Do you have
>> > spinlock debugging turned on by any chance?
>>
>> Any chance of:
>> 1 dev_queue_xmit
>> 2 dev->xmit_lock taken
>> 3 interruption
>> 4 printk
>> 5 netconsole write
>> 6 dev->xmit_lock again
>> 7 lockup
>>
>> ?
>>
>> This is probably the silly question of the day.
>
> Maybe, but the answer isn't obvious to me at the moment as I haven't
> been thinking about such stuff enough lately. Silly response of the
> day:
>
> Mark, can you try this (again completely untested, but at least
> compiles) patch? I'm afraid I don't have a proper test rig to
> reproduce this at the moment. This will attempt to grab the lock, and
> if it fails, will check for recursion. Then it will try to print a
> message on the local console, temporarily disabling netconsole to
> allow the printk to get through..

OK, patch applied and spinlock debugging enabled.  Testing with eth1
(r1869) doesn'tyield any additional messages, just the standard 'NMI Watchdog detected
lockup'.
Thanks
Mark

-- 
Mark Broadbent <markb@wetlettuce.com>
Web: http://www.wetlettuce.com




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-21 10:23                 ` Mark Broadbent
@ 2004-12-21 12:37                   ` Francois Romieu
  2004-12-21 13:29                     ` Mark Broadbent
  0 siblings, 1 reply; 26+ messages in thread
From: Francois Romieu @ 2004-12-21 12:37 UTC (permalink / raw)
  To: Mark Broadbent; +Cc: mpm, romieu, linux-kernel, netdev

Mark Broadbent <markb@wetlettuce.com> :
[...]
> OK, patch applied and spinlock debugging enabled.  Testing with eth1
> (r1869) doesn'tyield any additional messages, just the standard
> 'NMI Watchdog detected lockup'.

Does the modified version below trigger _exactly_ the same hang ?

--- net/core/netpoll.c	2004-12-21 13:09:51.000000000 +0100
+++ net/core/netpoll.c	2004-12-21 13:27:01.000000000 +0100
@@ -31,6 +31,8 @@
 #define MAX_SKBS 32
 #define MAX_UDP_CHUNK 1460
 
+static int netpoll_kill;
+
 static spinlock_t skb_list_lock = SPIN_LOCK_UNLOCKED;
 static int nr_skbs;
 static struct sk_buff *skbs;
@@ -184,11 +186,21 @@ void netpoll_send_skb(struct netpoll *np
 
 repeat:
 	if(!np || !np->dev || !netif_running(np->dev)) {
+too_bad:
 		__kfree_skb(skb);
 		return;
 	}
 
-	spin_lock(&np->dev->xmit_lock);
+	if (!spin_trylock(&np->dev->xmit_lock)) {
+		netpoll_kill = 1;
+		goto too_bad;
+	}
+
+	if (netpoll_kill) {
+		if (net_ratelimit())
+			printk(KERN_ERR "netconsole raced");
+		netpoll_kill = 0;
+	}
 	np->dev->xmit_lock_owner = smp_processor_id();
 
 	/*

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-21 12:37                   ` Francois Romieu
@ 2004-12-21 13:29                     ` Mark Broadbent
  2004-12-21 20:48                       ` Francois Romieu
  0 siblings, 1 reply; 26+ messages in thread
From: Mark Broadbent @ 2004-12-21 13:29 UTC (permalink / raw)
  To: romieu; +Cc: mpm, linux-kernel, netdev


Francois Romieu said:
> Mark Broadbent <markb@wetlettuce.com> :
> [...]
>> OK, patch applied and spinlock debugging enabled.  Testing with eth1
>> (r1869) doesn'tyield any additional messages, just the standard
>> 'NMI Watchdog detected lockup'.
>
> Does the modified version below trigger _exactly_ the same hang ?

Using the patch supplied I get no hang, just the message 'netconsole
raced' output to the console and the packet capture proceeds as normal.
Thanks
Mark

-- 
Mark Broadbent <markb@wetlettuce.com>
Web: http://www.wetlettuce.com




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-21 13:29                     ` Mark Broadbent
@ 2004-12-21 20:48                       ` Francois Romieu
  2004-12-21 21:27                         ` Matt Mackall
  0 siblings, 1 reply; 26+ messages in thread
From: Francois Romieu @ 2004-12-21 20:48 UTC (permalink / raw)
  To: Mark Broadbent; +Cc: mpm, linux-kernel, netdev

Mark Broadbent <markb@wetlettuce.com> :
[...]
> Using the patch supplied I get no hang, just the message 'netconsole
> raced' output to the console and the packet capture proceeds as normal.
> Thanks

The patch is more a bandaid for debugging than a real fix. The netconsole
will drop some messages until its locking is fixed

If you can issue one more test, I'd like to know if some messages appear
on the VGA console around the time at which tcpdump is started (the test
assumes that netconsole is not used/insmoded at all). Please check that
the console log_level is set high enough as it will be really disappointing
if nothing appears :o)

--
Ueimor

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-21 20:48                       ` Francois Romieu
@ 2004-12-21 21:27                         ` Matt Mackall
  2004-12-21 22:58                           ` Francois Romieu
  0 siblings, 1 reply; 26+ messages in thread
From: Matt Mackall @ 2004-12-21 21:27 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Mark Broadbent, linux-kernel, netdev

On Tue, Dec 21, 2004 at 09:48:53PM +0100, Francois Romieu wrote:
> Mark Broadbent <markb@wetlettuce.com> :
> [...]
> > Using the patch supplied I get no hang, just the message 'netconsole
> > raced' output to the console and the packet capture proceeds as normal.
> > Thanks
> 
> The patch is more a bandaid for debugging than a real fix. The netconsole
> will drop some messages until its locking is fixed

Unfortunately there's no good way to fix its locking in this
circumstance (or the harder case of driver-private locks). I think
I'll just have to come up with some scheme for queueing messages that
arrive when the queue lock is held.

> If you can issue one more test, I'd like to know if some messages appear
> on the VGA console around the time at which tcpdump is started (the test
> assumes that netconsole is not used/insmoded at all). Please check that
> the console log_level is set high enough as it will be really disappointing
> if nothing appears :o)

I think it's the promiscuous mode message itself that's the problem
but I've not had time to reproduce it.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-21 21:27                         ` Matt Mackall
@ 2004-12-21 22:58                           ` Francois Romieu
  2004-12-22  9:34                             ` Patrick McHardy
  0 siblings, 1 reply; 26+ messages in thread
From: Francois Romieu @ 2004-12-21 22:58 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Mark Broadbent, linux-kernel, netdev

Matt Mackall <mpm@selenic.com> :
[...]
> I think it's the promiscuous mode message itself that's the problem

Yes. dev_mc_add takes dev->xmit_lock and the game is over.

Marc, if the patch below happens to work, it should not drop messages
like the previous one (it is an ugly short-term suggestion).

--- net/core/netpoll.c	2004-12-21 13:09:51.000000000 +0100
+++ net/core/netpoll.c	2004-12-21 23:35:25.000000000 +0100
@@ -22,6 +22,7 @@
 #include <net/tcp.h>
 #include <net/udp.h>
 #include <asm/unaligned.h>
+#include <net/pkt_sched.h>
 
 /*
  * We maintain a small pool of fully-sized skbs, to make sure the
@@ -184,11 +187,19 @@ void netpoll_send_skb(struct netpoll *np
 
 repeat:
 	if(!np || !np->dev || !netif_running(np->dev)) {
 		__kfree_skb(skb);
 		return;
 	}
 
-	spin_lock(&np->dev->xmit_lock);
+	while (!spin_trylock(&np->dev->xmit_lock)) {
+		if (np->dev->xmit_lock_owner == smp_processor_id()) {
+			struct Qdisc *q = dev->qdisc;
+
+			q->ops->enqueue(skb, q);
+			return;
+		}
+	}
+
 	np->dev->xmit_lock_owner = smp_processor_id();
 
 	/*

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-21 22:58                           ` Francois Romieu
@ 2004-12-22  9:34                             ` Patrick McHardy
  2004-12-22 10:54                               ` Patrick McHardy
  0 siblings, 1 reply; 26+ messages in thread
From: Patrick McHardy @ 2004-12-22  9:34 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Matt Mackall, Mark Broadbent, linux-kernel, netdev

Francois Romieu wrote:
> Marc, if the patch below happens to work, it should not drop messages
> like the previous one (it is an ugly short-term suggestion).
> 

> -	spin_lock(&np->dev->xmit_lock);
> +	while (!spin_trylock(&np->dev->xmit_lock)) {
> +		if (np->dev->xmit_lock_owner == smp_processor_id()) {
> +			struct Qdisc *q = dev->qdisc;
> +
> +			q->ops->enqueue(skb, q);

Shouldn't this be requeue ?

Regards
Patrick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-22  9:34                             ` Patrick McHardy
@ 2004-12-22 10:54                               ` Patrick McHardy
  2004-12-22 12:39                                 ` Francois Romieu
  2004-12-22 14:37                                 ` Mark Broadbent
  0 siblings, 2 replies; 26+ messages in thread
From: Patrick McHardy @ 2004-12-22 10:54 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Matt Mackall, Mark Broadbent, linux-kernel, netdev

Patrick McHardy wrote:
> Francois Romieu wrote:
> 
>> Marc, if the patch below happens to work, it should not drop messages
>> like the previous one (it is an ugly short-term suggestion).
>>
> 
>> -    spin_lock(&np->dev->xmit_lock);
>> +    while (!spin_trylock(&np->dev->xmit_lock)) {
>> +        if (np->dev->xmit_lock_owner == smp_processor_id()) {
>> +            struct Qdisc *q = dev->qdisc;
>> +
>> +            q->ops->enqueue(skb, q);
> 
> 
> Shouldn't this be requeue ?

Since the code doesn't dequeue itself, enqueue seems fine to keep
at least the queued messages ordered. But you need to grab
dev->queue_lock, otherwise you risk corrupting qdisc internal data.
You should probably also deal with the noqueue-qdisc, which doesn't
have an enqueue function. So it should look something like this:

while (!spin_trylock(&np->dev->xmit_lock)) {
	if (np->dev->xmit_lock_owner == smp_processor_id()) {
		struct Qdisc *q;

		rcu_read_lock();
		q = rcu_dereference(dev->qdisc);
		if (q->enqueue) {
			spin_lock(&dev->queue_lock);
			q->ops->enqueue(skb, q);
			spin_unlock(&dev->queue_lock);
			netif_schedule(np->dev);
		} else
			kfree_skb(skb);
		rcu_read_unlock();
		return;
	}
}

Regards
Patrick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-22 10:54                               ` Patrick McHardy
@ 2004-12-22 12:39                                 ` Francois Romieu
  2004-12-22 13:33                                   ` jamal
  2004-12-22 14:57                                   ` Patrick McHardy
  2004-12-22 14:37                                 ` Mark Broadbent
  1 sibling, 2 replies; 26+ messages in thread
From: Francois Romieu @ 2004-12-22 12:39 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Matt Mackall, Mark Broadbent, linux-kernel, netdev

Patrick McHardy <kaber@trash.net> :
[...]
> at least the queued messages ordered. But you need to grab
> dev->queue_lock, otherwise you risk corrupting qdisc internal data.
> You should probably also deal with the noqueue-qdisc, which doesn't
> have an enqueue function. So it should look something like this:

If I am not mistaken, a failure on spin_trylock + the test on
xmit_lock_owner imply that it is safe to directly handle the
queue. It means that qdisc_run() has been interrupted on the
current cpu and the other paths seem fine as well. Counter-example
is welcome (no joke).

Of course the patch is completely ugly and violates any layering
principle one could think of. It was not submitted for inclusion :o)

> while (!spin_trylock(&np->dev->xmit_lock)) {
> 	if (np->dev->xmit_lock_owner == smp_processor_id()) {
> 		struct Qdisc *q;
> 
> 		rcu_read_lock();
> 		q = rcu_dereference(dev->qdisc);
> 		if (q->enqueue) {
> 			spin_lock(&dev->queue_lock);

I'd expect it to deadlock if dev_queue_xmit -> qdisc_run is interrupted
on the current cpu and a printk is issued as dev->queue_lock will have
been taken elsewhere.

--
Ueimor

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-22 12:39                                 ` Francois Romieu
@ 2004-12-22 13:33                                   ` jamal
  2004-12-22 14:57                                   ` Patrick McHardy
  1 sibling, 0 replies; 26+ messages in thread
From: jamal @ 2004-12-22 13:33 UTC (permalink / raw)
  To: Francois Romieu
  Cc: Patrick McHardy, Matt Mackall, Mark Broadbent, linux-kernel, netdev

On Wed, 2004-12-22 at 07:39, Francois Romieu wrote:

> If I am not mistaken, a failure on spin_trylock + the test on
> xmit_lock_owner imply that it is safe to directly handle the
> queue. It means that qdisc_run() has been interrupted on the
> current cpu and the other paths seem fine as well. Counter-example
> is welcome (no joke).

Think more than 2 processors ;-> 

cheers,
jamal


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-22 10:54                               ` Patrick McHardy
  2004-12-22 12:39                                 ` Francois Romieu
@ 2004-12-22 14:37                                 ` Mark Broadbent
  1 sibling, 0 replies; 26+ messages in thread
From: Mark Broadbent @ 2004-12-22 14:37 UTC (permalink / raw)
  To: kaber; +Cc: romieu, mpm, linux-kernel, netdev


Patrick McHardy said:
> Patrick McHardy wrote:
>> Francois Romieu wrote:
>>
>>> Marc, if the patch below happens to work, it should not drop messages
>>> like the previous one (it is an ugly short-term suggestion).
>>>
>>
>>> -    spin_lock(&np->dev->xmit_lock);
>>> +    while (!spin_trylock(&np->dev->xmit_lock)) {
>>> +        if (np->dev->xmit_lock_owner == smp_processor_id()) {
>>> +            struct Qdisc *q = dev->qdisc;
>>> +
>>> +            q->ops->enqueue(skb, q);
>>
>>
>> Shouldn't this be requeue ?
>
> Since the code doesn't dequeue itself, enqueue seems fine to keep
> at least the queued messages ordered. But you need to grab
> dev->queue_lock, otherwise you risk corrupting qdisc internal data. You
> should probably also deal with the noqueue-qdisc, which doesn't have an
> enqueue function. So it should look something like this:
>
> while (!spin_trylock(&np->dev->xmit_lock)) {
> 	if (np->dev->xmit_lock_owner == smp_processor_id()) {
> 		struct Qdisc *q;
>
> 		rcu_read_lock();
> 		q = rcu_dereference(dev->qdisc);
> 		if (q->enqueue) {
> 			spin_lock(&dev->queue_lock);
> 			q->ops->enqueue(skb, q);
> 			spin_unlock(&dev->queue_lock);
> 			netif_schedule(np->dev);
> 		} else
> 			kfree_skb(skb);
> 		rcu_read_unlock();
> 		return;
> 	}
> }

I've tried both patches (modified slightly to get them to compile) but
they both produce hard NMI detected lockups (as before).
Thanks
Mark

Patches after modification to allow compilation:

Francois' patch (against 2.6.10-rc3-bk10):

diff -X dontdiff -urN linux-2.6.9-rc3-bk10.orig/net/core/netpoll.c
linux-2.6.9-rc3-bk10/net/core/netpoll.c--- linux-2.6.9-rc3-bk10.orig/net/core/netpoll.c	2004-12-22
12:09:32.000000000 +0000+++ linux-2.6.9-rc3-bk10/net/core/netpoll.c	2004-12-22 14:13:54.000000000
+0000@@ -22,6 +22,7 @@
 #include <net/tcp.h>
 #include <net/udp.h>
 #include <asm/unaligned.h>
+#include <net/pkt_sched.h>

 /*
  * We maintain a small pool of fully-sized skbs, to make sure the
@@ -188,7 +189,15 @@
 		return;
 	}

-	spin_lock(&np->dev->xmit_lock);
+	while (!spin_trylock(&np->dev->xmit_lock)) {
+		if (np->dev->xmit_lock_owner == smp_processor_id()) {
+			struct Qdisc *q = np->dev->qdisc;
+
+			q->ops->enqueue(skb, q);
+			return;
+		}
+	}
+
 	np->dev->xmit_lock_owner = smp_processor_id();

 	/*

Patrick's patch (against 2.6.10-rc3-bk10):

diff -X dontdiff -urN linux-2.6.9-rc3-bk10.orig/net/core/netpoll.c
linux-2.6.9-rc3-bk10/net/core/netpoll.c--- linux-2.6.9-rc3-bk10.orig/net/core/netpoll.c	2004-12-22
12:09:32.000000000 +0000+++ linux-2.6.9-rc3-bk10/net/core/netpoll.c	2004-12-22 11:08:06.000000000
+0000@@ -22,6 +22,7 @@
 #include <net/tcp.h>
 #include <net/udp.h>
 #include <asm/unaligned.h>
+#include <net/pkt_sched.h>

 /*
  * We maintain a small pool of fully-sized skbs, to make sure the
@@ -188,7 +189,24 @@
 		return;
 	}

-	spin_lock(&np->dev->xmit_lock);
+	while (!spin_trylock(&np->dev->xmit_lock)) {
+		if (np->dev->xmit_lock_owner == smp_processor_id()) {
+			struct Qdisc *q;
+
+			rcu_read_lock();
+			q = rcu_dereference(np->dev->qdisc);
+			if (q->enqueue) {
+				spin_lock(&np->dev->queue_lock);
+				q->ops->enqueue(skb, q);
+				spin_unlock(&np->dev->queue_lock);
+				netif_schedule(np->dev);
+			} else
+				__kfree_skb(skb);
+			rcu_read_unlock();
+			return;
+		}
+	}
+
 	np->dev->xmit_lock_owner = smp_processor_id();

 	/*


-- 
Mark Broadbent <markb@wetlettuce.com>
Web: http://www.wetlettuce.com




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-22 12:39                                 ` Francois Romieu
  2004-12-22 13:33                                   ` jamal
@ 2004-12-22 14:57                                   ` Patrick McHardy
  2004-12-22 17:18                                     ` Matt Mackall
  1 sibling, 1 reply; 26+ messages in thread
From: Patrick McHardy @ 2004-12-22 14:57 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Matt Mackall, Mark Broadbent, linux-kernel, netdev

Francois Romieu wrote:
> Patrick McHardy <kaber@trash.net> :
> [...]
> 
>>at least the queued messages ordered. But you need to grab
>>dev->queue_lock, otherwise you risk corrupting qdisc internal data.
>>You should probably also deal with the noqueue-qdisc, which doesn't
>>have an enqueue function. So it should look something like this:
> 
> 
> If I am not mistaken, a failure on spin_trylock + the test on
> xmit_lock_owner imply that it is safe to directly handle the
> queue. It means that qdisc_run() has been interrupted on the
> current cpu and the other paths seem fine as well. Counter-example
> is welcome (no joke).

enqueue is only protected by dev->queue_lock, and dev->queue_lock
is dropped as soon as dev->xmit_lock is grabbed, so any other CPU
might call enqueue at the same time.

Example:

CPU1					CPU2

dev_queue_xmit				dev_queue_xmit
  lock(dev->queue_lock)			 lock(dev->queue_lock)
q->enqueue
qdisc_run
qdisc_restart
  trylock(dev->xmit_lock), ok
  unlock(dev->queue_lock)
...
printk("something")
...
netpoll_send_skb
  trylock(dev->xmit_lock), fails
q->enqueue				q->enqueue

> Of course the patch is completely ugly and violates any layering
> principle one could think of. It was not submitted for inclusion :o)

Sure, but I think we should have a short-term workaround until
a better solution has been invented. Maybe dropping the packets
would be best for now, it only affects printks issued in paths
starting at qdisc_restart (-> hard_start_xmit -> ...). Queueing
the packets might also cause reordering since not all packets
are queued.

>>while (!spin_trylock(&np->dev->xmit_lock)) {
>>	if (np->dev->xmit_lock_owner == smp_processor_id()) {
>>		struct Qdisc *q;
>>
>>		rcu_read_lock();
>>		q = rcu_dereference(dev->qdisc);
>>		if (q->enqueue) {
>>			spin_lock(&dev->queue_lock);
> 
> 
> I'd expect it to deadlock if dev_queue_xmit -> qdisc_run is interrupted
> on the current cpu and a printk is issued as dev->queue_lock will have
> been taken elsewhere.

Hmm this is complicated, let me think some more about it.

Regards
Patrick

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-22 14:57                                   ` Patrick McHardy
@ 2004-12-22 17:18                                     ` Matt Mackall
  2004-12-25 11:26                                       ` Wish you all a Merry Christmas Pranav
  2004-12-28 13:45                                       ` Lockup with 2.6.9-ac15 related to netconsole jamal
  0 siblings, 2 replies; 26+ messages in thread
From: Matt Mackall @ 2004-12-22 17:18 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Francois Romieu, Mark Broadbent, linux-kernel, netdev

On Wed, Dec 22, 2004 at 03:57:57PM +0100, Patrick McHardy wrote:
> >Of course the patch is completely ugly and violates any layering
> >principle one could think of. It was not submitted for inclusion :o)
> 
> Sure, but I think we should have a short-term workaround until
> a better solution has been invented. Maybe dropping the packets
> would be best for now, it only affects printks issued in paths
> starting at qdisc_restart (-> hard_start_xmit -> ...). Queueing
> the packets might also cause reordering since not all packets
> are queued.

When I mentioned queueing, I was thinking of a netpoll-private queue
that would be hooked to a softirq or some such so that it would be
pushed out as soon as possible. Dropping may be the better approach as
queueing throws away netpoll's immediacy and ordering properties. And
getting netpoll _more_ tangled in the net stack mechanics is
definitely a step in the wrong direction.

More generally, I'm tempted to add some warn_on style functionality so
that printks in such troublesome paths can be lifted out.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Wish you all a Merry Christmas
  2004-12-22 17:18                                     ` Matt Mackall
@ 2004-12-25 11:26                                       ` Pranav
  2004-12-25 11:30                                         ` Jan Engelhardt
  2004-12-28 13:45                                       ` Lockup with 2.6.9-ac15 related to netconsole jamal
  1 sibling, 1 reply; 26+ messages in thread
From: Pranav @ 2004-12-25 11:26 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel

Hi everyone,
Wishing you all A Prosperous Merry ChristMas.
Hope Coming years brings Peace,Happiness,blessings of CHRISTmas to you all
,your family and this World.

With Regards,
Pranav.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Wish you all a Merry Christmas
  2004-12-25 11:26                                       ` Wish you all a Merry Christmas Pranav
@ 2004-12-25 11:30                                         ` Jan Engelhardt
  0 siblings, 0 replies; 26+ messages in thread
From: Jan Engelhardt @ 2004-12-25 11:30 UTC (permalink / raw)
  To: Pranav; +Cc: netdev, linux-kernel

>Wishing you all A Prosperous Merry ChristMas.
>Hope Coming years brings Peace,Happiness,blessings of CHRISTmas to you all
>,your family and this World.
>
>With Regards,
>Pranav.

I don't see how this is related to linux-kernel.




Jan Engelhardt
-- 
ENOSPC

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Lockup with 2.6.9-ac15 related to netconsole
  2004-12-22 17:18                                     ` Matt Mackall
  2004-12-25 11:26                                       ` Wish you all a Merry Christmas Pranav
@ 2004-12-28 13:45                                       ` jamal
  1 sibling, 0 replies; 26+ messages in thread
From: jamal @ 2004-12-28 13:45 UTC (permalink / raw)
  To: Matt Mackall
  Cc: Patrick McHardy, Francois Romieu, Mark Broadbent, linux-kernel, netdev

On Wed, 2004-12-22 at 12:18, Matt Mackall wrote:
> On Wed, Dec 22, 2004 at 03:57:57PM +0100, Patrick McHardy wrote:
> > >Of course the patch is completely ugly and violates any layering
> > >principle one could think of. It was not submitted for inclusion :o)
> > 
> > Sure, but I think we should have a short-term workaround until
> > a better solution has been invented. Maybe dropping the packets
> > would be best for now, it only affects printks issued in paths
> > starting at qdisc_restart (-> hard_start_xmit -> ...). Queueing
> > the packets might also cause reordering since not all packets
> > are queued.
> 
> When I mentioned queueing, I was thinking of a netpoll-private queue
> that would be hooked to a softirq or some such so that it would be
> pushed out as soon as possible. Dropping may be the better approach 

I think so - just junk those packets. 

cheers,
jamal


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2004-12-28 13:45 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-12-16 16:20 Lockup with 2.6.9-ac15 related to netconsole Mark Broadbent
2004-12-16 21:10 ` Matt Mackall
2004-12-17  9:10   ` Mark Broadbent
2004-12-17 21:57     ` Matt Mackall
2004-12-17 23:35       ` Francois Romieu
2004-12-18 13:25         ` Mark Broadbent
2004-12-20  9:42         ` Mark Broadbent
2004-12-20 21:14           ` Matt Mackall
2004-12-21  0:22             ` Francois Romieu
2004-12-21  0:55               ` Matt Mackall
2004-12-21 10:23                 ` Mark Broadbent
2004-12-21 12:37                   ` Francois Romieu
2004-12-21 13:29                     ` Mark Broadbent
2004-12-21 20:48                       ` Francois Romieu
2004-12-21 21:27                         ` Matt Mackall
2004-12-21 22:58                           ` Francois Romieu
2004-12-22  9:34                             ` Patrick McHardy
2004-12-22 10:54                               ` Patrick McHardy
2004-12-22 12:39                                 ` Francois Romieu
2004-12-22 13:33                                   ` jamal
2004-12-22 14:57                                   ` Patrick McHardy
2004-12-22 17:18                                     ` Matt Mackall
2004-12-25 11:26                                       ` Wish you all a Merry Christmas Pranav
2004-12-25 11:30                                         ` Jan Engelhardt
2004-12-28 13:45                                       ` Lockup with 2.6.9-ac15 related to netconsole jamal
2004-12-22 14:37                                 ` Mark Broadbent

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).