All of lore.kernel.org
 help / color / mirror / Atom feed
* large packet loss take2 2.6.31.x
@ 2009-10-31 14:21 Caleb Cushing
  2009-10-31 18:44 ` Frans Pop
  0 siblings, 1 reply; 55+ messages in thread
From: Caleb Cushing @ 2009-10-31 14:21 UTC (permalink / raw)
  To: Linux Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 1089 bytes --]

so ever since arch rolled out 2.6.31.x I've been having problems with
my network (again) where I've been losing a large amount of packets
(just testing with mtr somewhere between 30/50%). first I figured it
was the same problem as I had in 2.6.30.x (and maybe it is?) but that
appeared to get fixed. when I started bisecting the bug wasn't
apparent in 2.6.31.0 but I knew for sure it was in .5 (I couldn't
remember if I had noticed it again in .3)

I'm attaching the bisection log and a 'good' dmesg output.

c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9 is the first bad commit

I'm not going to pretend to understand why this patch is breaking my
networking but between bisection and testing it appears to be... I've
never bisected before and I'm definitely not a kernel hacker (I can
barely read C).

I should also note that the wireshark dump here
http://bugzilla.kernel.org/show_bug.cgi?id=13835 is related to this.
and if it's not the same bug then possibly a new one should be opened.

P.S. I'm not subscribed to the list please CC me
-- 
Caleb Cushing
http://xenoterracide.blogspot.com

[-- Attachment #2: bisect.log --]
[-- Type: text/x-log, Size: 1347 bytes --]

# bad: [e2984cbfddd5c8fac88b24d7e5f28e1cfb6f3838] Linux 2.6.31.5
# good: [74fca6a42863ffacaf7ba6f1936a9f228950f657] Linux 2.6.31
git bisect start 'v2.6.31.5' 'v2.6.31' '--'
# good: [08f30ff9811e59c08d1cee043ce55b2e862efe58] tty: USB hangup is racy
git bisect good 08f30ff9811e59c08d1cee043ce55b2e862efe58
# bad: [ddf2acb72f3df470ce15eb23ee97cd3be23016f8] KVM: fix LAPIC timer period overflow
git bisect bad ddf2acb72f3df470ce15eb23ee97cd3be23016f8
# bad: [03429ffaea91375d8ed80a60fa13c9ffe694539a] Fix idle time field in /proc/uptime
git bisect bad 03429ffaea91375d8ed80a60fa13c9ffe694539a
# bad: [2f670d465897b491da0b82fd8b047b9ec75bf8c8] USB: xhci: Support full speed devices.
git bisect bad 2f670d465897b491da0b82fd8b047b9ec75bf8c8
# bad: [b8580fde3ba44a0f00bf186dc5e4935bbd51be29] usb-serial: fix termios initialization logic
git bisect bad b8580fde3ba44a0f00bf186dc5e4935bbd51be29
# bad: [c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9] usb-serial: change referencing of port and serial structures
git bisect bad c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9
# good: [fa31221e38207cece07a6d96854a0fcf47c75ae5] hwmon: (asus_atk0110) Add maintainer information
git bisect good fa31221e38207cece07a6d96854a0fcf47c75ae5
# good: [17fd426331d1e4611654985dd545a52d200dd9d1] tty: USB serial termios bits
git bisect good 17fd426331d1e4611654985dd545a52d200dd9d1

[-- Attachment #3: dmesg.log --]
[-- Type: text/x-log, Size: 31964 bytes --]

Linux version 2.6.31.1-test-00092-g17fd426 (xenoterracide@slave4) (gcc version 4.4.2 (GCC) ) #15 SMP PREEMPT Sat Oct 31 09:54:39 EDT 2009
Command line: root=/dev/disk/by-uuid/e9367b57-da02-48bf-9cd7-d41ce53520a3 ro
KERNEL supported cpus:
  Intel GenuineIntel
  AMD AuthenticAMD
  Centaur CentaurHauls
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 0000000000099c00 (usable)
 BIOS-e820: 0000000000099c00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000cf590000 (usable)
 BIOS-e820: 00000000cf590000 - 00000000cf5e3000 (ACPI NVS)
 BIOS-e820: 00000000cf5e3000 - 00000000cf5f0000 (ACPI data)
 BIOS-e820: 00000000cf5f0000 - 00000000cf600000 (reserved)
 BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 00000001b0000000 (usable)
DMI 2.5 present.
last_pfn = 0x1b0000 max_arch_pfn = 0x400000000
MTRR default type: uncachable
MTRR fixed ranges enabled:
  00000-9FFFF write-back
  A0000-BFFFF uncachable
  C0000-E7FFF write-protect
  E8000-EFFFF uncachable
  F0000-FFFFF write-through
MTRR variable ranges enabled:
  0 base 100000000 mask F80000000 write-back
  1 base 180000000 mask FE0000000 write-back
  2 base 1A0000000 mask FF0000000 write-back
  3 base 000000000 mask F80000000 write-back
  4 base 080000000 mask FC0000000 write-back
  5 base 0C0000000 mask FF0000000 write-back
  6 base 0CF700000 mask FFFF00000 uncachable
  7 base 0CF800000 mask FFF800000 uncachable
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
e820 update range: 00000000cf700000 - 0000000100000000 (usable) ==> (reserved)
last_pfn = 0xcf590 max_arch_pfn = 0x400000000
e820 update range: 0000000000001000 - 0000000000006000 (usable) ==> (reserved)
Scanning 1 areas for low memory corruption
modified physical RAM map:
 modified: 0000000000000000 - 0000000000001000 (usable)
 modified: 0000000000001000 - 0000000000006000 (reserved)
 modified: 0000000000006000 - 0000000000099c00 (usable)
 modified: 0000000000099c00 - 00000000000a0000 (reserved)
 modified: 00000000000f0000 - 0000000000100000 (reserved)
 modified: 0000000000100000 - 00000000cf590000 (usable)
 modified: 00000000cf590000 - 00000000cf5e3000 (ACPI NVS)
 modified: 00000000cf5e3000 - 00000000cf5f0000 (ACPI data)
 modified: 00000000cf5f0000 - 00000000cf600000 (reserved)
 modified: 00000000e0000000 - 00000000f0000000 (reserved)
 modified: 00000000fec00000 - 0000000100000000 (reserved)
 modified: 0000000100000000 - 00000001b0000000 (usable)
initial memory mapped : 0 - 20000000
init_memory_mapping: 0000000000000000-00000000cf590000
 0000000000 - 00cf400000 page 2M
 00cf400000 - 00cf590000 page 4k
kernel direct mapping tables up to cf590000 @ 8000-e000
init_memory_mapping: 0000000100000000-00000001b0000000
 0100000000 - 01b0000000 page 2M
kernel direct mapping tables up to 1b0000000 @ c000-14000
RAMDISK: 37ef1000 - 37fef9a7
ACPI: RSDP 00000000000f97a0 00024 (v02 DELL  )
ACPI: XSDT 00000000cf5e3080 0005C (v01 DELL    FX09    42302E31 AWRD 00000000)
ACPI: FACP 00000000cf5e7200 000F4 (v03 DELL    FX09    42302E31 AWRD 00000000)
ACPI: DSDT 00000000cf5e3200 03FFC (v01 DELL   AWRDACPI 00001000 MSFT 03000000)
ACPI: FACS 00000000cf590000 00040
ACPI: HPET 00000000cf5e73c0 00038 (v01 DELL    FX09    42302E31 AWRD 00000098)
ACPI: MCFG 00000000cf5e7400 0003C (v01 DELL    FX09    42302E31 AWRD 00000000)
ACPI: DMY1 00000000cf5e7440 00176 (v01 DELL    FX09    42302E31 AWRD 00000000)
ACPI: DMY2 00000000cf5e75c0 00080 (v01 DELL    FX09    42302E31 AWRD 00000000)
ACPI: APIC 00000000cf5e7300 00084 (v01 DELL    FX09    42302E31 AWRD 00000000)
ACPI: SSDT 00000000cf5e7f60 00380 (v01  PmRef    CpuPm 00003000 INTL 20041203)
ACPI: Local APIC address 0xfee00000
(8 early reservations) ==> bootmem [0000000000 - 01b0000000]
  #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
  #1 [0000006000 - 0000008000]       TRAMPOLINE ==> [0000006000 - 0000008000]
  #2 [0001000000 - 00016837d0]    TEXT DATA BSS ==> [0001000000 - 00016837d0]
  #3 [0037ef1000 - 0037fef9a7]          RAMDISK ==> [0037ef1000 - 0037fef9a7]
  #4 [0000099c00 - 0000100000]    BIOS reserved ==> [0000099c00 - 0000100000]
  #5 [0001684000 - 00016840f8]              BRK ==> [0001684000 - 00016840f8]
  #6 [0000008000 - 000000c000]          PGTABLE ==> [0000008000 - 000000c000]
  #7 [000000c000 - 000000f000]          PGTABLE ==> [000000c000 - 000000f000]
found SMP MP-table at [ffff8800000f3f00] f3f00
 [ffffea0000000000-ffffea0005ffffff] PMD -> [ffff880028600000-ffff88002dbfffff] on node 0
Zone PFN ranges:
  DMA      0x00000000 -> 0x00001000
  DMA32    0x00001000 -> 0x00100000
  Normal   0x00100000 -> 0x001b0000
Movable zone start PFN for each node
early_node_map[4] active PFN ranges
    0: 0x00000000 -> 0x00000001
    0: 0x00000006 -> 0x00000099
    0: 0x00000100 -> 0x000cf590
    0: 0x00100000 -> 0x001b0000
On node 0 totalpages: 1570084
  DMA zone: 56 pages used for memmap
  DMA zone: 112 pages reserved
  DMA zone: 3820 pages, LIFO batch:0
  DMA32 zone: 14280 pages used for memmap
  DMA32 zone: 830920 pages, LIFO batch:31
  Normal zone: 9856 pages used for memmap
  Normal zone: 711040 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x408
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x03] enabled)
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x02] enabled)
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
ACPI: IOAPIC (id[0x04] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 4, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Using ACPI (MADT) for SMP configuration information
ACPI: HPET id: 0x8086a201 base: 0xfed00000
SMP: Allowing 4 CPUs, 0 hotplug CPUs
nr_irqs_gsi: 24
PM: Registered nosave memory: 0000000000001000 - 0000000000006000
PM: Registered nosave memory: 0000000000099000 - 000000000009a000
PM: Registered nosave memory: 000000000009a000 - 00000000000a0000
PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000
PM: Registered nosave memory: 00000000000f0000 - 0000000000100000
PM: Registered nosave memory: 00000000cf590000 - 00000000cf5e3000
PM: Registered nosave memory: 00000000cf5e3000 - 00000000cf5f0000
PM: Registered nosave memory: 00000000cf5f0000 - 00000000cf600000
PM: Registered nosave memory: 00000000cf600000 - 00000000e0000000
PM: Registered nosave memory: 00000000e0000000 - 00000000f0000000
PM: Registered nosave memory: 00000000f0000000 - 00000000fec00000
PM: Registered nosave memory: 00000000fec00000 - 0000000100000000
Allocating PCI resources starting at cf600000 (gap: cf600000:10a00000)
NR_CPUS:16 nr_cpumask_bits:16 nr_cpu_ids:4 nr_node_ids:1
PERCPU: Embedded 28 pages at ffff880028034000, static data 82400 bytes
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 1545780
Kernel command line: root=/dev/disk/by-uuid/e9367b57-da02-48bf-9cd7-d41ce53520a3 ro
PID hash table entries: 4096 (order: 12, 32768 bytes)
Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
Initializing CPU#0
Checking aperture...
No AGP bridge found
Calgary: detecting Calgary via BIOS EBDA area
Calgary: Unable to locate Rio Grande table in EBDA - bailing!
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Placing 64MB software IO TLB between ffff880020000000 - ffff880024000000
software IO TLB at phys 0x20000000 - 0x24000000
Memory: 6104592k/7077888k available (3629k kernel code, 797552k absent, 174760k reserved, 1500k data, 484k init)
SLUB: Genslabs=13, HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
NR_IRQS:768
Fast TSC calibration using PIT
Detected 2393.837 MHz processor.
Console: colour VGA+ 80x25
console [tty0] enabled
hpet clockevent registered
HPET: 4 timers in total, 0 timers will be used for per-cpu timer
Calibrating delay loop (skipped), value calculated using timer frequency.. 4789.16 BogoMIPS (lpj=7979456)
Security Framework initialized
Mount-cache hash table entries: 256
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
mce: CPU supports 6 MCE banks
CPU0: Thermal monitoring enabled (TM2)
using mwait in idle threads.
Performance Counters: Core2 events, Intel PMU driver.
... version:                 2
... bit width:               40
... generic counters:        2
... value mask:              000000ffffffffff
... max period:              000000007fffffff
... fixed-purpose counters:  3
... counter mask:            0000000700000003
ACPI: Core revision 20090521
Setting APIC routing to flat
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
CPU0: Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz stepping 0b
Booting processor 1 APIC 0x1 ip 0x6000
Initializing CPU#1
Calibrating delay using timer specific routine.. 4789.48 BogoMIPS (lpj=7979981)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
mce: CPU supports 6 MCE banks
CPU1: Thermal monitoring enabled (TM2)
x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106
CPU1: Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz stepping 0b
checking TSC synchronization [CPU#0 -> CPU#1]: passed.
Booting processor 2 APIC 0x3 ip 0x6000
Initializing CPU#2
Calibrating delay using timer specific routine.. 4789.51 BogoMIPS (lpj=7980021)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 3
mce: CPU supports 6 MCE banks
CPU2: Thermal monitoring enabled (TM2)
x86 PAT enabled: cpu 2, old 0x7040600070406, new 0x7010600070106
CPU2: Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz stepping 0b
checking TSC synchronization [CPU#0 -> CPU#2]: passed.
Booting processor 3 APIC 0x2 ip 0x6000
Initializing CPU#3
Calibrating delay using timer specific routine.. 4789.56 BogoMIPS (lpj=7980102)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 2
mce: CPU supports 6 MCE banks
CPU3: Thermal monitoring enabled (TM2)
x86 PAT enabled: cpu 3, old 0x7040600070406, new 0x7010600070106
CPU3: Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz stepping 0b
checking TSC synchronization [CPU#0 -> CPU#3]: passed.
Brought up 4 CPUs
Total of 4 processors activated (19159.72 BogoMIPS).
CPU0 attaching sched-domain:
 domain 0: span 0-1 level MC
  groups: 0 1
  domain 1: span 0-3 level CPU
   groups: 0-1 2-3
CPU1 attaching sched-domain:
 domain 0: span 0-1 level MC
  groups: 1 0
  domain 1: span 0-3 level CPU
   groups: 0-1 2-3
CPU2 attaching sched-domain:
 domain 0: span 2-3 level MC
  groups: 2 3
  domain 1: span 0-3 level CPU
   groups: 2-3 0-1
CPU3 attaching sched-domain:
 domain 0: span 2-3 level MC
  groups: 3 2
  domain 1: span 0-3 level CPU
   groups: 2-3 0-1
Booting paravirtualized kernel on bare hardware
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 255
PCI: MCFG area at e0000000 reserved in E820
PCI: Using MMCONFIG at e0000000 - efffffff
PCI: Using configuration type 1 for base access
mtrr: your CPUs had inconsistent fixed MTRR settings
mtrr: probably your BIOS does not setup all CPUs.
mtrr: corrected configuration.
bio: create slab <bio-0> at 0
ACPI: EC: Look up EC in DSDT
ACPI: Interpreter enabled
ACPI: (supports S0 S3 S4 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: No dock devices found.
ACPI: PCI Root Bridge [PCI0] (0000:00)
pci 0000:00:01.0: PME# supported from D0 D3hot D3cold
pci 0000:00:01.0: PME# disabled
pci 0000:00:02.0: reg 10 32bit mmio: [0xfdf00000-0xfdf7ffff]
pci 0000:00:02.0: reg 14 io port: [0xff00-0xff07]
pci 0000:00:02.0: reg 18 32bit mmio: [0xd0000000-0xdfffffff]
pci 0000:00:02.0: reg 1c 32bit mmio: [0xfdc00000-0xfdcfffff]
pci 0000:00:19.0: reg 10 32bit mmio: [0xfdfc0000-0xfdfdffff]
pci 0000:00:19.0: reg 14 32bit mmio: [0xfdfff000-0xfdffffff]
pci 0000:00:19.0: reg 18 io port: [0xfe00-0xfe1f]
pci 0000:00:19.0: PME# supported from D0 D3hot D3cold
pci 0000:00:19.0: PME# disabled
pci 0000:00:1a.0: reg 20 io port: [0xfd00-0xfd1f]
pci 0000:00:1a.1: reg 20 io port: [0xfc00-0xfc1f]
pci 0000:00:1a.2: reg 20 io port: [0xfb00-0xfb1f]
pci 0000:00:1a.7: reg 10 32bit mmio: [0xfdffe000-0xfdffe3ff]
pci 0000:00:1a.7: PME# supported from D0 D3hot D3cold
pci 0000:00:1a.7: PME# disabled
pci 0000:00:1d.0: reg 20 io port: [0xfa00-0xfa1f]
pci 0000:00:1d.1: reg 20 io port: [0xf900-0xf91f]
pci 0000:00:1d.2: reg 20 io port: [0xf800-0xf81f]
pci 0000:00:1d.7: reg 10 32bit mmio: [0xfdffd000-0xfdffd3ff]
pci 0000:00:1d.7: PME# supported from D0 D3hot D3cold
pci 0000:00:1d.7: PME# disabled
pci 0000:00:1f.0: quirk: region 0400-047f claimed by ICH6 ACPI/GPIO/TCO
pci 0000:00:1f.0: quirk: region 0480-04bf claimed by ICH6 GPIO
pci 0000:00:1f.0: ICH7 LPC Generic IO decode 1 PIO at 0800 (mask 003f)
pci 0000:00:1f.0: ICH7 LPC Generic IO decode 2 PIO at 0290 (mask 003f)
pci 0000:00:1f.2: reg 10 io port: [0xf700-0xf707]
pci 0000:00:1f.2: reg 14 io port: [0xf600-0xf603]
pci 0000:00:1f.2: reg 18 io port: [0xf500-0xf507]
pci 0000:00:1f.2: reg 1c io port: [0xf400-0xf403]
pci 0000:00:1f.2: reg 20 io port: [0xf300-0xf31f]
pci 0000:00:1f.2: reg 24 32bit mmio: [0xfdffc000-0xfdffc7ff]
pci 0000:00:1f.2: PME# supported from D3hot
pci 0000:00:1f.2: PME# disabled
pci 0000:00:1f.3: reg 10 64bit mmio: [0xfdffb000-0xfdffb0ff]
pci 0000:00:1f.3: reg 20 io port: [0x500-0x51f]
pci 0000:00:01.0: bridge io port: [0xd000-0xdfff]
pci 0000:00:01.0: bridge 32bit mmio: [0xfda00000-0xfdafffff]
pci 0000:00:01.0: bridge 64bit mmio pref: [0xfdb00000-0xfdbfffff]
pci 0000:02:01.0: reg 10 io port: [0xcf00-0xcf3f]
pci 0000:02:01.0: supports D1 D2
pci 0000:02:01.1: reg 10 io port: [0xce00-0xce07]
pci 0000:02:01.1: supports D1 D2
pci 0000:02:01.2: reg 10 32bit mmio: [0xfdeff000-0xfdeff7ff]
pci 0000:02:01.2: reg 14 32bit mmio: [0xfdef8000-0xfdefbfff]
pci 0000:02:01.2: supports D1 D2
pci 0000:02:01.2: PME# supported from D0 D1 D2 D3hot
pci 0000:02:01.2: PME# disabled
pci 0000:00:1e.0: transparent bridge
pci 0000:00:1e.0: bridge io port: [0xc000-0xcfff]
pci 0000:00:1e.0: bridge 32bit mmio: [0xfde00000-0xfdefffff]
pci 0000:00:1e.0: bridge 64bit mmio pref: [0xfdd00000-0xfddfffff]
pci_bus 0000:00: on NUMA node 0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 *4 5 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 9 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 9 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNK0] (IRQs *3 4 5 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 7 *9 10 11 12 14 15)
PCI: Using ACPI for IRQ routing
NetLabel: Initializing
NetLabel:  domain hash size = 128
NetLabel:  protocols = UNLABELED CIPSOv4
NetLabel:  unlabeled traffic allowed by default
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: PnP ACPI: found 12 devices
ACPI: ACPI bus type pnp unregistered
system 00:01: ioport range 0x4d0-0x4d1 has been reserved
system 00:01: ioport range 0x800-0x87f has been reserved
system 00:01: ioport range 0x290-0x297 has been reserved
system 00:01: ioport range 0x880-0x88f has been reserved
system 00:08: ioport range 0x400-0x4bf could not be reserved
system 00:0a: iomem range 0xe0000000-0xefffffff has been reserved
system 00:0b: iomem range 0xf0000-0xfffff could not be reserved
system 00:0b: iomem range 0xcf600000-0xcf6fffff could not be reserved
system 00:0b: iomem range 0xfed00000-0xfed000ff has been reserved
system 00:0b: iomem range 0xcf590000-0xcf5fffff could not be reserved
system 00:0b: iomem range 0x0-0x9ffff could not be reserved
system 00:0b: iomem range 0x100000-0xcf58ffff could not be reserved
system 00:0b: iomem range 0xfec00000-0xfec00fff could not be reserved
system 00:0b: iomem range 0xfed14000-0xfed1dfff has been reserved
system 00:0b: iomem range 0xfed20000-0xfed9ffff has been reserved
system 00:0b: iomem range 0xfee00000-0xfee00fff has been reserved
system 00:0b: iomem range 0xffb00000-0xffb7ffff has been reserved
system 00:0b: iomem range 0xfff00000-0xffffffff has been reserved
system 00:0b: iomem range 0xe0000-0xeffff has been reserved
pci 0000:00:01.0: PCI bridge, secondary bus 0000:01
pci 0000:00:01.0:   IO window: 0xd000-0xdfff
pci 0000:00:01.0:   MEM window: 0xfda00000-0xfdafffff
pci 0000:00:01.0:   PREFETCH window: 0x000000fdb00000-0x000000fdbfffff
pci 0000:00:1e.0: PCI bridge, secondary bus 0000:02
pci 0000:00:1e.0:   IO window: 0xc000-0xcfff
pci 0000:00:1e.0:   MEM window: 0xfde00000-0xfdefffff
pci 0000:00:1e.0:   PREFETCH window: 0x000000fdd00000-0x000000fddfffff
pci 0000:00:01.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:01.0: setting latency timer to 64
pci 0000:00:1e.0: setting latency timer to 64
pci_bus 0000:00: resource 0 io:  [0x00-0xffff]
pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
pci_bus 0000:01: resource 0 io:  [0xd000-0xdfff]
pci_bus 0000:01: resource 1 mem: [0xfda00000-0xfdafffff]
pci_bus 0000:01: resource 2 pref mem [0xfdb00000-0xfdbfffff]
pci_bus 0000:02: resource 0 io:  [0xc000-0xcfff]
pci_bus 0000:02: resource 1 mem: [0xfde00000-0xfdefffff]
pci_bus 0000:02: resource 2 pref mem [0xfdd00000-0xfddfffff]
pci_bus 0000:02: resource 3 io:  [0x00-0xffff]
pci_bus 0000:02: resource 4 mem: [0x000000-0xffffffffffffffff]
NET: Registered protocol family 2
IP route cache hash table entries: 262144 (order: 9, 2097152 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
NET: Registered protocol family 1
Unpacking initramfs...
Freeing initrd memory: 1018k freed
Scanning for low memory corruption every 60 seconds
audit: initializing netlink socket (disabled)
type=2000 audit(1256998169.436:1): initialized
VFS: Disk quotas dquot_6.5.2
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
msgmni has been set to 11926
alg: No test for stdrng (krng)
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
pci 0000:00:02.0: Boot video device
pcieport-driver 0000:00:01.0: irq 24 for MSI/MSI-X
pcieport-driver 0000:00:01.0: setting latency timer to 64
Linux agpgart interface v0.103
Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
input: Macintosh mouse button emulation as /devices/virtual/input/input0
PNP: No PS/2 controller found. Probing ports directly.
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
cpuidle: using governor ladder
cpuidle: using governor menu
TCP cubic registered
NET: Registered protocol family 17
registered taskstats version 1
Initalizing network drop monitor service
Freeing unused kernel memory: 484k freed
agpgart-intel 0000:00:00.0: Intel G33 Chipset
agpgart-intel 0000:00:00.0: detected 7164K stolen memory
agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0xd0000000
[drm] Initialized drm 1.1.0 20060810
i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
i915 0000:00:02.0: setting latency timer to 64
mtrr: no more MTRRs available
[drm] MTRR allocation failed.  Graphics performance may suffer.
i915 0000:00:02.0: irq 25 for MSI/MSI-X
Switched to high resolution mode on CPU 1
Switched to high resolution mode on CPU 3
Switched to high resolution mode on CPU 2
Switched to high resolution mode on CPU 0
[drm] DAC-6: set mode 1920x1080 11
Console: switching to colour frame buffer device 240x67
[drm] fb0: inteldrmfb frame buffer device
[drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
FDC 0 is a post-1991 82077
SCSI subsystem initialized
libata version 3.00 loaded.
ahci 0000:00:1f.2: version 3.0
ahci 0000:00:1f.2: PCI INT A -> GSI 19 (level, low) -> IRQ 19
ahci 0000:00:1f.2: irq 26 for MSI/MSI-X
ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0x3f impl RAID mode
ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pmp pio slum part ems 
ahci 0000:00:1f.2: setting latency timer to 64
scsi0 : ahci
scsi1 : ahci
scsi2 : ahci
scsi3 : ahci
scsi4 : ahci
scsi5 : ahci
ata1: SATA max UDMA/133 abar m2048@0xfdffc000 port 0xfdffc100 irq 26
ata2: SATA max UDMA/133 abar m2048@0xfdffc000 port 0xfdffc180 irq 26
ata3: SATA max UDMA/133 abar m2048@0xfdffc000 port 0xfdffc200 irq 26
ata4: SATA max UDMA/133 abar m2048@0xfdffc000 port 0xfdffc280 irq 26
ata5: SATA max UDMA/133 abar m2048@0xfdffc000 port 0xfdffc300 irq 26
ata6: SATA max UDMA/133 abar m2048@0xfdffc000 port 0xfdffc380 irq 26
ata3: SATA link down (SStatus 0 SControl 300)
ata2: SATA link down (SStatus 0 SControl 300)
ata4: SATA link down (SStatus 0 SControl 300)
ata6: SATA link down (SStatus 0 SControl 300)
ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata5.00: ATAPI: PLDS DVD+/-RW DH-16A6S, YD11, max UDMA/100
ata5.00: configured for UDMA/100
ata1.00: ATA-7: SAMSUNG HD322HJ, 1AC01113, max UDMA7
ata1.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata1.00: configured for UDMA/133
scsi 0:0:0:0: Direct-Access     ATA      SAMSUNG HD322HJ  1AC0 PQ: 0 ANSI: 5
scsi 4:0:0:0: CD-ROM            PLDS     DVD+-RW DH-16A6S YD11 PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 625142448 512-byte logical blocks: (320 GB/298 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 >
sd 0:0:0:0: [sda] Attached SCSI disk
sr0: scsi3-mmc drive: 48x/12x writer dvd-ram cd/rw xa/form2 cdda tray
Uniform CD-ROM driver Revision: 3.20
sr 4:0:0:0: Attached scsi CD-ROM sr0
EXT4-fs (sda2): barriers enabled
kjournald2 starting: pid 557, dev sda2:8, commit interval 5 seconds
EXT4-fs (sda2): delayed allocation enabled
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
EXT4-fs (sda2): mounted filesystem with ordered data mode
rtc_cmos 00:04: RTC can wake from S4
rtc_cmos 00:04: rtc core: registered rtc_cmos as rtc0
rtc0: alarms up to one month, 242 bytes nvram, hpet irqs
udev: starting version 146
i801_smbus 0000:00:1f.3: PCI INT B -> GSI 18 (level, low) -> IRQ 18
input: PC Speaker as /devices/platform/pcspkr/input/input1
input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input2
ACPI: Power Button [PWRF]
input: Power Button as /devices/LNXSYSTM:00/device:00/PNP0C0C:00/input/input3
ACPI: Power Button [PWRB]
dcdbas dcdbas: Dell Systems Management Base Driver (version 5.6.0-3.2)
e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
e1000e: Copyright (c) 1999-2008 Intel Corporation.
e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
e1000e 0000:00:19.0: pci_enable_pcie_error_reporting failed 0xfffffffb
e1000e 0000:00:19.0: setting latency timer to 64
e1000e 0000:00:19.0: irq 27 for MSI/MSI-X
iTCO_vendor_support: vendor-support=0
sd 0:0:0:0: Attached scsi generic sg0 type 0
sr 4:0:0:0: Attached scsi generic sg1 type 5
ACPI: SSDT 00000000cf5e7680 0022A (v01  PmRef  Cpu0Ist 00003000 INTL 20041203)
processor LNXCPU:00: registered as cooling_device0
ACPI: SSDT 00000000cf5e7b40 00152 (v01  PmRef  Cpu1Ist 00003000 INTL 20041203)
processor LNXCPU:01: registered as cooling_device1
ACPI: SSDT 00000000cf5e7ca0 00152 (v01  PmRef  Cpu2Ist 00003000 INTL 20041203)
processor LNXCPU:02: registered as cooling_device2
ACPI: SSDT 00000000cf5e7e00 00152 (v01  PmRef  Cpu3Ist 00003000 INTL 20041203)
processor LNXCPU:03: registered as cooling_device3
iTCO_wdt: Intel TCO WatchDog Timer Driver v1.05
iTCO_wdt: Found a ICH9R TCO device (Version=2, TCOBASE=0x0460)
iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)
gameport: EMU10K1 is pci0000:02:01.1/gameport0, io 0xce00, speed 1007kHz
fan PNP0C0B:00: registered as cooling_device4
ACPI: Fan [FAN] (on)
thermal LNXTHERM:01: registered as thermal_zone0
ACPI: Thermal Zone [THRM] (40 C)
0000:00:19.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:21:9b:06:4c:c9
0000:00:19.0: eth0: Intel(R) PRO/10/100 Network Connection
0000:00:19.0: eth0: MAC: 7, PHY: 7, PBA No: ffffff-0ff
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
ohci1394 0000:02:01.2: PCI INT B -> GSI 17 (level, low) -> IRQ 17
ohci1394 0000:02:01.2: setting latency timer to 64
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci_hcd 0000:00:1a.7: PCI INT C -> GSI 18 (level, low) -> IRQ 18
ehci_hcd 0000:00:1a.7: setting latency timer to 64
ehci_hcd 0000:00:1a.7: EHCI Host Controller
ehci_hcd 0000:00:1a.7: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:1a.7: debug port 1
ehci_hcd 0000:00:1a.7: cache line size of 32 is not supported
ehci_hcd 0000:00:1a.7: irq 18, io mem 0xfdffe000
ehci_hcd 0000:00:1a.7: USB 2.0 started, EHCI 1.00
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 6 ports detected
ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 23 (level, low) -> IRQ 23
ehci_hcd 0000:00:1d.7: setting latency timer to 64
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 2
ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[17]  MMIO=[fdeff000-fdeff7ff]  Max Packet=[2048]  IR/IT contexts=[4/8]
ehci_hcd 0000:00:1d.7: debug port 1
ehci_hcd 0000:00:1d.7: cache line size of 32 is not supported
ehci_hcd 0000:00:1d.7: irq 23, io mem 0xfdffd000
ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 6 ports detected
uhci_hcd: USB Universal Host Controller Interface driver
uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
uhci_hcd 0000:00:1a.0: setting latency timer to 64
uhci_hcd 0000:00:1a.0: UHCI Host Controller
uhci_hcd 0000:00:1a.0: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1a.0: irq 16, io base 0x0000fd00
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
uhci_hcd 0000:00:1a.1: PCI INT B -> GSI 21 (level, low) -> IRQ 21
uhci_hcd 0000:00:1a.1: setting latency timer to 64
uhci_hcd 0000:00:1a.1: UHCI Host Controller
uhci_hcd 0000:00:1a.1: new USB bus registered, assigned bus number 4
uhci_hcd 0000:00:1a.1: irq 21, io base 0x0000fc00
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
uhci_hcd 0000:00:1a.2: PCI INT D -> GSI 19 (level, low) -> IRQ 19
uhci_hcd 0000:00:1a.2: setting latency timer to 64
uhci_hcd 0000:00:1a.2: UHCI Host Controller
uhci_hcd 0000:00:1a.2: new USB bus registered, assigned bus number 5
uhci_hcd 0000:00:1a.2: irq 19, io base 0x0000fb00
usb usb5: configuration #1 chosen from 1 choice
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 2 ports detected
uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 23 (level, low) -> IRQ 23
uhci_hcd 0000:00:1d.0: setting latency timer to 64
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 6
uhci_hcd 0000:00:1d.0: irq 23, io base 0x0000fa00
usb usb6: configuration #1 chosen from 1 choice
hub 6-0:1.0: USB hub found
hub 6-0:1.0: 2 ports detected
uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19
uhci_hcd 0000:00:1d.1: setting latency timer to 64
uhci_hcd 0000:00:1d.1: UHCI Host Controller
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 7
uhci_hcd 0000:00:1d.1: irq 19, io base 0x0000f900
usb usb7: configuration #1 chosen from 1 choice
hub 7-0:1.0: USB hub found
hub 7-0:1.0: 2 ports detected
uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
uhci_hcd 0000:00:1d.2: setting latency timer to 64
uhci_hcd 0000:00:1d.2: UHCI Host Controller
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 8
uhci_hcd 0000:00:1d.2: irq 18, io base 0x0000f800
usb usb8: configuration #1 chosen from 1 choice
hub 8-0:1.0: USB hub found
hub 8-0:1.0: 2 ports detected
EMU10K1_Audigy 0000:02:01.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
EMU10K1_Audigy 0000:02:01.0: setting latency timer to 64
Installing spdif_bug patch: SB Audigy 2 ZS [SB0350]
usb 5-1: new full speed USB device using uhci_hcd and address 2
usb 5-1: configuration #1 chosen from 1 choice
hub 5-1:1.0: USB hub found
hub 5-1:1.0: 4 ports detected
usb 5-1.1: new low speed USB device using uhci_hcd and address 3
usb 5-1.1: configuration #1 chosen from 1 choice
ieee1394: Host added: ID:BUS[0-00:1023]  GUID[00023c01510c1ac4]
usb 5-1.3: new low speed USB device using uhci_hcd and address 4
usbcore: registered new interface driver hiddev
input: Logitech Logitech Gaming Keyboard as /devices/pci0000:00/0000:00:1a.2/usb5/5-1/5-1.1/5-1.1:1.0/input/input4
generic-usb 0003:046D:C221.0001: input,hidraw0: USB HID v1.10 Keyboard [Logitech Logitech Gaming Keyboard] on usb-0000:00:1a.2-1.1/input0
input: Logitech Logitech Gaming Keyboard as /devices/pci0000:00/0000:00:1a.2/usb5/5-1/5-1.1/5-1.1:1.1/input/input5
generic-usb 0003:046D:C221.0002: input,hiddev0,hidraw1: USB HID v1.10 Device [Logitech Logitech Gaming Keyboard] on usb-0000:00:1a.2-1.1/input1
usbcore: registered new interface driver usbhid
usbhid: v2.6:USB HID core driver
usb 5-1.3: configuration #1 chosen from 1 choice
input: Logitech USB-PS/2 Optical Mouse as /devices/pci0000:00/0000:00:1a.2/usb5/5-1/5-1.3/5-1.3:1.0/input/input6
generic-usb 0003:046D:C01E.0003: input,hidraw2: USB HID v1.10 Mouse [Logitech USB-PS/2 Optical Mouse] on usb-0000:00:1a.2-1.3/input0
usb 5-1.4: new full speed USB device using uhci_hcd and address 5
usb 5-1.4: configuration #1 chosen from 1 choice
input: G15 Keyboard G15 Keyboard as /devices/pci0000:00/0000:00:1a.2/usb5/5-1/5-1.4/5-1.4:1.0/input/input7
generic-usb 0003:046D:C222.0004: input,hiddev1,hidraw3: USB HID v1.11 Keypad [G15 Keyboard G15 Keyboard] on usb-0000:00:1a.2-1.4/input0
EXT4-fs (sda2): internal journal on sda2:8
EXT4-fs (sda5): barriers enabled
kjournald2 starting: pid 1140, dev sda5:8, commit interval 5 seconds
EXT4-fs (sda5): internal journal on sda5:8
EXT4-fs (sda5): Ignoring delalloc option - requested data journaling mode
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
EXT4-fs (sda5): mounted filesystem with journalled data mode
EXT4-fs (sda6): barriers enabled
kjournald2 starting: pid 1143, dev sda6:8, commit interval 5 seconds
EXT4-fs (sda6): internal journal on sda6:8
EXT4-fs (sda6): delayed allocation enabled
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
EXT4-fs (sda6): mounted filesystem with ordered data mode
loop: module loaded
Adding 3212992k swap on /dev/sda3.  Priority:-1 extents:1 across:3212992k 
e1000e 0000:00:19.0: irq 27 for MSI/MSI-X
e1000e 0000:00:19.0: irq 27 for MSI/MSI-X
ip_tables: (C) 2000-2006 Netfilter Core Team
nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
CONFIG_NF_CT_ACCT is deprecated and will be removed soon. Please use
nf_conntrack.acct=1 kernel parameter, acct=1 nf_conntrack module option or
sysctl net.netfilter.nf_conntrack_acct=1 to enable it.
e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX
0000:00:19.0: eth0: 10/100 speed: disabling TSO
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
eth0: no IPv6 routers present

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-10-31 14:21 large packet loss take2 2.6.31.x Caleb Cushing
@ 2009-10-31 18:44 ` Frans Pop
  2009-11-11 20:19   ` Caleb Cushing
  0 siblings, 1 reply; 55+ messages in thread
From: Frans Pop @ 2009-10-31 18:44 UTC (permalink / raw)
  To: Caleb Cushing; +Cc: linux-kernel, netdev

[-- Attachment #1: Type: text/plain, Size: 1158 bytes --]

Adding netdev in CC. Original message + attachments follow.

=========
so ever since arch rolled out 2.6.31.x I've been having problems with
my network (again) where I've been losing a large amount of packets
(just testing with mtr somewhere between 30/50%). first I figured it
was the same problem as I had in 2.6.30.x (and maybe it is?) but that
appeared to get fixed. when I started bisecting the bug wasn't
apparent in 2.6.31.0 but I knew for sure it was in .5 (I couldn't
remember if I had noticed it again in .3)

I'm attaching the bisection log and a 'good' dmesg output.

c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9 is the first bad commit

I'm not going to pretend to understand why this patch is breaking my
networking but between bisection and testing it appears to be... I've
never bisected before and I'm definitely not a kernel hacker (I can
barely read C).

I should also note that the wireshark dump here
http://bugzilla.kernel.org/show_bug.cgi?id=13835 is related to this.
and if it's not the same bug then possibly a new one should be opened.

P.S. I'm not subscribed to the list please CC me

Caleb Cushing
http://xenoterracide.blogspot.com


[-- Attachment #2: bisect.log --]
[-- Type: text/x-log, Size: 1347 bytes --]

# bad: [e2984cbfddd5c8fac88b24d7e5f28e1cfb6f3838] Linux 2.6.31.5
# good: [74fca6a42863ffacaf7ba6f1936a9f228950f657] Linux 2.6.31
git bisect start 'v2.6.31.5' 'v2.6.31' '--'
# good: [08f30ff9811e59c08d1cee043ce55b2e862efe58] tty: USB hangup is racy
git bisect good 08f30ff9811e59c08d1cee043ce55b2e862efe58
# bad: [ddf2acb72f3df470ce15eb23ee97cd3be23016f8] KVM: fix LAPIC timer period overflow
git bisect bad ddf2acb72f3df470ce15eb23ee97cd3be23016f8
# bad: [03429ffaea91375d8ed80a60fa13c9ffe694539a] Fix idle time field in /proc/uptime
git bisect bad 03429ffaea91375d8ed80a60fa13c9ffe694539a
# bad: [2f670d465897b491da0b82fd8b047b9ec75bf8c8] USB: xhci: Support full speed devices.
git bisect bad 2f670d465897b491da0b82fd8b047b9ec75bf8c8
# bad: [b8580fde3ba44a0f00bf186dc5e4935bbd51be29] usb-serial: fix termios initialization logic
git bisect bad b8580fde3ba44a0f00bf186dc5e4935bbd51be29
# bad: [c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9] usb-serial: change referencing of port and serial structures
git bisect bad c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9
# good: [fa31221e38207cece07a6d96854a0fcf47c75ae5] hwmon: (asus_atk0110) Add maintainer information
git bisect good fa31221e38207cece07a6d96854a0fcf47c75ae5
# good: [17fd426331d1e4611654985dd545a52d200dd9d1] tty: USB serial termios bits
git bisect good 17fd426331d1e4611654985dd545a52d200dd9d1

[-- Attachment #3: dmesg.log --]
[-- Type: text/x-log, Size: 31964 bytes --]

Linux version 2.6.31.1-test-00092-g17fd426 (xenoterracide@slave4) (gcc version 4.4.2 (GCC) ) #15 SMP PREEMPT Sat Oct 31 09:54:39 EDT 2009
Command line: root=/dev/disk/by-uuid/e9367b57-da02-48bf-9cd7-d41ce53520a3 ro
KERNEL supported cpus:
  Intel GenuineIntel
  AMD AuthenticAMD
  Centaur CentaurHauls
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 0000000000099c00 (usable)
 BIOS-e820: 0000000000099c00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000cf590000 (usable)
 BIOS-e820: 00000000cf590000 - 00000000cf5e3000 (ACPI NVS)
 BIOS-e820: 00000000cf5e3000 - 00000000cf5f0000 (ACPI data)
 BIOS-e820: 00000000cf5f0000 - 00000000cf600000 (reserved)
 BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
 BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 00000001b0000000 (usable)
DMI 2.5 present.
last_pfn = 0x1b0000 max_arch_pfn = 0x400000000
MTRR default type: uncachable
MTRR fixed ranges enabled:
  00000-9FFFF write-back
  A0000-BFFFF uncachable
  C0000-E7FFF write-protect
  E8000-EFFFF uncachable
  F0000-FFFFF write-through
MTRR variable ranges enabled:
  0 base 100000000 mask F80000000 write-back
  1 base 180000000 mask FE0000000 write-back
  2 base 1A0000000 mask FF0000000 write-back
  3 base 000000000 mask F80000000 write-back
  4 base 080000000 mask FC0000000 write-back
  5 base 0C0000000 mask FF0000000 write-back
  6 base 0CF700000 mask FFFF00000 uncachable
  7 base 0CF800000 mask FFF800000 uncachable
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
e820 update range: 00000000cf700000 - 0000000100000000 (usable) ==> (reserved)
last_pfn = 0xcf590 max_arch_pfn = 0x400000000
e820 update range: 0000000000001000 - 0000000000006000 (usable) ==> (reserved)
Scanning 1 areas for low memory corruption
modified physical RAM map:
 modified: 0000000000000000 - 0000000000001000 (usable)
 modified: 0000000000001000 - 0000000000006000 (reserved)
 modified: 0000000000006000 - 0000000000099c00 (usable)
 modified: 0000000000099c00 - 00000000000a0000 (reserved)
 modified: 00000000000f0000 - 0000000000100000 (reserved)
 modified: 0000000000100000 - 00000000cf590000 (usable)
 modified: 00000000cf590000 - 00000000cf5e3000 (ACPI NVS)
 modified: 00000000cf5e3000 - 00000000cf5f0000 (ACPI data)
 modified: 00000000cf5f0000 - 00000000cf600000 (reserved)
 modified: 00000000e0000000 - 00000000f0000000 (reserved)
 modified: 00000000fec00000 - 0000000100000000 (reserved)
 modified: 0000000100000000 - 00000001b0000000 (usable)
initial memory mapped : 0 - 20000000
init_memory_mapping: 0000000000000000-00000000cf590000
 0000000000 - 00cf400000 page 2M
 00cf400000 - 00cf590000 page 4k
kernel direct mapping tables up to cf590000 @ 8000-e000
init_memory_mapping: 0000000100000000-00000001b0000000
 0100000000 - 01b0000000 page 2M
kernel direct mapping tables up to 1b0000000 @ c000-14000
RAMDISK: 37ef1000 - 37fef9a7
ACPI: RSDP 00000000000f97a0 00024 (v02 DELL  )
ACPI: XSDT 00000000cf5e3080 0005C (v01 DELL    FX09    42302E31 AWRD 00000000)
ACPI: FACP 00000000cf5e7200 000F4 (v03 DELL    FX09    42302E31 AWRD 00000000)
ACPI: DSDT 00000000cf5e3200 03FFC (v01 DELL   AWRDACPI 00001000 MSFT 03000000)
ACPI: FACS 00000000cf590000 00040
ACPI: HPET 00000000cf5e73c0 00038 (v01 DELL    FX09    42302E31 AWRD 00000098)
ACPI: MCFG 00000000cf5e7400 0003C (v01 DELL    FX09    42302E31 AWRD 00000000)
ACPI: DMY1 00000000cf5e7440 00176 (v01 DELL    FX09    42302E31 AWRD 00000000)
ACPI: DMY2 00000000cf5e75c0 00080 (v01 DELL    FX09    42302E31 AWRD 00000000)
ACPI: APIC 00000000cf5e7300 00084 (v01 DELL    FX09    42302E31 AWRD 00000000)
ACPI: SSDT 00000000cf5e7f60 00380 (v01  PmRef    CpuPm 00003000 INTL 20041203)
ACPI: Local APIC address 0xfee00000
(8 early reservations) ==> bootmem [0000000000 - 01b0000000]
  #0 [0000000000 - 0000001000]   BIOS data page ==> [0000000000 - 0000001000]
  #1 [0000006000 - 0000008000]       TRAMPOLINE ==> [0000006000 - 0000008000]
  #2 [0001000000 - 00016837d0]    TEXT DATA BSS ==> [0001000000 - 00016837d0]
  #3 [0037ef1000 - 0037fef9a7]          RAMDISK ==> [0037ef1000 - 0037fef9a7]
  #4 [0000099c00 - 0000100000]    BIOS reserved ==> [0000099c00 - 0000100000]
  #5 [0001684000 - 00016840f8]              BRK ==> [0001684000 - 00016840f8]
  #6 [0000008000 - 000000c000]          PGTABLE ==> [0000008000 - 000000c000]
  #7 [000000c000 - 000000f000]          PGTABLE ==> [000000c000 - 000000f000]
found SMP MP-table at [ffff8800000f3f00] f3f00
 [ffffea0000000000-ffffea0005ffffff] PMD -> [ffff880028600000-ffff88002dbfffff] on node 0
Zone PFN ranges:
  DMA      0x00000000 -> 0x00001000
  DMA32    0x00001000 -> 0x00100000
  Normal   0x00100000 -> 0x001b0000
Movable zone start PFN for each node
early_node_map[4] active PFN ranges
    0: 0x00000000 -> 0x00000001
    0: 0x00000006 -> 0x00000099
    0: 0x00000100 -> 0x000cf590
    0: 0x00100000 -> 0x001b0000
On node 0 totalpages: 1570084
  DMA zone: 56 pages used for memmap
  DMA zone: 112 pages reserved
  DMA zone: 3820 pages, LIFO batch:0
  DMA32 zone: 14280 pages used for memmap
  DMA32 zone: 830920 pages, LIFO batch:31
  Normal zone: 9856 pages used for memmap
  Normal zone: 711040 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x408
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x03] enabled)
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x02] enabled)
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
ACPI: IOAPIC (id[0x04] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 4, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Using ACPI (MADT) for SMP configuration information
ACPI: HPET id: 0x8086a201 base: 0xfed00000
SMP: Allowing 4 CPUs, 0 hotplug CPUs
nr_irqs_gsi: 24
PM: Registered nosave memory: 0000000000001000 - 0000000000006000
PM: Registered nosave memory: 0000000000099000 - 000000000009a000
PM: Registered nosave memory: 000000000009a000 - 00000000000a0000
PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000
PM: Registered nosave memory: 00000000000f0000 - 0000000000100000
PM: Registered nosave memory: 00000000cf590000 - 00000000cf5e3000
PM: Registered nosave memory: 00000000cf5e3000 - 00000000cf5f0000
PM: Registered nosave memory: 00000000cf5f0000 - 00000000cf600000
PM: Registered nosave memory: 00000000cf600000 - 00000000e0000000
PM: Registered nosave memory: 00000000e0000000 - 00000000f0000000
PM: Registered nosave memory: 00000000f0000000 - 00000000fec00000
PM: Registered nosave memory: 00000000fec00000 - 0000000100000000
Allocating PCI resources starting at cf600000 (gap: cf600000:10a00000)
NR_CPUS:16 nr_cpumask_bits:16 nr_cpu_ids:4 nr_node_ids:1
PERCPU: Embedded 28 pages at ffff880028034000, static data 82400 bytes
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 1545780
Kernel command line: root=/dev/disk/by-uuid/e9367b57-da02-48bf-9cd7-d41ce53520a3 ro
PID hash table entries: 4096 (order: 12, 32768 bytes)
Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
Initializing CPU#0
Checking aperture...
No AGP bridge found
Calgary: detecting Calgary via BIOS EBDA area
Calgary: Unable to locate Rio Grande table in EBDA - bailing!
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Placing 64MB software IO TLB between ffff880020000000 - ffff880024000000
software IO TLB at phys 0x20000000 - 0x24000000
Memory: 6104592k/7077888k available (3629k kernel code, 797552k absent, 174760k reserved, 1500k data, 484k init)
SLUB: Genslabs=13, HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
NR_IRQS:768
Fast TSC calibration using PIT
Detected 2393.837 MHz processor.
Console: colour VGA+ 80x25
console [tty0] enabled
hpet clockevent registered
HPET: 4 timers in total, 0 timers will be used for per-cpu timer
Calibrating delay loop (skipped), value calculated using timer frequency.. 4789.16 BogoMIPS (lpj=7979456)
Security Framework initialized
Mount-cache hash table entries: 256
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
mce: CPU supports 6 MCE banks
CPU0: Thermal monitoring enabled (TM2)
using mwait in idle threads.
Performance Counters: Core2 events, Intel PMU driver.
... version:                 2
... bit width:               40
... generic counters:        2
... value mask:              000000ffffffffff
... max period:              000000007fffffff
... fixed-purpose counters:  3
... counter mask:            0000000700000003
ACPI: Core revision 20090521
Setting APIC routing to flat
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
CPU0: Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz stepping 0b
Booting processor 1 APIC 0x1 ip 0x6000
Initializing CPU#1
Calibrating delay using timer specific routine.. 4789.48 BogoMIPS (lpj=7979981)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
mce: CPU supports 6 MCE banks
CPU1: Thermal monitoring enabled (TM2)
x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106
CPU1: Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz stepping 0b
checking TSC synchronization [CPU#0 -> CPU#1]: passed.
Booting processor 2 APIC 0x3 ip 0x6000
Initializing CPU#2
Calibrating delay using timer specific routine.. 4789.51 BogoMIPS (lpj=7980021)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 3
mce: CPU supports 6 MCE banks
CPU2: Thermal monitoring enabled (TM2)
x86 PAT enabled: cpu 2, old 0x7040600070406, new 0x7010600070106
CPU2: Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz stepping 0b
checking TSC synchronization [CPU#0 -> CPU#2]: passed.
Booting processor 3 APIC 0x2 ip 0x6000
Initializing CPU#3
Calibrating delay using timer specific routine.. 4789.56 BogoMIPS (lpj=7980102)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 2
mce: CPU supports 6 MCE banks
CPU3: Thermal monitoring enabled (TM2)
x86 PAT enabled: cpu 3, old 0x7040600070406, new 0x7010600070106
CPU3: Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz stepping 0b
checking TSC synchronization [CPU#0 -> CPU#3]: passed.
Brought up 4 CPUs
Total of 4 processors activated (19159.72 BogoMIPS).
CPU0 attaching sched-domain:
 domain 0: span 0-1 level MC
  groups: 0 1
  domain 1: span 0-3 level CPU
   groups: 0-1 2-3
CPU1 attaching sched-domain:
 domain 0: span 0-1 level MC
  groups: 1 0
  domain 1: span 0-3 level CPU
   groups: 0-1 2-3
CPU2 attaching sched-domain:
 domain 0: span 2-3 level MC
  groups: 2 3
  domain 1: span 0-3 level CPU
   groups: 2-3 0-1
CPU3 attaching sched-domain:
 domain 0: span 2-3 level MC
  groups: 3 2
  domain 1: span 0-3 level CPU
   groups: 2-3 0-1
Booting paravirtualized kernel on bare hardware
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 255
PCI: MCFG area at e0000000 reserved in E820
PCI: Using MMCONFIG at e0000000 - efffffff
PCI: Using configuration type 1 for base access
mtrr: your CPUs had inconsistent fixed MTRR settings
mtrr: probably your BIOS does not setup all CPUs.
mtrr: corrected configuration.
bio: create slab <bio-0> at 0
ACPI: EC: Look up EC in DSDT
ACPI: Interpreter enabled
ACPI: (supports S0 S3 S4 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: No dock devices found.
ACPI: PCI Root Bridge [PCI0] (0000:00)
pci 0000:00:01.0: PME# supported from D0 D3hot D3cold
pci 0000:00:01.0: PME# disabled
pci 0000:00:02.0: reg 10 32bit mmio: [0xfdf00000-0xfdf7ffff]
pci 0000:00:02.0: reg 14 io port: [0xff00-0xff07]
pci 0000:00:02.0: reg 18 32bit mmio: [0xd0000000-0xdfffffff]
pci 0000:00:02.0: reg 1c 32bit mmio: [0xfdc00000-0xfdcfffff]
pci 0000:00:19.0: reg 10 32bit mmio: [0xfdfc0000-0xfdfdffff]
pci 0000:00:19.0: reg 14 32bit mmio: [0xfdfff000-0xfdffffff]
pci 0000:00:19.0: reg 18 io port: [0xfe00-0xfe1f]
pci 0000:00:19.0: PME# supported from D0 D3hot D3cold
pci 0000:00:19.0: PME# disabled
pci 0000:00:1a.0: reg 20 io port: [0xfd00-0xfd1f]
pci 0000:00:1a.1: reg 20 io port: [0xfc00-0xfc1f]
pci 0000:00:1a.2: reg 20 io port: [0xfb00-0xfb1f]
pci 0000:00:1a.7: reg 10 32bit mmio: [0xfdffe000-0xfdffe3ff]
pci 0000:00:1a.7: PME# supported from D0 D3hot D3cold
pci 0000:00:1a.7: PME# disabled
pci 0000:00:1d.0: reg 20 io port: [0xfa00-0xfa1f]
pci 0000:00:1d.1: reg 20 io port: [0xf900-0xf91f]
pci 0000:00:1d.2: reg 20 io port: [0xf800-0xf81f]
pci 0000:00:1d.7: reg 10 32bit mmio: [0xfdffd000-0xfdffd3ff]
pci 0000:00:1d.7: PME# supported from D0 D3hot D3cold
pci 0000:00:1d.7: PME# disabled
pci 0000:00:1f.0: quirk: region 0400-047f claimed by ICH6 ACPI/GPIO/TCO
pci 0000:00:1f.0: quirk: region 0480-04bf claimed by ICH6 GPIO
pci 0000:00:1f.0: ICH7 LPC Generic IO decode 1 PIO at 0800 (mask 003f)
pci 0000:00:1f.0: ICH7 LPC Generic IO decode 2 PIO at 0290 (mask 003f)
pci 0000:00:1f.2: reg 10 io port: [0xf700-0xf707]
pci 0000:00:1f.2: reg 14 io port: [0xf600-0xf603]
pci 0000:00:1f.2: reg 18 io port: [0xf500-0xf507]
pci 0000:00:1f.2: reg 1c io port: [0xf400-0xf403]
pci 0000:00:1f.2: reg 20 io port: [0xf300-0xf31f]
pci 0000:00:1f.2: reg 24 32bit mmio: [0xfdffc000-0xfdffc7ff]
pci 0000:00:1f.2: PME# supported from D3hot
pci 0000:00:1f.2: PME# disabled
pci 0000:00:1f.3: reg 10 64bit mmio: [0xfdffb000-0xfdffb0ff]
pci 0000:00:1f.3: reg 20 io port: [0x500-0x51f]
pci 0000:00:01.0: bridge io port: [0xd000-0xdfff]
pci 0000:00:01.0: bridge 32bit mmio: [0xfda00000-0xfdafffff]
pci 0000:00:01.0: bridge 64bit mmio pref: [0xfdb00000-0xfdbfffff]
pci 0000:02:01.0: reg 10 io port: [0xcf00-0xcf3f]
pci 0000:02:01.0: supports D1 D2
pci 0000:02:01.1: reg 10 io port: [0xce00-0xce07]
pci 0000:02:01.1: supports D1 D2
pci 0000:02:01.2: reg 10 32bit mmio: [0xfdeff000-0xfdeff7ff]
pci 0000:02:01.2: reg 14 32bit mmio: [0xfdef8000-0xfdefbfff]
pci 0000:02:01.2: supports D1 D2
pci 0000:02:01.2: PME# supported from D0 D1 D2 D3hot
pci 0000:02:01.2: PME# disabled
pci 0000:00:1e.0: transparent bridge
pci 0000:00:1e.0: bridge io port: [0xc000-0xcfff]
pci 0000:00:1e.0: bridge 32bit mmio: [0xfde00000-0xfdefffff]
pci 0000:00:1e.0: bridge 64bit mmio pref: [0xfdd00000-0xfddfffff]
pci_bus 0000:00: on NUMA node 0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 *4 5 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 9 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 9 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNK0] (IRQs *3 4 5 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 7 *9 10 11 12 14 15)
PCI: Using ACPI for IRQ routing
NetLabel: Initializing
NetLabel:  domain hash size = 128
NetLabel:  protocols = UNLABELED CIPSOv4
NetLabel:  unlabeled traffic allowed by default
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: PnP ACPI: found 12 devices
ACPI: ACPI bus type pnp unregistered
system 00:01: ioport range 0x4d0-0x4d1 has been reserved
system 00:01: ioport range 0x800-0x87f has been reserved
system 00:01: ioport range 0x290-0x297 has been reserved
system 00:01: ioport range 0x880-0x88f has been reserved
system 00:08: ioport range 0x400-0x4bf could not be reserved
system 00:0a: iomem range 0xe0000000-0xefffffff has been reserved
system 00:0b: iomem range 0xf0000-0xfffff could not be reserved
system 00:0b: iomem range 0xcf600000-0xcf6fffff could not be reserved
system 00:0b: iomem range 0xfed00000-0xfed000ff has been reserved
system 00:0b: iomem range 0xcf590000-0xcf5fffff could not be reserved
system 00:0b: iomem range 0x0-0x9ffff could not be reserved
system 00:0b: iomem range 0x100000-0xcf58ffff could not be reserved
system 00:0b: iomem range 0xfec00000-0xfec00fff could not be reserved
system 00:0b: iomem range 0xfed14000-0xfed1dfff has been reserved
system 00:0b: iomem range 0xfed20000-0xfed9ffff has been reserved
system 00:0b: iomem range 0xfee00000-0xfee00fff has been reserved
system 00:0b: iomem range 0xffb00000-0xffb7ffff has been reserved
system 00:0b: iomem range 0xfff00000-0xffffffff has been reserved
system 00:0b: iomem range 0xe0000-0xeffff has been reserved
pci 0000:00:01.0: PCI bridge, secondary bus 0000:01
pci 0000:00:01.0:   IO window: 0xd000-0xdfff
pci 0000:00:01.0:   MEM window: 0xfda00000-0xfdafffff
pci 0000:00:01.0:   PREFETCH window: 0x000000fdb00000-0x000000fdbfffff
pci 0000:00:1e.0: PCI bridge, secondary bus 0000:02
pci 0000:00:1e.0:   IO window: 0xc000-0xcfff
pci 0000:00:1e.0:   MEM window: 0xfde00000-0xfdefffff
pci 0000:00:1e.0:   PREFETCH window: 0x000000fdd00000-0x000000fddfffff
pci 0000:00:01.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:01.0: setting latency timer to 64
pci 0000:00:1e.0: setting latency timer to 64
pci_bus 0000:00: resource 0 io:  [0x00-0xffff]
pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
pci_bus 0000:01: resource 0 io:  [0xd000-0xdfff]
pci_bus 0000:01: resource 1 mem: [0xfda00000-0xfdafffff]
pci_bus 0000:01: resource 2 pref mem [0xfdb00000-0xfdbfffff]
pci_bus 0000:02: resource 0 io:  [0xc000-0xcfff]
pci_bus 0000:02: resource 1 mem: [0xfde00000-0xfdefffff]
pci_bus 0000:02: resource 2 pref mem [0xfdd00000-0xfddfffff]
pci_bus 0000:02: resource 3 io:  [0x00-0xffff]
pci_bus 0000:02: resource 4 mem: [0x000000-0xffffffffffffffff]
NET: Registered protocol family 2
IP route cache hash table entries: 262144 (order: 9, 2097152 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
NET: Registered protocol family 1
Unpacking initramfs...
Freeing initrd memory: 1018k freed
Scanning for low memory corruption every 60 seconds
audit: initializing netlink socket (disabled)
type=2000 audit(1256998169.436:1): initialized
VFS: Disk quotas dquot_6.5.2
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
msgmni has been set to 11926
alg: No test for stdrng (krng)
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
pci 0000:00:02.0: Boot video device
pcieport-driver 0000:00:01.0: irq 24 for MSI/MSI-X
pcieport-driver 0000:00:01.0: setting latency timer to 64
Linux agpgart interface v0.103
Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
input: Macintosh mouse button emulation as /devices/virtual/input/input0
PNP: No PS/2 controller found. Probing ports directly.
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
cpuidle: using governor ladder
cpuidle: using governor menu
TCP cubic registered
NET: Registered protocol family 17
registered taskstats version 1
Initalizing network drop monitor service
Freeing unused kernel memory: 484k freed
agpgart-intel 0000:00:00.0: Intel G33 Chipset
agpgart-intel 0000:00:00.0: detected 7164K stolen memory
agpgart-intel 0000:00:00.0: AGP aperture is 256M @ 0xd0000000
[drm] Initialized drm 1.1.0 20060810
i915 0000:00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
i915 0000:00:02.0: setting latency timer to 64
mtrr: no more MTRRs available
[drm] MTRR allocation failed.  Graphics performance may suffer.
i915 0000:00:02.0: irq 25 for MSI/MSI-X
Switched to high resolution mode on CPU 1
Switched to high resolution mode on CPU 3
Switched to high resolution mode on CPU 2
Switched to high resolution mode on CPU 0
[drm] DAC-6: set mode 1920x1080 11
Console: switching to colour frame buffer device 240x67
[drm] fb0: inteldrmfb frame buffer device
[drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
FDC 0 is a post-1991 82077
SCSI subsystem initialized
libata version 3.00 loaded.
ahci 0000:00:1f.2: version 3.0
ahci 0000:00:1f.2: PCI INT A -> GSI 19 (level, low) -> IRQ 19
ahci 0000:00:1f.2: irq 26 for MSI/MSI-X
ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0x3f impl RAID mode
ahci 0000:00:1f.2: flags: 64bit ncq sntf led clo pmp pio slum part ems 
ahci 0000:00:1f.2: setting latency timer to 64
scsi0 : ahci
scsi1 : ahci
scsi2 : ahci
scsi3 : ahci
scsi4 : ahci
scsi5 : ahci
ata1: SATA max UDMA/133 abar m2048@0xfdffc000 port 0xfdffc100 irq 26
ata2: SATA max UDMA/133 abar m2048@0xfdffc000 port 0xfdffc180 irq 26
ata3: SATA max UDMA/133 abar m2048@0xfdffc000 port 0xfdffc200 irq 26
ata4: SATA max UDMA/133 abar m2048@0xfdffc000 port 0xfdffc280 irq 26
ata5: SATA max UDMA/133 abar m2048@0xfdffc000 port 0xfdffc300 irq 26
ata6: SATA max UDMA/133 abar m2048@0xfdffc000 port 0xfdffc380 irq 26
ata3: SATA link down (SStatus 0 SControl 300)
ata2: SATA link down (SStatus 0 SControl 300)
ata4: SATA link down (SStatus 0 SControl 300)
ata6: SATA link down (SStatus 0 SControl 300)
ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata5.00: ATAPI: PLDS DVD+/-RW DH-16A6S, YD11, max UDMA/100
ata5.00: configured for UDMA/100
ata1.00: ATA-7: SAMSUNG HD322HJ, 1AC01113, max UDMA7
ata1.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata1.00: configured for UDMA/133
scsi 0:0:0:0: Direct-Access     ATA      SAMSUNG HD322HJ  1AC0 PQ: 0 ANSI: 5
scsi 4:0:0:0: CD-ROM            PLDS     DVD+-RW DH-16A6S YD11 PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 625142448 512-byte logical blocks: (320 GB/298 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 >
sd 0:0:0:0: [sda] Attached SCSI disk
sr0: scsi3-mmc drive: 48x/12x writer dvd-ram cd/rw xa/form2 cdda tray
Uniform CD-ROM driver Revision: 3.20
sr 4:0:0:0: Attached scsi CD-ROM sr0
EXT4-fs (sda2): barriers enabled
kjournald2 starting: pid 557, dev sda2:8, commit interval 5 seconds
EXT4-fs (sda2): delayed allocation enabled
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
EXT4-fs (sda2): mounted filesystem with ordered data mode
rtc_cmos 00:04: RTC can wake from S4
rtc_cmos 00:04: rtc core: registered rtc_cmos as rtc0
rtc0: alarms up to one month, 242 bytes nvram, hpet irqs
udev: starting version 146
i801_smbus 0000:00:1f.3: PCI INT B -> GSI 18 (level, low) -> IRQ 18
input: PC Speaker as /devices/platform/pcspkr/input/input1
input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input2
ACPI: Power Button [PWRF]
input: Power Button as /devices/LNXSYSTM:00/device:00/PNP0C0C:00/input/input3
ACPI: Power Button [PWRB]
dcdbas dcdbas: Dell Systems Management Base Driver (version 5.6.0-3.2)
e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
e1000e: Copyright (c) 1999-2008 Intel Corporation.
e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
e1000e 0000:00:19.0: pci_enable_pcie_error_reporting failed 0xfffffffb
e1000e 0000:00:19.0: setting latency timer to 64
e1000e 0000:00:19.0: irq 27 for MSI/MSI-X
iTCO_vendor_support: vendor-support=0
sd 0:0:0:0: Attached scsi generic sg0 type 0
sr 4:0:0:0: Attached scsi generic sg1 type 5
ACPI: SSDT 00000000cf5e7680 0022A (v01  PmRef  Cpu0Ist 00003000 INTL 20041203)
processor LNXCPU:00: registered as cooling_device0
ACPI: SSDT 00000000cf5e7b40 00152 (v01  PmRef  Cpu1Ist 00003000 INTL 20041203)
processor LNXCPU:01: registered as cooling_device1
ACPI: SSDT 00000000cf5e7ca0 00152 (v01  PmRef  Cpu2Ist 00003000 INTL 20041203)
processor LNXCPU:02: registered as cooling_device2
ACPI: SSDT 00000000cf5e7e00 00152 (v01  PmRef  Cpu3Ist 00003000 INTL 20041203)
processor LNXCPU:03: registered as cooling_device3
iTCO_wdt: Intel TCO WatchDog Timer Driver v1.05
iTCO_wdt: Found a ICH9R TCO device (Version=2, TCOBASE=0x0460)
iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)
gameport: EMU10K1 is pci0000:02:01.1/gameport0, io 0xce00, speed 1007kHz
fan PNP0C0B:00: registered as cooling_device4
ACPI: Fan [FAN] (on)
thermal LNXTHERM:01: registered as thermal_zone0
ACPI: Thermal Zone [THRM] (40 C)
0000:00:19.0: eth0: (PCI Express:2.5GB/s:Width x1) 00:21:9b:06:4c:c9
0000:00:19.0: eth0: Intel(R) PRO/10/100 Network Connection
0000:00:19.0: eth0: MAC: 7, PHY: 7, PBA No: ffffff-0ff
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
ohci1394 0000:02:01.2: PCI INT B -> GSI 17 (level, low) -> IRQ 17
ohci1394 0000:02:01.2: setting latency timer to 64
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci_hcd 0000:00:1a.7: PCI INT C -> GSI 18 (level, low) -> IRQ 18
ehci_hcd 0000:00:1a.7: setting latency timer to 64
ehci_hcd 0000:00:1a.7: EHCI Host Controller
ehci_hcd 0000:00:1a.7: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:1a.7: debug port 1
ehci_hcd 0000:00:1a.7: cache line size of 32 is not supported
ehci_hcd 0000:00:1a.7: irq 18, io mem 0xfdffe000
ehci_hcd 0000:00:1a.7: USB 2.0 started, EHCI 1.00
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 6 ports detected
ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 23 (level, low) -> IRQ 23
ehci_hcd 0000:00:1d.7: setting latency timer to 64
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 2
ohci1394: fw-host0: OHCI-1394 1.1 (PCI): IRQ=[17]  MMIO=[fdeff000-fdeff7ff]  Max Packet=[2048]  IR/IT contexts=[4/8]
ehci_hcd 0000:00:1d.7: debug port 1
ehci_hcd 0000:00:1d.7: cache line size of 32 is not supported
ehci_hcd 0000:00:1d.7: irq 23, io mem 0xfdffd000
ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 6 ports detected
uhci_hcd: USB Universal Host Controller Interface driver
uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
uhci_hcd 0000:00:1a.0: setting latency timer to 64
uhci_hcd 0000:00:1a.0: UHCI Host Controller
uhci_hcd 0000:00:1a.0: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1a.0: irq 16, io base 0x0000fd00
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
uhci_hcd 0000:00:1a.1: PCI INT B -> GSI 21 (level, low) -> IRQ 21
uhci_hcd 0000:00:1a.1: setting latency timer to 64
uhci_hcd 0000:00:1a.1: UHCI Host Controller
uhci_hcd 0000:00:1a.1: new USB bus registered, assigned bus number 4
uhci_hcd 0000:00:1a.1: irq 21, io base 0x0000fc00
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
uhci_hcd 0000:00:1a.2: PCI INT D -> GSI 19 (level, low) -> IRQ 19
uhci_hcd 0000:00:1a.2: setting latency timer to 64
uhci_hcd 0000:00:1a.2: UHCI Host Controller
uhci_hcd 0000:00:1a.2: new USB bus registered, assigned bus number 5
uhci_hcd 0000:00:1a.2: irq 19, io base 0x0000fb00
usb usb5: configuration #1 chosen from 1 choice
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 2 ports detected
uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 23 (level, low) -> IRQ 23
uhci_hcd 0000:00:1d.0: setting latency timer to 64
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 6
uhci_hcd 0000:00:1d.0: irq 23, io base 0x0000fa00
usb usb6: configuration #1 chosen from 1 choice
hub 6-0:1.0: USB hub found
hub 6-0:1.0: 2 ports detected
uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19
uhci_hcd 0000:00:1d.1: setting latency timer to 64
uhci_hcd 0000:00:1d.1: UHCI Host Controller
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 7
uhci_hcd 0000:00:1d.1: irq 19, io base 0x0000f900
usb usb7: configuration #1 chosen from 1 choice
hub 7-0:1.0: USB hub found
hub 7-0:1.0: 2 ports detected
uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
uhci_hcd 0000:00:1d.2: setting latency timer to 64
uhci_hcd 0000:00:1d.2: UHCI Host Controller
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 8
uhci_hcd 0000:00:1d.2: irq 18, io base 0x0000f800
usb usb8: configuration #1 chosen from 1 choice
hub 8-0:1.0: USB hub found
hub 8-0:1.0: 2 ports detected
EMU10K1_Audigy 0000:02:01.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
EMU10K1_Audigy 0000:02:01.0: setting latency timer to 64
Installing spdif_bug patch: SB Audigy 2 ZS [SB0350]
usb 5-1: new full speed USB device using uhci_hcd and address 2
usb 5-1: configuration #1 chosen from 1 choice
hub 5-1:1.0: USB hub found
hub 5-1:1.0: 4 ports detected
usb 5-1.1: new low speed USB device using uhci_hcd and address 3
usb 5-1.1: configuration #1 chosen from 1 choice
ieee1394: Host added: ID:BUS[0-00:1023]  GUID[00023c01510c1ac4]
usb 5-1.3: new low speed USB device using uhci_hcd and address 4
usbcore: registered new interface driver hiddev
input: Logitech Logitech Gaming Keyboard as /devices/pci0000:00/0000:00:1a.2/usb5/5-1/5-1.1/5-1.1:1.0/input/input4
generic-usb 0003:046D:C221.0001: input,hidraw0: USB HID v1.10 Keyboard [Logitech Logitech Gaming Keyboard] on usb-0000:00:1a.2-1.1/input0
input: Logitech Logitech Gaming Keyboard as /devices/pci0000:00/0000:00:1a.2/usb5/5-1/5-1.1/5-1.1:1.1/input/input5
generic-usb 0003:046D:C221.0002: input,hiddev0,hidraw1: USB HID v1.10 Device [Logitech Logitech Gaming Keyboard] on usb-0000:00:1a.2-1.1/input1
usbcore: registered new interface driver usbhid
usbhid: v2.6:USB HID core driver
usb 5-1.3: configuration #1 chosen from 1 choice
input: Logitech USB-PS/2 Optical Mouse as /devices/pci0000:00/0000:00:1a.2/usb5/5-1/5-1.3/5-1.3:1.0/input/input6
generic-usb 0003:046D:C01E.0003: input,hidraw2: USB HID v1.10 Mouse [Logitech USB-PS/2 Optical Mouse] on usb-0000:00:1a.2-1.3/input0
usb 5-1.4: new full speed USB device using uhci_hcd and address 5
usb 5-1.4: configuration #1 chosen from 1 choice
input: G15 Keyboard G15 Keyboard as /devices/pci0000:00/0000:00:1a.2/usb5/5-1/5-1.4/5-1.4:1.0/input/input7
generic-usb 0003:046D:C222.0004: input,hiddev1,hidraw3: USB HID v1.11 Keypad [G15 Keyboard G15 Keyboard] on usb-0000:00:1a.2-1.4/input0
EXT4-fs (sda2): internal journal on sda2:8
EXT4-fs (sda5): barriers enabled
kjournald2 starting: pid 1140, dev sda5:8, commit interval 5 seconds
EXT4-fs (sda5): internal journal on sda5:8
EXT4-fs (sda5): Ignoring delalloc option - requested data journaling mode
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
EXT4-fs (sda5): mounted filesystem with journalled data mode
EXT4-fs (sda6): barriers enabled
kjournald2 starting: pid 1143, dev sda6:8, commit interval 5 seconds
EXT4-fs (sda6): internal journal on sda6:8
EXT4-fs (sda6): delayed allocation enabled
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
EXT4-fs (sda6): mounted filesystem with ordered data mode
loop: module loaded
Adding 3212992k swap on /dev/sda3.  Priority:-1 extents:1 across:3212992k 
e1000e 0000:00:19.0: irq 27 for MSI/MSI-X
e1000e 0000:00:19.0: irq 27 for MSI/MSI-X
ip_tables: (C) 2000-2006 Netfilter Core Team
nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
CONFIG_NF_CT_ACCT is deprecated and will be removed soon. Please use
nf_conntrack.acct=1 kernel parameter, acct=1 nf_conntrack module option or
sysctl net.netfilter.nf_conntrack_acct=1 to enable it.
e1000e: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX
0000:00:19.0: eth0: 10/100 speed: disabling TSO
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
eth0: no IPv6 routers present

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-10-31 18:44 ` Frans Pop
@ 2009-11-11 20:19   ` Caleb Cushing
  2009-11-11 21:47     ` Andi Kleen
  0 siblings, 1 reply; 55+ messages in thread
From: Caleb Cushing @ 2009-11-11 20:19 UTC (permalink / raw)
  To: Frans Pop; +Cc: linux-kernel, netdev

not to be impatient or ungrateful, but has anyone had time to look at
this? will I be able to upgrade to 31.6? will 32.0 work?

On Sat, Oct 31, 2009 at 1:44 PM, Frans Pop <elendil@planet.nl> wrote:
> Adding netdev in CC. Original message + attachments follow.
>
> =========
> so ever since arch rolled out 2.6.31.x I've been having problems with
> my network (again) where I've been losing a large amount of packets
> (just testing with mtr somewhere between 30/50%). first I figured it
> was the same problem as I had in 2.6.30.x (and maybe it is?) but that
> appeared to get fixed. when I started bisecting the bug wasn't
> apparent in 2.6.31.0 but I knew for sure it was in .5 (I couldn't
> remember if I had noticed it again in .3)
>
> I'm attaching the bisection log and a 'good' dmesg output.
>
> c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9 is the first bad commit
>
> I'm not going to pretend to understand why this patch is breaking my
> networking but between bisection and testing it appears to be... I've
> never bisected before and I'm definitely not a kernel hacker (I can
> barely read C).
>
> I should also note that the wireshark dump here
> http://bugzilla.kernel.org/show_bug.cgi?id=13835 is related to this.
> and if it's not the same bug then possibly a new one should be opened.
>
> P.S. I'm not subscribed to the list please CC me
>
> Caleb Cushing
> http://xenoterracide.blogspot.com
>
>



-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-11 20:19   ` Caleb Cushing
@ 2009-11-11 21:47     ` Andi Kleen
  2009-11-11 22:05       ` Frans Pop
  0 siblings, 1 reply; 55+ messages in thread
From: Andi Kleen @ 2009-11-11 21:47 UTC (permalink / raw)
  To: Caleb Cushing; +Cc: Frans Pop, linux-kernel, netdev

Caleb Cushing <xenoterracide@gmail.com> writes:
>>
>> I'm attaching the bisection log and a 'good' dmesg output.
>>
>> c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9 is the first bad commit

Just gives fatal: bad object c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9
here on a standard Linus linux-2.6 tree.

It might be also useful if you could describe what kind
of network devices you use and how you determine
the packet loss.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-11 21:47     ` Andi Kleen
@ 2009-11-11 22:05       ` Frans Pop
  2009-11-11 22:48         ` Caleb Cushing
  0 siblings, 1 reply; 55+ messages in thread
From: Frans Pop @ 2009-11-11 22:05 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Caleb Cushing, linux-kernel, netdev

On Wednesday 11 November 2009, Andi Kleen wrote:
> Caleb Cushing <xenoterracide@gmail.com> writes:
> >> I'm attaching the bisection log and a 'good' dmesg output.
> >>
> >> c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9 is the first bad commit
>
> Just gives fatal: bad object c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9
> here on a standard Linus linux-2.6 tree.

Looks to be a commit from a stable update:

commit c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9
Author: Alan Stern <stern@rowland.harvard.edu>
Date:   Tue Sep 1 11:38:34 2009 -0400

    usb-serial: change referencing of port and serial structures

    commit 41bd34ddd7aa46dbc03b5bb33896e0fa8100fe7b upstream.

Cheers,
FJP

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-11 22:05       ` Frans Pop
@ 2009-11-11 22:48         ` Caleb Cushing
  2009-11-12 11:38           ` Jarek Poplawski
  0 siblings, 1 reply; 55+ messages in thread
From: Caleb Cushing @ 2009-11-11 22:48 UTC (permalink / raw)
  To: Frans Pop; +Cc: Andi Kleen, linux-kernel, netdev

On Wed, Nov 11, 2009 at 5:05 PM, Frans Pop <elendil@planet.nl> wrote:
> On Wednesday 11 November 2009, Andi Kleen wrote:
>> Caleb Cushing <xenoterracide@gmail.com> writes:
>> >> I'm attaching the bisection log and a 'good' dmesg output.
>> >>
>> >> c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9 is the first bad commit
>>
>> Just gives fatal: bad object c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9
>> here on a standard Linus linux-2.6 tree.
>
> Looks to be a commit from a stable update:
>
> commit c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9
> Author: Alan Stern <stern@rowland.harvard.edu>
> Date:   Tue Sep 1 11:38:34 2009 -0400
>
>    usb-serial: change referencing of port and serial structures
>
>    commit 41bd34ddd7aa46dbc03b5bb33896e0fa8100fe7b upstream.
>
> Cheers,
> FJP
>

yeah it is. it's from greg kroah-hartman's tree.
-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-11 22:48         ` Caleb Cushing
@ 2009-11-12 11:38           ` Jarek Poplawski
  2009-11-12 13:46               ` Caleb Cushing
  0 siblings, 1 reply; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-12 11:38 UTC (permalink / raw)
  To: Caleb Cushing; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

On 11-11-2009 23:48, Caleb Cushing wrote:
> On Wed, Nov 11, 2009 at 5:05 PM, Frans Pop <elendil@planet.nl> wrote:
>> On Wednesday 11 November 2009, Andi Kleen wrote:
>>> Caleb Cushing <xenoterracide@gmail.com> writes:
>>>>> I'm attaching the bisection log and a 'good' dmesg output.
>>>>>
>>>>> c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9 is the first bad commit
>>> Just gives fatal: bad object c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9
>>> here on a standard Linus linux-2.6 tree.
>> Looks to be a commit from a stable update:
>>
>> commit c9fb3ded7a8a6769f3bcb3ef3d9aed61d3e376a9
>> Author: Alan Stern <stern@rowland.harvard.edu>
>> Date: Â  Tue Sep 1 11:38:34 2009 -0400
>>
>> Â  Â usb-serial: change referencing of port and serial structures
>>
>> Â  Â commit 41bd34ddd7aa46dbc03b5bb33896e0fa8100fe7b upstream.
>>
>> Cheers,
>> FJP
>>
> 
> yeah it is. it's from greg kroah-hartman's tree.

Could you answer the previous question too:

On 11-11-2009 22:47, Andi Kleen wrote:
...
> It might be also useful if you could describe what kind
> of network devices you use and how you determine
> the packet loss.

Btw, you didn't send the stats you compared, and your wireshark dump
doesn't show anything wrong either.

Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-12 11:38           ` Jarek Poplawski
@ 2009-11-12 13:46               ` Caleb Cushing
  0 siblings, 0 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-11-12 13:46 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

[-- Attachment #1: Type: text/plain, Size: 1024 bytes --]

> On 11-11-2009 22:47, Andi Kleen wrote:
> ...
>> It might be also useful if you could describe what kind
>> of network devices you use and how you determine
>> the packet loss.
>
> Btw, you didn't send the stats you compared, and your wireshark dump
> doesn't show anything wrong either.
>
> Jarek P.
>

I didn't see that sorry. I wasn't sure if the dump would or not (I'm
not a networking expert, just know more than the average joe).

from dmesg (networking device)

e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2

from lspci

00:19.0 Ethernet controller: Intel Corporation 82562V-2 10/100 Network
Connection (rev 02)

the attached png's show mtr with bad being when I have the problem.
for those not familiar mtr sends an icmp packet to each hop in 1
second then loops. when I'm having this kind of packet loss (and
sometimes it's higher) all services including dhcp, dns, and http (web
browsing) get flaky, or don't work at all (I really can't browse the
web).
-- 
Caleb Cushing

http://xenoterracide.blogspot.com

[-- Attachment #2: mtr_good.png --]
[-- Type: image/png, Size: 179391 bytes --]

[-- Attachment #3: mtr_bad.png --]
[-- Type: image/png, Size: 137799 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
@ 2009-11-12 13:46               ` Caleb Cushing
  0 siblings, 0 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-11-12 13:46 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

[-- Attachment #1: Type: text/plain, Size: 1024 bytes --]

> On 11-11-2009 22:47, Andi Kleen wrote:
> ...
>> It might be also useful if you could describe what kind
>> of network devices you use and how you determine
>> the packet loss.
>
> Btw, you didn't send the stats you compared, and your wireshark dump
> doesn't show anything wrong either.
>
> Jarek P.
>

I didn't see that sorry. I wasn't sure if the dump would or not (I'm
not a networking expert, just know more than the average joe).

from dmesg (networking device)

e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2

from lspci

00:19.0 Ethernet controller: Intel Corporation 82562V-2 10/100 Network
Connection (rev 02)

the attached png's show mtr with bad being when I have the problem.
for those not familiar mtr sends an icmp packet to each hop in 1
second then loops. when I'm having this kind of packet loss (and
sometimes it's higher) all services including dhcp, dns, and http (web
browsing) get flaky, or don't work at all (I really can't browse the
web).
-- 
Caleb Cushing

http://xenoterracide.blogspot.com

[-- Attachment #2: mtr_good.png --]
[-- Type: image/png, Size: 69328 bytes --]

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-12 13:46               ` Caleb Cushing
  (?)
@ 2009-11-12 19:04               ` Jarek Poplawski
  2009-11-12 21:47                 ` Jarek Poplawski
  2009-11-13 16:25                 ` Caleb Cushing
  -1 siblings, 2 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-12 19:04 UTC (permalink / raw)
  To: Caleb Cushing; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

Caleb Cushing wrote, On 11/12/2009 02:46 PM:

>> On 11-11-2009 22:47, Andi Kleen wrote:
>> ...
>>> It might be also useful if you could describe what kind
>>> of network devices you use and how you determine
>>> the packet loss.
>> Btw, you didn't send the stats you compared, and your wireshark dump
>> doesn't show anything wrong either.
>>
>> Jarek P.
>>
> 
> I didn't see that sorry. I wasn't sure if the dump would or not (I'm
> not a networking expert, just know more than the average joe).
> 
> from dmesg (networking device)
> 
> e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
> 
> from lspci
> 
> 00:19.0 Ethernet controller: Intel Corporation 82562V-2 10/100 Network
> Connection (rev 02)

So I assume it's your only network device on this box and according
to these reports it's 192.168.1.3 with 192.168.1.1 as the gateway,
and your only change is kernel on this 192.168.1.3 box, right?

> the attached png's show mtr with bad being when I have the problem.
> for those not familiar mtr sends an icmp packet to each hop in 1
> second then loops. when I'm having this kind of packet loss (and
> sometimes it's higher) all services including dhcp, dns, and http (web
> browsing) get flaky, or don't work at all (I really can't browse the
> web).

Since the loss is seen on the first hop already, it seems it should be
enough to query 192.168.1.1 only - did you try this? If so, does this
happen from the beginning of the test or after many loops? Could you
try to repeat this wireshark dump with more data than before (but just
to be sure there are a few unanswered pings). If possible it would be
nice to have wireshark or tcpdump data from 192.168.1.1 too, while
pinged from 192.168.1.3. Please, send it gzipped to bugzilla only plus
ifconfig eth0 before and after the test (and let us know here).

Btw, mtr has text reporting too (--report). Larger things send to
bugzilla only.

Thanks,
Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-12 19:04               ` Jarek Poplawski
@ 2009-11-12 21:47                 ` Jarek Poplawski
  2009-11-13 16:25                 ` Caleb Cushing
  1 sibling, 0 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-12 21:47 UTC (permalink / raw)
  To: Caleb Cushing; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

Jarek Poplawski wrote, On 11/12/2009 08:04 PM:

> Caleb Cushing wrote, On 11/12/2009 02:46 PM:
...
>>> Btw, you didn't send the stats you compared, and your wireshark dump
>>> doesn't show anything wrong either.

...

>> I didn't see that sorry. I wasn't sure if the dump would or not (I'm
>> not a networking expert, just know more than the average joe).


Hmm... I didn't see that either, sorry! After re-checking I can see
unanswered requests in this dump. Anyway, the main thing to test now is
the first hop to 192.168.1.1 (some info about it?), as I wrote before.

Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-12 19:04               ` Jarek Poplawski
  2009-11-12 21:47                 ` Jarek Poplawski
@ 2009-11-13 16:25                 ` Caleb Cushing
  2009-11-13 17:21                   ` Caleb Cushing
  2009-11-13 21:16                   ` Jarek Poplawski
  1 sibling, 2 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-11-13 16:25 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

> So I assume it's your only network device on this box and according
> to these reports it's 192.168.1.3 with 192.168.1.1 as the gateway,
> and your only change is kernel on this 192.168.1.3 box, right?

yes, and semi obviously that router is my box (LinkSys wrt 54gl
openwrt kamikaze 8.09.1.

> Since the loss is seen on the first hop already, it seems it should be
> enough to query 192.168.1.1 only - did you try this? If so, does this
> happen from the beginning of the test or after many loops? Could you
> try to repeat this wireshark dump with more data than before (but just
> to be sure there are a few unanswered pings). If possible it would be
> nice to have wireshark or tcpdump data from 192.168.1.1 too, while
> pinged from 192.168.1.3. Please, send it gzipped to bugzilla only plus
> ifconfig eth0 before and after the test (and let us know here).

same bug? or new bug? I can see what I can do to get a tcpdump from
the router.  yes I tried that, I can tell within the first 10 pings. I
should say I don't notice it on every kernel boot, it's ~80% of
reboots (but that's pulled from my behind). but I haven't noticed it
on gfa31221 at all. it's reproducible in 31.6 too (arch just added
that).

> Btw, mtr has text reporting too (--report). Larger things send to
> bugzilla only.

didn't know that, although I should have guessed (or rtfm), thanks.


-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-13 16:25                 ` Caleb Cushing
@ 2009-11-13 17:21                   ` Caleb Cushing
  2009-11-13 21:16                   ` Jarek Poplawski
  1 sibling, 0 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-11-13 17:21 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

any specific switches I should run tcpdump with? or any other tests I
should be trying while capturing? (on either end).

-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-13 16:25                 ` Caleb Cushing
  2009-11-13 17:21                   ` Caleb Cushing
@ 2009-11-13 21:16                   ` Jarek Poplawski
  2009-11-18  9:59                     ` Caleb Cushing
  1 sibling, 1 reply; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-13 21:16 UTC (permalink / raw)
  To: Caleb Cushing; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

On Fri, Nov 13, 2009 at 11:25:25AM -0500, Caleb Cushing wrote:
> > So I assume it's your only network device on this box and according
> > to these reports it's 192.168.1.3 with 192.168.1.1 as the gateway,
> > and your only change is kernel on this 192.168.1.3 box, right?
> 
> yes, and semi obviously that router is my box (LinkSys wrt 54gl
> openwrt kamikaze 8.09.1.
> 
> > Since the loss is seen on the first hop already, it seems it should be
> > enough to query 192.168.1.1 only - did you try this? If so, does this
> > happen from the beginning of the test or after many loops? Could you
> > try to repeat this wireshark dump with more data than before (but just
> > to be sure there are a few unanswered pings). If possible it would be
> > nice to have wireshark or tcpdump data from 192.168.1.1 too, while
> > pinged from 192.168.1.3. Please, send it gzipped to bugzilla only plus
> > ifconfig eth0 before and after the test (and let us know here).
> 
> same bug? or new bug? I can see what I can do to get a tcpdump from
> the router.  yes I tried that, I can tell within the first 10 pings. I
> should say I don't notice it on every kernel boot, it's ~80% of
> reboots (but that's pulled from my behind). but I haven't noticed it
> on gfa31221 at all. it's reproducible in 31.6 too (arch just added
> that).

Might be the same bugzilla report, I guess. We need to establish if
these pings reach 192.168.1.1, so a short test and tcpdump without any
special options just to get a few lost cases as seen on both sides.
(And ifconfigs before and after the test.)

Btw, could you check with lsmod if usbserial module is loaded before
this test? I'd like to verify this git bisection result. (If the
module is loaded or you have CONFIG_USB_SERIAL=y instead of m, try to
recompile the kernel with this option turned off, for this test.)

Thanks,
Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-13 21:16                   ` Jarek Poplawski
@ 2009-11-18  9:59                     ` Caleb Cushing
  2009-11-18 10:00                       ` Caleb Cushing
  2009-11-18 13:51                       ` Jarek Poplawski
  0 siblings, 2 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-11-18  9:59 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

> Might be the same bugzilla report, I guess. We need to establish if
> these pings reach 192.168.1.1, so a short test and tcpdump without any
> special options just to get a few lost cases as seen on both sides.
> (And ifconfigs before and after the test.)
>
> Btw, could you check with lsmod if usbserial module is loaded before
> this test? I'd like to verify this git bisection result. (If the
> module is loaded or you have CONFIG_USB_SERIAL=y instead of m, try to
> recompile the kernel with this option turned off, for this test.)

sorry for taking so long to get back. busy problematic times.

the dumps and ifconfigs are a bit less 'clean' because the router
serves several other computers (none of which have this issue
(windows)) here's the ifconfig -a from the router.

usbserial is not loaded. actually from reading the patch submission I
suspected the official cause might be off... but I'm not kernel
programmer all I know is where I could see the loss during tests.and I
haven't been able to reproduce over dozens of reboots from this
2.6.31.1-test-00091-gfa31221 kernel.

I totally forgot to do it during the dump's so I hope these are still useful

I haven't rebooted this in a few weeks (the router)

br-lan    Link encap:Ethernet  HWaddr 00:1D:7E:F8:21:66
          inet addr:192.168.1.1  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:60613991 errors:0 dropped:0 overruns:0 frame:0
          TX packets:67849334 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2172912561 (2.0 GiB)  TX bytes:3999263405 (3.7 GiB)

eth0      Link encap:Ethernet  HWaddr 00:1D:7E:F8:21:66
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:144116625 errors:0 dropped:0 overruns:0 frame:0
          TX packets:122639966 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1986512923 (1.8 GiB)  TX bytes:1548485891 (1.4 GiB)
          Interrupt:4

eth0.0    Link encap:Ethernet  HWaddr 00:1D:7E:F8:21:66
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:57318567 errors:0 dropped:0 overruns:0 frame:0
          TX packets:62317675 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:3466538358 (3.2 GiB)  TX bytes:2132301174 (1.9 GiB)

eth0.1    Link encap:Ethernet  HWaddr 00:1D:7E:F8:21:66
          inet addr:68.42.198.183  Bcast:255.255.255.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:86777655 errors:0 dropped:0 overruns:0 frame:0
          TX packets:60312064 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:205005516 (195.5 MiB)  TX bytes:3162930981 (2.9 GiB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:168 errors:0 dropped:0 overruns:0 frame:0
          TX packets:168 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:19706 (19.2 KiB)  TX bytes:19706 (19.2 KiB)

wl0       Link encap:Ethernet  HWaddr 00:1D:7E:F8:21:68
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:5114480 errors:0 dropped:0 overruns:0 frame:720205
          TX packets:7576790 errors:1902 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:762579947 (727.2 MiB)  TX bytes:3981402458 (3.7 GiB)
          Interrupt:2 Base address:0x5000

this is the ifconfig -a from my desktop while experiencing the issue

eth0      Link encap:Ethernet  HWaddr 00:21:9B:06:4C:C9
          inet addr:192.168.1.3  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::221:9bff:fe06:4cc9/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:3465 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4951 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          RX bytes:1467320 (1.3 Mb)  TX bytes:631808 (617.0 Kb)
          Memory:fdfc0000-fdfe0000

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:624 errors:0 dropped:0 overruns:0 frame:0
          TX packets:624 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:64397 (62.8 Kb)  TX bytes:64397 (62.8 Kb)


-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-18  9:59                     ` Caleb Cushing
@ 2009-11-18 10:00                       ` Caleb Cushing
  2009-11-18 13:51                       ` Jarek Poplawski
  1 sibling, 0 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-11-18 10:00 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

p.s. dumps are on the old bug here...
http://bugzilla.kernel.org/show_bug.cgi?id=13835


-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-18  9:59                     ` Caleb Cushing
  2009-11-18 10:00                       ` Caleb Cushing
@ 2009-11-18 13:51                       ` Jarek Poplawski
  2009-11-18 18:21                         ` Caleb Cushing
  1 sibling, 1 reply; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-18 13:51 UTC (permalink / raw)
  To: Caleb Cushing; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

On Wed, Nov 18, 2009 at 04:59:03AM -0500, Caleb Cushing wrote:
> > Might be the same bugzilla report, I guess. We need to establish if
> > these pings reach 192.168.1.1, so a short test and tcpdump without any
> > special options just to get a few lost cases as seen on both sides.
> > (And ifconfigs before and after the test.)
> >
> > Btw, could you check with lsmod if usbserial module is loaded before
> > this test? I'd like to verify this git bisection result. (If the
> > module is loaded or you have CONFIG_USB_SERIAL=y instead of m, try to
> > recompile the kernel with this option turned off, for this test.)
> 
> sorry for taking so long to get back. busy problematic times.
No problem, don't hurry.

> 
> the dumps and ifconfigs are a bit less 'clean' because the router
> serves several other computers (none of which have this issue
> (windows)) here's the ifconfig -a from the router.

Actually, I'm a little bit surprised. Maybe I missed something from
your previous messages, but I expected something more similar to the
first wireshark dump, which suggested to me there was only this mtr
traffic. Now there is a lot more (plus we know it's not all).

So, there is a basic question: can this mtr loss be seen while no
other traffic is present? After looking into these current dumps I
doubt. There are e.g. 3 pings unanswered between 09:21:50 and
09:21:52 (21:31:34 to 21:31:38 router time), but a lot of tcp
packets to and from 192.168.1.3, so looks like simply dropped and
we can guess the reason.

> 
> usbserial is not loaded. actually from reading the patch submission I
> suspected the official cause might be off... but I'm not kernel
> programmer all I know is where I could see the loss during tests.and I
> haven't been able to reproduce over dozens of reboots from this
> 2.6.31.1-test-00091-gfa31221 kernel.

Since this patch from the bisection is really limited to this one
module I doubt we should follow this direction. IMHO it shows the
test wasn't reproducible enough. Probably the amount and/or kind of
other traffic really matter. If I'm wrong and missed something again
let me know. Btw, could you try if changing with ifconfig the
txqueuelen of desktop's eth0 from 100 to 1000 changes anything
in this mtr test?

Jarek P.

> this is the ifconfig -a from my desktop while experiencing the issue
> 
> eth0      Link encap:Ethernet  HWaddr 00:21:9B:06:4C:C9
>           inet addr:192.168.1.3  Bcast:192.168.1.255  Mask:255.255.255.0
>           inet6 addr: fe80::221:9bff:fe06:4cc9/64 Scope:Link
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:3465 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:4951 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:100
>           RX bytes:1467320 (1.3 Mb)  TX bytes:631808 (617.0 Kb)
>           Memory:fdfc0000-fdfe0000

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-18 13:51                       ` Jarek Poplawski
@ 2009-11-18 18:21                         ` Caleb Cushing
  2009-11-18 20:10                           ` Jarek Poplawski
  0 siblings, 1 reply; 55+ messages in thread
From: Caleb Cushing @ 2009-11-18 18:21 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

> Actually, I'm a little bit surprised. Maybe I missed something from
> your previous messages, but I expected something more similar to the
> first wireshark dump, which suggested to me there was only this mtr
> traffic. Now there is a lot more (plus we know it's not all).

probably just me lazy at 5 am? did I do the dump on the router right
so it wasn't showing traffic that's just idling from other computers
(windows likes to make a lot of noise). I could do it by ip...

> So, there is a basic question: can this mtr loss be seen while no
> other traffic is present? After looking into these current dumps I
> doubt. There are e.g. 3 pings unanswered between 09:21:50 and
> 09:21:52 (21:31:34 to 21:31:38 router time), but a lot of tcp
> packets to and from 192.168.1.3, so looks like simply dropped and
> we can guess the reason.

yes. this was at a fairly low traffic time of day. 5am only 2 people
were up, and I was using the other computer during. I've had everyone
actively doing one or more of downloading/uploading/video/voip/gaming
stuff on this network with no noticeable packet loss. if really,
really needed I can probably restrict this network to 2 machines for
the duration of the test.

> Since this patch from the bisection is really limited to this one
> module I doubt we should follow this direction. IMHO it shows the
> test wasn't reproducible enough. Probably the amount and/or kind of
> other traffic really matter. If I'm wrong and missed something again
> let me know. Btw, could you try if changing with ifconfig the
> txqueuelen of desktop's eth0 from 100 to 1000 changes anything
> in this mtr test?

yeah testing it under my known working config first. I'll get back w/ you later.
-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-18 18:21                         ` Caleb Cushing
@ 2009-11-18 20:10                           ` Jarek Poplawski
  2009-11-18 22:38                             ` Jarek Poplawski
  0 siblings, 1 reply; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-18 20:10 UTC (permalink / raw)
  To: Caleb Cushing; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

On Wed, Nov 18, 2009 at 01:21:19PM -0500, Caleb Cushing wrote:
> > So, there is a basic question: can this mtr loss be seen while no
> > other traffic is present? After looking into these current dumps I
> > doubt. There are e.g. 3 pings unanswered between 09:21:50 and
> > 09:21:52 (21:31:34 to 21:31:38 router time), but a lot of tcp
> > packets to and from 192.168.1.3, so looks like simply dropped and
> > we can guess the reason.
> 
> yes. this was at a fairly low traffic time of day. 5am only 2 people
> were up, and I was using the other computer during. I've had everyone
> actively doing one or more of downloading/uploading/video/voip/gaming
> stuff on this network with no noticeable packet loss. if really,
> really needed I can probably restrict this network to 2 machines for
> the duration of the test.

Alas "a fairly low traffic" can have a fairly high surges, so it's not
easy to compare. Anyway, try to check, if it's still available, if
there were any messages from the NIC in syslog etc. during this test
(~09:21:50).

> 
> > Since this patch from the bisection is really limited to this one
> > module I doubt we should follow this direction. IMHO it shows the
> > test wasn't reproducible enough. Probably the amount and/or kind of
> > other traffic really matter. If I'm wrong and missed something again
> > let me know. Btw, could you try if changing with ifconfig the
> > txqueuelen of desktop's eth0 from 100 to 1000 changes anything
> > in this mtr test?
> 
> yeah testing it under my known working config first. I'll get back w/ you later.

Btw, since dropping at hardware (NIC) level seems more likely to me,
could you send 'ethtool eth0', and 'ethtool -S eth0' after such tests
(both sides).

Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-18 20:10                           ` Jarek Poplawski
@ 2009-11-18 22:38                             ` Jarek Poplawski
  2009-11-22 19:35                               ` Caleb Cushing
  0 siblings, 1 reply; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-18 22:38 UTC (permalink / raw)
  To: Caleb Cushing; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

On Wed, Nov 18, 2009 at 09:10:34PM +0100, Jarek Poplawski wrote:
> On Wed, Nov 18, 2009 at 01:21:19PM -0500, Caleb Cushing wrote:
> > yeah testing it under my known working config first. I'll get back w/ you later.
> 
> Btw, since dropping at hardware (NIC) level seems more likely to me,
> could you send 'ethtool eth0', and 'ethtool -S eth0' after such tests
> (both sides).

Hmm... and 'netstat -s' before and after the test (both sides).

Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-18 22:38                             ` Jarek Poplawski
@ 2009-11-22 19:35                               ` Caleb Cushing
  2009-11-22 20:50                                 ` Jarek Poplawski
  0 siblings, 1 reply; 55+ messages in thread
From: Caleb Cushing @ 2009-11-22 19:35 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

haven't had time to do a test yet. but would it be of any use for you
all for me to throw another nic (it'd be a different driver for sure)
in this box and test that on a problematic kernel? I have some but not
with me.

On Wed, Nov 18, 2009 at 5:38 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> On Wed, Nov 18, 2009 at 09:10:34PM +0100, Jarek Poplawski wrote:
>> On Wed, Nov 18, 2009 at 01:21:19PM -0500, Caleb Cushing wrote:
>> > yeah testing it under my known working config first. I'll get back w/ you later.
>>
>> Btw, since dropping at hardware (NIC) level seems more likely to me,
>> could you send 'ethtool eth0', and 'ethtool -S eth0' after such tests
>> (both sides).
>
> Hmm... and 'netstat -s' before and after the test (both sides).
>
> Jarek P.
>



-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-22 19:35                               ` Caleb Cushing
@ 2009-11-22 20:50                                 ` Jarek Poplawski
  2009-11-24  6:17                                   ` Caleb Cushing
  0 siblings, 1 reply; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-22 20:50 UTC (permalink / raw)
  To: Caleb Cushing; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

On Sun, Nov 22, 2009 at 02:35:10PM -0500, Caleb Cushing wrote:
> haven't had time to do a test yet. but would it be of any use for you
> all for me to throw another nic (it'd be a different driver for sure)
> in this box and test that on a problematic kernel? I have some but not
> with me.

Of course it would be useful. Especially if you find new bugs. ;-)
I'm not sure it's the fastest way to diagnose this problem, but if
it's not a problem for you...

Btw, currently I don't consider this dropping means there has to be
a bug. It could be otherwise - a feature... e.g. when a new kernel
can transmit faster (then dropping in some other, slower place can
happen).

Jarek P.

> 
> On Wed, Nov 18, 2009 at 5:38 PM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> > On Wed, Nov 18, 2009 at 09:10:34PM +0100, Jarek Poplawski wrote:
> >> On Wed, Nov 18, 2009 at 01:21:19PM -0500, Caleb Cushing wrote:
> >> > yeah testing it under my known working config first. I'll get back w/ you later.
> >>
> >> Btw, since dropping at hardware (NIC) level seems more likely to me,
> >> could you send 'ethtool eth0', and 'ethtool -S eth0' after such tests
> >> (both sides).
> >
> > Hmm... and 'netstat -s' before and after the test (both sides).
> >
> > Jarek P.
> >
> 
> 
> 
> -- 
> Caleb Cushing
> 
> http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-22 20:50                                 ` Jarek Poplawski
@ 2009-11-24  6:17                                   ` Caleb Cushing
  2009-11-24 11:19                                       ` Jarek Poplawski
  0 siblings, 1 reply; 55+ messages in thread
From: Caleb Cushing @ 2009-11-24  6:17 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: Frans Pop, Andi Kleen, linux-kernel, netdev

> Btw, currently I don't consider this dropping means there has to be
> a bug. It could be otherwise - a feature... e.g. when a new kernel
> can transmit faster (then dropping in some other, slower place can
> happen).

um... where would it be dropping that we wouldn't have a bug? I mean
sure faster is great... but if it makes my network not work right...

I've added all (I think) information you've asked for to the bug
http://bugzilla.kernel.org/show_bug.cgi?id=13835 except for ethtool
and netstat on the router side. ethtool complains about not having
driver or capability (maybe because it's a 2.4 kernel?) and the
version of netstat doesn't support -s. I disabled everything that I
can think of that would send/receive packets before doing the test
client side, except dhcp/dns windows box's were probably sending some
broadcasts too. but the traffic should be pretty low. I did remember
to set the txqueuelen didn't seem to make a difference

only error in dmesg I see is

e1000e 0000:00:19.0: pci_enable_pcie_error_reporting failed 0xfffffffb

but it's in working versions too.
-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-24  6:17                                   ` Caleb Cushing
@ 2009-11-24 11:19                                       ` Jarek Poplawski
  0 siblings, 0 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-24 11:19 UTC (permalink / raw)
  To: Caleb Cushing
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

On Tue, Nov 24, 2009 at 01:17:09AM -0500, Caleb Cushing wrote:
> > Btw, currently I don't consider this dropping means there has to be
> > a bug. It could be otherwise - a feature... e.g. when a new kernel
> > can transmit faster (then dropping in some other, slower place can
> > happen).
> 
> um... where would it be dropping that we wouldn't have a bug? I mean
> sure faster is great... but if it makes my network not work right...

E.g. if it were dropped because of a queue overflow (but it doesn't
seem to be the case, at least at your box) or because of memory
problems while handling a lot of traffic.

> 
> I've added all (I think) information you've asked for to the bug
> http://bugzilla.kernel.org/show_bug.cgi?id=13835 except for ethtool
> and netstat on the router side. ethtool complains about not having
> driver or capability (maybe because it's a 2.4 kernel?) and the
> version of netstat doesn't support -s. I disabled everything that I
> can think of that would send/receive packets before doing the test
> client side, except dhcp/dns windows box's were probably sending some
> broadcasts too. but the traffic should be pretty low. I did remember
> to set the txqueuelen didn't seem to make a difference

Alas it's not all information I asked. E.g. "netstat -s before faulty
kernel" and "netstat -s after faulty kernel" seem to be the same file:
netstat_after.slave4.log.gz. Anyway, since there are problems with
getting stats from the router we still can't compare them, or check
for the dropped stats. (Btw, could you check for /proc/net/softnet_stat
yet?)

So, it might be the kernel problem you reported, but there is not
enough data to prove it. Then my proposal is to try to repeat this
problem in more "testing friendly" conditions - preferably against
some other, more up-to-date linux box, if possible?

> only error in dmesg I see is
> 
> e1000e 0000:00:19.0: pci_enable_pcie_error_reporting failed 0xfffffffb

I added e1000e maintainers to CC to have a look at this warning.

Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
@ 2009-11-24 11:19                                       ` Jarek Poplawski
  0 siblings, 0 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-24 11:19 UTC (permalink / raw)
  To: Caleb Cushing
  Cc: e1000-devel, netdev, Frans Pop, Jesse Brandeburg, linux-kernel,
	Andi Kleen, Jeff Kirsher

On Tue, Nov 24, 2009 at 01:17:09AM -0500, Caleb Cushing wrote:
> > Btw, currently I don't consider this dropping means there has to be
> > a bug. It could be otherwise - a feature... e.g. when a new kernel
> > can transmit faster (then dropping in some other, slower place can
> > happen).
> 
> um... where would it be dropping that we wouldn't have a bug? I mean
> sure faster is great... but if it makes my network not work right...

E.g. if it were dropped because of a queue overflow (but it doesn't
seem to be the case, at least at your box) or because of memory
problems while handling a lot of traffic.

> 
> I've added all (I think) information you've asked for to the bug
> http://bugzilla.kernel.org/show_bug.cgi?id=13835 except for ethtool
> and netstat on the router side. ethtool complains about not having
> driver or capability (maybe because it's a 2.4 kernel?) and the
> version of netstat doesn't support -s. I disabled everything that I
> can think of that would send/receive packets before doing the test
> client side, except dhcp/dns windows box's were probably sending some
> broadcasts too. but the traffic should be pretty low. I did remember
> to set the txqueuelen didn't seem to make a difference

Alas it's not all information I asked. E.g. "netstat -s before faulty
kernel" and "netstat -s after faulty kernel" seem to be the same file:
netstat_after.slave4.log.gz. Anyway, since there are problems with
getting stats from the router we still can't compare them, or check
for the dropped stats. (Btw, could you check for /proc/net/softnet_stat
yet?)

So, it might be the kernel problem you reported, but there is not
enough data to prove it. Then my proposal is to try to repeat this
problem in more "testing friendly" conditions - preferably against
some other, more up-to-date linux box, if possible?

> only error in dmesg I see is
> 
> e1000e 0000:00:19.0: pci_enable_pcie_error_reporting failed 0xfffffffb

I added e1000e maintainers to CC to have a look at this warning.

Jarek P.

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-24 11:19                                       ` Jarek Poplawski
  (?)
@ 2009-11-24 13:46                                       ` Jarek Poplawski
  -1 siblings, 0 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-24 13:46 UTC (permalink / raw)
  To: Caleb Cushing
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

On Tue, Nov 24, 2009 at 11:19:46AM +0000, Jarek Poplawski wrote:
...
> Alas it's not all information I asked. E.g. "netstat -s before faulty
> kernel" and "netstat -s after faulty kernel" seem to be the same file:
> netstat_after.slave4.log.gz.

On the other hand, there is a lot of tcp retransmits there:

Tcp:
    17 active connections openings
    0 passive connection openings
    14 failed connection attempts
    0 connection resets received
    0 connections established
    45 segments received
    49 segments send out
    19 segments retransmited
    0 bad segments received.
    19 resets sent

So it might point at the driver yet. It would be interesting to see
more of this: could you repeat "netstat -s" and "ethtool -S eth0"
after rebooting with both kernels and doing a few minutes of similar
tcp activities (against the router or some other "good" site). Btw,
please remind us the exact kernel versions. If you can, try 2.6.32-rc8
instead of 2.6.31.

Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* RE: [E1000-devel] large packet loss take2 2.6.31.x
  2009-11-24 11:19                                       ` Jarek Poplawski
@ 2009-11-24 15:57                                         ` Allan, Bruce W
  -1 siblings, 0 replies; 55+ messages in thread
From: Allan, Bruce W @ 2009-11-24 15:57 UTC (permalink / raw)
  To: Jarek Poplawski, Caleb Cushing
  Cc: e1000-devel, netdev, Frans Pop, Brandeburg, Jesse, linux-kernel,
	Andi Kleen, Kirsher, Jeffrey T



>-----Original Message-----
>From: Jarek Poplawski [mailto:jarkao2@gmail.com]
>Sent: Tuesday, November 24, 2009 3:20 AM
>To: Caleb Cushing
>Cc: e1000-devel@lists.sourceforge.net; netdev@vger.kernel.org; Frans Pop;
>Brandeburg, Jesse; linux-kernel@vger.kernel.org; Andi Kleen; Kirsher,
>Jeffrey T
>Subject: Re: [E1000-devel] large packet loss take2 2.6.31.x
>
>On Tue, Nov 24, 2009 at 01:17:09AM -0500, Caleb Cushing wrote:
>> > Btw, currently I don't consider this dropping means there has to be
>> > a bug. It could be otherwise - a feature... e.g. when a new kernel
>> > can transmit faster (then dropping in some other, slower place can
>> > happen).
>>
>> um... where would it be dropping that we wouldn't have a bug? I mean
>> sure faster is great... but if it makes my network not work right...
>
>E.g. if it were dropped because of a queue overflow (but it doesn't
>seem to be the case, at least at your box) or because of memory
>problems while handling a lot of traffic.
>
>>
>> I've added all (I think) information you've asked for to the bug
>> http://bugzilla.kernel.org/show_bug.cgi?id=13835 except for ethtool
>> and netstat on the router side. ethtool complains about not having
>> driver or capability (maybe because it's a 2.4 kernel?) and the
>> version of netstat doesn't support -s. I disabled everything that I
>> can think of that would send/receive packets before doing the test
>> client side, except dhcp/dns windows box's were probably sending some
>> broadcasts too. but the traffic should be pretty low. I did remember
>> to set the txqueuelen didn't seem to make a difference
>
>Alas it's not all information I asked. E.g. "netstat -s before faulty
>kernel" and "netstat -s after faulty kernel" seem to be the same file:
>netstat_after.slave4.log.gz. Anyway, since there are problems with
>getting stats from the router we still can't compare them, or check
>for the dropped stats. (Btw, could you check for /proc/net/softnet_stat
>yet?)
>
>So, it might be the kernel problem you reported, but there is not
>enough data to prove it. Then my proposal is to try to repeat this
>problem in more "testing friendly" conditions - preferably against
>some other, more up-to-date linux box, if possible?
>
>> only error in dmesg I see is
>>
>> e1000e 0000:00:19.0: pci_enable_pcie_error_reporting failed 0xfffffffb
>
>I added e1000e maintainers to CC to have a look at this warning.
>
>Jarek P.

The "pci_enable_pcie_error_reporting failed" message is a non-fatal warning that has recently been removed.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* RE: [E1000-devel] large packet loss take2 2.6.31.x
@ 2009-11-24 15:57                                         ` Allan, Bruce W
  0 siblings, 0 replies; 55+ messages in thread
From: Allan, Bruce W @ 2009-11-24 15:57 UTC (permalink / raw)
  To: Jarek Poplawski, Caleb Cushing
  Cc: e1000-devel, netdev, Frans Pop, Brandeburg, Jesse, linux-kernel,
	Andi Kleen, Kirsher, Jeffrey T



>-----Original Message-----
>From: Jarek Poplawski [mailto:jarkao2@gmail.com]
>Sent: Tuesday, November 24, 2009 3:20 AM
>To: Caleb Cushing
>Cc: e1000-devel@lists.sourceforge.net; netdev@vger.kernel.org; Frans Pop;
>Brandeburg, Jesse; linux-kernel@vger.kernel.org; Andi Kleen; Kirsher,
>Jeffrey T
>Subject: Re: [E1000-devel] large packet loss take2 2.6.31.x
>
>On Tue, Nov 24, 2009 at 01:17:09AM -0500, Caleb Cushing wrote:
>> > Btw, currently I don't consider this dropping means there has to be
>> > a bug. It could be otherwise - a feature... e.g. when a new kernel
>> > can transmit faster (then dropping in some other, slower place can
>> > happen).
>>
>> um... where would it be dropping that we wouldn't have a bug? I mean
>> sure faster is great... but if it makes my network not work right...
>
>E.g. if it were dropped because of a queue overflow (but it doesn't
>seem to be the case, at least at your box) or because of memory
>problems while handling a lot of traffic.
>
>>
>> I've added all (I think) information you've asked for to the bug
>> http://bugzilla.kernel.org/show_bug.cgi?id=13835 except for ethtool
>> and netstat on the router side. ethtool complains about not having
>> driver or capability (maybe because it's a 2.4 kernel?) and the
>> version of netstat doesn't support -s. I disabled everything that I
>> can think of that would send/receive packets before doing the test
>> client side, except dhcp/dns windows box's were probably sending some
>> broadcasts too. but the traffic should be pretty low. I did remember
>> to set the txqueuelen didn't seem to make a difference
>
>Alas it's not all information I asked. E.g. "netstat -s before faulty
>kernel" and "netstat -s after faulty kernel" seem to be the same file:
>netstat_after.slave4.log.gz. Anyway, since there are problems with
>getting stats from the router we still can't compare them, or check
>for the dropped stats. (Btw, could you check for /proc/net/softnet_stat
>yet?)
>
>So, it might be the kernel problem you reported, but there is not
>enough data to prove it. Then my proposal is to try to repeat this
>problem in more "testing friendly" conditions - preferably against
>some other, more up-to-date linux box, if possible?
>
>> only error in dmesg I see is
>>
>> e1000e 0000:00:19.0: pci_enable_pcie_error_reporting failed 0xfffffffb
>
>I added e1000e maintainers to CC to have a look at this warning.
>
>Jarek P.

The "pci_enable_pcie_error_reporting failed" message is a non-fatal warning that has recently been removed.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [E1000-devel] large packet loss take2 2.6.31.x
  2009-11-24 15:57                                         ` Allan, Bruce W
@ 2009-11-24 18:27                                           ` Jarek Poplawski
  -1 siblings, 0 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-24 18:27 UTC (permalink / raw)
  To: Allan, Bruce W
  Cc: Caleb Cushing, e1000-devel, netdev, Frans Pop, Brandeburg, Jesse,
	linux-kernel, Andi Kleen, Kirsher, Jeffrey T

On Tue, Nov 24, 2009 at 07:57:41AM -0800, Allan, Bruce W wrote:
> The "pci_enable_pcie_error_reporting failed" message is a non-fatal warning that has recently been removed.
> 

Thanks for the explanation,
Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
@ 2009-11-24 18:27                                           ` Jarek Poplawski
  0 siblings, 0 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-24 18:27 UTC (permalink / raw)
  To: Allan, Bruce W
  Cc: Andi Kleen, e1000-devel, netdev, Frans Pop, Brandeburg, Jesse,
	linux-kernel, Caleb Cushing, Kirsher, Jeffrey T

On Tue, Nov 24, 2009 at 07:57:41AM -0800, Allan, Bruce W wrote:
> The "pci_enable_pcie_error_reporting failed" message is a non-fatal warning that has recently been removed.
> 

Thanks for the explanation,
Jarek P.

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-24 11:19                                       ` Jarek Poplawski
@ 2009-11-25 14:06                                         ` Caleb Cushing
  -1 siblings, 0 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-11-25 14:06 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

> Alas it's not all information I asked. E.g. "netstat -s before faulty
> kernel" and "netstat -s after faulty kernel" seem to be the same file:
> netstat_after.slave4.log.gz.

sorry I guess I misunderstood what you wanted? (or maybe I just dorked
it when I created all the files)  I upped a netstat -s from a good
kernel shortly after reboot.

> Anyway, since there are problems with
> getting stats from the router we still can't compare them, or check
> for the dropped stats. (Btw, could you check for /proc/net/softnet_stat
> yet?)

router? good kernel? bad kernel?

> So, it might be the kernel problem you reported, but there is not
> enough data to prove it. Then my proposal is to try to repeat this
> problem in more "testing friendly" conditions - preferably against
> some other, more up-to-date linux box, if possible?

yeah 2.6.31.6 works on my laptop fine I'll just have to see about
getting a direct connection to it. probably do that after I bring the
other NIC back from home, on thanksgiving.


-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
@ 2009-11-25 14:06                                         ` Caleb Cushing
  0 siblings, 0 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-11-25 14:06 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: e1000-devel, netdev, Frans Pop, Jesse Brandeburg, linux-kernel,
	Andi Kleen, Jeff Kirsher

> Alas it's not all information I asked. E.g. "netstat -s before faulty
> kernel" and "netstat -s after faulty kernel" seem to be the same file:
> netstat_after.slave4.log.gz.

sorry I guess I misunderstood what you wanted? (or maybe I just dorked
it when I created all the files)  I upped a netstat -s from a good
kernel shortly after reboot.

> Anyway, since there are problems with
> getting stats from the router we still can't compare them, or check
> for the dropped stats. (Btw, could you check for /proc/net/softnet_stat
> yet?)

router? good kernel? bad kernel?

> So, it might be the kernel problem you reported, but there is not
> enough data to prove it. Then my proposal is to try to repeat this
> problem in more "testing friendly" conditions - preferably against
> some other, more up-to-date linux box, if possible?

yeah 2.6.31.6 works on my laptop fine I'll just have to see about
getting a direct connection to it. probably do that after I bring the
other NIC back from home, on thanksgiving.


-- 
Caleb Cushing

http://xenoterracide.blogspot.com

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-25 14:06                                         ` Caleb Cushing
@ 2009-11-25 16:47                                           ` Caleb Cushing
  -1 siblings, 0 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-11-25 16:47 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

> yeah 2.6.31.6 works on my laptop fine I'll just have to see about
> getting a direct connection to it. probably do that after I bring the
> other NIC back from home, on thanksgiving.

meh scratch that sorta.. screen mounting brackets/joint on laptop
broke... I'm thinking of scrapping it for a netbook. but replacing may
take a few weeks.

-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
@ 2009-11-25 16:47                                           ` Caleb Cushing
  0 siblings, 0 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-11-25 16:47 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: e1000-devel, netdev, Frans Pop, Jesse Brandeburg, linux-kernel,
	Andi Kleen, Jeff Kirsher

> yeah 2.6.31.6 works on my laptop fine I'll just have to see about
> getting a direct connection to it. probably do that after I bring the
> other NIC back from home, on thanksgiving.

meh scratch that sorta.. screen mounting brackets/joint on laptop
broke... I'm thinking of scrapping it for a netbook. but replacing may
take a few weeks.

-- 
Caleb Cushing

http://xenoterracide.blogspot.com

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-25 14:06                                         ` Caleb Cushing
@ 2009-11-25 19:11                                           ` Jarek Poplawski
  -1 siblings, 0 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-25 19:11 UTC (permalink / raw)
  To: Caleb Cushing
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

On Wed, Nov 25, 2009 at 09:06:30AM -0500, Caleb Cushing wrote:
> > Anyway, since there are problems with
> > getting stats from the router we still can't compare them, or check
> > for the dropped stats. (Btw, could you check for /proc/net/softnet_stat
> > yet?)
> 
> router? good kernel? bad kernel?

router.

> 
> > So, it might be the kernel problem you reported, but there is not
> > enough data to prove it. Then my proposal is to try to repeat this
> > problem in more "testing friendly" conditions - preferably against
> > some other, more up-to-date linux box, if possible?
> 
> yeah 2.6.31.6 works on my laptop fine I'll just have to see about
> getting a direct connection to it. probably do that after I bring the
> other NIC back from home, on thanksgiving.

This other NIC is a really good idea, so let's wait and see.

Happy Thanksgiving!
Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
@ 2009-11-25 19:11                                           ` Jarek Poplawski
  0 siblings, 0 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-25 19:11 UTC (permalink / raw)
  To: Caleb Cushing
  Cc: e1000-devel, netdev, Frans Pop, Jesse Brandeburg, linux-kernel,
	Andi Kleen, Jeff Kirsher

On Wed, Nov 25, 2009 at 09:06:30AM -0500, Caleb Cushing wrote:
> > Anyway, since there are problems with
> > getting stats from the router we still can't compare them, or check
> > for the dropped stats. (Btw, could you check for /proc/net/softnet_stat
> > yet?)
> 
> router? good kernel? bad kernel?

router.

> 
> > So, it might be the kernel problem you reported, but there is not
> > enough data to prove it. Then my proposal is to try to repeat this
> > problem in more "testing friendly" conditions - preferably against
> > some other, more up-to-date linux box, if possible?
> 
> yeah 2.6.31.6 works on my laptop fine I'll just have to see about
> getting a direct connection to it. probably do that after I bring the
> other NIC back from home, on thanksgiving.

This other NIC is a really good idea, so let's wait and see.

Happy Thanksgiving!
Jarek P.

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-25 19:11                                           ` Jarek Poplawski
  (?)
@ 2009-11-27 18:07                                           ` Caleb Cushing
  2009-11-27 21:36                                               ` Jarek Poplawski
  -1 siblings, 1 reply; 55+ messages in thread
From: Caleb Cushing @ 2009-11-27 18:07 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

2.6.32-rc8 seemed to be affected (guess. because my net didn't come up
on reboot. further testing will likely verify) also during reboots I
found out that the version I've been thinking is good is afflicted. I
supposed maybe I should try bisecting again? starting with that point.
not sure it'll do us much good if that version was able to slip by for
so long. I really hate intermittent bugs.


-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-27 18:07                                           ` Caleb Cushing
@ 2009-11-27 21:36                                               ` Jarek Poplawski
  0 siblings, 0 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-27 21:36 UTC (permalink / raw)
  To: Caleb Cushing
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

On Fri, Nov 27, 2009 at 01:07:53PM -0500, Caleb Cushing wrote:
> 2.6.32-rc8 seemed to be affected (guess. because my net didn't come up
> on reboot. further testing will likely verify) also during reboots I
> found out that the version I've been thinking is good is afflicted. I
> supposed maybe I should try bisecting again? starting with that point.
> not sure it'll do us much good if that version was able to slip by for
> so long. I really hate intermittent bugs.

I doubt bisecting is a good idea with so unpredictable bug. First, you
should make sure it's not a hardware problem, so go back to the kernel
you trust most, and give it a really long try with a few recompilations
after slightly changing the config. Btw, I wonder if you tried e1000e
module parameters like IntMode=0 or 1.

Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
@ 2009-11-27 21:36                                               ` Jarek Poplawski
  0 siblings, 0 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-27 21:36 UTC (permalink / raw)
  To: Caleb Cushing
  Cc: e1000-devel, netdev, Frans Pop, Jesse Brandeburg, linux-kernel,
	Andi Kleen, Jeff Kirsher

On Fri, Nov 27, 2009 at 01:07:53PM -0500, Caleb Cushing wrote:
> 2.6.32-rc8 seemed to be affected (guess. because my net didn't come up
> on reboot. further testing will likely verify) also during reboots I
> found out that the version I've been thinking is good is afflicted. I
> supposed maybe I should try bisecting again? starting with that point.
> not sure it'll do us much good if that version was able to slip by for
> so long. I really hate intermittent bugs.

I doubt bisecting is a good idea with so unpredictable bug. First, you
should make sure it's not a hardware problem, so go back to the kernel
you trust most, and give it a really long try with a few recompilations
after slightly changing the config. Btw, I wonder if you tried e1000e
module parameters like IntMode=0 or 1.

Jarek P.

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-27 21:36                                               ` Jarek Poplawski
@ 2009-11-27 22:35                                                 ` Caleb Cushing
  -1 siblings, 0 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-11-27 22:35 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

> I doubt bisecting is a good idea with so unpredictable bug. First, you
> should make sure it's not a hardware problem, so go back to the kernel
> you trust most, and give it a really long try with a few recompilations
> after slightly changing the config. Btw, I wonder if you tried e1000e
> module parameters like IntMode=0 or 1.

no, how do I set those?


-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
@ 2009-11-27 22:35                                                 ` Caleb Cushing
  0 siblings, 0 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-11-27 22:35 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: e1000-devel, netdev, Frans Pop, Jesse Brandeburg, linux-kernel,
	Andi Kleen, Jeff Kirsher

> I doubt bisecting is a good idea with so unpredictable bug. First, you
> should make sure it's not a hardware problem, so go back to the kernel
> you trust most, and give it a really long try with a few recompilations
> after slightly changing the config. Btw, I wonder if you tried e1000e
> module parameters like IntMode=0 or 1.

no, how do I set those?


-- 
Caleb Cushing

http://xenoterracide.blogspot.com

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-27 22:35                                                 ` Caleb Cushing
  (?)
@ 2009-11-27 22:42                                                 ` Jarek Poplawski
  2009-12-04  1:49                                                     ` Caleb Cushing
  -1 siblings, 1 reply; 55+ messages in thread
From: Jarek Poplawski @ 2009-11-27 22:42 UTC (permalink / raw)
  To: Caleb Cushing
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

On Fri, Nov 27, 2009 at 05:35:34PM -0500, Caleb Cushing wrote:
> > I doubt bisecting is a good idea with so unpredictable bug. First, you
> > should make sure it's not a hardware problem, so go back to the kernel
> > you trust most, and give it a really long try with a few recompilations
> > after slightly changing the config. Btw, I wonder if you tried e1000e
> > module parameters like IntMode=0 or 1.
> 
> no, how do I set those?

modprobe -r e1000e
modprobe e1000e IntMode=0

Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-11-27 22:42                                                 ` Jarek Poplawski
@ 2009-12-04  1:49                                                     ` Caleb Cushing
  0 siblings, 0 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-12-04  1:49 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

>
> modprobe -r e1000e
> modprobe e1000e IntMode=0
>
> Jarek P.
>
tested on kernel behaving properly no change. what do these modes do?

I've installed a 10/100 linksys nic into my system. it appears to be
working fine on a bad kernel (2.6.32-final tested and for sure
verified). I've only tested it once though. my laptop died so direct
connection between that won't work. can I test between these 2 nics?
(suppose no real reason why not) but what should I proceed with at
this point?


-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
@ 2009-12-04  1:49                                                     ` Caleb Cushing
  0 siblings, 0 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-12-04  1:49 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: e1000-devel, netdev, Frans Pop, Jesse Brandeburg, linux-kernel,
	Andi Kleen, Jeff Kirsher

>
> modprobe -r e1000e
> modprobe e1000e IntMode=0
>
> Jarek P.
>
tested on kernel behaving properly no change. what do these modes do?

I've installed a 10/100 linksys nic into my system. it appears to be
working fine on a bad kernel (2.6.32-final tested and for sure
verified). I've only tested it once though. my laptop died so direct
connection between that won't work. can I test between these 2 nics?
(suppose no real reason why not) but what should I proceed with at
this point?


-- 
Caleb Cushing

http://xenoterracide.blogspot.com

------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-12-04  1:49                                                     ` Caleb Cushing
@ 2009-12-04  9:05                                                       ` Jarek Poplawski
  -1 siblings, 0 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-12-04  9:05 UTC (permalink / raw)
  To: Caleb Cushing
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

On Thu, Dec 03, 2009 at 08:49:17PM -0500, Caleb Cushing wrote:
> >
> > modprobe -r e1000e
> > modprobe e1000e IntMode=0
> >
> > Jarek P.
> >
> tested on kernel behaving properly no change. what do these modes do?

e1000e by default uses MSI-X interrupts if possible, which are most
modern. If there are some problems IntMode lets us try older types,
so I rather meant it for the misbehaving kernel.

> 
> I've installed a 10/100 linksys nic into my system. it appears to be
> working fine on a bad kernel (2.6.32-final tested and for sure
> verified). I've only tested it once though. my laptop died so direct
> connection between that won't work. can I test between these 2 nics?
> (suppose no real reason why not) but what should I proceed with at
> this point?

If you have it fixed easily with another nic you should first
reconsider if this debugging is worth of your time. Of course it's
could be very useful for the kernel (unless it's a hardware fault),
but on the other hand this is a popular nic, tested by many people.

Then, if you find time for such testing, I'd suggest to try mainly
2.6.32 - I'm not sure if you tried it with e1000e. So if after longer
testing both linksys and e1000e you find only the latter has problems
I think you should open the new report in bugzilla for e1000e and
submit things like: dmesg, .config, lspci -v from 2.6.32, and if
possible the same things from the last kernel which didn't have these
problems. Add some references to previous attempts in bugzilla and
this thread. (Btw, any reproducible tests should be fine.)

Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
@ 2009-12-04  9:05                                                       ` Jarek Poplawski
  0 siblings, 0 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-12-04  9:05 UTC (permalink / raw)
  To: Caleb Cushing
  Cc: e1000-devel, netdev, Frans Pop, Jesse Brandeburg, linux-kernel,
	Andi Kleen, Jeff Kirsher

On Thu, Dec 03, 2009 at 08:49:17PM -0500, Caleb Cushing wrote:
> >
> > modprobe -r e1000e
> > modprobe e1000e IntMode=0
> >
> > Jarek P.
> >
> tested on kernel behaving properly no change. what do these modes do?

e1000e by default uses MSI-X interrupts if possible, which are most
modern. If there are some problems IntMode lets us try older types,
so I rather meant it for the misbehaving kernel.

> 
> I've installed a 10/100 linksys nic into my system. it appears to be
> working fine on a bad kernel (2.6.32-final tested and for sure
> verified). I've only tested it once though. my laptop died so direct
> connection between that won't work. can I test between these 2 nics?
> (suppose no real reason why not) but what should I proceed with at
> this point?

If you have it fixed easily with another nic you should first
reconsider if this debugging is worth of your time. Of course it's
could be very useful for the kernel (unless it's a hardware fault),
but on the other hand this is a popular nic, tested by many people.

Then, if you find time for such testing, I'd suggest to try mainly
2.6.32 - I'm not sure if you tried it with e1000e. So if after longer
testing both linksys and e1000e you find only the latter has problems
I think you should open the new report in bugzilla for e1000e and
submit things like: dmesg, .config, lspci -v from 2.6.32, and if
possible the same things from the last kernel which didn't have these
problems. Add some references to previous attempts in bugzilla and
this thread. (Btw, any reproducible tests should be fine.)

Jarek P.

------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-12-04  9:05                                                       ` Jarek Poplawski
  (?)
@ 2009-12-04 18:28                                                       ` Caleb Cushing
  2009-12-04 20:44                                                         ` Jarek Poplawski
  -1 siblings, 1 reply; 55+ messages in thread
From: Caleb Cushing @ 2009-12-04 18:28 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

On Fri, Dec 4, 2009 at 4:05 AM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> On Thu, Dec 03, 2009 at 08:49:17PM -0500, Caleb Cushing wrote:
>> >
>> > modprobe -r e1000e
>> > modprobe e1000e IntMode=0
>> >
>> > Jarek P.
>> >
>> tested on kernel behaving properly no change. what do these modes do?
>
> e1000e by default uses MSI-X interrupts if possible, which are most
> modern. If there are some problems IntMode lets us try older types,
> so I rather meant it for the misbehaving kernel.
>
>>
>> I've installed a 10/100 linksys nic into my system. it appears to be
>> working fine on a bad kernel (2.6.32-final tested and for sure
>> verified). I've only tested it once though. my laptop died so direct
>> connection between that won't work. can I test between these 2 nics?
>> (suppose no real reason why not) but what should I proceed with at
>> this point?
>
> If you have it fixed easily with another nic you should first
> reconsider if this debugging is worth of your time. Of course it's
> could be very useful for the kernel (unless it's a hardware fault),
> but on the other hand this is a popular nic, tested by many people.

trying to figure out if it's hardware, I wish I'd figured that out a
month ago because dell would have been shipping me a new mobo at that
point... oh well... I'm starting to think it is but given the age of
the computer... (just over 1 year now) I'm not happy about that. my
nic is a 10/100 card the e1000e is obviously gigabit though I don't
think I've a need (or network) for the gigabit atm
-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-12-04 18:28                                                       ` Caleb Cushing
@ 2009-12-04 20:44                                                         ` Jarek Poplawski
  2009-12-04 23:07                                                           ` Caleb Cushing
  0 siblings, 1 reply; 55+ messages in thread
From: Jarek Poplawski @ 2009-12-04 20:44 UTC (permalink / raw)
  To: Caleb Cushing
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

On Fri, Dec 04, 2009 at 01:28:26PM -0500, Caleb Cushing wrote:
> On Fri, Dec 4, 2009 at 4:05 AM, Jarek Poplawski <jarkao2@gmail.com> wrote:
> > If you have it fixed easily with another nic you should first
> > reconsider if this debugging is worth of your time. Of course it's
> > could be very useful for the kernel (unless it's a hardware fault),
> > but on the other hand this is a popular nic, tested by many people.
> 
> trying to figure out if it's hardware, I wish I'd figured that out a
> month ago because dell would have been shipping me a new mobo at that
> point... oh well... I'm starting to think it is but given the age of
> the computer... (just over 1 year now) I'm not happy about that. my
> nic is a 10/100 card the e1000e is obviously gigabit though I don't
> think I've a need (or network) for the gigabit atm

For now there is no proof it's hardware, so don't worry ;-) It might
be firmware, bios etc. And might be kernel too. I meant it's hard to
debug, considering your bisection results, but easy to avoid with
other hardware, so it's up to.

Btw, maybe we(?!) should've done it earlier, but just did some google,
and it looks like these NICs aren't so innocent as I assumed. I'm just
looking here:
https://bugs.launchpad.net/ubuntu/+bug/382671
and there:
http://bugzilla.kernel.org/show_bug.cgi?id=11998
and maybe it's a bit different story, but actors mainly the same.

So, again, if you're willing to debug this, the new bugzilla report
seems reasonable to me, plus maybe some notice to this #11998 too.

Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-12-04 20:44                                                         ` Jarek Poplawski
@ 2009-12-04 23:07                                                           ` Caleb Cushing
  2009-12-04 23:47                                                               ` Jarek Poplawski
  0 siblings, 1 reply; 55+ messages in thread
From: Caleb Cushing @ 2009-12-04 23:07 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

> So, again, if you're willing to debug this, the new bugzilla report
> seems reasonable to me, plus maybe some notice to this #11998 too.

I will later today. I'm thinking since I now have 2 active nics...
would hooking the 1 card directly to the other and then running the
tests be helpful? (I wish this mobo had a 3rd pci slot because I'd
have put a 3rd card in so I can connect to the net at the same time.


-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-12-04 23:07                                                           ` Caleb Cushing
@ 2009-12-04 23:47                                                               ` Jarek Poplawski
  0 siblings, 0 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-12-04 23:47 UTC (permalink / raw)
  To: Caleb Cushing
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

On Fri, Dec 04, 2009 at 06:07:42PM -0500, Caleb Cushing wrote:
> > So, again, if you're willing to debug this, the new bugzilla report
> > seems reasonable to me, plus maybe some notice to this #11998 too.
> 
> I will later today. I'm thinking since I now have 2 active nics...
> would hooking the 1 card directly to the other and then running the
> tests be helpful? (I wish this mobo had a 3rd pci slot because I'd
> have put a 3rd card in so I can connect to the net at the same time.

I guess, you should better wait with new tests for some assistance
from e1000e maintainers - it seems they might be interested in some
specific dumps and registers - like in this #11998 case.

Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
@ 2009-12-04 23:47                                                               ` Jarek Poplawski
  0 siblings, 0 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-12-04 23:47 UTC (permalink / raw)
  To: Caleb Cushing
  Cc: e1000-devel, netdev, Frans Pop, Jesse Brandeburg, linux-kernel,
	Andi Kleen, Jeff Kirsher

On Fri, Dec 04, 2009 at 06:07:42PM -0500, Caleb Cushing wrote:
> > So, again, if you're willing to debug this, the new bugzilla report
> > seems reasonable to me, plus maybe some notice to this #11998 too.
> 
> I will later today. I'm thinking since I now have 2 active nics...
> would hooking the 1 card directly to the other and then running the
> tests be helpful? (I wish this mobo had a 3rd pci slot because I'd
> have put a 3rd card in so I can connect to the net at the same time.

I guess, you should better wait with new tests for some assistance
from e1000e maintainers - it seems they might be interested in some
specific dumps and registers - like in this #11998 case.

Jarek P.

------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-12-04 23:47                                                               ` Jarek Poplawski
@ 2009-12-05  7:06                                                                 ` Caleb Cushing
  -1 siblings, 0 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-12-05  7:06 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

> I guess, you should better wait with new tests for some assistance
> from e1000e maintainers - it seems they might be interested in some
> specific dumps and registers - like in this #11998 case.

I reported here.

http://bugzilla.kernel.org/show_bug.cgi?id=14737
-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
@ 2009-12-05  7:06                                                                 ` Caleb Cushing
  0 siblings, 0 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-12-05  7:06 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: e1000-devel, netdev, Frans Pop, Jesse Brandeburg, linux-kernel,
	Andi Kleen, Jeff Kirsher

> I guess, you should better wait with new tests for some assistance
> from e1000e maintainers - it seems they might be interested in some
> specific dumps and registers - like in this #11998 case.

I reported here.

http://bugzilla.kernel.org/show_bug.cgi?id=14737
-- 
Caleb Cushing

http://xenoterracide.blogspot.com

------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-12-05  7:06                                                                 ` Caleb Cushing
  (?)
@ 2009-12-05  7:29                                                                 ` Caleb Cushing
  -1 siblings, 0 replies; 55+ messages in thread
From: Caleb Cushing @ 2009-12-05  7:29 UTC (permalink / raw)
  To: Jarek Poplawski
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

I sadly wonder if this is why Dell pulled their 530n product line for
ubuntu and I didn't see the last time I checked if it was replaced.
-- 
Caleb Cushing

http://xenoterracide.blogspot.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: large packet loss take2 2.6.31.x
  2009-12-05  7:06                                                                 ` Caleb Cushing
  (?)
  (?)
@ 2009-12-05 13:57                                                                 ` Jarek Poplawski
  -1 siblings, 0 replies; 55+ messages in thread
From: Jarek Poplawski @ 2009-12-05 13:57 UTC (permalink / raw)
  To: Caleb Cushing
  Cc: Frans Pop, Andi Kleen, linux-kernel, netdev, Jeff Kirsher,
	Jesse Brandeburg, e1000-devel

On Sat, Dec 05, 2009 at 02:06:09AM -0500, Caleb Cushing wrote:
> > I guess, you should better wait with new tests for some assistance
> > from e1000e maintainers - it seems they might be interested in some
> > specific dumps and registers - like in this #11998 case.
> 
> I reported here.
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=14737

Please, remember to add at least standard things like dmesg, .config,
'lspci -vvv', /proc/interrupts etc. (linux-2.6/REPORTING_BUGS), and
maybe 'netstat -s' both for non-working and working case/boot.

And some summary (incl. the router type), so people don't have to
browse all this thread.

Jarek P.

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2009-12-05 13:58 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-31 14:21 large packet loss take2 2.6.31.x Caleb Cushing
2009-10-31 18:44 ` Frans Pop
2009-11-11 20:19   ` Caleb Cushing
2009-11-11 21:47     ` Andi Kleen
2009-11-11 22:05       ` Frans Pop
2009-11-11 22:48         ` Caleb Cushing
2009-11-12 11:38           ` Jarek Poplawski
2009-11-12 13:46             ` Caleb Cushing
2009-11-12 13:46               ` Caleb Cushing
2009-11-12 19:04               ` Jarek Poplawski
2009-11-12 21:47                 ` Jarek Poplawski
2009-11-13 16:25                 ` Caleb Cushing
2009-11-13 17:21                   ` Caleb Cushing
2009-11-13 21:16                   ` Jarek Poplawski
2009-11-18  9:59                     ` Caleb Cushing
2009-11-18 10:00                       ` Caleb Cushing
2009-11-18 13:51                       ` Jarek Poplawski
2009-11-18 18:21                         ` Caleb Cushing
2009-11-18 20:10                           ` Jarek Poplawski
2009-11-18 22:38                             ` Jarek Poplawski
2009-11-22 19:35                               ` Caleb Cushing
2009-11-22 20:50                                 ` Jarek Poplawski
2009-11-24  6:17                                   ` Caleb Cushing
2009-11-24 11:19                                     ` Jarek Poplawski
2009-11-24 11:19                                       ` Jarek Poplawski
2009-11-24 13:46                                       ` Jarek Poplawski
2009-11-24 15:57                                       ` [E1000-devel] " Allan, Bruce W
2009-11-24 15:57                                         ` Allan, Bruce W
2009-11-24 18:27                                         ` Jarek Poplawski
2009-11-24 18:27                                           ` Jarek Poplawski
2009-11-25 14:06                                       ` Caleb Cushing
2009-11-25 14:06                                         ` Caleb Cushing
2009-11-25 16:47                                         ` Caleb Cushing
2009-11-25 16:47                                           ` Caleb Cushing
2009-11-25 19:11                                         ` Jarek Poplawski
2009-11-25 19:11                                           ` Jarek Poplawski
2009-11-27 18:07                                           ` Caleb Cushing
2009-11-27 21:36                                             ` Jarek Poplawski
2009-11-27 21:36                                               ` Jarek Poplawski
2009-11-27 22:35                                               ` Caleb Cushing
2009-11-27 22:35                                                 ` Caleb Cushing
2009-11-27 22:42                                                 ` Jarek Poplawski
2009-12-04  1:49                                                   ` Caleb Cushing
2009-12-04  1:49                                                     ` Caleb Cushing
2009-12-04  9:05                                                     ` Jarek Poplawski
2009-12-04  9:05                                                       ` Jarek Poplawski
2009-12-04 18:28                                                       ` Caleb Cushing
2009-12-04 20:44                                                         ` Jarek Poplawski
2009-12-04 23:07                                                           ` Caleb Cushing
2009-12-04 23:47                                                             ` Jarek Poplawski
2009-12-04 23:47                                                               ` Jarek Poplawski
2009-12-05  7:06                                                               ` Caleb Cushing
2009-12-05  7:06                                                                 ` Caleb Cushing
2009-12-05  7:29                                                                 ` Caleb Cushing
2009-12-05 13:57                                                                 ` Jarek Poplawski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.