linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Processes receive SIGSEGV if TCQ is enabled
@ 2003-10-30 15:01 Thomas Schlichter
  2003-10-30 17:48 ` Bartlomiej Zolnierkiewicz
  2003-10-30 21:33 ` Andrew Morton
  0 siblings, 2 replies; 11+ messages in thread
From: Thomas Schlichter @ 2003-10-30 15:01 UTC (permalink / raw)
  To: B.Zolnierkiewicz; +Cc: linux-kernel, linux-ide


[-- Attachment #1.1: Type: text/plain, Size: 3216 bytes --]

Hello,

today I tried to test TCQ with the linux-2.6.0-test9-mm1 kernel. The config.gz 
is attached. But after enabling TCQ with 'hdparm -Q1 /dev/hda' newly started 
processes die due to a received SIGSEGV. No bad kernel messages appear...

Disabling TCQ again doesn't help, only e reboot does...
When I let the kernel enable TCQ at boot time, it set the TCQ buffer depth to 
8 and even the init script was killed!

Here some additional information:

schlicht@bigboss:~> uname -a
Linux bigboss 2.6.0-test9-mm1 #1 Thu Oct 30 14:45:35 CET 2003 i686 unknown 
unknown GNU/Linux
schlicht@bigboss:~> hdparm -i /dev/hda

/dev/hda:

 Model=IBM-DTLA-307030, FwRev=TX4OA50C, SerialNo=YK0YKT61943
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
 RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=40
 BuffType=DualPortCache, BuffSize=1916kB, MaxMultSect=16, MultSect=16
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=60036480
 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4
 DMA modes:  mdma0 mdma1 mdma2
 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5
 AdvancedPM=yes: disabled (255) WriteCache=enabled
 Drive conforms to: ATA/ATAPI-5 T13 1321D revision 1:

 * signifies the current active mode

schlicht@bigboss:~> hdparm -i /dev/hdb
/dev/hdb: No such device or address
schlicht@bigboss:~> cat /proc/ide/via
----------VIA BusMastering IDE Configuration----------------
Driver Version:                     3.38
South Bridge:                       VIA vt8235
Revision:                           ISA 0x0 IDE 0x6
Highest DMA rate:                   UDMA133
BM-DMA base:                        0xec00
PCI clock:                          33.3MHz
Master Read  Cycle IRDY:            0ws
Master Write Cycle IRDY:            0ws
BM IDE Status Register Read Retry:  yes
Max DRDY Pulse Width:               No limit
-----------------------Primary IDE-------Secondary IDE------
Read DMA FIFO flush:          yes                 yes
End Sector FIFO flush:         no                  no
Prefetch Buffer:              yes                 yes
Post Write Buffer:            yes                 yes
Enabled:                      yes                 yes
Simplex only:                  no                  no
Cable Type:                   80w                 40w
-------------------drive0----drive1----drive2----drive3-----
Transfer Mode:       UDMA       PIO       DMA      UDMA
Address Setup:      120ns     120ns     120ns     120ns
Cmd Active:          90ns      90ns      90ns      90ns
Cmd Recovery:        30ns      30ns      30ns      30ns
Data Active:         90ns     330ns      90ns      90ns
Data Recovery:       30ns     270ns      30ns      30ns
Cycle Time:          22ns     600ns     120ns      60ns
Transfer Rate:   88.8MB/s   3.3MB/s  16.6MB/s  33.3MB/s
schlicht@bigboss:~> mount
/dev/hda5 on / type reiserfs (rw)
proc on /proc type proc (rw)
devpts on /dev/pts type devpts (rw,mode=0620,gid=5)
sysfs on /sys type sysfs (rw)
tmpfs on /tmp type tmpfs (rw)


Regards
   Thomas

P.S.: My hdparm version v5.4 sets the queuing depth only to 1 even if I 
provide e.g. '-Q8'...

[-- Attachment #1.2: config.gz --]
[-- Type: application/x-gzip, Size: 13517 bytes --]

[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Processes receive SIGSEGV if TCQ is enabled
  2003-10-30 15:01 Processes receive SIGSEGV if TCQ is enabled Thomas Schlichter
@ 2003-10-30 17:48 ` Bartlomiej Zolnierkiewicz
  2003-10-31 13:00   ` Jens Axboe
  2003-10-30 21:33 ` Andrew Morton
  1 sibling, 1 reply; 11+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2003-10-30 17:48 UTC (permalink / raw)
  To: Thomas Schlichter; +Cc: linux-kernel, linux-ide


[ Jens - IDE TCQ maintainer 8) added to cc: ]

On Thursday 30 of October 2003 16:01, Thomas Schlichter wrote:
> Hello,

Hi,

Could you also send dmesg output and retry with vanilla -test9?

--bartlomiej

> today I tried to test TCQ with the linux-2.6.0-test9-mm1 kernel. The
> config.gz is attached. But after enabling TCQ with 'hdparm -Q1 /dev/hda'
> newly started processes die due to a received SIGSEGV. No bad kernel
> messages appear...
>
> Disabling TCQ again doesn't help, only e reboot does...
> When I let the kernel enable TCQ at boot time, it set the TCQ buffer depth
> to 8 and even the init script was killed!


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Processes receive SIGSEGV if TCQ is enabled
  2003-10-30 15:01 Processes receive SIGSEGV if TCQ is enabled Thomas Schlichter
  2003-10-30 17:48 ` Bartlomiej Zolnierkiewicz
@ 2003-10-30 21:33 ` Andrew Morton
  2003-10-30 22:10   ` Thomas Schlichter
  1 sibling, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2003-10-30 21:33 UTC (permalink / raw)
  To: Thomas Schlichter; +Cc: B.Zolnierkiewicz, linux-kernel, linux-ide

Thomas Schlichter <schlicht@uni-mannheim.de> wrote:
>
>  today I tried to test TCQ with the linux-2.6.0-test9-mm1 kernel. The config.gz 
>  is attached. But after enabling TCQ with 'hdparm -Q1 /dev/hda' newly started 
>  processes die due to a received SIGSEGV. No bad kernel messages appear...

Probably we need to turn off TCQ in kernel config until the confidence
level is higher.

>  Disabling TCQ again doesn't help, only e reboot does...

That will be because you have incorrect program text in pagecache, left
over from when the driver was in TCQ mode.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Processes receive SIGSEGV if TCQ is enabled
  2003-10-30 21:33 ` Andrew Morton
@ 2003-10-30 22:10   ` Thomas Schlichter
  2003-10-30 22:44     ` Ivan Gyurdiev
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Schlichter @ 2003-10-30 22:10 UTC (permalink / raw)
  To: Andrew Morton; +Cc: B.Zolnierkiewicz, linux-kernel, linux-ide


[-- Attachment #1.1: Type: text/plain, Size: 1541 bytes --]

Hello again...

On Thursday 30 October 2003 22:33, Andrew Morton wrote:
> Thomas Schlichter <schlicht@uni-mannheim.de> wrote:
> >  today I tried to test TCQ with the linux-2.6.0-test9-mm1 kernel. The
> > config.gz is attached. But after enabling TCQ with 'hdparm -Q1 /dev/hda'
> > newly started processes die due to a received SIGSEGV. No bad kernel
> > messages appear...
>
> Probably we need to turn off TCQ in kernel config until the confidence
> level is higher.
>
> >  Disabling TCQ again doesn't help, only e reboot does...
>
> That will be because you have incorrect program text in pagecache, left
> over from when the driver was in TCQ mode.

Yes, that would explan it...

Well, I tested it with vanilla test9 and it has exaclty the same problems with 
TCQ here as test9-mm1 has. This means:

- TCQ enabled at boot time (TCQ depth is set to 8) kills the init script
- TCQ set to 1 with hdparm kills nearly everything
- TCQ set to 8 kills just some programms
- TCQ set to 32 seems to work

Attached is the dmesg from vanilla test9 where I set the TCQ depth to 32. If I 
set it to 1 the messages are the same (as explained in my original mail, 
there is no warning or error in the logs) except 'hda: tagged command 
queueing enabled, command queue depth 32' will be 'hda: tagged command 
queueing enabled, command queue depth 1'...

Btw. hdparam correctly sets the TCQ buffer depth, but TCQ always has to be 
disabled before setting an other TCQ depth, that confused me... ;-)

Regards
   Thomas

[-- Attachment #1.2: dmesg --]
[-- Type: text/plain, Size: 15707 bytes --]

0000000) @ 0x0fff3040
ACPI: MADT (v001 KT400  AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x0fff7000
ACPI: DSDT (v001 KT400  AWRDACPI 0x00001000 MSFT 0x0100000d) @ 0x00000000
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 6:7 APIC version 16
ACPI: LAPIC_NMI (acpi_id[0x00] polarity[0x1] trigger[0x1] lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] global_irq_base[0x0])
IOAPIC[0]: Assigned apic_id 2
IOAPIC[0]: apic_id 2, version 3, address 0xfec00000, IRQ 0-23
ACPI: INT_SRC_OVR (bus[0] irq[0x0] global_irq[0x2] polarity[0x0] trigger[0x0])
ACPI: INT_SRC_OVR (bus[0] irq[0x9] global_irq[0x9] polarity[0x0] trigger[0x0])
ACPI: INT_SRC_OVR (bus[0] irq[0xe] global_irq[0xe] polarity[0x1] trigger[0x1])
ACPI: INT_SRC_OVR (bus[0] irq[0xf] global_irq[0xf] polarity[0x1] trigger[0x1])
Enabling APIC mode:  Flat.  Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Building zonelist for node : 0
Kernel command line: auto BOOT_IMAGE=linux ro root=305 BOOT_FILE=/boot/vmlinuz video=vesafb:mtrr,pmipal,ywrap
Initializing CPU#0
PID hash table entries: 1024 (order 10: 8192 bytes)
Detected 1307.337 MHz processor.
Console: colour dummy device 80x25
Memory: 255656k/262080k available (1880k kernel code, 5696k reserved, 809k data, 192k init, 0k highmem)
Calibrating delay loop... 2572.28 BogoMIPS
Security Scaffold v1.0.0 initialized
SELinux:  Initializing.
SELinux:  Starting in permissive mode
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU:     After generic identify, caps: 0383fbff c1c3fbff 00000000 00000000
CPU:     After vendor identify, caps: 0383fbff c1c3fbff 00000000 00000000
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 64K (64 bytes/line)
CPU:     After all inits, caps: 0383fbff c1c3fbff 00000000 00000020
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: AMD Duron(tm) processor stepping 01
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
enabled ExtINT on CPU#0
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
ENABLING IO-APIC IRQs
init IO_APIC IRQs
 IO-APIC (apicid-pin) 2-0, 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected.
..TIMER: vector=0x31 pin1=2 pin2=-1
number of MP IRQ sources: 15.
number of IO-APIC #2 registers: 24.
testing the IO APIC.......................
IO APIC #2......
.... register #00: 02000000
.......    : physical APIC id: 02
.......    : Delivery Type: 0
.......    : LTS          : 0
.... register #01: 00178003
.......     : max redirection entries: 0017
.......     : PRQ implemented: 1
.......     : IO APIC version: 0003
.... IRQ redirection table:
 NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:   
 00 000 00  1    0    0   0   0    0    0    00
 01 001 01  0    0    0   0   0    1    1    39
 02 001 01  0    0    0   0   0    1    1    31
 03 001 01  0    0    0   0   0    1    1    41
 04 001 01  0    0    0   0   0    1    1    49
 05 001 01  0    0    0   0   0    1    1    51
 06 001 01  0    0    0   0   0    1    1    59
 07 001 01  0    0    0   0   0    1    1    61
 08 001 01  0    0    0   0   0    1    1    69
 09 001 01  0    0    0   0   0    1    1    71
 0a 001 01  0    0    0   0   0    1    1    79
 0b 001 01  0    0    0   0   0    1    1    81
 0c 001 01  0    0    0   0   0    1    1    89
 0d 001 01  0    0    0   0   0    1    1    91
 0e 001 01  0    0    0   0   0    1    1    99
 0f 001 01  0    0    0   0   0    1    1    A1
 10 000 00  1    0    0   0   0    0    0    00
 11 000 00  1    0    0   0   0    0    0    00
 12 000 00  1    0    0   0   0    0    0    00
 13 000 00  1    0    0   0   0    0    0    00
 14 000 00  1    0    0   0   0    0    0    00
 15 000 00  1    0    0   0   0    0    0    00
 16 000 00  1    0    0   0   0    0    0    00
 17 000 00  1    0    0   0   0    0    0    00
IRQ to pin mappings:
IRQ0 -> 0:2
IRQ1 -> 0:1
IRQ3 -> 0:3
IRQ4 -> 0:4
IRQ5 -> 0:5
IRQ6 -> 0:6
IRQ7 -> 0:7
IRQ8 -> 0:8
IRQ9 -> 0:9
IRQ10 -> 0:10
IRQ11 -> 0:11
IRQ12 -> 0:12
IRQ13 -> 0:13
IRQ14 -> 0:14
IRQ15 -> 0:15
.................................... done.
Using local APIC timer interrupts.
calibrating APIC timer ...
..... CPU clock speed is 1306.0989 MHz.
..... host bus clock speed is 201.0075 MHz.
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfb3b0, last bus=1
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20031002
IOAPIC[0]: Set PCI routing entry (2-9 -> 0x71 -> IRQ 9 Mode:0 Active:0)
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (00:00)
PCI: Probing PCI hardware (bus 00)
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 1 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 1 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 1 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 1 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [ALKA] (IRQs 20)
ACPI: PCI Interrupt Link [ALKB] (IRQs 21)
ACPI: PCI Interrupt Link [ALKC] (IRQs 22)
ACPI: PCI Interrupt Link [ALKD] (IRQs 23)
Linux Plug and Play Support v0.97 (c) Adam Belay
PnPBIOS: Scanning system for PnP BIOS support...
PnPBIOS: Found PnP BIOS installation structure at 0xc00fbe80
PnPBIOS: PnP BIOS version 1.0, entry 0xf0000:0xbeb0, dseg 0xf0000
PnPBIOS: 15 nodes reported by PnP BIOS; 15 recorded by driver
IOAPIC[0]: Set PCI routing entry (2-16 -> 0xa9 -> IRQ 16 Mode:1 Active:1)
00:00:08[A] -> 2-16 -> IRQ 16
IOAPIC[0]: Set PCI routing entry (2-17 -> 0xb1 -> IRQ 17 Mode:1 Active:1)
00:00:08[B] -> 2-17 -> IRQ 17
IOAPIC[0]: Set PCI routing entry (2-18 -> 0xb9 -> IRQ 18 Mode:1 Active:1)
00:00:08[C] -> 2-18 -> IRQ 18
IOAPIC[0]: Set PCI routing entry (2-19 -> 0xc1 -> IRQ 19 Mode:1 Active:1)
00:00:08[D] -> 2-19 -> IRQ 19
Pin 2-17 already programmed
Pin 2-18 already programmed
Pin 2-19 already programmed
Pin 2-16 already programmed
Pin 2-18 already programmed
Pin 2-19 already programmed
Pin 2-16 already programmed
Pin 2-17 already programmed
Pin 2-19 already programmed
Pin 2-16 already programmed
Pin 2-17 already programmed
Pin 2-18 already programmed
Pin 2-18 already programmed
Pin 2-19 already programmed
Pin 2-16 already programmed
Pin 2-17 already programmed
_CRS returns NULL! Using IRQ 21 for device (PCI Interrupt Link [ALKB]).
ACPI: PCI Interrupt Link [ALKB] enabled at IRQ 21
IOAPIC[0]: Set PCI routing entry (2-21 -> 0xc9 -> IRQ 21 Mode:1 Active:1)
00:00:10[A] -> 2-21 -> IRQ 21
Pin 2-21 already programmed
Pin 2-21 already programmed
Pin 2-21 already programmed
_CRS returns NULL! Using IRQ 20 for device (PCI Interrupt Link [ALKA]).
ACPI: PCI Interrupt Link [ALKA] enabled at IRQ 20
IOAPIC[0]: Set PCI routing entry (2-20 -> 0xd1 -> IRQ 20 Mode:1 Active:1)
00:00:11[A] -> 2-20 -> IRQ 20
Pin 2-21 already programmed
_CRS returns NULL! Using IRQ 22 for device (PCI Interrupt Link [ALKC]).
ACPI: PCI Interrupt Link [ALKC] enabled at IRQ 22
IOAPIC[0]: Set PCI routing entry (2-22 -> 0xd9 -> IRQ 22 Mode:1 Active:1)
00:00:11[C] -> 2-22 -> IRQ 22
_CRS returns NULL! Using IRQ 23 for device (PCI Interrupt Link [ALKD]).
ACPI: PCI Interrupt Link [ALKD] enabled at IRQ 23
IOAPIC[0]: Set PCI routing entry (2-23 -> 0xe1 -> IRQ 23 Mode:1 Active:1)
00:00:11[D] -> 2-23 -> IRQ 23
Pin 2-16 already programmed
Pin 2-17 already programmed
Pin 2-18 already programmed
Pin 2-19 already programmed
Pin 2-23 already programmed
Pin 2-23 already programmed
Pin 2-23 already programmed
Pin 2-23 already programmed
PCI: Using ACPI for IRQ routing
PCI: if you experience problems, try using option 'pci=noacpi' or even 'acpi=off'
divert: not allocating divert_blk for non-ethernet device lo
vesafb: framebuffer at 0xd8000000, mapped to 0xd080b000, size 16384k
vesafb: mode is 1024x768x16, linelength=2048, pages=0
vesafb: protected mode interface info at c000:c2c0
vesafb: pmi: set display start = c00cc305, set palette = c00cc38a
vesafb: pmi: ports = b4c3 b503 ba03 c003 c103 c403 c503 c603 c703 c803 c903 cc03 ce03 cf03 d003 d103 d203 d303 d403 d503 da03 ff03 
vesafb: scrolling: ywrap using protected mode interface, yres_virtual=8192
vesafb: directcolor: size=0:5:6:5, shift=0:11:5:0
fb0: VESA VGA frame buffer device
Machine check exception polling timer started.
apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16ac)
apm: overridden by ACPI.
Total HugeTLB memory allocated, 0
ikconfig 0.7 with /proc/config*
VFS: Disk quotas dquot_6.5.1
Initializing Cryptographic API
PCI: Via IRQ fixup for 0000:00:10.1, from 11 to 5
PCI: Via IRQ fixup for 0000:00:10.2, from 10 to 5
ACPI: Power Button (FF) [PWRF]
ACPI: Fan [FAN] (on)
ACPI: Processor [CPU0] (supports C1 C2)
ACPI: Thermal Zone [THRM] (41 C)
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Console: switching to colour frame buffer device 128x48
pty: 256 Unix98 ptys configured
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller at PCI slot 0000:00:11.1
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: VIA vt8235 (rev 00) IDE UDMA133 controller on pci0000:00:11.1
    ide0: BM-DMA at 0xec00-0xec07, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0xec08-0xec0f, BIOS settings: hdc:DMA, hdd:DMA
hda: IBM-DTLA-307030, ATA DISK drive
Using anticipatory io scheduler
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hdc: PLEXTOR CD-R PX-W8432T, ATAPI CD/DVD-ROM drive
hdd: Pioneer DVD-ROM ATAPIModel DVD-105S 013, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
hda: max request size: 128KiB
hda: 60036480 sectors (30738 MB) w/1916KiB Cache, CHS=59560/16/63, UDMA(100)
 hda: hda1 hda2 hda3 < hda5 hda6 >
hdc: ATAPI 32X CD-ROM CD-R/RW drive, 4096kB Cache, DMA
Uniform CD-ROM driver Revision: 3.12
hdd: ATAPI 40X DVD-ROM drive, 512kB Cache, UDMA(33)
ide-floppy driver 0.99.newide
Console: switching to colour frame buffer device 128x48
mice: PS/2 mouse device common for all mice
input: PC Speaker
serio: i8042 AUX port at 0x60,0x64 irq 12
input: AT Translated Set 2 keyboard on isa0060/serio0
serio: i8042 KBD port at 0x60,0x64 irq 1
NET: Registered protocol family 2
IP: routing cache hash table of 2048 buckets, 16Kbytes
TCP: Hash tables configured (established 16384 bind 32768)
BIOS EDD facility v0.10 2003-Oct-11, 1 devices found
Please report your BIOS at http://domsch.com/linux/edd30/results.html
PM: Reading pmdisk image.
PM: Resume from disk failed.
ACPI: (supports S0 S1 S3 S4 S5)
found reiserfs format "3.5" with standard journal
Reiserfs journal params: device hda5, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
reiserfs: checking transaction log (hda5) for (hda5)
reiserfs: replayed 63 transactions in 2 seconds
Using r5 hash to sort names
reiserfs: using 3.5.x disk format
VFS: Mounted root (reiserfs filesystem) readonly.
Freeing unused kernel memory: 192k freed
NET: Registered protocol family 1
Adding 524656k swap on /dev/hda2.  Priority:-1 extents:1
Real Time Clock Driver v1.12
Serial: 8250/16550 driver $Revision: 1.90 $ 48 ports, IRQ sharing enabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
CAPI Subsystem Rev 1.21.6.8
ksysguardd: numerical sysctl 7 2 1 is obsolete.
8139too Fast Ethernet driver 0.9.26
divert: allocating divert_blk for eth0
eth0: RealTek RTL8139 at 0xd191d000, 00:e0:7d:b5:8b:2a, IRQ 17
eth0:  Identified 8139 chip type 'RTL-8139C'
eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
drivers/usb/core/usb.c: registered new driver usbfs
drivers/usb/core/usb.c: registered new driver hub
drivers/usb/host/uhci-hcd.c: USB Universal Host Controller Interface driver v2.1
uhci_hcd 0000:00:10.0: UHCI Host Controller
uhci_hcd 0000:00:10.0: irq 21, io base 0000e000
uhci_hcd 0000:00:10.0: new USB bus registered, assigned bus number 1
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
uhci_hcd 0000:00:10.1: UHCI Host Controller
uhci_hcd 0000:00:10.1: irq 21, io base 0000e400
uhci_hcd 0000:00:10.1: new USB bus registered, assigned bus number 2
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
uhci_hcd 0000:00:10.2: UHCI Host Controller
uhci_hcd 0000:00:10.2: irq 21, io base 0000e800
uhci_hcd 0000:00:10.2: new USB bus registered, assigned bus number 3
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
ehci_hcd 0000:00:10.3: EHCI Host Controller
ehci_hcd 0000:00:10.3: irq 21, pci mem d191f000
ehci_hcd 0000:00:10.3: new USB bus registered, assigned bus number 4
ehci_hcd 0000:00:10.3: USB 2.0 enabled, EHCI 1.00, driver 2003-Jun-13
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 6 ports detected
hub 1-0:1.0: new USB device on port 1, assigned address 2
parport0: PC-style at 0x378 (0x778) [PCSPP,TRISTATE,EPP]
parport0: irq 7 detected
parport0: cpp_daisy: aa5500ff(38)
parport0: assign_addrs: aa5500ff(38)
parport0: cpp_daisy: aa5500ff(38)
parport0: assign_addrs: aa5500ff(38)
lp0: using parport0 (polling).
lp0: console ready
NET: Registered protocol family 10
Disabled Privacy Extensions on device c0365ca0(lo)
IPv6 over IPv4 tunneling driver
divert: not allocating divert_blk for non-ethernet device sit0
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
mtrr: 0xd8000000,0x4000000 overlaps existing 0xd8000000,0x1000000
Linux video capture interface: v1.00
bttv: driver version 0.9.12 loaded
bttv: using 8 buffers with 2080k (520 pages) each for capture
bttv: Bt8xx card found (0).
bttv0: Bt848 (rev 18) at 0000:00:0c.0, irq: 18, latency: 32, mmio: 0xe2002000
bttv0: using: Hauppauge (bt848) [card=2,insmod option]
bttv0: Hauppauge/Voodoo msp34xx: reset line init [5]
bttv0: Hauppauge eeprom: model=60004, tuner=Philips FI1216 MK2 (5), radio=no
bttv0: using tuner=5
bttv0: i2c: checking for MSP34xx @ 0x80... not found
bttv0: i2c: checking for TDA9875 @ 0xb0... not found
bttv0: i2c: checking for TDA7432 @ 0x8a... not found
tvaudio: TV audio decoder + audio/video mux driver
tvaudio: known chips: tda9840,tda9873h,tda9874h/a,tda9850,tda9855,tea6300,tea6420,tda8425,pic16c54 (PV951),ta8874z
tuner: chip found @ 0xc2
tuner: type set to 5 (Philips PAL_BG (FI1216 and compatibles))
registering 0-0061
bttv0: registered device video0
bttv0: registered device vbi0
eth0: no IPv6 routers present
usb 1-1: USB disconnect, address 2
hub 1-0:1.0: new USB device on port 1, assigned address 3
drivers/usb/core/usb.c: registered new driver hiddev
drivers/usb/input/hid-ff.c: hid_ff_init could not find initializer
input: USB HID v1.00 Mouse [Logitech USB-PS/2 Mouse M-BA47] on usb-0000:00:10.0-1
drivers/usb/core/usb.c: registered new driver hid
drivers/usb/input/hid-core.c: v2.0:USB HID core driver
hda: tagged command queueing enabled, command queue depth 32
Creative EMU10K1 PCI Audio Driver, version 0.20a, 21:49:21 Oct 30 2003
emu10k1: EMU10K1 rev 4 model 0x20 found, IO at 0xd800-0xd81f, IRQ 19
ac97_codec: AC97  codec, id: TRA3 (TriTech TR28023)
ksysguardd: numerical sysctl 7 2 1 is obsolete.

[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Processes receive SIGSEGV if TCQ is enabled
  2003-10-30 22:10   ` Thomas Schlichter
@ 2003-10-30 22:44     ` Ivan Gyurdiev
  2003-10-30 23:40       ` Nick Piggin
  0 siblings, 1 reply; 11+ messages in thread
From: Ivan Gyurdiev @ 2003-10-30 22:44 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Andrew Morton, B.Zolnierkiewicz, linux-kernel, linux-ide

Hi, perhaps this might also be relevant:

http://www.ussg.iu.edu/hypermail/linux/kernel/0307.2/1655.html

Have those issues here been finally addressed?
I haven't been following development much lately, but I wanted to ask, 
since at the moment I have no confidence in TCQ. I know #1 has been 
fixed. Have there been any major TCQ changes that I need to retest for?

The only thing I see is this:

ChangeSet@1.1153.141.3, 2003-09-05 07:15:37-07:00, 
B.Zolnierkiewicz@elka.pw.edu.pl
   [PATCH] ide: fix ide_cs oops with TCQ

Thomas Schlichter wrote:
> Hello again...
> 
> On Thursday 30 October 2003 22:33, Andrew Morton wrote:
> 
>>Thomas Schlichter <schlicht@uni-mannheim.de> wrote:
>>
>>> today I tried to test TCQ with the linux-2.6.0-test9-mm1 kernel. The
>>>config.gz is attached. But after enabling TCQ with 'hdparm -Q1 /dev/hda'
>>>newly started processes die due to a received SIGSEGV. No bad kernel
>>>messages appear...



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Processes receive SIGSEGV if TCQ is enabled
  2003-10-30 22:44     ` Ivan Gyurdiev
@ 2003-10-30 23:40       ` Nick Piggin
  2003-10-31  7:28         ` Thomas Schlichter
  0 siblings, 1 reply; 11+ messages in thread
From: Nick Piggin @ 2003-10-30 23:40 UTC (permalink / raw)
  To: Ivan Gyurdiev, Thomas Schlichter
  Cc: Andrew Morton, B.Zolnierkiewicz, linux-kernel, linux-ide

[-- Attachment #1: Type: text/plain, Size: 359 bytes --]

Hi,
If you're testing IDE TCQ, please try the following patch and use the
default io scheduler. It won't fix anything, but it poisons requests
so we can sometimes tell if they are being used in the wrong places.
I have seen warnings that lead me to believe this might be happening.
Its against 2.6.0-test9-mm1. Report any stack traces you see. Thanks.

Nick


[-- Attachment #2: as-req-debug.patch --]
[-- Type: text/plain, Size: 4329 bytes --]

 linux-2.6-npiggin/drivers/block/as-iosched.c |   69 ++++++++++++++++++++-------
 1 files changed, 53 insertions(+), 16 deletions(-)

diff -puN drivers/block/as-iosched.c~as-req-debug drivers/block/as-iosched.c
--- linux-2.6/drivers/block/as-iosched.c~as-req-debug	2003-10-31 09:42:27.000000000 +1100
+++ linux-2.6-npiggin/drivers/block/as-iosched.c	2003-10-31 10:33:00.000000000 +1100
@@ -136,6 +136,10 @@ enum arq_state {
 				   scheduler */
 	AS_RQ_DISPATCHED,	/* On the dispatch list. It belongs to the
 				   driver now */
+	AS_RQ_PRESCHED,		/* Debug poisoning for requests being used */
+	AS_RQ_REMOVED,
+	AS_RQ_MERGED,
+	AS_RQ_POSTSCHED,	/* when they shouldn't be */
 };
 
 struct as_rq {
@@ -897,8 +901,14 @@ static void as_completed_request(request
 
 	WARN_ON(!list_empty(&rq->queuelist));
 
-	if (unlikely(arq->state != AS_RQ_DISPATCHED))
-		return;
+	if (arq->state == AS_RQ_MERGED)
+		goto out;
+
+	if (arq->state != AS_RQ_REMOVED) {
+		printk("arq->state %d\n", arq->state);
+		WARN_ON(1);
+		goto out;
+	}
 
 	if (ad->changed_batch && ad->nr_dispatched == 1) {
 		WARN_ON(ad->batch_data_dir == arq->is_sync);
@@ -926,7 +936,7 @@ static void as_completed_request(request
 	}
 
 	if (!arq->io_context)
-		return;
+		goto out;
 
 	if (ad->io_context == arq->io_context) {
 		ad->antic_start = jiffies;
@@ -942,7 +952,7 @@ static void as_completed_request(request
 
 	aic = arq->io_context->aic;
 	if (!aic)
-		return;
+		goto out;
 
 	spin_lock(&aic->lock);
 	if (arq->is_sync == REQ_SYNC) {
@@ -952,6 +962,9 @@ static void as_completed_request(request
 	spin_unlock(&aic->lock);
 
 	put_io_context(arq->io_context);
+
+out:
+	arq->state = AS_RQ_POSTSCHED;
 }
 
 /*
@@ -1020,14 +1033,14 @@ static void as_remove_request(request_qu
 	struct as_rq *arq = RQ_DATA(rq);
 
 	if (unlikely(arq->state == AS_RQ_NEW))
-		return;
-
-	if (!arq) {
-		WARN_ON(1);
-		return;
-	}
+		goto out;
 
 	if (ON_RB(&arq->rb_node)) {
+		if (arq->state != AS_RQ_QUEUED) {
+			printk("arq->state %d\n", arq->state);
+			WARN_ON(1);
+			goto out;
+		}
 		/*
 		 * We'll lose the aliased request(s) here. I don't think this
 		 * will ever happen, but if it does, hopefully someone will
@@ -1035,8 +1048,16 @@ static void as_remove_request(request_qu
 		 */
 		WARN_ON(!list_empty(&rq->queuelist));
 		as_remove_queued_request(q, rq);
-	} else
+	} else {
+		if (arq->state != AS_RQ_DISPATCHED) {
+			printk("arq->state %d\n", arq->state);
+			WARN_ON(1);
+			goto out;
+		}
 		as_remove_dispatched_request(q, rq);
+	}
+out:
+	arq->state = AS_RQ_REMOVED;
 }
 
 /*
@@ -1276,8 +1297,6 @@ fifo_expired:
 			ad->new_batch = 1;
 
 		ad->changed_batch = 0;
-
-//		arq->request->flags |= REQ_SOFTBARRIER;
 	}
 
 	/*
@@ -1402,6 +1421,11 @@ static void as_requeue_request(request_q
 	struct as_rq *arq = RQ_DATA(rq);
 
 	if (arq) {
+		if (arq->state != AS_RQ_REMOVED) {
+			printk("arq->state %d\n", arq->state);
+			WARN_ON(1);
+		}
+
 		arq->state = AS_RQ_DISPATCHED;
 		if (arq->io_context && arq->io_context->aic)
 			atomic_inc(&arq->io_context->aic->nr_dispatched);
@@ -1421,12 +1445,18 @@ as_insert_request(request_queue_t *q, st
 	struct as_data *ad = q->elevator.elevator_data;
 	struct as_rq *arq = RQ_DATA(rq);
 
-#if 0
+ 	if (arq) {
+ 		if (arq->state != AS_RQ_PRESCHED) {
+ 			printk("arq->state: %d\n", arq->state);
+ 			WARN_ON(1);
+ 		}
+ 		arq->state = AS_RQ_NEW;
+ 	}
+
 	/* barriers must flush the reorder queue */
 	if (unlikely(rq->flags & (REQ_SOFTBARRIER | REQ_HARDBARRIER)
 			&& where == ELEVATOR_INSERT_SORT))
 		where = ELEVATOR_INSERT_BACK;
-#endif
 
 	switch (where) {
 		case ELEVATOR_INSERT_BACK:
@@ -1662,6 +1692,8 @@ as_merged_requests(request_queue_t *q, s
 	 */
 	as_remove_queued_request(q, next);
 	put_io_context(anext->io_context);
+
+	anext->state = AS_RQ_MERGED;
 }
 
 /*
@@ -1694,6 +1726,11 @@ static void as_put_request(request_queue
 		return;
 	}
 
+	if (arq->state != AS_RQ_POSTSCHED) {
+		printk("arq->state %d\n", arq->state);
+		WARN_ON(1);
+	}
+
 	mempool_free(arq, ad->arq_pool);
 	rq->elevator_private = NULL;
 }
@@ -1707,7 +1744,7 @@ static int as_set_request(request_queue_
 		memset(arq, 0, sizeof(*arq));
 		RB_CLEAR(&arq->rb_node);
 		arq->request = rq;
-		arq->state = AS_RQ_NEW;
+		arq->state = AS_RQ_PRESCHED;
 		arq->io_context = NULL;
 		INIT_LIST_HEAD(&arq->hash);
 		arq->on_hash = 0;

_

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Processes receive SIGSEGV if TCQ is enabled
  2003-10-30 23:40       ` Nick Piggin
@ 2003-10-31  7:28         ` Thomas Schlichter
  2003-10-31  7:37           ` Nick Piggin
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Schlichter @ 2003-10-31  7:28 UTC (permalink / raw)
  To: Nick Piggin, Ivan Gyurdiev
  Cc: Andrew Morton, B.Zolnierkiewicz, linux-kernel, linux-ide

[-- Attachment #1: Type: text/plain, Size: 1577 bytes --]

Hi,

On Friday 31 October 2003 00:40, Nick Piggin wrote:
> Hi,
> If you're testing IDE TCQ, please try the following patch and use the
> default io scheduler. It won't fix anything, but it poisons requests
> so we can sometimes tell if they are being used in the wrong places.
> I have seen warnings that lead me to believe this might be happening.
> Its against 2.6.0-test9-mm1. Report any stack traces you see. Thanks.

OK, I tested 2.6.0-test9-mm1 + your patch, but it seems not to print any 
messages or stack traces, even if many processes are killed after setting TCQ 
depth to 1.

Today, however, I got reiserfs corruption messages in the logs, but 
fsck.reiserfs could not find any corruption on the next boot, so I think this 
is not a real corruption but just reading the wrong data...

The messages are something like:

Oct 31 07:57:53 bigboss kernel: is_tree_node: node level 2120 does not match 
to the expected one 1
Oct 31 07:57:53 bigboss kernel: vs-5150: search_by_key: invalid format found 
in block 642544. Fsck?
Oct 31 07:57:53 bigboss kernel: vs-13070: reiserfs_read_locked_inode: i/o 
failure occurred trying to find stat data of [4767268 4767269 0x0 SD]
Oct 31 07:57:53 bigboss kernel: is_tree_node: node level 2120 does not match 
to the expected one 1
Oct 31 07:57:53 bigboss kernel: vs-5150: search_by_key: invalid format found 
in block 642544. Fsck?
Oct 31 07:57:53 bigboss kernel: vs-13070: reiserfs_read_locked_inode: i/o 
failure occurred trying to find stat data of [4767268 4767269 0x0 SD]

> Nick

  Thomas

[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Processes receive SIGSEGV if TCQ is enabled
  2003-10-31  7:28         ` Thomas Schlichter
@ 2003-10-31  7:37           ` Nick Piggin
  0 siblings, 0 replies; 11+ messages in thread
From: Nick Piggin @ 2003-10-31  7:37 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Ivan Gyurdiev, Andrew Morton, B.Zolnierkiewicz, linux-kernel, linux-ide



Thomas Schlichter wrote:

>Hi,
>
>On Friday 31 October 2003 00:40, Nick Piggin wrote:
>
>>Hi,
>>If you're testing IDE TCQ, please try the following patch and use the
>>default io scheduler. It won't fix anything, but it poisons requests
>>so we can sometimes tell if they are being used in the wrong places.
>>I have seen warnings that lead me to believe this might be happening.
>>Its against 2.6.0-test9-mm1. Report any stack traces you see. Thanks.
>>
>
>OK, I tested 2.6.0-test9-mm1 + your patch, but it seems not to print any 
>messages or stack traces, even if many processes are killed after setting TCQ 
>depth to 1.
>

OK well thats good, its not my problem then ;) Thanks.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Processes receive SIGSEGV if TCQ is enabled
  2003-10-30 17:48 ` Bartlomiej Zolnierkiewicz
@ 2003-10-31 13:00   ` Jens Axboe
  2003-10-31 13:11     ` Thomas Schlichter
  0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2003-10-31 13:00 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz; +Cc: Thomas Schlichter, linux-kernel, linux-ide

On Thu, Oct 30 2003, Bartlomiej Zolnierkiewicz wrote:
> 
> [ Jens - IDE TCQ maintainer 8) added to cc: ]

Not really (added, that is :)

> Could you also send dmesg output and retry with vanilla -test9?

It's probably via + tcq, that drive is known good with ide-tcq. At least
I've never seen problems with it.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Processes receive SIGSEGV if TCQ is enabled
  2003-10-31 13:00   ` Jens Axboe
@ 2003-10-31 13:11     ` Thomas Schlichter
  2003-10-31 14:44       ` Ivan Gyurdiev
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Schlichter @ 2003-10-31 13:11 UTC (permalink / raw)
  To: Jens Axboe, Ivan Gyurdiev, Bartlomiej Zolnierkiewicz
  Cc: linux-kernel, linux-ide

[-- Attachment #1: Type: text/plain, Size: 454 bytes --]

On Friday 31 October 2003 14:00, Jens Axboe wrote:
> It's probably via + tcq, that drive is known good with ide-tcq. At least
> I've never seen problems with it.

Maybe it is a problem with via + tcq... so, how to debug?

@Ivan: Did you also use a via chipset when reporting the problems mentioned in
http://www.ussg.iu.edu/hypermail/linux/kernel/0307.2/1655.html ? Beacuse your 
point 4) indeed looks very similar to my problems...

  Thomas

[-- Attachment #2: signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Processes receive SIGSEGV if TCQ is enabled
  2003-10-31 13:11     ` Thomas Schlichter
@ 2003-10-31 14:44       ` Ivan Gyurdiev
  0 siblings, 0 replies; 11+ messages in thread
From: Ivan Gyurdiev @ 2003-10-31 14:44 UTC (permalink / raw)
  To: Thomas Schlichter
  Cc: Jens Axboe, Bartlomiej Zolnierkiewicz, linux-kernel, linux-ide

Thomas Schlichter wrote:
> On Friday 31 October 2003 14:00, Jens Axboe wrote:
> 
>>It's probably via + tcq, that drive is known good with ide-tcq. At least
>>I've never seen problems with it.
> 
> 
> Maybe it is a problem with via + tcq... so, how to debug?
> 
> @Ivan: Did you also use a via chipset when reporting the problems mentioned in
> http://www.ussg.iu.edu/hypermail/linux/kernel/0307.2/1655.html ? Beacuse your 
> point 4) indeed looks very similar to my problems...
> 
>   Thomas


Yes - VIA chipset.

00:11.1 IDE interface: VIA Technologies, Inc. 
VT82C586A/B/VT82C686/A/B/VT8233/A/C/VT8235 PIPC Bus Master IDE (rev 06)



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2003-10-31 14:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-30 15:01 Processes receive SIGSEGV if TCQ is enabled Thomas Schlichter
2003-10-30 17:48 ` Bartlomiej Zolnierkiewicz
2003-10-31 13:00   ` Jens Axboe
2003-10-31 13:11     ` Thomas Schlichter
2003-10-31 14:44       ` Ivan Gyurdiev
2003-10-30 21:33 ` Andrew Morton
2003-10-30 22:10   ` Thomas Schlichter
2003-10-30 22:44     ` Ivan Gyurdiev
2003-10-30 23:40       ` Nick Piggin
2003-10-31  7:28         ` Thomas Schlichter
2003-10-31  7:37           ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).