* regression: CD burning (k3b) went broke @ 2008-02-21 8:42 Mike Galbraith 2008-02-22 7:32 ` Jens Axboe 0 siblings, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-02-21 8:42 UTC (permalink / raw) To: LKML; +Cc: Jens Axboe, Tejun Heo [-- Attachment #1: Type: text/plain, Size: 54196 bytes --] Greetings, K3b recently (9a4c854..5d9c4a7 pull) began terminally griping about buffer underrun upon every attempt to burn a CD. I can't fully bisect the problem because intervening kernels hang soft during boot. Using git bisect visualize, and converting to postable text: bisect/bad block: add request->raw_data_len (6b00769fe1502b4ad97bb327ef7ac971b208bfb5) bisect block: update bio according to DMA alignment padding (40b01b9bbdf51ae543a04744283bf2d56c4a6afa) libata: update ATAPI overflow draining bisect/good-e164094964e6e20fe7fce418e06a9dce952bb7a4 Serial console log of hung kernel 40b01b9bbdf51ae543a04744283bf2d56c4a6afa below [ 0.000000] Linux version 2.6.25-rc2-smp (root@homer) (gcc version 4.2.1 (SUSE Linux)) #14 SMP PREEMPT Thu Feb 21 08:49:51 CET 2008 [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) [ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) [ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) [ 0.000000] BIOS-e820: 0000000000100000 - 000000003fff0000 (usable) [ 0.000000] BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS) [ 0.000000] BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data) [ 0.000000] BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved) [ 0.000000] 0MB HIGHMEM available. [ 0.000000] 1023MB LOWMEM available. [ 0.000000] Scan SMP from b0000000 for 1024 bytes. [ 0.000000] Scan SMP from b009fc00 for 1024 bytes. [ 0.000000] Scan SMP from b00f0000 for 65536 bytes. [ 0.000000] found SMP MP-table at [b00f5320] 000f5320 [ 0.000000] Zone PFN ranges: [ 0.000000] DMA 0 -> 4096 [ 0.000000] Normal 4096 -> 262128 [ 0.000000] HighMem 262128 -> 262128 [ 0.000000] Movable zone start PFN for each node [ 0.000000] early_node_map[1] active PFN ranges [ 0.000000] 0: 0 -> 262128 [ 0.000000] DMI 2.3 present. [ 0.000000] ACPI: RSDP 000F6CC0, 0014 (r0 IntelR) [ 0.000000] ACPI: RSDT 3FFF3000, 002C (r1 IntelR AWRDACPI 42302E31 AWRD 0) [ 0.000000] ACPI: FACP 3FFF3040, 0074 (r1 IntelR AWRDACPI 42302E31 AWRD 0) [ 0.000000] ACPI: DSDT 3FFF30C0, 4139 (r1 INTELR AWRDACPI 1000 MSFT 100000E) [ 0.000000] ACPI: FACS 3FFF0000, 0040 [ 0.000000] ACPI: APIC 3FFF7200, 0068 (r1 IntelR AWRDACPI 42302E31 AWRD 0) [ 0.000000] ACPI: PM-Timer IO Port: 0x408 [ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) [ 0.000000] Processor #0 15:2 APIC version 20 [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) [ 0.000000] Processor #1 15:2 APIC version 20 [ 0.000000] WARNING: maxcpus limit of 1 reached. Processor ignored. [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) [ 0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) [ 0.000000] IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23 [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) [ 0.000000] Enabling APIC mode: Flat. Using 1 I/O APICs [ 0.000000] Using ACPI (MADT) for SMP configuration information [ 0.000000] Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000) [ 0.000000] PM: Registered nosave memory: 000000000009f000 - 00000000000a0000 [ 0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000 [ 0.000000] PM: Registered nosave memory: 00000000000f0000 - 0000000000100000 [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 260081 [ 0.000000] Kernel command line: root=/dev/sdb3 rootflags=data=writeback vga=0x314 resume=/dev/sdb2 console=ttyS0,115200n8 console=tty splash=silent PROFILE=default 1 maxcpus=1 [ 0.000000] Enabling fast FPU save and restore... done. [ 0.000000] Enabling unmasked SIMD FPU exception support... done. [ 0.000000] Initializing CPU#0 [ 0.000000] Preemptible RCU implementation. [ 0.000000] CPU 0 irqstacks, hard=b0427000 soft=b0425000 [ 0.000000] PID hash table entries: 4096 (order: 12, 16384 bytes) [ 0.000000] Detected 2992.603 MHz processor. [ 0.000999] Console: colour dummy device 80x25 [ 0.000999] console [tty0] enabled [ 0.000999] console [ttyS0] enabled [ 0.000999] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) [ 0.000999] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) [ 0.000999] Memory: 1028968k/1048512k available (1998k kernel code, 18904k reserved, 955k data, 236k init, 0k highmem) [ 0.000999] virtual kernel memory layout: [ 0.000999] fixmap : 0xfff9b000 - 0xfffff000 ( 400 kB) [ 0.000999] pkmap : 0xff800000 - 0xffc00000 (4096 kB) [ 0.000999] vmalloc : 0xf0800000 - 0xff7fe000 ( 239 MB) [ 0.000999] lowmem : 0xb0000000 - 0xefff0000 (1023 MB) [ 0.000999] .init : 0xb03e7000 - 0xb0422000 ( 236 kB) [ 0.000999] .data : 0xb02f3b26 - 0xb03e29a8 ( 955 kB) [ 0.000999] .text : 0xb0100000 - 0xb02f3b26 (1998 kB) [ 0.000999] Checking if this processor honours the WP bit even in supervisor mode...Ok. [ 0.060993] Calibrating delay using timer specific routine.. 5987.55 BogoMIPS (lpj=2993775) [ 0.063022] Security Framework initialized [ 0.064010] Mount-cache hash table entries: 512 [ 0.065129] CPU: Trace cache: 12K uops, L1 D cache: 8K [ 0.066992] CPU: L2 cache: 512K [ 0.067992] CPU: Physical Processor ID: 0 [ 0.068993] Intel machine check architecture supported. [ 0.069994] Intel machine check reporting enabled on CPU#0. [ 0.070991] CPU0: Intel P4/Xeon Extended MCE MSRs (12) available [ 0.071992] CPU0: Thermal monitoring enabled [ 0.072993] Compat vDSO mapped to ffffe000. [ 0.073996] Checking 'hlt' instruction... OK. [ 0.079743] SMP alternatives: switching to UP code [ 0.079998] Freeing SMP alternatives: 9k freed [ 0.080991] ACPI: Core revision 20070126 [ 0.091025] CPU0: Intel(R) Pentium(R) 4 CPU 3.00GHz stepping 09 [ 0.094019] Total of 1 processors activated (5987.55 BogoMIPS). [ 0.095110] ENABLING IO-APIC IRQs [ 0.096152] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 0.107983] Brought up 1 CPUs [ 0.108208] net_namespace: 552 bytes [ 0.110058] NET: Registered protocol family 16 [ 0.111195] ACPI: bus type pci registered [ 0.114122] PCI: PCI BIOS revision 2.10 entry at 0xfb980, last bus=2 [ 0.114984] PCI: Using configuration type 1 [ 0.115983] Setting up standard PCI resources [ 0.139635] ACPI: Interpreter enabled [ 0.139985] ACPI: (supports S0 S3 S4 S5) [ 0.142261] ACPI: Using IOAPIC for interrupt routing [ 0.148393] ACPI: PCI Root Bridge [PCI0] (0000:00) [ 0.149406] pci 0000:00:1f.0: quirk: region 0400-047f claimed by ICH4 ACPI/GPIO/TCO [ 0.149981] pci 0000:00:1f.0: quirk: region 0480-04bf claimed by ICH4 GPIO [ 0.151395] PCI: Transparent bridge - 0000:00:1e.0 [ 0.160727] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 7 9 10 11 12 14 15) [ 0.164091] ACPI: PCI Interrupt Link [LNKB] (IRQs *3 4 5 7 9 10 11 12 14 15) [ 0.168383] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 7 9 10 11 12 14 15) [ 0.171666] ACPI: PCI Interrupt Link [LNKD] (IRQs *3 4 5 7 9 10 11 12 14 15) [ 0.175277] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 9 10 *11 12 14 15) [ 0.179656] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 *9 10 11 12 14 15) [ 0.183050] ACPI: PCI Interrupt Link [LNK0] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled. [ 0.188275] ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 7 9 10 *11 12 14 15) [ 0.192014] Linux Plug and Play Support v0.97 (c) Adam Belay [ 0.193005] pnp: PnP ACPI init [ 0.193979] ACPI: bus type pnp registered [ 0.199104] pnpacpi: exceeded the max number of mem resources: 12 [ 0.200035] pnp: PnP ACPI: found 13 devices [ 0.200971] ACPI: ACPI bus type pnp unregistered [ 0.202306] PCI: Using ACPI for IRQ routing [ 0.202977] PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report [ 0.226967] NetLabel: Initializing [ 0.226969] NetLabel: domain hash size = 128 [ 0.227968] NetLabel: protocols = UNLABELED CIPSOv4 [ 0.228981] NetLabel: unlabeled traffic allowed by default [ 0.230055] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 11 [ 0.232158] hpet0: 3 64-bit timers, 14318180 Hz [ 0.235002] ACPI: RTC can wake from S4 [ 0.235966] Time: tsc clocksource has been installed. [ 0.243966] system 00:01: ioport range 0xb78-0xb7b has been reserved [ 0.243969] system 00:01: ioport range 0xf78-0xf7b has been reserved [ 0.244971] system 00:01: ioport range 0xa78-0xa7b has been reserved [ 0.245968] system 00:01: ioport range 0xe78-0xe7b has been reserved [ 0.246967] system 00:01: ioport range 0xbbc-0xbbf has been reserved [ 0.247967] system 00:01: ioport range 0xfbc-0xfbf has been reserved [ 0.248968] system 00:01: ioport range 0x4d0-0x4d1 has been reserved [ 0.249967] system 00:01: ioport range 0x200-0x200 has been reserved [ 0.250967] system 00:01: ioport range 0x202-0x208 has been reserved [ 0.251967] system 00:01: ioport range 0x320-0x32f has been reserved [ 0.252966] system 00:01: ioport range 0x295-0x296 has been reserved [ 0.254970] system 00:0b: ioport range 0x400-0x4bf could not be reserved [ 0.255973] system 00:0c: iomem range 0xf0000-0xf3fff could not be reserved [ 0.256966] system 00:0c: iomem range 0xf4000-0xf7fff could not be reserved [ 0.257966] system 00:0c: iomem range 0xf8000-0xfbfff could not be reserved [ 0.258966] system 00:0c: iomem range 0xfc000-0xfffff could not be reserved [ 0.259966] system 00:0c: iomem range 0x3fff0000-0x3fffffff could not be reserved [ 0.260965] system 00:0c: iomem range 0x0-0x9ffff could not be reserved [ 0.261965] system 00:0c: iomem range 0x100000-0x3ffeffff could not be reserved [ 0.262965] system 00:0c: iomem range 0xfec00000-0xfec00fff could not be reserved [ 0.263965] system 00:0c: iomem range 0xfec01000-0xfed8ffff could not be reserved [ 0.264965] system 00:0c: iomem range 0xfee00000-0xfee00fff could not be reserved [ 0.265965] system 00:0c: iomem range 0xffb00000-0xffbfffff could not be reserved [ 0.266964] system 00:0c: iomem range 0xfff00000-0xffffffff could not be reserved [ 0.298911] PCI: Bridge: 0000:00:01.0 [ 0.298960] IO window: a000-afff [ 0.299963] MEM window: 0xf8000000-0xf9ffffff [ 0.300961] PREFETCH window: 0x00000000e8000000-0x00000000f7ffffff [ 0.301963] PCI: Bridge: 0000:00:1e.0 [ 0.302959] IO window: b000-bfff [ 0.303961] MEM window: 0xfa000000-0xfa0fffff [ 0.304960] PREFETCH window: disabled. [ 0.305994] NET: Registered protocol family 2 [ 0.325957] IP route cache hash table entries: 32768 (order: 5, 131072 bytes) [ 0.326233] TCP established hash table entries: 131072 (order: 8, 1048576 bytes) [ 0.328385] TCP bind hash table entries: 65536 (order: 7, 524288 bytes) [ 0.329394] TCP: Hash tables configured (established 131072 bind 65536) [ 0.329963] TCP reno registered [ 0.337961] Unpacking initramfs... done [ 0.574326] Freeing initrd memory: 6128k freed [ 0.576000] Machine check exception polling timer started. [ 0.577268] audit: initializing netlink socket (disabled) [ 0.577937] type=2000 audit(1203584121.788:1): initialized [ 0.579060] Total HugeTLB memory allocated, 0 [ 0.579988] VFS: Disk quotas dquot_6.5.1 [ 0.580947] Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) [ 0.582937] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253) [ 0.583930] io scheduler noop registered [ 0.584925] io scheduler anticipatory registered [ 0.585924] io scheduler deadline registered [ 0.586932] io scheduler cfq registered (default) [ 0.589129] vesafb: framebuffer at 0xe8000000, mapped to 0xf0880000, using 1875k, total 16384k [ 0.589925] vesafb: mode is 800x600x16, linelength=1600, pages=16 [ 0.590924] vesafb: protected mode interface info at c000:b544 [ 0.591926] vesafb: pmi: set display start = b00cb5d2, set palette = b00cb612 [ 0.592923] vesafb: scrolling: redraw [ 0.593924] vesafb: Truecolor: size=0:5:6:5, shift=0:11:5:0 [ 0.612485] Console: switching to colour frame buffer device 100x37 [ 0.627919] fb0: VESA VGA frame buffer device [ 0.638247] Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled [ 0.639029] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A [ 0.641591] 00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A [ 0.642200] PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1 [ 0.642917] PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp [ 0.644280] serio: i8042 KBD port at 0x60,0x64 irq 1 [ 0.645234] mice: PS/2 mouse device common for all mice [ 0.675685] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0 [ 0.687684] rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0 [ 0.687709] rtc0: alarms up to one month [ 0.688747] cpuidle: using governor ladder [ 0.689689] cpuidle: using governor menu [ 0.690751] oprofile: using NMI interrupt. [ 0.693003] NET: Registered protocol family 1 [ 0.693759] p4-clockmod: P4/Xeon(TM) CPU On-Demand Clock Modulation available [ 0.694686] Using IPI No-Shortcut mode [ 0.695814] registered taskstats version 1 [ 0.696813] rtc_cmos 00:03: setting system clock to 2008-02-21 08:55:23 UTC (1203584123) [ 0.698738] Freeing unused kernel memory: 236k freed [ 0.699701] Write protecting the kernel text: 2000k [ 0.700690] Write protecting the kernel read-only data: 792k [ 0.762206] ACPI: ACPI0007:00 is registered as cooling_device0 [ 0.768027] ACPI: LNXTHERM:01 is registered as thermal_zone0 [ 0.768870] ACPI: Thermal Zone [THRM] (40 C) [ 0.785027] SCSI subsystem initialized [ 0.806767] ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 18 [ 0.808835] scsi0 : ata_piix [ 0.809721] scsi1 : ata_piix [ 0.812130] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000 irq 14 [ 0.812702] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xf008 irq 15 [ 1.290553] ata1.00: ATA-6: ST3160021A, 3.04, max UDMA/100 [ 1.290558] ata1.00: 312581808 sectors, multi 16: LBA48 [ 1.291580] ata1.01: ATAPI: BENQ DVD DD DW1625, BBIA, max UDMA/33 [ 1.314801] ata1.00: configured for UDMA/100 [ 1.473512] ata1.01: configured for UDMA/33 [ 3.788245] ata2.00: ATA-6: ST3120022A, 3.06, max UDMA/100 [ 3.788248] ata2.00: 234441648 sectors, multi 16: LBA48 [ 3.811575] ata2.00: configured for UDMA/100 [ 3.823087] scsi 0:0:0:0: Direct-Access ATA ST3160021A 3.04 PQ: 0 ANSI: 5 [ 3.825453] scsi 0:0:1:0: CD-ROM BENQ DVD DD DW1625 BBIA PQ: 0 ANSI: 5 [ 3.825587] scsi 1:0:0:0: Direct-Access ATA ST3120022A 3.06 PQ: 0 ANSI: 5 [ 3.831138] ACPI: PNP0C0B:00 is registered as cooling_device1 [ 3.831481] ACPI: Fan [FAN] (on) [ 3.856766] BIOS EDD facility v0.16 2004-Jun-25, 6 devices found [ 3.999333] Driver 'sd' needs updating - please use bus_type methods [ 3.999551] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB) [ 4.000464] sd 0:0:0:0: [sda] Write Protect is off [ 4.001459] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 4.002528] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB) [ 4.003475] sd 0:0:0:0: [sda] Write Protect is off [ 4.004458] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 4.005586] sda: sda1 sda2 < sda5 sda6 > [ 4.048569] sd 0:0:0:0: [sda] Attached SCSI disk [ 4.049519] sd 1:0:0:0: [sdb] 234441648 512-byte hardware sectors (120034 MB) [ 4.050439] sd 1:0:0:0: [sdb] Write Protect is off [ 4.051473] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 4.052505] sd 1:0:0:0: [sdb] 234441648 512-byte hardware sectors (120034 MB) [ 4.053436] sd 1:0:0:0: [sdb] Write Protect is off [ 4.054450] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 4.055498] sdb: sdb1 sdb2 sdb3 [ 4.063528] sd 1:0:0:0: [sdb] Attached SCSI disk [ 37.473476] SysRq : Show State [ 37.474338] task PC stack pid father [ 37.474338] init S ef836eac 0 1 0 [ 37.474338] ef836f00 00000082 ffffffff ef836eac b01224d3 00000000 00000000 00000001 [ 37.474338] e80cb97b 00000000 b0421180 b0421180 b0421180 ef835020 ef83527c b180b180 [ 37.474338] 00000000 ef836000 ee6e0c80 b1031ae0 00000000 00000000 ee5b9148 ef836f14 [ 37.474338] Call Trace: [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7 [ 37.474338] [<b016de52>] ? do_wp_page+0x2a0/0x3e5 [ 37.474338] [<b01d218e>] ? security_task_wait+0xf/0x11 [ 37.474338] [<b0129e1f>] do_wait+0x470/0xa1a [ 37.474338] [<b01249e9>] ? wake_up_new_task+0x77/0x91 [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd [ 37.474338] [<b012a431>] sys_wait4+0x68/0x9f [ 37.474338] [<b012a48f>] sys_waitpid+0x27/0x29 [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb [ 37.474338] [<b02f0000>] ? relay_hotcpu_callback+0x51/0xb1 [ 37.474338] ======================= [ 37.474338] kthreadd S 00000000 0 2 0 [ 37.474338] ef83bfc8 00000046 ef8b50a0 00000000 00000092 ef83bf80 00000000 00000000 [ 37.474338] ee5f776c 00000000 b0421180 b0421180 b0421180 ef83a0a0 ef83a2fc b180b180 [ 37.474338] 00000000 ef83b000 ee564d00 00000000 00000ae7 00000000 ee5f6e1c ee5f6e20 [ 37.474338] Call Trace: [ 37.474338] [<b011d295>] ? complete+0x43/0x4b [ 37.474338] [<b0139450>] kthreadd+0x13a/0x13f [ 37.474338] [<b0139316>] ? kthreadd+0x0/0x13f [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] migration/0 S 0102c262 0 3 2 [ 37.474338] ef83df98 00000046 ef83c120 0102c262 00000246 ef83df4c 00000000 ef83df4c [ 37.474338] 066fb2fc 00000000 b0421180 b0421180 b0421180 ef83c120 ef83c37c b180b180 [ 37.474338] 00000000 ef83d000 b03ba200 00000000 00000000 00000000 00000000 b0421180 [ 37.474338] Call Trace: [ 37.474338] [<b0122f2a>] ? migration_thread+0x0/0x210 [ 37.474338] [<b012304e>] migration_thread+0x124/0x210 [ 37.474338] [<b0122f2a>] ? migration_thread+0x0/0x210 [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] ksoftirqd/0 R running 0 4 2 [ 37.474338] f4744940 00000008 ef841f14 b01424d9 f4650f99 00000008 00000000 01808638 [ 37.474338] b18086c0 b1808638 b1808600 ef841f50 b013ca83 b1808600 b1808604 f4744940 [ 37.474338] 00000008 f465092e 00000008 f465092e 00000046 b1807120 00000000 00000246 [ 37.474338] Call Trace: [ 37.474338] [<b01424d9>] ? tick_program_event+0x3f/0x61 [ 37.474338] [<b013ca83>] ? hrtimer_interrupt+0x13f/0x164 [ 37.474338] [<b012c055>] ? irq_exit+0x3f/0x79 [ 37.474338] [<b0113103>] ? smp_apic_timer_interrupt+0x5c/0x89 [ 37.474338] [<b012c055>] ? irq_exit+0x3f/0x79 [ 37.474338] [<b01057bc>] ? apic_timer_interrupt+0x28/0x30 [ 37.474338] [<b01079d9>] ? do_softirq+0x35/0x8c [ 37.474338] [<b012c515>] ? ksoftirqd+0xad/0x17f [ 37.474338] [<b012c468>] ? ksoftirqd+0x0/0x17f [ 37.474338] [<b01392f4>] ? kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] ? kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] events/0 R running 0 5 2 [ 37.474338] ef843fa0 00000046 ef8112c0 ef843f44 b0136d0e 00000a75 00000000 00000a75 [ 37.474338] 60e9afad 00000008 b0421180 b0421180 b0421180 ef8420a0 ef8422fc b180b180 [ 37.474338] 00000000 ef843000 ee6e0880 b18089e0 3399ec88 00000002 b0421180 b0421180 [ 37.474338] Call Trace: [ 37.474338] [<b0136d0e>] ? queue_delayed_work+0x40/0x48 [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] khelper S b02f12e5 0 6 2 [ 37.474338] ef845fa0 00000046 ef845f3c b02f12e5 ef836d40 ef836d44 00000000 ef845f44 [ 37.474338] 2a9ba799 00000000 b0421180 b0421180 b0421180 ef844120 ef84437c b180b180 [ 37.474338] 00000000 ef845000 ee6e0880 ef822480 00000000 00000000 b0421180 b0421180 [ 37.474338] Call Trace: [ 37.474338] [<b02f12e5>] ? preempt_schedule+0x40/0x55 [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] kblockd/0 S ef887f4c 0 35 2 [ 37.474338] ef887fa0 00000046 00000246 ef887f4c b01224ed ef887f4c 00000000 00000000 [ 37.474338] 0800f107 00000000 b0421180 b0421180 b0421180 ef886120 ef88637c b180b180 [ 37.474338] 00000000 ef887000 b03ba200 08006f94 00000c39 00000000 b0421180 b0421180 [ 37.474338] Call Trace: [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7 [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] kacpid S ef88cf4c 0 37 2 [ 37.474338] ef88cfa0 00000046 00000246 ef88cf4c b01224ed ef88cf4c 00000000 00000000 [ 37.474338] 0801a506 00000000 b0421180 b0421180 b0421180 ef88b0a0 ef88b2fc b180b180 [ 37.474338] 00000000 ef88c000 b03ba200 08018266 00000000 00000000 b0421180 b0421180 [ 37.474338] Call Trace: [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7 [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] kacpi_notify S ef88ff4c 0 38 2 [ 37.474338] ef88ffa0 00000046 00000246 ef88ff4c b01224ed ef88ff4c 00000000 00000000 [ 37.474338] 0886f6cd 00000000 b0421180 b0421180 b0421180 ef88e120 ef88e37c b180b180 [ 37.474338] 00000000 ef88f000 b03ba200 0801c44e 00000000 00000000 b0421180 b0421180 [ 37.474338] Call Trace: [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7 [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] cqueue/0 S ef8e0f4c 0 111 2 [ 37.474338] ef8e0fa0 00000046 00000246 ef8e0f4c b01224ed ef8e0f4c 00000000 00000000 [ 37.474338] 0c0b0eaf 00000000 b0421180 b0421180 b0421180 ef8bd0a0 ef8bd2fc b180b180 [ 37.474338] 00000000 ef8e0000 b03ba200 0c0abebc 00000cf2 00000000 b0421180 b0421180 [ 37.474338] Call Trace: [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7 [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] kseriod S b011d321 0 115 2 [ 37.474338] ef8e4f8c 00000046 ef8e4f34 b011d321 00000000 00000000 00000000 ef88d440 [ 37.474338] 28fd3c39 00000000 b0421180 b0421180 b0421180 ef874120 ef87437c b180b180 [ 37.474338] 00000000 ef8e4000 b03ba200 00000000 00000000 00000000 b03cc440 ef88d43c [ 37.474338] Call Trace: [ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42 [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d [ 37.474338] [<b0255c33>] serio_thread+0xc2/0x32f [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 [ 37.474338] [<b0255b71>] ? serio_thread+0x0/0x32f [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] pdflush S 00000000 0 148 2 [ 37.474338] ee52cfa4 00000046 b180b180 00000000 ee52c000 ee52cf68 00000000 000001fa [ 37.474338] 22833768 00000000 b0421180 b0421180 b0421180 ef864120 ef86437c b180b180 [ 37.474338] 00000000 ee52c000 b03ba200 b012229c 00000000 00000000 ef864120 b180b180 [ 37.474338] Call Trace: [ 37.474338] [<b012229c>] ? set_cpus_allowed+0x50/0xb8 [ 37.474338] [<b02f2c57>] ? _spin_unlock_irqrestore+0x1f/0x21 [ 37.474338] [<b011eb19>] ? set_user_nice+0xcf/0xdf [ 37.474338] [<b0167dd1>] ? pdflush+0x0/0x1b4 [ 37.474338] [<b0167e82>] pdflush+0xb1/0x1b4 [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] pdflush S 00000286 0 149 2 [ 37.474338] ee52dfa4 00000046 b03c16e0 00000286 ee52df54 b0130184 00000000 00000000 [ 37.474338] 60e9f95c 00000008 b0421180 b0421180 b0421180 ef8620a0 ef8622fc b180b180 [ 37.474338] 00000000 ee52d000 ee6e0680 00000000 00000000 00000000 00000000 00000000 [ 37.474338] Call Trace: [ 37.474338] [<b0130184>] ? __mod_timer+0xa0/0xaf [ 37.474338] [<b0167dd1>] ? pdflush+0x0/0x1b4 [ 37.474338] [<b0167e82>] pdflush+0xb1/0x1b4 [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] kswapd0 S 00000000 0 150 2 [ 37.474338] ee52ef2c 00000046 00000000 00000000 00000000 00000000 00000000 000004fa [ 37.474338] 22918b1f 00000000 b0421180 b0421180 b0421180 ef85e020 ef85e27c b180b180 [ 37.474338] 00000000 ee52e000 b03ba200 b012229c 00000000 00000000 b1807b00 b180b180 [ 37.474338] Call Trace: [ 37.474338] [<b012229c>] ? set_cpus_allowed+0x50/0xb8 [ 37.474338] [<b011ab2c>] ? __dequeue_entity+0x31/0x35 [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d [ 37.474338] [<b016aefe>] ? kswapd+0x0/0x4a2 [ 37.474338] [<b016b38e>] kswapd+0x490/0x4a2 [ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 [ 37.474338] [<b011d295>] ? complete+0x43/0x4b [ 37.474338] [<b016aefe>] ? kswapd+0x0/0x4a2 [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] aio/0 S ee530f4c 0 151 2 [ 37.474338] ee530fa0 00000046 00000246 ee530f4c b01224ed ee530f4c 00000000 00000000 [ 37.474338] 22bec65c 00000000 b0421180 b0421180 b0421180 ef85c120 ef85c37c b180b180 [ 37.474338] 00000000 ee530000 b03ba200 2291af16 00000000 00000000 b0421180 b0421180 [ 37.474338] Call Trace: [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7 [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] kpsmoused S ee653f4c 0 378 2 [ 37.474338] ee653fa0 00000046 00000246 ee653f4c b01224ed ee653f4c 00000000 00000000 [ 37.474338] 26811a10 00000000 b0421180 b0421180 b0421180 ee6ba020 ee6ba27c b180b180 [ 37.474338] 00000000 ee653000 b03ba200 2680eca0 00000000 00000000 b0421180 b0421180 [ 37.474338] Call Trace: [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7 [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] kondemand/0 S ee5b0f4c 0 384 2 [ 37.474338] ee5b0fa0 00000046 00000246 ee5b0f4c b01224ed ee5b0f4c 00000000 00000000 [ 37.474338] 292b36fc 00000000 b0421180 b0421180 b0421180 ee5cc120 ee5cc37c b180b180 [ 37.474338] 00000000 ee5b0000 b03ba200 290d4c43 00000b94 00000000 b0421180 b0421180 [ 37.474338] Call Trace: [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7 [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] blogd S 00000000 0 423 1 [ 37.474338] ee6d0b10 00000082 000004b3 00000000 b1807120 15b2b2c0 00000000 ef8420a0 [ 37.474338] 60e9cf9d 00000008 b0421180 b0421180 b0421180 ee5e60a0 ee5e62fc b180b180 [ 37.474338] 00000000 ee6d0000 ee6e0680 b0130059 339a08bf 00000002 ee6d0b20 00000286 [ 37.474338] Call Trace: [ 37.474338] [<b0130059>] ? lock_timer_base+0x1f/0x40 [ 37.474338] [<b0130184>] ? __mod_timer+0xa0/0xaf [ 37.474338] [<b02f14ba>] schedule_timeout+0x44/0xa4 [ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36 [ 37.474338] [<b012fcda>] ? process_timeout+0x0/0xa [ 37.474338] [<b02f14b5>] ? schedule_timeout+0x3f/0xa4 [ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c [ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7 [ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e [ 37.474338] [<b020e60c>] ? cfb_fillrect+0x138/0x2bd [ 37.474338] [<b013c088>] ? enqueue_hrtimer+0x75/0xf3 [ 37.474338] [<b0109fbc>] ? read_tsc+0x8/0xa [ 37.474338] [<b013e472>] ? getnstimeofday+0x34/0xdf [ 37.474338] [<b013c90f>] ? ktime_get_ts+0x44/0x49 [ 37.474338] [<b013c925>] ? ktime_get+0x11/0x30 [ 37.474338] [<b011b82b>] ? hrtick_start_fair+0x10d/0x144 [ 37.474338] [<b011b900>] ? enqueue_task_fair+0x52/0x56 [ 37.474338] [<b011a3f1>] ? enqueue_task+0x4c/0x58 [ 37.474338] [<b011eb76>] ? try_to_wake_up+0x4d/0x1be [ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb [ 37.474338] [<b013c6ed>] ? hrtimer_start+0xc7/0x13c [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7 [ 37.474338] [<b01042b4>] ? do_notify_resume+0x55/0x79e [ 37.474338] [<b0127ad4>] ? release_console_sem+0x1c4/0x1d4 [ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42 [ 37.474338] [<b0232d6a>] ? tty_ldisc_deref+0x55/0x6e [ 37.474338] [<b018e3fc>] sys_select+0xd7/0x1a2 [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb [ 37.474338] ======================= [ 37.474338] blogd R running 0 424 1 [ 37.474338] ee6cbdf8 00000082 b18086c0 ee6cbda0 b01e99b7 b1808640 00000000 b180b5f4 [ 37.474338] 60e8dd8e 00000008 b0421180 b0421180 b0421180 ee5d70a0 ee5d72fc b180b180 [ 37.474338] 00000000 ee6cb000 ee6e0680 ee6cbdf8 375f44e1 00000003 ee6cbdec b041e600 [ 37.474338] Call Trace: [ 37.474338] [<b01e99b7>] ? rb_insert_color+0x77/0xd8 [ 37.474338] [<b0143bfe>] futex_wait+0x285/0x2d3 [ 37.474338] [<b013c31c>] ? hrtimer_wakeup+0x0/0x1c [ 37.474338] [<b0143b3d>] ? futex_wait+0x1c4/0x2d3 [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd [ 37.474338] [<b0144c2d>] do_futex+0x20d/0xa5f [ 37.474338] [<b0168667>] ? pagevec_lookup_tag+0x25/0x2e [ 37.474338] [<b016158a>] ? wait_on_page_writeback_range+0x5b/0xf0 [ 37.474338] [<b0109fbc>] ? read_tsc+0x8/0xa [ 37.474338] [<b013e472>] ? getnstimeofday+0x34/0xdf [ 37.474338] [<b013c90f>] ? ktime_get_ts+0x44/0x49 [ 37.474338] [<b013c925>] ? ktime_get+0x11/0x30 [ 37.474338] [<b0145502>] sys_futex+0x83/0xe8 [ 37.474338] [<b01ec504>] ? copy_to_user+0x2a/0x36 [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb [ 37.474338] ======================= [ 37.474338] ata/0 S ee7b3f8c 0 438 2 [ 37.474338] ee7b3fa0 00000046 00000002 ee7b3f8c ee7b3f80 ee7b3f44 00000000 002dae6f [ 37.474338] e3f3f21f 00000000 b0421180 b0421180 b0421180 ee6be120 ee6be37c b180b180 [ 37.474338] 00000000 ee7b3000 ee6e0a80 ef906480 00000000 00000000 b0421180 b0421180 [ 37.474338] Call Trace: [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] ata_aux S ee7b2f4c 0 439 2 [ 37.474338] ee7b2fa0 00000046 ffffffff ee7b2f4c b01224d3 00000000 00000000 00000000 [ 37.474338] 2fc09ed2 00000000 b0421180 b0421180 b0421180 ee5dd020 ee5dd27c b180b180 [ 37.474338] 00000000 ee7b2000 ee6e0280 2f81ef36 00000000 00000000 b0421180 b0421180 [ 37.474338] Call Trace: [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7 [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] scsi_eh_0 S ef906520 0 445 2 [ 37.474338] ee585f64 00000046 ef906524 ef906520 00000000 00000092 00000000 b011d321 [ 37.474338] e4331293 00000000 b0421180 b0421180 b0421180 ee5e4020 ee5e427c b180b180 [ 37.474338] 00000000 ee585000 ee6e0480 ee702008 00108a29 00000000 f084c295 ee702000 [ 37.474338] Call Trace: [ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42 [ 37.474338] [<f084c295>] ? __scsi_iterate_devices+0x5d/0x6b [scsi_mod] [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod] [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod] [ 37.474338] [<f085033a>] scsi_error_handler+0x37/0x4eb [scsi_mod] [ 37.474338] [<b011d295>] ? complete+0x43/0x4b [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod] [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] scsi_eh_1 S ee622520 0 446 2 [ 37.474338] ee5acf64 00000046 ee622524 ee622520 00000000 00000092 00000000 b011d321 [ 37.474338] e4333e8f 00000000 b0421180 b0421180 b0421180 ee5d10a0 ee5d12fc b180b180 [ 37.474338] 00000000 ee5ac000 ee6e0480 ee702808 00000000 00000000 f084c295 ee702800 [ 37.474338] Call Trace: [ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42 [ 37.474338] [<f084c295>] ? __scsi_iterate_devices+0x5d/0x6b [scsi_mod] [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod] [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod] [ 37.474338] [<f085033a>] scsi_error_handler+0x37/0x4eb [scsi_mod] [ 37.474338] [<b011d295>] ? complete+0x43/0x4b [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod] [ 37.474338] [<b01392f4>] kthread+0x37/0x59 [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 [ 37.474338] ======================= [ 37.474338] udevd S 00000000 0 461 1 [ 37.474338] ee5d8b10 00000082 00000000 00000000 00000000 00000000 00000000 ee5d8ab4 [ 37.474338] 93a637c5 00000001 b0421180 b0421180 b0421180 ee6c20a0 ee6c22fc b180b180 [ 37.474338] 00000000 ee5d8000 ee6e0480 b1807980 00000000 00000000 000f41a9 00000000 [ 37.474338] Call Trace: [ 37.474338] [<b011b920>] ? __update_rq_clock+0x1c/0x157 [ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4 [ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36 [ 37.474338] [<b018e2c1>] ? __pollwait+0x67/0xcb [ 37.474338] [<b0186ef4>] ? pipe_poll+0x29/0x8f [ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c [ 37.474338] [<b013fdbc>] ? clocksource_get_next+0x3d/0x44 [ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd [ 37.474338] [<b01414ad>] ? clockevents_program_event+0x93/0x108 [ 37.474338] [<b01424d9>] ? tick_program_event+0x3f/0x61 [ 37.474338] [<b01eab55>] ? number+0x2a3/0x2b5 [ 37.474338] [<b01057bc>] ? apic_timer_interrupt+0x28/0x30 [ 37.474338] [<b019007b>] ? fcntl_getlk64+0x4e/0x159 [ 37.474338] [<b01eb3a7>] ? vsnprintf+0x2e8/0x5ea [ 37.474338] [<b0119234>] ? kmap_atomic_prot+0x47/0xa8 [ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443 [ 37.474338] [<b016625f>] ? __alloc_pages+0x57/0x32d [ 37.474338] [<b016c409>] ? __inc_zone_page_state+0x18/0x1a [ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb [ 37.474338] [<b0165c04>] ? free_hot_page+0xa/0xc [ 37.474338] [<b0168bf3>] ? put_page+0x2d/0xac [ 37.474338] [<b0176340>] ? free_page_and_swap_cache+0x1e/0x3e [ 37.474338] [<b016e44a>] ? unmap_vmas+0x317/0x54b [ 37.474338] [<b0117410>] ? pgd_dtor+0x0/0x4a [ 37.474338] [<b011740e>] ? check_pgt_cache+0x1e/0x20 [ 37.474338] [<b01711bf>] ? unmap_region+0xdc/0x12f [ 37.474338] [<b018e35e>] sys_select+0x39/0x1a2 [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb [ 37.474338] ======================= [ 37.474338] udevsettle R running 0 465 1 [ 37.474338] ee524f1c 00000082 ee524ed4 00000000 00010542 b1808640 00000000 ee6cbe14 [ 37.474338] 60e8f831 00000008 b0421180 b0421180 b0421180 ee5ce020 ee5ce27c b180b180 [ 37.474338] 00000000 ee524000 ee6e0880 ee524f1c 33bd800d 00000003 00000008 b041e600 [ 37.474338] Call Trace: [ 37.474338] [<b02f1a50>] do_nanosleep+0x70/0x9a [ 37.474338] [<b013c7ae>] hrtimer_nanosleep+0x4c/0xaf [ 37.474338] [<b013c31c>] ? hrtimer_wakeup+0x0/0x1c [ 37.474338] [<b02f1a3d>] ? do_nanosleep+0x5d/0x9a [ 37.474338] [<b013c868>] sys_nanosleep+0x57/0x5b [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb [ 37.474338] [<b02f0000>] ? relay_hotcpu_callback+0x51/0xb1 [ 37.474338] ======================= [ 37.474338] udevd S 00000000 0 873 461 [ 37.474338] e7c7eb10 00000082 00000000 00000000 00000000 00000000 00000000 00000000 [ 37.474338] 9385214a 00000001 b0421180 b0421180 b0421180 ef8d40a0 ef8d42fc b180b180 [ 37.474338] 00000000 e7c7e000 ee6e0280 00000000 00000000 00000000 00000000 00000000 [ 37.474338] Call Trace: [ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4 [ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36 [ 37.474338] [<b018e2c1>] ? __pollwait+0x67/0xcb [ 37.474338] [<b0186ef4>] ? pipe_poll+0x29/0x8f [ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c [ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124 [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124 [ 37.474338] [<b0119234>] ? kmap_atomic_prot+0x47/0xa8 [ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443 [ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb [ 37.474338] [<b013c088>] ? enqueue_hrtimer+0x75/0xf3 [ 37.474338] [<b013c6ed>] ? hrtimer_start+0xc7/0x13c [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7 [ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e [ 37.474338] [<b016de52>] ? do_wp_page+0x2a0/0x3e5 [ 37.474338] [<b016f0e7>] ? handle_mm_fault+0x442/0x5d4 [ 37.474338] [<b02f12e5>] ? preempt_schedule+0x40/0x55 [ 37.474338] [<b018e35e>] sys_select+0x39/0x1a2 [ 37.474338] [<b017fc6f>] ? filp_close+0x43/0x69 [ 37.474338] [<b01ec504>] ? copy_to_user+0x2a/0x36 [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb [ 37.474338] ======================= [ 37.474338] udevd S 00000000 0 876 461 [ 37.474338] ee780b10 00000086 00000000 00000000 00000000 00000000 00000000 00000000 [ 37.474338] 93a2c2db 00000001 b0421180 b0421180 b0421180 ef8a1120 ef8a137c b180b180 [ 37.474338] 00000000 ee780000 ee564100 00000000 00000000 00000000 00000000 00000000 [ 37.474338] Call Trace: [ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4 [ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36 [ 37.474338] [<b018e2c1>] ? __pollwait+0x67/0xcb [ 37.474338] [<b0186ef4>] ? pipe_poll+0x29/0x8f [ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c [ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd [ 37.474338] [<b011a8a0>] ? update_curr+0x12f/0x136 [ 37.474338] [<b012381e>] ? task_tick_fair+0x59/0x86 [ 37.474338] [<b01227b3>] ? scheduler_tick+0x268/0x3cc [ 37.474338] [<b0109fbc>] ? read_tsc+0x8/0xa [ 37.474338] [<b013e472>] ? getnstimeofday+0x34/0xdf [ 37.474338] [<b01414ad>] ? clockevents_program_event+0x93/0x108 [ 37.474338] [<b01424d9>] ? tick_program_event+0x3f/0x61 [ 37.474338] [<b013ca83>] ? hrtimer_interrupt+0x13f/0x164 [ 37.474338] [<b012c055>] ? irq_exit+0x3f/0x79 [ 37.474338] [<b0113103>] ? smp_apic_timer_interrupt+0x5c/0x89 [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124 [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124 [ 37.474338] [<b0119234>] ? kmap_atomic_prot+0x47/0xa8 [ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443 [ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb [ 37.474338] [<b013c088>] ? enqueue_hrtimer+0x75/0xf3 [ 37.474338] [<b013c6ed>] ? hrtimer_start+0xc7/0x13c [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7 [ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e [ 37.474338] [<b016de52>] ? do_wp_page+0x2a0/0x3e5 [ 37.474338] [<b016f0e7>] ? handle_mm_fault+0x442/0x5d4 [ 37.474338] [<b02f12e5>] ? preempt_schedule+0x40/0x55 [ 37.474338] [<b018e35e>] sys_select+0x39/0x1a2 [ 37.474338] [<b017fc6f>] ? filp_close+0x43/0x69 [ 37.474338] [<b01ec504>] ? copy_to_user+0x2a/0x36 [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb [ 37.474338] ======================= [ 37.474338] scsi_id D 00000000 0 878 873 [ 37.474338] ee60bb7c 00000086 00000046 00000000 00000000 ef904000 00000000 ee66d910 [ 37.474338] 93796336 00000001 b0421180 b0421180 b0421180 ee5d3120 ee5d337c b180b180 [ 37.474338] 00000000 ee60b000 ee564700 ee5c2800 0000065a 00000000 ee60bb5c b024e1b8 [ 37.474338] Call Trace: [ 37.474338] [<b024e1b8>] ? put_device+0xf/0x11 [ 37.474338] [<f0852878>] ? scsi_request_fn+0x218/0x345 [scsi_mod] [ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4 [ 37.474338] [<b01d8b08>] ? elv_insert+0xf3/0x20b [ 37.474338] [<b01302da>] ? mod_timer+0x26/0x3d [ 37.474338] [<b01db15b>] ? blk_plug_device+0x42/0x9a [ 37.474338] [<b02f08e5>] wait_for_common+0x74/0x12e [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd [ 37.474338] [<b02f0a21>] wait_for_completion+0x12/0x14 [ 37.474338] [<b01dd237>] blk_execute_rq+0x5c/0x9c [ 37.474338] [<b01dd277>] ? blk_end_sync_rq+0x0/0x29 [ 37.474338] [<b01a3fe2>] ? bio_add_pc_page+0x24/0x2a [ 37.474338] [<b01d9672>] ? blk_rq_bio_prep+0x9e/0xb2 [ 37.474338] [<b01dceef>] ? blk_rq_append_bio+0x17/0x49 [ 37.474338] [<b01dd022>] ? blk_rq_map_user+0x101/0x1b4 [ 37.474338] [<b01e01f6>] sg_io+0x18c/0x2e1 [ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443 [ 37.474338] [<b01e05df>] scsi_cmd_ioctl+0x294/0x3d5 [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124 [ 37.474338] [<b0188770>] ? do_lookup+0x5a/0x162 [ 37.474338] [<f08799b5>] sd_ioctl+0x82/0xc9 [sd_mod] [ 37.474338] [<b01ddf93>] blkdev_driver_ioctl+0x55/0x5e [ 37.474338] [<b01de1bf>] blkdev_ioctl+0x223/0x814 [ 37.474338] [<b01e7574>] ? kobject_get+0x12/0x17 [ 37.474338] [<b0160fc3>] ? find_lock_page+0x72/0x8d [ 37.474338] [<b0163210>] ? filemap_fault+0x240/0x449 [ 37.474338] [<b01395a5>] ? wake_up_bit+0x17/0x1b [ 37.474338] [<b0160e75>] ? unlock_page+0x25/0x28 [ 37.474338] [<b016d9b5>] ? __do_fault+0x17a/0x377 [ 37.474338] [<b01a57b6>] ? blkdev_open+0x28/0x58 [ 37.474338] [<b016eda8>] ? handle_mm_fault+0x103/0x5d4 [ 37.474338] [<b01a4c39>] block_ioctl+0x1b/0x21 [ 37.474338] [<b01a4c1e>] ? block_ioctl+0x0/0x21 [ 37.474338] [<b018cc12>] vfs_ioctl+0x22/0x71 [ 37.474338] [<b018cea9>] do_vfs_ioctl+0x248/0x290 [ 37.474338] [<b0117ba2>] ? do_page_fault+0x13d/0x5fb [ 37.474338] [<b0189bfc>] ? putname+0x25/0x30 [ 37.474338] [<b01800e0>] ? do_sys_open+0xb1/0xc7 [ 37.474338] [<b018cf43>] sys_ioctl+0x52/0x63 [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb [ 37.474338] ======================= [ 37.474338] scsi_id D ee621bd4 0 879 876 [ 37.474338] ee636b7c 00000086 ee621bac ee621bd4 00000000 f084c4f9 00000000 ee66d0c8 [ 37.474338] 9394eaa1 00000001 b0421180 b0421180 b0421180 ef888020 ef88827c b180b180 [ 37.474338] 00000000 ee636000 ee564300 ee6e8c00 00000000 00000000 ee636b5c b024e1b8 [ 37.474338] Call Trace: [ 37.474338] [<f084c4f9>] ? scsi_done+0x0/0x19 [scsi_mod] [ 37.474338] [<b024e1b8>] ? put_device+0xf/0x11 [ 37.474338] [<f0852878>] ? scsi_request_fn+0x218/0x345 [scsi_mod] [ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4 [ 37.474338] [<b01d8b08>] ? elv_insert+0xf3/0x20b [ 37.474338] [<b01302da>] ? mod_timer+0x26/0x3d [ 37.474338] [<b01db15b>] ? blk_plug_device+0x42/0x9a [ 37.474338] [<b02f08e5>] wait_for_common+0x74/0x12e [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd [ 37.474338] [<b02f0a21>] wait_for_completion+0x12/0x14 [ 37.474338] [<b01dd237>] blk_execute_rq+0x5c/0x9c [ 37.474338] [<b01dd277>] ? blk_end_sync_rq+0x0/0x29 [ 37.474338] [<b01a3fe2>] ? bio_add_pc_page+0x24/0x2a [ 37.474338] [<b01d9672>] ? blk_rq_bio_prep+0x9e/0xb2 [ 37.474338] [<b01dceef>] ? blk_rq_append_bio+0x17/0x49 [ 37.474338] [<b01dd022>] ? blk_rq_map_user+0x101/0x1b4 [ 37.474338] [<b01e01f6>] sg_io+0x18c/0x2e1 [ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443 [ 37.474338] [<b01e05df>] scsi_cmd_ioctl+0x294/0x3d5 [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124 [ 37.474338] [<b0188770>] ? do_lookup+0x5a/0x162 [ 37.474338] [<f08799b5>] sd_ioctl+0x82/0xc9 [sd_mod] [ 37.474338] [<b01ddf93>] blkdev_driver_ioctl+0x55/0x5e [ 37.474338] [<b01de1bf>] blkdev_ioctl+0x223/0x814 [ 37.474338] [<b0168bbe>] ? activate_page+0xb1/0xb9 [ 37.474338] [<b0168cce>] ? mark_page_accessed+0x27/0x2e [ 37.474338] [<b0163210>] ? filemap_fault+0x240/0x449 [ 37.474338] [<b01395a5>] ? wake_up_bit+0x17/0x1b [ 37.474338] [<b0160e75>] ? unlock_page+0x25/0x28 [ 37.474338] [<b016d9b5>] ? __do_fault+0x17a/0x377 [ 37.474338] [<b01a57b6>] ? blkdev_open+0x28/0x58 [ 37.474338] [<b016eda8>] ? handle_mm_fault+0x103/0x5d4 [ 37.474338] [<b01a4c39>] block_ioctl+0x1b/0x21 [ 37.474338] [<b01a4c1e>] ? block_ioctl+0x0/0x21 [ 37.474338] [<b018cc12>] vfs_ioctl+0x22/0x71 [ 37.474338] [<b018cea9>] do_vfs_ioctl+0x248/0x290 [ 37.474338] [<b0117ba2>] ? do_page_fault+0x13d/0x5fb [ 37.474338] [<b0189bfc>] ? putname+0x25/0x30 [ 37.474338] [<b01800e0>] ? do_sys_open+0xb1/0xc7 [ 37.474338] [<b018cf43>] sys_ioctl+0x52/0x63 [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb [ 37.474338] ======================= [ 37.474338] Sched Debug Version: v0.07, 2.6.25-rc2-smp #14 [ 37.474338] now at 44900.194289 msecs [ 37.474338] .sysctl_sched_latency : 20.000000 [ 37.474338] .sysctl_sched_min_granularity : 4.000000 [ 37.474338] .sysctl_sched_wakeup_granularity : 10.000000 [ 37.474338] .sysctl_sched_batch_wakeup_granularity : 10.000000 [ 37.474338] .sysctl_sched_child_runs_first : 0.000001 [ 37.474338] .sysctl_sched_features : 39 [ 37.474338] [ 37.474338] cpu#0, 2992.603 MHz [ 37.474338] .nr_running : 4 [ 37.474338] .load : 2048 [ 37.474338] .nr_switches : 3363 [ 37.474338] .nr_load_updates : 34557 [ 37.474338] .nr_uninterruptible : 2 [ 37.474338] .jiffies : 4294705811 [ 37.474338] .next_balance : 4294.705968 [ 37.474338] .curr->pid : 4 [ 37.474338] .clock : 37474.338672 [ 37.474338] .idle_clock : 3065.714769 [ 37.474338] .prev_clock_raw : 78772.978492 [ 37.474338] .clock_warps : 0 [ 37.474338] .clock_overflows : 3996 [ 37.474338] .clock_underflows : 31781 [ 37.474338] .clock_deep_idle_events : 1 [ 37.474338] .clock_max_delta : 0.999848 [ 37.474338] .cpu_load[0] : 2048 [ 37.474338] .cpu_load[1] : 2048 [ 37.474338] .cpu_load[2] : 2048 [ 37.474338] .cpu_load[3] : 2048 [ 37.474338] .cpu_load[4] : 2048 [ 37.474338] [ 37.474338] cfs_rq [ 37.474338] .exec_clock : 34293.783916 [ 37.474338] .MIN_vruntime : 0.000001 [ 37.474338] .min_vruntime : 17146.893105 [ 37.474338] .max_vruntime : 0.000001 [ 37.474338] .spread : 0.000000 [ 37.474338] .spread0 : 0.000000 [ 37.474338] .nr_running : 1 [ 37.474338] .load : 2048 [ 37.474338] .bkl_count : 405 [ 37.474338] .nr_spread_over : 0 [ 37.474338] [ 37.474338] cfs_rq [ 37.474338] .exec_clock : 34293.783916 [ 37.474338] .MIN_vruntime : 13830.833996 [ 37.474338] .min_vruntime : 17146.893105 [ 37.474338] .max_vruntime : 13830.833996 [ 37.474338] .spread : 0.000000 [ 37.474338] .spread0 : 0.000000 [ 37.474338] .nr_running : 4 [ 37.474338] .load : 8290 [ 37.474338] .bkl_count : 405 [ 37.474338] .nr_spread_over : 6 [ 37.474338] [ 37.474338] runnable tasks: [ 37.474338] task PID tree-key switches prio exec-runtime sum-exec sum-sleep [ 37.474338] ---------------------------------------------------------------------------------------------------------- [ 37.474338] R ksoftirqd/0 4 14329.115724 37 115 14329.115724 33248.417353 4113.418770 [ 37.474338] events/0 5 13830.833996 35 115 13830.833996 0.284125 4265.748410 [ 37.474338] blogd 424 13830.833996 135 120 13830.833996 0.460905 1680.558420 [ 37.474338] udevsettle 465 13830.833996 17 120 13830.833996 0.838894 643.594622 [ 37.474338] [-- Attachment #2: config.gz --] [-- Type: application/x-gzip, Size: 13006 bytes --] ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: regression: CD burning (k3b) went broke 2008-02-21 8:42 regression: CD burning (k3b) went broke Mike Galbraith @ 2008-02-22 7:32 ` Jens Axboe 2008-02-23 7:42 ` Mike Galbraith 0 siblings, 1 reply; 109+ messages in thread From: Jens Axboe @ 2008-02-22 7:32 UTC (permalink / raw) To: Mike Galbraith; +Cc: LKML, Tejun Heo On Thu, Feb 21 2008, Mike Galbraith wrote: > Greetings, > > K3b recently (9a4c854..5d9c4a7 pull) began terminally griping about > buffer underrun upon every attempt to burn a CD. I can't fully bisect > the problem because intervening kernels hang soft during boot. Using > git bisect visualize, and converting to postable text: > > bisect/bad block: add request->raw_data_len (6b00769fe1502b4ad97bb327ef7ac971b208bfb5) > bisect block: update bio according to DMA alignment padding (40b01b9bbdf51ae543a04744283bf2d56c4a6afa) > libata: update ATAPI overflow draining > bisect/good-e164094964e6e20fe7fce418e06a9dce952bb7a4 Tejun? > > Serial console log of hung kernel 40b01b9bbdf51ae543a04744283bf2d56c4a6afa below > > [ 0.000000] Linux version 2.6.25-rc2-smp (root@homer) (gcc version 4.2.1 (SUSE Linux)) #14 SMP PREEMPT Thu Feb 21 08:49:51 CET 2008 > [ 0.000000] BIOS-provided physical RAM map: > [ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) > [ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) > [ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) > [ 0.000000] BIOS-e820: 0000000000100000 - 000000003fff0000 (usable) > [ 0.000000] BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS) > [ 0.000000] BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data) > [ 0.000000] BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved) > [ 0.000000] 0MB HIGHMEM available. > [ 0.000000] 1023MB LOWMEM available. > [ 0.000000] Scan SMP from b0000000 for 1024 bytes. > [ 0.000000] Scan SMP from b009fc00 for 1024 bytes. > [ 0.000000] Scan SMP from b00f0000 for 65536 bytes. > [ 0.000000] found SMP MP-table at [b00f5320] 000f5320 > [ 0.000000] Zone PFN ranges: > [ 0.000000] DMA 0 -> 4096 > [ 0.000000] Normal 4096 -> 262128 > [ 0.000000] HighMem 262128 -> 262128 > [ 0.000000] Movable zone start PFN for each node > [ 0.000000] early_node_map[1] active PFN ranges > [ 0.000000] 0: 0 -> 262128 > [ 0.000000] DMI 2.3 present. > [ 0.000000] ACPI: RSDP 000F6CC0, 0014 (r0 IntelR) > [ 0.000000] ACPI: RSDT 3FFF3000, 002C (r1 IntelR AWRDACPI 42302E31 AWRD 0) > [ 0.000000] ACPI: FACP 3FFF3040, 0074 (r1 IntelR AWRDACPI 42302E31 AWRD 0) > [ 0.000000] ACPI: DSDT 3FFF30C0, 4139 (r1 INTELR AWRDACPI 1000 MSFT 100000E) > [ 0.000000] ACPI: FACS 3FFF0000, 0040 > [ 0.000000] ACPI: APIC 3FFF7200, 0068 (r1 IntelR AWRDACPI 42302E31 AWRD 0) > [ 0.000000] ACPI: PM-Timer IO Port: 0x408 > [ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) > [ 0.000000] Processor #0 15:2 APIC version 20 > [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) > [ 0.000000] Processor #1 15:2 APIC version 20 > [ 0.000000] WARNING: maxcpus limit of 1 reached. Processor ignored. > [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) > [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) > [ 0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) > [ 0.000000] IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23 > [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) > [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) > [ 0.000000] Enabling APIC mode: Flat. Using 1 I/O APICs > [ 0.000000] Using ACPI (MADT) for SMP configuration information > [ 0.000000] Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000) > [ 0.000000] PM: Registered nosave memory: 000000000009f000 - 00000000000a0000 > [ 0.000000] PM: Registered nosave memory: 00000000000a0000 - 00000000000f0000 > [ 0.000000] PM: Registered nosave memory: 00000000000f0000 - 0000000000100000 > [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 260081 > [ 0.000000] Kernel command line: root=/dev/sdb3 rootflags=data=writeback vga=0x314 resume=/dev/sdb2 console=ttyS0,115200n8 console=tty splash=silent PROFILE=default 1 maxcpus=1 > [ 0.000000] Enabling fast FPU save and restore... done. > [ 0.000000] Enabling unmasked SIMD FPU exception support... done. > [ 0.000000] Initializing CPU#0 > [ 0.000000] Preemptible RCU implementation. > [ 0.000000] CPU 0 irqstacks, hard=b0427000 soft=b0425000 > [ 0.000000] PID hash table entries: 4096 (order: 12, 16384 bytes) > [ 0.000000] Detected 2992.603 MHz processor. > [ 0.000999] Console: colour dummy device 80x25 > [ 0.000999] console [tty0] enabled > [ 0.000999] console [ttyS0] enabled > [ 0.000999] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) > [ 0.000999] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) > [ 0.000999] Memory: 1028968k/1048512k available (1998k kernel code, 18904k reserved, 955k data, 236k init, 0k highmem) > [ 0.000999] virtual kernel memory layout: > [ 0.000999] fixmap : 0xfff9b000 - 0xfffff000 ( 400 kB) > [ 0.000999] pkmap : 0xff800000 - 0xffc00000 (4096 kB) > [ 0.000999] vmalloc : 0xf0800000 - 0xff7fe000 ( 239 MB) > [ 0.000999] lowmem : 0xb0000000 - 0xefff0000 (1023 MB) > [ 0.000999] .init : 0xb03e7000 - 0xb0422000 ( 236 kB) > [ 0.000999] .data : 0xb02f3b26 - 0xb03e29a8 ( 955 kB) > [ 0.000999] .text : 0xb0100000 - 0xb02f3b26 (1998 kB) > [ 0.000999] Checking if this processor honours the WP bit even in supervisor mode...Ok. > [ 0.060993] Calibrating delay using timer specific routine.. 5987.55 BogoMIPS (lpj=2993775) > [ 0.063022] Security Framework initialized > [ 0.064010] Mount-cache hash table entries: 512 > [ 0.065129] CPU: Trace cache: 12K uops, L1 D cache: 8K > [ 0.066992] CPU: L2 cache: 512K > [ 0.067992] CPU: Physical Processor ID: 0 > [ 0.068993] Intel machine check architecture supported. > [ 0.069994] Intel machine check reporting enabled on CPU#0. > [ 0.070991] CPU0: Intel P4/Xeon Extended MCE MSRs (12) available > [ 0.071992] CPU0: Thermal monitoring enabled > [ 0.072993] Compat vDSO mapped to ffffe000. > [ 0.073996] Checking 'hlt' instruction... OK. > [ 0.079743] SMP alternatives: switching to UP code > [ 0.079998] Freeing SMP alternatives: 9k freed > [ 0.080991] ACPI: Core revision 20070126 > [ 0.091025] CPU0: Intel(R) Pentium(R) 4 CPU 3.00GHz stepping 09 > [ 0.094019] Total of 1 processors activated (5987.55 BogoMIPS). > [ 0.095110] ENABLING IO-APIC IRQs > [ 0.096152] ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 > [ 0.107983] Brought up 1 CPUs > [ 0.108208] net_namespace: 552 bytes > [ 0.110058] NET: Registered protocol family 16 > [ 0.111195] ACPI: bus type pci registered > [ 0.114122] PCI: PCI BIOS revision 2.10 entry at 0xfb980, last bus=2 > [ 0.114984] PCI: Using configuration type 1 > [ 0.115983] Setting up standard PCI resources > [ 0.139635] ACPI: Interpreter enabled > [ 0.139985] ACPI: (supports S0 S3 S4 S5) > [ 0.142261] ACPI: Using IOAPIC for interrupt routing > [ 0.148393] ACPI: PCI Root Bridge [PCI0] (0000:00) > [ 0.149406] pci 0000:00:1f.0: quirk: region 0400-047f claimed by ICH4 ACPI/GPIO/TCO > [ 0.149981] pci 0000:00:1f.0: quirk: region 0480-04bf claimed by ICH4 GPIO > [ 0.151395] PCI: Transparent bridge - 0000:00:1e.0 > [ 0.160727] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 7 9 10 11 12 14 15) > [ 0.164091] ACPI: PCI Interrupt Link [LNKB] (IRQs *3 4 5 7 9 10 11 12 14 15) > [ 0.168383] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 7 9 10 11 12 14 15) > [ 0.171666] ACPI: PCI Interrupt Link [LNKD] (IRQs *3 4 5 7 9 10 11 12 14 15) > [ 0.175277] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 9 10 *11 12 14 15) > [ 0.179656] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 *9 10 11 12 14 15) > [ 0.183050] ACPI: PCI Interrupt Link [LNK0] (IRQs 3 4 5 7 9 10 11 12 14 15) *0, disabled. > [ 0.188275] ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 5 7 9 10 *11 12 14 15) > [ 0.192014] Linux Plug and Play Support v0.97 (c) Adam Belay > [ 0.193005] pnp: PnP ACPI init > [ 0.193979] ACPI: bus type pnp registered > [ 0.199104] pnpacpi: exceeded the max number of mem resources: 12 > [ 0.200035] pnp: PnP ACPI: found 13 devices > [ 0.200971] ACPI: ACPI bus type pnp unregistered > [ 0.202306] PCI: Using ACPI for IRQ routing > [ 0.202977] PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report > [ 0.226967] NetLabel: Initializing > [ 0.226969] NetLabel: domain hash size = 128 > [ 0.227968] NetLabel: protocols = UNLABELED CIPSOv4 > [ 0.228981] NetLabel: unlabeled traffic allowed by default > [ 0.230055] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 11 > [ 0.232158] hpet0: 3 64-bit timers, 14318180 Hz > [ 0.235002] ACPI: RTC can wake from S4 > [ 0.235966] Time: tsc clocksource has been installed. > [ 0.243966] system 00:01: ioport range 0xb78-0xb7b has been reserved > [ 0.243969] system 00:01: ioport range 0xf78-0xf7b has been reserved > [ 0.244971] system 00:01: ioport range 0xa78-0xa7b has been reserved > [ 0.245968] system 00:01: ioport range 0xe78-0xe7b has been reserved > [ 0.246967] system 00:01: ioport range 0xbbc-0xbbf has been reserved > [ 0.247967] system 00:01: ioport range 0xfbc-0xfbf has been reserved > [ 0.248968] system 00:01: ioport range 0x4d0-0x4d1 has been reserved > [ 0.249967] system 00:01: ioport range 0x200-0x200 has been reserved > [ 0.250967] system 00:01: ioport range 0x202-0x208 has been reserved > [ 0.251967] system 00:01: ioport range 0x320-0x32f has been reserved > [ 0.252966] system 00:01: ioport range 0x295-0x296 has been reserved > [ 0.254970] system 00:0b: ioport range 0x400-0x4bf could not be reserved > [ 0.255973] system 00:0c: iomem range 0xf0000-0xf3fff could not be reserved > [ 0.256966] system 00:0c: iomem range 0xf4000-0xf7fff could not be reserved > [ 0.257966] system 00:0c: iomem range 0xf8000-0xfbfff could not be reserved > [ 0.258966] system 00:0c: iomem range 0xfc000-0xfffff could not be reserved > [ 0.259966] system 00:0c: iomem range 0x3fff0000-0x3fffffff could not be reserved > [ 0.260965] system 00:0c: iomem range 0x0-0x9ffff could not be reserved > [ 0.261965] system 00:0c: iomem range 0x100000-0x3ffeffff could not be reserved > [ 0.262965] system 00:0c: iomem range 0xfec00000-0xfec00fff could not be reserved > [ 0.263965] system 00:0c: iomem range 0xfec01000-0xfed8ffff could not be reserved > [ 0.264965] system 00:0c: iomem range 0xfee00000-0xfee00fff could not be reserved > [ 0.265965] system 00:0c: iomem range 0xffb00000-0xffbfffff could not be reserved > [ 0.266964] system 00:0c: iomem range 0xfff00000-0xffffffff could not be reserved > [ 0.298911] PCI: Bridge: 0000:00:01.0 > [ 0.298960] IO window: a000-afff > [ 0.299963] MEM window: 0xf8000000-0xf9ffffff > [ 0.300961] PREFETCH window: 0x00000000e8000000-0x00000000f7ffffff > [ 0.301963] PCI: Bridge: 0000:00:1e.0 > [ 0.302959] IO window: b000-bfff > [ 0.303961] MEM window: 0xfa000000-0xfa0fffff > [ 0.304960] PREFETCH window: disabled. > [ 0.305994] NET: Registered protocol family 2 > [ 0.325957] IP route cache hash table entries: 32768 (order: 5, 131072 bytes) > [ 0.326233] TCP established hash table entries: 131072 (order: 8, 1048576 bytes) > [ 0.328385] TCP bind hash table entries: 65536 (order: 7, 524288 bytes) > [ 0.329394] TCP: Hash tables configured (established 131072 bind 65536) > [ 0.329963] TCP reno registered > [ 0.337961] Unpacking initramfs... done > [ 0.574326] Freeing initrd memory: 6128k freed > [ 0.576000] Machine check exception polling timer started. > [ 0.577268] audit: initializing netlink socket (disabled) > [ 0.577937] type=2000 audit(1203584121.788:1): initialized > [ 0.579060] Total HugeTLB memory allocated, 0 > [ 0.579988] VFS: Disk quotas dquot_6.5.1 > [ 0.580947] Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) > [ 0.582937] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 253) > [ 0.583930] io scheduler noop registered > [ 0.584925] io scheduler anticipatory registered > [ 0.585924] io scheduler deadline registered > [ 0.586932] io scheduler cfq registered (default) > [ 0.589129] vesafb: framebuffer at 0xe8000000, mapped to 0xf0880000, using 1875k, total 16384k > [ 0.589925] vesafb: mode is 800x600x16, linelength=1600, pages=16 > [ 0.590924] vesafb: protected mode interface info at c000:b544 > [ 0.591926] vesafb: pmi: set display start = b00cb5d2, set palette = b00cb612 > [ 0.592923] vesafb: scrolling: redraw > [ 0.593924] vesafb: Truecolor: size=0:5:6:5, shift=0:11:5:0 > [ 0.612485] Console: switching to colour frame buffer device 100x37 > [ 0.627919] fb0: VESA VGA frame buffer device > [ 0.638247] Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled > [ 0.639029] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A > [ 0.641591] 00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A > [ 0.642200] PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1 > [ 0.642917] PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp > [ 0.644280] serio: i8042 KBD port at 0x60,0x64 irq 1 > [ 0.645234] mice: PS/2 mouse device common for all mice > [ 0.675685] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0 > [ 0.687684] rtc_cmos 00:03: rtc core: registered rtc_cmos as rtc0 > [ 0.687709] rtc0: alarms up to one month > [ 0.688747] cpuidle: using governor ladder > [ 0.689689] cpuidle: using governor menu > [ 0.690751] oprofile: using NMI interrupt. > [ 0.693003] NET: Registered protocol family 1 > [ 0.693759] p4-clockmod: P4/Xeon(TM) CPU On-Demand Clock Modulation available > [ 0.694686] Using IPI No-Shortcut mode > [ 0.695814] registered taskstats version 1 > [ 0.696813] rtc_cmos 00:03: setting system clock to 2008-02-21 08:55:23 UTC (1203584123) > [ 0.698738] Freeing unused kernel memory: 236k freed > [ 0.699701] Write protecting the kernel text: 2000k > [ 0.700690] Write protecting the kernel read-only data: 792k > [ 0.762206] ACPI: ACPI0007:00 is registered as cooling_device0 > [ 0.768027] ACPI: LNXTHERM:01 is registered as thermal_zone0 > [ 0.768870] ACPI: Thermal Zone [THRM] (40 C) > [ 0.785027] SCSI subsystem initialized > [ 0.806767] ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 18 > [ 0.808835] scsi0 : ata_piix > [ 0.809721] scsi1 : ata_piix > [ 0.812130] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0xf000 irq 14 > [ 0.812702] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0xf008 irq 15 > [ 1.290553] ata1.00: ATA-6: ST3160021A, 3.04, max UDMA/100 > [ 1.290558] ata1.00: 312581808 sectors, multi 16: LBA48 > [ 1.291580] ata1.01: ATAPI: BENQ DVD DD DW1625, BBIA, max UDMA/33 > [ 1.314801] ata1.00: configured for UDMA/100 > [ 1.473512] ata1.01: configured for UDMA/33 > [ 3.788245] ata2.00: ATA-6: ST3120022A, 3.06, max UDMA/100 > [ 3.788248] ata2.00: 234441648 sectors, multi 16: LBA48 > [ 3.811575] ata2.00: configured for UDMA/100 > [ 3.823087] scsi 0:0:0:0: Direct-Access ATA ST3160021A 3.04 PQ: 0 ANSI: 5 > [ 3.825453] scsi 0:0:1:0: CD-ROM BENQ DVD DD DW1625 BBIA PQ: 0 ANSI: 5 > [ 3.825587] scsi 1:0:0:0: Direct-Access ATA ST3120022A 3.06 PQ: 0 ANSI: 5 > [ 3.831138] ACPI: PNP0C0B:00 is registered as cooling_device1 > [ 3.831481] ACPI: Fan [FAN] (on) > [ 3.856766] BIOS EDD facility v0.16 2004-Jun-25, 6 devices found > [ 3.999333] Driver 'sd' needs updating - please use bus_type methods > [ 3.999551] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB) > [ 4.000464] sd 0:0:0:0: [sda] Write Protect is off > [ 4.001459] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA > [ 4.002528] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB) > [ 4.003475] sd 0:0:0:0: [sda] Write Protect is off > [ 4.004458] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA > [ 4.005586] sda: sda1 sda2 < sda5 sda6 > > [ 4.048569] sd 0:0:0:0: [sda] Attached SCSI disk > [ 4.049519] sd 1:0:0:0: [sdb] 234441648 512-byte hardware sectors (120034 MB) > [ 4.050439] sd 1:0:0:0: [sdb] Write Protect is off > [ 4.051473] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA > [ 4.052505] sd 1:0:0:0: [sdb] 234441648 512-byte hardware sectors (120034 MB) > [ 4.053436] sd 1:0:0:0: [sdb] Write Protect is off > [ 4.054450] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA > [ 4.055498] sdb: sdb1 sdb2 sdb3 > [ 4.063528] sd 1:0:0:0: [sdb] Attached SCSI disk > [ 37.473476] SysRq : Show State > [ 37.474338] task PC stack pid father > [ 37.474338] init S ef836eac 0 1 0 > [ 37.474338] ef836f00 00000082 ffffffff ef836eac b01224d3 00000000 00000000 00000001 > [ 37.474338] e80cb97b 00000000 b0421180 b0421180 b0421180 ef835020 ef83527c b180b180 > [ 37.474338] 00000000 ef836000 ee6e0c80 b1031ae0 00000000 00000000 ee5b9148 ef836f14 > [ 37.474338] Call Trace: > [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7 > [ 37.474338] [<b016de52>] ? do_wp_page+0x2a0/0x3e5 > [ 37.474338] [<b01d218e>] ? security_task_wait+0xf/0x11 > [ 37.474338] [<b0129e1f>] do_wait+0x470/0xa1a > [ 37.474338] [<b01249e9>] ? wake_up_new_task+0x77/0x91 > [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd > [ 37.474338] [<b012a431>] sys_wait4+0x68/0x9f > [ 37.474338] [<b012a48f>] sys_waitpid+0x27/0x29 > [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb > [ 37.474338] [<b02f0000>] ? relay_hotcpu_callback+0x51/0xb1 > [ 37.474338] ======================= > [ 37.474338] kthreadd S 00000000 0 2 0 > [ 37.474338] ef83bfc8 00000046 ef8b50a0 00000000 00000092 ef83bf80 00000000 00000000 > [ 37.474338] ee5f776c 00000000 b0421180 b0421180 b0421180 ef83a0a0 ef83a2fc b180b180 > [ 37.474338] 00000000 ef83b000 ee564d00 00000000 00000ae7 00000000 ee5f6e1c ee5f6e20 > [ 37.474338] Call Trace: > [ 37.474338] [<b011d295>] ? complete+0x43/0x4b > [ 37.474338] [<b0139450>] kthreadd+0x13a/0x13f > [ 37.474338] [<b0139316>] ? kthreadd+0x0/0x13f > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] migration/0 S 0102c262 0 3 2 > [ 37.474338] ef83df98 00000046 ef83c120 0102c262 00000246 ef83df4c 00000000 ef83df4c > [ 37.474338] 066fb2fc 00000000 b0421180 b0421180 b0421180 ef83c120 ef83c37c b180b180 > [ 37.474338] 00000000 ef83d000 b03ba200 00000000 00000000 00000000 00000000 b0421180 > [ 37.474338] Call Trace: > [ 37.474338] [<b0122f2a>] ? migration_thread+0x0/0x210 > [ 37.474338] [<b012304e>] migration_thread+0x124/0x210 > [ 37.474338] [<b0122f2a>] ? migration_thread+0x0/0x210 > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] ksoftirqd/0 R running 0 4 2 > [ 37.474338] f4744940 00000008 ef841f14 b01424d9 f4650f99 00000008 00000000 01808638 > [ 37.474338] b18086c0 b1808638 b1808600 ef841f50 b013ca83 b1808600 b1808604 f4744940 > [ 37.474338] 00000008 f465092e 00000008 f465092e 00000046 b1807120 00000000 00000246 > [ 37.474338] Call Trace: > [ 37.474338] [<b01424d9>] ? tick_program_event+0x3f/0x61 > [ 37.474338] [<b013ca83>] ? hrtimer_interrupt+0x13f/0x164 > [ 37.474338] [<b012c055>] ? irq_exit+0x3f/0x79 > [ 37.474338] [<b0113103>] ? smp_apic_timer_interrupt+0x5c/0x89 > [ 37.474338] [<b012c055>] ? irq_exit+0x3f/0x79 > [ 37.474338] [<b01057bc>] ? apic_timer_interrupt+0x28/0x30 > [ 37.474338] [<b01079d9>] ? do_softirq+0x35/0x8c > [ 37.474338] [<b012c515>] ? ksoftirqd+0xad/0x17f > [ 37.474338] [<b012c468>] ? ksoftirqd+0x0/0x17f > [ 37.474338] [<b01392f4>] ? kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] ? kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] events/0 R running 0 5 2 > [ 37.474338] ef843fa0 00000046 ef8112c0 ef843f44 b0136d0e 00000a75 00000000 00000a75 > [ 37.474338] 60e9afad 00000008 b0421180 b0421180 b0421180 ef8420a0 ef8422fc b180b180 > [ 37.474338] 00000000 ef843000 ee6e0880 b18089e0 3399ec88 00000002 b0421180 b0421180 > [ 37.474338] Call Trace: > [ 37.474338] [<b0136d0e>] ? queue_delayed_work+0x40/0x48 > [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d > [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 > [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 > [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] khelper S b02f12e5 0 6 2 > [ 37.474338] ef845fa0 00000046 ef845f3c b02f12e5 ef836d40 ef836d44 00000000 ef845f44 > [ 37.474338] 2a9ba799 00000000 b0421180 b0421180 b0421180 ef844120 ef84437c b180b180 > [ 37.474338] 00000000 ef845000 ee6e0880 ef822480 00000000 00000000 b0421180 b0421180 > [ 37.474338] Call Trace: > [ 37.474338] [<b02f12e5>] ? preempt_schedule+0x40/0x55 > [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d > [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 > [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 > [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] kblockd/0 S ef887f4c 0 35 2 > [ 37.474338] ef887fa0 00000046 00000246 ef887f4c b01224ed ef887f4c 00000000 00000000 > [ 37.474338] 0800f107 00000000 b0421180 b0421180 b0421180 ef886120 ef88637c b180b180 > [ 37.474338] 00000000 ef887000 b03ba200 08006f94 00000c39 00000000 b0421180 b0421180 > [ 37.474338] Call Trace: > [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7 > [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d > [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 > [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 > [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] kacpid S ef88cf4c 0 37 2 > [ 37.474338] ef88cfa0 00000046 00000246 ef88cf4c b01224ed ef88cf4c 00000000 00000000 > [ 37.474338] 0801a506 00000000 b0421180 b0421180 b0421180 ef88b0a0 ef88b2fc b180b180 > [ 37.474338] 00000000 ef88c000 b03ba200 08018266 00000000 00000000 b0421180 b0421180 > [ 37.474338] Call Trace: > [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7 > [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d > [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 > [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 > [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] kacpi_notify S ef88ff4c 0 38 2 > [ 37.474338] ef88ffa0 00000046 00000246 ef88ff4c b01224ed ef88ff4c 00000000 00000000 > [ 37.474338] 0886f6cd 00000000 b0421180 b0421180 b0421180 ef88e120 ef88e37c b180b180 > [ 37.474338] 00000000 ef88f000 b03ba200 0801c44e 00000000 00000000 b0421180 b0421180 > [ 37.474338] Call Trace: > [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7 > [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d > [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 > [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 > [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] cqueue/0 S ef8e0f4c 0 111 2 > [ 37.474338] ef8e0fa0 00000046 00000246 ef8e0f4c b01224ed ef8e0f4c 00000000 00000000 > [ 37.474338] 0c0b0eaf 00000000 b0421180 b0421180 b0421180 ef8bd0a0 ef8bd2fc b180b180 > [ 37.474338] 00000000 ef8e0000 b03ba200 0c0abebc 00000cf2 00000000 b0421180 b0421180 > [ 37.474338] Call Trace: > [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7 > [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d > [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 > [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 > [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] kseriod S b011d321 0 115 2 > [ 37.474338] ef8e4f8c 00000046 ef8e4f34 b011d321 00000000 00000000 00000000 ef88d440 > [ 37.474338] 28fd3c39 00000000 b0421180 b0421180 b0421180 ef874120 ef87437c b180b180 > [ 37.474338] 00000000 ef8e4000 b03ba200 00000000 00000000 00000000 b03cc440 ef88d43c > [ 37.474338] Call Trace: > [ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42 > [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d > [ 37.474338] [<b0255c33>] serio_thread+0xc2/0x32f > [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 > [ 37.474338] [<b0255b71>] ? serio_thread+0x0/0x32f > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] pdflush S 00000000 0 148 2 > [ 37.474338] ee52cfa4 00000046 b180b180 00000000 ee52c000 ee52cf68 00000000 000001fa > [ 37.474338] 22833768 00000000 b0421180 b0421180 b0421180 ef864120 ef86437c b180b180 > [ 37.474338] 00000000 ee52c000 b03ba200 b012229c 00000000 00000000 ef864120 b180b180 > [ 37.474338] Call Trace: > [ 37.474338] [<b012229c>] ? set_cpus_allowed+0x50/0xb8 > [ 37.474338] [<b02f2c57>] ? _spin_unlock_irqrestore+0x1f/0x21 > [ 37.474338] [<b011eb19>] ? set_user_nice+0xcf/0xdf > [ 37.474338] [<b0167dd1>] ? pdflush+0x0/0x1b4 > [ 37.474338] [<b0167e82>] pdflush+0xb1/0x1b4 > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] pdflush S 00000286 0 149 2 > [ 37.474338] ee52dfa4 00000046 b03c16e0 00000286 ee52df54 b0130184 00000000 00000000 > [ 37.474338] 60e9f95c 00000008 b0421180 b0421180 b0421180 ef8620a0 ef8622fc b180b180 > [ 37.474338] 00000000 ee52d000 ee6e0680 00000000 00000000 00000000 00000000 00000000 > [ 37.474338] Call Trace: > [ 37.474338] [<b0130184>] ? __mod_timer+0xa0/0xaf > [ 37.474338] [<b0167dd1>] ? pdflush+0x0/0x1b4 > [ 37.474338] [<b0167e82>] pdflush+0xb1/0x1b4 > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] kswapd0 S 00000000 0 150 2 > [ 37.474338] ee52ef2c 00000046 00000000 00000000 00000000 00000000 00000000 000004fa > [ 37.474338] 22918b1f 00000000 b0421180 b0421180 b0421180 ef85e020 ef85e27c b180b180 > [ 37.474338] 00000000 ee52e000 b03ba200 b012229c 00000000 00000000 b1807b00 b180b180 > [ 37.474338] Call Trace: > [ 37.474338] [<b012229c>] ? set_cpus_allowed+0x50/0xb8 > [ 37.474338] [<b011ab2c>] ? __dequeue_entity+0x31/0x35 > [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d > [ 37.474338] [<b016aefe>] ? kswapd+0x0/0x4a2 > [ 37.474338] [<b016b38e>] kswapd+0x490/0x4a2 > [ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e > [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 > [ 37.474338] [<b011d295>] ? complete+0x43/0x4b > [ 37.474338] [<b016aefe>] ? kswapd+0x0/0x4a2 > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] aio/0 S ee530f4c 0 151 2 > [ 37.474338] ee530fa0 00000046 00000246 ee530f4c b01224ed ee530f4c 00000000 00000000 > [ 37.474338] 22bec65c 00000000 b0421180 b0421180 b0421180 ef85c120 ef85c37c b180b180 > [ 37.474338] 00000000 ee530000 b03ba200 2291af16 00000000 00000000 b0421180 b0421180 > [ 37.474338] Call Trace: > [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7 > [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d > [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 > [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 > [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] kpsmoused S ee653f4c 0 378 2 > [ 37.474338] ee653fa0 00000046 00000246 ee653f4c b01224ed ee653f4c 00000000 00000000 > [ 37.474338] 26811a10 00000000 b0421180 b0421180 b0421180 ee6ba020 ee6ba27c b180b180 > [ 37.474338] 00000000 ee653000 b03ba200 2680eca0 00000000 00000000 b0421180 b0421180 > [ 37.474338] Call Trace: > [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7 > [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d > [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 > [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 > [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] kondemand/0 S ee5b0f4c 0 384 2 > [ 37.474338] ee5b0fa0 00000046 00000246 ee5b0f4c b01224ed ee5b0f4c 00000000 00000000 > [ 37.474338] 292b36fc 00000000 b0421180 b0421180 b0421180 ee5cc120 ee5cc37c b180b180 > [ 37.474338] 00000000 ee5b0000 b03ba200 290d4c43 00000b94 00000000 b0421180 b0421180 > [ 37.474338] Call Trace: > [ 37.474338] [<b01224ed>] ? hrtick_set+0x99/0xf7 > [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d > [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 > [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 > [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] blogd S 00000000 0 423 1 > [ 37.474338] ee6d0b10 00000082 000004b3 00000000 b1807120 15b2b2c0 00000000 ef8420a0 > [ 37.474338] 60e9cf9d 00000008 b0421180 b0421180 b0421180 ee5e60a0 ee5e62fc b180b180 > [ 37.474338] 00000000 ee6d0000 ee6e0680 b0130059 339a08bf 00000002 ee6d0b20 00000286 > [ 37.474338] Call Trace: > [ 37.474338] [<b0130059>] ? lock_timer_base+0x1f/0x40 > [ 37.474338] [<b0130184>] ? __mod_timer+0xa0/0xaf > [ 37.474338] [<b02f14ba>] schedule_timeout+0x44/0xa4 > [ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36 > [ 37.474338] [<b012fcda>] ? process_timeout+0x0/0xa > [ 37.474338] [<b02f14b5>] ? schedule_timeout+0x3f/0xa4 > [ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c > [ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb > [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd > [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd > [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd > [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7 > [ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e > [ 37.474338] [<b020e60c>] ? cfb_fillrect+0x138/0x2bd > [ 37.474338] [<b013c088>] ? enqueue_hrtimer+0x75/0xf3 > [ 37.474338] [<b0109fbc>] ? read_tsc+0x8/0xa > [ 37.474338] [<b013e472>] ? getnstimeofday+0x34/0xdf > [ 37.474338] [<b013c90f>] ? ktime_get_ts+0x44/0x49 > [ 37.474338] [<b013c925>] ? ktime_get+0x11/0x30 > [ 37.474338] [<b011b82b>] ? hrtick_start_fair+0x10d/0x144 > [ 37.474338] [<b011b900>] ? enqueue_task_fair+0x52/0x56 > [ 37.474338] [<b011a3f1>] ? enqueue_task+0x4c/0x58 > [ 37.474338] [<b011eb76>] ? try_to_wake_up+0x4d/0x1be > [ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb > [ 37.474338] [<b013c6ed>] ? hrtimer_start+0xc7/0x13c > [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7 > [ 37.474338] [<b01042b4>] ? do_notify_resume+0x55/0x79e > [ 37.474338] [<b0127ad4>] ? release_console_sem+0x1c4/0x1d4 > [ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42 > [ 37.474338] [<b0232d6a>] ? tty_ldisc_deref+0x55/0x6e > [ 37.474338] [<b018e3fc>] sys_select+0xd7/0x1a2 > [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb > [ 37.474338] ======================= > [ 37.474338] blogd R running 0 424 1 > [ 37.474338] ee6cbdf8 00000082 b18086c0 ee6cbda0 b01e99b7 b1808640 00000000 b180b5f4 > [ 37.474338] 60e8dd8e 00000008 b0421180 b0421180 b0421180 ee5d70a0 ee5d72fc b180b180 > [ 37.474338] 00000000 ee6cb000 ee6e0680 ee6cbdf8 375f44e1 00000003 ee6cbdec b041e600 > [ 37.474338] Call Trace: > [ 37.474338] [<b01e99b7>] ? rb_insert_color+0x77/0xd8 > [ 37.474338] [<b0143bfe>] futex_wait+0x285/0x2d3 > [ 37.474338] [<b013c31c>] ? hrtimer_wakeup+0x0/0x1c > [ 37.474338] [<b0143b3d>] ? futex_wait+0x1c4/0x2d3 > [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd > [ 37.474338] [<b0144c2d>] do_futex+0x20d/0xa5f > [ 37.474338] [<b0168667>] ? pagevec_lookup_tag+0x25/0x2e > [ 37.474338] [<b016158a>] ? wait_on_page_writeback_range+0x5b/0xf0 > [ 37.474338] [<b0109fbc>] ? read_tsc+0x8/0xa > [ 37.474338] [<b013e472>] ? getnstimeofday+0x34/0xdf > [ 37.474338] [<b013c90f>] ? ktime_get_ts+0x44/0x49 > [ 37.474338] [<b013c925>] ? ktime_get+0x11/0x30 > [ 37.474338] [<b0145502>] sys_futex+0x83/0xe8 > [ 37.474338] [<b01ec504>] ? copy_to_user+0x2a/0x36 > [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb > [ 37.474338] ======================= > [ 37.474338] ata/0 S ee7b3f8c 0 438 2 > [ 37.474338] ee7b3fa0 00000046 00000002 ee7b3f8c ee7b3f80 ee7b3f44 00000000 002dae6f > [ 37.474338] e3f3f21f 00000000 b0421180 b0421180 b0421180 ee6be120 ee6be37c b180b180 > [ 37.474338] 00000000 ee7b3000 ee6e0a80 ef906480 00000000 00000000 b0421180 b0421180 > [ 37.474338] Call Trace: > [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d > [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 > [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 > [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] ata_aux S ee7b2f4c 0 439 2 > [ 37.474338] ee7b2fa0 00000046 ffffffff ee7b2f4c b01224d3 00000000 00000000 00000000 > [ 37.474338] 2fc09ed2 00000000 b0421180 b0421180 b0421180 ee5dd020 ee5dd27c b180b180 > [ 37.474338] 00000000 ee7b2000 ee6e0280 2f81ef36 00000000 00000000 b0421180 b0421180 > [ 37.474338] Call Trace: > [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7 > [ 37.474338] [<b0139711>] ? prepare_to_wait+0x48/0x4d > [ 37.474338] [<b0136df5>] worker_thread+0xb9/0xd7 > [ 37.474338] [<b01395a9>] ? autoremove_wake_function+0x0/0x38 > [ 37.474338] [<b0136d3c>] ? worker_thread+0x0/0xd7 > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] scsi_eh_0 S ef906520 0 445 2 > [ 37.474338] ee585f64 00000046 ef906524 ef906520 00000000 00000092 00000000 b011d321 > [ 37.474338] e4331293 00000000 b0421180 b0421180 b0421180 ee5e4020 ee5e427c b180b180 > [ 37.474338] 00000000 ee585000 ee6e0480 ee702008 00108a29 00000000 f084c295 ee702000 > [ 37.474338] Call Trace: > [ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42 > [ 37.474338] [<f084c295>] ? __scsi_iterate_devices+0x5d/0x6b [scsi_mod] > [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod] > [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod] > [ 37.474338] [<f085033a>] scsi_error_handler+0x37/0x4eb [scsi_mod] > [ 37.474338] [<b011d295>] ? complete+0x43/0x4b > [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod] > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] scsi_eh_1 S ee622520 0 446 2 > [ 37.474338] ee5acf64 00000046 ee622524 ee622520 00000000 00000092 00000000 b011d321 > [ 37.474338] e4333e8f 00000000 b0421180 b0421180 b0421180 ee5d10a0 ee5d12fc b180b180 > [ 37.474338] 00000000 ee5ac000 ee6e0480 ee702808 00000000 00000000 f084c295 ee702800 > [ 37.474338] Call Trace: > [ 37.474338] [<b011d321>] ? __wake_up+0x3a/0x42 > [ 37.474338] [<f084c295>] ? __scsi_iterate_devices+0x5d/0x6b [scsi_mod] > [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod] > [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod] > [ 37.474338] [<f085033a>] scsi_error_handler+0x37/0x4eb [scsi_mod] > [ 37.474338] [<b011d295>] ? complete+0x43/0x4b > [ 37.474338] [<f0850303>] ? scsi_error_handler+0x0/0x4eb [scsi_mod] > [ 37.474338] [<b01392f4>] kthread+0x37/0x59 > [ 37.474338] [<b01392bd>] ? kthread+0x0/0x59 > [ 37.474338] [<b010593f>] kernel_thread_helper+0x7/0x18 > [ 37.474338] ======================= > [ 37.474338] udevd S 00000000 0 461 1 > [ 37.474338] ee5d8b10 00000082 00000000 00000000 00000000 00000000 00000000 ee5d8ab4 > [ 37.474338] 93a637c5 00000001 b0421180 b0421180 b0421180 ee6c20a0 ee6c22fc b180b180 > [ 37.474338] 00000000 ee5d8000 ee6e0480 b1807980 00000000 00000000 000f41a9 00000000 > [ 37.474338] Call Trace: > [ 37.474338] [<b011b920>] ? __update_rq_clock+0x1c/0x157 > [ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4 > [ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36 > [ 37.474338] [<b018e2c1>] ? __pollwait+0x67/0xcb > [ 37.474338] [<b0186ef4>] ? pipe_poll+0x29/0x8f > [ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c > [ 37.474338] [<b013fdbc>] ? clocksource_get_next+0x3d/0x44 > [ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb > [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd > [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd > [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd > [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd > [ 37.474338] [<b01414ad>] ? clockevents_program_event+0x93/0x108 > [ 37.474338] [<b01424d9>] ? tick_program_event+0x3f/0x61 > [ 37.474338] [<b01eab55>] ? number+0x2a3/0x2b5 > [ 37.474338] [<b01057bc>] ? apic_timer_interrupt+0x28/0x30 > [ 37.474338] [<b019007b>] ? fcntl_getlk64+0x4e/0x159 > [ 37.474338] [<b01eb3a7>] ? vsnprintf+0x2e8/0x5ea > [ 37.474338] [<b0119234>] ? kmap_atomic_prot+0x47/0xa8 > [ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443 > [ 37.474338] [<b016625f>] ? __alloc_pages+0x57/0x32d > [ 37.474338] [<b016c409>] ? __inc_zone_page_state+0x18/0x1a > [ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb > [ 37.474338] [<b0165c04>] ? free_hot_page+0xa/0xc > [ 37.474338] [<b0168bf3>] ? put_page+0x2d/0xac > [ 37.474338] [<b0176340>] ? free_page_and_swap_cache+0x1e/0x3e > [ 37.474338] [<b016e44a>] ? unmap_vmas+0x317/0x54b > [ 37.474338] [<b0117410>] ? pgd_dtor+0x0/0x4a > [ 37.474338] [<b011740e>] ? check_pgt_cache+0x1e/0x20 > [ 37.474338] [<b01711bf>] ? unmap_region+0xdc/0x12f > [ 37.474338] [<b018e35e>] sys_select+0x39/0x1a2 > [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb > [ 37.474338] ======================= > [ 37.474338] udevsettle R running 0 465 1 > [ 37.474338] ee524f1c 00000082 ee524ed4 00000000 00010542 b1808640 00000000 ee6cbe14 > [ 37.474338] 60e8f831 00000008 b0421180 b0421180 b0421180 ee5ce020 ee5ce27c b180b180 > [ 37.474338] 00000000 ee524000 ee6e0880 ee524f1c 33bd800d 00000003 00000008 b041e600 > [ 37.474338] Call Trace: > [ 37.474338] [<b02f1a50>] do_nanosleep+0x70/0x9a > [ 37.474338] [<b013c7ae>] hrtimer_nanosleep+0x4c/0xaf > [ 37.474338] [<b013c31c>] ? hrtimer_wakeup+0x0/0x1c > [ 37.474338] [<b02f1a3d>] ? do_nanosleep+0x5d/0x9a > [ 37.474338] [<b013c868>] sys_nanosleep+0x57/0x5b > [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb > [ 37.474338] [<b02f0000>] ? relay_hotcpu_callback+0x51/0xb1 > [ 37.474338] ======================= > [ 37.474338] udevd S 00000000 0 873 461 > [ 37.474338] e7c7eb10 00000082 00000000 00000000 00000000 00000000 00000000 00000000 > [ 37.474338] 9385214a 00000001 b0421180 b0421180 b0421180 ef8d40a0 ef8d42fc b180b180 > [ 37.474338] 00000000 e7c7e000 ee6e0280 00000000 00000000 00000000 00000000 00000000 > [ 37.474338] Call Trace: > [ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4 > [ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36 > [ 37.474338] [<b018e2c1>] ? __pollwait+0x67/0xcb > [ 37.474338] [<b0186ef4>] ? pipe_poll+0x29/0x8f > [ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c > [ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb > [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd > [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124 > [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124 > [ 37.474338] [<b0119234>] ? kmap_atomic_prot+0x47/0xa8 > [ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443 > [ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb > [ 37.474338] [<b013c088>] ? enqueue_hrtimer+0x75/0xf3 > [ 37.474338] [<b013c6ed>] ? hrtimer_start+0xc7/0x13c > [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7 > [ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e > [ 37.474338] [<b016de52>] ? do_wp_page+0x2a0/0x3e5 > [ 37.474338] [<b016f0e7>] ? handle_mm_fault+0x442/0x5d4 > [ 37.474338] [<b02f12e5>] ? preempt_schedule+0x40/0x55 > [ 37.474338] [<b018e35e>] sys_select+0x39/0x1a2 > [ 37.474338] [<b017fc6f>] ? filp_close+0x43/0x69 > [ 37.474338] [<b01ec504>] ? copy_to_user+0x2a/0x36 > [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb > [ 37.474338] ======================= > [ 37.474338] udevd S 00000000 0 876 461 > [ 37.474338] ee780b10 00000086 00000000 00000000 00000000 00000000 00000000 00000000 > [ 37.474338] 93a2c2db 00000001 b0421180 b0421180 b0421180 ef8a1120 ef8a137c b180b180 > [ 37.474338] 00000000 ee780000 ee564100 00000000 00000000 00000000 00000000 00000000 > [ 37.474338] Call Trace: > [ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4 > [ 37.474338] [<b01397ad>] ? add_wait_queue+0x2f/0x36 > [ 37.474338] [<b018e2c1>] ? __pollwait+0x67/0xcb > [ 37.474338] [<b0186ef4>] ? pipe_poll+0x29/0x8f > [ 37.474338] [<b018dce0>] do_select+0x4b6/0x53c > [ 37.474338] [<b018e25a>] ? __pollwait+0x0/0xcb > [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd > [ 37.474338] [<b011a8a0>] ? update_curr+0x12f/0x136 > [ 37.474338] [<b012381e>] ? task_tick_fair+0x59/0x86 > [ 37.474338] [<b01227b3>] ? scheduler_tick+0x268/0x3cc > [ 37.474338] [<b0109fbc>] ? read_tsc+0x8/0xa > [ 37.474338] [<b013e472>] ? getnstimeofday+0x34/0xdf > [ 37.474338] [<b01414ad>] ? clockevents_program_event+0x93/0x108 > [ 37.474338] [<b01424d9>] ? tick_program_event+0x3f/0x61 > [ 37.474338] [<b013ca83>] ? hrtimer_interrupt+0x13f/0x164 > [ 37.474338] [<b012c055>] ? irq_exit+0x3f/0x79 > [ 37.474338] [<b0113103>] ? smp_apic_timer_interrupt+0x5c/0x89 > [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124 > [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124 > [ 37.474338] [<b0119234>] ? kmap_atomic_prot+0x47/0xa8 > [ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443 > [ 37.474338] [<b018df10>] core_sys_select+0x1aa/0x2bb > [ 37.474338] [<b013c088>] ? enqueue_hrtimer+0x75/0xf3 > [ 37.474338] [<b013c6ed>] ? hrtimer_start+0xc7/0x13c > [ 37.474338] [<b01224d3>] ? hrtick_set+0x7f/0xf7 > [ 37.474338] [<b02f0d71>] ? schedule+0x34e/0x82e > [ 37.474338] [<b016de52>] ? do_wp_page+0x2a0/0x3e5 > [ 37.474338] [<b016f0e7>] ? handle_mm_fault+0x442/0x5d4 > [ 37.474338] [<b02f12e5>] ? preempt_schedule+0x40/0x55 > [ 37.474338] [<b018e35e>] sys_select+0x39/0x1a2 > [ 37.474338] [<b017fc6f>] ? filp_close+0x43/0x69 > [ 37.474338] [<b01ec504>] ? copy_to_user+0x2a/0x36 > [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb > [ 37.474338] ======================= > [ 37.474338] scsi_id D 00000000 0 878 873 > [ 37.474338] ee60bb7c 00000086 00000046 00000000 00000000 ef904000 00000000 ee66d910 > [ 37.474338] 93796336 00000001 b0421180 b0421180 b0421180 ee5d3120 ee5d337c b180b180 > [ 37.474338] 00000000 ee60b000 ee564700 ee5c2800 0000065a 00000000 ee60bb5c b024e1b8 > [ 37.474338] Call Trace: > [ 37.474338] [<b024e1b8>] ? put_device+0xf/0x11 > [ 37.474338] [<f0852878>] ? scsi_request_fn+0x218/0x345 [scsi_mod] > [ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4 > [ 37.474338] [<b01d8b08>] ? elv_insert+0xf3/0x20b > [ 37.474338] [<b01302da>] ? mod_timer+0x26/0x3d > [ 37.474338] [<b01db15b>] ? blk_plug_device+0x42/0x9a > [ 37.474338] [<b02f08e5>] wait_for_common+0x74/0x12e > [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd > [ 37.474338] [<b02f0a21>] wait_for_completion+0x12/0x14 > [ 37.474338] [<b01dd237>] blk_execute_rq+0x5c/0x9c > [ 37.474338] [<b01dd277>] ? blk_end_sync_rq+0x0/0x29 > [ 37.474338] [<b01a3fe2>] ? bio_add_pc_page+0x24/0x2a > [ 37.474338] [<b01d9672>] ? blk_rq_bio_prep+0x9e/0xb2 > [ 37.474338] [<b01dceef>] ? blk_rq_append_bio+0x17/0x49 > [ 37.474338] [<b01dd022>] ? blk_rq_map_user+0x101/0x1b4 > [ 37.474338] [<b01e01f6>] sg_io+0x18c/0x2e1 > [ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443 > [ 37.474338] [<b01e05df>] scsi_cmd_ioctl+0x294/0x3d5 > [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124 > [ 37.474338] [<b0188770>] ? do_lookup+0x5a/0x162 > [ 37.474338] [<f08799b5>] sd_ioctl+0x82/0xc9 [sd_mod] > [ 37.474338] [<b01ddf93>] blkdev_driver_ioctl+0x55/0x5e > [ 37.474338] [<b01de1bf>] blkdev_ioctl+0x223/0x814 > [ 37.474338] [<b01e7574>] ? kobject_get+0x12/0x17 > [ 37.474338] [<b0160fc3>] ? find_lock_page+0x72/0x8d > [ 37.474338] [<b0163210>] ? filemap_fault+0x240/0x449 > [ 37.474338] [<b01395a5>] ? wake_up_bit+0x17/0x1b > [ 37.474338] [<b0160e75>] ? unlock_page+0x25/0x28 > [ 37.474338] [<b016d9b5>] ? __do_fault+0x17a/0x377 > [ 37.474338] [<b01a57b6>] ? blkdev_open+0x28/0x58 > [ 37.474338] [<b016eda8>] ? handle_mm_fault+0x103/0x5d4 > [ 37.474338] [<b01a4c39>] block_ioctl+0x1b/0x21 > [ 37.474338] [<b01a4c1e>] ? block_ioctl+0x0/0x21 > [ 37.474338] [<b018cc12>] vfs_ioctl+0x22/0x71 > [ 37.474338] [<b018cea9>] do_vfs_ioctl+0x248/0x290 > [ 37.474338] [<b0117ba2>] ? do_page_fault+0x13d/0x5fb > [ 37.474338] [<b0189bfc>] ? putname+0x25/0x30 > [ 37.474338] [<b01800e0>] ? do_sys_open+0xb1/0xc7 > [ 37.474338] [<b018cf43>] sys_ioctl+0x52/0x63 > [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb > [ 37.474338] ======================= > [ 37.474338] scsi_id D ee621bd4 0 879 876 > [ 37.474338] ee636b7c 00000086 ee621bac ee621bd4 00000000 f084c4f9 00000000 ee66d0c8 > [ 37.474338] 9394eaa1 00000001 b0421180 b0421180 b0421180 ef888020 ef88827c b180b180 > [ 37.474338] 00000000 ee636000 ee564300 ee6e8c00 00000000 00000000 ee636b5c b024e1b8 > [ 37.474338] Call Trace: > [ 37.474338] [<f084c4f9>] ? scsi_done+0x0/0x19 [scsi_mod] > [ 37.474338] [<b024e1b8>] ? put_device+0xf/0x11 > [ 37.474338] [<f0852878>] ? scsi_request_fn+0x218/0x345 [scsi_mod] > [ 37.474338] [<b02f14e0>] schedule_timeout+0x6a/0xa4 > [ 37.474338] [<b01d8b08>] ? elv_insert+0xf3/0x20b > [ 37.474338] [<b01302da>] ? mod_timer+0x26/0x3d > [ 37.474338] [<b01db15b>] ? blk_plug_device+0x42/0x9a > [ 37.474338] [<b02f08e5>] wait_for_common+0x74/0x12e > [ 37.474338] [<b011ece7>] ? default_wake_function+0x0/0xd > [ 37.474338] [<b02f0a21>] wait_for_completion+0x12/0x14 > [ 37.474338] [<b01dd237>] blk_execute_rq+0x5c/0x9c > [ 37.474338] [<b01dd277>] ? blk_end_sync_rq+0x0/0x29 > [ 37.474338] [<b01a3fe2>] ? bio_add_pc_page+0x24/0x2a > [ 37.474338] [<b01d9672>] ? blk_rq_bio_prep+0x9e/0xb2 > [ 37.474338] [<b01dceef>] ? blk_rq_append_bio+0x17/0x49 > [ 37.474338] [<b01dd022>] ? blk_rq_map_user+0x101/0x1b4 > [ 37.474338] [<b01e01f6>] sg_io+0x18c/0x2e1 > [ 37.474338] [<b0165eb1>] ? get_page_from_freelist+0x25f/0x443 > [ 37.474338] [<b01e05df>] scsi_cmd_ioctl+0x294/0x3d5 > [ 37.474338] [<b01920af>] ? __d_lookup+0xba/0x124 > [ 37.474338] [<b0188770>] ? do_lookup+0x5a/0x162 > [ 37.474338] [<f08799b5>] sd_ioctl+0x82/0xc9 [sd_mod] > [ 37.474338] [<b01ddf93>] blkdev_driver_ioctl+0x55/0x5e > [ 37.474338] [<b01de1bf>] blkdev_ioctl+0x223/0x814 > [ 37.474338] [<b0168bbe>] ? activate_page+0xb1/0xb9 > [ 37.474338] [<b0168cce>] ? mark_page_accessed+0x27/0x2e > [ 37.474338] [<b0163210>] ? filemap_fault+0x240/0x449 > [ 37.474338] [<b01395a5>] ? wake_up_bit+0x17/0x1b > [ 37.474338] [<b0160e75>] ? unlock_page+0x25/0x28 > [ 37.474338] [<b016d9b5>] ? __do_fault+0x17a/0x377 > [ 37.474338] [<b01a57b6>] ? blkdev_open+0x28/0x58 > [ 37.474338] [<b016eda8>] ? handle_mm_fault+0x103/0x5d4 > [ 37.474338] [<b01a4c39>] block_ioctl+0x1b/0x21 > [ 37.474338] [<b01a4c1e>] ? block_ioctl+0x0/0x21 > [ 37.474338] [<b018cc12>] vfs_ioctl+0x22/0x71 > [ 37.474338] [<b018cea9>] do_vfs_ioctl+0x248/0x290 > [ 37.474338] [<b0117ba2>] ? do_page_fault+0x13d/0x5fb > [ 37.474338] [<b0189bfc>] ? putname+0x25/0x30 > [ 37.474338] [<b01800e0>] ? do_sys_open+0xb1/0xc7 > [ 37.474338] [<b018cf43>] sys_ioctl+0x52/0x63 > [ 37.474338] [<b0104d52>] syscall_call+0x7/0xb > [ 37.474338] ======================= > [ 37.474338] Sched Debug Version: v0.07, 2.6.25-rc2-smp #14 > [ 37.474338] now at 44900.194289 msecs > [ 37.474338] .sysctl_sched_latency : 20.000000 > [ 37.474338] .sysctl_sched_min_granularity : 4.000000 > [ 37.474338] .sysctl_sched_wakeup_granularity : 10.000000 > [ 37.474338] .sysctl_sched_batch_wakeup_granularity : 10.000000 > [ 37.474338] .sysctl_sched_child_runs_first : 0.000001 > [ 37.474338] .sysctl_sched_features : 39 > [ 37.474338] > [ 37.474338] cpu#0, 2992.603 MHz > [ 37.474338] .nr_running : 4 > [ 37.474338] .load : 2048 > [ 37.474338] .nr_switches : 3363 > [ 37.474338] .nr_load_updates : 34557 > [ 37.474338] .nr_uninterruptible : 2 > [ 37.474338] .jiffies : 4294705811 > [ 37.474338] .next_balance : 4294.705968 > [ 37.474338] .curr->pid : 4 > [ 37.474338] .clock : 37474.338672 > [ 37.474338] .idle_clock : 3065.714769 > [ 37.474338] .prev_clock_raw : 78772.978492 > [ 37.474338] .clock_warps : 0 > [ 37.474338] .clock_overflows : 3996 > [ 37.474338] .clock_underflows : 31781 > [ 37.474338] .clock_deep_idle_events : 1 > [ 37.474338] .clock_max_delta : 0.999848 > [ 37.474338] .cpu_load[0] : 2048 > [ 37.474338] .cpu_load[1] : 2048 > [ 37.474338] .cpu_load[2] : 2048 > [ 37.474338] .cpu_load[3] : 2048 > [ 37.474338] .cpu_load[4] : 2048 > [ 37.474338] > [ 37.474338] cfs_rq > [ 37.474338] .exec_clock : 34293.783916 > [ 37.474338] .MIN_vruntime : 0.000001 > [ 37.474338] .min_vruntime : 17146.893105 > [ 37.474338] .max_vruntime : 0.000001 > [ 37.474338] .spread : 0.000000 > [ 37.474338] .spread0 : 0.000000 > [ 37.474338] .nr_running : 1 > [ 37.474338] .load : 2048 > [ 37.474338] .bkl_count : 405 > [ 37.474338] .nr_spread_over : 0 > [ 37.474338] > [ 37.474338] cfs_rq > [ 37.474338] .exec_clock : 34293.783916 > [ 37.474338] .MIN_vruntime : 13830.833996 > [ 37.474338] .min_vruntime : 17146.893105 > [ 37.474338] .max_vruntime : 13830.833996 > [ 37.474338] .spread : 0.000000 > [ 37.474338] .spread0 : 0.000000 > [ 37.474338] .nr_running : 4 > [ 37.474338] .load : 8290 > [ 37.474338] .bkl_count : 405 > [ 37.474338] .nr_spread_over : 6 > [ 37.474338] > [ 37.474338] runnable tasks: > [ 37.474338] task PID tree-key switches prio exec-runtime sum-exec sum-sleep > [ 37.474338] ---------------------------------------------------------------------------------------------------------- > [ 37.474338] R ksoftirqd/0 4 14329.115724 37 115 14329.115724 33248.417353 4113.418770 > [ 37.474338] events/0 5 13830.833996 35 115 13830.833996 0.284125 4265.748410 > [ 37.474338] blogd 424 13830.833996 135 120 13830.833996 0.460905 1680.558420 > [ 37.474338] udevsettle 465 13830.833996 17 120 13830.833996 0.838894 643.594622 > [ 37.474338] > > > > -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: regression: CD burning (k3b) went broke 2008-02-22 7:32 ` Jens Axboe @ 2008-02-23 7:42 ` Mike Galbraith 2008-02-24 7:54 ` Mike Galbraith 0 siblings, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-02-23 7:42 UTC (permalink / raw) To: Jens Axboe; +Cc: LKML, Tejun Heo On Fri, 2008-02-22 at 08:32 +0100, Jens Axboe wrote: > On Thu, Feb 21 2008, Mike Galbraith wrote: > > Greetings, > > > > K3b recently (9a4c854..5d9c4a7 pull) began terminally griping about > > buffer underrun upon every attempt to burn a CD. I can't fully bisect > > the problem because intervening kernels hang soft during boot. Using > > git bisect visualize, and converting to postable text: > > > > bisect/bad block: add request->raw_data_len (6b00769fe1502b4ad97bb327ef7ac971b208bfb5) > > bisect block: update bio according to DMA alignment padding (40b01b9bbdf51ae543a04744283bf2d56c4a6afa) > > libata: update ATAPI overflow draining > > bisect/good-e164094964e6e20fe7fce418e06a9dce952bb7a4 > > Tejun? <crickets chirping> He must be off having a life or something ;-) Meanwhile back at the ranch, reverting 6b00769fe1502b4ad97bb327ef7ac971b208bfb5 40b01b9bbdf51ae543a04744283bf2d56c4a6afa and the one entangled line from dde2020754aeb14e17052d61784dcb37f252aac2 did restore my burner. -Mike ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: regression: CD burning (k3b) went broke 2008-02-23 7:42 ` Mike Galbraith @ 2008-02-24 7:54 ` Mike Galbraith 2008-02-26 9:48 ` Mike Galbraith 0 siblings, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-02-24 7:54 UTC (permalink / raw) To: Jens Axboe; +Cc: LKML, Tejun Heo On Sat, 2008-02-23 at 08:42 +0100, Mike Galbraith wrote: > Meanwhile back at the ranch, reverting > 6b00769fe1502b4ad97bb327ef7ac971b208bfb5 > 40b01b9bbdf51ae543a04744283bf2d56c4a6afa and the one entangled line from > dde2020754aeb14e17052d61784dcb37f252aac2 did restore my burner. It looks like the reason for boot failure with 40b01b9bbdf51ae543a04744283bf2d56c4a6afa may be that one hunk of 6b00769fe1502b4ad97bb327ef7ac971b208bfb5 was supposed to land in 40b01b9bbdf51ae543a04744283bf2d56c4a6afa (per comment); diff --git a/block/blk-map.c b/block/blk-map.c index a7cf63c..09f7fd0 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -154,6 +155,7 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len; bio->bi_size += pad_len; + rq->data_len += pad_len; } rq->buffer = rq->data = NULL; Something else looks funny with 6b00769fe1502b4ad97bb327ef7ac971b208bfb5, did something go missing? diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 135c1d0..ba21d97 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1014,10 +1014,6 @@ static int scsi_init_sgtable(struct request *req, struct scsi_data_buffer *sdb, } req->buffer = NULL; - if (blk_pc_request(req)) - sdb->length = req->data_len; - else - sdb->length = req->nr_sectors << 9; /* * Next, walk the list, and fill in the addresses and sizes of <== here @@ -1026,6 +1022,10 @@ static int scsi_init_sgtable(struct request *req, struct scsi_data_buffer *sdb, count = blk_rq_map_sg(req->q, req, sdb->table.sgl); BUG_ON(count > sdb->table.nents); sdb->table.nents = count; + if (blk_pc_request(req)) + sdb->length = req->data_len; + else + sdb->length = req->nr_sectors << 9; return BLKPREP_OK; } ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: regression: CD burning (k3b) went broke 2008-02-24 7:54 ` Mike Galbraith @ 2008-02-26 9:48 ` Mike Galbraith 2008-02-26 13:36 ` Mike Galbraith 0 siblings, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-02-26 9:48 UTC (permalink / raw) To: Jens Axboe; +Cc: LKML, Tejun Heo [-- Attachment #1: Type: text/plain, Size: 9615 bytes --] Greetings, I straced both a good and a bad kernel (good being .git with attached revert patch applied) and filtered/diffed/merged the output. Scroll down to "HERE" to see the problem (resid). I'm poking around, but not having much luck. --- good 2008-02-26 09:11:08.000000000 +0100 +++ bad 2008-02-26 09:03:44.000000000 +0100 @@ -1,48 +1,44 @@ open("/dev/sr0", O_RDWR|O_NONBLOCK) = 3 fcntl64(3, F_GETFL) = 0x8802 (flags O_RDWR|O_NONBLOCK|O_LARGEFILE) fcntl64(3, F_SETFL, O_RDWR|O_LARGEFILE) = 0 -ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0xaf8d9194) = 0 -ioctl(3, SCSI_IOCTL_GET_BUS_NUMBER, 0xaf8d9190) = 0 -ioctl(3, SG_GET_VERSION_NUM, 0xaf8d9198) = 0 +ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0xafa1a2d4) = 0 +ioctl(3, SCSI_IOCTL_GET_BUS_NUMBER, 0xafa1a2d0) = 0 +ioctl(3, SG_GET_VERSION_NUM, 0xafa1a2d8) = 0 write(2, "Linux sg driver version: 3.5.27\n", 32Linux sg driver version: 3.5.27 ) = 32 -ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0xaf8d9134) = 0 -ioctl(3, SCSI_IOCTL_GET_BUS_NUMBER, 0xaf8d9130) = 0 -ioctl(3, SG_SET_TIMEOUT, 0xaf8d9030) = 0 -fstat64(3, {st_dev=makedev(0, 13), st_ino=4758, st_mode=S_IFBLK|0640, st_nlink=1, st_uid=0, st_gid=6, st_blksize=4096, st_blocks=0, st_rdev=makedev(11, 0), st_atime=2008/02/26-08:45:17, st_mtime=2008/02/26-08:45:17, st_ctime=2008/02/26-08:45:17}) = 0 +ioctl(3, CDROMAUDIOBUFSIZ or SCSI_IOCTL_GET_IDLUN, 0xafa1a274) = 0 +ioctl(3, SCSI_IOCTL_GET_BUS_NUMBER, 0xafa1a270) = 0 +ioctl(3, SG_SET_TIMEOUT, 0xafa1a170) = 0 +fstat64(3, {st_dev=makedev(0, 13), st_ino=4572, st_mode=S_IFBLK|0640, st_nlink=1, st_uid=0, st_gid=6, st_blksize=4096, st_blocks=0, st_rdev=makedev(11, 0), st_atime=2008/02/26-09:36:43, st_mtime=2008/02/26-09:36:43, st_ctime=2008/02/26-09:36:43}) = 0 geteuid32() = 0 getuid32() = 0 write(1, "Using libscg version \'schily-0.9"..., 35) = 35 write(1, "Driveropts: \'burnfree\'\n", 23) = 23 -ioctl(3, SG_GET_RESERVED_SIZE, 0xaf8d93d4) = 0 -ioctl(3, SG_GET_RESERVED_SIZE, 0xaf8d93d8) = 0 -ioctl(3, SG_GET_PACK_ID, 0xaf8d93d0) = -1 ENOTTY (Inappropriate ioctl for device) +ioctl(3, SG_GET_RESERVED_SIZE, 0xafa1a514) = 0 +ioctl(3, SG_GET_RESERVED_SIZE, 0xafa1a518) = 0 +ioctl(3, SG_GET_PACK_ID, 0xafa1a510) = -1 ENOTTY (Inappropriate ioctl for device) write(2, "SCSI buffer size: 64512\n", 24SCSI buffer size: 64512 ) = 24 -ioctl(3, SG_GET_RESERVED_SIZE, 0xaf8d93b4) = 0 -ioctl(3, SG_GET_RESERVED_SIZE, 0xaf8d93b8) = 0 -ioctl(3, SG_GET_PACK_ID, 0xaf8d93b0) = -1 ENOTTY (Inappropriate ioctl for device) -brk(0x9520000) = 0x9520000 -ioctl(3, SG_EMULATED_HOST, 0xaf8d93ec) = 0 +ioctl(3, SG_GET_RESERVED_SIZE, 0xafa1a4f4) = 0 +ioctl(3, SG_GET_RESERVED_SIZE, 0xafa1a4f8) = 0 +ioctl(3, SG_GET_PACK_ID, 0xafa1a4f0) = -1 ENOTTY (Inappropriate ioctl for device) +brk(0x9fa8000) = 0x9fa8000 +ioctl(3, SG_EMULATED_HOST, 0xafa1a52c) = 0 HERE write(1, "atapi: 1\n", 9) = 9 ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0 ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0 -ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[6]=[12, 00, 00, 00, 24, 00], mx_sb_len=16, iovec_count=0, dxfer_len=36, timeout=200000, flags=0x1, data[36]=["\5\200\0052[\0\0\0BENQ DVD DD DW1625 "...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0 +ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[6]=[12, 00, 00, 00, 24, 00], mx_sb_len=16, iovec_count=0, dxfer_len=36, timeout=200000, flags=0x1, data[36]=["\5\200\0052[\0\0\0BENQ DVD DD DW1625 "...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=36, duration=2, info=0}) = 0 ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0 -ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 3f, 00, 00, 00, 00, 00, 08, 00], mx_sb_len=16, iovec_count=0, dxfer_len=8, timeout=200000, flags=0x1, data[8]=["\1\36\21\0\0\0\0\0"], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=6, info=0}) = 0 +ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 3f, 00, 00, 00, 00, 00, 08, 00], mx_sb_len=16, iovec_count=0, dxfer_len=8, timeout=200000, flags=0x1, data[8]=["\1\36\21\0\0\0\0\0"], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=8, duration=4, info=0}) = 0 ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0 -ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 2a, 00, 00, 00, 00, 00, 02, 00], mx_sb_len=16, iovec_count=0, dxfer_len=2, timeout=200000, flags=0x1, data[2]=["\0>"], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=2, info=0}) = 0 +ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 2a, 00, 00, 00, 00, 00, 02, 00], mx_sb_len=16, iovec_count=0, dxfer_len=2, timeout=200000, flags=0x1, data[2]=["\0>"], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=2, duration=3, info=0}) = 0 -ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0 +ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=2, info=0}) = 0 -ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 2a, 00, 00, 00, 00, 00, 40, 00], mx_sb_len=16, iovec_count=0, dxfer_len=64, timeout=200000, flags=0x1, data[64]=["\0>\21\0\0\0\0\0*6\37\27\365g) \33\220\0\2\10\0\33\220\0\0\33\220\33\220\0\1"...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=3, info=0}) = 0 +ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 2a, 00, 00, 00, 00, 00, 40, 00], mx_sb_len=16, iovec_count=0, dxfer_len=64, timeout=200000, flags=0x1, data[64]=["\0>\21\0\0\0\0\0*6\37\27\365g) \33\220\0\2\10\0\33\220\0\0\33\220\33\220\0\1"...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=64, duration=3, info=0}) = 0 -ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0 +ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0 -ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 2a, 00, 00, 00, 00, 00, 40, 00], mx_sb_len=16, iovec_count=0, dxfer_len=64, timeout=200000, flags=0x1, data[64]=["\0>\21\0\0\0\0\0*6\37\27\365g) \33\220\0\2\10\0\33\220\0\0\33\220\33\220\0\1"...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=3, info=0}) = 0 +ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 2a, 00, 00, 00, 00, 00, 0a, 00], mx_sb_len=16, iovec_count=0, dxfer_len=10, timeout=200000, flags=0x1, data[10]=["\0>\21\0\0\0\0\0*6"], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=10, duration=3, info=0}) = 0 -ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=1, info=0}) = 0 +ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[00, 00, 00, 00, 00, 00], mx_sb_len=16, iovec_count=0, dxfer_len=0, timeout=200000, flags=0x1, status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=0, duration=0, info=0}) = 0 - -write(1, "Device type : Removable CD-RO"..., 34) = 34 -write(1, "Version : 5\n", 19) = 19 -write(1, "Response Format: 2\n", 19) = 19 -write(1, "Capabilities : \n", 18) = 18 -write(1, "Vendor_info : \'BENQ \'\n", 28) = 28 -write(1, "Identifikation : \'DVD DD DW1625 "..., 36) = 36 -write(1, "Revision : \'BBIA\'\n", 24) = 24 -write(1, "Device seems to be: Generic mmc2"..., 55) = 55 +ioctl(3, SG_IO, {'S', SG_DXFER_FROM_DEV, cmd[10]=[5a, 00, 2a, 00, 00, 00, 00, 00, 40, 00], mx_sb_len=16, iovec_count=0, dxfer_len=64, timeout=200000, flags=0x1, data[64]=["\0>\21\0\0\0\0\0*6\37\27\365g) \33\220\0\2\10\0\33\220\0\0\33\220\33\220\0\1"...], status=00, masked_status=00, sb[0]=[], host_status=0, driver_status=0, resid=64, duration=3, info=0}) = 0 +write(2, "/usr/bin/cdrecord: Warning: cont"..., 80/usr/bin/cdrecord: Warning: controller returns zero sized CD capabilities page. +) = 80 +write(2, "/usr/bin/cdrecord: Warning: cont"..., 91/usr/bin/cdrecord: Warning: controller returns wrong page 0 for CD capabilities page (2A). +) = 91 [-- Attachment #2: revert_add_raw_data_len.diff --] [-- Type: text/x-patch, Size: 3753 bytes --] diff --git a/block/blk-core.c b/block/blk-core.c index 775c851..c013ca2 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -127,7 +127,6 @@ void rq_init(struct request_queue *q, struct request *rq) rq->nr_hw_segments = 0; rq->ioprio = 0; rq->special = NULL; - rq->raw_data_len = 0; rq->buffer = NULL; rq->tag = -1; rq->errors = 0; @@ -2016,7 +2015,6 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq, rq->hard_cur_sectors = rq->current_nr_sectors; rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio); rq->buffer = bio_data(bio); - rq->raw_data_len = bio->bi_size; rq->data_len = bio->bi_size; rq->bio = rq->biotail = bio; diff --git a/block/blk-map.c b/block/blk-map.c index 09f7fd0..bc5ce60 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -19,7 +19,6 @@ int blk_rq_append_bio(struct request_queue *q, struct request *rq, rq->biotail->bi_next = bio; rq->biotail = bio; - rq->raw_data_len += bio->bi_size; rq->data_len += bio->bi_size; } return 0; diff --git a/block/blk-merge.c b/block/blk-merge.c index 7506c4f..a15d0ee 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -231,7 +231,6 @@ new_segment: ((unsigned long)q->dma_drain_buffer) & (PAGE_SIZE - 1)); nsegs++; - rq->data_len += q->dma_drain_size; } if (sg) diff --git a/block/bsg.c b/block/bsg.c index 7f3c095..8917c51 100644 --- a/block/bsg.c +++ b/block/bsg.c @@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr, } if (rq->next_rq) { - hdr->dout_resid = rq->raw_data_len; - hdr->din_resid = rq->next_rq->raw_data_len; + hdr->dout_resid = rq->data_len; + hdr->din_resid = rq->next_rq->data_len; blk_rq_unmap_user(bidi_bio); blk_put_request(rq->next_rq); } else if (rq_data_dir(rq) == READ) - hdr->din_resid = rq->raw_data_len; + hdr->din_resid = rq->data_len; else - hdr->dout_resid = rq->raw_data_len; + hdr->dout_resid = rq->data_len; /* * If the request generated a negative error number, return it diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c index e993cac..9675b34 100644 --- a/block/scsi_ioctl.c +++ b/block/scsi_ioctl.c @@ -266,7 +266,7 @@ static int blk_complete_sghdr_rq(struct request *rq, struct sg_io_hdr *hdr, hdr->info = 0; if (hdr->masked_status || hdr->host_status || hdr->driver_status) hdr->info |= SG_INFO_CHECK; - hdr->resid = rq->raw_data_len; + hdr->resid = rq->data_len; hdr->sb_len_wr = 0; if (rq->sense_len && hdr->sbp) { @@ -528,7 +528,6 @@ static int __blk_send_generic(struct request_queue *q, struct gendisk *bd_disk, rq = blk_get_request(q, WRITE, __GFP_WAIT); rq->cmd_type = REQ_TYPE_BLOCK_PC; rq->data = NULL; - rq->raw_data_len = 0; rq->data_len = 0; rq->timeout = BLK_DEFAULT_SG_TIMEOUT; memset(rq->cmd, 0, sizeof(rq->cmd)); diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index f888bab..a3baf69 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -2539,7 +2539,7 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc) * want to set it properly, and for DMA where it is * effectively meaningless. */ - nbytes = min(scmd->request->raw_data_len, (unsigned int)63 * 1024); + nbytes = min(qc->nbytes, (unsigned int)63 * 1024); /* Most ATAPI devices which honor transfer chunk size don't * behave according to the spec when odd chunk size which diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 6fe67d1..094eba2 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -216,7 +216,6 @@ struct request { unsigned int cmd_len; unsigned char cmd[BLK_MAX_CDB]; - unsigned int raw_data_len; unsigned int data_len; unsigned int sense_len; void *data; ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: regression: CD burning (k3b) went broke 2008-02-26 9:48 ` Mike Galbraith @ 2008-02-26 13:36 ` Mike Galbraith 2008-02-26 23:08 ` Andrew Morton 0 siblings, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-02-26 13:36 UTC (permalink / raw) To: Jens Axboe; +Cc: LKML, Tejun Heo On Tue, 2008-02-26 at 10:48 +0100, Mike Galbraith wrote: > Greetings, > > I straced both a good and a bad kernel (good being .git with attached > revert patch applied) and filtered/diffed/merged the output. Scroll > down to "HERE" to see the problem (resid). > > I'm poking around, but not having much luck. Seems the problem is data_len changes, but raw_data_len doesn't. I've not the foggiest IO-land clue, but k3b works again, so the below may have some diagnostic value. diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index ba21d97..7a6f784 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -871,7 +871,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) scsi_end_bidi_request(cmd); return; } - req->data_len = scsi_get_resid(cmd); + req->data_len = req->raw_data_len = scsi_get_resid(cmd); } BUG_ON(blk_bidi_rq(req)); /* bidi not support for !blk_pc_request yet */ -Mike ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: regression: CD burning (k3b) went broke 2008-02-26 13:36 ` Mike Galbraith @ 2008-02-26 23:08 ` Andrew Morton 2008-02-27 0:46 ` Jeff Garzik 2008-02-27 2:24 ` Mike Galbraith 0 siblings, 2 replies; 109+ messages in thread From: Andrew Morton @ 2008-02-26 23:08 UTC (permalink / raw) To: Mike Galbraith; +Cc: Jens Axboe, LKML, Tejun Heo, linux-ide, linux-scsi On Tue, 26 Feb 2008 14:36:43 +0100 Mike Galbraith <efault@gmx.de> wrote: > > On Tue, 2008-02-26 at 10:48 +0100, Mike Galbraith wrote: > > Greetings, > > > > I straced both a good and a bad kernel (good being .git with attached > > revert patch applied) and filtered/diffed/merged the output. Scroll > > down to "HERE" to see the problem (resid). > > > > I'm poking around, but not having much luck. cc's added. I'm told this is part of "Tejun's DMA drain handling". > Seems the problem is data_len changes, but raw_data_len doesn't. I've > not the foggiest IO-land clue, but k3b works again, so the below may > have some diagnostic value. So this change fixes a bug? Can we have a recap of how it does this? > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index ba21d97..7a6f784 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -871,7 +871,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) > scsi_end_bidi_request(cmd); > return; > } > - req->data_len = scsi_get_resid(cmd); > + req->data_len = req->raw_data_len = scsi_get_resid(cmd); > } > > BUG_ON(blk_bidi_rq(req)); /* bidi not support for !blk_pc_request yet */ > Thanks. ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: regression: CD burning (k3b) went broke 2008-02-26 23:08 ` Andrew Morton @ 2008-02-27 0:46 ` Jeff Garzik 2008-02-27 2:58 ` Mike Galbraith 2008-02-27 2:24 ` Mike Galbraith 1 sibling, 1 reply; 109+ messages in thread From: Jeff Garzik @ 2008-02-27 0:46 UTC (permalink / raw) To: Andrew Morton Cc: Mike Galbraith, Jens Axboe, LKML, Tejun Heo, linux-ide, linux-scsi Andrew Morton wrote: > On Tue, 26 Feb 2008 14:36:43 +0100 Mike Galbraith <efault@gmx.de> wrote: > >> On Tue, 2008-02-26 at 10:48 +0100, Mike Galbraith wrote: >>> Greetings, >>> >>> I straced both a good and a bad kernel (good being .git with attached >>> revert patch applied) and filtered/diffed/merged the output. Scroll >>> down to "HERE" to see the problem (resid). >>> >>> I'm poking around, but not having much luck. > > cc's added. > > I'm told this is part of "Tejun's DMA drain handling". Correct. >> Seems the problem is data_len changes, but raw_data_len doesn't. I've >> not the foggiest IO-land clue, but k3b works again, so the below may >> have some diagnostic value. > > So this change fixes a bug? Can we have a recap of how it does this? > >> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c >> index ba21d97..7a6f784 100644 >> --- a/drivers/scsi/scsi_lib.c >> +++ b/drivers/scsi/scsi_lib.c >> @@ -871,7 +871,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) >> scsi_end_bidi_request(cmd); >> return; >> } >> - req->data_len = scsi_get_resid(cmd); >> + req->data_len = req->raw_data_len = scsi_get_resid(cmd); >> } I would love to get an answer as to what data_len (and of course raw_data_len) should be set to AFTER the command completes, which is what is going on here. I can see the above being correct -- scsi_get_resid() returns the length of the left-over data after the command is processed -- but I am mainly curious why setting [raw_]data_len matters after I/O completion. Jeff ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: regression: CD burning (k3b) went broke 2008-02-27 0:46 ` Jeff Garzik @ 2008-02-27 2:58 ` Mike Galbraith 0 siblings, 0 replies; 109+ messages in thread From: Mike Galbraith @ 2008-02-27 2:58 UTC (permalink / raw) To: Jeff Garzik Cc: Andrew Morton, Jens Axboe, LKML, Tejun Heo, linux-ide, linux-scsi On Tue, 2008-02-26 at 19:46 -0500, Jeff Garzik wrote: > I would love to get an answer as to what data_len (and of course > raw_data_len) should be set to AFTER the command completes, which is > what is going on here. Yeah, blk_complete_sghdr_rq() used to do hdr->resid = irq->data_len, which is modified down lower. How/where that hdr->resid percolates back up, and turns into a retry/nogo, I don't know. -Mike ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: regression: CD burning (k3b) went broke 2008-02-26 23:08 ` Andrew Morton 2008-02-27 0:46 ` Jeff Garzik @ 2008-02-27 2:24 ` Mike Galbraith 2008-02-27 6:00 ` Mike Galbraith 1 sibling, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-02-27 2:24 UTC (permalink / raw) To: Andrew Morton; +Cc: Jens Axboe, LKML, Tejun Heo, linux-ide, linux-scsi On Tue, 2008-02-26 at 15:08 -0800, Andrew Morton wrote: > On Tue, 26 Feb 2008 14:36:43 +0100 Mike Galbraith <efault@gmx.de> wrote: > > Seems the problem is data_len changes, but raw_data_len doesn't. I've > > not the foggiest IO-land clue, but k3b works again, so the below may > > have some diagnostic value. > > So this change fixes a bug? Can we have a recap of how it does this? Yeah, it fixes the problem. (wrt recap, if I could write it, it would be a changelog;) -Mike ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: regression: CD burning (k3b) went broke 2008-02-27 2:24 ` Mike Galbraith @ 2008-02-27 6:00 ` Mike Galbraith 2008-02-27 7:07 ` Mike Galbraith 0 siblings, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-02-27 6:00 UTC (permalink / raw) To: Andrew Morton; +Cc: Jens Axboe, LKML, Tejun Heo, linux-ide, linux-scsi On Wed, 2008-02-27 at 03:24 +0100, Mike Galbraith wrote: > On Tue, 2008-02-26 at 15:08 -0800, Andrew Morton wrote: > > So this change fixes a bug? Can we have a recap of how it does this? > > Yeah, it fixes the problem. (wrt recap, if I could write it, it would > be a changelog;) Hm. After rummaging around some more in both kernel and userland, I think this patchlet is not only functional, but (random accident) technically correct. What the heck, let's see if it flies... snippet from userland: /* * Return the residual DMA count for last command. * If this count is < 0, then a DMA overrun occured. */ EXPORT int scg_getresid(scgp) SCSI *scgp; { return (scgp->scmd->resid); } This function is used all over the place in cdrecord to determine transfer size. (patchlet takes wing, and... goes splat?) Fix CD burning regression introduced by 6b00769fe1502b4ad97bb327ef7ac971b208bfb5. raw_data_len must be updated to reflect residual data upon IO completion because it is used by blk_complete_sghdr_rq() to set hdr->resid which eventually becomes visible to userland. Signed-off-by: Mike Galbraith <efault@gmx.de> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index ba21d97..7a6f784 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -871,7 +871,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) scsi_end_bidi_request(cmd); return; } - req->data_len = scsi_get_resid(cmd); + req->data_len = req->raw_data_len = scsi_get_resid(cmd); } BUG_ON(blk_bidi_rq(req)); /* bidi not support for !blk_pc_request yet */ -Mike ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: regression: CD burning (k3b) went broke 2008-02-27 6:00 ` Mike Galbraith @ 2008-02-27 7:07 ` Mike Galbraith 2008-02-28 7:43 ` Tejun Heo 0 siblings, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-02-27 7:07 UTC (permalink / raw) To: Andrew Morton; +Cc: Jens Axboe, LKML, Tejun Heo, linux-ide, linux-scsi On Wed, 2008-02-27 at 07:00 +0100, Mike Galbraith wrote: > (patchlet takes wing, and... goes splat?) Bugger, went splat... forgot preformat for patchlet insert. <quiltuple checks> Fix CD burning regression introduced by 6b00769fe1502b4ad97bb327ef7ac971b208bfb5. raw_data_len must be updated to reflect residual data upon IO completion because it is used by blk_complete_sghdr_rq() to set hdr->resid which eventually becomes visible to userland. Signed-off-by: Mike Galbraith <efault@gmx.de> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index ba21d97..7a6f784 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -871,7 +871,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) scsi_end_bidi_request(cmd); return; } - req->data_len = scsi_get_resid(cmd); + req->data_len = req->raw_data_len = scsi_get_resid(cmd); } BUG_ON(blk_bidi_rq(req)); /* bidi not support for !blk_pc_request yet */ ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: regression: CD burning (k3b) went broke 2008-02-27 7:07 ` Mike Galbraith @ 2008-02-28 7:43 ` Tejun Heo 2008-02-28 8:20 ` Mike Galbraith 0 siblings, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-02-28 7:43 UTC (permalink / raw) To: Mike Galbraith Cc: Andrew Morton, Jens Axboe, LKML, linux-ide, linux-scsi, Jeff Garzik [-- Attachment #1: Type: text/plain, Size: 1011 bytes --] Hello, all. Sorry about the delay. Was buried under other stuff. Mike, thanks a lot for reporting and analyzing the problem; however, the patch is slightly incorrect. rq->data_len is rq->data_len + extra stuff for alignment and padding, so the correct thing to do is... req->raw_data_len -= req->data_len - scsi_get_resid(cmd); req->data_len = scsi_get_resid(cmd); which is ugly and error-prone. In addition, this isn't the only place where resid is set. Other block drivers do this too. This definitely should be done in block layer. With rq->data_len and rq->raw_data_len, it's impossible to translate resid of rq->data_len to resid of rq->raw_data_len as block layer doesn't know how much was extra data after rq->data_len is modified. The attached patch substitutes rq->raw_data_len w/ rq->extra_len and adds blk_rq_raw_data_len(). Things look cleaner this way and the resid problem should be solved with this. Can you please verify the attached patch fixes the problem? Thanks. -- tejun [-- Attachment #2: patch --] [-- Type: text/plain, Size: 4608 bytes --] diff --git a/block/blk-core.c b/block/blk-core.c index 775c851..929ab61 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -127,7 +127,7 @@ void rq_init(struct request_queue *q, struct request *rq) rq->nr_hw_segments = 0; rq->ioprio = 0; rq->special = NULL; - rq->raw_data_len = 0; + rq->extra_len = 0; rq->buffer = NULL; rq->tag = -1; rq->errors = 0; @@ -2016,7 +2016,6 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq, rq->hard_cur_sectors = rq->current_nr_sectors; rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio); rq->buffer = bio_data(bio); - rq->raw_data_len = bio->bi_size; rq->data_len = bio->bi_size; rq->bio = rq->biotail = bio; diff --git a/block/blk-map.c b/block/blk-map.c index 09f7fd0..c67a75f 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -19,7 +19,6 @@ int blk_rq_append_bio(struct request_queue *q, struct request *rq, rq->biotail->bi_next = bio; rq->biotail = bio; - rq->raw_data_len += bio->bi_size; rq->data_len += bio->bi_size; } return 0; @@ -156,6 +155,7 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len; bio->bi_size += pad_len; rq->data_len += pad_len; + rq->extra_len += pad_len; } rq->buffer = rq->data = NULL; diff --git a/block/blk-merge.c b/block/blk-merge.c index 7506c4f..efb5b4d 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -232,6 +232,7 @@ new_segment: (PAGE_SIZE - 1)); nsegs++; rq->data_len += q->dma_drain_size; + rq->extra_len += q->dma_drain_size; } if (sg) diff --git a/block/bsg.c b/block/bsg.c index 7f3c095..81b2133 100644 --- a/block/bsg.c +++ b/block/bsg.c @@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr, } if (rq->next_rq) { - hdr->dout_resid = rq->raw_data_len; - hdr->din_resid = rq->next_rq->raw_data_len; + hdr->dout_resid = blk_rq_raw_data_len(rq); + hdr->din_resid = blk_rq_raw_data_len(rq->next_rq); blk_rq_unmap_user(bidi_bio); blk_put_request(rq->next_rq); } else if (rq_data_dir(rq) == READ) - hdr->din_resid = rq->raw_data_len; + hdr->din_resid = blk_rq_raw_data_len(rq); else - hdr->dout_resid = rq->raw_data_len; + hdr->dout_resid = blk_rq_raw_data_len(rq); /* * If the request generated a negative error number, return it diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c index e993cac..32424b3 100644 --- a/block/scsi_ioctl.c +++ b/block/scsi_ioctl.c @@ -266,7 +266,7 @@ static int blk_complete_sghdr_rq(struct request *rq, struct sg_io_hdr *hdr, hdr->info = 0; if (hdr->masked_status || hdr->host_status || hdr->driver_status) hdr->info |= SG_INFO_CHECK; - hdr->resid = rq->raw_data_len; + hdr->resid = blk_rq_raw_data_len(rq); hdr->sb_len_wr = 0; if (rq->sense_len && hdr->sbp) { @@ -528,8 +528,8 @@ static int __blk_send_generic(struct request_queue *q, struct gendisk *bd_disk, rq = blk_get_request(q, WRITE, __GFP_WAIT); rq->cmd_type = REQ_TYPE_BLOCK_PC; rq->data = NULL; - rq->raw_data_len = 0; rq->data_len = 0; + rq->extra_len = 0; rq->timeout = BLK_DEFAULT_SG_TIMEOUT; memset(rq->cmd, 0, sizeof(rq->cmd)); rq->cmd[0] = cmd; diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 0562b0a..5cab84c 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -2539,7 +2539,8 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc) * want to set it properly, and for DMA where it is * effectively meaningless. */ - nbytes = min(scmd->request->raw_data_len, (unsigned int)63 * 1024); + nbytes = min(blk_rq_raw_data_len(scmd->request), + (unsigned int)63 * 1024); /* Most ATAPI devices which honor transfer chunk size don't * behave according to the spec when odd chunk size which diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 6fe67d1..57e2a9e 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -216,8 +216,8 @@ struct request { unsigned int cmd_len; unsigned char cmd[BLK_MAX_CDB]; - unsigned int raw_data_len; unsigned int data_len; + unsigned int extra_len; unsigned int sense_len; void *data; void *sense; @@ -477,6 +477,11 @@ enum { #define rq_data_dir(rq) ((rq)->cmd_flags & 1) +static inline unsigned int blk_rq_raw_data_len(struct request *rq) +{ + return rq->data_len - min(rq->extra_len, rq->data_len); +} + /* * We regard a request as sync, if it's a READ or a SYNC write. */ ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: regression: CD burning (k3b) went broke 2008-02-28 7:43 ` Tejun Heo @ 2008-02-28 8:20 ` Mike Galbraith 2008-02-28 8:50 ` [PATCH] block: fix residual byte count handling Tejun Heo 0 siblings, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-02-28 8:20 UTC (permalink / raw) To: Tejun Heo Cc: Andrew Morton, Jens Axboe, LKML, linux-ide, linux-scsi, Jeff Garzik On Thu, 2008-02-28 at 16:43 +0900, Tejun Heo wrote: > Hello, all. > > Sorry about the delay. Was buried under other stuff. Mike, thanks a > lot for reporting and analyzing the problem; however, the patch is > slightly incorrect. rq->data_len is rq->data_len + extra stuff for > alignment and padding, so the correct thing to do is... > > req->raw_data_len -= req->data_len - scsi_get_resid(cmd); > req->data_len = scsi_get_resid(cmd); Ah, close but no banana. (feeds poor wingless patchlet to bit-wolf) > which is ugly and error-prone. In addition, this isn't the only place > where resid is set. Other block drivers do this too. This definitely > should be done in block layer. > > With rq->data_len and rq->raw_data_len, it's impossible to translate > resid of rq->data_len to resid of rq->raw_data_len as block layer > doesn't know how much was extra data after rq->data_len is modified. > The attached patch substitutes rq->raw_data_len w/ rq->extra_len and > adds blk_rq_raw_data_len(). Things look cleaner this way and the resid > problem should be solved with this. > > Can you please verify the attached patch fixes the problem? > > Thanks. Thank you, works fine. > plain text document attachment (patch) > diff --git a/block/blk-core.c b/block/blk-core.c > index 775c851..929ab61 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -127,7 +127,7 @@ void rq_init(struct request_queue *q, struct request *rq) > rq->nr_hw_segments = 0; > rq->ioprio = 0; > rq->special = NULL; > - rq->raw_data_len = 0; > + rq->extra_len = 0; > rq->buffer = NULL; > rq->tag = -1; > rq->errors = 0; > @@ -2016,7 +2016,6 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq, > rq->hard_cur_sectors = rq->current_nr_sectors; > rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio); > rq->buffer = bio_data(bio); > - rq->raw_data_len = bio->bi_size; > rq->data_len = bio->bi_size; > > rq->bio = rq->biotail = bio; > diff --git a/block/blk-map.c b/block/blk-map.c > index 09f7fd0..c67a75f 100644 > --- a/block/blk-map.c > +++ b/block/blk-map.c > @@ -19,7 +19,6 @@ int blk_rq_append_bio(struct request_queue *q, struct request *rq, > rq->biotail->bi_next = bio; > rq->biotail = bio; > > - rq->raw_data_len += bio->bi_size; > rq->data_len += bio->bi_size; > } > return 0; > @@ -156,6 +155,7 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, > bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len; > bio->bi_size += pad_len; > rq->data_len += pad_len; > + rq->extra_len += pad_len; > } > > rq->buffer = rq->data = NULL; > diff --git a/block/blk-merge.c b/block/blk-merge.c > index 7506c4f..efb5b4d 100644 > --- a/block/blk-merge.c > +++ b/block/blk-merge.c > @@ -232,6 +232,7 @@ new_segment: > (PAGE_SIZE - 1)); > nsegs++; > rq->data_len += q->dma_drain_size; > + rq->extra_len += q->dma_drain_size; > } > > if (sg) > diff --git a/block/bsg.c b/block/bsg.c > index 7f3c095..81b2133 100644 > --- a/block/bsg.c > +++ b/block/bsg.c > @@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr, > } > > if (rq->next_rq) { > - hdr->dout_resid = rq->raw_data_len; > - hdr->din_resid = rq->next_rq->raw_data_len; > + hdr->dout_resid = blk_rq_raw_data_len(rq); > + hdr->din_resid = blk_rq_raw_data_len(rq->next_rq); > blk_rq_unmap_user(bidi_bio); > blk_put_request(rq->next_rq); > } else if (rq_data_dir(rq) == READ) > - hdr->din_resid = rq->raw_data_len; > + hdr->din_resid = blk_rq_raw_data_len(rq); > else > - hdr->dout_resid = rq->raw_data_len; > + hdr->dout_resid = blk_rq_raw_data_len(rq); > > /* > * If the request generated a negative error number, return it > diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c > index e993cac..32424b3 100644 > --- a/block/scsi_ioctl.c > +++ b/block/scsi_ioctl.c > @@ -266,7 +266,7 @@ static int blk_complete_sghdr_rq(struct request *rq, struct sg_io_hdr *hdr, > hdr->info = 0; > if (hdr->masked_status || hdr->host_status || hdr->driver_status) > hdr->info |= SG_INFO_CHECK; > - hdr->resid = rq->raw_data_len; > + hdr->resid = blk_rq_raw_data_len(rq); > hdr->sb_len_wr = 0; > > if (rq->sense_len && hdr->sbp) { > @@ -528,8 +528,8 @@ static int __blk_send_generic(struct request_queue *q, struct gendisk *bd_disk, > rq = blk_get_request(q, WRITE, __GFP_WAIT); > rq->cmd_type = REQ_TYPE_BLOCK_PC; > rq->data = NULL; > - rq->raw_data_len = 0; > rq->data_len = 0; > + rq->extra_len = 0; > rq->timeout = BLK_DEFAULT_SG_TIMEOUT; > memset(rq->cmd, 0, sizeof(rq->cmd)); > rq->cmd[0] = cmd; > diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c > index 0562b0a..5cab84c 100644 > --- a/drivers/ata/libata-scsi.c > +++ b/drivers/ata/libata-scsi.c > @@ -2539,7 +2539,8 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc) > * want to set it properly, and for DMA where it is > * effectively meaningless. > */ > - nbytes = min(scmd->request->raw_data_len, (unsigned int)63 * 1024); > + nbytes = min(blk_rq_raw_data_len(scmd->request), > + (unsigned int)63 * 1024); > > /* Most ATAPI devices which honor transfer chunk size don't > * behave according to the spec when odd chunk size which > diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > index 6fe67d1..57e2a9e 100644 > --- a/include/linux/blkdev.h > +++ b/include/linux/blkdev.h > @@ -216,8 +216,8 @@ struct request { > unsigned int cmd_len; > unsigned char cmd[BLK_MAX_CDB]; > > - unsigned int raw_data_len; > unsigned int data_len; > + unsigned int extra_len; > unsigned int sense_len; > void *data; > void *sense; > @@ -477,6 +477,11 @@ enum { > > #define rq_data_dir(rq) ((rq)->cmd_flags & 1) > > +static inline unsigned int blk_rq_raw_data_len(struct request *rq) > +{ > + return rq->data_len - min(rq->extra_len, rq->data_len); > +} > + > /* > * We regard a request as sync, if it's a READ or a SYNC write. > */ ^ permalink raw reply [flat|nested] 109+ messages in thread
* [PATCH] block: fix residual byte count handling 2008-02-28 8:20 ` Mike Galbraith @ 2008-02-28 8:50 ` Tejun Heo 2008-02-28 15:35 ` Jens Axboe 0 siblings, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-02-28 8:50 UTC (permalink / raw) To: Mike Galbraith, Andrew Morton, Jens Axboe Cc: LKML, linux-ide, linux-scsi, Jeff Garzik rq->raw_data_len introduced for block layer padding and draining (commit 6b00769fe1502b4ad97bb327ef7ac971b208bfb5) broke residual byte count handling. Block drivers modify rq->data_len to notify residual byte count to the block layer which blindly reported unmodified rq->raw_data_len to userland. To keep block drivers dealing only with rq->data_len, this should be handled inside block layer. However, how much extra buffer was appened is lost after rq->data_len is modified. This patch replaces rq->raw_data_len with rq->extra_len and add blk_rq_raw_data_len() helper to calculate raw data size from rq->data_len and rq->extra_len. The helper returns correct raw residual byte count when called on a rq whose data_len is modified to carry residual byte count. This problem was reported and diagnosed by Mike Galbraith. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Mike Galbraith <efault@gmx.de> --- block/blk-core.c | 3 +-- block/blk-map.c | 2 +- block/blk-merge.c | 1 + block/bsg.c | 8 ++++---- block/scsi_ioctl.c | 4 ++-- drivers/ata/libata-scsi.c | 3 ++- include/linux/blkdev.h | 8 +++++++- 7 files changed, 18 insertions(+), 11 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 775c851..929ab61 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -127,7 +127,7 @@ void rq_init(struct request_queue *q, struct request *rq) rq->nr_hw_segments = 0; rq->ioprio = 0; rq->special = NULL; - rq->raw_data_len = 0; + rq->extra_len = 0; rq->buffer = NULL; rq->tag = -1; rq->errors = 0; @@ -2016,7 +2016,6 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq, rq->hard_cur_sectors = rq->current_nr_sectors; rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio); rq->buffer = bio_data(bio); - rq->raw_data_len = bio->bi_size; rq->data_len = bio->bi_size; rq->bio = rq->biotail = bio; diff --git a/block/blk-map.c b/block/blk-map.c index 09f7fd0..c67a75f 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -19,7 +19,6 @@ int blk_rq_append_bio(struct request_queue *q, struct request *rq, rq->biotail->bi_next = bio; rq->biotail = bio; - rq->raw_data_len += bio->bi_size; rq->data_len += bio->bi_size; } return 0; @@ -156,6 +155,7 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len; bio->bi_size += pad_len; rq->data_len += pad_len; + rq->extra_len += pad_len; } rq->buffer = rq->data = NULL; diff --git a/block/blk-merge.c b/block/blk-merge.c index 7506c4f..efb5b4d 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -232,6 +232,7 @@ new_segment: (PAGE_SIZE - 1)); nsegs++; rq->data_len += q->dma_drain_size; + rq->extra_len += q->dma_drain_size; } if (sg) diff --git a/block/bsg.c b/block/bsg.c index 7f3c095..81b2133 100644 --- a/block/bsg.c +++ b/block/bsg.c @@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr, } if (rq->next_rq) { - hdr->dout_resid = rq->raw_data_len; - hdr->din_resid = rq->next_rq->raw_data_len; + hdr->dout_resid = blk_rq_raw_data_len(rq); + hdr->din_resid = blk_rq_raw_data_len(rq->next_rq); blk_rq_unmap_user(bidi_bio); blk_put_request(rq->next_rq); } else if (rq_data_dir(rq) == READ) - hdr->din_resid = rq->raw_data_len; + hdr->din_resid = blk_rq_raw_data_len(rq); else - hdr->dout_resid = rq->raw_data_len; + hdr->dout_resid = blk_rq_raw_data_len(rq); /* * If the request generated a negative error number, return it diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c index e993cac..32424b3 100644 --- a/block/scsi_ioctl.c +++ b/block/scsi_ioctl.c @@ -266,7 +266,7 @@ static int blk_complete_sghdr_rq(struct request *rq, struct sg_io_hdr *hdr, hdr->info = 0; if (hdr->masked_status || hdr->host_status || hdr->driver_status) hdr->info |= SG_INFO_CHECK; - hdr->resid = rq->raw_data_len; + hdr->resid = blk_rq_raw_data_len(rq); hdr->sb_len_wr = 0; if (rq->sense_len && hdr->sbp) { @@ -528,8 +528,8 @@ static int __blk_send_generic(struct request_queue *q, struct gendisk *bd_disk, rq = blk_get_request(q, WRITE, __GFP_WAIT); rq->cmd_type = REQ_TYPE_BLOCK_PC; rq->data = NULL; - rq->raw_data_len = 0; rq->data_len = 0; + rq->extra_len = 0; rq->timeout = BLK_DEFAULT_SG_TIMEOUT; memset(rq->cmd, 0, sizeof(rq->cmd)); rq->cmd[0] = cmd; diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 0562b0a..5cab84c 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -2539,7 +2539,8 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc) * want to set it properly, and for DMA where it is * effectively meaningless. */ - nbytes = min(scmd->request->raw_data_len, (unsigned int)63 * 1024); + nbytes = min(blk_rq_raw_data_len(scmd->request), + (unsigned int)63 * 1024); /* Most ATAPI devices which honor transfer chunk size don't * behave according to the spec when odd chunk size which diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 6fe67d1..917b97f 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -216,8 +216,8 @@ struct request { unsigned int cmd_len; unsigned char cmd[BLK_MAX_CDB]; - unsigned int raw_data_len; unsigned int data_len; + unsigned int extra_len; /* length of alignment and padding */ unsigned int sense_len; void *data; void *sense; @@ -477,6 +477,12 @@ enum { #define rq_data_dir(rq) ((rq)->cmd_flags & 1) +/* data_len of the request sans extra stuff for alignment and padding */ +static inline unsigned int blk_rq_raw_data_len(struct request *rq) +{ + return rq->data_len - min(rq->extra_len, rq->data_len); +} + /* * We regard a request as sync, if it's a READ or a SYNC write. */ ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-02-28 8:50 ` [PATCH] block: fix residual byte count handling Tejun Heo @ 2008-02-28 15:35 ` Jens Axboe 2008-02-28 15:46 ` Tejun Heo 0 siblings, 1 reply; 109+ messages in thread From: Jens Axboe @ 2008-02-28 15:35 UTC (permalink / raw) To: Tejun Heo Cc: Mike Galbraith, Andrew Morton, LKML, linux-ide, linux-scsi, Jeff Garzik On Thu, Feb 28 2008, Tejun Heo wrote: > rq->raw_data_len introduced for block layer padding and draining > (commit 6b00769fe1502b4ad97bb327ef7ac971b208bfb5) broke residual byte > count handling. Block drivers modify rq->data_len to notify residual > byte count to the block layer which blindly reported unmodified > rq->raw_data_len to userland. > > To keep block drivers dealing only with rq->data_len, this should be > handled inside block layer. However, how much extra buffer was > appened is lost after rq->data_len is modified. > > This patch replaces rq->raw_data_len with rq->extra_len and add > blk_rq_raw_data_len() helper to calculate raw data size from > rq->data_len and rq->extra_len. The helper returns correct raw > residual byte count when called on a rq whose data_len is modified to > carry residual byte count. > > This problem was reported and diagnosed by Mike Galbraith. Tejun, this patch isn't much cleaner at all. It really shows the pain of these two seperate, yet related, variables. -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-02-28 15:35 ` Jens Axboe @ 2008-02-28 15:46 ` Tejun Heo 2008-02-29 16:47 ` James Bottomley 0 siblings, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-02-28 15:46 UTC (permalink / raw) To: Jens Axboe Cc: Mike Galbraith, Andrew Morton, LKML, linux-ide, linux-scsi, Jeff Garzik Jens Axboe wrote: >> This problem was reported and diagnosed by Mike Galbraith. > > Tejun, this patch isn't much cleaner at all. It really shows the pain of > these two seperate, yet related, variables. Not much cleaner compared to what? I think padding stuff is bound to be somewhat complex. It's a nasty thing in nature. I think ->extra_len is better than ->raw_data_len because ->extra_len only needs to be updated where the dirty jobs are done and extra buffer areas are added. Any better suggestions? Thanks. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-02-28 15:46 ` Tejun Heo @ 2008-02-29 16:47 ` James Bottomley 2008-02-29 20:11 ` Jens Axboe 0 siblings, 1 reply; 109+ messages in thread From: James Bottomley @ 2008-02-29 16:47 UTC (permalink / raw) To: Tejun Heo Cc: Jens Axboe, Mike Galbraith, Andrew Morton, LKML, linux-ide, linux-scsi, Jeff Garzik On Fri, 2008-02-29 at 00:46 +0900, Tejun Heo wrote: > Jens Axboe wrote: > >> This problem was reported and diagnosed by Mike Galbraith. > > > > Tejun, this patch isn't much cleaner at all. It really shows the pain of > > these two seperate, yet related, variables. > > Not much cleaner compared to what? I think padding stuff is bound to be > somewhat complex. It's a nasty thing in nature. I think ->extra_len is > better than ->raw_data_len because ->extra_len only needs to be updated > where the dirty jobs are done and extra buffer areas are added. Any > better suggestions? Well, I just investigated a bug report in the SCSI transport class. Our SMP handler is broken in exactly the same way. We rely on the incoming reported request lengths to size our request data, and they've blown up from the true length to 512 bytes (the size of our alignment). With the original patch, I have to run through the whole of libsas and scsi_transport_sas doing s/data_len/raw_data_len/ With your update it looks like I have to run through them all doing s/data_len/data_len - extra_len/ which is even worse. Can't we put things back to a point where data_len means exactly that and extra_len means how much we have spare on the end, so you know you can DMA up to data_len + extra_len if need be? That way we don't have to sweep through every block driver altering the way it uses data_len. James ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-02-29 16:47 ` James Bottomley @ 2008-02-29 20:11 ` Jens Axboe 2008-03-01 6:17 ` Tejun Heo 0 siblings, 1 reply; 109+ messages in thread From: Jens Axboe @ 2008-02-29 20:11 UTC (permalink / raw) To: James Bottomley Cc: Tejun Heo, Mike Galbraith, Andrew Morton, LKML, linux-ide, linux-scsi, Jeff Garzik On Fri, Feb 29 2008, James Bottomley wrote: > > On Fri, 2008-02-29 at 00:46 +0900, Tejun Heo wrote: > > Jens Axboe wrote: > > >> This problem was reported and diagnosed by Mike Galbraith. > > > > > > Tejun, this patch isn't much cleaner at all. It really shows the pain of > > > these two seperate, yet related, variables. > > > > Not much cleaner compared to what? I think padding stuff is bound to be > > somewhat complex. It's a nasty thing in nature. I think ->extra_len is > > better than ->raw_data_len because ->extra_len only needs to be updated > > where the dirty jobs are done and extra buffer areas are added. Any > > better suggestions? > > Well, I just investigated a bug report in the SCSI transport class. Our > SMP handler is broken in exactly the same way. We rely on the incoming > reported request lengths to size our request data, and they've blown up > from the true length to 512 bytes (the size of our alignment). > > With the original patch, I have to run through the whole of libsas and > scsi_transport_sas doing > > s/data_len/raw_data_len/ > > With your update it looks like I have to run through them all doing > > s/data_len/data_len - extra_len/ > > which is even worse. Can't we put things back to a point where data_len > means exactly that and extra_len means how much we have spare on the > end, so you know you can DMA up to data_len + extra_len if need be? > > That way we don't have to sweep through every block driver altering the > way it uses data_len. Fully agree. The reason why I think it's so ugly is that you have to keep these two seperate variables in sync. The burning was just one bug, there will be others... -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-02-29 20:11 ` Jens Axboe @ 2008-03-01 6:17 ` Tejun Heo 2008-03-01 15:19 ` James Bottomley 2008-03-02 14:52 ` FUJITA Tomonori 0 siblings, 2 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-01 6:17 UTC (permalink / raw) To: Jens Axboe Cc: James Bottomley, Mike Galbraith, Andrew Morton, LKML, linux-ide, linux-scsi, Jeff Garzik Hello, Jens, James. Jens Axboe wrote: >> With the original patch, I have to run through the whole of libsas and >> scsi_transport_sas doing >> >> s/data_len/raw_data_len/ >> >> With your update it looks like I have to run through them all doing >> >> s/data_len/data_len - extra_len/ blk_rq_raw_data_len() should do. >> which is even worse. Can't we put things back to a point where data_len >> means exactly that and extra_len means how much we have spare on the >> end, so you know you can DMA up to data_len + extra_len if need be? >> >> That way we don't have to sweep through every block driver altering the >> way it uses data_len. If SMP is broken because it needs start address alignment but not padding to align the size, what should be done is to make that exact requirement visible to the block layer. Say, blk_queue_dma_start_alignment() or maybe change blk_queue_dma_alignment() such that it only indicates start address alignment and add blk_queue_dma_size_alignment() for drivers which require size to be aligned too. I think those are few. I think the decision which value rq->data_len represents comes down to which size is used more in low level drivers because no matter which way we choose we'll have to update some of the drivers which expects the other thing from rq->data_len. blk_rq_raw_data_len() is needed iff a driver needs dummy buffers attached at the end and still needs to know the original request size which isn't the common case. > Fully agree. The reason why I think it's so ugly is that you have to > keep these two seperate variables in sync. The burning was just one bug, > there will be others... The posted modification isn't too bad as the maintenance of the two variables is at places where the nasty things happen. I think what rq->data_len should represent when seen from LLDs is more important and please note that if SMP is broken because it simply doesn't require 512byte size alignment, it's a different issue. As long as both raw_data_len and data_len are accessible, I'm okay either way. My biggest reluctance is against breaking sum(sg) == rq->data_len. I think this can lead to much more subtle problems such as programming the controller w/ wrong bytes count and wrapped-around resid calculation. Thanks. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-01 6:17 ` Tejun Heo @ 2008-03-01 15:19 ` James Bottomley 2008-03-02 14:52 ` FUJITA Tomonori 1 sibling, 0 replies; 109+ messages in thread From: James Bottomley @ 2008-03-01 15:19 UTC (permalink / raw) To: Tejun Heo Cc: Jens Axboe, Mike Galbraith, Andrew Morton, LKML, linux-ide, linux-scsi, Jeff Garzik On Sat, 2008-03-01 at 15:17 +0900, Tejun Heo wrote: > Hello, Jens, James. > > Jens Axboe wrote: > >> With the original patch, I have to run through the whole of libsas and > >> scsi_transport_sas doing > >> > >> s/data_len/raw_data_len/ > >> > >> With your update it looks like I have to run through them all doing > >> > >> s/data_len/data_len - extra_len/ > > blk_rq_raw_data_len() should do. I know we *could* sweep through all the block drivers altering them; my point is that I don't think we *should*. Fundamentally, every driver that cares is assuming req->data_len is the length of the request that came down. The fact that it got padded is irrelevant (and actually detrimental) to most of them as the SMP driver illustrates. We use a high dma_alignment not because we care about padding, but because we want to avoid scatter gather. So we care about alignment of the start of the buffer (to avoid sg), but fundamentally, we need to know what its true length (not its padded length) is. The true length feeds into the smp frame size and is checked by the interfaces, which is why the changes caused an SMP failure. Just for the principle of least surprise, can we not keep req->data_len what it has always been, namely the true data length of the request and express the fact that we've padded it by req->extra_len or something, so we don't have to do all of these driver changes. > >> which is even worse. Can't we put things back to a point where data_len > >> means exactly that and extra_len means how much we have spare on the > >> end, so you know you can DMA up to data_len + extra_len if need be? > >> > >> That way we don't have to sweep through every block driver altering the > >> way it uses data_len. > > If SMP is broken because it needs start address alignment but not > padding to align the size, what should be done is to make that exact > requirement visible to the block layer. Say, > blk_queue_dma_start_alignment() or maybe change > blk_queue_dma_alignment() such that it only indicates start address > alignment and add blk_queue_dma_size_alignment() for drivers which > require size to be aligned too. I think those are few. But this is true of *every* current user of the block layer apart from IDE ... we all care about alignment not padding. Any current user that actually cares about padding will be doing their own adjustments, so they need changing anyway. We can frame that with a different API, but blk_queue_dma_alignment() better be the common case (start but not pad alignment). > I think the decision which value rq->data_len represents comes down to > which size is used more in low level drivers because no matter which way > we choose we'll have to update some of the drivers which expects the > other thing from rq->data_len. Right, and currently, apart from IDE, they all want it to mean the true data length. > blk_rq_raw_data_len() is needed iff a driver needs dummy buffers > attached at the end and still needs to know the original request size > which isn't the common case. I think it *is* the common case. > > Fully agree. The reason why I think it's so ugly is that you have to > > keep these two seperate variables in sync. The burning was just one bug, > > there will be others... > > The posted modification isn't too bad as the maintenance of the two > variables is at places where the nasty things happen. I think what > rq->data_len should represent when seen from LLDs is more important and > please note that if SMP is broken because it simply doesn't require > 512byte size alignment, it's a different issue. But we still have to find all the bugs this causes in all the block drivers ... that's my biggest concern right now. > As long as both raw_data_len and data_len are accessible, I'm okay > either way. My biggest reluctance is against breaking sum(sg) == > rq->data_len. I think this can lead to much more subtle problems such > as programming the controller w/ wrong bytes count and wrapped-around > resid calculation. OK, so can we go back to data_len being the true value and add an extra_len for drivers who want to know where the padding lies? James ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-01 6:17 ` Tejun Heo 2008-03-01 15:19 ` James Bottomley @ 2008-03-02 14:52 ` FUJITA Tomonori 2008-03-02 18:46 ` Mike Christie 2008-03-03 2:40 ` Tejun Heo 1 sibling, 2 replies; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-02 14:52 UTC (permalink / raw) To: htejun Cc: jens.axboe, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, fujita.tomonori On Sat, 01 Mar 2008 15:17:32 +0900 Tejun Heo <htejun@gmail.com> wrote: > Hello, Jens, James. > > Jens Axboe wrote: > >> With the original patch, I have to run through the whole of libsas and > >> scsi_transport_sas doing > >> > >> s/data_len/raw_data_len/ > >> > >> With your update it looks like I have to run through them all doing > >> > >> s/data_len/data_len - extra_len/ > > blk_rq_raw_data_len() should do. > > >> which is even worse. Can't we put things back to a point where data_len > >> means exactly that and extra_len means how much we have spare on the > >> end, so you know you can DMA up to data_len + extra_len if need be? > >> > >> That way we don't have to sweep through every block driver altering the > >> way it uses data_len. > > If SMP is broken because it needs start address alignment but not > padding to align the size, what should be done is to make that exact > requirement visible to the block layer. Say, > blk_queue_dma_start_alignment() or maybe change > blk_queue_dma_alignment() such that it only indicates start address > alignment and add blk_queue_dma_size_alignment() for drivers which > require size to be aligned too. I think those are few. > > I think the decision which value rq->data_len represents comes down to > which size is used more in low level drivers because no matter which way > we choose we'll have to update some of the drivers which expects the > other thing from rq->data_len. > > blk_rq_raw_data_len() is needed iff a driver needs dummy buffers > attached at the end and still needs to know the original request size > which isn't the common case. > > > Fully agree. The reason why I think it's so ugly is that you have to > > keep these two seperate variables in sync. The burning was just one bug, > > there will be others... > > The posted modification isn't too bad as the maintenance of the two > variables is at places where the nasty things happen. I think what > rq->data_len should represent when seen from LLDs is more important and > please note that if SMP is broken because it simply doesn't require > 512byte size alignment, it's a different issue. > > As long as both raw_data_len and data_len are accessible, I'm okay > either way. My biggest reluctance is against breaking sum(sg) == > rq->data_len. I think this can lead to much more subtle problems such > as programming the controller w/ wrong bytes count and wrapped-around > resid calculation. sum(sg) == rq->data_len is already broken; sg sends such requests (though it would be nice if it doesn't). I've not followed the earlier discussion (because I thought the drain buffer stuff affected only libata but seems it doesn't ...). Why did we need to change the meaning of rq->data_len? rq->data_len meant the true data length and the patch to change it doesn't look to make anything simple. Can we revert the meaning of rq->data_len? I'm not sure that we need to add rq->extra_len but it's fine as long as it's only for drivers that want to use it. This is only compile tested. = diff --git a/block/blk-core.c b/block/blk-core.c index 775c851..bfec406 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -127,7 +127,6 @@ void rq_init(struct request_queue *q, struct request *rq) rq->nr_hw_segments = 0; rq->ioprio = 0; rq->special = NULL; - rq->raw_data_len = 0; rq->buffer = NULL; rq->tag = -1; rq->errors = 0; @@ -135,6 +134,7 @@ void rq_init(struct request_queue *q, struct request *rq) rq->cmd_len = 0; memset(rq->cmd, 0, sizeof(rq->cmd)); rq->data_len = 0; + rq->extra_len = 0; rq->sense_len = 0; rq->data = NULL; rq->sense = NULL; @@ -2016,7 +2016,6 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq, rq->hard_cur_sectors = rq->current_nr_sectors; rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio); rq->buffer = bio_data(bio); - rq->raw_data_len = bio->bi_size; rq->data_len = bio->bi_size; rq->bio = rq->biotail = bio; diff --git a/block/blk-map.c b/block/blk-map.c index 09f7fd0..3287637 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -19,7 +19,6 @@ int blk_rq_append_bio(struct request_queue *q, struct request *rq, rq->biotail->bi_next = bio; rq->biotail = bio; - rq->raw_data_len += bio->bi_size; rq->data_len += bio->bi_size; } return 0; @@ -155,7 +154,7 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len; bio->bi_size += pad_len; - rq->data_len += pad_len; + rq->extra_len += pad_len; } rq->buffer = rq->data = NULL; diff --git a/block/blk-merge.c b/block/blk-merge.c index 7506c4f..0f58616 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -231,7 +231,7 @@ new_segment: ((unsigned long)q->dma_drain_buffer) & (PAGE_SIZE - 1)); nsegs++; - rq->data_len += q->dma_drain_size; + rq->extra_len += q->dma_drain_size; } if (sg) diff --git a/block/bsg.c b/block/bsg.c index 7f3c095..8917c51 100644 --- a/block/bsg.c +++ b/block/bsg.c @@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr, } if (rq->next_rq) { - hdr->dout_resid = rq->raw_data_len; - hdr->din_resid = rq->next_rq->raw_data_len; + hdr->dout_resid = rq->data_len; + hdr->din_resid = rq->next_rq->data_len; blk_rq_unmap_user(bidi_bio); blk_put_request(rq->next_rq); } else if (rq_data_dir(rq) == READ) - hdr->din_resid = rq->raw_data_len; + hdr->din_resid = rq->data_len; else - hdr->dout_resid = rq->raw_data_len; + hdr->dout_resid = rq->data_len; /* * If the request generated a negative error number, return it diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c index e993cac..a2c3a93 100644 --- a/block/scsi_ioctl.c +++ b/block/scsi_ioctl.c @@ -266,7 +266,7 @@ static int blk_complete_sghdr_rq(struct request *rq, struct sg_io_hdr *hdr, hdr->info = 0; if (hdr->masked_status || hdr->host_status || hdr->driver_status) hdr->info |= SG_INFO_CHECK; - hdr->resid = rq->raw_data_len; + hdr->resid = rq->data_len; hdr->sb_len_wr = 0; if (rq->sense_len && hdr->sbp) { @@ -528,8 +528,8 @@ static int __blk_send_generic(struct request_queue *q, struct gendisk *bd_disk, rq = blk_get_request(q, WRITE, __GFP_WAIT); rq->cmd_type = REQ_TYPE_BLOCK_PC; rq->data = NULL; - rq->raw_data_len = 0; rq->data_len = 0; + rq->extra_len = 0; rq->timeout = BLK_DEFAULT_SG_TIMEOUT; memset(rq->cmd, 0, sizeof(rq->cmd)); rq->cmd[0] = cmd; diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 7b1f1ee..fe47922 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -2538,7 +2538,7 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc) } qc->tf.command = ATA_CMD_PACKET; - qc->nbytes = scsi_bufflen(scmd); + qc->nbytes = scsi_bufflen(scmd) + scmd->request->extra_len; /* check whether ATAPI DMA is safe */ if (!using_pio && ata_check_atapi_dma(qc)) @@ -2549,7 +2549,7 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc) * want to set it properly, and for DMA where it is * effectively meaningless. */ - nbytes = min(scmd->request->raw_data_len, (unsigned int)63 * 1024); + nbytes = min(scmd->request->data_len, (unsigned int)63 * 1024); /* Most ATAPI devices which honor transfer chunk size don't * behave according to the spec when odd chunk size which @@ -2875,7 +2875,7 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc) * TODO: find out if we need to do more here to * cover scatter/gather case. */ - qc->nbytes = scsi_bufflen(scmd); + qc->nbytes = scsi_bufflen(scmd) + scmd->request->extra_len; /* request result TF and be quiet about device error */ qc->flags |= ATA_QCFLAG_RESULT_TF | ATA_QCFLAG_QUIET; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 6fe67d1..b72526c 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -216,8 +216,8 @@ struct request { unsigned int cmd_len; unsigned char cmd[BLK_MAX_CDB]; - unsigned int raw_data_len; unsigned int data_len; + unsigned int extra_len; /* length of alignment and padding */ unsigned int sense_len; void *data; void *sense; ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-02 14:52 ` FUJITA Tomonori @ 2008-03-02 18:46 ` Mike Christie 2008-03-03 3:27 ` Mike Galbraith 2008-03-03 2:40 ` Tejun Heo 1 sibling, 1 reply; 109+ messages in thread From: Mike Christie @ 2008-03-02 18:46 UTC (permalink / raw) To: FUJITA Tomonori Cc: htejun, jens.axboe, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, fujita.tomonori FUJITA Tomonori wrote: > sum(sg) == rq->data_len is already broken; sg sends such requests > (though it would be nice if it doesn't). > Actually, I think I was half wrong on that when you asked about scsi_debug. The scatterlist that sg.c uses is never seen by the block layer or scsi layer. It is just used as a container to hold segments. sg.c and st.c use their scatterlist to manage their preallocated pages/segments. When they pass it to scsi_execute_async, that function will create a request struct and add bios for the pages. In 2.6.24 and below, sg.c will send a scatterlist length that does not match the IO length, and scsi_execute_async will goof up and send a rq->data_len that does not match the sum of the bios. That is what I was trying to fix in 2.6.24, but the patch got messed up. In 2.6.25-rc2 and above that is fixed and scsi_execute_async will catch sg.c doing this and set rq->data_len and the bio lengths correctly. So hopefully that helps any fixes you might have planned. ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-02 18:46 ` Mike Christie @ 2008-03-03 3:27 ` Mike Galbraith 0 siblings, 0 replies; 109+ messages in thread From: Mike Galbraith @ 2008-03-03 3:27 UTC (permalink / raw) To: Mike Christie Cc: Jens Axboe, Mike Galbraith, Andrew Morton, LKML, linux-ide, linux-scsi, Jeff Garzik Is it possible to teach Thunderturd to NOT munge the cc line? It stripped names from cc addresses, and here when that happens the message lands (intentionally) in my spam grinder. I just happened to see this one before flushing, but now, thanks to Thunderturd, every follow-up will also land there. (well, not any more since I restored it) -Mike ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-02 14:52 ` FUJITA Tomonori 2008-03-02 18:46 ` Mike Christie @ 2008-03-03 2:40 ` Tejun Heo 2008-03-03 3:59 ` FUJITA Tomonori 1 sibling, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-03-03 2:40 UTC (permalink / raw) To: FUJITA Tomonori Cc: jens.axboe, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, fujita.tomonori FUJITA Tomonori wrote: > sum(sg) == rq->data_len is already broken; sg sends such requests > (though it would be nice if it doesn't). > > I've not followed the earlier discussion (because I thought the drain > buffer stuff affected only libata but seems it doesn't ...). Why did > we need to change the meaning of rq->data_len? At this point, it's not clear what the original meaning of rq->data_len is because before moving alignment and padding to block layer, rq->data_len equaled both the requested data length and the length of sg list. AFAIK, it's SCSI midlayer which makes sg list and data length mismatch not block layer. >From the POV of the block layer, as now it might extend the sg list, it has to decide what rq->data_len means in this case - the requested transfer length from userland or the length of mapped sg list. I think that currently the biggest problem is that drivers which don't require request size adjustment are getting it because alignment setting doesn't distinguish between start address alignment and size alignment. For drivers which don't require data size adjustment from block layer, data_len or raw_data_len doesn't matter. They're equal anyway. I'm prepping a patch for this now. For drivers which do require request size adjustments, I think it's better to keep rq->data_len in line with the size of mapped sg list. The rationales are... - Those are dumb controllers which want to see requests which meet certain size requirements and they're likely to care more about actual data buffer size than user requested buffer size. IOW, they wanna see sizes which meet certain requirements, so give them those values. - I think bugs caused by using raw_data_len instead of data_len are more subtle than the other way around. Using data_len instead of raw_data_len usually affects the application layer while using raw_data_len instead of data_len affects the DMA engine and transport layer. > rq->data_len meant the true data length and the patch to change it > doesn't look to make anything simple. Can we revert the meaning of > rq->data_len? I'm not sure that we need to add rq->extra_len but it's > fine as long as it's only for drivers that want to use it. > > This is only compile tested. If we're gonna go this way, we'll need blk_rq_total_data_len() and use it for drivers which requires request size adjustments. Thanks. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-03 2:40 ` Tejun Heo @ 2008-03-03 3:59 ` FUJITA Tomonori 2008-03-03 4:09 ` Tejun Heo 0 siblings, 1 reply; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-03 3:59 UTC (permalink / raw) To: htejun Cc: tomof, jens.axboe, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, fujita.tomonori On Mon, 03 Mar 2008 11:40:08 +0900 Tejun Heo <htejun@gmail.com> wrote: > FUJITA Tomonori wrote: > > sum(sg) == rq->data_len is already broken; sg sends such requests > > (though it would be nice if it doesn't). > > > > I've not followed the earlier discussion (because I thought the drain > > buffer stuff affected only libata but seems it doesn't ...). Why did > > we need to change the meaning of rq->data_len? > > At this point, it's not clear what the original meaning of rq->data_len > is because before moving alignment and padding to block layer, > rq->data_len equaled both the requested data length and the length of sg > list. AFAIK, it's SCSI midlayer which makes sg list and data length > mismatch not block layer. > > From the POV of the block layer, as now it might extend the sg list, it > has to decide what rq->data_len means in this case - the requested > transfer length from userland or the length of mapped sg list. Yeah. It meant the requested transfer length from userland in the past and I think that chaning to the length of mapped sg list doesn't make anything simpler. > I think that currently the biggest problem is that drivers which don't > require request size adjustment are getting it because alignment setting > doesn't distinguish between start address alignment and size alignment. > For drivers which don't require data size adjustment from block layer, > data_len or raw_data_len doesn't matter. They're equal anyway. I'm > prepping a patch for this now. > > For drivers which do require request size adjustments, I think it's > better to keep rq->data_len in line with the size of mapped sg list. > The rationales are... > > - Those are dumb controllers which want to see requests which meet > certain size requirements and they're likely to care more about actual > data buffer size than user requested buffer size. IOW, they wanna see > sizes which meet certain requirements, so give them those values. The drivers care about the actual data buffer size. The dumb controllers drivers can get what they want to use rq->extra_len or walk through the sg list. > - I think bugs caused by using raw_data_len instead of data_len are more > subtle than the other way around. Using data_len instead of > raw_data_len usually affects the application layer while using > raw_data_len instead of data_len affects the DMA engine and transport layer. If we add extra_len, we can get what raw_data_len and data_len provide. I can't see what changing the meaning of rq->data_len (and investigating all the block drivers) gives us. > > rq->data_len meant the true data length and the patch to change it > > doesn't look to make anything simple. Can we revert the meaning of > > rq->data_len? I'm not sure that we need to add rq->extra_len but it's > > fine as long as it's only for drivers that want to use it. > > > > This is only compile tested. > > If we're gonna go this way, we'll need blk_rq_total_data_len() and use > it for drivers which requires request size adjustments. No problem. It would be much better to add blk_rq_total_data_len rather than chainging the meaning of rq->data_len and all the block drivers. Here's an updated patch (I forgot to remove the bi_size adjustment in blk_rq_map_user in the previous patch). Can we agree on it if we add blk_rq_total_data_len()? diff --git a/block/blk-core.c b/block/blk-core.c index 775c851..bfec406 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -127,7 +127,6 @@ void rq_init(struct request_queue *q, struct request *rq) rq->nr_hw_segments = 0; rq->ioprio = 0; rq->special = NULL; - rq->raw_data_len = 0; rq->buffer = NULL; rq->tag = -1; rq->errors = 0; @@ -135,6 +134,7 @@ void rq_init(struct request_queue *q, struct request *rq) rq->cmd_len = 0; memset(rq->cmd, 0, sizeof(rq->cmd)); rq->data_len = 0; + rq->extra_len = 0; rq->sense_len = 0; rq->data = NULL; rq->sense = NULL; @@ -2016,7 +2016,6 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq, rq->hard_cur_sectors = rq->current_nr_sectors; rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio); rq->buffer = bio_data(bio); - rq->raw_data_len = bio->bi_size; rq->data_len = bio->bi_size; rq->bio = rq->biotail = bio; diff --git a/block/blk-map.c b/block/blk-map.c index 09f7fd0..f559832 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -19,7 +19,6 @@ int blk_rq_append_bio(struct request_queue *q, struct request *rq, rq->biotail->bi_next = bio; rq->biotail = bio; - rq->raw_data_len += bio->bi_size; rq->data_len += bio->bi_size; } return 0; @@ -151,11 +150,8 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, */ if (len & queue_dma_alignment(q)) { unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1; - struct bio *bio = rq->biotail; - bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len; - bio->bi_size += pad_len; - rq->data_len += pad_len; + rq->extra_len += pad_len; } rq->buffer = rq->data = NULL; diff --git a/block/blk-merge.c b/block/blk-merge.c index 7506c4f..0f58616 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -231,7 +231,7 @@ new_segment: ((unsigned long)q->dma_drain_buffer) & (PAGE_SIZE - 1)); nsegs++; - rq->data_len += q->dma_drain_size; + rq->extra_len += q->dma_drain_size; } if (sg) diff --git a/block/bsg.c b/block/bsg.c index 7f3c095..8917c51 100644 --- a/block/bsg.c +++ b/block/bsg.c @@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr, } if (rq->next_rq) { - hdr->dout_resid = rq->raw_data_len; - hdr->din_resid = rq->next_rq->raw_data_len; + hdr->dout_resid = rq->data_len; + hdr->din_resid = rq->next_rq->data_len; blk_rq_unmap_user(bidi_bio); blk_put_request(rq->next_rq); } else if (rq_data_dir(rq) == READ) - hdr->din_resid = rq->raw_data_len; + hdr->din_resid = rq->data_len; else - hdr->dout_resid = rq->raw_data_len; + hdr->dout_resid = rq->data_len; /* * If the request generated a negative error number, return it diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c index e993cac..a2c3a93 100644 --- a/block/scsi_ioctl.c +++ b/block/scsi_ioctl.c @@ -266,7 +266,7 @@ static int blk_complete_sghdr_rq(struct request *rq, struct sg_io_hdr *hdr, hdr->info = 0; if (hdr->masked_status || hdr->host_status || hdr->driver_status) hdr->info |= SG_INFO_CHECK; - hdr->resid = rq->raw_data_len; + hdr->resid = rq->data_len; hdr->sb_len_wr = 0; if (rq->sense_len && hdr->sbp) { @@ -528,8 +528,8 @@ static int __blk_send_generic(struct request_queue *q, struct gendisk *bd_disk, rq = blk_get_request(q, WRITE, __GFP_WAIT); rq->cmd_type = REQ_TYPE_BLOCK_PC; rq->data = NULL; - rq->raw_data_len = 0; rq->data_len = 0; + rq->extra_len = 0; rq->timeout = BLK_DEFAULT_SG_TIMEOUT; memset(rq->cmd, 0, sizeof(rq->cmd)); rq->cmd[0] = cmd; diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 7b1f1ee..fe47922 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -2538,7 +2538,7 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc) } qc->tf.command = ATA_CMD_PACKET; - qc->nbytes = scsi_bufflen(scmd); + qc->nbytes = scsi_bufflen(scmd) + scmd->request->extra_len; /* check whether ATAPI DMA is safe */ if (!using_pio && ata_check_atapi_dma(qc)) @@ -2549,7 +2549,7 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc) * want to set it properly, and for DMA where it is * effectively meaningless. */ - nbytes = min(scmd->request->raw_data_len, (unsigned int)63 * 1024); + nbytes = min(scmd->request->data_len, (unsigned int)63 * 1024); /* Most ATAPI devices which honor transfer chunk size don't * behave according to the spec when odd chunk size which @@ -2875,7 +2875,7 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc) * TODO: find out if we need to do more here to * cover scatter/gather case. */ - qc->nbytes = scsi_bufflen(scmd); + qc->nbytes = scsi_bufflen(scmd) + scmd->request->extra_len; /* request result TF and be quiet about device error */ qc->flags |= ATA_QCFLAG_RESULT_TF | ATA_QCFLAG_QUIET; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 6fe67d1..b72526c 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -216,8 +216,8 @@ struct request { unsigned int cmd_len; unsigned char cmd[BLK_MAX_CDB]; - unsigned int raw_data_len; unsigned int data_len; + unsigned int extra_len; /* length of alignment and padding */ unsigned int sense_len; void *data; void *sense; ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-03 3:59 ` FUJITA Tomonori @ 2008-03-03 4:09 ` Tejun Heo 2008-03-03 6:08 ` [PATCH 1/2] " Tejun Heo 2008-03-03 8:26 ` [PATCH] block: fix residual byte count handling FUJITA Tomonori 0 siblings, 2 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-03 4:09 UTC (permalink / raw) To: FUJITA Tomonori Cc: tomof, jens.axboe, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik FUJITA Tomonori wrote: >> - I think bugs caused by using raw_data_len instead of data_len are more >> subtle than the other way around. Using data_len instead of >> raw_data_len usually affects the application layer while using >> raw_data_len instead of data_len affects the DMA engine and transport layer. > > If we add extra_len, we can get what raw_data_len and data_len > provide. > > I can't see what changing the meaning of rq->data_len (and > investigating all the block drivers) gives us. No matter which way you go, you change the meaning of rq->data_len and you MUST inspect rq->data_len usage whichever way you go. Apply your patch and try to do sg IO on IDE cdrom w/ various transfer lengths. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* [PATCH 1/2] block: fix residual byte count handling 2008-03-03 4:09 ` Tejun Heo @ 2008-03-03 6:08 ` Tejun Heo 2008-03-03 6:10 ` [PATCH] block: separate out padding from alignment Tejun Heo 2008-03-03 8:26 ` [PATCH] block: fix residual byte count handling FUJITA Tomonori 1 sibling, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-03-03 6:08 UTC (permalink / raw) To: FUJITA Tomonori Cc: tomof, jens.axboe, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik rq->raw_data_len introduced for block layer padding and draining (commit 6b00769fe1502b4ad97bb327ef7ac971b208bfb5) broke residual byte count handling. Block drivers modify rq->data_len to notify residual byte count to the block layer which blindly reported unmodified rq->raw_data_len to userland. To keep block drivers dealing only with rq->data_len, this should be handled inside block layer. However, how much extra buffer was appened is lost after rq->data_len is modified. This patch replaces rq->raw_data_len with rq->extra_len and add blk_rq_raw_data_len() helper to calculate raw data size from rq->data_len and rq->extra_len. The helper returns correct raw residual byte count when called on a rq whose data_len is modified to carry residual byte count. This problem was reported and diagnosed by Mike Galbraith. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Mike Galbraith <efault@gmx.de> --- Comments updated compared to the previous version. block/blk-core.c | 3 +-- block/blk-map.c | 2 +- block/blk-merge.c | 1 + block/blk-settings.c | 4 ++++ block/bsg.c | 8 ++++---- block/scsi_ioctl.c | 4 ++-- drivers/ata/libata-scsi.c | 3 ++- include/linux/blkdev.h | 8 +++++++- 8 files changed, 22 insertions(+), 11 deletions(-) Index: work/block/blk-core.c =================================================================== --- work.orig/block/blk-core.c +++ work/block/blk-core.c @@ -127,7 +127,7 @@ void rq_init(struct request_queue *q, st rq->nr_hw_segments = 0; rq->ioprio = 0; rq->special = NULL; - rq->raw_data_len = 0; + rq->extra_len = 0; rq->buffer = NULL; rq->tag = -1; rq->errors = 0; @@ -2016,7 +2016,6 @@ void blk_rq_bio_prep(struct request_queu rq->hard_cur_sectors = rq->current_nr_sectors; rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio); rq->buffer = bio_data(bio); - rq->raw_data_len = bio->bi_size; rq->data_len = bio->bi_size; rq->bio = rq->biotail = bio; Index: work/block/blk-map.c =================================================================== --- work.orig/block/blk-map.c +++ work/block/blk-map.c @@ -19,7 +19,6 @@ int blk_rq_append_bio(struct request_que rq->biotail->bi_next = bio; rq->biotail = bio; - rq->raw_data_len += bio->bi_size; rq->data_len += bio->bi_size; } return 0; @@ -156,6 +155,7 @@ int blk_rq_map_user(struct request_queue bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len; bio->bi_size += pad_len; rq->data_len += pad_len; + rq->extra_len += pad_len; } rq->buffer = rq->data = NULL; Index: work/block/blk-merge.c =================================================================== --- work.orig/block/blk-merge.c +++ work/block/blk-merge.c @@ -232,6 +232,7 @@ new_segment: (PAGE_SIZE - 1)); nsegs++; rq->data_len += q->dma_drain_size; + rq->extra_len += q->dma_drain_size; } if (sg) Index: work/block/bsg.c =================================================================== --- work.orig/block/bsg.c +++ work/block/bsg.c @@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(stru } if (rq->next_rq) { - hdr->dout_resid = rq->raw_data_len; - hdr->din_resid = rq->next_rq->raw_data_len; + hdr->dout_resid = blk_rq_raw_data_len(rq); + hdr->din_resid = blk_rq_raw_data_len(rq->next_rq); blk_rq_unmap_user(bidi_bio); blk_put_request(rq->next_rq); } else if (rq_data_dir(rq) == READ) - hdr->din_resid = rq->raw_data_len; + hdr->din_resid = blk_rq_raw_data_len(rq); else - hdr->dout_resid = rq->raw_data_len; + hdr->dout_resid = blk_rq_raw_data_len(rq); /* * If the request generated a negative error number, return it Index: work/block/scsi_ioctl.c =================================================================== --- work.orig/block/scsi_ioctl.c +++ work/block/scsi_ioctl.c @@ -266,7 +266,7 @@ static int blk_complete_sghdr_rq(struct hdr->info = 0; if (hdr->masked_status || hdr->host_status || hdr->driver_status) hdr->info |= SG_INFO_CHECK; - hdr->resid = rq->raw_data_len; + hdr->resid = blk_rq_raw_data_len(rq); hdr->sb_len_wr = 0; if (rq->sense_len && hdr->sbp) { @@ -528,8 +528,8 @@ static int __blk_send_generic(struct req rq = blk_get_request(q, WRITE, __GFP_WAIT); rq->cmd_type = REQ_TYPE_BLOCK_PC; rq->data = NULL; - rq->raw_data_len = 0; rq->data_len = 0; + rq->extra_len = 0; rq->timeout = BLK_DEFAULT_SG_TIMEOUT; memset(rq->cmd, 0, sizeof(rq->cmd)); rq->cmd[0] = cmd; Index: work/drivers/ata/libata-scsi.c =================================================================== --- work.orig/drivers/ata/libata-scsi.c +++ work/drivers/ata/libata-scsi.c @@ -2549,7 +2549,8 @@ static unsigned int atapi_xlat(struct at * want to set it properly, and for DMA where it is * effectively meaningless. */ - nbytes = min(scmd->request->raw_data_len, (unsigned int)63 * 1024); + nbytes = min(blk_rq_raw_data_len(scmd->request), + (unsigned int)63 * 1024); /* Most ATAPI devices which honor transfer chunk size don't * behave according to the spec when odd chunk size which Index: work/include/linux/blkdev.h =================================================================== --- work.orig/include/linux/blkdev.h +++ work/include/linux/blkdev.h @@ -216,8 +216,8 @@ struct request { unsigned int cmd_len; unsigned char cmd[BLK_MAX_CDB]; - unsigned int raw_data_len; unsigned int data_len; + unsigned int extra_len; /* length of padding and draining buffers */ unsigned int sense_len; void *data; void *sense; @@ -477,6 +477,12 @@ enum { #define rq_data_dir(rq) ((rq)->cmd_flags & 1) +/* data_len of the request sans extra stuff for padding and draining */ +static inline unsigned int blk_rq_raw_data_len(struct request *rq) +{ + return rq->data_len - min(rq->extra_len, rq->data_len); +} + /* * We regard a request as sync, if it's a READ or a SYNC write. */ Index: work/block/blk-settings.c =================================================================== --- work.orig/block/blk-settings.c +++ work/block/blk-settings.c @@ -309,6 +309,10 @@ EXPORT_SYMBOL(blk_queue_stack_limits); * does is adjust the queue so that the buf is always appended * silently to the scatterlist. * + * Appending draining buffer to a request modifies ->data_len such + * that it includes the drain buffer. The original requested data + * length can be obtained using blk_rq_raw_data_len(). + * * Note: This routine adjusts max_hw_segments to make room for * appending the drain buffer. If you call * blk_queue_max_hw_segments() or blk_queue_max_phys_segments() after ^ permalink raw reply [flat|nested] 109+ messages in thread
* [PATCH] block: separate out padding from alignment 2008-03-03 6:08 ` [PATCH 1/2] " Tejun Heo @ 2008-03-03 6:10 ` Tejun Heo 2008-03-03 18:27 ` James Bottomley 0 siblings, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-03-03 6:10 UTC (permalink / raw) To: FUJITA Tomonori Cc: tomof, jens.axboe, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik Block layer alignment was used for two different purposes - memory alignment and padding. This causes problems in lower layers because drivers which only require memory alignment ends up with adjusted rq->data_len. Separate out padding such that padding occurs iff driver explicitly requests it. Signed-off-by: Tejun Heo <htejun@gmail.com> --- As wrote before, the major problem was that drivers which don't want size adjustment got it acciedentally by mixing up aligning and padding which are two conceptually separate things. Let padding occur iff the driver explicitly requested it. This makes both parties happy. block/blk-map.c | 16 +++++++++------- block/blk-settings.c | 17 +++++++++++++++++ drivers/ata/libata-scsi.c | 3 ++- include/linux/blkdev.h | 2 ++ 4 files changed, 30 insertions(+), 8 deletions(-) Index: work/block/blk-settings.c =================================================================== --- work.orig/block/blk-settings.c +++ work/block/blk-settings.c @@ -293,6 +293,23 @@ void blk_queue_stack_limits(struct reque EXPORT_SYMBOL(blk_queue_stack_limits); /** + * blk_queue_dma_pad - set pad mask + * @q: the request queue for the device + * @mask: pad mask + * + * Set pad mask. Direct IO requests are padded to the mask specified. + * + * Appending pad buffer to a request modifies ->data_len such that it + * includes the pad buffer. The original requested data length can be + * obtained using blk_rq_raw_data_len(). + **/ +void blk_queue_dma_pad(struct request_queue *q, unsigned int mask) +{ + q->dma_pad_mask = mask; +} +EXPORT_SYMBOL(blk_queue_dma_pad); + +/** * blk_queue_dma_drain - Set up a drain buffer for excess dma. * * @q: the request queue for the device Index: work/block/blk-map.c =================================================================== --- work.orig/block/blk-map.c +++ work/block/blk-map.c @@ -43,6 +43,7 @@ static int __blk_rq_map_user(struct requ void __user *ubuf, unsigned int len) { unsigned long uaddr; + unsigned int alignment; struct bio *bio, *orig_bio; int reading, ret; @@ -53,8 +54,8 @@ static int __blk_rq_map_user(struct requ * direct dma. else, set up kernel bounce buffers */ uaddr = (unsigned long) ubuf; - if (!(uaddr & queue_dma_alignment(q)) && - !(len & queue_dma_alignment(q))) + alignment = queue_dma_alignment(q) | q->dma_pad_mask; + if (!(uaddr & alignment) && !(len & alignment)) bio = bio_map_user(q, NULL, uaddr, len, reading); else bio = bio_copy_user(q, uaddr, len, reading); @@ -141,15 +142,16 @@ int blk_rq_map_user(struct request_queue /* * __blk_rq_map_user() copies the buffers if starting address - * or length isn't aligned. As the copied buffer is always - * page aligned, we know that there's enough room for padding. - * Extend the last bio and update rq->data_len accordingly. + * or length isn't aligned to dma_pad_mask. As the copied + * buffer is always page aligned, we know that there's enough + * room for padding. Extend the last bio and update + * rq->data_len accordingly. * * On unmap, bio_uncopy_user() will use unmodified * bio_map_data pointed to by bio->bi_private. */ - if (len & queue_dma_alignment(q)) { - unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1; + if (len & q->dma_pad_mask) { + unsigned int pad_len = (q->dma_pad_mask & ~len) + 1; struct bio *bio = rq->biotail; bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len; Index: work/include/linux/blkdev.h =================================================================== --- work.orig/include/linux/blkdev.h +++ work/include/linux/blkdev.h @@ -362,6 +362,7 @@ struct request_queue unsigned long seg_boundary_mask; void *dma_drain_buffer; unsigned int dma_drain_size; + unsigned int dma_pad_mask; unsigned int dma_alignment; struct blk_queue_tag *queue_tags; @@ -707,6 +708,7 @@ extern void blk_queue_max_hw_segments(st extern void blk_queue_max_segment_size(struct request_queue *, unsigned int); extern void blk_queue_hardsect_size(struct request_queue *, unsigned short); extern void blk_queue_stack_limits(struct request_queue *t, struct request_queue *b); +extern void blk_queue_dma_pad(struct request_queue *, unsigned int); extern int blk_queue_dma_drain(struct request_queue *q, dma_drain_needed_fn *dma_drain_needed, void *buf, unsigned int size); Index: work/drivers/ata/libata-scsi.c =================================================================== --- work.orig/drivers/ata/libata-scsi.c +++ work/drivers/ata/libata-scsi.c @@ -862,9 +862,10 @@ static int ata_scsi_dev_config(struct sc struct request_queue *q = sdev->request_queue; void *buf; - /* set the min alignment */ + /* set the min alignment and padding */ blk_queue_update_dma_alignment(sdev->request_queue, ATA_DMA_PAD_SZ - 1); + blk_queue_dma_pad(sdev->request_queue, ATA_DMA_PAD_SZ - 1); /* configure draining */ buf = kmalloc(ATAPI_MAX_DRAIN, q->bounce_gfp | GFP_KERNEL); ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: separate out padding from alignment 2008-03-03 6:10 ` [PATCH] block: separate out padding from alignment Tejun Heo @ 2008-03-03 18:27 ` James Bottomley 0 siblings, 0 replies; 109+ messages in thread From: James Bottomley @ 2008-03-03 18:27 UTC (permalink / raw) To: Tejun Heo Cc: FUJITA Tomonori, tomof, jens.axboe, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik On Mon, 2008-03-03 at 15:10 +0900, Tejun Heo wrote: > Block layer alignment was used for two different purposes - memory > alignment and padding. This causes problems in lower layers because > drivers which only require memory alignment ends up with adjusted > rq->data_len. Separate out padding such that padding occurs iff > driver explicitly requests it. This puts the libsas SMP handler back into a working state again. Thanks, James ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-03 4:09 ` Tejun Heo 2008-03-03 6:08 ` [PATCH 1/2] " Tejun Heo @ 2008-03-03 8:26 ` FUJITA Tomonori 2008-03-03 9:21 ` Tejun Heo 1 sibling, 1 reply; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-03 8:26 UTC (permalink / raw) To: htejun Cc: fujita.tomonori, tomof, jens.axboe, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik On Mon, 03 Mar 2008 13:09:13 +0900 Tejun Heo <htejun@gmail.com> wrote: > FUJITA Tomonori wrote: > >> - I think bugs caused by using raw_data_len instead of data_len are more > >> subtle than the other way around. Using data_len instead of > >> raw_data_len usually affects the application layer while using > >> raw_data_len instead of data_len affects the DMA engine and transport layer. > > > > If we add extra_len, we can get what raw_data_len and data_len > > provide. > > > > I can't see what changing the meaning of rq->data_len (and > > investigating all the block drivers) gives us. > > No matter which way you go, you change the meaning of rq->data_len and > you MUST inspect rq->data_len usage whichever way you go. The patch doens't change that rq->data_len means the true data length. But yeah, it breaks rq->data_len == sum(sg). So it might break some drivers. > Apply your patch and try to do sg IO on IDE cdrom w/ various > transfer lengths. I've just tried the patch with both ata and libata and it seems to work. For anyone hitting this problem, please try the following patch: http://lkml.org/lkml/2008/3/2/218 Thanks, ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-03 8:26 ` [PATCH] block: fix residual byte count handling FUJITA Tomonori @ 2008-03-03 9:21 ` Tejun Heo 2008-03-03 12:17 ` FUJITA Tomonori 0 siblings, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-03-03 9:21 UTC (permalink / raw) To: FUJITA Tomonori Cc: tomof, jens.axboe, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik FUJITA Tomonori wrote: >>> I can't see what changing the meaning of rq->data_len (and >>> investigating all the block drivers) gives us. >> No matter which way you go, you change the meaning of rq->data_len and >> you MUST inspect rq->data_len usage whichever way you go. > > The patch doens't change that rq->data_len means the true data > length. But yeah, it breaks rq->data_len == sum(sg). So it might break > some drivers. Yeah, that's what I was saying. You end up breaking one of the two assumptions. As sglist is getting modified for any driver if it has DMA alignment set, whether rq->data_len is adjusted together or not, sglist and data_len usages have to be audited. >> Apply your patch and try to do sg IO on IDE cdrom w/ various >> transfer lengths. > > I've just tried the patch with both ata and libata and it seems to > work. Right, I missed you added extra_len in libata and IDE isn't using block layer stuff yet. > For anyone hitting this problem, please try the following patch: > > http://lkml.org/lkml/2008/3/2/218 Whether rq->data_len stays with requested data buffer size or sum(sg), I think we need to separate out padding from address alignment; otherwise, we'll have to audit every block driver to make sure they can deal with extended sglist no matter which value rq->data_len ends up indicating. If padding is applied iff explicitly requested, rq->data_len indicates matters only to the drivers which want to see the data length adjusted, so most of the problems go away. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-03 9:21 ` Tejun Heo @ 2008-03-03 12:17 ` FUJITA Tomonori 2008-03-03 13:38 ` Tejun Heo 0 siblings, 1 reply; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-03 12:17 UTC (permalink / raw) To: htejun Cc: fujita.tomonori, tomof, jens.axboe, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, fujita.tomonori On Mon, 03 Mar 2008 18:21:13 +0900 Tejun Heo <htejun@gmail.com> wrote: > FUJITA Tomonori wrote: > >>> I can't see what changing the meaning of rq->data_len (and > >>> investigating all the block drivers) gives us. > >> No matter which way you go, you change the meaning of rq->data_len and > >> you MUST inspect rq->data_len usage whichever way you go. > > > > The patch doens't change that rq->data_len means the true data > > length. But yeah, it breaks rq->data_len == sum(sg). So it might break > > some drivers. > > Yeah, that's what I was saying. You end up breaking one of the two > assumptions. As sglist is getting modified for any driver if it has DMA > alignment set, whether rq->data_len is adjusted together or not, sglist > and data_len usages have to be audited. My patch (well, James' original approach) doesn't affect drivers that don't use drain buffer. rq->data_len still means the true data length and rq->data_len is equal to sum(sg) for them. So right now we need to audit only libata. But your patch changes the meaning of rq->data_len. It affects all the drivers. So it breaks non libata stuff, like the SMP handler. We need to audit all the drivers. ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-03 12:17 ` FUJITA Tomonori @ 2008-03-03 13:38 ` Tejun Heo 2008-03-03 13:50 ` FUJITA Tomonori 0 siblings, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-03-03 13:38 UTC (permalink / raw) To: FUJITA Tomonori Cc: fujita.tomonori, jens.axboe, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik FUJITA Tomonori wrote: > On Mon, 03 Mar 2008 18:21:13 +0900 > Tejun Heo <htejun@gmail.com> wrote: > >> FUJITA Tomonori wrote: >>>>> I can't see what changing the meaning of rq->data_len (and >>>>> investigating all the block drivers) gives us. >>>> No matter which way you go, you change the meaning of rq->data_len and >>>> you MUST inspect rq->data_len usage whichever way you go. >>> The patch doens't change that rq->data_len means the true data >>> length. But yeah, it breaks rq->data_len == sum(sg). So it might break >>> some drivers. >> Yeah, that's what I was saying. You end up breaking one of the two >> assumptions. As sglist is getting modified for any driver if it has DMA >> alignment set, whether rq->data_len is adjusted together or not, sglist >> and data_len usages have to be audited. > > My patch (well, James' original approach) doesn't affect drivers that > don't use drain buffer. rq->data_len still means the true data length > and rq->data_len is equal to sum(sg) for them. So right now we need to > audit only libata. Your patch does change sglist for any driver which sets DMA alignment. You'll definitely need to audit more than libata. > But your patch changes the meaning of rq->data_len. It affects all the > drivers. So it breaks non libata stuff, like the SMP handler. We need > to audit all the drivers. With both patches applied, sglist and data_len are adjusted only for libata, so only drivers which explicitly requested buffer size manipulation (currently only libata) need to be audited / updated. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-03 13:38 ` Tejun Heo @ 2008-03-03 13:50 ` FUJITA Tomonori 2008-03-03 13:55 ` Tejun Heo 0 siblings, 1 reply; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-03 13:50 UTC (permalink / raw) To: htejun Cc: tomof, fujita.tomonori, jens.axboe, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, fujita.tomonori On Mon, 03 Mar 2008 22:38:55 +0900 Tejun Heo <htejun@gmail.com> wrote: > FUJITA Tomonori wrote: > > On Mon, 03 Mar 2008 18:21:13 +0900 > > Tejun Heo <htejun@gmail.com> wrote: > > > >> FUJITA Tomonori wrote: > >>>>> I can't see what changing the meaning of rq->data_len (and > >>>>> investigating all the block drivers) gives us. > >>>> No matter which way you go, you change the meaning of rq->data_len and > >>>> you MUST inspect rq->data_len usage whichever way you go. > >>> The patch doens't change that rq->data_len means the true data > >>> length. But yeah, it breaks rq->data_len == sum(sg). So it might break > >>> some drivers. > >> Yeah, that's what I was saying. You end up breaking one of the two > >> assumptions. As sglist is getting modified for any driver if it has DMA > >> alignment set, whether rq->data_len is adjusted together or not, sglist > >> and data_len usages have to be audited. > > > > My patch (well, James' original approach) doesn't affect drivers that > > don't use drain buffer. rq->data_len still means the true data length > > and rq->data_len is equal to sum(sg) for them. So right now we need to > > audit only libata. > > Your patch does change sglist for any driver which sets DMA alignment. I overlook it. Where does it changes sglist? ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-03 13:50 ` FUJITA Tomonori @ 2008-03-03 13:55 ` Tejun Heo 2008-03-03 14:01 ` FUJITA Tomonori 0 siblings, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-03-03 13:55 UTC (permalink / raw) To: FUJITA Tomonori Cc: fujita.tomonori, jens.axboe, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik FUJITA Tomonori wrote: >>>> FUJITA Tomonori wrote: >>>>>>> I can't see what changing the meaning of rq->data_len (and >>>>>>> investigating all the block drivers) gives us. >>>>>> No matter which way you go, you change the meaning of rq->data_len and >>>>>> you MUST inspect rq->data_len usage whichever way you go. >>>>> The patch doens't change that rq->data_len means the true data >>>>> length. But yeah, it breaks rq->data_len == sum(sg). So it might break >>>>> some drivers. >>>> Yeah, that's what I was saying. You end up breaking one of the two >>>> assumptions. As sglist is getting modified for any driver if it has DMA >>>> alignment set, whether rq->data_len is adjusted together or not, sglist >>>> and data_len usages have to be audited. >>> My patch (well, James' original approach) doesn't affect drivers that >>> don't use drain buffer. rq->data_len still means the true data length >>> and rq->data_len is equal to sum(sg) for them. So right now we need to >>> audit only libata. >> Your patch does change sglist for any driver which sets DMA alignment. > > I overlook it. Where does it changes sglist? At the end of blk_rq_map_user() together with data_len / extra_len mangling or were you talking about James' original patch? -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-03 13:55 ` Tejun Heo @ 2008-03-03 14:01 ` FUJITA Tomonori 2008-03-03 14:22 ` Tejun Heo 0 siblings, 1 reply; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-03 14:01 UTC (permalink / raw) To: htejun Cc: tomof, fujita.tomonori, jens.axboe, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, fujita.tomonori On Mon, 03 Mar 2008 22:55:56 +0900 Tejun Heo <htejun@gmail.com> wrote: > FUJITA Tomonori wrote: > >>>> FUJITA Tomonori wrote: > >>>>>>> I can't see what changing the meaning of rq->data_len (and > >>>>>>> investigating all the block drivers) gives us. > >>>>>> No matter which way you go, you change the meaning of rq->data_len and > >>>>>> you MUST inspect rq->data_len usage whichever way you go. > >>>>> The patch doens't change that rq->data_len means the true data > >>>>> length. But yeah, it breaks rq->data_len == sum(sg). So it might break > >>>>> some drivers. > >>>> Yeah, that's what I was saying. You end up breaking one of the two > >>>> assumptions. As sglist is getting modified for any driver if it has DMA > >>>> alignment set, whether rq->data_len is adjusted together or not, sglist > >>>> and data_len usages have to be audited. > >>> My patch (well, James' original approach) doesn't affect drivers that > >>> don't use drain buffer. rq->data_len still means the true data length > >>> and rq->data_len is equal to sum(sg) for them. So right now we need to > >>> audit only libata. > >> Your patch does change sglist for any driver which sets DMA alignment. > > > > I overlook it. Where does it changes sglist? > > At the end of blk_rq_map_user() together with data_len / extra_len > mangling or were you talking about James' original patch? With my patch, at the end of blk_rq_map_user, we have: if (len & queue_dma_alignment(q)) { unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1; rq->extra_len += pad_len; } So no change as compared with 2.6.24? ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-03 14:01 ` FUJITA Tomonori @ 2008-03-03 14:22 ` Tejun Heo 2008-03-03 14:52 ` FUJITA Tomonori 0 siblings, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-03-03 14:22 UTC (permalink / raw) To: FUJITA Tomonori Cc: fujita.tomonori, jens.axboe, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik FUJITA Tomonori wrote: >> At the end of blk_rq_map_user() together with data_len / extra_len >> mangling or were you talking about James' original patch? > > With my patch, at the end of blk_rq_map_user, we have: > > if (len & queue_dma_alignment(q)) { > unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1; > > rq->extra_len += pad_len; > } > > > So no change as compared with 2.6.24? Oh.. you killed sg list manipulation. Many controllers do allow odd bytes as the last sg entry but not all. Also, if you append drain buffer after it, it ends up with unaligned sg entry in the middle and rq->data_len + rq->extra_len will overrun the sg entry after the drain page which is really dangerous. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-03 14:22 ` Tejun Heo @ 2008-03-03 14:52 ` FUJITA Tomonori 2008-03-03 22:44 ` Tejun Heo 0 siblings, 1 reply; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-03 14:52 UTC (permalink / raw) To: htejun, jens.axboe Cc: tomof, fujita.tomonori, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik On Mon, 03 Mar 2008 23:22:46 +0900 Tejun Heo <htejun@gmail.com> wrote: > FUJITA Tomonori wrote: > >> At the end of blk_rq_map_user() together with data_len / extra_len > >> mangling or were you talking about James' original patch? > > > > With my patch, at the end of blk_rq_map_user, we have: > > > > if (len & queue_dma_alignment(q)) { > > unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1; > > > > rq->extra_len += pad_len; > > } > > > > > > So no change as compared with 2.6.24? > > Oh.. you killed sg list manipulation. Many controllers do allow odd > bytes as the last sg entry but not all. Also, if you append drain Until 2.6.24, these drivers have taken care about the issue by themselves. There is no change as compared with 2.6.24. > buffer after it, it ends up with unaligned sg entry in the middle and > rq->data_len + rq->extra_len will overrun the sg entry after the drain > page which is really dangerous. The drivers know that they use drain buffer. They can take care about themselves on this too. If we want to do explicitly, we could have rq->pad_len and rq->drain_len instead of rq->extra_len, though I think that we are fine without these values because these drivers already tell the block layer what they want and know that the block layer gives it. Jens, want's your verdict on this? ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-03 14:52 ` FUJITA Tomonori @ 2008-03-03 22:44 ` Tejun Heo 2008-03-04 2:11 ` FUJITA Tomonori 0 siblings, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-03-03 22:44 UTC (permalink / raw) To: FUJITA Tomonori Cc: jens.axboe, fujita.tomonori, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik FUJITA Tomonori wrote: > On Mon, 03 Mar 2008 23:22:46 +0900 > Tejun Heo <htejun@gmail.com> wrote: > >> FUJITA Tomonori wrote: >>>> At the end of blk_rq_map_user() together with data_len / extra_len >>>> mangling or were you talking about James' original patch? >>> With my patch, at the end of blk_rq_map_user, we have: >>> >>> if (len & queue_dma_alignment(q)) { >>> unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1; >>> >>> rq->extra_len += pad_len; >>> } >>> >>> >>> So no change as compared with 2.6.24? >> Oh.. you killed sg list manipulation. Many controllers do allow odd >> bytes as the last sg entry but not all. Also, if you append drain > > Until 2.6.24, these drivers have taken care about the issue by > themselves. There is no change as compared with 2.6.24. Yeah, libata did its own padding and needed to add draining. Private implementation was complex as hell and James suggested moving them to block layer. Are you suggesting moving them back to drivers? >> buffer after it, it ends up with unaligned sg entry in the middle and >> rq->data_len + rq->extra_len will overrun the sg entry after the drain >> page which is really dangerous. > > The drivers know that they use drain buffer. They can take care about > themselves on this too. If we want to do explicitly, we could have > rq->pad_len and rq->drain_len instead of rq->extra_len, though I think > that we are fine without these values because these drivers already > tell the block layer what they want and know that the block layer > gives it. So, if a driver has requested aligning and draining, the driver should extend the sg entry before the last one by the alignment if draining was used for the request and extent the last sg if the draining wasn't used. I'd rather just implement them in the drivers. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-03 22:44 ` Tejun Heo @ 2008-03-04 2:11 ` FUJITA Tomonori 2008-03-04 2:32 ` Tejun Heo 0 siblings, 1 reply; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-04 2:11 UTC (permalink / raw) To: htejun Cc: tomof, jens.axboe, fujita.tomonori, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik On Tue, 04 Mar 2008 07:44:13 +0900 Tejun Heo <htejun@gmail.com> wrote: > FUJITA Tomonori wrote: > > On Mon, 03 Mar 2008 23:22:46 +0900 > > Tejun Heo <htejun@gmail.com> wrote: > > > >> FUJITA Tomonori wrote: > >>>> At the end of blk_rq_map_user() together with data_len / extra_len > >>>> mangling or were you talking about James' original patch? > >>> With my patch, at the end of blk_rq_map_user, we have: > >>> > >>> if (len & queue_dma_alignment(q)) { > >>> unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1; > >>> > >>> rq->extra_len += pad_len; > >>> } > >>> > >>> > >>> So no change as compared with 2.6.24? > >> Oh.. you killed sg list manipulation. Many controllers do allow odd > >> bytes as the last sg entry but not all. Also, if you append drain > > > > Until 2.6.24, these drivers have taken care about the issue by > > themselves. There is no change as compared with 2.6.24. > > Yeah, libata did its own padding and needed to add draining. Private > implementation was complex as hell and James suggested moving them to > block layer. Are you suggesting moving them back to drivers? No, I'm not. I've been working on the IOMMUs to remove such workarounds in LLDs. What drivers need to do on this is just adding a padding length, that is, drivers don't need to change the structure of the sg list (like splitting a sg entry), right? And it doesn't break the SAS drivers that support SATAPI, does it? But I agree that drivers want to get a complete sglist so I'm fine with adjusting sglist entries in the block layer with your secode patch (separate out padding from alignment). As we discussed, I'm fine with breaking sum(sg) == rq->data_len as long as rq->data_len means the true data length. > >> buffer after it, it ends up with unaligned sg entry in the middle and > >> rq->data_len + rq->extra_len will overrun the sg entry after the drain > >> page which is really dangerous. > > > > The drivers know that they use drain buffer. They can take care about > > themselves on this too. If we want to do explicitly, we could have > > rq->pad_len and rq->drain_len instead of rq->extra_len, though I think > > that we are fine without these values because these drivers already > > tell the block layer what they want and know that the block layer > > gives it. > > So, if a driver has requested aligning and draining, the driver should > extend the sg entry before the last one by the alignment if draining was > used for the request and extent the last sg if the draining wasn't used. > I'd rather just implement them in the drivers. The block layer extends the sg entry? The drivers just adjust sg->length? ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 2:11 ` FUJITA Tomonori @ 2008-03-04 2:32 ` Tejun Heo 2008-03-04 8:53 ` FUJITA Tomonori 0 siblings, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-03-04 2:32 UTC (permalink / raw) To: FUJITA Tomonori Cc: jens.axboe, fujita.tomonori, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, Bartlomiej Zolnierkiewicz FUJITA Tomonori wrote: >> Yeah, libata did its own padding and needed to add draining. Private >> implementation was complex as hell and James suggested moving them to >> block layer. Are you suggesting moving them back to drivers? > > No, I'm not. I've been working on the IOMMUs to remove such > workarounds in LLDs. > > What drivers need to do on this is just adding a padding length, that > is, drivers don't need to change the structure of the sg list (like > splitting a sg entry), right? And it doesn't break the SAS drivers > that support SATAPI, does it? > > But I agree that drivers want to get a complete sglist so I'm fine > with adjusting sglist entries in the block layer with your secode > patch (separate out padding from alignment). As we discussed, I'm fine > with breaking sum(sg) == rq->data_len as long as rq->data_len means > the true data length. As long as the second patch is in, what value rq->data_len indicates doesn't matter to drivers which don't use explicit padding or draining, so the situation is much more controlled. I don't care which value rq->data_len would indicate. I'd prefer it equal sum(sg) as that value is what IDE and libata which will be the major users of padding and/or draining expect in rq->data_len but fixing up that shouldn't be too difficult. I guess this can be determined by Jens. If Jens likes rq->data_len to contain requested transfer size, I'll post updated patches. >>>> buffer after it, it ends up with unaligned sg entry in the middle and >>>> rq->data_len + rq->extra_len will overrun the sg entry after the drain >>>> page which is really dangerous. >>> The drivers know that they use drain buffer. They can take care about >>> themselves on this too. If we want to do explicitly, we could have >>> rq->pad_len and rq->drain_len instead of rq->extra_len, though I think >>> that we are fine without these values because these drivers already >>> tell the block layer what they want and know that the block layer >>> gives it. >> So, if a driver has requested aligning and draining, the driver should >> extend the sg entry before the last one by the alignment if draining was >> used for the request and extent the last sg if the draining wasn't used. >> I'd rather just implement them in the drivers. > > The block layer extends the sg entry? The drivers just adjust > sg->length? Still, do you really wanna force such things into low level drivers? That will be one extremely fragile API and will be really difficult to tell when things go wrong. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 2:32 ` Tejun Heo @ 2008-03-04 8:53 ` FUJITA Tomonori 2008-03-04 8:59 ` Jens Axboe 0 siblings, 1 reply; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-04 8:53 UTC (permalink / raw) To: htejun Cc: tomof, jens.axboe, fujita.tomonori, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, 04 Mar 2008 11:32:56 +0900 Tejun Heo <htejun@gmail.com> wrote: > FUJITA Tomonori wrote: > >> Yeah, libata did its own padding and needed to add draining. Private > >> implementation was complex as hell and James suggested moving them to > >> block layer. Are you suggesting moving them back to drivers? > > > > No, I'm not. I've been working on the IOMMUs to remove such > > workarounds in LLDs. > > > > What drivers need to do on this is just adding a padding length, that > > is, drivers don't need to change the structure of the sg list (like > > splitting a sg entry), right? And it doesn't break the SAS drivers > > that support SATAPI, does it? > > > > But I agree that drivers want to get a complete sglist so I'm fine > > with adjusting sglist entries in the block layer with your secode > > patch (separate out padding from alignment). As we discussed, I'm fine > > with breaking sum(sg) == rq->data_len as long as rq->data_len means > > the true data length. > > As long as the second patch is in, what value rq->data_len indicates > doesn't matter to drivers which don't use explicit padding or draining, > so the situation is much more controlled. I don't care which value > rq->data_len would indicate. I'd prefer it equal sum(sg) as that value > is what IDE and libata which will be the major users of padding and/or > draining expect in rq->data_len but fixing up that shouldn't be too > difficult. I guess this can be determined by Jens. If Jens likes > rq->data_len to contain requested transfer size, I'll post updated patches. OK, I prefer rq->data_len means the true data length though you prefer rq->data_len means the allocated buffer length (the true data length plus padding and drain). We agree on other things. We can live with either way. Jens, what's your preference? > >>>> buffer after it, it ends up with unaligned sg entry in the middle and > >>>> rq->data_len + rq->extra_len will overrun the sg entry after the drain > >>>> page which is really dangerous. > >>> The drivers know that they use drain buffer. They can take care about > >>> themselves on this too. If we want to do explicitly, we could have > >>> rq->pad_len and rq->drain_len instead of rq->extra_len, though I think > >>> that we are fine without these values because these drivers already > >>> tell the block layer what they want and know that the block layer > >>> gives it. > >> So, if a driver has requested aligning and draining, the driver should > >> extend the sg entry before the last one by the alignment if draining was > >> used for the request and extent the last sg if the draining wasn't used. > >> I'd rather just implement them in the drivers. > > > > The block layer extends the sg entry? The drivers just adjust > > sg->length? > > Still, do you really wanna force such things into low level drivers? > That will be one extremely fragile API and will be really difficult to > tell when things go wrong. No, I don't, as I explained above. As long as rq->data_len means the true data length, I'm fine. I knew that James' drain buffer patch breaks rq->data_len == sum(sg). I don't care about it. I can understand that drivers wants to a perfect sglist. ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 8:53 ` FUJITA Tomonori @ 2008-03-04 8:59 ` Jens Axboe 2008-03-04 9:06 ` FUJITA Tomonori 2008-03-04 9:29 ` [PATCH] block: fix residual byte count handling Tejun Heo 0 siblings, 2 replies; 109+ messages in thread From: Jens Axboe @ 2008-03-04 8:59 UTC (permalink / raw) To: FUJITA Tomonori Cc: htejun, tomof, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, Mar 04 2008, FUJITA Tomonori wrote: > On Tue, 04 Mar 2008 11:32:56 +0900 > Tejun Heo <htejun@gmail.com> wrote: > > > FUJITA Tomonori wrote: > > >> Yeah, libata did its own padding and needed to add draining. Private > > >> implementation was complex as hell and James suggested moving them to > > >> block layer. Are you suggesting moving them back to drivers? > > > > > > No, I'm not. I've been working on the IOMMUs to remove such > > > workarounds in LLDs. > > > > > > What drivers need to do on this is just adding a padding length, that > > > is, drivers don't need to change the structure of the sg list (like > > > splitting a sg entry), right? And it doesn't break the SAS drivers > > > that support SATAPI, does it? > > > > > > But I agree that drivers want to get a complete sglist so I'm fine > > > with adjusting sglist entries in the block layer with your secode > > > patch (separate out padding from alignment). As we discussed, I'm fine > > > with breaking sum(sg) == rq->data_len as long as rq->data_len means > > > the true data length. > > > > As long as the second patch is in, what value rq->data_len indicates > > doesn't matter to drivers which don't use explicit padding or draining, > > so the situation is much more controlled. I don't care which value > > rq->data_len would indicate. I'd prefer it equal sum(sg) as that value > > is what IDE and libata which will be the major users of padding and/or > > draining expect in rq->data_len but fixing up that shouldn't be too > > difficult. I guess this can be determined by Jens. If Jens likes > > rq->data_len to contain requested transfer size, I'll post updated patches. > > OK, I prefer rq->data_len means the true data length though you prefer > rq->data_len means the allocated buffer length (the true data length > plus padding and drain). We agree on other things. We can live with > either way. > > Jens, what's your preference? I completely agree with you, ->data_len meaning true data length is way cleaner imho. Only the driver should care for the padded length, all other parts of the kernel only need to know what they actually got. -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 8:59 ` Jens Axboe @ 2008-03-04 9:06 ` FUJITA Tomonori 2008-03-04 9:22 ` FUJITA Tomonori 2008-03-04 9:29 ` [PATCH] block: fix residual byte count handling Tejun Heo 1 sibling, 1 reply; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-04 9:06 UTC (permalink / raw) To: jens.axboe Cc: fujita.tomonori, htejun, tomof, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, 4 Mar 2008 09:59:46 +0100 Jens Axboe <jens.axboe@oracle.com> wrote: > On Tue, Mar 04 2008, FUJITA Tomonori wrote: > > On Tue, 04 Mar 2008 11:32:56 +0900 > > Tejun Heo <htejun@gmail.com> wrote: > > > > > FUJITA Tomonori wrote: > > > >> Yeah, libata did its own padding and needed to add draining. Private > > > >> implementation was complex as hell and James suggested moving them to > > > >> block layer. Are you suggesting moving them back to drivers? > > > > > > > > No, I'm not. I've been working on the IOMMUs to remove such > > > > workarounds in LLDs. > > > > > > > > What drivers need to do on this is just adding a padding length, that > > > > is, drivers don't need to change the structure of the sg list (like > > > > splitting a sg entry), right? And it doesn't break the SAS drivers > > > > that support SATAPI, does it? > > > > > > > > But I agree that drivers want to get a complete sglist so I'm fine > > > > with adjusting sglist entries in the block layer with your secode > > > > patch (separate out padding from alignment). As we discussed, I'm fine > > > > with breaking sum(sg) == rq->data_len as long as rq->data_len means > > > > the true data length. > > > > > > As long as the second patch is in, what value rq->data_len indicates > > > doesn't matter to drivers which don't use explicit padding or draining, > > > so the situation is much more controlled. I don't care which value > > > rq->data_len would indicate. I'd prefer it equal sum(sg) as that value > > > is what IDE and libata which will be the major users of padding and/or > > > draining expect in rq->data_len but fixing up that shouldn't be too > > > difficult. I guess this can be determined by Jens. If Jens likes > > > rq->data_len to contain requested transfer size, I'll post updated patches. > > > > OK, I prefer rq->data_len means the true data length though you prefer > > rq->data_len means the allocated buffer length (the true data length > > plus padding and drain). We agree on other things. We can live with > > either way. > > > > Jens, what's your preference? > > I completely agree with you, ->data_len meaning true data length is way > cleaner imho. Only the driver should care for the padded length, all > other parts of the kernel only need to know what they actually got. OK, now we can fix the whole SG_IO (and bsg handler) mess. Here's my patch with a proper description. which several people have already tested (thanks!). Then we need an updated version of Tejun's separate out padding from alignment patch. = From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Subject: [PATCH] block: restore the meaning of rq->data_len to the true data length The meaning of rq->data_len was changed to the length of an allocated buffer from the true data length. It breaks SG_IO friends and bsg. This patch restores the meaning of rq->data_len to the true data length and adds rq->extra_len to store an extended length (due to drain buffer and padding). This patch also removes the code to update bio in blk_rq_map_user introduced by the commit 40b01b9bbdf51ae543a04744283bf2d56c4a6afa. The commit adjusts bio according to memory alignment (queue_dma_alignment). However, memory alignment is NOT padding alignment. This adjustment also breaks SG_IO friends and bsg. Padding alignment needs to be fixed in a proper way (by a separate patch). Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> --- block/blk-core.c | 3 +-- block/blk-map.c | 6 +----- block/blk-merge.c | 2 +- block/bsg.c | 8 ++++---- block/scsi_ioctl.c | 4 ++-- drivers/ata/libata-scsi.c | 6 +++--- include/linux/blkdev.h | 2 +- 7 files changed, 13 insertions(+), 18 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 775c851..bfec406 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -127,7 +127,6 @@ void rq_init(struct request_queue *q, struct request *rq) rq->nr_hw_segments = 0; rq->ioprio = 0; rq->special = NULL; - rq->raw_data_len = 0; rq->buffer = NULL; rq->tag = -1; rq->errors = 0; @@ -135,6 +134,7 @@ void rq_init(struct request_queue *q, struct request *rq) rq->cmd_len = 0; memset(rq->cmd, 0, sizeof(rq->cmd)); rq->data_len = 0; + rq->extra_len = 0; rq->sense_len = 0; rq->data = NULL; rq->sense = NULL; @@ -2016,7 +2016,6 @@ void blk_rq_bio_prep(struct request_queue *q, struct request *rq, rq->hard_cur_sectors = rq->current_nr_sectors; rq->hard_nr_sectors = rq->nr_sectors = bio_sectors(bio); rq->buffer = bio_data(bio); - rq->raw_data_len = bio->bi_size; rq->data_len = bio->bi_size; rq->bio = rq->biotail = bio; diff --git a/block/blk-map.c b/block/blk-map.c index 09f7fd0..f559832 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -19,7 +19,6 @@ int blk_rq_append_bio(struct request_queue *q, struct request *rq, rq->biotail->bi_next = bio; rq->biotail = bio; - rq->raw_data_len += bio->bi_size; rq->data_len += bio->bi_size; } return 0; @@ -151,11 +150,8 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, */ if (len & queue_dma_alignment(q)) { unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1; - struct bio *bio = rq->biotail; - bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len; - bio->bi_size += pad_len; - rq->data_len += pad_len; + rq->extra_len += pad_len; } rq->buffer = rq->data = NULL; diff --git a/block/blk-merge.c b/block/blk-merge.c index 7506c4f..0f58616 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -231,7 +231,7 @@ new_segment: ((unsigned long)q->dma_drain_buffer) & (PAGE_SIZE - 1)); nsegs++; - rq->data_len += q->dma_drain_size; + rq->extra_len += q->dma_drain_size; } if (sg) diff --git a/block/bsg.c b/block/bsg.c index 7f3c095..8917c51 100644 --- a/block/bsg.c +++ b/block/bsg.c @@ -437,14 +437,14 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr, } if (rq->next_rq) { - hdr->dout_resid = rq->raw_data_len; - hdr->din_resid = rq->next_rq->raw_data_len; + hdr->dout_resid = rq->data_len; + hdr->din_resid = rq->next_rq->data_len; blk_rq_unmap_user(bidi_bio); blk_put_request(rq->next_rq); } else if (rq_data_dir(rq) == READ) - hdr->din_resid = rq->raw_data_len; + hdr->din_resid = rq->data_len; else - hdr->dout_resid = rq->raw_data_len; + hdr->dout_resid = rq->data_len; /* * If the request generated a negative error number, return it diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c index e993cac..a2c3a93 100644 --- a/block/scsi_ioctl.c +++ b/block/scsi_ioctl.c @@ -266,7 +266,7 @@ static int blk_complete_sghdr_rq(struct request *rq, struct sg_io_hdr *hdr, hdr->info = 0; if (hdr->masked_status || hdr->host_status || hdr->driver_status) hdr->info |= SG_INFO_CHECK; - hdr->resid = rq->raw_data_len; + hdr->resid = rq->data_len; hdr->sb_len_wr = 0; if (rq->sense_len && hdr->sbp) { @@ -528,8 +528,8 @@ static int __blk_send_generic(struct request_queue *q, struct gendisk *bd_disk, rq = blk_get_request(q, WRITE, __GFP_WAIT); rq->cmd_type = REQ_TYPE_BLOCK_PC; rq->data = NULL; - rq->raw_data_len = 0; rq->data_len = 0; + rq->extra_len = 0; rq->timeout = BLK_DEFAULT_SG_TIMEOUT; memset(rq->cmd, 0, sizeof(rq->cmd)); rq->cmd[0] = cmd; diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 7b1f1ee..fe47922 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -2538,7 +2538,7 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc) } qc->tf.command = ATA_CMD_PACKET; - qc->nbytes = scsi_bufflen(scmd); + qc->nbytes = scsi_bufflen(scmd) + scmd->request->extra_len; /* check whether ATAPI DMA is safe */ if (!using_pio && ata_check_atapi_dma(qc)) @@ -2549,7 +2549,7 @@ static unsigned int atapi_xlat(struct ata_queued_cmd *qc) * want to set it properly, and for DMA where it is * effectively meaningless. */ - nbytes = min(scmd->request->raw_data_len, (unsigned int)63 * 1024); + nbytes = min(scmd->request->data_len, (unsigned int)63 * 1024); /* Most ATAPI devices which honor transfer chunk size don't * behave according to the spec when odd chunk size which @@ -2875,7 +2875,7 @@ static unsigned int ata_scsi_pass_thru(struct ata_queued_cmd *qc) * TODO: find out if we need to do more here to * cover scatter/gather case. */ - qc->nbytes = scsi_bufflen(scmd); + qc->nbytes = scsi_bufflen(scmd) + scmd->request->extra_len; /* request result TF and be quiet about device error */ qc->flags |= ATA_QCFLAG_RESULT_TF | ATA_QCFLAG_QUIET; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 6fe67d1..b72526c 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -216,8 +216,8 @@ struct request { unsigned int cmd_len; unsigned char cmd[BLK_MAX_CDB]; - unsigned int raw_data_len; unsigned int data_len; + unsigned int extra_len; /* length of alignment and padding */ unsigned int sense_len; void *data; void *sense; -- 1.5.3.6 ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 9:06 ` FUJITA Tomonori @ 2008-03-04 9:22 ` FUJITA Tomonori 2008-03-04 9:30 ` Tejun Heo 2008-03-04 9:35 ` Jens Axboe 0 siblings, 2 replies; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-04 9:22 UTC (permalink / raw) To: jens.axboe, htejun Cc: tomof, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, 04 Mar 2008 18:06:48 +0900 FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote: > On Tue, 4 Mar 2008 09:59:46 +0100 > Jens Axboe <jens.axboe@oracle.com> wrote: > > > On Tue, Mar 04 2008, FUJITA Tomonori wrote: > > > On Tue, 04 Mar 2008 11:32:56 +0900 > > > Tejun Heo <htejun@gmail.com> wrote: > > > > > > > FUJITA Tomonori wrote: > > > > >> Yeah, libata did its own padding and needed to add draining. Private > > > > >> implementation was complex as hell and James suggested moving them to > > > > >> block layer. Are you suggesting moving them back to drivers? > > > > > > > > > > No, I'm not. I've been working on the IOMMUs to remove such > > > > > workarounds in LLDs. > > > > > > > > > > What drivers need to do on this is just adding a padding length, that > > > > > is, drivers don't need to change the structure of the sg list (like > > > > > splitting a sg entry), right? And it doesn't break the SAS drivers > > > > > that support SATAPI, does it? > > > > > > > > > > But I agree that drivers want to get a complete sglist so I'm fine > > > > > with adjusting sglist entries in the block layer with your secode > > > > > patch (separate out padding from alignment). As we discussed, I'm fine > > > > > with breaking sum(sg) == rq->data_len as long as rq->data_len means > > > > > the true data length. > > > > > > > > As long as the second patch is in, what value rq->data_len indicates > > > > doesn't matter to drivers which don't use explicit padding or draining, > > > > so the situation is much more controlled. I don't care which value > > > > rq->data_len would indicate. I'd prefer it equal sum(sg) as that value > > > > is what IDE and libata which will be the major users of padding and/or > > > > draining expect in rq->data_len but fixing up that shouldn't be too > > > > difficult. I guess this can be determined by Jens. If Jens likes > > > > rq->data_len to contain requested transfer size, I'll post updated patches. > > > > > > OK, I prefer rq->data_len means the true data length though you prefer > > > rq->data_len means the allocated buffer length (the true data length > > > plus padding and drain). We agree on other things. We can live with > > > either way. > > > > > > Jens, what's your preference? > > > > I completely agree with you, ->data_len meaning true data length is way > > cleaner imho. Only the driver should care for the padded length, all > > other parts of the kernel only need to know what they actually got. > > OK, now we can fix the whole SG_IO (and bsg handler) mess. > > Here's my patch with a proper description. which several people have > already tested (thanks!). Then we need an updated version of Tejun's > separate out padding from alignment patch. OK, I've updated his patch. Tejun, can you audit this? Thanks, = From: Tejun Heo <htejun@gmail.com> Subject: [PATCH] block: separate out padding from alignment Block layer alignment was used for two different purposes - memory alignment and padding. This causes problems in lower layers because drivers which only require memory alignment ends up with adjusted rq->data_len. Separate out padding such that padding occurs iff driver explicitly requests it. Tomo: restorethe code to update bio in blk_rq_map_user introduced by the commit 40b01b9bbdf51ae543a04744283bf2d56c4a6afa according to padding alignment. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> --- block/blk-map.c | 20 +++++++++++++------- block/blk-settings.c | 17 +++++++++++++++++ drivers/ata/libata-scsi.c | 3 ++- include/linux/blkdev.h | 2 ++ 4 files changed, 34 insertions(+), 8 deletions(-) diff --git a/block/blk-map.c b/block/blk-map.c index f559832..4e17dfd 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -43,6 +43,7 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq, void __user *ubuf, unsigned int len) { unsigned long uaddr; + unsigned int alignment; struct bio *bio, *orig_bio; int reading, ret; @@ -53,8 +54,8 @@ static int __blk_rq_map_user(struct request_queue *q, struct request *rq, * direct dma. else, set up kernel bounce buffers */ uaddr = (unsigned long) ubuf; - if (!(uaddr & queue_dma_alignment(q)) && - !(len & queue_dma_alignment(q))) + alignment = queue_dma_alignment(q) | q->dma_pad_mask; + if (!(uaddr & alignment) && !(len & alignment)) bio = bio_map_user(q, NULL, uaddr, len, reading); else bio = bio_copy_user(q, uaddr, len, reading); @@ -141,15 +142,20 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, /* * __blk_rq_map_user() copies the buffers if starting address - * or length isn't aligned. As the copied buffer is always - * page aligned, we know that there's enough room for padding. - * Extend the last bio and update rq->data_len accordingly. + * or length isn't aligned to dma_pad_mask. As the copied + * buffer is always page aligned, we know that there's enough + * room for padding. Extend the last bio and update + * rq->data_len accordingly. * * On unmap, bio_uncopy_user() will use unmodified * bio_map_data pointed to by bio->bi_private. */ - if (len & queue_dma_alignment(q)) { - unsigned int pad_len = (queue_dma_alignment(q) & ~len) + 1; + if (len & q->dma_pad_mask) { + unsigned int pad_len = (q->dma_pad_mask & ~len) + 1; + struct bio *bio = rq->biotail; + + bio->bi_io_vec[bio->bi_vcnt - 1].bv_len += pad_len; + bio->bi_size += pad_len; rq->extra_len += pad_len; } diff --git a/block/blk-settings.c b/block/blk-settings.c index 9a8ffdd..5fcb625 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -293,6 +293,23 @@ void blk_queue_stack_limits(struct request_queue *t, struct request_queue *b) EXPORT_SYMBOL(blk_queue_stack_limits); /** + * blk_queue_dma_pad - set pad mask + * @q: the request queue for the device + * @mask: pad mask + * + * Set pad mask. Direct IO requests are padded to the mask specified. + * + * Appending pad buffer to a request modifies ->data_len such that it + * includes the pad buffer. The original requested data length can be + * obtained using blk_rq_raw_data_len(). + **/ +void blk_queue_dma_pad(struct request_queue *q, unsigned int mask) +{ + q->dma_pad_mask = mask; +} +EXPORT_SYMBOL(blk_queue_dma_pad); + +/** * blk_queue_dma_drain - Set up a drain buffer for excess dma. * * @q: the request queue for the device diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index fe47922..8f0e8f2 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -862,9 +862,10 @@ static int ata_scsi_dev_config(struct scsi_device *sdev, struct request_queue *q = sdev->request_queue; void *buf; - /* set the min alignment */ + /* set the min alignment and padding */ blk_queue_update_dma_alignment(sdev->request_queue, ATA_DMA_PAD_SZ - 1); + blk_queue_dma_pad(sdev->request_queue, ATA_DMA_PAD_SZ - 1); /* configure draining */ buf = kmalloc(ATAPI_MAX_DRAIN, q->bounce_gfp | GFP_KERNEL); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index b72526c..6f79d40 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -362,6 +362,7 @@ struct request_queue unsigned long seg_boundary_mask; void *dma_drain_buffer; unsigned int dma_drain_size; + unsigned int dma_pad_mask; unsigned int dma_alignment; struct blk_queue_tag *queue_tags; @@ -701,6 +702,7 @@ extern void blk_queue_max_hw_segments(struct request_queue *, unsigned short); extern void blk_queue_max_segment_size(struct request_queue *, unsigned int); extern void blk_queue_hardsect_size(struct request_queue *, unsigned short); extern void blk_queue_stack_limits(struct request_queue *t, struct request_queue *b); +extern void blk_queue_dma_pad(struct request_queue *, unsigned int); extern int blk_queue_dma_drain(struct request_queue *q, dma_drain_needed_fn *dma_drain_needed, void *buf, unsigned int size); -- 1.5.3.6 ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 9:22 ` FUJITA Tomonori @ 2008-03-04 9:30 ` Tejun Heo 2008-03-04 9:35 ` Jens Axboe 1 sibling, 0 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-04 9:30 UTC (permalink / raw) To: FUJITA Tomonori Cc: jens.axboe, tomof, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier FUJITA Tomonori wrote: > OK, I've updated his patch. Tejun, can you audit this? Looks good to me. Thanks. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 9:22 ` FUJITA Tomonori 2008-03-04 9:30 ` Tejun Heo @ 2008-03-04 9:35 ` Jens Axboe 2008-03-04 9:40 ` Tejun Heo 2008-03-04 12:37 ` Mike Galbraith 1 sibling, 2 replies; 109+ messages in thread From: Jens Axboe @ 2008-03-04 9:35 UTC (permalink / raw) To: FUJITA Tomonori Cc: htejun, tomof, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, Mar 04 2008, FUJITA Tomonori wrote: > On Tue, 04 Mar 2008 18:06:48 +0900 > FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote: > > > On Tue, 4 Mar 2008 09:59:46 +0100 > > Jens Axboe <jens.axboe@oracle.com> wrote: > > > > > On Tue, Mar 04 2008, FUJITA Tomonori wrote: > > > > On Tue, 04 Mar 2008 11:32:56 +0900 > > > > Tejun Heo <htejun@gmail.com> wrote: > > > > > > > > > FUJITA Tomonori wrote: > > > > > >> Yeah, libata did its own padding and needed to add draining. Private > > > > > >> implementation was complex as hell and James suggested moving them to > > > > > >> block layer. Are you suggesting moving them back to drivers? > > > > > > > > > > > > No, I'm not. I've been working on the IOMMUs to remove such > > > > > > workarounds in LLDs. > > > > > > > > > > > > What drivers need to do on this is just adding a padding length, that > > > > > > is, drivers don't need to change the structure of the sg list (like > > > > > > splitting a sg entry), right? And it doesn't break the SAS drivers > > > > > > that support SATAPI, does it? > > > > > > > > > > > > But I agree that drivers want to get a complete sglist so I'm fine > > > > > > with adjusting sglist entries in the block layer with your secode > > > > > > patch (separate out padding from alignment). As we discussed, I'm fine > > > > > > with breaking sum(sg) == rq->data_len as long as rq->data_len means > > > > > > the true data length. > > > > > > > > > > As long as the second patch is in, what value rq->data_len indicates > > > > > doesn't matter to drivers which don't use explicit padding or draining, > > > > > so the situation is much more controlled. I don't care which value > > > > > rq->data_len would indicate. I'd prefer it equal sum(sg) as that value > > > > > is what IDE and libata which will be the major users of padding and/or > > > > > draining expect in rq->data_len but fixing up that shouldn't be too > > > > > difficult. I guess this can be determined by Jens. If Jens likes > > > > > rq->data_len to contain requested transfer size, I'll post updated patches. > > > > > > > > OK, I prefer rq->data_len means the true data length though you prefer > > > > rq->data_len means the allocated buffer length (the true data length > > > > plus padding and drain). We agree on other things. We can live with > > > > either way. > > > > > > > > Jens, what's your preference? > > > > > > I completely agree with you, ->data_len meaning true data length is way > > > cleaner imho. Only the driver should care for the padded length, all > > > other parts of the kernel only need to know what they actually got. > > > > OK, now we can fix the whole SG_IO (and bsg handler) mess. > > > > Here's my patch with a proper description. which several people have > > already tested (thanks!). Then we need an updated version of Tejun's > > separate out padding from alignment patch. > > OK, I've updated his patch. Tejun, can you audit this? Looks excellent to me, has a variant of this been tested as OK by the users reporting the regression? -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 9:35 ` Jens Axboe @ 2008-03-04 9:40 ` Tejun Heo 2008-03-04 9:46 ` Jens Axboe 2008-03-04 12:37 ` Mike Galbraith 1 sibling, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-03-04 9:40 UTC (permalink / raw) To: Jens Axboe Cc: FUJITA Tomonori, tomof, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier Jens Axboe wrote: > Looks excellent to me, has a variant of this been tested as OK by the > users reporting the regression? Yeah, the other version which added extra_len to data_len has been verified to work. The only difference is now libata is adding extra_len, so this one should be safe. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 9:40 ` Tejun Heo @ 2008-03-04 9:46 ` Jens Axboe 0 siblings, 0 replies; 109+ messages in thread From: Jens Axboe @ 2008-03-04 9:46 UTC (permalink / raw) To: Tejun Heo Cc: FUJITA Tomonori, tomof, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, Mar 04 2008, Tejun Heo wrote: > Jens Axboe wrote: > > Looks excellent to me, has a variant of this been tested as OK by the > > users reporting the regression? > > Yeah, the other version which added extra_len to data_len has been > verified to work. The only difference is now libata is adding > extra_len, so this one should be safe. Great, since we all agree, I'll merge it up and pass it on. -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 9:35 ` Jens Axboe 2008-03-04 9:40 ` Tejun Heo @ 2008-03-04 12:37 ` Mike Galbraith 2008-03-04 12:39 ` Jens Axboe 2008-03-04 12:40 ` Tejun Heo 1 sibling, 2 replies; 109+ messages in thread From: Mike Galbraith @ 2008-03-04 12:37 UTC (permalink / raw) To: Jens Axboe Cc: FUJITA Tomonori, htejun, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote: > Looks excellent to me, has a variant of this been tested as OK by the > users reporting the regression? K3b burning seems to be a nogo here. This is git pulled this morning though, so it's a somewhat different tree than previously tested fwtw. [ 136.440021] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen [ 136.440043] ata1.01: cmd a0/00:00:00:00:00/00:00:00:00:00/b0 tag 0 [ 136.440045] cdb 51 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 [ 136.440047] res 58/00:02:00:02:00/00:00:00:00:00/b0 Emask 0x2 (HSM violation) [ 136.440053] ata1.01: status: { DRDY DRQ } [ 136.440086] ata1: soft resetting link [ 165.327627] ata1.01: qc timeout (cmd 0xa1) [ 165.327627] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4) [ 165.327627] ata1.01: revalidation failed (errno=-5) [ 165.327627] ata1: failed to recover some devices, retrying in 5 secs [ 177.272373] ata1: port is slow to respond, please be patient (Status 0x80) [ 180.388879] ata1: device not ready (errno=-16), forcing hardreset [ 180.388879] ata1: soft resetting link [ 210.832471] ata1.01: qc timeout (cmd 0xa1) [ 210.832471] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4) [ 210.832471] ata1.01: revalidation failed (errno=-5) [ 210.832471] ata1: failed to recover some devices, retrying in 5 secs [ 223.392899] ata1: port is slow to respond, please be patient (Status 0x80) [ 225.920376] ata1: device not ready (errno=-16), forcing hardreset [ 225.920376] ata1: soft resetting link [ 256.542565] ata1.01: qc timeout (cmd 0xa1) [ 256.542565] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4) [ 256.542565] ata1.01: revalidation failed (errno=-5) [ 256.542565] ata1.01: disabled [ 259.995199] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x40) [ 259.995214] ata1.00: revalidation failed (errno=-5) [ 259.995219] ata1: failed to recover some devices, retrying in 5 secs [ 265.047502] ata1: soft resetting link [ 262.397570] ata1.00: limited to UDMA/33 due to 40-wire cable [ 262.420039] ata1.00: configured for UDMA/33 [ 262.420039] sr 0:0:1:0: [sr0] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK [ 262.420039] sr 0:0:1:0: [sr0] Sense Key : Aborted Command [current] [descriptor] [ 262.420039] Descriptor sense data with sense descriptors (in hex): [ 262.420039] 72 0b 47 00 00 00 00 0e 09 0c 00 00 00 02 00 00 [ 262.420039] 00 02 00 00 b0 58 [ 262.420039] sr 0:0:1:0: [sr0] Add. Sense: Scsi parity error [ 262.420039] ata1: EH complete [ 262.420257] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB) [ 262.420320] sd 0:0:0:0: [sda] Write Protect is off [ 262.420326] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 262.420390] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 12:37 ` Mike Galbraith @ 2008-03-04 12:39 ` Jens Axboe 2008-03-04 12:43 ` Mike Galbraith ` (2 more replies) 2008-03-04 12:40 ` Tejun Heo 1 sibling, 3 replies; 109+ messages in thread From: Jens Axboe @ 2008-03-04 12:39 UTC (permalink / raw) To: Mike Galbraith Cc: FUJITA Tomonori, htejun, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, Mar 04 2008, Mike Galbraith wrote: > > On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote: > > > Looks excellent to me, has a variant of this been tested as OK by the > > users reporting the regression? > > K3b burning seems to be a nogo here. This is git pulled this morning > though, so it's a somewhat different tree than previously tested fwtw. can you please try git as of this morning without any patches applied, and then pull git://git.kernel.dk/linux-2.6-block.git for-linus into that and see if that works? -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 12:39 ` Jens Axboe @ 2008-03-04 12:43 ` Mike Galbraith 2008-03-04 12:58 ` Mike Galbraith 2008-03-04 16:04 ` James Bottomley 2008-03-04 17:34 ` walt 2 siblings, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-03-04 12:43 UTC (permalink / raw) To: Jens Axboe Cc: FUJITA Tomonori, htejun, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, 2008-03-04 at 13:39 +0100, Jens Axboe wrote: > On Tue, Mar 04 2008, Mike Galbraith wrote: > > > > On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote: > > > > > Looks excellent to me, has a variant of this been tested as OK by the > > > users reporting the regression? > > > > K3b burning seems to be a nogo here. This is git pulled this morning > > though, so it's a somewhat different tree than previously tested fwtw. > > can you please try git as of this morning without any patches applied, > and then pull > > git://git.kernel.dk/linux-2.6-block.git for-linus > > into that and see if that works? I'll give it a shot in a bit. -Mike ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 12:43 ` Mike Galbraith @ 2008-03-04 12:58 ` Mike Galbraith 2008-03-04 13:03 ` Jens Axboe 0 siblings, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-03-04 12:58 UTC (permalink / raw) To: Jens Axboe Cc: FUJITA Tomonori, htejun, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, 2008-03-04 at 13:43 +0100, Mike Galbraith wrote: > > can you please try git as of this morning without any patches applied, > > and then pull > > > > git://git.kernel.dk/linux-2.6-block.git for-linus > > > > into that and see if that works? > > I'll give it a shot in a bit. Aw poo, so many choices. I did: git add remote block-for-linus git://git.kernel.dk/linux-2.6-block.git git remote update Now, which one do I check out? block-for-linus/master maybe, or block-for-linus/for-linus? homer:..git/linux-2.6 # git checkout block-for-linus error: pathspec 'block-for-linus' did not match any file(s) known to git. Did you forget to 'git add'? homer:..git/linux-2.6 # git branch -a * master x86/master x86/mm block-for-linus/blktrace block-for-linus/cmdfilter block-for-linus/dynpipe block-for-linus/fcache block-for-linus/for-akpm block-for-linus/for-linus block-for-linus/io-cpu-affinity block-for-linus/io-cpu-affinity-kthread block-for-linus/loop-extent_map block-for-linus/loop-fastfs block-for-linus/master block-for-linus/plug block-for-linus/splice block-for-linus/syslet block-for-linus/syslet-share block-for-linus/timeout linux-next/master linux-next/stable origin/HEAD origin/master x86/base x86/for-akpm x86/for-linus x86/latest x86/master x86/mm x86/origin x86/testing ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 12:58 ` Mike Galbraith @ 2008-03-04 13:03 ` Jens Axboe 2008-03-04 14:25 ` Mike Galbraith 0 siblings, 1 reply; 109+ messages in thread From: Jens Axboe @ 2008-03-04 13:03 UTC (permalink / raw) To: Mike Galbraith Cc: FUJITA Tomonori, htejun, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, Mar 04 2008, Mike Galbraith wrote: > > On Tue, 2008-03-04 at 13:43 +0100, Mike Galbraith wrote: > > > > can you please try git as of this morning without any patches applied, > > > and then pull > > > > > > git://git.kernel.dk/linux-2.6-block.git for-linus > > > > > > into that and see if that works? > > > > I'll give it a shot in a bit. > > Aw poo, so many choices. > I did: > git add remote block-for-linus git://git.kernel.dk/linux-2.6-block.git > git remote update > Now, which one do I check out? block-for-linus/master maybe, or > block-for-linus/for-linus? Re-read my original mail! It states that you should just pull: git://git.kernel.dk/linux-2.6-block.git for-linus into your linus branch, or just create a test branch off linus' master and pull into that. IOW, it's the for-linus branch that you should pull, nothing else. -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 13:03 ` Jens Axboe @ 2008-03-04 14:25 ` Mike Galbraith 2008-03-04 18:17 ` Jens Axboe 0 siblings, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-03-04 14:25 UTC (permalink / raw) To: Jens Axboe Cc: FUJITA Tomonori, htejun, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, 2008-03-04 at 14:03 +0100, Jens Axboe wrote: > Re-read my original mail! It states that you should just pull: > > git://git.kernel.dk/linux-2.6-block.git for-linus > > into your linus branch, or just create a test branch off linus' master > and pull into that. IOW, it's the for-linus branch that you should pull, > nothing else. Well, I had a good reason. You know how to un-pull, I know how to un-remote to get back to pristine after I'm done testing... guaranteed without whimpering pathetically on the git list ;-) Anyway, I checked out the one with the big-fat-hint in it's name (block-for-linus/for-linus). Same error. Git this morning with patches... restore_meaning_of_data_len.diff seperate_out_padding_from_alignment.diff ...reverted restored me to the originally reported k3b error, nothing new noted. If I tested the wrong branch, whack me upside the head, and I'll follow your pull destructions, and figure out how to un-pull later. -Mike ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 14:25 ` Mike Galbraith @ 2008-03-04 18:17 ` Jens Axboe 2008-03-04 18:29 ` Jens Axboe 2008-03-04 18:35 ` Mike Galbraith 0 siblings, 2 replies; 109+ messages in thread From: Jens Axboe @ 2008-03-04 18:17 UTC (permalink / raw) To: Mike Galbraith Cc: FUJITA Tomonori, htejun, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, Mar 04 2008, Mike Galbraith wrote: > > On Tue, 2008-03-04 at 14:03 +0100, Jens Axboe wrote: > > > Re-read my original mail! It states that you should just pull: > > > > git://git.kernel.dk/linux-2.6-block.git for-linus > > > > into your linus branch, or just create a test branch off linus' master > > and pull into that. IOW, it's the for-linus branch that you should pull, > > nothing else. > > Well, I had a good reason. You know how to un-pull, I know how to > un-remote to get back to pristine after I'm done testing... guaranteed > without whimpering pathetically on the git list ;-) OK, if you're on master, it's pretty easy: $ git branch test-branch $ git checkout test-branch $ git pull git://git.kernel.dk/linux-2.6-block.git for-linus [build, boot, test] $ git checkout master $ git branch -D test-branch > Anyway, I checked out the one with the big-fat-hint in it's name > (block-for-linus/for-linus). > Same error. Git this morning with patches... > restore_meaning_of_data_len.diff > seperate_out_padding_from_alignment.diff > ...reverted restored me to the originally reported k3b error, nothing > new noted. > > If I tested the wrong branch, whack me upside the head, and I'll follow > your pull destructions, and figure out how to un-pull later. for-linus is the right branch, but I'm just a little worried that you didn't test what you think you tested. What does cat .git/HEAD say? If that is a ref to a file (eg refs/heads/master), what does that file contain? -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 18:17 ` Jens Axboe @ 2008-03-04 18:29 ` Jens Axboe 2008-03-04 18:35 ` Mike Galbraith 1 sibling, 0 replies; 109+ messages in thread From: Jens Axboe @ 2008-03-04 18:29 UTC (permalink / raw) To: Mike Galbraith Cc: FUJITA Tomonori, htejun, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, Mar 04 2008, Jens Axboe wrote: > On Tue, Mar 04 2008, Mike Galbraith wrote: > > > > On Tue, 2008-03-04 at 14:03 +0100, Jens Axboe wrote: > > > > > Re-read my original mail! It states that you should just pull: > > > > > > git://git.kernel.dk/linux-2.6-block.git for-linus > > > > > > into your linus branch, or just create a test branch off linus' master > > > and pull into that. IOW, it's the for-linus branch that you should pull, > > > nothing else. > > > > Well, I had a good reason. You know how to un-pull, I know how to > > un-remote to get back to pristine after I'm done testing... guaranteed > > without whimpering pathetically on the git list ;-) > > OK, if you're on master, it's pretty easy: > > $ git branch test-branch > $ git checkout test-branch > $ git pull git://git.kernel.dk/linux-2.6-block.git for-linus > > [build, boot, test] > $ git checkout master > $ git branch -D test-branch > > > Anyway, I checked out the one with the big-fat-hint in it's name > > (block-for-linus/for-linus). > > Same error. Git this morning with patches... > > restore_meaning_of_data_len.diff > > seperate_out_padding_from_alignment.diff > > ...reverted restored me to the originally reported k3b error, nothing > > new noted. > > > > If I tested the wrong branch, whack me upside the head, and I'll follow > > your pull destructions, and figure out how to un-pull later. > > for-linus is the right branch, but I'm just a little worried that you > didn't test what you think you tested. What does cat .git/HEAD say? If > that is a ref to a file (eg refs/heads/master), what does that file > contain? Or just re-pull Linus' tree, the stuff is in now. -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 18:17 ` Jens Axboe 2008-03-04 18:29 ` Jens Axboe @ 2008-03-04 18:35 ` Mike Galbraith 2008-03-04 18:45 ` Jens Axboe 1 sibling, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-03-04 18:35 UTC (permalink / raw) To: Jens Axboe Cc: FUJITA Tomonori, htejun, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, 2008-03-04 at 19:17 +0100, Jens Axboe wrote: > On Tue, Mar 04 2008, Mike Galbraith wrote: > > > > On Tue, 2008-03-04 at 14:03 +0100, Jens Axboe wrote: > > > > > Re-read my original mail! It states that you should just pull: > > > > > > git://git.kernel.dk/linux-2.6-block.git for-linus > > > > > > into your linus branch, or just create a test branch off linus' master > > > and pull into that. IOW, it's the for-linus branch that you should pull, > > > nothing else. > > > > Well, I had a good reason. You know how to un-pull, I know how to > > un-remote to get back to pristine after I'm done testing... guaranteed > > without whimpering pathetically on the git list ;-) > > OK, if you're on master, it's pretty easy: > > $ git branch test-branch > $ git checkout test-branch > $ git pull git://git.kernel.dk/linux-2.6-block.git for-linus > > [build, boot, test] > $ git checkout master > $ git branch -D test-branch Hm, that's simple enough. I'll do this for the edification. Thanks. Maybe some day, I'll cease to be so paranoid that my test setup may become compromised. (at which time...) > > Anyway, I checked out the one with the big-fat-hint in it's name > > (block-for-linus/for-linus). > > Same error. Git this morning with patches... > > restore_meaning_of_data_len.diff > > seperate_out_padding_from_alignment.diff > > ...reverted restored me to the originally reported k3b error, nothing > > new noted. > > > > If I tested the wrong branch, whack me upside the head, and I'll follow > > your pull destructions, and figure out how to un-pull later. > > for-linus is the right branch, but I'm just a little worried that you > didn't test what you think you tested. What does cat .git/HEAD say? If > that is a ref to a file (eg refs/heads/master), what does that file > contain? That wouldn't surprise me one bit. (ergo...) It says cc66b4512cae8df4ed1635483210aabf7690ec27... kewpie doll? -Mike ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 18:35 ` Mike Galbraith @ 2008-03-04 18:45 ` Jens Axboe 2008-03-04 18:49 ` Mike Galbraith 0 siblings, 1 reply; 109+ messages in thread From: Jens Axboe @ 2008-03-04 18:45 UTC (permalink / raw) To: Mike Galbraith Cc: FUJITA Tomonori, htejun, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, Mar 04 2008, Mike Galbraith wrote: > > On Tue, 2008-03-04 at 19:17 +0100, Jens Axboe wrote: > > On Tue, Mar 04 2008, Mike Galbraith wrote: > > > > > > On Tue, 2008-03-04 at 14:03 +0100, Jens Axboe wrote: > > > > > > > Re-read my original mail! It states that you should just pull: > > > > > > > > git://git.kernel.dk/linux-2.6-block.git for-linus > > > > > > > > into your linus branch, or just create a test branch off linus' master > > > > and pull into that. IOW, it's the for-linus branch that you should pull, > > > > nothing else. > > > > > > Well, I had a good reason. You know how to un-pull, I know how to > > > un-remote to get back to pristine after I'm done testing... guaranteed > > > without whimpering pathetically on the git list ;-) > > > > OK, if you're on master, it's pretty easy: > > > > $ git branch test-branch > > $ git checkout test-branch > > $ git pull git://git.kernel.dk/linux-2.6-block.git for-linus > > > > [build, boot, test] > > $ git checkout master > > $ git branch -D test-branch > > Hm, that's simple enough. I'll do this for the edification. Thanks. > Maybe some day, I'll cease to be so paranoid that my test setup may > become compromised. (at which time...) > > > > Anyway, I checked out the one with the big-fat-hint in it's name > > > (block-for-linus/for-linus). > > > Same error. Git this morning with patches... > > > restore_meaning_of_data_len.diff > > > seperate_out_padding_from_alignment.diff > > > ...reverted restored me to the originally reported k3b error, nothing > > > new noted. > > > > > > If I tested the wrong branch, whack me upside the head, and I'll follow > > > your pull destructions, and figure out how to un-pull later. > > > > for-linus is the right branch, but I'm just a little worried that you > > didn't test what you think you tested. What does cat .git/HEAD say? If > > that is a ref to a file (eg refs/heads/master), what does that file > > contain? > > That wouldn't surprise me one bit. (ergo...) > > It says cc66b4512cae8df4ed1635483210aabf7690ec27... kewpie doll? That looks right, then perhaps there's still an issue there :/ Logs? -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 18:45 ` Jens Axboe @ 2008-03-04 18:49 ` Mike Galbraith 2008-03-04 18:54 ` Jens Axboe 0 siblings, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-03-04 18:49 UTC (permalink / raw) To: Jens Axboe Cc: FUJITA Tomonori, htejun, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, 2008-03-04 at 19:45 +0100, Jens Axboe wrote: > > It says cc66b4512cae8df4ed1635483210aabf7690ec27... kewpie doll? > > That looks right, then perhaps there's still an issue there :/ > Logs? Tejuns patchlet (below) fixed it here. Date: Wed, 05 Mar 2008 01:42:45 +0900 From: Tejun Heo <htejun@gmail.com> To: FUJITA Tomonori <tomof@acm.org> CC: efault@gmx.de, jens.axboe@oracle.com, fujita.tomonori@lab.ntt.co.jp, James.Bottomley@HansenPartnership.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-ide@vger.kernel.org, linux-scsi@vger.kernel.org, jgarzik@pobox.com, bzolnier@gmail.com Subject: Re: [PATCH] block: fix residual byte count handling Tejun Heo wrote: > Tejun Heo wrote: >> FUJITA Tomonori wrote: >>>> Aiee... device going down after timing out on READ_DISC_INFO. That's >>>> gruesome. Can you please try the other patches? >>> Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No? >> The extra_len you added to qc->nbytes should be it. The only other >> place to pay attention is the ATAPI transfer chunk size and your patch >> seems to get it right. >> >>> Now Jens' git tree should work with all the non libata stuff, ide, >>> firewire, bsg, etc. But I'm not sure about libata. >> With the second patch, all others should be fine no matter what. I'll >> go check libata part again. > > I can reproduce the problem here and it's very weird. I'll report back > when I know more. Okay, I got it. Heh, it turns out SCSI and/or block layer is not ready for rq->data_len != sum(sg). When adjusted command completes, SCSI midlayer completes the command with rq->data_len for PC commands which eventually ends up in __end_that_request_first(). As there are extra sg area left after completing rq->data_len, blk layer says so to SCSI layer and SCSI layer retries the command only with the appended area. The following patch gets the writing going. I really think it's a serious mistake to break rq->data_len == sum(sg). If we break rq->data_len == requested size, the worst bugs are giving wrong size when issuing commands to application layer of devices which is relatively easy to spot and not all that command anyway. Breaking rq->data_len == sum(sg), bugs will be in internal mechanics, DMA engine programming and transport layer. Oh well... diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index fecba05..32439ac 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd) "Notifying upper driver of completion " "(result %x)\n", cmd->result)); - good_bytes = scsi_bufflen(cmd); + good_bytes = scsi_bufflen(cmd) + cmd->request->data_len; if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) { drv = scsi_cmd_to_driver(cmd); if (drv->done) ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 18:49 ` Mike Galbraith @ 2008-03-04 18:54 ` Jens Axboe 2008-03-04 19:26 ` Mike Galbraith 0 siblings, 1 reply; 109+ messages in thread From: Jens Axboe @ 2008-03-04 18:54 UTC (permalink / raw) To: Mike Galbraith Cc: FUJITA Tomonori, htejun, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, Mar 04 2008, Mike Galbraith wrote: > > On Tue, 2008-03-04 at 19:45 +0100, Jens Axboe wrote: > > > > It says cc66b4512cae8df4ed1635483210aabf7690ec27... kewpie doll? > > > > That looks right, then perhaps there's still an issue there :/ > > Logs? > > Tejuns patchlet (below) fixed it here. OK, can you try changing that to good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len; and retest? -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 18:54 ` Jens Axboe @ 2008-03-04 19:26 ` Mike Galbraith 2008-03-04 19:28 ` Jens Axboe 0 siblings, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-03-04 19:26 UTC (permalink / raw) To: Jens Axboe Cc: FUJITA Tomonori, htejun, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, 2008-03-04 at 19:54 +0100, Jens Axboe wrote: > On Tue, Mar 04 2008, Mike Galbraith wrote: > > > > On Tue, 2008-03-04 at 19:45 +0100, Jens Axboe wrote: > > > > > > It says cc66b4512cae8df4ed1635483210aabf7690ec27... kewpie doll? > > > > > > That looks right, then perhaps there's still an issue there :/ > > > Logs? > > > > Tejuns patchlet (below) fixed it here. > > OK, can you try changing that to > > good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len; > > and retest? Yup, disk #42 is happily burning away. -Mike ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 19:26 ` Mike Galbraith @ 2008-03-04 19:28 ` Jens Axboe 0 siblings, 0 replies; 109+ messages in thread From: Jens Axboe @ 2008-03-04 19:28 UTC (permalink / raw) To: Mike Galbraith Cc: FUJITA Tomonori, htejun, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, Mar 04 2008, Mike Galbraith wrote: > > On Tue, 2008-03-04 at 19:54 +0100, Jens Axboe wrote: > > On Tue, Mar 04 2008, Mike Galbraith wrote: > > > > > > On Tue, 2008-03-04 at 19:45 +0100, Jens Axboe wrote: > > > > > > > > It says cc66b4512cae8df4ed1635483210aabf7690ec27... kewpie doll? > > > > > > > > That looks right, then perhaps there's still an issue there :/ > > > > Logs? > > > > > > Tejuns patchlet (below) fixed it here. > > > > OK, can you try changing that to > > > > good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len; > > > > and retest? > > Yup, disk #42 is happily burning away. Super, patch heading to Linus now. Thanks for all your testing, Mike! -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 12:39 ` Jens Axboe 2008-03-04 12:43 ` Mike Galbraith @ 2008-03-04 16:04 ` James Bottomley 2008-03-04 18:46 ` Jens Axboe 2008-03-04 17:34 ` walt 2 siblings, 1 reply; 109+ messages in thread From: James Bottomley @ 2008-03-04 16:04 UTC (permalink / raw) To: Jens Axboe Cc: Mike Galbraith, FUJITA Tomonori, htejun, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, 2008-03-04 at 13:39 +0100, Jens Axboe wrote: > On Tue, Mar 04 2008, Mike Galbraith wrote: > > > > On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote: > > > > > Looks excellent to me, has a variant of this been tested as OK by the > > > users reporting the regression? > > > > K3b burning seems to be a nogo here. This is git pulled this morning > > though, so it's a somewhat different tree than previously tested fwtw. > > can you please try git as of this morning without any patches applied, > and then pull > > git://git.kernel.dk/linux-2.6-block.git for-linus > > into that and see if that works? Works for me with the SAS SMP handler. Both input request and output response frame sizes are picked up and returned with the correct residues. James ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 16:04 ` James Bottomley @ 2008-03-04 18:46 ` Jens Axboe 0 siblings, 0 replies; 109+ messages in thread From: Jens Axboe @ 2008-03-04 18:46 UTC (permalink / raw) To: James Bottomley Cc: Mike Galbraith, FUJITA Tomonori, htejun, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, Mar 04 2008, James Bottomley wrote: > On Tue, 2008-03-04 at 13:39 +0100, Jens Axboe wrote: > > On Tue, Mar 04 2008, Mike Galbraith wrote: > > > > > > On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote: > > > > > > > Looks excellent to me, has a variant of this been tested as OK by the > > > > users reporting the regression? > > > > > > K3b burning seems to be a nogo here. This is git pulled this morning > > > though, so it's a somewhat different tree than previously tested fwtw. > > > > can you please try git as of this morning without any patches applied, > > and then pull > > > > git://git.kernel.dk/linux-2.6-block.git for-linus > > > > into that and see if that works? > > Works for me with the SAS SMP handler. Both input request and output > response frame sizes are picked up and returned with the correct > residues. Goodie, now we just need to figure out why it doesn't work for Mike yet... -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 12:39 ` Jens Axboe 2008-03-04 12:43 ` Mike Galbraith 2008-03-04 16:04 ` James Bottomley @ 2008-03-04 17:34 ` walt 2008-03-04 17:59 ` Tejun Heo 2008-03-04 19:42 ` Kiyoshi Ueda 2 siblings, 2 replies; 109+ messages in thread From: walt @ 2008-03-04 17:34 UTC (permalink / raw) To: linux-kernel; +Cc: linux-scsi, linux-ide Jens Axboe wrote: > On Tue, Mar 04 2008, Mike Galbraith wrote: >> On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote: >> >>> Looks excellent to me, has a variant of this been tested as OK by the >>> users reporting the regression? >> K3b burning seems to be a nogo here. This is git pulled this morning >> though, so it's a somewhat different tree than previously tested fwtw. > > can you please try git as of this morning without any patches applied, > and then pull > > git://git.kernel.dk/linux-2.6-block.git for-linus > > into that and see if that works? Unfortunately this doesn't fix a problem I've discussed off-list with Kiyoshi Ueda, who suggested that I should follow this thread and try any patches posted here. Here is what happens when I try to mount a CD (before and after I pull 'for-linus'): hdc: ide_cd_check_ireason: wrong transfer direction! cdrom: failed setting lba address space hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdc: drive not ready for command hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdc: drive not ready for command hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdc: drive not ready for command hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdc: drive not ready for command hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdc: drive not ready for command hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdc: drive not ready for command hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdc: drive not ready for command hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } ide: failed opcode was: unknown hdc: drive not ready for command hdc: status timeout: status=0xd0 { Busy } ide: failed opcode was: unknown hdc: DMA disabled hdc: drive not ready for command hdc: ATAPI reset complete ISO 9660 Extensions: Microsoft Joliet Level 3 ISOFS: changing to secondary root VFS: busy inodes on changed media. The mount can take from 5 seconds on up to a minute or so before the CD can be accessed. ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 17:34 ` walt @ 2008-03-04 17:59 ` Tejun Heo 2008-03-04 19:42 ` Kiyoshi Ueda 1 sibling, 0 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-04 17:59 UTC (permalink / raw) To: walt; +Cc: linux-ide, linux-scsi, linux-kernel walt wrote: > Jens Axboe wrote: >> On Tue, Mar 04 2008, Mike Galbraith wrote: >>> On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote: >>> >>>> Looks excellent to me, has a variant of this been tested as OK by the >>>> users reporting the regression? >>> K3b burning seems to be a nogo here. This is git pulled this morning >>> though, so it's a somewhat different tree than previously tested fwtw. >> >> can you please try git as of this morning without any patches applied, >> and then pull >> >> git://git.kernel.dk/linux-2.6-block.git for-linus >> >> into that and see if that works? > > Unfortunately this doesn't fix a problem I've discussed off-list with > Kiyoshi Ueda, who suggested that I should follow this thread and try > any patches posted here. > > Here is what happens when I try to mount a CD (before and after I > pull 'for-linus'): > > hdc: ide_cd_check_ireason: wrong transfer direction! > cdrom: failed setting lba address space > hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } > ide: failed opcode was: unknown > hdc: drive not ready for command > hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } > ide: failed opcode was: unknown > hdc: drive not ready for command > hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } > ide: failed opcode was: unknown > hdc: drive not ready for command > hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } > ide: failed opcode was: unknown > hdc: drive not ready for command > hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } > ide: failed opcode was: unknown > hdc: drive not ready for command > hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } > ide: failed opcode was: unknown > hdc: drive not ready for command > hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } > ide: failed opcode was: unknown > hdc: drive not ready for command > hdc: status error: status=0x58 { DriveReady SeekComplete DataRequest } > ide: failed opcode was: unknown > hdc: drive not ready for command > hdc: status timeout: status=0xd0 { Busy } > ide: failed opcode was: unknown > hdc: DMA disabled > hdc: drive not ready for command > hdc: ATAPI reset complete > ISO 9660 Extensions: Microsoft Joliet Level 3 > ISOFS: changing to secondary root > VFS: busy inodes on changed media. > > The mount can take from 5 seconds on up to a minute or so before the > CD can be accessed. Which version did you try? There was a recent IDE bug fix which affected CD recording. Commit bcd88ac3b2ff2eae3d0fa57a6b02d4fce5392f32 which is included in 2.6.25-rc3. Also, did 2.6.24 work? -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 17:34 ` walt 2008-03-04 17:59 ` Tejun Heo @ 2008-03-04 19:42 ` Kiyoshi Ueda 1 sibling, 0 replies; 109+ messages in thread From: Kiyoshi Ueda @ 2008-03-04 19:42 UTC (permalink / raw) To: w41ter; +Cc: linux-scsi, linux-kernel, linux-ide Hi, On Tue, 04 Mar 2008 09:34:56 -0800, walt wrote: > Jens Axboe wrote: > > On Tue, Mar 04 2008, Mike Galbraith wrote: > >> On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote: > >> > >>> Looks excellent to me, has a variant of this been tested as OK by the > >>> users reporting the regression? > >> K3b burning seems to be a nogo here. This is git pulled this morning > >> though, so it's a somewhat different tree than previously tested fwtw. > > > > can you please try git as of this morning without any patches applied, > > and then pull > > > > git://git.kernel.dk/linux-2.6-block.git for-linus > > > > into that and see if that works? > > Unfortunately this doesn't fix a problem I've discussed off-list with > Kiyoshi Ueda, who suggested that I should follow this thread and try > any patches posted here. I think there was misunderstanding between us. On off-list, I meant: o Your original problem was CD burning, and it looked same problem being discussed on this thread, according to this message: > cdrecord: Warning: controller returns zero sized CD capabilities page. > cdrecord: Warning: controller returns wrong page 0 for CD capabilities page (2A). So I suggested you to watch this thread and try patches of this thread for CD burning problem. o The problem of ide_cd_check_ireason looked different from CD burning one. So I suggested you to report it as a different problem. Thanks, Kiyoshi Ueda ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 12:37 ` Mike Galbraith 2008-03-04 12:39 ` Jens Axboe @ 2008-03-04 12:40 ` Tejun Heo 2008-03-04 12:45 ` Mike Galbraith 2008-03-04 13:30 ` FUJITA Tomonori 1 sibling, 2 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-04 12:40 UTC (permalink / raw) To: Mike Galbraith Cc: Jens Axboe, FUJITA Tomonori, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier Mike Galbraith wrote: > On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote: > >> Looks excellent to me, has a variant of this been tested as OK by the >> users reporting the regression? > > K3b burning seems to be a nogo here. This is git pulled this morning > though, so it's a somewhat different tree than previously tested fwtw. > > [ 136.440021] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen > [ 136.440043] ata1.01: cmd a0/00:00:00:00:00/00:00:00:00:00/b0 tag 0 > [ 136.440045] cdb 51 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 > [ 136.440047] res 58/00:02:00:02:00/00:00:00:00:00/b0 Emask 0x2 (HSM violation) > [ 136.440053] ata1.01: status: { DRDY DRQ } > [ 136.440086] ata1: soft resetting link > [ 165.327627] ata1.01: qc timeout (cmd 0xa1) > [ 165.327627] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4) > [ 165.327627] ata1.01: revalidation failed (errno=-5) > [ 165.327627] ata1: failed to recover some devices, retrying in 5 secs > [ 177.272373] ata1: port is slow to respond, please be patient (Status 0x80) > [ 180.388879] ata1: device not ready (errno=-16), forcing hardreset > [ 180.388879] ata1: soft resetting link > [ 210.832471] ata1.01: qc timeout (cmd 0xa1) > [ 210.832471] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4) > [ 210.832471] ata1.01: revalidation failed (errno=-5) > [ 210.832471] ata1: failed to recover some devices, retrying in 5 secs > [ 223.392899] ata1: port is slow to respond, please be patient (Status 0x80) > [ 225.920376] ata1: device not ready (errno=-16), forcing hardreset > [ 225.920376] ata1: soft resetting link > [ 256.542565] ata1.01: qc timeout (cmd 0xa1) > [ 256.542565] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4) > [ 256.542565] ata1.01: revalidation failed (errno=-5) > [ 256.542565] ata1.01: disabled Aiee... device going down after timing out on READ_DISC_INFO. That's gruesome. Can you please try the other patches? Thanks. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 12:40 ` Tejun Heo @ 2008-03-04 12:45 ` Mike Galbraith 2008-03-04 13:30 ` FUJITA Tomonori 1 sibling, 0 replies; 109+ messages in thread From: Mike Galbraith @ 2008-03-04 12:45 UTC (permalink / raw) To: Tejun Heo Cc: Jens Axboe, FUJITA Tomonori, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, 2008-03-04 at 21:40 +0900, Tejun Heo wrote: > Aiee... device going down after timing out on READ_DISC_INFO. That's > gruesome. Can you please try the other patches? I tried your last yesterday, and k3b worked fine. -Mike ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 12:40 ` Tejun Heo 2008-03-04 12:45 ` Mike Galbraith @ 2008-03-04 13:30 ` FUJITA Tomonori 2008-03-04 13:50 ` Tejun Heo 1 sibling, 1 reply; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-04 13:30 UTC (permalink / raw) To: htejun Cc: efault, jens.axboe, fujita.tomonori, tomof, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, 04 Mar 2008 21:40:53 +0900 Tejun Heo <htejun@gmail.com> wrote: > Mike Galbraith wrote: > > On Tue, 2008-03-04 at 10:35 +0100, Jens Axboe wrote: > > > >> Looks excellent to me, has a variant of this been tested as OK by the > >> users reporting the regression? > > > > K3b burning seems to be a nogo here. This is git pulled this morning > > though, so it's a somewhat different tree than previously tested fwtw. > > > > [ 136.440021] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen > > [ 136.440043] ata1.01: cmd a0/00:00:00:00:00/00:00:00:00:00/b0 tag 0 > > [ 136.440045] cdb 51 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 > > [ 136.440047] res 58/00:02:00:02:00/00:00:00:00:00/b0 Emask 0x2 (HSM violation) > > [ 136.440053] ata1.01: status: { DRDY DRQ } > > [ 136.440086] ata1: soft resetting link > > [ 165.327627] ata1.01: qc timeout (cmd 0xa1) > > [ 165.327627] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4) > > [ 165.327627] ata1.01: revalidation failed (errno=-5) > > [ 165.327627] ata1: failed to recover some devices, retrying in 5 secs > > [ 177.272373] ata1: port is slow to respond, please be patient (Status 0x80) > > [ 180.388879] ata1: device not ready (errno=-16), forcing hardreset > > [ 180.388879] ata1: soft resetting link > > [ 210.832471] ata1.01: qc timeout (cmd 0xa1) > > [ 210.832471] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4) > > [ 210.832471] ata1.01: revalidation failed (errno=-5) > > [ 210.832471] ata1: failed to recover some devices, retrying in 5 secs > > [ 223.392899] ata1: port is slow to respond, please be patient (Status 0x80) > > [ 225.920376] ata1: device not ready (errno=-16), forcing hardreset > > [ 225.920376] ata1: soft resetting link > > [ 256.542565] ata1.01: qc timeout (cmd 0xa1) > > [ 256.542565] ata1.01: failed to IDENTIFY (I/O error, err_mask=0x4) > > [ 256.542565] ata1.01: revalidation failed (errno=-5) > > [ 256.542565] ata1.01: disabled > > Aiee... device going down after timing out on READ_DISC_INFO. That's > gruesome. Can you please try the other patches? Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No? Now Jens' git tree should work with all the non libata stuff, ide, firewire, bsg, etc. But I'm not sure about libata. ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 13:30 ` FUJITA Tomonori @ 2008-03-04 13:50 ` Tejun Heo 2008-03-04 16:17 ` Tejun Heo 0 siblings, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-03-04 13:50 UTC (permalink / raw) To: FUJITA Tomonori Cc: efault, jens.axboe, fujita.tomonori, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier FUJITA Tomonori wrote: >> Aiee... device going down after timing out on READ_DISC_INFO. That's >> gruesome. Can you please try the other patches? > > Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No? The extra_len you added to qc->nbytes should be it. The only other place to pay attention is the ATAPI transfer chunk size and your patch seems to get it right. > Now Jens' git tree should work with all the non libata stuff, ide, > firewire, bsg, etc. But I'm not sure about libata. With the second patch, all others should be fine no matter what. I'll go check libata part again. Thanks. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 13:50 ` Tejun Heo @ 2008-03-04 16:17 ` Tejun Heo 2008-03-04 16:42 ` Tejun Heo 0 siblings, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-03-04 16:17 UTC (permalink / raw) To: FUJITA Tomonori Cc: efault, jens.axboe, fujita.tomonori, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier Tejun Heo wrote: > FUJITA Tomonori wrote: >>> Aiee... device going down after timing out on READ_DISC_INFO. That's >>> gruesome. Can you please try the other patches? >> Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No? > > The extra_len you added to qc->nbytes should be it. The only other > place to pay attention is the ATAPI transfer chunk size and your patch > seems to get it right. > >> Now Jens' git tree should work with all the non libata stuff, ide, >> firewire, bsg, etc. But I'm not sure about libata. > > With the second patch, all others should be fine no matter what. I'll > go check libata part again. I can reproduce the problem here and it's very weird. I'll report back when I know more. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 16:17 ` Tejun Heo @ 2008-03-04 16:42 ` Tejun Heo 2008-03-04 18:26 ` Boaz Harrosh ` (3 more replies) 0 siblings, 4 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-04 16:42 UTC (permalink / raw) To: FUJITA Tomonori Cc: efault, jens.axboe, fujita.tomonori, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier Tejun Heo wrote: > Tejun Heo wrote: >> FUJITA Tomonori wrote: >>>> Aiee... device going down after timing out on READ_DISC_INFO. That's >>>> gruesome. Can you please try the other patches? >>> Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No? >> The extra_len you added to qc->nbytes should be it. The only other >> place to pay attention is the ATAPI transfer chunk size and your patch >> seems to get it right. >> >>> Now Jens' git tree should work with all the non libata stuff, ide, >>> firewire, bsg, etc. But I'm not sure about libata. >> With the second patch, all others should be fine no matter what. I'll >> go check libata part again. > > I can reproduce the problem here and it's very weird. I'll report back > when I know more. Okay, I got it. Heh, it turns out SCSI and/or block layer is not ready for rq->data_len != sum(sg). When adjusted command completes, SCSI midlayer completes the command with rq->data_len for PC commands which eventually ends up in __end_that_request_first(). As there are extra sg area left after completing rq->data_len, blk layer says so to SCSI layer and SCSI layer retries the command only with the appended area. The following patch gets the writing going. I really think it's a serious mistake to break rq->data_len == sum(sg). If we break rq->data_len == requested size, the worst bugs are giving wrong size when issuing commands to application layer of devices which is relatively easy to spot and not all that command anyway. Breaking rq->data_len == sum(sg), bugs will be in internal mechanics, DMA engine programming and transport layer. Oh well... diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index fecba05..32439ac 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd) "Notifying upper driver of completion " "(result %x)\n", cmd->result)); - good_bytes = scsi_bufflen(cmd); + good_bytes = scsi_bufflen(cmd) + cmd->request->data_len; if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) { drv = scsi_cmd_to_driver(cmd); if (drv->done) -- tejun ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 16:42 ` Tejun Heo @ 2008-03-04 18:26 ` Boaz Harrosh 2008-03-04 18:35 ` Tejun Heo 2008-03-04 18:27 ` James Bottomley ` (2 subsequent siblings) 3 siblings, 1 reply; 109+ messages in thread From: Boaz Harrosh @ 2008-03-04 18:26 UTC (permalink / raw) To: Tejun Heo Cc: FUJITA Tomonori, efault, jens.axboe, fujita.tomonori, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, Mar 04 2008 at 18:42 +0200, Tejun Heo <htejun@gmail.com> wrote: > Tejun Heo wrote: >> Tejun Heo wrote: >>> FUJITA Tomonori wrote: >>>>> Aiee... device going down after timing out on READ_DISC_INFO. That's >>>>> gruesome. Can you please try the other patches? >>>> Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No? >>> The extra_len you added to qc->nbytes should be it. The only other >>> place to pay attention is the ATAPI transfer chunk size and your patch >>> seems to get it right. >>> >>>> Now Jens' git tree should work with all the non libata stuff, ide, >>>> firewire, bsg, etc. But I'm not sure about libata. >>> With the second patch, all others should be fine no matter what. I'll >>> go check libata part again. >> I can reproduce the problem here and it's very weird. I'll report back >> when I know more. > > Okay, I got it. Heh, it turns out SCSI and/or block layer is not > ready for rq->data_len != sum(sg). When adjusted command completes, > SCSI midlayer completes the command with rq->data_len for PC commands > which eventually ends up in __end_that_request_first(). As there are > extra sg area left after completing rq->data_len, blk layer says so to > SCSI layer and SCSI layer retries the command only with the appended > area. > > The following patch gets the writing going. I really think it's a > serious mistake to break rq->data_len == sum(sg). If we break > rq->data_len == requested size, the worst bugs are giving wrong size > when issuing commands to application layer of devices which is > relatively easy to spot and not all that command anyway. Breaking > rq->data_len == sum(sg), bugs will be in internal mechanics, DMA > engine programming and transport layer. Oh well... > > diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c > index fecba05..32439ac 100644 > --- a/drivers/scsi/scsi.c > +++ b/drivers/scsi/scsi.c > @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd) > "Notifying upper driver of completion " > "(result %x)\n", cmd->result)); > > - good_bytes = scsi_bufflen(cmd); > + good_bytes = scsi_bufflen(cmd) + cmd->request->data_len; Are you sure? is it not: + good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len > if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) { > drv = scsi_cmd_to_driver(cmd); > if (drv->done) > > I hate this patch. I wish you could maybe take the extra_len into account inside blk_end_request. The padding should be transparent to all concerned but the requesting LLD and the internals of the block layer. If block layer added padding it should take that into account on completion. My $0.2. Boaz ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 18:26 ` Boaz Harrosh @ 2008-03-04 18:35 ` Tejun Heo 0 siblings, 0 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-04 18:35 UTC (permalink / raw) To: Boaz Harrosh Cc: FUJITA Tomonori, efault, jens.axboe, fujita.tomonori, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier Boaz Harrosh wrote: >> - good_bytes = scsi_bufflen(cmd); >> + good_bytes = scsi_bufflen(cmd) + cmd->request->data_len; > > Are you sure? is it not: > + good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len You're right. Sorry about the confusion. >> if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) { >> drv = scsi_cmd_to_driver(cmd); >> if (drv->done) >> >> > > I hate this patch. I wish you could maybe take the extra_len into > account inside blk_end_request. The padding should be transparent > to all concerned but the requesting LLD and the internals of the > block layer. If block layer added padding it should take that into > account on completion. My $0.2. Yeah, I hate it too. As I've been saying all along, I think it just should be rq->data_len. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 16:42 ` Tejun Heo 2008-03-04 18:26 ` Boaz Harrosh @ 2008-03-04 18:27 ` James Bottomley 2008-03-04 18:33 ` Tejun Heo 2008-03-04 18:45 ` Mike Galbraith 2008-03-04 19:19 ` FUJITA Tomonori 3 siblings, 1 reply; 109+ messages in thread From: James Bottomley @ 2008-03-04 18:27 UTC (permalink / raw) To: Tejun Heo Cc: FUJITA Tomonori, efault, jens.axboe, fujita.tomonori, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Wed, 2008-03-05 at 01:42 +0900, Tejun Heo wrote: > Tejun Heo wrote: > > Tejun Heo wrote: > >> FUJITA Tomonori wrote: > >>>> Aiee... device going down after timing out on READ_DISC_INFO. That's > >>>> gruesome. Can you please try the other patches? > >>> Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No? > >> The extra_len you added to qc->nbytes should be it. The only other > >> place to pay attention is the ATAPI transfer chunk size and your patch > >> seems to get it right. > >> > >>> Now Jens' git tree should work with all the non libata stuff, ide, > >>> firewire, bsg, etc. But I'm not sure about libata. > >> With the second patch, all others should be fine no matter what. I'll > >> go check libata part again. > > > > I can reproduce the problem here and it's very weird. I'll report back > > when I know more. > > Okay, I got it. Heh, it turns out SCSI and/or block layer is not > ready for rq->data_len != sum(sg). When adjusted command completes, > SCSI midlayer completes the command with rq->data_len for PC commands > which eventually ends up in __end_that_request_first(). As there are > extra sg area left after completing rq->data_len, blk layer says so to > SCSI layer and SCSI layer retries the command only with the appended > area. > > The following patch gets the writing going. I really think it's a > serious mistake to break rq->data_len == sum(sg). If we break > rq->data_len == requested size, the worst bugs are giving wrong size > when issuing commands to application layer of devices which is > relatively easy to spot and not all that command anyway. Breaking > rq->data_len == sum(sg), bugs will be in internal mechanics, DMA > engine programming and transport layer. Oh well... > > diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c > index fecba05..32439ac 100644 > --- a/drivers/scsi/scsi.c > +++ b/drivers/scsi/scsi.c > @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd) > "Notifying upper driver of completion " > "(result %x)\n", cmd->result)); > > - good_bytes = scsi_bufflen(cmd); > + good_bytes = scsi_bufflen(cmd) + cmd->request->data_len; This doesn't look right. scsi_bufflen(cmd) is req->data_len for PC commands ... did you mean to add extra_len here? James ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 18:27 ` James Bottomley @ 2008-03-04 18:33 ` Tejun Heo 0 siblings, 0 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-04 18:33 UTC (permalink / raw) To: James Bottomley Cc: FUJITA Tomonori, efault, jens.axboe, fujita.tomonori, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier James Bottomley wrote: > On Wed, 2008-03-05 at 01:42 +0900, Tejun Heo wrote: >> Tejun Heo wrote: >>> Tejun Heo wrote: >>>> FUJITA Tomonori wrote: >>>>>> Aiee... device going down after timing out on READ_DISC_INFO. That's >>>>>> gruesome. Can you please try the other patches? >>>>> Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No? >>>> The extra_len you added to qc->nbytes should be it. The only other >>>> place to pay attention is the ATAPI transfer chunk size and your patch >>>> seems to get it right. >>>> >>>>> Now Jens' git tree should work with all the non libata stuff, ide, >>>>> firewire, bsg, etc. But I'm not sure about libata. >>>> With the second patch, all others should be fine no matter what. I'll >>>> go check libata part again. >>> I can reproduce the problem here and it's very weird. I'll report back >>> when I know more. >> Okay, I got it. Heh, it turns out SCSI and/or block layer is not >> ready for rq->data_len != sum(sg). When adjusted command completes, >> SCSI midlayer completes the command with rq->data_len for PC commands >> which eventually ends up in __end_that_request_first(). As there are >> extra sg area left after completing rq->data_len, blk layer says so to >> SCSI layer and SCSI layer retries the command only with the appended >> area. >> >> The following patch gets the writing going. I really think it's a >> serious mistake to break rq->data_len == sum(sg). If we break >> rq->data_len == requested size, the worst bugs are giving wrong size >> when issuing commands to application layer of devices which is >> relatively easy to spot and not all that command anyway. Breaking >> rq->data_len == sum(sg), bugs will be in internal mechanics, DMA >> engine programming and transport layer. Oh well... >> >> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c >> index fecba05..32439ac 100644 >> --- a/drivers/scsi/scsi.c >> +++ b/drivers/scsi/scsi.c >> @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd) >> "Notifying upper driver of completion " >> "(result %x)\n", cmd->result)); >> >> - good_bytes = scsi_bufflen(cmd); >> + good_bytes = scsi_bufflen(cmd) + cmd->request->data_len; > > This doesn't look right. scsi_bufflen(cmd) is req->data_len for PC > commands ... did you mean to add extra_len here? Yeap, sorry about the confusion. Adding two times data_len accidentally worked tho. :-) -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 16:42 ` Tejun Heo 2008-03-04 18:26 ` Boaz Harrosh 2008-03-04 18:27 ` James Bottomley @ 2008-03-04 18:45 ` Mike Galbraith 2008-03-04 19:25 ` Jens Axboe 2008-03-04 19:19 ` FUJITA Tomonori 3 siblings, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-03-04 18:45 UTC (permalink / raw) To: Tejun Heo Cc: FUJITA Tomonori, jens.axboe, fujita.tomonori, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Wed, 2008-03-05 at 01:42 +0900, Tejun Heo wrote: > The following patch gets the writing going. Bingo. -Mike ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 18:45 ` Mike Galbraith @ 2008-03-04 19:25 ` Jens Axboe 2008-03-04 19:33 ` Mike Galbraith 0 siblings, 1 reply; 109+ messages in thread From: Jens Axboe @ 2008-03-04 19:25 UTC (permalink / raw) To: Mike Galbraith Cc: Tejun Heo, FUJITA Tomonori, fujita.tomonori, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, Mar 04 2008, Mike Galbraith wrote: > > On Wed, 2008-03-05 at 01:42 +0900, Tejun Heo wrote: > > > The following patch gets the writing going. > > Bingo. Pretty please test this on top of current -git? I'll merge this up, it should do the trick. Would just be nice if you could verify! :-) diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index fecba05..e5c6f6a 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd) "Notifying upper driver of completion " "(result %x)\n", cmd->result)); - good_bytes = scsi_bufflen(cmd); + good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len; if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) { drv = scsi_cmd_to_driver(cmd); if (drv->done) -- Jens Axboe ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 19:25 ` Jens Axboe @ 2008-03-04 19:33 ` Mike Galbraith 2008-03-04 19:34 ` Jens Axboe 0 siblings, 1 reply; 109+ messages in thread From: Mike Galbraith @ 2008-03-04 19:33 UTC (permalink / raw) To: Jens Axboe Cc: Tejun Heo, FUJITA Tomonori, fujita.tomonori, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, 2008-03-04 at 20:25 +0100, Jens Axboe wrote: > On Tue, Mar 04 2008, Mike Galbraith wrote: > > > > On Wed, 2008-03-05 at 01:42 +0900, Tejun Heo wrote: > > > > > The following patch gets the writing going. > > > > Bingo. > > Pretty please test this on top of current -git? ? That's the patch I just tested, and the tree. Oh.. 976dde0..87baa2b just a sec.... -Mike ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 19:33 ` Mike Galbraith @ 2008-03-04 19:34 ` Jens Axboe 0 siblings, 0 replies; 109+ messages in thread From: Jens Axboe @ 2008-03-04 19:34 UTC (permalink / raw) To: Mike Galbraith Cc: Tejun Heo, FUJITA Tomonori, fujita.tomonori, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Tue, Mar 04 2008, Mike Galbraith wrote: > > On Tue, 2008-03-04 at 20:25 +0100, Jens Axboe wrote: > > On Tue, Mar 04 2008, Mike Galbraith wrote: > > > > > > On Wed, 2008-03-05 at 01:42 +0900, Tejun Heo wrote: > > > > > > > The following patch gets the writing going. > > > > > > Bingo. > > > > Pretty please test this on top of current -git? > > ? That's the patch I just tested, and the tree. Oh.. 976dde0..87baa2b > just a sec.... Yeah it is, mid-air collision of emails. So just disregard this one! -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 16:42 ` Tejun Heo ` (2 preceding siblings ...) 2008-03-04 18:45 ` Mike Galbraith @ 2008-03-04 19:19 ` FUJITA Tomonori 2008-03-04 23:33 ` Tejun Heo 3 siblings, 1 reply; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-04 19:19 UTC (permalink / raw) To: htejun Cc: tomof, efault, jens.axboe, fujita.tomonori, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier, fujita.tomonori On Wed, 05 Mar 2008 01:42:45 +0900 Tejun Heo <htejun@gmail.com> wrote: > Tejun Heo wrote: > > Tejun Heo wrote: > >> FUJITA Tomonori wrote: > >>>> Aiee... device going down after timing out on READ_DISC_INFO. That's > >>>> gruesome. Can you please try the other patches? > >>> Tejun, I thought that libata needs a fix for sum(sg) != rq->data_len. No? > >> The extra_len you added to qc->nbytes should be it. The only other > >> place to pay attention is the ATAPI transfer chunk size and your patch > >> seems to get it right. > >> > >>> Now Jens' git tree should work with all the non libata stuff, ide, > >>> firewire, bsg, etc. But I'm not sure about libata. > >> With the second patch, all others should be fine no matter what. I'll > >> go check libata part again. > > > > I can reproduce the problem here and it's very weird. I'll report back > > when I know more. > > Okay, I got it. Heh, it turns out SCSI and/or block layer is not > ready for rq->data_len != sum(sg). When adjusted command completes, > SCSI midlayer completes the command with rq->data_len for PC commands > which eventually ends up in __end_that_request_first(). As there are > extra sg area left after completing rq->data_len, blk layer says so to > SCSI layer and SCSI layer retries the command only with the appended > area. Ah, thanks! > The following patch gets the writing going. I really think it's a > serious mistake to break rq->data_len == sum(sg). If we break > rq->data_len == requested size, the worst bugs are giving wrong size > when issuing commands to application layer of devices which is > relatively easy to spot and not all that command anyway. Breaking > rq->data_len == sum(sg), bugs will be in internal mechanics, DMA > engine programming and transport layer. Oh well... > > diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c > index fecba05..32439ac 100644 > --- a/drivers/scsi/scsi.c > +++ b/drivers/scsi/scsi.c > @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd) > "Notifying upper driver of completion " > "(result %x)\n", cmd->result)); > > - good_bytes = scsi_bufflen(cmd); > + good_bytes = scsi_bufflen(cmd) + cmd->request->data_len; > if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) { > drv = scsi_cmd_to_driver(cmd); > if (drv->done) > > Hmm, does SCSI mid-layer need to care about how many bytes the block layer allocates? I don't think that extra_len is NOT good_bytes. I think that the block layer had better take care about it (fix __end_that_request_first?). ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 19:19 ` FUJITA Tomonori @ 2008-03-04 23:33 ` Tejun Heo 2008-03-04 23:54 ` Tejun Heo 2008-03-05 0:26 ` FUJITA Tomonori 0 siblings, 2 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-04 23:33 UTC (permalink / raw) To: FUJITA Tomonori Cc: efault, jens.axboe, fujita.tomonori, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier FUJITA Tomonori wrote: > Hmm, does SCSI mid-layer need to care about how many bytes the block > layer allocates? I don't think that extra_len is NOT good_bytes. > > I think that the block layer had better take care about it (fix > __end_that_request_first?). Yeah, probably calling completion functions w/o bytes count is the right thing to do but what I was talking about was what could break when the semantics of rq->data_len changed. If we keep rq->data_len() == sum(sg), we keep it business as usual for all the rest except for the device application layer if we don't we do the reverse and SCSI midlayer completion was a good example, I think. Things going the other way is fine with me but I at least want to hear a valid rationale. Till now all I got is "because that's the true size" which doesn't really make much sense to me. Thanks. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 23:33 ` Tejun Heo @ 2008-03-04 23:54 ` Tejun Heo 2008-03-05 0:26 ` FUJITA Tomonori 1 sibling, 0 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-04 23:54 UTC (permalink / raw) To: FUJITA Tomonori Cc: efault, jens.axboe, fujita.tomonori, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier Tejun Heo wrote: > FUJITA Tomonori wrote: >> Hmm, does SCSI mid-layer need to care about how many bytes the block >> layer allocates? I don't think that extra_len is NOT good_bytes. >> >> I think that the block layer had better take care about it (fix >> __end_that_request_first?). > > Yeah, probably calling completion functions w/o bytes count is the right > thing to do but what I was talking about was what could break when the > semantics of rq->data_len changed. If we keep rq->data_len() == > sum(sg), we keep it business as usual for all the rest except for the > device application layer if we don't we do the reverse and SCSI midlayer > completion was a good example, I think. > > Things going the other way is fine with me but I at least want to hear a > valid rationale. Till now all I got is "because that's the true size" > which doesn't really make much sense to me. I'm giving it another shot. When the padding / draining thing was in libata (or IDE) in that matter. The whole thing looked like this. user - blk - SCSI - libata - LLD - controller - device <---------------------><----------------------><-----> a b c a: Uses the 'true' request size and matching sg b: Requires adjusted request size and matching sg c: Don't really care about sg, but sometimes needs the true size. For anything which gets attached behind ATA and which may require padding, transfer size is also sent in the CDB as well, which not all devices honor and that's one of the reasons why size adjustment is necessary. If we move the adjustment to block layer and keep data_len == sum(sg), it looks like. user - blk - SCSI - libata - LLD - controller - device <------><-------------------------------------><-----> a b c And a, b and c stay the same. If we keep the requested size in data_len. Whole b gets inconsistent values in the middle while c gets the value it wants in data_len, so we're risking much more to keep the true size in rq->data_len when we could simply make it mean sum(sg). Before the only thing which need updating was to correctly determine the transfer size to feed to device. Now we need to audit whole b. In addition, such adjustments are made only when the driver explicitly requested it, so for all others it doesn't really matter. Thanks. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 23:33 ` Tejun Heo 2008-03-04 23:54 ` Tejun Heo @ 2008-03-05 0:26 ` FUJITA Tomonori 2008-03-05 0:44 ` Tejun Heo 2008-03-05 10:16 ` [PATCH] blk: missing add of padded bytes to io completion byte count Boaz Harrosh 1 sibling, 2 replies; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-05 0:26 UTC (permalink / raw) To: htejun Cc: tomof, efault, jens.axboe, fujita.tomonori, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Wed, 05 Mar 2008 08:33:05 +0900 Tejun Heo <htejun@gmail.com> wrote: > FUJITA Tomonori wrote: > > Hmm, does SCSI mid-layer need to care about how many bytes the block > > layer allocates? I don't think that extra_len is NOT good_bytes. > > > > I think that the block layer had better take care about it (fix > > __end_that_request_first?). > > Yeah, probably calling completion functions w/o bytes count is the right > thing to do but what I was talking about was what could break when the > semantics of rq->data_len changed. If we keep rq->data_len() == > sum(sg), we keep it business as usual for all the rest except for the > device application layer if we don't we do the reverse and SCSI midlayer > completion was a good example, I think. sglist is a low-level I/O representation for device drivers. SCSI midlayer should not care about sglist. We should not fix SCSI midlayer for rq->data_len != sum(sg) change (so I can't agree with your diagrams in another mail). When if we change a rule, we need to fix something. If we keep rq->data_len == sum(sg), we need to fix the device application layer. If we keep rq->data_len == the true data length, we need to fix the low-level drivers. Now I'm fine with the commit e97a294ef6938512b655b1abf17656cf2b26f709 since we are in -rc stages. But I plan to send a patch to revert it and fix this issue in the block layer. I'd like to test it in -mm for a while. Only sglist stuff in SCSI midlayer is scsi_req_map_sg now. As you know, we really want to remove it. > Things going the other way is fine with me but I at least want to hear a > valid rationale. Till now all I got is "because that's the true size" > which doesn't really make much sense to me. Most of users of request structure care about only the real data length, don't care about padding and drain length. Why do they bother to use a helper function to get the real data length? ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-05 0:26 ` FUJITA Tomonori @ 2008-03-05 0:44 ` Tejun Heo 2008-03-06 4:56 ` FUJITA Tomonori 2008-03-05 10:16 ` [PATCH] blk: missing add of padded bytes to io completion byte count Boaz Harrosh 1 sibling, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-03-05 0:44 UTC (permalink / raw) To: FUJITA Tomonori Cc: tomof, efault, jens.axboe, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier Hello, FUJITA Tomonori wrote: > sglist is a low-level I/O representation for device drivers. SCSI > midlayer should not care about sglist. We should not fix SCSI midlayer > for rq->data_len != sum(sg) change (so I can't agree with your > diagrams in another mail). But that's not the way things currently are. > When if we change a rule, we need to fix something. > > If we keep rq->data_len == sum(sg), we need to fix the device > application layer. If we keep rq->data_len == the true data length, we > need to fix the low-level drivers. Basically everything under block layer. > Now I'm fine with the commit e97a294ef6938512b655b1abf17656cf2b26f709 > since we are in -rc stages. But I plan to send a patch to revert it > and fix this issue in the block layer. I'd like to test it in -mm for > a while. > > Only sglist stuff in SCSI midlayer is scsi_req_map_sg now. As you > know, we really want to remove it. If the way forward is to make anything but the low level drivers not care about sglist, in the long term, the current scheme is fine but I still don't think this way of doing things is safe one. We're affecting large portion of code based on what things should be in future not what they currently are. >> Things going the other way is fine with me but I at least want to hear a >> valid rationale. Till now all I got is "because that's the true size" >> which doesn't really make much sense to me. > > Most of users of request structure care about only the real data > length, don't care about padding and drain length. Why do they bother > to use a helper function to get the real data length? I think this is where the difference comes from. To me it seems internal usage seems more wide-spread and more delicate and not too many care about the true size and when they do only in well defined places. Maybe it comes from the difference between your most and my most. Thanks. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-05 0:44 ` Tejun Heo @ 2008-03-06 4:56 ` FUJITA Tomonori 2008-03-06 5:02 ` Tejun Heo 0 siblings, 1 reply; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-06 4:56 UTC (permalink / raw) To: htejun Cc: fujita.tomonori, tomof, efault, jens.axboe, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Wed, 05 Mar 2008 09:44:01 +0900 Tejun Heo <htejun@gmail.com> wrote: > >> Things going the other way is fine with me but I at least want to hear a > >> valid rationale. Till now all I got is "because that's the true size" > >> which doesn't really make much sense to me. > > > > Most of users of request structure care about only the real data > > length, don't care about padding and drain length. Why do they bother > > to use a helper function to get the real data length? > > I think this is where the difference comes from. To me it seems > internal usage seems more wide-spread and more delicate and not too many > care about the true size and when they do only in well defined places. > Maybe it comes from the difference between your most and my most. I don't think that they only in well defined places. If you see scsi mid-layer (and LLDs), you can find several places that use rq->data_len as the true data length. Breaking rq->data_len == the true data length theoretically wrong. Even if it affects only libata now, it will hurt us, I think. ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-06 4:56 ` FUJITA Tomonori @ 2008-03-06 5:02 ` Tejun Heo 0 siblings, 0 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-06 5:02 UTC (permalink / raw) To: FUJITA Tomonori Cc: tomof, efault, jens.axboe, James.Bottomley, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier Hello, FUJITA. FUJITA Tomonori wrote: >>>> Things going the other way is fine with me but I at least want to hear a >>>> valid rationale. Till now all I got is "because that's the true size" >>>> which doesn't really make much sense to me. >>> Most of users of request structure care about only the real data >>> length, don't care about padding and drain length. Why do they bother >>> to use a helper function to get the real data length? >> I think this is where the difference comes from. To me it seems >> internal usage seems more wide-spread and more delicate and not too many >> care about the true size and when they do only in well defined places. >> Maybe it comes from the difference between your most and my most. > > I don't think that they only in well defined places. > > If you see scsi mid-layer (and LLDs), you can find several places that > use rq->data_len as the true data length. > > Breaking rq->data_len == the true data length theoretically > wrong. Even if it affects only libata now, it will hurt us, I think. Yeap, I fully agree it's much better not to break any of the two assumptions except when it's actually needed. Both padding and draining are requirements from low level driver which usually stems from hardware kinkiness, so adjusting sg and length there and let the rest of system not care about it sounds like a good idea to me. Maybe something good can come out of this long thread. :-) Thanks. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-05 0:26 ` FUJITA Tomonori 2008-03-05 0:44 ` Tejun Heo @ 2008-03-05 10:16 ` Boaz Harrosh 2008-03-05 12:28 ` Mike Galbraith ` (2 more replies) 1 sibling, 3 replies; 109+ messages in thread From: Boaz Harrosh @ 2008-03-05 10:16 UTC (permalink / raw) To: FUJITA Tomonori, Tejun Heo, Mike Galbraith, jens.axboe, James.Bottomley Cc: tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Wed, Mar 05 2008 at 2:26 +0200, FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote: > On Wed, 05 Mar 2008 08:33:05 +0900 > Tejun Heo <htejun@gmail.com> wrote: > >> FUJITA Tomonori wrote: >>> Hmm, does SCSI mid-layer need to care about how many bytes the block >>> layer allocates? I don't think that extra_len is NOT good_bytes. >>> >>> I think that the block layer had better take care about it (fix >>> __end_that_request_first?). >> Yeah, probably calling completion functions w/o bytes count is the right >> thing to do but what I was talking about was what could break when the >> semantics of rq->data_len changed. If we keep rq->data_len() == >> sum(sg), we keep it business as usual for all the rest except for the >> device application layer if we don't we do the reverse and SCSI midlayer >> completion was a good example, I think. > > sglist is a low-level I/O representation for device drivers. SCSI > midlayer should not care about sglist. We should not fix SCSI midlayer > for rq->data_len != sum(sg) change (so I can't agree with your > diagrams in another mail). > > When if we change a rule, we need to fix something. > > If we keep rq->data_len == sum(sg), we need to fix the device > application layer. If we keep rq->data_len == the true data length, we > need to fix the low-level drivers. > > Now I'm fine with the commit e97a294ef6938512b655b1abf17656cf2b26f709 > since we are in -rc stages. But I plan to send a patch to revert it > and fix this issue in the block layer. I'd like to test it in -mm for > a while. No this commit is a serious bug, and the only fix is like you suggested in __end_that_request_first. This is because it breaks that scsi-ml loop where scsi_bufflen() can be less then blk_rq_bytes(). In that case this commit is a data corruption. > Only sglist stuff in SCSI midlayer is scsi_req_map_sg now. As you > know, we really want to remove it. > > >> Things going the other way is fine with me but I at least want to hear a >> valid rationale. Till now all I got is "because that's the true size" >> which doesn't really make much sense to me. > > Most of users of request structure care about only the real data > length, don't care about padding and drain length. Why do they bother > to use a helper function to get the real data length? > -- Submitted is the right fix to this problem, as pointed out by TOMO. Please test it solves the CD burning problem. (The patch includes the revert of commit e97a294e) --- From: Boaz Harrosh <bharrosh@panasas.com> Date: Wed, 5 Mar 2008 12:07:12 +0200 Subject: [PATCH] blk: missing add of padded bytes to io completion byte count the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is because scsi-ml supports the ability to split a request into smaller chunks, in which case scsi_bufflen() is smaller then request length. Then at completion time the remainder can be issued as a new scsi command. In that case the above commit is a data corruption. Also in this fix all users of block layer are taken care of, and not only scsi devices. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> --- block/blk-core.c | 4 ++++ drivers/scsi/scsi.c | 2 +- 2 files changed, 5 insertions(+), 1 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 2a438a9..37fcccc 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error, nr_bytes >> 9, req->sector); } + if (nr_bytes >= blk_rq_bytes(req)) + nr_bytes += req->extra_len; + total_bytes = bio_nbytes = 0; while ((bio = req->bio) != NULL) { int nbytes; @@ -1616,6 +1619,7 @@ static int __end_that_request_first(struct request *req, int error, if (!req->bio) return 0; + BUG_ON(total_bytes >= blk_rq_bytes(req)); /* * if the request wasn't completed, update state */ diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index e5c6f6a..fecba05 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd) "Notifying upper driver of completion " "(result %x)\n", cmd->result)); - good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len; + good_bytes = scsi_bufflen(cmd); if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) { drv = scsi_cmd_to_driver(cmd); if (drv->done) -- 1.5.3.3 ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-05 10:16 ` [PATCH] blk: missing add of padded bytes to io completion byte count Boaz Harrosh @ 2008-03-05 12:28 ` Mike Galbraith 2008-03-05 12:33 ` Jens Axboe 2008-03-06 5:02 ` FUJITA Tomonori 2 siblings, 0 replies; 109+ messages in thread From: Mike Galbraith @ 2008-03-05 12:28 UTC (permalink / raw) To: Boaz Harrosh Cc: FUJITA Tomonori, Tejun Heo, jens.axboe, James.Bottomley, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Wed, 2008-03-05 at 12:16 +0200, Boaz Harrosh wrote: > Please test it solves the CD burning problem. Works for me. -Mike > (The patch includes the revert of commit e97a294e) > --- > From: Boaz Harrosh <bharrosh@panasas.com> > Date: Wed, 5 Mar 2008 12:07:12 +0200 > Subject: [PATCH] blk: missing add of padded bytes to io completion byte count > > the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is > because scsi-ml supports the ability to split a request into smaller chunks, > in which case scsi_bufflen() is smaller then request length. Then at completion > time the remainder can be issued as a new scsi command. In that case the above > commit is a data corruption. > > Also in this fix all users of block layer are taken care of, and not only > scsi devices. > > Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> > Signed-off-by: Benny Halevy <bhalevy@panasas.com> > --- > block/blk-core.c | 4 ++++ > drivers/scsi/scsi.c | 2 +- > 2 files changed, 5 insertions(+), 1 deletions(-) > > diff --git a/block/blk-core.c b/block/blk-core.c > index 2a438a9..37fcccc 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error, > nr_bytes >> 9, req->sector); > } > > + if (nr_bytes >= blk_rq_bytes(req)) > + nr_bytes += req->extra_len; > + > total_bytes = bio_nbytes = 0; > while ((bio = req->bio) != NULL) { > int nbytes; > @@ -1616,6 +1619,7 @@ static int __end_that_request_first(struct request *req, int error, > if (!req->bio) > return 0; > > + BUG_ON(total_bytes >= blk_rq_bytes(req)); > /* > * if the request wasn't completed, update state > */ > diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c > index e5c6f6a..fecba05 100644 > --- a/drivers/scsi/scsi.c > +++ b/drivers/scsi/scsi.c > @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd) > "Notifying upper driver of completion " > "(result %x)\n", cmd->result)); > > - good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len; > + good_bytes = scsi_bufflen(cmd); > if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) { > drv = scsi_cmd_to_driver(cmd); > if (drv->done) ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-05 10:16 ` [PATCH] blk: missing add of padded bytes to io completion byte count Boaz Harrosh 2008-03-05 12:28 ` Mike Galbraith @ 2008-03-05 12:33 ` Jens Axboe 2008-03-05 12:46 ` Boaz Harrosh 2008-03-06 5:02 ` FUJITA Tomonori 2 siblings, 1 reply; 109+ messages in thread From: Jens Axboe @ 2008-03-05 12:33 UTC (permalink / raw) To: Boaz Harrosh Cc: FUJITA Tomonori, Tejun Heo, Mike Galbraith, James.Bottomley, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Wed, Mar 05 2008, Boaz Harrosh wrote: > On Wed, Mar 05 2008 at 2:26 +0200, FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote: > > On Wed, 05 Mar 2008 08:33:05 +0900 > > Tejun Heo <htejun@gmail.com> wrote: > > > >> FUJITA Tomonori wrote: > >>> Hmm, does SCSI mid-layer need to care about how many bytes the block > >>> layer allocates? I don't think that extra_len is NOT good_bytes. > >>> > >>> I think that the block layer had better take care about it (fix > >>> __end_that_request_first?). > >> Yeah, probably calling completion functions w/o bytes count is the right > >> thing to do but what I was talking about was what could break when the > >> semantics of rq->data_len changed. If we keep rq->data_len() == > >> sum(sg), we keep it business as usual for all the rest except for the > >> device application layer if we don't we do the reverse and SCSI midlayer > >> completion was a good example, I think. > > > > sglist is a low-level I/O representation for device drivers. SCSI > > midlayer should not care about sglist. We should not fix SCSI midlayer > > for rq->data_len != sum(sg) change (so I can't agree with your > > diagrams in another mail). > > > > When if we change a rule, we need to fix something. > > > > If we keep rq->data_len == sum(sg), we need to fix the device > > application layer. If we keep rq->data_len == the true data length, we > > need to fix the low-level drivers. > > > > Now I'm fine with the commit e97a294ef6938512b655b1abf17656cf2b26f709 > > since we are in -rc stages. But I plan to send a patch to revert it > > and fix this issue in the block layer. I'd like to test it in -mm for > > a while. > > No this commit is a serious bug, and the only fix is like you suggested > in __end_that_request_first. This is because it breaks that scsi-ml loop > where scsi_bufflen() can be less then blk_rq_bytes(). In that case this > commit is a data corruption. > > > Only sglist stuff in SCSI midlayer is scsi_req_map_sg now. As you > > know, we really want to remove it. > > > > > >> Things going the other way is fine with me but I at least want to hear a > >> valid rationale. Till now all I got is "because that's the true size" > >> which doesn't really make much sense to me. > > > > Most of users of request structure care about only the real data > > length, don't care about padding and drain length. Why do they bother > > to use a helper function to get the real data length? > > -- > > Submitted is the right fix to this problem, as pointed out by TOMO. > Please test it solves the CD burning problem. > (The patch includes the revert of commit e97a294e) > --- > From: Boaz Harrosh <bharrosh@panasas.com> > Date: Wed, 5 Mar 2008 12:07:12 +0200 > Subject: [PATCH] blk: missing add of padded bytes to io completion byte count > > the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is > because scsi-ml supports the ability to split a request into smaller chunks, > in which case scsi_bufflen() is smaller then request length. Then at completion > time the remainder can be issued as a new scsi command. In that case the above > commit is a data corruption. We needed something for -rc4, so it had to be rushed a bit... > Also in this fix all users of block layer are taken care of, and not only > scsi devices. > > Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> > Signed-off-by: Benny Halevy <bhalevy@panasas.com> > --- > block/blk-core.c | 4 ++++ > drivers/scsi/scsi.c | 2 +- > 2 files changed, 5 insertions(+), 1 deletions(-) > > diff --git a/block/blk-core.c b/block/blk-core.c > index 2a438a9..37fcccc 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error, > nr_bytes >> 9, req->sector); > } > > + if (nr_bytes >= blk_rq_bytes(req)) > + nr_bytes += req->extra_len; > + > total_bytes = bio_nbytes = 0; > while ((bio = req->bio) != NULL) { > int nbytes; > @@ -1616,6 +1619,7 @@ static int __end_that_request_first(struct request *req, int error, > if (!req->bio) > return 0; > > + BUG_ON(total_bytes >= blk_rq_bytes(req)); Make that a WARN_ON() first please. It's indeed a bug, but it wont be critical and it's not fair killing everything since this padding stuff is so fresh and may still need a tweak or two. I'd be fine with making it a BUG_ON() post 2.6.25. -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-05 12:33 ` Jens Axboe @ 2008-03-05 12:46 ` Boaz Harrosh 2008-03-05 12:48 ` Jens Axboe 0 siblings, 1 reply; 109+ messages in thread From: Boaz Harrosh @ 2008-03-05 12:46 UTC (permalink / raw) To: Jens Axboe Cc: FUJITA Tomonori, Tejun Heo, Mike Galbraith, James.Bottomley, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Wed, Mar 05 2008 at 14:33 +0200, Jens Axboe <jens.axboe@oracle.com> wrote: > On Wed, Mar 05 2008, Boaz Harrosh wrote: >> On Wed, Mar 05 2008 at 2:26 +0200, FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote: >>> On Wed, 05 Mar 2008 08:33:05 +0900 >>> Tejun Heo <htejun@gmail.com> wrote: >>> >>>> FUJITA Tomonori wrote: >>>>> Hmm, does SCSI mid-layer need to care about how many bytes the block >>>>> layer allocates? I don't think that extra_len is NOT good_bytes. >>>>> >>>>> I think that the block layer had better take care about it (fix >>>>> __end_that_request_first?). >>>> Yeah, probably calling completion functions w/o bytes count is the right >>>> thing to do but what I was talking about was what could break when the >>>> semantics of rq->data_len changed. If we keep rq->data_len() == >>>> sum(sg), we keep it business as usual for all the rest except for the >>>> device application layer if we don't we do the reverse and SCSI midlayer >>>> completion was a good example, I think. >>> sglist is a low-level I/O representation for device drivers. SCSI >>> midlayer should not care about sglist. We should not fix SCSI midlayer >>> for rq->data_len != sum(sg) change (so I can't agree with your >>> diagrams in another mail). >>> >>> When if we change a rule, we need to fix something. >>> >>> If we keep rq->data_len == sum(sg), we need to fix the device >>> application layer. If we keep rq->data_len == the true data length, we >>> need to fix the low-level drivers. >>> >>> Now I'm fine with the commit e97a294ef6938512b655b1abf17656cf2b26f709 >>> since we are in -rc stages. But I plan to send a patch to revert it >>> and fix this issue in the block layer. I'd like to test it in -mm for >>> a while. >> No this commit is a serious bug, and the only fix is like you suggested >> in __end_that_request_first. This is because it breaks that scsi-ml loop >> where scsi_bufflen() can be less then blk_rq_bytes(). In that case this >> commit is a data corruption. >> >>> Only sglist stuff in SCSI midlayer is scsi_req_map_sg now. As you >>> know, we really want to remove it. >>> >>> >>>> Things going the other way is fine with me but I at least want to hear a >>>> valid rationale. Till now all I got is "because that's the true size" >>>> which doesn't really make much sense to me. >>> Most of users of request structure care about only the real data >>> length, don't care about padding and drain length. Why do they bother >>> to use a helper function to get the real data length? >>> -- >> Submitted is the right fix to this problem, as pointed out by TOMO. >> Please test it solves the CD burning problem. >> (The patch includes the revert of commit e97a294e) >> --- >> From: Boaz Harrosh <bharrosh@panasas.com> >> Date: Wed, 5 Mar 2008 12:07:12 +0200 >> Subject: [PATCH] blk: missing add of padded bytes to io completion byte count >> >> the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is >> because scsi-ml supports the ability to split a request into smaller chunks, >> in which case scsi_bufflen() is smaller then request length. Then at completion >> time the remainder can be issued as a new scsi command. In that case the above >> commit is a data corruption. > > We needed something for -rc4, so it had to be rushed a bit... > >> Also in this fix all users of block layer are taken care of, and not only >> scsi devices. >> >> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> >> Signed-off-by: Benny Halevy <bhalevy@panasas.com> >> --- >> block/blk-core.c | 4 ++++ >> drivers/scsi/scsi.c | 2 +- >> 2 files changed, 5 insertions(+), 1 deletions(-) >> >> diff --git a/block/blk-core.c b/block/blk-core.c >> index 2a438a9..37fcccc 100644 >> --- a/block/blk-core.c >> +++ b/block/blk-core.c >> @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error, >> nr_bytes >> 9, req->sector); >> } >> >> + if (nr_bytes >= blk_rq_bytes(req)) >> + nr_bytes += req->extra_len; >> + >> total_bytes = bio_nbytes = 0; >> while ((bio = req->bio) != NULL) { >> int nbytes; >> @@ -1616,6 +1619,7 @@ static int __end_that_request_first(struct request *req, int error, >> if (!req->bio) >> return 0; >> >> + BUG_ON(total_bytes >= blk_rq_bytes(req)); > > Make that a WARN_ON() first please. It's indeed a bug, but it wont be > critical and it's not fair killing everything since this padding stuff > is so fresh and may still need a tweak or two. > > I'd be fine with making it a BUG_ON() post 2.6.25. > Updated, you are absolutely right, thanks. Will you commit below patch for 2.6.25? I know that, at the time, I have seen this scsi-ml-loop in action on a sata drive here in the lab, on an x86_64 machine. The current solution will silently corrupt data, which is very hard to find. Boaz --- From: Boaz Harrosh <bharrosh@panasas.com> Date: Wed, 5 Mar 2008 12:07:12 +0200 Subject: [PATCH] blk: missing add of padded bytes to io completion byte count the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is because scsi-ml supports the ability to split a request into smaller chunks, in which case scsi_bufflen() is smaller then request length. Then at completion time the remainder can be issued as a new scsi command. In that case the above commit is a data corruption. Also in this fix all users of block layer are taken care of, and not only scsi devices. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> --- block/blk-core.c | 4 ++++ drivers/scsi/scsi.c | 2 +- 2 files changed, 5 insertions(+), 1 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 2a438a9..c82e68a 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error, nr_bytes >> 9, req->sector); } + if (nr_bytes >= blk_rq_bytes(req)) + nr_bytes += req->extra_len; + total_bytes = bio_nbytes = 0; while ((bio = req->bio) != NULL) { int nbytes; @@ -1616,6 +1619,7 @@ static int __end_that_request_first(struct request *req, int error, if (!req->bio) return 0; + WARN_ON(total_bytes >= blk_rq_bytes(req)); /* * if the request wasn't completed, update state */ diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index e5c6f6a..fecba05 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd) "Notifying upper driver of completion " "(result %x)\n", cmd->result)); - good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len; + good_bytes = scsi_bufflen(cmd); if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) { drv = scsi_cmd_to_driver(cmd); if (drv->done) -- 1.5.3.3 ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-05 12:46 ` Boaz Harrosh @ 2008-03-05 12:48 ` Jens Axboe 2008-03-05 13:45 ` Tejun Heo 0 siblings, 1 reply; 109+ messages in thread From: Jens Axboe @ 2008-03-05 12:48 UTC (permalink / raw) To: Boaz Harrosh Cc: FUJITA Tomonori, Tejun Heo, Mike Galbraith, James.Bottomley, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Wed, Mar 05 2008, Boaz Harrosh wrote: > On Wed, Mar 05 2008 at 14:33 +0200, Jens Axboe <jens.axboe@oracle.com> wrote: > > On Wed, Mar 05 2008, Boaz Harrosh wrote: > >> On Wed, Mar 05 2008 at 2:26 +0200, FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote: > >>> On Wed, 05 Mar 2008 08:33:05 +0900 > >>> Tejun Heo <htejun@gmail.com> wrote: > >>> > >>>> FUJITA Tomonori wrote: > >>>>> Hmm, does SCSI mid-layer need to care about how many bytes the block > >>>>> layer allocates? I don't think that extra_len is NOT good_bytes. > >>>>> > >>>>> I think that the block layer had better take care about it (fix > >>>>> __end_that_request_first?). > >>>> Yeah, probably calling completion functions w/o bytes count is the right > >>>> thing to do but what I was talking about was what could break when the > >>>> semantics of rq->data_len changed. If we keep rq->data_len() == > >>>> sum(sg), we keep it business as usual for all the rest except for the > >>>> device application layer if we don't we do the reverse and SCSI midlayer > >>>> completion was a good example, I think. > >>> sglist is a low-level I/O representation for device drivers. SCSI > >>> midlayer should not care about sglist. We should not fix SCSI midlayer > >>> for rq->data_len != sum(sg) change (so I can't agree with your > >>> diagrams in another mail). > >>> > >>> When if we change a rule, we need to fix something. > >>> > >>> If we keep rq->data_len == sum(sg), we need to fix the device > >>> application layer. If we keep rq->data_len == the true data length, we > >>> need to fix the low-level drivers. > >>> > >>> Now I'm fine with the commit e97a294ef6938512b655b1abf17656cf2b26f709 > >>> since we are in -rc stages. But I plan to send a patch to revert it > >>> and fix this issue in the block layer. I'd like to test it in -mm for > >>> a while. > >> No this commit is a serious bug, and the only fix is like you suggested > >> in __end_that_request_first. This is because it breaks that scsi-ml loop > >> where scsi_bufflen() can be less then blk_rq_bytes(). In that case this > >> commit is a data corruption. > >> > >>> Only sglist stuff in SCSI midlayer is scsi_req_map_sg now. As you > >>> know, we really want to remove it. > >>> > >>> > >>>> Things going the other way is fine with me but I at least want to hear a > >>>> valid rationale. Till now all I got is "because that's the true size" > >>>> which doesn't really make much sense to me. > >>> Most of users of request structure care about only the real data > >>> length, don't care about padding and drain length. Why do they bother > >>> to use a helper function to get the real data length? > >>> -- > >> Submitted is the right fix to this problem, as pointed out by TOMO. > >> Please test it solves the CD burning problem. > >> (The patch includes the revert of commit e97a294e) > >> --- > >> From: Boaz Harrosh <bharrosh@panasas.com> > >> Date: Wed, 5 Mar 2008 12:07:12 +0200 > >> Subject: [PATCH] blk: missing add of padded bytes to io completion byte count > >> > >> the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is > >> because scsi-ml supports the ability to split a request into smaller chunks, > >> in which case scsi_bufflen() is smaller then request length. Then at completion > >> time the remainder can be issued as a new scsi command. In that case the above > >> commit is a data corruption. > > > > We needed something for -rc4, so it had to be rushed a bit... > > > >> Also in this fix all users of block layer are taken care of, and not only > >> scsi devices. > >> > >> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> > >> Signed-off-by: Benny Halevy <bhalevy@panasas.com> > >> --- > >> block/blk-core.c | 4 ++++ > >> drivers/scsi/scsi.c | 2 +- > >> 2 files changed, 5 insertions(+), 1 deletions(-) > >> > >> diff --git a/block/blk-core.c b/block/blk-core.c > >> index 2a438a9..37fcccc 100644 > >> --- a/block/blk-core.c > >> +++ b/block/blk-core.c > >> @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error, > >> nr_bytes >> 9, req->sector); > >> } > >> > >> + if (nr_bytes >= blk_rq_bytes(req)) > >> + nr_bytes += req->extra_len; > >> + > >> total_bytes = bio_nbytes = 0; > >> while ((bio = req->bio) != NULL) { > >> int nbytes; > >> @@ -1616,6 +1619,7 @@ static int __end_that_request_first(struct request *req, int error, > >> if (!req->bio) > >> return 0; > >> > >> + BUG_ON(total_bytes >= blk_rq_bytes(req)); > > > > Make that a WARN_ON() first please. It's indeed a bug, but it wont be > > critical and it's not fair killing everything since this padding stuff > > is so fresh and may still need a tweak or two. > > > > I'd be fine with making it a BUG_ON() post 2.6.25. > > > Updated, you are absolutely right, thanks. > > Will you commit below patch for 2.6.25? I know that, at the time, I have > seen this scsi-ml-loop in action on a sata drive here in the lab, on an > x86_64 machine. The current solution will silently corrupt data, which > is very hard to find. Yes, was just hoping you'd resend with the above corrected, so thanks! I'll add it to the pending queue for 2.6.25. -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-05 12:48 ` Jens Axboe @ 2008-03-05 13:45 ` Tejun Heo 2008-03-05 13:51 ` Jens Axboe 2008-03-05 14:46 ` Boaz Harrosh 0 siblings, 2 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-05 13:45 UTC (permalink / raw) To: Jens Axboe Cc: Boaz Harrosh, FUJITA Tomonori, Mike Galbraith, James.Bottomley, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier Hello, Jens, Boaz. Jens Axboe wrote: >>>> From: Boaz Harrosh <bharrosh@panasas.com> >>>> Date: Wed, 5 Mar 2008 12:07:12 +0200 >>>> Subject: [PATCH] blk: missing add of padded bytes to io completion byte count >>>> >>>> the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is >>>> because scsi-ml supports the ability to split a request into smaller chunks, >>>> in which case scsi_bufflen() is smaller then request length. Then at completion >>>> time the remainder can be issued as a new scsi command. In that case the above >>>> commit is a data corruption. Thanks for catching the stupidity. Did it actually happen? PC commands are not completed in pieces and padding / draining should only happen for those. qc->extra_len should be zero where commands can be splitted for all current cases. >>> We needed something for -rc4, so it had to be rushed a bit... >>> >>>> Also in this fix all users of block layer are taken care of, and not only >>>> scsi devices. >>>> >>>> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> >>>> Signed-off-by: Benny Halevy <bhalevy@panasas.com> >>>> --- >>>> block/blk-core.c | 4 ++++ >>>> drivers/scsi/scsi.c | 2 +- >>>> 2 files changed, 5 insertions(+), 1 deletions(-) >>>> >>>> diff --git a/block/blk-core.c b/block/blk-core.c >>>> index 2a438a9..37fcccc 100644 >>>> --- a/block/blk-core.c >>>> +++ b/block/blk-core.c >>>> @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error, >>>> nr_bytes >> 9, req->sector); >>>> } >>>> >>>> + if (nr_bytes >= blk_rq_bytes(req)) >>>> + nr_bytes += req->extra_len; >>>> + This is getting insanely subtle. Let's say there's PIO driver which transfer certain sized chunks at a time and completes request partially after completing each chunk and the driver uses draining to eat up whatever excess data, which seems like a legit use case to me. But it won't work because __end_that_request_first() will terminate when it reaches reaches the 'true' transfer size. That's just broken API. FWIW, Nacked-by: Tejun Heo <htejun@gmail.com> -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-05 13:45 ` Tejun Heo @ 2008-03-05 13:51 ` Jens Axboe 2008-03-05 14:08 ` Tejun Heo 2008-03-05 15:21 ` James Bottomley 2008-03-05 14:46 ` Boaz Harrosh 1 sibling, 2 replies; 109+ messages in thread From: Jens Axboe @ 2008-03-05 13:51 UTC (permalink / raw) To: Tejun Heo Cc: Boaz Harrosh, FUJITA Tomonori, Mike Galbraith, James.Bottomley, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Wed, Mar 05 2008, Tejun Heo wrote: > Hello, Jens, Boaz. > > Jens Axboe wrote: > >>>> From: Boaz Harrosh <bharrosh@panasas.com> > >>>> Date: Wed, 5 Mar 2008 12:07:12 +0200 > >>>> Subject: [PATCH] blk: missing add of padded bytes to io completion byte count > >>>> > >>>> the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is > >>>> because scsi-ml supports the ability to split a request into smaller chunks, > >>>> in which case scsi_bufflen() is smaller then request length. Then at completion > >>>> time the remainder can be issued as a new scsi command. In that case the above > >>>> commit is a data corruption. > > Thanks for catching the stupidity. Did it actually happen? PC commands > are not completed in pieces and padding / draining should only happen > for those. qc->extra_len should be zero where commands can be splitted > for all current cases. > > >>> We needed something for -rc4, so it had to be rushed a bit... > >>> > >>>> Also in this fix all users of block layer are taken care of, and not only > >>>> scsi devices. > >>>> > >>>> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> > >>>> Signed-off-by: Benny Halevy <bhalevy@panasas.com> > >>>> --- > >>>> block/blk-core.c | 4 ++++ > >>>> drivers/scsi/scsi.c | 2 +- > >>>> 2 files changed, 5 insertions(+), 1 deletions(-) > >>>> > >>>> diff --git a/block/blk-core.c b/block/blk-core.c > >>>> index 2a438a9..37fcccc 100644 > >>>> --- a/block/blk-core.c > >>>> +++ b/block/blk-core.c > >>>> @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error, > >>>> nr_bytes >> 9, req->sector); > >>>> } > >>>> > >>>> + if (nr_bytes >= blk_rq_bytes(req)) > >>>> + nr_bytes += req->extra_len; > >>>> + > > This is getting insanely subtle. Let's say there's PIO driver which > transfer certain sized chunks at a time and completes request partially > after completing each chunk and the driver uses draining to eat up > whatever excess data, which seems like a legit use case to me. But it > won't work because __end_that_request_first() will terminate when it > reaches reaches the 'true' transfer size. That's just broken API. FWIW, > > Nacked-by: Tejun Heo <htejun@gmail.com> Yeah, I think I may have gone a bit overboard in applying this so quickly. It's just not a good interface, silently adding the extra length if asked to complete more. It may even happen right now, for a driver that does no padding (it probably wont do any harm here either, but still). I'll try and see if I can come up with something cleaner. My basic design paradigm for this is that the _driver_ (or mid layer, if SCSI wants to handle it) should care about the padding. So make it easy for them to pad, but have it 'unrolled' by completion time. We should NOT need any extra_len checks or additions in the block/ directory, period. -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-05 13:51 ` Jens Axboe @ 2008-03-05 14:08 ` Tejun Heo 2008-03-05 15:21 ` James Bottomley 1 sibling, 0 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-05 14:08 UTC (permalink / raw) To: Jens Axboe Cc: Boaz Harrosh, FUJITA Tomonori, Mike Galbraith, James.Bottomley, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier Hello, Jens. Jens Axboe wrote: >> This is getting insanely subtle. Let's say there's PIO driver which >> transfer certain sized chunks at a time and completes request partially >> after completing each chunk and the driver uses draining to eat up >> whatever excess data, which seems like a legit use case to me. But it >> won't work because __end_that_request_first() will terminate when it >> reaches reaches the 'true' transfer size. That's just broken API. FWIW, >> >> Nacked-by: Tejun Heo <htejun@gmail.com> > > Yeah, I think I may have gone a bit overboard in applying this so > quickly. It's just not a good interface, silently adding the extra > length if asked to complete more. It may even happen right now, for a > driver that does no padding (it probably wont do any harm here either, > but still). Unless it explicitly requests padding, it shouldn't be a problem extra_len will always be zero and currently the only driver which uses padding and draining is libata. > I'll try and see if I can come up with something cleaner. > > My basic design paradigm for this is that the _driver_ (or mid layer, if > SCSI wants to handle it) should care about the padding. So make it easy > for them to pad, but have it 'unrolled' by completion time. We should > NOT need any extra_len checks or additions in the block/ directory, > period. Maybe I'm from Mars but I don't really understand all this fuss. The two patches I posted way back work perfectly fine and don't have any of these problems and as I have said again and again that's because it doesn't break the assumption which our internal mechanics depend on. Can you please put the "true" size aside for a while and consider those patches? There's nothing fundamentally wrong with letting the rq->data_len be sum(sg) which can differ from user requested data length if and only if low level driver requests so. If you can come up with something nicer, that will be great too but I really don't think the current scheme will work. Thanks. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-05 13:51 ` Jens Axboe 2008-03-05 14:08 ` Tejun Heo @ 2008-03-05 15:21 ` James Bottomley 2008-03-06 4:41 ` FUJITA Tomonori 1 sibling, 1 reply; 109+ messages in thread From: James Bottomley @ 2008-03-05 15:21 UTC (permalink / raw) To: Jens Axboe Cc: Tejun Heo, Boaz Harrosh, FUJITA Tomonori, Mike Galbraith, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Wed, 2008-03-05 at 14:51 +0100, Jens Axboe wrote: > On Wed, Mar 05 2008, Tejun Heo wrote: > > This is getting insanely subtle. Let's say there's PIO driver which > > transfer certain sized chunks at a time and completes request partially > > after completing each chunk and the driver uses draining to eat up > > whatever excess data, which seems like a legit use case to me. But it > > won't work because __end_that_request_first() will terminate when it > > reaches reaches the 'true' transfer size. That's just broken API. FWIW, > > > > Nacked-by: Tejun Heo <htejun@gmail.com> > > Yeah, I think I may have gone a bit overboard in applying this so > quickly. It's just not a good interface, silently adding the extra > length if asked to complete more. It may even happen right now, for a > driver that does no padding (it probably wont do any harm here either, > but still). > > I'll try and see if I can come up with something cleaner. > > My basic design paradigm for this is that the _driver_ (or mid layer, if > SCSI wants to handle it) should care about the padding. So make it easy > for them to pad, but have it 'unrolled' by completion time. We should > NOT need any extra_len checks or additions in the block/ directory, > period. Right, that's why my original proposal was to do nothing for padding (other than ensure the driver could adjust the length if it wanted to) and to add an extra element always for draining, which the driver could ignore. It basically pushed the use paradigm onto the driver. If we want the use paradigm shared between block and driver, then I think the best approach is to keep all the bios the same (so not adjust for padding), but do adjust in the blk_rq_map_sg(). That way we have the padding and draining unwind information by comparing with the bio. For passing on to the driver: req->data_len still needs to be the input (bio) lenght. req->extra_len can record how much padding and draining was added. The completion length also needs to be in terms of the true (bio) length. Now, here's the subtlety. Because of the way transfers work, we expect the padded length not to contribute to overrun (because it represents transfers that were successfully completed at the correct length), but we *do* expect drain usage to be recorded as overrun. However, if we keep the bios intact, we have all the information to make this determination in the block layer at completion time, with the expectation that the lower layers report the exact amount they transferred. James ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-05 15:21 ` James Bottomley @ 2008-03-06 4:41 ` FUJITA Tomonori 2008-03-06 13:41 ` Jens Axboe 0 siblings, 1 reply; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-06 4:41 UTC (permalink / raw) To: James.Bottomley Cc: jens.axboe, htejun, bharrosh, fujita.tomonori, efault, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Wed, 05 Mar 2008 09:21:24 -0600 James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > On Wed, 2008-03-05 at 14:51 +0100, Jens Axboe wrote: > > On Wed, Mar 05 2008, Tejun Heo wrote: > > > This is getting insanely subtle. Let's say there's PIO driver which > > > transfer certain sized chunks at a time and completes request partially > > > after completing each chunk and the driver uses draining to eat up > > > whatever excess data, which seems like a legit use case to me. But it > > > won't work because __end_that_request_first() will terminate when it > > > reaches reaches the 'true' transfer size. That's just broken API. FWIW, > > > > > > Nacked-by: Tejun Heo <htejun@gmail.com> > > > > Yeah, I think I may have gone a bit overboard in applying this so > > quickly. It's just not a good interface, silently adding the extra > > length if asked to complete more. It may even happen right now, for a > > driver that does no padding (it probably wont do any harm here either, > > but still). > > > > I'll try and see if I can come up with something cleaner. > > > > My basic design paradigm for this is that the _driver_ (or mid layer, if > > SCSI wants to handle it) should care about the padding. So make it easy > > for them to pad, but have it 'unrolled' by completion time. We should > > NOT need any extra_len checks or additions in the block/ directory, > > period. > > Right, that's why my original proposal was to do nothing for padding > (other than ensure the driver could adjust the length if it wanted to) > and to add an extra element always for draining, which the driver could > ignore. It basically pushed the use paradigm onto the driver. > > If we want the use paradigm shared between block and driver, then I > think the best approach is to keep all the bios the same (so not adjust > for padding), but do adjust in the blk_rq_map_sg(). That way we have > the padding and draining unwind information by comparing with the bio. Adjusting only sg in blk_rq_map_sg (like drain) looks much better. This works with libata for me. diff --git a/block/blk-map.c b/block/blk-map.c index c07d9c8..e949969 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -140,26 +140,6 @@ int blk_rq_map_user(struct request_queue *q, struct request *rq, ubuf += ret; } - /* - * __blk_rq_map_user() copies the buffers if starting address - * or length isn't aligned to dma_pad_mask. As the copied - * buffer is always page aligned, we know that there's enough - * room for padding. Extend the last bio and update - * rq->data_len accordingly. - * - * On unmap, bio_uncopy_user() will use unmodified - * bio_map_data pointed to by bio->bi_private. - */ - if (len & q->dma_pad_mask) { - unsigned int pad_len = (q->dma_pad_mask & ~len) + 1; - struct bio *tail = rq->biotail; - - tail->bi_io_vec[tail->bi_vcnt - 1].bv_len += pad_len; - tail->bi_size += pad_len; - - rq->extra_len += pad_len; - } - rq->buffer = rq->data = NULL; return 0; unmap_rq: diff --git a/block/blk-merge.c b/block/blk-merge.c index 0f58616..2a81c87 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -220,6 +220,13 @@ new_segment: bvprv = bvec; } /* segments in rq */ + if (sg && (q->dma_pad_mask & rq->data_len)) { + unsigned int pad_len = (q->dma_pad_mask & ~rq->data_len) + 1; + + sg->length += pad_len; + rq->extra_len += pad_len; + } + if (q->dma_drain_size && q->dma_drain_needed(rq)) { if (rq->cmd_flags & REQ_RW) memset(q->dma_drain_buffer, 0, q->dma_drain_size); diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c index e5c6f6a..fecba05 100644 --- a/drivers/scsi/scsi.c +++ b/drivers/scsi/scsi.c @@ -757,7 +757,7 @@ void scsi_finish_command(struct scsi_cmnd *cmd) "Notifying upper driver of completion " "(result %x)\n", cmd->result)); - good_bytes = scsi_bufflen(cmd) + cmd->request->extra_len; + good_bytes = scsi_bufflen(cmd); if (cmd->request->cmd_type != REQ_TYPE_BLOCK_PC) { drv = scsi_cmd_to_driver(cmd); if (drv->done) ^ permalink raw reply related [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-06 4:41 ` FUJITA Tomonori @ 2008-03-06 13:41 ` Jens Axboe 2008-03-07 0:07 ` Tejun Heo 2008-03-20 12:54 ` FUJITA Tomonori 0 siblings, 2 replies; 109+ messages in thread From: Jens Axboe @ 2008-03-06 13:41 UTC (permalink / raw) To: FUJITA Tomonori Cc: James.Bottomley, htejun, bharrosh, efault, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Thu, Mar 06 2008, FUJITA Tomonori wrote: > On Wed, 05 Mar 2008 09:21:24 -0600 > James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > > On Wed, 2008-03-05 at 14:51 +0100, Jens Axboe wrote: > > > On Wed, Mar 05 2008, Tejun Heo wrote: > > > > This is getting insanely subtle. Let's say there's PIO driver which > > > > transfer certain sized chunks at a time and completes request partially > > > > after completing each chunk and the driver uses draining to eat up > > > > whatever excess data, which seems like a legit use case to me. But it > > > > won't work because __end_that_request_first() will terminate when it > > > > reaches reaches the 'true' transfer size. That's just broken API. FWIW, > > > > > > > > Nacked-by: Tejun Heo <htejun@gmail.com> > > > > > > Yeah, I think I may have gone a bit overboard in applying this so > > > quickly. It's just not a good interface, silently adding the extra > > > length if asked to complete more. It may even happen right now, for a > > > driver that does no padding (it probably wont do any harm here either, > > > but still). > > > > > > I'll try and see if I can come up with something cleaner. > > > > > > My basic design paradigm for this is that the _driver_ (or mid layer, if > > > SCSI wants to handle it) should care about the padding. So make it easy > > > for them to pad, but have it 'unrolled' by completion time. We should > > > NOT need any extra_len checks or additions in the block/ directory, > > > period. > > > > Right, that's why my original proposal was to do nothing for padding > > (other than ensure the driver could adjust the length if it wanted to) > > and to add an extra element always for draining, which the driver could > > ignore. It basically pushed the use paradigm onto the driver. > > > > If we want the use paradigm shared between block and driver, then I > > think the best approach is to keep all the bios the same (so not adjust > > for padding), but do adjust in the blk_rq_map_sg(). That way we have > > the padding and draining unwind information by comparing with the bio. > > Adjusting only sg in blk_rq_map_sg (like drain) looks much > better. This works with libata for me. Looks like a much better solution to me. Anyone have any valid objections against moving the padding to the sg map time? -- Jens Axboe ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-06 13:41 ` Jens Axboe @ 2008-03-07 0:07 ` Tejun Heo 2008-03-07 15:07 ` FUJITA Tomonori 2008-03-20 12:54 ` FUJITA Tomonori 1 sibling, 1 reply; 109+ messages in thread From: Tejun Heo @ 2008-03-07 0:07 UTC (permalink / raw) To: Jens Axboe Cc: FUJITA Tomonori, James.Bottomley, bharrosh, efault, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier Jens Axboe wrote: >>> If we want the use paradigm shared between block and driver, then I >>> think the best approach is to keep all the bios the same (so not adjust >>> for padding), but do adjust in the blk_rq_map_sg(). That way we have >>> the padding and draining unwind information by comparing with the bio. >> Adjusting only sg in blk_rq_map_sg (like drain) looks much >> better. This works with libata for me. > > Looks like a much better solution to me. Anyone have any valid > objections against moving the padding to the sg map time? Not necessarily objections but some concerns. * As completion is done in bio terms, it makes completion from LLDs a bit cumbersome, but this is unavoidable if we break sum(bio) == sum(sg). * I've been wondering why we are not using sg chain / table or whatever directly in bios and maybe rq_map_sg can go away in future. How about separating out the padding / draining adjustment into a separate interface? Say, blk_rq_apply_extra() and blk_rq_undo_extra() and make it the responsibility of the LLD which requested padding/draining to apply and undo the adjustments? It can undo the adjustments when it returns the the request to its upper layer. If rq completion is handled by upper layer, it will do the right thing. If rq completion is handled by LLD, it can see the bio it wants to see. Thanks. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-07 0:07 ` Tejun Heo @ 2008-03-07 15:07 ` FUJITA Tomonori 2008-03-08 1:06 ` Tejun Heo 0 siblings, 1 reply; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-07 15:07 UTC (permalink / raw) To: htejun Cc: jens.axboe, fujita.tomonori, James.Bottomley, bharrosh, efault, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Fri, 07 Mar 2008 09:07:23 +0900 Tejun Heo <htejun@gmail.com> wrote: > Jens Axboe wrote: > >>> If we want the use paradigm shared between block and driver, then I > >>> think the best approach is to keep all the bios the same (so not adjust > >>> for padding), but do adjust in the blk_rq_map_sg(). That way we have > >>> the padding and draining unwind information by comparing with the bio. > >> Adjusting only sg in blk_rq_map_sg (like drain) looks much > >> better. This works with libata for me. > > > > Looks like a much better solution to me. Anyone have any valid > > objections against moving the padding to the sg map time? > > Not necessarily objections but some concerns. > > * As completion is done in bio terms, it makes completion from LLDs a > bit cumbersome, but this is unavoidable if we break sum(bio) == sum(sg). What do you mean? How does sub(bio) affect LLDs? > * I've been wondering why we are not using sg chain / table or whatever > directly in bios and maybe rq_map_sg can go away in future. You mean that LLDs use bios directly? For me, sg and bio have very different objectives and it's a clean layer separation. > How about separating out the padding / draining adjustment into a > separate interface? Say, blk_rq_apply_extra() and blk_rq_undo_extra() > and make it the responsibility of the LLD which requested > padding/draining to apply and undo the adjustments? It can undo the > adjustments when it returns the the request to its upper layer. If rq > completion is handled by upper layer, it will do the right thing. If rq > completion is handled by LLD, it can see the bio it wants to see. If possible, I'd like to avoid creating APIs for them. I think that the current approach is much better than such APIs. ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-07 15:07 ` FUJITA Tomonori @ 2008-03-08 1:06 ` Tejun Heo 0 siblings, 0 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-08 1:06 UTC (permalink / raw) To: FUJITA Tomonori Cc: jens.axboe, fujita.tomonori, James.Bottomley, bharrosh, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier FUJITA Tomonori wrote: > On Fri, 07 Mar 2008 09:07:23 +0900 > Tejun Heo <htejun@gmail.com> wrote: > >> Jens Axboe wrote: >>>>> If we want the use paradigm shared between block and driver, then I >>>>> think the best approach is to keep all the bios the same (so not adjust >>>>> for padding), but do adjust in the blk_rq_map_sg(). That way we have >>>>> the padding and draining unwind information by comparing with the bio. >>>> Adjusting only sg in blk_rq_map_sg (like drain) looks much >>>> better. This works with libata for me. >>> Looks like a much better solution to me. Anyone have any valid >>> objections against moving the padding to the sg map time? >> Not necessarily objections but some concerns. >> >> * As completion is done in bio terms, it makes completion from LLDs a >> bit cumbersome, but this is unavoidable if we break sum(bio) == sum(sg). > > What do you mean? How does sub(bio) affect LLDs? LLDs which loop over sg's trying to complete rq incrementally will see rq going away sooner than it expected. >> * I've been wondering why we are not using sg chain / table or whatever >> directly in bios and maybe rq_map_sg can go away in future. > > You mean that LLDs use bios directly? For me, sg and bio have very > different objectives and it's a clean layer separation. Actually the other way, block layer use sg instead of bio_vec in bio. Layer separation doesn't necessarily require copying about the same information to differently formatted data structure. I'm not sure it will be a clean win tho. Requests hang longer in scheduler queue and and bio_vec is smaller and scatterlist. The thing is that, to me, blk_rq_map_sg() doesn't really look necessary, it can be done just as well when the request is fetched from the queue by block driver. (continued below...) >> How about separating out the padding / draining adjustment into a >> separate interface? Say, blk_rq_apply_extra() and blk_rq_undo_extra() >> and make it the responsibility of the LLD which requested >> padding/draining to apply and undo the adjustments? It can undo the >> adjustments when it returns the the request to its upper layer. If rq >> completion is handled by upper layer, it will do the right thing. If rq >> completion is handled by LLD, it can see the bio it wants to see. > > If possible, I'd like to avoid creating APIs for them. I think that > the current approach is much better than such APIs. And, so, I'm not too sure whether putting more mechanisms into it is a good idea. Thanks. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-06 13:41 ` Jens Axboe 2008-03-07 0:07 ` Tejun Heo @ 2008-03-20 12:54 ` FUJITA Tomonori 1 sibling, 0 replies; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-20 12:54 UTC (permalink / raw) To: jens.axboe Cc: fujita.tomonori, James.Bottomley, htejun, bharrosh, efault, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier From: Jens Axboe <jens.axboe@oracle.com> Subject: Re: [PATCH] blk: missing add of padded bytes to io completion byte count Date: Thu, 6 Mar 2008 14:41:39 +0100 > On Thu, Mar 06 2008, FUJITA Tomonori wrote: > > On Wed, 05 Mar 2008 09:21:24 -0600 > > James Bottomley <James.Bottomley@HansenPartnership.com> wrote: > > > > > On Wed, 2008-03-05 at 14:51 +0100, Jens Axboe wrote: > > > > On Wed, Mar 05 2008, Tejun Heo wrote: > > > > > This is getting insanely subtle. Let's say there's PIO driver which > > > > > transfer certain sized chunks at a time and completes request partially > > > > > after completing each chunk and the driver uses draining to eat up > > > > > whatever excess data, which seems like a legit use case to me. But it > > > > > won't work because __end_that_request_first() will terminate when it > > > > > reaches reaches the 'true' transfer size. That's just broken API. FWIW, > > > > > > > > > > Nacked-by: Tejun Heo <htejun@gmail.com> > > > > > > > > Yeah, I think I may have gone a bit overboard in applying this so > > > > quickly. It's just not a good interface, silently adding the extra > > > > length if asked to complete more. It may even happen right now, for a > > > > driver that does no padding (it probably wont do any harm here either, > > > > but still). > > > > > > > > I'll try and see if I can come up with something cleaner. > > > > > > > > My basic design paradigm for this is that the _driver_ (or mid layer, if > > > > SCSI wants to handle it) should care about the padding. So make it easy > > > > for them to pad, but have it 'unrolled' by completion time. We should > > > > NOT need any extra_len checks or additions in the block/ directory, > > > > period. > > > > > > Right, that's why my original proposal was to do nothing for padding > > > (other than ensure the driver could adjust the length if it wanted to) > > > and to add an extra element always for draining, which the driver could > > > ignore. It basically pushed the use paradigm onto the driver. > > > > > > If we want the use paradigm shared between block and driver, then I > > > think the best approach is to keep all the bios the same (so not adjust > > > for padding), but do adjust in the blk_rq_map_sg(). That way we have > > > the padding and draining unwind information by comparing with the bio. > > > > Adjusting only sg in blk_rq_map_sg (like drain) looks much > > better. This works with libata for me. > > Looks like a much better solution to me. Anyone have any valid > objections against moving the padding to the sg map time? What's the situation with this fix? ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-05 13:45 ` Tejun Heo 2008-03-05 13:51 ` Jens Axboe @ 2008-03-05 14:46 ` Boaz Harrosh 2008-03-05 15:11 ` Tejun Heo 1 sibling, 1 reply; 109+ messages in thread From: Boaz Harrosh @ 2008-03-05 14:46 UTC (permalink / raw) To: Tejun Heo Cc: Jens Axboe, FUJITA Tomonori, Mike Galbraith, James.Bottomley, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Wed, Mar 05 2008 at 15:45 +0200, Tejun Heo <htejun@gmail.com> wrote: > Hello, Jens, Boaz. > > Jens Axboe wrote: >>>>> From: Boaz Harrosh <bharrosh@panasas.com> >>>>> Date: Wed, 5 Mar 2008 12:07:12 +0200 >>>>> Subject: [PATCH] blk: missing add of padded bytes to io completion byte count >>>>> >>>>> the commit e97a294ef6938512b655b1abf17656cf2b26f709 was very wrong. This is >>>>> because scsi-ml supports the ability to split a request into smaller chunks, >>>>> in which case scsi_bufflen() is smaller then request length. Then at completion >>>>> time the remainder can be issued as a new scsi command. In that case the above >>>>> commit is a data corruption. > > Thanks for catching the stupidity. Did it actually happen? PC commands > are not completed in pieces and padding / draining should only happen > for those. qc->extra_len should be zero where commands can be splitted > for all current cases. So qc->extra_len == 0 and nothing is done in that case. > >>>> We needed something for -rc4, so it had to be rushed a bit... >>>> >>>>> Also in this fix all users of block layer are taken care of, and not only >>>>> scsi devices. >>>>> >>>>> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> >>>>> Signed-off-by: Benny Halevy <bhalevy@panasas.com> >>>>> --- >>>>> block/blk-core.c | 4 ++++ >>>>> drivers/scsi/scsi.c | 2 +- >>>>> 2 files changed, 5 insertions(+), 1 deletions(-) >>>>> >>>>> diff --git a/block/blk-core.c b/block/blk-core.c >>>>> index 2a438a9..37fcccc 100644 >>>>> --- a/block/blk-core.c >>>>> +++ b/block/blk-core.c >>>>> @@ -1549,6 +1549,9 @@ static int __end_that_request_first(struct request *req, int error, >>>>> nr_bytes >> 9, req->sector); >>>>> } >>>>> >>>>> + if (nr_bytes >= blk_rq_bytes(req)) >>>>> + nr_bytes += req->extra_len; >>>>> + > > This is getting insanely subtle. Let's say there's PIO driver which > transfer certain sized chunks at a time and completes request partially > after completing each chunk and the driver uses draining to eat up > whatever excess data, which seems like a legit use case to me. But it > won't work because __end_that_request_first() will terminate when it > reaches reaches the 'true' transfer size. That's just broken API. FWIW, > > Nacked-by: Tejun Heo <htejun@gmail.com> > I don't understand? Drivers can still do that. Do you mean That it wants to also complete the draining portion in smaller chunks? I thought the draining is always done at once, at most. Is that theoretical or is it so in any of the drivers. Any way Nack from my side on the scsi_finish_command(), it makes too many assumptions that are unchecked anywhere. And it's a terrible layering violation. scsi is a pass-threw block device, the fix should be in block or in using device drivers (eg libata), that know what is going on. Any way you are always saying req->data_len == sum(sg) but that was certainly never true for scsi_bufflen() == sum(sg) so leave that alone please. Any other block layer fixes are welcome. But for now this is the best fix we have that only breaks theoretical, yet to be submitted drivers. Boaz ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-05 14:46 ` Boaz Harrosh @ 2008-03-05 15:11 ` Tejun Heo 0 siblings, 0 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-05 15:11 UTC (permalink / raw) To: Boaz Harrosh Cc: Jens Axboe, FUJITA Tomonori, Mike Galbraith, James.Bottomley, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier Hello, Boaz. Boaz Harrosh wrote: >> This is getting insanely subtle. Let's say there's PIO driver which >> transfer certain sized chunks at a time and completes request partially >> after completing each chunk and the driver uses draining to eat up >> whatever excess data, which seems like a legit use case to me. But it >> won't work because __end_that_request_first() will terminate when it >> reaches reaches the 'true' transfer size. That's just broken API. FWIW, >> >> Nacked-by: Tejun Heo <htejun@gmail.com> >> > > I don't understand? Drivers can still do that. Do you mean That it wants > to also complete the draining portion in smaller chunks? I thought the draining > is always done at once, at most. Is that theoretical or is it so in any of the > drivers. Ah... I wasn't really Nacking your patch specifically. I was trying to say "this scheme isn't gonna work". Your patch does make good sense given the situation (and I think I did acknowledge that above). Sorry about the miscommunication. > Any way Nack from my side on the scsi_finish_command(), it makes too many > assumptions that are unchecked anywhere. And it's a terrible layering violation. > scsi is a pass-threw block device, the fix should be in block or in using device > drivers (eg libata), that know what is going on. Yeap, completely agreed. That one gets my big Nack-You-Idiot. > Any way you are always saying req->data_len == sum(sg) but that was certainly > never true for scsi_bufflen() == sum(sg) so leave that alone please. I don't really care about scsi_bufflen() and I'm not willing to change any of that. If SCSI LLDs are happy with scsi_bufflen() != sum(sg), no problem at all. What I'm against is pushing that into block layer, which until now had "true" size == rq->data_len == sum(sg). We're about to break one of the two equals if we're gonna do sg manipulation in block layer (Jens seems to be planning something different) and all I'm saying is we're far better off breaking the former one. First, I don't really think SCSI LLDs will make much use of explicit padding or draining. Secondly, even when such need arises, keeping scsi_bufflen() at the "true" size is easy no matter which way we go with rq->data_len. Anyways, let's wait and see what Jens comes up with. > Any other block layer fixes are welcome. But for now this is the best fix we have > that only breaks theoretical, yet to be submitted drivers. Yeap, given the current code, I agree. Thanks. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] blk: missing add of padded bytes to io completion byte count 2008-03-05 10:16 ` [PATCH] blk: missing add of padded bytes to io completion byte count Boaz Harrosh 2008-03-05 12:28 ` Mike Galbraith 2008-03-05 12:33 ` Jens Axboe @ 2008-03-06 5:02 ` FUJITA Tomonori 2 siblings, 0 replies; 109+ messages in thread From: FUJITA Tomonori @ 2008-03-06 5:02 UTC (permalink / raw) To: bharrosh Cc: fujita.tomonori, htejun, efault, jens.axboe, James.Bottomley, tomof, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier On Wed, 05 Mar 2008 12:16:15 +0200 Boaz Harrosh <bharrosh@panasas.com> wrote: > On Wed, Mar 05 2008 at 2:26 +0200, FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote: > > On Wed, 05 Mar 2008 08:33:05 +0900 > > Tejun Heo <htejun@gmail.com> wrote: > > > >> FUJITA Tomonori wrote: > >>> Hmm, does SCSI mid-layer need to care about how many bytes the block > >>> layer allocates? I don't think that extra_len is NOT good_bytes. > >>> > >>> I think that the block layer had better take care about it (fix > >>> __end_that_request_first?). > >> Yeah, probably calling completion functions w/o bytes count is the right > >> thing to do but what I was talking about was what could break when the > >> semantics of rq->data_len changed. If we keep rq->data_len() == > >> sum(sg), we keep it business as usual for all the rest except for the > >> device application layer if we don't we do the reverse and SCSI midlayer > >> completion was a good example, I think. > > > > sglist is a low-level I/O representation for device drivers. SCSI > > midlayer should not care about sglist. We should not fix SCSI midlayer > > for rq->data_len != sum(sg) change (so I can't agree with your > > diagrams in another mail). > > > > When if we change a rule, we need to fix something. > > > > If we keep rq->data_len == sum(sg), we need to fix the device > > application layer. If we keep rq->data_len == the true data length, we > > need to fix the low-level drivers. > > > > Now I'm fine with the commit e97a294ef6938512b655b1abf17656cf2b26f709 > > since we are in -rc stages. But I plan to send a patch to revert it > > and fix this issue in the block layer. I'd like to test it in -mm for > > a while. > > No this commit is a serious bug, and the only fix is like you suggested > in __end_that_request_first. This is because it breaks that scsi-ml loop > where scsi_bufflen() can be less then blk_rq_bytes(). In that case this > commit is a data corruption. Ah, I knew that the patch doesn't work with partial completion but I thought that it doesn't happen with PC commands... And touching __end_that_request_first looked really hacky so I didn't send such patch. Moving the padding adjustment to blk_rq_map_sg (James' proposal) looks fine. Maybe Jens will come up with something better. ^ permalink raw reply [flat|nested] 109+ messages in thread
* Re: [PATCH] block: fix residual byte count handling 2008-03-04 8:59 ` Jens Axboe 2008-03-04 9:06 ` FUJITA Tomonori @ 2008-03-04 9:29 ` Tejun Heo 1 sibling, 0 replies; 109+ messages in thread From: Tejun Heo @ 2008-03-04 9:29 UTC (permalink / raw) To: Jens Axboe Cc: FUJITA Tomonori, tomof, James.Bottomley, efault, akpm, linux-kernel, linux-ide, linux-scsi, jgarzik, bzolnier Hello, Jens. Jens Axboe wrote: > I completely agree with you, ->data_len meaning true data length is way > cleaner imho. Only the driver should care for the padded length, all > other parts of the kernel only need to know what they actually got. Oh well, I guess I'm the one with strange taste he re. My logic is that the only thing below the block layer is the driver which requested size adjustment. This means residual bytes calculation is pushed to low level drivers which isn't anything major but still. Anyways, I'll review FUJITA's modified patch. Thanks. -- tejun ^ permalink raw reply [flat|nested] 109+ messages in thread
end of thread, other threads:[~2008-03-20 12:56 UTC | newest] Thread overview: 109+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-02-21 8:42 regression: CD burning (k3b) went broke Mike Galbraith 2008-02-22 7:32 ` Jens Axboe 2008-02-23 7:42 ` Mike Galbraith 2008-02-24 7:54 ` Mike Galbraith 2008-02-26 9:48 ` Mike Galbraith 2008-02-26 13:36 ` Mike Galbraith 2008-02-26 23:08 ` Andrew Morton 2008-02-27 0:46 ` Jeff Garzik 2008-02-27 2:58 ` Mike Galbraith 2008-02-27 2:24 ` Mike Galbraith 2008-02-27 6:00 ` Mike Galbraith 2008-02-27 7:07 ` Mike Galbraith 2008-02-28 7:43 ` Tejun Heo 2008-02-28 8:20 ` Mike Galbraith 2008-02-28 8:50 ` [PATCH] block: fix residual byte count handling Tejun Heo 2008-02-28 15:35 ` Jens Axboe 2008-02-28 15:46 ` Tejun Heo 2008-02-29 16:47 ` James Bottomley 2008-02-29 20:11 ` Jens Axboe 2008-03-01 6:17 ` Tejun Heo 2008-03-01 15:19 ` James Bottomley 2008-03-02 14:52 ` FUJITA Tomonori 2008-03-02 18:46 ` Mike Christie 2008-03-03 3:27 ` Mike Galbraith 2008-03-03 2:40 ` Tejun Heo 2008-03-03 3:59 ` FUJITA Tomonori 2008-03-03 4:09 ` Tejun Heo 2008-03-03 6:08 ` [PATCH 1/2] " Tejun Heo 2008-03-03 6:10 ` [PATCH] block: separate out padding from alignment Tejun Heo 2008-03-03 18:27 ` James Bottomley 2008-03-03 8:26 ` [PATCH] block: fix residual byte count handling FUJITA Tomonori 2008-03-03 9:21 ` Tejun Heo 2008-03-03 12:17 ` FUJITA Tomonori 2008-03-03 13:38 ` Tejun Heo 2008-03-03 13:50 ` FUJITA Tomonori 2008-03-03 13:55 ` Tejun Heo 2008-03-03 14:01 ` FUJITA Tomonori 2008-03-03 14:22 ` Tejun Heo 2008-03-03 14:52 ` FUJITA Tomonori 2008-03-03 22:44 ` Tejun Heo 2008-03-04 2:11 ` FUJITA Tomonori 2008-03-04 2:32 ` Tejun Heo 2008-03-04 8:53 ` FUJITA Tomonori 2008-03-04 8:59 ` Jens Axboe 2008-03-04 9:06 ` FUJITA Tomonori 2008-03-04 9:22 ` FUJITA Tomonori 2008-03-04 9:30 ` Tejun Heo 2008-03-04 9:35 ` Jens Axboe 2008-03-04 9:40 ` Tejun Heo 2008-03-04 9:46 ` Jens Axboe 2008-03-04 12:37 ` Mike Galbraith 2008-03-04 12:39 ` Jens Axboe 2008-03-04 12:43 ` Mike Galbraith 2008-03-04 12:58 ` Mike Galbraith 2008-03-04 13:03 ` Jens Axboe 2008-03-04 14:25 ` Mike Galbraith 2008-03-04 18:17 ` Jens Axboe 2008-03-04 18:29 ` Jens Axboe 2008-03-04 18:35 ` Mike Galbraith 2008-03-04 18:45 ` Jens Axboe 2008-03-04 18:49 ` Mike Galbraith 2008-03-04 18:54 ` Jens Axboe 2008-03-04 19:26 ` Mike Galbraith 2008-03-04 19:28 ` Jens Axboe 2008-03-04 16:04 ` James Bottomley 2008-03-04 18:46 ` Jens Axboe 2008-03-04 17:34 ` walt 2008-03-04 17:59 ` Tejun Heo 2008-03-04 19:42 ` Kiyoshi Ueda 2008-03-04 12:40 ` Tejun Heo 2008-03-04 12:45 ` Mike Galbraith 2008-03-04 13:30 ` FUJITA Tomonori 2008-03-04 13:50 ` Tejun Heo 2008-03-04 16:17 ` Tejun Heo 2008-03-04 16:42 ` Tejun Heo 2008-03-04 18:26 ` Boaz Harrosh 2008-03-04 18:35 ` Tejun Heo 2008-03-04 18:27 ` James Bottomley 2008-03-04 18:33 ` Tejun Heo 2008-03-04 18:45 ` Mike Galbraith 2008-03-04 19:25 ` Jens Axboe 2008-03-04 19:33 ` Mike Galbraith 2008-03-04 19:34 ` Jens Axboe 2008-03-04 19:19 ` FUJITA Tomonori 2008-03-04 23:33 ` Tejun Heo 2008-03-04 23:54 ` Tejun Heo 2008-03-05 0:26 ` FUJITA Tomonori 2008-03-05 0:44 ` Tejun Heo 2008-03-06 4:56 ` FUJITA Tomonori 2008-03-06 5:02 ` Tejun Heo 2008-03-05 10:16 ` [PATCH] blk: missing add of padded bytes to io completion byte count Boaz Harrosh 2008-03-05 12:28 ` Mike Galbraith 2008-03-05 12:33 ` Jens Axboe 2008-03-05 12:46 ` Boaz Harrosh 2008-03-05 12:48 ` Jens Axboe 2008-03-05 13:45 ` Tejun Heo 2008-03-05 13:51 ` Jens Axboe 2008-03-05 14:08 ` Tejun Heo 2008-03-05 15:21 ` James Bottomley 2008-03-06 4:41 ` FUJITA Tomonori 2008-03-06 13:41 ` Jens Axboe 2008-03-07 0:07 ` Tejun Heo 2008-03-07 15:07 ` FUJITA Tomonori 2008-03-08 1:06 ` Tejun Heo 2008-03-20 12:54 ` FUJITA Tomonori 2008-03-05 14:46 ` Boaz Harrosh 2008-03-05 15:11 ` Tejun Heo 2008-03-06 5:02 ` FUJITA Tomonori 2008-03-04 9:29 ` [PATCH] block: fix residual byte count handling Tejun Heo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).