* [KEXEC][2.5.69] kexec for 2.5.69 available @ 2003-05-09 19:04 Andy Pfiffer 2003-05-09 20:04 ` ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69 available) Christophe Saout 0 siblings, 1 reply; 28+ messages in thread From: Andy Pfiffer @ 2003-05-09 19:04 UTC (permalink / raw) To: Eric Biederman; +Cc: fastboot, linux-kernel, Suparna Bhattacharya Eric, I have a patch set available for kexec for 2.5.69. I had an unrelated delay in posting this due to some strange behavior of late with LILO and my ext3-mounted /boot partition (/sbin/lilo would say that it updated, but a subsequent reboot would not include my new kernel) The patches are available for download from OSDL's patch lifecycle manager ( http://www.osdl.org/cgi-bin/plm/ ). Patch stack for kexec for 2.5.69: kexec base for 2.5.69: http://www.osdl.org/cgi-bin/plm?module=patch_info&patch_id=1828 kexec hwfixes for 2.5.69: http://www.osdl.org/cgi-bin/plm?module=patch_info&patch_id=1829 kexec usemm change (allowed 2-way to work for me): http://www.osdl.org/cgi-bin/plm?module=patch_info&patch_id=1830 optional change to defconfig (sets CONFIG_KEXEC=y): http://www.osdl.org/cgi-bin/plm?module=patch_info&patch_id=1831 The patches are also available (with matching kexec-tools-1.8) from this link pending a crontab update: http://www.osdl.org/archive/andyp/kexec/2.5.69/ Andrew Morton's tree now also contains kexec, and you can pick up his patch here: http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.69/ I'll put together another release area for a matching kexec-tools for -mm trees (different kexec syscall number between 2.5.* and 2.5.*-mm*) as soon as I get -mm trees built and booted on my kexec test machines. Regards, Andy To All: if you try kexec, a quick reply of success or failure to fastboot@osdl.org would be appreciated. If it doesn't work for you, please include the output of lspci in your email. Kexec has worked for me on these systems: single P3-800MHz, 640MB: 00:00.0 Host bridge: ServerWorks CNB20LE Host Bridge (rev 06) 00:00.1 Host bridge: ServerWorks CNB20LE Host Bridge (rev 06) 00:01.0 VGA compatible controller: S3 Inc. Savage 4 (rev 04) 00:09.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08) 00:0f.0 ISA bridge: ServerWorks OSB4 South Bridge (rev 50) 00:0f.1 IDE interface: ServerWorks OSB4 IDE Controller 00:0f.2 USB Controller: ServerWorks OSB4/CSB5 OHCI USB Controller (rev 04) 01:03.0 SCSI storage controller: Adaptec AIC-7892P U160/m (rev 02) dual P3-866MHz, 256MB: 00:00.0 Host bridge: ServerWorks CNB20LE Host Bridge (rev 05) 00:00.1 Host bridge: ServerWorks CNB20LE Host Bridge (rev 05) 00:02.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08) 00:04.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 08) 00:07.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 65) 00:0f.0 ISA bridge: ServerWorks OSB4 South Bridge (rev 4f) 00:0f.1 IDE interface: ServerWorks OSB4 IDE Controller 00:0f.2 USB Controller: ServerWorks OSB4/CSB5 USB Controller (rev 04) 01:05.0 SCSI storage controller: LSI Logic / Symbios Logic 53c896 (rev 07) 01:05.1 SCSI storage controller: LSI Logic / Symbios Logic 53c896 (rev 07) dual P4-1.7GHz Xeon, 512MB: 00:00.0 Host bridge: Intel Corp. 82850 860 (Wombat) Chipset Host Bridge (MCH) (rev 04) 00:01.0 PCI bridge: Intel Corp. 82850/82860 850/860 (Tehama/Wombat) Chipset AGP Bridge (rev 04) 00:02.0 PCI bridge: Intel Corp. 82860 860 (Wombat) Chipset PCI Bridge (rev 04) 00:1e.0 PCI bridge: Intel Corp. 82801BA/CA PCI Bridge (rev 04) 00:1f.0 ISA bridge: Intel Corp. 82801BA ISA Bridge (LPC) (rev 04) 00:1f.1 IDE interface: Intel Corp. 82801BA IDE U100 (rev 04) 00:1f.2 USB Controller: Intel Corp. 82801BA/BAM USB (Hub (rev 04) 00:1f.3 SMBus: Intel Corp. 82801BA/BAM SMBus (rev 04) 00:1f.5 Multimedia audio controller: Intel Corp. 82801BA/BAM AC'97 Audio (rev 04) 01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G400 AGP (rev 85) 02:1f.0 PCI bridge: Intel Corp. 82806AA PCI64 Hub PCI Bridge (rev 03) 03:00.0 PIC: Intel Corp. 82806AA PCI64 Hub Advanced Programmable Interrupt Controller (rev 01) 04:04.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 0c) ^ permalink raw reply [flat|nested] 28+ messages in thread
* ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69 available) 2003-05-09 19:04 [KEXEC][2.5.69] kexec for 2.5.69 available Andy Pfiffer @ 2003-05-09 20:04 ` Christophe Saout 2003-05-09 20:55 ` Andy Pfiffer 0 siblings, 1 reply; 28+ messages in thread From: Christophe Saout @ 2003-05-09 20:04 UTC (permalink / raw) To: linux-kernel; +Cc: Andy Pfiffer Am Fre, 2003-05-09 um 21.04 schrieb Andy Pfiffer: > [...] > I had an unrelated > delay in posting this due to some strange behavior of late with LILO and > my ext3-mounted /boot partition (/sbin/lilo would say that it updated, > but a subsequent reboot would not include my new kernel) So I'm not the only one having this problem... I think I first saw this with 2.5.68 but I'm not sure. My boot partition is a small ext3 partition on a lvm2 volume accessed over device-mapper (I've written a lilo patch for that, but the patch is working and) but I don't think that has something to do with the problem. When syncing, unmounting and waiting some time after running lilo, the changes sometimes seem correctly written to disk, I don't know when exactly. Could it be that the location of /boot/map is not written to the partition sector of /dev/hda? Or not flushed correctly or something? After reboot the old kernel came up again (though it was moved to vmlinuz.old). -- Christophe Saout <christophe@saout.de> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69 available) 2003-05-09 20:04 ` ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69 available) Christophe Saout @ 2003-05-09 20:55 ` Andy Pfiffer 2003-05-09 20:46 ` ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69available) Riley Williams 2003-06-11 22:08 ` ext[23]/lilo/2.5.{68,69,70} -- IDE Problem? Andy Pfiffer 0 siblings, 2 replies; 28+ messages in thread From: Andy Pfiffer @ 2003-05-09 20:55 UTC (permalink / raw) To: Christophe Saout; +Cc: linux-kernel On Fri, 2003-05-09 at 13:04, Christophe Saout wrote: > Am Fre, 2003-05-09 um 21.04 schrieb Andy Pfiffer: > > > [...] > > I had an unrelated > > delay in posting this due to some strange behavior of late with LILO and > > my ext3-mounted /boot partition (/sbin/lilo would say that it updated, > > but a subsequent reboot would not include my new kernel) > > So I'm not the only one having this problem... I think I first saw this > with 2.5.68 but I'm not sure. Well, that makes two of us for sure. > > My boot partition is a small ext3 partition on a lvm2 volume accessed > over device-mapper (I've written a lilo patch for that, but the patch is > working and) but I don't think that has something to do with the > problem. > > When syncing, unmounting and waiting some time after running lilo, the > changes sometimes seem correctly written to disk, I don't know when > exactly. My /boot is an ext3 partition on an IDE disk. My symptoms and your symptoms match -- wait awhile, and it works okay. If you don't wait "long enough" the changes made in /etc/lilo.conf are not reflected in the after running /sbin/lilo and rebooting normally. I have been unable to reproduce this on a uniproc system with SCSI disks. 2.5.67 seems to work in this regard as expected. > Could it be that the location of /boot/map is not written to the > partition sector of /dev/hda? Or not flushed correctly or something? > > After reboot the old kernel came up again (though it was moved to > vmlinuz.old). I don't know -- I haven't isolated it yet. Anyone else? ^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69available) 2003-05-09 20:55 ` Andy Pfiffer @ 2003-05-09 20:46 ` Riley Williams 2003-05-09 22:39 ` Joe Korty 2003-05-09 23:39 ` Andy Pfiffer 2003-06-11 22:08 ` ext[23]/lilo/2.5.{68,69,70} -- IDE Problem? Andy Pfiffer 1 sibling, 2 replies; 28+ messages in thread From: Riley Williams @ 2003-05-09 20:46 UTC (permalink / raw) To: Andy Pfiffer, Christophe Saout; +Cc: linux-kernel Hi Andy, Christophe. >>> I had an unrelated delay in posting this due to some strange >>> behavior of late with LILO and my ext3-mounted /boot partition >>> (/sbin/lilo would say that it updated, but a subsequent reboot >>> would not include my new kernel) >> So I'm not the only one having this problem... I think I first >> saw this with 2.5.68 but I'm not sure. > Well, that makes two of us for sure. >> My boot partition is a small ext3 partition on a lvm2 volume >> accessed over device-mapper (I've written a lilo patch for >> that, but the patch is working and) but I don't think that has >> something to do with the problem. >> >> When syncing, unmounting and waiting some time after running >> lilo, the changes sometimes seem correctly written to disk, I >> don't know when exactly. > > My /boot is an ext3 partition on an IDE disk. My symptoms and > your symptoms match -- wait awhile, and it works okay. If you > don't wait "long enough" the changes made in /etc/lilo.conf are > not reflected in the after running /sbin/lilo and rebooting > normally. One suggestion: ext3 is a journalled version of ext2, so if you can boot with whatever is needed to specify that the boot partition is to be mounted as ext2 rather than ext3, you can isolate the journal system: If the problem's still there in ext2 then the journal is not involved, but if the problem vanishes there, it's something to do with the journal. I have to admit that the above sounds very much like the details are being recorded in the journal, but the journal isn't being played back to update the actual files. Best wishes from Riley. --- * Nothing as pretty as a smile, nothing as ugly as a frown. --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.478 / Virus Database: 275 - Release Date: 6-May-2003 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69available) 2003-05-09 20:46 ` ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69available) Riley Williams @ 2003-05-09 22:39 ` Joe Korty 2003-05-09 23:39 ` Andy Pfiffer 1 sibling, 0 replies; 28+ messages in thread From: Joe Korty @ 2003-05-09 22:39 UTC (permalink / raw) To: Riley Williams; +Cc: Andy Pfiffer, Christophe Saout, linux-kernel > One suggestion: ext3 is a journalled version of ext2, so if you can > boot with whatever is needed to specify that the boot partition is > to be mounted as ext2 rather than ext3, you can isolate the journal > system: If the problem's still there in ext2 then the journal is > not involved, but if the problem vanishes there, it's something to > do with the journal. > > I have to admit that the above sounds very much like the details > are being recorded in the journal, but the journal isn't being > played back to update the actual files. I recall reading on lkml once that an ext3 sync(2) merely pushes volatile data/metadata out to the journal rather than to to files themselves. Joe ^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69available) 2003-05-09 20:46 ` ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69available) Riley Williams 2003-05-09 22:39 ` Joe Korty @ 2003-05-09 23:39 ` Andy Pfiffer 1 sibling, 0 replies; 28+ messages in thread From: Andy Pfiffer @ 2003-05-09 23:39 UTC (permalink / raw) To: Riley Williams; +Cc: Christophe Saout, linux-kernel On Fri, 2003-05-09 at 13:46, Riley Williams wrote: > Hi Andy, Christophe. > > >>> I had an unrelated delay in posting this due to some strange > >>> behavior of late with LILO and my ext3-mounted /boot partition > >>> (/sbin/lilo would say that it updated, but a subsequent reboot > >>> would not include my new kernel) > > >> So I'm not the only one having this problem... I think I first > >> saw this with 2.5.68 but I'm not sure. > > > Well, that makes two of us for sure. > > >> My boot partition is a small ext3 partition on a lvm2 volume > >> accessed over device-mapper (I've written a lilo patch for > >> that, but the patch is working and) but I don't think that has > >> something to do with the problem. > >> > >> When syncing, unmounting and waiting some time after running > >> lilo, the changes sometimes seem correctly written to disk, I > >> don't know when exactly. > > > > My /boot is an ext3 partition on an IDE disk. My symptoms and > > your symptoms match -- wait awhile, and it works okay. If you > > don't wait "long enough" the changes made in /etc/lilo.conf are > > not reflected in the after running /sbin/lilo and rebooting > > normally. > > One suggestion: ext3 is a journalled version of ext2, so if you can > boot with whatever is needed to specify that the boot partition is > to be mounted as ext2 rather than ext3, you can isolate the journal > system: If the problem's still there in ext2 then the journal is > not involved, but if the problem vanishes there, it's something to > do with the journal. Changing the "ext3" to "ext2" in /etc/fstab and rebooting did not change the behavior (ie, edit /etc/lilo.conf, run /sbin/lilo, reboot cleanly, changes not there). I did see the warning about mounting an ext3 filesystem as ext2, however. Strange. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- IDE Problem? 2003-05-09 20:55 ` Andy Pfiffer 2003-05-09 20:46 ` ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69available) Riley Williams @ 2003-06-11 22:08 ` Andy Pfiffer 2003-06-11 23:21 ` Christophe Saout 1 sibling, 1 reply; 28+ messages in thread From: Andy Pfiffer @ 2003-06-11 22:08 UTC (permalink / raw) To: Christophe Saout; +Cc: linux-kernel On Fri, 2003-05-09 at 13:55, Andy Pfiffer wrote: > On Fri, 2003-05-09 at 13:04, Christophe Saout wrote: > > Am Fre, 2003-05-09 um 21.04 schrieb Andy Pfiffer: > > > > > [...] > > > I had an unrelated > > > delay in posting this due to some strange behavior of late with LILO and > > > my ext3-mounted /boot partition (/sbin/lilo would say that it updated, > > > but a subsequent reboot would not include my new kernel) > > > > So I'm not the only one having this problem... I think I first saw this > > with 2.5.68 but I'm not sure. > > Well, that makes two of us for sure. > > > > > My boot partition is a small ext3 partition on a lvm2 volume accessed > > over device-mapper (I've written a lilo patch for that, but the patch is > > working and) but I don't think that has something to do with the > > problem. > > > > When syncing, unmounting and waiting some time after running lilo, the > > changes sometimes seem correctly written to disk, I don't know when > > exactly. > > My /boot is an ext3 partition on an IDE disk. My symptoms and your > symptoms match -- wait awhile, and it works okay. If you don't wait > "long enough" the changes made in /etc/lilo.conf are not reflected in > the after running /sbin/lilo and rebooting normally. > > I have been unable to reproduce this on a uniproc system with SCSI > disks. > > 2.5.67 seems to work in this regard as expected. > > > Could it be that the location of /boot/map is not written to the > > partition sector of /dev/hda? Or not flushed correctly or something? > > > > After reboot the old kernel came up again (though it was moved to > > vmlinuz.old). > > I don't know -- I haven't isolated it yet. > > Anyone else? I have taken another look at this, and can confirm the following: 1. 2.5.67 works as expected. 2. 2.5.68, 2.5.69, and 2.5.70 do not. 3. ext2 vs. ext3 for /boot: no effect (ie, .68, .69, .70 demonstrate the problem independent of the filesystem used for /boot). Relative to a 2.5.68 pure BK tree, the deltas from 2.5.67 to 2.5.68 are: 1.971.76.10 /* 2.5.67 */ 1.1124 /* 2.5.68 */ The patch exported by BK between these 2 revs is 297K lines ( a sizeable haystack ). Any ideas about where I should dig for my needle first would be welcomed... Gory details about my hardware & software follow... % lilo -v LILO version 22.1, Copyright (C) 1992-1998 Werner Almesberger Development beyond version 21 Copyright (C) 1999-2001 John Coffman Released 31-Oct-2001 and compiled at 20:50:13 on Mar 25 2002. MAX_IMAGES = 27 CPUs: processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 1 model name : Intel(R) Xeon(TM) CPU 1.70GHz stepping : 2 cpu MHz : 1685.926 cache size : 256 KB physical id : 0 siblings : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm bogomips : 3317.76 processor : 1 vendor_id : GenuineIntel cpu family : 15 model : 0 model name : Intel(R) Xeon(TM) CPU 1700MHz stepping : 10 cpu MHz : 1685.926 cache size : 256 KB physical id : 0 siblings : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm bogomips : 3366.91 Two IDE hard drives (I haven't cracked the case to identify the manufacturer): /dev/hda: HDIO_GETGEO_BIG failed: Inappropriate ioctl for device Model=CI530L04VARE700- , FwRev=REO44AA5, SerialNo= S PXXTYH2351 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=40 BuffType=DualPortCache, BuffSize=1916kB, MaxMultSect=16, MultSect=16 CurCHS=16383/16/63, CurSects=-66060037, LBA=yes, LBAsects=78156288 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 AdvancedPM=yes: disabled (255) Drive Supports : ATA/ATAPI-5 T13 1321D revision 1 : ATA-2 ATA-3 ATA-4 ATA-5 /dev/hdb: HDIO_GETGEO_BIG failed: Inappropriate ioctl for device Model=CI530L02VARE700- , FwRev=REO24AA5, SerialNo= S PVVTFT0B17 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=40 BuffType=DualPortCache, BuffSize=1916kB, MaxMultSect=16, MultSect=16 CurCHS=16383/16/63, CurSects=-66060037, LBA=yes, LBAsects=39876480 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 AdvancedPM=yes: disabled (255) Drive Supports : ATA/ATAPI-5 T13 1321D revision 1 : ATA-2 ATA-3 ATA-4 ATA-5 The PCI hardware on this system: 00:00.0 Host bridge: Intel Corp. 82850 860 (Wombat) Chipset Host Bridge (MCH) (rev 04) Subsystem: IBM: Unknown device 2531 Flags: bus master, fast devsel, latency 0 Memory at f0000000 (32-bit, prefetchable) [size=64M] Capabilities: [a0] AGP version 2.0 00:01.0 PCI bridge: Intel Corp. 82850/82860 850/860 (Tehama/Wombat) Chipset AGP Bridge (rev 04) (prog-if 00 [Normal decode]) Flags: bus master, 66Mhz, fast devsel, latency 64 Bus: primary=00, secondary=01, subordinate=01, sec-latency=32 Memory behind bridge: f6000000-f8ffffff Prefetchable memory behind bridge: f4000000-f5ffffff 00:02.0 PCI bridge: Intel Corp. 82860 860 (Wombat) Chipset PCI Bridge (rev 04) (prog-if 00 [Normal decode]) Flags: bus master, 66Mhz, fast devsel, latency 32 Bus: primary=00, secondary=02, subordinate=03, sec-latency=0 Memory behind bridge: fb000000-fb0fffff 00:1e.0 PCI bridge: Intel Corp. 82801BA/CA PCI Bridge (rev 04) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=00, secondary=04, subordinate=04, sec-latency=32 I/O behind bridge: 0000c000-0000cfff Memory behind bridge: f9000000-faffffff 00:1f.0 ISA bridge: Intel Corp. 82801BA ISA Bridge (LPC) (rev 04) Flags: bus master, medium devsel, latency 0 00:1f.1 IDE interface: Intel Corp. 82801BA IDE U100 (rev 04) (prog-if 80 [Master]) Subsystem: IBM: Unknown device 2442 Flags: bus master, medium devsel, latency 0 I/O ports at f000 [size=16] 00:1f.2 USB Controller: Intel Corp. 82801BA/BAM USB (Hub (rev 04) (prog-if 00 [UHCI]) Subsystem: IBM: Unknown device 2442 Flags: bus master, medium devsel, latency 0, IRQ 19 I/O ports at d000 [size=32] 00:1f.3 SMBus: Intel Corp. 82801BA/BAM SMBus (rev 04) Subsystem: IBM: Unknown device 2442 Flags: medium devsel, IRQ 17 I/O ports at 5000 [size=16] 00:1f.5 Multimedia audio controller: Intel Corp. 82801BA/BAM AC'97 Audio (rev 04) Subsystem: IBM: Unknown device 0224 Flags: bus master, medium devsel, latency 0, IRQ 17 I/O ports at d800 [size=256] I/O ports at dc00 [size=64] 01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G400 AGP (rev 85) (prog-if 00 [VGA]) Subsystem: Matrox Graphics, Inc. Millennium G450 Dual Head Flags: bus master, medium devsel, latency 64, IRQ 22 Memory at f4000000 (32-bit, prefetchable) [size=32M] Memory at f6000000 (32-bit, non-prefetchable) [size=16K] Memory at f7000000 (32-bit, non-prefetchable) [size=8M] Expansion ROM at <unassigned> [disabled] [size=128K] Capabilities: [dc] Power Management version 2 Capabilities: [f0] AGP version 2.0 02:1f.0 PCI bridge: Intel Corp. 82806AA PCI64 Hub PCI Bridge (rev 03) (prog-if 00 [Normal decode]) Flags: bus master, 66Mhz, fast devsel, latency 0 Bus: primary=02, secondary=03, subordinate=03, sec-latency=32 Memory behind bridge: fb000000-fb0fffff 03:00.0 PIC: Intel Corp. 82806AA PCI64 Hub Advanced Programmable Interrupt Controller (rev 01) (prog-if 20 [IO(X)-APIC]) Subsystem: Intel Corp. 82806AA PCI64 Hub Advanced Programmable Interrupt Controller Flags: bus master, fast devsel, latency 0 Memory at fb000000 (32-bit, non-prefetchable) [size=4K] 04:04.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 0c) Subsystem: IBM: Unknown device 0207 Flags: bus master, medium devsel, latency 32, IRQ 16 Memory at fa020000 (32-bit, non-prefetchable) [size=4K] I/O ports at c000 [size=64] Memory at fa000000 (32-bit, non-prefetchable) [size=128K] Expansion ROM at <unassigned> [disabled] [size=64K] Capabilities: [dc] Power Management version 2 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- IDE Problem? 2003-06-11 22:08 ` ext[23]/lilo/2.5.{68,69,70} -- IDE Problem? Andy Pfiffer @ 2003-06-11 23:21 ` Christophe Saout 2003-06-11 23:40 ` Bartlomiej Zolnierkiewicz 2003-06-12 0:20 ` ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? Andy Pfiffer 0 siblings, 2 replies; 28+ messages in thread From: Christophe Saout @ 2003-06-11 23:21 UTC (permalink / raw) To: Andy Pfiffer; +Cc: linux-kernel Am Don, 2003-06-12 um 00.08 schrieb Andy Pfiffer: > On Fri, 2003-05-09 at 13:55, Andy Pfiffer wrote: > > On Fri, 2003-05-09 at 13:04, Christophe Saout wrote: > > > Am Fre, 2003-05-09 um 21.04 schrieb Andy Pfiffer: > > > > > > > [...] > > > > I had an unrelated > > > > delay in posting this due to some strange behavior of late with LILO and > > > > my ext3-mounted /boot partition (/sbin/lilo would say that it updated, > > > > but a subsequent reboot would not include my new kernel) > > > > > > So I'm not the only one having this problem... I think I first saw this > > > with 2.5.68 but I'm not sure. > > > > Well, that makes two of us for sure. > > > > > > > > My boot partition is a small ext3 partition on a lvm2 volume accessed > > > over device-mapper (I've written a lilo patch for that, but the patch is > > > working and) but I don't think that has something to do with the > > > problem. > > > > > > When syncing, unmounting and waiting some time after running lilo, the > > > changes sometimes seem correctly written to disk, I don't know when > > > exactly. > > > > My /boot is an ext3 partition on an IDE disk. My symptoms and your > > symptoms match -- wait awhile, and it works okay. If you don't wait > > "long enough" the changes made in /etc/lilo.conf are not reflected in > > the after running /sbin/lilo and rebooting normally. > > > > I have been unable to reproduce this on a uniproc system with SCSI > > disks. > > > > 2.5.67 seems to work in this regard as expected. > > > > > Could it be that the location of /boot/map is not written to the > > > partition sector of /dev/hda? Or not flushed correctly or something? > > > > > > After reboot the old kernel came up again (though it was moved to > > > vmlinuz.old). > > > > I don't know -- I haven't isolated it yet. > > > > Anyone else? > > I have taken another look at this, and can confirm the following: > > 1. 2.5.67 works as expected. > 2. 2.5.68, 2.5.69, and 2.5.70 do not. > 3. ext2 vs. ext3 for /boot: no effect (ie, .68, .69, .70 demonstrate the > problem independent of the filesystem used for /boot). I found out that flushb /dev/<boot_device> helps, syncing doesn't. Not 100% sure if that's right, because right now I'm always doing both, but I remember having only synced before and that didn't help. > Relative to a 2.5.68 pure BK tree, the deltas from 2.5.67 to 2.5.68 are: > 1.971.76.10 /* 2.5.67 */ > 1.1124 /* 2.5.68 */ > > The patch exported by BK between these 2 revs is 297K lines ( a sizeable > haystack ). Any ideas about where I should dig for my needle first > would be welcomed... There don't seem to be too much changes in /drivers/block or /fs, mostly cleanups. I personally have no idea where to start, except trying out each -bk version inbetween. Hmmm. And I'm not going to do that now... :-/ -- Christophe Saout <christophe@saout.de> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- IDE Problem? 2003-06-11 23:21 ` Christophe Saout @ 2003-06-11 23:40 ` Bartlomiej Zolnierkiewicz 2003-06-12 0:20 ` ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? Andy Pfiffer 1 sibling, 0 replies; 28+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2003-06-11 23:40 UTC (permalink / raw) To: Christophe Saout; +Cc: Andy Pfiffer, linux-kernel mm/msync.c: <...> * MS_ASYNC does not start I/O (it used to, up to 2.5.67). <...> You can revert changes to mm/msync.c from 2.5.68 patch and see whether it helps. Regards, -- Bartlomiej On 12 Jun 2003, Christophe Saout wrote: > Am Don, 2003-06-12 um 00.08 schrieb Andy Pfiffer: > > On Fri, 2003-05-09 at 13:55, Andy Pfiffer wrote: > > > On Fri, 2003-05-09 at 13:04, Christophe Saout wrote: > > > > Am Fre, 2003-05-09 um 21.04 schrieb Andy Pfiffer: > > > > > > > > > [...] > > > > > I had an unrelated > > > > > delay in posting this due to some strange behavior of late with LILO and > > > > > my ext3-mounted /boot partition (/sbin/lilo would say that it updated, > > > > > but a subsequent reboot would not include my new kernel) > > > > > > > > So I'm not the only one having this problem... I think I first saw this > > > > with 2.5.68 but I'm not sure. > > > > > > Well, that makes two of us for sure. > > > > > > > > > > > My boot partition is a small ext3 partition on a lvm2 volume accessed > > > > over device-mapper (I've written a lilo patch for that, but the patch is > > > > working and) but I don't think that has something to do with the > > > > problem. > > > > > > > > When syncing, unmounting and waiting some time after running lilo, the > > > > changes sometimes seem correctly written to disk, I don't know when > > > > exactly. > > > > > > My /boot is an ext3 partition on an IDE disk. My symptoms and your > > > symptoms match -- wait awhile, and it works okay. If you don't wait > > > "long enough" the changes made in /etc/lilo.conf are not reflected in > > > the after running /sbin/lilo and rebooting normally. > > > > > > I have been unable to reproduce this on a uniproc system with SCSI > > > disks. > > > > > > 2.5.67 seems to work in this regard as expected. > > > > > > > Could it be that the location of /boot/map is not written to the > > > > partition sector of /dev/hda? Or not flushed correctly or something? > > > > > > > > After reboot the old kernel came up again (though it was moved to > > > > vmlinuz.old). > > > > > > I don't know -- I haven't isolated it yet. > > > > > > Anyone else? > > > > I have taken another look at this, and can confirm the following: > > > > 1. 2.5.67 works as expected. > > 2. 2.5.68, 2.5.69, and 2.5.70 do not. > > 3. ext2 vs. ext3 for /boot: no effect (ie, .68, .69, .70 demonstrate the > > problem independent of the filesystem used for /boot). > > I found out that flushb /dev/<boot_device> helps, syncing doesn't. Not > 100% sure if that's right, because right now I'm always doing both, but > I remember having only synced before and that didn't help. > > > Relative to a 2.5.68 pure BK tree, the deltas from 2.5.67 to 2.5.68 are: > > 1.971.76.10 /* 2.5.67 */ > > 1.1124 /* 2.5.68 */ > > > > The patch exported by BK between these 2 revs is 297K lines ( a sizeable > > haystack ). Any ideas about where I should dig for my needle first > > would be welcomed... > > There don't seem to be too much changes in /drivers/block or /fs, mostly > cleanups. I personally have no idea where to start, except trying out > each -bk version inbetween. Hmmm. And I'm not going to do that now... > :-/ > > -- > Christophe Saout <christophe@saout.de> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-11 23:21 ` Christophe Saout 2003-06-11 23:40 ` Bartlomiej Zolnierkiewicz @ 2003-06-12 0:20 ` Andy Pfiffer 2003-06-12 0:29 ` Andrew Morton 1 sibling, 1 reply; 28+ messages in thread From: Andy Pfiffer @ 2003-06-12 0:20 UTC (permalink / raw) To: Christophe Saout, adam; +Cc: linux-kernel On Wed, 2003-06-11 at 16:21, Christophe Saout wrote: > Am Don, 2003-06-12 um 00.08 schrieb Andy Pfiffer: > > On Fri, 2003-05-09 at 13:55, Andy Pfiffer wrote: > > > On Fri, 2003-05-09 at 13:04, Christophe Saout wrote: > > > > Am Fre, 2003-05-09 um 21.04 schrieb Andy Pfiffer: > > > > > > > > > [...] > > > > > I had an unrelated > > > > > delay in posting this due to some strange behavior of late with LILO and > > > > > my ext3-mounted /boot partition (/sbin/lilo would say that it updated, > > > > > but a subsequent reboot would not include my new kernel) > > > > > > > > So I'm not the only one having this problem... I think I first saw this > > > > with 2.5.68 but I'm not sure. <snip> > > I have taken another look at this, and can confirm the following: > > > > 1. 2.5.67 works as expected. > > 2. 2.5.68, 2.5.69, and 2.5.70 do not. > > 3. ext2 vs. ext3 for /boot: no effect (ie, .68, .69, .70 demonstrate the > > problem independent of the filesystem used for /boot). > > I found out that flushb /dev/<boot_device> helps, syncing doesn't. Not > 100% sure if that's right, because right now I'm always doing both, but > I remember having only synced before and that didn't help. <snip> A little more digging reveals this thread from May 14, 2003: http://marc.theaimsgroup.com/?l=linux-kernel&m=105296774516509&w=2 Applying the kludge in Adam's message: --- linux-2.5.69/fs/block_dev.c.orig 2003-05-14 17:43:40.000000000 -0700 +++ linux-2.5.69/fs/block_dev.c 2003-05-14 17:44:29.000000000 -0700 @@ -635,14 +635,24 @@ int blkdev_put(struct block_device *bdev, int kind) { int ret = 0; struct inode *bd_inode = bdev->bd_inode; struct gendisk *disk = bdev->bd_disk; down(&bdev->bd_sem); + + /* AJR start */ + switch (kind) { + case BDEV_FILE: + case BDEV_FS: + sync_blockdev(bd_inode->i_bdev); + break; + } + /* AJR end */ + lock_kernel(); if (!--bdev->bd_openers) { switch (kind) { case BDEV_FILE: case BDEV_FS: sync_blockdev(bd_inode->i_bdev); break; made things work for me in 2.5.68. I suspect it will make things work for .70 as well. So now the important question: is it wrong to not sync_blockdev() until the count drops to 0? Andy ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-12 0:20 ` ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? Andy Pfiffer @ 2003-06-12 0:29 ` Andrew Morton 2003-06-12 10:42 ` Christophe Saout 2003-06-12 17:27 ` Andy Pfiffer 0 siblings, 2 replies; 28+ messages in thread From: Andrew Morton @ 2003-06-12 0:29 UTC (permalink / raw) To: Andy Pfiffer; +Cc: christophe, adam, linux-kernel Andy Pfiffer <andyp@osdl.org> wrote: > > So now the important question: is it wrong to not sync_blockdev() until > the count drops to 0? Should be OK. The close will not sync anything if someone else has the blockdev open (ie: there's a filesystem mounted there). But sync() should certainly write everything out, and lilo does perform a sync. I'd be interested in seeing the contents of /proc/meminfo immediately after the lilo run, see if there's any dirty memory left around. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-12 0:29 ` Andrew Morton @ 2003-06-12 10:42 ` Christophe Saout 2003-06-12 10:54 ` Andrew Morton 2003-06-12 17:27 ` Andy Pfiffer 1 sibling, 1 reply; 28+ messages in thread From: Christophe Saout @ 2003-06-12 10:42 UTC (permalink / raw) To: Andrew Morton; +Cc: Andy Pfiffer, adam, linux-kernel Am Don, 2003-06-12 um 02.29 schrieb Andrew Morton: > But sync() should certainly write everything out, and lilo does perform a > sync. Yep. > I'd be interested in seeing the contents of /proc/meminfo immediately after > the lilo run, see if there's any dirty memory left around. Yes, one page. After running lilo, there are 4k diry, running sync doesn't get it below 4k. Only flushb /dev/hda does (or waiting several minutes). If you're interested, I've put an annotated version of ( cat /proc/meminfo; lilo; cat /proc/meminfo; sync; cat /proc/meminfo; flushb /dev/hda; cat /proc/meminfo ) | buffer > meminfo.out.txt on my web space: http://www.saout.de/files/meminfo.out.txt (the kernel used was 2.5.70-mm7 with some unrelated patches backed out) BTW: I found out that now strace lilo freezes the machine... -- Christophe Saout <christophe@saout.de> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-12 10:42 ` Christophe Saout @ 2003-06-12 10:54 ` Andrew Morton 2003-06-12 11:12 ` Christophe Saout ` (2 more replies) 0 siblings, 3 replies; 28+ messages in thread From: Andrew Morton @ 2003-06-12 10:54 UTC (permalink / raw) To: Christophe Saout; +Cc: andyp, adam, linux-kernel Christophe Saout <christophe@saout.de> wrote: > > Am Don, 2003-06-12 um 02.29 schrieb Andrew Morton: > > > But sync() should certainly write everything out, and lilo does perform a > > sync. > > Yep. > > > I'd be interested in seeing the contents of /proc/meminfo immediately after > > the lilo run, see if there's any dirty memory left around. > > Yes, one page. After running lilo, there are 4k diry, running sync > doesn't get it below 4k. That would tend to imply that a page got onto the wrong list. But if that were so, nothing would be able to write it. > Only flushb /dev/hda does (or waiting several minutes). What is flushb? I use `lilo ; reboot -f' about 1000 times a day, no probs. There's something different. Adam was doing strange things with an initrd and pivot_root. Are you doing anything unconventional? > > BTW: I found out that now strace lilo freezes the machine... Works OK here. Try `strace strace lilo' ;) ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-12 10:54 ` Andrew Morton @ 2003-06-12 11:12 ` Christophe Saout 2003-06-12 11:24 ` ext[23]/lilo/2.5.{68,69,70} -- strace lilo - freeze Christophe Saout 2003-06-12 12:44 ` ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? Herbert Xu 2 siblings, 0 replies; 28+ messages in thread From: Christophe Saout @ 2003-06-12 11:12 UTC (permalink / raw) To: Andrew Morton; +Cc: andyp, adam, linux-kernel Am Don, 2003-06-12 um 12.54 schrieb Andrew Morton: > Christophe Saout <christophe@saout.de> wrote: > > > > Am Don, 2003-06-12 um 02.29 schrieb Andrew Morton: > > > > > I'd be interested in seeing the contents of /proc/meminfo immediately after > > > the lilo run, see if there's any dirty memory left around. > > > > Yes, one page. After running lilo, there are 4k diry, running sync > > doesn't get it below 4k. > > That would tend to imply that a page got onto the wrong list. But if that > were so, nothing would be able to write it. > > > Only flushb /dev/hda does (or waiting several minutes). > > What is flushb? A program that does a flush ioctl on a block device: open("/dev/hda", O_RDONLY) = 3 ioctl(3, BLKFLSBUF, 0) = 0 > I use `lilo ; reboot -f' about 1000 times a day, no probs. There's > something different. > > Adam was doing strange things with an initrd and pivot_root. Are you doing > anything unconventional? I'm using an initrd (but no pivot_root) that initializes my LVM2 volumes (using device-mapper). /boot and / are on device-mapper devices. > > BTW: I found out that now strace lilo freezes the machine... > > Works OK here. Try `strace strace lilo' ;) I'll try to find out what happens. Not interested in crashing my system while answering emails now. ;) -- Christophe Saout <christophe@saout.de> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- strace lilo - freeze 2003-06-12 10:54 ` Andrew Morton 2003-06-12 11:12 ` Christophe Saout @ 2003-06-12 11:24 ` Christophe Saout 2003-06-12 12:44 ` ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? Herbert Xu 2 siblings, 0 replies; 28+ messages in thread From: Christophe Saout @ 2003-06-12 11:24 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel Am Don, 2003-06-12 um 12.54 schrieb Andrew Morton: > > BTW: I found out that now strace lilo freezes the machine... > Works OK here. Try `strace strace lilo' ;) Since we are already talking about syncing... The last thing "strace lilo" shows is: fsync(5 -- Christophe Saout <christophe@saout.de> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-12 10:54 ` Andrew Morton 2003-06-12 11:12 ` Christophe Saout 2003-06-12 11:24 ` ext[23]/lilo/2.5.{68,69,70} -- strace lilo - freeze Christophe Saout @ 2003-06-12 12:44 ` Herbert Xu 2 siblings, 0 replies; 28+ messages in thread From: Herbert Xu @ 2003-06-12 12:44 UTC (permalink / raw) To: Andrew Morton, linux-kernel Andrew Morton <akpm@digeo.com> wrote: > > I use `lilo ; reboot -f' about 1000 times a day, no probs. There's > something different. > > Adam was doing strange things with an initrd and pivot_root. Are you doing > anything unconventional? I see exactly the same problem with lilo and I too use initrd + pivot_root. I think Adam's post referred to elsewhere in this thread already identified the problem as initrd-only. -- Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ ) Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-12 0:29 ` Andrew Morton 2003-06-12 10:42 ` Christophe Saout @ 2003-06-12 17:27 ` Andy Pfiffer 2003-06-12 17:53 ` Andrew Morton 1 sibling, 1 reply; 28+ messages in thread From: Andy Pfiffer @ 2003-06-12 17:27 UTC (permalink / raw) To: Andrew Morton; +Cc: christophe, adam, linux-kernel On Wed, 2003-06-11 at 17:29, Andrew Morton wrote: > Andy Pfiffer <andyp@osdl.org> wrote: > > > > So now the important question: is it wrong to not sync_blockdev() until > > the count drops to 0? > > Should be OK. The close will not sync anything if someone else has the > blockdev open (ie: there's a filesystem mounted there). > > But sync() should certainly write everything out, and lilo does perform a > sync. > > I'd be interested in seeing the contents of /proc/meminfo immediately after > the lilo run, see if there's any dirty memory left around. How I measured: # cat measure #!/bin/sh sync cat /proc/meminfo /sbin/lilo cat /proc/meminfo # ./measure | dd bs=1024k > out 2.5.68 pure: Before: After: MemTotal: 514304 kB MemTotal: 514304 kB MemFree: 388720 kB MemFree: 385648 kB Buffers: 10056 kB Buffers: 12092 kB Cached: 54000 kB Cached: 54956 kB SwapCached: 0 kB SwapCached: 0 kB Active: 62928 kB Active: 63240 kB Inactive: 34416 kB Inactive: 37120 kB HighTotal: 0 kB HighTotal: 0 kB HighFree: 0 kB HighFree: 0 kB LowTotal: 514304 kB LowTotal: 514304 kB LowFree: 388720 kB LowFree: 385648 kB SwapTotal: 787144 kB SwapTotal: 787144 kB SwapFree: 787144 kB SwapFree: 787144 kB Dirty: 0 kB Dirty: 8 kB <--- Writeback: 0 kB Writeback: 0 kB Mapped: 45484 kB Mapped: 45488 kB Slab: 11880 kB Slab: 12100 kB Committed_AS: 146184 kB Committed_AS: 146184 kB PageTables: 656 kB PageTables: 656 kB VmallocTotal: 516040 kB VmallocTotal: 516040 kB VmallocUsed: 42608 kB VmallocUsed: 42608 kB VmallocChunk: 473432 kB VmallocChunk: 473432 kB 2.5.68+kludge: Before: After: MemTotal: 514304 kB MemTotal: 514304 kB MemFree: 390416 kB MemFree: 387216 kB Buffers: 9844 kB Buffers: 11892 kB Cached: 52920 kB Cached: 53864 kB SwapCached: 0 kB SwapCached: 0 kB Active: 62136 kB Active: 62452 kB Inactive: 33908 kB Inactive: 36600 kB HighTotal: 0 kB HighTotal: 0 kB HighFree: 0 kB HighFree: 0 kB LowTotal: 514304 kB LowTotal: 514304 kB LowFree: 390416 kB LowFree: 387216 kB SwapTotal: 787144 kB SwapTotal: 787144 kB SwapFree: 787144 kB SwapFree: 787144 kB Dirty: 0 kB Dirty: 4 kB <--- Writeback: 0 kB Writeback: 0 kB Mapped: 45448 kB Mapped: 45452 kB Slab: 11564 kB Slab: 11756 kB Committed_AS: 146192 kB Committed_AS: 146132 kB PageTables: 656 kB PageTables: 656 kB VmallocTotal: 516040 kB VmallocTotal: 516040 kB VmallocUsed: 42608 kB VmallocUsed: 42608 kB VmallocChunk: 473432 kB VmallocChunk: 473432 kB ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-12 17:27 ` Andy Pfiffer @ 2003-06-12 17:53 ` Andrew Morton 2003-06-12 18:03 ` Andy Pfiffer 0 siblings, 1 reply; 28+ messages in thread From: Andrew Morton @ 2003-06-12 17:53 UTC (permalink / raw) To: Andy Pfiffer; +Cc: christophe, adam, linux-kernel Andy Pfiffer <andyp@osdl.org> wrote: > > Dirty: 0 kB Dirty: 4 kB <--- OK. And are you using initrd as well? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-12 17:53 ` Andrew Morton @ 2003-06-12 18:03 ` Andy Pfiffer 2003-06-12 18:10 ` Andrew Morton 2003-06-12 18:25 ` Andy Pfiffer 0 siblings, 2 replies; 28+ messages in thread From: Andy Pfiffer @ 2003-06-12 18:03 UTC (permalink / raw) To: Andrew Morton; +Cc: christophe, adam, linux-kernel On Thu, 2003-06-12 at 10:53, Andrew Morton wrote: > Andy Pfiffer <andyp@osdl.org> wrote: > > > > Dirty: 0 kB Dirty: 4 kB <--- > > OK. And are you using initrd as well? It is specified in lilo.conf (SuSE 8.0 distro) but I don't see any reason to keep it. I'll yank it and see if it makes a difference. Andy ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-12 18:03 ` Andy Pfiffer @ 2003-06-12 18:10 ` Andrew Morton 2003-06-12 18:53 ` Christophe Saout 2003-06-12 18:25 ` Andy Pfiffer 1 sibling, 1 reply; 28+ messages in thread From: Andrew Morton @ 2003-06-12 18:10 UTC (permalink / raw) To: Andy Pfiffer; +Cc: christophe, adam, linux-kernel Andy Pfiffer <andyp@osdl.org> wrote: > > On Thu, 2003-06-12 at 10:53, Andrew Morton wrote: > > Andy Pfiffer <andyp@osdl.org> wrote: > > > > > > Dirty: 0 kB Dirty: 4 kB <--- > > > > OK. And are you using initrd as well? > > It is specified in lilo.conf (SuSE 8.0 distro) but I don't see any > reason to keep it. I'll yank it and see if it makes a difference. > That would be interesting. Also, what about this shot in the dark? --- 25/fs/fs-writeback.c~a 2003-06-12 11:08:34.000000000 -0700 +++ 25-akpm/fs/fs-writeback.c 2003-06-12 11:08:39.000000000 -0700 @@ -368,7 +368,7 @@ void sync_inodes_sb(struct super_block * }; get_page_state(&ps); - wbc.nr_to_write = ps.nr_dirty + ps.nr_unstable + + wbc.nr_to_write = ps.nr_dirty + ps.nr_unstable + 1024 + (ps.nr_dirty + ps.nr_unstable) / 4; spin_lock(&inode_lock); sync_sb_inodes(sb, &wbc); _ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-12 18:10 ` Andrew Morton @ 2003-06-12 18:53 ` Christophe Saout 0 siblings, 0 replies; 28+ messages in thread From: Christophe Saout @ 2003-06-12 18:53 UTC (permalink / raw) To: Andrew Morton; +Cc: Andy Pfiffer, adam, linux-kernel Am Don, 2003-06-12 um 20.10 schrieb Andrew Morton: > Also, what about this shot in the dark? > - wbc.nr_to_write = ps.nr_dirty + ps.nr_unstable + > + wbc.nr_to_write = ps.nr_dirty + ps.nr_unstable + 1024 + Nope, still 4k dirty left after lilo. -- Christophe Saout <christophe@saout.de> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-12 18:03 ` Andy Pfiffer 2003-06-12 18:10 ` Andrew Morton @ 2003-06-12 18:25 ` Andy Pfiffer 2003-06-13 8:01 ` Andrew Morton 1 sibling, 1 reply; 28+ messages in thread From: Andy Pfiffer @ 2003-06-12 18:25 UTC (permalink / raw) To: Andrew Morton; +Cc: christophe, adam, linux-kernel On Thu, 2003-06-12 at 11:03, Andy Pfiffer wrote: > On Thu, 2003-06-12 at 10:53, Andrew Morton wrote: > > Andy Pfiffer <andyp@osdl.org> wrote: > > > > > > Dirty: 0 kB Dirty: 4 kB <--- > > > > OK. And are you using initrd as well? > > It is specified in lilo.conf (SuSE 8.0 distro) but I don't see any > reason to keep it. I'll yank it and see if it makes a difference. pure == 2.5.68 kludge == 2.5.68+kludge in blkdev_put() % grep Dirt =noinitrd-* =noinitrd-kludge=:Dirty: 0 kB # before =noinitrd-kludge=:Dirty: 4 kB # after =noinitrd-pure=:Dirty: 0 kB # before =noinitrd-pure=:Dirty: 4 kB # after So it would appear to me that initrd is the common denominator among those of us reporting similar symptoms. Andy ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-12 18:25 ` Andy Pfiffer @ 2003-06-13 8:01 ` Andrew Morton 2003-06-13 9:57 ` Herbert Xu ` (3 more replies) 0 siblings, 4 replies; 28+ messages in thread From: Andrew Morton @ 2003-06-13 8:01 UTC (permalink / raw) To: Andy Pfiffer Cc: christophe, adam, linux-kernel, Herbert Xu, Unai Garro Arrazola, Max Valdez, Eduardo Pereira Habkost This should fix it. Once the blockdev inode for /dev/ram0 is dirtied we have a memory-backed inode on the blockdev superblock's s_dirty list. sync_sb_inodes() sees the memory-backed inode on the superblock and assumes that all the other inodes on the superblock are also memory-backed. This is not true for the blockdev superblock! We forget to write out dirty pages against the following blockdevs. Fix this by just leaving the inode dirty and moving on to inspect the other blockdev inodes on sb->s_io. (This is a little inefficient: an alternative is to leave dirtied memory-backed inodes on inode_in_use, so nobody ever even considers them for writeout. But that introduces an inconsistency and is a bit kludgey). fs/fs-writeback.c | 15 ++++++++++++++- 1 files changed, 14 insertions(+), 1 deletion(-) diff -puN fs/fs-writeback.c~writeback-memory-backed-fix fs/fs-writeback.c --- 25/fs/fs-writeback.c~writeback-memory-backed-fix 2003-06-12 23:12:28.000000000 -0700 +++ 25-akpm/fs/fs-writeback.c 2003-06-12 23:14:07.000000000 -0700 @@ -260,8 +260,21 @@ sync_sb_inodes(struct super_block *sb, s struct address_space *mapping = inode->i_mapping; struct backing_dev_info *bdi = mapping->backing_dev_info; - if (bdi->memory_backed) + if (bdi->memory_backed) { + if (sb == blockdev_superblock) { + /* + * Dirty memory-backed blockdev: the ramdisk + * driver does this. + */ + list_move(&inode->i_list, &sb->s_dirty); + continue; + } + /* + * Assume that all inodes on this superblock are memory + * backed. Skip the superblock. + */ break; + } if (wbc->nonblocking && bdi_write_congested(bdi)) { wbc->encountered_congestion = 1; _ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-13 8:01 ` Andrew Morton @ 2003-06-13 9:57 ` Herbert Xu 2003-06-13 14:42 ` Eduardo Pereira Habkost ` (2 subsequent siblings) 3 siblings, 0 replies; 28+ messages in thread From: Herbert Xu @ 2003-06-13 9:57 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel On Fri, Jun 13, 2003 at 01:01:49AM -0700, Andrew Morton wrote: > > Fix this by just leaving the inode dirty and moving on to inspect the other > blockdev inodes on sb->s_io. This fixes it for me. Thanks Andrew. -- Debian GNU/Linux 3.0 is out! ( http://www.debian.org/ ) Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-13 8:01 ` Andrew Morton 2003-06-13 9:57 ` Herbert Xu @ 2003-06-13 14:42 ` Eduardo Pereira Habkost 2003-06-13 17:17 ` Andy Pfiffer 2003-06-13 22:12 ` Unai Garro Arrazola 3 siblings, 0 replies; 28+ messages in thread From: Eduardo Pereira Habkost @ 2003-06-13 14:42 UTC (permalink / raw) To: Andrew Morton Cc: Andy Pfiffer, christophe, adam, linux-kernel, Herbert Xu, Unai Garro Arrazola, Max Valdez [-- Attachment #1: Type: text/plain, Size: 138 bytes --] On Fri, Jun 13, 2003 at 01:01:49AM -0700, Andrew Morton wrote: > > This should fix it. > It worked here. Thanks! -- Eduardo [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-13 8:01 ` Andrew Morton 2003-06-13 9:57 ` Herbert Xu 2003-06-13 14:42 ` Eduardo Pereira Habkost @ 2003-06-13 17:17 ` Andy Pfiffer 2003-06-13 22:12 ` Unai Garro Arrazola 3 siblings, 0 replies; 28+ messages in thread From: Andy Pfiffer @ 2003-06-13 17:17 UTC (permalink / raw) To: Andrew Morton Cc: christophe, adam, linux-kernel, Herbert Xu, Unai Garro Arrazola, Max Valdez, Eduardo Pereira Habkost On Fri, 2003-06-13 at 01:01, Andrew Morton wrote: > Fix this by just leaving the inode dirty and moving on to inspect the other > blockdev inodes on sb->s_io. Yup, this fixed it for me, too. Thanks for your help. --Andy ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-13 8:01 ` Andrew Morton ` (2 preceding siblings ...) 2003-06-13 17:17 ` Andy Pfiffer @ 2003-06-13 22:12 ` Unai Garro Arrazola 2003-06-13 22:28 ` Andrew Morton 3 siblings, 1 reply; 28+ messages in thread From: Unai Garro Arrazola @ 2003-06-13 22:12 UTC (permalink / raw) To: Andrew Morton, Andy Pfiffer Cc: christophe, adam, linux-kernel, Herbert Xu, Max Valdez, Eduardo Pereira Habkost -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I just got the time to checked. It works great, thanks! Where can I send this box of chocolates as gratitude? ;-) On Friday 13 June 2003 09:01, Andrew Morton wrote: > This should fix it. > > > > Once the blockdev inode for /dev/ram0 is dirtied we have a memory-backed > inode on the blockdev superblock's s_dirty list. > > sync_sb_inodes() sees the memory-backed inode on the superblock and assumes > that all the other inodes on the superblock are also memory-backed. This > is not true for the blockdev superblock! We forget to write out dirty > pages against the following blockdevs. > > Fix this by just leaving the inode dirty and moving on to inspect the other > blockdev inodes on sb->s_io. > > (This is a little inefficient: an alternative is to leave dirtied > memory-backed inodes on inode_in_use, so nobody ever even considers them > for writeout. But that introduces an inconsistency and is a bit kludgey). > > > > fs/fs-writeback.c | 15 ++++++++++++++- > 1 files changed, 14 insertions(+), 1 deletion(-) > > diff -puN fs/fs-writeback.c~writeback-memory-backed-fix fs/fs-writeback.c > --- 25/fs/fs-writeback.c~writeback-memory-backed-fix 2003-06-12 > 23:12:28.000000000 -0700 +++ 25-akpm/fs/fs-writeback.c 2003-06-12 > 23:14:07.000000000 -0700 > @@ -260,8 +260,21 @@ sync_sb_inodes(struct super_block *sb, s > struct address_space *mapping = inode->i_mapping; > struct backing_dev_info *bdi = mapping->backing_dev_info; > > - if (bdi->memory_backed) > + if (bdi->memory_backed) { > + if (sb == blockdev_superblock) { > + /* > + * Dirty memory-backed blockdev: the ramdisk > + * driver does this. > + */ > + list_move(&inode->i_list, &sb->s_dirty); > + continue; > + } > + /* > + * Assume that all inodes on this superblock are memory > + * backed. Skip the superblock. > + */ > break; > + } > > if (wbc->nonblocking && bdi_write_congested(bdi)) { > wbc->encountered_congestion = 1; > > _ - -- Coincidences are spiritual puns. -- G.K. Chesterton -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE+6kxjhxDfDIoZlaURAsHrAKCRFnHCpzdBbtJ8C9vrY6P7T9+dYACgg+fL XYizhhJD8KZ3bO4O/YzXr2c= =Rwik -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? 2003-06-13 22:12 ` Unai Garro Arrazola @ 2003-06-13 22:28 ` Andrew Morton 0 siblings, 0 replies; 28+ messages in thread From: Andrew Morton @ 2003-06-13 22:28 UTC (permalink / raw) To: Unai Garro Arrazola Cc: andyp, christophe, adam, linux-kernel, herbert, maxvalde, ehabkost Unai Garro Arrazola <Unai.Garro@ee.ed.ac.uk> wrote: > > I just got the time to checked. It works great, thanks! Thanks for following up. > Where can I send this box of chocolates as gratitude? ;-) Not to the guy who broke it in the first place ;) ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2003-06-13 22:18 UTC | newest] Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-05-09 19:04 [KEXEC][2.5.69] kexec for 2.5.69 available Andy Pfiffer 2003-05-09 20:04 ` ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69 available) Christophe Saout 2003-05-09 20:55 ` Andy Pfiffer 2003-05-09 20:46 ` ext3/lilo/2.5.6[89] (was: [KEXEC][2.5.69] kexec for 2.5.69available) Riley Williams 2003-05-09 22:39 ` Joe Korty 2003-05-09 23:39 ` Andy Pfiffer 2003-06-11 22:08 ` ext[23]/lilo/2.5.{68,69,70} -- IDE Problem? Andy Pfiffer 2003-06-11 23:21 ` Christophe Saout 2003-06-11 23:40 ` Bartlomiej Zolnierkiewicz 2003-06-12 0:20 ` ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? Andy Pfiffer 2003-06-12 0:29 ` Andrew Morton 2003-06-12 10:42 ` Christophe Saout 2003-06-12 10:54 ` Andrew Morton 2003-06-12 11:12 ` Christophe Saout 2003-06-12 11:24 ` ext[23]/lilo/2.5.{68,69,70} -- strace lilo - freeze Christophe Saout 2003-06-12 12:44 ` ext[23]/lilo/2.5.{68,69,70} -- blkdev_put() problem? Herbert Xu 2003-06-12 17:27 ` Andy Pfiffer 2003-06-12 17:53 ` Andrew Morton 2003-06-12 18:03 ` Andy Pfiffer 2003-06-12 18:10 ` Andrew Morton 2003-06-12 18:53 ` Christophe Saout 2003-06-12 18:25 ` Andy Pfiffer 2003-06-13 8:01 ` Andrew Morton 2003-06-13 9:57 ` Herbert Xu 2003-06-13 14:42 ` Eduardo Pereira Habkost 2003-06-13 17:17 ` Andy Pfiffer 2003-06-13 22:12 ` Unai Garro Arrazola 2003-06-13 22:28 ` Andrew Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).