linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] laptop-mode-2.6.0, version 5
       [not found] <20040102025509.91753.qmail@mail.com>
@ 2004-01-02 13:01 ` Bart Samwel
  2004-01-02 21:05   ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Bart Samwel @ 2004-01-02 13:01 UTC (permalink / raw)
  To: Simon Mackinlay; +Cc: Andrew Morton, linux-kernel, Jens Axboe

[-- Attachment #1: Type: text/plain, Size: 3790 bytes --]

Simon Mackinlay wrote:
>>I think what you want to do is to increase dirty_background_ratio and
>>dirty_ratio, so the system buffers _more_ dirty data before lighting up the
>>disk.  But it shouldn't be disabled altogether.
> 
> The greater component of functional gain from laptop-mode derives
> from the notion that we want to schedule writeout events with some
> controlled period (default in documentation is 10min interval, at
> next journal commit event, or at next read event, whichever arrives
> first).
> 
> Is it possible instead to put a ceiling on the amount of data we're
> willing to accept for writeout and somehow apply back-pressure
> to the task (or even, more coarsely, to the system as a whole),
> rather than arbitrarily stretching these ratios to ${MAGIC NUMBER}
> when laptop mode is active - given that the overall goal is to
> control what triggers may result in disk activity.
> 
> In other words - what happens at present if we're unable to
> accept more data into the system buffers because we're already
> out of memory (and quite possibly now awaiting completion of
> a writeout), and are we not able to just allow this condition
> (of being apparently out of memory - and presumably stalling
> writers) to persist until either of...
> 
>    * the timers controlled via laptop-mode permit a writeout event
>    * a journal commit event
>    * a read event
> 
> ... occurs?
> 
> Apologies in advance for the naivete - but this functionality
> seems too useful to ignore, particularly considering that MTBF
> figures and warranties for notebook drives tend to put a ceiling
> against the number of operational hours for which the unit will be
> supported - also considering that the periodicity of both writeout
> and journal commit are tunable (with safe defaults), and that the
> mode itself is not default (ie, notebook users, here's a gun,
> don't shoot your feet would seem to apply?).

I think Andrew has a point. The dirty writebacks (controlled by 
dirty_ratio and dirty_background_ratio) should simply be set to
possibly higher and, more importantly, *equal* values. This will 
effectively disable background writeouts (which are good for speed, but 
bad for battery power because it will spin up your disk sooner than 
necessary), but it will not make the system keep accumulating dirty 
pages ad infinitum like the current laptop_mode patch does. IMO, if a 
writer writes enough to have filled up 40% of all available memory with 
dirty pages, the disk deserves to spin up.

Find attached version 5 of the laptop-mode patch. It includes the 
following changes:

* Fix for supporting 64-bit sector_t (thanks hugang!)

* Simplified the design a bit, saved us a timer. It now only wakes up 
kupdate to write back stuff: kupdate disregards age while laptop_mode is 
active, so it writes back everything anyway.

* balance_dirty_pages does it's job again in laptop mode. However, when 
it decides to write back pages, it now also calls disk_is_spun_up(), 
which makes sure that the remainder of the dirty pages are written 
immediately as well. That means that if a writer writes an amount of 
data that is about 50% of memory, and dirty_ratio is 40%, that the disk 
will spin up after 4/5ths of the data is written, will write the full 
4/5ths of the data at once, and then the disk can spin down again 
because the remaining 10% will be not be written until the next spin-up.

* The control script now sets dirty_background_ratio to the same value 
as dirty_ratio, so that background writes are effectively disabled. This 
enables a writer to fill up up to dirty_ratio (default 40%) of the 
memory with dirty blocks before the disk is spun up.

* Includes control script (scripts/laptop_mode).

* Includes docs (Documentation/laptop-mode.txt).



-- Bart

[-- Attachment #2: laptop-mode-2.6.0-5.patch --]
[-- Type: text/x-patch, Size: 11277 bytes --]

diff -Nbaur linux-2.6.0/Documentation/laptop-mode.txt linux-2.6.0-withlaptopmode/Documentation/laptop-mode.txt
--- linux-2.6.0/Documentation/laptop-mode.txt	1970-01-01 01:00:00.000000000 +0100
+++ linux-2.6.0-withlaptopmode/Documentation/laptop-mode.txt	2004-01-02 13:31:46.000000000 +0100
@@ -0,0 +1,69 @@
+How to conserve battery power using laptop-mode
+-----------------------------------------------
+
+Document Author: Bart Samwel (bart@samwel.tk)
+Date created: January 2, 2004
+
+Introduction
+------------
+
+Laptopmode is used to minimize the time that the hard disk needs to be spun up,
+to conserve battery power on laptops. It has been reported to cause significant
+power savings.
+
+The short story
+---------------
+
+If you just want to use it, run the laptop_mode control script as follows:
+
+# laptop_mode start
+
+and then remount all filesystems on your harddisk with a higher commit value,
+e.g.:
+
+mount /dev/hda2 / -t ext3 -o remount,defaults,noatime,commit=600
+
+Then set your harddisk spindown time to a relatively low value with hdparm:
+
+hdparm -S 4 /dev/hda
+
+The value -S 4 means 20 seconds idle time before spindown. Your harddisk will
+now only spin up when a disk cache miss occurs, or at least once every 10
+minutes to write back any pending changes.
+
+To stop laptop_mode, remount your filesystems with regular commit intervals
+(e.g., 5 seconds), and run "laptop_mode stop".
+
+CAVEAT: The downside of laptop mode is that you have a chance of losing up
+to 10 minutes of work. If you cannot afford this, don't use it!
+
+The details
+-----------
+
+Laptop-mode is controlled by the flag /proc/sys/vm/laptop_mode. When this
+flag is set, any physical disk read operation (that might have caused the
+hard disk to spin up) causes Linux to flush all dirty blocks. The result
+of this is that after a disk has spun down, it will not be spun up anymore
+to write dirty blocks, because those blocks had already been written
+immediately after the most recent read operation
+
+To increase the effectiveness of the laptop_mode strategy, the laptop_mode
+control script increases dirty_expire_centisecs and dirty_writeback_centisecs in
+/proc/sys/vm to about 10 minutes (by default), which means that pages that are
+dirtied are not forced to be written to disk as often. The control script also changes the dirty background ratio, so that background writeback of dirty pages
+is not done anymore. Combined with a higher commit value (also 10 minutes) for
+ext3 or ReiserFS filesystems, this results in concentration of disk activity in
+a small time interval which occurs only once every 10 minutes, or whenever the
+disk is forced to spin up by a cache miss. The disk can then be spun down in the periods of inactivity.
+
+If you want to find out which process caused the disk to spin up, you can
+gather information by setting the flag /proc/sys/vm/block_dump. When this flag
+is set, Linux reports all disk read and write operations that take place, and
+all block dirtyings done to files. This makes it possible to debug why a disk
+needs to spin up, and to increase battery life even more.
+
+If 10 minutes is too much or too little downtime for you, you can configure
+this downtime as follows. In the control script, set the MAX_AGE value to the
+maximum number of seconds of disk downtime that you would like. You should
+then set your filesystem's commit interval to the same value. The dirty ratio
+is also configurable from the control script.
\ No newline at end of file
diff -Nbaur linux-2.6.0/drivers/block/ll_rw_blk.c linux-2.6.0-withlaptopmode/drivers/block/ll_rw_blk.c
--- linux-2.6.0/drivers/block/ll_rw_blk.c	2003-12-24 05:19:46.000000000 +0100
+++ linux-2.6.0-withlaptopmode/drivers/block/ll_rw_blk.c	2004-01-02 13:47:06.000000000 +0100
@@ -27,6 +27,7 @@
 #include <linux/completion.h>
 #include <linux/slab.h>
 #include <linux/swap.h>
+#include <linux/writeback.h>
 
 static void blk_unplug_work(void *data);
 static void blk_unplug_timeout(unsigned long data);
@@ -2307,6 +2308,15 @@
 		mod_page_state(pgpgout, count);
 	else
 		mod_page_state(pgpgin, count);
+
+	if (unlikely(block_dump)) {
+		char b[BDEVNAME_SIZE];
+		printk("%s(%d): %s block %Lu on %s\n",
+			current->comm, current->pid,
+			(rw & WRITE) ? "WRITE" : "READ",
+			(u64)bio->bi_sector, bdevname(bio->bi_bdev,b));
+	}
+
 	generic_make_request(bio);
 	return 1;
 }
@@ -2598,6 +2608,11 @@
 			disk_stat_add(disk, write_ticks, duration);
 			break;
 		    case READ:
+			/*
+			 * schedule the writeout of pending dirty data when the disk is idle
+			 */
+			if (unlikely(laptop_mode))
+				disk_is_spun_up();
 			disk_stat_inc(disk, reads);
 			disk_stat_add(disk, read_ticks, duration);
 			break;
diff -Nbaur linux-2.6.0/fs/buffer.c linux-2.6.0-withlaptopmode/fs/buffer.c
--- linux-2.6.0/fs/buffer.c	2003-12-24 05:19:46.000000000 +0100
+++ linux-2.6.0-withlaptopmode/fs/buffer.c	2003-12-24 15:46:30.000000000 +0100
@@ -855,10 +855,13 @@
 		struct buffer_head *bh = head;
 
 		do {
-			if (buffer_uptodate(bh))
+			if (buffer_uptodate(bh)) {
 				set_buffer_dirty(bh);
-			else
+				if (unlikely(block_dump))
+					printk("%s(%d): dirtied buffer\n", current->comm, current->pid);
+			} else {
 				buffer_error();
+			}
 			bh = bh->b_this_page;
 		} while (bh != head);
 	}
diff -Nbaur linux-2.6.0/include/linux/sysctl.h linux-2.6.0-withlaptopmode/include/linux/sysctl.h
--- linux-2.6.0/include/linux/sysctl.h	2003-12-24 05:19:46.000000000 +0100
+++ linux-2.6.0-withlaptopmode/include/linux/sysctl.h	2003-12-24 03:17:36.000000000 +0100
@@ -154,6 +154,8 @@
 	VM_SWAPPINESS=19,	/* Tendency to steal mapped memory */
 	VM_LOWER_ZONE_PROTECTION=20,/* Amount of protection of lower zones */
 	VM_MIN_FREE_KBYTES=21,	/* Minimum free kilobytes to maintain */
+	VM_LAPTOP_MODE=22,      /* vm laptop mode */
+	VM_BLOCK_DUMP=23,       /* block dump mode */
 };
 
 
diff -Nbaur linux-2.6.0/include/linux/writeback.h linux-2.6.0-withlaptopmode/include/linux/writeback.h
--- linux-2.6.0/include/linux/writeback.h	2003-12-24 05:19:46.000000000 +0100
+++ linux-2.6.0-withlaptopmode/include/linux/writeback.h	2004-01-02 13:07:50.000000000 +0100
@@ -71,12 +71,15 @@
  * mm/page-writeback.c
  */
 int wakeup_bdflush(long nr_pages);
+void disk_is_spun_up(void);
 
-/* These 5 are exported to sysctl. */
+/* These are exported to sysctl. */
 extern int dirty_background_ratio;
 extern int vm_dirty_ratio;
 extern int dirty_writeback_centisecs;
 extern int dirty_expire_centisecs;
+extern int block_dump;
+extern int laptop_mode;
 
 struct ctl_table;
 struct file;
diff -Nbaur linux-2.6.0/kernel/sysctl.c linux-2.6.0-withlaptopmode/kernel/sysctl.c
--- linux-2.6.0/kernel/sysctl.c	2003-12-24 05:19:46.000000000 +0100
+++ linux-2.6.0-withlaptopmode/kernel/sysctl.c	2003-12-24 06:24:53.000000000 +0100
@@ -700,6 +700,26 @@
 		.strategy	= &sysctl_intvec,
 		.extra1		= &zero,
 	},
+	{
+		.ctl_name	= VM_LAPTOP_MODE,
+		.procname	= "laptop_mode",
+		.data		= &laptop_mode,
+		.maxlen		= sizeof(laptop_mode),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec,
+		.strategy	= &sysctl_intvec,
+		.extra1		= &zero,
+	},
+	{
+		.ctl_name	= VM_BLOCK_DUMP,
+		.procname	= "block_dump",
+		.data		= &block_dump,
+		.maxlen		= sizeof(block_dump),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec,
+		.strategy	= &sysctl_intvec,
+		.extra1		= &zero,
+	},
 	{ .ctl_name = 0 }
 };
 
diff -Nbaur linux-2.6.0/mm/page-writeback.c linux-2.6.0-withlaptopmode/mm/page-writeback.c
--- linux-2.6.0/mm/page-writeback.c	2003-12-24 05:19:46.000000000 +0100
+++ linux-2.6.0-withlaptopmode/mm/page-writeback.c	2004-01-02 13:51:40.000000000 +0100
@@ -28,6 +28,7 @@
 #include <linux/smp.h>
 #include <linux/sysctl.h>
 #include <linux/cpu.h>
+#include <linux/quotaops.h>
 
 /*
  * The maximum number of pages to writeout in a single bdflush/kupdate
@@ -81,6 +82,16 @@
  */
 int dirty_expire_centisecs = 30 * 100;
 
+/*
+ * Flag that makes the machine dump writes/reads and block dirtyings.
+ */
+int block_dump;
+
+/*
+ * Flag that puts the machine in "laptop mode".
+ */
+int laptop_mode;
+
 /* End of sysctl-exported parameters */
 
 
@@ -180,6 +191,8 @@
 		 */
 		if (nr_reclaimable) {
 			writeback_inodes(&wbc);
+			if (unlikely(laptop_mode))
+				disk_is_spun_up(); /* Schedule full writeout. */
 			get_dirty_limits(&ps, &background_thresh,
 					&dirty_thresh);
 			nr_reclaimable = ps.nr_dirty + ps.nr_unstable;
@@ -327,6 +340,8 @@
 	oldest_jif = jiffies - (dirty_expire_centisecs * HZ) / 100;
 	start_jif = jiffies;
 	next_jif = start_jif + (dirty_writeback_centisecs * HZ) / 100;
+	if (laptop_mode)
+		wbc.older_than_this = NULL;
 	nr_to_write = ps.nr_dirty + ps.nr_unstable +
 			(inodes_stat.nr_inodes - inodes_stat.nr_unused);
 	while (nr_to_write > 0) {
@@ -343,6 +358,11 @@
 	}
 	if (time_before(next_jif, jiffies + HZ))
 		next_jif = jiffies + HZ;
+	if (laptop_mode) {
+		sync_inodes(0);
+		sync_filesystems(0);
+		DQUOT_SYNC(NULL);
+	}
 	if (dirty_writeback_centisecs)
 		mod_timer(&wb_timer, next_jif);
 }
@@ -363,6 +383,15 @@
 	return 0;
 }
 
+/*
+ * We've spun up the disk and we're in laptop mode: schedule writeback
+ * of all dirty data in 5 seconds.
+ */
+void disk_is_spun_up(void)
+{
+	mod_timer(&wb_timer, jiffies + 5 * HZ);
+}
+
 static void wb_timer_fn(unsigned long unused)
 {
 	if (pdflush_operation(wb_kupdate, 0) < 0)
@@ -525,6 +554,8 @@
 				__mark_inode_dirty(mapping->host,
 							I_DIRTY_PAGES);
 		}
+		if (unlikely(block_dump))
+			printk("%s(%d): dirtied page\n", current->comm, current->pid);
 	}
 	return ret;
 }
diff -Nbaur linux-2.6.0/scripts/laptop_mode linux-2.6.0-withlaptopmode/scripts/laptop_mode
--- linux-2.6.0/scripts/laptop_mode	1970-01-01 01:00:00.000000000 +0100
+++ linux-2.6.0-withlaptopmode/scripts/laptop_mode	2004-01-02 13:10:55.000000000 +0100
@@ -0,0 +1,57 @@
+#!/bin/sh
+#
+# start of stop laptop mode, best run by a power management daemon when
+# ac gets connected/disconnected from a laptop
+
+# age time, in seconds. should be put into a sysconfig file
+MAX_AGE=600
+
+# Allowed dirty ratio, in pct. should be put into a sysconfig file as well.
+DIRTY_RATIO=40
+
+# kernel default dirty buffer age
+DEF_AGE=30
+DEF_UPDATE=5
+DEF_DIRTY_BACKGROUND_RATIO=10
+DEF_DIRTY_RATIO=40
+
+
+if [ ! -e /proc/sys/vm/laptop_mode ]; then
+	echo "Kernel is not patched with laptop_mode patch."
+	exit 1
+fi
+
+if [ ! -w /proc/sys/vm/laptop_mode ]; then
+	echo "You do not have enough privileges to enable laptop mode."
+	exit 1
+fi
+
+case "$1" in
+	start)
+		AGE=$((100*$MAX_AGE))
+		echo -n "Starting laptop mode"
+		echo "1" > /proc/sys/vm/laptop_mode
+		echo "$AGE" > /proc/sys/vm/dirty_expire_centisecs
+		echo "$AGE" > /proc/sys/vm/dirty_writeback_centisecs
+		echo "$DIRTY_RATIO" > /proc/sys/vm/dirty_ratio
+		echo "$DIRTY_RATIO" > /proc/sys/vm/dirty_background_ratio
+		echo "."
+		;;
+	stop)
+		U_AGE=$((100*$DEF_UPDATE))
+		B_AGE=$((100*$DEF_AGE))
+		echo -n "Stopping laptop mode"
+		echo "0" > /proc/sys/vm/laptop_mode
+		echo "$B_AGE" > /proc/sys/vm/dirty_writeback_centisecs
+		echo "$U_AGE" > /proc/sys/vm/dirty_expire_centisecs
+		echo "$DEF_DIRTY_RATIO" > /proc/sys/vm/dirty_ratio
+		echo "$DEF_DIRTY_BACKGROUND_RATIO" > /proc/sys/vm/dirty_background_ratio
+		echo "."
+		;;
+	*)
+		echo "$0 {start|stop}"
+		;;
+
+esac
+
+exit 0

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] laptop-mode-2.6.0, version 5
  2004-01-02 13:01 ` [PATCH] laptop-mode-2.6.0, version 5 Bart Samwel
@ 2004-01-02 21:05   ` Andrew Morton
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2004-01-02 21:05 UTC (permalink / raw)
  To: Bart Samwel; +Cc: smackinlay, linux-kernel, axboe

Bart Samwel <bart@samwel.tk> wrote:
>
> Find attached version 5 of the laptop-mode patch. It includes the 
> following changes:
> 
> * Fix for supporting 64-bit sector_t (thanks hugang!)

%Lu expects an unsigned long long argument and nothing else, so we should
cast this sector_t to unsigned long long, not u64.

With u64 this code will generate a warning on ppc64 (at least), because
ppc64's u64 is unsigned long.

> * Simplified the design a bit, saved us a timer. It now only wakes up 
> kupdate to write back stuff: kupdate disregards age while laptop_mode is 
> active, so it writes back everything anyway.
> 
> * balance_dirty_pages does it's job again in laptop mode. However, when 
> it decides to write back pages, it now also calls disk_is_spun_up(), 
> which makes sure that the remainder of the dirty pages are written 
> immediately as well. That means that if a writer writes an amount of 
> data that is about 50% of memory, and dirty_ratio is 40%, that the disk 
> will spin up after 4/5ths of the data is written, will write the full 
> 4/5ths of the data at once, and then the disk can spin down again 
> because the remaining 10% will be not be written until the next spin-up.

hum, OK, I'll tkae a look at that, thanks.

> * The control script now sets dirty_background_ratio to the same value 
> as dirty_ratio, so that background writes are effectively disabled. This 
> enables a writer to fill up up to dirty_ratio (default 40%) of the 
> memory with dirty blocks before the disk is spun up.
> 
> * Includes control script (scripts/laptop_mode).
>
> * Includes docs (Documentation/laptop-mode.txt).

Can you please place the control script inside laptop-mode.txt rather than
in scripts/?  The scripts directory is for kernel build tools, not for
kernel runtime support things.

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] laptop-mode-2.6.0 version 5
  2004-01-04  9:52                   ` Peter Chubb
  2004-01-04 10:31                     ` Jens Axboe
@ 2004-03-29 15:42                     ` Jens Axboe
  1 sibling, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2004-03-29 15:42 UTC (permalink / raw)
  To: Administrator
  Cc: Hugang, Bart Samwel, Andrew Morton, smackinlay, Bartek Kania,
	linux-kernel

On Sun, Jan 04 2004, Peter Chubb wrote:
> >>>>> "Jens" == Jens Axboe <axboe@suse.de> writes:
> 
> Jens> The dump printk() needs to be changed anyways, the rw
> Jens> deciphering is not correct. Something like this is more
> Jens> appropriate:
> 
> Jens>	if (unlikely(block_dump)) { 
> Jens>		char b[BDEVNAME_SIZE];
> Jens>		printk("%s(%d): %s block %Lu on %s\n", current->comm, current-> pid, (rw & WRITE) ? "WRITE" : "READ",
> Jens>		(u64) bio->bi_sector, bdevname(bio->bi_bdev, b)); 
> Jens>	}
> 
> Please cast to (unsigned long long) not (u64) because on 64-bit
> architectures u64 is unsigned long, and you'll get a compiler warning.

Yeah Andrew noted the same thing, my bad.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] laptop-mode-2.6.0 version 5
  2004-01-02 12:03                 ` Jens Axboe
  2004-01-04  9:52                   ` Peter Chubb
@ 2004-03-29 15:42                   ` Peter Chubb
  1 sibling, 0 replies; 10+ messages in thread
From: Peter Chubb @ 2004-03-29 15:42 UTC (permalink / raw)
  To: Administrator
  Cc: Hugang, Bart Samwel, Andrew Morton, smackinlay, Bartek Kania,
	linux-kernel

>>>>> "Jens" == Jens Axboe <axboe@suse.de> writes:

Jens> The dump printk() needs to be changed anyways, the rw
Jens> deciphering is not correct. Something like this is more
Jens> appropriate:

Jens>	if (unlikely(block_dump)) { 
Jens>		char b[BDEVNAME_SIZE];
Jens>		printk("%s(%d): %s block %Lu on %s\n", current->comm, current-> pid, (rw & WRITE) ? "WRITE" : "READ",
Jens>		(u64) bio->bi_sector, bdevname(bio->bi_bdev, b)); 
Jens>	}

Please cast to (unsigned long long) not (u64) because on 64-bit
architectures u64 is unsigned long, and you'll get a compiler warning.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] laptop-mode-2.6.0 version 5
  2004-01-04  9:52                   ` Peter Chubb
@ 2004-01-04 10:31                     ` Jens Axboe
  2004-03-29 15:42                     ` Jens Axboe
  1 sibling, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2004-01-04 10:31 UTC (permalink / raw)
  To: Peter Chubb
  Cc: Hugang, Bart Samwel, Andrew Morton, smackinlay, Bartek Kania,
	linux-kernel

On Sun, Jan 04 2004, Peter Chubb wrote:
> >>>>> "Jens" == Jens Axboe <axboe@suse.de> writes:
> 
> Jens> The dump printk() needs to be changed anyways, the rw
> Jens> deciphering is not correct. Something like this is more
> Jens> appropriate:
> 
> Jens>	if (unlikely(block_dump)) { 
> Jens>		char b[BDEVNAME_SIZE];
> Jens>		printk("%s(%d): %s block %Lu on %s\n", current->comm, current-> pid, (rw & WRITE) ? "WRITE" : "READ",
> Jens>		(u64) bio->bi_sector, bdevname(bio->bi_bdev, b)); 
> Jens>	}
> 
> Please cast to (unsigned long long) not (u64) because on 64-bit
> architectures u64 is unsigned long, and you'll get a compiler warning.

Yeah Andrew noted the same thing, my bad.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] laptop-mode-2.6.0 version 5
  2004-01-02 12:03                 ` Jens Axboe
@ 2004-01-04  9:52                   ` Peter Chubb
  2004-01-04 10:31                     ` Jens Axboe
  2004-03-29 15:42                     ` Jens Axboe
  2004-03-29 15:42                   ` Peter Chubb
  1 sibling, 2 replies; 10+ messages in thread
From: Peter Chubb @ 2004-01-04  9:52 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Hugang, Bart Samwel, Andrew Morton, smackinlay, Bartek Kania,
	linux-kernel

>>>>> "Jens" == Jens Axboe <axboe@suse.de> writes:

Jens> The dump printk() needs to be changed anyways, the rw
Jens> deciphering is not correct. Something like this is more
Jens> appropriate:

Jens>	if (unlikely(block_dump)) { 
Jens>		char b[BDEVNAME_SIZE];
Jens>		printk("%s(%d): %s block %Lu on %s\n", current->comm, current-> pid, (rw & WRITE) ? "WRITE" : "READ",
Jens>		(u64) bio->bi_sector, bdevname(bio->bi_bdev, b)); 
Jens>	}

Please cast to (unsigned long long) not (u64) because on 64-bit
architectures u64 is unsigned long, and you'll get a compiler warning.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] laptop-mode-2.6.0 version 5
  2004-01-02 11:38               ` Hugang
@ 2004-01-02 12:03                 ` Jens Axboe
  2004-01-04  9:52                   ` Peter Chubb
  2004-03-29 15:42                   ` Peter Chubb
  0 siblings, 2 replies; 10+ messages in thread
From: Jens Axboe @ 2004-01-02 12:03 UTC (permalink / raw)
  To: Hugang; +Cc: Bart Samwel, Andrew Morton, smackinlay, Bartek Kania, linux-kernel

On Fri, Jan 02 2004, Hugang wrote:
> On Fri, 2 Jan 2004 12:27:33 +0100
> Jens Axboe <axboe@suse.de> wrote:
> 
> > I dunno, I can't possibly tell since you haven't given any info about
> > this crash. Where does it crash, do you have an oops? All I could say
> > from your report + patch is that it wasn't valid. There's just no way
> > for current->comm to be NULL, so your patch couldn't possibly have made
> > a difference.
> 
> Attached file is the crashed dmesg, When I disable CONFIG_LBD, the
> problem not found any more.

Ah there you go, then it's just the missing cast to u64. It has nothing
to do with current->comm at all. The compiler should have warned you
about this error, did it not?

The dump printk() needs to be changed anyways, the rw deciphering is not
correct. Something like this is more appropriate:

	if (unlikely(block_dump)) {
		char b[BDEVNAME_SIZE];

		printk("%s(%d): %s block %Lu on %s\n", current->comm,
			current->pid, (rw & WRITE) ? "WRITE" : "READ",
			(u64) bio->bi_sector, bdevname(bio->bi_bdev, b));
	}

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] laptop-mode-2.6.0 version 5
  2004-01-02 11:27             ` Jens Axboe
@ 2004-01-02 11:38               ` Hugang
  2004-01-02 12:03                 ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Hugang @ 2004-01-02 11:38 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Bart Samwel, Andrew Morton, smackinlay, Bartek Kania, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 610 bytes --]

On Fri, 2 Jan 2004 12:27:33 +0100
Jens Axboe <axboe@suse.de> wrote:

> I dunno, I can't possibly tell since you haven't given any info about
> this crash. Where does it crash, do you have an oops? All I could say
> from your report + patch is that it wasn't valid. There's just no way
> for current->comm to be NULL, so your patch couldn't possibly have made
> a difference.

Attached file is the crashed dmesg, When I disable CONFIG_LBD, the problem not found any more.

thanks.

-- 
Hu Gang / Steve
RLU#          : 204016 [1999] (Registered Linux user)
GPG Public Key: http://soulinfo.com/~hugang/HuGang.asc

[-- Attachment #2: crash.log --]
[-- Type: application/octet-stream, Size: 8906 bytes --]

Total memory = 256MB; using 512kB for hash table (at c0280000)
Linux version 2.6.0-test11-wli-3 (hugang@hugang.soulinfo.com) (gcc version 2.95.4 20011002 (Debian prerelease)) #13 Fri Jan 2 19:24:29 CST 2004
Found Uninorth memory controller & host bridge, revision: 210
Mapped at 0xfdf3c000
Found a Intrepid mac-io controller, rev: 0, mapped at 0xfdebc000
Processor NAP mode on idle enabled.
PowerMac motherboard: PowerBook G4 12"
Found UniNorth PCI host bridge at 0xf0000000. Firmware bus number: 0->0
Found UniNorth PCI host bridge at 0xf2000000. Firmware bus number: 0->0
Found UniNorth PCI host bridge at 0xf4000000. Firmware bus number: 0->0
PMU driver 2 initialized for Core99, firmware: 0c
On node 0 totalpages: 65536
  DMA zone: 65536 pages, LIFO batch:256
  Normal zone: 0 pages, LIFO batch:1
  HighMem zone: 0 pages, LIFO batch:1
Building zonelist for node : 0
Kernel command line: root=/dev/hda13 ro 
PowerMac using OpenPIC irq controller
OpenPIC Version 1.2 (4 CPUs and 64 IRQ sources) at fc66e000
OpenPIC timer frequency is 4.166666 MHz
PID hash table entries: 2048 (order 11: 16384 bytes)
GMT Delta read from XPRAM: 0 minutes, DST: off
time_init: decrementer frequency = 33.280357 MHz
Console: colour dummy device 80x25
Memory: 250880k available (1472k kernel code, 928k data, 144k init, 0k highmem)
AGP special page: 0xcffff000
Calibrating delay loop... 530.84 BogoMIPS
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
POSIX conformance testing by UNIFIX
NET: Registered protocol family 16
PCI: Probing PCI hardware
Registering openpic with sysfs...
Using unsupported 1024x768 NVDA,Display-A at 94004000, depth=8, pitch=1024
fb0: Open Firmware frame buffer device on /pci@f0000000/NVDA,Parent@10/NVDA,Display-A@0
ikconfig 0.7 with /proc/config*
Initializing Cryptographic API
Console: switching to colour frame buffer device 128x48
pty: 256 Unix98 ptys configured
Generic RTC Driver v1.07
MacIO PCI driver attached to Intrepid chipset
PowerMac Z8530 serial driver version 2.0
tty00 at 0xd10c1020 (irq = 22) is a Z8530 ESCC
tty01 at 0xd10c8000 (irq = 23) is a Z8530 ESCC
Macintosh non-volatile memory driver v1.0
input: Macintosh mouse button emulation
apm_emu: APM Emulation 0.5 initialized.
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PCI: Enabling device 0002:02:0d.0 (0000 -> 0002)
ide0: Found Apple UniNorth ATA-6 controller, bus ID 3
Probing IDE interface ide0...
hda: TOSHIBA MK4021GAS, ATA DISK drive
hda: Enabling Ultra DMA 5
Using anticipatory io scheduler
ide0 at 0xd10d0000-0xd10d0007,0xd10d0160 on irq 39
ide1: Found Apple KeyLargo ATA-3 controller, bus ID 0
Probing IDE interface ide1...
hdc: MATSHITADVD-R UJ-815, ATAPI CD/DVD-ROM drive
hdc: MDMA, cycleTime: 120, accessTime: 90, recTime: 30
hdc: Set MDMA timing for mode 2, reg: 0x00011d26
hdc: Enabling MultiWord DMA 2
ide1 at 0xd10d3000-0xd10d3007,0xd10d3160 on irq 24
hda: max request size: 128KiB
adb: starting probe task...
adb devices: [2]: 2 c3 [3]: 3 1 [7]: 7 1f
ADB keyboard at 2, handler 1
Detected ADB keyboard, type ANSI.
input: ADB keyboard on adb2:2.c3/input
input: ADB Powerbook buttons on adb7:7.1f/input
ADB mouse at 3, handler set to 4 (trackpad)
input: ADB mouse on adb3:3.01/input
adb: finished probe task...
hda: 78140160 sectors (40007 MB), CHS=65535/16/63, UDMA(100)
 hda: [mac] hda1 hda2 hda3 hda4 hda5 hda6 hda7 hda8 hda9 hda10 hda11 hda12 hda13 hda14
Console: switching to colour frame buffer device 128x48
mice: PS/2 mouse device common for all mice
NET: Registered protocol family 2
IP: routing cache hash table of 2048 buckets, 16Kbytes
TCP: Hash tables configured (established 16384 bind 16384)
NET: Registered protocol family 1
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting.  Commit interval 5 seconds
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 144k init 4k chrp 8k prep
Adding 524280k swap on /dev/hda10.  Priority:-1 extents:1
EXT3 FS on hda13, internal journal
sungem.c:v0.98 8/24/03 David S. Miller (davem@redhat.com)
eth0: Sun GEM (PCI) 10/100/1000BaseT Ethernet 00:0a:95:7f:e0:de 
eth0: Found BCM5221 PHY
drivers/usb/core/usb.c: registered new driver usbfs
drivers/usb/core/usb.c: registered new driver hub
drivers/usb/core/usb.c: registered new driver hid
drivers/usb/input/hid-core.c: v2.0:USB HID core driver
ohci1394: $Rev: 1045 $ Ben Collins <bcollins@debian.org>
PCI: Enabling device 0002:02:0e.0 (0000 -> 0002)
ohci1394_0: Unexpected PCI resource length of 1000!
ohci1394_0: OHCI-1394 1.1 (PCI): IRQ=[40]  MMIO=[f5000000-f50007ff]  Max Packet=[2048]
Found KeyWest i2c on "mac-io", 1 channel, stepping: 4 bits
Found KeyWest i2c on "uni-n", 2 channels, stepping: 4 bits
ieee1394: Host added: ID:BUS[0-00:1023]  GUID[000a95fffe7fe0de]
registering 0-0035
Linux agpgart interface v0.100 (c) Dave Jones
agpgart: Detected Apple UniNorth 2 chipset
agpgart: Maximum main memory to use for agp memory: 204M
agpgart: configuring for size idx: 4
agpgart: AGP aperture is 16M @ 0x0
ohci_hcd: 2003 Oct 13 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ohci_hcd: block sizes: ed 64 td 64
PCI: Enabling device 0001:01:18.0 (0000 -> 0002)
ohci_hcd 0001:01:18.0: OHCI Host Controller
ohci_hcd 0001:01:18.0: irq 27, pci mem d139a000
ohci_hcd 0001:01:18.0: new USB bus registered, assigned bus number 1
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
PCI: Enabling device 0001:01:19.0 (0000 -> 0002)
ohci_hcd 0001:01:19.0: OHCI Host Controller
ohci_hcd 0001:01:19.0: irq 28, pci mem d13a2000
ohci_hcd 0001:01:19.0: new USB bus registered, assigned bus number 2
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
PCI: Enabling device 0001:01:1a.0 (0000 -> 0002)
ohci_hcd 0001:01:1a.0: OHCI Host Controller
ohci_hcd 0001:01:1a.0: irq 29, pci mem d13a4000
ohci_hcd 0001:01:1a.0: new USB bus registered, assigned bus number 3
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
hub 1-0:1.0: new USB device on port 2, assigned address 2
device-mapper: 4.0.0-ioctl (2003-06-04) initialised: dm@uk.sistina.com
found reiserfs format "3.6" with standard journal
Reiserfs journal params: device dm-3, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 600, max trans age 30
reiserfs: checking transaction log (dm-3) for (dm-3)
Using r5 hash to sort names
found reiserfs format "3.6" with standard journal
Reiserfs journal params: device dm-0, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 600, max trans age 30
reiserfs: checking transaction log (dm-0) for (dm-0)
Using r5 hash to sort names
found reiserfs format "3.6" with standard journal
Reiserfs journal params: device dm-4, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 600, max trans age 30
reiserfs: checking transaction log (dm-4) for (dm-4)
Using r5 hash to sort names
found reiserfs format "3.6" with standard journal
Reiserfs journal params: device dm-5, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 600, max trans age 30
reiserfs: checking transaction log (dm-5) for (dm-5)
Using r5 hash to sort names
eth0: Link is up at 100 Mbps, full-duplex.
eth0: Pause is disabled
NET: Registered protocol family 17
process `named' is using obsolete setsockopt SO_BSDCOMPAT
process `lwresd' is using obsolete setsockopt SO_BSDCOMPAT
process `named' is using obsolete setsockopt SO_BSDCOMPAT
process `lwresd' is using obsolete setsockopt SO_BSDCOMPAT
nfs warning: mount version older than kernel
hda: Set PIO timing for mode 4, reg: 0x00758000
Oops: kernel access of bad area, sig: 11 [#1]
NIP: C00AB674 LR: C00AC2A0 SP: C0729A10 REGS: c0729960 TRAP: 0301    Not tainted
MSR: 00001032 EE: 0 PR: 0 FP: 0 ME: 1 IR/DR: 11
DAR: 0003E740, DSISR: 40000000
TASK = cfd810c0[378] 'bash' Last syscall: 195 
GPR00: C0729A58 C0729A10 CFD810C0 0003E740 FFFFFFFE C0194C30 00000000 0000000A 
GPR08: FFFFFFFF 0003E740 C00AC254 00000006 22028024 
Call trace:
 [c0018d50] printk+0x98/0x1a4
 [c00d2c50] submit_bio+0x110/0x130
 [c005ac38] submit_bh+0x1b4/0x1d0
 [c00586d8] __bread_slow+0xa0/0xfc
 [c0058a04] __bread+0x20/0x30
 [d14da4fc] search_by_key+0xa8/0x1e0c [reiserfs]
 [d14c26d0] reiserfs_read_locked_inode+0x68/0x110 [reiserfs]
 [d14c2820] reiserfs_iget+0x68/0xb8 [reiserfs]
 [d14bd61c] reiserfs_lookup+0x88/0x100 [reiserfs]
 [c0063da0] real_lookup+0x8c/0x14c
 [c0064110] do_lookup+0x68/0xd0
 [c00643f0] link_path_walk+0x278/0x908
 [c0064ddc] path_lookup+0x17c/0x190
 [c0064fd4] __user_walk+0x44/0x70
 [c005fb5c] vfs_stat+0x20/0x5c
init(1): dirtied page
init(1): dirtied page

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] laptop-mode-2.6.0 version 5
  2004-01-02  9:02           ` [PATCH] laptop-mode-2.6.0 " Hugang
@ 2004-01-02 11:27             ` Jens Axboe
  2004-01-02 11:38               ` Hugang
  0 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2004-01-02 11:27 UTC (permalink / raw)
  To: Hugang; +Cc: Bart Samwel, Andrew Morton, smackinlay, Bartek Kania, linux-kernel

On Fri, Jan 02 2004, Hugang wrote:
> Organization: Beijing Soul
> X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; powerpc-unknown-linux-gnu)
> Mime-Version: 1.0
> Content-Type: text/plain; charset=US-ASCII
> Content-Transfer-Encoding: 7bit
> 
> On Thu, 1 Jan 2004 19:35:45 +0100
> Jens Axboe <axboe@suse.de> wrote:
> 
> > Patch is obviously bogus, just look at the comm definition in sched.h:
> > 
> > 	char comm[16];
> > 
> > IO submission must happen in process context, so we also know that
> > current is valid.
> 
> You are right. But why add this patch, My laptop not crash when I
> enable block dump, So I try to find where is the Bug. Final, The bug

I dunno, I can't possibly tell since you haven't given any info about
this crash. Where does it crash, do you have an oops? All I could say
from your report + patch is that it wasn't valid. There's just no way
for current->comm to be NULL, so your patch couldn't possibly have made
a difference.

> is in sector_t, I was enable CONFIG_LBD, So sector_t is u64, So We
> have to change the code when enable CONFIG_LBD.
> 
> I'd like the 2.4 style so add count number into printf.
> 
> Here is the patch fix it
> +
> +   if (unlikely(block_dump)) {
> +       char b[BDEVNAME_SIZE];
> +       printk("%s(%d): %s block %llu/%u on %s\n",
> +           current->comm, current->pid,
> +           (rw & WRITE) ? "WRITE" : (rw == READA ? "READA" : "READ"),
> +           (u64)bio->bi_sector, count, bdevname(bio->bi_bdev,b));
> +   }

It's best to keep the line as minimal as possible, count isn't really
very interesting. What is interesting is process, offset (for finding
the file, if you need to), and data direction.

> I think, also have this bug in 2.4.23, here is the patch for it, Hope can helpful.
> Index: linux-2.4.23/drivers/block/ll_rw_blk.c
> ===================================================================
> --- linux-2.4.23/drivers/block/ll_rw_blk.c      (revision 4)
> +++ linux-2.4.23/drivers/block/ll_rw_blk.c      (working copy)
> @@ -1298,7 +1298,7 @@
>                 wake_up(&bh->b_wait);
>  
>         if (block_dump)
> -               printk(KERN_DEBUG "%s: %s block %lu/%u on %s\n", current->comm, rw == WRITE ? "WRITE" : "READ", bh->b_rsector, count, kdevname(bh->b_rdev));
> +               printk(KERN_DEBUG "%s: %s block %llu/%u on %s\n", current->comm, rw == WRITE ? "WRITE" : "READ", (u64)bh->b_rsector, count, kdevname(bh->b_rdev));

2.4 stock doesn't have 64-bit sectors, please consult (again) the
canonical source (include file). There's no need to cast.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] laptop-mode-2.6.0 version 5
       [not found]         ` <20040101183545.GD5523@suse.de>
@ 2004-01-02  9:02           ` Hugang
  2004-01-02 11:27             ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Hugang @ 2004-01-02  9:02 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Bart Samwel, Andrew Morton, smackinlay, Bartek Kania, linux-kernel

Organization: Beijing Soul
X-Mailer: Sylpheed version 0.9.8claws (GTK+ 1.2.10; powerpc-unknown-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

On Thu, 1 Jan 2004 19:35:45 +0100
Jens Axboe <axboe@suse.de> wrote:

> Patch is obviously bogus, just look at the comm definition in sched.h:
> 
> 	char comm[16];
> 
> IO submission must happen in process context, so we also know that
> current is valid.

You are right. But why add this patch, My laptop not crash when I enable block dump, So I try to find where is the Bug. Final, The bug is in sector_t, I was enable CONFIG_LBD, So sector_t is u64, So We have to change the code when enable CONFIG_LBD.

I'd like the 2.4 style so add count number into printf.

Here is the patch fix it
+
+   if (unlikely(block_dump)) {
+       char b[BDEVNAME_SIZE];
+       printk("%s(%d): %s block %llu/%u on %s\n",
+           current->comm, current->pid,
+           (rw & WRITE) ? "WRITE" : (rw == READA ? "READA" : "READ"),
+           (u64)bio->bi_sector, count, bdevname(bio->bi_bdev,b));
+   }
+


I think, also have this bug in 2.4.23, here is the patch for it, Hope can helpful.
Index: linux-2.4.23/drivers/block/ll_rw_blk.c
===================================================================
--- linux-2.4.23/drivers/block/ll_rw_blk.c      (revision 4)
+++ linux-2.4.23/drivers/block/ll_rw_blk.c      (working copy)
@@ -1298,7 +1298,7 @@
                wake_up(&bh->b_wait);
 
        if (block_dump)
-               printk(KERN_DEBUG "%s: %s block %lu/%u on %s\n", current->comm, rw == WRITE ? "WRITE" : "READ", bh->b_rsector, count, kdevname(bh->b_rdev));
+               printk(KERN_DEBUG "%s: %s block %llu/%u on %s\n", current->comm, rw == WRITE ? "WRITE" : "READ", (u64)bh->b_rsector, count, kdevname(bh->b_rdev));
 
        put_bh(bh);
        switch (rw) {

-- 
Hu Gang / Steve
RLU#          : 204016 [1999] (Registered Linux user)
GPG Public Key: http://soulinfo.com/~hugang/HuGang.asc

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2004-03-29 15:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20040102025509.91753.qmail@mail.com>
2004-01-02 13:01 ` [PATCH] laptop-mode-2.6.0, version 5 Bart Samwel
2004-01-02 21:05   ` Andrew Morton
     [not found] <20031231210756.315.qmail@mail.com>
     [not found] ` <3FF3887C.90404@samwel.tk>
     [not found]   ` <20031231184830.1168b8ff.akpm@osdl.org>
     [not found]     ` <3FF43BAF.7040704@samwel.tk>
     [not found]       ` <3FF457C0.2040303@samwel.tk>
     [not found]         ` <20040101183545.GD5523@suse.de>
2004-01-02  9:02           ` [PATCH] laptop-mode-2.6.0 " Hugang
2004-01-02 11:27             ` Jens Axboe
2004-01-02 11:38               ` Hugang
2004-01-02 12:03                 ` Jens Axboe
2004-01-04  9:52                   ` Peter Chubb
2004-01-04 10:31                     ` Jens Axboe
2004-03-29 15:42                     ` Jens Axboe
2004-03-29 15:42                   ` Peter Chubb

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).