linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Suspend 2 merge
@ 2004-11-27 22:07 linux
  2004-11-27 22:25 ` Pavel Machek
  2004-11-28  8:29 ` Wichert Akkerman
  0 siblings, 2 replies; 51+ messages in thread
From: linux @ 2004-11-27 22:07 UTC (permalink / raw)
  To: linux-kernel; +Cc: pavel

> My machine suspends in 7 seconds, and that's swsusp1. According to
> your numbers, suspend2 should suspend it in 1 second and LZE
> compressed should be .5 second.
> 
> I'd say "who cares". 7 seconds seems like fast enough for me.  And I'm
> *not* going to add 2000 lines of code for 500msec speedup during
> suspend.

Lucky you.  My machine takes minutes.
(To be precise, it prints about a line and a half of dots in the
count_data_pages() loop, and often takes 2 seconds per dot.)

AMD Athlon XP, 1066 MHz, 768K RAM, VIA KT133 chipset.
Stock 2.6.10-rc1.

I could really use a speedup.


Remember, Linux is the aggregate of a lot of people scratching their
itches.  It's okay to criticize *how* people go about addressing
what's annoying them, since that has a long-term maintenance effect,
if nothing else.  But complaining that it doesn't annoy *you* isn't the
most valid argument.

That's what's fundamentally wrong with people complainging about
wanting to "stabilize" 2.6.x.  Stability is in the eye of the beholder.
Unless you want no changes at all (and you can get that easily enough),
what it means is that the bugs that particularly annoy you get fixed.

But the point is, every bug fixed particularly annoys *someone*;
that's why it's getting fixed.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-27 22:07 Suspend 2 merge linux
@ 2004-11-27 22:25 ` Pavel Machek
  2004-11-28  8:29 ` Wichert Akkerman
  1 sibling, 0 replies; 51+ messages in thread
From: Pavel Machek @ 2004-11-27 22:25 UTC (permalink / raw)
  To: linux; +Cc: linux-kernel

Hi!

> > My machine suspends in 7 seconds, and that's swsusp1. According to
> > your numbers, suspend2 should suspend it in 1 second and LZE
> > compressed should be .5 second.
> > 
> > I'd say "who cares". 7 seconds seems like fast enough for me.  And I'm
> > *not* going to add 2000 lines of code for 500msec speedup during
> > suspend.
> 
> Lucky you.  My machine takes minutes.
> (To be precise, it prints about a line and a half of dots in the
> count_data_pages() loop, and often takes 2 seconds per dot.)
> 
> AMD Athlon XP, 1066 MHz, 768K RAM, VIA KT133 chipset.
> Stock 2.6.10-rc1.
> 
> I could really use a speedup.

Yep, that's O(n^2) algorithm slowing it down. I have fix for it, but
2.6.10 is now too frozen for performance fix to go in. See "bigdiff" I
sent to hugang, or wait few minutes and you'll get really ugly diff in
private email, that should solve it, too.

[I'll be glad when you report results. It should make count_data_pages
< 1 second].

> if nothing else.  But complaining that it doesn't annoy *you* isn't the
> most valid argument.

Ok, it is the scale. Half a second speedup is not enough to justify
new compression algorithm in the kernel.
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-27 22:07 Suspend 2 merge linux
  2004-11-27 22:25 ` Pavel Machek
@ 2004-11-28  8:29 ` Wichert Akkerman
  2004-11-28 11:37   ` Pavel Machek
  1 sibling, 1 reply; 51+ messages in thread
From: Wichert Akkerman @ 2004-11-28  8:29 UTC (permalink / raw)
  To: linux-kernel

Previously linux@horizon.com wrote:
> Lucky you.  My machine takes minutes.
> (To be precise, it prints about a line and a half of dots in the
> count_data_pages() loop, and often takes 2 seconds per dot.)

It also seems to vary wildly. Most of the time it goes pretty fast for
me (under one minute) but occasionaly it will take well over 10 minutes.
Never managed to time it exactly since my battery tends to run out in
the middle of a suspend when that happens.

Wichert.

-- 
Wichert Akkerman <wichert@wiggy.net>    It is simple to make things.
http://www.wiggy.net/                   It is hard to make things simple.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-28  8:29 ` Wichert Akkerman
@ 2004-11-28 11:37   ` Pavel Machek
  2004-11-28 16:23     ` software suspend patch [1/6] hugang
  0 siblings, 1 reply; 51+ messages in thread
From: Pavel Machek @ 2004-11-28 11:37 UTC (permalink / raw)
  To: linux-kernel

Hi!

> > Lucky you.  My machine takes minutes.
> > (To be precise, it prints about a line and a half of dots in the
> > count_data_pages() loop, and often takes 2 seconds per dot.)
> 
> It also seems to vary wildly. Most of the time it goes pretty fast for
> me (under one minute) but occasionaly it will take well over 10 minutes.
> Never managed to time it exactly since my battery tends to run out in
> the middle of a suspend when that happens.

It depends on memory fragmentation; after updatedb it tends to be slow.
Patch exists, see archives.
				Pavel
-- 
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms         


^ permalink raw reply	[flat|nested] 51+ messages in thread

* software suspend patch [1/6]
  2004-11-28 11:37   ` Pavel Machek
@ 2004-11-28 16:23     ` hugang
  2004-11-28 16:24       ` software suspend patch [2/6] hugang
                         ` (5 more replies)
  0 siblings, 6 replies; 51+ messages in thread
From: hugang @ 2004-11-28 16:23 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

Hi Pavel Machek, Nigel Cunningham:

 device-tree.diff 
   base from suspend2 with a little changed.

 core.diff
  1: redefine struct pbe for using _no_ continuous as pagedir.
  2: make shrink memory as little as possible.
  3: using a bitmap speed up collide check in page relocating.
  4: pagecache saving ready.

 i386.diff
 ppc.diff
  i386 and powerpc suspend update.

 pagecachs_addon.diff
  if enable page caches saving, must using it, it making saving
  pagecaches safe. idea from suspend2.

  ppcfix.diff
  fix compile error. 
  $ gcc -v
   .... 
   gcc version 2.95.4 20011002 (Debian prerelease)

I'm using 2.6.9-ck3 With above patch, swsusp1 works prefect in my 
PowerPC and x86 PC with Highmem and prepempt option enabled.

I hope the core.diff@1,@2,@3 i386.diff ppc.diff will merge into 
mainline kernel ASAP, :). from I view point device-tree.diff is 
very usefuly when using pagecache saving and pagecachs_addon.diff
that's really hack for making pagecache saving safe.


--- 2.6.9-lzf//drivers/base/class.c	2004-11-25 14:13:02.000000000 +0800
+++ 2.6.9/drivers/base/class.c	2004-11-28 23:17:00.000000000 +0800
@@ -465,6 +465,25 @@ void class_device_put(struct class_devic
 	kobject_put(&class_dev->kobj);
 }
 
+struct class * class_find(char * name)
+{
+	struct class * this_class;
+
+	if (!name)
+		return NULL;
+
+	down_read(&class_subsys.rwsem);
+	list_for_each_entry(this_class, &class_subsys.kset.list, subsys.kset.kobj.entry) {
+		if (!(strcmp(this_class->name, name))) {
+			class_get(this_class);
+			up_read(&class_subsys.rwsem);
+			return this_class;
+		}
+	}
+	up_read(&class_subsys.rwsem);
+
+	return NULL;
+}
 
 int class_interface_register(struct class_interface *class_intf)
 {
@@ -547,3 +566,5 @@ EXPORT_SYMBOL(class_device_remove_file);
 
 EXPORT_SYMBOL(class_interface_register);
 EXPORT_SYMBOL(class_interface_unregister);
+
+EXPORT_SYMBOL(class_find);
--- 2.6.9-lzf//drivers/base/power/Makefile	2004-11-25 14:13:03.000000000 +0800
+++ 2.6.9/drivers/base/power/Makefile	2004-11-28 23:17:01.000000000 +0800
@@ -1,5 +1,5 @@
 obj-y			:= shutdown.o
-obj-$(CONFIG_PM)	+= main.o suspend.o resume.o runtime.o sysfs.o
+obj-$(CONFIG_PM)	+= main.o suspend.o resume.o runtime.o sysfs.o tree.o
 
 ifeq ($(CONFIG_DEBUG_DRIVER),y)
 EXTRA_CFLAGS += -DDEBUG
--- 2.6.9-lzf//drivers/base/power/main.c	2004-11-25 14:13:02.000000000 +0800
+++ 2.6.9/drivers/base/power/main.c	2004-11-28 23:17:01.000000000 +0800
@@ -4,6 +4,9 @@
  * Copyright (c) 2003 Patrick Mochel
  * Copyright (c) 2003 Open Source Development Lab
  *
+ * Partial tree additions
+ * Copyright (c) 2004 Nigel Cunningham
+ *
  * This file is released under the GPLv2
  *
  *
@@ -23,10 +26,18 @@
 #include <linux/device.h>
 #include "power.h"
 
-LIST_HEAD(dpm_active);
-LIST_HEAD(dpm_off);
-LIST_HEAD(dpm_off_irq);
-
+struct partial_device_tree default_device_tree =
+{ 
+	.dpm_active	= LIST_HEAD_INIT(default_device_tree.dpm_active),
+	.dpm_off	= LIST_HEAD_INIT(default_device_tree.dpm_off),
+	.dpm_off_irq	= LIST_HEAD_INIT(default_device_tree.dpm_off_irq),
+};
+EXPORT_SYMBOL(default_device_tree);
+
+/* 
+ * One mutex for all trees because we can be moving items
+ * between trees.
+ */
 DECLARE_MUTEX(dpm_sem);
 
 /*
@@ -76,7 +87,9 @@ int device_pm_add(struct device * dev)
 		 dev->bus ? dev->bus->name : "No Bus", dev->kobj.name);
 	atomic_set(&dev->power.pm_users, 0);
 	down(&dpm_sem);
-	list_add_tail(&dev->power.entry, &dpm_active);
+	list_add_tail(&dev->power.entry, &default_device_tree.dpm_active);
+	dev->current_list = DEVICE_LIST_DPM_ACTIVE;
+	dev->tree = &default_device_tree;
 	device_pm_set_parent(dev, dev->parent);
 	if ((error = dpm_sysfs_add(dev)))
 		list_del(&dev->power.entry);
@@ -92,6 +105,7 @@ void device_pm_remove(struct device * de
 	dpm_sysfs_remove(dev);
 	device_pm_release(dev->power.pm_parent);
 	list_del(&dev->power.entry);
+	dev->current_list = DEVICE_LIST_NONE;
 	up(&dpm_sem);
 }
 
--- 2.6.9-lzf//drivers/base/power/power.h	2004-11-28 23:17:29.000000000 +0800
+++ 2.6.9/drivers/base/power/power.h	2004-11-28 23:17:00.000000000 +0800
@@ -30,10 +30,22 @@ extern struct semaphore dpm_sem;
 /*
  * The PM lists.
  */
-extern struct list_head dpm_active;
-extern struct list_head dpm_off;
-extern struct list_head dpm_off_irq;
 
+struct partial_device_tree 
+{
+	struct list_head dpm_active;
+	struct list_head dpm_off;
+	struct list_head dpm_off_irq;
+};
+
+enum {
+	DEVICE_LIST_NONE,
+	DEVICE_LIST_DPM_ACTIVE,
+	DEVICE_LIST_DPM_OFF,
+	DEVICE_LIST_DPM_OFF_IRQ,
+};
+
+extern struct partial_device_tree default_device_tree;
 
 static inline struct dev_pm_info * to_pm_info(struct list_head * entry)
 {
@@ -59,7 +71,9 @@ extern void dpm_sysfs_remove(struct devi
  * resume.c
  */
 
+extern void dpm_resume_tree(struct partial_device_tree * tree);
 extern void dpm_resume(void);
+extern void dpm_power_up_tree(struct partial_device_tree * tree);
 extern void dpm_power_up(void);
 extern int resume_device(struct device *);
 
--- 2.6.9-lzf//drivers/base/power/resume.c	2004-11-28 23:17:29.000000000 +0800
+++ 2.6.9/drivers/base/power/resume.c	2004-11-28 23:17:00.000000000 +0800
@@ -29,20 +29,25 @@ int resume_device(struct device * dev)
 
 
 
-void dpm_resume(void)
+void dpm_resume_tree(struct partial_device_tree * tree)
 {
-	while(!list_empty(&dpm_off)) {
-		struct list_head * entry = dpm_off.next;
+	while(!list_empty(&tree->dpm_off)) {
+		struct list_head * entry = tree->dpm_off.next;
 		struct device * dev = to_device(entry);
 		list_del_init(entry);
 
 		if (dev->power.prev_state == PMSG_ON)
 			resume_device(dev);
 
-		list_add_tail(entry, &dpm_active);
+		list_add_tail(entry, &tree->dpm_active);
+		dev->current_list = DEVICE_LIST_DPM_ACTIVE;
 	}
 }
 
+void dpm_resume(void)
+{
+	dpm_resume_tree(&default_device_tree);
+}
 
 /**
  *	device_resume - Restore state of each device in system.
@@ -60,6 +65,14 @@ void device_resume(void)
 
 EXPORT_SYMBOL(device_resume);
 
+void device_resume_tree(struct partial_device_tree * tree)
+{
+	down(&dpm_sem);
+	dpm_resume_tree(tree);
+	up(&dpm_sem);
+}
+
+EXPORT_SYMBOL(device_resume_tree);
 
 /**
  *	device_power_up_irq - Power on some devices.
@@ -72,16 +85,23 @@ EXPORT_SYMBOL(device_resume);
  *	Interrupts must be disabled when calling this.
  */
 
-void dpm_power_up(void)
+void dpm_power_up_tree(struct partial_device_tree * tree)
 {
-	while(!list_empty(&dpm_off_irq)) {
-		struct list_head * entry = dpm_off_irq.next;
+	while(!list_empty(&tree->dpm_off_irq)) {
+		struct list_head * entry = tree->dpm_off_irq.next;
+		struct device * dev = to_device(entry);
 		list_del_init(entry);
-		resume_device(to_device(entry));
-		list_add_tail(entry, &dpm_active);
+		resume_device(dev);
+		list_add_tail(entry, &tree->dpm_active);
+		dev->current_list = DEVICE_LIST_DPM_ACTIVE;
 	}
 }
+EXPORT_SYMBOL(dpm_power_up_tree);
 
+void dpm_power_up(void)
+{
+	dpm_power_up_tree(&default_device_tree);
+}
 
 /**
  *	device_pm_power_up - Turn on all devices that need special attention.
@@ -97,6 +117,58 @@ void device_power_up(void)
 	dpm_power_up();
 }
 
+#if 0
+
+/**
+ *
+ * 	pci_find_class_storage
+ *
+ *	Find a PCI storage device.
+ *	Based upon pci_find_class, but less strict.
+ */
+
+static struct pci_dev *
+pci_find_class_storage(unsigned int class, const struct pci_dev *from)
+{
+	struct list_head *n;
+	struct pci_dev *dev;
+
+	spin_lock(&pci_bus_lock);
+	n = from ? from->global_list.next : pci_devices.next;
+
+	while (n && (n != &pci_devices)) {
+		dev = pci_dev_g(n);
+		if (((dev->class & 0xff00) >> 16) == class)
+			goto exit;
+		n = n->next;
+	}
+	dev = NULL;
+exit:
+	spin_unlock(&pci_bus_lock);
+	return dev;
+}
+
+
+/**
+ *	device_resume_type - Resume some devices.
+ *
+ *	Resume devices of a specific type and their parents.
+ *	Interrupts must be disabled when calling this.
+ *
+ *	Note that we only handle pci devices at the moment.
+ *	We have no way that I can tell of getting the class
+ *	of devices not on the pci bus.
+ */
+void device_resume_type(type)
+{
+	struct device * dev_dev;
+	struct pci_dev * pci_dev = NULL;
+	
+	while ((dev = pci_find_class(PCI_BASE_CLASS_STORAGE, dev))) {
+	}
+}
+#endif
+
 EXPORT_SYMBOL(device_power_up);
 
 
--- 2.6.9-lzf//drivers/base/power/shutdown.c	2004-11-28 23:17:29.000000000 +0800
+++ 2.6.9/drivers/base/power/shutdown.c	2004-11-28 23:17:01.000000000 +0800
@@ -66,3 +66,4 @@ void device_shutdown(void)
 	sysdev_shutdown();
 }
 
+EXPORT_SYMBOL(device_shutdown);
--- 2.6.9-lzf//drivers/base/power/suspend.c	2004-11-28 23:17:29.000000000 +0800
+++ 2.6.9/drivers/base/power/suspend.c	2004-11-28 23:17:00.000000000 +0800
@@ -51,7 +51,7 @@ int suspend_device(struct device * dev, 
 
 
 /**
- *	device_suspend - Save state and stop all devices in system.
+ *	device_suspend_tree - Save state and stop all devices in system.
  *	@state:		Power state to put each device in.
  *
  *	Walk the dpm_active list, call ->suspend() for each device, and move
@@ -60,7 +60,7 @@ int suspend_device(struct device * dev, 
  *	the device to the dpm_off list. If it returns -EAGAIN, we move it to
  *	the dpm_off_irq list. If we get a different error, try and back out.
  *
- *	If we hit a failure with any of the devices, call device_resume()
+ *	If we hit a failure with any of the devices, call device_resume_tree()
  *	above to bring the suspended devices back to life.
  *
  *	Note this function leaves dpm_sem held to
@@ -70,22 +70,24 @@ int suspend_device(struct device * dev, 
  *
  */
 
-int device_suspend(pm_message_t state)
+int device_suspend_tree(pm_message_t state, struct partial_device_tree * tree)
 {
 	int error = 0;
 
 	down(&dpm_sem);
-	while(!list_empty(&dpm_active)) {
-		struct list_head * entry = dpm_active.prev;
+	while(!list_empty(&tree->dpm_active)) {
+		struct list_head * entry = tree->dpm_active.prev;
 		struct device * dev = to_device(entry);
 		error = suspend_device(dev, state);
 
 		if (!error) {
 			list_del(&dev->power.entry);
-			list_add(&dev->power.entry, &dpm_off);
+			list_add(&dev->power.entry, &tree->dpm_off);
+			dev->current_list = DEVICE_LIST_DPM_OFF;
 		} else if (error == -EAGAIN) {
 			list_del(&dev->power.entry);
-			list_add(&dev->power.entry, &dpm_off_irq);
+			list_add(&dev->power.entry, &tree->dpm_off_irq);
+			dev->current_list = DEVICE_LIST_DPM_OFF_IRQ;
 		} else {
 			printk(KERN_ERR "Could not suspend device %s: "
 				"error %d\n", kobject_name(&dev->kobj), error);
@@ -96,10 +98,15 @@ int device_suspend(pm_message_t state)
 	up(&dpm_sem);
 	return error;
  Error:
-	dpm_resume();
+	dpm_resume_tree(tree);
 	goto Done;
 }
+EXPORT_SYMBOL(device_suspend_tree);
 
+int device_suspend(pm_message_t state)
+{
+	return device_suspend_tree(state, &default_device_tree);
+}
 EXPORT_SYMBOL(device_suspend);
 
 
@@ -112,19 +119,17 @@ EXPORT_SYMBOL(device_suspend);
  *	done, power down system devices.
  */
 
-int device_power_down(pm_message_t state)
+int device_power_down_tree(pm_message_t state, struct partial_device_tree * tree)
 {
 	int error = 0;
 	struct device * dev;
 
-	list_for_each_entry_reverse(dev, &dpm_off_irq, power.entry) {
+	list_for_each_entry_reverse(dev, &tree->dpm_off_irq, power.entry) {
 		if ((error = suspend_device(dev, state)))
 			break;
 	}
 	if (error)
 		goto Error;
-	if ((error = sysdev_suspend(state)))
-		goto Error;
  Done:
 	return error;
  Error:
@@ -132,5 +137,14 @@ int device_power_down(pm_message_t state
 	goto Done;
 }
 
-EXPORT_SYMBOL(device_power_down);
+EXPORT_SYMBOL(device_power_down_tree);
 
+int device_power_down(pm_message_t state)
+{
+	int error;
+
+	if (!(error = device_power_down_tree(state, &default_device_tree)))
+		error = sysdev_suspend(state);
+	return error;
+}
+EXPORT_SYMBOL(device_power_down);
--- /dev/null	2004-06-07 18:45:47.000000000 +0800
+++ 2.6.9/drivers/base/power/tree.c	2004-11-28 23:17:00.000000000 +0800
@@ -0,0 +1,105 @@
+/*
+ * suspend.c - Functions for moving devices between trees.
+ *
+ * Copyright (c) 2004 Nigel Cunningham
+ *
+ * This file is released under the GPLv2
+ *
+ */
+
+#include <linux/device.h>
+#include <linux/err.h>
+#include "power.h"
+
+/*
+ *	device_merge_tree - Move an entire tree into another tree
+ *	@source: The tree to be moved
+ *	@dest  : The destination tree
+ */
+
+void device_merge_tree(	struct partial_device_tree * source,
+			struct partial_device_tree * dest)
+{
+	down(&dpm_sem);
+	list_splice_init(&source->dpm_active, &dest->dpm_active);
+	list_splice_init(&source->dpm_off, &dest->dpm_off);
+	list_splice_init(&source->dpm_off_irq, &dest->dpm_off_irq);
+	up(&dpm_sem);
+}
+EXPORT_SYMBOL(device_merge_tree);
+
+/*
+ * 	device_switch_trees - Move a device and its ancestors to a new tree
+ * 	@dev:	The lowest device to be moved.
+ * 	@tree:	The destination tree.
+ *
+ * 	Note that siblings can be left in the original tree. This is because
+ * 	we want to be able to keep part of a tree in one state while part is
+ * 	in another.
+ *
+ * 	Since we iterate all the way back to the top, and may move entries
+ * 	already in the destination tree, we will never violate the depth
+ * 	first property of the destination tree.
+ */
+
+void device_switch_trees(struct device * dev, struct partial_device_tree * tree)
+{
+	down(&dpm_sem);
+	while (dev) {
+		list_del(&dev->power.entry);
+		switch (dev->current_list) {
+			case DEVICE_LIST_DPM_ACTIVE:
+				list_add(&dev->power.entry, &tree->dpm_active);
+				break;
+			case DEVICE_LIST_DPM_OFF:
+				list_add(&dev->power.entry, &tree->dpm_off);
+				break;
+			case DEVICE_LIST_DPM_OFF_IRQ:
+				list_add(&dev->power.entry, &tree->dpm_off_irq);
+				break;
+		}
+			
+		dev = dev->parent;
+	}
+	up(&dpm_sem);
+}
+
+EXPORT_SYMBOL(device_switch_trees);
+
+/*
+ * 	create_device_tree - Create a new device tree
+ */
+
+struct partial_device_tree * device_create_tree(void)
+{
+	struct partial_device_tree * new_tree;
+
+	new_tree = (struct partial_device_tree *) 
+		kmalloc(sizeof(struct partial_device_tree),  GFP_ATOMIC);
+
+	if (!IS_ERR(new_tree)) {
+		INIT_LIST_HEAD(&new_tree->dpm_active);
+		INIT_LIST_HEAD(&new_tree->dpm_off);
+		INIT_LIST_HEAD(&new_tree->dpm_off_irq);
+	}
+
+	return new_tree;
+}
+EXPORT_SYMBOL(device_create_tree);
+
+/*
+ * 	device_destroy_tree - Destroy a dynamically created tree
+ */
+
+void device_destroy_tree(struct partial_device_tree * tree)
+{
+	BUG_ON(tree == &default_device_tree);
+	
+	BUG_ON(!list_empty(&tree->dpm_active));
+	BUG_ON(!list_empty(&tree->dpm_off));
+	BUG_ON(!list_empty(&tree->dpm_off_irq));
+
+	kfree(tree);
+}
+
+EXPORT_SYMBOL(device_destroy_tree);
--- 2.6.9-lzf//drivers/base/sys.c	2004-11-25 14:13:03.000000000 +0800
+++ 2.6.9/drivers/base/sys.c	2004-11-28 23:17:01.000000000 +0800
@@ -337,7 +337,7 @@ int sysdev_suspend(u32 state)
 	}
 	return 0;
 }
-
+EXPORT_SYMBOL(sysdev_suspend);
 
 /**
  *	sysdev_resume - Bring system devices back to life.
@@ -384,6 +384,7 @@ int sysdev_resume(void)
 	}
 	return 0;
 }
+EXPORT_SYMBOL(sysdev_resume);
 
 
 int __init system_bus_init(void)
--- 2.6.9-lzf//include/linux/pm.h	2004-11-28 23:17:16.000000000 +0800
+++ 2.6.9/include/linux/pm.h	2004-11-28 23:16:55.000000000 +0800
@@ -231,13 +231,25 @@ struct dev_pm_info {
 };
 
 extern void device_pm_set_parent(struct device * dev, struct device * parent);
+struct partial_device_tree;
+extern struct partial_device_tree default_device_tree;
 
 extern int device_suspend(pm_message_t state);
+extern int device_suspend_tree(pm_message_t state, struct partial_device_tree * tree);
 extern int device_power_down(pm_message_t state);
+extern int device_power_down_tree(pm_message_t state, struct partial_device_tree * tree);
 extern void device_power_up(void);
+extern void device_power_up_tree(struct partial_device_tree * tree);
 extern void device_resume(void);
-
-
+extern void device_resume_tree(struct partial_device_tree * tree);
+extern void device_merge_tree(struct partial_device_tree * source,
+		struct partial_device_tree * dest);
+extern void device_switch_trees(struct device * dev, struct partial_device_tree * tree);
+extern void dpm_power_up_tree(struct partial_device_tree * tree);
+extern int sysdev_suspend(u32 state);
+extern int sysdev_resume(void);
+extern struct partial_device_tree * device_create_tree(void);
+extern void device_destroy_tree(struct partial_device_tree * tree);
 #endif /* __KERNEL__ */
 
 #endif /* _LINUX_PM_H */
--- 2.6.9-lzf//include/linux/device.h	2004-11-28 23:17:16.000000000 +0800
+++ 2.6.9/include/linux/device.h	2004-11-28 23:16:56.000000000 +0800
@@ -162,6 +162,7 @@ extern void class_unregister(struct clas
 
 extern struct class * class_get(struct class *);
 extern void class_put(struct class *);
+extern struct class * class_find(char * name);
 
 
 struct class_attribute {
@@ -288,6 +289,11 @@ struct device {
 					     override */
 
 	void	(*release)(struct device * dev);
+
+	struct partial_device_tree * tree; /* Which tree of devices this
+					      device is in */
+	int	current_list;		/* Which list within the tree the
+					   device is on (speeds moving) */
 };
 
 static inline struct device *

--
Hu Gang / Steve
Linux Registered User 204016
GPG Public Key: http://soulinfo.com/~hugang/hugang.asc

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: software suspend patch [2/6]
  2004-11-28 16:23     ` software suspend patch [1/6] hugang
@ 2004-11-28 16:24       ` hugang
  2004-11-28 17:03         ` Pavel Machek
  2004-11-28 16:24       ` software suspend patch [3/6] hugang
                         ` (4 subsequent siblings)
  5 siblings, 1 reply; 51+ messages in thread
From: hugang @ 2004-11-28 16:24 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

On Mon, Nov 29, 2004 at 12:23:20AM +0800, hugang@soulinfo.com wrote:
> Hi Pavel Machek, Nigel Cunningham:
> 
>  device-tree.diff 
>    base from suspend2 with a little changed.
> 
>  core.diff
>   1: redefine struct pbe for using _no_ continuous as pagedir.
>   2: make shrink memory as little as possible.
>   3: using a bitmap speed up collide check in page relocating.
>   4: pagecache saving ready.
> 
>  i386.diff
>  ppc.diff
>   i386 and powerpc suspend update.
> 
>  pagecachs_addon.diff
>   if enable page caches saving, must using it, it making saving
>   pagecaches safe. idea from suspend2.
> 
>   ppcfix.diff
>   fix compile error. 
>   $ gcc -v
>    .... 
>    gcc version 2.95.4 20011002 (Debian prerelease)
> 
> I'm using 2.6.9-ck3 With above patch, swsusp1 works prefect in my 
> PowerPC and x86 PC with Highmem and prepempt option enabled.
> 
> I hope the core.diff@1,@2,@3 i386.diff ppc.diff will merge into 
> mainline kernel ASAP, :). from I view point device-tree.diff is 
> very usefuly when using pagecache saving and pagecachs_addon.diff
> that's really hack for making pagecache saving safe.

--- 2.6.9-lzf//include/linux/reboot.h	2004-11-26 12:33:39.000000000 +0800
+++ 2.6.9/include/linux/reboot.h	2004-11-28 23:16:56.000000000 +0800
@@ -42,6 +42,8 @@
 extern int register_reboot_notifier(struct notifier_block *);
 extern int unregister_reboot_notifier(struct notifier_block *);
 
+/* For use by swsusp only */
+extern struct notifier_block *reboot_notifier_list;
 
 /*
  * Architecture-specific implementations of sys_reboot commands.
--- 2.6.9-lzf//include/linux/suspend.h	2004-11-28 23:17:18.000000000 +0800
+++ 2.6.9/include/linux/suspend.h	2004-11-28 23:16:56.000000000 +0800
@@ -1,7 +1,7 @@
 #ifndef _LINUX_SWSUSP_H
 #define _LINUX_SWSUSP_H
 
-#ifdef CONFIG_X86
+#if (defined(CONFIG_X86)) || (defined (CONFIG_PPC32))
 #include <asm/suspend.h>
 #endif
 #include <linux/swap.h>
--- 2.6.9-lzf//include/linux/sysctl.h	2004-11-28 23:17:15.000000000 +0800
+++ 2.6.9/include/linux/sysctl.h	2004-11-28 23:16:55.000000000 +0800
@@ -170,6 +170,7 @@ enum
 	VM_VFS_CACHE_PRESSURE=26, /* dcache/icache reclaim pressure */
 	VM_LEGACY_VA_LAYOUT=27, /* legacy/compatibility virtual address space layout */
 	VM_HARDMAPLIMIT=28,	/* Make mapped a hard limit */
+	VM_SWSUSP_PAGECACHE=29,	/* Enable/Disable Suspend PageCaches */
 };
 
 
--- 2.6.9-lzf//kernel/power/disk.c	2004-11-28 23:17:11.000000000 +0800
+++ 2.6.9/kernel/power/disk.c	2004-11-28 23:16:54.000000000 +0800
@@ -16,10 +16,11 @@
 #include <linux/device.h>
 #include <linux/delay.h>
 #include <linux/fs.h>
+#include <linux/reboot.h>
 #include <linux/device.h>
 #include "power.h"
 
-
+extern struct partial_device_tree *swsusp_dev_tree;
 extern suspend_disk_method_t pm_disk_mode;
 extern struct pm_ops * pm_ops;
 
@@ -29,6 +30,8 @@ extern int swsusp_read(void);
 extern int swsusp_resume(void);
 extern int swsusp_free(void);
 
+extern int swsusp_prepare_suspend(void);
+extern int swsusp_post_resume(void);
 
 static int noresume = 0;
 char resume_file[256] = CONFIG_PM_STD_PARTITION;
@@ -48,19 +51,20 @@ static void power_down(suspend_disk_meth
 	unsigned long flags;
 	int error = 0;
 
-	local_irq_save(flags);
 	switch(mode) {
 	case PM_DISK_PLATFORM:
- 		device_power_down(PMSG_SUSPEND);
+		local_irq_save(flags);
 		error = pm_ops->enter(PM_SUSPEND_DISK);
+		local_irq_restore(flags);
 		break;
 	case PM_DISK_SHUTDOWN:
 		printk("Powering off system\n");
-		device_shutdown();
+		notifier_call_chain(&reboot_notifier_list, SYS_POWER_OFF, NULL);
+		device_suspend_tree(PMSG_FREEZE, swsusp_dev_tree);
 		machine_power_off();
 		break;
 	case PM_DISK_REBOOT:
-		device_shutdown();
+		device_suspend_tree(PMSG_FREEZE, swsusp_dev_tree);
 		machine_restart(NULL);
 		break;
 	}
@@ -74,38 +78,6 @@ static void power_down(suspend_disk_meth
 
 static int in_suspend __nosavedata = 0;
 
-
-/**
- *	free_some_memory -  Try to free as much memory as possible
- *
- *	... but do not OOM-kill anyone
- *
- *	Notice: all userland should be stopped at this point, or
- *	livelock is possible.
- */
-
-static void free_some_memory(void)
-{
-	int i;
-	for (i=0; i<5; i++) {
-		int i = 0, tmp;
-		long pages = 0;
-		char *p = "-\\|/";
-
-		printk("Freeing memory...  ");
-		while ((tmp = shrink_all_memory(10000))) {
-			pages += tmp;
-			printk("\b%c", p[i]);
-			i++;
-			if (i > 3)
-				i = 0;
-		}
-		printk("\bdone (%li pages freed)\n", pages);
-		current->state = TASK_INTERRUPTIBLE;
-		schedule_timeout(HZ/5);
-	}
-}
-
 static inline void platform_finish(void)
 {
 	if (pm_disk_mode == PM_DISK_PLATFORM) {
@@ -116,7 +88,7 @@ static inline void platform_finish(void)
 
 static void finish(void)
 {
-	device_resume();
+	swsusp_post_resume();
 	platform_finish();
 	enable_nonboot_cpus();
 	thaw_processes();
@@ -124,7 +96,7 @@ static void finish(void)
 }
 
 
-static int prepare(void)
+static int prepare(int resume)
 {
 	int error;
 
@@ -143,14 +115,11 @@ static int prepare(void)
 		}
 	}
 
-	/* Free memory before shutting down devices. */
-	free_some_memory();
-
 	disable_nonboot_cpus();
-	if ((error = device_suspend(PMSG_FREEZE))) {
-		printk("Some devices failed to suspend\n");
-		goto Finish;
-	}
+	if (!resume)
+		if ((error = swsusp_prepare_suspend())) {
+			goto Finish;
+		}
 
 	return 0;
  Finish:
@@ -176,7 +145,7 @@ int pm_suspend_disk(void)
 {
 	int error;
 
-	if ((error = prepare()))
+	if ((error = prepare(0)))
 		return error;
 
 	pr_debug("PM: Attempting to suspend to disk.\n");
@@ -233,7 +202,7 @@ static int software_resume(void)
 
 	pr_debug("PM: Preparing system for restore.\n");
 
-	if ((error = prepare()))
+	if ((error = prepare(1)))
 		goto Free;
 
 	barrier();
@@ -241,7 +210,7 @@ static int software_resume(void)
 
 	pr_debug("PM: Restoring saved image.\n");
 	swsusp_resume();
-	pr_debug("PM: Restore failed, recovering.n");
+	pr_debug("PM: Restore failed, recovering.\n");
 	finish();
  Free:
 	swsusp_free();
--- 2.6.9-lzf//kernel/power/main.c	2004-11-28 23:17:11.000000000 +0800
+++ 2.6.9/kernel/power/main.c	2004-11-28 23:16:54.000000000 +0800
@@ -4,7 +4,7 @@
  * Copyright (c) 2003 Patrick Mochel
  * Copyright (c) 2003 Open Source Development Lab
  * 
- * This file is release under the GPLv2
+ * This file is released under the GPLv2
  *
  */
 
--- 2.6.9-lzf//kernel/power/swsusp.c	2004-11-28 23:17:11.000000000 +0800
+++ 2.6.9/kernel/power/swsusp.c	2004-11-28 23:16:54.000000000 +0800
@@ -63,6 +63,7 @@
 #include <linux/console.h>
 #include <linux/highmem.h>
 #include <linux/bio.h>
+#include <linux/preempt.h>
 
 #include <asm/uaccess.h>
 #include <asm/mmu_context.h>
@@ -74,11 +75,8 @@
 /* References to section boundaries */
 extern char __nosave_begin, __nosave_end;
 
-/* Variables to be preserved over suspend */
-static int pagedir_order_check;
-
 extern char resume_file[];
-static dev_t resume_device;
+static dev_t swsusp_resume_device;
 /* Local variables that should not be affected by save */
 unsigned int nr_copy_pages __nosavedata = 0;
 
@@ -97,7 +95,6 @@ unsigned int nr_copy_pages __nosavedata 
  */
 suspend_pagedir_t *pagedir_nosave __nosavedata = NULL;
 static suspend_pagedir_t *pagedir_save;
-static int pagedir_order __nosavedata = 0;
 
 #define SWSUSP_SIG	"S1SUSPEND"
 
@@ -168,10 +165,11 @@ static int is_resume_device(const struct
 	struct inode *inode = file->f_dentry->d_inode;
 
 	return S_ISBLK(inode->i_mode) &&
-		resume_device == MKDEV(imajor(inode), iminor(inode));
+		swsusp_resume_device == MKDEV(imajor(inode), iminor(inode));
 }
 
-int swsusp_swap_check(void) /* This is called before saving image */
+/* This is called before saving image */
+int swsusp_swap_check(struct partial_device_tree *suspend_device_tree) 
 {
 	int i, len;
 	
@@ -195,6 +193,7 @@ int swsusp_swap_check(void) /* This is c
 				if (is_resume_device(&swap_info[i])) {
 					swapfile_used[i] = SWAPFILE_SUSPEND;
 					root_swap = i;
+					device_switch_trees((swap_info[i].bdev)->bd_disk->driverfs_dev, suspend_device_tree);
 				} else {
 				  	swapfile_used[i] = SWAPFILE_IGNORED;
 				}
@@ -222,8 +221,105 @@ static void lock_swapdevices(void)
 		}
 	swap_list_unlock();
 }
+	
+#define ONE_PAGE_PBE_NUM	(PAGE_SIZE/sizeof(struct pbe))
+#define PBE_IS_PAGE_END(x)  \
+	( PAGE_SIZE - sizeof(struct pbe) == ((x) - ((~(PAGE_SIZE - 1)) & (x))) )
+
+#define pgdir_for_each_safe(pos, n, head) \
+	for(pos = head, n = pos ? (suspend_pagedir_t*)pos->dummy.val : NULL; \
+		pos != NULL; \
+		pos = n, n = pos ? (suspend_pagedir_t *)pos->dummy.val : NULL)
+
+#define pbe_for_each_safe(pos, n, index, max, head) \
+	for(pos = head, index = 0, \
+			n = pos ? (struct pbe *)pos->dummy.val : NULL; \
+		(pos != NULL) && (index < max); \
+		pos = (PBE_IS_PAGE_END((unsigned long)pos)) ? n : \
+			((struct pbe *)((unsigned long)pos + sizeof(struct pbe))), \
+			index ++, \
+			n = pos ? (struct pbe*)pos->dummy.val : NULL)
+
+/* free pagedir */
+static void pagedir_free(suspend_pagedir_t *head)
+{
+	suspend_pagedir_t *next, *cur;
+	pgdir_for_each_safe(cur, next, head) {
+		free_page((unsigned long)cur);
+	}
+}
+
+/* for_each_pbe_copy_back 
+ *
+ * That usefuly for help us writing the code in assemble code.
+ *
+ */
+/*#define CREATE_ASM_CODE */
+#ifdef CREATE_ASM_CODE
+#if 0
+#define GET_ADDRESS(x) __pa(x) 
+#else
+#define GET_ADDRESS(x) (x)
+#endif
+asmlinkage void for_each_pbe_copy_back(void)
+{
+	struct pbe *pgdir, *next;
+
+	pgdir = pagedir_nosave;
+	while (pgdir != NULL) {
+		unsigned long nums, i;
+		pgdir = (struct pbe *)GET_ADDRESS(pgdir);
+		next = (struct pbe*)pgdir->dummy.val;
+		for (nums = 0; nums < ONE_PAGE_PBE_NUM; nums++) {
+			register unsigned long *orig, *copy;
+			orig = (unsigned long *)pgdir->orig_address;
+			if (orig == 0) goto end;
+			orig = (unsigned long *)GET_ADDRESS(orig);
+			copy = (unsigned long *)GET_ADDRESS(pgdir->address);
+#if 0
+			memcpy(orig, copy, PAGE_SIZE);
+#else
+			for (i = 0; i < PAGE_SIZE / sizeof(unsigned long); i+=4) {
+				*(orig + i) = *(copy + i);
+				*(orig + i+1) = *(copy + i+1);
+				*(orig + i+2) = *(copy + i+2);
+				*(orig + i+3) = *(copy + i+3);
+			}
+#endif
+			pgdir ++;
+		}
+		pgdir = next;
+	}
+end:
+	panic("just asm code");
+}
+#endif
 
+/*
+ * find_pbe_by_index - 
+ * @pgdir: the pgdir head 
+ * @index: 
+ *
+ * @return: 
+ */
+static struct pbe *find_pbe_by_index(struct pbe *pgdir, int index)
+{
+	unsigned long p = 0;
+	struct pbe *pbe, *next;
 
+	pr_debug("find_pbe_by_index: %p, 0x%03x", pgdir, index); 
+	pgdir_for_each_safe(pbe, next, pgdir) {
+		if (p == index / ONE_PAGE_PBE_NUM) {
+			pbe = (struct pbe *)((unsigned long)pbe + 
+					(index % ONE_PAGE_PBE_NUM) * sizeof(struct pbe));
+			pr_debug(" %p, o{%p} c{%p}\n",
+					pbe, (void*)pbe->orig_address, (void*)pbe->address);
+			return pbe;
+		}
+		p ++;
+	}
+	return (NULL);
+}
 
 /**
  *	write_swap_page - Write one page to a fresh swap location.
@@ -257,7 +353,6 @@ static int write_page(unsigned long addr
 	return error;
 }
 
-
 /**
  *	data_free - Free the swap entries used by the saved image.
  *
@@ -267,43 +362,82 @@ static int write_page(unsigned long addr
 
 static void data_free(void)
 {
-	swp_entry_t entry;
-	int i;
+	int index;
+	struct pbe *pos, *next;
 
-	for (i = 0; i < nr_copy_pages; i++) {
-		entry = (pagedir_nosave + i)->swap_address;
+	pbe_for_each_safe(pos, next, index, nr_copy_pages, pagedir_nosave) {
+		swp_entry_t entry;
+
+		entry = pos->swap_address;
 		if (entry.val)
 			swap_free(entry);
-		else
-			break;
-		(pagedir_nosave + i)->swap_address = (swp_entry_t){0};
+		pos->swap_address = (swp_entry_t){0};
 	}
 }
 
+static int mod_progress = 1;
+
+static void inline mod_printk_progress(int i)
+{
+	if (mod_progress == 0) mod_progress = 1;
+	if (!(i%100))
+		printk( "\b\b\b\b%3d%%", i / mod_progress );
+}
+
+static int write_one_pbe(struct pbe *p, void *data, int cur)
+{
+	int error = 0;
+
+	mod_printk_progress(cur);
+
+	pr_debug("write_one_pbe: %p, o{%p} c{%p} %d ",
+			p, (void *)p->orig_address, (void *)p->address, cur);
+	error = write_page((unsigned long)data, &p->swap_address);
+	if (error) return error;
+
+	pr_debug("%lu\n", swp_offset(p->swap_address));
+
+	return 0;
+}
+
+static int bio_read_page(pgoff_t page_off, void * page);
+
+static int read_one_pbe(struct pbe *p, void *data, int cur)
+{
+	int error = 0;
+
+	mod_printk_progress(cur);
+
+	pr_debug("read_one_pbe: %p, o{%p} c{%p} %lu\n",
+			p, (void *)p->orig_address, data, 
+			swp_offset(p->swap_address));
+
+	error = bio_read_page(swp_offset(p->swap_address), data);
+	if (error) return error;
+
+	return 0;
+}
 
 /**
  *	data_write - Write saved image to swap.
  *
  *	Walk the list of pages in the image and sync each one to swap.
  */
-
 static int data_write(void)
 {
-	int error = 0;
-	int i;
-	unsigned int mod = nr_copy_pages / 100;
-
-	if (!mod)
-		mod = 1;
+	int error = 0, index;
+	struct pbe *pos, *next;
+	
+	mod_progress = nr_copy_pages / 100;
 
-	printk( "Writing data to swap (%d pages)...     ", nr_copy_pages );
-	for (i = 0; i < nr_copy_pages && !error; i++) {
-		if (!(i%mod))
-			printk( "\b\b\b\b%3d%%", i / mod );
-		error = write_page((pagedir_nosave+i)->address,
-					  &((pagedir_nosave+i)->swap_address));
+	printk( "Writing data to swap (%d pages)...     ", nr_copy_pages);
+	pbe_for_each_safe(pos, next, index, nr_copy_pages, pagedir_nosave) {
+		BUG_ON(pos->orig_address == 0);
+		error = write_one_pbe(pos, (void*)pos->address, index);
+		if (error) break;
 	}
 	printk("\b\b\b\bdone\n");
+
 	return error;
 }
 
@@ -363,7 +497,6 @@ static void free_pagedir_entries(void)
 		swap_free(swsusp_info.pagedir[i]);
 }
 
-
 /**
  *	write_pagedir - Write the array of pages holding the page directory.
  *	@last:	Last swap entry we write (needed for header).
@@ -371,15 +504,19 @@ static void free_pagedir_entries(void)
 
 static int write_pagedir(void)
 {
-	unsigned long addr = (unsigned long)pagedir_nosave;
-	int error = 0;
-	int n = SUSPEND_PD_PAGES(nr_copy_pages);
-	int i;
+	int error = 0, n = 0;
+	suspend_pagedir_t *pgdir, *next;
 
-	swsusp_info.pagedir_pages = n;
+	pgdir_for_each_safe(pgdir, next, pagedir_nosave) {
+		error = write_page((unsigned long)pgdir, &swsusp_info.pagedir[n]);
+		if (error) { 
+			break;
+		}
+		n++;
+	} 
 	printk( "Writing pagedir (%d pages)\n", n);
-	for (i = 0; i < n && !error; i++, addr += PAGE_SIZE)
-		error = write_page(addr, &swsusp_info.pagedir[i]);
+	swsusp_info.pagedir_pages = n;
+
 	return error;
 }
 
@@ -410,7 +547,6 @@ static int write_suspend_image(void)
 	goto Done;
 }
 
-
 #ifdef CONFIG_HIGHMEM
 struct highmem_page {
 	char *data;
@@ -503,7 +639,533 @@ static int restore_highmem(void)
 #endif
 	return 0;
 }
+struct partial_device_tree *swsusp_dev_tree = NULL;
+
+static int free_suspend_device_tree(void)
+{
+	if (swsusp_dev_tree) {
+		device_merge_tree(swsusp_dev_tree, &default_device_tree);
+		device_destroy_tree(swsusp_dev_tree);
+	}
+	swsusp_dev_tree = NULL;
+	return 0;
+}
+
+static int setup_suspend_device_tree(void)
+{
+	struct class * class = NULL;
+
+	swsusp_dev_tree = device_create_tree();
+	if (IS_ERR(swsusp_dev_tree)) {
+		swsusp_dev_tree = NULL;
+		return -ENOMEM;
+	}
+	/* Now check for graphics class devices, so we can 
+	 * keep the display on while suspending */
+	class = class_find("graphics");
+	if (class) {
+		struct class_device * class_dev;
+		list_for_each_entry(class_dev, &class->children, node)
+			device_switch_trees(class_dev->dev, swsusp_dev_tree);
+		class_put(class);
+	}
+
+	return (0);
+}
+
+typedef int (*do_page_t)(struct page *page, int p);
+
+static int foreach_zone_page(struct zone *zone, do_page_t fun, int p)
+{
+	int inactive = 0, active = 0;
+
+	spin_lock_irq(&zone->lru_lock); 
+	if (zone->nr_inactive) {
+		struct list_head * entry = zone->inactive_list.prev;
+		while (entry != &zone->inactive_list) {
+			if (fun) {
+				struct page * page = list_entry(entry, struct page, lru);
+				inactive += fun(page, p);
+			} else { 
+				inactive ++;
+			}
+			entry = entry->prev;
+		}
+	}
+	if (zone->nr_active) {
+		struct list_head * entry = zone->active_list.prev;
+		while (entry != &zone->active_list) {
+			if (fun) {
+				struct page * page = list_entry(entry, struct page, lru);
+				active += fun(page, p);
+			} else {
+				active ++;
+			}
+			entry = entry->prev;
+		}
+	}
+	spin_unlock_irq(&zone->lru_lock);
+
+	return (active + inactive);
+}
+
+/* enable/disable pagecache suspend */
+int swsusp_pagecache = 0;
+
+/* I'll move this to include/linux/page-flags.h */
+#define PG_page_caches (PG_nosave_free + 1)
+
+#define SetPagePcs(page)    set_bit(PG_page_caches, &(page)->flags)
+#define ClearPagePcs(page)  clear_bit(PG_page_caches, &(page)->flags)
+#define PagePcs(page)   test_bit(PG_page_caches, &(page)->flags)
+
+static suspend_pagedir_t *pagedir_cache = NULL;
+static int nr_copy_page_caches = 0;
+
+static int setup_page_caches_pe(struct page *page, int setup)
+{
+	unsigned long pfn = page_to_pfn(page);
+
+	BUG_ON(PageReserved(page) && PageNosave(page));
+	if (!pfn_valid(pfn)) {
+		printk("not valid page\n");
+		return 0;
+	}
+	if (PageNosave(page)) {
+		printk("nosave\n");
+		return 0;
+	}
+	if (PageReserved(page) /*&& pfn_is_nosave(pfn)*/) {
+		printk("[nosave]\n");
+		return 0;
+	}
+	if (PageSlab(page)) {
+		printk("slab\n");
+		return 0;
+	}
+	if (setup) {
+		struct pbe *p = find_pbe_by_index(pagedir_cache, nr_copy_page_caches);
+		BUG_ON(p == NULL);
+		p->address = (long)page_address(page);
+		BUG_ON(p->address == 0);
+		/*pr_debug("setup_page_caches: cur %p, o{%p}, d{%p}, nr %u\n",
+				(void*)p, (void*)p->orig_address,
+				(void*)p->address, nr_copy_page_caches);*/
+		nr_copy_page_caches ++;
+	}
+	SetPagePcs(page);
+
+	return (1);
+}
+
+static int count_page_caches(struct zone *zone, int p)
+{
+	if (swsusp_pagecache)
+		return foreach_zone_page(zone, setup_page_caches_pe, p);
+	return 0;
+}
+
+#define pointer2num(x)  ((x - 0xc0000000) >> 12)
+#define num2pointer(x)  ((x << 12) + 0xc0000000)
+
+static inline void collide_set_bit(unsigned char *bitmap, 
+		unsigned long bitnum)
+{
+	bitnum = pointer2num(bitnum); 
+	bitmap[bitnum / 8] |= (1 << (bitnum%8));
+}
+
+static inline int collide_is_bit_set(unsigned char *bitmap, 
+		unsigned long bitnum)
+{               
+	bitnum = pointer2num(bitnum); 
+	return !!(bitmap[bitnum / 8] & (1 << (bitnum%8)));
+}
+
+static void collide_bitmap_free(unsigned char *bitmap)
+{
+	free_pages((unsigned long)bitmap, 2);
+}
+
+/* 
+ * four pages are enough for bitmap 
+ *
+ */
+static unsigned char *collide_bitmap_init(struct pbe *pgdir)
+{
+	unsigned char *bitmap = 
+		(unsigned char *)__get_free_pages(GFP_ATOMIC | __GFP_COLD, 2);
+	struct pbe *next;
+
+	if (bitmap == NULL) {
+		return NULL;
+	}
+	memset(bitmap, 0, 4 * PAGE_SIZE);
+
+	/* do base check */
+	BUG_ON(collide_is_bit_set(bitmap, (unsigned long)bitmap) == 1);
+	collide_set_bit(bitmap, (unsigned long)bitmap);
+	BUG_ON(collide_is_bit_set(bitmap, (unsigned long)bitmap) == 0);
+	
+	while (pgdir != NULL) {
+		unsigned long nums;
+		next = (struct pbe*)pgdir->dummy.val;
+		for (nums = 0; nums < ONE_PAGE_PBE_NUM; nums++) {
+			collide_set_bit(bitmap, (unsigned long)pgdir);
+			collide_set_bit(bitmap, (unsigned long)pgdir->orig_address);
+			pgdir ++;
+		}
+		pgdir = next;
+	}
+
+	return bitmap;
+}
+static void **eaten_memory = NULL;
+
+static void *swsusp_get_safe_free_page(unsigned char *collide)
+{
+	void *addr = NULL;
+	void **c = eaten_memory;
+
+	do {
+		if (addr) {
+			eaten_memory = (void**)addr;
+			*eaten_memory = c;
+			c = eaten_memory;
+		}
+		addr = (void*)__get_free_pages(GFP_ATOMIC | __GFP_COLD, 0);
+		if (!addr) 
+			return NULL;
+	} while (collide && collide_is_bit_set(collide, (unsigned long)addr));
+
+	return addr;
+}
+/*
+ * redefine in PageCahe pagdir.
+ *
+ * struct pbe {
+ * unsigned long address;
+ * unsigned long orig_address; pointer of next struct pbe
+ * swp_entry_t swap_address;
+ * swp_entry_t dummy;          current index
+ * }
+ *
+ */
+static suspend_pagedir_t * alloc_one_pagedir(suspend_pagedir_t *prev, 
+		unsigned char *collide)
+{
+	suspend_pagedir_t *pgdir = NULL;
+	int i;
+
+	pgdir = (suspend_pagedir_t *)swsusp_get_safe_free_page(collide);
+
+	/*pr_debug("pgdir: %p, %p, %d\n", 
+			pgdir, prev, sizeof(suspend_pagedir_t)); */
+	for (i = 0; i < ONE_PAGE_PBE_NUM; i++) {
+		pgdir[i].dummy.val = 0;
+		pgdir[i].address = 0;
+		pgdir[i].orig_address = 0;
+		if (prev)
+			prev[i].dummy.val= (unsigned long)pgdir;
+	}
+
+	return (pgdir);
+}
+
+/* calc_nums - Determine the nums of allocation needed for pagedir_save. */
+static int calc_nums(int nr_copy)
+{
+	int diff = 0, ret = 0;
+	do {
+		diff = (nr_copy / ONE_PAGE_PBE_NUM) - ret + 1;
+		if (diff) {
+			ret += diff;
+			nr_copy += diff;
+		}
+	} while (diff);
+	return nr_copy;
+}
+
+
+/* 
+ * alloc_pagedir 
+ *
+ * @param pbe
+ * @param pbe_nums
+ * @param collide
+ * @param page_nums
+ *
+ */
+static int alloc_pagedir(struct pbe **pbe, int pbe_nums, 
+		unsigned char *collide, int page_nums)
+{
+	unsigned int nums = 0;
+	unsigned int after_alloc = pbe_nums;
+	suspend_pagedir_t *prev = NULL, *cur = NULL;
+
+	if (page_nums)
+		after_alloc = ONE_PAGE_PBE_NUM * page_nums;
+	else 
+		after_alloc = calc_nums(after_alloc);
+
+	pr_debug("alloc_pagedir: %d, %d\n", pbe_nums, after_alloc);
+	for (nums = 0 ; nums < after_alloc ; nums += ONE_PAGE_PBE_NUM) {
+		cur = alloc_one_pagedir(prev, collide);
+		pr_debug("alloc_one_pagedir: %p\n", cur);
+		if (!cur) { /* get page failed */
+			goto no_mem;
+		}
+		if (nums == 0) { /* setup the head */
+			*pbe = cur;
+		}
+		prev = cur;
+	}
+	return after_alloc - pbe_nums;
+
+no_mem:
+	pagedir_free(*pbe);
+	*pbe = NULL;
+
+	return (-ENOMEM);
+}
+
+static char *page_cache_buf = NULL;
+static int alloc_pagecache_buf(void)
+{
+	page_cache_buf = (char *)__get_free_pages(GFP_ATOMIC /*| __GFP_NOWARN*/, 0);
+	if (!page_cache_buf) {
+		/* FIXME try shrink memory */
+		return -ENOMEM;
+	}
+	return 0;
+}
+static int free_pagecache_buf(void)
+{
+	free_page((unsigned long)page_cache_buf);
+	return 0;
+}
+
+int swsusp_post_resume(void)
+{
+	int error = 0, index;
+	struct pbe *pos, *next;
+
+#ifdef CONFIG_PREEMPT
+	preempt_enable();
+#endif
+	if (swsusp_pagecache == 0) {
+		goto end;
+	}
+	
+	local_irq_disable();
+	dpm_power_up_tree(swsusp_dev_tree);
+	local_irq_enable();
+	device_resume_tree(swsusp_dev_tree);
+
+	mod_progress = nr_copy_page_caches / 100;
+
+	printk( "Reading PageCaches from swap (%d pages)...     ", 
+			nr_copy_page_caches);
+	pbe_for_each_safe(pos, next, index, nr_copy_page_caches, 
+			pagedir_cache) {
+		swp_entry_t entry;
+
+		error = read_one_pbe(pos, page_cache_buf, index);
+		if (error) break;
+		memcpy((void*)pos->address, page_cache_buf, PAGE_SIZE);
+		entry = pos->swap_address;
+		if (entry.val)
+			swap_free(entry);
+	}
+	printk("\b\b\b\bdone\n");
+
+	free_pagecache_buf();
+	swsusp_pagecache = 1;
+end:
+	local_irq_disable();
+	dpm_power_up_tree(&default_device_tree);
+	local_irq_enable();
+	device_resume_tree(&default_device_tree);
+	device_resume_tree(&default_device_tree);
+	free_suspend_device_tree();
+
+	return error;
+}
+
+static int page_caches_write(void)
+{
+	int error = 0, index;
+	struct pbe *pos, *next;
+	
+	mod_progress = nr_copy_page_caches / 100;
+
+	printk( "Writing PageCaches to swap (%d pages)...     ", 
+			nr_copy_page_caches);
+	pbe_for_each_safe(pos, next, index, nr_copy_page_caches,
+			pagedir_cache) {
+		memcpy(page_cache_buf, (void*)pos->address, PAGE_SIZE);
+		error = write_one_pbe(pos, page_cache_buf, index);
+		if (error) break;
+	}
+	printk("\b\b\b\bdone\n");
+
+	return error;
+}
+
+static int setup_pagedir_pbe(void)
+{
+	struct zone *zone;
+
+	nr_copy_page_caches = 0;
+	for_each_zone(zone) {
+		if (!is_highmem(zone)) {
+			count_page_caches(zone, 1);
+		}
+	}
+
+	return 0;
+}
+
+static void count_data_pages(void);
+static int swsusp_alloc(void);
+
+static int page_caches_recal(int resume)
+{
+	struct zone *zone;
+	int i;
+
+	if (swsusp_pagecache == 0 || resume == 1) return 0;
+
+	for (i = 0; i < max_mapnr; i++)
+		ClearPagePcs(mem_map+i);
+
+	nr_copy_page_caches = 0;
+	drain_local_pages();
+	for_each_zone(zone) {
+		if (!is_highmem(zone)) {
+			nr_copy_page_caches += count_page_caches(zone, 0);
+		}
+	}
+	i = calc_nums(nr_copy_page_caches);
+
+	return (i / ONE_PAGE_PBE_NUM + 1);
+}
+
+static int inline swsusp_need_pages(int resume)
+{
+	return nr_copy_pages + page_caches_recal(resume) + PAGES_FOR_IO;
+}
+
+static int swsusp_check_memory(int resume)
+{
+	int retry = 20 * 5; /* wait no memory can swap for 20 sec */
+
+	if (!resume) {
+		count_data_pages();
+	}
+
+	printk("swsusp: need %d + %d pages, freed %d pages ...     ", 
+			nr_copy_pages + PAGES_FOR_IO, page_caches_recal(resume), 
+			nr_free_pages());
+	if (nr_free_pages() > swsusp_need_pages(resume)) {
+		printk(" done\n");
+		return 0;
+	}
+
+	do {
+		int diff = swsusp_need_pages(resume) - nr_free_pages();
+		
+		if (diff < 0) break;
+		if (shrink_all_memory(diff * 2) == 0) {
+			retry --;
+		} else {
+			retry = 0;
+		}
+		current->state = TASK_INTERRUPTIBLE;
+		schedule_timeout(HZ/5);
+		if (!resume) {
+			drain_local_pages();
+			count_data_pages();
+		}
+		printk("\b\b\b\b\b%5d", diff);
+	} while (retry);
+	
+	printk("swsusp: need %d + %d pages, freed %d pages ... ", 
+			nr_copy_pages + PAGES_FOR_IO, page_caches_recal(resume), 
+			nr_free_pages());
+
+	if (nr_free_pages() < swsusp_need_pages(resume)) {
+		printk("  failed\n");
+		return -ENOMEM;
+	} 
+	printk("   done\n");
+
+	return 0;
+}
+
+int swsusp_prepare_suspend(void)
+{
+	int error = 0;
+
+	if ((error = setup_suspend_device_tree())) {
+		return error;
+	}
+	if (swsusp_check_memory(0)) {
+		free_suspend_device_tree();
+		return -ENOMEM;
+	}
+	/* exept swap device and parent from the tree */
+	if ((error = swsusp_swap_check(swsusp_dev_tree))) {
+		free_suspend_device_tree();
+		return error;
+	}
+
+	/* power all device execpt swap device and the parent */
+	BUG_ON(irqs_disabled());
+	device_suspend_tree(PMSG_FREEZE, &default_device_tree);
+	local_irq_disable();
+	device_power_down_tree(PMSG_FREEZE, &default_device_tree);
+	local_irq_enable();
+
+	if (swsusp_pagecache) {
+		if ((error = alloc_pagecache_buf())) {
+			swsusp_pagecache = 0;
+		}
+	}
+	if (swsusp_pagecache) {
+		if (alloc_pagedir(&pagedir_cache, nr_copy_page_caches, NULL, 0) < 0)
+			swsusp_pagecache = 0;
+		else 
+			swsusp_pagecache = 2;
+	}
+
+	drain_local_pages();
+	count_data_pages();
+	error = swsusp_alloc();
+	if (error) {
+		printk("swsusp_alloc failed, %d\n", error);
+		free_suspend_device_tree();
+		return error;
+	}
 
+	drain_local_pages();
+	count_data_pages();
+	printk("swsusp: need to copy %u pages, %u page_caches\n",
+			nr_copy_pages, nr_copy_page_caches);
+
+	if (swsusp_pagecache) {
+		setup_pagedir_pbe();
+		pr_debug("after setup_pagedir_pbe \n");
+
+		error = page_caches_write();
+		if (error)  {
+			free_suspend_device_tree();
+			return error;
+		}
+	}
+
+	return 0;
+}
 
 static int pfn_is_nosave(unsigned long pfn)
 {
@@ -539,7 +1201,10 @@ static int saveable(struct zone * zone, 
 	}
 	if (PageNosaveFree(page))
 		return 0;
-
+	if (PagePcs(page) && swsusp_pagecache) {
+		BUG_ON(zone->nr_inactive == 0 && zone->nr_active == 0);
+		return 0;
+	}
 	return 1;
 }
 
@@ -559,12 +1224,10 @@ static void count_data_pages(void)
 	}
 }
 
-
 static void copy_data_pages(void)
 {
 	struct zone *zone;
 	unsigned long zone_pfn;
-	struct pbe * pbe = pagedir_nosave;
 	int pages_copied = 0;
 	
 	for_each_zone(zone) {
@@ -574,11 +1237,16 @@ static void copy_data_pages(void)
 		for (zone_pfn = 0; zone_pfn < zone->spanned_pages; ++zone_pfn) {
 			if (saveable(zone, &zone_pfn)) {
 				struct page * page;
+				struct pbe * pbe = find_pbe_by_index(pagedir_nosave, 
+						pages_copied);
+				BUG_ON(pbe == NULL);
+				if (pbe->address == 0) 
+					panic("copy_data_pages: %d copied\n", pages_copied);
 				page = pfn_to_page(zone_pfn + zone->zone_start_pfn);
 				pbe->orig_address = (long) page_address(page);
+				BUG_ON(pbe->orig_address == 0);
 				/* copy_page is not usable for copying task structs. */
 				memcpy((void *)pbe->address, (void *)pbe->orig_address, PAGE_SIZE);
-				pbe++;
 				pages_copied++;
 			}
 		}
@@ -587,85 +1255,18 @@ static void copy_data_pages(void)
 	nr_copy_pages = pages_copied;
 }
 
-
-/**
- *	calc_order - Determine the order of allocation needed for pagedir_save.
- *
- *	This looks tricky, but is just subtle. Please fix it some time.
- *	Since there are %nr_copy_pages worth of pages in the snapshot, we need
- *	to allocate enough contiguous space to hold 
- *		(%nr_copy_pages * sizeof(struct pbe)), 
- *	which has the saved/orig locations of the page.. 
- *
- *	SUSPEND_PD_PAGES() tells us how many pages we need to hold those 
- *	structures, then we call get_bitmask_order(), which will tell us the
- *	last bit set in the number, starting with 1. (If we need 30 pages, that
- *	is 0x0000001e in hex. The last bit is the 5th, which is the order we 
- *	would use to allocate 32 contiguous pages).
- *
- *	Since we also need to save those pages, we add the number of pages that
- *	we need to nr_copy_pages, and in case of an overflow, do the 
- *	calculation again to update the number of pages needed. 
- *
- *	With this model, we will tend to waste a lot of memory if we just cross
- *	an order boundary. Plus, the higher the order of allocation that we try
- *	to do, the more likely we are to fail in a low-memory situtation 
- *	(though	we're unlikely to get this far in such a case, since swsusp 
- *	requires half of memory to be free anyway).
- */
-
-
-static void calc_order(void)
-{
-	int diff = 0;
-	int order = 0;
-
-	do {
-		diff = get_bitmask_order(SUSPEND_PD_PAGES(nr_copy_pages)) - order;
-		if (diff) {
-			order += diff;
-			nr_copy_pages += 1 << diff;
-		}
-	} while(diff);
-	pagedir_order = order;
-}
-
-
-/**
- *	alloc_pagedir - Allocate the page directory.
- *
- *	First, determine exactly how many contiguous pages we need and
- *	allocate them.
- */
-
-static int alloc_pagedir(void)
-{
-	calc_order();
-	pagedir_save = (suspend_pagedir_t *)__get_free_pages(GFP_ATOMIC | __GFP_COLD,
-							     pagedir_order);
-	if (!pagedir_save)
-		return -ENOMEM;
-	memset(pagedir_save, 0, (1 << pagedir_order) * PAGE_SIZE);
-	pagedir_nosave = pagedir_save;
-	return 0;
-}
-
 /**
  *	free_image_pages - Free pages allocated for snapshot
  */
-
 static void free_image_pages(void)
 {
-	struct pbe * p;
-	int i;
+	struct pbe *pos, *next;
+	int index;
 
-	p = pagedir_save;
-	for (i = 0, p = pagedir_save; i < nr_copy_pages; i++, p++) {
-		if (p->address) {
-			ClearPageNosave(virt_to_page(p->address));
-			free_page(p->address);
-			p->address = 0;
-		}
+	pbe_for_each_safe(pos, next, index, nr_copy_pages, pagedir_save) {
+		ClearPageNosave(virt_to_page(pos->address));
+		free_page(pos->address);
+		pos->address = 0;
 	}
 }
 
@@ -673,17 +1274,16 @@ static void free_image_pages(void)
  *	alloc_image_pages - Allocate pages for the snapshot.
  *
  */
-
 static int alloc_image_pages(void)
 {
-	struct pbe * p;
-	int i;
+	struct pbe *pos, *next;
+	int index;
 
-	for (i = 0, p = pagedir_save; i < nr_copy_pages; i++, p++) {
-		p->address = get_zeroed_page(GFP_ATOMIC | __GFP_COLD);
-		if (!p->address)
+	pbe_for_each_safe(pos, next, index, nr_copy_pages, pagedir_save) {
+		pos->address = (unsigned long)get_zeroed_page(GFP_ATOMIC | __GFP_COLD);
+		if (!pos->address)
 			return -ENOMEM;
-		SetPageNosave(virt_to_page(p->address));
+		SetPageNosave(virt_to_page(pos->address));
 	}
 	return 0;
 }
@@ -693,28 +1293,9 @@ void swsusp_free(void)
 	BUG_ON(PageNosave(virt_to_page(pagedir_save)));
 	BUG_ON(PageNosaveFree(virt_to_page(pagedir_save)));
 	free_image_pages();
-	free_pages((unsigned long) pagedir_save, pagedir_order);
+	pagedir_free(pagedir_save);
 }
 
-
-/**
- *	enough_free_mem - Make sure we enough free memory to snapshot.
- *
- *	Returns TRUE or FALSE after checking the number of available 
- *	free pages.
- */
-
-static int enough_free_mem(void)
-{
-	if (nr_free_pages() < (nr_copy_pages + PAGES_FOR_IO)) {
-		pr_debug("swsusp: Not enough free pages: Have %d\n",
-			 nr_free_pages());
-		return 0;
-	}
-	return 1;
-}
-
-
 /**
  *	enough_swap - Make sure we have enough swap to save the image.
  *
@@ -730,7 +1311,7 @@ static int enough_swap(void)
 	struct sysinfo i;
 
 	si_swapinfo(&i);
-	if (i.freeswap < (nr_copy_pages + PAGES_FOR_IO))  {
+	if (i.freeswap < (nr_copy_pages + nr_copy_page_caches + PAGES_FOR_IO))  {
 		pr_debug("swsusp: Not enough swap. Need %ld\n",i.freeswap);
 		return 0;
 	}
@@ -741,34 +1322,30 @@ static int swsusp_alloc(void)
 {
 	int error;
 
-	pr_debug("suspend: (pages needed: %d + %d free: %d)\n",
-		 nr_copy_pages, PAGES_FOR_IO, nr_free_pages());
-
 	pagedir_nosave = NULL;
-	if (!enough_free_mem())
-		return -ENOMEM;
 
 	if (!enough_swap())
 		return -ENOSPC;
-
-	if ((error = alloc_pagedir())) {
-		pr_debug("suspend: Allocating pagedir failed.\n");
-		return error;
+	error = alloc_pagedir(&pagedir_save, nr_copy_pages, NULL, 0);
+	if (error < 0) {
+		printk("suspend: Allocating pagedir failed.\n");
+		return -ENOMEM;
 	}
+	pr_debug("alloc_pagedir: addon %d\n", error);
+	nr_copy_pages += error;
 	if ((error = alloc_image_pages())) {
-		pr_debug("suspend: Allocating image pages failed.\n");
+		printk("suspend: Allocating image pages failed.\n");
 		swsusp_free();
 		return error;
 	}
+	pagedir_nosave = pagedir_save;
 
-	pagedir_order_check = pagedir_order;
 	return 0;
 }
 
 int suspend_prepare_image(void)
 {
-	unsigned int nr_needed_pages;
-	int error;
+	BUG_ON(!irqs_disabled());
 
 	pr_debug("swsusp: critical section: \n");
 	if (save_highmem()) {
@@ -777,15 +1354,6 @@ int suspend_prepare_image(void)
 		return -ENOMEM;
 	}
 
-	drain_local_pages();
-	count_data_pages();
-	printk("swsusp: Need to copy %u pages\n",nr_copy_pages);
-	nr_needed_pages = nr_copy_pages + PAGES_FOR_IO;
-
-	error = swsusp_alloc();
-	if (error)
-		return error;
-	
 	/* During allocating of suspend pagedir, new cold pages may appear. 
 	 * Kill them.
 	 */
@@ -811,7 +1379,6 @@ int suspend_prepare_image(void)
 int swsusp_write(void)
 {
 	int error;
-	device_resume();
 	lock_swapdevices();
 	error = write_suspend_image();
 	/* This will unlock ignored swap devices since writing is finished */
@@ -820,17 +1387,11 @@ int swsusp_write(void)
 
 }
 
-
 extern asmlinkage int swsusp_arch_suspend(void);
 extern asmlinkage int swsusp_arch_resume(void);
 
-
 asmlinkage int swsusp_save(void)
 {
-	int error = 0;
-
-	if ((error = swsusp_swap_check()))
-		return error;
 	return suspend_prepare_image();
 }
 
@@ -839,34 +1400,66 @@ int swsusp_suspend(void)
 	int error;
 	if ((error = arch_prepare_suspend()))
 		return error;
+
+	BUG_ON(irqs_disabled());
+	/* suspend swap device */
+	device_suspend_tree(PMSG_FREEZE, swsusp_dev_tree);
+
+	mb();
+	barrier();
+
+#ifdef CONFIG_PREEMPT
+	preempt_disable();
+#endif
 	local_irq_disable();
+	device_power_down_tree(PMSG_FREEZE, swsusp_dev_tree);
 	sysdev_suspend(PMSG_FREEZE);
+
 	save_processor_state();
 	error = swsusp_arch_suspend();
 	/* Restore control flow magically appears here */
 	restore_processor_state();
 	restore_highmem();
+
+	BUG_ON(!irqs_disabled());
 	sysdev_resume();
+	
+	dpm_power_up_tree(swsusp_dev_tree);
 	local_irq_enable();
+	device_resume_tree(swsusp_dev_tree);
+
 	return error;
 }
 
 
 asmlinkage int swsusp_restore(void)
 {
-	BUG_ON (pagedir_order_check != pagedir_order);
-	
 	/* Even mappings of "global" things (vmalloc) need to be fixed */
+#if defined(CONFIG_X86) || defined(CONFIG_X86_64)
 	__flush_tlb_global();
 	wbinvd();	/* Nigel says wbinvd here is good idea... */
+#endif
 	return 0;
 }
 
 int swsusp_resume(void)
 {
 	int error;
+
+	/* power all device execpt swap device and the parent */
+	BUG_ON(irqs_disabled());
+	device_suspend_tree(PMSG_FREEZE, &default_device_tree);
+	local_irq_disable();
+	device_power_down_tree(PMSG_FREEZE, &default_device_tree);
+	local_irq_enable();
+
+#ifdef CONFIG_PREEMPT
+	preempt_disable();
+#endif
+	
 	local_irq_disable();
 	sysdev_suspend(PMSG_FREEZE);
+
 	/* We'll ignore saved state, but this gets preempt count (etc) right */
 	save_processor_state();
 	error = swsusp_arch_resume();
@@ -881,99 +1474,6 @@ int swsusp_resume(void)
 	return error;
 }
 
-
-
-/* More restore stuff */
-
-#define does_collide(addr) does_collide_order(pagedir_nosave, addr, 0)
-
-/*
- * Returns true if given address/order collides with any orig_address 
- */
-static int __init does_collide_order(suspend_pagedir_t *pagedir, unsigned long addr,
-		int order)
-{
-	int i;
-	unsigned long addre = addr + (PAGE_SIZE<<order);
-	
-	for (i=0; i < nr_copy_pages; i++)
-		if ((pagedir+i)->orig_address >= addr &&
-			(pagedir+i)->orig_address < addre)
-			return 1;
-
-	return 0;
-}
-
-/*
- * We check here that pagedir & pages it points to won't collide with pages
- * where we're going to restore from the loaded pages later
- */
-static int __init check_pagedir(void)
-{
-	int i;
-
-	for(i=0; i < nr_copy_pages; i++) {
-		unsigned long addr;
-
-		do {
-			addr = get_zeroed_page(GFP_ATOMIC);
-			if(!addr)
-				return -ENOMEM;
-		} while (does_collide(addr));
-
-		(pagedir_nosave+i)->address = addr;
-	}
-	return 0;
-}
-
-static int __init swsusp_pagedir_relocate(void)
-{
-	/*
-	 * We have to avoid recursion (not to overflow kernel stack),
-	 * and that's why code looks pretty cryptic 
-	 */
-	suspend_pagedir_t *old_pagedir = pagedir_nosave;
-	void **eaten_memory = NULL;
-	void **c = eaten_memory, *m, *f;
-	int ret = 0;
-
-	printk("Relocating pagedir ");
-
-	if (!does_collide_order(old_pagedir, (unsigned long)old_pagedir, pagedir_order)) {
-		printk("not necessary\n");
-		return check_pagedir();
-	}
-
-	while ((m = (void *) __get_free_pages(GFP_ATOMIC, pagedir_order)) != NULL) {
-		if (!does_collide_order(old_pagedir, (unsigned long)m, pagedir_order))
-			break;
-		eaten_memory = m;
-		printk( "." ); 
-		*eaten_memory = c;
-		c = eaten_memory;
-	}
-
-	if (!m) {
-		printk("out of memory\n");
-		ret = -ENOMEM;
-	} else {
-		pagedir_nosave =
-			memcpy(m, old_pagedir, PAGE_SIZE << pagedir_order);
-	}
-
-	c = eaten_memory;
-	while (c) {
-		printk(":");
-		f = c;
-		c = *c;
-		free_pages((unsigned long)f, pagedir_order);
-	}
-	if (ret)
-		return ret;
-	printk("|\n");
-	return check_pagedir();
-}
-
 /**
  *	Using bio to read from swap.
  *	This code requires a bit more work than just using buffer heads
@@ -1038,12 +1538,12 @@ static int submit(int rw, pgoff_t page_o
 	return error;
 }
 
-int bio_read_page(pgoff_t page_off, void * page)
+static int bio_read_page(pgoff_t page_off, void * page)
 {
 	return submit(READ, page_off, page);
 }
 
-int bio_write_page(pgoff_t page_off, void * page)
+static int bio_write_page(pgoff_t page_off, void * page)
 {
 	return submit(WRITE, page_off, page);
 }
@@ -1088,7 +1588,6 @@ static int __init check_header(void)
 		return -EPERM;
 	}
 	nr_copy_pages = swsusp_info.image_pages;
-	pagedir_order = get_bitmask_order(SUSPEND_PD_PAGES(nr_copy_pages));
 	return error;
 }
 
@@ -1115,62 +1614,167 @@ static int __init check_sig(void)
 	return error;
 }
 
+
+static void __init eat_progress(void)
+{
+	char *eaten_progess = "-\\|/";
+	static int eaten_i = 0;
+
+	printk("\b%c", eaten_progess[eaten_i]);
+	eaten_i ++;
+	if (eaten_i > 3) eaten_i = 0;
+}
+
+static int __init check_one_pbe(struct pbe *p, void *collide, int cur)
+{
+	unsigned long addr = 0;
+
+	pr_debug("check_one_pbe: %p %lu o{%p} ", 
+			p, p->swap_address.val, (void*)p->orig_address);
+	addr = (unsigned long)swsusp_get_safe_free_page(collide);
+	if(!addr)
+		return -ENOMEM;
+	pr_debug("c{%p} done\n", (void*)addr);
+	p->address = addr;
+
+	return 0;
+}
+
+static void __init swsusp_copy_pagedir(suspend_pagedir_t *d_pgdir, 
+		suspend_pagedir_t *s_pgdir)
+{
+	int i = 0;
+
+	while (s_pgdir != NULL) {
+		suspend_pagedir_t *s_next = (suspend_pagedir_t *)s_pgdir->dummy.val;
+		suspend_pagedir_t *d_next = (suspend_pagedir_t *)d_pgdir->dummy.val;
+		for (i = 0; i < ONE_PAGE_PBE_NUM; i++) {
+			d_pgdir->address = s_pgdir->address;
+			d_pgdir->orig_address = s_pgdir->orig_address;
+			d_pgdir->swap_address = s_pgdir->swap_address;
+			s_pgdir ++; d_pgdir ++;
+		}
+		d_pgdir = d_next;
+		s_pgdir = s_next;
+	};
+}
+/*
+ * We check here that pagedir & pages it points to won't collide with pages
+ * where we're going to restore from the loaded pages later
+ */
+static int __init check_pagedir(void)
+{
+	void **c, *f;
+	struct pbe *next, *pos;
+	int error, index;
+	suspend_pagedir_t *addr = NULL;
+	unsigned char *bitmap = collide_bitmap_init(pagedir_nosave);
+	
+	BUG_ON(bitmap == NULL);
+
+	printk("Relocating pagedir ... ");
+	error = alloc_pagedir(&addr, nr_copy_pages, bitmap, 
+			swsusp_info.pagedir_pages);
+	if (error < 0) {
+		return error;
+	}
+	swsusp_copy_pagedir(addr, pagedir_nosave);
+	pagedir_free(pagedir_nosave);
+
+	/* check copy address */
+	pbe_for_each_safe(pos, next, index, nr_copy_pages, addr) {
+		error = check_one_pbe(pos, bitmap, index);
+		BUG_ON(error);
+	}
+
+	/* free eaten memory */
+	c = eaten_memory;
+	while (c) {
+		eat_progress();
+		f = c;
+		c = *c;
+		free_pages((unsigned long)f, 0);
+	}
+	/* free unused memory */
+	collide_bitmap_free(bitmap);
+	printk("     done\n");
+		
+	pagedir_nosave = addr;
+
+	return 0;
+}
+
 /**
  *	swsusp_read_data - Read image pages from swap.
  *
- *	You do not need to check for overlaps, check_pagedir()
- *	already did that.
  */
-
 static int __init data_read(void)
 {
-	struct pbe * p;
-	int error;
-	int i;
-	int mod = nr_copy_pages / 100;
-
-	if (!mod)
-		mod = 1;
+	int error = 0, index;
+	struct pbe *pos, *next;
 
-	if ((error = swsusp_pagedir_relocate()))
+	if ((error = swsusp_check_memory(1))) {
 		return error;
+	}
+
+	if ((error = check_pagedir())) {
+		return -ENOMEM;
+	}
+
+	mod_progress = nr_copy_pages / 100;
 
 	printk( "Reading image data (%d pages):     ", nr_copy_pages );
-	for(i = 0, p = pagedir_nosave; i < nr_copy_pages && !error; i++, p++) {
-		if (!(i%mod))
-			printk( "\b\b\b\b%3d%%", i / mod );
-		error = bio_read_page(swp_offset(p->swap_address),
-				  (void *)p->address);
+	pbe_for_each_safe(pos, next, index, nr_copy_pages, pagedir_nosave) {
+		error = read_one_pbe(pos, (void*)pos->address, index);
+		if (error) break;
 	}
-	printk(" %d done.\n",i);
-	return error;
+	printk(" %d done.\n", index);
 
+	return error;
 }
 
 extern dev_t __init name_to_dev_t(const char *line);
 
-static int __init read_pagedir(void)
+static int __init read_one_pagedir(suspend_pagedir_t *pgdir, int i)
 {
-	unsigned long addr;
-	int i, n = swsusp_info.pagedir_pages;
+	unsigned long offset = swp_offset(swsusp_info.pagedir[i]);
+	unsigned long next;
 	int error = 0;
 
-	addr = __get_free_pages(GFP_ATOMIC, pagedir_order);
-	if (!addr)
-		return -ENOMEM;
-	pagedir_nosave = (struct pbe *)addr;
+	next = pgdir->dummy.val;
+	pr_debug("read_one_pagedir: %p, %d, %lu, %p\n", 
+			pgdir, i, offset, (void*)next);
+	if ((error = bio_read_page(offset, (void *)pgdir))) {
+		return error;
+	}
+	pgdir->dummy.val = next;
 
-	pr_debug("pmdisk: Reading pagedir (%d Pages)\n",n);
+	return error;
+}
 
-	for (i = 0; i < n && !error; i++, addr += PAGE_SIZE) {
-		unsigned long offset = swp_offset(swsusp_info.pagedir[i]);
-		if (offset)
-			error = bio_read_page(offset, (void *)addr);
-		else
-			error = -EFAULT;
-	}
-	if (error)
-		free_pages((unsigned long)pagedir_nosave, pagedir_order);
+/*
+ * reading pagedir from swap device 
+ */
+static int __init read_pagedir(void)
+{
+	int i = 0, n = swsusp_info.pagedir_pages;
+	int error = 0;
+	suspend_pagedir_t *pgdir, *next;
+	
+	error = alloc_pagedir(&pagedir_nosave, nr_copy_pages, NULL, n);
+	if (error < 0)
+		return -ENOMEM;
+
+	printk("pmdisk: Reading pagedir (%d Pages)\n",n);
+	pgdir_for_each_safe(pgdir, next, pagedir_nosave) {
+		error = read_one_pagedir(pgdir, i);
+		if (error) break;
+		i++;
+	}
+	BUG_ON(i != n);
+	if (error)	
+		pagedir_free(pagedir_nosave);
+	
 	return error;
 }
 
@@ -1185,7 +1789,7 @@ static int __init read_suspend_image(voi
 	if ((error = read_pagedir()))
 		return error;
 	if ((error = data_read()))
-		free_pages((unsigned long)pagedir_nosave, pagedir_order);
+		pagedir_free(pagedir_nosave);
 	return error;
 }
 
@@ -1200,14 +1804,14 @@ int __init swsusp_read(void)
 	if (!strlen(resume_file))
 		return -ENOENT;
 
-	resume_device = name_to_dev_t(resume_file);
+	swsusp_resume_device = name_to_dev_t(resume_file);
 	pr_debug("swsusp: Resume From Partition: %s\n", resume_file);
 
-	resume_bdev = open_by_devnum(resume_device, FMODE_READ);
+	resume_bdev = open_by_devnum(swsusp_resume_device, FMODE_READ);
 	if (!IS_ERR(resume_bdev)) {
 		set_blocksize(resume_bdev, PAGE_SIZE);
 		error = read_suspend_image();
-		blkdev_put(resume_bdev);
+		/* blkdev_put(resume_bdev); */
 	} else
 		error = PTR_ERR(resume_bdev);
 
--- 2.6.9-lzf//kernel/sys.c	2004-11-28 23:17:11.000000000 +0800
+++ 2.6.9/kernel/sys.c	2004-11-28 23:16:53.000000000 +0800
@@ -84,7 +84,7 @@ int cad_pid = 1;
  *	and the like. 
  */
 
-static struct notifier_block *reboot_notifier_list;
+struct notifier_block *reboot_notifier_list;
 rwlock_t notifier_lock = RW_LOCK_UNLOCKED;
 
 /**
--- 2.6.9-lzf//kernel/sysctl.c	2004-11-28 23:17:12.000000000 +0800
+++ 2.6.9/kernel/sysctl.c	2004-11-28 23:16:55.000000000 +0800
@@ -66,6 +66,10 @@ extern int min_free_kbytes;
 extern int printk_ratelimit_jiffies;
 extern int printk_ratelimit_burst;
 
+#if defined(CONFIG_SOFTWARE_SUSPEND)
+extern int swsusp_pagecache;
+#endif
+
 #if defined(CONFIG_X86_LOCAL_APIC) && defined(__i386__)
 int unknown_nmi_panic;
 extern int proc_unknown_nmi_panic(ctl_table *, int, struct file *,
@@ -792,6 +796,18 @@ static ctl_table vm_table[] = {
 		.strategy	= &sysctl_intvec,
 		.extra1		= &zero,
 	},
+#if defined(CONFIG_SOFTWARE_SUSPEND)
+	{
+		.ctl_name	= VM_SWSUSP_PAGECACHE,
+		.procname	= "swsusp_pagecache",
+		.data		= &swsusp_pagecache,
+		.maxlen		= sizeof(swsusp_pagecache),
+		.mode		= 0644,
+		.proc_handler	= &proc_dointvec,
+		.strategy	= &sysctl_intvec,
+		.extra1		= &zero,
+	},
+#endif
 	{
 		.ctl_name	= VM_BLOCK_DUMP,
 		.procname	= "block_dump",
-- 
--
Hu Gang / Steve
Linux Registered User 204016
GPG Public Key: http://soulinfo.com/~hugang/hugang.asc

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: software suspend patch [3/6]
  2004-11-28 16:23     ` software suspend patch [1/6] hugang
  2004-11-28 16:24       ` software suspend patch [2/6] hugang
@ 2004-11-28 16:24       ` hugang
  2004-11-28 16:25       ` software suspend patch [4/6] hugang
                         ` (3 subsequent siblings)
  5 siblings, 0 replies; 51+ messages in thread
From: hugang @ 2004-11-28 16:24 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

On Mon, Nov 29, 2004 at 12:23:20AM +0800, hugang@soulinfo.com wrote:
> Hi Pavel Machek, Nigel Cunningham:
> 
>  device-tree.diff 
>    base from suspend2 with a little changed.
> 
>  core.diff
>   1: redefine struct pbe for using _no_ continuous as pagedir.
>   2: make shrink memory as little as possible.
>   3: using a bitmap speed up collide check in page relocating.
>   4: pagecache saving ready.
> 
>  i386.diff
>  ppc.diff
>   i386 and powerpc suspend update.
> 
>  pagecachs_addon.diff
>   if enable page caches saving, must using it, it making saving
>   pagecaches safe. idea from suspend2.
> 
>   ppcfix.diff
>   fix compile error. 
>   $ gcc -v
>    .... 
>    gcc version 2.95.4 20011002 (Debian prerelease)
> 
> I'm using 2.6.9-ck3 With above patch, swsusp1 works prefect in my 
> PowerPC and x86 PC with Highmem and prepempt option enabled.
> 
> I hope the core.diff@1,@2,@3 i386.diff ppc.diff will merge into 
> mainline kernel ASAP, :). from I view point device-tree.diff is 
> very usefuly when using pagecache saving and pagecachs_addon.diff
> that's really hack for making pagecache saving safe.
> 

--- 2.6.9-lzf//arch/i386/kernel/signal.c	2004-11-28 23:17:23.000000000 +0800
+++ 2.6.9/arch/i386/kernel/signal.c	2004-11-28 23:16:59.000000000 +0800
@@ -587,6 +587,7 @@ int fastcall do_signal(struct pt_regs *r
 
 	if (current->flags & PF_FREEZE) {
 		refrigerator(0);
+		recalc_sigpending();
 		if (!signal_pending(current))
 			goto no_signal;
 	}
--- 2.6.9-lzf//arch/i386/power/swsusp.S	2004-11-26 12:32:45.000000000 +0800
+++ 2.6.9/arch/i386/power/swsusp.S	2004-11-28 23:16:59.000000000 +0800
@@ -31,24 +31,33 @@ ENTRY(swsusp_arch_resume)
 	movl $swsusp_pg_dir-__PAGE_OFFSET,%ecx
 	movl %ecx,%cr3
 
-	movl	pagedir_nosave, %ebx
-	xorl	%eax, %eax
-	xorl	%edx, %edx
-	.p2align 4,,7
-
-copy_loop:
-	movl	4(%ebx,%edx),%edi
-	movl	(%ebx,%edx),%esi
-
-	movl	$1024, %ecx
-	rep
-	movsl
-
-	incl	%eax
-	addl	$16, %edx
-	cmpl	nr_copy_pages,%eax
-	jb copy_loop
-	.p2align 4,,7
+	movl  pagedir_nosave, %eax
+	test %eax, %eax
+	je   copy_loop_end
+	movl  $1024, %edx
+
+copy_loop_start:
+	movl  0xc(%eax), %ebp
+	xorl  %ebx, %ebx
+	leal  0x0(%esi),%esi
+
+copy_one_pgdir:
+	movl  0x4(%eax),%edi
+	test %edi, %edi
+	je   copy_loop_end
+
+	movl  (%eax), %esi
+	movl  %edx, %ecx
+	repz movsl %ds:(%esi),%es:(%edi)
+
+	incl  %ebx
+	addl  $0x10, %eax
+	cmpl  $0xff, %ebx
+	jbe  copy_one_pgdir
+	test %ebp, %ebp
+	movl  %ebp, %eax
+	jne  copy_loop_start
+copy_loop_end:
 
 	movl saved_context_esp, %esp
 	movl saved_context_ebp, %ebp
-- 
--
Hu Gang / Steve
Linux Registered User 204016
GPG Public Key: http://soulinfo.com/~hugang/hugang.asc

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: software suspend patch [4/6]
  2004-11-28 16:23     ` software suspend patch [1/6] hugang
  2004-11-28 16:24       ` software suspend patch [2/6] hugang
  2004-11-28 16:24       ` software suspend patch [3/6] hugang
@ 2004-11-28 16:25       ` hugang
  2004-11-28 17:08         ` Pavel Machek
  2004-11-28 16:25       ` software suspend patch [5/6] hugang
                         ` (2 subsequent siblings)
  5 siblings, 1 reply; 51+ messages in thread
From: hugang @ 2004-11-28 16:25 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

On Mon, Nov 29, 2004 at 12:23:20AM +0800, hugang@soulinfo.com wrote:
> Hi Pavel Machek, Nigel Cunningham:
> 
>  device-tree.diff 
>    base from suspend2 with a little changed.
> 
>  core.diff
>   1: redefine struct pbe for using _no_ continuous as pagedir.
>   2: make shrink memory as little as possible.
>   3: using a bitmap speed up collide check in page relocating.
>   4: pagecache saving ready.
> 
>  i386.diff
>  ppc.diff
>   i386 and powerpc suspend update.
> 
>  pagecachs_addon.diff
>   if enable page caches saving, must using it, it making saving
>   pagecaches safe. idea from suspend2.
> 
>   ppcfix.diff
>   fix compile error. 
>   $ gcc -v
>    .... 
>    gcc version 2.95.4 20011002 (Debian prerelease)
> 
> I'm using 2.6.9-ck3 With above patch, swsusp1 works prefect in my 
> PowerPC and x86 PC with Highmem and prepempt option enabled.
> 
> I hope the core.diff@1,@2,@3 i386.diff ppc.diff will merge into 
> mainline kernel ASAP, :). from I view point device-tree.diff is 
> very usefuly when using pagecache saving and pagecachs_addon.diff
> that's really hack for making pagecache saving safe.

--- 2.6.9-lzf/drivers/ide/ppc/pmac.c	2004-11-26 12:33:06.000000000 +0800
+++ 2.6.9/drivers/ide/ppc/pmac.c	2004-11-28 23:17:00.000000000 +0800
@@ -32,6 +32,7 @@
 #include <linux/notifier.h>
 #include <linux/reboot.h>
 #include <linux/pci.h>
+#include <linux/pm.h>
 #include <linux/adb.h>
 #include <linux/pmu.h>
 
@@ -1364,7 +1365,7 @@ pmac_ide_macio_suspend(struct macio_dev 
 	ide_hwif_t	*hwif = (ide_hwif_t *)dev_get_drvdata(&mdev->ofdev.dev);
 	int		rc = 0;
 
-	if (state != mdev->ofdev.dev.power_state && state >= 2) {
+	if (state != mdev->ofdev.dev.power_state && state == PM_SUSPEND_MEM) {
 		rc = pmac_ide_do_suspend(hwif);
 		if (rc == 0)
 			mdev->ofdev.dev.power_state = state;
@@ -1472,7 +1473,7 @@ pmac_ide_pci_suspend(struct pci_dev *pde
 	ide_hwif_t	*hwif = (ide_hwif_t *)pci_get_drvdata(pdev);
 	int		rc = 0;
 	
-	if (state != pdev->dev.power_state && state >= 2) {
+	if (state != pdev->dev.power_state && state == PM_SUSPEND_MEM ) {
 		rc = pmac_ide_do_suspend(hwif);
 		if (rc == 0)
 			pdev->dev.power_state = state;
--- 2.6.9-lzf/drivers/macintosh/Kconfig	2004-11-26 12:33:06.000000000 +0800
+++ 2.6.9/drivers/macintosh/Kconfig	2004-11-28 23:17:00.000000000 +0800
@@ -80,7 +80,7 @@ config ADB_PMU
 
 config PMAC_PBOOK
 	bool "Power management support for PowerBooks"
-	depends on ADB_PMU
+	depends on PM && ADB_PMU
 	---help---
 	  This provides support for putting a PowerBook to sleep; it also
 	  enables media bay support.  Power management works on the
@@ -97,11 +97,6 @@ config PMAC_PBOOK
 	  have it autoloaded. The act of removing the module shuts down the
 	  sound hardware for more power savings.
 
-config PM
-	bool
-	depends on PPC_PMAC && ADB_PMU && PMAC_PBOOK
-	default y
-
 config PMAC_APM_EMU
 	tristate "APM emulation"
 	depends on PMAC_PBOOK
--- 2.6.9-lzf/drivers/macintosh/via-pmu.c	2004-11-26 12:33:07.000000000 +0800
+++ 2.6.9/drivers/macintosh/via-pmu.c	2004-11-28 23:17:00.000000000 +0800
@@ -43,6 +43,7 @@
 #include <linux/init.h>
 #include <linux/interrupt.h>
 #include <linux/device.h>
+#include <linux/sysdev.h>
 #include <linux/suspend.h>
 #include <linux/syscalls.h>
 #include <asm/prom.h>
@@ -2326,7 +2327,7 @@ pmac_suspend_devices(void)
 	/* Sync the disks. */
 	/* XXX It would be nice to have some way to ensure that
 	 * nobody is dirtying any new buffers while we wait. That
-	 * could be acheived using the refrigerator for processes
+	 * could be achieved using the refrigerator for processes
 	 * that swsusp uses
 	 */
 	sys_sync();
@@ -2379,7 +2380,6 @@ pmac_suspend_devices(void)
 
 	/* Wait for completion of async backlight requests */
 	while (!bright_req_1.complete || !bright_req_2.complete ||
-
 			!batt_req.complete)
 		pmu_poll();
 
@@ -3048,6 +3048,88 @@ pmu_polled_request(struct adb_request *r
 }
 #endif /* DEBUG_SLEEP */
 
+
+/* FIXME: This is a temporary set of callbacks to enable us
+ * to do suspend-to-disk.
+ */
+
+#ifdef CONFIG_PM
+
+static int pmu_sys_suspended = 0;
+
+static int pmu_sys_suspend(struct sys_device *sysdev, pm_message_t state)
+{
+	if (state != PMSG_FREEZE || pmu_sys_suspended)
+		return 0;
+
+	/* Suspend PMU event interrupts */
+	pmu_suspend();
+
+	pmu_sys_suspended = 1;
+	return 0;
+}
+
+static int pmu_sys_resume(struct sys_device *sysdev)
+{
+	struct adb_request req;
+
+	if (!pmu_sys_suspended)
+		return 0;
+
+	/* Tell PMU we are ready */
+	pmu_request(&req, NULL, 2, PMU_SYSTEM_READY, 2);
+	pmu_wait_complete(&req);
+
+	/* Resume PMU event interrupts */
+	pmu_resume();
+
+	pmu_sys_suspended = 0;
+
+	return 0;
+}
+
+#endif /* CONFIG_PM */
+
+static struct sysdev_class pmu_sysclass = {
+	set_kset_name("pmu"),
+};
+
+static struct sys_device device_pmu = {
+	.id		= 0,
+	.cls		= &pmu_sysclass,
+};
+
+static struct sysdev_driver driver_pmu = {
+#ifdef CONFIG_PM
+	.suspend	= &pmu_sys_suspend,
+	.resume		= &pmu_sys_resume,
+#endif /* CONFIG_PM */
+};
+
+static int __init init_pmu_sysfs(void)
+{
+	int rc;
+
+	rc = sysdev_class_register(&pmu_sysclass);
+	if (rc) {
+		printk(KERN_ERR "Failed registering PMU sys class\n");
+		return -ENODEV;
+	}
+	rc = sysdev_register(&device_pmu);
+	if (rc) {
+		printk(KERN_ERR "Failed registering PMU sys device\n");
+		return -ENODEV;
+	}
+	rc = sysdev_driver_register(&pmu_sysclass, &driver_pmu);
+	if (rc) {
+		printk(KERN_ERR "Failed registering PMU sys driver\n");
+		return -ENODEV;
+	}
+	return 0;
+}
+
+subsys_initcall(init_pmu_sysfs);
+
 EXPORT_SYMBOL(pmu_request);
 EXPORT_SYMBOL(pmu_poll);
 EXPORT_SYMBOL(pmu_poll_adb);
--- /dev/null	2004-06-07 18:45:47.000000000 +0800
+++ 2.6.9/include/asm-ppc/suspend.h	2004-11-28 23:16:57.000000000 +0800
@@ -0,0 +1,12 @@
+static inline int arch_prepare_suspend(void)
+{
+	return 0;
+}
+
+static inline void save_processor_state(void)
+{
+}
+
+static inline void restore_processor_state(void)
+{
+}
--- 2.6.9-lzf/arch/ppc/Kconfig	2004-11-26 12:32:56.000000000 +0800
+++ 2.6.9/arch/ppc/Kconfig	2004-11-28 23:16:58.000000000 +0800
@@ -983,6 +983,8 @@ config PROC_HARDWARE
 
 source "drivers/zorro/Kconfig"
 
+source kernel/power/Kconfig
+
 endmenu
 
 menu "Bus options"
--- 2.6.9-lzf/arch/ppc/kernel/Makefile	2004-11-26 12:32:56.000000000 +0800
+++ 2.6.9/arch/ppc/kernel/Makefile	2004-11-28 23:16:58.000000000 +0800
@@ -16,6 +16,7 @@ obj-y				:= entry.o traps.o irq.o idle.o
 					semaphore.o syscalls.o setup.o \
 					cputable.o ppc_htab.o
 obj-$(CONFIG_6xx)		+= l2cr.o cpu_setup_6xx.o
+obj-$(CONFIG_SOFTWARE_SUSPEND)	+= swsusp.o
 obj-$(CONFIG_POWER4)		+= cpu_setup_power4.o
 obj-$(CONFIG_MODULES)		+= module.o ppc_ksyms.o
 obj-$(CONFIG_NOT_COHERENT_CACHE)	+= dma-mapping.o
--- 2.6.9-lzf/arch/ppc/kernel/signal.c	2004-11-26 12:32:57.000000000 +0800
+++ 2.6.9/arch/ppc/kernel/signal.c	2004-11-28 23:16:58.000000000 +0800
@@ -28,6 +28,7 @@
 #include <linux/elf.h>
 #include <linux/tty.h>
 #include <linux/binfmts.h>
+#include <linux/suspend.h>
 #include <asm/ucontext.h>
 #include <asm/uaccess.h>
 #include <asm/pgtable.h>
@@ -604,6 +605,15 @@ int do_signal(sigset_t *oldset, struct p
 	unsigned long frame, newsp;
 	int signr, ret;
 
+	if (current->flags & PF_FREEZE) {
+		refrigerator(0);
+		signr = 0;
+		ret = regs->gpr[3];
+		recalc_sigpending();
+		if (!signal_pending(current))
+			goto no_signal;
+	}
+
 	if (!oldset)
 		oldset = &current->blocked;
 
@@ -626,6 +636,7 @@ int do_signal(sigset_t *oldset, struct p
 			regs->gpr[3] = EINTR;
 			/* note that the cr0.SO bit is already set */
 		} else {
+no_signal:
 			regs->nip -= 4;	/* Back up & retry system call */
 			regs->result = 0;
 			regs->trap = 0;
--- /dev/null	2004-06-07 18:45:47.000000000 +0800
+++ 2.6.9/arch/ppc/kernel/swsusp.S	2004-11-28 23:16:57.000000000 +0800
@@ -0,0 +1,366 @@
+#include <linux/config.h>
+#include <linux/threads.h>
+#include <asm/processor.h>
+#include <asm/page.h>
+#include <asm/cputable.h>
+#include <asm/thread_info.h>
+#include <asm/ppc_asm.h>
+#include <asm/offsets.h>
+
+
+/*
+ * Structure for storing CPU registers on the save area.
+ */
+#define SL_SP		0
+#define SL_PC		4
+#define SL_MSR		8
+#define SL_SDR1		0xc
+#define SL_SPRG0	0x10	/* 4 sprg's */
+#define SL_DBAT0	0x20
+#define SL_IBAT0	0x28
+#define SL_DBAT1	0x30
+#define SL_IBAT1	0x38
+#define SL_DBAT2	0x40
+#define SL_IBAT2	0x48
+#define SL_DBAT3	0x50
+#define SL_IBAT3	0x58
+#define SL_TB		0x60
+#define SL_R2		0x68
+#define SL_CR		0x6c
+#define SL_LR		0x70
+#define SL_R12		0x74	/* r12 to r31 */
+#define SL_SIZE		(SL_R12 + 80)
+
+	.section .data
+	.align	5
+
+_GLOBAL(swsusp_save_area)
+	.space	SL_SIZE
+
+
+	.section .text
+	.align	5
+
+_GLOBAL(swsusp_arch_suspend)
+
+	lis	r11,swsusp_save_area@h
+	ori	r11,r11,swsusp_save_area@l
+
+	mflr	r0
+	stw	r0,SL_LR(r11)
+	mfcr	r0
+	stw	r0,SL_CR(r11)
+	stw	r1,SL_SP(r11)
+	stw	r2,SL_R2(r11)
+	stmw	r12,SL_R12(r11)
+
+	/* Save MSR & SDR1 */
+	mfmsr	r4
+	stw	r4,SL_MSR(r11)
+	mfsdr1	r4
+	stw	r4,SL_SDR1(r11)
+
+	/* Get a stable timebase and save it */
+1:	mftbu	r4
+	stw	r4,SL_TB(r11)
+	mftb	r5
+	stw	r5,SL_TB+4(r11)
+	mftbu	r3
+	cmpw	r3,r4
+	bne	1b
+
+	/* Save SPRGs */
+	mfsprg	r4,0
+	stw	r4,SL_SPRG0(r11)
+	mfsprg	r4,1
+	stw	r4,SL_SPRG0+4(r11)
+	mfsprg	r4,2
+	stw	r4,SL_SPRG0+8(r11)
+	mfsprg	r4,3
+	stw	r4,SL_SPRG0+12(r11)
+
+	/* Save BATs */
+	mfdbatu	r4,0
+	stw	r4,SL_DBAT0(r11)
+	mfdbatl	r4,0
+	stw	r4,SL_DBAT0+4(r11)
+	mfdbatu	r4,1
+	stw	r4,SL_DBAT1(r11)
+	mfdbatl	r4,1
+	stw	r4,SL_DBAT1+4(r11)
+	mfdbatu	r4,2
+	stw	r4,SL_DBAT2(r11)
+	mfdbatl	r4,2
+	stw	r4,SL_DBAT2+4(r11)
+	mfdbatu	r4,3
+	stw	r4,SL_DBAT3(r11)
+	mfdbatl	r4,3
+	stw	r4,SL_DBAT3+4(r11)
+	mfibatu	r4,0
+	stw	r4,SL_IBAT0(r11)
+	mfibatl	r4,0
+	stw	r4,SL_IBAT0+4(r11)
+	mfibatu	r4,1
+	stw	r4,SL_IBAT1(r11)
+	mfibatl	r4,1
+	stw	r4,SL_IBAT1+4(r11)
+	mfibatu	r4,2
+	stw	r4,SL_IBAT2(r11)
+	mfibatl	r4,2
+	stw	r4,SL_IBAT2+4(r11)
+	mfibatu	r4,3
+	stw	r4,SL_IBAT3(r11)
+	mfibatl	r4,3
+	stw	r4,SL_IBAT3+4(r11)
+
+#if  0
+	/* Backup various CPU config stuffs */
+	bl	__save_cpu_setup
+#endif
+	/* Call the low level suspend stuff (we should probably have made
+	 * a stackframe...
+	 */
+	bl	swsusp_save
+
+	/* Restore LR from the save area */
+	lis	r11,swsusp_save_area@h
+	ori	r11,r11,swsusp_save_area@l
+	lwz	r0,SL_LR(r11)
+	mtlr	r0
+
+	blr
+
+
+/* Resume code */
+_GLOBAL(swsusp_arch_resume)
+
+	/* Stop pending alitvec streams and memory accesses */
+BEGIN_FTR_SECTION
+	DSSALL
+END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
+ 	sync
+
+	/* Disable MSR:DR to make sure we don't take a TLB or
+	 * hash miss during the copy, as our hash table will
+	 * for a while be unuseable. For .text, we assume we are
+	 * covered by a BAT. This works only for non-G5 at this
+	 * point. G5 will need a better approach, possibly using
+	 * a small temporary hash table filled with large mappings,
+	 * disabling the MMU completely isn't a good option for
+	 * performance reasons.
+	 * (Note that 750's may have the same performance issue as
+	 * the G5 in this case, we should investigate using moving
+	 * BATs for these CPUs)
+	 */
+	mfmsr	r0
+	sync
+	rlwinm	r0,r0,0,28,26		/* clear MSR_DR */
+	mtmsr	r0
+	sync
+	isync
+
+	/* Load ptr the list of pages to copy in r11 */
+	lis	    r9,pagedir_nosave@ha
+	addi    r9,r9,pagedir_nosave@l
+	tophys(r9,r9)
+	lwz     r9, 0(r9)
+#if 0
+	twi     31,r0,0 /* triger trap */
+#endif
+	cmpwi   r9, 0
+	beq copy_loop_end
+copy_loop:
+	tophys(r9,r9)
+	lwz    r6, 12(r9)
+	li     r10, 0
+copy_one_pgdir:
+	lwz    r11, 4(r9)
+	addi   r8,r10,1
+	cmpwi  r11, 0
+	addi   r7,r9,16
+	beq copy_loop_end
+	li     r0, 256
+	mtctr  r0
+	lwz    r9,0(r9)
+#if 0
+	twi    31,r0,0 /* triger trap */
+#endif
+	tophys(r10,r11)
+	tophys(r11,r9)
+copy_one_page:
+	lwz    r0, 0(r11)
+	stw    r0, 0(r10)
+	lwz    r9, 4(r11)
+	stw    r9, 4(r10)
+	lwz    r0, 8(r11)
+	stw    r0, 8(r10)
+	lwz    r9, 12(r11)
+	addi   r11,r11,16
+	stw    r9, 12(r10)
+	addi   r10,r10,16
+	bdnz copy_one_page
+	mr     r10, r8
+	cmplwi r10, 255
+	mr     r9, r7
+	ble copy_one_pgdir
+	mr     r9, r6
+	bne copy_loop
+copy_loop_end:
+
+	/* Do a very simple cache flush/inval of the L1 to ensure
+	 * coherency of the icache
+	 */
+	lis	r3,0x0002
+	mtctr	r3
+	li	r3, 0
+1:
+	lwz	r0,0(r3)
+	addi	r3,r3,0x0020
+	bdnz	1b
+	isync
+	sync
+
+	/* Now flush those cache lines */
+	lis	r3,0x0002
+	mtctr	r3
+	li	r3, 0
+1:
+	dcbf	0,r3
+	addi	r3,r3,0x0020
+	bdnz	1b
+	sync
+
+	/* Ok, we are now running with the kernel data of the old
+	 * kernel fully restored. We can get to the save area
+	 * easily now. As for the rest of the code, it assumes the
+	 * loader kernel and the booted one are exactly identical
+	 */
+	lis	r11,swsusp_save_area@h
+	ori	r11,r11,swsusp_save_area@l
+	tophys(r11,r11)
+
+#if 0
+	/* Restore various CPU config stuffs */
+	bl	__restore_cpu_setup
+#endif
+	/* Restore the BATs, and SDR1.  Then we can turn on the MMU.
+	 * This is a bit hairy as we are running out of those BATs,
+	 * but first, our code is probably in the icache, and we are
+	 * writing the same value to the BAT, so that should be fine,
+	 * though a better solution will have to be found long-term
+	 */
+	lwz	r4,SL_SDR1(r11)
+	mtsdr1	r4
+	lwz	r4,SL_SPRG0(r11)
+	mtsprg	0,r4
+	lwz	r4,SL_SPRG0+4(r11)
+	mtsprg	1,r4
+	lwz	r4,SL_SPRG0+8(r11)
+	mtsprg	2,r4
+	lwz	r4,SL_SPRG0+12(r11)
+	mtsprg	3,r4
+
+#if 0
+	lwz	r4,SL_DBAT0(r11)
+	mtdbatu	0,r4
+	lwz	r4,SL_DBAT0+4(r11)
+	mtdbatl	0,r4
+	lwz	r4,SL_DBAT1(r11)
+	mtdbatu	1,r4
+	lwz	r4,SL_DBAT1+4(r11)
+	mtdbatl	1,r4
+	lwz	r4,SL_DBAT2(r11)
+	mtdbatu	2,r4
+	lwz	r4,SL_DBAT2+4(r11)
+	mtdbatl	2,r4
+	lwz	r4,SL_DBAT3(r11)
+	mtdbatu	3,r4
+	lwz	r4,SL_DBAT3+4(r11)
+	mtdbatl	3,r4
+	lwz	r4,SL_IBAT0(r11)
+	mtibatu	0,r4
+	lwz	r4,SL_IBAT0+4(r11)
+	mtibatl	0,r4
+	lwz	r4,SL_IBAT1(r11)
+	mtibatu	1,r4
+	lwz	r4,SL_IBAT1+4(r11)
+	mtibatl	1,r4
+	lwz	r4,SL_IBAT2(r11)
+	mtibatu	2,r4
+	lwz	r4,SL_IBAT2+4(r11)
+	mtibatl	2,r4
+	lwz	r4,SL_IBAT3(r11)
+	mtibatu	3,r4
+	lwz	r4,SL_IBAT3+4(r11)
+	mtibatl	3,r4
+#endif
+
+BEGIN_FTR_SECTION
+	li	r4,0
+	mtspr	SPRN_DBAT4U,r4
+	mtspr	SPRN_DBAT4L,r4
+	mtspr	SPRN_DBAT5U,r4
+	mtspr	SPRN_DBAT5L,r4
+	mtspr	SPRN_DBAT6U,r4
+	mtspr	SPRN_DBAT6L,r4
+	mtspr	SPRN_DBAT7U,r4
+	mtspr	SPRN_DBAT7L,r4
+	mtspr	SPRN_IBAT4U,r4
+	mtspr	SPRN_IBAT4L,r4
+	mtspr	SPRN_IBAT5U,r4
+	mtspr	SPRN_IBAT5L,r4
+	mtspr	SPRN_IBAT6U,r4
+	mtspr	SPRN_IBAT6L,r4
+	mtspr	SPRN_IBAT7U,r4
+	mtspr	SPRN_IBAT7L,r4
+END_FTR_SECTION_IFSET(CPU_FTR_HAS_HIGH_BATS)
+
+	/* Flush all TLBs */
+	lis	r4,0x1000
+1:	addic.	r4,r4,-0x1000
+	tlbie	r4
+	blt	1b
+	sync
+
+	/* restore the MSR and turn on the MMU */
+	lwz	r3,SL_MSR(r11)
+	bl	turn_on_mmu
+	tovirt(r11,r11)
+
+	/* Restore TB */
+	li	r3,0
+	mttbl	r3
+	lwz	r3,SL_TB(r11)
+	lwz	r4,SL_TB+4(r11)
+	mttbu	r3
+	mttbl	r4
+
+	/* Kick decrementer */
+	li	r0,1
+	mtdec	r0
+
+	/* Restore the callee-saved registers and return */
+	lwz	r0,SL_CR(r11)
+	mtcr	r0
+	lwz	r2,SL_R2(r11)
+	lmw	r12,SL_R12(r11)
+	lwz	r1,SL_SP(r11)
+	lwz	r0,SL_LR(r11)
+	mtlr	r0
+
+	// XXX Note: we don't really need to call swsusp_resume
+
+	li	r3,0
+	blr
+
+/* FIXME:This construct is actually not useful since we don't shut
+ * down the instruction MMU, we could just flip back MSR-DR on.
+ */
+turn_on_mmu:
+	mflr	r4
+	mtsrr0	r4
+	mtsrr1	r3
+	sync
+	isync
+	rfi
+
--- 2.6.9-lzf/arch/ppc/kernel/vmlinux.lds.S	2004-11-26 12:32:57.000000000 +0800
+++ 2.6.9/arch/ppc/kernel/vmlinux.lds.S	2004-11-28 23:16:58.000000000 +0800
@@ -74,6 +74,12 @@ SECTIONS
     CONSTRUCTORS
   }
 
+  . = ALIGN(4096);
+  __nosave_begin = .;
+  .data_nosave : { *(.data.nosave) }
+  . = ALIGN(4096);
+  __nosave_end = .;
+
   . = ALIGN(32);
   .data.cacheline_aligned : { *(.data.cacheline_aligned) }
 
--- 2.6.9-lzf/arch/ppc/platforms/pmac_feature.c	2004-11-27 17:33:17.000000000 +0800
+++ 2.6.9/arch/ppc/platforms/pmac_feature.c	2004-11-28 23:16:59.000000000 +0800
@@ -2146,7 +2146,7 @@ static struct pmac_mb_def pmac_mb_defs[]
 	},
 	{	"PowerBook6,1",			"PowerBook G4 12\"",
 		PMAC_TYPE_UNKNOWN_INTREPID,	intrepid_features,
-		PMAC_MB_HAS_FW_POWER | PMAC_MB_MOBILE,
+		PMAC_MB_CAN_SLEEP | PMAC_MB_HAS_FW_POWER | PMAC_MB_MOBILE,
 	},
 	{	"PowerBook6,2",			"PowerBook G4",
 		PMAC_TYPE_UNKNOWN_INTREPID,	intrepid_features,
--- 2.6.9-lzf/arch/ppc/platforms/pmac_setup.c	2004-11-26 12:32:57.000000000 +0800
+++ 2.6.9/arch/ppc/platforms/pmac_setup.c	2004-11-28 23:16:59.000000000 +0800
@@ -51,6 +51,7 @@
 #include <linux/irq.h>
 #include <linux/seq_file.h>
 #include <linux/root_dev.h>
+#include <linux/suspend.h>
 
 #include <asm/reg.h>
 #include <asm/sections.h>
@@ -70,6 +71,8 @@
 #include <asm/pmac_feature.h>
 #include <asm/time.h>
 #include <asm/of_device.h>
+#include <asm/mmu_context.h>
+
 #include "pmac_pic.h"
 #include "mem_pieces.h"
 
@@ -420,11 +423,67 @@ find_boot_device(void)
 #endif
 }
 
+/* TODO: Merge the suspend-to-ram with the common code !!!
+ * currently, this is a stub implementation for suspend-to-disk
+ * only
+ */
+
+#ifdef CONFIG_PM
+
+extern void enable_kernel_altivec(void);
+
+static int pmac_pm_prepare(suspend_state_t state)
+{
+	printk(KERN_DEBUG "pmac_pm_prepare(%d)\n", state);
+
+	return 0;
+}
+
+static int pmac_pm_enter(suspend_state_t state)
+{
+	printk(KERN_DEBUG "pmac_pm_enter(%d)\n", state);
+
+	/* Giveup the lazy FPU & vec so we don't have to back them
+	 * up from the low level code
+	 */
+	enable_kernel_fp();
+
+#ifdef CONFIG_ALTIVEC
+	if (cur_cpu_spec[0]->cpu_features & CPU_FTR_ALTIVEC)
+		enable_kernel_altivec();
+#endif /* CONFIG_ALTIVEC */
+
+	return 0;
+}
+
+static int pmac_pm_finish(suspend_state_t state)
+{
+	printk(KERN_DEBUG "pmac_pm_finish(%d)\n", state);
+
+	/* Restore userland MMU context */
+	set_context(current->active_mm->context, current->active_mm->pgd);
+
+	return 0;
+}
+
+static struct pm_ops pmac_pm_ops = {
+	.pm_disk_mode	= PM_DISK_SHUTDOWN,
+	.prepare	= pmac_pm_prepare,
+	.enter		= pmac_pm_enter,
+	.finish		= pmac_pm_finish,
+};
+
+#endif /* CONFIG_PM */
+
 static int initializing = 1;
 
 static int pmac_late_init(void)
 {
 	initializing = 0;
+
+#ifdef CONFIG_PM
+	pm_set_ops(&pmac_pm_ops);
+#endif /* CONFIG_PM */
 	return 0;
 }
 
-- 
--
Hu Gang / Steve
Linux Registered User 204016
GPG Public Key: http://soulinfo.com/~hugang/hugang.asc

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: software suspend patch [5/6]
  2004-11-28 16:23     ` software suspend patch [1/6] hugang
                         ` (2 preceding siblings ...)
  2004-11-28 16:25       ` software suspend patch [4/6] hugang
@ 2004-11-28 16:25       ` hugang
  2004-11-28 16:25       ` software suspend patch [1/6] hugang
  2004-11-28 16:58       ` Pavel Machek
  5 siblings, 0 replies; 51+ messages in thread
From: hugang @ 2004-11-28 16:25 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

On Mon, Nov 29, 2004 at 12:23:20AM +0800, hugang@soulinfo.com wrote:
> Hi Pavel Machek, Nigel Cunningham:
> 
>  device-tree.diff 
>    base from suspend2 with a little changed.
> 
>  core.diff
>   1: redefine struct pbe for using _no_ continuous as pagedir.
>   2: make shrink memory as little as possible.
>   3: using a bitmap speed up collide check in page relocating.
>   4: pagecache saving ready.
> 
>  i386.diff
>  ppc.diff
>   i386 and powerpc suspend update.
> 
>  pagecachs_addon.diff
>   if enable page caches saving, must using it, it making saving
>   pagecaches safe. idea from suspend2.
> 
>   ppcfix.diff
>   fix compile error. 
>   $ gcc -v
>    .... 
>    gcc version 2.95.4 20011002 (Debian prerelease)
> 
> I'm using 2.6.9-ck3 With above patch, swsusp1 works prefect in my 
> PowerPC and x86 PC with Highmem and prepempt option enabled.
> 
> I hope the core.diff@1,@2,@3 i386.diff ppc.diff will merge into 
> mainline kernel ASAP, :). from I view point device-tree.diff is 
> very usefuly when using pagecache saving and pagecachs_addon.diff
> that's really hack for making pagecache saving safe.
> 

--- 2.6.9-lzf/kernel/sched.c	2004-11-28 23:17:11.000000000 +0800
+++ 2.6.9/kernel/sched.c	2004-11-28 23:16:54.000000000 +0800
@@ -2656,6 +2656,12 @@ asmlinkage void __sched schedule(void)
 	 * Otherwise, whine if we are scheduling when we should not be.
 	 */
 	if (likely(!(current->state & (TASK_DEAD | TASK_ZOMBIE)))) {
+#ifdef CONFIG_PM
+		extern int swsusp_pagecache;
+		if (unlikely(swsusp_pagecache == 2)) /* slient warning message when 
+												writing pagecache */
+#endif
+
 		if (unlikely(in_atomic())) {
 			printk(KERN_ERR "bad: scheduling while atomic!\n");
 			dump_stack();
--- 2.6.9-lzf/mm/page-writeback.c	2004-11-25 14:06:02.000000000 +0800
+++ 2.6.9/mm/page-writeback.c	2004-11-29 00:07:13.000000000 +0800
@@ -359,6 +359,9 @@ static void wb_kupdate(unsigned long arg
 	unsigned long start_jif;
 	unsigned long next_jif;
 	long nr_to_write;
+#ifdef CONFIG_PM
+	extern int swsusp_pagecache;
+#endif
 	struct writeback_state wbs;
 	struct writeback_control wbc = {
 		.bdi		= NULL,
@@ -369,6 +372,14 @@ static void wb_kupdate(unsigned long arg
 		.for_kupdate	= 1,
 	};
 
+#ifdef CONFIG_PM
+	if (unlikely(swsusp_pagecache == 2)) {
+		start_jif = jiffies;
+		next_jif = start_jif + (dirty_writeback_centisecs * HZ) / 100;
+		goto out;
+	}
+#endif
+
 	sync_supers();
 
 	get_writeback_state(&wbs);
@@ -389,6 +400,7 @@ static void wb_kupdate(unsigned long arg
 		}
 		nr_to_write -= MAX_WRITEBACK_PAGES - wbc.nr_to_write;
 	}
+out:
 	if (time_before(next_jif, jiffies + HZ))
 		next_jif = jiffies + HZ;
 	if (dirty_writeback_centisecs)
-- 
--
Hu Gang / Steve
Linux Registered User 204016
GPG Public Key: http://soulinfo.com/~hugang/hugang.asc

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: software suspend patch [1/6]
  2004-11-28 16:23     ` software suspend patch [1/6] hugang
                         ` (3 preceding siblings ...)
  2004-11-28 16:25       ` software suspend patch [5/6] hugang
@ 2004-11-28 16:25       ` hugang
  2004-11-28 17:11         ` Pavel Machek
  2004-11-28 16:58       ` Pavel Machek
  5 siblings, 1 reply; 51+ messages in thread
From: hugang @ 2004-11-28 16:25 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

On Mon, Nov 29, 2004 at 12:23:20AM +0800, hugang@soulinfo.com wrote:
> Hi Pavel Machek, Nigel Cunningham:
> 
>  device-tree.diff 
>    base from suspend2 with a little changed.
> 
>  core.diff
>   1: redefine struct pbe for using _no_ continuous as pagedir.
>   2: make shrink memory as little as possible.
>   3: using a bitmap speed up collide check in page relocating.
>   4: pagecache saving ready.
> 
>  i386.diff
>  ppc.diff
>   i386 and powerpc suspend update.
> 
>  pagecachs_addon.diff
>   if enable page caches saving, must using it, it making saving
>   pagecaches safe. idea from suspend2.
> 
>   ppcfix.diff
>   fix compile error. 
>   $ gcc -v
>    .... 
>    gcc version 2.95.4 20011002 (Debian prerelease)
> 
> I'm using 2.6.9-ck3 With above patch, swsusp1 works prefect in my 
> PowerPC and x86 PC with Highmem and prepempt option enabled.
> 
> I hope the core.diff@1,@2,@3 i386.diff ppc.diff will merge into 
> mainline kernel ASAP, :). from I view point device-tree.diff is 
> very usefuly when using pagecache saving and pagecachs_addon.diff
> that's really hack for making pagecache saving safe.
> 

--- 2.6.9-lzf/arch/ppc/syslib/open_pic.c	2004-11-26 12:32:58.000000000 +0800
+++ 2.6.9/arch/ppc/syslib/open_pic.c	2004-11-28 23:16:58.000000000 +0800
@@ -776,7 +776,8 @@ static void openpic_mapirq(u_int irq, cp
 	if (ISR[irq] == 0)
 		return;
 	if (!cpus_empty(keepmask)) {
-		cpumask_t irqdest = { .bits[0] = openpic_read(&ISR[irq]->Destination) };
+		cpumask_t irqdest;
+		irqdest.bits[0] = openpic_read(&ISR[irq]->Destination);
 		cpus_and(irqdest, irqdest, keepmask);
 		cpus_or(physmask, physmask, irqdest);
 	}
-- 
--
Hu Gang / Steve
Linux Registered User 204016
GPG Public Key: http://soulinfo.com/~hugang/hugang.asc

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: software suspend patch [1/6]
  2004-11-28 16:23     ` software suspend patch [1/6] hugang
                         ` (4 preceding siblings ...)
  2004-11-28 16:25       ` software suspend patch [1/6] hugang
@ 2004-11-28 16:58       ` Pavel Machek
  2004-11-29 15:43         ` hugang
  5 siblings, 1 reply; 51+ messages in thread
From: Pavel Machek @ 2004-11-28 16:58 UTC (permalink / raw)
  To: hugang; +Cc: linux-kernel

Hi!

I can not merge anything before 2.6.10. As you have seen, I have quite
a lot of patches in my tree, and I do not want mix them with these...

>  device-tree.diff 
>    base from suspend2 with a little changed.

I do not want this one.

>  core.diff
>   1: redefine struct pbe for using _no_ continuous as pagedir.

Can I get this one as a separate diff?

>   2: make shrink memory as little as possible.
>   3: using a bitmap speed up collide check in page relocating.
>   4: pagecache saving ready.
> 
>  i386.diff
>  ppc.diff
>   i386 and powerpc suspend update.

ppc changes look good, you should send them to ppc maintainer...

>  pagecachs_addon.diff
>   if enable page caches saving, must using it, it making saving
>   pagecaches safe. idea from suspend2.
> 
>   ppcfix.diff
>   fix compile error. 
>   $ gcc -v
>    .... 
>    gcc version 2.95.4 20011002 (Debian prerelease)

Send this one to Andrew Morton, now, it is a bugfix.
								Pavel

-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: software suspend patch [2/6]
  2004-11-28 16:24       ` software suspend patch [2/6] hugang
@ 2004-11-28 17:03         ` Pavel Machek
  0 siblings, 0 replies; 51+ messages in thread
From: Pavel Machek @ 2004-11-28 17:03 UTC (permalink / raw)
  To: hugang; +Cc: linux-kernel

Hi!

> @@ -222,8 +221,105 @@ static void lock_swapdevices(void)
>  		}
>  	swap_list_unlock();
>  }
> +	
> +#define ONE_PAGE_PBE_NUM	(PAGE_SIZE/sizeof(struct pbe))
> +#define PBE_IS_PAGE_END(x)  \
> +	( PAGE_SIZE - sizeof(struct pbe) == ((x) - ((~(PAGE_SIZE - 1)) & (x))) )
> +
> +#define pgdir_for_each_safe(pos, n, head) \
> +	for(pos = head, n = pos ? (suspend_pagedir_t*)pos->dummy.val : NULL; \
> +		pos != NULL; \
> +		pos = n, n = pos ? (suspend_pagedir_t *)pos->dummy.val : NULL)
> +
> +#define pbe_for_each_safe(pos, n, index, max, head) \
> +	for(pos = head, index = 0, \
> +			n = pos ? (struct pbe *)pos->dummy.val : NULL; \
> +		(pos != NULL) && (index < max); \
> +		pos = (PBE_IS_PAGE_END((unsigned long)pos)) ? n : \
> +			((struct pbe *)((unsigned long)pos + sizeof(struct pbe))), \
> +			index ++, \
> +			n = pos ? (struct pbe*)pos->dummy.val : NULL)
> +

_safe suffix means it is safe to delete while traversing. I do not
think your macros can handle that, so you should not have _safe
suffix.
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: software suspend patch [4/6]
  2004-11-28 16:25       ` software suspend patch [4/6] hugang
@ 2004-11-28 17:08         ` Pavel Machek
  0 siblings, 0 replies; 51+ messages in thread
From: Pavel Machek @ 2004-11-28 17:08 UTC (permalink / raw)
  To: hugang; +Cc: linux-kernel

Hi!

> --- 2.6.9-lzf/drivers/ide/ppc/pmac.c	2004-11-26 12:33:06.000000000 +0800
> +++ 2.6.9/drivers/ide/ppc/pmac.c	2004-11-28 23:17:00.000000000 +0800
> @@ -32,6 +32,7 @@
>  #include <linux/notifier.h>
>  #include <linux/reboot.h>
>  #include <linux/pci.h>
> +#include <linux/pm.h>
>  #include <linux/adb.h>
>  #include <linux/pmu.h>
>  
> @@ -1364,7 +1365,7 @@ pmac_ide_macio_suspend(struct macio_dev 
>  	ide_hwif_t	*hwif = (ide_hwif_t *)dev_get_drvdata(&mdev->ofdev.dev);
>  	int		rc = 0;
>  
> -	if (state != mdev->ofdev.dev.power_state && state >= 2) {
> +	if (state != mdev->ofdev.dev.power_state && state == PM_SUSPEND_MEM) {
>  		rc = pmac_ide_do_suspend(hwif);
>  		if (rc == 0)
>  			mdev->ofdev.dev.power_state = state;
> @@ -1472,7 +1473,7 @@ pmac_ide_pci_suspend(struct pci_dev *pde
>  	ide_hwif_t	*hwif = (ide_hwif_t *)pci_get_drvdata(pdev);
>  	int		rc = 0;
>  	
> -	if (state != pdev->dev.power_state && state >= 2) {
> +	if (state != pdev->dev.power_state && state == PM_SUSPEND_MEM ) {
>  		rc = pmac_ide_do_suspend(hwif);
>  		if (rc == 0)
>  			pdev->dev.power_state = state;

Please wait with this one.

> --- 2.6.9-lzf/drivers/macintosh/Kconfig	2004-11-26 12:33:06.000000000 +0800
> +++ 2.6.9/drivers/macintosh/Kconfig	2004-11-28 23:17:00.000000000 +0800
> @@ -80,7 +80,7 @@ config ADB_PMU
>  
>  config PMAC_PBOOK
>  	bool "Power management support for PowerBooks"
> -	depends on ADB_PMU
> +	depends on PM && ADB_PMU
>  	---help---
>  	  This provides support for putting a PowerBook to sleep; it also
>  	  enables media bay support.  Power management works on the
> @@ -97,11 +97,6 @@ config PMAC_PBOOK
>  	  have it autoloaded. The act of removing the module shuts down the
>  	  sound hardware for more power savings.
>  
> -config PM
> -	bool
> -	depends on PPC_PMAC && ADB_PMU && PMAC_PBOOK
> -	default y
> -
>  config PMAC_APM_EMU
>  	tristate "APM emulation"
>  	depends on PMAC_PBOOK

Ok, merge with BenH.

> --- 2.6.9-lzf/drivers/macintosh/via-pmu.c	2004-11-26 12:33:07.000000000 +0800
> +++ 2.6.9/drivers/macintosh/via-pmu.c	2004-11-28 23:17:00.000000000 +0800
> @@ -43,6 +43,7 @@
>  #include <linux/init.h>
>  #include <linux/interrupt.h>
>  #include <linux/device.h>
> +#include <linux/sysdev.h>
>  #include <linux/suspend.h>
>  #include <linux/syscalls.h>
>  #include <asm/prom.h>
> @@ -2326,7 +2327,7 @@ pmac_suspend_devices(void)
>  	/* Sync the disks. */
>  	/* XXX It would be nice to have some way to ensure that
>  	 * nobody is dirtying any new buffers while we wait. That
> -	 * could be acheived using the refrigerator for processes
> +	 * could be achieved using the refrigerator for processes
>  	 * that swsusp uses
>  	 */
>  	sys_sync();
> @@ -2379,7 +2380,6 @@ pmac_suspend_devices(void)
>  
>  	/* Wait for completion of async backlight requests */
>  	while (!bright_req_1.complete || !bright_req_2.complete ||
> -
>  			!batt_req.complete)
>  		pmu_poll();
>  
> @@ -3048,6 +3048,88 @@ pmu_polled_request(struct adb_request *r
>  }
>  #endif /* DEBUG_SLEEP */
>  
> +
> +/* FIXME: This is a temporary set of callbacks to enable us
> + * to do suspend-to-disk.
> + */
> +
> +#ifdef CONFIG_PM
> +
> +static int pmu_sys_suspended = 0;
> +
> +static int pmu_sys_suspend(struct sys_device *sysdev, pm_message_t state)
> +{
> +	if (state != PMSG_FREEZE || pmu_sys_suspended)
> +		return 0;
> +
> +	/* Suspend PMU event interrupts */
> +	pmu_suspend();
> +
> +	pmu_sys_suspended = 1;
> +	return 0;
> +}
> +
> +static int pmu_sys_resume(struct sys_device *sysdev)
> +{
> +	struct adb_request req;
> +
> +	if (!pmu_sys_suspended)
> +		return 0;
> +
> +	/* Tell PMU we are ready */
> +	pmu_request(&req, NULL, 2, PMU_SYSTEM_READY, 2);
> +	pmu_wait_complete(&req);
> +
> +	/* Resume PMU event interrupts */
> +	pmu_resume();
> +
> +	pmu_sys_suspended = 0;
> +
> +	return 0;
> +}
> +
> +#endif /* CONFIG_PM */
> +
> +static struct sysdev_class pmu_sysclass = {
> +	set_kset_name("pmu"),
> +};
> +
> +static struct sys_device device_pmu = {
> +	.id		= 0,
> +	.cls		= &pmu_sysclass,
> +};
> +
> +static struct sysdev_driver driver_pmu = {
> +#ifdef CONFIG_PM
> +	.suspend	= &pmu_sys_suspend,
> +	.resume		= &pmu_sys_resume,
> +#endif /* CONFIG_PM */
> +};
> +
> +static int __init init_pmu_sysfs(void)
> +{
> +	int rc;
> +
> +	rc = sysdev_class_register(&pmu_sysclass);
> +	if (rc) {
> +		printk(KERN_ERR "Failed registering PMU sys class\n");
> +		return -ENODEV;
> +	}
> +	rc = sysdev_register(&device_pmu);
> +	if (rc) {
> +		printk(KERN_ERR "Failed registering PMU sys device\n");
> +		return -ENODEV;
> +	}
> +	rc = sysdev_driver_register(&pmu_sysclass, &driver_pmu);
> +	if (rc) {
> +		printk(KERN_ERR "Failed registering PMU sys driver\n");
> +		return -ENODEV;
> +	}
> +	return 0;
> +}

The error handling is not okay: 

> --- /dev/null	2004-06-07 18:45:47.000000000 +0800
> +++ 2.6.9/include/asm-ppc/suspend.h	2004-11-28 23:16:57.000000000 +0800
> @@ -0,0 +1,12 @@
> +static inline int arch_prepare_suspend(void)
> +{
> +	return 0;
> +}
> +
> +static inline void save_processor_state(void)
> +{
> +}
> +
> +static inline void restore_processor_state(void)
> +{
> +}
> --- 2.6.9-lzf/arch/ppc/Kconfig	2004-11-26 12:32:56.000000000 +0800
> +++ 2.6.9/arch/ppc/Kconfig	2004-11-28 23:16:58.000000000 +0800
> @@ -983,6 +983,8 @@ config PROC_HARDWARE
>  
>  source "drivers/zorro/Kconfig"
>  
> +source kernel/power/Kconfig
> +
>  endmenu
>  
>  menu "Bus options"
> --- 2.6.9-lzf/arch/ppc/kernel/Makefile	2004-11-26 12:32:56.000000000 +0800
> +++ 2.6.9/arch/ppc/kernel/Makefile	2004-11-28 23:16:58.000000000 +0800
> @@ -16,6 +16,7 @@ obj-y				:= entry.o traps.o irq.o idle.o
>  					semaphore.o syscalls.o setup.o \
>  					cputable.o ppc_htab.o
>  obj-$(CONFIG_6xx)		+= l2cr.o cpu_setup_6xx.o
> +obj-$(CONFIG_SOFTWARE_SUSPEND)	+= swsusp.o
>  obj-$(CONFIG_POWER4)		+= cpu_setup_power4.o
>  obj-$(CONFIG_MODULES)		+= module.o ppc_ksyms.o
>  obj-$(CONFIG_NOT_COHERENT_CACHE)	+= dma-mapping.o
> --- 2.6.9-lzf/arch/ppc/kernel/signal.c	2004-11-26 12:32:57.000000000 +0800
> +++ 2.6.9/arch/ppc/kernel/signal.c	2004-11-28 23:16:58.000000000 +0800
> @@ -28,6 +28,7 @@
>  #include <linux/elf.h>
>  #include <linux/tty.h>
>  #include <linux/binfmts.h>
> +#include <linux/suspend.h>
>  #include <asm/ucontext.h>
>  #include <asm/uaccess.h>
>  #include <asm/pgtable.h>
> @@ -604,6 +605,15 @@ int do_signal(sigset_t *oldset, struct p
>  	unsigned long frame, newsp;
>  	int signr, ret;
>  
> +	if (current->flags & PF_FREEZE) {
> +		refrigerator(0);
> +		signr = 0;
> +		ret = regs->gpr[3];
> +		recalc_sigpending();
> +		if (!signal_pending(current))
> +			goto no_signal;
> +	}
> +
>  	if (!oldset)
>  		oldset = &current->blocked;
>  
> @@ -626,6 +636,7 @@ int do_signal(sigset_t *oldset, struct p
>  			regs->gpr[3] = EINTR;
>  			/* note that the cr0.SO bit is already set */
>  		} else {
> +no_signal:
>  			regs->nip -= 4;	/* Back up & retry system call */
>  			regs->result = 0;
>  			regs->trap = 0;

Ok, merge with BenH.

> --- /dev/null	2004-06-07 18:45:47.000000000 +0800
> +++ 2.6.9/arch/ppc/kernel/swsusp.S	2004-11-28 23:16:57.000000000 +0800
> @@ -0,0 +1,366 @@
> +#include <linux/config.h>
> +#include <linux/threads.h>
> +#include <asm/processor.h>
> +#include <asm/page.h>
> +#include <asm/cputable.h>
> +#include <asm/thread_info.h>
> +#include <asm/ppc_asm.h>
> +#include <asm/offsets.h>
> +
> +
> +/*
> + * Structure for storing CPU registers on the save area.
> + */
> +#define SL_SP		0
> +#define SL_PC		4
> +#define SL_MSR		8
> +#define SL_SDR1		0xc
> +#define SL_SPRG0	0x10	/* 4 sprg's */
> +#define SL_DBAT0	0x20
> +#define SL_IBAT0	0x28
> +#define SL_DBAT1	0x30
> +#define SL_IBAT1	0x38
> +#define SL_DBAT2	0x40
> +#define SL_IBAT2	0x48
> +#define SL_DBAT3	0x50
> +#define SL_IBAT3	0x58
> +#define SL_TB		0x60
> +#define SL_R2		0x68
> +#define SL_CR		0x6c
> +#define SL_LR		0x70
> +#define SL_R12		0x74	/* r12 to r31 */
> +#define SL_SIZE		(SL_R12 + 80)
> +
> +	.section .data
> +	.align	5
> +
> +_GLOBAL(swsusp_save_area)
> +	.space	SL_SIZE
> +
> +
> +	.section .text
> +	.align	5
> +
> +_GLOBAL(swsusp_arch_suspend)
> +
> +	lis	r11,swsusp_save_area@h
> +	ori	r11,r11,swsusp_save_area@l
> +
> +	mflr	r0
> +	stw	r0,SL_LR(r11)
> +	mfcr	r0
> +	stw	r0,SL_CR(r11)
> +	stw	r1,SL_SP(r11)
> +	stw	r2,SL_R2(r11)
> +	stmw	r12,SL_R12(r11)
> +
> +	/* Save MSR & SDR1 */
> +	mfmsr	r4
> +	stw	r4,SL_MSR(r11)
> +	mfsdr1	r4
> +	stw	r4,SL_SDR1(r11)
> +
> +	/* Get a stable timebase and save it */
> +1:	mftbu	r4
> +	stw	r4,SL_TB(r11)
> +	mftb	r5
> +	stw	r5,SL_TB+4(r11)
> +	mftbu	r3
> +	cmpw	r3,r4
> +	bne	1b
> +
> +	/* Save SPRGs */
> +	mfsprg	r4,0
> +	stw	r4,SL_SPRG0(r11)
> +	mfsprg	r4,1
> +	stw	r4,SL_SPRG0+4(r11)
> +	mfsprg	r4,2
> +	stw	r4,SL_SPRG0+8(r11)
> +	mfsprg	r4,3
> +	stw	r4,SL_SPRG0+12(r11)
> +
> +	/* Save BATs */
> +	mfdbatu	r4,0
> +	stw	r4,SL_DBAT0(r11)
> +	mfdbatl	r4,0
> +	stw	r4,SL_DBAT0+4(r11)
> +	mfdbatu	r4,1
> +	stw	r4,SL_DBAT1(r11)
> +	mfdbatl	r4,1
> +	stw	r4,SL_DBAT1+4(r11)
> +	mfdbatu	r4,2
> +	stw	r4,SL_DBAT2(r11)
> +	mfdbatl	r4,2
> +	stw	r4,SL_DBAT2+4(r11)
> +	mfdbatu	r4,3
> +	stw	r4,SL_DBAT3(r11)
> +	mfdbatl	r4,3
> +	stw	r4,SL_DBAT3+4(r11)
> +	mfibatu	r4,0
> +	stw	r4,SL_IBAT0(r11)
> +	mfibatl	r4,0
> +	stw	r4,SL_IBAT0+4(r11)
> +	mfibatu	r4,1
> +	stw	r4,SL_IBAT1(r11)
> +	mfibatl	r4,1
> +	stw	r4,SL_IBAT1+4(r11)
> +	mfibatu	r4,2
> +	stw	r4,SL_IBAT2(r11)
> +	mfibatl	r4,2
> +	stw	r4,SL_IBAT2+4(r11)
> +	mfibatu	r4,3
> +	stw	r4,SL_IBAT3(r11)
> +	mfibatl	r4,3
> +	stw	r4,SL_IBAT3+4(r11)
> +
> +#if  0
> +	/* Backup various CPU config stuffs */
> +	bl	__save_cpu_setup
> +#endif
> +	/* Call the low level suspend stuff (we should probably have made
> +	 * a stackframe...
> +	 */
> +	bl	swsusp_save
> +
> +	/* Restore LR from the save area */
> +	lis	r11,swsusp_save_area@h
> +	ori	r11,r11,swsusp_save_area@l
> +	lwz	r0,SL_LR(r11)
> +	mtlr	r0
> +
> +	blr
> +
> +
> +/* Resume code */
> +_GLOBAL(swsusp_arch_resume)
> +
> +	/* Stop pending alitvec streams and memory accesses */
> +BEGIN_FTR_SECTION
> +	DSSALL
> +END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
> + 	sync
> +
> +	/* Disable MSR:DR to make sure we don't take a TLB or
> +	 * hash miss during the copy, as our hash table will
> +	 * for a while be unuseable. For .text, we assume we are
> +	 * covered by a BAT. This works only for non-G5 at this
> +	 * point. G5 will need a better approach, possibly using
> +	 * a small temporary hash table filled with large mappings,
> +	 * disabling the MMU completely isn't a good option for
> +	 * performance reasons.
> +	 * (Note that 750's may have the same performance issue as
> +	 * the G5 in this case, we should investigate using moving
> +	 * BATs for these CPUs)
> +	 */
> +	mfmsr	r0
> +	sync
> +	rlwinm	r0,r0,0,28,26		/* clear MSR_DR */
> +	mtmsr	r0
> +	sync
> +	isync
> +
> +	/* Load ptr the list of pages to copy in r11 */
> +	lis	    r9,pagedir_nosave@ha
> +	addi    r9,r9,pagedir_nosave@l
> +	tophys(r9,r9)
> +	lwz     r9, 0(r9)
> +#if 0
> +	twi     31,r0,0 /* triger trap */
> +#endif
> +	cmpwi   r9, 0
> +	beq copy_loop_end
> +copy_loop:
> +	tophys(r9,r9)
> +	lwz    r6, 12(r9)
> +	li     r10, 0
> +copy_one_pgdir:
> +	lwz    r11, 4(r9)
> +	addi   r8,r10,1
> +	cmpwi  r11, 0
> +	addi   r7,r9,16
> +	beq copy_loop_end
> +	li     r0, 256
> +	mtctr  r0
> +	lwz    r9,0(r9)
> +#if 0
> +	twi    31,r0,0 /* triger trap */
> +#endif
> +	tophys(r10,r11)
> +	tophys(r11,r9)
> +copy_one_page:
> +	lwz    r0, 0(r11)
> +	stw    r0, 0(r10)
> +	lwz    r9, 4(r11)
> +	stw    r9, 4(r10)
> +	lwz    r0, 8(r11)
> +	stw    r0, 8(r10)
> +	lwz    r9, 12(r11)
> +	addi   r11,r11,16
> +	stw    r9, 12(r10)
> +	addi   r10,r10,16
> +	bdnz copy_one_page
> +	mr     r10, r8
> +	cmplwi r10, 255
> +	mr     r9, r7
> +	ble copy_one_pgdir
> +	mr     r9, r6
> +	bne copy_loop
> +copy_loop_end:
> +
> +	/* Do a very simple cache flush/inval of the L1 to ensure
> +	 * coherency of the icache
> +	 */
> +	lis	r3,0x0002
> +	mtctr	r3
> +	li	r3, 0
> +1:
> +	lwz	r0,0(r3)
> +	addi	r3,r3,0x0020
> +	bdnz	1b
> +	isync
> +	sync
> +
> +	/* Now flush those cache lines */
> +	lis	r3,0x0002
> +	mtctr	r3
> +	li	r3, 0
> +1:
> +	dcbf	0,r3
> +	addi	r3,r3,0x0020
> +	bdnz	1b
> +	sync
> +
> +	/* Ok, we are now running with the kernel data of the old
> +	 * kernel fully restored. We can get to the save area
> +	 * easily now. As for the rest of the code, it assumes the
> +	 * loader kernel and the booted one are exactly identical
> +	 */
> +	lis	r11,swsusp_save_area@h
> +	ori	r11,r11,swsusp_save_area@l
> +	tophys(r11,r11)
> +
> +#if 0
> +	/* Restore various CPU config stuffs */
> +	bl	__restore_cpu_setup
> +#endif
> +	/* Restore the BATs, and SDR1.  Then we can turn on the MMU.
> +	 * This is a bit hairy as we are running out of those BATs,
> +	 * but first, our code is probably in the icache, and we are
> +	 * writing the same value to the BAT, so that should be fine,
> +	 * though a better solution will have to be found long-term
> +	 */
> +	lwz	r4,SL_SDR1(r11)
> +	mtsdr1	r4
> +	lwz	r4,SL_SPRG0(r11)
> +	mtsprg	0,r4
> +	lwz	r4,SL_SPRG0+4(r11)
> +	mtsprg	1,r4
> +	lwz	r4,SL_SPRG0+8(r11)
> +	mtsprg	2,r4
> +	lwz	r4,SL_SPRG0+12(r11)
> +	mtsprg	3,r4
> +
> +#if 0
> +	lwz	r4,SL_DBAT0(r11)
> +	mtdbatu	0,r4
> +	lwz	r4,SL_DBAT0+4(r11)
> +	mtdbatl	0,r4
> +	lwz	r4,SL_DBAT1(r11)
> +	mtdbatu	1,r4
> +	lwz	r4,SL_DBAT1+4(r11)
> +	mtdbatl	1,r4
> +	lwz	r4,SL_DBAT2(r11)
> +	mtdbatu	2,r4
> +	lwz	r4,SL_DBAT2+4(r11)
> +	mtdbatl	2,r4
> +	lwz	r4,SL_DBAT3(r11)
> +	mtdbatu	3,r4
> +	lwz	r4,SL_DBAT3+4(r11)
> +	mtdbatl	3,r4
> +	lwz	r4,SL_IBAT0(r11)
> +	mtibatu	0,r4
> +	lwz	r4,SL_IBAT0+4(r11)
> +	mtibatl	0,r4
> +	lwz	r4,SL_IBAT1(r11)
> +	mtibatu	1,r4
> +	lwz	r4,SL_IBAT1+4(r11)
> +	mtibatl	1,r4
> +	lwz	r4,SL_IBAT2(r11)
> +	mtibatu	2,r4
> +	lwz	r4,SL_IBAT2+4(r11)
> +	mtibatl	2,r4
> +	lwz	r4,SL_IBAT3(r11)
> +	mtibatu	3,r4
> +	lwz	r4,SL_IBAT3+4(r11)
> +	mtibatl	3,r4
> +#endif
> +
> +BEGIN_FTR_SECTION
> +	li	r4,0
> +	mtspr	SPRN_DBAT4U,r4
> +	mtspr	SPRN_DBAT4L,r4
> +	mtspr	SPRN_DBAT5U,r4
> +	mtspr	SPRN_DBAT5L,r4
> +	mtspr	SPRN_DBAT6U,r4
> +	mtspr	SPRN_DBAT6L,r4
> +	mtspr	SPRN_DBAT7U,r4
> +	mtspr	SPRN_DBAT7L,r4
> +	mtspr	SPRN_IBAT4U,r4
> +	mtspr	SPRN_IBAT4L,r4
> +	mtspr	SPRN_IBAT5U,r4
> +	mtspr	SPRN_IBAT5L,r4
> +	mtspr	SPRN_IBAT6U,r4
> +	mtspr	SPRN_IBAT6L,r4
> +	mtspr	SPRN_IBAT7U,r4
> +	mtspr	SPRN_IBAT7L,r4
> +END_FTR_SECTION_IFSET(CPU_FTR_HAS_HIGH_BATS)
> +
> +	/* Flush all TLBs */
> +	lis	r4,0x1000
> +1:	addic.	r4,r4,-0x1000
> +	tlbie	r4
> +	blt	1b
> +	sync
> +
> +	/* restore the MSR and turn on the MMU */
> +	lwz	r3,SL_MSR(r11)
> +	bl	turn_on_mmu
> +	tovirt(r11,r11)
> +
> +	/* Restore TB */
> +	li	r3,0
> +	mttbl	r3
> +	lwz	r3,SL_TB(r11)
> +	lwz	r4,SL_TB+4(r11)
> +	mttbu	r3
> +	mttbl	r4
> +
> +	/* Kick decrementer */
> +	li	r0,1
> +	mtdec	r0
> +
> +	/* Restore the callee-saved registers and return */
> +	lwz	r0,SL_CR(r11)
> +	mtcr	r0
> +	lwz	r2,SL_R2(r11)
> +	lmw	r12,SL_R12(r11)
> +	lwz	r1,SL_SP(r11)
> +	lwz	r0,SL_LR(r11)
> +	mtlr	r0
> +
> +	// XXX Note: we don't really need to call swsusp_resume
> +
> +	li	r3,0
> +	blr
> +
> +/* FIXME:This construct is actually not useful since we don't shut
> + * down the instruction MMU, we could just flip back MSR-DR on.
> + */
> +turn_on_mmu:
> +	mflr	r4
> +	mtsrr0	r4
> +	mtsrr1	r3
> +	sync
> +	isync
> +	rfi
> +

This version will probably not work with 2.6.9 kernel (if you have
version that works with 2.6.9, it would be even better to merge that).

> --- 2.6.9-lzf/arch/ppc/kernel/vmlinux.lds.S	2004-11-26 12:32:57.000000000 +0800
> +++ 2.6.9/arch/ppc/kernel/vmlinux.lds.S	2004-11-28 23:16:58.000000000 +0800
> @@ -74,6 +74,12 @@ SECTIONS
>      CONSTRUCTORS
>    }
>  
> +  . = ALIGN(4096);
> +  __nosave_begin = .;
> +  .data_nosave : { *(.data.nosave) }
> +  . = ALIGN(4096);
> +  __nosave_end = .;
> +
>    . = ALIGN(32);
>    .data.cacheline_aligned : { *(.data.cacheline_aligned) }
>  
> --- 2.6.9-lzf/arch/ppc/platforms/pmac_feature.c	2004-11-27 17:33:17.000000000 +0800
> +++ 2.6.9/arch/ppc/platforms/pmac_feature.c	2004-11-28 23:16:59.000000000 +0800
> @@ -2146,7 +2146,7 @@ static struct pmac_mb_def pmac_mb_defs[]
>  	},
>  	{	"PowerBook6,1",			"PowerBook G4 12\"",
>  		PMAC_TYPE_UNKNOWN_INTREPID,	intrepid_features,
> -		PMAC_MB_HAS_FW_POWER | PMAC_MB_MOBILE,
> +		PMAC_MB_CAN_SLEEP | PMAC_MB_HAS_FW_POWER | PMAC_MB_MOBILE,
>  	},
>  	{	"PowerBook6,2",			"PowerBook G4",
>  		PMAC_TYPE_UNKNOWN_INTREPID,	intrepid_features,

Ok, merge with BenH.

> --- 2.6.9-lzf/arch/ppc/platforms/pmac_setup.c	2004-11-26 12:32:57.000000000 +0800
> +++ 2.6.9/arch/ppc/platforms/pmac_setup.c	2004-11-28 23:16:59.000000000 +0800
> @@ -51,6 +51,7 @@
>  #include <linux/irq.h>
>  #include <linux/seq_file.h>
>  #include <linux/root_dev.h>
> +#include <linux/suspend.h>
>  
>  #include <asm/reg.h>
>  #include <asm/sections.h>
> @@ -70,6 +71,8 @@
>  #include <asm/pmac_feature.h>
>  #include <asm/time.h>
>  #include <asm/of_device.h>
> +#include <asm/mmu_context.h>
> +
>  #include "pmac_pic.h"
>  #include "mem_pieces.h"
>  
> @@ -420,11 +423,67 @@ find_boot_device(void)
>  #endif
>  }
>  
> +/* TODO: Merge the suspend-to-ram with the common code !!!
> + * currently, this is a stub implementation for suspend-to-disk
> + * only
> + */
> +
> +#ifdef CONFIG_PM
> +
> +extern void enable_kernel_altivec(void);
> +
> +static int pmac_pm_prepare(suspend_state_t state)
> +{
> +	printk(KERN_DEBUG "pmac_pm_prepare(%d)\n", state);
> +
> +	return 0;
> +}
> +
> +static int pmac_pm_enter(suspend_state_t state)
> +{
> +	printk(KERN_DEBUG "pmac_pm_enter(%d)\n", state);
> +
> +	/* Giveup the lazy FPU & vec so we don't have to back them
> +	 * up from the low level code
> +	 */
> +	enable_kernel_fp();
> +
> +#ifdef CONFIG_ALTIVEC
> +	if (cur_cpu_spec[0]->cpu_features & CPU_FTR_ALTIVEC)
> +		enable_kernel_altivec();
> +#endif /* CONFIG_ALTIVEC */
> +
> +	return 0;
> +}
> +
> +static int pmac_pm_finish(suspend_state_t state)
> +{
> +	printk(KERN_DEBUG "pmac_pm_finish(%d)\n", state);
> +
> +	/* Restore userland MMU context */
> +	set_context(current->active_mm->context, current->active_mm->pgd);
> +
> +	return 0;
> +}
> +
> +static struct pm_ops pmac_pm_ops = {
> +	.pm_disk_mode	= PM_DISK_SHUTDOWN,
> +	.prepare	= pmac_pm_prepare,
> +	.enter		= pmac_pm_enter,
> +	.finish		= pmac_pm_finish,
> +};
> +
> +#endif /* CONFIG_PM */
> +
>  static int initializing = 1;
>  
>  static int pmac_late_init(void)
>  {
>  	initializing = 0;
> +
> +#ifdef CONFIG_PM
> +	pm_set_ops(&pmac_pm_ops);
> +#endif /* CONFIG_PM */
>  	return 0;
>  }

Ok, merge with BenH.
								Pavel								

-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: software suspend patch [1/6]
  2004-11-28 16:25       ` software suspend patch [1/6] hugang
@ 2004-11-28 17:11         ` Pavel Machek
  2004-12-02 20:56           ` Guido Guenther
  0 siblings, 1 reply; 51+ messages in thread
From: Pavel Machek @ 2004-11-28 17:11 UTC (permalink / raw)
  To: hugang; +Cc: linux-kernel

Hi!

> >  device-tree.diff 
> >    base from suspend2 with a little changed.
> > 
> >  core.diff
> >   1: redefine struct pbe for using _no_ continuous as pagedir.
> >   2: make shrink memory as little as possible.
> >   3: using a bitmap speed up collide check in page relocating.
> >   4: pagecache saving ready.
> > 
> >  i386.diff
> >  ppc.diff
> >   i386 and powerpc suspend update.
> > 
> >  pagecachs_addon.diff
> >   if enable page caches saving, must using it, it making saving
> >   pagecaches safe. idea from suspend2.
> > 
> >   ppcfix.diff
> >   fix compile error. 
> >   $ gcc -v
> >    .... 
> >    gcc version 2.95.4 20011002 (Debian prerelease)
> > 
> > I'm using 2.6.9-ck3 With above patch, swsusp1 works prefect in my 
> > PowerPC and x86 PC with Highmem and prepempt option enabled.
> > 
> > I hope the core.diff@1,@2,@3 i386.diff ppc.diff will merge into 
> > mainline kernel ASAP, :). from I view point device-tree.diff is 
> > very usefuly when using pagecache saving and pagecachs_addon.diff
> > that's really hack for making pagecache saving safe.
> > 
> 
> --- 2.6.9-lzf/arch/ppc/syslib/open_pic.c	2004-11-26 12:32:58.000000000 +0800
> +++ 2.6.9/arch/ppc/syslib/open_pic.c	2004-11-28 23:16:58.000000000 +0800
> @@ -776,7 +776,8 @@ static void openpic_mapirq(u_int irq, cp
>  	if (ISR[irq] == 0)
>  		return;
>  	if (!cpus_empty(keepmask)) {
> -		cpumask_t irqdest = { .bits[0] = openpic_read(&ISR[irq]->Destination) };
> +		cpumask_t irqdest;
> +		irqdest.bits[0] = openpic_read(&ISR[irq]->Destination);
>  		cpus_and(irqdest, irqdest, keepmask);
>  		cpus_or(physmask, physmask, irqdest);
>  	}

ACK. Send this to Andrew Morton, Cc: Rusty trivial patch monkey
Russell <trivial@rustcorp.com.au>.
								Pavel

-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: software suspend patch [1/6]
  2004-11-28 16:58       ` Pavel Machek
@ 2004-11-29 15:43         ` hugang
  2005-01-09 22:43           ` Pavel Machek
  0 siblings, 1 reply; 51+ messages in thread
From: hugang @ 2004-11-29 15:43 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

On Sun, Nov 28, 2004 at 05:58:35PM +0100, Pavel Machek wrote:
> Hi!
> 
> I can not merge anything before 2.6.10. As you have seen, I have quite
> a lot of patches in my tree, and I do not want mix them with these...
> 
> >  device-tree.diff 
> >    base from suspend2 with a little changed.
> 
> I do not want this one.
> 
> >  core.diff
> >   1: redefine struct pbe for using _no_ continuous as pagedir.
> 
> Can I get this one as a separate diff?

Here is it.

=== kernel/power/swsusp.c
==================================================================
--- kernel/power/swsusp.c  (revision 24520)
+++ kernel/power/swsusp.c  (local)
@@ -74,9 +74,6 @@
 /* References to section boundaries */
 extern char __nosave_begin, __nosave_end;
 
-/* Variables to be preserved over suspend */
-static int pagedir_order_check;
-
 extern char resume_file[];
 static dev_t resume_device;
 /* Local variables that should not be affected by save */
@@ -97,7 +94,6 @@
  */
 suspend_pagedir_t *pagedir_nosave __nosavedata = NULL;
 static suspend_pagedir_t *pagedir_save;
-static int pagedir_order __nosavedata = 0;
 
 #define SWSUSP_SIG	"S1SUSPEND"
 
@@ -223,9 +219,63 @@
 	swap_list_unlock();
 }
 
+#define ONE_PAGE_PBE_NUM   (PAGE_SIZE/sizeof(struct pbe))
+#define PBE_IS_PAGE_END(x)  \
+	( PAGE_SIZE - sizeof(struct pbe) == ((x) - ((~(PAGE_SIZE - 1)) & (x))) )
 
+#define pgdir_for_each(pos, n, head) \
+	for(pos = head, n = pos ? (suspend_pagedir_t*)pos->dummy.val : NULL; \
+		pos != NULL; \
+		pos = n, n = pos ? (suspend_pagedir_t *)pos->dummy.val : NULL)
 
+#define pbe_for_each(pos, n, index, max, head) \
+	for(pos = head, index = 0, \
+			n = pos ? (struct pbe *)pos->dummy.val : NULL; \
+		(pos != NULL) && (index < max); \
+		pos = (PBE_IS_PAGE_END((unsigned long)pos)) ? n : \
+			((struct pbe *)((unsigned long)pos + sizeof(struct pbe))), \
+			index ++, \
+			n = pos ? (struct pbe*)pos->dummy.val : NULL)
 /**
+ *  find_pbe_by_index - 
+ *  @pgdir:
+ *  @index:
+ *
+ *
+ */
+static struct pbe *find_pbe_by_index(struct pbe *pgdir, int index)
+{
+	unsigned long p = 0;
+	struct pbe *pbe, *next;
+
+	pr_debug("find_pbe_by_index: %p, 0x%03x", pgdir, index); 
+	pgdir_for_each(pbe, next, pgdir) {
+		if (p == index / ONE_PAGE_PBE_NUM) {
+			pbe = (struct pbe *)((unsigned long)pbe + 
+					(index % ONE_PAGE_PBE_NUM) * sizeof(struct pbe));
+			pr_debug(" %p, o{%p} c{%p}\n",
+					pbe, (void*)pbe->orig_address, (void*)pbe->address);
+			return pbe;
+		}
+		p ++;
+	}
+	return (NULL);
+}
+
+/**
+ *  pagedir_free - 
+ *  @head: 
+ *
+ */
+static void pagedir_free(suspend_pagedir_t *head)
+{
+	suspend_pagedir_t *next, *cur;
+	pgdir_for_each(cur, next, head) 
+		free_page((unsigned long)cur);
+}
+
+
+/**
  *	write_swap_page - Write one page to a fresh swap location.
  *	@addr:	Address we're writing.
  *	@loc:	Place to store the entry we used.
@@ -269,19 +319,76 @@
 {
 	swp_entry_t entry;
 	int i;
+	struct pbe *next, *pos;
 
-	for (i = 0; i < nr_copy_pages; i++) {
-		entry = (pagedir_nosave + i)->swap_address;
+	pbe_for_each(pos, next, i, nr_copy_pages, pagedir_nosave) {
+		entry = pos->swap_address;
 		if (entry.val)
 			swap_free(entry);
 		else
 			break;
-		(pagedir_nosave + i)->swap_address = (swp_entry_t){0};
+		pos->swap_address = (swp_entry_t){0};
 	}
 }
 
+static int mod_progress = 1;
 
+static void inline mod_printk_progress(int i)
+{
+	if (mod_progress == 0) mod_progress = 1;
+	if (!(i%100))
+		printk( "\b\b\b\b%3d%%", i / mod_progress );
+}
+
 /**
+ *  write_one_pbe -
+ *  @p:
+ *  @data:
+ *  @cur:
+ *
+ */
+static int write_one_pbe(struct pbe *p, void *data, int cur)
+{
+	int error = 0;
+
+	mod_printk_progress(cur);
+
+	pr_debug("write_one_pbe: %p, o{%p} c{%p} %d ",
+			p, (void *)p->orig_address, (void *)p->address, cur);
+	error = write_page((unsigned long)data, &p->swap_address);
+	if (error) return error;
+	pr_debug("%lu\n", swp_offset(p->swap_address));
+
+	return 0;
+}
+
+static int bio_read_page(pgoff_t page_off, void * page);
+
+/**
+ *  read_one_pbe -
+ *  @p:
+ *  @data:
+ *  @cur
+ *
+ */
+static int read_one_pbe(struct pbe *p, void *data, int cur)
+{
+	int error = 0;
+
+	mod_printk_progress(cur);
+
+	pr_debug("read_one_pbe: %p, o{%p} c{%p} %lu\n",
+			p, (void *)p->orig_address, data, 
+			swp_offset(p->swap_address));
+
+	error = bio_read_page(swp_offset(p->swap_address), data);
+	if (error) return error;
+
+	return 0;
+}
+
+
+/**
  *	data_write - Write saved image to swap.
  *
  *	Walk the list of pages in the image and sync each one to swap.
@@ -291,17 +398,15 @@
 {
 	int error = 0;
 	int i;
-	unsigned int mod = nr_copy_pages / 100;
+	struct pbe *pos, *next;
 
-	if (!mod)
-		mod = 1;
+	mod_progress = nr_copy_pages / 100;
 
 	printk( "Writing data to swap (%d pages)...     ", nr_copy_pages );
-	for (i = 0; i < nr_copy_pages && !error; i++) {
-		if (!(i%mod))
-			printk( "\b\b\b\b%3d%%", i / mod );
-		error = write_page((pagedir_nosave+i)->address,
-					  &((pagedir_nosave+i)->swap_address));
+	pbe_for_each(pos, next, i, nr_copy_pages, pagedir_nosave) {
+		BUG_ON(pos->orig_address == 0);
+		error = write_one_pbe(pos, (void*)pos->address, i);
+		if (error) break;
 	}
 	printk("\b\b\b\bdone\n");
 	return error;
@@ -371,15 +476,17 @@
 
 static int write_pagedir(void)
 {
-	unsigned long addr = (unsigned long)pagedir_nosave;
 	int error = 0;
-	int n = SUSPEND_PD_PAGES(nr_copy_pages);
-	int i;
+	int n = 0;
+	suspend_pagedir_t *pgdir, *next;
 
+	pgdir_for_each(pgdir, next, pagedir_nosave) {
+		error = write_page((unsigned long)pgdir, &swsusp_info.pagedir[n]);
+		if (error) break;
+		n ++;
+	}
+	printk( "Writing pagedir (%d pages)\n", n);
 	swsusp_info.pagedir_pages = n;
-	printk( "Writing pagedir (%d pages)\n", n);
-	for (i = 0; i < n && !error; i++, addr += PAGE_SIZE)
-		error = write_page(addr, &swsusp_info.pagedir[i]);
 	return error;
 }
 
@@ -564,7 +671,7 @@
 {
 	struct zone *zone;
 	unsigned long zone_pfn;
-	struct pbe * pbe = pagedir_nosave;
+	struct pbe * pbe = NULL;
 	int pages_copied = 0;
 	
 	for_each_zone(zone) {
@@ -574,11 +681,14 @@
 		for (zone_pfn = 0; zone_pfn < zone->spanned_pages; ++zone_pfn) {
 			if (saveable(zone, &zone_pfn)) {
 				struct page * page;
+				pbe = find_pbe_by_index(pagedir_nosave, pages_copied);
+				BUG_ON(pbe == NULL);
 				page = pfn_to_page(zone_pfn + zone->zone_start_pfn);
 				pbe->orig_address = (long) page_address(page);
+				BUG_ON(pbe->orig_address == 0);
+				BUG_ON(pbe->address == 0);
 				/* copy_page is not usable for copying task structs. */
 				memcpy((void *)pbe->address, (void *)pbe->orig_address, PAGE_SIZE);
-				pbe++;
 				pages_copied++;
 			}
 		}
@@ -587,67 +697,160 @@
 	nr_copy_pages = pages_copied;
 }
 
+#define pointer2num(x)  ((x - PAGE_OFFSET) >> 12)
+#define num2pointer(x)  ((x << 12) + PAGE_OFFSET)
+static inline void collide_set_bit(unsigned char *bitmap, 
+		unsigned long bitnum)
+{
+	bitnum = pointer2num(bitnum); 
+	bitmap[bitnum / 8] |= (1 << (bitnum%8));
+}
+static inline int collide_is_bit_set(unsigned char *bitmap, 
+		unsigned long bitnum)
+{               
+	bitnum = pointer2num(bitnum); 
+	return !!(bitmap[bitnum / 8] & (1 << (bitnum%8)));
+}
+static void collide_bitmap_free(unsigned char *bitmap)
+{
+	free_pages((unsigned long)bitmap, 2);
+}
 
+/* ((1 << COLLIDE_BITMAP_ORDER) * PAGE_SIZE * 8) << 12 + PAGE_OFFSET */
+#define COLLIDE_BITMAP_ORDER 3
+
+static unsigned char *collide_bitmap_init(struct pbe *pgdir)
+{
+	unsigned char *bitmap = 
+		(unsigned char *)__get_free_pages(GFP_ATOMIC | __GFP_COLD, 
+										  COLLIDE_BITMAP_ORDER);
+	struct pbe *next;
+
+	if (bitmap == NULL) {
+		return NULL;
+	}
+	memset(bitmap, 0, (1 << COLLIDE_BITMAP_ORDER) * PAGE_SIZE);
+
+	/* do base check */
+	BUG_ON(collide_is_bit_set(bitmap, (unsigned long)bitmap) == 1);
+	collide_set_bit(bitmap, (unsigned long)bitmap);
+	BUG_ON(collide_is_bit_set(bitmap, (unsigned long)bitmap) == 0);
+	while (pgdir != NULL) {
+		unsigned long nums;
+		next = (struct pbe*)pgdir->dummy.val;
+		for (nums = 0; nums < ONE_PAGE_PBE_NUM; nums++) {
+			collide_set_bit(bitmap, (unsigned long)pgdir);
+			collide_set_bit(bitmap, (unsigned long)pgdir->orig_address);
+			pgdir ++;
+		}
+		pgdir = next;
+	}
+	return bitmap;
+}
+
+static void **eaten_memory = NULL;
+
+static void *swsusp_get_safe_free_page(unsigned char *collide)
+{
+	void *addr = NULL;
+	void **c = eaten_memory;
+
+	do {
+		if (addr) {
+			eaten_memory = (void**)addr;
+			*eaten_memory = c;
+			c = eaten_memory;
+		}
+		addr = (void*)__get_free_pages(GFP_ATOMIC | __GFP_COLD, 0);
+		if (!addr) 
+			return NULL;
+	} while (collide && collide_is_bit_set(collide, (unsigned long)addr));
+
+	return addr;
+}
+
 /**
- *	calc_order - Determine the order of allocation needed for pagedir_save.
+ *  alloc_one_pagedir - 
+ *  @prev:
+ *  @collide:
  *
- *	This looks tricky, but is just subtle. Please fix it some time.
- *	Since there are %nr_copy_pages worth of pages in the snapshot, we need
- *	to allocate enough contiguous space to hold 
- *		(%nr_copy_pages * sizeof(struct pbe)), 
- *	which has the saved/orig locations of the page.. 
- *
- *	SUSPEND_PD_PAGES() tells us how many pages we need to hold those 
- *	structures, then we call get_bitmask_order(), which will tell us the
- *	last bit set in the number, starting with 1. (If we need 30 pages, that
- *	is 0x0000001e in hex. The last bit is the 5th, which is the order we 
- *	would use to allocate 32 contiguous pages).
- *
- *	Since we also need to save those pages, we add the number of pages that
- *	we need to nr_copy_pages, and in case of an overflow, do the 
- *	calculation again to update the number of pages needed. 
- *
- *	With this model, we will tend to waste a lot of memory if we just cross
- *	an order boundary. Plus, the higher the order of allocation that we try
- *	to do, the more likely we are to fail in a low-memory situtation 
- *	(though	we're unlikely to get this far in such a case, since swsusp 
- *	requires half of memory to be free anyway).
  */
+static suspend_pagedir_t * alloc_one_pagedir(suspend_pagedir_t *prev, 
+		unsigned char *collide)
+{
+	suspend_pagedir_t *pgdir = NULL;
+	int i;
 
+	pgdir = (suspend_pagedir_t *)swsusp_get_safe_free_page(collide);
 
-static void calc_order(void)
+	/*pr_debug("pgdir: %p, %p, %d\n", 
+	  pgdir, prev, sizeof(suspend_pagedir_t)); */
+	for (i = 0; i < ONE_PAGE_PBE_NUM; i++) {
+		pgdir[i].dummy.val = 0;
+		pgdir[i].address = 0;
+		pgdir[i].orig_address = 0;
+		if (prev)
+			prev[i].dummy.val= (unsigned long)pgdir;
+	}
+
+	return (pgdir);
+}
+
+/* calc_nums - Determine the nums of allocation needed for pagedir_save. */
+static int calc_nums(int nr_copy)
 {
-	int diff = 0;
-	int order = 0;
-
+	int diff = 0, ret = 0;
 	do {
-		diff = get_bitmask_order(SUSPEND_PD_PAGES(nr_copy_pages)) - order;
+		diff = (nr_copy / ONE_PAGE_PBE_NUM) - ret + 1;
 		if (diff) {
-			order += diff;
-			nr_copy_pages += 1 << diff;
+			ret += diff;
+			nr_copy += diff;
 		}
-	} while(diff);
-	pagedir_order = order;
+	} while (diff);
+	return nr_copy;
 }
 
-
 /**
  *	alloc_pagedir - Allocate the page directory.
+ *	@pbe:
+ *	@pbe_nums:
+ *	@collide:
+ *	@page_nums:
  *
  *	First, determine exactly how many contiguous pages we need and
  *	allocate them.
  */
 
-static int alloc_pagedir(void)
+static int alloc_pagedir(struct pbe **pbe, int pbe_nums,
+		unsigned char *collide, int page_nums)
 {
-	calc_order();
-	pagedir_save = (suspend_pagedir_t *)__get_free_pages(GFP_ATOMIC | __GFP_COLD,
-							     pagedir_order);
-	if (!pagedir_save)
-		return -ENOMEM;
-	memset(pagedir_save, 0, (1 << pagedir_order) * PAGE_SIZE);
-	pagedir_nosave = pagedir_save;
-	return 0;
+	unsigned int nums = 0;
+	unsigned int after_alloc = pbe_nums;
+	suspend_pagedir_t *prev = NULL, *cur = NULL;
+
+	if (page_nums)
+		after_alloc = ONE_PAGE_PBE_NUM * page_nums;
+	else
+		after_alloc = calc_nums(after_alloc);
+	pr_debug("alloc_pagedir: %d, %d\n", pbe_nums, after_alloc);
+	for (nums = 0 ; nums < after_alloc ; nums += ONE_PAGE_PBE_NUM) {
+		cur = alloc_one_pagedir(prev, collide);
+		pr_debug("alloc_one_pagedir: %p\n", cur);
+		if (!cur) { /* get page failed */
+			goto no_mem;
+		}
+		if (nums == 0) { /* setup the head */
+			*pbe = cur;
+		}
+		prev = cur;
+	}
+	return after_alloc - pbe_nums;
+
+no_mem:
+	pagedir_free(*pbe);
+	*pbe = NULL;
+
+	return (-ENOMEM);
 }
 
 /**
@@ -656,11 +859,10 @@
 
 static void free_image_pages(void)
 {
-	struct pbe * p;
+	struct pbe * p, * n;
 	int i;
 
-	p = pagedir_save;
-	for (i = 0, p = pagedir_save; i < nr_copy_pages; i++, p++) {
+	pbe_for_each(p, n, i, nr_copy_pages, pagedir_save) {
 		if (p->address) {
 			ClearPageNosave(virt_to_page(p->address));
 			free_page(p->address);
@@ -676,10 +878,10 @@
 
 static int alloc_image_pages(void)
 {
-	struct pbe * p;
+	struct pbe * p, * n;
 	int i;
 
-	for (i = 0, p = pagedir_save; i < nr_copy_pages; i++, p++) {
+	pbe_for_each(p, n, i, nr_copy_pages, pagedir_save) {
 		p->address = get_zeroed_page(GFP_ATOMIC | __GFP_COLD);
 		if (!p->address)
 			return -ENOMEM;
@@ -693,7 +895,7 @@
 	BUG_ON(PageNosave(virt_to_page(pagedir_save)));
 	BUG_ON(PageNosaveFree(virt_to_page(pagedir_save)));
 	free_image_pages();
-	free_pages((unsigned long) pagedir_save, pagedir_order);
+	pagedir_free(pagedir_save);
 }
 
 
@@ -751,17 +953,20 @@
 	if (!enough_swap())
 		return -ENOSPC;
 
-	if ((error = alloc_pagedir())) {
+	error = alloc_pagedir(&pagedir_save, nr_copy_pages, NULL, 0);
+	if (error < 0) {
 		pr_debug("suspend: Allocating pagedir failed.\n");
 		return error;
 	}
+	pr_debug("alloc_pagedir: addon %d\n", error);
+	nr_copy_pages += error;
 	if ((error = alloc_image_pages())) {
 		pr_debug("suspend: Allocating image pages failed.\n");
 		swsusp_free();
 		return error;
 	}
+	pagedir_nosave = pagedir_save;
 
-	pagedir_order_check = pagedir_order;
 	return 0;
 }
 
@@ -854,8 +1059,6 @@
 
 asmlinkage int swsusp_restore(void)
 {
-	BUG_ON (pagedir_order_check != pagedir_order);
-	
 	/* Even mappings of "global" things (vmalloc) need to be fixed */
 	__flush_tlb_global();
 	wbinvd();	/* Nigel says wbinvd here is good idea... */
@@ -882,98 +1085,6 @@
 }
 
 
-
-/* More restore stuff */
-
-#define does_collide(addr) does_collide_order(pagedir_nosave, addr, 0)
-
-/*
- * Returns true if given address/order collides with any orig_address 
- */
-static int __init does_collide_order(suspend_pagedir_t *pagedir, unsigned long addr,
-		int order)
-{
-	int i;
-	unsigned long addre = addr + (PAGE_SIZE<<order);
-	
-	for (i=0; i < nr_copy_pages; i++)
-		if ((pagedir+i)->orig_address >= addr &&
-			(pagedir+i)->orig_address < addre)
-			return 1;
-
-	return 0;
-}
-
-/*
- * We check here that pagedir & pages it points to won't collide with pages
- * where we're going to restore from the loaded pages later
- */
-static int __init check_pagedir(void)
-{
-	int i;
-
-	for(i=0; i < nr_copy_pages; i++) {
-		unsigned long addr;
-
-		do {
-			addr = get_zeroed_page(GFP_ATOMIC);
-			if(!addr)
-				return -ENOMEM;
-		} while (does_collide(addr));
-
-		(pagedir_nosave+i)->address = addr;
-	}
-	return 0;
-}
-
-static int __init swsusp_pagedir_relocate(void)
-{
-	/*
-	 * We have to avoid recursion (not to overflow kernel stack),
-	 * and that's why code looks pretty cryptic 
-	 */
-	suspend_pagedir_t *old_pagedir = pagedir_nosave;
-	void **eaten_memory = NULL;
-	void **c = eaten_memory, *m, *f;
-	int ret = 0;
-
-	printk("Relocating pagedir ");
-
-	if (!does_collide_order(old_pagedir, (unsigned long)old_pagedir, pagedir_order)) {
-		printk("not necessary\n");
-		return check_pagedir();
-	}
-
-	while ((m = (void *) __get_free_pages(GFP_ATOMIC, pagedir_order)) != NULL) {
-		if (!does_collide_order(old_pagedir, (unsigned long)m, pagedir_order))
-			break;
-		eaten_memory = m;
-		printk( "." ); 
-		*eaten_memory = c;
-		c = eaten_memory;
-	}
-
-	if (!m) {
-		printk("out of memory\n");
-		ret = -ENOMEM;
-	} else {
-		pagedir_nosave =
-			memcpy(m, old_pagedir, PAGE_SIZE << pagedir_order);
-	}
-
-	c = eaten_memory;
-	while (c) {
-		printk(":");
-		f = c;
-		c = *c;
-		free_pages((unsigned long)f, pagedir_order);
-	}
-	if (ret)
-		return ret;
-	printk("|\n");
-	return check_pagedir();
-}
-
 /**
  *	Using bio to read from swap.
  *	This code requires a bit more work than just using buffer heads
@@ -1038,12 +1149,12 @@
 	return error;
 }
 
-int bio_read_page(pgoff_t page_off, void * page)
+static int bio_read_page(pgoff_t page_off, void * page)
 {
 	return submit(READ, page_off, page);
 }
 
-int bio_write_page(pgoff_t page_off, void * page)
+static int bio_write_page(pgoff_t page_off, void * page)
 {
 	return submit(WRITE, page_off, page);
 }
@@ -1088,7 +1199,6 @@
 		return -EPERM;
 	}
 	nr_copy_pages = swsusp_info.image_pages;
-	pagedir_order = get_bitmask_order(SUSPEND_PD_PAGES(nr_copy_pages));
 	return error;
 }
 
@@ -1115,7 +1225,96 @@
 	return error;
 }
 
+static void __init eat_progress(void)
+{
+	char *eaten_progess = "-\\|/";
+	static int eaten_i = 0;
+
+	printk("\b%c", eaten_progess[eaten_i]);
+	eaten_i ++;
+	if (eaten_i > 3) eaten_i = 0;
+}
+
+static int __init check_one_pbe(struct pbe *p, void *collide, int cur)
+{
+	unsigned long addr = 0;
+
+	pr_debug("check_one_pbe: %p %lu o{%p} ", 
+			p, p->swap_address.val, (void*)p->orig_address);
+	addr = (unsigned long)swsusp_get_safe_free_page(collide);
+	if(!addr)
+		return -ENOMEM;
+	pr_debug("c{%p} done\n", (void*)addr);
+	p->address = addr;
+
+	return 0;
+}
+
+static void __init swsusp_copy_pagedir(suspend_pagedir_t *d_pgdir, 
+		suspend_pagedir_t *s_pgdir)
+{
+	int i = 0;
+
+	while (s_pgdir != NULL) {
+		suspend_pagedir_t *s_next = (suspend_pagedir_t *)s_pgdir->dummy.val;
+		suspend_pagedir_t *d_next = (suspend_pagedir_t *)d_pgdir->dummy.val;
+		for (i = 0; i < ONE_PAGE_PBE_NUM; i++) {
+			d_pgdir->address = s_pgdir->address;
+			d_pgdir->orig_address = s_pgdir->orig_address;
+			d_pgdir->swap_address = s_pgdir->swap_address;
+			s_pgdir ++; d_pgdir ++;
+		}
+		d_pgdir = d_next;
+		s_pgdir = s_next;
+	};
+}
 /**
+ * We check here that pagedir & pages it points to won't collide with pages
+ * where we're going to restore from the loaded pages later
+ */
+static int __init check_pagedir(void)
+{
+	void **c, *f;
+	struct pbe *next, *pos;
+	int error, index;
+	suspend_pagedir_t *addr = NULL;
+	unsigned char *bitmap = collide_bitmap_init(pagedir_nosave);
+
+	BUG_ON(bitmap == NULL);
+
+	printk("Relocating pagedir ... ");
+	error = alloc_pagedir(&addr, nr_copy_pages, bitmap, 
+			swsusp_info.pagedir_pages);
+	if (error < 0) {
+		return error;
+	}
+	swsusp_copy_pagedir(addr, pagedir_nosave);
+	pagedir_free(pagedir_nosave);
+
+	/* check copy address */
+	pbe_for_each(pos, next, index, nr_copy_pages, addr) {
+		error = check_one_pbe(pos, bitmap, index);
+		BUG_ON(error);
+	}
+
+	/* free eaten memory */
+	c = eaten_memory;
+	while (c) {
+		eat_progress();
+		f = c;
+		c = *c;
+		free_pages((unsigned long)f, 0);
+	}
+	/* free unused memory */
+	collide_bitmap_free(bitmap);
+	printk("     done\n");
+
+	pagedir_nosave = addr;
+
+	return 0;
+}
+
+/**
  *	swsusp_read_data - Read image pages from swap.
  *
  *	You do not need to check for overlaps, check_pagedir()
@@ -1124,53 +1323,67 @@
 
 static int __init data_read(void)
 {
-	struct pbe * p;
+	struct pbe * p, * n;
 	int error;
 	int i;
-	int mod = nr_copy_pages / 100;
 
-	if (!mod)
-		mod = 1;
+	if ((error = check_pagedir())) {
+		return -ENOMEM;
+	}
 
-	if ((error = swsusp_pagedir_relocate()))
-		return error;
+	mod_progress = nr_copy_pages / 100;
 
 	printk( "Reading image data (%d pages):     ", nr_copy_pages );
-	for(i = 0, p = pagedir_nosave; i < nr_copy_pages && !error; i++, p++) {
-		if (!(i%mod))
-			printk( "\b\b\b\b%3d%%", i / mod );
-		error = bio_read_page(swp_offset(p->swap_address),
-				  (void *)p->address);
+	pbe_for_each(p, n, i, nr_copy_pages, pagedir_nosave) {
+		error = read_one_pbe(p, (void*)p->address, i);
+		if (error) break;
 	}
 	printk(" %d done.\n",i);
 	return error;
-
 }
 
 extern dev_t __init name_to_dev_t(const char *line);
 
+static int __init read_one_pagedir(suspend_pagedir_t *pgdir, int i)
+{
+	unsigned long offset = swp_offset(swsusp_info.pagedir[i]);
+	unsigned long next;
+	int error = 0;
+
+	next = pgdir->dummy.val;
+	pr_debug("read_one_pagedir: %p, %d, %lu, %p\n", 
+			pgdir, i, offset, (void*)next);
+	if ((error = bio_read_page(offset, (void *)pgdir))) {
+		return error;
+	}
+	pgdir->dummy.val = next;
+
+	return error;
+}
+
+/*
+ * reading pagedir from swap device
+ */
 static int __init read_pagedir(void)
 {
-	unsigned long addr;
-	int i, n = swsusp_info.pagedir_pages;
+	int i = 0, n = swsusp_info.pagedir_pages;
 	int error = 0;
+	suspend_pagedir_t *pgdir, *next;
 
-	addr = __get_free_pages(GFP_ATOMIC, pagedir_order);
-	if (!addr)
+	error = alloc_pagedir(&pagedir_nosave, nr_copy_pages, NULL, n);
+	if (error < 0)
 		return -ENOMEM;
-	pagedir_nosave = (struct pbe *)addr;
 
-	pr_debug("pmdisk: Reading pagedir (%d Pages)\n",n);
+	printk("pmdisk: Reading pagedir (%d Pages)\n",n);
 
-	for (i = 0; i < n && !error; i++, addr += PAGE_SIZE) {
-		unsigned long offset = swp_offset(swsusp_info.pagedir[i]);
-		if (offset)
-			error = bio_read_page(offset, (void *)addr);
-		else
-			error = -EFAULT;
+	pgdir_for_each(pgdir, next, pagedir_nosave) {
+		error = read_one_pagedir(pgdir, i);
+		if (error) break;
+		i++;
 	}
+	BUG_ON(i != n);
 	if (error)
-		free_pages((unsigned long)pagedir_nosave, pagedir_order);
+		pagedir_free(pagedir_nosave);
 	return error;
 }
 
@@ -1185,7 +1398,7 @@
 	if ((error = read_pagedir()))
 		return error;
 	if ((error = data_read()))
-		free_pages((unsigned long)pagedir_nosave, pagedir_order);
+		pagedir_free(pagedir_nosave);
 	return error;
 }
 
@@ -1217,3 +1430,50 @@
 		pr_debug("pmdisk: Error %d resuming\n", error);
 	return error;
 }
+
+/**
+ *  for_each_pbe_copy_back - 
+ *
+ *  That usefuly for help us writing the code in assemble code
+ *
+ */
+/* #define CREATE_ASM_CODE */
+#ifdef CREATE_ASM_CODE
+#if 0 /* if your copy back code is running in real mode, enable it */
+#define GET_ADDRESS(x) __pa(x) 
+#else
+#define GET_ADDRESS(x) (x)
+#endif
+asmlinkage void for_each_pbe_copy_back(void)
+{
+	struct pbe *pgdir, *next;
+
+	pgdir = pagedir_nosave;
+	while (pgdir != NULL) {
+		unsigned long nums, i;
+		pgdir = (struct pbe *)GET_ADDRESS(pgdir);
+		next = (struct pbe*)pgdir->dummy.val;
+		for (nums = 0; nums < ONE_PAGE_PBE_NUM; nums++) {
+			register unsigned long *orig, *copy;
+			orig = (unsigned long *)pgdir->orig_address;
+			if (orig == 0) goto end;
+			orig = (unsigned long *)GET_ADDRESS(orig);
+			copy = (unsigned long *)GET_ADDRESS(pgdir->address);
+#if 0
+			memcpy(orig, copy, PAGE_SIZE);
+#else
+			for (i = 0; i < PAGE_SIZE / sizeof(unsigned long); i+=4) {
+				*(orig + i) = *(copy + i);
+				*(orig + i+1) = *(copy + i+1);
+				*(orig + i+2) = *(copy + i+2);
+				*(orig + i+3) = *(copy + i+3);
+			}
+#endif
+			pgdir ++;
+		}
+		pgdir = next;
+	}
+end:
+	panic("just asm code");
+}
+#endif
=== arch/i386/power/swsusp.S
==================================================================
--- arch/i386/power/swsusp.S  (revision 24520)
+++ arch/i386/power/swsusp.S  (local)
@@ -31,25 +31,34 @@
 	movl $swsusp_pg_dir-__PAGE_OFFSET,%ecx
 	movl %ecx,%cr3
 
-	movl	pagedir_nosave, %ebx
-	xorl	%eax, %eax
-	xorl	%edx, %edx
-	.p2align 4,,7
+	movl  pagedir_nosave, %eax
+	test %eax, %eax
+	je   copy_loop_end
+	movl  $1024, %edx
 
-copy_loop:
-	movl	4(%ebx,%edx),%edi
-	movl	(%ebx,%edx),%esi
+copy_loop_start:
+	movl  0xc(%eax), %ebp
+	xorl  %ebx, %ebx
+	leal  0x0(%esi),%esi
 
-	movl	$1024, %ecx
-	rep
-	movsl
+copy_one_pgdir:
+	movl  0x4(%eax),%edi
+	test %edi, %edi
+	je   copy_loop_end
 
-	incl	%eax
-	addl	$16, %edx
-	cmpl	nr_copy_pages,%eax
-	jb copy_loop
-	.p2align 4,,7
+	movl  (%eax), %esi
+	movl  %edx, %ecx
+	repz movsl %ds:(%esi),%es:(%edi)
 
+	incl  %ebx
+	addl  $0x10, %eax
+	cmpl  $0xff, %ebx
+	jbe  copy_one_pgdir
+	test %ebp, %ebp
+	movl  %ebp, %eax
+	jne  copy_loop_start
+copy_loop_end:
+
 	movl saved_context_esp, %esp
 	movl saved_context_ebp, %ebp
 	movl saved_context_ebx, %ebx
--
Hu Gang / Steve
Linux Registered User 204016
GPG Public Key: http://soulinfo.com/~hugang/hugang.asc

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: software suspend patch [1/6]
  2004-11-28 17:11         ` Pavel Machek
@ 2004-12-02 20:56           ` Guido Guenther
  0 siblings, 0 replies; 51+ messages in thread
From: Guido Guenther @ 2004-12-02 20:56 UTC (permalink / raw)
  To: Pavel Machek; +Cc: hugang, linux-kernel

On Sun, Nov 28, 2004 at 06:11:06PM +0100, Pavel Machek wrote:
> Hi!
> 
> > >  device-tree.diff 
> > >    base from suspend2 with a little changed.
> > > 
> > >  core.diff
> > >   1: redefine struct pbe for using _no_ continuous as pagedir.
> > >   2: make shrink memory as little as possible.
> > >   3: using a bitmap speed up collide check in page relocating.
> > >   4: pagecache saving ready.
> > > 
> > >  i386.diff
> > >  ppc.diff
> > >   i386 and powerpc suspend update.
> > > 
> > >  pagecachs_addon.diff
> > >   if enable page caches saving, must using it, it making saving
> > >   pagecaches safe. idea from suspend2.
> > > 
> > >   ppcfix.diff
> > >   fix compile error. 
> > >   $ gcc -v
> > >    .... 
> > >    gcc version 2.95.4 20011002 (Debian prerelease)
> > > 
> > > I'm using 2.6.9-ck3 With above patch, swsusp1 works prefect in my 
> > > PowerPC and x86 PC with Highmem and prepempt option enabled.
> > > 
> > > I hope the core.diff@1,@2,@3 i386.diff ppc.diff will merge into 
> > > mainline kernel ASAP, :). from I view point device-tree.diff is 
> > > very usefuly when using pagecache saving and pagecachs_addon.diff
> > > that's really hack for making pagecache saving safe.
> > > 
> > 
> > --- 2.6.9-lzf/arch/ppc/syslib/open_pic.c	2004-11-26 12:32:58.000000000 +0800
> > +++ 2.6.9/arch/ppc/syslib/open_pic.c	2004-11-28 23:16:58.000000000 +0800
> > @@ -776,7 +776,8 @@ static void openpic_mapirq(u_int irq, cp
> >  	if (ISR[irq] == 0)
> >  		return;
> >  	if (!cpus_empty(keepmask)) {
> > -		cpumask_t irqdest = { .bits[0] = openpic_read(&ISR[irq]->Destination) };
> > +		cpumask_t irqdest;
> > +		irqdest.bits[0] = openpic_read(&ISR[irq]->Destination);
> >  		cpus_and(irqdest, irqdest, keepmask);
> >  		cpus_or(physmask, physmask, irqdest);
> >  	}
Doesn't the former mean initialize irqdest.bits as a whole to zero and
.bits[0] to openpic_read(...), while the later only sets
iqdest.bits[0] to openpic_read(...)?
I think this is wrong.
Cheers,
 -- Guido

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: software suspend patch [1/6]
  2004-11-29 15:43         ` hugang
@ 2005-01-09 22:43           ` Pavel Machek
  2005-01-11  2:01             ` hugang
  0 siblings, 1 reply; 51+ messages in thread
From: Pavel Machek @ 2005-01-09 22:43 UTC (permalink / raw)
  To: hugang; +Cc: linux-kernel

Hi!

> > I can not merge anything before 2.6.10. As you have seen, I have quite
> > a lot of patches in my tree, and I do not want mix them with these...
> > 
> > >  device-tree.diff 
> > >    base from suspend2 with a little changed.
> > 
> > I do not want this one.
> > 
> > >  core.diff
> > >   1: redefine struct pbe for using _no_ continuous as pagedir.
> > 
> > Can I get this one as a separate diff?
> 
> Here is it.

Do you have any updates? It would be nice to separate non-continuous
pagedir from speeding up check_pagedir?

...plus check_pagedir should really use PageNosaveFree flag instead of
allocating there own (big!) bitmaps. It should also make the code
simpler...
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: software suspend patch [1/6]
  2005-01-09 22:43           ` Pavel Machek
@ 2005-01-11  2:01             ` hugang
  2005-01-11  3:15               ` Pavel Machek
  0 siblings, 1 reply; 51+ messages in thread
From: hugang @ 2005-01-11  2:01 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

On Sun, Jan 09, 2005 at 11:43:25PM +0100, Pavel Machek wrote:
> Hi!
> 
> Do you have any updates? It would be nice to separate non-continuous
> pagedir from speeding up check_pagedir?
> 
> ...plus check_pagedir should really use PageNosaveFree flag instead of
> allocating there own (big!) bitmaps. It should also make the code
> simpler...
> 								Pavel

I'm very happy with current swsusp, that's stable for me. 
 2.6.10-mm1 + ppc patch from 
  http://honk.physik.uni-konstanz.de/~agx/linux-ppc/kernel/
 + your free some memory patch

I using it for a week, never failed, never oops. :)

The only problem is relocating a little slowly.

Now I don't think non-continuous pagedir is really need. Anyway I'll
prepare a patch to make swsusp using non-continuous pagedir.

any comments.

-- 
Hu Gang       .-.
              /v\
             // \\ 
Linux User  /(   )\  [204016]
GPG Key ID   ^^-^^   http://soulinfo.com/~hugang/hugang.asc

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: software suspend patch [1/6]
  2005-01-11  2:01             ` hugang
@ 2005-01-11  3:15               ` Pavel Machek
  0 siblings, 0 replies; 51+ messages in thread
From: Pavel Machek @ 2005-01-11  3:15 UTC (permalink / raw)
  To: hugang; +Cc: linux-kernel, xhejtman

Hi!

> > Do you have any updates? It would be nice to separate non-continuous
> > pagedir from speeding up check_pagedir?
> > 
> > ...plus check_pagedir should really use PageNosaveFree flag instead of
> > allocating there own (big!) bitmaps. It should also make the code
> > simpler...
> 
> I'm very happy with current swsusp, that's stable for me. 
>  2.6.10-mm1 + ppc patch from 
>   http://honk.physik.uni-konstanz.de/~agx/linux-ppc/kernel/
>  + your free some memory patch
> 
> I using it for a week, never failed, never oops. :)
> 
> The only problem is relocating a little slowly.

I just got very nice patch from Lukas Hejtmanek to relocate
faster... It would be great if you could test it.

> Now I don't think non-continuous pagedir is really need. Anyway I'll
> prepare a patch to make swsusp using non-continuous pagedir.

Thanks.

								Pavel

-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-12-01 10:08                         ` Pavel Machek
@ 2004-12-01 20:39                           ` Nigel Cunningham
  0 siblings, 0 replies; 51+ messages in thread
From: Nigel Cunningham @ 2004-12-01 20:39 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Stefan Seyfried, Christoph Hellwig, Linux Kernel Mailing List,
	hugang, Andrew Morton

Hi.

On Wed, 2004-12-01 at 21:08, Pavel Machek wrote:
> Hi!
> 
> > > > > >>Putting only the absolutely necessary things into the kernel (the same
> > > > > >>is true for the interactive resume thing - if someone wants interactive
> > > > > >>startup at a failing resume, he has to use an initrd, i don't see a
> > > > > >>problem with that) will probably increase the acceptance a bit :-)
> > > > > > 
> > > > > > That's fine if your initrd is properly configured and you're willing to
> > > > > 
> > > > > This is something distributions have to take care of.
> > > > 
> > > > No; it's something the users will have to take care of. Distro makers
> > > > might make the process more automated, but in the end it's the user's
> > > > problem if it doesn't work.
> > > 
> > > Actually, no, its not like that. 
> > > 
> > > User will click icon in KDE, and if it does not suspend & resume
> > > properly, distribution has problem to fix. And yes, it works well in
> > > SUSE9.2.
> > 
> > I didn't know you had support for initramfs and initrd configurations
> > already. You are making progress.
> 
> Well, no, not that one.
> 
> OTOH for SUSE9.2 these things basically can not happen. (There's no
> wrong kernel you can click on ;-) -- either you boot normally, then
> there's just one kernel to boot, or you boot failsafe, and then you
> want to kill signatures etc.) 

And it's impossible for you to accidentally select the failsafe?

Anyway, even if it is impossible, you're still only talking about one
particular set of circumstances; that's not necessarily the case for
everyone else, not should it necessarily be.

Regards,

Nigel
-- 
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

You see, at just the right time, when we were still powerless, Christ
died for the ungodly.		-- Romans 5:6


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-12-01  9:27                       ` Nigel Cunningham
@ 2004-12-01 10:08                         ` Pavel Machek
  2004-12-01 20:39                           ` Nigel Cunningham
  0 siblings, 1 reply; 51+ messages in thread
From: Pavel Machek @ 2004-12-01 10:08 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Stefan Seyfried, Christoph Hellwig, Linux Kernel Mailing List,
	hugang, Andrew Morton

Hi!

> > > > >>Putting only the absolutely necessary things into the kernel (the same
> > > > >>is true for the interactive resume thing - if someone wants interactive
> > > > >>startup at a failing resume, he has to use an initrd, i don't see a
> > > > >>problem with that) will probably increase the acceptance a bit :-)
> > > > > 
> > > > > That's fine if your initrd is properly configured and you're willing to
> > > > 
> > > > This is something distributions have to take care of.
> > > 
> > > No; it's something the users will have to take care of. Distro makers
> > > might make the process more automated, but in the end it's the user's
> > > problem if it doesn't work.
> > 
> > Actually, no, its not like that. 
> > 
> > User will click icon in KDE, and if it does not suspend & resume
> > properly, distribution has problem to fix. And yes, it works well in
> > SUSE9.2.
> 
> I didn't know you had support for initramfs and initrd configurations
> already. You are making progress.

Well, no, not that one.

OTOH for SUSE9.2 these things basically can not happen. (There's no
wrong kernel you can click on ;-) -- either you boot normally, then
there's just one kernel to boot, or you boot failsafe, and then you
want to kill signatures etc.) 
								Pavel

-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-30 22:20                     ` Pavel Machek
@ 2004-12-01  9:27                       ` Nigel Cunningham
  2004-12-01 10:08                         ` Pavel Machek
  0 siblings, 1 reply; 51+ messages in thread
From: Nigel Cunningham @ 2004-12-01  9:27 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Stefan Seyfried, Christoph Hellwig, Linux Kernel Mailing List,
	hugang, Andrew Morton

Hi.

On Wed, 2004-12-01 at 09:20, Pavel Machek wrote:
> Hi!
> 
> > > >>Putting only the absolutely necessary things into the kernel (the same
> > > >>is true for the interactive resume thing - if someone wants interactive
> > > >>startup at a failing resume, he has to use an initrd, i don't see a
> > > >>problem with that) will probably increase the acceptance a bit :-)
> > > > 
> > > > That's fine if your initrd is properly configured and you're willing to
> > > 
> > > This is something distributions have to take care of.
> > 
> > No; it's something the users will have to take care of. Distro makers
> > might make the process more automated, but in the end it's the user's
> > problem if it doesn't work.
> 
> Actually, no, its not like that. 
> 
> User will click icon in KDE, and if it does not suspend & resume
> properly, distribution has problem to fix. And yes, it works well in
> SUSE9.2.

I didn't know you had support for initramfs and initrd configurations
already. You are making progress.

Nigel
-- 
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

You see, at just the right time, when we were still powerless, Christ
died for the ungodly.		-- Romans 5:6


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-30 21:16                   ` Nigel Cunningham
@ 2004-11-30 22:20                     ` Pavel Machek
  2004-12-01  9:27                       ` Nigel Cunningham
  0 siblings, 1 reply; 51+ messages in thread
From: Pavel Machek @ 2004-11-30 22:20 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Stefan Seyfried, Christoph Hellwig, Linux Kernel Mailing List,
	hugang, Andrew Morton

Hi!

> > >>Putting only the absolutely necessary things into the kernel (the same
> > >>is true for the interactive resume thing - if someone wants interactive
> > >>startup at a failing resume, he has to use an initrd, i don't see a
> > >>problem with that) will probably increase the acceptance a bit :-)
> > > 
> > > That's fine if your initrd is properly configured and you're willing to
> > 
> > This is something distributions have to take care of.
> 
> No; it's something the users will have to take care of. Distro makers
> might make the process more automated, but in the end it's the user's
> problem if it doesn't work.

Actually, no, its not like that. 

User will click icon in KDE, and if it does not suspend & resume
properly, distribution has problem to fix. And yes, it works well in
SUSE9.2.
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-30 12:16                 ` Stefan Seyfried
@ 2004-11-30 21:16                   ` Nigel Cunningham
  2004-11-30 22:20                     ` Pavel Machek
  0 siblings, 1 reply; 51+ messages in thread
From: Nigel Cunningham @ 2004-11-30 21:16 UTC (permalink / raw)
  To: Stefan Seyfried
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang,
	Andrew Morton, Pavel Machek

Hi.

On Tue, 2004-11-30 at 23:16, Stefan Seyfried wrote:
> >>>change a parameter or forcing them to do an ls in /dev with obscure
> >>>parameters (to get the major and minor numbers) when they already know
> >>>they want /dev/sda1 isn't user friendly. Obviously user friendliness is 
> >>
> >>This can easily be done by a userspace helper. You do use the
> >>(userspace) X server to display your GUI, don't you?
> > 
> > No. Not at all. All of userspace is well and truly wedged in a block of
> > ice by then.
> 
> you are not changing the suspend device after freezing userspace, or i
> am getting something horribly wrong here.

No, it doesn't change once userspace is frozen; you're correct.

> so if you have 2 choices of an interface:
> 1) more complex kernel code, but you can do "echo /dev/name > /proc/foo"
> 2) less complex kernel code, now you have a userspace helper e.g.
>    "suspend_ctl foodev /dev/name" which then does the magic number
>    calculations in userspace and puts the magic number into the kernel.
> 
> I think that interface 2) would be preferred by most kernel developers.
> Especially since this is code only needed on a relatively small subset
> of all linux installations.
> 
> There is a "top" userspace program to parse kernel numbers, we don't
> have "/proc/top".

Forgive me for asking a stupid question, but why all this fuss when the
code is already in the kernel? And isn't really that complex anyway.
Instead of whatever for parsing a major and minor, we have 

        resume_device = name_to_dev_t(commandline);

Is it really worth all this heat for that call and making two routines
(name_to_dev_t and try_name IIRC) not be __init. It seems to me that
it's far more complex to create some userspace program to do this stuff.

> >>Putting only the absolutely necessary things into the kernel (the same
> >>is true for the interactive resume thing - if someone wants interactive
> >>startup at a failing resume, he has to use an initrd, i don't see a
> >>problem with that) will probably increase the acceptance a bit :-)
> > 
> > That's fine if your initrd is properly configured and you're willing to
> 
> This is something distributions have to take care of.

No; it's something the users will have to take care of. Distro makers
might make the process more automated, but in the end it's the user's
problem if it doesn't work.

> > add extra cruft to the kernel so userspace can get the info it needs,
> 
> not much extra cruft is needed. The "echo resume > /sys/power/state"
> just returns (which it wouldn't if the resume was successful), then you
> can decide what to do next.
> 
> > and report what the user wants to do. If, however, you don't use an
> > initrd, you're sunk.
> 
> yes. There are other prerequisites for suspend than using an initrd
> though (you need a computer :-). If you don't use an initrd, you cannot
> use the interactive features but have to decide at compile time which
> way to go if the resume fails. That's life.

Have you looked at the code for handling this? It's really very simple.

> > Regarding acceptance, there's no point in getting it accepted into the
> > kernel if we end up with something that's user-unfriendly. I think it
> > will help a lot if we agree that suspend does need to blur the lines
> > between kernel and userspace a little, in the interests of providing
> > software that is superior.
> 
> User-friendlyness is an joint effort of kernel and userspace. The user
> does not care who does the work when he clicks on his "hibernation" Icon
> in the taskbar. (The same is true for users of an hibernation script).
> Actually, the thing that makes suspend2 more reliable than swsusp is
> probably the very good hibernation script (userspace) that saves users
> the reading of documentation since it automatically unloads all critical
> modules etc. For me, pavel's later versions as in SUSE 9.2 have worked
> out of the box on every non-SMP i386 notebook i have laid my hands on in
> the last 6 months (thanks to userspace taking care of bad modules etc).

Have those boxes had DRI enabled or serious USB usage? I'd be surprised
if you haven't run into the same problems we have.

Regards,

Nigel
-- 
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

You see, at just the right time, when we were still powerless, Christ
died for the ungodly.		-- Romans 5:6


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-29 22:20               ` Nigel Cunningham
  2004-11-29 22:34                 ` Pavel Machek
@ 2004-11-30 12:16                 ` Stefan Seyfried
  2004-11-30 21:16                   ` Nigel Cunningham
  1 sibling, 1 reply; 51+ messages in thread
From: Stefan Seyfried @ 2004-11-30 12:16 UTC (permalink / raw)
  To: ncunningham
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang,
	Andrew Morton, Pavel Machek

Hi,

Nigel Cunningham wrote:

> On Mon, 2004-11-29 at 20:34, Stefan Seyfried wrote:

>>and not everyone who downloads suspend2 uses it ;-)
> 
> Yes... I'd say the relative percentage would be much higher, though.

you are probably right here :-)

>>>change a parameter or forcing them to do an ls in /dev with obscure
>>>parameters (to get the major and minor numbers) when they already know
>>>they want /dev/sda1 isn't user friendly. Obviously user friendliness is 
>>
>>This can easily be done by a userspace helper. You do use the
>>(userspace) X server to display your GUI, don't you?
> 
> No. Not at all. All of userspace is well and truly wedged in a block of
> ice by then.

you are not changing the suspend device after freezing userspace, or i
am getting something horribly wrong here.

so if you have 2 choices of an interface:
1) more complex kernel code, but you can do "echo /dev/name > /proc/foo"
2) less complex kernel code, now you have a userspace helper e.g.
   "suspend_ctl foodev /dev/name" which then does the magic number
   calculations in userspace and puts the magic number into the kernel.

I think that interface 2) would be preferred by most kernel developers.
Especially since this is code only needed on a relatively small subset
of all linux installations.

There is a "top" userspace program to parse kernel numbers, we don't
have "/proc/top".

>>Putting only the absolutely necessary things into the kernel (the same
>>is true for the interactive resume thing - if someone wants interactive
>>startup at a failing resume, he has to use an initrd, i don't see a
>>problem with that) will probably increase the acceptance a bit :-)
> 
> That's fine if your initrd is properly configured and you're willing to

This is something distributions have to take care of.

> add extra cruft to the kernel so userspace can get the info it needs,

not much extra cruft is needed. The "echo resume > /sys/power/state"
just returns (which it wouldn't if the resume was successful), then you
can decide what to do next.

> and report what the user wants to do. If, however, you don't use an
> initrd, you're sunk.

yes. There are other prerequisites for suspend than using an initrd
though (you need a computer :-). If you don't use an initrd, you cannot
use the interactive features but have to decide at compile time which
way to go if the resume fails. That's life.

> Regarding acceptance, there's no point in getting it accepted into the
> kernel if we end up with something that's user-unfriendly. I think it
> will help a lot if we agree that suspend does need to blur the lines
> between kernel and userspace a little, in the interests of providing
> software that is superior.

User-friendlyness is an joint effort of kernel and userspace. The user
does not care who does the work when he clicks on his "hibernation" Icon
in the taskbar. (The same is true for users of an hibernation script).
Actually, the thing that makes suspend2 more reliable than swsusp is
probably the very good hibernation script (userspace) that saves users
the reading of documentation since it automatically unloads all critical
modules etc. For me, pavel's later versions as in SUSE 9.2 have worked
out of the box on every non-SMP i386 notebook i have laid my hands on in
the last 6 months (thanks to userspace taking care of bad modules etc).

Regards,

   Stefan

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-30  0:24                           ` Nigel Cunningham
@ 2004-11-30 10:19                             ` Pavel Machek
  0 siblings, 0 replies; 51+ messages in thread
From: Pavel Machek @ 2004-11-30 10:19 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang, Andrew Morton

Hi!

> > I'm not sure if I want to do full page-cache saving (and without that,
> > half-of-memory limit does not bite too badly). "Everything is swapped
> > out" problem is actually not limited to swsusp, updatedb overnight
> > tends to have the same effect. Perhaps more generic solution is
> > needed...
> 
> Would increases in the amount of memory machines have make this bite
> more and more over time?

Actually, it should bite less and less, because ammount of memory
actually used does not seem to grow as fast as ammount of memory
available. On 4MB machine, I could imagine kernel using >2MB memory
and "half-memory-free" trick not working at all. On 1GB
machine... well kernel will never use >512MB of memory, so we are safe. 

> I guess the more generic solution would be to abandon using bio calls
> and have your own device driver that could write the whole image to disk
> without having to do the atomic copy. You'd have to write a lot of
> support for drivers, though. I'd find it hard to imagine it being worth
> the effort.

That would mean rewriting half of kernel.

> > cat `cat /proc/[0-9]*/maps | grep / | sed 's:.* /:/:' | sort -u` > /dev/null
> 
> What does this do?

Attempts to load all the binaries into memory. Poor man's "make
machine responsive after swsusp".
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-29 13:03                         ` Pavel Machek
@ 2004-11-30  0:24                           ` Nigel Cunningham
  2004-11-30 10:19                             ` Pavel Machek
  0 siblings, 1 reply; 51+ messages in thread
From: Nigel Cunningham @ 2004-11-30  0:24 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang, Andrew Morton

Hi.

On Tue, 2004-11-30 at 00:03, Pavel Machek wrote:
> > If we want to compare apples with apples, we're going to have to make
> > the only difference which code is run. A normal load on my computer is
> > evolution, cyrus imapd, opera, win4lin running Libronix and a kernel
> > tree in the cache (last image sizes were 1000, 1002, 995, 949 and
> > 910MB). I'm happy to run your sped-up code for some tests, if you'd
> > like. You know where to find mine if you want to make sure I'm not
> > cheating :>
> 
> Okay, I started galeon (no opera here :-(), evolution, xpdf,
> oowriter. Well, it is not going to be too much "apples-to-apples"
> since swsusp1 cheats and discards caches (etc). Machine has 1GB memory
> total, before suspend attempt 800MB were in use. Suspend took 20
> seconds, after resume (and some swap-in) 250MB was in use.

Are you able to time up to when the swap in is finished? Without that,
we're not really comparing apples with apples, it seems.

> > > > These discussions are getting really unreasonable. "I don't want that
> > > > feature, therefore it shouldn't be merged" isn't a valid argument.
> > > > Neither is "Well, I can suspend in seven seconds with hardly any memory
> > > > in use." If you just don't want suspend2 in the kernel, come out and say
> > > > it. 
> > > 
> > > Ok, "I do not want suspend2 in kernel". Not what you'd call suspend2,
> > > anyway. I thought that stripping down suspend2 then merging it is
> > > reasonable way to go, but now it seems to me that enhancing swsusp1 is
> > > easier way to go. At least I'll be able to do it incrementally.
> > 
> > You'll be able to do that within limits, but once you do seriously given
> > up on the max-half-of-memory limit, you'll need some major redesigning.
> > If that's the way you want to go, okay. Assuming nothing else changes,
> 
> I'm not sure if I want to do full page-cache saving (and without that,
> half-of-memory limit does not bite too badly). "Everything is swapped
> out" problem is actually not limited to swsusp, updatedb overnight
> tends to have the same effect. Perhaps more generic solution is
> needed...

Would increases in the amount of memory machines have make this bite
more and more over time?

I guess the more generic solution would be to abandon using bio calls
and have your own device driver that could write the whole image to disk
without having to do the atomic copy. You'd have to write a lot of
support for drivers, though. I'd find it hard to imagine it being worth
the effort.

> cat `cat /proc/[0-9]*/maps | grep / | sed 's:.* /:/:' | sort -u` > /dev/null

What does this do?

> does solve part of the problem. (Another problem is how to actually
> measure improvements in this area).

Yes; that's always an 'interesting' issue :>

Nigel
-- 
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

You see, at just the right time, when we were still powerless, Christ
died for the ungodly.		-- Romans 5:6


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-29 22:20               ` Nigel Cunningham
@ 2004-11-29 22:34                 ` Pavel Machek
  2004-11-30 12:16                 ` Stefan Seyfried
  1 sibling, 0 replies; 51+ messages in thread
From: Pavel Machek @ 2004-11-29 22:34 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Stefan Seyfried, Christoph Hellwig, Linux Kernel Mailing List,
	hugang, Andrew Morton

Hi!

> > >>>:> But not everyone who uses 2.6.9 uses swsusp. :>
> > 
> > and not everyone who downloads suspend2 uses it ;-)
> 
> Yes... I'd say the relative percentage would be much higher, though.

Agreed.

> > > change a parameter or forcing them to do an ls in /dev with obscure
> > > parameters (to get the major and minor numbers) when they already know
> > > they want /dev/sda1 isn't user friendly. Obviously user friendliness is 
> > 
> > This can easily be done by a userspace helper. You do use the
> > (userspace) X server to display your GUI, don't you?
> 
> No. Not at all. All of userspace is well and truly wedged in a block of
> ice by then.

I think that was not what Stefan wanted to say.

> Regarding acceptance, there's no point in getting it accepted into the
> kernel if we end up with something that's user-unfriendly. I think it
> will help a lot if we agree that suspend does need to blur the lines
> between kernel and userspace a little, in the interests of providing
> software that is superior.

I guess we'll have to agree to disagree here. I do not think suspend
is special enough to blur the lines...
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-29  9:34             ` Stefan Seyfried
@ 2004-11-29 22:20               ` Nigel Cunningham
  2004-11-29 22:34                 ` Pavel Machek
  2004-11-30 12:16                 ` Stefan Seyfried
  0 siblings, 2 replies; 51+ messages in thread
From: Nigel Cunningham @ 2004-11-29 22:20 UTC (permalink / raw)
  To: Stefan Seyfried
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang,
	Andrew Morton, Pavel Machek

Hi.

On Mon, 2004-11-29 at 20:34, Stefan Seyfried wrote:
> Nigel Cunningham wrote:
> 
> > The cryptoapi provides support for both compression and encryption. I'd
> > happily make use of that, but we still need a way for the user to choose
> > what compression/encryption they want and configure it. I'm not at all
> 
> And encryption is in fact much more needed than compression. Remember,
> you are writing everything in memory (including maybe ssh passphrases or
> gpg keys) to swap in clear text. Not nice. And i agree that compression
> is nice to have, too.
> 
> >>>:> But not everyone who uses 2.6.9 uses swsusp. :>
> 
> and not everyone who downloads suspend2 uses it ;-)

Yes... I'd say the relative percentage would be much higher, though.

> > change a parameter or forcing them to do an ls in /dev with obscure
> > parameters (to get the major and minor numbers) when they already know
> > they want /dev/sda1 isn't user friendly. Obviously user friendliness is 
> 
> This can easily be done by a userspace helper. You do use the
> (userspace) X server to display your GUI, don't you?

No. Not at all. All of userspace is well and truly wedged in a block of
ice by then.

> Putting only the absolutely necessary things into the kernel (the same
> is true for the interactive resume thing - if someone wants interactive
> startup at a failing resume, he has to use an initrd, i don't see a
> problem with that) will probably increase the acceptance a bit :-)

That's fine if your initrd is properly configured and you're willing to
add extra cruft to the kernel so userspace can get the info it needs,
and report what the user wants to do. If, however, you don't use an
initrd, you're sunk.

Regarding acceptance, there's no point in getting it accepted into the
kernel if we end up with something that's user-unfriendly. I think it
will help a lot if we agree that suspend does need to blur the lines
between kernel and userspace a little, in the interests of providing
software that is superior.

Regards,

Nigel
-- 
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

You see, at just the right time, when we were still powerless, Christ
died for the ungodly.		-- Romans 5:6


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-29  3:20                       ` Nigel Cunningham
@ 2004-11-29 13:03                         ` Pavel Machek
  2004-11-30  0:24                           ` Nigel Cunningham
  0 siblings, 1 reply; 51+ messages in thread
From: Pavel Machek @ 2004-11-29 13:03 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang, Andrew Morton

Hi!

> > > > My machine suspends in 7 seconds, and that's swsusp1. According to
> > > > your numbers, suspend2 should suspend it in 1 second and LZE
> > > > compressed should be .5 second.
> > > 
> > > Seven seconds? How much memory is in use when you start, and how much is
> > > actually written to disk? If you're starting with 1GB of RAM in use,
> > > I'll sit up and listen, but I suspect you're talking about something
> > > closer to 20MB and init S :>
> > 
> > It was on .5GB machine, with X running, IIRC. Specify how should I
> > load the system and I'll try it here. swsusp1 got some speedups with
> > O(n^2) killing (not yet merged).
> 
> So it wrote .5GB of memory in seven seconds, or started with .5GB of RAM
> in use?

Machine had .5GB total, not surehow much was really used.

> If we want to compare apples with apples, we're going to have to make
> the only difference which code is run. A normal load on my computer is
> evolution, cyrus imapd, opera, win4lin running Libronix and a kernel
> tree in the cache (last image sizes were 1000, 1002, 995, 949 and
> 910MB). I'm happy to run your sped-up code for some tests, if you'd
> like. You know where to find mine if you want to make sure I'm not
> cheating :>

Okay, I started galeon (no opera here :-(), evolution, xpdf,
oowriter. Well, it is not going to be too much "apples-to-apples"
since swsusp1 cheats and discards caches (etc). Machine has 1GB memory
total, before suspend attempt 800MB were in use. Suspend took 20
seconds, after resume (and some swap-in) 250MB was in use.

> > > These discussions are getting really unreasonable. "I don't want that
> > > feature, therefore it shouldn't be merged" isn't a valid argument.
> > > Neither is "Well, I can suspend in seven seconds with hardly any memory
> > > in use." If you just don't want suspend2 in the kernel, come out and say
> > > it. 
> > 
> > Ok, "I do not want suspend2 in kernel". Not what you'd call suspend2,
> > anyway. I thought that stripping down suspend2 then merging it is
> > reasonable way to go, but now it seems to me that enhancing swsusp1 is
> > easier way to go. At least I'll be able to do it incrementally.
> 
> You'll be able to do that within limits, but once you do seriously given
> up on the max-half-of-memory limit, you'll need some major redesigning.
> If that's the way you want to go, okay. Assuming nothing else changes,

I'm not sure if I want to do full page-cache saving (and without that,
half-of-memory limit does not bite too badly). "Everything is swapped
out" problem is actually not limited to swsusp, updatedb overnight
tends to have the same effect. Perhaps more generic solution is
needed...

cat `cat /proc/[0-9]*/maps | grep / | sed 's:.* /:/:' | sort -u` > /dev/null

does solve part of the problem. (Another problem is how to actually
measure improvements in this area).
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-25 23:46           ` Nigel Cunningham
  2004-11-26  0:39             ` Pavel Machek
@ 2004-11-29  9:34             ` Stefan Seyfried
  2004-11-29 22:20               ` Nigel Cunningham
  1 sibling, 1 reply; 51+ messages in thread
From: Stefan Seyfried @ 2004-11-29  9:34 UTC (permalink / raw)
  To: ncunningham
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang,
	Andrew Morton, Pavel Machek

Nigel Cunningham wrote:

> The cryptoapi provides support for both compression and encryption. I'd
> happily make use of that, but we still need a way for the user to choose
> what compression/encryption they want and configure it. I'm not at all

And encryption is in fact much more needed than compression. Remember,
you are writing everything in memory (including maybe ssh passphrases or
gpg keys) to swap in clear text. Not nice. And i agree that compression
is nice to have, too.

>>>:> But not everyone who uses 2.6.9 uses swsusp. :>

and not everyone who downloads suspend2 uses it ;-)

> change a parameter or forcing them to do an ls in /dev with obscure
> parameters (to get the major and minor numbers) when they already know
> they want /dev/sda1 isn't user friendly. Obviously user friendliness is 

This can easily be done by a userspace helper. You do use the
(userspace) X server to display your GUI, don't you?
Putting only the absolutely necessary things into the kernel (the same
is true for the interactive resume thing - if someone wants interactive
startup at a failing resume, he has to use an initrd, i don't see a
problem with that) will probably increase the acceptance a bit :-)

Best regards,

   Stefan


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-28 23:55                     ` Pavel Machek
@ 2004-11-29  3:20                       ` Nigel Cunningham
  2004-11-29 13:03                         ` Pavel Machek
  0 siblings, 1 reply; 51+ messages in thread
From: Nigel Cunningham @ 2004-11-29  3:20 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang, Andrew Morton

Hi.

On Mon, 2004-11-29 at 10:55, Pavel Machek wrote:
> Hi!
> 
> > > My machine suspends in 7 seconds, and that's swsusp1. According to
> > > your numbers, suspend2 should suspend it in 1 second and LZE
> > > compressed should be .5 second.
> > 
> > Seven seconds? How much memory is in use when you start, and how much is
> > actually written to disk? If you're starting with 1GB of RAM in use,
> > I'll sit up and listen, but I suspect you're talking about something
> > closer to 20MB and init S :>
> 
> It was on .5GB machine, with X running, IIRC. Specify how should I
> load the system and I'll try it here. swsusp1 got some speedups with
> O(n^2) killing (not yet merged).

So it wrote .5GB of memory in seven seconds, or started with .5GB of RAM
in use?

If we want to compare apples with apples, we're going to have to make
the only difference which code is run. A normal load on my computer is
evolution, cyrus imapd, opera, win4lin running Libronix and a kernel
tree in the cache (last image sizes were 1000, 1002, 995, 949 and
910MB). I'm happy to run your sped-up code for some tests, if you'd
like. You know where to find mine if you want to make sure I'm not
cheating :>

> > These discussions are getting really unreasonable. "I don't want that
> > feature, therefore it shouldn't be merged" isn't a valid argument.
> > Neither is "Well, I can suspend in seven seconds with hardly any memory
> > in use." If you just don't want suspend2 in the kernel, come out and say
> > it. 
> 
> Ok, "I do not want suspend2 in kernel". Not what you'd call suspend2,
> anyway. I thought that stripping down suspend2 then merging it is
> reasonable way to go, but now it seems to me that enhancing swsusp1 is
> easier way to go. At least I'll be able to do it incrementally.

You'll be able to do that within limits, but once you do seriously given
up on the max-half-of-memory limit, you'll need some major redesigning.
If that's the way you want to go, okay. Assuming nothing else changes,
I'll just keep suspend2 alive outside of the kernel tree until you get
sick of users asking, and continue to enhance it.

> I'm sorry about all the confusion, and you can still get that jpeg for
> "put pavel into doom3".

I'm not taking it personally at all. I did find some of the objections
pretty petty and some of the comparisons grossly unfair, but I'm not
taking it personally.
-- 
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

You see, at just the right time, when we were still powerless, Christ
died for the ungodly.		-- Romans 5:6


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-28 22:35                   ` Nigel Cunningham
@ 2004-11-28 23:55                     ` Pavel Machek
  2004-11-29  3:20                       ` Nigel Cunningham
  0 siblings, 1 reply; 51+ messages in thread
From: Pavel Machek @ 2004-11-28 23:55 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang, Andrew Morton

Hi!

> > My machine suspends in 7 seconds, and that's swsusp1. According to
> > your numbers, suspend2 should suspend it in 1 second and LZE
> > compressed should be .5 second.
> 
> Seven seconds? How much memory is in use when you start, and how much is
> actually written to disk? If you're starting with 1GB of RAM in use,
> I'll sit up and listen, but I suspect you're talking about something
> closer to 20MB and init S :>

It was on .5GB machine, with X running, IIRC. Specify how should I
load the system and I'll try it here. swsusp1 got some speedups with
O(n^2) killing (not yet merged).

> These discussions are getting really unreasonable. "I don't want that
> feature, therefore it shouldn't be merged" isn't a valid argument.
> Neither is "Well, I can suspend in seven seconds with hardly any memory
> in use." If you just don't want suspend2 in the kernel, come out and say
> it. 

Ok, "I do not want suspend2 in kernel". Not what you'd call suspend2,
anyway. I thought that stripping down suspend2 then merging it is
reasonable way to go, but now it seems to me that enhancing swsusp1 is
easier way to go. At least I'll be able to do it incrementally.

I'm sorry about all the confusion, and you can still get that jpeg for
"put pavel into doom3".
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-26 12:38                 ` Pavel Machek
  2004-11-26 15:54                   ` Christoph Hellwig
@ 2004-11-28 22:35                   ` Nigel Cunningham
  2004-11-28 23:55                     ` Pavel Machek
  1 sibling, 1 reply; 51+ messages in thread
From: Nigel Cunningham @ 2004-11-28 22:35 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang, Andrew Morton

Hi.

On Fri, 2004-11-26 at 23:38, Pavel Machek wrote:
> My machine suspends in 7 seconds, and that's swsusp1. According to
> your numbers, suspend2 should suspend it in 1 second and LZE
> compressed should be .5 second.

Seven seconds? How much memory is in use when you start, and how much is
actually written to disk? If you're starting with 1GB of RAM in use,
I'll sit up and listen, but I suspect you're talking about something
closer to 20MB and init S :>

These discussions are getting really unreasonable. "I don't want that
feature, therefore it shouldn't be merged" isn't a valid argument.
Neither is "Well, I can suspend in seven seconds with hardly any memory
in use." If you just don't want suspend2 in the kernel, come out and say
it. But please, stop giving me lame arguments (more below deleted rather
than replied to).

Nigel
-- 
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

You see, at just the right time, when we were still powerless, Christ
died for the ungodly.		-- Romans 5:6


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-26  0:39             ` Pavel Machek
                                 ` (2 preceding siblings ...)
       [not found]               ` <20041126043203.GA2713@hugang.soulinfo.com>
@ 2004-11-28 21:40               ` Nigel Cunningham
  3 siblings, 0 replies; 51+ messages in thread
From: Nigel Cunningham @ 2004-11-28 21:40 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang, Andrew Morton

Hi.

On Fri, 2004-11-26 at 11:39, Pavel Machek wrote:
> I'm not *that* concerned about speed. Getting rid of order-8 is
> for preventing "sorry, not enough RAM to suspend to disk".

That's fine, but you're only expressing your preference. I'm going to
ignore the other postings from the weekend that essentially say the same
thing; there's no point.

-- 
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

You see, at just the right time, when we were still powerless, Christ
died for the ungodly.		-- Romans 5:6


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-26 15:54                   ` Christoph Hellwig
@ 2004-11-26 22:36                     ` Pavel Machek
  0 siblings, 0 replies; 51+ messages in thread
From: Pavel Machek @ 2004-11-26 22:36 UTC (permalink / raw)
  To: Christoph Hellwig, Nigel Cunningham, Linux Kernel Mailing List,
	hugang, Andrew Morton

Hi!

> > > Again, when you're running on limited time, twice as fast is still twice
> > > as fast.
> > 
> > My machine suspends in 7 seconds, and that's swsusp1. According to
> > your numbers, suspend2 should suspend it in 1 second and LZE
> > compressed should be .5 second.
> > 
> > I'd say "who cares". 7 seconds seems like fast enough for me. And I'm
> > *not* going to add 2000 lines of code for 500msec speedup during
> > suspend.
> 
> Yupp.  Premature optimization is the roo of all evil.  swsusp is
> 
>  a) an absolute slowpath compared to any normal kernel operation,
>     and called extremly seldomly
>  b) only usefull for a small subset of all linux instances
> 
> hacking core code (fastpathes) for speedups there is a really bad idea.
> If you can speed it up without beeing intrusive all power to you.

I have to agree here. Swsusp is not really performance critical,
almost every other part of kernel is more important.

								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-26 12:38                 ` Pavel Machek
@ 2004-11-26 15:54                   ` Christoph Hellwig
  2004-11-26 22:36                     ` Pavel Machek
  2004-11-28 22:35                   ` Nigel Cunningham
  1 sibling, 1 reply; 51+ messages in thread
From: Christoph Hellwig @ 2004-11-26 15:54 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Nigel Cunningham, Christoph Hellwig, Linux Kernel Mailing List,
	hugang, Andrew Morton

On Fri, Nov 26, 2004 at 01:38:48PM +0100, Pavel Machek wrote:
> > Again, when you're running on limited time, twice as fast is still twice
> > as fast.
> 
> My machine suspends in 7 seconds, and that's swsusp1. According to
> your numbers, suspend2 should suspend it in 1 second and LZE
> compressed should be .5 second.
> 
> I'd say "who cares". 7 seconds seems like fast enough for me. And I'm
> *not* going to add 2000 lines of code for 500msec speedup during
> suspend.

Yupp.  Premature optimization is the roo of all evil.  swsusp is

 a) an absolute slowpath compared to any normal kernel operation,
    and called extremly seldomly
 b) only usefull for a small subset of all linux instances

hacking core code (fastpathes) for speedups there is a really bad idea.
If you can speed it up without beeing intrusive all power to you.

> > I'm trying not to make assumptions about how we're writing the image,
> > either. If you want to pipe your image over a network to some server,
> > you should be able to, and not have to implement compression again in
> > the writer for that.
> 
> Suspend-over-network is obscure-enough
> feature. Compressed-suspend-over-network is even worse.
> 
> BTW my feeling is that if you want to do suspend-over-network, you
> should just modify nbd to work with suspend2 and stop adding
> special-purpose code to suspend.

Honestly I think it's a feature so obscure that we wouldn't ever want to
merge it unless it just happens to work as a fallout of a more important
feature.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-26  9:08                 ` Nigel Cunningham
@ 2004-11-26 13:37                   ` Pavel Machek
  0 siblings, 0 replies; 51+ messages in thread
From: Pavel Machek @ 2004-11-26 13:37 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: hugang, Christoph Hellwig, Linux Kernel Mailing List, Andrew Morton

Hi!

> I don't see the point to saving LRU pages separately when you're still
> eating all the memory you can. You'll have the same number of pages to
> save, just fewer to copy (and copying takes far less time than saving).
> 
> > Pagecaches still in, but disable by default, active using sysctl, 
> > I'd like not merge it right now, Hope other chagnes can merge into. :)
> 
> Pavel's going to think you are trying to turn swsusp into suspend2!!

Pavel knows that already, but at least hugang is producing small
patches ;-).
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
       [not found]               ` <20041126043203.GA2713@hugang.soulinfo.com>
  2004-11-26  9:08                 ` Nigel Cunningham
@ 2004-11-26 13:31                 ` Pavel Machek
  1 sibling, 0 replies; 51+ messages in thread
From: Pavel Machek @ 2004-11-26 13:31 UTC (permalink / raw)
  To: hugang
  Cc: Nigel Cunningham, Christoph Hellwig, Linux Kernel Mailing List,
	Andrew Morton

Hi!

> Ok, I got it.  I think making LRU safe must sure 
>  1: LRU can't change after saved.
>  2: LRU memory can't change after saved.
> The first one is done, the second we can't sure in current design, can
> we using COW do it?

Userspace processes should be stopped at that point, and you really
can't do COW to kernel users.

> > swsusp1 is more self-contained. As long as drivers stop the DMA and
> > NMI does nothing wrong, atomic snapshot will indeed be atomic.
> Here is my current patch still relative with your bit diff, only core
> part here.
>  1: adding a collide bitmap for speedup collide check, I can't sure
>     four pages is enough, pavel please check.
>  2: swith list_for_xxx style
>  3: corrent calc_nums.

Heh, can you try this after resume?

cat `cat [0-9]*/maps | grep / | sed 's:.* /:/:' | sort -u` > /dev/null

It should have very similar effect to saving LRU, just in one line of
code ;-).

								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
       [not found]               ` <20041126082109.GA842@hugang.soulinfo.com>
@ 2004-11-26 13:25                 ` Pavel Machek
  0 siblings, 0 replies; 51+ messages in thread
From: Pavel Machek @ 2004-11-26 13:25 UTC (permalink / raw)
  To: hugang
  Cc: Nigel Cunningham, Christoph Hellwig, Linux Kernel Mailing List,
	Andrew Morton

Hi!

> > For swsusp2, you need drivers to stop the DMA, NMI not interfering,
> > sync may not happen after you have saved LRU, memory may not be
> > alocated from slab after you have saved LRU. (something else? This
> > needs to be written down somewhere, and all kernel hackers will need
> > to be carefull not to break these rules. Do you see why it wories me?)
> > 
> > swsusp1 is more self-contained. As long as drivers stop the DMA and
> > NMI does nothing wrong, atomic snapshot will indeed be atomic.
> 
> Here is a grabed memory allocate patch from suspend2, useful for shrink memory
> in high memory using system.

Sorry, I do not understand. What problem is this solving?
									Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-26  9:08               ` Nigel Cunningham
@ 2004-11-26 12:38                 ` Pavel Machek
  2004-11-26 15:54                   ` Christoph Hellwig
  2004-11-28 22:35                   ` Nigel Cunningham
  0 siblings, 2 replies; 51+ messages in thread
From: Pavel Machek @ 2004-11-26 12:38 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang, Andrew Morton

Hi!

> > > Don't you need more than that? Unless things have changed, you still
> > > spend most of your time eating memory so that you only have a few megs
> > > (well, maybe a little more) to write to disk. If you do reduce the
> > > amount of memory you eat, then you need to work to make the I/O
> > > faster.
> > 
> > I'm not *that* concerned about speed. Getting rid of order-8 is
> > for preventing "sorry, not enough RAM to suspend to disk".
> 
> Priority wise, I agree. But given that the order 8 issue is dealt with,
> speed is important. Particularly when your power just went out and your
> UPS battery is running down.

....

> Again, when you're running on limited time, twice as fast is still twice
> as fast.

My machine suspends in 7 seconds, and that's swsusp1. According to
your numbers, suspend2 should suspend it in 1 second and LZE
compressed should be .5 second.

I'd say "who cares". 7 seconds seems like fast enough for me. And I'm
*not* going to add 2000 lines of code for 500msec speedup during
suspend.

> > Actually I'd like to see lzf done at LVM level; that way it is usefull
> > for people not doing suspend, too, and we should not need plugin
> > infrastructure in suspend2 (LVM provides us with that service).
> 
> That ignores that the vast majority of people don't use LVM at the
> moment. Perhaps you could argue that they should. The other thing is,
> I'm trying not to make assumptions about how we're writing the image,
> either. If you want to pipe your image over a network to some server,
> you should be able to, and not have to implement compression again in
> the writer for that.

Suspend-over-network is obscure-enough
feature. Compressed-suspend-over-network is even worse.

BTW my feeling is that if you want to do suspend-over-network, you
should just modify nbd to work with suspend2 and stop adding
special-purpose code to suspend.

> > I believe you need to say "no" way more often. One user is not enough
> > to justify feature in mainline kernel, and any number of users should
> > not be enough to make GZIP compression supported by suspend2.
> 
> Okay. Let's say I drop GZIP. I've just asked on the suspend list for
> good reasons not to do it. I'll be surprised if I get any :> (And I'll
> ask for proof that they get a higher throughput with GZIP then with
> LZF!).
> 
> I still think the plugin system is useful. It made adding LZF
> compression and DM support really easy, and also means work can be done
> on a generic file writer without needing to pull out all of the
> swapwriter code. It also made making suspend modular far easier, which
> in turn means you don't have to have the memory in use all the time,
> when you only really want it the functionality ready to go when you want
> to power down.


> > Yes, they are unlikely(); but still they are hooks into memory
> > managment. They are at least ugly as hell. And no swsusp1 does not
> > this particular set of hooks, and does not need to patch sysrq-S.
> 
> How is ugly defined here? Can you give me an example that does the same
> thing, but which you consider less ugly?
> 
>         if (unlikely(test_suspend_state(SUSPEND_USE_MEMORY_POOL))) {
>                 suspend2_free_pool_pages(page, order);
>                 return;
>         }
> 
> I nearly launched into a flame war, but I'll try to be more gracious
> than that.

Memory managment should have no knowledge of suspend2... Imagine every
subsystem sprinkling hooks such as this one...

We'd have 

	if (unlikely(scsi_is_recovering_from_problem()))
		scsi_free_pool_pages();
	else if (usb_is_unhealthy())
		usb_recover()
	else if (unlikely(test_suspend_state(SUSPEND_USE_MEMORY_POOL))) {
                suspend2_free_pool_pages(page, order);
                return;
        }

and every one hacking memory managment would know about scsi, usb, and
suspend2. That's not reasonable way to go.

> > > things. While suspend is running, they serve a good and necessary
> > > purpose. Using high level routines, we can't guarantee that new slab
> > 
> > They are neccessary because of two-stages LRU saving... I'm trying to
> > argue "two-stages LRU saving is wrong"...
> 
> I know you are. What I'm not sure about is whether you believe that the
> user should never have the option of saving a full image of their
> memory, or whether you think there's a better way to do it.

If LRU saving can be done in 300 lines of code with no impact on
generic code... that's okay. In the current form it is way too complex
to merge.

> > For swsusp2, you need drivers to stop the DMA, NMI not interfering,
> > sync may not happen after you have saved LRU, memory may not be
> > alocated from slab after you have saved LRU. (something else? This
> > needs to be written down somewhere, and all kernel hackers will need
> > to be carefull not to break these rules. Do you see why it wories me?)
> > 
> > swsusp1 is more self-contained. As long as drivers stop the DMA and
> > NMI does nothing wrong, atomic snapshot will indeed be atomic.
> 
> Syncing may not happen after we've done the atomic copy, but since it is
> already done when freezing processes, there shouldn't be any dirty data
> to sync anyway... except for the syslog data from our printks.

??? syslogd is stopped, it can't write anything.

> The LRU pages can't change, but this shouldn't be a problem because all
> userspace threads and most of kernel space, including kswapd, kjournald
> and so on is stopped. The only guys who need to worry about this are the
> MM guys, and so long as the scanning continues to run via a process, the
> buddy allocator or a timer (interrupts aren't an option, are they?),
> that activity will be paused during suspend without them having to add a
> single line of code.

Actually swsusp1 does not need to stop timers, so you have few lines
of code added.

> If bio page I/O was changed to interact with the LRU, we might be in
> trouble.
> 
> Slab _can_ be allocated while we're saving the LRU, but the allocations
> need to come from pages that we know will be included in the atomic
> copy. This happens transparently (page allocator), so other kernel
> hackers don't need to worry about any of these issues. If we use the
> memory pool idea, everything else that needs to run can run just like
> normal, without any suspend specific changes. (You might be being

Why do you need to allocate from special pool? After LRU is saved, you
should write all used kernel pages. Slab are kernel pages, so I do not
see why you need to modify it.

> In short, there are no rules that "all kernel hackers" will need to be
> careful not to break. The main thing constraint added is that we need to
> be able to stop all changes to the LRU.

Ok, so the "all kernel hackers" rule is "do not change LRU while
suspend2 is going on".
								Pavel

-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
       [not found]               ` <20041126043203.GA2713@hugang.soulinfo.com>
@ 2004-11-26  9:08                 ` Nigel Cunningham
  2004-11-26 13:37                   ` Pavel Machek
  2004-11-26 13:31                 ` Pavel Machek
  1 sibling, 1 reply; 51+ messages in thread
From: Nigel Cunningham @ 2004-11-26  9:08 UTC (permalink / raw)
  To: hugang
  Cc: Pavel Machek, Christoph Hellwig, Linux Kernel Mailing List,
	Andrew Morton

Hi.

On Fri, 2004-11-26 at 15:32, hugang@soulinfo.com wrote:
> > For swsusp2, you need drivers to stop the DMA, NMI not interfering,
> > sync may not happen after you have saved LRU, memory may not be
> > alocated from slab after you have saved LRU. (something else? This
> > needs to be written down somewhere, and all kernel hackers will need
> > to be carefull not to break these rules. Do you see why it wories me?)
> Ok, I got it.  I think making LRU safe must sure 
>  1: LRU can't change after saved.
>  2: LRU memory can't change after saved.
> The first one is done, the second we can't sure in current design, can
> we using COW do it?

2 is simple: LRU doesn't change because everything that would change it
is frozen, and the memory pool hooks ensure that scanning of the list
doesn't happen while suspending either.

I don't see the point to saving LRU pages separately when you're still
eating all the memory you can. You'll have the same number of pages to
save, just fewer to copy (and copying takes far less time than saving).

> Pagecaches still in, but disable by default, active using sysctl, 
> I'd like not merge it right now, Hope other chagnes can merge into. :)

Pavel's going to think you are trying to turn swsusp into suspend2!!

Nigel 
-- 
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

You see, at just the right time, when we were still powerless, Christ
died for the ungodly.		-- Romans 5:6


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-26  0:39             ` Pavel Machek
@ 2004-11-26  9:08               ` Nigel Cunningham
  2004-11-26 12:38                 ` Pavel Machek
       [not found]               ` <20041126082109.GA842@hugang.soulinfo.com>
                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 51+ messages in thread
From: Nigel Cunningham @ 2004-11-26  9:08 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang, Andrew Morton

Hi.

On Fri, 2004-11-26 at 11:39, Pavel Machek wrote:
> Hi!
> 
> > > I'd prefer not to get plugins and abstract storage. I'm not sure about
> > > extents, but as soon as I can get rid of order-8 allocations, things
> > > should be ok.
> > 
> > Don't you need more than that? Unless things have changed, you still
> > spend most of your time eating memory so that you only have a few megs
> > (well, maybe a little more) to write to disk. If you do reduce the
> > amount of memory you eat, then you need to work to make the I/O
> > faster.
> 
> I'm not *that* concerned about speed. Getting rid of order-8 is
> for preventing "sorry, not enough RAM to suspend to disk".

Priority wise, I agree. But given that the order 8 issue is dealt with,
speed is important. Particularly when your power just went out and your
UPS battery is running down.

> > > Okay, 58MB/sec is better than 1MB/sec. I do not think I want the
> > > complexity neccessary to get me 70MB/sec.
> > 
> > Fair enough for you, but not everyone can say that. At the end of the
> > day, I'm not writing this code just for me to use, though. Many of the
> > features I've added have been added for the benefit of other people; I
> > assume you'd do the same. Most laptops can't do 58MB/s, so the
> > difference is much bigger. (My original laptop hard drive did 17/s; with
> > compression I think it achieved something close to double that,
> > depending on the data being compressed, of course).
> 
> I do have too fast machines around me. But notice that compression
> only does factor-2 speedup. If we wanted to make whole kernel uglier,
> we could probably achieve factor-2 speedup for any benchmark... just
> it would be bad idea.

Again, when you're running on limited time, twice as fast is still twice
as fast.

> > > In some ways, suspend2 is two years ahead of rest of kernel:
> > > * you have interactive debugging
> > > * file compression
> > > * nice splash screen
> > > * plugin interface for transparent network support
> > > 
> > > Unfortunately, we do not want compression done like that. It would
> > > make sense to do compressed-LVM or something like that (that way
> > > everyone would get the benefit), but it does not make sense to have it
> > > just for suspend2. And we do not want the rest of features, too,
> > > unless they work for the rest of kernel.
> > 
> > The cryptoapi provides support for both compression and encryption. I'd
> > happily make use of that, but we still need a way for the user to choose
> > what compression/encryption they want and configure it. I'm not at all
> > adverse to the idea of shifting the lzf compression support into being a
> > cryptoapi plugin. That shouldn't be hard to do precisely because I have
> > the plugin system :>.
> 
> Actually I'd like to see lzf done at LVM level; that way it is usefull
> for people not doing suspend, too, and we should not need plugin
> infrastructure in suspend2 (LVM provides us with that service).

That ignores that the vast majority of people don't use LVM at the
moment. Perhaps you could argue that they should. The other thing is,
I'm trying not to make assumptions about how we're writing the image,
either. If you want to pipe your image over a network to some server,
you should be able to, and not have to implement compression again in
the writer for that.

> > > You did wonderfull work -- you shown what is possible with
> > > suspend2. Now we just need to scale it back to what is practical. It
> > > needs not only to work, it also needs to be nice, simple, and easy to
> > > maintain.
> > 
> > I think it is practical. Apart from the bootsplash support, I don't
> > think I have added any feature because I thought "Hey, this looks like a
> > fun thing to try.". Every feature has been added because it makes
> > suspend faster, more reliable, more user friendly, more versatile or the
> > like. If we want Linux to get adopted by desktop users, it needs to
> 
> I believe you need to say "no" way more often. One user is not enough
> to justify feature in mainline kernel, and any number of users should
> not be enough to make GZIP compression supported by suspend2.

Okay. Let's say I drop GZIP. I've just asked on the suspend list for
good reasons not to do it. I'll be surprised if I get any :> (And I'll
ask for proof that they get a higher throughput with GZIP then with
LZF!).

I still think the plugin system is useful. It made adding LZF
compression and DM support really easy, and also means work can be done
on a generic file writer without needing to pull out all of the
swapwriter code. It also made making suspend modular far easier, which
in turn means you don't have to have the memory in use all the time,
when you only really want it the functionality ready to go when you want
to power down.

> > > I believe it has at least one pretty bad flaw: it has hooks all over
> > > the place and will be nightmare to maintain. Puting suspend hooks into
> > > memory allocation is not nice.
> > 
> > That's a big statement.
> > 
> > "Hooks all over the place" was a phrase first used to refer to the
> > attempts at making freezing more reliable. That's irrelevant now with
> > the simplified three stages to freezing. The hooks are the same as for
> > swsusp1 there.
> > 
> > The hooks you've seen in the rest of the kernel are generally only
> > further supplements to the freezing, so that swsusp should probably have
> > them too.
> > 
> > The hooks in the memory management are minimal and wrapped in
> > unlikely(), so they shouldn't really be a problem in the normal flow
> > of
> 
> Yes, they are unlikely(); but still they are hooks into memory
> managment. They are at least ugly as hell. And no swsusp1 does not
> this particular set of hooks, and does not need to patch sysrq-S.

How is ugly defined here? Can you give me an example that does the same
thing, but which you consider less ugly?

        if (unlikely(test_suspend_state(SUSPEND_USE_MEMORY_POOL))) {
                suspend2_free_pool_pages(page, order);
                return;
        }

I nearly launched into a flame war, but I'll try to be more gracious
than that.

> > things. While suspend is running, they serve a good and necessary
> > purpose. Using high level routines, we can't guarantee that new slab
> 
> They are neccessary because of two-stages LRU saving... I'm trying to
> argue "two-stages LRU saving is wrong"...

I know you are. What I'm not sure about is whether you believe that the
user should never have the option of saving a full image of their
memory, or whether you think there's a better way to do it.

> > > swsusp1 is pretty self-contained. As long as drivers stop the DMA and
> > > NMI does nothing wrong, atomic snapshot will indeed be atomic.
> > > 
> > > Can you list conditions neccessary for suspend2 to work? 
> > 
> > Not really sure what you mean. At the moment, the main hinderance to it
> > working properly is driver model support (USB, DRI, as said previously).
> > That forces us to have a userspace script to compile as modules and
> > stop/unload support around kernel call. Given that these things are
> > done, and that suspend is able to get enough memory to do it's work
> > (almost never a problem), suspend should always work.
> 
> No, assume driver problems are solved.

Okay.

> For swsusp2, you need drivers to stop the DMA, NMI not interfering,
> sync may not happen after you have saved LRU, memory may not be
> alocated from slab after you have saved LRU. (something else? This
> needs to be written down somewhere, and all kernel hackers will need
> to be carefull not to break these rules. Do you see why it wories me?)
> 
> swsusp1 is more self-contained. As long as drivers stop the DMA and
> NMI does nothing wrong, atomic snapshot will indeed be atomic.

Syncing may not happen after we've done the atomic copy, but since it is
already done when freezing processes, there shouldn't be any dirty data
to sync anyway... except for the syslog data from our printks.

The LRU pages can't change, but this shouldn't be a problem because all
userspace threads and most of kernel space, including kswapd, kjournald
and so on is stopped. The only guys who need to worry about this are the
MM guys, and so long as the scanning continues to run via a process, the
buddy allocator or a timer (interrupts aren't an option, are they?),
that activity will be paused during suspend without them having to add a
single line of code.

If bio page I/O was changed to interact with the LRU, we might be in
trouble.

Slab _can_ be allocated while we're saving the LRU, but the allocations
need to come from pages that we know will be included in the atomic
copy. This happens transparently (page allocator), so other kernel
hackers don't need to worry about any of these issues. If we use the
memory pool idea, everything else that needs to run can run just like
normal, without any suspend specific changes. (You might be being
confused here by those printks in the slab get-a-new-page code; I guess
I forgot to write at the top of that patch that the printks are only
there while I'm seeking to determine whether suspend is the cause of
some occasional slab corruption I've been seeing. I'm trying to
determine whether the pages I see the oops at are ones allocated while
writing the image, or not. Unfortunately, it happens so infrequently
that I am taking a long time to see.

DMA only needs to be stopped when the drivers are quiesced, which is not
a suspend2 specific requirement.

The NMI watchdog turned out not to be a problem at all.

In short, there are no rules that "all kernel hackers" will need to be
careful not to break. The main thing constraint added is that we need to
be able to stop all changes to the LRU.

Regards,

Nigel
-- 
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

You see, at just the right time, when we were still powerless, Christ
died for the ungodly.		-- Romans 5:6


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-25 23:46           ` Nigel Cunningham
@ 2004-11-26  0:39             ` Pavel Machek
  2004-11-26  9:08               ` Nigel Cunningham
                                 ` (3 more replies)
  2004-11-29  9:34             ` Stefan Seyfried
  1 sibling, 4 replies; 51+ messages in thread
From: Pavel Machek @ 2004-11-26  0:39 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang, Andrew Morton

Hi!

> > I'd prefer not to get plugins and abstract storage. I'm not sure about
> > extents, but as soon as I can get rid of order-8 allocations, things
> > should be ok.
> 
> Don't you need more than that? Unless things have changed, you still
> spend most of your time eating memory so that you only have a few megs
> (well, maybe a little more) to write to disk. If you do reduce the
> amount of memory you eat, then you need to work to make the I/O
> faster.

I'm not *that* concerned about speed. Getting rid of order-8 is
for preventing "sorry, not enough RAM to suspend to disk".

> > Okay, 58MB/sec is better than 1MB/sec. I do not think I want the
> > complexity neccessary to get me 70MB/sec.
> 
> Fair enough for you, but not everyone can say that. At the end of the
> day, I'm not writing this code just for me to use, though. Many of the
> features I've added have been added for the benefit of other people; I
> assume you'd do the same. Most laptops can't do 58MB/s, so the
> difference is much bigger. (My original laptop hard drive did 17/s; with
> compression I think it achieved something close to double that,
> depending on the data being compressed, of course).

I do have too fast machines around me. But notice that compression
only does factor-2 speedup. If we wanted to make whole kernel uglier,
we could probably achieve factor-2 speedup for any benchmark... just
it would be bad idea.


> > In some ways, suspend2 is two years ahead of rest of kernel:
> > * you have interactive debugging
> > * file compression
> > * nice splash screen
> > * plugin interface for transparent network support
> > 
> > Unfortunately, we do not want compression done like that. It would
> > make sense to do compressed-LVM or something like that (that way
> > everyone would get the benefit), but it does not make sense to have it
> > just for suspend2. And we do not want the rest of features, too,
> > unless they work for the rest of kernel.
> 
> The cryptoapi provides support for both compression and encryption. I'd
> happily make use of that, but we still need a way for the user to choose
> what compression/encryption they want and configure it. I'm not at all
> adverse to the idea of shifting the lzf compression support into being a
> cryptoapi plugin. That shouldn't be hard to do precisely because I have
> the plugin system :>.

Actually I'd like to see lzf done at LVM level; that way it is usefull
for people not doing suspend, too, and we should not need plugin
infrastructure in suspend2 (LVM provides us with that service).

> > You did wonderfull work -- you shown what is possible with
> > suspend2. Now we just need to scale it back to what is practical. It
> > needs not only to work, it also needs to be nice, simple, and easy to
> > maintain.
> 
> I think it is practical. Apart from the bootsplash support, I don't
> think I have added any feature because I thought "Hey, this looks like a
> fun thing to try.". Every feature has been added because it makes
> suspend faster, more reliable, more user friendly, more versatile or the
> like. If we want Linux to get adopted by desktop users, it needs to

I believe you need to say "no" way more often. One user is not enough
to justify feature in mainline kernel, and any number of users should
not be enough to make GZIP compression supported by suspend2.

> have
> these features. Making it harder to use by forcing people to reboot to
> change a parameter or forcing them to do an ls in /dev with obscure
> parameters (to get the major and minor numbers) when they already know
> they want /dev/sda1 isn't user friendly. Obviously user friendliness is 
> more important to me than to you. That's fine, but let's agree to differ
> and let the software be more helpful rather than less.

Yes, I care about linux being developer-friendly. If it is not
user-friendly, distributions will solve it. If it is not
developer-friendly, it is dead.

> > > It does. But if there were fundamental flaws in the approach, we would
> > > have found them by now. Since you're using bio calls and not swap's own
> > > read/write functions, you shouldn't have any problems.
> > 
> > I believe it has at least one pretty bad flaw: it has hooks all over
> > the place and will be nightmare to maintain. Puting suspend hooks into
> > memory allocation is not nice.
> 
> That's a big statement.
> 
> "Hooks all over the place" was a phrase first used to refer to the
> attempts at making freezing more reliable. That's irrelevant now with
> the simplified three stages to freezing. The hooks are the same as for
> swsusp1 there.
> 
> The hooks you've seen in the rest of the kernel are generally only
> further supplements to the freezing, so that swsusp should probably have
> them too.
> 
> The hooks in the memory management are minimal and wrapped in
> unlikely(), so they shouldn't really be a problem in the normal flow
> of

Yes, they are unlikely(); but still they are hooks into memory
managment. They are at least ugly as hell. And no swsusp1 does not
this particular set of hooks, and does not need to patch sysrq-S.

> things. While suspend is running, they serve a good and necessary
> purpose. Using high level routines, we can't guarantee that new slab

They are neccessary because of two-stages LRU saving... I'm trying to
argue "two-stages LRU saving is wrong"...

> > swsusp1 is pretty self-contained. As long as drivers stop the DMA and
> > NMI does nothing wrong, atomic snapshot will indeed be atomic.
> > 
> > Can you list conditions neccessary for suspend2 to work? 
> 
> Not really sure what you mean. At the moment, the main hinderance to it
> working properly is driver model support (USB, DRI, as said previously).
> That forces us to have a userspace script to compile as modules and
> stop/unload support around kernel call. Given that these things are
> done, and that suspend is able to get enough memory to do it's work
> (almost never a problem), suspend should always work.

No, assume driver problems are solved.

For swsusp2, you need drivers to stop the DMA, NMI not interfering,
sync may not happen after you have saved LRU, memory may not be
alocated from slab after you have saved LRU. (something else? This
needs to be written down somewhere, and all kernel hackers will need
to be carefull not to break these rules. Do you see why it wories me?)

swsusp1 is more self-contained. As long as drivers stop the DMA and
NMI does nothing wrong, atomic snapshot will indeed be atomic.

								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-25 23:22         ` Pavel Machek
@ 2004-11-25 23:46           ` Nigel Cunningham
  2004-11-26  0:39             ` Pavel Machek
  2004-11-29  9:34             ` Stefan Seyfried
  0 siblings, 2 replies; 51+ messages in thread
From: Nigel Cunningham @ 2004-11-25 23:46 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang, Andrew Morton

Hi.

On Fri, 2004-11-26 at 10:22, Pavel Machek wrote:
> I'd prefer not to get plugins and abstract storage. I'm not sure about
> extents, but as soon as I can get rid of order-8 allocations, things
> should be ok.

Don't you need more than that? Unless things have changed, you still
spend most of your time eating memory so that you only have a few megs
(well, maybe a little more) to write to disk. If you do reduce the
amount of memory you eat, then you need to work to make the I/O faster.

> > > [Okay, at this point I'll understand when you'll put my picture as a
> > > texture to some doom3 monster and shoot me thousand times... Lot of
> > > work went into suspend2, but in the meantime lot of work went into
> > > swsusp1, too...]
> > 
> > Not at all. Perhaps I'm overstating the case or not spending enough time
> > looking at your code, but I don't actually think swsusp has changed a
> > lot in the two years since I started working on this. (Want my picture
> > now? :>)
> 
> Well, it was rewriten by Patrick so it actually looks okay, and it
> started to work for users...

:> Okay.

> > > > - Speed: All I/O is asynchronous where possible and readahead used where
> > > > not. Routines everywhere optimised to get things done as fast as poss.
> > > > (Think low battery).
> > > 
> > > I fixed O(n^2) behaviour in swsusp1 (not yet in). I do not think that
> > > asynchronous I/O is does that much difference.
> > 
> > Oh, it makes a huge difference once you're not eating all the memory you
> > can. If I submit I/O one at a time, I do 1 or 2 MB/s. With asynchrounous
> > I/O, I can write 70MB/s and read 110MB/s with compression, 58|58 without
> > compression (that's the maximum throughput of the drive I'm using at the
> > moment). If I can streamline things a further, I should be able to lift
> > that write rate further, too.
> 
> Okay, 58MB/sec is better than 1MB/sec. I do not think I want the
> complexity neccessary to get me 70MB/sec.

Fair enough for you, but not everyone can say that. At the end of the
day, I'm not writing this code just for me to use, though. Many of the
features I've added have been added for the benefit of other people; I
assume you'd do the same. Most laptops can't do 58MB/s, so the
difference is much bigger. (My original laptop hard drive did 17/s; with
compression I think it achieved something close to double that,
depending on the data being compressed, of course).

> In some ways, suspend2 is two years ahead of rest of kernel:
> * you have interactive debugging
> * file compression
> * nice splash screen
> * plugin interface for transparent network support
> 
> Unfortunately, we do not want compression done like that. It would
> make sense to do compressed-LVM or something like that (that way
> everyone would get the benefit), but it does not make sense to have it
> just for suspend2. And we do not want the rest of features, too,
> unless they work for the rest of kernel.

The cryptoapi provides support for both compression and encryption. I'd
happily make use of that, but we still need a way for the user to choose
what compression/encryption they want and configure it. I'm not at all
adverse to the idea of shifting the lzf compression support into being a
cryptoapi plugin. That shouldn't be hard to do precisely because I have
the plugin system :>.

> > > > - Test bed: Around 10,000 downloads of the 1.0 patch, 2730 to date of
> > > > the 2.1.5 version I released 2 weeks ago.
> > > 
> > > Hmm, look at number of downloads of 2.6.9 kernel, I think I win here
> > > ;-)))). SuSE9.2 is actually shipping swsusp1 and advertising it as a
> > > feature. And it seems to work for people...
> > 
> > :> But not everyone who uses 2.6.9 uses swsusp. :>
> 
> But they should ;-).

I'll beg to differ there :>

> > > > - Swap file support
> > > > - Support for LVM/dm-crypt and siblings
> > > > - Support for having device drivers as modules (resume from an
> > > > initrd/initramfs)
> > > 
> > > Okay, you win these.
> > 
> > I don't want to have a competition, really. I just want to convince you
> > that I've done some worthwhile work :>
> 
> You did wonderfull work -- you shown what is possible with
> suspend2. Now we just need to scale it back to what is practical. It
> needs not only to work, it also needs to be nice, simple, and easy to
> maintain.

I think it is practical. Apart from the bootsplash support, I don't
think I have added any feature because I thought "Hey, this looks like a
fun thing to try.". Every feature has been added because it makes
suspend faster, more reliable, more user friendly, more versatile or the
like. If we want Linux to get adopted by desktop users, it needs to have
these features. Making it harder to use by forcing people to reboot to
change a parameter or forcing them to do an ls in /dev with obscure
parameters (to get the major and minor numbers) when they already know
they want /dev/sda1 isn't user friendly. Obviously user friendliness is 
more important to me than to you. That's fine, but let's agree to differ
and let the software be more helpful rather than less.

> > > > - Designed to save as much of memory as possible rather than as little
> > > > (making the system more responsive post-resume).
> > > 
> > > hugang already has a patch, but I'm not 100% sure if I want it
> > > in. Yes, people seem to like this feature, but it complicates
> > > *design*, quite a lot.
> > 
> > It does. But if there were fundamental flaws in the approach, we would
> > have found them by now. Since you're using bio calls and not swap's own
> > read/write functions, you shouldn't have any problems.
> 
> I believe it has at least one pretty bad flaw: it has hooks all over
> the place and will be nightmare to maintain. Puting suspend hooks into
> memory allocation is not nice.

That's a big statement.

"Hooks all over the place" was a phrase first used to refer to the
attempts at making freezing more reliable. That's irrelevant now with
the simplified three stages to freezing. The hooks are the same as for
swsusp1 there.

The hooks you've seen in the rest of the kernel are generally only
further supplements to the freezing, so that swsusp should probably have
them too.

The hooks in the memory management are minimal and wrapped in
unlikely(), so they shouldn't really be a problem in the normal flow of
things. While suspend is running, they serve a good and necessary
purpose. Using high level routines, we can't guarantee that new slab
pages, for example, aren't allocated while we're writing the LRU pages.
If you can't be sure of that and don't want to satisfy allocations from
a memory pool, you'll need to recheck that your saving all the pages
after writing LRU. And if you do that, you might need to allocate more
memory for the metadata, which might mean freeing an LRU page or two,
which will make your image inconsistent. A memory pool is a simple and
effective solution to that issue: we know exactly what pages may be
allocated when we do our atomic copy, and we copy them without having to
figure out which ones were actually used.

> swsusp1 is pretty self-contained. As long as drivers stop the DMA and
> NMI does nothing wrong, atomic snapshot will indeed be atomic.
> 
> Can you list conditions neccessary for suspend2 to work? 

Not really sure what you mean. At the moment, the main hinderance to it
working properly is driver model support (USB, DRI, as said previously).
That forces us to have a userspace script to compile as modules and
stop/unload support around kernel call. Given that these things are
done, and that suspend is able to get enough memory to do it's work
(almost never a problem), suspend should always work.

Regards,

Nigel
-- 
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

You see, at just the right time, when we were still powerless, Christ
died for the ungodly.		-- Romans 5:6


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-25 22:34       ` Nigel Cunningham
@ 2004-11-25 23:22         ` Pavel Machek
  2004-11-25 23:46           ` Nigel Cunningham
  0 siblings, 1 reply; 51+ messages in thread
From: Pavel Machek @ 2004-11-25 23:22 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang, Andrew Morton

Hi!

> > > I'm thus seeking to simply merge the existing code, let Pavel and others
> > > get to the point where they're ready to say "Okay, we're satisfied that
> > > suspend2 does everything swsusp does and more and better." Then we can
> > > remove swsusp. This is the plan that was discussed with Pavel and Andrew
> > > ages ago. I've just been slow to get there because I'm doing this
> > > part-time voluntary.
> > 
> > hugang seems to show that it indeed is possible to incrementally turn
> > swsusp into suspend2. I do not think Andrew really wanted it that way,
> > and I thought of that as of neccessary evil.
> 
> With some changes, yes. But when you come to using extents or
> abstracting the method of storage and implementing plugins, it will be
> ground-up redesign. Of course you might not want to go that far.

I'd prefer not to get plugins and abstract storage. I'm not sure about
extents, but as soon as I can get rid of order-8 allocations, things
should be ok.

> > [Okay, at this point I'll understand when you'll put my picture as a
> > texture to some doom3 monster and shoot me thousand times... Lot of
> > work went into suspend2, but in the meantime lot of work went into
> > swsusp1, too...]
> 
> Not at all. Perhaps I'm overstating the case or not spending enough time
> looking at your code, but I don't actually think swsusp has changed a
> lot in the two years since I started working on this. (Want my picture
> now? :>)

Well, it was rewriten by Patrick so it actually looks okay, and it
started to work for users...

> > > - Speed: All I/O is asynchronous where possible and readahead used where
> > > not. Routines everywhere optimised to get things done as fast as poss.
> > > (Think low battery).
> > 
> > I fixed O(n^2) behaviour in swsusp1 (not yet in). I do not think that
> > asynchronous I/O is does that much difference.
> 
> Oh, it makes a huge difference once you're not eating all the memory you
> can. If I submit I/O one at a time, I do 1 or 2 MB/s. With asynchrounous
> I/O, I can write 70MB/s and read 110MB/s with compression, 58|58 without
> compression (that's the maximum throughput of the drive I'm using at the
> moment). If I can streamline things a further, I should be able to lift
> that write rate further, too.

Okay, 58MB/sec is better than 1MB/sec. I do not think I want the
complexity neccessary to get me 70MB/sec.

In some ways, suspend2 is two years ahead of rest of kernel:
* you have interactive debugging
* file compression
* nice splash screen
* plugin interface for transparent network support

Unfortunately, we do not want compression done like that. It would
make sense to do compressed-LVM or something like that (that way
everyone would get the benefit), but it does not make sense to have it
just for suspend2. And we do not want the rest of features, too,
unless they work for the rest of kernel.

> > > - Test bed: Around 10,000 downloads of the 1.0 patch, 2730 to date of
> > > the 2.1.5 version I released 2 weeks ago.
> > 
> > Hmm, look at number of downloads of 2.6.9 kernel, I think I win here
> > ;-)))). SuSE9.2 is actually shipping swsusp1 and advertising it as a
> > feature. And it seems to work for people...
> 
> :> But not everyone who uses 2.6.9 uses swsusp. :>

But they should ;-).

> > > - Swap file support
> > > - Support for LVM/dm-crypt and siblings
> > > - Support for having device drivers as modules (resume from an
> > > initrd/initramfs)
> > 
> > Okay, you win these.
> 
> I don't want to have a competition, really. I just want to convince you
> that I've done some worthwhile work :>

You did wonderfull work -- you shown what is possible with
suspend2. Now we just need to scale it back to what is practical. It
needs not only to work, it also needs to be nice, simple, and easy to
maintain.

> > > - Designed to save as much of memory as possible rather than as little
> > > (making the system more responsive post-resume).
> > 
> > hugang already has a patch, but I'm not 100% sure if I want it
> > in. Yes, people seem to like this feature, but it complicates
> > *design*, quite a lot.
> 
> It does. But if there were fundamental flaws in the approach, we would
> have found them by now. Since you're using bio calls and not swap's own
> read/write functions, you shouldn't have any problems.

I believe it has at least one pretty bad flaw: it has hooks all over
the place and will be nightmare to maintain. Puting suspend hooks into
memory allocation is not nice.

swsusp1 is pretty self-contained. As long as drivers stop the DMA and
NMI does nothing wrong, atomic snapshot will indeed be atomic.

Can you list conditions neccessary for suspend2 to work? 

								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-25 19:20     ` Pavel Machek
@ 2004-11-25 22:34       ` Nigel Cunningham
  2004-11-25 23:22         ` Pavel Machek
  0 siblings, 1 reply; 51+ messages in thread
From: Nigel Cunningham @ 2004-11-25 22:34 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang, Andrew Morton

Hi.

On Fri, 2004-11-26 at 06:20, Pavel Machek wrote:
> Hi!
> 
> > > Your way of merging looks rather wrong.  Please submit changes against the
> > > current swsusp code that introduce one feature after another to bring it
> > > at the level you want.  You'll surely have to rewrok it a lot until all
> > > reviewers are happy.
> > 
> > I realise that it needs further cleanup; that's why I'm submitting it
> > now for comment and not asking 'please apply'. As to patching against
> > swsusp, I'm purposely not doing that. The reason is that suspend2 isn't
> > a bunch of incremental changes to swsusp. It has been redesigned from
> > the ground up and I'd have to pull swsusp to pieces and put it back
> > together to do the same things.
> >
> > I'm thus seeking to simply merge the existing code, let Pavel and others
> > get to the point where they're ready to say "Okay, we're satisfied that
> > suspend2 does everything swsusp does and more and better." Then we can
> > remove swsusp. This is the plan that was discussed with Pavel and Andrew
> > ages ago. I've just been slow to get there because I'm doing this
> > part-time voluntary.
> 
> hugang seems to show that it indeed is possible to incrementally turn
> swsusp into suspend2. I do not think Andrew really wanted it that way,
> and I thought of that as of neccessary evil.

With some changes, yes. But when you come to using extents or
abstracting the method of storage and implementing plugins, it will be
ground-up redesign. Of course you might not want to go that far.

> [Okay, at this point I'll understand when you'll put my picture as a
> texture to some doom3 monster and shoot me thousand times... Lot of
> work went into suspend2, but in the meantime lot of work went into
> swsusp1, too...]

Not at all. Perhaps I'm overstating the case or not spending enough time
looking at your code, but I don't actually think swsusp has changed a
lot in the two years since I started working on this. (Want my picture
now? :>)

> > > And most importantly for each patch explain exactly what feature it
> > > implements and why, etc..  "swsusp2" tells exactly nothing about the
> > > changed you do.
> > 
> > Okay. The changes include:
> > 
> > - Almost no BUG() statements. Wherever possible, if something goes
> > wrong, we back out and give the user a perfectly usable system back
> 
> Patrick did a lot of work in this area, and there are 10 BUGs() in
> swsusp just now. [And I do not think "no BUGs()" is a feature -- look
> at my comments, at one point you just ignored "can not happen
> condition". That's bad, it can hide real bugs.] I have no reports of
> swsusp1 going BUG() for users, and that's what counts.

Not sure what you're talking about with the 'can not happen condition'.
Regarding real reports, I agree.

> > - Speed: All I/O is asynchronous where possible and readahead used where
> > not. Routines everywhere optimised to get things done as fast as poss.
> > (Think low battery).
> 
> I fixed O(n^2) behaviour in swsusp1 (not yet in). I do not think that
> asynchronous I/O is does that much difference.

Oh, it makes a huge difference once you're not eating all the memory you
can. If I submit I/O one at a time, I do 1 or 2 MB/s. With asynchrounous
I/O, I can write 70MB/s and read 110MB/s with compression, 58|58 without
compression (that's the maximum throughput of the drive I'm using at the
moment). If I can streamline things a further, I should be able to lift
that write rate further, too.

> > - Reliability. I haven't run the tests for a while, but Michael Frank
> > produced a suite that was used to stress test the software (under 2.4)
> > while running 100s (1000s at least once) of cycles. There have been some
> > significant changes since then, but the software is essentially the
> > same.
> 
> Well, swsusp1 is getting a lot of testing too. Is the test-suite
> somewhere easily available?

I believe Michael is preparing a new version. I assume he'll put it on
https://developer.berlios.de/projects/lstress/ when he's done.

> > - Test bed: Around 10,000 downloads of the 1.0 patch, 2730 to date of
> > the 2.1.5 version I released 2 weeks ago.
> 
> Hmm, look at number of downloads of 2.6.9 kernel, I think I win here
> ;-)))). SuSE9.2 is actually shipping swsusp1 and advertising it as a
> feature. And it seems to work for people...

:> But not everyone who uses 2.6.9 uses swsusp. :>

> > - Swap file support
> > - Support for LVM/dm-crypt and siblings
> > - Support for having device drivers as modules (resume from an
> > initrd/initramfs)
> 
> Okay, you win these.

I don't want to have a competition, really. I just want to convince you
that I've done some worthwhile work :>

> > - Almost all memory allocations are order 0, making suspend more
> > reliable under load.
> 
> I'll have to fix this. Fortunately hugang already has a patch.
> 
> > - Designed to save as much of memory as possible rather than as little
> > (making the system more responsive post-resume).
> 
> hugang already has a patch, but I'm not 100% sure if I want it
> in. Yes, people seem to like this feature, but it complicates
> *design*, quite a lot.

It does. But if there were fundamental flaws in the approach, we would
have found them by now. Since you're using bio calls and not swap's own
read/write functions, you shouldn't have any problems.

> > - Support for SMP
> > - Support for preempt
> > - Support for 4GB highmem (hope to do 64GB soonish)
> 
> This works in swsusp1, too. Parts of SMP support need to be rewritten
> to assembly, but same is probably true for suspend2.

See separate comments - I think it can all stay as C.

> I'm not sure if there are still problems with swsusp1 refrigerator, if
> so add
> 
> - Suspend2 actually works under load
> 

Hopefully, we'll merge the refrigerator changes soon; then you can say
that for swsusp1 too.

Regards,

Nigel
-- 
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

You see, at just the right time, when we were still powerless, Christ
died for the ungodly.		-- Romans 5:6


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-24 20:46   ` Nigel Cunningham
@ 2004-11-25 19:20     ` Pavel Machek
  2004-11-25 22:34       ` Nigel Cunningham
  0 siblings, 1 reply; 51+ messages in thread
From: Pavel Machek @ 2004-11-25 19:20 UTC (permalink / raw)
  To: Nigel Cunningham
  Cc: Christoph Hellwig, Linux Kernel Mailing List, hugang, Andrew Morton

Hi!

> > Your way of merging looks rather wrong.  Please submit changes against the
> > current swsusp code that introduce one feature after another to bring it
> > at the level you want.  You'll surely have to rewrok it a lot until all
> > reviewers are happy.
> 
> I realise that it needs further cleanup; that's why I'm submitting it
> now for comment and not asking 'please apply'. As to patching against
> swsusp, I'm purposely not doing that. The reason is that suspend2 isn't
> a bunch of incremental changes to swsusp. It has been redesigned from
> the ground up and I'd have to pull swsusp to pieces and put it back
> together to do the same things.
>
> I'm thus seeking to simply merge the existing code, let Pavel and others
> get to the point where they're ready to say "Okay, we're satisfied that
> suspend2 does everything swsusp does and more and better." Then we can
> remove swsusp. This is the plan that was discussed with Pavel and Andrew
> ages ago. I've just been slow to get there because I'm doing this
> part-time voluntary.

hugang seems to show that it indeed is possible to incrementally turn
swsusp into suspend2. I do not think Andrew really wanted it that way,
and I thought of that as of neccessary evil.

[Okay, at this point I'll understand when you'll put my picture as a
texture to some doom3 monster and shoot me thousand times... Lot of
work went into suspend2, but in the meantime lot of work went into
swsusp1, too...]

> > And most importantly for each patch explain exactly what feature it
> > implements and why, etc..  "swsusp2" tells exactly nothing about the
> > changed you do.
> 
> Okay. The changes include:
> 
> - Almost no BUG() statements. Wherever possible, if something goes
> wrong, we back out and give the user a perfectly usable system back

Patrick did a lot of work in this area, and there are 10 BUGs() in
swsusp just now. [And I do not think "no BUGs()" is a feature -- look
at my comments, at one point you just ignored "can not happen
condition". That's bad, it can hide real bugs.] I have no reports of
swsusp1 going BUG() for users, and that's what counts.

> - Speed: All I/O is asynchronous where possible and readahead used where
> not. Routines everywhere optimised to get things done as fast as poss.
> (Think low battery).

I fixed O(n^2) behaviour in swsusp1 (not yet in). I do not think that
asynchronous I/O is does that much difference.

> - Reliability. I haven't run the tests for a while, but Michael Frank
> produced a suite that was used to stress test the software (under 2.4)
> while running 100s (1000s at least once) of cycles. There have been some
> significant changes since then, but the software is essentially the
> same.

Well, swsusp1 is getting a lot of testing too. Is the test-suite
somewhere easily available?

> - Test bed: Around 10,000 downloads of the 1.0 patch, 2730 to date of
> the 2.1.5 version I released 2 weeks ago.

Hmm, look at number of downloads of 2.6.9 kernel, I think I win here
;-)))). SuSE9.2 is actually shipping swsusp1 and advertising it as a
feature. And it seems to work for people...

> - Swap file support
> - Support for LVM/dm-crypt and siblings
> - Support for having device drivers as modules (resume from an
> initrd/initramfs)

Okay, you win these.

> - Almost all memory allocations are order 0, making suspend more
> reliable under load.

I'll have to fix this. Fortunately hugang already has a patch.

> - Designed to save as much of memory as possible rather than as little
> (making the system more responsive post-resume).

hugang already has a patch, but I'm not 100% sure if I want it
in. Yes, people seem to like this feature, but it complicates
*design*, quite a lot.

> - Support for SMP
> - Support for preempt
> - Support for 4GB highmem (hope to do 64GB soonish)

This works in swsusp1, too. Parts of SMP support need to be rewritten
to assembly, but same is probably true for suspend2.

I'm not sure if there are still problems with swsusp1 refrigerator, if
so add

- Suspend2 actually works under load

to the list.
								Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-24 13:28 ` Christoph Hellwig
@ 2004-11-24 20:46   ` Nigel Cunningham
  2004-11-25 19:20     ` Pavel Machek
  0 siblings, 1 reply; 51+ messages in thread
From: Nigel Cunningham @ 2004-11-24 20:46 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Linux Kernel Mailing List

Hi.

On Thu, 2004-11-25 at 00:28, Christoph Hellwig wrote:
> Your way of merging looks rather wrong.  Please submit changes against the
> current swsusp code that introduce one feature after another to bring it
> at the level you want.  You'll surely have to rewrok it a lot until all
> reviewers are happy.

I realise that it needs further cleanup; that's why I'm submitting it
now for comment and not asking 'please apply'. As to patching against
swsusp, I'm purposely not doing that. The reason is that suspend2 isn't
a bunch of incremental changes to swsusp. It has been redesigned from
the ground up and I'd have to pull swsusp to pieces and put it back
together to do the same things.

I'm thus seeking to simply merge the existing code, let Pavel and others
get to the point where they're ready to say "Okay, we're satisfied that
suspend2 does everything swsusp does and more and better." Then we can
remove swsusp. This is the plan that was discussed with Pavel and Andrew
ages ago. I've just been slow to get there because I'm doing this
part-time voluntary.

> And most importantly for each patch explain exactly what feature it
> implements and why, etc..  "swsusp2" tells exactly nothing about the
> changed you do.

Okay. The changes include:

- Almost no BUG() statements. Wherever possible, if something goes
wrong, we back out and give the user a perfectly usable system back
- Speed: All I/O is asynchronous where possible and readahead used where
not. Routines everywhere optimised to get things done as fast as poss.
(Think low battery).
- Flexible: You can tune performance to your system in a number of ways.
You can use/not use bootsplash, text output, compression drivers as you
choose. You can change your swap configuration without having to reboot
just to change the resume2= parameter. You can cancel a suspend if you
want, or disable the possibility of doing so.
- Reliability. I haven't run the tests for a while, but Michael Frank
produced a suite that was used to stress test the software (under 2.4)
while running 100s (1000s at least once) of cycles. There have been some
significant changes since then, but the software is essentially the
same.
- Test bed: Around 10,000 downloads of the 1.0 patch, 2730 to date of
the 2.1.5 version I released 2 weeks ago.
- Swap file support
- Support for LVM/dm-crypt and siblings
- Support for having device drivers as modules (resume from an
initrd/initramfs)
- Almost all memory allocations are order 0, making suspend more
reliable under load.
- Designed to save as much of memory as possible rather than as little
(making the system more responsive post-resume).
- Support for SMP
- Support for preempt
- Support for 4GB highmem (hope to do 64GB soonish)
- Support for suspending/resuming over a network possible but not yet
implemented (hope to do so soon)

I realise it's only some, but I think it gives you the jist :>

Nigel
-- 
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

You see, at just the right time, when we were still powerless, Christ
died for the ungodly.		-- Romans 5:6


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: Suspend 2 merge
  2004-11-24 12:56 Suspend 2 merge Nigel Cunningham
@ 2004-11-24 13:28 ` Christoph Hellwig
  2004-11-24 20:46   ` Nigel Cunningham
  0 siblings, 1 reply; 51+ messages in thread
From: Christoph Hellwig @ 2004-11-24 13:28 UTC (permalink / raw)
  To: Nigel Cunningham; +Cc: Linux Kernel Mailing List

On Wed, Nov 24, 2004 at 11:56:35PM +1100, Nigel Cunningham wrote:
> Hi everyone.
> 
> I know that I still have work to do on suspend2, but thought it was high
> time I got around to properly submitting the code for review, so here
> goes.
> 
> I have it split up into 51 patches, of which most are less than 20k,
> although there are three 50k patches. Changes to the rest of the kernel
> tree come first, then the core. The full tree can be found at

Your way of merging looks rather wrong.  Please submit changes against the
current swsusp code that introduce one feature after another to bring it
at the level you want.  You'll surely have to rewrok it a lot until all
reviewers are happy.

And most importantly for each patch explain exactly what feature it
implements and why, etc..  "swsusp2" tells exactly nothing about the
changed you do.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Suspend 2 merge
@ 2004-11-24 12:56 Nigel Cunningham
  2004-11-24 13:28 ` Christoph Hellwig
  0 siblings, 1 reply; 51+ messages in thread
From: Nigel Cunningham @ 2004-11-24 12:56 UTC (permalink / raw)
  To: Linux Kernel Mailing List

Hi everyone.

I know that I still have work to do on suspend2, but thought it was high
time I got around to properly submitting the code for review, so here
goes.

I have it split up into 51 patches, of which most are less than 20k,
although there are three 50k patches. Changes to the rest of the kernel
tree come first, then the core. The full tree can be found at

http://suspend2.bkbits.net:8080/merge-tree

Regards,

Nigel
-- 
Nigel Cunningham
Pastoral Worker
Christian Reformed Church of Tuggeranong
PO Box 1004, Tuggeranong, ACT 2901

You see, at just the right time, when we were still powerless, Christ
died for the ungodly.		-- Romans 5:6


^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2005-01-11  3:19 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-27 22:07 Suspend 2 merge linux
2004-11-27 22:25 ` Pavel Machek
2004-11-28  8:29 ` Wichert Akkerman
2004-11-28 11:37   ` Pavel Machek
2004-11-28 16:23     ` software suspend patch [1/6] hugang
2004-11-28 16:24       ` software suspend patch [2/6] hugang
2004-11-28 17:03         ` Pavel Machek
2004-11-28 16:24       ` software suspend patch [3/6] hugang
2004-11-28 16:25       ` software suspend patch [4/6] hugang
2004-11-28 17:08         ` Pavel Machek
2004-11-28 16:25       ` software suspend patch [5/6] hugang
2004-11-28 16:25       ` software suspend patch [1/6] hugang
2004-11-28 17:11         ` Pavel Machek
2004-12-02 20:56           ` Guido Guenther
2004-11-28 16:58       ` Pavel Machek
2004-11-29 15:43         ` hugang
2005-01-09 22:43           ` Pavel Machek
2005-01-11  2:01             ` hugang
2005-01-11  3:15               ` Pavel Machek
  -- strict thread matches above, loose matches on Subject: below --
2004-11-24 12:56 Suspend 2 merge Nigel Cunningham
2004-11-24 13:28 ` Christoph Hellwig
2004-11-24 20:46   ` Nigel Cunningham
2004-11-25 19:20     ` Pavel Machek
2004-11-25 22:34       ` Nigel Cunningham
2004-11-25 23:22         ` Pavel Machek
2004-11-25 23:46           ` Nigel Cunningham
2004-11-26  0:39             ` Pavel Machek
2004-11-26  9:08               ` Nigel Cunningham
2004-11-26 12:38                 ` Pavel Machek
2004-11-26 15:54                   ` Christoph Hellwig
2004-11-26 22:36                     ` Pavel Machek
2004-11-28 22:35                   ` Nigel Cunningham
2004-11-28 23:55                     ` Pavel Machek
2004-11-29  3:20                       ` Nigel Cunningham
2004-11-29 13:03                         ` Pavel Machek
2004-11-30  0:24                           ` Nigel Cunningham
2004-11-30 10:19                             ` Pavel Machek
     [not found]               ` <20041126082109.GA842@hugang.soulinfo.com>
2004-11-26 13:25                 ` Pavel Machek
     [not found]               ` <20041126043203.GA2713@hugang.soulinfo.com>
2004-11-26  9:08                 ` Nigel Cunningham
2004-11-26 13:37                   ` Pavel Machek
2004-11-26 13:31                 ` Pavel Machek
2004-11-28 21:40               ` Nigel Cunningham
2004-11-29  9:34             ` Stefan Seyfried
2004-11-29 22:20               ` Nigel Cunningham
2004-11-29 22:34                 ` Pavel Machek
2004-11-30 12:16                 ` Stefan Seyfried
2004-11-30 21:16                   ` Nigel Cunningham
2004-11-30 22:20                     ` Pavel Machek
2004-12-01  9:27                       ` Nigel Cunningham
2004-12-01 10:08                         ` Pavel Machek
2004-12-01 20:39                           ` Nigel Cunningham

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).