linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -mm] swsusp: userland interface (rev 2)
@ 2006-01-24  8:29 Rafael J. Wysocki
  2006-01-24 11:14 ` Pavel Machek
  2006-01-24 21:13 ` Andrew Morton
  0 siblings, 2 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2006-01-24  8:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Pavel Machek, LKML

Hi,

This patch introduces a user space interface for swsusp.

The interface is based on a special character device, called the snapshot
device, that allows user space processes to perform suspend and
resume-related operations with the help of some ioctls and the read()/write()
functions.  Additionally it allows these processes to allocate free swap pages
from a selected swap partition, called the resume partition, so that they know
which sectors of the resume partition are available to them.

The interface uses the same low-level system memory snapshot-handling
functions that are used by the built-it swap-writing/reading code of swsusp.

The interface documentation is included in the patch.

The patch assumes that the major and minor numbers of the snapshot device
will be 10 (ie. misc device) and 231, the registration of which has already been
requested.

Please apply (Pavel, please ack if it looks good).

Greetings,
Rafael


Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>

 Documentation/power/userland-swsusp.txt |  149 +++++++++++++++
 init/do_mounts_initrd.c                 |    1 
 kernel/power/Makefile                   |    2 
 kernel/power/power.h                    |   14 +
 kernel/power/snapshot.c                 |    2 
 kernel/power/user.c                     |  302 ++++++++++++++++++++++++++++++++
 6 files changed, 469 insertions(+), 1 deletion(-)

Index: linux-2.6.16-rc1-mm2/Documentation/power/userland-swsusp.txt
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.16-rc1-mm2/Documentation/power/userland-swsusp.txt	2006-01-23 16:04:12.000000000 +0100
@@ -0,0 +1,149 @@
+Documentation for userland software suspend interface
+	(C) 2006 Rafael J. Wysocki <rjw@sisk.pl>
+
+First, the warnings at the beginning of swsusp.txt still apply.
+
+Second, you should read the FAQ in swsusp.txt _now_ if you have not
+done it already.
+
+Now, to use the userland interface for software suspend you need special
+utilities that will read/write the system memory snapshot from/to the
+kernel.  Such utilities are available, for example, from
+<http://www.sisk.pl/kernel/utilities/suspend>.  You may want to have
+a look at them if you are going to develop your own suspend/resume
+utilities.
+
+The interface consists of a character device providing the open(),
+release(), read(), and write() operations as well as several ioctl()
+commands defined in kernel/power/power.h.  The major and minor
+numbers of the device are, respectively, 10 and 231, and they can
+be read from /sys/class/misc/snapshot/dev.
+
+The device can be open either for reading or for writing.  If open for
+reading, it is considered to be in the suspend mode.  Otherwise it is
+assumed to be in the resume mode.  The device cannot be open for reading
+and writing.  It is also impossible to have the device open more than once
+at a time.
+
+The ioctl() commands recognized by the device are:
+
+SNAPSHOT_FREEZE - freeze user space processes (the current process is
+	not frozen); this is required for SNAPSHOT_ATOMIC_SNAPSHOT
+	and SNAPSHOT_ATOMIC_RESTORE to succeed
+
+SNAPSHOT_UNFREEZE - thaw user space processes frozen by SNAPSHOT_FREEZE
+
+SNAPSHOT_ATOMIC_SNAPSHOT - create a snapshot of the system memory; the
+	last argument of ioctl() should be a pointer to an int variable,
+	the value of which will indicate whether the call returned after
+	creating the snapshot (1) or after restoring the system memory state
+	from it (0) (after resume the system finds itself finishing the
+	SNAPSHOT_ATOMIC_SNAPSHOT ioctl() again); after the snapshot
+	has been created the read() operation can be used to transfer
+	it out of the kernel
+
+SNAPSHOT_ATOMIC_RESTORE - restore the system memory state from the
+	uploaded snapshot image; before calling it you should transfer
+	the system memory snapshot back to the kernel using the write()
+	operation; this call will not succeed if the snapshot
+	image is not available to the kernel
+
+SNAPSHOT_FREE - free memory allocated for the snapshot image
+
+SNAPSHOT_SET_IMAGE_SIZE - set the preferred maximum size of the image
+	(the kernel will do its best to ensure the image size will not exceed
+	this number, but if it turns out to be impossible, the kernel will
+	create the smallest image possible)
+
+SNAPSHOT_AVAIL_SWAP - return the amount of available swap in bytes (the last
+	argument should be a pointer to an unsigned int variable that will
+	contain the result if the call is successful).
+
+SNAPSHOT_GET_SWAP_PAGE - allocate a swap page from the resume partition
+	(the last argument should be a pointer to a loff_t variable that
+	will contain the swap page offset if the call is successful)
+
+SNAPSHOT_FREE_SWAP_PAGES - free all swap pages allocated with
+	SNAPSHOT_GET_SWAP_PAGE
+
+SNAPSHOT_SET_SWAP_FILE - set the resume partition (the last ioctl() argument
+	should specify the device's major and minor numbers in the old
+	two-byte format, as returned by the stat() function in the .st_rdev
+	member of the stat structure); it is recommended to always use this
+	call, because the code to set the resume partition could be removed from
+	future kernels
+
+The device's read() operation can be used to transfer the snapshot image from
+the kernel.  It has the following limitations:
+- you cannot read() more than one virtual memory page at a time
+- read()s accross page boundaries are impossible (ie. if ypu read() 1/2 of
+	a page in the previous call, you will only be able to read()
+	_at_ _most_ 1/2 of the page in the next call)
+
+The device's write() operation is used for uploading the system memory snapshot
+into the kernel.  It has the same limitations as the read() operation.
+
+The release() operation frees all memory allocated for the snapshot image
+and all swap pages allocated with SNAPSHOT_GET_SWAP_PAGE (if any).
+Thus it is not necessary to use either SNAPSHOT_FREE or
+SNAPSHOT_FREE_SWAP_PAGES before closing the device (in fact it will also
+unfreeze user space processes frozen by SNAPSHOT_UNFREEZE if they are
+still frozen when the device is being closed).
+
+Currently it is assumed that the userland utilities reading/writing the
+snapshot image from/to the kernel will use a swap parition, called the resume
+partition, as storage space.  However, this is not really required, as they
+can use, for example, a special (blank) suspend partition or a file on a partition
+that is unmounted before SNAPSHOT_ATOMIC_SNAPSHOT and mounted afterwards.
+
+These utilities SHOULD NOT make any assumptions regarding the ordering of
+data within the snapshot image, except for the image header that MAY be
+assumed to start with an swsusp_info structure, as specified in
+kernel/power/power.h.  This structure MAY be used by the userland utilities
+to obtain some information about the snapshot image, such as the size
+of the snapshot image, including the metadata and the header itself,
+contained in the .size member of swsusp_info.
+
+The snapshot image MUST be written to the kernel unaltered (ie. all of the image
+data, metadata and header MUST be written in _exactly_ the same amount, form
+and order in which they have been read).  Otherwise, the behavior of the
+resumed system may be totally unpredictable.
+
+While executing SNAPSHOT_ATOMIC_RESTORE the kernel checks if the
+structure of the snapshot image is consistent with the information stored
+in the image header.  If any inconsistencies are detected,
+SNAPSHOT_ATOMIC_RESTORE will not succeed.  Still, this is not a fool-proof
+mechanism and the userland utilities using the interface SHOULD use additional
+means, such as checksums, to ensure the integrity of the snapshot image.
+
+The suspending and resuming utilities MUST lock themselves in memory,
+preferrably using mlockall(), before calling SNAPSHOT_FREEZE.
+
+The suspending utility MUST check the value stored by SNAPSHOT_ATOMIC_SNAPSHOT
+in the memory location pointed to by the last argument of ioctl() and proceed
+in accordance with it:
+1. 	If the value is 1 (ie. the system memory snapshot has just been
+	created and the system is ready for saving it):
+	(a)	The suspending utility MUST NOT close the snapshot device
+		_unless_ the whole suspend procedure is to be cancelled, in
+		which case, if the snapshot image has already been saved, the
+		suspending utility SHOULD destroy it, preferrably by zapping
+		its header.  If the suspend is not to be cancelled, the
+		system MUST be powered off or rebooted after the snapshot
+		image has been saved.
+	(b)	The suspending utility SHOULD NOT attempt to perform any
+		file system operations (including reads) on the file systems
+		that were mounted before SNAPSHOT_ATOMIC_SNAPSHOT has been
+		called.  However, it MAY mount a file system that was not
+		mounted at that time and perform some operations on it (eg.
+		use it for saving the image).
+2.	If the value is 0 (ie. the system state has just been restored from
+	the snapshot image), the suspending utility MUST close the snapshot
+	device.  Afterwards it will be treated as a regular userland process,
+	so it need not exit.
+
+The resuming utility SHOULD NOT attempt to mount any file systems that could
+be mounted before suspend and SHOULD NOT attempt to perform any operations
+involving such file systems.
+
+For details, please refer to the source code.
Index: linux-2.6.16-rc1-mm2/init/do_mounts_initrd.c
===================================================================
--- linux-2.6.16-rc1-mm2.orig/init/do_mounts_initrd.c	2006-01-23 15:53:39.000000000 +0100
+++ linux-2.6.16-rc1-mm2/init/do_mounts_initrd.c	2006-01-23 16:00:28.000000000 +0100
@@ -56,6 +56,7 @@
 	sys_chroot(".");
 	mount_devfs_fs ();
 
+	current->flags |= PF_NOFREEZE;
 	pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD);
 	if (pid > 0) {
 		while (pid != sys_wait4(-1, NULL, 0, NULL))
Index: linux-2.6.16-rc1-mm2/kernel/power/Makefile
===================================================================
--- linux-2.6.16-rc1-mm2.orig/kernel/power/Makefile	2006-01-23 15:53:39.000000000 +0100
+++ linux-2.6.16-rc1-mm2/kernel/power/Makefile	2006-01-23 16:00:28.000000000 +0100
@@ -5,7 +5,7 @@
 
 obj-y				:= main.o process.o console.o
 obj-$(CONFIG_PM_LEGACY)		+= pm.o
-obj-$(CONFIG_SOFTWARE_SUSPEND)	+= swsusp.o disk.o snapshot.o swap.o
+obj-$(CONFIG_SOFTWARE_SUSPEND)	+= swsusp.o disk.o snapshot.o swap.o user.o
 
 obj-$(CONFIG_SUSPEND_SMP)	+= smp.o
 
Index: linux-2.6.16-rc1-mm2/kernel/power/power.h
===================================================================
--- linux-2.6.16-rc1-mm2.orig/kernel/power/power.h	2006-01-23 15:56:03.000000000 +0100
+++ linux-2.6.16-rc1-mm2/kernel/power/power.h	2006-01-23 16:04:12.000000000 +0100
@@ -16,6 +16,7 @@
 	int			cpus;
 	unsigned long		image_pages;
 	unsigned long		pages;
+	unsigned long		size;
 } __attribute__((aligned(PAGE_SIZE)));
 
 
@@ -76,6 +77,19 @@
 extern int snapshot_write_next(struct snapshot_handle *handle, size_t count);
 int snapshot_image_loaded(struct snapshot_handle *handle);
 
+#define SNAPSHOT_IOC_MAGIC	'3'
+#define SNAPSHOT_FREEZE			_IO(SNAPSHOT_IOC_MAGIC, 1)
+#define SNAPSHOT_UNFREEZE		_IO(SNAPSHOT_IOC_MAGIC, 2)
+#define SNAPSHOT_ATOMIC_SNAPSHOT	_IOW(SNAPSHOT_IOC_MAGIC, 3, void *)
+#define SNAPSHOT_ATOMIC_RESTORE		_IO(SNAPSHOT_IOC_MAGIC, 4)
+#define SNAPSHOT_FREE			_IO(SNAPSHOT_IOC_MAGIC, 5)
+#define SNAPSHOT_SET_IMAGE_SIZE		_IOW(SNAPSHOT_IOC_MAGIC, 6, unsigned long)
+#define SNAPSHOT_AVAIL_SWAP		_IOR(SNAPSHOT_IOC_MAGIC, 7, void *)
+#define SNAPSHOT_GET_SWAP_PAGE		_IOR(SNAPSHOT_IOC_MAGIC, 8, void *)
+#define SNAPSHOT_FREE_SWAP_PAGES	_IO(SNAPSHOT_IOC_MAGIC, 9)
+#define SNAPSHOT_SET_SWAP_FILE		_IOW(SNAPSHOT_IOC_MAGIC, 10, unsigned int)
+#define SNAPSHOT_IOC_MAXNR	10
+
 /**
  *	The bitmap is used for tracing allocated swap pages
  *
Index: linux-2.6.16-rc1-mm2/kernel/power/user.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.16-rc1-mm2/kernel/power/user.c	2006-01-23 16:04:12.000000000 +0100
@@ -0,0 +1,302 @@
+/*
+ * linux/kernel/power/user.c
+ *
+ * This file provides the user space interface for software suspend/resume.
+ *
+ * Copyright (C) 2006 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * This file is released under the GPLv2.
+ *
+ */
+
+#include <linux/suspend.h>
+#include <linux/syscalls.h>
+#include <linux/string.h>
+#include <linux/device.h>
+#include <linux/miscdevice.h>
+#include <linux/mm.h>
+#include <linux/swap.h>
+#include <linux/swapops.h>
+#include <linux/pm.h>
+#include <linux/uaccess.h>
+#include <linux/fs.h>
+
+#include "power.h"
+
+#define SNAPSHOT_MINOR	231
+
+static struct snapshot_data {
+	struct snapshot_handle handle;
+	int swap;
+	struct bitmap_page *bitmap;
+	int mode;
+	char frozen;
+	char ready;
+} snapshot_state;
+
+static atomic_t device_available = ATOMIC_INIT(1);
+
+static int snapshot_open(struct inode *inode, struct file *filp)
+{
+	struct snapshot_data *data;
+
+	if (!atomic_dec_and_test(&device_available)) {
+		atomic_inc(&device_available);
+		return -EBUSY;
+	}
+
+	if ((filp->f_flags & O_ACCMODE) == O_RDWR)
+		return -ENOSYS;
+
+	nonseekable_open(inode, filp);
+	data = &snapshot_state;
+	filp->private_data = data;
+	memset(&data->handle, 0, sizeof(struct snapshot_handle));
+	if ((filp->f_flags & O_ACCMODE) == O_RDONLY) {
+		data->swap = swap_type_of(swsusp_resume_device);
+		data->mode = O_RDONLY;
+	} else {
+		data->swap = -1;
+		data->mode = O_WRONLY;
+	}
+	data->bitmap = NULL;
+	data->frozen = 0;
+	data->ready = 0;
+
+	return 0;
+}
+
+static int snapshot_release(struct inode *inode, struct file *filp)
+{
+	struct snapshot_data *data;
+
+	swsusp_free();
+	data = filp->private_data;
+	free_all_swap_pages(data->swap, data->bitmap);
+	free_bitmap(data->bitmap);
+	if (data->frozen) {
+		down(&pm_sem);
+		thaw_processes();
+		enable_nonboot_cpus();
+		up(&pm_sem);
+	}
+	atomic_inc(&device_available);
+	return 0;
+}
+
+static ssize_t snapshot_read(struct file *filp, char __user *buf,
+                             size_t count, loff_t *offp)
+{
+	struct snapshot_data *data;
+	ssize_t res;
+
+	data = filp->private_data;
+	res = snapshot_read_next(&data->handle, count);
+	if (res > 0) {
+		if (copy_to_user(buf, data_of(data->handle), res))
+			res = -EFAULT;
+		else
+			*offp = data->handle.offset;
+	}
+	return res;
+}
+
+static ssize_t snapshot_write(struct file *filp, const char __user *buf,
+                              size_t count, loff_t *offp)
+{
+	struct snapshot_data *data;
+	ssize_t res;
+
+	data = filp->private_data;
+	res = snapshot_write_next(&data->handle, count);
+	if (res > 0) {
+		if (copy_from_user(data_of(data->handle), buf, res))
+			res = -EFAULT;
+		else
+			*offp = data->handle.offset;
+	}
+	return res;
+}
+
+static int snapshot_ioctl(struct inode *inode, struct file *filp,
+                          unsigned int cmd, unsigned long arg)
+{
+	int error = 0;
+	struct snapshot_data *data;
+	loff_t offset, avail;
+
+	if (_IOC_TYPE(cmd) != SNAPSHOT_IOC_MAGIC)
+		return -ENOTTY;
+	if (_IOC_NR(cmd) > SNAPSHOT_IOC_MAXNR)
+		return -ENOTTY;
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	data = filp->private_data;
+
+	switch (cmd) {
+
+	case SNAPSHOT_FREEZE:
+		if (data->frozen)
+			break;
+		sys_sync();
+		down(&pm_sem);
+		pm_prepare_console();
+		disable_nonboot_cpus();
+		if (freeze_processes()) {
+			thaw_processes();
+			enable_nonboot_cpus();
+			pm_restore_console();
+			error = -EBUSY;
+		}
+		up(&pm_sem);
+		if (!error)
+			data->frozen = 1;
+		break;
+
+	case SNAPSHOT_UNFREEZE:
+		if (!data->frozen)
+			break;
+		down(&pm_sem);
+		thaw_processes();
+		enable_nonboot_cpus();
+		pm_restore_console();
+		up(&pm_sem);
+		data->frozen = 0;
+		break;
+
+	case SNAPSHOT_ATOMIC_SNAPSHOT:
+		if (data->mode != O_RDONLY || !data->frozen  || data->ready) {
+			error = -EPERM;
+			break;
+		}
+		down(&pm_sem);
+		/* Free memory before shutting down devices. */
+		error = swsusp_shrink_memory();
+		if (!error) {
+			error = device_suspend(PMSG_FREEZE);
+			if (!error) {
+				in_suspend = 1;
+				error = swsusp_suspend();
+				device_resume();
+			}
+		}
+		up(&pm_sem);
+		if (!error)
+			error = put_user(in_suspend, (unsigned int __user *)arg);
+		if (!error)
+			data->ready = 1;
+		break;
+
+	case SNAPSHOT_ATOMIC_RESTORE:
+		if (data->mode != O_WRONLY || !data->frozen ||
+		    !snapshot_image_loaded(&data->handle)) {
+			error = -EPERM;
+			break;
+		}
+		down(&pm_sem);
+		pm_prepare_console();
+		error = device_suspend(PMSG_FREEZE);
+		if (!error) {
+			mb();
+			error = swsusp_resume();
+			device_resume();
+		}
+		pm_restore_console();
+		up(&pm_sem);
+		break;
+
+	case SNAPSHOT_FREE:
+		swsusp_free();
+		memset(&data->handle, 0, sizeof(struct snapshot_handle));
+		data->ready = 0;
+		break;
+
+	case SNAPSHOT_SET_IMAGE_SIZE:
+		image_size = arg;
+		break;
+
+	case SNAPSHOT_AVAIL_SWAP:
+		avail = count_swap_pages(data->swap, 1);
+		avail <<= PAGE_SHIFT;
+		error = put_user(avail, (loff_t __user *)arg);
+		break;
+
+	case SNAPSHOT_GET_SWAP_PAGE:
+		if (!access_ok(VERIFY_WRITE, (unsigned long __user *)arg, _IOC_SIZE(cmd))) {
+			error = -EINVAL;
+			break;
+		}
+		if (data->swap < 0 || data->swap >= MAX_SWAPFILES) {
+			error = -ENODEV;
+			break;
+		}
+		if (!data->bitmap) {
+			data->bitmap = alloc_bitmap(count_swap_pages(data->swap, 0));
+			if (!data->bitmap) {
+				error = -ENOMEM;
+				break;
+			}
+		}
+		offset = alloc_swap_page(data->swap, data->bitmap);
+		if (offset) {
+			offset <<= PAGE_SHIFT;
+			__put_user(offset, (loff_t __user *)arg);
+		} else {
+			error = -ENOSPC;
+		}
+		break;
+
+	case SNAPSHOT_FREE_SWAP_PAGES:
+		if (data->swap >= 0 && data->swap < MAX_SWAPFILES) {
+			error = -ENODEV;
+			break;
+		}
+		free_all_swap_pages(data->swap, data->bitmap);
+		free_bitmap(data->bitmap);
+		data->bitmap = NULL;
+		break;
+
+	case SNAPSHOT_SET_SWAP_FILE:
+		if (!data->bitmap) {
+			/*
+			 * User space encodes device types as two-byte values,
+			 * so we need to recode them
+			 */
+			data->swap = swap_type_of(old_decode_dev(arg));
+			if (data->swap < 0)
+				error = -ENODEV;
+		} else {
+			error = -EPERM;
+		}
+		break;
+
+	default:
+		error = -ENOTTY;
+
+	}
+
+	return error;
+}
+
+static struct file_operations snapshot_fops = {
+	.open = snapshot_open,
+	.release = snapshot_release,
+	.read = snapshot_read,
+	.write = snapshot_write,
+	.llseek = no_llseek,
+	.ioctl = snapshot_ioctl,
+};
+
+static struct miscdevice snapshot_device = {
+	.minor = SNAPSHOT_MINOR,
+	.name = "snapshot",
+	.fops = &snapshot_fops,
+};
+
+static int __init snapshot_device_init(void)
+{
+	return misc_register(&snapshot_device);
+};
+
+device_initcall(snapshot_device_init);
Index: linux-2.6.16-rc1-mm2/kernel/power/snapshot.c
===================================================================
--- linux-2.6.16-rc1-mm2.orig/kernel/power/snapshot.c	2006-01-23 15:45:38.000000000 +0100
+++ linux-2.6.16-rc1-mm2/kernel/power/snapshot.c	2006-01-23 16:04:12.000000000 +0100
@@ -525,6 +525,8 @@
 	info->cpus = num_online_cpus();
 	info->image_pages = nr_copy_pages;
 	info->pages = nr_copy_pages + nr_meta_pages + 1;
+	info->size = info->pages;
+	info->size <<= PAGE_SHIFT;
 }
 
 /**

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24  8:29 [PATCH -mm] swsusp: userland interface (rev 2) Rafael J. Wysocki
@ 2006-01-24 11:14 ` Pavel Machek
  2006-01-24 21:13 ` Andrew Morton
  1 sibling, 0 replies; 25+ messages in thread
From: Pavel Machek @ 2006-01-24 11:14 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Andrew Morton, LKML

On Út 24-01-06 09:29:37, Rafael J. Wysocki wrote:
> Hi,
> 
> This patch introduces a user space interface for swsusp.
> 
> The interface is based on a special character device, called the snapshot
> device, that allows user space processes to perform suspend and
> resume-related operations with the help of some ioctls and the read()/write()
> functions.  Additionally it allows these processes to allocate free swap pages
> from a selected swap partition, called the resume partition, so that they know
> which sectors of the resume partition are available to them.
> 
> The interface uses the same low-level system memory snapshot-handling
> functions that are used by the built-it swap-writing/reading code of swsusp.
> 
> The interface documentation is included in the patch.
> 
> The patch assumes that the major and minor numbers of the snapshot device
> will be 10 (ie. misc device) and 231, the registration of which has already been
> requested.
> 
> Please apply (Pavel, please ack if it looks good).

ACK.
						Pavel

-- 
Thanks, Sharp!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24  8:29 [PATCH -mm] swsusp: userland interface (rev 2) Rafael J. Wysocki
  2006-01-24 11:14 ` Pavel Machek
@ 2006-01-24 21:13 ` Andrew Morton
  2006-01-24 21:30   ` Pavel Machek
  2006-01-24 23:35   ` Rafael J. Wysocki
  1 sibling, 2 replies; 25+ messages in thread
From: Andrew Morton @ 2006-01-24 21:13 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: pavel, linux-kernel

"Rafael J. Wysocki" <rjw@sisk.pl> wrote:
>
> Hi,
> 
> This patch introduces a user space interface for swsusp.

How will we know if/when this feature is ready for mainline?  What criteria
can we use to judge that?

Will you be developing and long-term maintaining the userspace tools?  Is
it your expectation/hope that distros will migrate onto using them?  etc.

> +
> +static int snapshot_open(struct inode *inode, struct file *filp)
> +{
> +	struct snapshot_data *data;
> +
> +	if (!atomic_dec_and_test(&device_available)) {
> +		atomic_inc(&device_available);

You may find that atomic_add_unless(..., -1, ...) is neater here, and
closes the tiny race.

> +		return -EBUSY;
> +	}
> +
> +	if ((filp->f_flags & O_ACCMODE) == O_RDWR)
> +		return -ENOSYS;
> +
> +	nonseekable_open(inode, filp);
> +	data = &snapshot_state;
> +	filp->private_data = data;
> +	memset(&data->handle, 0, sizeof(struct snapshot_handle));

<goes off hunting elsewhere for the defn of data->handle.  grr>

> +static ssize_t snapshot_read(struct file *filp, char __user *buf,
> +                             size_t count, loff_t *offp)
> +{
> +	struct snapshot_data *data;
> +	ssize_t res;
> +
> +	data = filp->private_data;
> +	res = snapshot_read_next(&data->handle, count);
> +	if (res > 0) {
> +		if (copy_to_user(buf, data_of(data->handle), res))
> +			res = -EFAULT;
> +		else
> +			*offp = data->handle.offset;
> +	}
> +	return res;
> +}

It's more conventional for a read() to return less-than-was-asked-for when
it hits a fault.  Doesn't matter though - lots of drivers do it this way.

> +static ssize_t snapshot_write(struct file *filp, const char __user *buf,
> +                              size_t count, loff_t *offp)
> +{
> +	struct snapshot_data *data;
> +	ssize_t res;
> +
> +	data = filp->private_data;
> +	res = snapshot_write_next(&data->handle, count);
> +	if (res > 0) {
> +		if (copy_from_user(data_of(data->handle), buf, res))
> +			res = -EFAULT;
> +		else
> +			*offp = data->handle.offset;
> +	}
> +	return res;
> +}

Ditto.

> +static int snapshot_ioctl(struct inode *inode, struct file *filp,
> +                          unsigned int cmd, unsigned long arg)
> +{
>
> ...
>
> +	case SNAPSHOT_ATOMIC_RESTORE:
> +		if (data->mode != O_WRONLY || !data->frozen ||
> +		    !snapshot_image_loaded(&data->handle)) {
> +			error = -EPERM;
> +			break;
> +		}
> +		down(&pm_sem);
> +		pm_prepare_console();
> +		error = device_suspend(PMSG_FREEZE);
> +		if (!error) {
> +			mb();
> +			error = swsusp_resume();
> +			device_resume();
> +		}

whee, what does the mystery barrier do?  It'd be nice to comment this
(please always comment open-coded barriers).

> +	case SNAPSHOT_GET_SWAP_PAGE:
> +		if (!access_ok(VERIFY_WRITE, (unsigned long __user *)arg, _IOC_SIZE(cmd))) {
> +			error = -EINVAL;
> +			break;
> +		}

Why do we need an access_ok() here?

Should it return -EFAULT?



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24 21:13 ` Andrew Morton
@ 2006-01-24 21:30   ` Pavel Machek
  2006-01-24 21:58     ` Andrew Morton
  2006-01-24 23:35   ` Rafael J. Wysocki
  1 sibling, 1 reply; 25+ messages in thread
From: Pavel Machek @ 2006-01-24 21:30 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Rafael J. Wysocki, linux-kernel

Hi!

> > This patch introduces a user space interface for swsusp.
> 
> How will we know if/when this feature is ready for mainline?  What criteria
> can we use to judge that?

It was stable for me last time I tested. I do not think it needs
longer -mm testing than usual patches.

> Will you be developing and long-term maintaining the userspace tools?  Is
> it your expectation/hope that distros will migrate onto using them?
> etc.

It looks like I'll do it, or Rafael can have it as an original
author. They are currently hosted at sf.net/projects/suspend. SuSE is
very likely to use them for 10.2 or so -- we want to provide nice
splashscreen so that users are not scared :-), we would like to do
encryption/compression too, etc.

> > +static int snapshot_ioctl(struct inode *inode, struct file *filp,
> > +                          unsigned int cmd, unsigned long arg)
> > +{
> >
> > ...
> >
> > +	case SNAPSHOT_ATOMIC_RESTORE:
> > +		if (data->mode != O_WRONLY || !data->frozen ||
> > +		    !snapshot_image_loaded(&data->handle)) {
> > +			error = -EPERM;
> > +			break;
> > +		}
> > +		down(&pm_sem);
> > +		pm_prepare_console();
> > +		error = device_suspend(PMSG_FREEZE);
> > +		if (!error) {
> > +			mb();
> > +			error = swsusp_resume();
> > +			device_resume();
> > +		}
> 
> whee, what does the mystery barrier do?  It'd be nice to comment this
> (please always comment open-coded barriers).

It is probably relic from very early code, should not be needed, but
everyone is scared of removing it.
								Pavel
-- 
Thanks, Sharp!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24 21:30   ` Pavel Machek
@ 2006-01-24 21:58     ` Andrew Morton
  2006-01-24 22:14       ` Pavel Machek
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Morton @ 2006-01-24 21:58 UTC (permalink / raw)
  To: Pavel Machek; +Cc: rjw, linux-kernel

Pavel Machek <pavel@ucw.cz> wrote:
>
> Hi!
> 
> > > This patch introduces a user space interface for swsusp.
> > 
> > How will we know if/when this feature is ready for mainline?  What criteria
> > can we use to judge that?
> 
> It was stable for me last time I tested. I do not think it needs
> longer -mm testing than usual patches.

One we've shipped the interface we're kinda stuck with it for ever, so it
does want to be pretty mature.

> > 
> > whee, what does the mystery barrier do?  It'd be nice to comment this
> > (please always comment open-coded barriers).
> 
> It is probably relic from very early code, should not be needed, but
> everyone is scared of removing it.

rofl.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24 21:58     ` Andrew Morton
@ 2006-01-24 22:14       ` Pavel Machek
  2006-01-24 22:20         ` Dave Jones
  2006-01-24 23:53         ` Rafael J. Wysocki
  0 siblings, 2 replies; 25+ messages in thread
From: Pavel Machek @ 2006-01-24 22:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: rjw, linux-kernel

Hi!

> > > > This patch introduces a user space interface for swsusp.
> > > 
> > > How will we know if/when this feature is ready for mainline?  What criteria
> > > can we use to judge that?
> > 
> > It was stable for me last time I tested. I do not think it needs
> > longer -mm testing than usual patches.
> 
> One we've shipped the interface we're kinda stuck with it for ever, so it
> does want to be pretty mature.

Well, I think we got the interface pretty much right -- and it is
really pretty simple. It survived pretty nasty stress testing at one
point.

Of course, bad things happen. Having it merged but disabled in
Makefile would certainly be preferred than not merged at
all. Plus... stable kernel or not, it is new feature, and userland
suspending programs are quite closely tied to the kernel. I think it
is reasonable to expect users to have matching version of kernel and
userland-swsusp tools, at least before dust settles.
								Pavel
-- 
Thanks, Sharp!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24 22:14       ` Pavel Machek
@ 2006-01-24 22:20         ` Dave Jones
  2006-01-24 22:33           ` Pavel Machek
  2006-01-24 23:53         ` Rafael J. Wysocki
  1 sibling, 1 reply; 25+ messages in thread
From: Dave Jones @ 2006-01-24 22:20 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Andrew Morton, rjw, linux-kernel

On Tue, Jan 24, 2006 at 11:14:26PM +0100, Pavel Machek wrote:

 > suspending programs are quite closely tied to the kernel. I think it
 > is reasonable to expect users to have matching version of kernel and
 > userland-swsusp tools

    _        _    ____  ____   ____  ____ _   _ _   _ _   _ _   _
   / \      / \  |  _ \|  _ \ / ___|/ ___| | | | | | | | | | | | |
  / _ \    / _ \ | |_) | |_) | |  _| |  _| |_| | |_| | |_| | |_| |
 / ___ \  / ___ \|  _ <|  _ <| |_| | |_| |  _  |  _  |  _  |  _  |
/_/   \_\/_/   \_\_| \_\_| \_\\____|\____|_| |_|_| |_|_| |_|_| |_|

Someone please stop this madness..

		Dave


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24 22:20         ` Dave Jones
@ 2006-01-24 22:33           ` Pavel Machek
  2006-01-24 22:38             ` Dave Jones
  0 siblings, 1 reply; 25+ messages in thread
From: Pavel Machek @ 2006-01-24 22:33 UTC (permalink / raw)
  To: Dave Jones, Andrew Morton, rjw, linux-kernel

On Út 24-01-06 17:20:44, Dave Jones wrote:
> On Tue, Jan 24, 2006 at 11:14:26PM +0100, Pavel Machek wrote:
> 
>  > suspending programs are quite closely tied to the kernel. I think it
>  > is reasonable to expect users to have matching version of kernel and
>  > userland-swsusp tools
> 
>     _        _    ____  ____   ____  ____ _   _ _   _ _   _ _   _
>    / \      / \  |  _ \|  _ \ / ___|/ ___| | | | | | | | | | | | |
>   / _ \    / _ \ | |_) | |_) | |  _| |  _| |_| | |_| | |_| | |_| |
>  / ___ \  / ___ \|  _ <|  _ <| |_| | |_| |  _  |  _  |  _  |  _  |
> /_/   \_\/_/   \_\_| \_\_| \_\\____|\____|_| |_|_| |_|_| |_|_| |_|
> 
> Someone please stop this madness..

Nice ascii art :-).

We'll of course try to get the interface right at the first
try. OTOH... if wrong interface is in kernel for a month, I do not
think it is reasonable to keep supporting that wrong interface for a
year before it can be removed. One month of warning should be fair in
such case...
								Pavel

-- 
Thanks, Sharp!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24 22:33           ` Pavel Machek
@ 2006-01-24 22:38             ` Dave Jones
  2006-01-24 22:44               ` Pavel Machek
  2006-01-24 22:47               ` Lee Revell
  0 siblings, 2 replies; 25+ messages in thread
From: Dave Jones @ 2006-01-24 22:38 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Andrew Morton, rjw, linux-kernel

 > We'll of course try to get the interface right at the first
 > try. OTOH... if wrong interface is in kernel for a month, I do not
 > think it is reasonable to keep supporting that wrong interface for a
 > year before it can be removed. One month of warning should be fair in
 > such case...

Users want to be able to boot between different kernels.
Tying functionality to specific versions of userspace completely
screws them over.

		Dave


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24 22:38             ` Dave Jones
@ 2006-01-24 22:44               ` Pavel Machek
  2006-01-26  2:09                 ` Jim Crilly
  2006-01-24 22:47               ` Lee Revell
  1 sibling, 1 reply; 25+ messages in thread
From: Pavel Machek @ 2006-01-24 22:44 UTC (permalink / raw)
  To: Dave Jones, Andrew Morton, rjw, linux-kernel

On Út 24-01-06 17:38:34, Dave Jones wrote:
>  > We'll of course try to get the interface right at the first
>  > try. OTOH... if wrong interface is in kernel for a month, I do not
>  > think it is reasonable to keep supporting that wrong interface for a
>  > year before it can be removed. One month of warning should be fair in
>  > such case...
> 
> Users want to be able to boot between different kernels.
> Tying functionality to specific versions of userspace completely
> screws them over.

Well, by the time we have any _users_ interface should be
stable. Actually I believe interface will be stable from day 0, but...

								Pavel
-- 
Thanks, Sharp!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24 22:38             ` Dave Jones
  2006-01-24 22:44               ` Pavel Machek
@ 2006-01-24 22:47               ` Lee Revell
  1 sibling, 0 replies; 25+ messages in thread
From: Lee Revell @ 2006-01-24 22:47 UTC (permalink / raw)
  To: Dave Jones; +Cc: Pavel Machek, Andrew Morton, rjw, linux-kernel

On Tue, 2006-01-24 at 17:38 -0500, Dave Jones wrote:
>  > We'll of course try to get the interface right at the first
>  > try. OTOH... if wrong interface is in kernel for a month, I do not
>  > think it is reasonable to keep supporting that wrong interface for a
>  > year before it can be removed. One month of warning should be fair in
>  > such case...
> 
> Users want to be able to boot between different kernels.
> Tying functionality to specific versions of userspace completely
> screws them over.

Surely it's OK to experiment with different interfaces in -mm, as long
as it doesn't hit mainline?  Otherwise what's the point of an
experimental branch?

Lee


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24 21:13 ` Andrew Morton
  2006-01-24 21:30   ` Pavel Machek
@ 2006-01-24 23:35   ` Rafael J. Wysocki
  2006-01-25  2:46     ` Benjamin LaHaise
                       ` (2 more replies)
  1 sibling, 3 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2006-01-24 23:35 UTC (permalink / raw)
  To: Andrew Morton; +Cc: pavel, linux-kernel

Hi,

On Tuesday, 24 January 2006 22:13, Andrew Morton wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> >
> > Hi,
> > 
> > This patch introduces a user space interface for swsusp.
> 
> How will we know if/when this feature is ready for mainline?  What criteria
> can we use to judge that?

I think when we are able to demonstrate that it allows us to do more than
the current built-in swsusp in terms of performance, security etc.  Of course
we'll need some userland utilities for this purpose.

> Will you be developing and long-term maintaining the userspace tools?

Yes.

> Is it your expectation/hope that distros will migrate onto using them?  etc.

I think they'll find the interface useful.  I've been using it for a couple of
weeks now and it really allowed me to do some tricks that are just impossible
with the current implementation.

> > +
> > +static int snapshot_open(struct inode *inode, struct file *filp)
> > +{
> > +	struct snapshot_data *data;
> > +
> > +	if (!atomic_dec_and_test(&device_available)) {
> > +		atomic_inc(&device_available);
> 
> You may find that atomic_add_unless(..., -1, ...) is neater here, and
> closes the tiny race.

Well, actually I've taken this stuff verbatim from LDD3.

> > +		return -EBUSY;
> > +	}
> > +
> > +	if ((filp->f_flags & O_ACCMODE) == O_RDWR)
> > +		return -ENOSYS;
> > +
> > +	nonseekable_open(inode, filp);
> > +	data = &snapshot_state;
> > +	filp->private_data = data;
> > +	memset(&data->handle, 0, sizeof(struct snapshot_handle));
> 
> <goes off hunting elsewhere for the defn of data->handle.  grr>
> 
> > +static ssize_t snapshot_read(struct file *filp, char __user *buf,
> > +                             size_t count, loff_t *offp)
> > +{
> > +	struct snapshot_data *data;
> > +	ssize_t res;
> > +
> > +	data = filp->private_data;
> > +	res = snapshot_read_next(&data->handle, count);
> > +	if (res > 0) {
> > +		if (copy_to_user(buf, data_of(data->handle), res))
> > +			res = -EFAULT;
> > +		else
> > +			*offp = data->handle.offset;
> > +	}
> > +	return res;
> > +}
> 
> It's more conventional for a read() to return less-than-was-asked-for when
> it hits a fault.  Doesn't matter though - lots of drivers do it this way.

I thought about it, but this would increase the complexity of
snapshot_read_next() by two orders of magnitude, and this function is also
called by the built-in code which doesn't use the read-less-than-one-page-at-a-time
functionality anyway, so I decided against it.

> > +static ssize_t snapshot_write(struct file *filp, const char __user *buf,
> > +                              size_t count, loff_t *offp)
> > +{
> > +	struct snapshot_data *data;
> > +	ssize_t res;
> > +
> > +	data = filp->private_data;
> > +	res = snapshot_write_next(&data->handle, count);
> > +	if (res > 0) {
> > +		if (copy_from_user(data_of(data->handle), buf, res))
> > +			res = -EFAULT;
> > +		else
> > +			*offp = data->handle.offset;
> > +	}
> > +	return res;
> > +}
> 
> Ditto.
> 
> > +static int snapshot_ioctl(struct inode *inode, struct file *filp,
> > +                          unsigned int cmd, unsigned long arg)
> > +{
> >
> > ...
> >
> > +	case SNAPSHOT_ATOMIC_RESTORE:
> > +		if (data->mode != O_WRONLY || !data->frozen ||
> > +		    !snapshot_image_loaded(&data->handle)) {
> > +			error = -EPERM;
> > +			break;
> > +		}
> > +		down(&pm_sem);
> > +		pm_prepare_console();
> > +		error = device_suspend(PMSG_FREEZE);
> > +		if (!error) {
> > +			mb();
> > +			error = swsusp_resume();
> > +			device_resume();
> > +		}
> 
> whee, what does the mystery barrier do?  It'd be nice to comment this
> (please always comment open-coded barriers).

Pavel should know. ;-)
 
> > +	case SNAPSHOT_GET_SWAP_PAGE:
> > +		if (!access_ok(VERIFY_WRITE, (unsigned long __user *)arg, _IOC_SIZE(cmd))) {
> > +			error = -EINVAL;
> > +			break;
> > +		}
> 
> Why do we need an access_ok() here?

Because we use __put_user() down the road?

The problem is if the address is wrong we should not try to call
alloc_swap_page() at all.  If we did, we wouldn't be able to return the result
and we would leak a swap page.

> Should it return -EFAULT?

Yes, it should.

I'll post a small fix on top of this patch if you don't mind.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24 22:14       ` Pavel Machek
  2006-01-24 22:20         ` Dave Jones
@ 2006-01-24 23:53         ` Rafael J. Wysocki
  2006-01-25  0:17           ` Andrew Morton
  1 sibling, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2006-01-24 23:53 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Andrew Morton, linux-kernel

Hi,

On Tuesday, 24 January 2006 23:14, Pavel Machek wrote:
> > > > > This patch introduces a user space interface for swsusp.
> > > > 
> > > > How will we know if/when this feature is ready for mainline?  What criteria
> > > > can we use to judge that?
> > > 
> > > It was stable for me last time I tested. I do not think it needs
> > > longer -mm testing than usual patches.
> > 
> > One we've shipped the interface we're kinda stuck with it for ever, so it
> > does want to be pretty mature.
> 
> Well, I think we got the interface pretty much right -- and it is
> really pretty simple. It survived pretty nasty stress testing at one
> point.

I think the ioctls defined so far won't change.  However, it's possible we'll
need some more (for example the suspend console handling is not
optimal now, so to speak).

IMHO the current version is sufficient to start with, but now we should
start working on userland tools seriously.  Then it'll turn out what else
is necessary.

> Of course, bad things happen. Having it merged but disabled in
> Makefile would certainly be preferred than not merged at
> all. Plus... stable kernel or not, it is new feature, and userland
> suspending programs are quite closely tied to the kernel. I think it
> is reasonable to expect users to have matching version of kernel and
> userland-swsusp tools, at least before dust settles.

I'm afraid that would be a nightmare for distributors who decide to use it.

IMHO after it gets into mainline every next version of the interface should
be backwards compatible with the previous one.

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24 23:53         ` Rafael J. Wysocki
@ 2006-01-25  0:17           ` Andrew Morton
  2006-01-25  0:31             ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Morton @ 2006-01-25  0:17 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: pavel, linux-kernel

"Rafael J. Wysocki" <rjw@sisk.pl> wrote:
>
> IMHO after it gets into mainline every next version of the interface should
> be backwards compatible with the previous one.

That's insufficient.  Every new version of the userspace tools also needs
to be back-compatible with (and tested against!) older kernels.

Otherwise, people who upgrade both their kernel and userspace will find
that their old kernels don't work right, so they have to downgrade their
userspace tools too.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-25  0:17           ` Andrew Morton
@ 2006-01-25  0:31             ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2006-01-25  0:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: pavel, linux-kernel

On Wednesday, 25 January 2006 01:17, Andrew Morton wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> >
> > IMHO after it gets into mainline every next version of the interface should
> > be backwards compatible with the previous one.
> 
> That's insufficient.  Every new version of the userspace tools also needs
> to be back-compatible with (and tested against!) older kernels.

Certainly.  I only referred to the "new kernel with old userland tools" case,
but it goes both ways.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24 23:35   ` Rafael J. Wysocki
@ 2006-01-25  2:46     ` Benjamin LaHaise
  2006-01-25 10:50       ` Rafael J. Wysocki
  2006-01-25 12:18     ` Pavel Machek
  2006-01-25 12:20     ` Pavel Machek
  2 siblings, 1 reply; 25+ messages in thread
From: Benjamin LaHaise @ 2006-01-25  2:46 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Andrew Morton, pavel, linux-kernel

On Wed, Jan 25, 2006 at 12:35:38AM +0100, Rafael J. Wysocki wrote:
> > > +		if (!access_ok(VERIFY_WRITE, (unsigned long __user *)arg, _IOC_SIZE(cmd))) {
> > > +			error = -EINVAL;
> > > +			break;
> > > +		}
> > 
> > Why do we need an access_ok() here?
> 
> Because we use __put_user() down the road?
> 
> The problem is if the address is wrong we should not try to call
> alloc_swap_page() at all.  If we did, we wouldn't be able to return the result
> and we would leak a swap page.

Then access_ok() is not the droid you are looking for... since it won't 
catch several cases (out of memory being the most obvious).  Doing an 
early put_user() wouldn't hurt and reduces the chance of later failure 
even further.  __put_user() should never be used outside of a select few 
performance critical code paths.

		-ben
-- 
"Ladies and gentlemen, I'm sorry to interrupt, but the police are here 
and they've asked us to stop the party."  Don't Email: <dont@kvack.org>.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-25  2:46     ` Benjamin LaHaise
@ 2006-01-25 10:50       ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2006-01-25 10:50 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Andrew Morton, pavel, linux-kernel

On Wednesday, 25 January 2006 03:46, Benjamin LaHaise wrote:
> On Wed, Jan 25, 2006 at 12:35:38AM +0100, Rafael J. Wysocki wrote:
> > > > +		if (!access_ok(VERIFY_WRITE, (unsigned long __user *)arg, _IOC_SIZE(cmd))) {
> > > > +			error = -EINVAL;
> > > > +			break;
> > > > +		}
> > > 
> > > Why do we need an access_ok() here?
> > 
> > Because we use __put_user() down the road?
> > 
> > The problem is if the address is wrong we should not try to call
> > alloc_swap_page() at all.  If we did, we wouldn't be able to return the result
> > and we would leak a swap page.
> 
> Then access_ok() is not the droid you are looking for... since it won't 
> catch several cases (out of memory being the most obvious).

Thanks, I haven't thought about it.

> Doing an early put_user() wouldn't hurt and reduces the chance of later failure 
> even further.  __put_user() should never be used outside of a select few 
> performance critical code paths.

Do you mean to use a fake put_user() instead of access_ok()?  And then
put_user() once again or is it reasonable to call __put_user() with the same
arg?

Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24 23:35   ` Rafael J. Wysocki
  2006-01-25  2:46     ` Benjamin LaHaise
@ 2006-01-25 12:18     ` Pavel Machek
  2006-01-25 12:29       ` Rafael J. Wysocki
  2006-01-25 12:20     ` Pavel Machek
  2 siblings, 1 reply; 25+ messages in thread
From: Pavel Machek @ 2006-01-25 12:18 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Andrew Morton, linux-kernel

Hi!

> > > +	case SNAPSHOT_ATOMIC_RESTORE:
> > > +		if (data->mode != O_WRONLY || !data->frozen ||
> > > +		    !snapshot_image_loaded(&data->handle)) {
> > > +			error = -EPERM;
> > > +			break;
> > > +		}
> > > +		down(&pm_sem);
> > > +		pm_prepare_console();
> > > +		error = device_suspend(PMSG_FREEZE);
> > > +		if (!error) {
> > > +			mb();
> > > +			error = swsusp_resume();
> > > +			device_resume();
> > > +		}
> > 
> > whee, what does the mystery barrier do?  It'd be nice to comment this
> > (please always comment open-coded barriers).
> 
> Pavel should know. ;-)

Pavel does not known. That memory barrier should be part of assembly
parts, anyway, and AFAIK it is. Should be safe to kill.

> > > +	case SNAPSHOT_GET_SWAP_PAGE:
> > > +		if (!access_ok(VERIFY_WRITE, (unsigned long __user *)arg, _IOC_SIZE(cmd))) {
> > > +			error = -EINVAL;
> > > +			break;
> > > +		}
> > 
> > Why do we need an access_ok() here?
> 
> Because we use __put_user() down the road?
> 
> The problem is if the address is wrong we should not try to call
> alloc_swap_page() at all.  If we did, we wouldn't be able to return the result
> and we would leak a swap page.

I think you need to watch for failing put_user and free the page at
that point. Anything else is racy as __put_user() may fail.

							Pavel
-- 
Thanks, Sharp!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24 23:35   ` Rafael J. Wysocki
  2006-01-25  2:46     ` Benjamin LaHaise
  2006-01-25 12:18     ` Pavel Machek
@ 2006-01-25 12:20     ` Pavel Machek
  2 siblings, 0 replies; 25+ messages in thread
From: Pavel Machek @ 2006-01-25 12:20 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Andrew Morton, linux-kernel

On St 25-01-06 00:35:38, Rafael J. Wysocki wrote:
> Hi,
> 
> On Tuesday, 24 January 2006 22:13, Andrew Morton wrote:
> > "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > >
> > > Hi,
> > > 
> > > This patch introduces a user space interface for swsusp.
> > 
> > How will we know if/when this feature is ready for mainline?  What criteria
> > can we use to judge that?
> 
> I think when we are able to demonstrate that it allows us to do more than
> the current built-in swsusp in terms of performance, security etc.  Of course
> we'll need some userland utilities for this purpose.
> 
> > Will you be developing and long-term maintaining the userspace tools?
> 
> Yes.
> 
> > Is it your expectation/hope that distros will migrate onto using them?  etc.
> 
> I think they'll find the interface useful.  I've been using it for a couple of
> weeks now and it really allowed me to do some tricks that are just impossible
> with the current implementation.

Interesting... what tricks? Where is the latest code? [On a related
note, I should probably give you suspend.sf.net account. Do you
already have login on sourceforge?]
								Pavel
-- 
Thanks, Sharp!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-25 12:18     ` Pavel Machek
@ 2006-01-25 12:29       ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2006-01-25 12:29 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Andrew Morton, linux-kernel

Hi,

On Wednesday, 25 January 2006 13:18, Pavel Machek wrote:
> > > > +	case SNAPSHOT_ATOMIC_RESTORE:
> > > > +		if (data->mode != O_WRONLY || !data->frozen ||
> > > > +		    !snapshot_image_loaded(&data->handle)) {
> > > > +			error = -EPERM;
> > > > +			break;
> > > > +		}
> > > > +		down(&pm_sem);
> > > > +		pm_prepare_console();
> > > > +		error = device_suspend(PMSG_FREEZE);
> > > > +		if (!error) {
> > > > +			mb();
> > > > +			error = swsusp_resume();
> > > > +			device_resume();
> > > > +		}
> > > 
> > > whee, what does the mystery barrier do?  It'd be nice to comment this
> > > (please always comment open-coded barriers).
> > 
> > Pavel should know. ;-)
> 
> Pavel does not known. That memory barrier should be part of assembly
> parts, anyway, and AFAIK it is. Should be safe to kill.

OK

> > > > +	case SNAPSHOT_GET_SWAP_PAGE:
> > > > +		if (!access_ok(VERIFY_WRITE, (unsigned long __user *)arg, _IOC_SIZE(cmd))) {
> > > > +			error = -EINVAL;
> > > > +			break;
> > > > +		}
> > > 
> > > Why do we need an access_ok() here?
> > 
> > Because we use __put_user() down the road?
> > 
> > The problem is if the address is wrong we should not try to call
> > alloc_swap_page() at all.  If we did, we wouldn't be able to return the result
> > and we would leak a swap page.
> 
> I think you need to watch for failing put_user and free the page at
> that point. Anything else is racy as __put_user() may fail.

The page will be freed anyway when the device is closed (I was wrong
saying it would be "leaked"), so I think I'll just use put_user().

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-24 22:44               ` Pavel Machek
@ 2006-01-26  2:09                 ` Jim Crilly
  2006-01-26  7:58                   ` Pavel Machek
                                     ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Jim Crilly @ 2006-01-26  2:09 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Dave Jones, Andrew Morton, rjw, linux-kernel

On 01/24/06 11:44:37PM +0100, Pavel Machek wrote:
> On Út 24-01-06 17:38:34, Dave Jones wrote:
> >  > We'll of course try to get the interface right at the first
> >  > try. OTOH... if wrong interface is in kernel for a month, I do not
> >  > think it is reasonable to keep supporting that wrong interface for a
> >  > year before it can be removed. One month of warning should be fair in
> >  > such case...
> > 
> > Users want to be able to boot between different kernels.
> > Tying functionality to specific versions of userspace completely
> > screws them over.
> 
> Well, by the time we have any _users_ interface should be
> stable. Actually I believe interface will be stable from day 0, but...
>

I'm sure gregkh thought the same thing with about sysfs and udev and we've
seen how well that's worked out...

> 								Pavel

Jim.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-26  2:09                 ` Jim Crilly
@ 2006-01-26  7:58                   ` Pavel Machek
  2006-01-27  1:11                   ` Greg KH
  2006-01-27  3:42                   ` Kay Sievers
  2 siblings, 0 replies; 25+ messages in thread
From: Pavel Machek @ 2006-01-26  7:58 UTC (permalink / raw)
  To: Dave Jones, Andrew Morton, rjw, linux-kernel

On St 25-01-06 21:09:27, Jim Crilly wrote:
> On 01/24/06 11:44:37PM +0100, Pavel Machek wrote:
> > On Út 24-01-06 17:38:34, Dave Jones wrote:
> > >  > We'll of course try to get the interface right at the first
> > >  > try. OTOH... if wrong interface is in kernel for a month, I do not
> > >  > think it is reasonable to keep supporting that wrong interface for a
> > >  > year before it can be removed. One month of warning should be fair in
> > >  > such case...
> > > 
> > > Users want to be able to boot between different kernels.
> > > Tying functionality to specific versions of userspace completely
> > > screws them over.
> > 
> > Well, by the time we have any _users_ interface should be
> > stable. Actually I believe interface will be stable from day 0, but...
> >
> 
> I'm sure gregkh thought the same thing with about sysfs and udev and we've
> seen how well that's worked out...

Quite well, I'd say. But whats your alternative, just stop adding
interfaces to the kernel? Oops...
							Pavel
-- 
Thanks, Sharp!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-26  2:09                 ` Jim Crilly
  2006-01-26  7:58                   ` Pavel Machek
@ 2006-01-27  1:11                   ` Greg KH
  2006-01-27  3:42                   ` Kay Sievers
  2 siblings, 0 replies; 25+ messages in thread
From: Greg KH @ 2006-01-27  1:11 UTC (permalink / raw)
  To: Pavel Machek, Dave Jones, Andrew Morton, rjw, linux-kernel

On Wed, Jan 25, 2006 at 09:09:27PM -0500, Jim Crilly wrote:
> On 01/24/06 11:44:37PM +0100, Pavel Machek wrote:
> > On ?t 24-01-06 17:38:34, Dave Jones wrote:
> > >  > We'll of course try to get the interface right at the first
> > >  > try. OTOH... if wrong interface is in kernel for a month, I do not
> > >  > think it is reasonable to keep supporting that wrong interface for a
> > >  > year before it can be removed. One month of warning should be fair in
> > >  > such case...
> > > 
> > > Users want to be able to boot between different kernels.
> > > Tying functionality to specific versions of userspace completely
> > > screws them over.
> > 
> > Well, by the time we have any _users_ interface should be
> > stable. Actually I believe interface will be stable from day 0, but...
> >
> 
> I'm sure gregkh thought the same thing with about sysfs and udev and we've
> seen how well that's worked out...

Yes, I think it's worked out quite well, except for the bugs in udev
that did not anticipate kernel changes properly.  But again, those were
udev bugs, not kernel bugs.

If you have specific udev issues, that are not distro related, please
bring them up on the linux-hotplug-devel mailing list so that people can
help you out.

Or don't use udev at all, no one is forcing you :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-26  2:09                 ` Jim Crilly
  2006-01-26  7:58                   ` Pavel Machek
  2006-01-27  1:11                   ` Greg KH
@ 2006-01-27  3:42                   ` Kay Sievers
  2006-01-27 11:24                     ` Rafael J. Wysocki
  2 siblings, 1 reply; 25+ messages in thread
From: Kay Sievers @ 2006-01-27  3:42 UTC (permalink / raw)
  To: Pavel Machek, Dave Jones, Andrew Morton, rjw, linux-kernel

On Wed, Jan 25, 2006 at 09:09:27PM -0500, Jim Crilly wrote:
> On 01/24/06 11:44:37PM +0100, Pavel Machek wrote:
> > On Út 24-01-06 17:38:34, Dave Jones wrote:
> > >  > We'll of course try to get the interface right at the first
> > >  > try. OTOH... if wrong interface is in kernel for a month, I do not
> > >  > think it is reasonable to keep supporting that wrong interface for a
> > >  > year before it can be removed. One month of warning should be fair in
> > >  > such case...
> > > 
> > > Users want to be able to boot between different kernels.
> > > Tying functionality to specific versions of userspace completely
> > > screws them over.
> > 
> > Well, by the time we have any _users_ interface should be
> > stable. Actually I believe interface will be stable from day 0, but...
> >
> I'm sure gregkh thought the same thing with about sysfs and udev and we've
> seen how well that's worked out...

Well, that was just an unfortunate "bug".

Declaring interfaces "stable" makes as much sense as all the other
tries to define crazy enterprise "standards" nobody follows in real
world.

In a developing environment, interfaces _become_ stable and don't get
_declared_ by anybody as such. We are not talking about syscall
interfaces or things that are simple enough to be kept stable, if you
cross a certain level of complexity, you just can't apply these rules
anymore.

Interfaces mature over the time they get used. Only the _use_ of it
collects the needed information to form the model behind it. They get
improved up to the point that changing the interface causes more
pain than it's worth this change. Then an interface has _become_ "stable"
cause it makes sense at that point.

"by the time we have any _users_ interface should be stable", that's
such a nonsense. If you don't have any user, you don't know if this
interface works at all and only if it gets used you get the needed
feedback to improve it.

Kay

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH -mm] swsusp: userland interface (rev 2)
  2006-01-27  3:42                   ` Kay Sievers
@ 2006-01-27 11:24                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2006-01-27 11:24 UTC (permalink / raw)
  To: Kay Sievers; +Cc: Pavel Machek, Dave Jones, Andrew Morton, linux-kernel

Hi,

On Friday, 27 January 2006 04:42, Kay Sievers wrote:
> On Wed, Jan 25, 2006 at 09:09:27PM -0500, Jim Crilly wrote:
> > On 01/24/06 11:44:37PM +0100, Pavel Machek wrote:
> > > On Út 24-01-06 17:38:34, Dave Jones wrote:
> > > >  > We'll of course try to get the interface right at the first
> > > >  > try. OTOH... if wrong interface is in kernel for a month, I do not
> > > >  > think it is reasonable to keep supporting that wrong interface for a
> > > >  > year before it can be removed. One month of warning should be fair in
> > > >  > such case...
> > > > 
> > > > Users want to be able to boot between different kernels.
> > > > Tying functionality to specific versions of userspace completely
> > > > screws them over.
> > > 
> > > Well, by the time we have any _users_ interface should be
> > > stable. Actually I believe interface will be stable from day 0, but...
> > >
> > I'm sure gregkh thought the same thing with about sysfs and udev and we've
> > seen how well that's worked out...
> 
> Well, that was just an unfortunate "bug".
> 
> Declaring interfaces "stable" makes as much sense as all the other
> tries to define crazy enterprise "standards" nobody follows in real
> world.
> 
> In a developing environment, interfaces _become_ stable and don't get
> _declared_ by anybody as such. We are not talking about syscall
> interfaces or things that are simple enough to be kept stable, if you
> cross a certain level of complexity, you just can't apply these rules
> anymore.
> 
> Interfaces mature over the time they get used. Only the _use_ of it
> collects the needed information to form the model behind it. They get
> improved up to the point that changing the interface causes more
> pain than it's worth this change. Then an interface has _become_ "stable"
> cause it makes sense at that point.

Agreed.

> "by the time we have any _users_ interface should be stable", that's
> such a nonsense. If you don't have any user, you don't know if this
> interface works at all and only if it gets used you get the needed
> feedback to improve it.

I think Pavel meant "users who don't work on it themselves".

Greetings,
Rafael

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2006-01-27 11:22 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-24  8:29 [PATCH -mm] swsusp: userland interface (rev 2) Rafael J. Wysocki
2006-01-24 11:14 ` Pavel Machek
2006-01-24 21:13 ` Andrew Morton
2006-01-24 21:30   ` Pavel Machek
2006-01-24 21:58     ` Andrew Morton
2006-01-24 22:14       ` Pavel Machek
2006-01-24 22:20         ` Dave Jones
2006-01-24 22:33           ` Pavel Machek
2006-01-24 22:38             ` Dave Jones
2006-01-24 22:44               ` Pavel Machek
2006-01-26  2:09                 ` Jim Crilly
2006-01-26  7:58                   ` Pavel Machek
2006-01-27  1:11                   ` Greg KH
2006-01-27  3:42                   ` Kay Sievers
2006-01-27 11:24                     ` Rafael J. Wysocki
2006-01-24 22:47               ` Lee Revell
2006-01-24 23:53         ` Rafael J. Wysocki
2006-01-25  0:17           ` Andrew Morton
2006-01-25  0:31             ` Rafael J. Wysocki
2006-01-24 23:35   ` Rafael J. Wysocki
2006-01-25  2:46     ` Benjamin LaHaise
2006-01-25 10:50       ` Rafael J. Wysocki
2006-01-25 12:18     ` Pavel Machek
2006-01-25 12:29       ` Rafael J. Wysocki
2006-01-25 12:20     ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).