All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oren Laadan <orenl@cs.columbia.edu>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@osdl.org>,
	containers@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-api@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
	Serge Hallyn <serue@us.ibm.com>,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Oren Laadan <orenl@cs.columbia.edu>
Subject: [RFC v13][PATCH 04/14] General infrastructure for checkpoint restart
Date: Tue, 27 Jan 2009 12:08:02 -0500	[thread overview]
Message-ID: <1233076092-8660-5-git-send-email-orenl@cs.columbia.edu> (raw)
In-Reply-To: <1233076092-8660-1-git-send-email-orenl@cs.columbia.edu>

Add those interfaces, as well as helpers needed to easily manage the
file format. The code is roughly broken out as follows:

checkpoint/sys.c - user/kernel data transfer, as well as setup of the
  CR context (a per-checkpoint data structure for housekeeping)
checkpoint/checkpoint.c - output wrappers and basic checkpoint handling
checkpoint/restart.c - input wrappers and basic restart handling

For now, we can only checkpoint the 'current' task ("self" checkpoint),
and the 'pid' argument to to the syscall is ignored.

Patches to add the per-architecture support as well as the actual
work to do the memory checkpoint follow in subsequent patches.

Changelog[v12]:
  - cr_kwrite/cr_kread() again use vfs_read(), vfs_write() (safer)
  - Split cr_write/cr_read() to two parts: _cr_write/read() helper
  - Befriend with sparse : explicit conversion to 'void __user *'
  - Redfine 'pr_fmt' instead of using special cr_debug()

Changelog[v10]:
  - add cr_write_buffer(), cr_read_buffer() and cr_read_buf_type()
  - force end-of-string in cr_read_string() (fix possible DoS)

Changelog[v9]:
  - cr_kwrite/cr_kread() use file->f_op->write() directly
  - Drop cr_uwrite/cr_uread() since they aren't used anywhere

Changelog[v6]:
  - Balance all calls to cr_hbuf_get() with matching cr_hbuf_put()
    (although it's not really needed)

Changelog[v5]:
  - Rename headers files s/ckpt/checkpoint/

Changelog[v2]:
  - Added utsname->{release,version,machine} to checkpoint header
  - Pad header structures to 64 bits to ensure compatibility

Signed-off-by: Oren Laadan <orenl@cs.columbia.edu>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
---
 Makefile                       |    2 +-
 checkpoint/Makefile            |    2 +-
 checkpoint/checkpoint.c        |  188 +++++++++++++++++++++++++++++++
 checkpoint/restart.c           |  239 ++++++++++++++++++++++++++++++++++++++++
 checkpoint/sys.c               |  216 +++++++++++++++++++++++++++++++++++-
 include/linux/checkpoint.h     |   61 ++++++++++
 include/linux/checkpoint_hdr.h |   85 ++++++++++++++
 include/linux/magic.h          |    3 +
 8 files changed, 790 insertions(+), 6 deletions(-)
 create mode 100644 checkpoint/checkpoint.c
 create mode 100644 checkpoint/restart.c
 create mode 100644 include/linux/checkpoint.h
 create mode 100644 include/linux/checkpoint_hdr.h

diff --git a/Makefile b/Makefile
index 71e98e9..c1744a1 100644
--- a/Makefile
+++ b/Makefile
@@ -619,7 +619,7 @@ export mod_strip_cmd
 
 
 ifeq ($(KBUILD_EXTMOD),)
-core-y		+= kernel/ mm/ fs/ ipc/ security/ crypto/ block/
+core-y		+= kernel/ mm/ fs/ ipc/ security/ crypto/ block/ checkpoint/
 
 vmlinux-dirs	:= $(patsubst %/,%,$(filter %/, $(init-y) $(init-m) \
 		     $(core-y) $(core-m) $(drivers-y) $(drivers-m) \
diff --git a/checkpoint/Makefile b/checkpoint/Makefile
index 07d018b..d2df68c 100644
--- a/checkpoint/Makefile
+++ b/checkpoint/Makefile
@@ -2,4 +2,4 @@
 # Makefile for linux checkpoint/restart.
 #
 
-obj-$(CONFIG_CHECKPOINT_RESTART) += sys.o
+obj-$(CONFIG_CHECKPOINT_RESTART) += sys.o checkpoint.o restart.o
diff --git a/checkpoint/checkpoint.c b/checkpoint/checkpoint.c
new file mode 100644
index 0000000..35bf99b
--- /dev/null
+++ b/checkpoint/checkpoint.c
@@ -0,0 +1,188 @@
+/*
+ *  Checkpoint logic and helpers
+ *
+ *  Copyright (C) 2008 Oren Laadan
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#include <linux/version.h>
+#include <linux/sched.h>
+#include <linux/time.h>
+#include <linux/fs.h>
+#include <linux/file.h>
+#include <linux/dcache.h>
+#include <linux/mount.h>
+#include <linux/utsname.h>
+#include <linux/magic.h>
+#include <linux/checkpoint.h>
+#include <linux/checkpoint_hdr.h>
+
+/* unique checkpoint identifier (FIXME: should be per-container ?) */
+static atomic_t cr_ctx_count = ATOMIC_INIT(0);
+
+/**
+ * cr_write_obj - write a record described by a cr_hdr
+ * @ctx: checkpoint context
+ * @h: record descriptor
+ * @buf: record buffer
+ */
+int cr_write_obj(struct cr_ctx *ctx, struct cr_hdr *h, void *buf)
+{
+	int ret;
+
+	ret = cr_kwrite(ctx, h, sizeof(*h));
+	if (ret < 0)
+		return ret;
+	return cr_kwrite(ctx, buf, h->len);
+}
+
+/**
+ * cr_write_buffer - write a buffer
+ * @ctx: checkpoint context
+ * @str: buffer pointer
+ * @len: buffer size
+ */
+int cr_write_buffer(struct cr_ctx *ctx, void *buf, int len)
+{
+	struct cr_hdr h;
+
+	h.type = CR_HDR_BUFFER;
+	h.len = len;
+	h.parent = 0;
+
+	return cr_write_obj(ctx, &h, buf);
+}
+
+/**
+ * cr_write_string - write a string
+ * @ctx: checkpoint context
+ * @str: string pointer
+ * @len: string length
+ */
+int cr_write_string(struct cr_ctx *ctx, char *str, int len)
+{
+	struct cr_hdr h;
+
+	h.type = CR_HDR_STRING;
+	h.len = len;
+	h.parent = 0;
+
+	return cr_write_obj(ctx, &h, str);
+}
+
+/* write the checkpoint header */
+static int cr_write_head(struct cr_ctx *ctx)
+{
+	struct cr_hdr h;
+	struct cr_hdr_head *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	struct new_utsname *uts;
+	struct timeval ktv;
+	int ret;
+
+	h.type = CR_HDR_HEAD;
+	h.len = sizeof(*hh);
+	h.parent = 0;
+
+	do_gettimeofday(&ktv);
+
+	hh->magic = CHECKPOINT_MAGIC_HEAD;
+	hh->major = (LINUX_VERSION_CODE >> 16) & 0xff;
+	hh->minor = (LINUX_VERSION_CODE >> 8) & 0xff;
+	hh->patch = (LINUX_VERSION_CODE) & 0xff;
+
+	hh->rev = CR_VERSION;
+
+	hh->flags = ctx->flags;
+	hh->time = ktv.tv_sec;
+
+	uts = utsname();
+	memcpy(hh->release, uts->release, __NEW_UTS_LEN);
+	memcpy(hh->version, uts->version, __NEW_UTS_LEN);
+	memcpy(hh->machine, uts->machine, __NEW_UTS_LEN);
+
+	ret = cr_write_obj(ctx, &h, hh);
+	cr_hbuf_put(ctx, sizeof(*hh));
+	return ret;
+}
+
+/* write the checkpoint trailer */
+static int cr_write_tail(struct cr_ctx *ctx)
+{
+	struct cr_hdr h;
+	struct cr_hdr_tail *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	int ret;
+
+	h.type = CR_HDR_TAIL;
+	h.len = sizeof(*hh);
+	h.parent = 0;
+
+	hh->magic = CHECKPOINT_MAGIC_TAIL;
+
+	ret = cr_write_obj(ctx, &h, hh);
+	cr_hbuf_put(ctx, sizeof(*hh));
+	return ret;
+}
+
+/* dump the task_struct of a given task */
+static int cr_write_task_struct(struct cr_ctx *ctx, struct task_struct *t)
+{
+	struct cr_hdr h;
+	struct cr_hdr_task *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	int ret;
+
+	h.type = CR_HDR_TASK;
+	h.len = sizeof(*hh);
+	h.parent = 0;
+
+	hh->state = t->state;
+	hh->exit_state = t->exit_state;
+	hh->exit_code = t->exit_code;
+	hh->exit_signal = t->exit_signal;
+
+	hh->task_comm_len = TASK_COMM_LEN;
+
+	/* FIXME: save remaining relevant task_struct fields */
+
+	ret = cr_write_obj(ctx, &h, hh);
+	cr_hbuf_put(ctx, sizeof(*hh));
+	if (ret < 0)
+		return ret;
+
+	return cr_write_string(ctx, t->comm, TASK_COMM_LEN);
+}
+
+/* dump the entire state of a given task */
+static int cr_write_task(struct cr_ctx *ctx, struct task_struct *t)
+{
+	int ret;
+
+	ret = cr_write_task_struct(ctx, t);
+	pr_debug("ret %d\n", ret);
+
+	return ret;
+}
+
+int do_checkpoint(struct cr_ctx *ctx, pid_t pid)
+{
+	int ret;
+
+	ret = cr_write_head(ctx);
+	if (ret < 0)
+		goto out;
+	ret = cr_write_task(ctx, current);
+	if (ret < 0)
+		goto out;
+	ret = cr_write_tail(ctx);
+	if (ret < 0)
+		goto out;
+
+	ctx->crid = atomic_inc_return(&cr_ctx_count);
+
+	/* on success, return (unique) checkpoint identifier */
+	ret = ctx->crid;
+ out:
+	return ret;
+}
diff --git a/checkpoint/restart.c b/checkpoint/restart.c
new file mode 100644
index 0000000..4741f4a
--- /dev/null
+++ b/checkpoint/restart.c
@@ -0,0 +1,239 @@
+/*
+ *  Restart logic and helpers
+ *
+ *  Copyright (C) 2008 Oren Laadan
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#include <linux/version.h>
+#include <linux/sched.h>
+#include <linux/file.h>
+#include <linux/magic.h>
+#include <linux/checkpoint.h>
+#include <linux/checkpoint_hdr.h>
+
+/**
+ * cr_read_obj - read a whole record (cr_hdr followed by payload)
+ * @ctx: checkpoint context
+ * @h: record descriptor
+ * @buf: record buffer
+ * @len: available buffer size
+ */
+int cr_read_obj(struct cr_ctx *ctx, struct cr_hdr *h, void *buf, int len)
+{
+	int ret;
+
+	ret = cr_kread(ctx, h, sizeof(*h));
+	if (ret < 0)
+		return ret;
+
+	pr_debug("type %d len %d parent %d\n", h->type, h->len, h->parent);
+
+	if (h->len < 0 || h->len > len)
+		return -EINVAL;
+
+	return cr_kread(ctx, buf, h->len);
+}
+
+/**
+ * cr_read_obj_type - read a whole record of expected type and size
+ * @ctx: checkpoint context
+ * @buf: record buffer
+ * @n: expected record size
+ * @type: expected record type
+ *
+ * Returns objref of the parent object
+ */
+int cr_read_obj_type(struct cr_ctx *ctx, void *buf, int len, int type)
+{
+	struct cr_hdr h;
+	int ret;
+
+	ret = cr_read_obj(ctx, &h, buf, len);
+	if (ret < 0)
+		return ret;
+
+	if (h.len != len || h.type != type)
+		return -EINVAL;
+
+	return h.parent;
+}
+
+/**
+ * cr_read_buf_type - read a whole record of expected type (unknown size)
+ * @ctx: checkpoint context
+ * @buf: record buffer
+ * @n: availabe buffer size (output: actual record size)
+ * @type: expected record type
+ *
+ * Returns objref of the parent object
+ */
+int cr_read_buf_type(struct cr_ctx *ctx, void *buf, int *len, int type)
+{
+	struct cr_hdr h;
+	int ret;
+
+	ret = cr_read_obj(ctx, &h, buf, *len);
+	if (ret < 0)
+		return ret;
+
+	if (h.type != type)
+		return -EINVAL;
+
+	*len = h.len;
+	return h.parent;
+}
+
+/**
+ * cr_read_buffer - read a buffer
+ * @ctx: checkpoint context
+ * @buf: buffer
+ * @len: buffer size (output actual record size)
+ */
+int cr_read_buffer(struct cr_ctx *ctx, void *buf, int *len)
+{
+	return cr_read_buf_type(ctx, buf, len, CR_HDR_BUFFER);
+}
+
+/**
+ * cr_read_string - read a string
+ * @ctx: checkpoint context
+ * @str: string buffer
+ * @len: string length
+ */
+int cr_read_string(struct cr_ctx *ctx, char *str, int len)
+{
+	int ret;
+
+	ret = cr_read_buf_type(ctx, str, &len, CR_HDR_STRING);
+	if (ret < 0)
+		return ret;
+
+	if (len > 0)
+		str[len - 1] = '\0';	/* always play it safe */
+
+	return ret;
+}
+
+/* read the checkpoint header */
+static int cr_read_head(struct cr_ctx *ctx)
+{
+	struct cr_hdr_head *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	int parent, ret = -EINVAL;
+
+	parent = cr_read_obj_type(ctx, hh, sizeof(*hh), CR_HDR_HEAD);
+	if (parent < 0) {
+		ret = parent;
+		goto out;
+	} else if (parent != 0)
+		goto out;
+
+	if (hh->magic != CHECKPOINT_MAGIC_HEAD || hh->rev != CR_VERSION ||
+	    hh->major != ((LINUX_VERSION_CODE >> 16) & 0xff) ||
+	    hh->minor != ((LINUX_VERSION_CODE >> 8) & 0xff) ||
+	    hh->patch != ((LINUX_VERSION_CODE) & 0xff))
+		goto out;
+
+	if (hh->flags & ~CR_CTX_CKPT)
+		goto out;
+
+	ctx->oflags = hh->flags;
+
+	/* FIX: verify compatibility of release, version and machine */
+
+	ret = 0;
+ out:
+	cr_hbuf_put(ctx, sizeof(*hh));
+	return ret;
+}
+
+/* read the checkpoint trailer */
+static int cr_read_tail(struct cr_ctx *ctx)
+{
+	struct cr_hdr_tail *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	int parent, ret = -EINVAL;
+
+	parent = cr_read_obj_type(ctx, hh, sizeof(*hh), CR_HDR_TAIL);
+	if (parent < 0) {
+		ret = parent;
+		goto out;
+	} else if (parent != 0)
+		goto out;
+
+	if (hh->magic != CHECKPOINT_MAGIC_TAIL)
+		goto out;
+
+	ret = 0;
+ out:
+	cr_hbuf_put(ctx, sizeof(*hh));
+	return ret;
+}
+
+/* read the task_struct into the current task */
+static int cr_read_task_struct(struct cr_ctx *ctx)
+{
+	struct cr_hdr_task *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	struct task_struct *t = current;
+	char *buf;
+	int parent, ret = -EINVAL;
+
+	parent = cr_read_obj_type(ctx, hh, sizeof(*hh), CR_HDR_TASK);
+	if (parent < 0) {
+		ret = parent;
+		goto out;
+	} else if (parent != 0)
+		goto out;
+
+	/* upper limit for task_comm_len to prevent DoS */
+	if (hh->task_comm_len < 0 || hh->task_comm_len > PAGE_SIZE)
+		goto out;
+
+	buf = kmalloc(hh->task_comm_len, GFP_KERNEL);
+	if (!buf)
+		goto out;
+	ret = cr_read_string(ctx, buf, hh->task_comm_len);
+	if (!ret) {
+		/* if t->comm is too long, silently truncate */
+		memset(t->comm, 0, TASK_COMM_LEN);
+		memcpy(t->comm, buf, min(hh->task_comm_len, TASK_COMM_LEN));
+	}
+	kfree(buf);
+
+	/* FIXME: restore remaining relevant task_struct fields */
+ out:
+	cr_hbuf_put(ctx, sizeof(*hh));
+	return ret;
+}
+
+/* read the entire state of the current task */
+static int cr_read_task(struct cr_ctx *ctx)
+{
+	int ret;
+
+	ret = cr_read_task_struct(ctx);
+	pr_debug("ret %d\n", ret);
+
+	return ret;
+}
+
+int do_restart(struct cr_ctx *ctx, pid_t pid)
+{
+	int ret;
+
+	ret = cr_read_head(ctx);
+	if (ret < 0)
+		goto out;
+	ret = cr_read_task(ctx);
+	if (ret < 0)
+		goto out;
+	ret = cr_read_tail(ctx);
+	if (ret < 0)
+		goto out;
+
+	/* on success, adjust the return value if needed [TODO] */
+ out:
+	return ret;
+}
diff --git a/checkpoint/sys.c b/checkpoint/sys.c
index 375129c..76e2553 100644
--- a/checkpoint/sys.c
+++ b/checkpoint/sys.c
@@ -10,6 +10,180 @@
 
 #include <linux/sched.h>
 #include <linux/kernel.h>
+#include <linux/fs.h>
+#include <linux/file.h>
+#include <linux/uaccess.h>
+#include <linux/capability.h>
+#include <linux/checkpoint.h>
+
+/*
+ * Helpers to write(read) from(to) kernel space to(from) the checkpoint
+ * image file descriptor (similar to how a core-dump is performed).
+ *
+ *   cr_kwrite() - write a kernel-space buffer to the checkpoint image
+ *   cr_kread() - read from the checkpoint image to a kernel-space buffer
+ */
+
+static inline int _cr_kwrite(struct file *file, void *addr, int count)
+{
+	void __user *uaddr = (__force void __user *) addr;
+	ssize_t nwrite;
+	int nleft;
+
+	for (nleft = count; nleft; nleft -= nwrite) {
+		loff_t pos = file_pos_read(file);
+		nwrite = vfs_write(file, uaddr, nleft, &pos);
+		file_pos_write(file, pos);
+		if (nwrite < 0) {
+			if (nwrite == -EAGAIN)
+				nwrite = 0;
+			else
+				return nwrite;
+		}
+		uaddr += nwrite;
+	}
+	return 0;
+}
+
+int cr_kwrite(struct cr_ctx *ctx, void *addr, int count)
+{
+	mm_segment_t fs;
+	int ret;
+
+	fs = get_fs();
+	set_fs(KERNEL_DS);
+	ret = _cr_kwrite(ctx->file, addr, count);
+	set_fs(fs);
+
+	ctx->total += count;
+	return ret;
+}
+
+static inline int _cr_kread(struct file *file, void *addr, int count)
+{
+	void __user *uaddr = (__force void __user *) addr;
+	ssize_t nread;
+	int nleft;
+
+	for (nleft = count; nleft; nleft -= nread) {
+		loff_t pos = file_pos_read(file);
+		nread = vfs_read(file, uaddr, nleft, &pos);
+		file_pos_write(file, pos);
+		if (nread <= 0) {
+			if (nread == -EAGAIN) {
+				nread = 0;
+				continue;
+			} else if (nread == 0)
+				nread = -EPIPE;		/* unexecpted EOF */
+			return nread;
+		}
+		uaddr += nread;
+	}
+	return 0;
+}
+
+int cr_kread(struct cr_ctx *ctx, void *addr, int count)
+{
+	mm_segment_t fs;
+	int ret;
+
+	fs = get_fs();
+	set_fs(KERNEL_DS);
+	ret = _cr_kread(ctx->file , addr, count);
+	set_fs(fs);
+
+	ctx->total += count;
+	return ret;
+}
+
+/*
+ * During checkpoint and restart the code writes outs/reads in data
+ * to/from the checkpoint image from/to a temporary buffer (ctx->hbuf).
+ * Because operations can be nested, use cr_hbuf_get() to reserve space
+ * in the buffer, then cr_hbuf_put() when you no longer need that space.
+ */
+
+/*
+ * ctx->hbuf is used to hold headers and data of known (or bound),
+ * static sizes. In some cases, multiple headers may be allocated in
+ * a nested manner. The size should accommodate all headers, nested
+ * or not, on all archs.
+ */
+#define CR_HBUF_TOTAL  (8 * 4096)
+
+/**
+ * cr_hbuf_get - reserve space on the hbuf
+ * @ctx: checkpoint context
+ * @n: number of bytes to reserve
+ *
+ * Returns pointer to reserved space
+ */
+void *cr_hbuf_get(struct cr_ctx *ctx, int n)
+{
+	void *ptr;
+
+	/*
+	 * Since requests depend on logic and static header sizes (not on
+	 * user data), space should always suffice, unless someone either
+	 * made a structure bigger or call path deeper than expected.
+	 */
+	BUG_ON(ctx->hpos + n > CR_HBUF_TOTAL);
+	ptr = ctx->hbuf + ctx->hpos;
+	ctx->hpos += n;
+	return ptr;
+}
+
+/**
+ * cr_hbuf_put - unreserve space on the hbuf
+ * @ctx: checkpoint context
+ * @n: number of bytes to reserve
+ */
+void cr_hbuf_put(struct cr_ctx *ctx, int n)
+{
+	BUG_ON(ctx->hpos < n);
+	ctx->hpos -= n;
+}
+
+/*
+ * helpers to manage C/R contexts: allocated for each checkpoint and/or
+ * restart operation, and persists until the operation is completed.
+ */
+
+static void cr_ctx_free(struct cr_ctx *ctx)
+{
+	if (ctx->file)
+		fput(ctx->file);
+	kfree(ctx->hbuf);
+	kfree(ctx);
+}
+
+static struct cr_ctx *cr_ctx_alloc(int fd, unsigned long flags)
+{
+	struct cr_ctx *ctx;
+	int err;
+
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return ERR_PTR(-ENOMEM);
+
+	ctx->flags = flags;
+
+	err = -EBADF;
+	ctx->file = fget(fd);
+	if (!ctx->file)
+		goto err;
+
+	err = -ENOMEM;
+	ctx->hbuf = kmalloc(CR_HBUF_TOTAL, GFP_KERNEL);
+	if (!ctx->hbuf)
+		goto err;
+
+	return ctx;
+
+ err:
+	cr_ctx_free(ctx);
+	return ERR_PTR(err);
+}
 
 /**
  * sys_checkpoint - checkpoint a container
@@ -22,9 +196,26 @@
  */
 asmlinkage long sys_checkpoint(pid_t pid, int fd, unsigned long flags)
 {
-	pr_debug("sys_checkpoint not implemented yet\n");
-	return -ENOSYS;
+	struct cr_ctx *ctx;
+	int ret;
+
+	/* no flags for now */
+	if (flags)
+		return -EINVAL;
+
+	ctx = cr_ctx_alloc(fd, flags | CR_CTX_CKPT);
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
+
+	ret = do_checkpoint(ctx, pid);
+
+	if (!ret)
+		ret = ctx->crid;
+
+	cr_ctx_free(ctx);
+	return ret;
 }
+
 /**
  * sys_restart - restart a container
  * @crid: checkpoint image identifier
@@ -36,6 +227,23 @@ asmlinkage long sys_checkpoint(pid_t pid, int fd, unsigned long flags)
  */
 asmlinkage long sys_restart(int crid, int fd, unsigned long flags)
 {
-	pr_debug("sys_restart not implemented yet\n");
-	return -ENOSYS;
+	struct cr_ctx *ctx;
+	pid_t pid;
+	int ret;
+
+	/* no flags for now */
+	if (flags)
+		return -EINVAL;
+
+	ctx = cr_ctx_alloc(fd, flags | CR_CTX_RSTR);
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
+
+	/* FIXME: for now, we use 'crid' as a pid */
+	pid = (pid_t) crid;
+
+	ret = do_restart(ctx, pid);
+
+	cr_ctx_free(ctx);
+	return ret;
 }
diff --git a/include/linux/checkpoint.h b/include/linux/checkpoint.h
new file mode 100644
index 0000000..65a2cbf
--- /dev/null
+++ b/include/linux/checkpoint.h
@@ -0,0 +1,61 @@
+#ifndef _CHECKPOINT_CKPT_H_
+#define _CHECKPOINT_CKPT_H_
+/*
+ *  Generic container checkpoint-restart
+ *
+ *  Copyright (C) 2008 Oren Laadan
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#define CR_VERSION  1
+
+struct cr_ctx {
+	int crid;		/* unique checkpoint id */
+
+	pid_t root_pid;		/* container identifier */
+
+	unsigned long flags;
+	unsigned long oflags;	/* restart: old flags */
+
+	struct file *file;
+	int total;		/* total read/written */
+
+	void *hbuf;		/* temporary buffer for headers */
+	int hpos;		/* position in headers buffer */
+};
+
+/* cr_ctx: flags */
+#define CR_CTX_CKPT	0x1
+#define CR_CTX_RSTR	0x2
+
+extern int cr_kwrite(struct cr_ctx *ctx, void *buf, int count);
+extern int cr_kread(struct cr_ctx *ctx, void *buf, int count);
+
+extern void *cr_hbuf_get(struct cr_ctx *ctx, int n);
+extern void cr_hbuf_put(struct cr_ctx *ctx, int n);
+
+struct cr_hdr;
+
+extern int cr_write_obj(struct cr_ctx *ctx, struct cr_hdr *h, void *buf);
+extern int cr_write_buffer(struct cr_ctx *ctx, void *buf, int len);
+extern int cr_write_string(struct cr_ctx *ctx, char *str, int len);
+
+extern int cr_read_obj(struct cr_ctx *ctx, struct cr_hdr *h, void *buf, int n);
+extern int cr_read_obj_type(struct cr_ctx *ctx, void *buf, int len, int type);
+extern int cr_read_buf_type(struct cr_ctx *ctx, void *buf, int *len, int type);
+extern int cr_read_buffer(struct cr_ctx *ctx, void *buf, int *len);
+extern int cr_read_string(struct cr_ctx *ctx, char *str, int len);
+
+extern int do_checkpoint(struct cr_ctx *ctx, pid_t pid);
+extern int do_restart(struct cr_ctx *ctx, pid_t pid);
+
+#ifdef pr_fmt
+#undef pr_fmt
+#endif
+
+#define pr_fmt(fmt) "[%d:c/r:%s] " fmt, task_pid_vnr(current), __func__
+
+#endif /* _CHECKPOINT_CKPT_H_ */
diff --git a/include/linux/checkpoint_hdr.h b/include/linux/checkpoint_hdr.h
new file mode 100644
index 0000000..fcc0125
--- /dev/null
+++ b/include/linux/checkpoint_hdr.h
@@ -0,0 +1,85 @@
+#ifndef _CHECKPOINT_CKPT_HDR_H_
+#define _CHECKPOINT_CKPT_HDR_H_
+/*
+ *  Generic container checkpoint-restart
+ *
+ *  Copyright (C) 2008 Oren Laadan
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#include <linux/types.h>
+#include <linux/utsname.h>
+
+/*
+ * To maintain compatibility between 32-bit and 64-bit architecture flavors,
+ * keep data 64-bit aligned: use padding for structure members, and use
+ * __attribute__ ((aligned (8))) for the entire structure.
+ *
+ * Quoting Arnd Bergmann:
+ *   "This structure has an odd multiple of 32-bit members, which means
+ *   that if you put it into a larger structure that also contains 64-bit
+ *   members, the larger structure may get different alignment on x86-32
+ *   and x86-64, which you might want to avoid. I can't tell if this is
+ *   an actual problem here. ... In this case, I'm pretty sure that
+ *   sizeof(cr_hdr_task) on x86-32 is different from x86-64, since it
+ *   will be 32-bit aligned on x86-32."
+ */
+
+/* records: generic header */
+
+struct cr_hdr {
+	__s16 type;
+	__s16 len;
+	__u32 parent;
+};
+
+/* header types */
+enum {
+	CR_HDR_HEAD = 1,
+	CR_HDR_BUFFER,
+	CR_HDR_STRING,
+
+	CR_HDR_TASK = 101,
+	CR_HDR_THREAD,
+	CR_HDR_CPU,
+
+	CR_HDR_MM = 201,
+	CR_HDR_VMA,
+	CR_HDR_MM_CONTEXT,
+
+	CR_HDR_TAIL = 5001
+};
+
+struct cr_hdr_head {
+	__u64 magic;
+
+	__u16 major;
+	__u16 minor;
+	__u16 patch;
+	__u16 rev;
+
+	__u64 time;	/* when checkpoint taken */
+	__u64 flags;	/* checkpoint options */
+
+	char release[__NEW_UTS_LEN];
+	char version[__NEW_UTS_LEN];
+	char machine[__NEW_UTS_LEN];
+} __attribute__((aligned(8)));
+
+struct cr_hdr_tail {
+	__u64 magic;
+} __attribute__((aligned(8)));
+
+struct cr_hdr_task {
+	__u32 state;
+	__u32 exit_state;
+	__u32 exit_code;
+	__u32 exit_signal;
+
+	__s32 task_comm_len;
+} __attribute__((aligned(8)));
+
+#endif /* _CHECKPOINT_CKPT_HDR_H_ */
diff --git a/include/linux/magic.h b/include/linux/magic.h
index f7f3fdd..5939bbe 100644
--- a/include/linux/magic.h
+++ b/include/linux/magic.h
@@ -46,4 +46,7 @@
 #define FUTEXFS_SUPER_MAGIC	0xBAD1DEA
 #define INOTIFYFS_SUPER_MAGIC	0x2BAD1DEA
 
+#define CHECKPOINT_MAGIC_HEAD  0x00feed0cc0a2d200LL
+#define CHECKPOINT_MAGIC_TAIL  0x002d2a0cc0deef00LL
+
 #endif /* __LINUX_MAGIC_H__ */
-- 
1.5.4.3


WARNING: multiple messages have this Message-ID (diff)
From: Oren Laadan <orenl@cs.columbia.edu>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@osdl.org>,
	containers@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-api@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
	Serge Hallyn <serue@us.ibm.com>,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Oren Laadan <orenl@cs.columbia.edu>
Subject: [RFC v13][PATCH 04/14] General infrastructure for checkpoint restart
Date: Tue, 27 Jan 2009 12:08:02 -0500	[thread overview]
Message-ID: <1233076092-8660-5-git-send-email-orenl@cs.columbia.edu> (raw)
In-Reply-To: <1233076092-8660-1-git-send-email-orenl@cs.columbia.edu>

Add those interfaces, as well as helpers needed to easily manage the
file format. The code is roughly broken out as follows:

checkpoint/sys.c - user/kernel data transfer, as well as setup of the
  CR context (a per-checkpoint data structure for housekeeping)
checkpoint/checkpoint.c - output wrappers and basic checkpoint handling
checkpoint/restart.c - input wrappers and basic restart handling

For now, we can only checkpoint the 'current' task ("self" checkpoint),
and the 'pid' argument to to the syscall is ignored.

Patches to add the per-architecture support as well as the actual
work to do the memory checkpoint follow in subsequent patches.

Changelog[v12]:
  - cr_kwrite/cr_kread() again use vfs_read(), vfs_write() (safer)
  - Split cr_write/cr_read() to two parts: _cr_write/read() helper
  - Befriend with sparse : explicit conversion to 'void __user *'
  - Redfine 'pr_fmt' instead of using special cr_debug()

Changelog[v10]:
  - add cr_write_buffer(), cr_read_buffer() and cr_read_buf_type()
  - force end-of-string in cr_read_string() (fix possible DoS)

Changelog[v9]:
  - cr_kwrite/cr_kread() use file->f_op->write() directly
  - Drop cr_uwrite/cr_uread() since they aren't used anywhere

Changelog[v6]:
  - Balance all calls to cr_hbuf_get() with matching cr_hbuf_put()
    (although it's not really needed)

Changelog[v5]:
  - Rename headers files s/ckpt/checkpoint/

Changelog[v2]:
  - Added utsname->{release,version,machine} to checkpoint header
  - Pad header structures to 64 bits to ensure compatibility

Signed-off-by: Oren Laadan <orenl@cs.columbia.edu>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
---
 Makefile                       |    2 +-
 checkpoint/Makefile            |    2 +-
 checkpoint/checkpoint.c        |  188 +++++++++++++++++++++++++++++++
 checkpoint/restart.c           |  239 ++++++++++++++++++++++++++++++++++++++++
 checkpoint/sys.c               |  216 +++++++++++++++++++++++++++++++++++-
 include/linux/checkpoint.h     |   61 ++++++++++
 include/linux/checkpoint_hdr.h |   85 ++++++++++++++
 include/linux/magic.h          |    3 +
 8 files changed, 790 insertions(+), 6 deletions(-)
 create mode 100644 checkpoint/checkpoint.c
 create mode 100644 checkpoint/restart.c
 create mode 100644 include/linux/checkpoint.h
 create mode 100644 include/linux/checkpoint_hdr.h

diff --git a/Makefile b/Makefile
index 71e98e9..c1744a1 100644
--- a/Makefile
+++ b/Makefile
@@ -619,7 +619,7 @@ export mod_strip_cmd
 
 
 ifeq ($(KBUILD_EXTMOD),)
-core-y		+= kernel/ mm/ fs/ ipc/ security/ crypto/ block/
+core-y		+= kernel/ mm/ fs/ ipc/ security/ crypto/ block/ checkpoint/
 
 vmlinux-dirs	:= $(patsubst %/,%,$(filter %/, $(init-y) $(init-m) \
 		     $(core-y) $(core-m) $(drivers-y) $(drivers-m) \
diff --git a/checkpoint/Makefile b/checkpoint/Makefile
index 07d018b..d2df68c 100644
--- a/checkpoint/Makefile
+++ b/checkpoint/Makefile
@@ -2,4 +2,4 @@
 # Makefile for linux checkpoint/restart.
 #
 
-obj-$(CONFIG_CHECKPOINT_RESTART) += sys.o
+obj-$(CONFIG_CHECKPOINT_RESTART) += sys.o checkpoint.o restart.o
diff --git a/checkpoint/checkpoint.c b/checkpoint/checkpoint.c
new file mode 100644
index 0000000..35bf99b
--- /dev/null
+++ b/checkpoint/checkpoint.c
@@ -0,0 +1,188 @@
+/*
+ *  Checkpoint logic and helpers
+ *
+ *  Copyright (C) 2008 Oren Laadan
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#include <linux/version.h>
+#include <linux/sched.h>
+#include <linux/time.h>
+#include <linux/fs.h>
+#include <linux/file.h>
+#include <linux/dcache.h>
+#include <linux/mount.h>
+#include <linux/utsname.h>
+#include <linux/magic.h>
+#include <linux/checkpoint.h>
+#include <linux/checkpoint_hdr.h>
+
+/* unique checkpoint identifier (FIXME: should be per-container ?) */
+static atomic_t cr_ctx_count = ATOMIC_INIT(0);
+
+/**
+ * cr_write_obj - write a record described by a cr_hdr
+ * @ctx: checkpoint context
+ * @h: record descriptor
+ * @buf: record buffer
+ */
+int cr_write_obj(struct cr_ctx *ctx, struct cr_hdr *h, void *buf)
+{
+	int ret;
+
+	ret = cr_kwrite(ctx, h, sizeof(*h));
+	if (ret < 0)
+		return ret;
+	return cr_kwrite(ctx, buf, h->len);
+}
+
+/**
+ * cr_write_buffer - write a buffer
+ * @ctx: checkpoint context
+ * @str: buffer pointer
+ * @len: buffer size
+ */
+int cr_write_buffer(struct cr_ctx *ctx, void *buf, int len)
+{
+	struct cr_hdr h;
+
+	h.type = CR_HDR_BUFFER;
+	h.len = len;
+	h.parent = 0;
+
+	return cr_write_obj(ctx, &h, buf);
+}
+
+/**
+ * cr_write_string - write a string
+ * @ctx: checkpoint context
+ * @str: string pointer
+ * @len: string length
+ */
+int cr_write_string(struct cr_ctx *ctx, char *str, int len)
+{
+	struct cr_hdr h;
+
+	h.type = CR_HDR_STRING;
+	h.len = len;
+	h.parent = 0;
+
+	return cr_write_obj(ctx, &h, str);
+}
+
+/* write the checkpoint header */
+static int cr_write_head(struct cr_ctx *ctx)
+{
+	struct cr_hdr h;
+	struct cr_hdr_head *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	struct new_utsname *uts;
+	struct timeval ktv;
+	int ret;
+
+	h.type = CR_HDR_HEAD;
+	h.len = sizeof(*hh);
+	h.parent = 0;
+
+	do_gettimeofday(&ktv);
+
+	hh->magic = CHECKPOINT_MAGIC_HEAD;
+	hh->major = (LINUX_VERSION_CODE >> 16) & 0xff;
+	hh->minor = (LINUX_VERSION_CODE >> 8) & 0xff;
+	hh->patch = (LINUX_VERSION_CODE) & 0xff;
+
+	hh->rev = CR_VERSION;
+
+	hh->flags = ctx->flags;
+	hh->time = ktv.tv_sec;
+
+	uts = utsname();
+	memcpy(hh->release, uts->release, __NEW_UTS_LEN);
+	memcpy(hh->version, uts->version, __NEW_UTS_LEN);
+	memcpy(hh->machine, uts->machine, __NEW_UTS_LEN);
+
+	ret = cr_write_obj(ctx, &h, hh);
+	cr_hbuf_put(ctx, sizeof(*hh));
+	return ret;
+}
+
+/* write the checkpoint trailer */
+static int cr_write_tail(struct cr_ctx *ctx)
+{
+	struct cr_hdr h;
+	struct cr_hdr_tail *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	int ret;
+
+	h.type = CR_HDR_TAIL;
+	h.len = sizeof(*hh);
+	h.parent = 0;
+
+	hh->magic = CHECKPOINT_MAGIC_TAIL;
+
+	ret = cr_write_obj(ctx, &h, hh);
+	cr_hbuf_put(ctx, sizeof(*hh));
+	return ret;
+}
+
+/* dump the task_struct of a given task */
+static int cr_write_task_struct(struct cr_ctx *ctx, struct task_struct *t)
+{
+	struct cr_hdr h;
+	struct cr_hdr_task *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	int ret;
+
+	h.type = CR_HDR_TASK;
+	h.len = sizeof(*hh);
+	h.parent = 0;
+
+	hh->state = t->state;
+	hh->exit_state = t->exit_state;
+	hh->exit_code = t->exit_code;
+	hh->exit_signal = t->exit_signal;
+
+	hh->task_comm_len = TASK_COMM_LEN;
+
+	/* FIXME: save remaining relevant task_struct fields */
+
+	ret = cr_write_obj(ctx, &h, hh);
+	cr_hbuf_put(ctx, sizeof(*hh));
+	if (ret < 0)
+		return ret;
+
+	return cr_write_string(ctx, t->comm, TASK_COMM_LEN);
+}
+
+/* dump the entire state of a given task */
+static int cr_write_task(struct cr_ctx *ctx, struct task_struct *t)
+{
+	int ret;
+
+	ret = cr_write_task_struct(ctx, t);
+	pr_debug("ret %d\n", ret);
+
+	return ret;
+}
+
+int do_checkpoint(struct cr_ctx *ctx, pid_t pid)
+{
+	int ret;
+
+	ret = cr_write_head(ctx);
+	if (ret < 0)
+		goto out;
+	ret = cr_write_task(ctx, current);
+	if (ret < 0)
+		goto out;
+	ret = cr_write_tail(ctx);
+	if (ret < 0)
+		goto out;
+
+	ctx->crid = atomic_inc_return(&cr_ctx_count);
+
+	/* on success, return (unique) checkpoint identifier */
+	ret = ctx->crid;
+ out:
+	return ret;
+}
diff --git a/checkpoint/restart.c b/checkpoint/restart.c
new file mode 100644
index 0000000..4741f4a
--- /dev/null
+++ b/checkpoint/restart.c
@@ -0,0 +1,239 @@
+/*
+ *  Restart logic and helpers
+ *
+ *  Copyright (C) 2008 Oren Laadan
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#include <linux/version.h>
+#include <linux/sched.h>
+#include <linux/file.h>
+#include <linux/magic.h>
+#include <linux/checkpoint.h>
+#include <linux/checkpoint_hdr.h>
+
+/**
+ * cr_read_obj - read a whole record (cr_hdr followed by payload)
+ * @ctx: checkpoint context
+ * @h: record descriptor
+ * @buf: record buffer
+ * @len: available buffer size
+ */
+int cr_read_obj(struct cr_ctx *ctx, struct cr_hdr *h, void *buf, int len)
+{
+	int ret;
+
+	ret = cr_kread(ctx, h, sizeof(*h));
+	if (ret < 0)
+		return ret;
+
+	pr_debug("type %d len %d parent %d\n", h->type, h->len, h->parent);
+
+	if (h->len < 0 || h->len > len)
+		return -EINVAL;
+
+	return cr_kread(ctx, buf, h->len);
+}
+
+/**
+ * cr_read_obj_type - read a whole record of expected type and size
+ * @ctx: checkpoint context
+ * @buf: record buffer
+ * @n: expected record size
+ * @type: expected record type
+ *
+ * Returns objref of the parent object
+ */
+int cr_read_obj_type(struct cr_ctx *ctx, void *buf, int len, int type)
+{
+	struct cr_hdr h;
+	int ret;
+
+	ret = cr_read_obj(ctx, &h, buf, len);
+	if (ret < 0)
+		return ret;
+
+	if (h.len != len || h.type != type)
+		return -EINVAL;
+
+	return h.parent;
+}
+
+/**
+ * cr_read_buf_type - read a whole record of expected type (unknown size)
+ * @ctx: checkpoint context
+ * @buf: record buffer
+ * @n: availabe buffer size (output: actual record size)
+ * @type: expected record type
+ *
+ * Returns objref of the parent object
+ */
+int cr_read_buf_type(struct cr_ctx *ctx, void *buf, int *len, int type)
+{
+	struct cr_hdr h;
+	int ret;
+
+	ret = cr_read_obj(ctx, &h, buf, *len);
+	if (ret < 0)
+		return ret;
+
+	if (h.type != type)
+		return -EINVAL;
+
+	*len = h.len;
+	return h.parent;
+}
+
+/**
+ * cr_read_buffer - read a buffer
+ * @ctx: checkpoint context
+ * @buf: buffer
+ * @len: buffer size (output actual record size)
+ */
+int cr_read_buffer(struct cr_ctx *ctx, void *buf, int *len)
+{
+	return cr_read_buf_type(ctx, buf, len, CR_HDR_BUFFER);
+}
+
+/**
+ * cr_read_string - read a string
+ * @ctx: checkpoint context
+ * @str: string buffer
+ * @len: string length
+ */
+int cr_read_string(struct cr_ctx *ctx, char *str, int len)
+{
+	int ret;
+
+	ret = cr_read_buf_type(ctx, str, &len, CR_HDR_STRING);
+	if (ret < 0)
+		return ret;
+
+	if (len > 0)
+		str[len - 1] = '\0';	/* always play it safe */
+
+	return ret;
+}
+
+/* read the checkpoint header */
+static int cr_read_head(struct cr_ctx *ctx)
+{
+	struct cr_hdr_head *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	int parent, ret = -EINVAL;
+
+	parent = cr_read_obj_type(ctx, hh, sizeof(*hh), CR_HDR_HEAD);
+	if (parent < 0) {
+		ret = parent;
+		goto out;
+	} else if (parent != 0)
+		goto out;
+
+	if (hh->magic != CHECKPOINT_MAGIC_HEAD || hh->rev != CR_VERSION ||
+	    hh->major != ((LINUX_VERSION_CODE >> 16) & 0xff) ||
+	    hh->minor != ((LINUX_VERSION_CODE >> 8) & 0xff) ||
+	    hh->patch != ((LINUX_VERSION_CODE) & 0xff))
+		goto out;
+
+	if (hh->flags & ~CR_CTX_CKPT)
+		goto out;
+
+	ctx->oflags = hh->flags;
+
+	/* FIX: verify compatibility of release, version and machine */
+
+	ret = 0;
+ out:
+	cr_hbuf_put(ctx, sizeof(*hh));
+	return ret;
+}
+
+/* read the checkpoint trailer */
+static int cr_read_tail(struct cr_ctx *ctx)
+{
+	struct cr_hdr_tail *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	int parent, ret = -EINVAL;
+
+	parent = cr_read_obj_type(ctx, hh, sizeof(*hh), CR_HDR_TAIL);
+	if (parent < 0) {
+		ret = parent;
+		goto out;
+	} else if (parent != 0)
+		goto out;
+
+	if (hh->magic != CHECKPOINT_MAGIC_TAIL)
+		goto out;
+
+	ret = 0;
+ out:
+	cr_hbuf_put(ctx, sizeof(*hh));
+	return ret;
+}
+
+/* read the task_struct into the current task */
+static int cr_read_task_struct(struct cr_ctx *ctx)
+{
+	struct cr_hdr_task *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	struct task_struct *t = current;
+	char *buf;
+	int parent, ret = -EINVAL;
+
+	parent = cr_read_obj_type(ctx, hh, sizeof(*hh), CR_HDR_TASK);
+	if (parent < 0) {
+		ret = parent;
+		goto out;
+	} else if (parent != 0)
+		goto out;
+
+	/* upper limit for task_comm_len to prevent DoS */
+	if (hh->task_comm_len < 0 || hh->task_comm_len > PAGE_SIZE)
+		goto out;
+
+	buf = kmalloc(hh->task_comm_len, GFP_KERNEL);
+	if (!buf)
+		goto out;
+	ret = cr_read_string(ctx, buf, hh->task_comm_len);
+	if (!ret) {
+		/* if t->comm is too long, silently truncate */
+		memset(t->comm, 0, TASK_COMM_LEN);
+		memcpy(t->comm, buf, min(hh->task_comm_len, TASK_COMM_LEN));
+	}
+	kfree(buf);
+
+	/* FIXME: restore remaining relevant task_struct fields */
+ out:
+	cr_hbuf_put(ctx, sizeof(*hh));
+	return ret;
+}
+
+/* read the entire state of the current task */
+static int cr_read_task(struct cr_ctx *ctx)
+{
+	int ret;
+
+	ret = cr_read_task_struct(ctx);
+	pr_debug("ret %d\n", ret);
+
+	return ret;
+}
+
+int do_restart(struct cr_ctx *ctx, pid_t pid)
+{
+	int ret;
+
+	ret = cr_read_head(ctx);
+	if (ret < 0)
+		goto out;
+	ret = cr_read_task(ctx);
+	if (ret < 0)
+		goto out;
+	ret = cr_read_tail(ctx);
+	if (ret < 0)
+		goto out;
+
+	/* on success, adjust the return value if needed [TODO] */
+ out:
+	return ret;
+}
diff --git a/checkpoint/sys.c b/checkpoint/sys.c
index 375129c..76e2553 100644
--- a/checkpoint/sys.c
+++ b/checkpoint/sys.c
@@ -10,6 +10,180 @@
 
 #include <linux/sched.h>
 #include <linux/kernel.h>
+#include <linux/fs.h>
+#include <linux/file.h>
+#include <linux/uaccess.h>
+#include <linux/capability.h>
+#include <linux/checkpoint.h>
+
+/*
+ * Helpers to write(read) from(to) kernel space to(from) the checkpoint
+ * image file descriptor (similar to how a core-dump is performed).
+ *
+ *   cr_kwrite() - write a kernel-space buffer to the checkpoint image
+ *   cr_kread() - read from the checkpoint image to a kernel-space buffer
+ */
+
+static inline int _cr_kwrite(struct file *file, void *addr, int count)
+{
+	void __user *uaddr = (__force void __user *) addr;
+	ssize_t nwrite;
+	int nleft;
+
+	for (nleft = count; nleft; nleft -= nwrite) {
+		loff_t pos = file_pos_read(file);
+		nwrite = vfs_write(file, uaddr, nleft, &pos);
+		file_pos_write(file, pos);
+		if (nwrite < 0) {
+			if (nwrite == -EAGAIN)
+				nwrite = 0;
+			else
+				return nwrite;
+		}
+		uaddr += nwrite;
+	}
+	return 0;
+}
+
+int cr_kwrite(struct cr_ctx *ctx, void *addr, int count)
+{
+	mm_segment_t fs;
+	int ret;
+
+	fs = get_fs();
+	set_fs(KERNEL_DS);
+	ret = _cr_kwrite(ctx->file, addr, count);
+	set_fs(fs);
+
+	ctx->total += count;
+	return ret;
+}
+
+static inline int _cr_kread(struct file *file, void *addr, int count)
+{
+	void __user *uaddr = (__force void __user *) addr;
+	ssize_t nread;
+	int nleft;
+
+	for (nleft = count; nleft; nleft -= nread) {
+		loff_t pos = file_pos_read(file);
+		nread = vfs_read(file, uaddr, nleft, &pos);
+		file_pos_write(file, pos);
+		if (nread <= 0) {
+			if (nread == -EAGAIN) {
+				nread = 0;
+				continue;
+			} else if (nread == 0)
+				nread = -EPIPE;		/* unexecpted EOF */
+			return nread;
+		}
+		uaddr += nread;
+	}
+	return 0;
+}
+
+int cr_kread(struct cr_ctx *ctx, void *addr, int count)
+{
+	mm_segment_t fs;
+	int ret;
+
+	fs = get_fs();
+	set_fs(KERNEL_DS);
+	ret = _cr_kread(ctx->file , addr, count);
+	set_fs(fs);
+
+	ctx->total += count;
+	return ret;
+}
+
+/*
+ * During checkpoint and restart the code writes outs/reads in data
+ * to/from the checkpoint image from/to a temporary buffer (ctx->hbuf).
+ * Because operations can be nested, use cr_hbuf_get() to reserve space
+ * in the buffer, then cr_hbuf_put() when you no longer need that space.
+ */
+
+/*
+ * ctx->hbuf is used to hold headers and data of known (or bound),
+ * static sizes. In some cases, multiple headers may be allocated in
+ * a nested manner. The size should accommodate all headers, nested
+ * or not, on all archs.
+ */
+#define CR_HBUF_TOTAL  (8 * 4096)
+
+/**
+ * cr_hbuf_get - reserve space on the hbuf
+ * @ctx: checkpoint context
+ * @n: number of bytes to reserve
+ *
+ * Returns pointer to reserved space
+ */
+void *cr_hbuf_get(struct cr_ctx *ctx, int n)
+{
+	void *ptr;
+
+	/*
+	 * Since requests depend on logic and static header sizes (not on
+	 * user data), space should always suffice, unless someone either
+	 * made a structure bigger or call path deeper than expected.
+	 */
+	BUG_ON(ctx->hpos + n > CR_HBUF_TOTAL);
+	ptr = ctx->hbuf + ctx->hpos;
+	ctx->hpos += n;
+	return ptr;
+}
+
+/**
+ * cr_hbuf_put - unreserve space on the hbuf
+ * @ctx: checkpoint context
+ * @n: number of bytes to reserve
+ */
+void cr_hbuf_put(struct cr_ctx *ctx, int n)
+{
+	BUG_ON(ctx->hpos < n);
+	ctx->hpos -= n;
+}
+
+/*
+ * helpers to manage C/R contexts: allocated for each checkpoint and/or
+ * restart operation, and persists until the operation is completed.
+ */
+
+static void cr_ctx_free(struct cr_ctx *ctx)
+{
+	if (ctx->file)
+		fput(ctx->file);
+	kfree(ctx->hbuf);
+	kfree(ctx);
+}
+
+static struct cr_ctx *cr_ctx_alloc(int fd, unsigned long flags)
+{
+	struct cr_ctx *ctx;
+	int err;
+
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return ERR_PTR(-ENOMEM);
+
+	ctx->flags = flags;
+
+	err = -EBADF;
+	ctx->file = fget(fd);
+	if (!ctx->file)
+		goto err;
+
+	err = -ENOMEM;
+	ctx->hbuf = kmalloc(CR_HBUF_TOTAL, GFP_KERNEL);
+	if (!ctx->hbuf)
+		goto err;
+
+	return ctx;
+
+ err:
+	cr_ctx_free(ctx);
+	return ERR_PTR(err);
+}
 
 /**
  * sys_checkpoint - checkpoint a container
@@ -22,9 +196,26 @@
  */
 asmlinkage long sys_checkpoint(pid_t pid, int fd, unsigned long flags)
 {
-	pr_debug("sys_checkpoint not implemented yet\n");
-	return -ENOSYS;
+	struct cr_ctx *ctx;
+	int ret;
+
+	/* no flags for now */
+	if (flags)
+		return -EINVAL;
+
+	ctx = cr_ctx_alloc(fd, flags | CR_CTX_CKPT);
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
+
+	ret = do_checkpoint(ctx, pid);
+
+	if (!ret)
+		ret = ctx->crid;
+
+	cr_ctx_free(ctx);
+	return ret;
 }
+
 /**
  * sys_restart - restart a container
  * @crid: checkpoint image identifier
@@ -36,6 +227,23 @@ asmlinkage long sys_checkpoint(pid_t pid, int fd, unsigned long flags)
  */
 asmlinkage long sys_restart(int crid, int fd, unsigned long flags)
 {
-	pr_debug("sys_restart not implemented yet\n");
-	return -ENOSYS;
+	struct cr_ctx *ctx;
+	pid_t pid;
+	int ret;
+
+	/* no flags for now */
+	if (flags)
+		return -EINVAL;
+
+	ctx = cr_ctx_alloc(fd, flags | CR_CTX_RSTR);
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
+
+	/* FIXME: for now, we use 'crid' as a pid */
+	pid = (pid_t) crid;
+
+	ret = do_restart(ctx, pid);
+
+	cr_ctx_free(ctx);
+	return ret;
 }
diff --git a/include/linux/checkpoint.h b/include/linux/checkpoint.h
new file mode 100644
index 0000000..65a2cbf
--- /dev/null
+++ b/include/linux/checkpoint.h
@@ -0,0 +1,61 @@
+#ifndef _CHECKPOINT_CKPT_H_
+#define _CHECKPOINT_CKPT_H_
+/*
+ *  Generic container checkpoint-restart
+ *
+ *  Copyright (C) 2008 Oren Laadan
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#define CR_VERSION  1
+
+struct cr_ctx {
+	int crid;		/* unique checkpoint id */
+
+	pid_t root_pid;		/* container identifier */
+
+	unsigned long flags;
+	unsigned long oflags;	/* restart: old flags */
+
+	struct file *file;
+	int total;		/* total read/written */
+
+	void *hbuf;		/* temporary buffer for headers */
+	int hpos;		/* position in headers buffer */
+};
+
+/* cr_ctx: flags */
+#define CR_CTX_CKPT	0x1
+#define CR_CTX_RSTR	0x2
+
+extern int cr_kwrite(struct cr_ctx *ctx, void *buf, int count);
+extern int cr_kread(struct cr_ctx *ctx, void *buf, int count);
+
+extern void *cr_hbuf_get(struct cr_ctx *ctx, int n);
+extern void cr_hbuf_put(struct cr_ctx *ctx, int n);
+
+struct cr_hdr;
+
+extern int cr_write_obj(struct cr_ctx *ctx, struct cr_hdr *h, void *buf);
+extern int cr_write_buffer(struct cr_ctx *ctx, void *buf, int len);
+extern int cr_write_string(struct cr_ctx *ctx, char *str, int len);
+
+extern int cr_read_obj(struct cr_ctx *ctx, struct cr_hdr *h, void *buf, int n);
+extern int cr_read_obj_type(struct cr_ctx *ctx, void *buf, int len, int type);
+extern int cr_read_buf_type(struct cr_ctx *ctx, void *buf, int *len, int type);
+extern int cr_read_buffer(struct cr_ctx *ctx, void *buf, int *len);
+extern int cr_read_string(struct cr_ctx *ctx, char *str, int len);
+
+extern int do_checkpoint(struct cr_ctx *ctx, pid_t pid);
+extern int do_restart(struct cr_ctx *ctx, pid_t pid);
+
+#ifdef pr_fmt
+#undef pr_fmt
+#endif
+
+#define pr_fmt(fmt) "[%d:c/r:%s] " fmt, task_pid_vnr(current), __func__
+
+#endif /* _CHECKPOINT_CKPT_H_ */
diff --git a/include/linux/checkpoint_hdr.h b/include/linux/checkpoint_hdr.h
new file mode 100644
index 0000000..fcc0125
--- /dev/null
+++ b/include/linux/checkpoint_hdr.h
@@ -0,0 +1,85 @@
+#ifndef _CHECKPOINT_CKPT_HDR_H_
+#define _CHECKPOINT_CKPT_HDR_H_
+/*
+ *  Generic container checkpoint-restart
+ *
+ *  Copyright (C) 2008 Oren Laadan
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#include <linux/types.h>
+#include <linux/utsname.h>
+
+/*
+ * To maintain compatibility between 32-bit and 64-bit architecture flavors,
+ * keep data 64-bit aligned: use padding for structure members, and use
+ * __attribute__ ((aligned (8))) for the entire structure.
+ *
+ * Quoting Arnd Bergmann:
+ *   "This structure has an odd multiple of 32-bit members, which means
+ *   that if you put it into a larger structure that also contains 64-bit
+ *   members, the larger structure may get different alignment on x86-32
+ *   and x86-64, which you might want to avoid. I can't tell if this is
+ *   an actual problem here. ... In this case, I'm pretty sure that
+ *   sizeof(cr_hdr_task) on x86-32 is different from x86-64, since it
+ *   will be 32-bit aligned on x86-32."
+ */
+
+/* records: generic header */
+
+struct cr_hdr {
+	__s16 type;
+	__s16 len;
+	__u32 parent;
+};
+
+/* header types */
+enum {
+	CR_HDR_HEAD = 1,
+	CR_HDR_BUFFER,
+	CR_HDR_STRING,
+
+	CR_HDR_TASK = 101,
+	CR_HDR_THREAD,
+	CR_HDR_CPU,
+
+	CR_HDR_MM = 201,
+	CR_HDR_VMA,
+	CR_HDR_MM_CONTEXT,
+
+	CR_HDR_TAIL = 5001
+};
+
+struct cr_hdr_head {
+	__u64 magic;
+
+	__u16 major;
+	__u16 minor;
+	__u16 patch;
+	__u16 rev;
+
+	__u64 time;	/* when checkpoint taken */
+	__u64 flags;	/* checkpoint options */
+
+	char release[__NEW_UTS_LEN];
+	char version[__NEW_UTS_LEN];
+	char machine[__NEW_UTS_LEN];
+} __attribute__((aligned(8)));
+
+struct cr_hdr_tail {
+	__u64 magic;
+} __attribute__((aligned(8)));
+
+struct cr_hdr_task {
+	__u32 state;
+	__u32 exit_state;
+	__u32 exit_code;
+	__u32 exit_signal;
+
+	__s32 task_comm_len;
+} __attribute__((aligned(8)));
+
+#endif /* _CHECKPOINT_CKPT_HDR_H_ */
diff --git a/include/linux/magic.h b/include/linux/magic.h
index f7f3fdd..5939bbe 100644
--- a/include/linux/magic.h
+++ b/include/linux/magic.h
@@ -46,4 +46,7 @@
 #define FUTEXFS_SUPER_MAGIC	0xBAD1DEA
 #define INOTIFYFS_SUPER_MAGIC	0x2BAD1DEA
 
+#define CHECKPOINT_MAGIC_HEAD  0x00feed0cc0a2d200LL
+#define CHECKPOINT_MAGIC_TAIL  0x002d2a0cc0deef00LL
+
 #endif /* __LINUX_MAGIC_H__ */
-- 
1.5.4.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2009-01-27 17:12 UTC|newest]

Thread overview: 394+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-27 17:07 [RFC v13][PATCH 00/14] Kernel based checkpoint/restart Oren Laadan
2009-01-27 17:07 ` Oren Laadan
2009-01-27 17:07 ` [RFC v13][PATCH 01/14] Create syscalls: sys_checkpoint, sys_restart Oren Laadan
2009-01-27 17:07   ` Oren Laadan
     [not found]   ` <1233076092-8660-2-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-01-27 17:20     ` Randy Dunlap
2009-01-27 17:20   ` Randy Dunlap
2009-01-27 17:20     ` Randy Dunlap
2009-01-27 17:08 ` [RFC v13][PATCH 02/14] Checkpoint/restart: initial documentation Oren Laadan
2009-01-27 17:08   ` Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 03/14] Make file_pos_read/write() public Oren Laadan
2009-01-27 17:08   ` Oren Laadan
2009-01-27 17:08 ` Oren Laadan [this message]
2009-01-27 17:08   ` [RFC v13][PATCH 04/14] General infrastructure for checkpoint restart Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 05/14] x86 support for checkpoint/restart Oren Laadan
2009-01-27 17:08   ` Oren Laadan
2009-02-24  7:47   ` Nathan Lynch
2009-02-24  7:47     ` Nathan Lynch
     [not found]     ` <20090224014739.1b82fc35-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
2009-02-24 16:06       ` Dave Hansen
2009-03-18  7:21       ` Oren Laadan
2009-02-24 16:06     ` Dave Hansen
2009-02-24 16:06       ` Dave Hansen
2009-02-24 16:06       ` Dave Hansen
2009-03-18  7:21     ` Oren Laadan
2009-03-18  7:21       ` Oren Laadan
     [not found]   ` <1233076092-8660-6-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-02-24  7:47     ` Nathan Lynch
2009-01-27 17:08 ` [RFC v13][PATCH 06/14] Dump memory address space Oren Laadan
2009-01-27 17:08   ` Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 07/14] Restore " Oren Laadan
2009-01-27 17:08   ` Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 08/14] Infrastructure for shared objects Oren Laadan
2009-01-27 17:08   ` Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 09/14] Dump open file descriptors Oren Laadan
2009-01-27 17:08   ` Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 10/14] Restore open file descriprtors Oren Laadan
2009-01-27 17:08   ` Oren Laadan
2009-01-27 17:08   ` Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 11/14] External checkpoint of a task other than ourself Oren Laadan
2009-01-27 17:08   ` Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 12/14] Track in-kernel when we expect checkpoint/restart to work Oren Laadan
2009-01-27 17:08   ` Oren Laadan
2009-01-27 17:08   ` Oren Laadan
     [not found] ` <1233076092-8660-1-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-01-27 17:07   ` [RFC v13][PATCH 01/14] Create syscalls: sys_checkpoint, sys_restart Oren Laadan
2009-01-27 17:08   ` [RFC v13][PATCH 02/14] Checkpoint/restart: initial documentation Oren Laadan
2009-01-27 17:08   ` [RFC v13][PATCH 03/14] Make file_pos_read/write() public Oren Laadan
2009-01-27 17:08   ` [RFC v13][PATCH 04/14] General infrastructure for checkpoint restart Oren Laadan
2009-01-27 17:08   ` [RFC v13][PATCH 05/14] x86 support for checkpoint/restart Oren Laadan
2009-01-27 17:08   ` [RFC v13][PATCH 06/14] Dump memory address space Oren Laadan
2009-01-27 17:08   ` [RFC v13][PATCH 07/14] Restore " Oren Laadan
2009-01-27 17:08   ` [RFC v13][PATCH 08/14] Infrastructure for shared objects Oren Laadan
2009-01-27 17:08   ` [RFC v13][PATCH 09/14] Dump open file descriptors Oren Laadan
2009-01-27 17:08   ` [RFC v13][PATCH 10/14] Restore open file descriprtors Oren Laadan
2009-01-27 17:08   ` [RFC v13][PATCH 11/14] External checkpoint of a task other than ourself Oren Laadan
2009-01-27 17:08   ` [RFC v13][PATCH 12/14] Track in-kernel when we expect checkpoint/restart to work Oren Laadan
2009-01-27 17:08   ` [RFC v13][PATCH 13/14] Checkpoint multiple processes Oren Laadan
2009-01-27 17:08   ` [RFC v13][PATCH 14/14] Restart " Oren Laadan
2009-02-10 17:05   ` [RFC v13][PATCH 00/14] Kernel based checkpoint/restart Dave Hansen
2009-01-27 17:08 ` [RFC v13][PATCH 13/14] Checkpoint multiple processes Oren Laadan
2009-01-27 17:08   ` Oren Laadan
2009-01-27 17:08 ` [RFC v13][PATCH 14/14] Restart " Oren Laadan
2009-01-27 17:08   ` Oren Laadan
2009-01-27 17:08   ` Oren Laadan
2009-02-10 17:05 ` [RFC v13][PATCH 00/14] Kernel based checkpoint/restart Dave Hansen
2009-02-10 17:05   ` Dave Hansen
2009-02-11 22:14   ` Andrew Morton
2009-02-11 22:14     ` Andrew Morton
2009-02-11 22:14     ` Andrew Morton
2009-02-12  9:17     ` Ingo Molnar
2009-02-12  9:17       ` Ingo Molnar
2009-02-12 18:11       ` Dave Hansen
2009-02-12 18:11         ` Dave Hansen
2009-02-12 18:11         ` Dave Hansen
2009-02-12 20:48         ` Serge E. Hallyn
2009-02-12 20:48         ` Serge E. Hallyn
2009-02-12 20:48           ` Serge E. Hallyn
2009-02-12 20:48           ` Serge E. Hallyn
2009-02-13 10:20         ` Ingo Molnar
2009-02-13 10:20           ` Ingo Molnar
2009-02-13 10:20         ` Ingo Molnar
     [not found]       ` <20090212091721.GB1888-X9Un+BFzKDI@public.gmane.org>
2009-02-12 18:11         ` Dave Hansen
     [not found]     ` <20090211141434.dfa1d079.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2009-02-12  9:17       ` Ingo Molnar
2009-02-12 18:11       ` Dave Hansen
2009-02-12 18:11     ` Dave Hansen
2009-02-12 18:11       ` Dave Hansen
2009-02-12 19:30       ` Matt Mackall
2009-02-12 19:30       ` Matt Mackall
2009-02-12 19:30         ` Matt Mackall
2009-02-12 19:42         ` Andrew Morton
2009-02-12 19:42           ` Andrew Morton
2009-02-12 21:51           ` What can OpenVZ do? Dave Hansen
2009-02-12 21:51             ` Dave Hansen
2009-02-12 22:10             ` Andrew Morton
2009-02-12 22:10             ` Andrew Morton
2009-02-12 22:10               ` Andrew Morton
2009-02-12 23:04               ` How much of a mess does OpenVZ make? ;) Was: " Dave Hansen
2009-02-12 23:04                 ` Dave Hansen
2009-02-26 15:57                 ` Alexey Dobriyan
2009-02-26 15:57                   ` Alexey Dobriyan
2009-02-26 15:57                   ` Alexey Dobriyan
     [not found]                   ` <20090226155755.GA1456-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-03-10 21:53                     ` Alexey Dobriyan
2009-03-10 21:53                   ` Alexey Dobriyan
2009-03-10 21:53                     ` Alexey Dobriyan
2009-03-10 23:28                     ` Serge E. Hallyn
2009-03-10 23:28                       ` Serge E. Hallyn
2009-03-11  8:26                     ` Cedric Le Goater
2009-03-11  8:26                       ` Cedric Le Goater
2009-03-12 14:53                       ` Serge E. Hallyn
2009-03-12 14:53                         ` Serge E. Hallyn
2009-03-12 21:01                         ` Greg Kurz
2009-03-12 21:01                           ` Greg Kurz
2009-03-12 21:21                           ` Serge E. Hallyn
2009-03-12 21:21                             ` Serge E. Hallyn
2009-03-12 21:21                             ` Serge E. Hallyn
     [not found]                             ` <20090312212124.GA25019-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-13  4:29                               ` Ying Han
2009-03-13  4:29                             ` Ying Han
2009-03-13  4:29                               ` Ying Han
2009-03-13  5:34                               ` Sukadev Bhattiprolu
2009-03-13  5:34                                 ` Sukadev Bhattiprolu
     [not found]                                 ` <20090313053458.GA28833-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-13  6:19                                   ` Ying Han
2009-03-13 17:27                                   ` Linus Torvalds
2009-03-13  6:19                                 ` Ying Han
2009-03-13  6:19                                   ` Ying Han
2009-03-13  6:19                                   ` Ying Han
2009-03-13 17:27                                 ` Linus Torvalds
2009-03-13 17:27                                   ` Linus Torvalds
2009-03-13 19:02                                   ` Serge E. Hallyn
2009-03-13 19:02                                     ` Serge E. Hallyn
     [not found]                                   ` <alpine.LFD.2.00.0903131018390.3940-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2009-03-13 19:02                                     ` Serge E. Hallyn
2009-03-13 19:35                                     ` Alexey Dobriyan
2009-03-13 20:48                                     ` Mike Waychison
2009-03-13 19:35                                   ` Alexey Dobriyan
2009-03-13 19:35                                     ` Alexey Dobriyan
2009-03-13 19:35                                     ` Alexey Dobriyan
     [not found]                                     ` <20090313193500.GA2285-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-03-13 21:01                                       ` Linus Torvalds
2009-03-13 21:01                                     ` Linus Torvalds
2009-03-13 21:01                                       ` Linus Torvalds
     [not found]                                       ` <alpine.LFD.2.00.0903131401070.3940-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2009-03-13 21:51                                         ` Dave Hansen
2009-03-14  0:20                                         ` Alexey Dobriyan
2009-03-13 21:51                                       ` Dave Hansen
2009-03-13 21:51                                         ` Dave Hansen
2009-03-13 22:15                                         ` Oren Laadan
2009-03-13 22:15                                           ` Oren Laadan
     [not found]                                           ` <49BADAE5.8070900-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-03-14  0:27                                             ` Eric W. Biederman
2009-03-14  0:27                                           ` Eric W. Biederman
2009-03-14  0:27                                             ` Eric W. Biederman
     [not found]                                             ` <m1hc1xrlt5.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-03-14  8:12                                               ` Ingo Molnar
2009-03-14  8:12                                             ` Ingo Molnar
2009-03-14  8:12                                               ` Ingo Molnar
2009-03-16 22:33                                               ` Kevin Fox
2009-03-16 22:33                                                 ` Kevin Fox
2009-03-19 21:19                                               ` Eric W. Biederman
2009-03-19 21:19                                                 ` Eric W. Biederman
     [not found]                                               ` <20090314081207.GA16436-X9Un+BFzKDI@public.gmane.org>
2009-03-16 22:33                                                 ` Kevin Fox
2009-03-19 21:19                                                 ` Eric W. Biederman
2009-03-13 22:15                                         ` Oren Laadan
2009-03-14  0:20                                       ` Alexey Dobriyan
2009-03-14  0:20                                         ` Alexey Dobriyan
2009-03-14  0:20                                         ` Alexey Dobriyan
2009-03-14  8:25                                         ` Ingo Molnar
2009-03-14  8:25                                           ` Ingo Molnar
     [not found]                                           ` <20090314082532.GB16436-X9Un+BFzKDI@public.gmane.org>
2009-03-14 17:11                                             ` Joseph Ruscio
2009-03-16  6:01                                             ` Oren Laadan
2009-03-16  6:01                                           ` Oren Laadan
2009-03-16  6:01                                             ` Oren Laadan
     [not found]                                         ` <20090314002059.GA4167-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-03-14  8:25                                           ` Ingo Molnar
2009-03-13 20:48                                   ` Mike Waychison
2009-03-13 20:48                                     ` Mike Waychison
     [not found]                                     ` <49BAC6AF.9090607-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2009-03-13 22:35                                       ` Oren Laadan
2009-03-13 22:35                                     ` Oren Laadan
2009-03-13 22:35                                       ` Oren Laadan
     [not found]                                       ` <49BADFCE.8020207-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-03-18 18:54                                         ` Mike Waychison
2009-03-18 18:54                                       ` Mike Waychison
2009-03-18 18:54                                         ` Mike Waychison
2009-03-18 19:04                                         ` Oren Laadan
2009-03-18 19:04                                           ` Oren Laadan
     [not found]                                         ` <49C1435B.1090809-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2009-03-18 19:04                                           ` Oren Laadan
2009-03-13 15:27                               ` Cedric Le Goater
2009-03-13 15:27                                 ` Cedric Le Goater
     [not found]                                 ` <49BA7B60.60607-GANU6spQydw@public.gmane.org>
2009-03-13 17:11                                   ` Greg Kurz
2009-03-13 17:11                                     ` Greg Kurz
2009-03-13 17:11                                     ` Greg Kurz
2009-03-13 17:37                               ` Serge E. Hallyn
2009-03-13 17:37                                 ` Serge E. Hallyn
     [not found]                               ` <604427e00903122129y37ad791aq5fe7ef2552415da9-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-03-13  5:34                                 ` Sukadev Bhattiprolu
2009-03-13 15:27                                 ` Cedric Le Goater
2009-03-13 17:37                                 ` Serge E. Hallyn
2009-03-12 21:21                           ` Serge E. Hallyn
2009-03-13 15:47                         ` Cedric Le Goater
2009-03-13 15:47                           ` Cedric Le Goater
     [not found]                           ` <49BA8013.3030103-GANU6spQydw@public.gmane.org>
2009-03-13 16:35                             ` Serge E. Hallyn
2009-03-13 16:35                           ` Serge E. Hallyn
2009-03-13 16:35                             ` Serge E. Hallyn
     [not found]                             ` <20090313163531.GA10685-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-13 16:53                               ` Cedric Le Goater
2009-03-13 16:53                             ` Cedric Le Goater
2009-03-13 16:53                               ` Cedric Le Goater
     [not found]                         ` <20090312145311.GC12390-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-12 21:01                           ` Greg Kurz
2009-03-13 15:47                           ` Cedric Le Goater
     [not found]                       ` <49B775B4.1040800-GANU6spQydw@public.gmane.org>
2009-03-12 14:53                         ` Serge E. Hallyn
     [not found]                     ` <20090310215305.GA2078-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-03-10 23:28                       ` Serge E. Hallyn
2009-03-11  8:26                       ` Cedric Le Goater
2009-02-26 15:57                 ` Alexey Dobriyan
2009-02-26 16:27                 ` Alexey Dobriyan
2009-02-26 16:27                   ` Alexey Dobriyan
     [not found]                   ` <20090226162755.GB1456-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-02-26 17:33                     ` Ingo Molnar
2009-02-26 17:33                   ` Ingo Molnar
2009-02-26 17:33                     ` Ingo Molnar
2009-02-26 18:30                     ` Greg Kurz
2009-02-26 18:30                       ` Greg Kurz
2009-02-26 18:30                       ` Greg Kurz
2009-02-26 22:17                       ` Alexey Dobriyan
2009-02-26 22:17                         ` Alexey Dobriyan
2009-02-26 22:17                         ` Alexey Dobriyan
2009-02-27  9:19                         ` Greg Kurz
2009-02-27  9:19                           ` Greg Kurz
2009-02-27  9:19                           ` Greg Kurz
2009-02-27 10:53                           ` Alexey Dobriyan
2009-02-27 10:53                             ` Alexey Dobriyan
2009-02-27 10:53                             ` Alexey Dobriyan
     [not found]                             ` <20090227105306.GB2939-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-02-27 14:33                               ` Cedric Le Goater
2009-02-27 14:33                             ` Cedric Le Goater
2009-02-27 14:33                               ` Cedric Le Goater
2009-02-27 10:53                           ` Alexey Dobriyan
     [not found]                         ` <20090226221709.GA2924-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-02-27  9:19                           ` Greg Kurz
2009-02-27  9:36                           ` Cedric Le Goater
2009-02-27  9:36                         ` Cedric Le Goater
2009-02-27  9:36                           ` Cedric Le Goater
2009-02-26 22:17                       ` Alexey Dobriyan
2009-02-26 22:31                     ` Alexey Dobriyan
2009-02-26 22:31                       ` Alexey Dobriyan
2009-02-26 22:31                       ` Alexey Dobriyan
2009-02-27  9:03                       ` Ingo Molnar
2009-02-27  9:03                         ` Ingo Molnar
2009-02-27  9:19                         ` Andrew Morton
2009-02-27  9:19                           ` Andrew Morton
     [not found]                           ` <20090227011901.8598d7f0.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2009-02-27 10:57                             ` Alexey Dobriyan
2009-02-27 10:57                           ` Alexey Dobriyan
2009-02-27 10:57                             ` Alexey Dobriyan
     [not found]                         ` <20090227090323.GC16211-X9Un+BFzKDI@public.gmane.org>
2009-02-27  9:19                           ` Andrew Morton
2009-02-27  9:22                           ` Andrew Morton
2009-02-27  9:22                             ` Andrew Morton
2009-02-27  9:22                             ` Andrew Morton
2009-02-27 10:59                             ` Alexey Dobriyan
2009-02-27 10:59                               ` Alexey Dobriyan
     [not found]                             ` <20090227012209.65401324.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2009-02-27 10:59                               ` Alexey Dobriyan
2009-02-27 16:14                       ` Dave Hansen
2009-02-27 16:14                         ` Dave Hansen
2009-02-27 21:57                         ` Alexey Dobriyan
2009-02-27 21:57                         ` Alexey Dobriyan
2009-02-27 21:57                           ` Alexey Dobriyan
     [not found]                           ` <20090227215749.GA3453-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-02-27 21:54                             ` Dave Hansen
2009-02-27 21:54                           ` Dave Hansen
2009-02-27 21:54                             ` Dave Hansen
2009-02-27 21:54                             ` Dave Hansen
     [not found]                       ` <20090226223112.GA2939-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-02-27  9:03                         ` Ingo Molnar
2009-02-27 16:14                         ` Dave Hansen
2009-03-01  1:33                         ` Alexey Dobriyan
2009-03-01  1:33                       ` Alexey Dobriyan
2009-03-01  1:33                         ` Alexey Dobriyan
2009-03-01  1:33                         ` Alexey Dobriyan
2009-03-01 20:02                         ` Serge E. Hallyn
2009-03-01 20:02                           ` Serge E. Hallyn
2009-03-01 20:02                           ` Serge E. Hallyn
     [not found]                           ` <20090301200231.GA25276-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-01 20:56                             ` Alexey Dobriyan
2009-03-01 20:56                           ` Alexey Dobriyan
2009-03-01 20:56                             ` Alexey Dobriyan
2009-03-01 20:56                             ` Alexey Dobriyan
2009-03-01 22:21                             ` Serge E. Hallyn
2009-03-01 22:21                               ` Serge E. Hallyn
2009-03-03 16:17                             ` Cedric Le Goater
2009-03-03 16:17                               ` Cedric Le Goater
2009-03-03 18:28                               ` Serge E. Hallyn
2009-03-03 18:28                                 ` Serge E. Hallyn
     [not found]                               ` <49AD581F.2090903-GANU6spQydw@public.gmane.org>
2009-03-03 18:28                                 ` Serge E. Hallyn
     [not found]                             ` <20090301205659.GA7276-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-03-01 22:21                               ` Serge E. Hallyn
2009-03-03 16:17                               ` Cedric Le Goater
     [not found]                         ` <20090301013304.GA2428-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-03-01 20:02                           ` Serge E. Hallyn
     [not found]                     ` <20090226173302.GB29439-X9Un+BFzKDI@public.gmane.org>
2009-02-26 18:30                       ` Greg Kurz
2009-02-26 22:31                       ` Alexey Dobriyan
2009-02-26 16:27                 ` Alexey Dobriyan
     [not found]               ` <20090212141014.2cd3d54d.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2009-02-12 23:04                 ` Dave Hansen
2009-02-13 10:53                 ` Ingo Molnar
2009-02-13 10:53               ` Ingo Molnar
2009-02-13 10:53                 ` Ingo Molnar
2009-02-16 20:51                 ` Dave Hansen
2009-02-16 20:51                   ` Dave Hansen
2009-02-16 20:51                   ` Dave Hansen
2009-02-17 22:23                   ` Ingo Molnar
2009-02-17 22:23                   ` Ingo Molnar
2009-02-17 22:23                     ` Ingo Molnar
     [not found]                     ` <20090217222319.GA10546-X9Un+BFzKDI@public.gmane.org>
2009-02-17 22:30                       ` Dave Hansen
2009-02-17 22:30                     ` Dave Hansen
2009-02-17 22:30                       ` Dave Hansen
2009-02-17 22:30                       ` Dave Hansen
2009-02-18  0:32                       ` Ingo Molnar
2009-02-18  0:32                       ` Ingo Molnar
2009-02-18  0:32                         ` Ingo Molnar
2009-02-18  0:40                         ` Dave Hansen
2009-02-18  0:40                           ` Dave Hansen
2009-02-18  5:11                           ` Alexey Dobriyan
2009-02-18  5:11                           ` Alexey Dobriyan
2009-02-18  5:11                             ` Alexey Dobriyan
2009-02-18 18:16                             ` Ingo Molnar
2009-02-18 18:16                               ` Ingo Molnar
     [not found]                               ` <20090218181644.GD19995-X9Un+BFzKDI@public.gmane.org>
2009-02-18 21:27                                 ` Dave Hansen
2009-02-18 21:27                               ` Dave Hansen
2009-02-18 21:27                                 ` Dave Hansen
2009-02-18 21:27                                 ` Dave Hansen
2009-02-18 23:15                                 ` Ingo Molnar
2009-02-18 23:15                                 ` Ingo Molnar
2009-02-18 23:15                                   ` Ingo Molnar
2009-02-19 19:06                                   ` Banning checkpoint (was: Re: What can OpenVZ do?) Alexey Dobriyan
2009-02-19 19:06                                     ` Alexey Dobriyan
2009-02-19 19:11                                     ` Dave Hansen
2009-02-19 19:11                                       ` Dave Hansen
2009-02-24  4:47                                       ` Alexey Dobriyan
2009-02-24  4:47                                         ` Alexey Dobriyan
2009-02-24  4:47                                         ` Alexey Dobriyan
2009-02-24  5:11                                         ` Dave Hansen
2009-02-24  5:11                                           ` Dave Hansen
2009-02-24  5:11                                           ` Dave Hansen
2009-02-24 15:43                                           ` Serge E. Hallyn
2009-02-24 15:43                                           ` Serge E. Hallyn
2009-02-24 15:43                                             ` Serge E. Hallyn
2009-02-24 20:09                                           ` Alexey Dobriyan
2009-02-24 20:09                                             ` Alexey Dobriyan
2009-02-24 20:09                                           ` Alexey Dobriyan
     [not found]                                         ` <20090224044752.GB3202-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-02-24  5:11                                           ` Dave Hansen
2009-02-24  4:47                                       ` Alexey Dobriyan
     [not found]                                     ` <20090219190637.GA4846-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-02-19 19:11                                       ` Dave Hansen
     [not found]                                   ` <20090218231545.GA17524-X9Un+BFzKDI@public.gmane.org>
2009-02-19 19:06                                     ` Alexey Dobriyan
     [not found]                             ` <20090218051123.GA9367-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-02-18 18:16                               ` What can OpenVZ do? Ingo Molnar
     [not found]                         ` <20090218003217.GB25856-X9Un+BFzKDI@public.gmane.org>
2009-02-18  0:40                           ` Dave Hansen
     [not found]                 ` <20090213105302.GC4608-X9Un+BFzKDI@public.gmane.org>
2009-02-16 20:51                   ` Dave Hansen
2009-02-12 22:17             ` Alexey Dobriyan
2009-02-12 22:17             ` Alexey Dobriyan
2009-02-12 22:17               ` Alexey Dobriyan
2009-02-13 10:27             ` Ingo Molnar
2009-02-13 10:27             ` Ingo Molnar
2009-02-13 10:27               ` Ingo Molnar
2009-02-13 11:32               ` Alexey Dobriyan
2009-02-13 11:32                 ` Alexey Dobriyan
2009-02-13 11:45                 ` Ingo Molnar
2009-02-13 11:45                   ` Ingo Molnar
2009-02-13 22:28                   ` Alexey Dobriyan
2009-02-13 22:28                     ` Alexey Dobriyan
     [not found]                     ` <20090213222818.GA17630-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-03-14  0:04                       ` Eric W. Biederman
2009-03-14  0:04                     ` Eric W. Biederman
2009-03-14  0:04                       ` Eric W. Biederman
2009-03-14  0:26                       ` Serge E. Hallyn
2009-03-14  0:26                         ` Serge E. Hallyn
     [not found]                       ` <m1wsatrmu0.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-03-14  0:26                         ` Serge E. Hallyn
     [not found]                   ` <20090213114503.GG15679-X9Un+BFzKDI@public.gmane.org>
2009-02-13 22:28                     ` Alexey Dobriyan
     [not found]                 ` <20090213113248.GA15275-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-02-13 11:45                   ` Ingo Molnar
     [not found]               ` <20090213102732.GB4608-X9Un+BFzKDI@public.gmane.org>
2009-02-13 11:32                 ` Alexey Dobriyan
     [not found]           ` <20090212114207.e1c2de82.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2009-02-12 21:51             ` Dave Hansen
2009-02-12 19:42         ` [RFC v13][PATCH 00/14] Kernel based checkpoint/restart Andrew Morton
2009-02-12 22:57         ` Dave Hansen
2009-02-12 22:57           ` Dave Hansen
2009-02-12 23:05           ` Matt Mackall
2009-02-12 23:05           ` Matt Mackall
2009-02-12 23:05             ` Matt Mackall
2009-02-12 23:13             ` Dave Hansen
2009-02-12 23:13             ` Dave Hansen
2009-02-12 23:13               ` Dave Hansen
2009-02-12 23:13               ` Dave Hansen
2009-02-12 22:57         ` Dave Hansen
2009-02-13 23:28       ` Andrew Morton
2009-02-13 23:28       ` Andrew Morton
2009-02-13 23:28         ` Andrew Morton
2009-02-13 23:28         ` Andrew Morton
2009-02-14 23:08         ` Ingo Molnar
2009-02-14 23:08           ` Ingo Molnar
2009-02-14 23:31           ` Andrew Morton
2009-02-14 23:31             ` Andrew Morton
     [not found]             ` <20090214153124.73132bf9.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2009-02-14 23:50               ` Ingo Molnar
2009-02-14 23:50             ` Ingo Molnar
2009-02-14 23:50               ` Ingo Molnar
     [not found]           ` <20090214230802.GE20477-X9Un+BFzKDI@public.gmane.org>
2009-02-14 23:31             ` Andrew Morton
2009-02-16 17:37         ` Dave Hansen
2009-02-16 17:37           ` Dave Hansen
2009-02-16 17:37           ` Dave Hansen
2009-03-13  2:45         ` Oren Laadan
2009-03-13  2:45           ` Oren Laadan
2009-03-13  3:57           ` Oren Laadan
2009-03-13  3:57             ` Oren Laadan
     [not found]           ` <49B9C8E0.5080500-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-03-13  3:57             ` Oren Laadan
     [not found]         ` <20090213152836.0fbbfa7d.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2009-02-14 23:08           ` Ingo Molnar
2009-02-16 17:37           ` Dave Hansen
2009-03-13  2:45           ` Oren Laadan
2009-02-11 22:14   ` Andrew Morton
     [not found] ` <bb33bcf20903160526v56f16a82m9192770e228016b1@mail.gmail.com>
     [not found]   ` <1237365510.5381.34.camel@subratamodak.linux.ibm.com>
     [not found]     ` <20090318133943.GA22636@us.ibm.com>
     [not found]       ` <1237385013.5381.58.camel@subratamodak.linux.ibm.com>
2009-06-23 14:48         ` [LTP] " Subrata Modak
2009-06-23 15:02           ` Serge E. Hallyn
2009-06-25  9:10             ` Subrata Modak
2009-09-13 13:16               ` Subrata Modak
2009-09-13 20:06                 ` Serge E. Hallyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1233076092-8660-5-git-send-email-orenl@cs.columbia.edu \
    --to=orenl@cs.columbia.edu \
    --cc=akpm@linux-foundation.org \
    --cc=containers@lists.linux-foundation.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=hpa@zytor.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=serue@us.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@osdl.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.