All of lore.kernel.org
 help / color / mirror / Atom feed
* [C/R ARM][PATCH 0/3] Linux Checkpoint-Restart - ARM port
@ 2010-03-22  1:06 ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-22  1:06 UTC (permalink / raw)
  To: containers, linux-arm-kernel; +Cc: linux-kernel, Christoffer Dall

Following there will be two preparatory patches for an ARM port of the
checkpoint-restart code and finally a third patch implementing the
architecture-specific parts of c/r.

The preparatory patches consist of a systrace implementation for ARM
based on a previous patch from Roland McGrath and an eclone implementation
for ARM. The systrace implementation is partial and provides the needed
functionality for c/r.

There is a separate patch for the user space code, which supports
cross-compilation, extracting headers for ARM and an eclone implementation
for ARM.

The kernel patches presented here are based on the ckpt-v20 patch set.

Signed-off-by: Christoffer Dall <christofferdall@christofferdall.dk>
Acked-by: Oren Laadan <orenl@cs.columbia.edu>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 0/3] Linux Checkpoint-Restart - ARM port
@ 2010-03-22  1:06 ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-22  1:06 UTC (permalink / raw)
  To: linux-arm-kernel

Following there will be two preparatory patches for an ARM port of the
checkpoint-restart code and finally a third patch implementing the
architecture-specific parts of c/r.

The preparatory patches consist of a systrace implementation for ARM
based on a previous patch from Roland McGrath and an eclone implementation
for ARM. The systrace implementation is partial and provides the needed
functionality for c/r.

There is a separate patch for the user space code, which supports
cross-compilation, extracting headers for ARM and an eclone implementation
for ARM.

The kernel patches presented here are based on the ckpt-v20 patch set.

Signed-off-by: Christoffer Dall <christofferdall@christofferdall.dk>
Acked-by: Oren Laadan <orenl@cs.columbia.edu>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
       [not found] ` <1269219965-23923-1-git-send-email-christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
@ 2010-03-22  1:06   ` Christoffer Dall
  2010-03-22  1:06   ` [C/R ARM][PATCH 2/3] ARM: Add the eclone system call Christoffer Dall
  2010-03-22  1:06   ` [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart Christoffer Dall
  2 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-22  1:06 UTC (permalink / raw)
  To: containers, linux-arm-kernel
  Cc: Christoffer Dall, linux-kernel, Roland McGrath

This small commit introduces a global state of system calls for ARM
making it possible for a debugger or checkpointing to gain information
about another process' state with respect to system calls.

The patch is based on this proposal from Roland McGrath:
https://patchwork.kernel.org/patch/32101/

Cc: Roland McGrath <roland-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Christoffer Dall <christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
Acked-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
---
 arch/arm/include/asm/syscall.h |   31 +++++++++++++++++++++++++++++++
 arch/arm/kernel/asm-offsets.c  |    1 +
 arch/arm/kernel/entry-common.S |    8 +++++++-
 arch/arm/kernel/ptrace.c       |    2 --
 arch/arm/kernel/signal.c       |   14 +++++++-------
 5 files changed, 46 insertions(+), 10 deletions(-)
 create mode 100644 arch/arm/include/asm/syscall.h

diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
new file mode 100644
index 0000000..3b3248f
--- /dev/null
+++ b/arch/arm/include/asm/syscall.h
@@ -0,0 +1,31 @@
+/*
+ * syscalls.h - Linux syscall interfaces for ARM
+ *
+ * Copyright (c) 2010 Christoffer Dall
+ *
+ * This file is released under the GPLv2.
+ * See the file COPYING for more details.
+ */
+
+#ifndef _ASM_ARM_SYSCALLS_H
+#define _ASM_ARM_SYSCALLS_H
+
+static inline int syscall_get_nr(struct task_struct *task,
+				 struct pt_regs *regs)
+{
+	return (int)(task_thread_info(task)->syscall);
+}
+
+static inline long syscall_get_return_value(struct task_struct *task,
+					    struct pt_regs *regs)
+{
+	return regs->ARM_r0;
+}
+
+static inline long syscall_get_error(struct task_struct *task,
+				     struct pt_regs *regs)
+{
+	return regs->ARM_r0;
+}
+
+#endif /* _ASM_ARM_SYSCALLS_H */
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 4a88125..726a0ad 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -48,6 +48,7 @@ int main(void)
   DEFINE(TI_CPU,		offsetof(struct thread_info, cpu));
   DEFINE(TI_CPU_DOMAIN,		offsetof(struct thread_info, cpu_domain));
   DEFINE(TI_CPU_SAVE,		offsetof(struct thread_info, cpu_context));
+  DEFINE(TI_SYSCALL,		offsetof(struct thread_info, syscall));
   DEFINE(TI_USED_CP,		offsetof(struct thread_info, used_cp));
   DEFINE(TI_TP_VALUE,		offsetof(struct thread_info, tp_value));
   DEFINE(TI_FPSTATE,		offsetof(struct thread_info, fpstate));
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index 2c1db77..f694f4d 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -30,6 +30,9 @@ ret_fast_syscall:
 	tst	r1, #_TIF_WORK_MASK
 	bne	fast_work_pending
 
+	mov	r2, #-1
+	str	r2, [tsk, #TI_SYSCALL]
+
 	/* perform architecture specific actions before user return */
 	arch_ret_to_user r1, lr
 
@@ -47,7 +50,6 @@ work_pending:
 	tst	r1, #_TIF_SIGPENDING|_TIF_NOTIFY_RESUME
 	beq	no_work_pending
 	mov	r0, sp				@ 'regs'
-	mov	r2, why				@ 'syscall'
 	bl	do_notify_resume
 	b	ret_slow_syscall		@ Check work again
 
@@ -62,6 +64,9 @@ ret_slow_syscall:
 	ldr	r1, [tsk, #TI_FLAGS]
 	tst	r1, #_TIF_WORK_MASK
 	bne	work_pending
+
+	mov	r2, #-1
+	str	r2, [tsk, #TI_SYSCALL]
 no_work_pending:
 	/* perform architecture specific actions before user return */
 	arch_ret_to_user r1, lr
@@ -274,6 +279,7 @@ ENTRY(vector_swi)
 	eor	scno, scno, #__NR_SYSCALL_BASE	@ check OS number
 #endif
 
+	str	scno, [tsk, #TI_SYSCALL]	@ store syscall nr. globally
 	stmdb	sp!, {r4, r5}			@ push fifth and sixth args
 	tst	ip, #_TIF_SYSCALL_TRACE		@ are we tracing syscalls?
 	bne	__sys_trace
diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index a2ea385..44ab437 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -863,8 +863,6 @@ asmlinkage int syscall_trace(int why, struct pt_regs *regs, int scno)
 	ip = regs->ARM_ip;
 	regs->ARM_ip = why;
 
-	current_thread_info()->syscall = scno;
-
 	/* the 0x80 provides a way for the tracing parent to distinguish
 	   between a syscall stop and SIGTRAP delivery */
 	ptrace_notify(SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD)
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index e7714f3..f695239 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -527,7 +527,7 @@ static inline void setup_syscall_restart(struct pt_regs *regs)
 static int
 handle_signal(unsigned long sig, struct k_sigaction *ka,
 	      siginfo_t *info, sigset_t *oldset,
-	      struct pt_regs * regs, int syscall)
+	      struct pt_regs *regs)
 {
 	struct thread_info *thread = current_thread_info();
 	struct task_struct *tsk = current;
@@ -537,7 +537,7 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
 	/*
 	 * If we were from a system call, check for system call restarting...
 	 */
-	if (syscall) {
+	if (thread->syscall != -1) {
 		switch (regs->ARM_r0) {
 		case -ERESTART_RESTARTBLOCK:
 		case -ERESTARTNOHAND:
@@ -601,7 +601,7 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
  * the kernel can handle, and then we build all the user-level signal handling
  * stack-frames in one go after that.
  */
-static void do_signal(struct pt_regs *regs, int syscall)
+static void do_signal(struct pt_regs *regs)
 {
 	struct k_sigaction ka;
 	siginfo_t info;
@@ -629,7 +629,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
 			oldset = &current->saved_sigmask;
 		else
 			oldset = &current->blocked;
-		if (handle_signal(signr, &ka, &info, oldset, regs, syscall) == 0) {
+		if (handle_signal(signr, &ka, &info, oldset, regs) == 0) {
 			/*
 			 * A signal was successfully delivered; the saved
 			 * sigmask will have been stored in the signal frame,
@@ -647,7 +647,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
 	/*
 	 * No signal to deliver to the process - restart the syscall.
 	 */
-	if (syscall) {
+	if (current_thread_info()->syscall != -1) {
 		if (regs->ARM_r0 == -ERESTART_RESTARTBLOCK) {
 			if (thumb_mode(regs)) {
 				regs->ARM_r7 = __NR_restart_syscall - __NR_SYSCALL_BASE;
@@ -689,10 +689,10 @@ static void do_signal(struct pt_regs *regs, int syscall)
 }
 
 asmlinkage void
-do_notify_resume(struct pt_regs *regs, unsigned int thread_flags, int syscall)
+do_notify_resume(struct pt_regs *regs, unsigned int thread_flags)
 {
 	if (thread_flags & _TIF_SIGPENDING)
-		do_signal(regs, syscall);
+		do_signal(regs);
 
 	if (thread_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
  2010-03-22  1:06 ` Christoffer Dall
@ 2010-03-22  1:06   ` Christoffer Dall
  -1 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-22  1:06 UTC (permalink / raw)
  To: containers, linux-arm-kernel
  Cc: linux-kernel, Christoffer Dall, Roland McGrath

This small commit introduces a global state of system calls for ARM
making it possible for a debugger or checkpointing to gain information
about another process' state with respect to system calls.

The patch is based on this proposal from Roland McGrath:
https://patchwork.kernel.org/patch/32101/

Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Christoffer Dall <christofferdall@christofferdall.dk>
Acked-by: Oren Laadan <orenl@cs.columbia.edu>
---
 arch/arm/include/asm/syscall.h |   31 +++++++++++++++++++++++++++++++
 arch/arm/kernel/asm-offsets.c  |    1 +
 arch/arm/kernel/entry-common.S |    8 +++++++-
 arch/arm/kernel/ptrace.c       |    2 --
 arch/arm/kernel/signal.c       |   14 +++++++-------
 5 files changed, 46 insertions(+), 10 deletions(-)
 create mode 100644 arch/arm/include/asm/syscall.h

diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
new file mode 100644
index 0000000..3b3248f
--- /dev/null
+++ b/arch/arm/include/asm/syscall.h
@@ -0,0 +1,31 @@
+/*
+ * syscalls.h - Linux syscall interfaces for ARM
+ *
+ * Copyright (c) 2010 Christoffer Dall
+ *
+ * This file is released under the GPLv2.
+ * See the file COPYING for more details.
+ */
+
+#ifndef _ASM_ARM_SYSCALLS_H
+#define _ASM_ARM_SYSCALLS_H
+
+static inline int syscall_get_nr(struct task_struct *task,
+				 struct pt_regs *regs)
+{
+	return (int)(task_thread_info(task)->syscall);
+}
+
+static inline long syscall_get_return_value(struct task_struct *task,
+					    struct pt_regs *regs)
+{
+	return regs->ARM_r0;
+}
+
+static inline long syscall_get_error(struct task_struct *task,
+				     struct pt_regs *regs)
+{
+	return regs->ARM_r0;
+}
+
+#endif /* _ASM_ARM_SYSCALLS_H */
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 4a88125..726a0ad 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -48,6 +48,7 @@ int main(void)
   DEFINE(TI_CPU,		offsetof(struct thread_info, cpu));
   DEFINE(TI_CPU_DOMAIN,		offsetof(struct thread_info, cpu_domain));
   DEFINE(TI_CPU_SAVE,		offsetof(struct thread_info, cpu_context));
+  DEFINE(TI_SYSCALL,		offsetof(struct thread_info, syscall));
   DEFINE(TI_USED_CP,		offsetof(struct thread_info, used_cp));
   DEFINE(TI_TP_VALUE,		offsetof(struct thread_info, tp_value));
   DEFINE(TI_FPSTATE,		offsetof(struct thread_info, fpstate));
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index 2c1db77..f694f4d 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -30,6 +30,9 @@ ret_fast_syscall:
 	tst	r1, #_TIF_WORK_MASK
 	bne	fast_work_pending
 
+	mov	r2, #-1
+	str	r2, [tsk, #TI_SYSCALL]
+
 	/* perform architecture specific actions before user return */
 	arch_ret_to_user r1, lr
 
@@ -47,7 +50,6 @@ work_pending:
 	tst	r1, #_TIF_SIGPENDING|_TIF_NOTIFY_RESUME
 	beq	no_work_pending
 	mov	r0, sp				@ 'regs'
-	mov	r2, why				@ 'syscall'
 	bl	do_notify_resume
 	b	ret_slow_syscall		@ Check work again
 
@@ -62,6 +64,9 @@ ret_slow_syscall:
 	ldr	r1, [tsk, #TI_FLAGS]
 	tst	r1, #_TIF_WORK_MASK
 	bne	work_pending
+
+	mov	r2, #-1
+	str	r2, [tsk, #TI_SYSCALL]
 no_work_pending:
 	/* perform architecture specific actions before user return */
 	arch_ret_to_user r1, lr
@@ -274,6 +279,7 @@ ENTRY(vector_swi)
 	eor	scno, scno, #__NR_SYSCALL_BASE	@ check OS number
 #endif
 
+	str	scno, [tsk, #TI_SYSCALL]	@ store syscall nr. globally
 	stmdb	sp!, {r4, r5}			@ push fifth and sixth args
 	tst	ip, #_TIF_SYSCALL_TRACE		@ are we tracing syscalls?
 	bne	__sys_trace
diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index a2ea385..44ab437 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -863,8 +863,6 @@ asmlinkage int syscall_trace(int why, struct pt_regs *regs, int scno)
 	ip = regs->ARM_ip;
 	regs->ARM_ip = why;
 
-	current_thread_info()->syscall = scno;
-
 	/* the 0x80 provides a way for the tracing parent to distinguish
 	   between a syscall stop and SIGTRAP delivery */
 	ptrace_notify(SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD)
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index e7714f3..f695239 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -527,7 +527,7 @@ static inline void setup_syscall_restart(struct pt_regs *regs)
 static int
 handle_signal(unsigned long sig, struct k_sigaction *ka,
 	      siginfo_t *info, sigset_t *oldset,
-	      struct pt_regs * regs, int syscall)
+	      struct pt_regs *regs)
 {
 	struct thread_info *thread = current_thread_info();
 	struct task_struct *tsk = current;
@@ -537,7 +537,7 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
 	/*
 	 * If we were from a system call, check for system call restarting...
 	 */
-	if (syscall) {
+	if (thread->syscall != -1) {
 		switch (regs->ARM_r0) {
 		case -ERESTART_RESTARTBLOCK:
 		case -ERESTARTNOHAND:
@@ -601,7 +601,7 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
  * the kernel can handle, and then we build all the user-level signal handling
  * stack-frames in one go after that.
  */
-static void do_signal(struct pt_regs *regs, int syscall)
+static void do_signal(struct pt_regs *regs)
 {
 	struct k_sigaction ka;
 	siginfo_t info;
@@ -629,7 +629,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
 			oldset = &current->saved_sigmask;
 		else
 			oldset = &current->blocked;
-		if (handle_signal(signr, &ka, &info, oldset, regs, syscall) == 0) {
+		if (handle_signal(signr, &ka, &info, oldset, regs) == 0) {
 			/*
 			 * A signal was successfully delivered; the saved
 			 * sigmask will have been stored in the signal frame,
@@ -647,7 +647,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
 	/*
 	 * No signal to deliver to the process - restart the syscall.
 	 */
-	if (syscall) {
+	if (current_thread_info()->syscall != -1) {
 		if (regs->ARM_r0 == -ERESTART_RESTARTBLOCK) {
 			if (thumb_mode(regs)) {
 				regs->ARM_r7 = __NR_restart_syscall - __NR_SYSCALL_BASE;
@@ -689,10 +689,10 @@ static void do_signal(struct pt_regs *regs, int syscall)
 }
 
 asmlinkage void
-do_notify_resume(struct pt_regs *regs, unsigned int thread_flags, int syscall)
+do_notify_resume(struct pt_regs *regs, unsigned int thread_flags)
 {
 	if (thread_flags & _TIF_SIGPENDING)
-		do_signal(regs, syscall);
+		do_signal(regs);
 
 	if (thread_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
@ 2010-03-22  1:06   ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-22  1:06 UTC (permalink / raw)
  To: linux-arm-kernel

This small commit introduces a global state of system calls for ARM
making it possible for a debugger or checkpointing to gain information
about another process' state with respect to system calls.

The patch is based on this proposal from Roland McGrath:
https://patchwork.kernel.org/patch/32101/

Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Christoffer Dall <christofferdall@christofferdall.dk>
Acked-by: Oren Laadan <orenl@cs.columbia.edu>
---
 arch/arm/include/asm/syscall.h |   31 +++++++++++++++++++++++++++++++
 arch/arm/kernel/asm-offsets.c  |    1 +
 arch/arm/kernel/entry-common.S |    8 +++++++-
 arch/arm/kernel/ptrace.c       |    2 --
 arch/arm/kernel/signal.c       |   14 +++++++-------
 5 files changed, 46 insertions(+), 10 deletions(-)
 create mode 100644 arch/arm/include/asm/syscall.h

diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
new file mode 100644
index 0000000..3b3248f
--- /dev/null
+++ b/arch/arm/include/asm/syscall.h
@@ -0,0 +1,31 @@
+/*
+ * syscalls.h - Linux syscall interfaces for ARM
+ *
+ * Copyright (c) 2010 Christoffer Dall
+ *
+ * This file is released under the GPLv2.
+ * See the file COPYING for more details.
+ */
+
+#ifndef _ASM_ARM_SYSCALLS_H
+#define _ASM_ARM_SYSCALLS_H
+
+static inline int syscall_get_nr(struct task_struct *task,
+				 struct pt_regs *regs)
+{
+	return (int)(task_thread_info(task)->syscall);
+}
+
+static inline long syscall_get_return_value(struct task_struct *task,
+					    struct pt_regs *regs)
+{
+	return regs->ARM_r0;
+}
+
+static inline long syscall_get_error(struct task_struct *task,
+				     struct pt_regs *regs)
+{
+	return regs->ARM_r0;
+}
+
+#endif /* _ASM_ARM_SYSCALLS_H */
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 4a88125..726a0ad 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -48,6 +48,7 @@ int main(void)
   DEFINE(TI_CPU,		offsetof(struct thread_info, cpu));
   DEFINE(TI_CPU_DOMAIN,		offsetof(struct thread_info, cpu_domain));
   DEFINE(TI_CPU_SAVE,		offsetof(struct thread_info, cpu_context));
+  DEFINE(TI_SYSCALL,		offsetof(struct thread_info, syscall));
   DEFINE(TI_USED_CP,		offsetof(struct thread_info, used_cp));
   DEFINE(TI_TP_VALUE,		offsetof(struct thread_info, tp_value));
   DEFINE(TI_FPSTATE,		offsetof(struct thread_info, fpstate));
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index 2c1db77..f694f4d 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -30,6 +30,9 @@ ret_fast_syscall:
 	tst	r1, #_TIF_WORK_MASK
 	bne	fast_work_pending
 
+	mov	r2, #-1
+	str	r2, [tsk, #TI_SYSCALL]
+
 	/* perform architecture specific actions before user return */
 	arch_ret_to_user r1, lr
 
@@ -47,7 +50,6 @@ work_pending:
 	tst	r1, #_TIF_SIGPENDING|_TIF_NOTIFY_RESUME
 	beq	no_work_pending
 	mov	r0, sp				@ 'regs'
-	mov	r2, why				@ 'syscall'
 	bl	do_notify_resume
 	b	ret_slow_syscall		@ Check work again
 
@@ -62,6 +64,9 @@ ret_slow_syscall:
 	ldr	r1, [tsk, #TI_FLAGS]
 	tst	r1, #_TIF_WORK_MASK
 	bne	work_pending
+
+	mov	r2, #-1
+	str	r2, [tsk, #TI_SYSCALL]
 no_work_pending:
 	/* perform architecture specific actions before user return */
 	arch_ret_to_user r1, lr
@@ -274,6 +279,7 @@ ENTRY(vector_swi)
 	eor	scno, scno, #__NR_SYSCALL_BASE	@ check OS number
 #endif
 
+	str	scno, [tsk, #TI_SYSCALL]	@ store syscall nr. globally
 	stmdb	sp!, {r4, r5}			@ push fifth and sixth args
 	tst	ip, #_TIF_SYSCALL_TRACE		@ are we tracing syscalls?
 	bne	__sys_trace
diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index a2ea385..44ab437 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -863,8 +863,6 @@ asmlinkage int syscall_trace(int why, struct pt_regs *regs, int scno)
 	ip = regs->ARM_ip;
 	regs->ARM_ip = why;
 
-	current_thread_info()->syscall = scno;
-
 	/* the 0x80 provides a way for the tracing parent to distinguish
 	   between a syscall stop and SIGTRAP delivery */
 	ptrace_notify(SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD)
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index e7714f3..f695239 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -527,7 +527,7 @@ static inline void setup_syscall_restart(struct pt_regs *regs)
 static int
 handle_signal(unsigned long sig, struct k_sigaction *ka,
 	      siginfo_t *info, sigset_t *oldset,
-	      struct pt_regs * regs, int syscall)
+	      struct pt_regs *regs)
 {
 	struct thread_info *thread = current_thread_info();
 	struct task_struct *tsk = current;
@@ -537,7 +537,7 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
 	/*
 	 * If we were from a system call, check for system call restarting...
 	 */
-	if (syscall) {
+	if (thread->syscall != -1) {
 		switch (regs->ARM_r0) {
 		case -ERESTART_RESTARTBLOCK:
 		case -ERESTARTNOHAND:
@@ -601,7 +601,7 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
  * the kernel can handle, and then we build all the user-level signal handling
  * stack-frames in one go after that.
  */
-static void do_signal(struct pt_regs *regs, int syscall)
+static void do_signal(struct pt_regs *regs)
 {
 	struct k_sigaction ka;
 	siginfo_t info;
@@ -629,7 +629,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
 			oldset = &current->saved_sigmask;
 		else
 			oldset = &current->blocked;
-		if (handle_signal(signr, &ka, &info, oldset, regs, syscall) == 0) {
+		if (handle_signal(signr, &ka, &info, oldset, regs) == 0) {
 			/*
 			 * A signal was successfully delivered; the saved
 			 * sigmask will have been stored in the signal frame,
@@ -647,7 +647,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
 	/*
 	 * No signal to deliver to the process - restart the syscall.
 	 */
-	if (syscall) {
+	if (current_thread_info()->syscall != -1) {
 		if (regs->ARM_r0 == -ERESTART_RESTARTBLOCK) {
 			if (thumb_mode(regs)) {
 				regs->ARM_r7 = __NR_restart_syscall - __NR_SYSCALL_BASE;
@@ -689,10 +689,10 @@ static void do_signal(struct pt_regs *regs, int syscall)
 }
 
 asmlinkage void
-do_notify_resume(struct pt_regs *regs, unsigned int thread_flags, int syscall)
+do_notify_resume(struct pt_regs *regs, unsigned int thread_flags)
 {
 	if (thread_flags & _TIF_SIGPENDING)
-		do_signal(regs, syscall);
+		do_signal(regs);
 
 	if (thread_flags & _TIF_NOTIFY_RESUME) {
 		clear_thread_flag(TIF_NOTIFY_RESUME);
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 2/3] ARM: Add the eclone system call
       [not found] ` <1269219965-23923-1-git-send-email-christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
  2010-03-22  1:06   ` Christoffer Dall
@ 2010-03-22  1:06   ` Christoffer Dall
  2010-03-22  1:06   ` [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart Christoffer Dall
  2 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-22  1:06 UTC (permalink / raw)
  To: containers, linux-arm-kernel
  Cc: Sukadev Bhattiprolu, Christoffer Dall, libc-ports, linux-kernel,
	rmk-lFZ/pmaqli7XmaaqVzeoHQ

In addition to doing everything that clone() system call does, the
eclone() system call:

	- allows additional clone flags (31 of 32 bits in the flags
	  parameter to clone() are in use)

	- allows user to specify a pid for the child process in its
	  active and ancestor pid namespaces.

Eclone is needed for restarting a process from a checkpoint. See more
in Documentation/eclone and refer to the original LKML posting:
http://lkml.org/lkml/2009/11/11/361

The new system call for ARM has number 366.

Cc: rmk-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org
Cc: libc-ports <libc-ports-9JcytcrH/bA+uJoB2kUjGw@public.gmane.org>
Cc: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Signed-off-by: Christoffer Dall <christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
Acked-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
---
 arch/arm/include/asm/unistd.h  |    1 +
 arch/arm/kernel/calls.S        |    1 +
 arch/arm/kernel/entry-common.S |    6 ++++++
 arch/arm/kernel/sys_arm.c      |   39 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/arch/arm/include/asm/unistd.h b/arch/arm/include/asm/unistd.h
index cf9cdaa..f295a6c 100644
--- a/arch/arm/include/asm/unistd.h
+++ b/arch/arm/include/asm/unistd.h
@@ -392,6 +392,7 @@
 #define __NR_rt_tgsigqueueinfo		(__NR_SYSCALL_BASE+363)
 #define __NR_perf_event_open		(__NR_SYSCALL_BASE+364)
 #define __NR_recvmmsg			(__NR_SYSCALL_BASE+365)
+#define __NR_eclone			(__NR_SYSCALL_BASE+366)
 
 /*
  * The following SWIs are ARM private.
diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S
index 9314a2d..5ef0b03 100644
--- a/arch/arm/kernel/calls.S
+++ b/arch/arm/kernel/calls.S
@@ -375,6 +375,7 @@
 		CALL(sys_rt_tgsigqueueinfo)
 		CALL(sys_perf_event_open)
 /* 365 */	CALL(sys_recvmmsg)
+		CALL(sys_eclone_wrapper)
 #ifndef syscalls_counted
 .equ syscalls_padding, ((NR_syscalls + 3) & ~3) - NR_syscalls
 #define syscalls_counted
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index f694f4d..9ead15d 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -386,6 +386,12 @@ sys_clone_wrapper:
 		b	sys_clone
 ENDPROC(sys_clone_wrapper)
 
+sys_eclone_wrapper:
+		add	ip, sp, #S_OFF
+		str	ip, [sp, #0]
+		b	sys_eclone
+ENDPROC(sys_eclone_wrapper)
+
 sys_sigreturn_wrapper:
 		add	r0, sp, #S_OFF
 		b	sys_sigreturn
diff --git a/arch/arm/kernel/sys_arm.c b/arch/arm/kernel/sys_arm.c
index ae4027b..fd8199d 100644
--- a/arch/arm/kernel/sys_arm.c
+++ b/arch/arm/kernel/sys_arm.c
@@ -183,6 +183,45 @@ asmlinkage int sys_clone(unsigned long clone_flags, unsigned long newsp,
 	return do_fork(clone_flags, newsp, regs, 0, parent_tidptr, child_tidptr);
 }
 
+asmlinkage int sys_eclone(unsigned flags_low, struct clone_args __user *uca,
+			  int args_size, pid_t __user *pids,
+			  struct pt_regs *regs)
+{
+	int rc;
+	struct clone_args kca;
+	unsigned long flags;
+	int __user *parent_tidp;
+	int __user *child_tidp;
+	unsigned long __user stack;
+	unsigned long stack_size;
+
+	rc = fetch_clone_args_from_user(uca, args_size, &kca);
+	if (rc)
+		return rc;
+
+	/*
+	 * TODO: Convert 'clone-flags' to 64-bits on all architectures.
+	 * TODO: When ->clone_flags_high is non-zero, copy it in to the
+	 * 	 higher word(s) of 'flags':
+	 *
+	 * 		flags = (kca.clone_flags_high << 32) | flags_low;
+	 */
+	flags = flags_low;
+	parent_tidp = (int *)(unsigned long)kca.parent_tid_ptr;
+	child_tidp = (int *)(unsigned long)kca.child_tid_ptr;
+
+	stack_size = (unsigned long)kca.child_stack_size;
+	if (stack_size)
+		return -EINVAL;
+
+	stack = (unsigned long)kca.child_stack;
+	if (!stack)
+		stack = regs->ARM_sp;
+
+	return do_fork_with_pids(flags, stack, regs, stack_size, parent_tidp,
+				child_tidp, kca.nr_pids, pids);
+}
+
 asmlinkage int sys_vfork(struct pt_regs *regs)
 {
 	return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs->ARM_sp, regs, 0, NULL, NULL);
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 2/3] ARM: Add the eclone system call
  2010-03-22  1:06 ` Christoffer Dall
@ 2010-03-22  1:06   ` Christoffer Dall
  -1 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-22  1:06 UTC (permalink / raw)
  To: containers, linux-arm-kernel
  Cc: linux-kernel, Christoffer Dall, rmk, libc-ports, Sukadev Bhattiprolu

In addition to doing everything that clone() system call does, the
eclone() system call:

	- allows additional clone flags (31 of 32 bits in the flags
	  parameter to clone() are in use)

	- allows user to specify a pid for the child process in its
	  active and ancestor pid namespaces.

Eclone is needed for restarting a process from a checkpoint. See more
in Documentation/eclone and refer to the original LKML posting:
http://lkml.org/lkml/2009/11/11/361

The new system call for ARM has number 366.

Cc: rmk@arm.linux.org.uk
Cc: libc-ports <libc-ports@sourceware.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Christoffer Dall <christofferdall@christofferdall.dk>
Acked-by: Oren Laadan <orenl@cs.columbia.edu>
---
 arch/arm/include/asm/unistd.h  |    1 +
 arch/arm/kernel/calls.S        |    1 +
 arch/arm/kernel/entry-common.S |    6 ++++++
 arch/arm/kernel/sys_arm.c      |   39 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/arch/arm/include/asm/unistd.h b/arch/arm/include/asm/unistd.h
index cf9cdaa..f295a6c 100644
--- a/arch/arm/include/asm/unistd.h
+++ b/arch/arm/include/asm/unistd.h
@@ -392,6 +392,7 @@
 #define __NR_rt_tgsigqueueinfo		(__NR_SYSCALL_BASE+363)
 #define __NR_perf_event_open		(__NR_SYSCALL_BASE+364)
 #define __NR_recvmmsg			(__NR_SYSCALL_BASE+365)
+#define __NR_eclone			(__NR_SYSCALL_BASE+366)
 
 /*
  * The following SWIs are ARM private.
diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S
index 9314a2d..5ef0b03 100644
--- a/arch/arm/kernel/calls.S
+++ b/arch/arm/kernel/calls.S
@@ -375,6 +375,7 @@
 		CALL(sys_rt_tgsigqueueinfo)
 		CALL(sys_perf_event_open)
 /* 365 */	CALL(sys_recvmmsg)
+		CALL(sys_eclone_wrapper)
 #ifndef syscalls_counted
 .equ syscalls_padding, ((NR_syscalls + 3) & ~3) - NR_syscalls
 #define syscalls_counted
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index f694f4d..9ead15d 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -386,6 +386,12 @@ sys_clone_wrapper:
 		b	sys_clone
 ENDPROC(sys_clone_wrapper)
 
+sys_eclone_wrapper:
+		add	ip, sp, #S_OFF
+		str	ip, [sp, #0]
+		b	sys_eclone
+ENDPROC(sys_eclone_wrapper)
+
 sys_sigreturn_wrapper:
 		add	r0, sp, #S_OFF
 		b	sys_sigreturn
diff --git a/arch/arm/kernel/sys_arm.c b/arch/arm/kernel/sys_arm.c
index ae4027b..fd8199d 100644
--- a/arch/arm/kernel/sys_arm.c
+++ b/arch/arm/kernel/sys_arm.c
@@ -183,6 +183,45 @@ asmlinkage int sys_clone(unsigned long clone_flags, unsigned long newsp,
 	return do_fork(clone_flags, newsp, regs, 0, parent_tidptr, child_tidptr);
 }
 
+asmlinkage int sys_eclone(unsigned flags_low, struct clone_args __user *uca,
+			  int args_size, pid_t __user *pids,
+			  struct pt_regs *regs)
+{
+	int rc;
+	struct clone_args kca;
+	unsigned long flags;
+	int __user *parent_tidp;
+	int __user *child_tidp;
+	unsigned long __user stack;
+	unsigned long stack_size;
+
+	rc = fetch_clone_args_from_user(uca, args_size, &kca);
+	if (rc)
+		return rc;
+
+	/*
+	 * TODO: Convert 'clone-flags' to 64-bits on all architectures.
+	 * TODO: When ->clone_flags_high is non-zero, copy it in to the
+	 * 	 higher word(s) of 'flags':
+	 *
+	 * 		flags = (kca.clone_flags_high << 32) | flags_low;
+	 */
+	flags = flags_low;
+	parent_tidp = (int *)(unsigned long)kca.parent_tid_ptr;
+	child_tidp = (int *)(unsigned long)kca.child_tid_ptr;
+
+	stack_size = (unsigned long)kca.child_stack_size;
+	if (stack_size)
+		return -EINVAL;
+
+	stack = (unsigned long)kca.child_stack;
+	if (!stack)
+		stack = regs->ARM_sp;
+
+	return do_fork_with_pids(flags, stack, regs, stack_size, parent_tidp,
+				child_tidp, kca.nr_pids, pids);
+}
+
 asmlinkage int sys_vfork(struct pt_regs *regs)
 {
 	return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs->ARM_sp, regs, 0, NULL, NULL);
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 2/3] ARM: Add the eclone system call
@ 2010-03-22  1:06   ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-22  1:06 UTC (permalink / raw)
  To: linux-arm-kernel

In addition to doing everything that clone() system call does, the
eclone() system call:

	- allows additional clone flags (31 of 32 bits in the flags
	  parameter to clone() are in use)

	- allows user to specify a pid for the child process in its
	  active and ancestor pid namespaces.

Eclone is needed for restarting a process from a checkpoint. See more
in Documentation/eclone and refer to the original LKML posting:
http://lkml.org/lkml/2009/11/11/361

The new system call for ARM has number 366.

Cc: rmk at arm.linux.org.uk
Cc: libc-ports <libc-ports@sourceware.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Christoffer Dall <christofferdall@christofferdall.dk>
Acked-by: Oren Laadan <orenl@cs.columbia.edu>
---
 arch/arm/include/asm/unistd.h  |    1 +
 arch/arm/kernel/calls.S        |    1 +
 arch/arm/kernel/entry-common.S |    6 ++++++
 arch/arm/kernel/sys_arm.c      |   39 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/arch/arm/include/asm/unistd.h b/arch/arm/include/asm/unistd.h
index cf9cdaa..f295a6c 100644
--- a/arch/arm/include/asm/unistd.h
+++ b/arch/arm/include/asm/unistd.h
@@ -392,6 +392,7 @@
 #define __NR_rt_tgsigqueueinfo		(__NR_SYSCALL_BASE+363)
 #define __NR_perf_event_open		(__NR_SYSCALL_BASE+364)
 #define __NR_recvmmsg			(__NR_SYSCALL_BASE+365)
+#define __NR_eclone			(__NR_SYSCALL_BASE+366)
 
 /*
  * The following SWIs are ARM private.
diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S
index 9314a2d..5ef0b03 100644
--- a/arch/arm/kernel/calls.S
+++ b/arch/arm/kernel/calls.S
@@ -375,6 +375,7 @@
 		CALL(sys_rt_tgsigqueueinfo)
 		CALL(sys_perf_event_open)
 /* 365 */	CALL(sys_recvmmsg)
+		CALL(sys_eclone_wrapper)
 #ifndef syscalls_counted
 .equ syscalls_padding, ((NR_syscalls + 3) & ~3) - NR_syscalls
 #define syscalls_counted
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index f694f4d..9ead15d 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -386,6 +386,12 @@ sys_clone_wrapper:
 		b	sys_clone
 ENDPROC(sys_clone_wrapper)
 
+sys_eclone_wrapper:
+		add	ip, sp, #S_OFF
+		str	ip, [sp, #0]
+		b	sys_eclone
+ENDPROC(sys_eclone_wrapper)
+
 sys_sigreturn_wrapper:
 		add	r0, sp, #S_OFF
 		b	sys_sigreturn
diff --git a/arch/arm/kernel/sys_arm.c b/arch/arm/kernel/sys_arm.c
index ae4027b..fd8199d 100644
--- a/arch/arm/kernel/sys_arm.c
+++ b/arch/arm/kernel/sys_arm.c
@@ -183,6 +183,45 @@ asmlinkage int sys_clone(unsigned long clone_flags, unsigned long newsp,
 	return do_fork(clone_flags, newsp, regs, 0, parent_tidptr, child_tidptr);
 }
 
+asmlinkage int sys_eclone(unsigned flags_low, struct clone_args __user *uca,
+			  int args_size, pid_t __user *pids,
+			  struct pt_regs *regs)
+{
+	int rc;
+	struct clone_args kca;
+	unsigned long flags;
+	int __user *parent_tidp;
+	int __user *child_tidp;
+	unsigned long __user stack;
+	unsigned long stack_size;
+
+	rc = fetch_clone_args_from_user(uca, args_size, &kca);
+	if (rc)
+		return rc;
+
+	/*
+	 * TODO: Convert 'clone-flags' to 64-bits on all architectures.
+	 * TODO: When ->clone_flags_high is non-zero, copy it in to the
+	 * 	 higher word(s) of 'flags':
+	 *
+	 * 		flags = (kca.clone_flags_high << 32) | flags_low;
+	 */
+	flags = flags_low;
+	parent_tidp = (int *)(unsigned long)kca.parent_tid_ptr;
+	child_tidp = (int *)(unsigned long)kca.child_tid_ptr;
+
+	stack_size = (unsigned long)kca.child_stack_size;
+	if (stack_size)
+		return -EINVAL;
+
+	stack = (unsigned long)kca.child_stack;
+	if (!stack)
+		stack = regs->ARM_sp;
+
+	return do_fork_with_pids(flags, stack, regs, stack_size, parent_tidp,
+				child_tidp, kca.nr_pids, pids);
+}
+
 asmlinkage int sys_vfork(struct pt_regs *regs)
 {
 	return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs->ARM_sp, regs, 0, NULL, NULL);
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
       [not found] ` <1269219965-23923-1-git-send-email-christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
  2010-03-22  1:06   ` Christoffer Dall
  2010-03-22  1:06   ` [C/R ARM][PATCH 2/3] ARM: Add the eclone system call Christoffer Dall
@ 2010-03-22  1:06   ` Christoffer Dall
  2 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-22  1:06 UTC (permalink / raw)
  To: containers, linux-arm-kernel
  Cc: Christoffer Dall, linux-kernel, rmk-lFZ/pmaqli7XmaaqVzeoHQ

Implements architecture specific requirements for checkpoint/restart on
ARM. The changes touch almost only c/r related code. Most of the work is
done in arch/arm/checkpoint.c, which implements checkpointing of the CPU
and necessary fields on the thread_info struct.

The ISA version (given by __LINUX_ARM_ARCH__) is checkpointed and verified
against the machine architecture on restart. If they differ, an error is
raised and restart aborted. It should be possible to restart on newer
architectures, but further investigation is warranted.

Regarding ThumbEE, the thumbee_state field on the thread_info is stored
in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
set on the restore system, the restore is aborted. Feedback on this
implementation is very welcome.

We checkpoint whether the system is running with CONFIG_MMU or not and
require the same configuration for the system on which we restore the
process. It might be possible to allow something more fine-grained,
if it's worth the energy. Input on this item is also very welcome,
specifically from someone who knows the exact meaning of the end_brk
field.

Added support for syscall sys_checkpoint and sys_restart for ARM:
__NR_checkpoint         367
__NR_restart            368


Cc: rmk-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org
Signed-off-by: Christoffer Dall <christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
Acked-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
---
 arch/arm/Kconfig                      |    4 +
 arch/arm/include/asm/checkpoint_hdr.h |   71 +++++++++
 arch/arm/include/asm/ptrace.h         |    1 +
 arch/arm/include/asm/unistd.h         |    2 +
 arch/arm/kernel/Makefile              |    1 +
 arch/arm/kernel/calls.S               |    2 +
 arch/arm/kernel/checkpoint.c          |  276 +++++++++++++++++++++++++++++++++
 arch/arm/kernel/signal.c              |    5 +
 arch/arm/kernel/sys_arm.c             |   13 ++
 include/linux/checkpoint_hdr.h        |    2 +
 10 files changed, 377 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/include/asm/checkpoint_hdr.h
 create mode 100644 arch/arm/kernel/checkpoint.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 184a6bd..fe83129 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -94,6 +94,10 @@ config HAVE_LATENCYTOP_SUPPORT
 	depends on !SMP
 	default y
 
+config CHECKPOINT_SUPPORT
+	bool
+	default y
+
 config LOCKDEP_SUPPORT
 	bool
 	default y
diff --git a/arch/arm/include/asm/checkpoint_hdr.h b/arch/arm/include/asm/checkpoint_hdr.h
new file mode 100644
index 0000000..c08a4ae
--- /dev/null
+++ b/arch/arm/include/asm/checkpoint_hdr.h
@@ -0,0 +1,71 @@
+#ifndef __ASM_ARM_CKPT_HDR_H
+#define __ASM_ARM_CKPT_HDR_H
+/*
+ *  Checkpoint/restart - architecture specific headers ARM
+ *
+ *  Copyright (C) 2008-2010 Oren Laadan
+ *  Copyright	  2010	    Christoffer Dall
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#ifndef _CHECKPOINT_CKPT_HDR_H_
+#error asm/checkpoint_hdr.h included directly
+#endif
+
+#include <linux/types.h>
+
+/* ARM structure seen from kernel/userspace */
+#ifdef __KERNEL__
+#include <asm/processor.h>
+#endif
+
+#define CKPT_ARCH_ID	CKPT_ARCH_ARM
+
+/* arch dependent constants */
+#define CKPT_ARCH_NSIG  64
+#define CKPT_TTY_NCC  8
+
+#ifdef __KERNEL__
+
+#include <asm/signal.h>
+#if CKPT_ARCH_NSIG != _NSIG
+#error CKPT_ARCH_NSIG size is wrong per asm/signal.h and asm/checkpoint_hdr.h
+#endif
+
+#include <linux/tty.h>
+#if CKPT_TTY_NCC != NCC
+#error CKPT_TTY_NCC size is wrong per asm-generic/termios.h
+#endif
+
+#endif /* __KERNEL__ */
+
+
+struct ckpt_hdr_header_arch {
+	struct ckpt_hdr h;
+	__u32	linux_arm_arch;
+	__u8	mmu;		/* Checkpointed on mmu system */
+	__u8	oabi_compat;	/* Checkpointed on old ABI compat. system */
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_thread {
+	struct ckpt_hdr h;
+	__u32		syscall;
+	__u32		tp_value;
+	__u32		thumbee_state;
+} __attribute__((aligned(8)));
+
+
+struct ckpt_hdr_cpu {
+	struct ckpt_hdr h;
+	__u32		uregs[18];
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_mm_context {
+	struct ckpt_hdr h;
+	__u32		end_brk;
+} __attribute__((aligned(8)));
+
+#endif /* __ASM_ARM_CKPT_HDR__H */
diff --git a/arch/arm/include/asm/ptrace.h b/arch/arm/include/asm/ptrace.h
index eec6e89..624e5d1 100644
--- a/arch/arm/include/asm/ptrace.h
+++ b/arch/arm/include/asm/ptrace.h
@@ -57,6 +57,7 @@
 #define PSR_C_BIT	0x20000000
 #define PSR_Z_BIT	0x40000000
 #define PSR_N_BIT	0x80000000
+#define PSR_GE_BITS	0x000f0000
 
 /*
  * Groups of PSR bits
diff --git a/arch/arm/include/asm/unistd.h b/arch/arm/include/asm/unistd.h
index f295a6c..7ec526e 100644
--- a/arch/arm/include/asm/unistd.h
+++ b/arch/arm/include/asm/unistd.h
@@ -393,6 +393,8 @@
 #define __NR_perf_event_open		(__NR_SYSCALL_BASE+364)
 #define __NR_recvmmsg			(__NR_SYSCALL_BASE+365)
 #define __NR_eclone			(__NR_SYSCALL_BASE+366)
+#define __NR_checkpoint			(__NR_SYSCALL_BASE+367)
+#define __NR_restart			(__NR_SYSCALL_BASE+368)
 
 /*
  * The following SWIs are ARM private.
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index dd00f74..1669065 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -38,6 +38,7 @@ obj-$(CONFIG_ARM_THUMBEE)	+= thumbee.o
 obj-$(CONFIG_KGDB)		+= kgdb.o
 obj-$(CONFIG_ARM_UNWIND)	+= unwind.o
 obj-$(CONFIG_HAVE_TCM)		+= tcm.o
+obj-$(CONFIG_CHECKPOINT)	+= checkpoint.o
 
 obj-$(CONFIG_CRUNCH)		+= crunch.o crunch-bits.o
 AFLAGS_crunch-bits.o		:= -Wa,-mcpu=ep9312
diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S
index 5ef0b03..aefb432 100644
--- a/arch/arm/kernel/calls.S
+++ b/arch/arm/kernel/calls.S
@@ -376,6 +376,8 @@
 		CALL(sys_perf_event_open)
 /* 365 */	CALL(sys_recvmmsg)
 		CALL(sys_eclone_wrapper)
+ 		CALL(sys_checkpoint)
+ 		CALL(sys_restart)
 #ifndef syscalls_counted
 .equ syscalls_padding, ((NR_syscalls + 3) & ~3) - NR_syscalls
 #define syscalls_counted
diff --git a/arch/arm/kernel/checkpoint.c b/arch/arm/kernel/checkpoint.c
new file mode 100644
index 0000000..1c9bb34
--- /dev/null
+++ b/arch/arm/kernel/checkpoint.c
@@ -0,0 +1,276 @@
+/*
+ *  Checkpoint/restart - architecture specific support for ARM
+ *
+ *  Copyright (C) 2008-2010 Oren Laadan
+ *  Copyright (C) 2010	    Christoffer Dall
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+#include <linux/checkpoint.h>
+#include <linux/checkpoint_hdr.h>
+
+#include <asm/processor.h>
+
+
+#ifdef CONFIG_MMU
+	const u8 ckpt_mmu = 1;
+#else
+	const u8 ckpt_mmu = 0;
+#endif
+
+#ifdef CONFIG_OABI_COMPAT
+	const u8 ckpt_oabi_compat = 1;
+#else
+	const u8 ckpt_oabi_compat = 0;
+#endif
+
+
+/**************************************************************************
+ * Checkpoint
+ */
+
+/* dump the thread_struct of a given task */
+int checkpoint_thread(struct ckpt_ctx *ctx, struct task_struct *t)
+{
+	int ret;
+	struct ckpt_hdr_thread *h;
+	struct thread_info *ti = task_thread_info(t);
+
+	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_THREAD);
+	if (!h)
+		return -ENOMEM;
+
+	/*
+	 * Store the syscall information about the checkpointed process
+	 * as we need to know if the process was doing a syscall (and which)
+	 * during restart.
+	 */
+	h->syscall = ti->syscall;
+
+	/*
+	 * Store remaining thread-specific info.
+	 */
+	h->tp_value = ti->tp_value;
+#ifdef CONFIG_ARM_THUMBEE
+	h->thumbee_state = ti->thumbee_state;
+#else
+	/*
+	 * If restoring on system with ThumbeEE support,
+	 * zero will set ThumbEE state to unused.
+	 */
+	h->thumbee_state = 0;
+#endif
+
+	ret = ckpt_write_obj(ctx, &h->h);
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
+
+static void save_cpu_regs(struct ckpt_hdr_cpu *h, struct task_struct *t)
+{
+	struct pt_regs *regs = task_pt_regs(t);
+
+	memcpy(&h->uregs, regs, sizeof(h->uregs));
+
+	/*
+	 * for checkpoint in process context (from within a container),
+	 * the actual syscall is taking place at this very moment; so
+	 * we (optimistically) subtitute the future return value (0) of
+	 * this syscall into r0, so that upon restart it will
+	 * succeed (or it will endlessly retry checkpoint...)
+	 */
+	if (t == current)
+		h->ARM_r0 = 0;
+}
+
+/* dump the cpu state and registers of a given task */
+int checkpoint_cpu(struct ckpt_ctx *ctx, struct task_struct *t)
+{
+	struct ckpt_hdr_cpu *h;
+	int ret;
+
+	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_CPU);
+	if (!h)
+		return -ENOMEM;
+
+	save_cpu_regs(h, t);
+
+	ret = ckpt_write_obj(ctx, &h->h);
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
+
+int checkpoint_write_header_arch(struct ckpt_ctx *ctx)
+{
+	struct ckpt_hdr_header_arch *arch_hdr;
+	int ret;
+
+	arch_hdr = ckpt_hdr_get_type(ctx, sizeof(*arch_hdr),
+				     CKPT_HDR_HEADER_ARCH);
+	if (!arch_hdr)
+		return -ENOMEM;
+
+	arch_hdr->linux_arm_arch = __LINUX_ARM_ARCH__;
+	arch_hdr->mmu = ckpt_mmu;
+	arch_hdr->oabi_compat = ckpt_oabi_compat;
+
+	ret = ckpt_write_obj(ctx, &arch_hdr->h);
+	ckpt_hdr_put(ctx, arch_hdr);
+
+	return ret;
+}
+
+/* dump the mm->context state */
+int checkpoint_mm_context(struct ckpt_ctx *ctx, struct mm_struct *mm)
+{
+	struct ckpt_hdr_mm_context *h;
+	int ret = 0;
+
+	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_MM_CONTEXT);
+	if (!h)
+		return -ENOMEM;
+
+#ifdef CONFIG_MMU
+	/*
+	 * We do not checkpoint kvm_seq as we do not know of any generally
+	 * exported functionality which would associate an ioremapped VMA
+	 * with a task. A driver might use this functionality, but should
+	 * implement its own checkpoint functionality to deal with this.
+	 */
+#else
+	h->end_brk = mm->context.end_brk;
+#endif
+
+	ret = ckpt_write_obj(ctx, &h->h);
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
+
+/**************************************************************************
+ * Restart
+ */
+
+/* read the thread_struct into the current task */
+int restore_thread(struct ckpt_ctx *ctx)
+{
+	struct ckpt_hdr_thread *h;
+	int ret = 0;
+	struct thread_info *ti = task_thread_info(current);
+
+	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_THREAD);
+	if (IS_ERR(h))
+		return PTR_ERR(h);
+
+	ti->syscall = h->syscall;
+	ti->tp_value = h->tp_value;
+
+#ifdef CONFIG_ARM_THUMBEE
+	/*
+	 * If the checkpoint system did not support ThumbEE, this field
+	 * will be zero, equivalent to unused ThumbEE state.
+	 */
+	h->thumbee_state = ti->thumbee_state;
+#else
+	if (ti->thumbee_state != 0) {
+		ret = -EINVAL;
+		ckpt_err(ctx, ret, "Checkpoint had ThumbEE state but "
+				   "ARM_THUMBEE not configured.");
+	}
+#endif
+
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
+
+static int load_cpu_regs(struct ckpt_hdr_cpu *h, struct task_struct *t)
+{
+	int i;
+	struct pt_regs *regs = task_pt_regs(t);
+
+	memcpy(regs, &h->uregs, sizeof(struct pt_regs));
+
+	for (i = 0; i < 16; i++)
+		regs->uregs[i] = h->uregs[i];
+
+	/*
+	 * Restore only user-writable bits on the CPSR
+	 */
+	regs->ARM_cpsr = regs->ARM_cpsr |
+			 (h->ARM_cpsr & (PSR_N_BIT | PSR_Z_BIT |
+					 PSR_C_BIT | PSR_V_BIT |
+					 PSR_V_BIT | PSR_Q_BIT |
+					 PSR_E_BIT | PSR_GE_BITS));
+	regs->ARM_ORIG_r0 = h->ARM_ORIG_r0;
+
+	return 0;
+}
+
+/* read the cpu state and registers for the current task */
+int restore_cpu(struct ckpt_ctx *ctx)
+{
+	struct ckpt_hdr_cpu *h;
+	struct task_struct *t = current;
+	int ret;
+
+	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_CPU);
+	if (IS_ERR(h))
+		return PTR_ERR(h);
+
+	ret = load_cpu_regs(h, t);
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
+
+int restore_read_header_arch(struct ckpt_ctx *ctx)
+{
+	struct ckpt_hdr_header_arch *arch_hdr;
+	int ret = -EINVAL;
+
+	arch_hdr = ckpt_read_obj_type(ctx, sizeof(*arch_hdr),
+				      CKPT_HDR_HEADER_ARCH);
+	if (IS_ERR(arch_hdr))
+		return PTR_ERR(arch_hdr);
+
+	if (arch_hdr->linux_arm_arch != __LINUX_ARM_ARCH__) {
+		ckpt_err(ctx, ret, "incompatible ARM architecture versions");
+		goto out;
+	}
+
+	/* TODO: Maybe compatibility can be more fine-grained */
+	if (arch_hdr->mmu != ckpt_mmu) {
+		ckpt_err(ctx, ret, "checkpoint %s MMU, restore %s MMU",
+			arch_hdr->mmu ? "with" : "without",
+			ckpt_mmu ? "with" : "without");
+		goto out;
+	}
+
+	ret = 0;
+
+	if (arch_hdr->oabi_compat && !ckpt_oabi_compat) {
+		ckpt_msg(ctx, "warning: process may have used old ABI. "
+			      "CONFIG_OABI_COMPAT not set.");
+	}
+
+out:
+	ckpt_hdr_put(ctx, arch_hdr);
+	return ret;
+}
+
+int restore_mm_context(struct ckpt_ctx *ctx, struct mm_struct *mm)
+{
+	struct ckpt_hdr_mm_context *h;
+	int ret = 0;
+
+	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_MM_CONTEXT);
+	if (IS_ERR(h))
+		return PTR_ERR(h);
+
+#if !CONFIG_MMU
+	mm->context.end_brk = h->end_brk;
+#endif
+
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index f695239..b42c39a 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -688,6 +688,11 @@ static void do_signal(struct pt_regs *regs)
 	single_step_set(current);
 }
 
+int task_has_saved_sigmask(struct task_struct *task)
+{
+	return !!(task_thread_info(task)->flags & _TIF_RESTORE_SIGMASK);
+}
+
 asmlinkage void
 do_notify_resume(struct pt_regs *regs, unsigned int thread_flags)
 {
diff --git a/arch/arm/kernel/sys_arm.c b/arch/arm/kernel/sys_arm.c
index fd8199d..eb178ad 100644
--- a/arch/arm/kernel/sys_arm.c
+++ b/arch/arm/kernel/sys_arm.c
@@ -27,6 +27,7 @@
 #include <linux/file.h>
 #include <linux/ipc.h>
 #include <linux/uaccess.h>
+#include <linux/checkpoint.h>
 
 struct mmap_arg_struct {
 	unsigned long addr;
@@ -295,3 +296,15 @@ asmlinkage long sys_arm_fadvise64_64(int fd, int advice,
 {
 	return sys_fadvise64_64(fd, offset, len, advice);
 }
+
+asmlinkage long sys_checkpoint(unsigned long pid, unsigned long fd,
+			       unsigned long flags, unsigned long logfd)
+{
+	return do_sys_checkpoint(pid, fd, flags, logfd);
+}
+
+asmlinkage long sys_restart(unsigned long pid, unsigned long fd,
+			    unsigned long flags, unsigned long logfd)
+{
+	return do_sys_restart(pid, fd, flags, logfd);
+}
diff --git a/include/linux/checkpoint_hdr.h b/include/linux/checkpoint_hdr.h
index 41412d1..8309a3b 100644
--- a/include/linux/checkpoint_hdr.h
+++ b/include/linux/checkpoint_hdr.h
@@ -202,6 +202,8 @@ enum {
 #define CKPT_ARCH_PPC32 CKPT_ARCH_PPC32
 	CKPT_ARCH_PPC64,
 #define CKPT_ARCH_PPC64 CKPT_ARCH_PPC64
+	CKPT_ARCH_ARM,
+#define CKPT_ARCH_ARM CKPT_ARCH_ARM
 };
 
 /* shared objrects (objref) */
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
  2010-03-22  1:06 ` Christoffer Dall
@ 2010-03-22  1:06   ` Christoffer Dall
  -1 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-22  1:06 UTC (permalink / raw)
  To: containers, linux-arm-kernel; +Cc: linux-kernel, Christoffer Dall, rmk

Implements architecture specific requirements for checkpoint/restart on
ARM. The changes touch almost only c/r related code. Most of the work is
done in arch/arm/checkpoint.c, which implements checkpointing of the CPU
and necessary fields on the thread_info struct.

The ISA version (given by __LINUX_ARM_ARCH__) is checkpointed and verified
against the machine architecture on restart. If they differ, an error is
raised and restart aborted. It should be possible to restart on newer
architectures, but further investigation is warranted.

Regarding ThumbEE, the thumbee_state field on the thread_info is stored
in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
set on the restore system, the restore is aborted. Feedback on this
implementation is very welcome.

We checkpoint whether the system is running with CONFIG_MMU or not and
require the same configuration for the system on which we restore the
process. It might be possible to allow something more fine-grained,
if it's worth the energy. Input on this item is also very welcome,
specifically from someone who knows the exact meaning of the end_brk
field.

Added support for syscall sys_checkpoint and sys_restart for ARM:
__NR_checkpoint         367
__NR_restart            368


Cc: rmk@arm.linux.org.uk
Signed-off-by: Christoffer Dall <christofferdall@christofferdall.dk>
Acked-by: Oren Laadan <orenl@cs.columbia.edu>
---
 arch/arm/Kconfig                      |    4 +
 arch/arm/include/asm/checkpoint_hdr.h |   71 +++++++++
 arch/arm/include/asm/ptrace.h         |    1 +
 arch/arm/include/asm/unistd.h         |    2 +
 arch/arm/kernel/Makefile              |    1 +
 arch/arm/kernel/calls.S               |    2 +
 arch/arm/kernel/checkpoint.c          |  276 +++++++++++++++++++++++++++++++++
 arch/arm/kernel/signal.c              |    5 +
 arch/arm/kernel/sys_arm.c             |   13 ++
 include/linux/checkpoint_hdr.h        |    2 +
 10 files changed, 377 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/include/asm/checkpoint_hdr.h
 create mode 100644 arch/arm/kernel/checkpoint.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 184a6bd..fe83129 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -94,6 +94,10 @@ config HAVE_LATENCYTOP_SUPPORT
 	depends on !SMP
 	default y
 
+config CHECKPOINT_SUPPORT
+	bool
+	default y
+
 config LOCKDEP_SUPPORT
 	bool
 	default y
diff --git a/arch/arm/include/asm/checkpoint_hdr.h b/arch/arm/include/asm/checkpoint_hdr.h
new file mode 100644
index 0000000..c08a4ae
--- /dev/null
+++ b/arch/arm/include/asm/checkpoint_hdr.h
@@ -0,0 +1,71 @@
+#ifndef __ASM_ARM_CKPT_HDR_H
+#define __ASM_ARM_CKPT_HDR_H
+/*
+ *  Checkpoint/restart - architecture specific headers ARM
+ *
+ *  Copyright (C) 2008-2010 Oren Laadan
+ *  Copyright	  2010	    Christoffer Dall
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#ifndef _CHECKPOINT_CKPT_HDR_H_
+#error asm/checkpoint_hdr.h included directly
+#endif
+
+#include <linux/types.h>
+
+/* ARM structure seen from kernel/userspace */
+#ifdef __KERNEL__
+#include <asm/processor.h>
+#endif
+
+#define CKPT_ARCH_ID	CKPT_ARCH_ARM
+
+/* arch dependent constants */
+#define CKPT_ARCH_NSIG  64
+#define CKPT_TTY_NCC  8
+
+#ifdef __KERNEL__
+
+#include <asm/signal.h>
+#if CKPT_ARCH_NSIG != _NSIG
+#error CKPT_ARCH_NSIG size is wrong per asm/signal.h and asm/checkpoint_hdr.h
+#endif
+
+#include <linux/tty.h>
+#if CKPT_TTY_NCC != NCC
+#error CKPT_TTY_NCC size is wrong per asm-generic/termios.h
+#endif
+
+#endif /* __KERNEL__ */
+
+
+struct ckpt_hdr_header_arch {
+	struct ckpt_hdr h;
+	__u32	linux_arm_arch;
+	__u8	mmu;		/* Checkpointed on mmu system */
+	__u8	oabi_compat;	/* Checkpointed on old ABI compat. system */
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_thread {
+	struct ckpt_hdr h;
+	__u32		syscall;
+	__u32		tp_value;
+	__u32		thumbee_state;
+} __attribute__((aligned(8)));
+
+
+struct ckpt_hdr_cpu {
+	struct ckpt_hdr h;
+	__u32		uregs[18];
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_mm_context {
+	struct ckpt_hdr h;
+	__u32		end_brk;
+} __attribute__((aligned(8)));
+
+#endif /* __ASM_ARM_CKPT_HDR__H */
diff --git a/arch/arm/include/asm/ptrace.h b/arch/arm/include/asm/ptrace.h
index eec6e89..624e5d1 100644
--- a/arch/arm/include/asm/ptrace.h
+++ b/arch/arm/include/asm/ptrace.h
@@ -57,6 +57,7 @@
 #define PSR_C_BIT	0x20000000
 #define PSR_Z_BIT	0x40000000
 #define PSR_N_BIT	0x80000000
+#define PSR_GE_BITS	0x000f0000
 
 /*
  * Groups of PSR bits
diff --git a/arch/arm/include/asm/unistd.h b/arch/arm/include/asm/unistd.h
index f295a6c..7ec526e 100644
--- a/arch/arm/include/asm/unistd.h
+++ b/arch/arm/include/asm/unistd.h
@@ -393,6 +393,8 @@
 #define __NR_perf_event_open		(__NR_SYSCALL_BASE+364)
 #define __NR_recvmmsg			(__NR_SYSCALL_BASE+365)
 #define __NR_eclone			(__NR_SYSCALL_BASE+366)
+#define __NR_checkpoint			(__NR_SYSCALL_BASE+367)
+#define __NR_restart			(__NR_SYSCALL_BASE+368)
 
 /*
  * The following SWIs are ARM private.
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index dd00f74..1669065 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -38,6 +38,7 @@ obj-$(CONFIG_ARM_THUMBEE)	+= thumbee.o
 obj-$(CONFIG_KGDB)		+= kgdb.o
 obj-$(CONFIG_ARM_UNWIND)	+= unwind.o
 obj-$(CONFIG_HAVE_TCM)		+= tcm.o
+obj-$(CONFIG_CHECKPOINT)	+= checkpoint.o
 
 obj-$(CONFIG_CRUNCH)		+= crunch.o crunch-bits.o
 AFLAGS_crunch-bits.o		:= -Wa,-mcpu=ep9312
diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S
index 5ef0b03..aefb432 100644
--- a/arch/arm/kernel/calls.S
+++ b/arch/arm/kernel/calls.S
@@ -376,6 +376,8 @@
 		CALL(sys_perf_event_open)
 /* 365 */	CALL(sys_recvmmsg)
 		CALL(sys_eclone_wrapper)
+ 		CALL(sys_checkpoint)
+ 		CALL(sys_restart)
 #ifndef syscalls_counted
 .equ syscalls_padding, ((NR_syscalls + 3) & ~3) - NR_syscalls
 #define syscalls_counted
diff --git a/arch/arm/kernel/checkpoint.c b/arch/arm/kernel/checkpoint.c
new file mode 100644
index 0000000..1c9bb34
--- /dev/null
+++ b/arch/arm/kernel/checkpoint.c
@@ -0,0 +1,276 @@
+/*
+ *  Checkpoint/restart - architecture specific support for ARM
+ *
+ *  Copyright (C) 2008-2010 Oren Laadan
+ *  Copyright (C) 2010	    Christoffer Dall
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+#include <linux/checkpoint.h>
+#include <linux/checkpoint_hdr.h>
+
+#include <asm/processor.h>
+
+
+#ifdef CONFIG_MMU
+	const u8 ckpt_mmu = 1;
+#else
+	const u8 ckpt_mmu = 0;
+#endif
+
+#ifdef CONFIG_OABI_COMPAT
+	const u8 ckpt_oabi_compat = 1;
+#else
+	const u8 ckpt_oabi_compat = 0;
+#endif
+
+
+/**************************************************************************
+ * Checkpoint
+ */
+
+/* dump the thread_struct of a given task */
+int checkpoint_thread(struct ckpt_ctx *ctx, struct task_struct *t)
+{
+	int ret;
+	struct ckpt_hdr_thread *h;
+	struct thread_info *ti = task_thread_info(t);
+
+	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_THREAD);
+	if (!h)
+		return -ENOMEM;
+
+	/*
+	 * Store the syscall information about the checkpointed process
+	 * as we need to know if the process was doing a syscall (and which)
+	 * during restart.
+	 */
+	h->syscall = ti->syscall;
+
+	/*
+	 * Store remaining thread-specific info.
+	 */
+	h->tp_value = ti->tp_value;
+#ifdef CONFIG_ARM_THUMBEE
+	h->thumbee_state = ti->thumbee_state;
+#else
+	/*
+	 * If restoring on system with ThumbeEE support,
+	 * zero will set ThumbEE state to unused.
+	 */
+	h->thumbee_state = 0;
+#endif
+
+	ret = ckpt_write_obj(ctx, &h->h);
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
+
+static void save_cpu_regs(struct ckpt_hdr_cpu *h, struct task_struct *t)
+{
+	struct pt_regs *regs = task_pt_regs(t);
+
+	memcpy(&h->uregs, regs, sizeof(h->uregs));
+
+	/*
+	 * for checkpoint in process context (from within a container),
+	 * the actual syscall is taking place at this very moment; so
+	 * we (optimistically) subtitute the future return value (0) of
+	 * this syscall into r0, so that upon restart it will
+	 * succeed (or it will endlessly retry checkpoint...)
+	 */
+	if (t == current)
+		h->ARM_r0 = 0;
+}
+
+/* dump the cpu state and registers of a given task */
+int checkpoint_cpu(struct ckpt_ctx *ctx, struct task_struct *t)
+{
+	struct ckpt_hdr_cpu *h;
+	int ret;
+
+	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_CPU);
+	if (!h)
+		return -ENOMEM;
+
+	save_cpu_regs(h, t);
+
+	ret = ckpt_write_obj(ctx, &h->h);
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
+
+int checkpoint_write_header_arch(struct ckpt_ctx *ctx)
+{
+	struct ckpt_hdr_header_arch *arch_hdr;
+	int ret;
+
+	arch_hdr = ckpt_hdr_get_type(ctx, sizeof(*arch_hdr),
+				     CKPT_HDR_HEADER_ARCH);
+	if (!arch_hdr)
+		return -ENOMEM;
+
+	arch_hdr->linux_arm_arch = __LINUX_ARM_ARCH__;
+	arch_hdr->mmu = ckpt_mmu;
+	arch_hdr->oabi_compat = ckpt_oabi_compat;
+
+	ret = ckpt_write_obj(ctx, &arch_hdr->h);
+	ckpt_hdr_put(ctx, arch_hdr);
+
+	return ret;
+}
+
+/* dump the mm->context state */
+int checkpoint_mm_context(struct ckpt_ctx *ctx, struct mm_struct *mm)
+{
+	struct ckpt_hdr_mm_context *h;
+	int ret = 0;
+
+	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_MM_CONTEXT);
+	if (!h)
+		return -ENOMEM;
+
+#ifdef CONFIG_MMU
+	/*
+	 * We do not checkpoint kvm_seq as we do not know of any generally
+	 * exported functionality which would associate an ioremapped VMA
+	 * with a task. A driver might use this functionality, but should
+	 * implement its own checkpoint functionality to deal with this.
+	 */
+#else
+	h->end_brk = mm->context.end_brk;
+#endif
+
+	ret = ckpt_write_obj(ctx, &h->h);
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
+
+/**************************************************************************
+ * Restart
+ */
+
+/* read the thread_struct into the current task */
+int restore_thread(struct ckpt_ctx *ctx)
+{
+	struct ckpt_hdr_thread *h;
+	int ret = 0;
+	struct thread_info *ti = task_thread_info(current);
+
+	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_THREAD);
+	if (IS_ERR(h))
+		return PTR_ERR(h);
+
+	ti->syscall = h->syscall;
+	ti->tp_value = h->tp_value;
+
+#ifdef CONFIG_ARM_THUMBEE
+	/*
+	 * If the checkpoint system did not support ThumbEE, this field
+	 * will be zero, equivalent to unused ThumbEE state.
+	 */
+	h->thumbee_state = ti->thumbee_state;
+#else
+	if (ti->thumbee_state != 0) {
+		ret = -EINVAL;
+		ckpt_err(ctx, ret, "Checkpoint had ThumbEE state but "
+				   "ARM_THUMBEE not configured.");
+	}
+#endif
+
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
+
+static int load_cpu_regs(struct ckpt_hdr_cpu *h, struct task_struct *t)
+{
+	int i;
+	struct pt_regs *regs = task_pt_regs(t);
+
+	memcpy(regs, &h->uregs, sizeof(struct pt_regs));
+
+	for (i = 0; i < 16; i++)
+		regs->uregs[i] = h->uregs[i];
+
+	/*
+	 * Restore only user-writable bits on the CPSR
+	 */
+	regs->ARM_cpsr = regs->ARM_cpsr |
+			 (h->ARM_cpsr & (PSR_N_BIT | PSR_Z_BIT |
+					 PSR_C_BIT | PSR_V_BIT |
+					 PSR_V_BIT | PSR_Q_BIT |
+					 PSR_E_BIT | PSR_GE_BITS));
+	regs->ARM_ORIG_r0 = h->ARM_ORIG_r0;
+
+	return 0;
+}
+
+/* read the cpu state and registers for the current task */
+int restore_cpu(struct ckpt_ctx *ctx)
+{
+	struct ckpt_hdr_cpu *h;
+	struct task_struct *t = current;
+	int ret;
+
+	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_CPU);
+	if (IS_ERR(h))
+		return PTR_ERR(h);
+
+	ret = load_cpu_regs(h, t);
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
+
+int restore_read_header_arch(struct ckpt_ctx *ctx)
+{
+	struct ckpt_hdr_header_arch *arch_hdr;
+	int ret = -EINVAL;
+
+	arch_hdr = ckpt_read_obj_type(ctx, sizeof(*arch_hdr),
+				      CKPT_HDR_HEADER_ARCH);
+	if (IS_ERR(arch_hdr))
+		return PTR_ERR(arch_hdr);
+
+	if (arch_hdr->linux_arm_arch != __LINUX_ARM_ARCH__) {
+		ckpt_err(ctx, ret, "incompatible ARM architecture versions");
+		goto out;
+	}
+
+	/* TODO: Maybe compatibility can be more fine-grained */
+	if (arch_hdr->mmu != ckpt_mmu) {
+		ckpt_err(ctx, ret, "checkpoint %s MMU, restore %s MMU",
+			arch_hdr->mmu ? "with" : "without",
+			ckpt_mmu ? "with" : "without");
+		goto out;
+	}
+
+	ret = 0;
+
+	if (arch_hdr->oabi_compat && !ckpt_oabi_compat) {
+		ckpt_msg(ctx, "warning: process may have used old ABI. "
+			      "CONFIG_OABI_COMPAT not set.");
+	}
+
+out:
+	ckpt_hdr_put(ctx, arch_hdr);
+	return ret;
+}
+
+int restore_mm_context(struct ckpt_ctx *ctx, struct mm_struct *mm)
+{
+	struct ckpt_hdr_mm_context *h;
+	int ret = 0;
+
+	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_MM_CONTEXT);
+	if (IS_ERR(h))
+		return PTR_ERR(h);
+
+#if !CONFIG_MMU
+	mm->context.end_brk = h->end_brk;
+#endif
+
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index f695239..b42c39a 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -688,6 +688,11 @@ static void do_signal(struct pt_regs *regs)
 	single_step_set(current);
 }
 
+int task_has_saved_sigmask(struct task_struct *task)
+{
+	return !!(task_thread_info(task)->flags & _TIF_RESTORE_SIGMASK);
+}
+
 asmlinkage void
 do_notify_resume(struct pt_regs *regs, unsigned int thread_flags)
 {
diff --git a/arch/arm/kernel/sys_arm.c b/arch/arm/kernel/sys_arm.c
index fd8199d..eb178ad 100644
--- a/arch/arm/kernel/sys_arm.c
+++ b/arch/arm/kernel/sys_arm.c
@@ -27,6 +27,7 @@
 #include <linux/file.h>
 #include <linux/ipc.h>
 #include <linux/uaccess.h>
+#include <linux/checkpoint.h>
 
 struct mmap_arg_struct {
 	unsigned long addr;
@@ -295,3 +296,15 @@ asmlinkage long sys_arm_fadvise64_64(int fd, int advice,
 {
 	return sys_fadvise64_64(fd, offset, len, advice);
 }
+
+asmlinkage long sys_checkpoint(unsigned long pid, unsigned long fd,
+			       unsigned long flags, unsigned long logfd)
+{
+	return do_sys_checkpoint(pid, fd, flags, logfd);
+}
+
+asmlinkage long sys_restart(unsigned long pid, unsigned long fd,
+			    unsigned long flags, unsigned long logfd)
+{
+	return do_sys_restart(pid, fd, flags, logfd);
+}
diff --git a/include/linux/checkpoint_hdr.h b/include/linux/checkpoint_hdr.h
index 41412d1..8309a3b 100644
--- a/include/linux/checkpoint_hdr.h
+++ b/include/linux/checkpoint_hdr.h
@@ -202,6 +202,8 @@ enum {
 #define CKPT_ARCH_PPC32 CKPT_ARCH_PPC32
 	CKPT_ARCH_PPC64,
 #define CKPT_ARCH_PPC64 CKPT_ARCH_PPC64
+	CKPT_ARCH_ARM,
+#define CKPT_ARCH_ARM CKPT_ARCH_ARM
 };
 
 /* shared objrects (objref) */
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
@ 2010-03-22  1:06   ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-22  1:06 UTC (permalink / raw)
  To: linux-arm-kernel

Implements architecture specific requirements for checkpoint/restart on
ARM. The changes touch almost only c/r related code. Most of the work is
done in arch/arm/checkpoint.c, which implements checkpointing of the CPU
and necessary fields on the thread_info struct.

The ISA version (given by __LINUX_ARM_ARCH__) is checkpointed and verified
against the machine architecture on restart. If they differ, an error is
raised and restart aborted. It should be possible to restart on newer
architectures, but further investigation is warranted.

Regarding ThumbEE, the thumbee_state field on the thread_info is stored
in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
set on the restore system, the restore is aborted. Feedback on this
implementation is very welcome.

We checkpoint whether the system is running with CONFIG_MMU or not and
require the same configuration for the system on which we restore the
process. It might be possible to allow something more fine-grained,
if it's worth the energy. Input on this item is also very welcome,
specifically from someone who knows the exact meaning of the end_brk
field.

Added support for syscall sys_checkpoint and sys_restart for ARM:
__NR_checkpoint         367
__NR_restart            368


Cc: rmk at arm.linux.org.uk
Signed-off-by: Christoffer Dall <christofferdall@christofferdall.dk>
Acked-by: Oren Laadan <orenl@cs.columbia.edu>
---
 arch/arm/Kconfig                      |    4 +
 arch/arm/include/asm/checkpoint_hdr.h |   71 +++++++++
 arch/arm/include/asm/ptrace.h         |    1 +
 arch/arm/include/asm/unistd.h         |    2 +
 arch/arm/kernel/Makefile              |    1 +
 arch/arm/kernel/calls.S               |    2 +
 arch/arm/kernel/checkpoint.c          |  276 +++++++++++++++++++++++++++++++++
 arch/arm/kernel/signal.c              |    5 +
 arch/arm/kernel/sys_arm.c             |   13 ++
 include/linux/checkpoint_hdr.h        |    2 +
 10 files changed, 377 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm/include/asm/checkpoint_hdr.h
 create mode 100644 arch/arm/kernel/checkpoint.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 184a6bd..fe83129 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -94,6 +94,10 @@ config HAVE_LATENCYTOP_SUPPORT
 	depends on !SMP
 	default y
 
+config CHECKPOINT_SUPPORT
+	bool
+	default y
+
 config LOCKDEP_SUPPORT
 	bool
 	default y
diff --git a/arch/arm/include/asm/checkpoint_hdr.h b/arch/arm/include/asm/checkpoint_hdr.h
new file mode 100644
index 0000000..c08a4ae
--- /dev/null
+++ b/arch/arm/include/asm/checkpoint_hdr.h
@@ -0,0 +1,71 @@
+#ifndef __ASM_ARM_CKPT_HDR_H
+#define __ASM_ARM_CKPT_HDR_H
+/*
+ *  Checkpoint/restart - architecture specific headers ARM
+ *
+ *  Copyright (C) 2008-2010 Oren Laadan
+ *  Copyright	  2010	    Christoffer Dall
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#ifndef _CHECKPOINT_CKPT_HDR_H_
+#error asm/checkpoint_hdr.h included directly
+#endif
+
+#include <linux/types.h>
+
+/* ARM structure seen from kernel/userspace */
+#ifdef __KERNEL__
+#include <asm/processor.h>
+#endif
+
+#define CKPT_ARCH_ID	CKPT_ARCH_ARM
+
+/* arch dependent constants */
+#define CKPT_ARCH_NSIG  64
+#define CKPT_TTY_NCC  8
+
+#ifdef __KERNEL__
+
+#include <asm/signal.h>
+#if CKPT_ARCH_NSIG != _NSIG
+#error CKPT_ARCH_NSIG size is wrong per asm/signal.h and asm/checkpoint_hdr.h
+#endif
+
+#include <linux/tty.h>
+#if CKPT_TTY_NCC != NCC
+#error CKPT_TTY_NCC size is wrong per asm-generic/termios.h
+#endif
+
+#endif /* __KERNEL__ */
+
+
+struct ckpt_hdr_header_arch {
+	struct ckpt_hdr h;
+	__u32	linux_arm_arch;
+	__u8	mmu;		/* Checkpointed on mmu system */
+	__u8	oabi_compat;	/* Checkpointed on old ABI compat. system */
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_thread {
+	struct ckpt_hdr h;
+	__u32		syscall;
+	__u32		tp_value;
+	__u32		thumbee_state;
+} __attribute__((aligned(8)));
+
+
+struct ckpt_hdr_cpu {
+	struct ckpt_hdr h;
+	__u32		uregs[18];
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_mm_context {
+	struct ckpt_hdr h;
+	__u32		end_brk;
+} __attribute__((aligned(8)));
+
+#endif /* __ASM_ARM_CKPT_HDR__H */
diff --git a/arch/arm/include/asm/ptrace.h b/arch/arm/include/asm/ptrace.h
index eec6e89..624e5d1 100644
--- a/arch/arm/include/asm/ptrace.h
+++ b/arch/arm/include/asm/ptrace.h
@@ -57,6 +57,7 @@
 #define PSR_C_BIT	0x20000000
 #define PSR_Z_BIT	0x40000000
 #define PSR_N_BIT	0x80000000
+#define PSR_GE_BITS	0x000f0000
 
 /*
  * Groups of PSR bits
diff --git a/arch/arm/include/asm/unistd.h b/arch/arm/include/asm/unistd.h
index f295a6c..7ec526e 100644
--- a/arch/arm/include/asm/unistd.h
+++ b/arch/arm/include/asm/unistd.h
@@ -393,6 +393,8 @@
 #define __NR_perf_event_open		(__NR_SYSCALL_BASE+364)
 #define __NR_recvmmsg			(__NR_SYSCALL_BASE+365)
 #define __NR_eclone			(__NR_SYSCALL_BASE+366)
+#define __NR_checkpoint			(__NR_SYSCALL_BASE+367)
+#define __NR_restart			(__NR_SYSCALL_BASE+368)
 
 /*
  * The following SWIs are ARM private.
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index dd00f74..1669065 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -38,6 +38,7 @@ obj-$(CONFIG_ARM_THUMBEE)	+= thumbee.o
 obj-$(CONFIG_KGDB)		+= kgdb.o
 obj-$(CONFIG_ARM_UNWIND)	+= unwind.o
 obj-$(CONFIG_HAVE_TCM)		+= tcm.o
+obj-$(CONFIG_CHECKPOINT)	+= checkpoint.o
 
 obj-$(CONFIG_CRUNCH)		+= crunch.o crunch-bits.o
 AFLAGS_crunch-bits.o		:= -Wa,-mcpu=ep9312
diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S
index 5ef0b03..aefb432 100644
--- a/arch/arm/kernel/calls.S
+++ b/arch/arm/kernel/calls.S
@@ -376,6 +376,8 @@
 		CALL(sys_perf_event_open)
 /* 365 */	CALL(sys_recvmmsg)
 		CALL(sys_eclone_wrapper)
+ 		CALL(sys_checkpoint)
+ 		CALL(sys_restart)
 #ifndef syscalls_counted
 .equ syscalls_padding, ((NR_syscalls + 3) & ~3) - NR_syscalls
 #define syscalls_counted
diff --git a/arch/arm/kernel/checkpoint.c b/arch/arm/kernel/checkpoint.c
new file mode 100644
index 0000000..1c9bb34
--- /dev/null
+++ b/arch/arm/kernel/checkpoint.c
@@ -0,0 +1,276 @@
+/*
+ *  Checkpoint/restart - architecture specific support for ARM
+ *
+ *  Copyright (C) 2008-2010 Oren Laadan
+ *  Copyright (C) 2010	    Christoffer Dall
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+#include <linux/checkpoint.h>
+#include <linux/checkpoint_hdr.h>
+
+#include <asm/processor.h>
+
+
+#ifdef CONFIG_MMU
+	const u8 ckpt_mmu = 1;
+#else
+	const u8 ckpt_mmu = 0;
+#endif
+
+#ifdef CONFIG_OABI_COMPAT
+	const u8 ckpt_oabi_compat = 1;
+#else
+	const u8 ckpt_oabi_compat = 0;
+#endif
+
+
+/**************************************************************************
+ * Checkpoint
+ */
+
+/* dump the thread_struct of a given task */
+int checkpoint_thread(struct ckpt_ctx *ctx, struct task_struct *t)
+{
+	int ret;
+	struct ckpt_hdr_thread *h;
+	struct thread_info *ti = task_thread_info(t);
+
+	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_THREAD);
+	if (!h)
+		return -ENOMEM;
+
+	/*
+	 * Store the syscall information about the checkpointed process
+	 * as we need to know if the process was doing a syscall (and which)
+	 * during restart.
+	 */
+	h->syscall = ti->syscall;
+
+	/*
+	 * Store remaining thread-specific info.
+	 */
+	h->tp_value = ti->tp_value;
+#ifdef CONFIG_ARM_THUMBEE
+	h->thumbee_state = ti->thumbee_state;
+#else
+	/*
+	 * If restoring on system with ThumbeEE support,
+	 * zero will set ThumbEE state to unused.
+	 */
+	h->thumbee_state = 0;
+#endif
+
+	ret = ckpt_write_obj(ctx, &h->h);
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
+
+static void save_cpu_regs(struct ckpt_hdr_cpu *h, struct task_struct *t)
+{
+	struct pt_regs *regs = task_pt_regs(t);
+
+	memcpy(&h->uregs, regs, sizeof(h->uregs));
+
+	/*
+	 * for checkpoint in process context (from within a container),
+	 * the actual syscall is taking place at this very moment; so
+	 * we (optimistically) subtitute the future return value (0) of
+	 * this syscall into r0, so that upon restart it will
+	 * succeed (or it will endlessly retry checkpoint...)
+	 */
+	if (t == current)
+		h->ARM_r0 = 0;
+}
+
+/* dump the cpu state and registers of a given task */
+int checkpoint_cpu(struct ckpt_ctx *ctx, struct task_struct *t)
+{
+	struct ckpt_hdr_cpu *h;
+	int ret;
+
+	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_CPU);
+	if (!h)
+		return -ENOMEM;
+
+	save_cpu_regs(h, t);
+
+	ret = ckpt_write_obj(ctx, &h->h);
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
+
+int checkpoint_write_header_arch(struct ckpt_ctx *ctx)
+{
+	struct ckpt_hdr_header_arch *arch_hdr;
+	int ret;
+
+	arch_hdr = ckpt_hdr_get_type(ctx, sizeof(*arch_hdr),
+				     CKPT_HDR_HEADER_ARCH);
+	if (!arch_hdr)
+		return -ENOMEM;
+
+	arch_hdr->linux_arm_arch = __LINUX_ARM_ARCH__;
+	arch_hdr->mmu = ckpt_mmu;
+	arch_hdr->oabi_compat = ckpt_oabi_compat;
+
+	ret = ckpt_write_obj(ctx, &arch_hdr->h);
+	ckpt_hdr_put(ctx, arch_hdr);
+
+	return ret;
+}
+
+/* dump the mm->context state */
+int checkpoint_mm_context(struct ckpt_ctx *ctx, struct mm_struct *mm)
+{
+	struct ckpt_hdr_mm_context *h;
+	int ret = 0;
+
+	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_MM_CONTEXT);
+	if (!h)
+		return -ENOMEM;
+
+#ifdef CONFIG_MMU
+	/*
+	 * We do not checkpoint kvm_seq as we do not know of any generally
+	 * exported functionality which would associate an ioremapped VMA
+	 * with a task. A driver might use this functionality, but should
+	 * implement its own checkpoint functionality to deal with this.
+	 */
+#else
+	h->end_brk = mm->context.end_brk;
+#endif
+
+	ret = ckpt_write_obj(ctx, &h->h);
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
+
+/**************************************************************************
+ * Restart
+ */
+
+/* read the thread_struct into the current task */
+int restore_thread(struct ckpt_ctx *ctx)
+{
+	struct ckpt_hdr_thread *h;
+	int ret = 0;
+	struct thread_info *ti = task_thread_info(current);
+
+	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_THREAD);
+	if (IS_ERR(h))
+		return PTR_ERR(h);
+
+	ti->syscall = h->syscall;
+	ti->tp_value = h->tp_value;
+
+#ifdef CONFIG_ARM_THUMBEE
+	/*
+	 * If the checkpoint system did not support ThumbEE, this field
+	 * will be zero, equivalent to unused ThumbEE state.
+	 */
+	h->thumbee_state = ti->thumbee_state;
+#else
+	if (ti->thumbee_state != 0) {
+		ret = -EINVAL;
+		ckpt_err(ctx, ret, "Checkpoint had ThumbEE state but "
+				   "ARM_THUMBEE not configured.");
+	}
+#endif
+
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
+
+static int load_cpu_regs(struct ckpt_hdr_cpu *h, struct task_struct *t)
+{
+	int i;
+	struct pt_regs *regs = task_pt_regs(t);
+
+	memcpy(regs, &h->uregs, sizeof(struct pt_regs));
+
+	for (i = 0; i < 16; i++)
+		regs->uregs[i] = h->uregs[i];
+
+	/*
+	 * Restore only user-writable bits on the CPSR
+	 */
+	regs->ARM_cpsr = regs->ARM_cpsr |
+			 (h->ARM_cpsr & (PSR_N_BIT | PSR_Z_BIT |
+					 PSR_C_BIT | PSR_V_BIT |
+					 PSR_V_BIT | PSR_Q_BIT |
+					 PSR_E_BIT | PSR_GE_BITS));
+	regs->ARM_ORIG_r0 = h->ARM_ORIG_r0;
+
+	return 0;
+}
+
+/* read the cpu state and registers for the current task */
+int restore_cpu(struct ckpt_ctx *ctx)
+{
+	struct ckpt_hdr_cpu *h;
+	struct task_struct *t = current;
+	int ret;
+
+	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_CPU);
+	if (IS_ERR(h))
+		return PTR_ERR(h);
+
+	ret = load_cpu_regs(h, t);
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
+
+int restore_read_header_arch(struct ckpt_ctx *ctx)
+{
+	struct ckpt_hdr_header_arch *arch_hdr;
+	int ret = -EINVAL;
+
+	arch_hdr = ckpt_read_obj_type(ctx, sizeof(*arch_hdr),
+				      CKPT_HDR_HEADER_ARCH);
+	if (IS_ERR(arch_hdr))
+		return PTR_ERR(arch_hdr);
+
+	if (arch_hdr->linux_arm_arch != __LINUX_ARM_ARCH__) {
+		ckpt_err(ctx, ret, "incompatible ARM architecture versions");
+		goto out;
+	}
+
+	/* TODO: Maybe compatibility can be more fine-grained */
+	if (arch_hdr->mmu != ckpt_mmu) {
+		ckpt_err(ctx, ret, "checkpoint %s MMU, restore %s MMU",
+			arch_hdr->mmu ? "with" : "without",
+			ckpt_mmu ? "with" : "without");
+		goto out;
+	}
+
+	ret = 0;
+
+	if (arch_hdr->oabi_compat && !ckpt_oabi_compat) {
+		ckpt_msg(ctx, "warning: process may have used old ABI. "
+			      "CONFIG_OABI_COMPAT not set.");
+	}
+
+out:
+	ckpt_hdr_put(ctx, arch_hdr);
+	return ret;
+}
+
+int restore_mm_context(struct ckpt_ctx *ctx, struct mm_struct *mm)
+{
+	struct ckpt_hdr_mm_context *h;
+	int ret = 0;
+
+	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_MM_CONTEXT);
+	if (IS_ERR(h))
+		return PTR_ERR(h);
+
+#if !CONFIG_MMU
+	mm->context.end_brk = h->end_brk;
+#endif
+
+	ckpt_hdr_put(ctx, h);
+	return ret;
+}
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index f695239..b42c39a 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -688,6 +688,11 @@ static void do_signal(struct pt_regs *regs)
 	single_step_set(current);
 }
 
+int task_has_saved_sigmask(struct task_struct *task)
+{
+	return !!(task_thread_info(task)->flags & _TIF_RESTORE_SIGMASK);
+}
+
 asmlinkage void
 do_notify_resume(struct pt_regs *regs, unsigned int thread_flags)
 {
diff --git a/arch/arm/kernel/sys_arm.c b/arch/arm/kernel/sys_arm.c
index fd8199d..eb178ad 100644
--- a/arch/arm/kernel/sys_arm.c
+++ b/arch/arm/kernel/sys_arm.c
@@ -27,6 +27,7 @@
 #include <linux/file.h>
 #include <linux/ipc.h>
 #include <linux/uaccess.h>
+#include <linux/checkpoint.h>
 
 struct mmap_arg_struct {
 	unsigned long addr;
@@ -295,3 +296,15 @@ asmlinkage long sys_arm_fadvise64_64(int fd, int advice,
 {
 	return sys_fadvise64_64(fd, offset, len, advice);
 }
+
+asmlinkage long sys_checkpoint(unsigned long pid, unsigned long fd,
+			       unsigned long flags, unsigned long logfd)
+{
+	return do_sys_checkpoint(pid, fd, flags, logfd);
+}
+
+asmlinkage long sys_restart(unsigned long pid, unsigned long fd,
+			    unsigned long flags, unsigned long logfd)
+{
+	return do_sys_restart(pid, fd, flags, logfd);
+}
diff --git a/include/linux/checkpoint_hdr.h b/include/linux/checkpoint_hdr.h
index 41412d1..8309a3b 100644
--- a/include/linux/checkpoint_hdr.h
+++ b/include/linux/checkpoint_hdr.h
@@ -202,6 +202,8 @@ enum {
 #define CKPT_ARCH_PPC32 CKPT_ARCH_PPC32
 	CKPT_ARCH_PPC64,
 #define CKPT_ARCH_PPC64 CKPT_ARCH_PPC64
+	CKPT_ARCH_ARM,
+#define CKPT_ARCH_ARM CKPT_ARCH_ARM
 };
 
 /* shared objrects (objref) */
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
       [not found]   ` <1269219965-23923-4-git-send-email-christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
@ 2010-03-23 16:09     ` Serge E. Hallyn
  2010-03-23 21:18     ` Russell King - ARM Linux
  1 sibling, 0 replies; 80+ messages in thread
From: Serge E. Hallyn @ 2010-03-23 16:09 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: rmk-lFZ/pmaqli7XmaaqVzeoHQ, containers, linux-kernel, linux-arm-kernel

Quoting Christoffer Dall (christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org):
> Implements architecture specific requirements for checkpoint/restart on
> ARM. The changes touch almost only c/r related code. Most of the work is
> done in arch/arm/checkpoint.c, which implements checkpointing of the CPU
> and necessary fields on the thread_info struct.
> 
> The ISA version (given by __LINUX_ARM_ARCH__) is checkpointed and verified
> against the machine architecture on restart. If they differ, an error is
> raised and restart aborted. It should be possible to restart on newer
> architectures, but further investigation is warranted.
> 
> Regarding ThumbEE, the thumbee_state field on the thread_info is stored
> in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
> a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
> set on the restore system, the restore is aborted. Feedback on this
> implementation is very welcome.
> 
> We checkpoint whether the system is running with CONFIG_MMU or not and
> require the same configuration for the system on which we restore the
> process. It might be possible to allow something more fine-grained,
> if it's worth the energy. Input on this item is also very welcome,
> specifically from someone who knows the exact meaning of the end_brk
> field.
> 
> Added support for syscall sys_checkpoint and sys_restart for ARM:
> __NR_checkpoint         367
> __NR_restart            368
> 
> 
> Cc: rmk-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org
> Signed-off-by: Christoffer Dall <christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
> Acked-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>

In terms of the cr api I don't see any problems.  Two nits below,
but in any case

Acked-by: Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>

thanks, this is really cool, especially how minimal it is :)
-serge

...

> +static int load_cpu_regs(struct ckpt_hdr_cpu *h, struct task_struct *t)
> +{
> +	int i;
> +	struct pt_regs *regs = task_pt_regs(t);
> +
> +	memcpy(regs, &h->uregs, sizeof(struct pt_regs));
> +
> +	for (i = 0; i < 16; i++)
> +		regs->uregs[i] = h->uregs[i];
> +
> +	/*
> +	 * Restore only user-writable bits on the CPSR
> +	 */
> +	regs->ARM_cpsr = regs->ARM_cpsr |
> +			 (h->ARM_cpsr & (PSR_N_BIT | PSR_Z_BIT |
> +					 PSR_C_BIT | PSR_V_BIT |
> +					 PSR_V_BIT | PSR_Q_BIT |
> +					 PSR_E_BIT | PSR_GE_BITS));
> +	regs->ARM_ORIG_r0 = h->ARM_ORIG_r0;
> +
> +	return 0;
> +}
> +
> +/* read the cpu state and registers for the current task */
> +int restore_cpu(struct ckpt_ctx *ctx)
> +{
> +	struct ckpt_hdr_cpu *h;
> +	struct task_struct *t = current;
> +	int ret;
> +
> +	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_CPU);
> +	if (IS_ERR(h))
> +		return PTR_ERR(h);
> +
> +	ret = load_cpu_regs(h, t);

will load_cpu_regs() ever be changed to return anything but 0?  If
not both fns can be simplified.

...

> +int restore_mm_context(struct ckpt_ctx *ctx, struct mm_struct *mm)
> +{
> +	struct ckpt_hdr_mm_context *h;
> +	int ret = 0;
> +
> +	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_MM_CONTEXT);
> +	if (IS_ERR(h))
> +		return PTR_ERR(h);
> +
> +#if !CONFIG_MMU
> +	mm->context.end_brk = h->end_brk;
> +#endif
> +
> +	ckpt_hdr_put(ctx, h);
> +	return ret;

Again ret doesn't seem needed here.

-serge

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
  2010-03-22  1:06   ` Christoffer Dall
@ 2010-03-23 16:09     ` Serge E. Hallyn
  -1 siblings, 0 replies; 80+ messages in thread
From: Serge E. Hallyn @ 2010-03-23 16:09 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: containers, linux-arm-kernel, linux-kernel, rmk

Quoting Christoffer Dall (christofferdall@christofferdall.dk):
> Implements architecture specific requirements for checkpoint/restart on
> ARM. The changes touch almost only c/r related code. Most of the work is
> done in arch/arm/checkpoint.c, which implements checkpointing of the CPU
> and necessary fields on the thread_info struct.
> 
> The ISA version (given by __LINUX_ARM_ARCH__) is checkpointed and verified
> against the machine architecture on restart. If they differ, an error is
> raised and restart aborted. It should be possible to restart on newer
> architectures, but further investigation is warranted.
> 
> Regarding ThumbEE, the thumbee_state field on the thread_info is stored
> in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
> a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
> set on the restore system, the restore is aborted. Feedback on this
> implementation is very welcome.
> 
> We checkpoint whether the system is running with CONFIG_MMU or not and
> require the same configuration for the system on which we restore the
> process. It might be possible to allow something more fine-grained,
> if it's worth the energy. Input on this item is also very welcome,
> specifically from someone who knows the exact meaning of the end_brk
> field.
> 
> Added support for syscall sys_checkpoint and sys_restart for ARM:
> __NR_checkpoint         367
> __NR_restart            368
> 
> 
> Cc: rmk@arm.linux.org.uk
> Signed-off-by: Christoffer Dall <christofferdall@christofferdall.dk>
> Acked-by: Oren Laadan <orenl@cs.columbia.edu>

In terms of the cr api I don't see any problems.  Two nits below,
but in any case

Acked-by: Serge Hallyn <serue@us.ibm.com>

thanks, this is really cool, especially how minimal it is :)
-serge

...

> +static int load_cpu_regs(struct ckpt_hdr_cpu *h, struct task_struct *t)
> +{
> +	int i;
> +	struct pt_regs *regs = task_pt_regs(t);
> +
> +	memcpy(regs, &h->uregs, sizeof(struct pt_regs));
> +
> +	for (i = 0; i < 16; i++)
> +		regs->uregs[i] = h->uregs[i];
> +
> +	/*
> +	 * Restore only user-writable bits on the CPSR
> +	 */
> +	regs->ARM_cpsr = regs->ARM_cpsr |
> +			 (h->ARM_cpsr & (PSR_N_BIT | PSR_Z_BIT |
> +					 PSR_C_BIT | PSR_V_BIT |
> +					 PSR_V_BIT | PSR_Q_BIT |
> +					 PSR_E_BIT | PSR_GE_BITS));
> +	regs->ARM_ORIG_r0 = h->ARM_ORIG_r0;
> +
> +	return 0;
> +}
> +
> +/* read the cpu state and registers for the current task */
> +int restore_cpu(struct ckpt_ctx *ctx)
> +{
> +	struct ckpt_hdr_cpu *h;
> +	struct task_struct *t = current;
> +	int ret;
> +
> +	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_CPU);
> +	if (IS_ERR(h))
> +		return PTR_ERR(h);
> +
> +	ret = load_cpu_regs(h, t);

will load_cpu_regs() ever be changed to return anything but 0?  If
not both fns can be simplified.

...

> +int restore_mm_context(struct ckpt_ctx *ctx, struct mm_struct *mm)
> +{
> +	struct ckpt_hdr_mm_context *h;
> +	int ret = 0;
> +
> +	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_MM_CONTEXT);
> +	if (IS_ERR(h))
> +		return PTR_ERR(h);
> +
> +#if !CONFIG_MMU
> +	mm->context.end_brk = h->end_brk;
> +#endif
> +
> +	ckpt_hdr_put(ctx, h);
> +	return ret;

Again ret doesn't seem needed here.

-serge

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
@ 2010-03-23 16:09     ` Serge E. Hallyn
  0 siblings, 0 replies; 80+ messages in thread
From: Serge E. Hallyn @ 2010-03-23 16:09 UTC (permalink / raw)
  To: linux-arm-kernel

Quoting Christoffer Dall (christofferdall at christofferdall.dk):
> Implements architecture specific requirements for checkpoint/restart on
> ARM. The changes touch almost only c/r related code. Most of the work is
> done in arch/arm/checkpoint.c, which implements checkpointing of the CPU
> and necessary fields on the thread_info struct.
> 
> The ISA version (given by __LINUX_ARM_ARCH__) is checkpointed and verified
> against the machine architecture on restart. If they differ, an error is
> raised and restart aborted. It should be possible to restart on newer
> architectures, but further investigation is warranted.
> 
> Regarding ThumbEE, the thumbee_state field on the thread_info is stored
> in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
> a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
> set on the restore system, the restore is aborted. Feedback on this
> implementation is very welcome.
> 
> We checkpoint whether the system is running with CONFIG_MMU or not and
> require the same configuration for the system on which we restore the
> process. It might be possible to allow something more fine-grained,
> if it's worth the energy. Input on this item is also very welcome,
> specifically from someone who knows the exact meaning of the end_brk
> field.
> 
> Added support for syscall sys_checkpoint and sys_restart for ARM:
> __NR_checkpoint         367
> __NR_restart            368
> 
> 
> Cc: rmk at arm.linux.org.uk
> Signed-off-by: Christoffer Dall <christofferdall@christofferdall.dk>
> Acked-by: Oren Laadan <orenl@cs.columbia.edu>

In terms of the cr api I don't see any problems.  Two nits below,
but in any case

Acked-by: Serge Hallyn <serue@us.ibm.com>

thanks, this is really cool, especially how minimal it is :)
-serge

...

> +static int load_cpu_regs(struct ckpt_hdr_cpu *h, struct task_struct *t)
> +{
> +	int i;
> +	struct pt_regs *regs = task_pt_regs(t);
> +
> +	memcpy(regs, &h->uregs, sizeof(struct pt_regs));
> +
> +	for (i = 0; i < 16; i++)
> +		regs->uregs[i] = h->uregs[i];
> +
> +	/*
> +	 * Restore only user-writable bits on the CPSR
> +	 */
> +	regs->ARM_cpsr = regs->ARM_cpsr |
> +			 (h->ARM_cpsr & (PSR_N_BIT | PSR_Z_BIT |
> +					 PSR_C_BIT | PSR_V_BIT |
> +					 PSR_V_BIT | PSR_Q_BIT |
> +					 PSR_E_BIT | PSR_GE_BITS));
> +	regs->ARM_ORIG_r0 = h->ARM_ORIG_r0;
> +
> +	return 0;
> +}
> +
> +/* read the cpu state and registers for the current task */
> +int restore_cpu(struct ckpt_ctx *ctx)
> +{
> +	struct ckpt_hdr_cpu *h;
> +	struct task_struct *t = current;
> +	int ret;
> +
> +	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_CPU);
> +	if (IS_ERR(h))
> +		return PTR_ERR(h);
> +
> +	ret = load_cpu_regs(h, t);

will load_cpu_regs() ever be changed to return anything but 0?  If
not both fns can be simplified.

...

> +int restore_mm_context(struct ckpt_ctx *ctx, struct mm_struct *mm)
> +{
> +	struct ckpt_hdr_mm_context *h;
> +	int ret = 0;
> +
> +	h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_MM_CONTEXT);
> +	if (IS_ERR(h))
> +		return PTR_ERR(h);
> +
> +#if !CONFIG_MMU
> +	mm->context.end_brk = h->end_brk;
> +#endif
> +
> +	ckpt_hdr_put(ctx, h);
> +	return ret;

Again ret doesn't seem needed here.

-serge

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
       [not found]   ` <1269219965-23923-2-git-send-email-christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
@ 2010-03-23 20:53     ` Russell King - ARM Linux
  0 siblings, 0 replies; 80+ messages in thread
From: Russell King - ARM Linux @ 2010-03-23 20:53 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Roland McGrath, containers, linux-kernel, linux-arm-kernel

On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
> This small commit introduces a global state of system calls for ARM
> making it possible for a debugger or checkpointing to gain information
> about another process' state with respect to system calls.

I don't particularly like the idea that we always store the syscall
number to memory for every system call, whether the stored version is
used or not.

Since ARM caches are generally not write allocate, this means mostly
write-only variables can have a higher than expected expense.

Is there not some thread flag which can be checked to see if we need to
store the syscall number?

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
  2010-03-22  1:06   ` Christoffer Dall
@ 2010-03-23 20:53     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 80+ messages in thread
From: Russell King - ARM Linux @ 2010-03-23 20:53 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: containers, linux-arm-kernel, linux-kernel, Roland McGrath

On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
> This small commit introduces a global state of system calls for ARM
> making it possible for a debugger or checkpointing to gain information
> about another process' state with respect to system calls.

I don't particularly like the idea that we always store the syscall
number to memory for every system call, whether the stored version is
used or not.

Since ARM caches are generally not write allocate, this means mostly
write-only variables can have a higher than expected expense.

Is there not some thread flag which can be checked to see if we need to
store the syscall number?

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
@ 2010-03-23 20:53     ` Russell King - ARM Linux
  0 siblings, 0 replies; 80+ messages in thread
From: Russell King - ARM Linux @ 2010-03-23 20:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
> This small commit introduces a global state of system calls for ARM
> making it possible for a debugger or checkpointing to gain information
> about another process' state with respect to system calls.

I don't particularly like the idea that we always store the syscall
number to memory for every system call, whether the stored version is
used or not.

Since ARM caches are generally not write allocate, this means mostly
write-only variables can have a higher than expected expense.

Is there not some thread flag which can be checked to see if we need to
store the syscall number?

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 2/3] ARM: Add the eclone system call
       [not found]   ` <1269219965-23923-3-git-send-email-christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
@ 2010-03-23 21:06     ` Russell King - ARM Linux
  0 siblings, 0 replies; 80+ messages in thread
From: Russell King - ARM Linux @ 2010-03-23 21:06 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: containers, Sukadev Bhattiprolu, libc-ports, linux-kernel,
	linux-arm-kernel

On Sun, Mar 21, 2010 at 09:06:04PM -0400, Christoffer Dall wrote:
> In addition to doing everything that clone() system call does, the
> eclone() system call:

Some comments...

> +sys_eclone_wrapper:
> +		add	ip, sp, #S_OFF
> +		str	ip, [sp, #0]
> +		b	sys_eclone
> +ENDPROC(sys_eclone_wrapper)

I'm curious why, if you want the entire set of registers, you don't just
do:
		add	r0, sp, #S_OFF
		b	sys_eclone

and load the syscall arguments out of regs->ARM_foo.  This avoids the need
for additional stores.

> +
>  sys_sigreturn_wrapper:
>  		add	r0, sp, #S_OFF
>  		b	sys_sigreturn
> diff --git a/arch/arm/kernel/sys_arm.c b/arch/arm/kernel/sys_arm.c
> index ae4027b..fd8199d 100644
> --- a/arch/arm/kernel/sys_arm.c
> +++ b/arch/arm/kernel/sys_arm.c
> @@ -183,6 +183,45 @@ asmlinkage int sys_clone(unsigned long clone_flags, unsigned long newsp,
>  	return do_fork(clone_flags, newsp, regs, 0, parent_tidptr, child_tidptr);
>  }
>  
> +asmlinkage int sys_eclone(unsigned flags_low, struct clone_args __user *uca,
> +			  int args_size, pid_t __user *pids,
> +			  struct pt_regs *regs)
> +{
> +	int rc;
> +	struct clone_args kca;
> +	unsigned long flags;
> +	int __user *parent_tidp;
> +	int __user *child_tidp;
> +	unsigned long __user stack;

__user on an integer type doesn't make any sense; integer types do not
have address spaces.

> +	unsigned long stack_size;
> +
> +	rc = fetch_clone_args_from_user(uca, args_size, &kca);
> +	if (rc)
> +		return rc;
> +
> +	/*
> +	 * TODO: Convert 'clone-flags' to 64-bits on all architectures.
> +	 * TODO: When ->clone_flags_high is non-zero, copy it in to the
> +	 * 	 higher word(s) of 'flags':
> +	 *
> +	 * 		flags = (kca.clone_flags_high << 32) | flags_low;
> +	 */
> +	flags = flags_low;
> +	parent_tidp = (int *)(unsigned long)kca.parent_tid_ptr;
> +	child_tidp = (int *)(unsigned long)kca.child_tid_ptr;

This will produce sparse errors.  Is there a reason why 'clone_args'
tid pointers aren't already pointers marked with __user ?

> +
> +	stack_size = (unsigned long)kca.child_stack_size;

Shouldn't this already be of integer type?

> +	if (stack_size)
> +		return -EINVAL;

So the stack must have a zero size?  Is this missing a '!' ?

> +
> +	stack = (unsigned long)kca.child_stack;
> +	if (!stack)
> +		stack = regs->ARM_sp;
> +
> +	return do_fork_with_pids(flags, stack, regs, stack_size, parent_tidp,
> +				child_tidp, kca.nr_pids, pids);

Hmm, so let me get this syscall interface right.  We have some arguments
passed in registers and others via a (variable sized?) structure.  It seems
really weird to have, eg, a pointer to the pids and the number of pids
passed in two separate ways.

The grouping between what's passed in registers and via this clone_args
structure seems to be random.  Can it be sanitized?

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 2/3] ARM: Add the eclone system call
  2010-03-22  1:06   ` Christoffer Dall
@ 2010-03-23 21:06     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 80+ messages in thread
From: Russell King - ARM Linux @ 2010-03-23 21:06 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: containers, linux-arm-kernel, linux-kernel, libc-ports,
	Sukadev Bhattiprolu

On Sun, Mar 21, 2010 at 09:06:04PM -0400, Christoffer Dall wrote:
> In addition to doing everything that clone() system call does, the
> eclone() system call:

Some comments...

> +sys_eclone_wrapper:
> +		add	ip, sp, #S_OFF
> +		str	ip, [sp, #0]
> +		b	sys_eclone
> +ENDPROC(sys_eclone_wrapper)

I'm curious why, if you want the entire set of registers, you don't just
do:
		add	r0, sp, #S_OFF
		b	sys_eclone

and load the syscall arguments out of regs->ARM_foo.  This avoids the need
for additional stores.

> +
>  sys_sigreturn_wrapper:
>  		add	r0, sp, #S_OFF
>  		b	sys_sigreturn
> diff --git a/arch/arm/kernel/sys_arm.c b/arch/arm/kernel/sys_arm.c
> index ae4027b..fd8199d 100644
> --- a/arch/arm/kernel/sys_arm.c
> +++ b/arch/arm/kernel/sys_arm.c
> @@ -183,6 +183,45 @@ asmlinkage int sys_clone(unsigned long clone_flags, unsigned long newsp,
>  	return do_fork(clone_flags, newsp, regs, 0, parent_tidptr, child_tidptr);
>  }
>  
> +asmlinkage int sys_eclone(unsigned flags_low, struct clone_args __user *uca,
> +			  int args_size, pid_t __user *pids,
> +			  struct pt_regs *regs)
> +{
> +	int rc;
> +	struct clone_args kca;
> +	unsigned long flags;
> +	int __user *parent_tidp;
> +	int __user *child_tidp;
> +	unsigned long __user stack;

__user on an integer type doesn't make any sense; integer types do not
have address spaces.

> +	unsigned long stack_size;
> +
> +	rc = fetch_clone_args_from_user(uca, args_size, &kca);
> +	if (rc)
> +		return rc;
> +
> +	/*
> +	 * TODO: Convert 'clone-flags' to 64-bits on all architectures.
> +	 * TODO: When ->clone_flags_high is non-zero, copy it in to the
> +	 * 	 higher word(s) of 'flags':
> +	 *
> +	 * 		flags = (kca.clone_flags_high << 32) | flags_low;
> +	 */
> +	flags = flags_low;
> +	parent_tidp = (int *)(unsigned long)kca.parent_tid_ptr;
> +	child_tidp = (int *)(unsigned long)kca.child_tid_ptr;

This will produce sparse errors.  Is there a reason why 'clone_args'
tid pointers aren't already pointers marked with __user ?

> +
> +	stack_size = (unsigned long)kca.child_stack_size;

Shouldn't this already be of integer type?

> +	if (stack_size)
> +		return -EINVAL;

So the stack must have a zero size?  Is this missing a '!' ?

> +
> +	stack = (unsigned long)kca.child_stack;
> +	if (!stack)
> +		stack = regs->ARM_sp;
> +
> +	return do_fork_with_pids(flags, stack, regs, stack_size, parent_tidp,
> +				child_tidp, kca.nr_pids, pids);

Hmm, so let me get this syscall interface right.  We have some arguments
passed in registers and others via a (variable sized?) structure.  It seems
really weird to have, eg, a pointer to the pids and the number of pids
passed in two separate ways.

The grouping between what's passed in registers and via this clone_args
structure seems to be random.  Can it be sanitized?

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 2/3] ARM: Add the eclone system call
@ 2010-03-23 21:06     ` Russell King - ARM Linux
  0 siblings, 0 replies; 80+ messages in thread
From: Russell King - ARM Linux @ 2010-03-23 21:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Mar 21, 2010 at 09:06:04PM -0400, Christoffer Dall wrote:
> In addition to doing everything that clone() system call does, the
> eclone() system call:

Some comments...

> +sys_eclone_wrapper:
> +		add	ip, sp, #S_OFF
> +		str	ip, [sp, #0]
> +		b	sys_eclone
> +ENDPROC(sys_eclone_wrapper)

I'm curious why, if you want the entire set of registers, you don't just
do:
		add	r0, sp, #S_OFF
		b	sys_eclone

and load the syscall arguments out of regs->ARM_foo.  This avoids the need
for additional stores.

> +
>  sys_sigreturn_wrapper:
>  		add	r0, sp, #S_OFF
>  		b	sys_sigreturn
> diff --git a/arch/arm/kernel/sys_arm.c b/arch/arm/kernel/sys_arm.c
> index ae4027b..fd8199d 100644
> --- a/arch/arm/kernel/sys_arm.c
> +++ b/arch/arm/kernel/sys_arm.c
> @@ -183,6 +183,45 @@ asmlinkage int sys_clone(unsigned long clone_flags, unsigned long newsp,
>  	return do_fork(clone_flags, newsp, regs, 0, parent_tidptr, child_tidptr);
>  }
>  
> +asmlinkage int sys_eclone(unsigned flags_low, struct clone_args __user *uca,
> +			  int args_size, pid_t __user *pids,
> +			  struct pt_regs *regs)
> +{
> +	int rc;
> +	struct clone_args kca;
> +	unsigned long flags;
> +	int __user *parent_tidp;
> +	int __user *child_tidp;
> +	unsigned long __user stack;

__user on an integer type doesn't make any sense; integer types do not
have address spaces.

> +	unsigned long stack_size;
> +
> +	rc = fetch_clone_args_from_user(uca, args_size, &kca);
> +	if (rc)
> +		return rc;
> +
> +	/*
> +	 * TODO: Convert 'clone-flags' to 64-bits on all architectures.
> +	 * TODO: When ->clone_flags_high is non-zero, copy it in to the
> +	 * 	 higher word(s) of 'flags':
> +	 *
> +	 * 		flags = (kca.clone_flags_high << 32) | flags_low;
> +	 */
> +	flags = flags_low;
> +	parent_tidp = (int *)(unsigned long)kca.parent_tid_ptr;
> +	child_tidp = (int *)(unsigned long)kca.child_tid_ptr;

This will produce sparse errors.  Is there a reason why 'clone_args'
tid pointers aren't already pointers marked with __user ?

> +
> +	stack_size = (unsigned long)kca.child_stack_size;

Shouldn't this already be of integer type?

> +	if (stack_size)
> +		return -EINVAL;

So the stack must have a zero size?  Is this missing a '!' ?

> +
> +	stack = (unsigned long)kca.child_stack;
> +	if (!stack)
> +		stack = regs->ARM_sp;
> +
> +	return do_fork_with_pids(flags, stack, regs, stack_size, parent_tidp,
> +				child_tidp, kca.nr_pids, pids);

Hmm, so let me get this syscall interface right.  We have some arguments
passed in registers and others via a (variable sized?) structure.  It seems
really weird to have, eg, a pointer to the pids and the number of pids
passed in two separate ways.

The grouping between what's passed in registers and via this clone_args
structure seems to be random.  Can it be sanitized?

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
       [not found]   ` <1269219965-23923-4-git-send-email-christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
  2010-03-23 16:09     ` Serge E. Hallyn
@ 2010-03-23 21:18     ` Russell King - ARM Linux
  1 sibling, 0 replies; 80+ messages in thread
From: Russell King - ARM Linux @ 2010-03-23 21:18 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: containers, linux-kernel, linux-arm-kernel

On Sun, Mar 21, 2010 at 09:06:05PM -0400, Christoffer Dall wrote:
> The ISA version (given by __LINUX_ARM_ARCH__) is checkpointed and verified
> against the machine architecture on restart.

I think you misunderstand what __LINUX_ARM_ARCH__ signifies.  It is the
build architecture for the kernel, and it indicates the lowest
architecture version that the kernel will run on.

That doesn't indicate what ISA version the system is running on, or even
if the ABI is compatible (we have two ABIs - OABI and EABI).

There's also the matter of FP implementation - whether it is VFP or FPA,
and whether iwMMXt is available or not.  (iwMMXt precludes the use of
FPA.)

> Regarding ThumbEE, the thumbee_state field on the thread_info is stored
> in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
> a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
> set on the restore system, the restore is aborted. Feedback on this
> implementation is very welcome.

I don't recognise this configuration symbol; it doesn't exist in mainline.

> We checkpoint whether the system is running with CONFIG_MMU or not and
> require the same configuration for the system on which we restore the
> process. It might be possible to allow something more fine-grained,
> if it's worth the energy. Input on this item is also very welcome,
> specifically from someone who knows the exact meaning of the end_brk
> field.

Processes which run on MMU and non-MMU CPUs are unlikely to be
interchangable - the run time environments are quite different.  I
think this is a sane check.

> +/* dump the thread_struct of a given task */
> +int checkpoint_thread(struct ckpt_ctx *ctx, struct task_struct *t)
> +{
> +	int ret;
> +	struct ckpt_hdr_thread *h;
> +	struct thread_info *ti = task_thread_info(t);
> +
> +	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_THREAD);
> +	if (!h)
> +		return -ENOMEM;
> +
> +	/*
> +	 * Store the syscall information about the checkpointed process
> +	 * as we need to know if the process was doing a syscall (and which)
> +	 * during restart.
> +	 */
> +	h->syscall = ti->syscall;
> +
> +	/*
> +	 * Store remaining thread-specific info.
> +	 */
> +	h->tp_value = ti->tp_value;

How do you safely obtain consistent information from a thread?  Do you
temporarily stop it?

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
  2010-03-22  1:06   ` Christoffer Dall
@ 2010-03-23 21:18     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 80+ messages in thread
From: Russell King - ARM Linux @ 2010-03-23 21:18 UTC (permalink / raw)
  To: Christoffer Dall; +Cc: containers, linux-arm-kernel, linux-kernel

On Sun, Mar 21, 2010 at 09:06:05PM -0400, Christoffer Dall wrote:
> The ISA version (given by __LINUX_ARM_ARCH__) is checkpointed and verified
> against the machine architecture on restart.

I think you misunderstand what __LINUX_ARM_ARCH__ signifies.  It is the
build architecture for the kernel, and it indicates the lowest
architecture version that the kernel will run on.

That doesn't indicate what ISA version the system is running on, or even
if the ABI is compatible (we have two ABIs - OABI and EABI).

There's also the matter of FP implementation - whether it is VFP or FPA,
and whether iwMMXt is available or not.  (iwMMXt precludes the use of
FPA.)

> Regarding ThumbEE, the thumbee_state field on the thread_info is stored
> in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
> a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
> set on the restore system, the restore is aborted. Feedback on this
> implementation is very welcome.

I don't recognise this configuration symbol; it doesn't exist in mainline.

> We checkpoint whether the system is running with CONFIG_MMU or not and
> require the same configuration for the system on which we restore the
> process. It might be possible to allow something more fine-grained,
> if it's worth the energy. Input on this item is also very welcome,
> specifically from someone who knows the exact meaning of the end_brk
> field.

Processes which run on MMU and non-MMU CPUs are unlikely to be
interchangable - the run time environments are quite different.  I
think this is a sane check.

> +/* dump the thread_struct of a given task */
> +int checkpoint_thread(struct ckpt_ctx *ctx, struct task_struct *t)
> +{
> +	int ret;
> +	struct ckpt_hdr_thread *h;
> +	struct thread_info *ti = task_thread_info(t);
> +
> +	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_THREAD);
> +	if (!h)
> +		return -ENOMEM;
> +
> +	/*
> +	 * Store the syscall information about the checkpointed process
> +	 * as we need to know if the process was doing a syscall (and which)
> +	 * during restart.
> +	 */
> +	h->syscall = ti->syscall;
> +
> +	/*
> +	 * Store remaining thread-specific info.
> +	 */
> +	h->tp_value = ti->tp_value;

How do you safely obtain consistent information from a thread?  Do you
temporarily stop it?

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
@ 2010-03-23 21:18     ` Russell King - ARM Linux
  0 siblings, 0 replies; 80+ messages in thread
From: Russell King - ARM Linux @ 2010-03-23 21:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Mar 21, 2010 at 09:06:05PM -0400, Christoffer Dall wrote:
> The ISA version (given by __LINUX_ARM_ARCH__) is checkpointed and verified
> against the machine architecture on restart.

I think you misunderstand what __LINUX_ARM_ARCH__ signifies.  It is the
build architecture for the kernel, and it indicates the lowest
architecture version that the kernel will run on.

That doesn't indicate what ISA version the system is running on, or even
if the ABI is compatible (we have two ABIs - OABI and EABI).

There's also the matter of FP implementation - whether it is VFP or FPA,
and whether iwMMXt is available or not.  (iwMMXt precludes the use of
FPA.)

> Regarding ThumbEE, the thumbee_state field on the thread_info is stored
> in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
> a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
> set on the restore system, the restore is aborted. Feedback on this
> implementation is very welcome.

I don't recognise this configuration symbol; it doesn't exist in mainline.

> We checkpoint whether the system is running with CONFIG_MMU or not and
> require the same configuration for the system on which we restore the
> process. It might be possible to allow something more fine-grained,
> if it's worth the energy. Input on this item is also very welcome,
> specifically from someone who knows the exact meaning of the end_brk
> field.

Processes which run on MMU and non-MMU CPUs are unlikely to be
interchangable - the run time environments are quite different.  I
think this is a sane check.

> +/* dump the thread_struct of a given task */
> +int checkpoint_thread(struct ckpt_ctx *ctx, struct task_struct *t)
> +{
> +	int ret;
> +	struct ckpt_hdr_thread *h;
> +	struct thread_info *ti = task_thread_info(t);
> +
> +	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_THREAD);
> +	if (!h)
> +		return -ENOMEM;
> +
> +	/*
> +	 * Store the syscall information about the checkpointed process
> +	 * as we need to know if the process was doing a syscall (and which)
> +	 * during restart.
> +	 */
> +	h->syscall = ti->syscall;
> +
> +	/*
> +	 * Store remaining thread-specific info.
> +	 */
> +	h->tp_value = ti->tp_value;

How do you safely obtain consistent information from a thread?  Do you
temporarily stop it?

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
       [not found]     ` <20100323211843.GC19572-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>
@ 2010-03-24  1:53       ` Matt Helsley
  2010-03-24 20:48       ` Christoffer Dall
  1 sibling, 0 replies; 80+ messages in thread
From: Matt Helsley @ 2010-03-24  1:53 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: containers, Christoffer Dall, linux-kernel, linux-arm-kernel

On Tue, Mar 23, 2010 at 09:18:43PM +0000, Russell King - ARM Linux wrote:

<snip> (sorry -- I'm not familiar with ARM so I can't respond to those)

> > +/* dump the thread_struct of a given task */
> > +int checkpoint_thread(struct ckpt_ctx *ctx, struct task_struct *t)
> > +{
> > +	int ret;
> > +	struct ckpt_hdr_thread *h;
> > +	struct thread_info *ti = task_thread_info(t);
> > +
> > +	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_THREAD);
> > +	if (!h)
> > +		return -ENOMEM;
> > +
> > +	/*
> > +	 * Store the syscall information about the checkpointed process
> > +	 * as we need to know if the process was doing a syscall (and which)
> > +	 * during restart.
> > +	 */
> > +	h->syscall = ti->syscall;
> > +
> > +	/*
> > +	 * Store remaining thread-specific info.
> > +	 */
> > +	h->tp_value = ti->tp_value;
> 
> How do you safely obtain consistent information from a thread?  Do you
> temporarily stop it?

It must be frozen with the cgroup freezer (which reuses the suspend freezer).
sys_checkpoint moves the cgroup freezer into the CHECKPOINTING state which
prevents tasks in that group from being thawed until just before checkpoint
returns.

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
  2010-03-23 21:18     ` Russell King - ARM Linux
@ 2010-03-24  1:53       ` Matt Helsley
  -1 siblings, 0 replies; 80+ messages in thread
From: Matt Helsley @ 2010-03-24  1:53 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Christoffer Dall, containers, linux-kernel, linux-arm-kernel

On Tue, Mar 23, 2010 at 09:18:43PM +0000, Russell King - ARM Linux wrote:

<snip> (sorry -- I'm not familiar with ARM so I can't respond to those)

> > +/* dump the thread_struct of a given task */
> > +int checkpoint_thread(struct ckpt_ctx *ctx, struct task_struct *t)
> > +{
> > +	int ret;
> > +	struct ckpt_hdr_thread *h;
> > +	struct thread_info *ti = task_thread_info(t);
> > +
> > +	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_THREAD);
> > +	if (!h)
> > +		return -ENOMEM;
> > +
> > +	/*
> > +	 * Store the syscall information about the checkpointed process
> > +	 * as we need to know if the process was doing a syscall (and which)
> > +	 * during restart.
> > +	 */
> > +	h->syscall = ti->syscall;
> > +
> > +	/*
> > +	 * Store remaining thread-specific info.
> > +	 */
> > +	h->tp_value = ti->tp_value;
> 
> How do you safely obtain consistent information from a thread?  Do you
> temporarily stop it?

It must be frozen with the cgroup freezer (which reuses the suspend freezer).
sys_checkpoint moves the cgroup freezer into the CHECKPOINTING state which
prevents tasks in that group from being thawed until just before checkpoint
returns.

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
@ 2010-03-24  1:53       ` Matt Helsley
  0 siblings, 0 replies; 80+ messages in thread
From: Matt Helsley @ 2010-03-24  1:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Mar 23, 2010 at 09:18:43PM +0000, Russell King - ARM Linux wrote:

<snip> (sorry -- I'm not familiar with ARM so I can't respond to those)

> > +/* dump the thread_struct of a given task */
> > +int checkpoint_thread(struct ckpt_ctx *ctx, struct task_struct *t)
> > +{
> > +	int ret;
> > +	struct ckpt_hdr_thread *h;
> > +	struct thread_info *ti = task_thread_info(t);
> > +
> > +	h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_THREAD);
> > +	if (!h)
> > +		return -ENOMEM;
> > +
> > +	/*
> > +	 * Store the syscall information about the checkpointed process
> > +	 * as we need to know if the process was doing a syscall (and which)
> > +	 * during restart.
> > +	 */
> > +	h->syscall = ti->syscall;
> > +
> > +	/*
> > +	 * Store remaining thread-specific info.
> > +	 */
> > +	h->tp_value = ti->tp_value;
> 
> How do you safely obtain consistent information from a thread?  Do you
> temporarily stop it?

It must be frozen with the cgroup freezer (which reuses the suspend freezer).
sys_checkpoint moves the cgroup freezer into the CHECKPOINTING state which
prevents tasks in that group from being thawed until just before checkpoint
returns.

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
       [not found]     ` <20100323205342.GA19572-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>
@ 2010-03-24  2:03       ` Matt Helsley
  0 siblings, 0 replies; 80+ messages in thread
From: Matt Helsley @ 2010-03-24  2:03 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: linux-arm-kernel, containers, Christoffer Dall, linux-kernel,
	Roland McGrath

On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux wrote:
> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
> > This small commit introduces a global state of system calls for ARM
> > making it possible for a debugger or checkpointing to gain information
> > about another process' state with respect to system calls.
> 
> I don't particularly like the idea that we always store the syscall
> number to memory for every system call, whether the stored version is
> used or not.
> 
> Since ARM caches are generally not write allocate, this means mostly
> write-only variables can have a higher than expected expense.
> 
> Is there not some thread flag which can be checked to see if we need to
> store the syscall number?

Perhaps before we freeze the task we can save the syscall number on ARM.
The patches suggest that the signal delivery path -- which the freezer
utilizes -- has the syscall number already.

Should work since the threads must be frozen first anyway.

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
  2010-03-23 20:53     ` Russell King - ARM Linux
@ 2010-03-24  2:03       ` Matt Helsley
  -1 siblings, 0 replies; 80+ messages in thread
From: Matt Helsley @ 2010-03-24  2:03 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Christoffer Dall, Roland McGrath, containers, linux-kernel,
	linux-arm-kernel

On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux wrote:
> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
> > This small commit introduces a global state of system calls for ARM
> > making it possible for a debugger or checkpointing to gain information
> > about another process' state with respect to system calls.
> 
> I don't particularly like the idea that we always store the syscall
> number to memory for every system call, whether the stored version is
> used or not.
> 
> Since ARM caches are generally not write allocate, this means mostly
> write-only variables can have a higher than expected expense.
> 
> Is there not some thread flag which can be checked to see if we need to
> store the syscall number?

Perhaps before we freeze the task we can save the syscall number on ARM.
The patches suggest that the signal delivery path -- which the freezer
utilizes -- has the syscall number already.

Should work since the threads must be frozen first anyway.

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
@ 2010-03-24  2:03       ` Matt Helsley
  0 siblings, 0 replies; 80+ messages in thread
From: Matt Helsley @ 2010-03-24  2:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux wrote:
> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
> > This small commit introduces a global state of system calls for ARM
> > making it possible for a debugger or checkpointing to gain information
> > about another process' state with respect to system calls.
> 
> I don't particularly like the idea that we always store the syscall
> number to memory for every system call, whether the stored version is
> used or not.
> 
> Since ARM caches are generally not write allocate, this means mostly
> write-only variables can have a higher than expected expense.
> 
> Is there not some thread flag which can be checked to see if we need to
> store the syscall number?

Perhaps before we freeze the task we can save the syscall number on ARM.
The patches suggest that the signal delivery path -- which the freezer
utilizes -- has the syscall number already.

Should work since the threads must be frozen first anyway.

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
  2010-03-24  2:03       ` Matt Helsley
  (?)
@ 2010-03-24  4:57           ` Oren Laadan
  -1 siblings, 0 replies; 80+ messages in thread
From: Oren Laadan @ 2010-03-24  4:57 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Russell King - ARM Linux, containers, linux-kernel,
	Christoffer Dall, linux-arm-kernel, Roland McGrath

On Tue, 23 Mar 2010, Matt Helsley wrote:

> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux wrote:
> > On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
> > > This small commit introduces a global state of system calls for ARM
> > > making it possible for a debugger or checkpointing to gain information
> > > about another process' state with respect to system calls.
> > 
> > I don't particularly like the idea that we always store the syscall
> > number to memory for every system call, whether the stored version is
> > used or not.
> > 
> > Since ARM caches are generally not write allocate, this means mostly
> > write-only variables can have a higher than expected expense.
> > 
> > Is there not some thread flag which can be checked to see if we need to
> > store the syscall number?
> 
> Perhaps before we freeze the task we can save the syscall number on ARM.
> The patches suggest that the signal delivery path -- which the freezer
> utilizes -- has the syscall number already.
> 
> Should work since the threads must be frozen first anyway.

I like the idea.

However, would it also work for those cases when the freezing does not 
occur from the signal delivery path - e.g. for vfork and ptraced tasks ?

Oren.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
@ 2010-03-24  4:57           ` Oren Laadan
  0 siblings, 0 replies; 80+ messages in thread
From: Oren Laadan @ 2010-03-24  4:57 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Russell King - ARM Linux, linux-arm-kernel, containers,
	Christoffer Dall, linux-kernel, Roland McGrath

On Tue, 23 Mar 2010, Matt Helsley wrote:

> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux wrote:
> > On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
> > > This small commit introduces a global state of system calls for ARM
> > > making it possible for a debugger or checkpointing to gain information
> > > about another process' state with respect to system calls.
> > 
> > I don't particularly like the idea that we always store the syscall
> > number to memory for every system call, whether the stored version is
> > used or not.
> > 
> > Since ARM caches are generally not write allocate, this means mostly
> > write-only variables can have a higher than expected expense.
> > 
> > Is there not some thread flag which can be checked to see if we need to
> > store the syscall number?
> 
> Perhaps before we freeze the task we can save the syscall number on ARM.
> The patches suggest that the signal delivery path -- which the freezer
> utilizes -- has the syscall number already.
> 
> Should work since the threads must be frozen first anyway.

I like the idea.

However, would it also work for those cases when the freezing does not 
occur from the signal delivery path - e.g. for vfork and ptraced tasks ?

Oren.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
@ 2010-03-24  4:57           ` Oren Laadan
  0 siblings, 0 replies; 80+ messages in thread
From: Oren Laadan @ 2010-03-24  4:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 23 Mar 2010, Matt Helsley wrote:

> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux wrote:
> > On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
> > > This small commit introduces a global state of system calls for ARM
> > > making it possible for a debugger or checkpointing to gain information
> > > about another process' state with respect to system calls.
> > 
> > I don't particularly like the idea that we always store the syscall
> > number to memory for every system call, whether the stored version is
> > used or not.
> > 
> > Since ARM caches are generally not write allocate, this means mostly
> > write-only variables can have a higher than expected expense.
> > 
> > Is there not some thread flag which can be checked to see if we need to
> > store the syscall number?
> 
> Perhaps before we freeze the task we can save the syscall number on ARM.
> The patches suggest that the signal delivery path -- which the freezer
> utilizes -- has the syscall number already.
> 
> Should work since the threads must be frozen first anyway.

I like the idea.

However, would it also work for those cases when the freezing does not 
occur from the signal delivery path - e.g. for vfork and ptraced tasks ?

Oren.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
       [not found]           ` <Pine.LNX.4.64.1003240055050.5867-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>
@ 2010-03-24 14:02             ` Matt Helsley
  0 siblings, 0 replies; 80+ messages in thread
From: Matt Helsley @ 2010-03-24 14:02 UTC (permalink / raw)
  To: Oren Laadan
  Cc: Russell King - ARM Linux, Roland McGrath, containers,
	linux-kernel, Christoffer Dall, linux-arm-kernel

On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
> On Tue, 23 Mar 2010, Matt Helsley wrote:
> 
> > On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux wrote:
> > > On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
> > > > This small commit introduces a global state of system calls for ARM
> > > > making it possible for a debugger or checkpointing to gain information
> > > > about another process' state with respect to system calls.
> > > 
> > > I don't particularly like the idea that we always store the syscall
> > > number to memory for every system call, whether the stored version is
> > > used or not.
> > > 
> > > Since ARM caches are generally not write allocate, this means mostly
> > > write-only variables can have a higher than expected expense.
> > > 
> > > Is there not some thread flag which can be checked to see if we need to
> > > store the syscall number?
> > 
> > Perhaps before we freeze the task we can save the syscall number on ARM.
> > The patches suggest that the signal delivery path -- which the freezer
> > utilizes -- has the syscall number already.
> > 
> > Should work since the threads must be frozen first anyway.
> 
> I like the idea.
> 
> However, would it also work for those cases when the freezing does not 
> occur from the signal delivery path - e.g. for vfork and ptraced tasks ?

We could just as easily set it before the vfork uninterruptible completion.
ptracing I'd don't know about though.

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
  2010-03-24  4:57           ` Oren Laadan
@ 2010-03-24 14:02             ` Matt Helsley
  -1 siblings, 0 replies; 80+ messages in thread
From: Matt Helsley @ 2010-03-24 14:02 UTC (permalink / raw)
  To: Oren Laadan
  Cc: Matt Helsley, Russell King - ARM Linux, linux-arm-kernel,
	containers, Christoffer Dall, linux-kernel, Roland McGrath

On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
> On Tue, 23 Mar 2010, Matt Helsley wrote:
> 
> > On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux wrote:
> > > On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
> > > > This small commit introduces a global state of system calls for ARM
> > > > making it possible for a debugger or checkpointing to gain information
> > > > about another process' state with respect to system calls.
> > > 
> > > I don't particularly like the idea that we always store the syscall
> > > number to memory for every system call, whether the stored version is
> > > used or not.
> > > 
> > > Since ARM caches are generally not write allocate, this means mostly
> > > write-only variables can have a higher than expected expense.
> > > 
> > > Is there not some thread flag which can be checked to see if we need to
> > > store the syscall number?
> > 
> > Perhaps before we freeze the task we can save the syscall number on ARM.
> > The patches suggest that the signal delivery path -- which the freezer
> > utilizes -- has the syscall number already.
> > 
> > Should work since the threads must be frozen first anyway.
> 
> I like the idea.
> 
> However, would it also work for those cases when the freezing does not 
> occur from the signal delivery path - e.g. for vfork and ptraced tasks ?

We could just as easily set it before the vfork uninterruptible completion.
ptracing I'd don't know about though.

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
@ 2010-03-24 14:02             ` Matt Helsley
  0 siblings, 0 replies; 80+ messages in thread
From: Matt Helsley @ 2010-03-24 14:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
> On Tue, 23 Mar 2010, Matt Helsley wrote:
> 
> > On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux wrote:
> > > On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
> > > > This small commit introduces a global state of system calls for ARM
> > > > making it possible for a debugger or checkpointing to gain information
> > > > about another process' state with respect to system calls.
> > > 
> > > I don't particularly like the idea that we always store the syscall
> > > number to memory for every system call, whether the stored version is
> > > used or not.
> > > 
> > > Since ARM caches are generally not write allocate, this means mostly
> > > write-only variables can have a higher than expected expense.
> > > 
> > > Is there not some thread flag which can be checked to see if we need to
> > > store the syscall number?
> > 
> > Perhaps before we freeze the task we can save the syscall number on ARM.
> > The patches suggest that the signal delivery path -- which the freezer
> > utilizes -- has the syscall number already.
> > 
> > Should work since the threads must be frozen first anyway.
> 
> I like the idea.
> 
> However, would it also work for those cases when the freezing does not 
> occur from the signal delivery path - e.g. for vfork and ptraced tasks ?

We could just as easily set it before the vfork uninterruptible completion.
ptracing I'd don't know about though.

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
       [not found]             ` <20100324140252.GC5704-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
@ 2010-03-24 15:53               ` Oren Laadan
  0 siblings, 0 replies; 80+ messages in thread
From: Oren Laadan @ 2010-03-24 15:53 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Russell King - ARM Linux, containers, linux-kernel,
	Christoffer Dall, linux-arm-kernel, Roland McGrath



Matt Helsley wrote:
> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
>> On Tue, 23 Mar 2010, Matt Helsley wrote:
>>
>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux wrote:
>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
>>>>> This small commit introduces a global state of system calls for ARM
>>>>> making it possible for a debugger or checkpointing to gain information
>>>>> about another process' state with respect to system calls.
>>>> I don't particularly like the idea that we always store the syscall
>>>> number to memory for every system call, whether the stored version is
>>>> used or not.
>>>>
>>>> Since ARM caches are generally not write allocate, this means mostly
>>>> write-only variables can have a higher than expected expense.
>>>>
>>>> Is there not some thread flag which can be checked to see if we need to
>>>> store the syscall number?
>>> Perhaps before we freeze the task we can save the syscall number on ARM.
>>> The patches suggest that the signal delivery path -- which the freezer
>>> utilizes -- has the syscall number already.

Actually, the signal path doesn't have the syscall number, it has
a binary "in syscall" value.

>>>
>>> Should work since the threads must be frozen first anyway.
>> I like the idea.
>>
>> However, would it also work for those cases when the freezing does not 
>> occur from the signal delivery path - e.g. for vfork and ptraced tasks ?
> 
> We could just as easily set it before the vfork uninterruptible completion.
> ptracing I'd don't know about though.
> 

vfork() uses freezer_do_not_count() to tell the freezer that it's
effectively frozen. It's also used by drivers/char/apm-emulation.c

Looking at calls to ptrace_notify(), ptrace_stop() and ptace_event(),
there are several places where a ptraced task can stop with TASK_TRACED
(which is good enough for the freezer), outside the signal handling
path.

This means that recording the syscall number for all these cases is
going to be tedious and intrusive.

I prefer to somehow figure out the syscall from the task's state or
pt_regs, or by (re)using the same assembly code that already does that.

Oren.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
  2010-03-24 14:02             ` Matt Helsley
@ 2010-03-24 15:53               ` Oren Laadan
  -1 siblings, 0 replies; 80+ messages in thread
From: Oren Laadan @ 2010-03-24 15:53 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Russell King - ARM Linux, linux-arm-kernel, containers,
	Christoffer Dall, linux-kernel, Roland McGrath



Matt Helsley wrote:
> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
>> On Tue, 23 Mar 2010, Matt Helsley wrote:
>>
>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux wrote:
>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
>>>>> This small commit introduces a global state of system calls for ARM
>>>>> making it possible for a debugger or checkpointing to gain information
>>>>> about another process' state with respect to system calls.
>>>> I don't particularly like the idea that we always store the syscall
>>>> number to memory for every system call, whether the stored version is
>>>> used or not.
>>>>
>>>> Since ARM caches are generally not write allocate, this means mostly
>>>> write-only variables can have a higher than expected expense.
>>>>
>>>> Is there not some thread flag which can be checked to see if we need to
>>>> store the syscall number?
>>> Perhaps before we freeze the task we can save the syscall number on ARM.
>>> The patches suggest that the signal delivery path -- which the freezer
>>> utilizes -- has the syscall number already.

Actually, the signal path doesn't have the syscall number, it has
a binary "in syscall" value.

>>>
>>> Should work since the threads must be frozen first anyway.
>> I like the idea.
>>
>> However, would it also work for those cases when the freezing does not 
>> occur from the signal delivery path - e.g. for vfork and ptraced tasks ?
> 
> We could just as easily set it before the vfork uninterruptible completion.
> ptracing I'd don't know about though.
> 

vfork() uses freezer_do_not_count() to tell the freezer that it's
effectively frozen. It's also used by drivers/char/apm-emulation.c

Looking at calls to ptrace_notify(), ptrace_stop() and ptace_event(),
there are several places where a ptraced task can stop with TASK_TRACED
(which is good enough for the freezer), outside the signal handling
path.

This means that recording the syscall number for all these cases is
going to be tedious and intrusive.

I prefer to somehow figure out the syscall from the task's state or
pt_regs, or by (re)using the same assembly code that already does that.

Oren.


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
@ 2010-03-24 15:53               ` Oren Laadan
  0 siblings, 0 replies; 80+ messages in thread
From: Oren Laadan @ 2010-03-24 15:53 UTC (permalink / raw)
  To: linux-arm-kernel



Matt Helsley wrote:
> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
>> On Tue, 23 Mar 2010, Matt Helsley wrote:
>>
>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux wrote:
>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
>>>>> This small commit introduces a global state of system calls for ARM
>>>>> making it possible for a debugger or checkpointing to gain information
>>>>> about another process' state with respect to system calls.
>>>> I don't particularly like the idea that we always store the syscall
>>>> number to memory for every system call, whether the stored version is
>>>> used or not.
>>>>
>>>> Since ARM caches are generally not write allocate, this means mostly
>>>> write-only variables can have a higher than expected expense.
>>>>
>>>> Is there not some thread flag which can be checked to see if we need to
>>>> store the syscall number?
>>> Perhaps before we freeze the task we can save the syscall number on ARM.
>>> The patches suggest that the signal delivery path -- which the freezer
>>> utilizes -- has the syscall number already.

Actually, the signal path doesn't have the syscall number, it has
a binary "in syscall" value.

>>>
>>> Should work since the threads must be frozen first anyway.
>> I like the idea.
>>
>> However, would it also work for those cases when the freezing does not 
>> occur from the signal delivery path - e.g. for vfork and ptraced tasks ?
> 
> We could just as easily set it before the vfork uninterruptible completion.
> ptracing I'd don't know about though.
> 

vfork() uses freezer_do_not_count() to tell the freezer that it's
effectively frozen. It's also used by drivers/char/apm-emulation.c

Looking at calls to ptrace_notify(), ptrace_stop() and ptace_event(),
there are several places where a ptraced task can stop with TASK_TRACED
(which is good enough for the freezer), outside the signal handling
path.

This means that recording the syscall number for all these cases is
going to be tedious and intrusive.

I prefer to somehow figure out the syscall from the task's state or
pt_regs, or by (re)using the same assembly code that already does that.

Oren.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 2/3] ARM: Add the eclone system call
       [not found]     ` <20100323210616.GB19572-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>
@ 2010-03-24 18:19       ` Sukadev Bhattiprolu
  2010-03-24 19:42       ` Christoffer Dall
  1 sibling, 0 replies; 80+ messages in thread
From: Sukadev Bhattiprolu @ 2010-03-24 18:19 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: containers, Christoffer Dall, libc-ports, linux-kernel, linux-arm-kernel

Russell King - ARM Linux [linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org] wrote:
| > +asmlinkage int sys_eclone(unsigned flags_low, struct clone_args __user *uca,
| > +			  int args_size, pid_t __user *pids,
| > +			  struct pt_regs *regs)
| > +{
| > +	int rc;
| > +	struct clone_args kca;
| > +	unsigned long flags;
| > +	int __user *parent_tidp;
| > +	int __user *child_tidp;
| > +	unsigned long __user stack;
| 
| __user on an integer type doesn't make any sense; integer types do not
| have address spaces.

Ah, will fix  that for x86 32/64 bit implementations.

| 
| > +	unsigned long stack_size;
| > +
| > +	rc = fetch_clone_args_from_user(uca, args_size, &kca);
| > +	if (rc)
| > +		return rc;
| > +
| > +	/*
| > +	 * TODO: Convert 'clone-flags' to 64-bits on all architectures.
| > +	 * TODO: When ->clone_flags_high is non-zero, copy it in to the
| > +	 * 	 higher word(s) of 'flags':
| > +	 *
| > +	 * 		flags = (kca.clone_flags_high << 32) | flags_low;
| > +	 */
| > +	flags = flags_low;
| > +	parent_tidp = (int *)(unsigned long)kca.parent_tid_ptr;
| > +	child_tidp = (int *)(unsigned long)kca.child_tid_ptr;
| 
| This will produce sparse errors.  Is there a reason why 'clone_args'
| tid pointers aren't already pointers marked with __user ?

Making them pointers would make them 32-bit on some and 64-bit on other
architectures. We wanted the fields to be the same size on all architectures
for easier portability/extensibility. Given that we are copying-in a structure
from user-space, copying in a few extra bytes on the 32-bit architectures
would not be significant.

| 
| > +
| > +	stack_size = (unsigned long)kca.child_stack_size;
| 
| Shouldn't this already be of integer type?
| 
| > +	if (stack_size)
| > +		return -EINVAL;
| 
| So the stack must have a zero size?  Is this missing a '!' ?

Some architectures (IA64 ?) use the stack-size field. Those that don't
need it should error out. Again, its because we are trying to keep the
interface common across architectures.

| 
| > +
| > +	stack = (unsigned long)kca.child_stack;
| > +	if (!stack)
| > +		stack = regs->ARM_sp;
| > +
| > +	return do_fork_with_pids(flags, stack, regs, stack_size, parent_tidp,
| > +				child_tidp, kca.nr_pids, pids);
| 
| Hmm, so let me get this syscall interface right.  We have some arguments
| passed in registers and others via a (variable sized?) structure.  It seems
| really weird to have, eg, a pointer to the pids and the number of pids
| passed in two separate ways.
| 
| The grouping between what's passed in registers and via this clone_args
| structure seems to be random.  Can it be sanitized?

:-) Well, we went through a lot of discussions on this. Here is one pointer
http://lkml.org/lkml/2009/9/11/92 (and that was version 6 of 13+ :-).

We wanted the first parameter of eclone(), flags, to remain 32-bit value to
avoid confusing user-space. (i.e if they accidentally pass a 64-bit flags
value to the clone() system call which takes 32-bit flags, the higher-32
bits would silently be dropped). 

By sticking the higher clone-flags in 'struct clone_args', all architectures
would have to do the same extra work to set the higher flags so less
chance of error.

Re: passing the number of pids, nr_pids in the structure, I think it was
just that by passing it in the structure, we could avoid using an extra
register for the system call parameter.

'pid_t *' would be of different size on different architecutres. Passing
it as a separate parameter would avoid the pointer conversion.  Note
that unlike the tid pointers above which are a few individual fields,
the pid_t array could in theory be large.

Hope that helps. Thanks the review comments.

Sukadev

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 2/3] ARM: Add the eclone system call
  2010-03-23 21:06     ` Russell King - ARM Linux
@ 2010-03-24 18:19       ` Sukadev Bhattiprolu
  -1 siblings, 0 replies; 80+ messages in thread
From: Sukadev Bhattiprolu @ 2010-03-24 18:19 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Christoffer Dall, containers, linux-arm-kernel, linux-kernel, libc-ports

Russell King - ARM Linux [linux@arm.linux.org.uk] wrote:
| > +asmlinkage int sys_eclone(unsigned flags_low, struct clone_args __user *uca,
| > +			  int args_size, pid_t __user *pids,
| > +			  struct pt_regs *regs)
| > +{
| > +	int rc;
| > +	struct clone_args kca;
| > +	unsigned long flags;
| > +	int __user *parent_tidp;
| > +	int __user *child_tidp;
| > +	unsigned long __user stack;
| 
| __user on an integer type doesn't make any sense; integer types do not
| have address spaces.

Ah, will fix  that for x86 32/64 bit implementations.

| 
| > +	unsigned long stack_size;
| > +
| > +	rc = fetch_clone_args_from_user(uca, args_size, &kca);
| > +	if (rc)
| > +		return rc;
| > +
| > +	/*
| > +	 * TODO: Convert 'clone-flags' to 64-bits on all architectures.
| > +	 * TODO: When ->clone_flags_high is non-zero, copy it in to the
| > +	 * 	 higher word(s) of 'flags':
| > +	 *
| > +	 * 		flags = (kca.clone_flags_high << 32) | flags_low;
| > +	 */
| > +	flags = flags_low;
| > +	parent_tidp = (int *)(unsigned long)kca.parent_tid_ptr;
| > +	child_tidp = (int *)(unsigned long)kca.child_tid_ptr;
| 
| This will produce sparse errors.  Is there a reason why 'clone_args'
| tid pointers aren't already pointers marked with __user ?

Making them pointers would make them 32-bit on some and 64-bit on other
architectures. We wanted the fields to be the same size on all architectures
for easier portability/extensibility. Given that we are copying-in a structure
from user-space, copying in a few extra bytes on the 32-bit architectures
would not be significant.

| 
| > +
| > +	stack_size = (unsigned long)kca.child_stack_size;
| 
| Shouldn't this already be of integer type?
| 
| > +	if (stack_size)
| > +		return -EINVAL;
| 
| So the stack must have a zero size?  Is this missing a '!' ?

Some architectures (IA64 ?) use the stack-size field. Those that don't
need it should error out. Again, its because we are trying to keep the
interface common across architectures.

| 
| > +
| > +	stack = (unsigned long)kca.child_stack;
| > +	if (!stack)
| > +		stack = regs->ARM_sp;
| > +
| > +	return do_fork_with_pids(flags, stack, regs, stack_size, parent_tidp,
| > +				child_tidp, kca.nr_pids, pids);
| 
| Hmm, so let me get this syscall interface right.  We have some arguments
| passed in registers and others via a (variable sized?) structure.  It seems
| really weird to have, eg, a pointer to the pids and the number of pids
| passed in two separate ways.
| 
| The grouping between what's passed in registers and via this clone_args
| structure seems to be random.  Can it be sanitized?

:-) Well, we went through a lot of discussions on this. Here is one pointer
http://lkml.org/lkml/2009/9/11/92 (and that was version 6 of 13+ :-).

We wanted the first parameter of eclone(), flags, to remain 32-bit value to
avoid confusing user-space. (i.e if they accidentally pass a 64-bit flags
value to the clone() system call which takes 32-bit flags, the higher-32
bits would silently be dropped). 

By sticking the higher clone-flags in 'struct clone_args', all architectures
would have to do the same extra work to set the higher flags so less
chance of error.

Re: passing the number of pids, nr_pids in the structure, I think it was
just that by passing it in the structure, we could avoid using an extra
register for the system call parameter.

'pid_t *' would be of different size on different architecutres. Passing
it as a separate parameter would avoid the pointer conversion.  Note
that unlike the tid pointers above which are a few individual fields,
the pid_t array could in theory be large.

Hope that helps. Thanks the review comments.

Sukadev

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 2/3] ARM: Add the eclone system call
@ 2010-03-24 18:19       ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 80+ messages in thread
From: Sukadev Bhattiprolu @ 2010-03-24 18:19 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux [linux at arm.linux.org.uk] wrote:
| > +asmlinkage int sys_eclone(unsigned flags_low, struct clone_args __user *uca,
| > +			  int args_size, pid_t __user *pids,
| > +			  struct pt_regs *regs)
| > +{
| > +	int rc;
| > +	struct clone_args kca;
| > +	unsigned long flags;
| > +	int __user *parent_tidp;
| > +	int __user *child_tidp;
| > +	unsigned long __user stack;
| 
| __user on an integer type doesn't make any sense; integer types do not
| have address spaces.

Ah, will fix  that for x86 32/64 bit implementations.

| 
| > +	unsigned long stack_size;
| > +
| > +	rc = fetch_clone_args_from_user(uca, args_size, &kca);
| > +	if (rc)
| > +		return rc;
| > +
| > +	/*
| > +	 * TODO: Convert 'clone-flags' to 64-bits on all architectures.
| > +	 * TODO: When ->clone_flags_high is non-zero, copy it in to the
| > +	 * 	 higher word(s) of 'flags':
| > +	 *
| > +	 * 		flags = (kca.clone_flags_high << 32) | flags_low;
| > +	 */
| > +	flags = flags_low;
| > +	parent_tidp = (int *)(unsigned long)kca.parent_tid_ptr;
| > +	child_tidp = (int *)(unsigned long)kca.child_tid_ptr;
| 
| This will produce sparse errors.  Is there a reason why 'clone_args'
| tid pointers aren't already pointers marked with __user ?

Making them pointers would make them 32-bit on some and 64-bit on other
architectures. We wanted the fields to be the same size on all architectures
for easier portability/extensibility. Given that we are copying-in a structure
from user-space, copying in a few extra bytes on the 32-bit architectures
would not be significant.

| 
| > +
| > +	stack_size = (unsigned long)kca.child_stack_size;
| 
| Shouldn't this already be of integer type?
| 
| > +	if (stack_size)
| > +		return -EINVAL;
| 
| So the stack must have a zero size?  Is this missing a '!' ?

Some architectures (IA64 ?) use the stack-size field. Those that don't
need it should error out. Again, its because we are trying to keep the
interface common across architectures.

| 
| > +
| > +	stack = (unsigned long)kca.child_stack;
| > +	if (!stack)
| > +		stack = regs->ARM_sp;
| > +
| > +	return do_fork_with_pids(flags, stack, regs, stack_size, parent_tidp,
| > +				child_tidp, kca.nr_pids, pids);
| 
| Hmm, so let me get this syscall interface right.  We have some arguments
| passed in registers and others via a (variable sized?) structure.  It seems
| really weird to have, eg, a pointer to the pids and the number of pids
| passed in two separate ways.
| 
| The grouping between what's passed in registers and via this clone_args
| structure seems to be random.  Can it be sanitized?

:-) Well, we went through a lot of discussions on this. Here is one pointer
http://lkml.org/lkml/2009/9/11/92 (and that was version 6 of 13+ :-).

We wanted the first parameter of eclone(), flags, to remain 32-bit value to
avoid confusing user-space. (i.e if they accidentally pass a 64-bit flags
value to the clone() system call which takes 32-bit flags, the higher-32
bits would silently be dropped). 

By sticking the higher clone-flags in 'struct clone_args', all architectures
would have to do the same extra work to set the higher flags so less
chance of error.

Re: passing the number of pids, nr_pids in the structure, I think it was
just that by passing it in the structure, we could avoid using an extra
register for the system call parameter.

'pid_t *' would be of different size on different architecutres. Passing
it as a separate parameter would avoid the pointer conversion.  Note
that unlike the tid pointers above which are a few individual fields,
the pid_t array could in theory be large.

Hope that helps. Thanks the review comments.

Sukadev

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
       [not found]               ` <4BAA3586.1020604-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
@ 2010-03-24 19:36                 ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-24 19:36 UTC (permalink / raw)
  To: Oren Laadan
  Cc: linux-arm-kernel, Russell King - ARM Linux, containers,
	linux-kernel, Roland McGrath

On Wed, Mar 24, 2010 at 4:53 PM, Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> wrote:
>
>
> Matt Helsley wrote:
>>
>> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
>>>
>>> On Tue, 23 Mar 2010, Matt Helsley wrote:
>>>
>>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux
>>>> wrote:
>>>>>
>>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
>>>>>>
>>>>>> This small commit introduces a global state of system calls for ARM
>>>>>> making it possible for a debugger or checkpointing to gain information
>>>>>> about another process' state with respect to system calls.
>>>>>
>>>>> I don't particularly like the idea that we always store the syscall
>>>>> number to memory for every system call, whether the stored version is
>>>>> used or not.
>>>>>
>>>>> Since ARM caches are generally not write allocate, this means mostly
>>>>> write-only variables can have a higher than expected expense.
>>>>>
>>>>> Is there not some thread flag which can be checked to see if we need to
>>>>> store the syscall number?
>>>>
>>>> Perhaps before we freeze the task we can save the syscall number on ARM.
>>>> The patches suggest that the signal delivery path -- which the freezer
>>>> utilizes -- has the syscall number already.
>
> Actually, the signal path doesn't have the syscall number, it has
> a binary "in syscall" value.
>

Well, this could be changed to pass the syscall number through
registers along to try_to_freeze without any mentionable performance
hit.

>>>>
>>>> Should work since the threads must be frozen first anyway.
>>>
>>> I like the idea.
>>>
>>> However, would it also work for those cases when the freezing does not
>>> occur from the signal delivery path - e.g. for vfork and ptraced tasks ?
>>
>> We could just as easily set it before the vfork uninterruptible
>> completion.
>> ptracing I'd don't know about though.
>>
>
> vfork() uses freezer_do_not_count() to tell the freezer that it's
> effectively frozen. It's also used by drivers/char/apm-emulation.c
>
> Looking at calls to ptrace_notify(), ptrace_stop() and ptace_event(),
> there are several places where a ptraced task can stop with TASK_TRACED
> (which is good enough for the freezer), outside the signal handling
> path.
>
> This means that recording the syscall number for all these cases is
> going to be tedious and intrusive.
>
> I prefer to somehow figure out the syscall from the task's state or
> pt_regs, or by (re)using the same assembly code that already does that.

Re-using the assembly code or factoring it out so that it can be used
from multiple places doesn't seem very pleasing to me, as the assembly
code is in the critical path and written specifically for the context
of a process entering the kernel. Please correct me if I'm wrong.

I imagine simply a function in C, more or less re-implementing the
logic that's already in entry-common.S, might do the trick. I wouldn't
worry much about the performance in this case as it will not be used
often. The following _untested_ snippet illustrates my idea:

---
 arch/arm/include/asm/syscall.h |   93 +++++++++++++++++++++++++++++++++++++++-
 1 files changed, 92 insertions(+), 1 deletions(-)

diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
index 3b3248f..a7f2615 100644
--- a/arch/arm/include/asm/syscall.h
+++ b/arch/arm/include/asm/syscall.h
@@ -10,10 +10,101 @@
 #ifndef _ASM_ARM_SYSCALLS_H
 #define _ASM_ARM_SYSCALLS_H

+static inline int get_swi_instruction(struct task_struct *task,
+				      struct pt_regs *regs,
+				      unsigned long *instr)
+{
+	struct page *page = NULL;
+	unsigned long instr_addr;
+	unsigned long *ptr;
+	int ret;
+
+	instr_addr = regs->ARM_pc - 4;
+
+	down_read(&task->mm->mmap_sem);
+	ret = get_user_pages(task, task->mm, instr_addr,
+			     1, 0, 0, &page, NULL);
+	up_read(&task->mm->mmap_sem);
+
+	if (ret < 0)
+		return ret;
+
+	ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
+	memcpy(instr,
+	       ptr + (instr_addr >> PAGE_SHIFT),
+	       sizeof(unsigned long));
+	kunmap_atomic(ptr, KM_USER1);
+
+	page_cache_release(page);
+
+	return 0;
+}
+
+static inline int __syscall_get_nr(struct task_struct *task,
+				   struct pt_regs *regs)
+{
+	int ret;
+	int scno;
+	unsigned long instr;
+	bool config_oabi = false;
+	bool config_aeabi = false;
+	bool config_arm_thumb = false;
+	bool config_cpu_endian_be8 = false;
+
+#ifdef CONFIG_OABI_COMPAT
+	config_oabi = true;
+#endif
+#ifdef CONFIG_AEABI
+	config_aeabi = true;
+#endif
+#ifdef CONFIG_ARM_THUMB
+	config_arm_thumb = true;
+#endif
+#ifdef CONFIG_CPU_ENDIAN_BE8
+	config_cpu_endian_be8 = true;
+#endif
+#ifdef CONFIG_CPU_ARM710
+	return -1;
+#endif
+
+	if (config_aeabi && !config_oabi) {
+		/* Pure EABI */
+		return regs->ARM_r7;
+	} else if (config_oabi) {
+		if (config_arm_thumb && (regs->ARM_cpsr & PSR_T_BIT))
+			return -1;
+
+		ret = get_swi_instruction(task, regs, &instr);
+		if (ret < 0)
+			return -1;
+
+		if (config_cpu_endian_be8)
+			asm ("rev %[out], %[in]": [out] "=r" (instr):
+						: [in] "r" (instr));
+
+		if ((instr & 0x00ffffff) == 0)
+			return regs->ARM_r7; /* EABI call */
+		else
+			return (instr & 0x00ffffff) | __NR_OABI_SYSCALL_BASE;
+	} else {
+		 /* Legacy ABI only */
+		if (config_arm_thumb && (regs->ARM_cpsr & PSR_T_BIT)) {
+			/* Thumb mode ABI */
+			scno = regs->ARM_r7 + __NR_SYSCALL_BASE;
+		} else {
+			ret = get_swi_instruction(task, regs, &instr);
+			if (ret < 0)
+				return -1;
+			scno = instr;
+		}
+		return scno & 0x00ffffff;
+	}
+}
+
 static inline int syscall_get_nr(struct task_struct *task,
 				 struct pt_regs *regs)
 {
-	return (int)(task_thread_info(task)->syscall);
+	return __syscall_get_nr(task, regs);
 }

 static inline long syscall_get_return_value(struct task_struct *task,
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
  2010-03-24 15:53               ` Oren Laadan
@ 2010-03-24 19:36                 ` Christoffer Dall
  -1 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-24 19:36 UTC (permalink / raw)
  To: Oren Laadan
  Cc: Matt Helsley, Russell King - ARM Linux, linux-arm-kernel,
	containers, linux-kernel, Roland McGrath

On Wed, Mar 24, 2010 at 4:53 PM, Oren Laadan <orenl@cs.columbia.edu> wrote:
>
>
> Matt Helsley wrote:
>>
>> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
>>>
>>> On Tue, 23 Mar 2010, Matt Helsley wrote:
>>>
>>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux
>>>> wrote:
>>>>>
>>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
>>>>>>
>>>>>> This small commit introduces a global state of system calls for ARM
>>>>>> making it possible for a debugger or checkpointing to gain information
>>>>>> about another process' state with respect to system calls.
>>>>>
>>>>> I don't particularly like the idea that we always store the syscall
>>>>> number to memory for every system call, whether the stored version is
>>>>> used or not.
>>>>>
>>>>> Since ARM caches are generally not write allocate, this means mostly
>>>>> write-only variables can have a higher than expected expense.
>>>>>
>>>>> Is there not some thread flag which can be checked to see if we need to
>>>>> store the syscall number?
>>>>
>>>> Perhaps before we freeze the task we can save the syscall number on ARM.
>>>> The patches suggest that the signal delivery path -- which the freezer
>>>> utilizes -- has the syscall number already.
>
> Actually, the signal path doesn't have the syscall number, it has
> a binary "in syscall" value.
>

Well, this could be changed to pass the syscall number through
registers along to try_to_freeze without any mentionable performance
hit.

>>>>
>>>> Should work since the threads must be frozen first anyway.
>>>
>>> I like the idea.
>>>
>>> However, would it also work for those cases when the freezing does not
>>> occur from the signal delivery path - e.g. for vfork and ptraced tasks ?
>>
>> We could just as easily set it before the vfork uninterruptible
>> completion.
>> ptracing I'd don't know about though.
>>
>
> vfork() uses freezer_do_not_count() to tell the freezer that it's
> effectively frozen. It's also used by drivers/char/apm-emulation.c
>
> Looking at calls to ptrace_notify(), ptrace_stop() and ptace_event(),
> there are several places where a ptraced task can stop with TASK_TRACED
> (which is good enough for the freezer), outside the signal handling
> path.
>
> This means that recording the syscall number for all these cases is
> going to be tedious and intrusive.
>
> I prefer to somehow figure out the syscall from the task's state or
> pt_regs, or by (re)using the same assembly code that already does that.

Re-using the assembly code or factoring it out so that it can be used
from multiple places doesn't seem very pleasing to me, as the assembly
code is in the critical path and written specifically for the context
of a process entering the kernel. Please correct me if I'm wrong.

I imagine simply a function in C, more or less re-implementing the
logic that's already in entry-common.S, might do the trick. I wouldn't
worry much about the performance in this case as it will not be used
often. The following _untested_ snippet illustrates my idea:

---
 arch/arm/include/asm/syscall.h |   93 +++++++++++++++++++++++++++++++++++++++-
 1 files changed, 92 insertions(+), 1 deletions(-)

diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
index 3b3248f..a7f2615 100644
--- a/arch/arm/include/asm/syscall.h
+++ b/arch/arm/include/asm/syscall.h
@@ -10,10 +10,101 @@
 #ifndef _ASM_ARM_SYSCALLS_H
 #define _ASM_ARM_SYSCALLS_H

+static inline int get_swi_instruction(struct task_struct *task,
+				      struct pt_regs *regs,
+				      unsigned long *instr)
+{
+	struct page *page = NULL;
+	unsigned long instr_addr;
+	unsigned long *ptr;
+	int ret;
+
+	instr_addr = regs->ARM_pc - 4;
+
+	down_read(&task->mm->mmap_sem);
+	ret = get_user_pages(task, task->mm, instr_addr,
+			     1, 0, 0, &page, NULL);
+	up_read(&task->mm->mmap_sem);
+
+	if (ret < 0)
+		return ret;
+
+	ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
+	memcpy(instr,
+	       ptr + (instr_addr >> PAGE_SHIFT),
+	       sizeof(unsigned long));
+	kunmap_atomic(ptr, KM_USER1);
+
+	page_cache_release(page);
+
+	return 0;
+}
+
+static inline int __syscall_get_nr(struct task_struct *task,
+				   struct pt_regs *regs)
+{
+	int ret;
+	int scno;
+	unsigned long instr;
+	bool config_oabi = false;
+	bool config_aeabi = false;
+	bool config_arm_thumb = false;
+	bool config_cpu_endian_be8 = false;
+
+#ifdef CONFIG_OABI_COMPAT
+	config_oabi = true;
+#endif
+#ifdef CONFIG_AEABI
+	config_aeabi = true;
+#endif
+#ifdef CONFIG_ARM_THUMB
+	config_arm_thumb = true;
+#endif
+#ifdef CONFIG_CPU_ENDIAN_BE8
+	config_cpu_endian_be8 = true;
+#endif
+#ifdef CONFIG_CPU_ARM710
+	return -1;
+#endif
+
+	if (config_aeabi && !config_oabi) {
+		/* Pure EABI */
+		return regs->ARM_r7;
+	} else if (config_oabi) {
+		if (config_arm_thumb && (regs->ARM_cpsr & PSR_T_BIT))
+			return -1;
+
+		ret = get_swi_instruction(task, regs, &instr);
+		if (ret < 0)
+			return -1;
+
+		if (config_cpu_endian_be8)
+			asm ("rev %[out], %[in]": [out] "=r" (instr):
+						: [in] "r" (instr));
+
+		if ((instr & 0x00ffffff) == 0)
+			return regs->ARM_r7; /* EABI call */
+		else
+			return (instr & 0x00ffffff) | __NR_OABI_SYSCALL_BASE;
+	} else {
+		 /* Legacy ABI only */
+		if (config_arm_thumb && (regs->ARM_cpsr & PSR_T_BIT)) {
+			/* Thumb mode ABI */
+			scno = regs->ARM_r7 + __NR_SYSCALL_BASE;
+		} else {
+			ret = get_swi_instruction(task, regs, &instr);
+			if (ret < 0)
+				return -1;
+			scno = instr;
+		}
+		return scno & 0x00ffffff;
+	}
+}
+
 static inline int syscall_get_nr(struct task_struct *task,
 				 struct pt_regs *regs)
 {
-	return (int)(task_thread_info(task)->syscall);
+	return __syscall_get_nr(task, regs);
 }

 static inline long syscall_get_return_value(struct task_struct *task,
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
@ 2010-03-24 19:36                 ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-24 19:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 24, 2010 at 4:53 PM, Oren Laadan <orenl@cs.columbia.edu> wrote:
>
>
> Matt Helsley wrote:
>>
>> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
>>>
>>> On Tue, 23 Mar 2010, Matt Helsley wrote:
>>>
>>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux
>>>> wrote:
>>>>>
>>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
>>>>>>
>>>>>> This small commit introduces a global state of system calls for ARM
>>>>>> making it possible for a debugger or checkpointing to gain information
>>>>>> about another process' state with respect to system calls.
>>>>>
>>>>> I don't particularly like the idea that we always store the syscall
>>>>> number to memory for every system call, whether the stored version is
>>>>> used or not.
>>>>>
>>>>> Since ARM caches are generally not write allocate, this means mostly
>>>>> write-only variables can have a higher than expected expense.
>>>>>
>>>>> Is there not some thread flag which can be checked to see if we need to
>>>>> store the syscall number?
>>>>
>>>> Perhaps before we freeze the task we can save the syscall number on ARM.
>>>> The patches suggest that the signal delivery path -- which the freezer
>>>> utilizes -- has the syscall number already.
>
> Actually, the signal path doesn't have the syscall number, it has
> a binary "in syscall" value.
>

Well, this could be changed to pass the syscall number through
registers along to try_to_freeze without any mentionable performance
hit.

>>>>
>>>> Should work since the threads must be frozen first anyway.
>>>
>>> I like the idea.
>>>
>>> However, would it also work for those cases when the freezing does not
>>> occur from the signal delivery path - e.g. for vfork and ptraced tasks ?
>>
>> We could just as easily set it before the vfork uninterruptible
>> completion.
>> ptracing I'd don't know about though.
>>
>
> vfork() uses freezer_do_not_count() to tell the freezer that it's
> effectively frozen. It's also used by drivers/char/apm-emulation.c
>
> Looking at calls to ptrace_notify(), ptrace_stop() and ptace_event(),
> there are several places where a ptraced task can stop with TASK_TRACED
> (which is good enough for the freezer), outside the signal handling
> path.
>
> This means that recording the syscall number for all these cases is
> going to be tedious and intrusive.
>
> I prefer to somehow figure out the syscall from the task's state or
> pt_regs, or by (re)using the same assembly code that already does that.

Re-using the assembly code or factoring it out so that it can be used
from multiple places doesn't seem very pleasing to me, as the assembly
code is in the critical path and written specifically for the context
of a process entering the kernel. Please correct me if I'm wrong.

I imagine simply a function in C, more or less re-implementing the
logic that's already in entry-common.S, might do the trick. I wouldn't
worry much about the performance in this case as it will not be used
often. The following _untested_ snippet illustrates my idea:

---
 arch/arm/include/asm/syscall.h |   93 +++++++++++++++++++++++++++++++++++++++-
 1 files changed, 92 insertions(+), 1 deletions(-)

diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
index 3b3248f..a7f2615 100644
--- a/arch/arm/include/asm/syscall.h
+++ b/arch/arm/include/asm/syscall.h
@@ -10,10 +10,101 @@
 #ifndef _ASM_ARM_SYSCALLS_H
 #define _ASM_ARM_SYSCALLS_H

+static inline int get_swi_instruction(struct task_struct *task,
+				      struct pt_regs *regs,
+				      unsigned long *instr)
+{
+	struct page *page = NULL;
+	unsigned long instr_addr;
+	unsigned long *ptr;
+	int ret;
+
+	instr_addr = regs->ARM_pc - 4;
+
+	down_read(&task->mm->mmap_sem);
+	ret = get_user_pages(task, task->mm, instr_addr,
+			     1, 0, 0, &page, NULL);
+	up_read(&task->mm->mmap_sem);
+
+	if (ret < 0)
+		return ret;
+
+	ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
+	memcpy(instr,
+	       ptr + (instr_addr >> PAGE_SHIFT),
+	       sizeof(unsigned long));
+	kunmap_atomic(ptr, KM_USER1);
+
+	page_cache_release(page);
+
+	return 0;
+}
+
+static inline int __syscall_get_nr(struct task_struct *task,
+				   struct pt_regs *regs)
+{
+	int ret;
+	int scno;
+	unsigned long instr;
+	bool config_oabi = false;
+	bool config_aeabi = false;
+	bool config_arm_thumb = false;
+	bool config_cpu_endian_be8 = false;
+
+#ifdef CONFIG_OABI_COMPAT
+	config_oabi = true;
+#endif
+#ifdef CONFIG_AEABI
+	config_aeabi = true;
+#endif
+#ifdef CONFIG_ARM_THUMB
+	config_arm_thumb = true;
+#endif
+#ifdef CONFIG_CPU_ENDIAN_BE8
+	config_cpu_endian_be8 = true;
+#endif
+#ifdef CONFIG_CPU_ARM710
+	return -1;
+#endif
+
+	if (config_aeabi && !config_oabi) {
+		/* Pure EABI */
+		return regs->ARM_r7;
+	} else if (config_oabi) {
+		if (config_arm_thumb && (regs->ARM_cpsr & PSR_T_BIT))
+			return -1;
+
+		ret = get_swi_instruction(task, regs, &instr);
+		if (ret < 0)
+			return -1;
+
+		if (config_cpu_endian_be8)
+			asm ("rev %[out], %[in]": [out] "=r" (instr):
+						: [in] "r" (instr));
+
+		if ((instr & 0x00ffffff) == 0)
+			return regs->ARM_r7; /* EABI call */
+		else
+			return (instr & 0x00ffffff) | __NR_OABI_SYSCALL_BASE;
+	} else {
+		 /* Legacy ABI only */
+		if (config_arm_thumb && (regs->ARM_cpsr & PSR_T_BIT)) {
+			/* Thumb mode ABI */
+			scno = regs->ARM_r7 + __NR_SYSCALL_BASE;
+		} else {
+			ret = get_swi_instruction(task, regs, &instr);
+			if (ret < 0)
+				return -1;
+			scno = instr;
+		}
+		return scno & 0x00ffffff;
+	}
+}
+
 static inline int syscall_get_nr(struct task_struct *task,
 				 struct pt_regs *regs)
 {
-	return (int)(task_thread_info(task)->syscall);
+	return __syscall_get_nr(task, regs);
 }

 static inline long syscall_get_return_value(struct task_struct *task,
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 2/3] ARM: Add the eclone system call
       [not found]     ` <20100323210616.GB19572-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>
  2010-03-24 18:19       ` Sukadev Bhattiprolu
@ 2010-03-24 19:42       ` Christoffer Dall
  1 sibling, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-24 19:42 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: containers, Sukadev Bhattiprolu, libc-ports, linux-kernel,
	linux-arm-kernel

On Tue, Mar 23, 2010 at 10:06 PM, Russell King - ARM Linux
<linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org> wrote:
> On Sun, Mar 21, 2010 at 09:06:04PM -0400, Christoffer Dall wrote:
>> In addition to doing everything that clone() system call does, the
>> eclone() system call:
>
> Some comments...
>
>> +sys_eclone_wrapper:
>> +             add     ip, sp, #S_OFF
>> +             str     ip, [sp, #0]
>> +             b       sys_eclone
>> +ENDPROC(sys_eclone_wrapper)
>
> I'm curious why, if you want the entire set of registers, you don't just
> do:
>                add     r0, sp, #S_OFF
>                b       sys_eclone
>
> and load the syscall arguments out of regs->ARM_foo.  This avoids the need
> for additional stores.
>

I simply copied the code from sys_clone. Do you prefer that I change
it in both places?

>> +
>>  sys_sigreturn_wrapper:
>>               add     r0, sp, #S_OFF
>>               b       sys_sigreturn
>> diff --git a/arch/arm/kernel/sys_arm.c b/arch/arm/kernel/sys_arm.c
>> index ae4027b..fd8199d 100644
>> --- a/arch/arm/kernel/sys_arm.c
>> +++ b/arch/arm/kernel/sys_arm.c
>> @@ -183,6 +183,45 @@ asmlinkage int sys_clone(unsigned long clone_flags, unsigned long newsp,
>>       return do_fork(clone_flags, newsp, regs, 0, parent_tidptr, child_tidptr);
>>  }
>>
>> +asmlinkage int sys_eclone(unsigned flags_low, struct clone_args __user *uca,
>> +                       int args_size, pid_t __user *pids,
>> +                       struct pt_regs *regs)
>> +{
>> +     int rc;
>> +     struct clone_args kca;
>> +     unsigned long flags;
>> +     int __user *parent_tidp;
>> +     int __user *child_tidp;
>> +     unsigned long __user stack;
>
> __user on an integer type doesn't make any sense; integer types do not
> have address spaces.
>

thanks, will follow Sukadev's changes...

>> +     unsigned long stack_size;
>> +
>> +     rc = fetch_clone_args_from_user(uca, args_size, &kca);
>> +     if (rc)
>> +             return rc;
>> +
>> +     /*
>> +      * TODO: Convert 'clone-flags' to 64-bits on all architectures.
>> +      * TODO: When ->clone_flags_high is non-zero, copy it in to the
>> +      *       higher word(s) of 'flags':
>> +      *
>> +      *              flags = (kca.clone_flags_high << 32) | flags_low;
>> +      */
>> +     flags = flags_low;
>> +     parent_tidp = (int *)(unsigned long)kca.parent_tid_ptr;
>> +     child_tidp = (int *)(unsigned long)kca.child_tid_ptr;
>
> This will produce sparse errors.  Is there a reason why 'clone_args'
> tid pointers aren't already pointers marked with __user ?
>
>> +
>> +     stack_size = (unsigned long)kca.child_stack_size;
>
> Shouldn't this already be of integer type?
>
>> +     if (stack_size)
>> +             return -EINVAL;
>
> So the stack must have a zero size?  Is this missing a '!' ?
>
>> +
>> +     stack = (unsigned long)kca.child_stack;
>> +     if (!stack)
>> +             stack = regs->ARM_sp;
>> +
>> +     return do_fork_with_pids(flags, stack, regs, stack_size, parent_tidp,
>> +                             child_tidp, kca.nr_pids, pids);
>
> Hmm, so let me get this syscall interface right.  We have some arguments
> passed in registers and others via a (variable sized?) structure.  It seems
> really weird to have, eg, a pointer to the pids and the number of pids
> passed in two separate ways.
>
> The grouping between what's passed in registers and via this clone_args
> structure seems to be random.  Can it be sanitized?
>

Thanks for you feedback. I will let the people behind eclone deal with
the eclone specifics.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 2/3] ARM: Add the eclone system call
  2010-03-23 21:06     ` Russell King - ARM Linux
@ 2010-03-24 19:42       ` Christoffer Dall
  -1 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-24 19:42 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: containers, linux-arm-kernel, linux-kernel, libc-ports,
	Sukadev Bhattiprolu

On Tue, Mar 23, 2010 at 10:06 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sun, Mar 21, 2010 at 09:06:04PM -0400, Christoffer Dall wrote:
>> In addition to doing everything that clone() system call does, the
>> eclone() system call:
>
> Some comments...
>
>> +sys_eclone_wrapper:
>> +             add     ip, sp, #S_OFF
>> +             str     ip, [sp, #0]
>> +             b       sys_eclone
>> +ENDPROC(sys_eclone_wrapper)
>
> I'm curious why, if you want the entire set of registers, you don't just
> do:
>                add     r0, sp, #S_OFF
>                b       sys_eclone
>
> and load the syscall arguments out of regs->ARM_foo.  This avoids the need
> for additional stores.
>

I simply copied the code from sys_clone. Do you prefer that I change
it in both places?

>> +
>>  sys_sigreturn_wrapper:
>>               add     r0, sp, #S_OFF
>>               b       sys_sigreturn
>> diff --git a/arch/arm/kernel/sys_arm.c b/arch/arm/kernel/sys_arm.c
>> index ae4027b..fd8199d 100644
>> --- a/arch/arm/kernel/sys_arm.c
>> +++ b/arch/arm/kernel/sys_arm.c
>> @@ -183,6 +183,45 @@ asmlinkage int sys_clone(unsigned long clone_flags, unsigned long newsp,
>>       return do_fork(clone_flags, newsp, regs, 0, parent_tidptr, child_tidptr);
>>  }
>>
>> +asmlinkage int sys_eclone(unsigned flags_low, struct clone_args __user *uca,
>> +                       int args_size, pid_t __user *pids,
>> +                       struct pt_regs *regs)
>> +{
>> +     int rc;
>> +     struct clone_args kca;
>> +     unsigned long flags;
>> +     int __user *parent_tidp;
>> +     int __user *child_tidp;
>> +     unsigned long __user stack;
>
> __user on an integer type doesn't make any sense; integer types do not
> have address spaces.
>

thanks, will follow Sukadev's changes...

>> +     unsigned long stack_size;
>> +
>> +     rc = fetch_clone_args_from_user(uca, args_size, &kca);
>> +     if (rc)
>> +             return rc;
>> +
>> +     /*
>> +      * TODO: Convert 'clone-flags' to 64-bits on all architectures.
>> +      * TODO: When ->clone_flags_high is non-zero, copy it in to the
>> +      *       higher word(s) of 'flags':
>> +      *
>> +      *              flags = (kca.clone_flags_high << 32) | flags_low;
>> +      */
>> +     flags = flags_low;
>> +     parent_tidp = (int *)(unsigned long)kca.parent_tid_ptr;
>> +     child_tidp = (int *)(unsigned long)kca.child_tid_ptr;
>
> This will produce sparse errors.  Is there a reason why 'clone_args'
> tid pointers aren't already pointers marked with __user ?
>
>> +
>> +     stack_size = (unsigned long)kca.child_stack_size;
>
> Shouldn't this already be of integer type?
>
>> +     if (stack_size)
>> +             return -EINVAL;
>
> So the stack must have a zero size?  Is this missing a '!' ?
>
>> +
>> +     stack = (unsigned long)kca.child_stack;
>> +     if (!stack)
>> +             stack = regs->ARM_sp;
>> +
>> +     return do_fork_with_pids(flags, stack, regs, stack_size, parent_tidp,
>> +                             child_tidp, kca.nr_pids, pids);
>
> Hmm, so let me get this syscall interface right.  We have some arguments
> passed in registers and others via a (variable sized?) structure.  It seems
> really weird to have, eg, a pointer to the pids and the number of pids
> passed in two separate ways.
>
> The grouping between what's passed in registers and via this clone_args
> structure seems to be random.  Can it be sanitized?
>

Thanks for you feedback. I will let the people behind eclone deal with
the eclone specifics.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 2/3] ARM: Add the eclone system call
@ 2010-03-24 19:42       ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-24 19:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Mar 23, 2010 at 10:06 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sun, Mar 21, 2010 at 09:06:04PM -0400, Christoffer Dall wrote:
>> In addition to doing everything that clone() system call does, the
>> eclone() system call:
>
> Some comments...
>
>> +sys_eclone_wrapper:
>> + ? ? ? ? ? ? add ? ? ip, sp, #S_OFF
>> + ? ? ? ? ? ? str ? ? ip, [sp, #0]
>> + ? ? ? ? ? ? b ? ? ? sys_eclone
>> +ENDPROC(sys_eclone_wrapper)
>
> I'm curious why, if you want the entire set of registers, you don't just
> do:
> ? ? ? ? ? ? ? ?add ? ? r0, sp, #S_OFF
> ? ? ? ? ? ? ? ?b ? ? ? sys_eclone
>
> and load the syscall arguments out of regs->ARM_foo. ?This avoids the need
> for additional stores.
>

I simply copied the code from sys_clone. Do you prefer that I change
it in both places?

>> +
>> ?sys_sigreturn_wrapper:
>> ? ? ? ? ? ? ? add ? ? r0, sp, #S_OFF
>> ? ? ? ? ? ? ? b ? ? ? sys_sigreturn
>> diff --git a/arch/arm/kernel/sys_arm.c b/arch/arm/kernel/sys_arm.c
>> index ae4027b..fd8199d 100644
>> --- a/arch/arm/kernel/sys_arm.c
>> +++ b/arch/arm/kernel/sys_arm.c
>> @@ -183,6 +183,45 @@ asmlinkage int sys_clone(unsigned long clone_flags, unsigned long newsp,
>> ? ? ? return do_fork(clone_flags, newsp, regs, 0, parent_tidptr, child_tidptr);
>> ?}
>>
>> +asmlinkage int sys_eclone(unsigned flags_low, struct clone_args __user *uca,
>> + ? ? ? ? ? ? ? ? ? ? ? int args_size, pid_t __user *pids,
>> + ? ? ? ? ? ? ? ? ? ? ? struct pt_regs *regs)
>> +{
>> + ? ? int rc;
>> + ? ? struct clone_args kca;
>> + ? ? unsigned long flags;
>> + ? ? int __user *parent_tidp;
>> + ? ? int __user *child_tidp;
>> + ? ? unsigned long __user stack;
>
> __user on an integer type doesn't make any sense; integer types do not
> have address spaces.
>

thanks, will follow Sukadev's changes...

>> + ? ? unsigned long stack_size;
>> +
>> + ? ? rc = fetch_clone_args_from_user(uca, args_size, &kca);
>> + ? ? if (rc)
>> + ? ? ? ? ? ? return rc;
>> +
>> + ? ? /*
>> + ? ? ?* TODO: Convert 'clone-flags' to 64-bits on all architectures.
>> + ? ? ?* TODO: When ->clone_flags_high is non-zero, copy it in to the
>> + ? ? ?* ? ? ? higher word(s) of 'flags':
>> + ? ? ?*
>> + ? ? ?* ? ? ? ? ? ? ?flags = (kca.clone_flags_high << 32) | flags_low;
>> + ? ? ?*/
>> + ? ? flags = flags_low;
>> + ? ? parent_tidp = (int *)(unsigned long)kca.parent_tid_ptr;
>> + ? ? child_tidp = (int *)(unsigned long)kca.child_tid_ptr;
>
> This will produce sparse errors. ?Is there a reason why 'clone_args'
> tid pointers aren't already pointers marked with __user ?
>
>> +
>> + ? ? stack_size = (unsigned long)kca.child_stack_size;
>
> Shouldn't this already be of integer type?
>
>> + ? ? if (stack_size)
>> + ? ? ? ? ? ? return -EINVAL;
>
> So the stack must have a zero size? ?Is this missing a '!' ?
>
>> +
>> + ? ? stack = (unsigned long)kca.child_stack;
>> + ? ? if (!stack)
>> + ? ? ? ? ? ? stack = regs->ARM_sp;
>> +
>> + ? ? return do_fork_with_pids(flags, stack, regs, stack_size, parent_tidp,
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? child_tidp, kca.nr_pids, pids);
>
> Hmm, so let me get this syscall interface right. ?We have some arguments
> passed in registers and others via a (variable sized?) structure. ?It seems
> really weird to have, eg, a pointer to the pids and the number of pids
> passed in two separate ways.
>
> The grouping between what's passed in registers and via this clone_args
> structure seems to be random. ?Can it be sanitized?
>

Thanks for you feedback. I will let the people behind eclone deal with
the eclone specifics.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
       [not found]     ` <20100323160933.GA4465-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2010-03-24 19:46       ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-24 19:46 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: rmk-lFZ/pmaqli7XmaaqVzeoHQ, containers, linux-kernel, linux-arm-kernel

On Tue, Mar 23, 2010 at 5:09 PM, Serge E. Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
> Quoting Christoffer Dall (christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org):
>> Implements architecture specific requirements for checkpoint/restart on
>> ARM. The changes touch almost only c/r related code. Most of the work is
>> done in arch/arm/checkpoint.c, which implements checkpointing of the CPU
>> and necessary fields on the thread_info struct.
>>
>> The ISA version (given by __LINUX_ARM_ARCH__) is checkpointed and verified
>> against the machine architecture on restart. If they differ, an error is
>> raised and restart aborted. It should be possible to restart on newer
>> architectures, but further investigation is warranted.
>>
>> Regarding ThumbEE, the thumbee_state field on the thread_info is stored
>> in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
>> a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
>> set on the restore system, the restore is aborted. Feedback on this
>> implementation is very welcome.
>>
>> We checkpoint whether the system is running with CONFIG_MMU or not and
>> require the same configuration for the system on which we restore the
>> process. It might be possible to allow something more fine-grained,
>> if it's worth the energy. Input on this item is also very welcome,
>> specifically from someone who knows the exact meaning of the end_brk
>> field.
>>
>> Added support for syscall sys_checkpoint and sys_restart for ARM:
>> __NR_checkpoint         367
>> __NR_restart            368
>>
>>
>> Cc: rmk-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org
>> Signed-off-by: Christoffer Dall <christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
>> Acked-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
>
> In terms of the cr api I don't see any problems.  Two nits below,
> but in any case
>
> Acked-by: Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
>
> thanks, this is really cool, especially how minimal it is :)
> -serge
thanks
>
> ...
>
>> +static int load_cpu_regs(struct ckpt_hdr_cpu *h, struct task_struct *t)
>> +{
>> +     int i;
>> +     struct pt_regs *regs = task_pt_regs(t);
>> +
>> +     memcpy(regs, &h->uregs, sizeof(struct pt_regs));
>> +
>> +     for (i = 0; i < 16; i++)
>> +             regs->uregs[i] = h->uregs[i];
>> +
>> +     /*
>> +      * Restore only user-writable bits on the CPSR
>> +      */
>> +     regs->ARM_cpsr = regs->ARM_cpsr |
>> +                      (h->ARM_cpsr & (PSR_N_BIT | PSR_Z_BIT |
>> +                                      PSR_C_BIT | PSR_V_BIT |
>> +                                      PSR_V_BIT | PSR_Q_BIT |
>> +                                      PSR_E_BIT | PSR_GE_BITS));
>> +     regs->ARM_ORIG_r0 = h->ARM_ORIG_r0;
>> +
>> +     return 0;
>> +}
>> +
>> +/* read the cpu state and registers for the current task */
>> +int restore_cpu(struct ckpt_ctx *ctx)
>> +{
>> +     struct ckpt_hdr_cpu *h;
>> +     struct task_struct *t = current;
>> +     int ret;
>> +
>> +     h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_CPU);
>> +     if (IS_ERR(h))
>> +             return PTR_ERR(h);
>> +
>> +     ret = load_cpu_regs(h, t);
>
> will load_cpu_regs() ever be changed to return anything but 0?  If
> not both fns can be simplified.
>

you're right. I will put load_cpu_regs() inline in restore_cpu.
> ...
>
>> +int restore_mm_context(struct ckpt_ctx *ctx, struct mm_struct *mm)
>> +{
>> +     struct ckpt_hdr_mm_context *h;
>> +     int ret = 0;
>> +
>> +     h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_MM_CONTEXT);
>> +     if (IS_ERR(h))
>> +             return PTR_ERR(h);
>> +
>> +#if !CONFIG_MMU
>> +     mm->context.end_brk = h->end_brk;
>> +#endif
>> +
>> +     ckpt_hdr_put(ctx, h);
>> +     return ret;
>
> Again ret doesn't seem needed here.
indeed it doesn't.

-Christoffer

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of  checkpoint/restart
  2010-03-23 16:09     ` Serge E. Hallyn
@ 2010-03-24 19:46       ` Christoffer Dall
  -1 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-24 19:46 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: containers, linux-arm-kernel, linux-kernel, rmk

On Tue, Mar 23, 2010 at 5:09 PM, Serge E. Hallyn <serue@us.ibm.com> wrote:
> Quoting Christoffer Dall (christofferdall@christofferdall.dk):
>> Implements architecture specific requirements for checkpoint/restart on
>> ARM. The changes touch almost only c/r related code. Most of the work is
>> done in arch/arm/checkpoint.c, which implements checkpointing of the CPU
>> and necessary fields on the thread_info struct.
>>
>> The ISA version (given by __LINUX_ARM_ARCH__) is checkpointed and verified
>> against the machine architecture on restart. If they differ, an error is
>> raised and restart aborted. It should be possible to restart on newer
>> architectures, but further investigation is warranted.
>>
>> Regarding ThumbEE, the thumbee_state field on the thread_info is stored
>> in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
>> a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
>> set on the restore system, the restore is aborted. Feedback on this
>> implementation is very welcome.
>>
>> We checkpoint whether the system is running with CONFIG_MMU or not and
>> require the same configuration for the system on which we restore the
>> process. It might be possible to allow something more fine-grained,
>> if it's worth the energy. Input on this item is also very welcome,
>> specifically from someone who knows the exact meaning of the end_brk
>> field.
>>
>> Added support for syscall sys_checkpoint and sys_restart for ARM:
>> __NR_checkpoint         367
>> __NR_restart            368
>>
>>
>> Cc: rmk@arm.linux.org.uk
>> Signed-off-by: Christoffer Dall <christofferdall@christofferdall.dk>
>> Acked-by: Oren Laadan <orenl@cs.columbia.edu>
>
> In terms of the cr api I don't see any problems.  Two nits below,
> but in any case
>
> Acked-by: Serge Hallyn <serue@us.ibm.com>
>
> thanks, this is really cool, especially how minimal it is :)
> -serge
thanks
>
> ...
>
>> +static int load_cpu_regs(struct ckpt_hdr_cpu *h, struct task_struct *t)
>> +{
>> +     int i;
>> +     struct pt_regs *regs = task_pt_regs(t);
>> +
>> +     memcpy(regs, &h->uregs, sizeof(struct pt_regs));
>> +
>> +     for (i = 0; i < 16; i++)
>> +             regs->uregs[i] = h->uregs[i];
>> +
>> +     /*
>> +      * Restore only user-writable bits on the CPSR
>> +      */
>> +     regs->ARM_cpsr = regs->ARM_cpsr |
>> +                      (h->ARM_cpsr & (PSR_N_BIT | PSR_Z_BIT |
>> +                                      PSR_C_BIT | PSR_V_BIT |
>> +                                      PSR_V_BIT | PSR_Q_BIT |
>> +                                      PSR_E_BIT | PSR_GE_BITS));
>> +     regs->ARM_ORIG_r0 = h->ARM_ORIG_r0;
>> +
>> +     return 0;
>> +}
>> +
>> +/* read the cpu state and registers for the current task */
>> +int restore_cpu(struct ckpt_ctx *ctx)
>> +{
>> +     struct ckpt_hdr_cpu *h;
>> +     struct task_struct *t = current;
>> +     int ret;
>> +
>> +     h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_CPU);
>> +     if (IS_ERR(h))
>> +             return PTR_ERR(h);
>> +
>> +     ret = load_cpu_regs(h, t);
>
> will load_cpu_regs() ever be changed to return anything but 0?  If
> not both fns can be simplified.
>

you're right. I will put load_cpu_regs() inline in restore_cpu.
> ...
>
>> +int restore_mm_context(struct ckpt_ctx *ctx, struct mm_struct *mm)
>> +{
>> +     struct ckpt_hdr_mm_context *h;
>> +     int ret = 0;
>> +
>> +     h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_MM_CONTEXT);
>> +     if (IS_ERR(h))
>> +             return PTR_ERR(h);
>> +
>> +#if !CONFIG_MMU
>> +     mm->context.end_brk = h->end_brk;
>> +#endif
>> +
>> +     ckpt_hdr_put(ctx, h);
>> +     return ret;
>
> Again ret doesn't seem needed here.
indeed it doesn't.

-Christoffer

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
@ 2010-03-24 19:46       ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-24 19:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Mar 23, 2010 at 5:09 PM, Serge E. Hallyn <serue@us.ibm.com> wrote:
> Quoting Christoffer Dall (christofferdall at christofferdall.dk):
>> Implements architecture specific requirements for checkpoint/restart on
>> ARM. The changes touch almost only c/r related code. Most of the work is
>> done in arch/arm/checkpoint.c, which implements checkpointing of the CPU
>> and necessary fields on the thread_info struct.
>>
>> The ISA version (given by __LINUX_ARM_ARCH__) is checkpointed and verified
>> against the machine architecture on restart. If they differ, an error is
>> raised and restart aborted. It should be possible to restart on newer
>> architectures, but further investigation is warranted.
>>
>> Regarding ThumbEE, the thumbee_state field on the thread_info is stored
>> in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
>> a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
>> set on the restore system, the restore is aborted. Feedback on this
>> implementation is very welcome.
>>
>> We checkpoint whether the system is running with CONFIG_MMU or not and
>> require the same configuration for the system on which we restore the
>> process. It might be possible to allow something more fine-grained,
>> if it's worth the energy. Input on this item is also very welcome,
>> specifically from someone who knows the exact meaning of the end_brk
>> field.
>>
>> Added support for syscall sys_checkpoint and sys_restart for ARM:
>> __NR_checkpoint ? ? ? ? 367
>> __NR_restart ? ? ? ? ? ?368
>>
>>
>> Cc: rmk at arm.linux.org.uk
>> Signed-off-by: Christoffer Dall <christofferdall@christofferdall.dk>
>> Acked-by: Oren Laadan <orenl@cs.columbia.edu>
>
> In terms of the cr api I don't see any problems. ?Two nits below,
> but in any case
>
> Acked-by: Serge Hallyn <serue@us.ibm.com>
>
> thanks, this is really cool, especially how minimal it is :)
> -serge
thanks
>
> ...
>
>> +static int load_cpu_regs(struct ckpt_hdr_cpu *h, struct task_struct *t)
>> +{
>> + ? ? int i;
>> + ? ? struct pt_regs *regs = task_pt_regs(t);
>> +
>> + ? ? memcpy(regs, &h->uregs, sizeof(struct pt_regs));
>> +
>> + ? ? for (i = 0; i < 16; i++)
>> + ? ? ? ? ? ? regs->uregs[i] = h->uregs[i];
>> +
>> + ? ? /*
>> + ? ? ?* Restore only user-writable bits on the CPSR
>> + ? ? ?*/
>> + ? ? regs->ARM_cpsr = regs->ARM_cpsr |
>> + ? ? ? ? ? ? ? ? ? ? ?(h->ARM_cpsr & (PSR_N_BIT | PSR_Z_BIT |
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?PSR_C_BIT | PSR_V_BIT |
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?PSR_V_BIT | PSR_Q_BIT |
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?PSR_E_BIT | PSR_GE_BITS));
>> + ? ? regs->ARM_ORIG_r0 = h->ARM_ORIG_r0;
>> +
>> + ? ? return 0;
>> +}
>> +
>> +/* read the cpu state and registers for the current task */
>> +int restore_cpu(struct ckpt_ctx *ctx)
>> +{
>> + ? ? struct ckpt_hdr_cpu *h;
>> + ? ? struct task_struct *t = current;
>> + ? ? int ret;
>> +
>> + ? ? h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_CPU);
>> + ? ? if (IS_ERR(h))
>> + ? ? ? ? ? ? return PTR_ERR(h);
>> +
>> + ? ? ret = load_cpu_regs(h, t);
>
> will load_cpu_regs() ever be changed to return anything but 0? ?If
> not both fns can be simplified.
>

you're right. I will put load_cpu_regs() inline in restore_cpu.
> ...
>
>> +int restore_mm_context(struct ckpt_ctx *ctx, struct mm_struct *mm)
>> +{
>> + ? ? struct ckpt_hdr_mm_context *h;
>> + ? ? int ret = 0;
>> +
>> + ? ? h = ckpt_read_obj_type(ctx, sizeof(*h), CKPT_HDR_MM_CONTEXT);
>> + ? ? if (IS_ERR(h))
>> + ? ? ? ? ? ? return PTR_ERR(h);
>> +
>> +#if !CONFIG_MMU
>> + ? ? mm->context.end_brk = h->end_brk;
>> +#endif
>> +
>> + ? ? ckpt_hdr_put(ctx, h);
>> + ? ? return ret;
>
> Again ret doesn't seem needed here.
indeed it doesn't.

-Christoffer

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
       [not found]     ` <20100323211843.GC19572-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>
  2010-03-24  1:53       ` Matt Helsley
@ 2010-03-24 20:48       ` Christoffer Dall
  1 sibling, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-24 20:48 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: containers, linux-kernel, linux-arm-kernel

On Tue, Mar 23, 2010 at 10:18 PM, Russell King - ARM Linux
<linux-lFZ/pmaqli7XmaaqVzeoHQ@public.gmane.org> wrote:
> On Sun, Mar 21, 2010 at 09:06:05PM -0400, Christoffer Dall wrote:
>> The ISA version (given by __LINUX_ARM_ARCH__) is checkpointed and verified
>> against the machine architecture on restart.
>
> I think you misunderstand what __LINUX_ARM_ARCH__ signifies.  It is the
> build architecture for the kernel, and it indicates the lowest
> architecture version that the kernel will run on.

Yes, clearly I didn't understand this fully. So is it in fact possible
to compile the kernel with __LINUX_ARM_ARCH=6 and have
CONFIG_CPU_32v7? Or is it a matter of running a v6 kernel with
CONFIG_CPU_32v6 on a newer architecture?

What I would like to accomplish is the best way to make sure that the
restarted process will in fact be able to run. What is the best way to
ensure this with regards to the architecture version?

>
> That doesn't indicate what ISA version the system is running on, or even
> if the ABI is compatible (we have two ABIs - OABI and EABI).

That's why I checkpointed CONFIG_OABI_COMPAT, but I realize that it's
not sufficient.

How about checkpointing CONFIG_AEABI and CONFIG_OABI_COMPAT and making
sure that we either restore to the same setting of the two or restore
to CONFIG_OABI_COMPAT=y?

>
> There's also the matter of FP implementation - whether it is VFP or FPA,
> and whether iwMMXt is available or not.  (iwMMXt precludes the use of
> FPA.)

I had a feeling this would be an issue, but I never dove into the
workings of FP on ARM. Can you give me some concrete pointers as what
to checkpoint and restart / check on restart for a process using FP to
be able to be restarted?

>
>> Regarding ThumbEE, the thumbee_state field on the thread_info is stored
>> in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
>> a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
>> set on the restore system, the restore is aborted. Feedback on this
>> implementation is very welcome.
>
> I don't recognise this configuration symbol; it doesn't exist in mainline.
>

I encountered it when looking at struct thread_info in
arch/arm/include/asm/thread_info.h, and have not seen et before. After
looking into it a little more, it's included it 2.6.33 and defined in
arch/arm/mm/Kconfig.

>> We checkpoint whether the system is running with CONFIG_MMU or not and
>> require the same configuration for the system on which we restore the
>> process. It might be possible to allow something more fine-grained,
>> if it's worth the energy. Input on this item is also very welcome,
>> specifically from someone who knows the exact meaning of the end_brk
>> field.
>
> Processes which run on MMU and non-MMU CPUs are unlikely to be
> interchangable - the run time environments are quite different.  I
> think this is a sane check.
>
thanks.

>> +/* dump the thread_struct of a given task */
>> +int checkpoint_thread(struct ckpt_ctx *ctx, struct task_struct *t)
>> +{
>> +     int ret;
>> +     struct ckpt_hdr_thread *h;
>> +     struct thread_info *ti = task_thread_info(t);
>> +
>> +     h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_THREAD);
>> +     if (!h)
>> +             return -ENOMEM;
>> +
>> +     /*
>> +      * Store the syscall information about the checkpointed process
>> +      * as we need to know if the process was doing a syscall (and which)
>> +      * during restart.
>> +      */
>> +     h->syscall = ti->syscall;
>> +
>> +     /*
>> +      * Store remaining thread-specific info.
>> +      */
>> +     h->tp_value = ti->tp_value;
>
> How do you safely obtain consistent information from a thread?  Do you
> temporarily stop it?
>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of  checkpoint/restart
  2010-03-23 21:18     ` Russell King - ARM Linux
@ 2010-03-24 20:48       ` Christoffer Dall
  -1 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-24 20:48 UTC (permalink / raw)
  To: Russell King - ARM Linux; +Cc: containers, linux-arm-kernel, linux-kernel

On Tue, Mar 23, 2010 at 10:18 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sun, Mar 21, 2010 at 09:06:05PM -0400, Christoffer Dall wrote:
>> The ISA version (given by __LINUX_ARM_ARCH__) is checkpointed and verified
>> against the machine architecture on restart.
>
> I think you misunderstand what __LINUX_ARM_ARCH__ signifies.  It is the
> build architecture for the kernel, and it indicates the lowest
> architecture version that the kernel will run on.

Yes, clearly I didn't understand this fully. So is it in fact possible
to compile the kernel with __LINUX_ARM_ARCH=6 and have
CONFIG_CPU_32v7? Or is it a matter of running a v6 kernel with
CONFIG_CPU_32v6 on a newer architecture?

What I would like to accomplish is the best way to make sure that the
restarted process will in fact be able to run. What is the best way to
ensure this with regards to the architecture version?

>
> That doesn't indicate what ISA version the system is running on, or even
> if the ABI is compatible (we have two ABIs - OABI and EABI).

That's why I checkpointed CONFIG_OABI_COMPAT, but I realize that it's
not sufficient.

How about checkpointing CONFIG_AEABI and CONFIG_OABI_COMPAT and making
sure that we either restore to the same setting of the two or restore
to CONFIG_OABI_COMPAT=y?

>
> There's also the matter of FP implementation - whether it is VFP or FPA,
> and whether iwMMXt is available or not.  (iwMMXt precludes the use of
> FPA.)

I had a feeling this would be an issue, but I never dove into the
workings of FP on ARM. Can you give me some concrete pointers as what
to checkpoint and restart / check on restart for a process using FP to
be able to be restarted?

>
>> Regarding ThumbEE, the thumbee_state field on the thread_info is stored
>> in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
>> a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
>> set on the restore system, the restore is aborted. Feedback on this
>> implementation is very welcome.
>
> I don't recognise this configuration symbol; it doesn't exist in mainline.
>

I encountered it when looking at struct thread_info in
arch/arm/include/asm/thread_info.h, and have not seen et before. After
looking into it a little more, it's included it 2.6.33 and defined in
arch/arm/mm/Kconfig.

>> We checkpoint whether the system is running with CONFIG_MMU or not and
>> require the same configuration for the system on which we restore the
>> process. It might be possible to allow something more fine-grained,
>> if it's worth the energy. Input on this item is also very welcome,
>> specifically from someone who knows the exact meaning of the end_brk
>> field.
>
> Processes which run on MMU and non-MMU CPUs are unlikely to be
> interchangable - the run time environments are quite different.  I
> think this is a sane check.
>
thanks.

>> +/* dump the thread_struct of a given task */
>> +int checkpoint_thread(struct ckpt_ctx *ctx, struct task_struct *t)
>> +{
>> +     int ret;
>> +     struct ckpt_hdr_thread *h;
>> +     struct thread_info *ti = task_thread_info(t);
>> +
>> +     h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_THREAD);
>> +     if (!h)
>> +             return -ENOMEM;
>> +
>> +     /*
>> +      * Store the syscall information about the checkpointed process
>> +      * as we need to know if the process was doing a syscall (and which)
>> +      * during restart.
>> +      */
>> +     h->syscall = ti->syscall;
>> +
>> +     /*
>> +      * Store remaining thread-specific info.
>> +      */
>> +     h->tp_value = ti->tp_value;
>
> How do you safely obtain consistent information from a thread?  Do you
> temporarily stop it?
>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
@ 2010-03-24 20:48       ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-24 20:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Mar 23, 2010 at 10:18 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sun, Mar 21, 2010 at 09:06:05PM -0400, Christoffer Dall wrote:
>> The ISA version (given by __LINUX_ARM_ARCH__) is checkpointed and verified
>> against the machine architecture on restart.
>
> I think you misunderstand what __LINUX_ARM_ARCH__ signifies. ?It is the
> build architecture for the kernel, and it indicates the lowest
> architecture version that the kernel will run on.

Yes, clearly I didn't understand this fully. So is it in fact possible
to compile the kernel with __LINUX_ARM_ARCH=6 and have
CONFIG_CPU_32v7? Or is it a matter of running a v6 kernel with
CONFIG_CPU_32v6 on a newer architecture?

What I would like to accomplish is the best way to make sure that the
restarted process will in fact be able to run. What is the best way to
ensure this with regards to the architecture version?

>
> That doesn't indicate what ISA version the system is running on, or even
> if the ABI is compatible (we have two ABIs - OABI and EABI).

That's why I checkpointed CONFIG_OABI_COMPAT, but I realize that it's
not sufficient.

How about checkpointing CONFIG_AEABI and CONFIG_OABI_COMPAT and making
sure that we either restore to the same setting of the two or restore
to CONFIG_OABI_COMPAT=y?

>
> There's also the matter of FP implementation - whether it is VFP or FPA,
> and whether iwMMXt is available or not. ?(iwMMXt precludes the use of
> FPA.)

I had a feeling this would be an issue, but I never dove into the
workings of FP on ARM. Can you give me some concrete pointers as what
to checkpoint and restart / check on restart for a process using FP to
be able to be restarted?

>
>> Regarding ThumbEE, the thumbee_state field on the thread_info is stored
>> in checkpoints when CONFIG_ARM_THUMBEE and 0 is stored otherwise. If
>> a value different than 0 is checkpointed and CONFIG_ARM_THUMBEE is not
>> set on the restore system, the restore is aborted. Feedback on this
>> implementation is very welcome.
>
> I don't recognise this configuration symbol; it doesn't exist in mainline.
>

I encountered it when looking at struct thread_info in
arch/arm/include/asm/thread_info.h, and have not seen et before. After
looking into it a little more, it's included it 2.6.33 and defined in
arch/arm/mm/Kconfig.

>> We checkpoint whether the system is running with CONFIG_MMU or not and
>> require the same configuration for the system on which we restore the
>> process. It might be possible to allow something more fine-grained,
>> if it's worth the energy. Input on this item is also very welcome,
>> specifically from someone who knows the exact meaning of the end_brk
>> field.
>
> Processes which run on MMU and non-MMU CPUs are unlikely to be
> interchangable - the run time environments are quite different. ?I
> think this is a sane check.
>
thanks.

>> +/* dump the thread_struct of a given task */
>> +int checkpoint_thread(struct ckpt_ctx *ctx, struct task_struct *t)
>> +{
>> + ? ? int ret;
>> + ? ? struct ckpt_hdr_thread *h;
>> + ? ? struct thread_info *ti = task_thread_info(t);
>> +
>> + ? ? h = ckpt_hdr_get_type(ctx, sizeof(*h), CKPT_HDR_THREAD);
>> + ? ? if (!h)
>> + ? ? ? ? ? ? return -ENOMEM;
>> +
>> + ? ? /*
>> + ? ? ?* Store the syscall information about the checkpointed process
>> + ? ? ?* as we need to know if the process was doing a syscall (and which)
>> + ? ? ?* during restart.
>> + ? ? ?*/
>> + ? ? h->syscall = ti->syscall;
>> +
>> + ? ? /*
>> + ? ? ?* Store remaining thread-specific info.
>> + ? ? ?*/
>> + ? ? h->tp_value = ti->tp_value;
>
> How do you safely obtain consistent information from a thread? ?Do you
> temporarily stop it?
>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
       [not found]                 ` <7d08b87d1003241236n2b45e6f4ife36da841351df9d-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-03-25  1:11                   ` Matt Helsley
  0 siblings, 0 replies; 80+ messages in thread
From: Matt Helsley @ 2010-03-25  1:11 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Russell King - ARM Linux, Roland McGrath, containers,
	linux-kernel, linux-arm-kernel

On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
> On Wed, Mar 24, 2010 at 4:53 PM, Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> wrote:
> >
> >
> > Matt Helsley wrote:
> >>
> >> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
> >>>
> >>> On Tue, 23 Mar 2010, Matt Helsley wrote:
> >>>
> >>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux
> >>>> wrote:
> >>>>>
> >>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
> >>>>>>
> >>>>>> This small commit introduces a global state of system calls for ARM
> >>>>>> making it possible for a debugger or checkpointing to gain information
> >>>>>> about another process' state with respect to system calls.
> >>>>>
> >>>>> I don't particularly like the idea that we always store the syscall
> >>>>> number to memory for every system call, whether the stored version is
> >>>>> used or not.
> >>>>>
> >>>>> Since ARM caches are generally not write allocate, this means mostly
> >>>>> write-only variables can have a higher than expected expense.
> >>>>>
> >>>>> Is there not some thread flag which can be checked to see if we need to
> >>>>> store the syscall number?
> >>>>
> >>>> Perhaps before we freeze the task we can save the syscall number on ARM.
> >>>> The patches suggest that the signal delivery path -- which the freezer
> >>>> utilizes -- has the syscall number already.
> >
> > Actually, the signal path doesn't have the syscall number, it has
> > a binary "in syscall" value.
> >

Argh. I read too much into the name :(.

> 
> Well, this could be changed to pass the syscall number through
> registers along to try_to_freeze without any mentionable performance
> hit.

Yes, that's possible. I was thinking we could still use your thread info
field but only store to it when we know it will be useful for c/r rather
than for each syscall. Personally, I'd rather avoid passing the extra
parameter into try_to_freeze(). Your idea below seems better to me.

> Re-using the assembly code or factoring it out so that it can be used
> from multiple places doesn't seem very pleasing to me, as the assembly
> code is in the critical path and written specifically for the context
> of a process entering the kernel. Please correct me if I'm wrong.
> 
> I imagine simply a function in C, more or less re-implementing the
> logic that's already in entry-common.S, might do the trick. I wouldn't
> worry much about the performance in this case as it will not be used
> often. The following _untested_ snippet illustrates my idea:
> 
> ---
>  arch/arm/include/asm/syscall.h |   93 +++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 92 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
> index 3b3248f..a7f2615 100644
> --- a/arch/arm/include/asm/syscall.h
> +++ b/arch/arm/include/asm/syscall.h
> @@ -10,10 +10,101 @@
>  #ifndef _ASM_ARM_SYSCALLS_H
>  #define _ASM_ARM_SYSCALLS_H
> 
> +static inline int get_swi_instruction(struct task_struct *task,
> +				      struct pt_regs *regs,
> +				      unsigned long *instr)
> +{
> +	struct page *page = NULL;
> +	unsigned long instr_addr;
> +	unsigned long *ptr;
> +	int ret;
> +
> +	instr_addr = regs->ARM_pc - 4;
> +
> +	down_read(&task->mm->mmap_sem);
> +	ret = get_user_pages(task, task->mm, instr_addr,
> +			     1, 0, 0, &page, NULL);
> +	up_read(&task->mm->mmap_sem);
> +
> +	if (ret < 0)
> +		return ret;
> +
> +	ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
> +	memcpy(instr,
> +	       ptr + (instr_addr >> PAGE_SHIFT),
			^shouldn't this be:
		      instr_addr & PAGE_MASK

> +	       sizeof(unsigned long));
> +	kunmap_atomic(ptr, KM_USER1);
> +
> +	page_cache_release(page);
> +
> +	return 0;
> +}

(again, not familiar with ARM so my understanding is:

I guess swi is "syscall word immediate".

The syscall nr is embedded in the instruction as an immediate
value and you're getting a copy of that instruction using the value of
the pc register just after the syscall instruction was executed.)

Perhaps I am missing or forgetting something. Why isn't this as simple
as calling get_user() or even copy_from_user() using instr_addr?

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
  2010-03-24 19:36                 ` Christoffer Dall
@ 2010-03-25  1:11                   ` Matt Helsley
  -1 siblings, 0 replies; 80+ messages in thread
From: Matt Helsley @ 2010-03-25  1:11 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Oren Laadan, Matt Helsley, Russell King - ARM Linux,
	linux-arm-kernel, containers, linux-kernel, Roland McGrath

On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
> On Wed, Mar 24, 2010 at 4:53 PM, Oren Laadan <orenl@cs.columbia.edu> wrote:
> >
> >
> > Matt Helsley wrote:
> >>
> >> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
> >>>
> >>> On Tue, 23 Mar 2010, Matt Helsley wrote:
> >>>
> >>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux
> >>>> wrote:
> >>>>>
> >>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
> >>>>>>
> >>>>>> This small commit introduces a global state of system calls for ARM
> >>>>>> making it possible for a debugger or checkpointing to gain information
> >>>>>> about another process' state with respect to system calls.
> >>>>>
> >>>>> I don't particularly like the idea that we always store the syscall
> >>>>> number to memory for every system call, whether the stored version is
> >>>>> used or not.
> >>>>>
> >>>>> Since ARM caches are generally not write allocate, this means mostly
> >>>>> write-only variables can have a higher than expected expense.
> >>>>>
> >>>>> Is there not some thread flag which can be checked to see if we need to
> >>>>> store the syscall number?
> >>>>
> >>>> Perhaps before we freeze the task we can save the syscall number on ARM.
> >>>> The patches suggest that the signal delivery path -- which the freezer
> >>>> utilizes -- has the syscall number already.
> >
> > Actually, the signal path doesn't have the syscall number, it has
> > a binary "in syscall" value.
> >

Argh. I read too much into the name :(.

> 
> Well, this could be changed to pass the syscall number through
> registers along to try_to_freeze without any mentionable performance
> hit.

Yes, that's possible. I was thinking we could still use your thread info
field but only store to it when we know it will be useful for c/r rather
than for each syscall. Personally, I'd rather avoid passing the extra
parameter into try_to_freeze(). Your idea below seems better to me.

> Re-using the assembly code or factoring it out so that it can be used
> from multiple places doesn't seem very pleasing to me, as the assembly
> code is in the critical path and written specifically for the context
> of a process entering the kernel. Please correct me if I'm wrong.
> 
> I imagine simply a function in C, more or less re-implementing the
> logic that's already in entry-common.S, might do the trick. I wouldn't
> worry much about the performance in this case as it will not be used
> often. The following _untested_ snippet illustrates my idea:
> 
> ---
>  arch/arm/include/asm/syscall.h |   93 +++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 92 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
> index 3b3248f..a7f2615 100644
> --- a/arch/arm/include/asm/syscall.h
> +++ b/arch/arm/include/asm/syscall.h
> @@ -10,10 +10,101 @@
>  #ifndef _ASM_ARM_SYSCALLS_H
>  #define _ASM_ARM_SYSCALLS_H
> 
> +static inline int get_swi_instruction(struct task_struct *task,
> +				      struct pt_regs *regs,
> +				      unsigned long *instr)
> +{
> +	struct page *page = NULL;
> +	unsigned long instr_addr;
> +	unsigned long *ptr;
> +	int ret;
> +
> +	instr_addr = regs->ARM_pc - 4;
> +
> +	down_read(&task->mm->mmap_sem);
> +	ret = get_user_pages(task, task->mm, instr_addr,
> +			     1, 0, 0, &page, NULL);
> +	up_read(&task->mm->mmap_sem);
> +
> +	if (ret < 0)
> +		return ret;
> +
> +	ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
> +	memcpy(instr,
> +	       ptr + (instr_addr >> PAGE_SHIFT),
			^shouldn't this be:
		      instr_addr & PAGE_MASK

> +	       sizeof(unsigned long));
> +	kunmap_atomic(ptr, KM_USER1);
> +
> +	page_cache_release(page);
> +
> +	return 0;
> +}

(again, not familiar with ARM so my understanding is:

I guess swi is "syscall word immediate".

The syscall nr is embedded in the instruction as an immediate
value and you're getting a copy of that instruction using the value of
the pc register just after the syscall instruction was executed.)

Perhaps I am missing or forgetting something. Why isn't this as simple
as calling get_user() or even copy_from_user() using instr_addr?

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
@ 2010-03-25  1:11                   ` Matt Helsley
  0 siblings, 0 replies; 80+ messages in thread
From: Matt Helsley @ 2010-03-25  1:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
> On Wed, Mar 24, 2010 at 4:53 PM, Oren Laadan <orenl@cs.columbia.edu> wrote:
> >
> >
> > Matt Helsley wrote:
> >>
> >> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
> >>>
> >>> On Tue, 23 Mar 2010, Matt Helsley wrote:
> >>>
> >>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux
> >>>> wrote:
> >>>>>
> >>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
> >>>>>>
> >>>>>> This small commit introduces a global state of system calls for ARM
> >>>>>> making it possible for a debugger or checkpointing to gain information
> >>>>>> about another process' state with respect to system calls.
> >>>>>
> >>>>> I don't particularly like the idea that we always store the syscall
> >>>>> number to memory for every system call, whether the stored version is
> >>>>> used or not.
> >>>>>
> >>>>> Since ARM caches are generally not write allocate, this means mostly
> >>>>> write-only variables can have a higher than expected expense.
> >>>>>
> >>>>> Is there not some thread flag which can be checked to see if we need to
> >>>>> store the syscall number?
> >>>>
> >>>> Perhaps before we freeze the task we can save the syscall number on ARM.
> >>>> The patches suggest that the signal delivery path -- which the freezer
> >>>> utilizes -- has the syscall number already.
> >
> > Actually, the signal path doesn't have the syscall number, it has
> > a binary "in syscall" value.
> >

Argh. I read too much into the name :(.

> 
> Well, this could be changed to pass the syscall number through
> registers along to try_to_freeze without any mentionable performance
> hit.

Yes, that's possible. I was thinking we could still use your thread info
field but only store to it when we know it will be useful for c/r rather
than for each syscall. Personally, I'd rather avoid passing the extra
parameter into try_to_freeze(). Your idea below seems better to me.

> Re-using the assembly code or factoring it out so that it can be used
> from multiple places doesn't seem very pleasing to me, as the assembly
> code is in the critical path and written specifically for the context
> of a process entering the kernel. Please correct me if I'm wrong.
> 
> I imagine simply a function in C, more or less re-implementing the
> logic that's already in entry-common.S, might do the trick. I wouldn't
> worry much about the performance in this case as it will not be used
> often. The following _untested_ snippet illustrates my idea:
> 
> ---
>  arch/arm/include/asm/syscall.h |   93 +++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 92 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
> index 3b3248f..a7f2615 100644
> --- a/arch/arm/include/asm/syscall.h
> +++ b/arch/arm/include/asm/syscall.h
> @@ -10,10 +10,101 @@
>  #ifndef _ASM_ARM_SYSCALLS_H
>  #define _ASM_ARM_SYSCALLS_H
> 
> +static inline int get_swi_instruction(struct task_struct *task,
> +				      struct pt_regs *regs,
> +				      unsigned long *instr)
> +{
> +	struct page *page = NULL;
> +	unsigned long instr_addr;
> +	unsigned long *ptr;
> +	int ret;
> +
> +	instr_addr = regs->ARM_pc - 4;
> +
> +	down_read(&task->mm->mmap_sem);
> +	ret = get_user_pages(task, task->mm, instr_addr,
> +			     1, 0, 0, &page, NULL);
> +	up_read(&task->mm->mmap_sem);
> +
> +	if (ret < 0)
> +		return ret;
> +
> +	ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
> +	memcpy(instr,
> +	       ptr + (instr_addr >> PAGE_SHIFT),
			^shouldn't this be:
		      instr_addr & PAGE_MASK

> +	       sizeof(unsigned long));
> +	kunmap_atomic(ptr, KM_USER1);
> +
> +	page_cache_release(page);
> +
> +	return 0;
> +}

(again, not familiar with ARM so my understanding is:

I guess swi is "syscall word immediate".

The syscall nr is embedded in the instruction as an immediate
value and you're getting a copy of that instruction using the value of
the pc register just after the syscall instruction was executed.)

Perhaps I am missing or forgetting something. Why isn't this as simple
as calling get_user() or even copy_from_user() using instr_addr?

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
       [not found]                   ` <20100325011132.GE5704-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
@ 2010-03-25  1:17                     ` Matt Helsley
  2010-03-25  1:35                     ` Oren Laadan
  1 sibling, 0 replies; 80+ messages in thread
From: Matt Helsley @ 2010-03-25  1:17 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Russell King - ARM Linux, containers, linux-kernel,
	Christoffer Dall, Roland McGrath, linux-arm-kernel

On Wed, Mar 24, 2010 at 06:11:32PM -0700, Matt Helsley wrote:
> On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
<snip>
> > Re-using the assembly code or factoring it out so that it can be used
> > from multiple places doesn't seem very pleasing to me, as the assembly
> > code is in the critical path and written specifically for the context
> > of a process entering the kernel. Please correct me if I'm wrong.
> > 
> > I imagine simply a function in C, more or less re-implementing the
> > logic that's already in entry-common.S, might do the trick. I wouldn't
> > worry much about the performance in this case as it will not be used
> > often. The following _untested_ snippet illustrates my idea:
> > 
> > ---
> >  arch/arm/include/asm/syscall.h |   93 +++++++++++++++++++++++++++++++++++++++-
> >  1 files changed, 92 insertions(+), 1 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
> > index 3b3248f..a7f2615 100644
> > --- a/arch/arm/include/asm/syscall.h
> > +++ b/arch/arm/include/asm/syscall.h
> > @@ -10,10 +10,101 @@
> >  #ifndef _ASM_ARM_SYSCALLS_H
> >  #define _ASM_ARM_SYSCALLS_H
> > 
> > +static inline int get_swi_instruction(struct task_struct *task,
> > +				      struct pt_regs *regs,
> > +				      unsigned long *instr)
> > +{
> > +	struct page *page = NULL;
> > +	unsigned long instr_addr;
> > +	unsigned long *ptr;
> > +	int ret;
> > +
> > +	instr_addr = regs->ARM_pc - 4;
> > +
> > +	down_read(&task->mm->mmap_sem);
> > +	ret = get_user_pages(task, task->mm, instr_addr,
> > +			     1, 0, 0, &page, NULL);
> > +	up_read(&task->mm->mmap_sem);
> > +
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
> > +	memcpy(instr,
> > +	       ptr + (instr_addr >> PAGE_SHIFT),
> 			^shouldn't this be:
> 		      instr_addr & PAGE_MASK

Oops, made my own mistake. I think the address of the kmap'd instruction
would be:

	ptr + (instr_addr & ~PAGE_MASK)

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
  2010-03-25  1:11                   ` Matt Helsley
@ 2010-03-25  1:17                     ` Matt Helsley
  -1 siblings, 0 replies; 80+ messages in thread
From: Matt Helsley @ 2010-03-25  1:17 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Christoffer Dall, Oren Laadan, Russell King - ARM Linux,
	linux-arm-kernel, containers, linux-kernel, Roland McGrath

On Wed, Mar 24, 2010 at 06:11:32PM -0700, Matt Helsley wrote:
> On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
<snip>
> > Re-using the assembly code or factoring it out so that it can be used
> > from multiple places doesn't seem very pleasing to me, as the assembly
> > code is in the critical path and written specifically for the context
> > of a process entering the kernel. Please correct me if I'm wrong.
> > 
> > I imagine simply a function in C, more or less re-implementing the
> > logic that's already in entry-common.S, might do the trick. I wouldn't
> > worry much about the performance in this case as it will not be used
> > often. The following _untested_ snippet illustrates my idea:
> > 
> > ---
> >  arch/arm/include/asm/syscall.h |   93 +++++++++++++++++++++++++++++++++++++++-
> >  1 files changed, 92 insertions(+), 1 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
> > index 3b3248f..a7f2615 100644
> > --- a/arch/arm/include/asm/syscall.h
> > +++ b/arch/arm/include/asm/syscall.h
> > @@ -10,10 +10,101 @@
> >  #ifndef _ASM_ARM_SYSCALLS_H
> >  #define _ASM_ARM_SYSCALLS_H
> > 
> > +static inline int get_swi_instruction(struct task_struct *task,
> > +				      struct pt_regs *regs,
> > +				      unsigned long *instr)
> > +{
> > +	struct page *page = NULL;
> > +	unsigned long instr_addr;
> > +	unsigned long *ptr;
> > +	int ret;
> > +
> > +	instr_addr = regs->ARM_pc - 4;
> > +
> > +	down_read(&task->mm->mmap_sem);
> > +	ret = get_user_pages(task, task->mm, instr_addr,
> > +			     1, 0, 0, &page, NULL);
> > +	up_read(&task->mm->mmap_sem);
> > +
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
> > +	memcpy(instr,
> > +	       ptr + (instr_addr >> PAGE_SHIFT),
> 			^shouldn't this be:
> 		      instr_addr & PAGE_MASK

Oops, made my own mistake. I think the address of the kmap'd instruction
would be:

	ptr + (instr_addr & ~PAGE_MASK)

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
@ 2010-03-25  1:17                     ` Matt Helsley
  0 siblings, 0 replies; 80+ messages in thread
From: Matt Helsley @ 2010-03-25  1:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 24, 2010 at 06:11:32PM -0700, Matt Helsley wrote:
> On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
<snip>
> > Re-using the assembly code or factoring it out so that it can be used
> > from multiple places doesn't seem very pleasing to me, as the assembly
> > code is in the critical path and written specifically for the context
> > of a process entering the kernel. Please correct me if I'm wrong.
> > 
> > I imagine simply a function in C, more or less re-implementing the
> > logic that's already in entry-common.S, might do the trick. I wouldn't
> > worry much about the performance in this case as it will not be used
> > often. The following _untested_ snippet illustrates my idea:
> > 
> > ---
> >  arch/arm/include/asm/syscall.h |   93 +++++++++++++++++++++++++++++++++++++++-
> >  1 files changed, 92 insertions(+), 1 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
> > index 3b3248f..a7f2615 100644
> > --- a/arch/arm/include/asm/syscall.h
> > +++ b/arch/arm/include/asm/syscall.h
> > @@ -10,10 +10,101 @@
> >  #ifndef _ASM_ARM_SYSCALLS_H
> >  #define _ASM_ARM_SYSCALLS_H
> > 
> > +static inline int get_swi_instruction(struct task_struct *task,
> > +				      struct pt_regs *regs,
> > +				      unsigned long *instr)
> > +{
> > +	struct page *page = NULL;
> > +	unsigned long instr_addr;
> > +	unsigned long *ptr;
> > +	int ret;
> > +
> > +	instr_addr = regs->ARM_pc - 4;
> > +
> > +	down_read(&task->mm->mmap_sem);
> > +	ret = get_user_pages(task, task->mm, instr_addr,
> > +			     1, 0, 0, &page, NULL);
> > +	up_read(&task->mm->mmap_sem);
> > +
> > +	if (ret < 0)
> > +		return ret;
> > +
> > +	ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
> > +	memcpy(instr,
> > +	       ptr + (instr_addr >> PAGE_SHIFT),
> 			^shouldn't this be:
> 		      instr_addr & PAGE_MASK

Oops, made my own mistake. I think the address of the kmap'd instruction
would be:

	ptr + (instr_addr & ~PAGE_MASK)

Cheers,
	-Matt Helsley

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
       [not found]                   ` <20100325011132.GE5704-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
  2010-03-25  1:17                     ` Matt Helsley
@ 2010-03-25  1:35                     ` Oren Laadan
  1 sibling, 0 replies; 80+ messages in thread
From: Oren Laadan @ 2010-03-25  1:35 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Russell King - ARM Linux, containers, linux-kernel,
	Christoffer Dall, linux-arm-kernel, Roland McGrath



Matt Helsley wrote:
> On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
>> On Wed, Mar 24, 2010 at 4:53 PM, Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> wrote:
>>>
>>> Matt Helsley wrote:
>>>> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
>>>>> On Tue, 23 Mar 2010, Matt Helsley wrote:
>>>>>
>>>>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux
>>>>>> wrote:
>>>>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
>>>>>>>> This small commit introduces a global state of system calls for ARM
>>>>>>>> making it possible for a debugger or checkpointing to gain information
>>>>>>>> about another process' state with respect to system calls.
>>>>>>> I don't particularly like the idea that we always store the syscall
>>>>>>> number to memory for every system call, whether the stored version is
>>>>>>> used or not.
>>>>>>>
>>>>>>> Since ARM caches are generally not write allocate, this means mostly
>>>>>>> write-only variables can have a higher than expected expense.
>>>>>>>
>>>>>>> Is there not some thread flag which can be checked to see if we need to
>>>>>>> store the syscall number?
>>>>>> Perhaps before we freeze the task we can save the syscall number on ARM.
>>>>>> The patches suggest that the signal delivery path -- which the freezer
>>>>>> utilizes -- has the syscall number already.
>>> Actually, the signal path doesn't have the syscall number, it has
>>> a binary "in syscall" value.
>>>
> 
> Argh. I read too much into the name :(.
> 
>> Well, this could be changed to pass the syscall number through
>> registers along to try_to_freeze without any mentionable performance
>> hit.
> 
> Yes, that's possible. I was thinking we could still use your thread info
> field but only store to it when we know it will be useful for c/r rather
> than for each syscall. Personally, I'd rather avoid passing the extra
> parameter into try_to_freeze(). Your idea below seems better to me.
> 
>> Re-using the assembly code or factoring it out so that it can be used
>> from multiple places doesn't seem very pleasing to me, as the assembly
>> code is in the critical path and written specifically for the context
>> of a process entering the kernel. Please correct me if I'm wrong.
>>
>> I imagine simply a function in C, more or less re-implementing the
>> logic that's already in entry-common.S, might do the trick. I wouldn't
>> worry much about the performance in this case as it will not be used
>> often. The following _untested_ snippet illustrates my idea:
>>
>> ---
>>  arch/arm/include/asm/syscall.h |   93 +++++++++++++++++++++++++++++++++++++++-
>>  1 files changed, 92 insertions(+), 1 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
>> index 3b3248f..a7f2615 100644
>> --- a/arch/arm/include/asm/syscall.h
>> +++ b/arch/arm/include/asm/syscall.h
>> @@ -10,10 +10,101 @@
>>  #ifndef _ASM_ARM_SYSCALLS_H
>>  #define _ASM_ARM_SYSCALLS_H
>>
>> +static inline int get_swi_instruction(struct task_struct *task,
>> +				      struct pt_regs *regs,
>> +				      unsigned long *instr)
>> +{
>> +	struct page *page = NULL;
>> +	unsigned long instr_addr;
>> +	unsigned long *ptr;
>> +	int ret;
>> +
>> +	instr_addr = regs->ARM_pc - 4;
>> +
>> +	down_read(&task->mm->mmap_sem);
>> +	ret = get_user_pages(task, task->mm, instr_addr,
>> +			     1, 0, 0, &page, NULL);
>> +	up_read(&task->mm->mmap_sem);
>> +
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
>> +	memcpy(instr,
>> +	       ptr + (instr_addr >> PAGE_SHIFT),
> 			^shouldn't this be:
> 		      instr_addr & PAGE_MASK
> 
>> +	       sizeof(unsigned long));
>> +	kunmap_atomic(ptr, KM_USER1);
>> +
>> +	page_cache_release(page);
>> +
>> +	return 0;
>> +}
> 
> (again, not familiar with ARM so my understanding is:
> 
> I guess swi is "syscall word immediate".
> 
> The syscall nr is embedded in the instruction as an immediate
> value and you're getting a copy of that instruction using the value of
> the pc register just after the syscall instruction was executed.)
> 
> Perhaps I am missing or forgetting something. Why isn't this as simple
> as calling get_user() or even copy_from_user() using instr_addr?

In c/r, we only need it at restart when a task calls it on itself.

However the interface itself of get_syscall_nr() can be called by
any task on another task.

(In fact, I think that for the most part, saving the syscall number
at checkpoint time may be better than figuring out at restart time).

Oren.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
  2010-03-25  1:11                   ` Matt Helsley
@ 2010-03-25  1:35                     ` Oren Laadan
  -1 siblings, 0 replies; 80+ messages in thread
From: Oren Laadan @ 2010-03-25  1:35 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Christoffer Dall, Russell King - ARM Linux, linux-arm-kernel,
	containers, linux-kernel, Roland McGrath



Matt Helsley wrote:
> On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
>> On Wed, Mar 24, 2010 at 4:53 PM, Oren Laadan <orenl@cs.columbia.edu> wrote:
>>>
>>> Matt Helsley wrote:
>>>> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
>>>>> On Tue, 23 Mar 2010, Matt Helsley wrote:
>>>>>
>>>>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux
>>>>>> wrote:
>>>>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
>>>>>>>> This small commit introduces a global state of system calls for ARM
>>>>>>>> making it possible for a debugger or checkpointing to gain information
>>>>>>>> about another process' state with respect to system calls.
>>>>>>> I don't particularly like the idea that we always store the syscall
>>>>>>> number to memory for every system call, whether the stored version is
>>>>>>> used or not.
>>>>>>>
>>>>>>> Since ARM caches are generally not write allocate, this means mostly
>>>>>>> write-only variables can have a higher than expected expense.
>>>>>>>
>>>>>>> Is there not some thread flag which can be checked to see if we need to
>>>>>>> store the syscall number?
>>>>>> Perhaps before we freeze the task we can save the syscall number on ARM.
>>>>>> The patches suggest that the signal delivery path -- which the freezer
>>>>>> utilizes -- has the syscall number already.
>>> Actually, the signal path doesn't have the syscall number, it has
>>> a binary "in syscall" value.
>>>
> 
> Argh. I read too much into the name :(.
> 
>> Well, this could be changed to pass the syscall number through
>> registers along to try_to_freeze without any mentionable performance
>> hit.
> 
> Yes, that's possible. I was thinking we could still use your thread info
> field but only store to it when we know it will be useful for c/r rather
> than for each syscall. Personally, I'd rather avoid passing the extra
> parameter into try_to_freeze(). Your idea below seems better to me.
> 
>> Re-using the assembly code or factoring it out so that it can be used
>> from multiple places doesn't seem very pleasing to me, as the assembly
>> code is in the critical path and written specifically for the context
>> of a process entering the kernel. Please correct me if I'm wrong.
>>
>> I imagine simply a function in C, more or less re-implementing the
>> logic that's already in entry-common.S, might do the trick. I wouldn't
>> worry much about the performance in this case as it will not be used
>> often. The following _untested_ snippet illustrates my idea:
>>
>> ---
>>  arch/arm/include/asm/syscall.h |   93 +++++++++++++++++++++++++++++++++++++++-
>>  1 files changed, 92 insertions(+), 1 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
>> index 3b3248f..a7f2615 100644
>> --- a/arch/arm/include/asm/syscall.h
>> +++ b/arch/arm/include/asm/syscall.h
>> @@ -10,10 +10,101 @@
>>  #ifndef _ASM_ARM_SYSCALLS_H
>>  #define _ASM_ARM_SYSCALLS_H
>>
>> +static inline int get_swi_instruction(struct task_struct *task,
>> +				      struct pt_regs *regs,
>> +				      unsigned long *instr)
>> +{
>> +	struct page *page = NULL;
>> +	unsigned long instr_addr;
>> +	unsigned long *ptr;
>> +	int ret;
>> +
>> +	instr_addr = regs->ARM_pc - 4;
>> +
>> +	down_read(&task->mm->mmap_sem);
>> +	ret = get_user_pages(task, task->mm, instr_addr,
>> +			     1, 0, 0, &page, NULL);
>> +	up_read(&task->mm->mmap_sem);
>> +
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
>> +	memcpy(instr,
>> +	       ptr + (instr_addr >> PAGE_SHIFT),
> 			^shouldn't this be:
> 		      instr_addr & PAGE_MASK
> 
>> +	       sizeof(unsigned long));
>> +	kunmap_atomic(ptr, KM_USER1);
>> +
>> +	page_cache_release(page);
>> +
>> +	return 0;
>> +}
> 
> (again, not familiar with ARM so my understanding is:
> 
> I guess swi is "syscall word immediate".
> 
> The syscall nr is embedded in the instruction as an immediate
> value and you're getting a copy of that instruction using the value of
> the pc register just after the syscall instruction was executed.)
> 
> Perhaps I am missing or forgetting something. Why isn't this as simple
> as calling get_user() or even copy_from_user() using instr_addr?

In c/r, we only need it at restart when a task calls it on itself.

However the interface itself of get_syscall_nr() can be called by
any task on another task.

(In fact, I think that for the most part, saving the syscall number
at checkpoint time may be better than figuring out at restart time).

Oren.



^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
@ 2010-03-25  1:35                     ` Oren Laadan
  0 siblings, 0 replies; 80+ messages in thread
From: Oren Laadan @ 2010-03-25  1:35 UTC (permalink / raw)
  To: linux-arm-kernel



Matt Helsley wrote:
> On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
>> On Wed, Mar 24, 2010 at 4:53 PM, Oren Laadan <orenl@cs.columbia.edu> wrote:
>>>
>>> Matt Helsley wrote:
>>>> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
>>>>> On Tue, 23 Mar 2010, Matt Helsley wrote:
>>>>>
>>>>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux
>>>>>> wrote:
>>>>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
>>>>>>>> This small commit introduces a global state of system calls for ARM
>>>>>>>> making it possible for a debugger or checkpointing to gain information
>>>>>>>> about another process' state with respect to system calls.
>>>>>>> I don't particularly like the idea that we always store the syscall
>>>>>>> number to memory for every system call, whether the stored version is
>>>>>>> used or not.
>>>>>>>
>>>>>>> Since ARM caches are generally not write allocate, this means mostly
>>>>>>> write-only variables can have a higher than expected expense.
>>>>>>>
>>>>>>> Is there not some thread flag which can be checked to see if we need to
>>>>>>> store the syscall number?
>>>>>> Perhaps before we freeze the task we can save the syscall number on ARM.
>>>>>> The patches suggest that the signal delivery path -- which the freezer
>>>>>> utilizes -- has the syscall number already.
>>> Actually, the signal path doesn't have the syscall number, it has
>>> a binary "in syscall" value.
>>>
> 
> Argh. I read too much into the name :(.
> 
>> Well, this could be changed to pass the syscall number through
>> registers along to try_to_freeze without any mentionable performance
>> hit.
> 
> Yes, that's possible. I was thinking we could still use your thread info
> field but only store to it when we know it will be useful for c/r rather
> than for each syscall. Personally, I'd rather avoid passing the extra
> parameter into try_to_freeze(). Your idea below seems better to me.
> 
>> Re-using the assembly code or factoring it out so that it can be used
>> from multiple places doesn't seem very pleasing to me, as the assembly
>> code is in the critical path and written specifically for the context
>> of a process entering the kernel. Please correct me if I'm wrong.
>>
>> I imagine simply a function in C, more or less re-implementing the
>> logic that's already in entry-common.S, might do the trick. I wouldn't
>> worry much about the performance in this case as it will not be used
>> often. The following _untested_ snippet illustrates my idea:
>>
>> ---
>>  arch/arm/include/asm/syscall.h |   93 +++++++++++++++++++++++++++++++++++++++-
>>  1 files changed, 92 insertions(+), 1 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
>> index 3b3248f..a7f2615 100644
>> --- a/arch/arm/include/asm/syscall.h
>> +++ b/arch/arm/include/asm/syscall.h
>> @@ -10,10 +10,101 @@
>>  #ifndef _ASM_ARM_SYSCALLS_H
>>  #define _ASM_ARM_SYSCALLS_H
>>
>> +static inline int get_swi_instruction(struct task_struct *task,
>> +				      struct pt_regs *regs,
>> +				      unsigned long *instr)
>> +{
>> +	struct page *page = NULL;
>> +	unsigned long instr_addr;
>> +	unsigned long *ptr;
>> +	int ret;
>> +
>> +	instr_addr = regs->ARM_pc - 4;
>> +
>> +	down_read(&task->mm->mmap_sem);
>> +	ret = get_user_pages(task, task->mm, instr_addr,
>> +			     1, 0, 0, &page, NULL);
>> +	up_read(&task->mm->mmap_sem);
>> +
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
>> +	memcpy(instr,
>> +	       ptr + (instr_addr >> PAGE_SHIFT),
> 			^shouldn't this be:
> 		      instr_addr & PAGE_MASK
> 
>> +	       sizeof(unsigned long));
>> +	kunmap_atomic(ptr, KM_USER1);
>> +
>> +	page_cache_release(page);
>> +
>> +	return 0;
>> +}
> 
> (again, not familiar with ARM so my understanding is:
> 
> I guess swi is "syscall word immediate".
> 
> The syscall nr is embedded in the instruction as an immediate
> value and you're getting a copy of that instruction using the value of
> the pc register just after the syscall instruction was executed.)
> 
> Perhaps I am missing or forgetting something. Why isn't this as simple
> as calling get_user() or even copy_from_user() using instr_addr?

In c/r, we only need it at restart when a task calls it on itself.

However the interface itself of get_syscall_nr() can be called by
any task on another task.

(In fact, I think that for the most part, saving the syscall number
at checkpoint time may be better than figuring out at restart time).

Oren.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
       [not found]                     ` <20100325011753.GF5704-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
@ 2010-03-25 10:29                       ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-25 10:29 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Russell King - ARM Linux, containers, linux-kernel,
	linux-arm-kernel, Roland McGrath

On Thu, Mar 25, 2010 at 2:17 AM, Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
> On Wed, Mar 24, 2010 at 06:11:32PM -0700, Matt Helsley wrote:
>> On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
> <snip>
>> > Re-using the assembly code or factoring it out so that it can be used
>> > from multiple places doesn't seem very pleasing to me, as the assembly
>> > code is in the critical path and written specifically for the context
>> > of a process entering the kernel. Please correct me if I'm wrong.
>> >
>> > I imagine simply a function in C, more or less re-implementing the
>> > logic that's already in entry-common.S, might do the trick. I wouldn't
>> > worry much about the performance in this case as it will not be used
>> > often. The following _untested_ snippet illustrates my idea:
>> >
>> > ---
>> >  arch/arm/include/asm/syscall.h |   93 +++++++++++++++++++++++++++++++++++++++-
>> >  1 files changed, 92 insertions(+), 1 deletions(-)
>> >
>> > diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
>> > index 3b3248f..a7f2615 100644
>> > --- a/arch/arm/include/asm/syscall.h
>> > +++ b/arch/arm/include/asm/syscall.h
>> > @@ -10,10 +10,101 @@
>> >  #ifndef _ASM_ARM_SYSCALLS_H
>> >  #define _ASM_ARM_SYSCALLS_H
>> >
>> > +static inline int get_swi_instruction(struct task_struct *task,
>> > +                                 struct pt_regs *regs,
>> > +                                 unsigned long *instr)
>> > +{
>> > +   struct page *page = NULL;
>> > +   unsigned long instr_addr;
>> > +   unsigned long *ptr;
>> > +   int ret;
>> > +
>> > +   instr_addr = regs->ARM_pc - 4;
>> > +
>> > +   down_read(&task->mm->mmap_sem);
>> > +   ret = get_user_pages(task, task->mm, instr_addr,
>> > +                        1, 0, 0, &page, NULL);
>> > +   up_read(&task->mm->mmap_sem);
>> > +
>> > +   if (ret < 0)
>> > +           return ret;
>> > +
>> > +   ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
>> > +   memcpy(instr,
>> > +          ptr + (instr_addr >> PAGE_SHIFT),
>>                       ^shouldn't this be:
>>                     instr_addr & PAGE_MASK
>
> Oops, made my own mistake. I think the address of the kmap'd instruction
> would be:
>
>        ptr + (instr_addr & ~PAGE_MASK)
>

Yes. Thanks for pointing it out.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
  2010-03-25  1:17                     ` Matt Helsley
@ 2010-03-25 10:29                       ` Christoffer Dall
  -1 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-25 10:29 UTC (permalink / raw)
  To: Matt Helsley
  Cc: Oren Laadan, Russell King - ARM Linux, linux-arm-kernel,
	containers, linux-kernel, Roland McGrath

On Thu, Mar 25, 2010 at 2:17 AM, Matt Helsley <matthltc@us.ibm.com> wrote:
> On Wed, Mar 24, 2010 at 06:11:32PM -0700, Matt Helsley wrote:
>> On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
> <snip>
>> > Re-using the assembly code or factoring it out so that it can be used
>> > from multiple places doesn't seem very pleasing to me, as the assembly
>> > code is in the critical path and written specifically for the context
>> > of a process entering the kernel. Please correct me if I'm wrong.
>> >
>> > I imagine simply a function in C, more or less re-implementing the
>> > logic that's already in entry-common.S, might do the trick. I wouldn't
>> > worry much about the performance in this case as it will not be used
>> > often. The following _untested_ snippet illustrates my idea:
>> >
>> > ---
>> >  arch/arm/include/asm/syscall.h |   93 +++++++++++++++++++++++++++++++++++++++-
>> >  1 files changed, 92 insertions(+), 1 deletions(-)
>> >
>> > diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
>> > index 3b3248f..a7f2615 100644
>> > --- a/arch/arm/include/asm/syscall.h
>> > +++ b/arch/arm/include/asm/syscall.h
>> > @@ -10,10 +10,101 @@
>> >  #ifndef _ASM_ARM_SYSCALLS_H
>> >  #define _ASM_ARM_SYSCALLS_H
>> >
>> > +static inline int get_swi_instruction(struct task_struct *task,
>> > +                                 struct pt_regs *regs,
>> > +                                 unsigned long *instr)
>> > +{
>> > +   struct page *page = NULL;
>> > +   unsigned long instr_addr;
>> > +   unsigned long *ptr;
>> > +   int ret;
>> > +
>> > +   instr_addr = regs->ARM_pc - 4;
>> > +
>> > +   down_read(&task->mm->mmap_sem);
>> > +   ret = get_user_pages(task, task->mm, instr_addr,
>> > +                        1, 0, 0, &page, NULL);
>> > +   up_read(&task->mm->mmap_sem);
>> > +
>> > +   if (ret < 0)
>> > +           return ret;
>> > +
>> > +   ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
>> > +   memcpy(instr,
>> > +          ptr + (instr_addr >> PAGE_SHIFT),
>>                       ^shouldn't this be:
>>                     instr_addr & PAGE_MASK
>
> Oops, made my own mistake. I think the address of the kmap'd instruction
> would be:
>
>        ptr + (instr_addr & ~PAGE_MASK)
>

Yes. Thanks for pointing it out.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
@ 2010-03-25 10:29                       ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-25 10:29 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Mar 25, 2010 at 2:17 AM, Matt Helsley <matthltc@us.ibm.com> wrote:
> On Wed, Mar 24, 2010 at 06:11:32PM -0700, Matt Helsley wrote:
>> On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
> <snip>
>> > Re-using the assembly code or factoring it out so that it can be used
>> > from multiple places doesn't seem very pleasing to me, as the assembly
>> > code is in the critical path and written specifically for the context
>> > of a process entering the kernel. Please correct me if I'm wrong.
>> >
>> > I imagine simply a function in C, more or less re-implementing the
>> > logic that's already in entry-common.S, might do the trick. I wouldn't
>> > worry much about the performance in this case as it will not be used
>> > often. The following _untested_ snippet illustrates my idea:
>> >
>> > ---
>> > ?arch/arm/include/asm/syscall.h | ? 93 +++++++++++++++++++++++++++++++++++++++-
>> > ?1 files changed, 92 insertions(+), 1 deletions(-)
>> >
>> > diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
>> > index 3b3248f..a7f2615 100644
>> > --- a/arch/arm/include/asm/syscall.h
>> > +++ b/arch/arm/include/asm/syscall.h
>> > @@ -10,10 +10,101 @@
>> > ?#ifndef _ASM_ARM_SYSCALLS_H
>> > ?#define _ASM_ARM_SYSCALLS_H
>> >
>> > +static inline int get_swi_instruction(struct task_struct *task,
>> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct pt_regs *regs,
>> > + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? unsigned long *instr)
>> > +{
>> > + ? struct page *page = NULL;
>> > + ? unsigned long instr_addr;
>> > + ? unsigned long *ptr;
>> > + ? int ret;
>> > +
>> > + ? instr_addr = regs->ARM_pc - 4;
>> > +
>> > + ? down_read(&task->mm->mmap_sem);
>> > + ? ret = get_user_pages(task, task->mm, instr_addr,
>> > + ? ? ? ? ? ? ? ? ? ? ? ?1, 0, 0, &page, NULL);
>> > + ? up_read(&task->mm->mmap_sem);
>> > +
>> > + ? if (ret < 0)
>> > + ? ? ? ? ? return ret;
>> > +
>> > + ? ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
>> > + ? memcpy(instr,
>> > + ? ? ? ? ?ptr + (instr_addr >> PAGE_SHIFT),
>> ? ? ? ? ? ? ? ? ? ? ? ^shouldn't this be:
>> ? ? ? ? ? ? ? ? ? ? instr_addr & PAGE_MASK
>
> Oops, made my own mistake. I think the address of the kmap'd instruction
> would be:
>
> ? ? ? ?ptr + (instr_addr & ~PAGE_MASK)
>

Yes. Thanks for pointing it out.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
       [not found]                     ` <4BAABDF4.8070904-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
@ 2010-03-25 10:34                       ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-25 10:34 UTC (permalink / raw)
  To: Oren Laadan
  Cc: linux-arm-kernel, Russell King - ARM Linux, containers,
	linux-kernel, Roland McGrath

On Thu, Mar 25, 2010 at 2:35 AM, Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> wrote:
>
>
> Matt Helsley wrote:
>>
>> On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
>>>
>>> On Wed, Mar 24, 2010 at 4:53 PM, Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
>>> wrote:
>>>>
>>>> Matt Helsley wrote:
>>>>>
>>>>> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
>>>>>>
>>>>>> On Tue, 23 Mar 2010, Matt Helsley wrote:
>>>>>>
>>>>>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
>>>>>>>>>
>>>>>>>>> This small commit introduces a global state of system calls for ARM
>>>>>>>>> making it possible for a debugger or checkpointing to gain
>>>>>>>>> information
>>>>>>>>> about another process' state with respect to system calls.
>>>>>>>>
>>>>>>>> I don't particularly like the idea that we always store the syscall
>>>>>>>> number to memory for every system call, whether the stored version
>>>>>>>> is
>>>>>>>> used or not.
>>>>>>>>
>>>>>>>> Since ARM caches are generally not write allocate, this means mostly
>>>>>>>> write-only variables can have a higher than expected expense.
>>>>>>>>
>>>>>>>> Is there not some thread flag which can be checked to see if we need
>>>>>>>> to
>>>>>>>> store the syscall number?
>>>>>>>
>>>>>>> Perhaps before we freeze the task we can save the syscall number on
>>>>>>> ARM.
>>>>>>> The patches suggest that the signal delivery path -- which the
>>>>>>> freezer
>>>>>>> utilizes -- has the syscall number already.
>>>>
>>>> Actually, the signal path doesn't have the syscall number, it has
>>>> a binary "in syscall" value.
>>>>
>>
>> Argh. I read too much into the name :(.
>>
>>> Well, this could be changed to pass the syscall number through
>>> registers along to try_to_freeze without any mentionable performance
>>> hit.
>>
>> Yes, that's possible. I was thinking we could still use your thread info
>> field but only store to it when we know it will be useful for c/r rather
>> than for each syscall. Personally, I'd rather avoid passing the extra
>> parameter into try_to_freeze(). Your idea below seems better to me.
>>
>>> Re-using the assembly code or factoring it out so that it can be used
>>> from multiple places doesn't seem very pleasing to me, as the assembly
>>> code is in the critical path and written specifically for the context
>>> of a process entering the kernel. Please correct me if I'm wrong.
>>>
>>> I imagine simply a function in C, more or less re-implementing the
>>> logic that's already in entry-common.S, might do the trick. I wouldn't
>>> worry much about the performance in this case as it will not be used
>>> often. The following _untested_ snippet illustrates my idea:
>>>
>>> ---
>>>  arch/arm/include/asm/syscall.h |   93
>>> +++++++++++++++++++++++++++++++++++++++-
>>>  1 files changed, 92 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/arch/arm/include/asm/syscall.h
>>> b/arch/arm/include/asm/syscall.h
>>> index 3b3248f..a7f2615 100644
>>> --- a/arch/arm/include/asm/syscall.h
>>> +++ b/arch/arm/include/asm/syscall.h
>>> @@ -10,10 +10,101 @@
>>>  #ifndef _ASM_ARM_SYSCALLS_H
>>>  #define _ASM_ARM_SYSCALLS_H
>>>
>>> +static inline int get_swi_instruction(struct task_struct *task,
>>> +                                     struct pt_regs *regs,
>>> +                                     unsigned long *instr)
>>> +{
>>> +       struct page *page = NULL;
>>> +       unsigned long instr_addr;
>>> +       unsigned long *ptr;
>>> +       int ret;
>>> +
>>> +       instr_addr = regs->ARM_pc - 4;
>>> +
>>> +       down_read(&task->mm->mmap_sem);
>>> +       ret = get_user_pages(task, task->mm, instr_addr,
>>> +                            1, 0, 0, &page, NULL);
>>> +       up_read(&task->mm->mmap_sem);
>>> +
>>> +       if (ret < 0)
>>> +               return ret;
>>> +
>>> +       ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
>>> +       memcpy(instr,
>>> +              ptr + (instr_addr >> PAGE_SHIFT),
>>
>>                        ^shouldn't this be:
>>                      instr_addr & PAGE_MASK
>>
>>> +              sizeof(unsigned long));
>>> +       kunmap_atomic(ptr, KM_USER1);
>>> +
>>> +       page_cache_release(page);
>>> +
>>> +       return 0;
>>> +}
>>
>> (again, not familiar with ARM so my understanding is:
>>
>> I guess swi is "syscall word immediate".
>>
>> The syscall nr is embedded in the instruction as an immediate
>> value and you're getting a copy of that instruction using the value of
>> the pc register just after the syscall instruction was executed.)
>>
>> Perhaps I am missing or forgetting something. Why isn't this as simple
>> as calling get_user() or even copy_from_user() using instr_addr?
>
> In c/r, we only need it at restart when a task calls it on itself.
>
> However the interface itself of get_syscall_nr() can be called by
> any task on another task.
>
> (In fact, I think that for the most part, saving the syscall number
> at checkpoint time may be better than figuring out at restart time).
>

So, as Oren is saying, the point was to make the syscall_get_nr(..)
work according to the interface specified in
include/asm-generic/syscall.h.

Considering it's unknown how we will deal with checkpoint/restart
across CONFIG_ARM_THUMB, CONFIG_OABI_COMPAT etc., I also think it's a
better idea to checkpoint the syscall number at checkpoint and for the
restore, place architecture specific hooks to get the syscall number
instead of calling syscall_get_nr(...) directly. In this way we should
always be able to get the syscall and correctly restart, independently
of what tricks we do to checkpoint restart across configuration
settings - if any.

Best,
Christoffer

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
  2010-03-25  1:35                     ` Oren Laadan
@ 2010-03-25 10:34                       ` Christoffer Dall
  -1 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-25 10:34 UTC (permalink / raw)
  To: Oren Laadan
  Cc: Matt Helsley, Russell King - ARM Linux, linux-arm-kernel,
	containers, linux-kernel, Roland McGrath

On Thu, Mar 25, 2010 at 2:35 AM, Oren Laadan <orenl@cs.columbia.edu> wrote:
>
>
> Matt Helsley wrote:
>>
>> On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
>>>
>>> On Wed, Mar 24, 2010 at 4:53 PM, Oren Laadan <orenl@cs.columbia.edu>
>>> wrote:
>>>>
>>>> Matt Helsley wrote:
>>>>>
>>>>> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
>>>>>>
>>>>>> On Tue, 23 Mar 2010, Matt Helsley wrote:
>>>>>>
>>>>>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
>>>>>>>>>
>>>>>>>>> This small commit introduces a global state of system calls for ARM
>>>>>>>>> making it possible for a debugger or checkpointing to gain
>>>>>>>>> information
>>>>>>>>> about another process' state with respect to system calls.
>>>>>>>>
>>>>>>>> I don't particularly like the idea that we always store the syscall
>>>>>>>> number to memory for every system call, whether the stored version
>>>>>>>> is
>>>>>>>> used or not.
>>>>>>>>
>>>>>>>> Since ARM caches are generally not write allocate, this means mostly
>>>>>>>> write-only variables can have a higher than expected expense.
>>>>>>>>
>>>>>>>> Is there not some thread flag which can be checked to see if we need
>>>>>>>> to
>>>>>>>> store the syscall number?
>>>>>>>
>>>>>>> Perhaps before we freeze the task we can save the syscall number on
>>>>>>> ARM.
>>>>>>> The patches suggest that the signal delivery path -- which the
>>>>>>> freezer
>>>>>>> utilizes -- has the syscall number already.
>>>>
>>>> Actually, the signal path doesn't have the syscall number, it has
>>>> a binary "in syscall" value.
>>>>
>>
>> Argh. I read too much into the name :(.
>>
>>> Well, this could be changed to pass the syscall number through
>>> registers along to try_to_freeze without any mentionable performance
>>> hit.
>>
>> Yes, that's possible. I was thinking we could still use your thread info
>> field but only store to it when we know it will be useful for c/r rather
>> than for each syscall. Personally, I'd rather avoid passing the extra
>> parameter into try_to_freeze(). Your idea below seems better to me.
>>
>>> Re-using the assembly code or factoring it out so that it can be used
>>> from multiple places doesn't seem very pleasing to me, as the assembly
>>> code is in the critical path and written specifically for the context
>>> of a process entering the kernel. Please correct me if I'm wrong.
>>>
>>> I imagine simply a function in C, more or less re-implementing the
>>> logic that's already in entry-common.S, might do the trick. I wouldn't
>>> worry much about the performance in this case as it will not be used
>>> often. The following _untested_ snippet illustrates my idea:
>>>
>>> ---
>>>  arch/arm/include/asm/syscall.h |   93
>>> +++++++++++++++++++++++++++++++++++++++-
>>>  1 files changed, 92 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/arch/arm/include/asm/syscall.h
>>> b/arch/arm/include/asm/syscall.h
>>> index 3b3248f..a7f2615 100644
>>> --- a/arch/arm/include/asm/syscall.h
>>> +++ b/arch/arm/include/asm/syscall.h
>>> @@ -10,10 +10,101 @@
>>>  #ifndef _ASM_ARM_SYSCALLS_H
>>>  #define _ASM_ARM_SYSCALLS_H
>>>
>>> +static inline int get_swi_instruction(struct task_struct *task,
>>> +                                     struct pt_regs *regs,
>>> +                                     unsigned long *instr)
>>> +{
>>> +       struct page *page = NULL;
>>> +       unsigned long instr_addr;
>>> +       unsigned long *ptr;
>>> +       int ret;
>>> +
>>> +       instr_addr = regs->ARM_pc - 4;
>>> +
>>> +       down_read(&task->mm->mmap_sem);
>>> +       ret = get_user_pages(task, task->mm, instr_addr,
>>> +                            1, 0, 0, &page, NULL);
>>> +       up_read(&task->mm->mmap_sem);
>>> +
>>> +       if (ret < 0)
>>> +               return ret;
>>> +
>>> +       ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
>>> +       memcpy(instr,
>>> +              ptr + (instr_addr >> PAGE_SHIFT),
>>
>>                        ^shouldn't this be:
>>                      instr_addr & PAGE_MASK
>>
>>> +              sizeof(unsigned long));
>>> +       kunmap_atomic(ptr, KM_USER1);
>>> +
>>> +       page_cache_release(page);
>>> +
>>> +       return 0;
>>> +}
>>
>> (again, not familiar with ARM so my understanding is:
>>
>> I guess swi is "syscall word immediate".
>>
>> The syscall nr is embedded in the instruction as an immediate
>> value and you're getting a copy of that instruction using the value of
>> the pc register just after the syscall instruction was executed.)
>>
>> Perhaps I am missing or forgetting something. Why isn't this as simple
>> as calling get_user() or even copy_from_user() using instr_addr?
>
> In c/r, we only need it at restart when a task calls it on itself.
>
> However the interface itself of get_syscall_nr() can be called by
> any task on another task.
>
> (In fact, I think that for the most part, saving the syscall number
> at checkpoint time may be better than figuring out at restart time).
>

So, as Oren is saying, the point was to make the syscall_get_nr(..)
work according to the interface specified in
include/asm-generic/syscall.h.

Considering it's unknown how we will deal with checkpoint/restart
across CONFIG_ARM_THUMB, CONFIG_OABI_COMPAT etc., I also think it's a
better idea to checkpoint the syscall number at checkpoint and for the
restore, place architecture specific hooks to get the syscall number
instead of calling syscall_get_nr(...) directly. In this way we should
always be able to get the syscall and correctly restart, independently
of what tricks we do to checkpoint restart across configuration
settings - if any.

Best,
Christoffer

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces
@ 2010-03-25 10:34                       ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-25 10:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Mar 25, 2010 at 2:35 AM, Oren Laadan <orenl@cs.columbia.edu> wrote:
>
>
> Matt Helsley wrote:
>>
>> On Wed, Mar 24, 2010 at 08:36:39PM +0100, Christoffer Dall wrote:
>>>
>>> On Wed, Mar 24, 2010 at 4:53 PM, Oren Laadan <orenl@cs.columbia.edu>
>>> wrote:
>>>>
>>>> Matt Helsley wrote:
>>>>>
>>>>> On Wed, Mar 24, 2010 at 12:57:46AM -0400, Oren Laadan wrote:
>>>>>>
>>>>>> On Tue, 23 Mar 2010, Matt Helsley wrote:
>>>>>>
>>>>>>> On Tue, Mar 23, 2010 at 08:53:42PM +0000, Russell King - ARM Linux
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> On Sun, Mar 21, 2010 at 09:06:03PM -0400, Christoffer Dall wrote:
>>>>>>>>>
>>>>>>>>> This small commit introduces a global state of system calls for ARM
>>>>>>>>> making it possible for a debugger or checkpointing to gain
>>>>>>>>> information
>>>>>>>>> about another process' state with respect to system calls.
>>>>>>>>
>>>>>>>> I don't particularly like the idea that we always store the syscall
>>>>>>>> number to memory for every system call, whether the stored version
>>>>>>>> is
>>>>>>>> used or not.
>>>>>>>>
>>>>>>>> Since ARM caches are generally not write allocate, this means mostly
>>>>>>>> write-only variables can have a higher than expected expense.
>>>>>>>>
>>>>>>>> Is there not some thread flag which can be checked to see if we need
>>>>>>>> to
>>>>>>>> store the syscall number?
>>>>>>>
>>>>>>> Perhaps before we freeze the task we can save the syscall number on
>>>>>>> ARM.
>>>>>>> The patches suggest that the signal delivery path -- which the
>>>>>>> freezer
>>>>>>> utilizes -- has the syscall number already.
>>>>
>>>> Actually, the signal path doesn't have the syscall number, it has
>>>> a binary "in syscall" value.
>>>>
>>
>> Argh. I read too much into the name :(.
>>
>>> Well, this could be changed to pass the syscall number through
>>> registers along to try_to_freeze without any mentionable performance
>>> hit.
>>
>> Yes, that's possible. I was thinking we could still use your thread info
>> field but only store to it when we know it will be useful for c/r rather
>> than for each syscall. Personally, I'd rather avoid passing the extra
>> parameter into try_to_freeze(). Your idea below seems better to me.
>>
>>> Re-using the assembly code or factoring it out so that it can be used
>>> from multiple places doesn't seem very pleasing to me, as the assembly
>>> code is in the critical path and written specifically for the context
>>> of a process entering the kernel. Please correct me if I'm wrong.
>>>
>>> I imagine simply a function in C, more or less re-implementing the
>>> logic that's already in entry-common.S, might do the trick. I wouldn't
>>> worry much about the performance in this case as it will not be used
>>> often. The following _untested_ snippet illustrates my idea:
>>>
>>> ---
>>> ?arch/arm/include/asm/syscall.h | ? 93
>>> +++++++++++++++++++++++++++++++++++++++-
>>> ?1 files changed, 92 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/arch/arm/include/asm/syscall.h
>>> b/arch/arm/include/asm/syscall.h
>>> index 3b3248f..a7f2615 100644
>>> --- a/arch/arm/include/asm/syscall.h
>>> +++ b/arch/arm/include/asm/syscall.h
>>> @@ -10,10 +10,101 @@
>>> ?#ifndef _ASM_ARM_SYSCALLS_H
>>> ?#define _ASM_ARM_SYSCALLS_H
>>>
>>> +static inline int get_swi_instruction(struct task_struct *task,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? struct pt_regs *regs,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? unsigned long *instr)
>>> +{
>>> + ? ? ? struct page *page = NULL;
>>> + ? ? ? unsigned long instr_addr;
>>> + ? ? ? unsigned long *ptr;
>>> + ? ? ? int ret;
>>> +
>>> + ? ? ? instr_addr = regs->ARM_pc - 4;
>>> +
>>> + ? ? ? down_read(&task->mm->mmap_sem);
>>> + ? ? ? ret = get_user_pages(task, task->mm, instr_addr,
>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?1, 0, 0, &page, NULL);
>>> + ? ? ? up_read(&task->mm->mmap_sem);
>>> +
>>> + ? ? ? if (ret < 0)
>>> + ? ? ? ? ? ? ? return ret;
>>> +
>>> + ? ? ? ptr = (unsigned long *)kmap_atomic(page, KM_USER1);
>>> + ? ? ? memcpy(instr,
>>> + ? ? ? ? ? ? ?ptr + (instr_addr >> PAGE_SHIFT),
>>
>> ? ? ? ? ? ? ? ? ? ? ? ?^shouldn't this be:
>> ? ? ? ? ? ? ? ? ? ? ?instr_addr & PAGE_MASK
>>
>>> + ? ? ? ? ? ? ?sizeof(unsigned long));
>>> + ? ? ? kunmap_atomic(ptr, KM_USER1);
>>> +
>>> + ? ? ? page_cache_release(page);
>>> +
>>> + ? ? ? return 0;
>>> +}
>>
>> (again, not familiar with ARM so my understanding is:
>>
>> I guess swi is "syscall word immediate".
>>
>> The syscall nr is embedded in the instruction as an immediate
>> value and you're getting a copy of that instruction using the value of
>> the pc register just after the syscall instruction was executed.)
>>
>> Perhaps I am missing or forgetting something. Why isn't this as simple
>> as calling get_user() or even copy_from_user() using instr_addr?
>
> In c/r, we only need it at restart when a task calls it on itself.
>
> However the interface itself of get_syscall_nr() can be called by
> any task on another task.
>
> (In fact, I think that for the most part, saving the syscall number
> at checkpoint time may be better than figuring out at restart time).
>

So, as Oren is saying, the point was to make the syscall_get_nr(..)
work according to the interface specified in
include/asm-generic/syscall.h.

Considering it's unknown how we will deal with checkpoint/restart
across CONFIG_ARM_THUMB, CONFIG_OABI_COMPAT etc., I also think it's a
better idea to checkpoint the syscall number at checkpoint and for the
restore, place architecture specific hooks to get the syscall number
instead of calling syscall_get_nr(...) directly. In this way we should
always be able to get the syscall and correctly restart, independently
of what tricks we do to checkpoint restart across configuration
settings - if any.

Best,
Christoffer

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
       [not found]       ` <7d08b87d1003241348g347f092k1142318490e0bdcc-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-03-26  2:47         ` Jamie Lokier
  0 siblings, 0 replies; 80+ messages in thread
From: Jamie Lokier @ 2010-03-26  2:47 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: containers, Russell King - ARM Linux, linux-kernel, linux-arm-kernel

Christoffer Dall wrote:
> > That doesn't indicate what ISA version the system is running on, or even
> > if the ABI is compatible (we have two ABIs - OABI and EABI).
> 
> That's why I checkpointed CONFIG_OABI_COMPAT, but I realize that it's
> not sufficient.
> 
> How about checkpointing CONFIG_AEABI and CONFIG_OABI_COMPAT and making
> sure that we either restore to the same setting of the two or restore
> to CONFIG_OABI_COMPAT=y?

With CONFIG_OABI_COMPAT enabled, each process can be in either
personality: OABI or EABI.  Checkpointing will need to remember which
one.

With CONFIG_OABI_COMPAT disabled, it'll be fixed at one or the other,
but there's no reason why a process should not be moved between
kernels with different values of CONFIG_OABI_COMPAT, so long as the
OABI or EABI personality is supported by the destination kernel.

In other words, CONFIG_OABI_COMPAT shouldn't be in the checkpoint
state at all - only the per-process personalities should be.

> >> We checkpoint whether the system is running with CONFIG_MMU or not and
> >> require the same configuration for the system on which we restore the
> >> process. It might be possible to allow something more fine-grained,
> >> if it's worth the energy. Input on this item is also very welcome,
> >> specifically from someone who knows the exact meaning of the end_brk
> >> field.
> >
> > Processes which run on MMU and non-MMU CPUs are unlikely to be
> > interchangable - the run time environments are quite different.  I
> > think this is a sane check.
> >
> thanks.

It's possible in principle to run many non-MMU binaries on MMU
kernels, but I've never heard of anyone doing it.

-- Jamie

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
  2010-03-24 20:48       ` Christoffer Dall
@ 2010-03-26  2:47         ` Jamie Lokier
  -1 siblings, 0 replies; 80+ messages in thread
From: Jamie Lokier @ 2010-03-26  2:47 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Russell King - ARM Linux, containers, linux-kernel, linux-arm-kernel

Christoffer Dall wrote:
> > That doesn't indicate what ISA version the system is running on, or even
> > if the ABI is compatible (we have two ABIs - OABI and EABI).
> 
> That's why I checkpointed CONFIG_OABI_COMPAT, but I realize that it's
> not sufficient.
> 
> How about checkpointing CONFIG_AEABI and CONFIG_OABI_COMPAT and making
> sure that we either restore to the same setting of the two or restore
> to CONFIG_OABI_COMPAT=y?

With CONFIG_OABI_COMPAT enabled, each process can be in either
personality: OABI or EABI.  Checkpointing will need to remember which
one.

With CONFIG_OABI_COMPAT disabled, it'll be fixed at one or the other,
but there's no reason why a process should not be moved between
kernels with different values of CONFIG_OABI_COMPAT, so long as the
OABI or EABI personality is supported by the destination kernel.

In other words, CONFIG_OABI_COMPAT shouldn't be in the checkpoint
state at all - only the per-process personalities should be.

> >> We checkpoint whether the system is running with CONFIG_MMU or not and
> >> require the same configuration for the system on which we restore the
> >> process. It might be possible to allow something more fine-grained,
> >> if it's worth the energy. Input on this item is also very welcome,
> >> specifically from someone who knows the exact meaning of the end_brk
> >> field.
> >
> > Processes which run on MMU and non-MMU CPUs are unlikely to be
> > interchangable - the run time environments are quite different.  I
> > think this is a sane check.
> >
> thanks.

It's possible in principle to run many non-MMU binaries on MMU
kernels, but I've never heard of anyone doing it.

-- Jamie


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
@ 2010-03-26  2:47         ` Jamie Lokier
  0 siblings, 0 replies; 80+ messages in thread
From: Jamie Lokier @ 2010-03-26  2:47 UTC (permalink / raw)
  To: linux-arm-kernel

Christoffer Dall wrote:
> > That doesn't indicate what ISA version the system is running on, or even
> > if the ABI is compatible (we have two ABIs - OABI and EABI).
> 
> That's why I checkpointed CONFIG_OABI_COMPAT, but I realize that it's
> not sufficient.
> 
> How about checkpointing CONFIG_AEABI and CONFIG_OABI_COMPAT and making
> sure that we either restore to the same setting of the two or restore
> to CONFIG_OABI_COMPAT=y?

With CONFIG_OABI_COMPAT enabled, each process can be in either
personality: OABI or EABI.  Checkpointing will need to remember which
one.

With CONFIG_OABI_COMPAT disabled, it'll be fixed at one or the other,
but there's no reason why a process should not be moved between
kernels with different values of CONFIG_OABI_COMPAT, so long as the
OABI or EABI personality is supported by the destination kernel.

In other words, CONFIG_OABI_COMPAT shouldn't be in the checkpoint
state at all - only the per-process personalities should be.

> >> We checkpoint whether the system is running with CONFIG_MMU or not and
> >> require the same configuration for the system on which we restore the
> >> process. It might be possible to allow something more fine-grained,
> >> if it's worth the energy. Input on this item is also very welcome,
> >> specifically from someone who knows the exact meaning of the end_brk
> >> field.
> >
> > Processes which run on MMU and non-MMU CPUs are unlikely to be
> > interchangable - the run time environments are quite different. ?I
> > think this is a sane check.
> >
> thanks.

It's possible in principle to run many non-MMU binaries on MMU
kernels, but I've never heard of anyone doing it.

-- Jamie

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
       [not found]         ` <20100326024759.GN19308-yetKDKU6eevNLxjTenLetw@public.gmane.org>
@ 2010-03-26  3:02           ` Paul Mundt
  0 siblings, 0 replies; 80+ messages in thread
From: Paul Mundt @ 2010-03-26  3:02 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: containers, Christoffer Dall, Russell King - ARM Linux,
	linux-kernel, linux-arm-kernel

On Fri, Mar 26, 2010 at 02:47:59AM +0000, Jamie Lokier wrote:
> Christoffer Dall wrote:
> > >> We checkpoint whether the system is running with CONFIG_MMU or not and
> > >> require the same configuration for the system on which we restore the
> > >> process. It might be possible to allow something more fine-grained,
> > >> if it's worth the energy. Input on this item is also very welcome,
> > >> specifically from someone who knows the exact meaning of the end_brk
> > >> field.
> > >
> > > Processes which run on MMU and non-MMU CPUs are unlikely to be
> > > interchangable - the run time environments are quite different. ?I
> > > think this is a sane check.
> > >
> > thanks.
> 
> It's possible in principle to run many non-MMU binaries on MMU
> kernels, but I've never heard of anyone doing it.
> 
FDPIC supports running the same binaries with or without MMU depending on
your ABI, it's not really that uncommon, even if it's mostly just used
for prototyping.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
  2010-03-26  2:47         ` Jamie Lokier
@ 2010-03-26  3:02           ` Paul Mundt
  -1 siblings, 0 replies; 80+ messages in thread
From: Paul Mundt @ 2010-03-26  3:02 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: Christoffer Dall, Russell King - ARM Linux, containers,
	linux-kernel, linux-arm-kernel

On Fri, Mar 26, 2010 at 02:47:59AM +0000, Jamie Lokier wrote:
> Christoffer Dall wrote:
> > >> We checkpoint whether the system is running with CONFIG_MMU or not and
> > >> require the same configuration for the system on which we restore the
> > >> process. It might be possible to allow something more fine-grained,
> > >> if it's worth the energy. Input on this item is also very welcome,
> > >> specifically from someone who knows the exact meaning of the end_brk
> > >> field.
> > >
> > > Processes which run on MMU and non-MMU CPUs are unlikely to be
> > > interchangable - the run time environments are quite different. ?I
> > > think this is a sane check.
> > >
> > thanks.
> 
> It's possible in principle to run many non-MMU binaries on MMU
> kernels, but I've never heard of anyone doing it.
> 
FDPIC supports running the same binaries with or without MMU depending on
your ABI, it's not really that uncommon, even if it's mostly just used
for prototyping.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
@ 2010-03-26  3:02           ` Paul Mundt
  0 siblings, 0 replies; 80+ messages in thread
From: Paul Mundt @ 2010-03-26  3:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Mar 26, 2010 at 02:47:59AM +0000, Jamie Lokier wrote:
> Christoffer Dall wrote:
> > >> We checkpoint whether the system is running with CONFIG_MMU or not and
> > >> require the same configuration for the system on which we restore the
> > >> process. It might be possible to allow something more fine-grained,
> > >> if it's worth the energy. Input on this item is also very welcome,
> > >> specifically from someone who knows the exact meaning of the end_brk
> > >> field.
> > >
> > > Processes which run on MMU and non-MMU CPUs are unlikely to be
> > > interchangable - the run time environments are quite different. ?I
> > > think this is a sane check.
> > >
> > thanks.
> 
> It's possible in principle to run many non-MMU binaries on MMU
> kernels, but I've never heard of anyone doing it.
> 
FDPIC supports running the same binaries with or without MMU depending on
your ABI, it's not really that uncommon, even if it's mostly just used
for prototyping.

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
       [not found]           ` <20100326030208.GA5815-M7jkjyW5wf5g9hUCZPvPmw@public.gmane.org>
@ 2010-03-26  3:55             ` Jamie Lokier
  2010-03-28 22:55             ` Christoffer Dall
  1 sibling, 0 replies; 80+ messages in thread
From: Jamie Lokier @ 2010-03-26  3:55 UTC (permalink / raw)
  To: Paul Mundt
  Cc: containers, Christoffer Dall, Russell King - ARM Linux,
	linux-kernel, linux-arm-kernel

Paul Mundt wrote:
> FDPIC supports running the same binaries with or without MMU depending on
> your ABI, it's not really that uncommon, even if it's mostly just used
> for prototyping.

Thanks - I didn't know anyone actually did it :-)

But I can see the value for product rollouts of some binaries onto a
mixture of hardware, too.

bFLT (flat) binaries should be runnable with an MMU too.

-- Jamie

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
  2010-03-26  3:02           ` Paul Mundt
@ 2010-03-26  3:55             ` Jamie Lokier
  -1 siblings, 0 replies; 80+ messages in thread
From: Jamie Lokier @ 2010-03-26  3:55 UTC (permalink / raw)
  To: Paul Mundt
  Cc: Christoffer Dall, Russell King - ARM Linux, containers,
	linux-kernel, linux-arm-kernel

Paul Mundt wrote:
> FDPIC supports running the same binaries with or without MMU depending on
> your ABI, it's not really that uncommon, even if it's mostly just used
> for prototyping.

Thanks - I didn't know anyone actually did it :-)

But I can see the value for product rollouts of some binaries onto a
mixture of hardware, too.

bFLT (flat) binaries should be runnable with an MMU too.

-- Jamie

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
@ 2010-03-26  3:55             ` Jamie Lokier
  0 siblings, 0 replies; 80+ messages in thread
From: Jamie Lokier @ 2010-03-26  3:55 UTC (permalink / raw)
  To: linux-arm-kernel

Paul Mundt wrote:
> FDPIC supports running the same binaries with or without MMU depending on
> your ABI, it's not really that uncommon, even if it's mostly just used
> for prototyping.

Thanks - I didn't know anyone actually did it :-)

But I can see the value for product rollouts of some binaries onto a
mixture of hardware, too.

bFLT (flat) binaries should be runnable with an MMU too.

-- Jamie

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
       [not found]           ` <20100326030208.GA5815-M7jkjyW5wf5g9hUCZPvPmw@public.gmane.org>
  2010-03-26  3:55             ` Jamie Lokier
@ 2010-03-28 22:55             ` Christoffer Dall
  1 sibling, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-28 22:55 UTC (permalink / raw)
  To: Paul Mundt
  Cc: Russell King - ARM Linux, containers, Jamie Lokier, linux-kernel,
	linux-arm-kernel

On Fri, Mar 26, 2010 at 5:02 AM, Paul Mundt <lethal-M7jkjyW5wf5g9hUCZPvPmw@public.gmane.org> wrote:
> On Fri, Mar 26, 2010 at 02:47:59AM +0000, Jamie Lokier wrote:
>>
>> It's possible in principle to run many non-MMU binaries on MMU
>> kernels, but I've never heard of anyone doing it.
>>
> FDPIC supports running the same binaries with or without MMU depending on
> your ABI, it's not really that uncommon, even if it's mostly just used
> for prototyping.
>
I would imagine that the chance that a restart will fail anyway when
restoring an MMU process on a non-MMU kernel. However, as you suggest,
the other way around should be possible. Thanks for clearing that up.

Specifically, do you know the meaning of the end_brk field on the
mm_context_t struct and if I need to checkpoint it on restart for
non-MMU systems (and potentially do something more clever during
restart on an MMU kernel?)

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [C/R ARM][PATCH 3/3] c/r: ARM implementation of  checkpoint/restart
  2010-03-26  3:02           ` Paul Mundt
@ 2010-03-28 22:55             ` Christoffer Dall
  -1 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-28 22:55 UTC (permalink / raw)
  To: Paul Mundt
  Cc: Jamie Lokier, Russell King - ARM Linux, containers, linux-kernel,
	linux-arm-kernel

On Fri, Mar 26, 2010 at 5:02 AM, Paul Mundt <lethal@linux-sh.org> wrote:
> On Fri, Mar 26, 2010 at 02:47:59AM +0000, Jamie Lokier wrote:
>>
>> It's possible in principle to run many non-MMU binaries on MMU
>> kernels, but I've never heard of anyone doing it.
>>
> FDPIC supports running the same binaries with or without MMU depending on
> your ABI, it's not really that uncommon, even if it's mostly just used
> for prototyping.
>
I would imagine that the chance that a restart will fail anyway when
restoring an MMU process on a non-MMU kernel. However, as you suggest,
the other way around should be possible. Thanks for clearing that up.

Specifically, do you know the meaning of the end_brk field on the
mm_context_t struct and if I need to checkpoint it on restart for
non-MMU systems (and potentially do something more clever during
restart on an MMU kernel?)

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart
@ 2010-03-28 22:55             ` Christoffer Dall
  0 siblings, 0 replies; 80+ messages in thread
From: Christoffer Dall @ 2010-03-28 22:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Mar 26, 2010 at 5:02 AM, Paul Mundt <lethal@linux-sh.org> wrote:
> On Fri, Mar 26, 2010 at 02:47:59AM +0000, Jamie Lokier wrote:
>>
>> It's possible in principle to run many non-MMU binaries on MMU
>> kernels, but I've never heard of anyone doing it.
>>
> FDPIC supports running the same binaries with or without MMU depending on
> your ABI, it's not really that uncommon, even if it's mostly just used
> for prototyping.
>
I would imagine that the chance that a restart will fail anyway when
restoring an MMU process on a non-MMU kernel. However, as you suggest,
the other way around should be possible. Thanks for clearing that up.

Specifically, do you know the meaning of the end_brk field on the
mm_context_t struct and if I need to checkpoint it on restart for
non-MMU systems (and potentially do something more clever during
restart on an MMU kernel?)

^ permalink raw reply	[flat|nested] 80+ messages in thread

end of thread, other threads:[~2010-03-28 22:55 UTC | newest]

Thread overview: 80+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-22  1:06 [C/R ARM][PATCH 0/3] Linux Checkpoint-Restart - ARM port Christoffer Dall
2010-03-22  1:06 ` Christoffer Dall
2010-03-22  1:06 ` [C/R ARM][PATCH 1/3] ARM: Rudimentary syscall interfaces Christoffer Dall
2010-03-22  1:06   ` Christoffer Dall
     [not found]   ` <1269219965-23923-2-git-send-email-christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
2010-03-23 20:53     ` Russell King - ARM Linux
2010-03-23 20:53   ` Russell King - ARM Linux
2010-03-23 20:53     ` Russell King - ARM Linux
     [not found]     ` <20100323205342.GA19572-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>
2010-03-24  2:03       ` Matt Helsley
2010-03-24  2:03     ` Matt Helsley
2010-03-24  2:03       ` Matt Helsley
     [not found]       ` <20100324020342.GB5704-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-03-24  4:57         ` Oren Laadan
2010-03-24  4:57           ` Oren Laadan
2010-03-24  4:57           ` Oren Laadan
     [not found]           ` <Pine.LNX.4.64.1003240055050.5867-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>
2010-03-24 14:02             ` Matt Helsley
2010-03-24 14:02           ` Matt Helsley
2010-03-24 14:02             ` Matt Helsley
     [not found]             ` <20100324140252.GC5704-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-03-24 15:53               ` Oren Laadan
2010-03-24 15:53             ` Oren Laadan
2010-03-24 15:53               ` Oren Laadan
2010-03-24 19:36               ` Christoffer Dall
2010-03-24 19:36                 ` Christoffer Dall
2010-03-25  1:11                 ` Matt Helsley
2010-03-25  1:11                   ` Matt Helsley
2010-03-25  1:17                   ` Matt Helsley
2010-03-25  1:17                     ` Matt Helsley
     [not found]                     ` <20100325011753.GF5704-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-03-25 10:29                       ` Christoffer Dall
2010-03-25 10:29                     ` Christoffer Dall
2010-03-25 10:29                       ` Christoffer Dall
     [not found]                   ` <20100325011132.GE5704-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-03-25  1:17                     ` Matt Helsley
2010-03-25  1:35                     ` Oren Laadan
2010-03-25  1:35                   ` Oren Laadan
2010-03-25  1:35                     ` Oren Laadan
2010-03-25 10:34                     ` Christoffer Dall
2010-03-25 10:34                       ` Christoffer Dall
     [not found]                     ` <4BAABDF4.8070904-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-03-25 10:34                       ` Christoffer Dall
     [not found]                 ` <7d08b87d1003241236n2b45e6f4ife36da841351df9d-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-03-25  1:11                   ` Matt Helsley
     [not found]               ` <4BAA3586.1020604-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-03-24 19:36                 ` Christoffer Dall
     [not found] ` <1269219965-23923-1-git-send-email-christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
2010-03-22  1:06   ` Christoffer Dall
2010-03-22  1:06   ` [C/R ARM][PATCH 2/3] ARM: Add the eclone system call Christoffer Dall
2010-03-22  1:06   ` [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart Christoffer Dall
2010-03-22  1:06 ` [C/R ARM][PATCH 2/3] ARM: Add the eclone system call Christoffer Dall
2010-03-22  1:06   ` Christoffer Dall
     [not found]   ` <1269219965-23923-3-git-send-email-christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
2010-03-23 21:06     ` Russell King - ARM Linux
2010-03-23 21:06   ` Russell King - ARM Linux
2010-03-23 21:06     ` Russell King - ARM Linux
2010-03-24 18:19     ` Sukadev Bhattiprolu
2010-03-24 18:19       ` Sukadev Bhattiprolu
     [not found]     ` <20100323210616.GB19572-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>
2010-03-24 18:19       ` Sukadev Bhattiprolu
2010-03-24 19:42       ` Christoffer Dall
2010-03-24 19:42     ` Christoffer Dall
2010-03-24 19:42       ` Christoffer Dall
2010-03-22  1:06 ` [C/R ARM][PATCH 3/3] c/r: ARM implementation of checkpoint/restart Christoffer Dall
2010-03-22  1:06   ` Christoffer Dall
2010-03-23 16:09   ` Serge E. Hallyn
2010-03-23 16:09     ` Serge E. Hallyn
2010-03-24 19:46     ` Christoffer Dall
2010-03-24 19:46       ` Christoffer Dall
     [not found]     ` <20100323160933.GA4465-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-03-24 19:46       ` Christoffer Dall
2010-03-23 21:18   ` Russell King - ARM Linux
2010-03-23 21:18     ` Russell King - ARM Linux
     [not found]     ` <20100323211843.GC19572-l+eeeJia6m9vn6HldHNs0ANdhmdF6hFW@public.gmane.org>
2010-03-24  1:53       ` Matt Helsley
2010-03-24 20:48       ` Christoffer Dall
2010-03-24  1:53     ` Matt Helsley
2010-03-24  1:53       ` Matt Helsley
2010-03-24 20:48     ` Christoffer Dall
2010-03-24 20:48       ` Christoffer Dall
     [not found]       ` <7d08b87d1003241348g347f092k1142318490e0bdcc-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-03-26  2:47         ` Jamie Lokier
2010-03-26  2:47       ` Jamie Lokier
2010-03-26  2:47         ` Jamie Lokier
     [not found]         ` <20100326024759.GN19308-yetKDKU6eevNLxjTenLetw@public.gmane.org>
2010-03-26  3:02           ` Paul Mundt
2010-03-26  3:02         ` Paul Mundt
2010-03-26  3:02           ` Paul Mundt
2010-03-26  3:55           ` Jamie Lokier
2010-03-26  3:55             ` Jamie Lokier
     [not found]           ` <20100326030208.GA5815-M7jkjyW5wf5g9hUCZPvPmw@public.gmane.org>
2010-03-26  3:55             ` Jamie Lokier
2010-03-28 22:55             ` Christoffer Dall
2010-03-28 22:55           ` Christoffer Dall
2010-03-28 22:55             ` Christoffer Dall
     [not found]   ` <1269219965-23923-4-git-send-email-christofferdall-77OGu6e99YhyO3AAkE1OcX9LOBIZ5rWg@public.gmane.org>
2010-03-23 16:09     ` Serge E. Hallyn
2010-03-23 21:18     ` Russell King - ARM Linux

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.