All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] Checkpoint Support for Syscall User Dispatch
@ 2023-01-20 14:43 Gregory Price
  2023-01-20 14:43 ` [PATCH v3 1/3] ptrace,syscall_user_dispatch: Implement Syscall User Dispatch Suspension Gregory Price
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Gregory Price @ 2023-01-20 14:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fsdevel, linux-doc, linux-kselftest, krisman, tglx, luto,
	oleg, peterz, ebiederm, akpm, adobriyan, corbet, shuah,
	Gregory Price

v3: Kernel test robot static function fix
    Whitespace nitpicks

v2: Implements the getter/setter interface in ptrace rather than prctl

Syscall user dispatch makes it possible to cleanly intercept system
calls from user-land.  However, most transparent checkpoint software
presently leverages some combination of ptrace and system call
injection to place software in a ready-to-checkpoint state.

If Syscall User Dispatch is enabled at the time of being quiesced,
injected system calls will subsequently be interposed upon and
dispatched to the task's signal handler.

This patch set implements 3 features to enable software such as CRIU
to cleanly interpose upon software leveraging syscall user dispatch.

- Implement PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH, akin to a similar
  feature for SECCOMP.  This allows a ptracer to temporarily disable
  syscall user dispatch, making syscall injection possible.

- Implement an fs/proc extension that reports whether Syscall User
  Dispatch is being used in proc/status.  A similar value is present
  for SECCOMP, and is used to determine whether special logic is
  needed during checkpoint/resume.

- Implement a getter interface for Syscall User Dispatch config info.
  To resume successfully, the checkpoint/resume software has to
  save and restore this information.  Presently this configuration
  is write-only, with no way for C/R software to save it.

  This was done in ptrace because syscall user dispatch is not part of
  uapi. The syscall_user_dispatch_config structure was added to the
  ptrace exports.


Gregory Price (3):
  ptrace,syscall_user_dispatch: Implement Syscall User Dispatch
    Suspension
  fs/proc/array: Add Syscall User Dispatch to proc status
  ptrace,syscall_user_dispatch: add a getter/setter for sud
    configuration

 .../admin-guide/syscall-user-dispatch.rst     |  5 +-
 fs/proc/array.c                               |  8 +++
 include/linux/ptrace.h                        |  2 +
 include/linux/syscall_user_dispatch.h         | 19 +++++++
 include/uapi/linux/ptrace.h                   | 16 +++++-
 kernel/entry/syscall_user_dispatch.c          | 54 +++++++++++++++++++
 kernel/ptrace.c                               | 13 +++++
 7 files changed, 115 insertions(+), 2 deletions(-)

-- 
2.39.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 1/3] ptrace,syscall_user_dispatch: Implement Syscall User Dispatch Suspension
  2023-01-20 14:43 [PATCH v3 0/3] Checkpoint Support for Syscall User Dispatch Gregory Price
@ 2023-01-20 14:43 ` Gregory Price
  2023-01-20 15:22   ` Oleg Nesterov
  2023-01-20 14:43 ` [PATCH v3 2/3] fs/proc/array: Add Syscall User Dispatch to proc status Gregory Price
  2023-01-20 14:43 ` [PATCH v3 3/3] ptrace,syscall_user_dispatch: add a getter/setter for sud configuration Gregory Price
  2 siblings, 1 reply; 9+ messages in thread
From: Gregory Price @ 2023-01-20 14:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fsdevel, linux-doc, linux-kselftest, krisman, tglx, luto,
	oleg, peterz, ebiederm, akpm, adobriyan, corbet, shuah,
	Gregory Price

Adds PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH to ptrace options, and
modify Syscall User Dispatch to suspend interception when enabled.

This is modeled after the SUSPEND_SECCOMP feature, which suspends
SECCOMP interposition.  Without doing this, software like CRIU will
inject system calls into a process and be intercepted by Syscall
User Dispatch, either causing a crash (due to blocked signals) or
the delivery of those signals to a ptracer (not the intended behavior).

Since Syscall User Dispatch is not a privileged feature, a check
for permissions is not required, however attempting to set this
option when CONFIG_CHECKPOINT_RESTORE it not supported should be
disallowed, as its intended use is checkpoint/resume.

Signed-off-by: Gregory Price <gregory.price@memverge.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/ptrace.h               | 2 ++
 include/uapi/linux/ptrace.h          | 6 +++++-
 kernel/entry/syscall_user_dispatch.c | 5 +++++
 kernel/ptrace.c                      | 4 ++++
 4 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index eaaef3ffec22..461ae5c99d57 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -45,6 +45,8 @@ extern int ptrace_access_vm(struct task_struct *tsk, unsigned long addr,
 
 #define PT_EXITKILL		(PTRACE_O_EXITKILL << PT_OPT_FLAG_SHIFT)
 #define PT_SUSPEND_SECCOMP	(PTRACE_O_SUSPEND_SECCOMP << PT_OPT_FLAG_SHIFT)
+#define PT_SUSPEND_SYSCALL_USER_DISPATCH \
+	(PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH << PT_OPT_FLAG_SHIFT)
 
 extern long arch_ptrace(struct task_struct *child, long request,
 			unsigned long addr, unsigned long data);
diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index 195ae64a8c87..ba9e3f19a22c 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -146,9 +146,13 @@ struct ptrace_rseq_configuration {
 /* eventless options */
 #define PTRACE_O_EXITKILL		(1 << 20)
 #define PTRACE_O_SUSPEND_SECCOMP	(1 << 21)
+#define PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH	(1 << 22)
 
 #define PTRACE_O_MASK		(\
-	0x000000ff | PTRACE_O_EXITKILL | PTRACE_O_SUSPEND_SECCOMP)
+	0x000000ff | \
+	PTRACE_O_EXITKILL | \
+	PTRACE_O_SUSPEND_SECCOMP | \
+	PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH)
 
 #include <asm/ptrace.h>
 
diff --git a/kernel/entry/syscall_user_dispatch.c b/kernel/entry/syscall_user_dispatch.c
index 0b6379adff6b..b5ec75164805 100644
--- a/kernel/entry/syscall_user_dispatch.c
+++ b/kernel/entry/syscall_user_dispatch.c
@@ -8,6 +8,7 @@
 #include <linux/uaccess.h>
 #include <linux/signal.h>
 #include <linux/elf.h>
+#include <linux/ptrace.h>
 
 #include <linux/sched/signal.h>
 #include <linux/sched/task_stack.h>
@@ -36,6 +37,10 @@ bool syscall_user_dispatch(struct pt_regs *regs)
 	struct syscall_user_dispatch *sd = &current->syscall_dispatch;
 	char state;
 
+	if (IS_ENABLED(CONFIG_CHECKPOINT_RESTORE) &&
+	    unlikely(current->ptrace & PT_SUSPEND_SYSCALL_USER_DISPATCH))
+		return false;
+
 	if (likely(instruction_pointer(regs) - sd->offset < sd->len))
 		return false;
 
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 54482193e1ed..99467ba5f55b 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -370,6 +370,10 @@ static int check_ptrace_options(unsigned long data)
 	if (data & ~(unsigned long)PTRACE_O_MASK)
 		return -EINVAL;
 
+	if (unlikely(data & PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH) &&
+	    (!IS_ENABLED(CONFIG_CHECKPOINT_RESTART)))
+			return -EINVAL;
+
 	if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
 		if (!IS_ENABLED(CONFIG_CHECKPOINT_RESTORE) ||
 		    !IS_ENABLED(CONFIG_SECCOMP))
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 2/3] fs/proc/array: Add Syscall User Dispatch to proc status
  2023-01-20 14:43 [PATCH v3 0/3] Checkpoint Support for Syscall User Dispatch Gregory Price
  2023-01-20 14:43 ` [PATCH v3 1/3] ptrace,syscall_user_dispatch: Implement Syscall User Dispatch Suspension Gregory Price
@ 2023-01-20 14:43 ` Gregory Price
  2023-01-20 14:43 ` [PATCH v3 3/3] ptrace,syscall_user_dispatch: add a getter/setter for sud configuration Gregory Price
  2 siblings, 0 replies; 9+ messages in thread
From: Gregory Price @ 2023-01-20 14:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fsdevel, linux-doc, linux-kselftest, krisman, tglx, luto,
	oleg, peterz, ebiederm, akpm, adobriyan, corbet, shuah,
	Gregory Price

If a dispatch selector has been configured for Syscall User Dispatch,
report Syscall User Dispath as configured in proc/status.

This provides an indicator to userland checkpoint/restart software that
it much manage special signal conditions (similar to SECCOMP)

Signed-off-by: Gregory Price <gregory.price@memverge.com>
---
 fs/proc/array.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 49283b8103c7..c85cdb4c137c 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -428,6 +428,13 @@ static inline void task_thp_status(struct seq_file *m, struct mm_struct *mm)
 	seq_printf(m, "THP_enabled:\t%d\n", thp_enabled);
 }
 
+static inline void task_syscall_user_dispatch(struct seq_file *m,
+						struct task_struct *p)
+{
+	seq_put_decimal_ull(m, "\nSyscall_user_dispatch:\t",
+			    (p->syscall_dispatch.selector != NULL));
+}
+
 int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
 			struct pid *pid, struct task_struct *task)
 {
@@ -451,6 +458,7 @@ int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
 	task_cpus_allowed(m, task);
 	cpuset_task_status_allowed(m, task);
 	task_context_switch_counts(m, task);
+	task_syscall_user_dispatch(m, task);
 	return 0;
 }
 
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v3 3/3] ptrace,syscall_user_dispatch: add a getter/setter for sud configuration
  2023-01-20 14:43 [PATCH v3 0/3] Checkpoint Support for Syscall User Dispatch Gregory Price
  2023-01-20 14:43 ` [PATCH v3 1/3] ptrace,syscall_user_dispatch: Implement Syscall User Dispatch Suspension Gregory Price
  2023-01-20 14:43 ` [PATCH v3 2/3] fs/proc/array: Add Syscall User Dispatch to proc status Gregory Price
@ 2023-01-20 14:43 ` Gregory Price
  2023-01-20 16:12   ` [lkp] [+309 bytes kernel size regression] [i386-tinyconfig] [9a08e0b054] " kernel test robot
  2023-01-21  3:18   ` [PATCH v3 3/3] " Andrei Vagin
  2 siblings, 2 replies; 9+ messages in thread
From: Gregory Price @ 2023-01-20 14:43 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fsdevel, linux-doc, linux-kselftest, krisman, tglx, luto,
	oleg, peterz, ebiederm, akpm, adobriyan, corbet, shuah,
	Gregory Price

Implement ptrace getter/setter interface for syscall user dispatch.

Presently, these settings are write-only via prctl, making it impossible
to implement transparent checkpoint (coordination with the software is
required).

This is modeled after a similar interface for SECCOMP, which can have
its configuration dumped by ptrace for software like CRIU.

Signed-off-by: Gregory Price <gregory.price@memverge.com>
---
 .../admin-guide/syscall-user-dispatch.rst     |  5 +-
 include/linux/syscall_user_dispatch.h         | 19 +++++++
 include/uapi/linux/ptrace.h                   | 10 ++++
 kernel/entry/syscall_user_dispatch.c          | 49 +++++++++++++++++++
 kernel/ptrace.c                               |  9 ++++
 5 files changed, 91 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/syscall-user-dispatch.rst b/Documentation/admin-guide/syscall-user-dispatch.rst
index 60314953c728..a23ae21a1d5b 100644
--- a/Documentation/admin-guide/syscall-user-dispatch.rst
+++ b/Documentation/admin-guide/syscall-user-dispatch.rst
@@ -43,7 +43,10 @@ doesn't rely on any of the syscall ABI to make the filtering.  It uses
 only the syscall dispatcher address and the userspace key.
 
 As the ABI of these intercepted syscalls is unknown to Linux, these
-syscalls are not instrumentable via ptrace or the syscall tracepoints.
+syscalls are not instrumentable via ptrace or the syscall tracepoints,
+however an interfaces to suspend, checkpoint, and restore syscall user
+dispatch configuration has been added to ptrace to assist userland
+checkpoint/restart software.
 
 Interface
 ---------
diff --git a/include/linux/syscall_user_dispatch.h b/include/linux/syscall_user_dispatch.h
index a0ae443fb7df..9e1bd0d87c1e 100644
--- a/include/linux/syscall_user_dispatch.h
+++ b/include/linux/syscall_user_dispatch.h
@@ -22,6 +22,13 @@ int set_syscall_user_dispatch(unsigned long mode, unsigned long offset,
 #define clear_syscall_work_syscall_user_dispatch(tsk) \
 	clear_task_syscall_work(tsk, SYSCALL_USER_DISPATCH)
 
+int syscall_user_dispatch_get_config(struct task_struct *task, unsigned long size,
+	void __user *data);
+
+int syscall_user_dispatch_set_config(struct task_struct *task, unsigned long size,
+	void __user *data);
+
+
 #else
 struct syscall_user_dispatch {};
 
@@ -35,6 +42,18 @@ static inline void clear_syscall_work_syscall_user_dispatch(struct task_struct *
 {
 }
 
+static inline int syscall_user_dispatch_get_config(struct task_struct *task, unsigned long size,
+	void __user *data)
+{
+	return -EINVAL;
+}
+
+static inline int syscall_user_dispatch_set_config(struct task_struct *task, unsigned long size,
+	void __user *data)
+{
+	return -EINVAL;
+}
+
 #endif /* CONFIG_GENERIC_ENTRY */
 
 #endif /* _SYSCALL_USER_DISPATCH_H */
diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index ba9e3f19a22c..8b93c78189b5 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -112,6 +112,16 @@ struct ptrace_rseq_configuration {
 	__u32 pad;
 };
 
+#define PTRACE_SET_SYSCALL_USER_DISPATCH_CONFIG 0x4210
+#define PTRACE_GET_SYSCALL_USER_DISPATCH_CONFIG 0x4211
+struct syscall_user_dispatch_config {
+	__u64 mode;
+	__s8 *selector;
+	__u64 offset;
+	__u64 len;
+	__u8 on_dispatch;
+};
+
 /*
  * These values are stored in task->ptrace_message
  * by ptrace_stop to describe the current syscall-stop.
diff --git a/kernel/entry/syscall_user_dispatch.c b/kernel/entry/syscall_user_dispatch.c
index b5ec75164805..a3b24d498b39 100644
--- a/kernel/entry/syscall_user_dispatch.c
+++ b/kernel/entry/syscall_user_dispatch.c
@@ -111,3 +111,52 @@ int set_syscall_user_dispatch(unsigned long mode, unsigned long offset,
 
 	return 0;
 }
+
+int syscall_user_dispatch_get_config(struct task_struct *task, unsigned long size,
+		void __user *data)
+{
+	struct syscall_user_dispatch *sd = &task->syscall_dispatch;
+	struct syscall_user_dispatch_config config;
+
+	if (size != sizeof(struct syscall_user_dispatch_config))
+		return -EINVAL;
+
+	if (sd->selector) {
+		config.mode = PR_SYS_DISPATCH_ON;
+		config.offset = sd->offset;
+		config.len = sd->len;
+		config.selector = sd->selector;
+		config.on_dispatch = sd->on_dispatch;
+	} else {
+		config.mode = PR_SYS_DISPATCH_OFF;
+		config.offset = 0;
+		config.len = 0;
+		config.selector = NULL;
+		config.on_dispatch = false;
+	}
+	if (copy_to_user(data, &config, sizeof(config)))
+		return -EFAULT;
+
+	return 0;
+}
+
+int syscall_user_dispatch_set_config(struct task_struct *task, unsigned long size,
+		void __user *data)
+{
+	struct syscall_user_dispatch_config config;
+	int ret;
+
+	if (size != sizeof(struct syscall_user_dispatch_config))
+		return -EINVAL;
+
+	if (copy_from_user(&config, data, sizeof(config)))
+		return -EFAULT;
+
+	ret = set_syscall_user_dispatch(config.mode, config.offset, config.len,
+			config.selector);
+	if (ret)
+		return ret;
+
+	task->syscall_dispatch.on_dispatch = config.on_dispatch;
+	return 0;
+}
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 99467ba5f55b..d1e9c0808905 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -32,6 +32,7 @@
 #include <linux/compat.h>
 #include <linux/sched/signal.h>
 #include <linux/minmax.h>
+#include <linux/syscall_user_dispatch.h>
 
 #include <asm/syscall.h>	/* for syscall_get_* */
 
@@ -1263,6 +1264,14 @@ int ptrace_request(struct task_struct *child, long request,
 		break;
 #endif
 
+	case PTRACE_SET_SYSCALL_USER_DISPATCH_CONFIG:
+		ret = syscall_user_dispatch_set_config(child, addr, datavp);
+		break;
+
+	case PTRACE_GET_SYSCALL_USER_DISPATCH_CONFIG:
+		ret = syscall_user_dispatch_get_config(child, addr, datavp);
+		break;
+
 	default:
 		break;
 	}
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/3] ptrace,syscall_user_dispatch: Implement Syscall User Dispatch Suspension
  2023-01-20 14:43 ` [PATCH v3 1/3] ptrace,syscall_user_dispatch: Implement Syscall User Dispatch Suspension Gregory Price
@ 2023-01-20 15:22   ` Oleg Nesterov
  2023-01-20 15:49     ` Gregory Price
  0 siblings, 1 reply; 9+ messages in thread
From: Oleg Nesterov @ 2023-01-20 15:22 UTC (permalink / raw)
  To: Gregory Price
  Cc: linux-kernel, linux-fsdevel, linux-doc, linux-kselftest, krisman,
	tglx, luto, peterz, ebiederm, akpm, adobriyan, corbet, shuah,
	Gregory Price

Hi Gregory,

I'll try to read this series next Monday, I need to recall what does
syscall-user-dispatch actually do ;)

just one question for now,

On 01/20, Gregory Price wrote:
>
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -370,6 +370,10 @@ static int check_ptrace_options(unsigned long data)
>  	if (data & ~(unsigned long)PTRACE_O_MASK)
>  		return -EINVAL;
>  
> +	if (unlikely(data & PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH) &&
> +	    (!IS_ENABLED(CONFIG_CHECKPOINT_RESTART)))
> +			return -EINVAL;

Hmm? git grep CHECKPOINT_RESTART shows nothing.

Oleg.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/3] ptrace,syscall_user_dispatch: Implement Syscall User Dispatch Suspension
  2023-01-20 15:22   ` Oleg Nesterov
@ 2023-01-20 15:49     ` Gregory Price
  0 siblings, 0 replies; 9+ messages in thread
From: Gregory Price @ 2023-01-20 15:49 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Gregory Price, linux-kernel, linux-fsdevel, linux-doc,
	linux-kselftest, krisman, tglx, luto, peterz, ebiederm, akpm,
	adobriyan, corbet, shuah

On Fri, Jan 20, 2023 at 04:22:51PM +0100, Oleg Nesterov wrote:
> Hi Gregory,
> 
> I'll try to read this series next Monday, I need to recall what does
> syscall-user-dispatch actually do ;)
> 
> just one question for now,
> 
> On 01/20, Gregory Price wrote:
> >
> > --- a/kernel/ptrace.c
> > +++ b/kernel/ptrace.c
> > @@ -370,6 +370,10 @@ static int check_ptrace_options(unsigned long data)
> >  	if (data & ~(unsigned long)PTRACE_O_MASK)
> >  		return -EINVAL;
> >  
> > +	if (unlikely(data & PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH) &&
> > +	    (!IS_ENABLED(CONFIG_CHECKPOINT_RESTART)))
> > +			return -EINVAL;
> 
> Hmm? git grep CHECKPOINT_RESTART shows nothing.
> 
> Oleg.
>

TIL the mailing lists don't like responses from proxy addresses.
Resending response to it goes out to everyone


Good catch, I always mixup RESTART/RESTORE.  This should be RESTORE
Adjusted patch below, will send a v4 tomorrow so as not to spam the
lists.  Attached an updated patch for the time being.



(brief syscall user dispatch overview)

syscall-user-dispatch is relatively simple, the goal is to implement
syscall interposition for foreign syscalls (windows, non-posix,
whatever).  Since the ABI of these syscalls can't be trusted to be
anything like linux, syscall dispatch produces a SIGSYS before anything
else can do things like check register values.

How to use

1) User registers a SIGSYS signal handler
2) User does
   prctl(PR_SET_SYSCALL_USER_DISPATCH, PR_SYS_DISPATCH_ON,
	       <address>, <length>, char* selector)

3) All 'syscall' instructions *outside* the virtual address range
   (address, address+length) now produce a SIGSYS on the thread that
	 executed the syscall.

   <selector> can be set to SYSCALL_DISPATCH_FILTER_ALLOW or 
	 SYSCALL_DISPATCH_FILTER_BLOCK to enable/disable this signal
	 production from userland without having to make kernel calls.

docs: https://docs.kernel.org/admin-guide/syscall-user-dispatch.html


Updated patch


diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index eaaef3ffec22..461ae5c99d57 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -45,6 +45,8 @@ extern int ptrace_access_vm(struct task_struct *tsk, unsigned long addr,

 #define PT_EXITKILL            (PTRACE_O_EXITKILL << PT_OPT_FLAG_SHIFT)
 #define PT_SUSPEND_SECCOMP     (PTRACE_O_SUSPEND_SECCOMP << PT_OPT_FLAG_SHIFT)
+#define PT_SUSPEND_SYSCALL_USER_DISPATCH \
+       (PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH << PT_OPT_FLAG_SHIFT)

 extern long arch_ptrace(struct task_struct *child, long request,
                        unsigned long addr, unsigned long data);
diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index 195ae64a8c87..ba9e3f19a22c 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -146,9 +146,13 @@ struct ptrace_rseq_configuration {
 /* eventless options */
 #define PTRACE_O_EXITKILL              (1 << 20)
 #define PTRACE_O_SUSPEND_SECCOMP       (1 << 21)
+#define PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH (1 << 22)

 #define PTRACE_O_MASK          (\
-       0x000000ff | PTRACE_O_EXITKILL | PTRACE_O_SUSPEND_SECCOMP)
+       0x000000ff | \
+       PTRACE_O_EXITKILL | \
+       PTRACE_O_SUSPEND_SECCOMP | \
+       PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH)

 #include <asm/ptrace.h>

diff --git a/kernel/entry/syscall_user_dispatch.c b/kernel/entry/syscall_user_dispatch.c
index 0b6379adff6b..b5ec75164805 100644
--- a/kernel/entry/syscall_user_dispatch.c
+++ b/kernel/entry/syscall_user_dispatch.c
@@ -8,6 +8,7 @@
 #include <linux/uaccess.h>
 #include <linux/signal.h>
 #include <linux/elf.h>
+#include <linux/ptrace.h>

 #include <linux/sched/signal.h>
 #include <linux/sched/task_stack.h>
@@ -36,6 +37,10 @@ bool syscall_user_dispatch(struct pt_regs *regs)
        struct syscall_user_dispatch *sd = &current->syscall_dispatch;
        char state;

+       if (IS_ENABLED(CONFIG_CHECKPOINT_RESTORE) &&
+           unlikely(current->ptrace & PT_SUSPEND_SYSCALL_USER_DISPATCH))
+               return false;
+
        if (likely(instruction_pointer(regs) - sd->offset < sd->len))
                return false;

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 54482193e1ed..a348b68d07a2 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -370,6 +370,10 @@ static int check_ptrace_options(unsigned long data)
        if (data & ~(unsigned long)PTRACE_O_MASK)
                return -EINVAL;

+       if (unlikely(data & PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH) &&
+           (!IS_ENABLED(CONFIG_CHECKPOINT_RESTORE)))
+                       return -EINVAL;
+
        if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
                if (!IS_ENABLED(CONFIG_CHECKPOINT_RESTORE) ||
                    !IS_ENABLED(CONFIG_SECCOMP))

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [lkp] [+309 bytes kernel size regression] [i386-tinyconfig] [9a08e0b054] ptrace,syscall_user_dispatch: add a getter/setter for sud configuration
  2023-01-20 14:43 ` [PATCH v3 3/3] ptrace,syscall_user_dispatch: add a getter/setter for sud configuration Gregory Price
@ 2023-01-20 16:12   ` kernel test robot
  2023-01-21  3:18   ` [PATCH v3 3/3] " Andrei Vagin
  1 sibling, 0 replies; 9+ messages in thread
From: kernel test robot @ 2023-01-20 16:12 UTC (permalink / raw)
  To: Gregory Price; +Cc: oe-kbuild-all, lkp, Josh Triplett


FYI, we noticed a +309 bytes kernel size regression due to commit:

commit: 9a08e0b054230e2e0a4c6313e708c64c338d9f89 (ptrace,syscall_user_dispatch: add a getter/setter for sud configuration)
url: https://github.com/intel-lab-lkp/linux/commits/Gregory-Price/ptrace-syscall_user_dispatch-Implement-Syscall-User-Dispatch-Suspension/20230120-224726
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git d368967cb1039b5c4cccb62b5a4b9468c50cd143
patch link: https://lore.kernel.org/all/20230120144356.40717-4-gregory.price@memverge.com/
patch subject: [PATCH v3 3/3] ptrace,syscall_user_dispatch: add a getter/setter for sud configuration


Details as below (size data is obtained by `nm --size-sort vmlinux`):

12197c6c: fs/proc/array: Add Syscall User Dispatch to proc status
9a08e0b0: ptrace,syscall_user_dispatch: add a getter/setter for sud configuration

+---------------------------------------+----------+----------+-------+
|                symbol                 | 12197c6c | 9a08e0b0 | delta |
+---------------------------------------+----------+----------+-------+
| bzImage                               | 495840   | 496096   | 256   |
| nm.T.syscall_user_dispatch_get_config | 0        | 181      | 181   |
| nm.T.syscall_user_dispatch_set_config | 0        | 86       | 86    |
| nm.T.ptrace_request                   | 1186     | 1228     | 42    |
+---------------------------------------+----------+----------+-------+



Thanks



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 3/3] ptrace,syscall_user_dispatch: add a getter/setter for sud configuration
  2023-01-20 14:43 ` [PATCH v3 3/3] ptrace,syscall_user_dispatch: add a getter/setter for sud configuration Gregory Price
  2023-01-20 16:12   ` [lkp] [+309 bytes kernel size regression] [i386-tinyconfig] [9a08e0b054] " kernel test robot
@ 2023-01-21  3:18   ` Andrei Vagin
  2023-01-21  3:27     ` Gregory Price
  1 sibling, 1 reply; 9+ messages in thread
From: Andrei Vagin @ 2023-01-21  3:18 UTC (permalink / raw)
  To: Gregory Price
  Cc: linux-kernel, linux-fsdevel, linux-doc, linux-kselftest, krisman,
	tglx, luto, oleg, peterz, ebiederm, akpm, adobriyan, corbet,
	shuah, Gregory Price

On Fri, Jan 20, 2023 at 7:05 AM Gregory Price <gourry.memverge@gmail.com> wrote:
>
> Implement ptrace getter/setter interface for syscall user dispatch.
>
> Presently, these settings are write-only via prctl, making it impossible
> to implement transparent checkpoint (coordination with the software is
> required).
>
> This is modeled after a similar interface for SECCOMP, which can have
> its configuration dumped by ptrace for software like CRIU.
>
> Signed-off-by: Gregory Price <gregory.price@memverge.com>
> ---
>  .../admin-guide/syscall-user-dispatch.rst     |  5 +-
>  include/linux/syscall_user_dispatch.h         | 19 +++++++
>  include/uapi/linux/ptrace.h                   | 10 ++++
>  kernel/entry/syscall_user_dispatch.c          | 49 +++++++++++++++++++
>  kernel/ptrace.c                               |  9 ++++
>  5 files changed, 91 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/admin-guide/syscall-user-dispatch.rst b/Documentation/admin-guide/syscall-user-dispatch.rst
> index 60314953c728..a23ae21a1d5b 100644
> --- a/Documentation/admin-guide/syscall-user-dispatch.rst
> +++ b/Documentation/admin-guide/syscall-user-dispatch.rst

<snip>

> +
> +int syscall_user_dispatch_get_config(struct task_struct *task, unsigned long size,
> +               void __user *data)
> +{
> +       struct syscall_user_dispatch *sd = &task->syscall_dispatch;
> +       struct syscall_user_dispatch_config config;
> +
> +       if (size != sizeof(struct syscall_user_dispatch_config))
> +               return -EINVAL;
> +
> +       if (sd->selector) {
> +               config.mode = PR_SYS_DISPATCH_ON;
> +               config.offset = sd->offset;
> +               config.len = sd->len;
> +               config.selector = sd->selector;
> +               config.on_dispatch = sd->on_dispatch;
> +       } else {

This doesn't look right for me. selector is optional and if it is 0,
it doesn't mean that
mode is PR_SYS_DISPATCH_OFF, does it?

> +               config.mode = PR_SYS_DISPATCH_OFF;
> +               config.offset = 0;
> +               config.len = 0;
> +               config.selector = NULL;
> +               config.on_dispatch = false;
> +       }
> +       if (copy_to_user(data, &config, sizeof(config)))
> +               return -EFAULT;
> +
> +       return 0;
> +}
> +

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 3/3] ptrace,syscall_user_dispatch: add a getter/setter for sud configuration
  2023-01-21  3:18   ` [PATCH v3 3/3] " Andrei Vagin
@ 2023-01-21  3:27     ` Gregory Price
  0 siblings, 0 replies; 9+ messages in thread
From: Gregory Price @ 2023-01-21  3:27 UTC (permalink / raw)
  To: Andrei Vagin
  Cc: Gregory Price, linux-kernel, linux-fsdevel, linux-doc,
	linux-kselftest, krisman, tglx, luto, oleg, peterz, ebiederm,
	akpm, adobriyan, corbet, shuah

On Fri, Jan 20, 2023 at 07:18:49PM -0800, Andrei Vagin wrote:
> On Fri, Jan 20, 2023 at 7:05 AM Gregory Price <gourry.memverge@gmail.com> wrote:
> >
> > Implement ptrace getter/setter interface for syscall user dispatch.
> >
> > Presently, these settings are write-only via prctl, making it impossible
> > to implement transparent checkpoint (coordination with the software is
> > required).
> >
> > This is modeled after a similar interface for SECCOMP, which can have
> > its configuration dumped by ptrace for software like CRIU.
> >
> > Signed-off-by: Gregory Price <gregory.price@memverge.com>
> > ---
> >  .../admin-guide/syscall-user-dispatch.rst     |  5 +-
> >  include/linux/syscall_user_dispatch.h         | 19 +++++++
> >  include/uapi/linux/ptrace.h                   | 10 ++++
> >  kernel/entry/syscall_user_dispatch.c          | 49 +++++++++++++++++++
> >  kernel/ptrace.c                               |  9 ++++
> >  5 files changed, 91 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/admin-guide/syscall-user-dispatch.rst b/Documentation/admin-guide/syscall-user-dispatch.rst
> > index 60314953c728..a23ae21a1d5b 100644
> > --- a/Documentation/admin-guide/syscall-user-dispatch.rst
> > +++ b/Documentation/admin-guide/syscall-user-dispatch.rst
> 
> <snip>
> 
> > +
> > +int syscall_user_dispatch_get_config(struct task_struct *task, unsigned long size,
> > +               void __user *data)
> > +{
> > +       struct syscall_user_dispatch *sd = &task->syscall_dispatch;
> > +       struct syscall_user_dispatch_config config;
> > +
> > +       if (size != sizeof(struct syscall_user_dispatch_config))
> > +               return -EINVAL;
> > +
> > +       if (sd->selector) {
> > +               config.mode = PR_SYS_DISPATCH_ON;
> > +               config.offset = sd->offset;
> > +               config.len = sd->len;
> > +               config.selector = sd->selector;
> > +               config.on_dispatch = sd->on_dispatch;
> > +       } else {
> 
> This doesn't look right for me. selector is optional and if it is 0,
> it doesn't mean that
> mode is PR_SYS_DISPATCH_OFF, does it?
> 
> > +               config.mode = PR_SYS_DISPATCH_OFF;
> > +               config.offset = 0;
> > +               config.len = 0;
> > +               config.selector = NULL;
> > +               config.on_dispatch = false;
> > +       }
> > +       if (copy_to_user(data, &config, sizeof(config)))
> > +               return -EFAULT;
> > +
> > +       return 0;
> > +}
> > +

Hm.  Right you are.  I suppose I should change this to checking offset
instead.  Will need to validate the fields are correctly cleared on
disable and on task allocate (i presume this is true).

Otherwise it might behoove us to actually add a state field.

Thank you, i'll push an update tomorrow.

I also need change patch 2/3 as well.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-01-21  3:27 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-20 14:43 [PATCH v3 0/3] Checkpoint Support for Syscall User Dispatch Gregory Price
2023-01-20 14:43 ` [PATCH v3 1/3] ptrace,syscall_user_dispatch: Implement Syscall User Dispatch Suspension Gregory Price
2023-01-20 15:22   ` Oleg Nesterov
2023-01-20 15:49     ` Gregory Price
2023-01-20 14:43 ` [PATCH v3 2/3] fs/proc/array: Add Syscall User Dispatch to proc status Gregory Price
2023-01-20 14:43 ` [PATCH v3 3/3] ptrace,syscall_user_dispatch: add a getter/setter for sud configuration Gregory Price
2023-01-20 16:12   ` [lkp] [+309 bytes kernel size regression] [i386-tinyconfig] [9a08e0b054] " kernel test robot
2023-01-21  3:18   ` [PATCH v3 3/3] " Andrei Vagin
2023-01-21  3:27     ` Gregory Price

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.