linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] system call notification with self_ptrace
@ 2008-09-08 12:02 Pierre Morel
  2008-09-09  0:04 ` Andrew Morton
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Pierre Morel @ 2008-09-08 12:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, Oleg Nesterov, Roland McGrath, Heiko Carstens,
	sameske, Martin Schwidefsky, Ingo Molnar, gregkh, uml-devel,
	Dave Hansen, Cedric Le Goater, Daniel Lezcano

Subject: [PATCH] system call notification with self_ptrace

From: Pierre Morel <pmorel@fr.ibm.com>


PTRACE SELF

This patch adds a new functionality to ptrace: system call notification to
the current process.
When a process requests self ptrace, with the new request PTRACE_SELF_ON:

 1.  the next system call performed by the process will not be executed
 2.  self ptrace will be disabled for the process
 3.  a SIGSYS signal will be sent to the process.

With an appropriate SIGSYS signal handler, the process can access its own
data structures to

 1.  get the system call number from the siginfo structure
 2.  get the system call arguments from the stack
 3.  instrument the system call with other system calls
 4.  emulate the system call with other system calls
 5.  change the arguments of the system call
 6.  perform the system call for good
 7.  change the return value of the system call
 8.  request self ptrace again before returning.

The new request PTRACE_SELF_OFF disables self ptrace.


Signed-off-by: Pierre Morel <pmorel@fr.ibm.com>
Signed-off-by: Volker Sameske <sameske@de.ibm.com>
---

 arch/s390/kernel/ptrace.c     |   16 ++++++++++++++++
 arch/s390/kernel/signal.c     |    5 +++++
 arch/x86/kernel/ptrace.c      |   29 +++++++++++++++++++++++++++++
 arch/x86/kernel/signal_32.c   |    5 +++++
 arch/x86/kernel/signal_64.c   |    5 +++++
 include/asm-generic/siginfo.h |    6 ++++++
 include/linux/ptrace.h        |   18 ++++++++++++++++++
 include/linux/sched.h         |    1 +
 kernel/ptrace.c               |   32 ++++++++++++++++++++++++++++++++
 9 files changed, 117 insertions(+)

Index: linux-2.6.26/arch/s390/kernel/ptrace.c
===================================================================
--- linux-2.6.26.orig/arch/s390/kernel/ptrace.c
+++ linux-2.6.26/arch/s390/kernel/ptrace.c
@@ -583,6 +583,22 @@ syscall_trace(struct pt_regs *regs, int 

 	if (!test_thread_flag(TIF_SYSCALL_TRACE))
 		goto out;
+
+	if (is_self_ptracing(regs->gprs[2])) {
+		if (!entryexit) {
+			struct siginfo info;
+
+			memset(&info, 0, sizeof(struct siginfo));
+			info.si_signo = SIGSYS;
+			info.si_code = SYS_SYSCALL;
+			info.si_errno = regs->gprs[2];
+			info.si_addr = (void *)regs->orig_gpr2;
+			send_sig_info(SIGSYS, &info, current);
+			regs->gprs[2] = -1;
+		}
+		return;
+	}
+
 	if (!(current->ptrace & PT_PTRACED))
 		goto out;
 	ptrace_notify(SIGTRAP | ((current->ptrace & PT_TRACESYSGOOD)
Index: linux-2.6.26/arch/s390/kernel/signal.c
===================================================================
--- linux-2.6.26.orig/arch/s390/kernel/signal.c
+++ linux-2.6.26/arch/s390/kernel/signal.c
@@ -409,6 +409,11 @@ handle_signal(unsigned long sig, struct 
 		spin_unlock_irq(&current->sighand->siglock);
 	}

+	if (current->instrumentation) {
+		clear_thread_flag(TIF_SYSCALL_TRACE);
+		current->instrumentation &= ~PTS_SELF;
+	}
+
 	return ret;
 }

Index: linux-2.6.26/arch/x86/kernel/ptrace.c
===================================================================
--- linux-2.6.26.orig/arch/x86/kernel/ptrace.c
+++ linux-2.6.26/arch/x86/kernel/ptrace.c
@@ -1394,6 +1394,19 @@ int do_syscall_trace(struct pt_regs *reg
 	if (!entryexit)
 		secure_computing(regs->orig_ax);

+	if (is_self_ptracing(regs->orig_ax)) {
+		if (!entryexit) {
+			struct siginfo info;
+
+			memset(&info, 0, sizeof(struct siginfo));
+			info.si_signo = SIGSYS;
+			info.si_code = SYS_SYSCALL;
+			info.si_addr = (void *) regs->orig_ax;
+			send_sig_info(SIGSYS, &info, current);
+		}
+		return 1; /* Skip system call, deliver signal. */
+	}
+
 	if (unlikely(current->audit_context)) {
 		if (entryexit)
 			audit_syscall_exit(AUDITSC_RESULT(regs->ax),
@@ -1486,6 +1499,18 @@ asmlinkage void syscall_trace_enter(stru
 	/* do the secure computing check first */
 	secure_computing(regs->orig_ax);

+	if (is_self_ptracing(regs->orig_ax)) {
+		struct siginfo info;
+
+		memset(&info, 0, sizeof(struct siginfo));
+		info.si_signo = SIGSYS;
+		info.si_code = SYS_SYSCALL;
+		info.si_addr = (void *) regs->orig_ax;
+		send_sig_info(SIGSYS, &info, current);
+		regs->ax = -1 ;
+		return; /* Skip system call, deliver signal. */
+	}
+
 	if (test_thread_flag(TIF_SYSCALL_TRACE)
 	    && (current->ptrace & PT_PTRACED))
 		syscall_trace(regs);
@@ -1507,6 +1532,10 @@ asmlinkage void syscall_trace_enter(stru

 asmlinkage void syscall_trace_leave(struct pt_regs *regs)
 {
+	if (is_self_ptracing(regs->orig_ax)) {
+		regs->ax = -1 ;
+		return; /* Skip system call. */
+	}
 	if (unlikely(current->audit_context))
 		audit_syscall_exit(AUDITSC_RESULT(regs->ax), regs->ax);

Index: linux-2.6.26/arch/x86/kernel/signal_32.c
===================================================================
--- linux-2.6.26.orig/arch/x86/kernel/signal_32.c
+++ linux-2.6.26/arch/x86/kernel/signal_32.c
@@ -568,6 +568,11 @@ handle_signal(unsigned long sig, siginfo
 	recalc_sigpending();
 	spin_unlock_irq(&current->sighand->siglock);

+	if (current->instrumentation & PTS_SELF) {
+		clear_thread_flag(TIF_SYSCALL_TRACE);
+		current->instrumentation &= ~PTS_SELF;
+	}
+
 	return 0;
 }

Index: linux-2.6.26/arch/x86/kernel/signal_64.c
===================================================================
--- linux-2.6.26.orig/arch/x86/kernel/signal_64.c
+++ linux-2.6.26/arch/x86/kernel/signal_64.c
@@ -402,6 +402,11 @@ handle_signal(unsigned long sig, siginfo
 		spin_unlock_irq(&current->sighand->siglock);
 	}

+	if (current->instrumentation & PTS_SELF) {
+		clear_thread_flag(TIF_SYSCALL_TRACE);
+		current->instrumentation &= ~PTS_SELF;
+	}
+
 	return ret;
 }

Index: linux-2.6.26/include/asm-generic/siginfo.h
===================================================================
--- linux-2.6.26.orig/include/asm-generic/siginfo.h
+++ linux-2.6.26/include/asm-generic/siginfo.h
@@ -224,6 +224,12 @@ typedef struct siginfo {
 #define NSIGPOLL	6

 /*
+ * SIGSYS si_codes
+ */
+#define SYS_SYSCALL    (__SI_FAULT|1)  /* system call notification */
+#define NSIGSYS        1
+
+/*
  * sigevent definitions
  * 
  * It seems likely that SIGEV_THREAD will have to be handled from 
Index: linux-2.6.26/include/linux/ptrace.h
===================================================================
--- linux-2.6.26.orig/include/linux/ptrace.h
+++ linux-2.6.26/include/linux/ptrace.h
@@ -27,6 +27,10 @@
 #define PTRACE_GETSIGINFO	0x4202
 #define PTRACE_SETSIGINFO	0x4203

+/* PTRACE SELF requests					*/
+#define PTRACE_SELF_ON		0x4281
+#define PTRACE_SELF_OFF		0x4282
+
 /* options set using PTRACE_SETOPTIONS */
 #define PTRACE_O_TRACESYSGOOD	0x00000001
 #define PTRACE_O_TRACEFORK	0x00000002
@@ -78,7 +82,21 @@

 #include <linux/compiler.h>		/* For unlikely.  */
 #include <linux/sched.h>		/* For struct task_struct.  */
+#include <linux/unistd.h>		/* For syscall definitions  */
+
+#define PTS_INSTRUMENTED	0x00000001
+#define PTS_SELF	0x00000002

+static inline int is_self_ptracing(unsigned long syscall)
+{
+	if (!(current->instrumentation & PTS_SELF))
+		return 0;
+	if (syscall == __NR_rt_sigreturn)
+		return 0;
+	if (syscall == __NR_ptrace)
+		return 0;
+	return 1;
+}

 extern long arch_ptrace(struct task_struct *child, long request, long addr, long data);
 extern struct task_struct *ptrace_get_task_struct(pid_t pid);
Index: linux-2.6.26/kernel/ptrace.c
===================================================================
--- linux-2.6.26.orig/kernel/ptrace.c
+++ linux-2.6.26/kernel/ptrace.c
@@ -543,6 +543,38 @@ asmlinkage long sys_ptrace(long request,
 	 * This lock_kernel fixes a subtle race with suid exec
 	 */
 	lock_kernel();
+	if (request == PTRACE_SELF_ON) {
+		task_lock(current);
+		if (current->ptrace) {
+			task_unlock(current);
+			ret = -EPERM;
+			goto out;
+		}
+		set_thread_flag(TIF_SYSCALL_TRACE);
+		current->instrumentation |= PTS_INSTRUMENTED|PTS_SELF;
+		task_unlock(current);
+		ret = 0;
+		goto out;
+	}
+	if (request == PTRACE_SELF_OFF) {
+		task_lock(current);
+		if (current->ptrace) {
+			task_unlock(current);
+			ret = -EPERM;
+			goto out;
+		}
+		clear_thread_flag(TIF_SYSCALL_TRACE);
+		current->instrumentation &= ~PTS_SELF;
+		task_unlock(current);
+		ret = 0;
+		goto out;
+	}
+
+	if (current->instrumentation) {
+		ret = -EPERM;
+		goto out;
+	}
+
 	if (request == PTRACE_TRACEME) {
 		ret = ptrace_traceme();
 		if (!ret)
Index: linux-2.6.26/include/linux/sched.h
===================================================================
--- linux-2.6.26.orig/include/linux/sched.h
+++ linux-2.6.26/include/linux/sched.h
@@ -1303,6 +1303,7 @@ struct task_struct {
 	int latency_record_count;
 	struct latency_record latency_record[LT_SAVECOUNT];
 #endif
+	u64	instrumentation;
 };

 /*



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] system call notification with self_ptrace
  2008-09-08 12:02 [PATCH 1/1] system call notification with self_ptrace Pierre Morel
@ 2008-09-09  0:04 ` Andrew Morton
  2008-09-10 14:17   ` Pierre Morel
  2008-09-09 12:43 ` Oleg Nesterov
  2008-09-10 16:19 ` Dave Hansen
  2 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2008-09-09  0:04 UTC (permalink / raw)
  To: Pierre Morel
  Cc: linux-kernel, oleg, roland, heicars2, sameske, schwidefsky,
	mingo, gregkh, user-mode-linux-devel, dave, clg, dlezcano,
	Michael Kerrisk, linux-arch

On Mon, 08 Sep 2008 14:02:01 +0200
Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:

> Subject: [PATCH] system call notification with self_ptrace
> 
> From: Pierre Morel <pmorel@fr.ibm.com>
> 
> 
> PTRACE SELF
> 
> This patch adds a new functionality to ptrace: system call notification to
> the current process.
> When a process requests self ptrace, with the new request PTRACE_SELF_ON:
> 
>  1.  the next system call performed by the process will not be executed
>  2.  self ptrace will be disabled for the process
>  3.  a SIGSYS signal will be sent to the process.
> 
> With an appropriate SIGSYS signal handler, the process can access its own
> data structures to
> 
>  1.  get the system call number from the siginfo structure
>  2.  get the system call arguments from the stack
>  3.  instrument the system call with other system calls
>  4.  emulate the system call with other system calls
>  5.  change the arguments of the system call
>  6.  perform the system call for good
>  7.  change the return value of the system call
>  8.  request self ptrace again before returning.
> 
> The new request PTRACE_SELF_OFF disables self ptrace.
> 

It sounds like it might be useful.

Are there any userspace tools available with which people can utilise
this new functionality?  Or plans to release them?

> arch/s390/kernel/ptrace.c     |   16 ++++++++++++++++
> arch/s390/kernel/signal.c     |    5 +++++
> arch/x86/kernel/ptrace.c      |   29 +++++++++++++++++++++++++++++
> arch/x86/kernel/signal_32.c   |    5 +++++
> arch/x86/kernel/signal_64.c   |    5 +++++

Maintainers of the other 30-odd architectures would appreciate a test
application which they can use to develop and test their ports, please.

Michael Kerrisk will no doubt be looking for manpage assistance. 
Please cc him on this material.

It would be good to get suitable testcases integrated into LTP (if LTP
has ptrace tests).

The patch title uses the term "self_ptrace", but the patch itself uses
the term "ptrace_self".  Let's get it consistent everywhere.

The patch adds a

+	u64	instrumentation;

to the task_struct but no explanation is provided as to why this was
added, why it is a 64-bit field, what its locking rules are, etc. 
Please fix this.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] system call notification with self_ptrace
  2008-09-08 12:02 [PATCH 1/1] system call notification with self_ptrace Pierre Morel
  2008-09-09  0:04 ` Andrew Morton
@ 2008-09-09 12:43 ` Oleg Nesterov
  2008-09-10 15:11   ` Pierre Morel
  2008-09-10 16:19 ` Dave Hansen
  2 siblings, 1 reply; 12+ messages in thread
From: Oleg Nesterov @ 2008-09-09 12:43 UTC (permalink / raw)
  To: Pierre Morel
  Cc: Andrew Morton, linux-kernel, Roland McGrath, Heiko Carstens,
	sameske, Martin Schwidefsky, Ingo Molnar, gregkh, uml-devel,
	Dave Hansen, Cedric Le Goater, Daniel Lezcano

On 09/08, Pierre Morel wrote:
>
> --- linux-2.6.26.orig/arch/s390/kernel/signal.c
> +++ linux-2.6.26/arch/s390/kernel/signal.c
> @@ -409,6 +409,11 @@ handle_signal(unsigned long sig, struct 
> 		spin_unlock_irq(&current->sighand->siglock);
> 	}
>
> +	if (current->instrumentation) {
> +		clear_thread_flag(TIF_SYSCALL_TRACE);
> +		current->instrumentation &= ~PTS_SELF;
> +	}
> +
> 	return ret;
> }

I still think this patch shouldn't change handle_signal().

Once again. The signal handler for SIGSYS can first do
sys_ptrace(PTRACE_SELF_OFF) (which is filtered out), and then use any
other syscall, so this change is not needed, afaics.

The overhead of the additional PTRACE_SELF_OFF syscall is very small,
especially compared to signal delivery. I don't think this functionality
will be widely used, but this change adds the unconditional overhead
to handle_signal().

Btw, the check above looks wrong, shouldn't it be

	if (current->instrumentation & PTS_SELF)

?

And. According to the prior discussion, this requires to hook every
signal handler in user space, otherwise we can miss syscall. But every
hook should start with PTRACE_SELF_ON, so I can't see any gain.

> +#define PTS_INSTRUMENTED	0x00000001
> +#define PTS_SELF	0x00000002

I don't really understand why do we need 2 flags, see also below,

> --- linux-2.6.26.orig/kernel/ptrace.c
> +++ linux-2.6.26/kernel/ptrace.c
> @@ -543,6 +543,38 @@ asmlinkage long sys_ptrace(long request,
> 	 * This lock_kernel fixes a subtle race with suid exec
> 	 */
> 	lock_kernel();
> +	if (request == PTRACE_SELF_ON) {
> +		task_lock(current);
> +		if (current->ptrace) {
> +			task_unlock(current);
> +			ret = -EPERM;
> +			goto out;
> +		}
> +		set_thread_flag(TIF_SYSCALL_TRACE);
> +		current->instrumentation |= PTS_INSTRUMENTED|PTS_SELF;
> +		task_unlock(current);
> +		ret = 0;
> +		goto out;

The code looks strange. How about

	if (request == PTRACE_SELF_ON) {
		ret = -EPERM;
		task_lock(current);
		if (!current->ptrace) {
			set_thread_flag(TIF_SYSCALL_TRACE);
			current->instrumentation |= PTS_INSTRUMENTED|PTS_SELF;
			ret = 0;
		}
		task_unlock(current);
		goto out;
	}

?

I don't understand how task_lock() can help. This code runs under
lock_kernel(), and without this lock the code is racy anyway.

> +	}
> +	if (request == PTRACE_SELF_OFF) {
> +		task_lock(current);
> +		if (current->ptrace) {
> +			task_unlock(current);
> +			ret = -EPERM;
> +			goto out;
> +		}
> +		clear_thread_flag(TIF_SYSCALL_TRACE);
> +		current->instrumentation &= ~PTS_SELF;

So. PTRACE_SELF_OFF doesn't clear PTS_INSTRUMENTED? How can the task
reset ->instrumentation ?

> +	if (current->instrumentation) {
> +		ret = -EPERM;
> +		goto out;
> +	}

So, PTRACE_SELF_XXX disables the "normal" ptrace. Not sure this is good.

Oleg.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] system call notification with self_ptrace
  2008-09-09  0:04 ` Andrew Morton
@ 2008-09-10 14:17   ` Pierre Morel
  0 siblings, 0 replies; 12+ messages in thread
From: Pierre Morel @ 2008-09-10 14:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, oleg, roland, heicars2, sameske, schwidefsky,
	mingo, gregkh, user-mode-linux-devel, dave, clg, dlezcano,
	Michael Kerrisk, linux-arch

Hi,

Andrew Morton wrote:
> On Mon, 08 Sep 2008 14:02:01 +0200
> Pierre Morel <pmorel@linux.vnet.ibm.com> wrote:
>
>   
>> Subject: [PATCH] system call notification with self_ptrace
>>
>> From: Pierre Morel <pmorel@fr.ibm.com>
>>
>>
>> PTRACE SELF
>>
>> This patch adds a new functionality to ptrace: system call notification to
>> the current process.
>> When a process requests self ptrace, with the new request PTRACE_SELF_ON:
>>
>>  1.  the next system call performed by the process will not be executed
>>  2.  self ptrace will be disabled for the process
>>  3.  a SIGSYS signal will be sent to the process.
>>
>> With an appropriate SIGSYS signal handler, the process can access its own
>> data structures to
>>
>>  1.  get the system call number from the siginfo structure
>>  2.  get the system call arguments from the stack
>>  3.  instrument the system call with other system calls
>>  4.  emulate the system call with other system calls
>>  5.  change the arguments of the system call
>>  6.  perform the system call for good
>>  7.  change the return value of the system call
>>  8.  request self ptrace again before returning.
>>
>> The new request PTRACE_SELF_OFF disables self ptrace.
>>
>>     
>
> It sounds like it might be useful.
>   
Thanks, yes I am sure it might.
> Are there any userspace tools available with which people can utilise
> this new functionality?  Or plans to release them?
>   
Yes, we plan to release a tool to trace an application soon.
>   
>> arch/s390/kernel/ptrace.c     |   16 ++++++++++++++++
>> arch/s390/kernel/signal.c     |    5 +++++
>> arch/x86/kernel/ptrace.c      |   29 +++++++++++++++++++++++++++++
>> arch/x86/kernel/signal_32.c   |    5 +++++
>> arch/x86/kernel/signal_64.c   |    5 +++++
>>     
>
> Maintainers of the other 30-odd architectures would appreciate a test
> application which they can use to develop and test their ports, please.
>   
Yes, of course I have one for x86 and one for s390.
I am cleaning them to make them available.
> Michael Kerrisk will no doubt be looking for manpage assistance. 
> Please cc him on this material.
>   
OK, I will prepare this.
> It would be good to get suitable testcases integrated into LTP (if LTP
> has ptrace tests).
>   
Yes, I will prepare this too.
> The patch title uses the term "self_ptrace", but the patch itself uses
> the term "ptrace_self".  Let's get it consistent everywhere.
>   
Right.  It should be self_ptrace.
> The patch adds a
>
> +	u64	instrumentation;
>
> to the task_struct but no explanation is provided as to why this was
> added, why it is a 64-bit field, what its locking rules are, etc. 
> Please fix this.
>   

I used to steal one bit in the ptrace bit-field of the task_struct but 
Oleg pointed out that the ptrace bit-field is used in a lot of places 
without any bit mask, so I chose another way to remember that I (the 
thread) am instrumenting myself.

Alternatively, I could also use the ptrace bit-field and modify every 
reference to use a mask for any test, set or reset of the bit-field.

I provision a 64 bit wide bit-field for future extensions of the 
instrumentation.  I could of course use a smaller bit-field as only 1 
bit is really useful for now. I used 64 bit to be memory aligned with 
most of the architectures.

There is no lock for the instrumentation bit-field because it is used 
for self tracing only, and only current ever accesses the flag.


-- 
=============
Pierre Morel
RTOS and Embedded Linux


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] system call notification with self_ptrace
  2008-09-09 12:43 ` Oleg Nesterov
@ 2008-09-10 15:11   ` Pierre Morel
  2008-09-10 16:20     ` Oleg Nesterov
  0 siblings, 1 reply; 12+ messages in thread
From: Pierre Morel @ 2008-09-10 15:11 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, linux-kernel, Roland McGrath, Heiko Carstens,
	sameske, Martin Schwidefsky, Ingo Molnar, gregkh, uml-devel,
	Dave Hansen, Cedric Le Goater, Daniel Lezcano

Hello,

Oleg Nesterov wrote:
> On 09/08, Pierre Morel wrote:
>   
>> --- linux-2.6.26.orig/arch/s390/kernel/signal.c
>> +++ linux-2.6.26/arch/s390/kernel/signal.c
>> @@ -409,6 +409,11 @@ handle_signal(unsigned long sig, struct 
>> 		spin_unlock_irq(&current->sighand->siglock);
>> 	}
>>
>> +	if (current->instrumentation) {
>> +		clear_thread_flag(TIF_SYSCALL_TRACE);
>> +		current->instrumentation &= ~PTS_SELF;
>> +	}
>> +
>> 	return ret;
>> }
>>     
>
> I still think this patch shouldn't change handle_signal().
>
> Once again. The signal handler for SIGSYS can first do
> sys_ptrace(PTRACE_SELF_OFF) (which is filtered out), and then use any
> other syscall, so this change is not needed, afaics.
>   
Yes it can but what if the application forget to do it?
It is a security so that the application do not bounce for ever.
> The overhead of the additional PTRACE_SELF_OFF syscall is very small,
> especially compared to signal delivery. I don't think this functionality
> will be widely used, but this change adds the unconditional overhead
> to handle_signal().
>
> Btw, the check above looks wrong, shouldn't it be
>
> 	if (current->instrumentation & PTS_SELF)
>
> ?
>   
Yes you are right, in fact I do not need two flags, I will remove
the PTS_INSTRUMENTED flag.
> And. According to the prior discussion, this requires to hook every
> signal handler in user space, otherwise we can miss syscall. But every
> hook should start with PTRACE_SELF_ON, so I can't see any gain.
>
>   
>> +#define PTS_INSTRUMENTED	0x00000001
>> +#define PTS_SELF	0x00000002
>>     
>
> I don't really understand why do we need 2 flags, see also below,
>   
Yes, I remove PTS_INSTRUMENTED, a bad idea.
>   
>> --- linux-2.6.26.orig/kernel/ptrace.c
>> +++ linux-2.6.26/kernel/ptrace.c
>> @@ -543,6 +543,38 @@ asmlinkage long sys_ptrace(long request,
>> 	 * This lock_kernel fixes a subtle race with suid exec
>> 	 */
>> 	lock_kernel();
>> +	if (request == PTRACE_SELF_ON) {
>> +		task_lock(current);
>> +		if (current->ptrace) {
>> +			task_unlock(current);
>> +			ret = -EPERM;
>> +			goto out;
>> +		}
>> +		set_thread_flag(TIF_SYSCALL_TRACE);
>> +		current->instrumentation |= PTS_INSTRUMENTED|PTS_SELF;
>> +		task_unlock(current);
>> +		ret = 0;
>> +		goto out;
>>     
>
> The code looks strange. How about
>
> 	if (request == PTRACE_SELF_ON) {
> 		ret = -EPERM;
> 		task_lock(current);
> 		if (!current->ptrace) {
> 			set_thread_flag(TIF_SYSCALL_TRACE);
> 			current->instrumentation |= PTS_INSTRUMENTED|PTS_SELF;
> 			ret = 0;
> 		}
> 		task_unlock(current);
> 		goto out;
> 	}
>
> ?
>
> I don't understand how task_lock() can help. This code runs under
> lock_kernel(), and without this lock the code is racy anyway.
>   

I use task_lock to protect the current->ptrace bit-field which can be 
accessed by another thread, like the one you pointed out previously.
I agree it is not necessary with lock_kernel().
I will put the code before the lock_kernel() to be more efficient.
>   
>> +	}
>> +	if (request == PTRACE_SELF_OFF) {
>> +		task_lock(current);
>> +		if (current->ptrace) {
>> +			task_unlock(current);
>> +			ret = -EPERM;
>> +			goto out;
>> +		}
>> +		clear_thread_flag(TIF_SYSCALL_TRACE);
>> +		current->instrumentation &= ~PTS_SELF;
>>     
>
> So. PTRACE_SELF_OFF doesn't clear PTS_INSTRUMENTED? How can the task
> reset ->instrumentation ?
>   
You are right again, I will remove the PTS_INSTRUMENTED flag.
>   
>> +	if (current->instrumentation) {
>> +		ret = -EPERM;
>> +		goto out;
>> +	}
>>     
>
> So, PTRACE_SELF_XXX disables the "normal" ptrace. Not sure this is good.
>   
I think that having two tracing system one over the other may be
quite difficult to handle.

Pierre
> Oleg.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>   


-- 
=============
Pierre Morel
RTOS and Embedded Linux


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] system call notification with self_ptrace
  2008-09-08 12:02 [PATCH 1/1] system call notification with self_ptrace Pierre Morel
  2008-09-09  0:04 ` Andrew Morton
  2008-09-09 12:43 ` Oleg Nesterov
@ 2008-09-10 16:19 ` Dave Hansen
  2008-09-12 12:30   ` Pierre Morel
  2 siblings, 1 reply; 12+ messages in thread
From: Dave Hansen @ 2008-09-10 16:19 UTC (permalink / raw)
  To: Pierre Morel
  Cc: Andrew Morton, linux-kernel, Oleg Nesterov, Roland McGrath,
	Heiko Carstens, sameske, Martin Schwidefsky, Ingo Molnar, gregkh,
	uml-devel, Cedric Le Goater, Daniel Lezcano

On Mon, 2008-09-08 at 14:02 +0200, Pierre Morel wrote:
> 
> +       if (is_self_ptracing(regs->gprs[2])) {
> +               if (!entryexit) {
> +                       struct siginfo info;
> +
> +                       memset(&info, 0, sizeof(struct siginfo));
> +                       info.si_signo = SIGSYS;
> +                       info.si_code = SYS_SYSCALL;
> +                       info.si_errno = regs->gprs[2];
> +                       info.si_addr = (void *)regs->orig_gpr2;
> +                       send_sig_info(SIGSYS, &info, current);
> +                       regs->gprs[2] = -1;
> +               }
> +               return;
> +       }

I see you didn't like my suggestions for consolidating some of these
repetitive code bits across all the architectures.  Did you give that a
a shot?  Would you like me to produce a patch on top of what you have
here before this gets merged into mm?

-- Dave


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] system call notification with self_ptrace
  2008-09-10 15:11   ` Pierre Morel
@ 2008-09-10 16:20     ` Oleg Nesterov
  2008-09-10 16:23       ` Dave Hansen
  2008-09-12 12:19       ` Pierre Morel
  0 siblings, 2 replies; 12+ messages in thread
From: Oleg Nesterov @ 2008-09-10 16:20 UTC (permalink / raw)
  To: Pierre Morel
  Cc: Andrew Morton, linux-kernel, Roland McGrath, Heiko Carstens,
	sameske, Martin Schwidefsky, Ingo Molnar, gregkh, uml-devel,
	Dave Hansen, Cedric Le Goater, Daniel Lezcano

On 09/10, Pierre Morel wrote:
>
> Oleg Nesterov wrote:
> >
> >I still think this patch shouldn't change handle_signal().
> >
> >Once again. The signal handler for SIGSYS can first do
> >sys_ptrace(PTRACE_SELF_OFF) (which is filtered out), and then use any
> >other syscall, so this change is not needed, afaics.
> >
> Yes it can but what if the application forget to do it?
> It is a security so that the application do not bounce for ever.

The (buggy) task can be killed, this has nothing to do with security.

>From the security pov, this case doesn't differ from, say,

	void sigh(int sig)
	{
		kill(getpid(), sig);
	}

	void main(void)
	{
		signal(SIGSYS, sigh);
		kill(getpid(), SIGSYS);
	}

Or I missed something?

> >So, PTRACE_SELF_XXX disables the "normal" ptrace. Not sure this is good.
> >
> I think that having two tracing system one over the other may be
> quite difficult to handle.

Yes I see.

But... well, I think we need Roland's opinion. I must admit, I am a bit
sceptical about this patch ;) I mean, I don't really understand why it
is useful. We can do the same with fork() + ptrace(). Yes, in that
case we need an "extra" context switch for any traced syscall. But,
do you have any "real life" example to demonstrate that the user-space
solution sucks? We can even use CLONE_MM to speedup the context switch.

Pierre, don't get me wrong. I never used debuggers for myself, I will
be happy to know I am wrong. I just don't understand.


As for ->instrumentation. If you are going to remove PTS_INSTRUMENTED,
we need only one bit. We could use PF_PTS_SELF, but ->flags is already
"contended". Perhaps you can do something like

	--- include/linux/sched.h
	+++ include/linux/sched.h
	@@ -1088,6 +1088,7 @@ struct task_struct {
		/* ??? */
		unsigned int personality;
		unsigned did_exec:1;
	+	unsigned pts_self:1;
		pid_t pid;
		pid_t tgid;
	 

Both did_exec and pts_self can only be changed by current, so it is
safe to share the same word. This way we don't enlarge task_struct.

Oleg.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] system call notification with self_ptrace
  2008-09-10 16:20     ` Oleg Nesterov
@ 2008-09-10 16:23       ` Dave Hansen
  2008-09-12 12:22         ` Pierre Morel
  2008-09-12 12:19       ` Pierre Morel
  1 sibling, 1 reply; 12+ messages in thread
From: Dave Hansen @ 2008-09-10 16:23 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Pierre Morel, Andrew Morton, linux-kernel, Roland McGrath,
	Heiko Carstens, sameske, Martin Schwidefsky, Ingo Molnar, gregkh,
	uml-devel, Cedric Le Goater, Daniel Lezcano

On Wed, 2008-09-10 at 20:20 +0400, Oleg Nesterov wrote:
> But... well, I think we need Roland's opinion. I must admit, I am a bit
> sceptical about this patch ;) I mean, I don't really understand why it
> is useful

Yeah, I don't think they're completely coming clean on it, yet.  

-- Dave


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] system call notification with self_ptrace
  2008-09-10 16:20     ` Oleg Nesterov
  2008-09-10 16:23       ` Dave Hansen
@ 2008-09-12 12:19       ` Pierre Morel
  2008-09-12 14:32         ` Oleg Nesterov
  1 sibling, 1 reply; 12+ messages in thread
From: Pierre Morel @ 2008-09-12 12:19 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, linux-kernel, Roland McGrath, Heiko Carstens,
	sameske, Martin Schwidefsky, Ingo Molnar, gregkh, uml-devel,
	Dave Hansen, Cedric Le Goater, Daniel Lezcano

Hello Oleg,

You are right, the functionality can be implemented with the system call.
But it means we have the overhead of a system call just to clear two bits,
the TIF_SYSCALL_TRACE and the PTS_SELF.

On the other hand we have an overhead of one single "if" inside
the handle_signal() function.

We can do the same with fork and ptrace, yes, but with a very big 
overhead on each system call and this is why this patch is so usefull: 
because with this patch you sit inside the thread when analysing it and 
have a direct access to all data without the need of IPC, ptrace or any 
task switch.

I will provide a test program and plan to release a tracing tool based 
on it.
I think I can reduce the task struct modification by using just a bit 
like you suggest if nobody seen any problem with this.

best regards,

Pierre

Oleg Nesterov wrote:
> On 09/10, Pierre Morel wrote:
>   
>> Oleg Nesterov wrote:
>>     
>>> I still think this patch shouldn't change handle_signal().
>>>
>>> Once again. The signal handler for SIGSYS can first do
>>> sys_ptrace(PTRACE_SELF_OFF) (which is filtered out), and then use any
>>> other syscall, so this change is not needed, afaics.
>>>
>>>       
>> Yes it can but what if the application forget to do it?
>> It is a security so that the application do not bounce for ever.
>>     
>
> The (buggy) task can be killed, this has nothing to do with security.
>
> From the security pov, this case doesn't differ from, say,
>
> 	void sigh(int sig)
> 	{
> 		kill(getpid(), sig);
> 	}
>
> 	void main(void)
> 	{
> 		signal(SIGSYS, sigh);
> 		kill(getpid(), SIGSYS);
> 	}
>
> Or I missed something?
>
>   
>>> So, PTRACE_SELF_XXX disables the "normal" ptrace. Not sure this is good.
>>>
>>>       
>> I think that having two tracing system one over the other may be
>> quite difficult to handle.
>>     
>
> Yes I see.
>
> But... well, I think we need Roland's opinion. I must admit, I am a bit
> sceptical about this patch ;) I mean, I don't really understand why it
> is useful. We can do the same with fork() + ptrace(). Yes, in that
> case we need an "extra" context switch for any traced syscall. But,
> do you have any "real life" example to demonstrate that the user-space
> solution sucks? We can even use CLONE_MM to speedup the context switch.
>
> Pierre, don't get me wrong. I never used debuggers for myself, I will
> be happy to know I am wrong. I just don't understand.
>
>
> As for ->instrumentation. If you are going to remove PTS_INSTRUMENTED,
> we need only one bit. We could use PF_PTS_SELF, but ->flags is already
> "contended". Perhaps you can do something like
>
> 	--- include/linux/sched.h
> 	+++ include/linux/sched.h
> 	@@ -1088,6 +1088,7 @@ struct task_struct {
> 		/* ??? */
> 		unsigned int personality;
> 		unsigned did_exec:1;
> 	+	unsigned pts_self:1;
> 		pid_t pid;
> 		pid_t tgid;
> 	 
>
> Both did_exec and pts_self can only be changed by current, so it is
> safe to share the same word. This way we don't enlarge task_struct.
>
> Oleg.
>
>   


-- 
=============
Pierre Morel
RTOS and Embedded Linux


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] system call notification with self_ptrace
  2008-09-10 16:23       ` Dave Hansen
@ 2008-09-12 12:22         ` Pierre Morel
  0 siblings, 0 replies; 12+ messages in thread
From: Pierre Morel @ 2008-09-12 12:22 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Oleg Nesterov, Andrew Morton, linux-kernel, Roland McGrath,
	Heiko Carstens, sameske, Martin Schwidefsky, Ingo Molnar, gregkh,
	uml-devel, Cedric Le Goater, Daniel Lezcano

Hi Dave,

be optimistic, why not?
I bet we will ;)

Pierre

Dave Hansen wrote:
> On Wed, 2008-09-10 at 20:20 +0400, Oleg Nesterov wrote:
>   
>> But... well, I think we need Roland's opinion. I must admit, I am a bit
>> sceptical about this patch ;) I mean, I don't really understand why it
>> is useful
>>     
>
> Yeah, I don't think they're completely coming clean on it, yet.  
>
> -- Dave
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>   


-- 
=============
Pierre Morel
RTOS and Embedded Linux


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] system call notification with self_ptrace
  2008-09-10 16:19 ` Dave Hansen
@ 2008-09-12 12:30   ` Pierre Morel
  0 siblings, 0 replies; 12+ messages in thread
From: Pierre Morel @ 2008-09-12 12:30 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Andrew Morton, linux-kernel, Oleg Nesterov, Roland McGrath,
	Heiko Carstens, sameske, Martin Schwidefsky, Ingo Molnar, gregkh,
	uml-devel, Cedric Le Goater, Daniel Lezcano

Hi Dave,

Dave Hansen wrote:
> On Mon, 2008-09-08 at 14:02 +0200, Pierre Morel wrote:
>   
>> +       if (is_self_ptracing(regs->gprs[2])) {
>> +               if (!entryexit) {
>> +                       struct siginfo info;
>> +
>> +                       memset(&info, 0, sizeof(struct siginfo));
>> +                       info.si_signo = SIGSYS;
>> +                       info.si_code = SYS_SYSCALL;
>> +                       info.si_errno = regs->gprs[2];
>> +                       info.si_addr = (void *)regs->orig_gpr2;
>> +                       send_sig_info(SIGSYS, &info, current);
>> +                       regs->gprs[2] = -1;
>> +               }
>> +               return;
>> +       }
>>     
>
> I see you didn't like my suggestions for consolidating some of these
> repetitive code bits across all the architectures.  Did you give that a
>   
I have read your suggestion, but the actual ptrace() implementation uses 
explicit reference to the different architecture dependent registers and 
I think that this portion of code is more readable if the patch keeps 
the actual conventions used by Roland.
> a shot?  Would you like me to produce a patch on top of what you have
> here before this gets merged into mm?
>   
No Dave, thank you.

regards,

Pierre

> -- Dave
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>   


-- 
=============
Pierre Morel
RTOS and Embedded Linux


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/1] system call notification with self_ptrace
  2008-09-12 12:19       ` Pierre Morel
@ 2008-09-12 14:32         ` Oleg Nesterov
  0 siblings, 0 replies; 12+ messages in thread
From: Oleg Nesterov @ 2008-09-12 14:32 UTC (permalink / raw)
  To: Pierre Morel
  Cc: Andrew Morton, linux-kernel, Roland McGrath, Heiko Carstens,
	sameske, Martin Schwidefsky, Ingo Molnar, gregkh, uml-devel,
	Dave Hansen, Cedric Le Goater, Daniel Lezcano

Hello Pierre,

On 09/12, Pierre Morel wrote:
>
> You are right, the functionality can be implemented with the system call.
> But it means we have the overhead of a system call just to clear two bits,
> the TIF_SYSCALL_TRACE and the PTS_SELF.

Yes.

So you want to optimize the code for the (imho very exotic) functionality.
And again, the overhead of a system call is nothing compared to the signal
delivery. I bet this overhead won't be visible with any benchmark.

> On the other hand we have an overhead of one single "if" inside
> the handle_signal() function.

What if everyone who wants to add the new functionality will add one
single "if + code" to the core kernel just because he wants to add
a very minor optimization for his needs?

And you forgot about the maintaince overhead. You forgot that this extra
"if" uglifies/complicates the code.

This all is imho of course, and I'm not maintainer. But I promise I
will argue against this change forever ;)

> We can do the same with fork and ptrace, yes, but with a very big 
> overhead on each system call and this is why this patch is so usefull: 
> because with this patch you sit inside the thread when analysing it and 
> have a direct access to all data without the need of IPC, ptrace or any 
> task switch.
>
> I will provide a test program and plan to release a tracing tool based 
> on it.

Yes please, this would be very nice. Please do not count me, but I'm
afraid I am not alone who needs to really understand why this patch
is useful.

Oleg.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2008-09-12 14:27 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-09-08 12:02 [PATCH 1/1] system call notification with self_ptrace Pierre Morel
2008-09-09  0:04 ` Andrew Morton
2008-09-10 14:17   ` Pierre Morel
2008-09-09 12:43 ` Oleg Nesterov
2008-09-10 15:11   ` Pierre Morel
2008-09-10 16:20     ` Oleg Nesterov
2008-09-10 16:23       ` Dave Hansen
2008-09-12 12:22         ` Pierre Morel
2008-09-12 12:19       ` Pierre Morel
2008-09-12 14:32         ` Oleg Nesterov
2008-09-10 16:19 ` Dave Hansen
2008-09-12 12:30   ` Pierre Morel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).