linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
       [not found] ` <1257887498.2061171261478252049.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
@ 2010-01-04 15:52   ` Oleg Nesterov
  2010-01-04 16:16     ` Martin Schwidefsky
  0 siblings, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-04 15:52 UTC (permalink / raw)
  To: caiqian, Heiko Carstens, Martin Schwidefsky
  Cc: Jan Kratochvil, Roland McGrath, linux-kernel, linux-s390, utrace-devel

Hi!

We have some strange problems with utrace on s390, and so far this _looks_
like a s390 problem.

Looks like, on any CPU user_enable_single_step() does not "work" until at
least one thread with per_info.single_step = 1 does the context switch.

This doesn't matter with the old ptrace implementation, but with utrace
the tracee itself does user_enable_single_step(current) and returns to
user-mode. Until it does at least one context switch the single-stepping
doesn't work, after that everything works fine till the next reboot.

To rule out the possible problems with ptrace or utrace, I did the trivial
patch:

--- K/kernel/sys.c~	2009-12-29 10:45:25.787198223 -0500
+++ K/kernel/sys.c	2010-01-03 13:04:00.485591316 -0500
@@ -1444,6 +1444,17 @@ SYSCALL_DEFINE5(prctl, int, option, unsi
 
 	error = 0;
 	switch (option) {
+		case 666:
+			user_enable_single_step(current);
+			break;
+
+		case 777:
+			/* same as 666, but force the context switch
+			 * after user_enable_single_step() */
+			user_enable_single_step(current);
+			schedule_timeout_interruptible(HZ/10);
+			break;
+
 		case PR_SET_PDEATHSIG:
 			if (!valid_signal(arg2)) {
 				error = -EINVAL;
--- K/arch/s390/kernel/traps.c~	2009-12-22 10:41:52.909174198 -0500
+++ K/arch/s390/kernel/traps.c	2009-12-30 10:31:12.985266686 -0500
@@ -378,11 +378,14 @@ static inline void __user *get_check_add
 
 void __kprobes do_single_step(struct pt_regs *regs)
 {
+	printk("SS enter\n");
+
 	if (notify_die(DIE_SSTEP, "sstep", regs, 0, 0,
 					SIGTRAP) == NOTIFY_STOP){
+		printk(KERN_INFO "SS cancelled ???\n");
 		return;
 	}
-	if (tracehook_consider_fatal_signal(current, SIGTRAP))
+//	if (tracehook_consider_fatal_signal(current, SIGTRAP))
 		force_sig(SIGTRAP, current);
 }
 
-------------------------------------------------------------------------------

The change in do_single_step() just removes "is it traced" check
and adds a couple of printk's.


With this patch I assume that the task which does prctl(666) should
be killed by SIGTRAP, but this doesn't happen:

	# taskset -c 0 perl -le 'syscall 172,666 and die $!'
	# taskset -c 0 perl -le 'syscall 172,666 and die $!'
	# taskset -c 0 perl -le 'syscall 172,666 and die $!'

	(syscall 172,666 == prctl(666))

the task exits normally, there is nothing in dmesg.

However,

	# taskset -c 0 perl -le 'syscall 172,777 and die $!'
	Trace/breakpoint trap

Now prctl(777)->user_enable_single_step() does work, the task is
killed by do_single_step()->force_sig(SIGTRAP).

Now prctl(666) works too on CPU 0

	# taskset -c 0 perl -le 'syscall 172,666 and die $!'
	Trace/breakpoint trap
	# taskset -c 0 perl -le 'syscall 172,666 and die $!'
	Trace/breakpoint trap
	# taskset -c 0 perl -le 'syscall 172,666 and die $!'
	Trace/breakpoint trap



And please note "# taskset -c 0", we can repeat the same on another
CPU:

	# taskset -c 1 perl -le 'syscall 172,666 and die $!'
	# taskset -c 1 perl -le 'syscall 172,666 and die $!'

doesn't work, but

	# taskset -c 1 perl -le 'syscall 172,777 and die $!'
	Trace/breakpoint trap

magically "fixes" user_enable_single_step(), now we can use prctl(666)
on CPU 1.


The kernel is 2.6.32.2 plus ca633fd006486ed2c2d3b542283067aab61e6dc8,
could you help?

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-04 15:52   ` s390 && user_enable_single_step() (Was: odd utrace testing results on s390x) Oleg Nesterov
@ 2010-01-04 16:16     ` Martin Schwidefsky
  2010-01-04 18:14       ` Oleg Nesterov
  2010-01-04 20:46       ` Roland McGrath
  0 siblings, 2 replies; 41+ messages in thread
From: Martin Schwidefsky @ 2010-01-04 16:16 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: caiqian, Heiko Carstens, Jan Kratochvil, Roland McGrath,
	linux-kernel, linux-s390, utrace-devel

On Mon, 4 Jan 2010 16:52:25 +0100
Oleg Nesterov <oleg@redhat.com> wrote:

> Hi!
> 
> We have some strange problems with utrace on s390, and so far this _looks_
> like a s390 problem.
> 
> Looks like, on any CPU user_enable_single_step() does not "work" until at
> least one thread with per_info.single_step = 1 does the context switch.
> 
> This doesn't matter with the old ptrace implementation, but with utrace
> the tracee itself does user_enable_single_step(current) and returns to
> user-mode. Until it does at least one context switch the single-stepping
> doesn't work, after that everything works fine till the next reboot.

The PER control registers only get reloaded on task switch. Can you test
if this patch fixes your problem?

--
Subject: [PATCH] fix loading of PER control registers for utrace.

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

If the current task enables / disables PER tracing for itself the
PER control registers need to be loaded in FixPerRegisters.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---
 arch/s390/kernel/ptrace.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/arch/s390/kernel/ptrace.c
+++ b/arch/s390/kernel/ptrace.c
@@ -98,6 +98,9 @@ FixPerRegisters(struct task_struct *task
 		per_info->control_regs.bits.storage_alt_space_ctl = 1;
 	else
 		per_info->control_regs.bits.storage_alt_space_ctl = 0;
+
+	if (task == current)
+		__ctl_load(per_info->control_regs.words, 9, 11);
 }
 
 void user_enable_single_step(struct task_struct *task)

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-04 16:16     ` Martin Schwidefsky
@ 2010-01-04 18:14       ` Oleg Nesterov
  2010-01-04 19:30         ` Oleg Nesterov
                           ` (2 more replies)
  2010-01-04 20:46       ` Roland McGrath
  1 sibling, 3 replies; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-04 18:14 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: caiqian, Heiko Carstens, Jan Kratochvil, Roland McGrath,
	linux-kernel, linux-s390, utrace-devel

On 01/04, Martin Schwidefsky wrote:
>
> On Mon, 4 Jan 2010 16:52:25 +0100
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > We have some strange problems with utrace on s390, and so far this _looks_
> > like a s390 problem.
> >
> > Looks like, on any CPU user_enable_single_step() does not "work" until at
> > least one thread with per_info.single_step = 1 does the context switch.
>
> The PER control registers only get reloaded on task switch. Can you test
> if this patch fixes your problem?
>
> --
> Subject: [PATCH] fix loading of PER control registers for utrace.
>
> From: Martin Schwidefsky <schwidefsky@de.ibm.com>
>
> If the current task enables / disables PER tracing for itself the
> PER control registers need to be loaded in FixPerRegisters.
>
> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> ---
>  arch/s390/kernel/ptrace.c |    3 +++
>  1 file changed, 3 insertions(+)
>
> --- a/arch/s390/kernel/ptrace.c
> +++ b/arch/s390/kernel/ptrace.c
> @@ -98,6 +98,9 @@ FixPerRegisters(struct task_struct *task
>  		per_info->control_regs.bits.storage_alt_space_ctl = 1;
>  	else
>  		per_info->control_regs.bits.storage_alt_space_ctl = 0;
> +
> +	if (task == current)
> +		__ctl_load(per_info->control_regs.words, 9, 11);
>  }

Yes it does fix the problem! Thanks a lot Martin.


However. Could you please look at 6580807da14c423f0d0a708108e6df6ebc8bc83d ?
I am worried, perhaps this commit is not enough for s390. OK, do_single_step()
tracehook_consider_fatal_signal(), this means the forked thread will not
be killed by SIGTRAP if it is not auto-attached, but still this may be
wrong.

IOW. I think this problem is minor and probably can be ignored, but if
I remove tracehook_consider_fatal_signal() check from do_single_step(),

--- a/arch/s390/kernel/traps.c
+++ b/arch/s390/kernel/traps.c
@@ -382,8 +382,7 @@ void __kprobes do_single_step(struct pt_
 					SIGTRAP) == NOTIFY_STOP){
 		return;
 	}
-	if (tracehook_consider_fatal_signal(current, SIGTRAP))
-		force_sig(SIGTRAP, current);
+	force_sig(SIGTRAP, current);
 }
 
 static void default_trap_handler(struct pt_regs * regs, long interruption_code)
-------------------------------------------------------------------

then the test-case from 6580807da14c423f0d0a708108e6df6ebc8bc83d
fails. This probably means that copy_process()->user_disable_single_step()
is not enough to clear the "this task wants single-stepping" copied
from parent.

Thanks!

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-04 18:14       ` Oleg Nesterov
@ 2010-01-04 19:30         ` Oleg Nesterov
  2010-01-04 21:11         ` Roland McGrath
  2010-01-05  9:26         ` Martin Schwidefsky
  2 siblings, 0 replies; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-04 19:30 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: caiqian, Heiko Carstens, Jan Kratochvil, Roland McGrath,
	linux-kernel, linux-s390, utrace-devel

On 01/04, Oleg Nesterov wrote:
>
> IOW. I think this problem is minor and probably can be ignored,

Or may be not...

Even if the child is not killed by SIGTRAP, it can get a lot of
unnecessary traps.

To verify, I did another trivial patch (below), and the test
case from 6580807da14c423f0d0a708108e6df6ebc8bc83d does trigger
a lot of "false step" printks.

Hmm. And sometimes there is nothing in dmesg, but the test-case
needs a lot of time to complete. "taskset -c" seems to always
trigger printk's. Magic.

Oleg.

--- arch/s390/kernel/traps.c~	2009-12-22 10:41:52.909174198 -0500
+++ arch/s390/kernel/traps.c	2010-01-04 13:19:51.038187586 -0500
@@ -384,6 +384,8 @@ void __kprobes do_single_step(struct pt_
 	}
 	if (tracehook_consider_fatal_signal(current, SIGTRAP))
 		force_sig(SIGTRAP, current);
+	else
+		printk("false step\n");
 }
 
 static void default_trap_handler(struct pt_regs * regs, long interruption_code)


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-04 16:16     ` Martin Schwidefsky
  2010-01-04 18:14       ` Oleg Nesterov
@ 2010-01-04 20:46       ` Roland McGrath
  1 sibling, 0 replies; 41+ messages in thread
From: Roland McGrath @ 2010-01-04 20:46 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Oleg Nesterov, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

> The PER control registers only get reloaded on task switch. Can you test
> if this patch fixes your problem?

Long ago when I first worked with David Wilder on s390 arch code,
I remember we made this change.  It seems to have been forgotten
in the later rounds of reworking and merging.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-04 18:14       ` Oleg Nesterov
  2010-01-04 19:30         ` Oleg Nesterov
@ 2010-01-04 21:11         ` Roland McGrath
  2010-01-05  9:50           ` Martin Schwidefsky
  2010-01-05  9:26         ` Martin Schwidefsky
  2 siblings, 1 reply; 41+ messages in thread
From: Roland McGrath @ 2010-01-04 21:11 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Martin Schwidefsky, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

> This probably means that copy_process()->user_disable_single_step()
> is not enough to clear the "this task wants single-stepping" copied
> from parent.

I would suspect s390's TIF_SINGLE_STEP flag here.  That flag means "a
single-step trap occurred".  This is what causes do_single_step to be
called before returning to user mode, rather than the machine trap doing it
directly as is done in the other arch implementations.

If I'm right, then "this task wants single-stepping" is not the problem,
and that really is fully cleared.  In fact, looking at s390's copy_thread
(arch/s390/kernel/process.c) it clears out all the state that is actually
touched by user_enable_single_step and user_disable_single_step.  So for
s390 the new fork.c call is actually superfluous AFAICT.

The problem is that the copied parent state includes the "this task has a
pending single-step to report" flag.  IMHO it clearly makes sense for
s390's copy_thread to clear this flag in a new task, which it does not do now.

An alternative to that would be just to make its user_disable_single_step
clear the flag.  That could in theory also have an effect on e.g. the
(authentic) pending step report of a tracee that was stopped with
TIF_SINGLE_STEP set when its tracer detached.  This might be considered a
good thing, but since every other arch posts the SIGTRAP immediately they
all have the equivalent issue and s390 doesn't need to be any "better" than
they are before we have a generic resolution to the whole subject of
tracer-induced signals (which we won't get into now).  I'm not even sure
from my insufficient reading of the s390 assembly code whether this path is
even possible, i.e. do_signal called before do_single_step.

Martin, I suggest having copy_thread clear TIF_SINGLE_STEP.
That bit is always task-private state that should not be copied.

Btw, given the complexity of FixPerRegisters (and its new additional cost
on task==current), you might want to make user_*_single_step bail out if
per_info.single_step is already set/clear on entry.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-04 18:14       ` Oleg Nesterov
  2010-01-04 19:30         ` Oleg Nesterov
  2010-01-04 21:11         ` Roland McGrath
@ 2010-01-05  9:26         ` Martin Schwidefsky
  2010-01-06 21:15           ` Roland McGrath
  2 siblings, 1 reply; 41+ messages in thread
From: Martin Schwidefsky @ 2010-01-05  9:26 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: caiqian, Heiko Carstens, Jan Kratochvil, Roland McGrath,
	linux-kernel, linux-s390, utrace-devel

On Mon, 4 Jan 2010 19:14:12 +0100
Oleg Nesterov <oleg@redhat.com> wrote:

> On 01/04, Martin Schwidefsky wrote:
> > Subject: [PATCH] fix loading of PER control registers for utrace.
> >
> > From: Martin Schwidefsky <schwidefsky@de.ibm.com>
> >
> > If the current task enables / disables PER tracing for itself the
> > PER control registers need to be loaded in FixPerRegisters.
> >
> > Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > ---
> >  arch/s390/kernel/ptrace.c |    3 +++
> >  1 file changed, 3 insertions(+)
> >
> > --- a/arch/s390/kernel/ptrace.c
> > +++ b/arch/s390/kernel/ptrace.c
> > @@ -98,6 +98,9 @@ FixPerRegisters(struct task_struct *task
> >  		per_info->control_regs.bits.storage_alt_space_ctl = 1;
> >  	else
> >  		per_info->control_regs.bits.storage_alt_space_ctl = 0;
> > +
> > +	if (task == current)
> > +		__ctl_load(per_info->control_regs.words, 9, 11);
> >  }
> 
> Yes it does fix the problem! Thanks a lot Martin.

Ok, I will add that patch to the git390 queue.

> However. Could you please look at 6580807da14c423f0d0a708108e6df6ebc8bc83d ?
> I am worried, perhaps this commit is not enough for s390. OK, do_single_step()
> tracehook_consider_fatal_signal(), this means the forked thread will not
> be killed by SIGTRAP if it is not auto-attached, but still this may be
> wrong.
> 
> IOW. I think this problem is minor and probably can be ignored, but if
> I remove tracehook_consider_fatal_signal() check from do_single_step(),
> 
> --- a/arch/s390/kernel/traps.c
> +++ b/arch/s390/kernel/traps.c
> @@ -382,8 +382,7 @@ void __kprobes do_single_step(struct pt_
>  					SIGTRAP) == NOTIFY_STOP){
>  		return;
>  	}
> -	if (tracehook_consider_fatal_signal(current, SIGTRAP))
> -		force_sig(SIGTRAP, current);
> +	force_sig(SIGTRAP, current);
>  }
> 
>  static void default_trap_handler(struct pt_regs * regs, long interruption_code)
> -------------------------------------------------------------------
> 
> then the test-case from 6580807da14c423f0d0a708108e6df6ebc8bc83d
> fails. This probably means that copy_process()->user_disable_single_step()
> is not enough to clear the "this task wants single-stepping" copied
> from parent.

user_disable_single_step() does not remove the TIF_SINGLE_STEP bit from the
forked task. Perhaps we should just clear the bit in the function.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-04 21:11         ` Roland McGrath
@ 2010-01-05  9:50           ` Martin Schwidefsky
  2010-01-05 15:36             ` Oleg Nesterov
                               ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Martin Schwidefsky @ 2010-01-05  9:50 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Oleg Nesterov, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On Mon,  4 Jan 2010 13:11:47 -0800 (PST)
Roland McGrath <roland@redhat.com> wrote:

> > This probably means that copy_process()->user_disable_single_step()
> > is not enough to clear the "this task wants single-stepping" copied
> > from parent.
> 
> I would suspect s390's TIF_SINGLE_STEP flag here.  That flag means "a
> single-step trap occurred".  This is what causes do_single_step to be
> called before returning to user mode, rather than the machine trap doing it
> directly as is done in the other arch implementations.

Just my thinking as well.

> If I'm right, then "this task wants single-stepping" is not the problem,
> and that really is fully cleared.  In fact, looking at s390's copy_thread
> (arch/s390/kernel/process.c) it clears out all the state that is actually
> touched by user_enable_single_step and user_disable_single_step.  So for
> s390 the new fork.c call is actually superfluous AFAICT.

        /* Don't copy debug registers */
        memset(&p->thread.per_info, 0, sizeof(p->thread.per_info));

Yep, the call from fork.c is indeed superfluous.

> The problem is that the copied parent state includes the "this task has a
> pending single-step to report" flag.  IMHO it clearly makes sense for
> s390's copy_thread to clear this flag in a new task, which it does not do now.
> 
> An alternative to that would be just to make its user_disable_single_step
> clear the flag.  That could in theory also have an effect on e.g. the
> (authentic) pending step report of a tracee that was stopped with
> TIF_SINGLE_STEP set when its tracer detached.  This might be considered a
> good thing, but since every other arch posts the SIGTRAP immediately they
> all have the equivalent issue and s390 doesn't need to be any "better" than
> they are before we have a generic resolution to the whole subject of
> tracer-induced signals (which we won't get into now).  I'm not even sure
> from my insufficient reading of the s390 assembly code whether this path is
> even possible, i.e. do_signal called before do_single_step.

do_signal is called before do_single_step. The order of checks of the
TIF_ bits is 1) machine checks, 2) need resched, 3) signal pending, 4)
notify resume, 5) restarting system call, 6) single step.
But why is that important ? If the TIF_SINGLE_STEP bit is set the order
of do_signal vs. do_single_step does not seem to be important to me.
There will be a SIGTRAP if TIF_SINGLE_STEP is set, no ?

But I agree, it is probably better to make all arches look the same in
regard to that pending step report. 

> Martin, I suggest having copy_thread clear TIF_SINGLE_STEP.
> That bit is always task-private state that should not be copied.

Then let us do this.
 
> Btw, given the complexity of FixPerRegisters (and its new additional cost
> on task==current), you might want to make user_*_single_step bail out if
> per_info.single_step is already set/clear on entry.

The LCTLG of multiple control registers is rather expensive. Does it
happen often that user_*_single_step is called without need? For gdb is
doesn't matter, the cost to switch between tracer and tracee is already
large, the cycles added to FixPerRegisters won't matter much. For
utrace things might be different.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-05  9:50           ` Martin Schwidefsky
@ 2010-01-05 15:36             ` Oleg Nesterov
  2010-01-05 15:46               ` Martin Schwidefsky
                                 ` (2 more replies)
  2010-01-06 20:23             ` Oleg Nesterov
  2010-01-06 20:56             ` Roland McGrath
  2 siblings, 3 replies; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-05 15:36 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Roland McGrath, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On 01/05, Martin Schwidefsky wrote:
>
> On Mon,  4 Jan 2010 13:11:47 -0800 (PST)
> Roland McGrath <roland@redhat.com> wrote:
>
> > > This probably means that copy_process()->user_disable_single_step()
> > > is not enough to clear the "this task wants single-stepping" copied
> > > from parent.
> >
> > I would suspect s390's TIF_SINGLE_STEP flag here.  That flag means "a
> > single-step trap occurred".  This is what causes do_single_step to be
> > called before returning to user mode, rather than the machine trap doing it
> > directly as is done in the other arch implementations.
>
> Just my thinking as well.

Oh, I am not sure. But I don't understand TIF_SINGLE_STEP on s390,
absolutely.

For example, why do_signal() sets TIF_SINGLE_STEP? Why can't we do

	--- a/arch/s390/kernel/signal.c
	+++ b/arch/s390/kernel/signal.c
	@@ -500,18 +500,10 @@ void do_signal(struct pt_regs *regs)
					clear_thread_flag(TIF_RESTORE_SIGMASK);
	 
				/*
	-			 * If we would have taken a single-step trap
	-			 * for a normal instruction, act like we took
	-			 * one for the handler setup.
	-			 */
	-			if (current->thread.per_info.single_step)
	-				set_thread_flag(TIF_SINGLE_STEP);
	-
	-			/*
				 * Let tracing know that we've done the handler setup.
				 */
				tracehook_signal_handler(signr, &info, &ka, regs,
	-					 test_thread_flag(TIF_SINGLE_STEP));
	+					current->thread.per_info.single_step);
			}
			return;
		}

?

Apart from arch/s390/signal.c, TIF_SINGLE_STEP is used by entry.S
but I don't understand this asm at all.

Anyway. I modified the debugging patch a bit:

--- K/arch/s390/kernel/traps.c~	2009-12-22 10:41:52.909174198 -0500
+++ K/arch/s390/kernel/traps.c	2010-01-05 09:49:19.541792379 -0500
@@ -384,6 +384,8 @@ void __kprobes do_single_step(struct pt_
 	}
 	if (tracehook_consider_fatal_signal(current, SIGTRAP))
 		force_sig(SIGTRAP, current);
+	else
+		printk("XXX: %d %d\n", current->pid, test_thread_flag(TIF_SINGLE_STEP));
 }
 
 static void default_trap_handler(struct pt_regs * regs, long interruption_code)
-------------------------------------------------------------------------------

Now, when I run this test-case

	#include <stdio.h>
	#include <unistd.h>
	#include <signal.h>
	#include <sys/ptrace.h>
	#include <sys/wait.h>
	#include <assert.h>

	int main(void)
	{
		int pid, status;

		if (!(pid = fork())) {
			assert(ptrace(PTRACE_TRACEME) == 0);
			kill(getpid(), SIGSTOP);

			if (!fork())
				return 43;

			wait(&status);
			return WEXITSTATUS(status);
		}


		for (;;) {
			assert(pid == wait(&status));
			if (WIFEXITED(status))
				break;
			assert(ptrace(PTRACE_SINGLESTEP, pid, 0,0) == 0);
		}

		assert(WEXITSTATUS(status) == 43);
		return 0;
	}

dmesg shows 799 lines of

	XXX: 2389 0


The kernel is 2.6.32.2 + utrace, but CONFIG_UTRACE is not set.

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-05 15:36             ` Oleg Nesterov
@ 2010-01-05 15:46               ` Martin Schwidefsky
  2010-01-05 15:59                 ` Oleg Nesterov
  2010-01-05 15:47               ` Oleg Nesterov
  2010-01-06 21:08               ` Roland McGrath
  2 siblings, 1 reply; 41+ messages in thread
From: Martin Schwidefsky @ 2010-01-05 15:46 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On Tue, 5 Jan 2010 16:36:33 +0100
Oleg Nesterov <oleg@redhat.com> wrote:

> On 01/05, Martin Schwidefsky wrote:
> >
> > On Mon,  4 Jan 2010 13:11:47 -0800 (PST)
> > Roland McGrath <roland@redhat.com> wrote:
> >
> > > > This probably means that copy_process()->user_disable_single_step()
> > > > is not enough to clear the "this task wants single-stepping" copied
> > > > from parent.
> > >
> > > I would suspect s390's TIF_SINGLE_STEP flag here.  That flag means "a
> > > single-step trap occurred".  This is what causes do_single_step to be
> > > called before returning to user mode, rather than the machine trap doing it
> > > directly as is done in the other arch implementations.
> >
> > Just my thinking as well.
> 
> Oh, I am not sure. But I don't understand TIF_SINGLE_STEP on s390,
> absolutely.
> 
> For example, why do_signal() sets TIF_SINGLE_STEP? Why can't we do
> 
> 	--- a/arch/s390/kernel/signal.c
> 	+++ b/arch/s390/kernel/signal.c
> 	@@ -500,18 +500,10 @@ void do_signal(struct pt_regs *regs)
> 					clear_thread_flag(TIF_RESTORE_SIGMASK);
> 	 
> 				/*
> 	-			 * If we would have taken a single-step trap
> 	-			 * for a normal instruction, act like we took
> 	-			 * one for the handler setup.
> 	-			 */
> 	-			if (current->thread.per_info.single_step)
> 	-				set_thread_flag(TIF_SINGLE_STEP);
> 	-
> 	-			/*
> 				 * Let tracing know that we've done the handler setup.
> 				 */
> 				tracehook_signal_handler(signr, &info, &ka, regs,
> 	-					 test_thread_flag(TIF_SINGLE_STEP));
> 	+					current->thread.per_info.single_step);
> 			}
> 			return;
> 		}
> 
> ?

The reason why we set the TIF_SINGLE_STEP bit in do_signal is that we
want to be able to stop the debugged program before the first
instruction of the signal handler has been executed. The PER single
step causes a trap after an instruction has been executed. That first
instruction can do bad things to the arguments of the signal handler..

> Apart from arch/s390/signal.c, TIF_SINGLE_STEP is used by entry.S
> but I don't understand this asm at all.
> 
> Anyway. I modified the debugging patch a bit:
> 
> --- K/arch/s390/kernel/traps.c~	2009-12-22 10:41:52.909174198 -0500
> +++ K/arch/s390/kernel/traps.c	2010-01-05 09:49:19.541792379 -0500
> @@ -384,6 +384,8 @@ void __kprobes do_single_step(struct pt_
>  	}
>  	if (tracehook_consider_fatal_signal(current, SIGTRAP))
>  		force_sig(SIGTRAP, current);
> +	else
> +		printk("XXX: %d %d\n", current->pid, test_thread_flag(TIF_SINGLE_STEP));
>  }
> 
>  static void default_trap_handler(struct pt_regs * regs, long interruption_code)
> -------------------------------------------------------------------------------
> 
> Now, when I run this test-case
> 
> 	#include <stdio.h>
> 	#include <unistd.h>
> 	#include <signal.h>
> 	#include <sys/ptrace.h>
> 	#include <sys/wait.h>
> 	#include <assert.h>
> 
> 	int main(void)
> 	{
> 		int pid, status;
> 
> 		if (!(pid = fork())) {
> 			assert(ptrace(PTRACE_TRACEME) == 0);
> 			kill(getpid(), SIGSTOP);
> 
> 			if (!fork())
> 				return 43;
> 
> 			wait(&status);
> 			return WEXITSTATUS(status);
> 		}
> 
> 
> 		for (;;) {
> 			assert(pid == wait(&status));
> 			if (WIFEXITED(status))
> 				break;
> 			assert(ptrace(PTRACE_SINGLESTEP, pid, 0,0) == 0);
> 		}
> 
> 		assert(WEXITSTATUS(status) == 43);
> 		return 0;
> 	}
> 
> dmesg shows 799 lines of
> 
> 	XXX: 2389 0
> 
> 
> The kernel is 2.6.32.2 + utrace, but CONFIG_UTRACE is not set.

With or without my bug fix ?

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-05 15:36             ` Oleg Nesterov
  2010-01-05 15:46               ` Martin Schwidefsky
@ 2010-01-05 15:47               ` Oleg Nesterov
  2010-01-05 15:50                 ` Martin Schwidefsky
  2010-01-06 21:08               ` Roland McGrath
  2 siblings, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-05 15:47 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Roland McGrath, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On 01/05, Oleg Nesterov wrote:
>
> Anyway. I modified the debugging patch a bit:
>
> --- K/arch/s390/kernel/traps.c~	2009-12-22 10:41:52.909174198 -0500
> +++ K/arch/s390/kernel/traps.c	2010-01-05 09:49:19.541792379 -0500
> @@ -384,6 +384,8 @@ void __kprobes do_single_step(struct pt_
>  	}
>  	if (tracehook_consider_fatal_signal(current, SIGTRAP))
>  		force_sig(SIGTRAP, current);
> +	else
> +		printk("XXX: %d %d\n", current->pid, test_thread_flag(TIF_SINGLE_STEP));
>  }
>
>  static void default_trap_handler(struct pt_regs * regs, long interruption_code)
> -------------------------------------------------------------------------------

Ah, please ignore. I guess TIF_SINGLE_STEP was already cleared by the caller
in entry.S

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-05 15:47               ` Oleg Nesterov
@ 2010-01-05 15:50                 ` Martin Schwidefsky
  0 siblings, 0 replies; 41+ messages in thread
From: Martin Schwidefsky @ 2010-01-05 15:50 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On Tue, 5 Jan 2010 16:47:25 +0100
Oleg Nesterov <oleg@redhat.com> wrote:

> On 01/05, Oleg Nesterov wrote:
> >
> > Anyway. I modified the debugging patch a bit:
> >
> > --- K/arch/s390/kernel/traps.c~	2009-12-22 10:41:52.909174198 -0500
> > +++ K/arch/s390/kernel/traps.c	2010-01-05 09:49:19.541792379 -0500
> > @@ -384,6 +384,8 @@ void __kprobes do_single_step(struct pt_
> >  	}
> >  	if (tracehook_consider_fatal_signal(current, SIGTRAP))
> >  		force_sig(SIGTRAP, current);
> > +	else
> > +		printk("XXX: %d %d\n", current->pid, test_thread_flag(TIF_SINGLE_STEP));
> >  }
> >
> >  static void default_trap_handler(struct pt_regs * regs, long interruption_code)
> > -------------------------------------------------------------------------------
> 
> Ah, please ignore. I guess TIF_SINGLE_STEP was already cleared by the caller
> in entry.S

Yes, TIF_SINGLE_STEP is checked in entry.S and cleared before do_signal
is called. That is the "ni" instruction at sysc_singlestep and
sysc_sigpending.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-05 15:46               ` Martin Schwidefsky
@ 2010-01-05 15:59                 ` Oleg Nesterov
  2010-01-05 17:03                   ` Oleg Nesterov
  0 siblings, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-05 15:59 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Roland McGrath, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On 01/05, Martin Schwidefsky wrote:
>
> On Tue, 5 Jan 2010 16:36:33 +0100
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > For example, why do_signal() sets TIF_SINGLE_STEP? Why can't we do
> > 
> > 	--- a/arch/s390/kernel/signal.c
> > 	+++ b/arch/s390/kernel/signal.c
> > 	@@ -500,18 +500,10 @@ void do_signal(struct pt_regs *regs)
> > 					clear_thread_flag(TIF_RESTORE_SIGMASK);
> > 	 
> > 				/*
> > 	-			 * If we would have taken a single-step trap
> > 	-			 * for a normal instruction, act like we took
> > 	-			 * one for the handler setup.
> > 	-			 */
> > 	-			if (current->thread.per_info.single_step)
> > 	-				set_thread_flag(TIF_SINGLE_STEP);
> > 	-
> > 	-			/*
> > 				 * Let tracing know that we've done the handler setup.
> > 				 */
> > 				tracehook_signal_handler(signr, &info, &ka, regs,
> > 	-					 test_thread_flag(TIF_SINGLE_STEP));
> > 	+					current->thread.per_info.single_step);
> > 			}
> > 			return;
> > 		}
> > 
> > ?
>
> The reason why we set the TIF_SINGLE_STEP bit in do_signal is that we
> want to be able to stop the debugged program before the first
> instruction of the signal handler has been executed. The PER single
> step causes a trap after an instruction has been executed. That first
> instruction can do bad things to the arguments of the signal handler..

Yes, but afaics all we need is to pass the correct "int stepping" arg
to tracehook_signal_handler(). If it is true, the tracee does
ptrace_notify() before it returns to user-mode.

> > dmesg shows 799 lines of
> >
> > 	XXX: 2389 0
> >
> >
> > The kernel is 2.6.32.2 + utrace, but CONFIG_UTRACE is not set.
>
> With or without my bug fix ?

With, but please see another email.


I'll add clear_bit(TIF_SINGLE_STEP) into do_fork() path and re-test.

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-05 15:59                 ` Oleg Nesterov
@ 2010-01-05 17:03                   ` Oleg Nesterov
  2010-01-05 19:58                     ` Oleg Nesterov
  0 siblings, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-05 17:03 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Roland McGrath, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On 01/05, Oleg Nesterov wrote:
>
> I'll add clear_bit(TIF_SINGLE_STEP) into do_fork() path and re-test.

Hmm. This patch

	--- kernel/fork.c~	2009-12-22 10:41:53.188084961 -0500
	+++ kernel/fork.c	2010-01-05 11:42:58.370636323 -0500
	@@ -1206,6 +1206,8 @@ static struct task_struct *copy_process(
		 * of CLONE_PTRACE.
		 */
		clear_tsk_thread_flag(p, TIF_SYSCALL_TRACE);
	+	clear_tsk_thread_flag(p, TIF_SINGLE_STEP);
	+	user_disable_single_step(p);
	 #ifdef TIF_SYSCALL_EMU
		clear_tsk_thread_flag(p, TIF_SYSCALL_EMU);
	 #endif

doesn't help, I still see the same XXX's in dmesg...

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-05 17:03                   ` Oleg Nesterov
@ 2010-01-05 19:58                     ` Oleg Nesterov
  2010-01-06 14:59                       ` Heiko Carstens
  2010-01-06 20:17                       ` Oleg Nesterov
  0 siblings, 2 replies; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-05 19:58 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Roland McGrath, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On 01/05, Oleg Nesterov wrote:
>
> On 01/05, Oleg Nesterov wrote:
> >
> > I'll add clear_bit(TIF_SINGLE_STEP) into do_fork() path and re-test.
>
> Hmm. This patch
>
> 	--- kernel/fork.c~	2009-12-22 10:41:53.188084961 -0500
> 	+++ kernel/fork.c	2010-01-05 11:42:58.370636323 -0500
> 	@@ -1206,6 +1206,8 @@ static struct task_struct *copy_process(
> 		 * of CLONE_PTRACE.
> 		 */
> 		clear_tsk_thread_flag(p, TIF_SYSCALL_TRACE);
> 	+	clear_tsk_thread_flag(p, TIF_SINGLE_STEP);
> 	+	user_disable_single_step(p);
> 	 #ifdef TIF_SYSCALL_EMU
> 		clear_tsk_thread_flag(p, TIF_SYSCALL_EMU);
> 	 #endif
>
> doesn't help, I still see the same XXX's in dmesg...

Oh, now I am totally confused.

I reverted your fix from
https://www.redhat.com/archives/utrace-devel/2010-January/msg00006.html
and now there is nothing in dmesg.


I decided to re-test this all with vanilla 2.6.33-rc2. It is really
amazing how long it takes to recompile/install the kernel! I spent
the rest of this day fighting with this rhts machine. Result - it
doesn't boot:

	00: zIPL v1.8.2-5.el6 interactive boot menu
	00:
	00:  0. default (2.6.33-rc2)
	00:
	00:  1. 2.6.33-rc2
	00:  2. 2.6.32.2-14.s390x.el6.s390x
	00:  3. 2.6.32.2-14.el6.s390x
	00:  4. linux
	00:
	00: Note: VM users please use '#cp vi vmsg <input>'
	00:
	00: Please choose (default will boot in 15 seconds):
	00: Booting default (2.6.33-rc2)...
	 Ý<000000000011c4fc>¨ sysc_return+0x0/0x8
	 Ý<00000000003cc0c6>¨ selinux_sb_copy_data+0x17e/0x238
	(Ý<00000000003cbf94>¨ selinux_sb_copy_data+0x4c/0x238)
	 Ý<00000000003b62a6>¨ security_sb_copy_data+0x4e/0x60
	 Ý<0000000000280338>¨ vfs_kern_mount+0x19c/0x1f4
	 Ý<00000000002803de>¨ kern_mount_data+0x4e/0x5c
	 Ý<0000000000ae1908>¨ devtmpfs_init+0x60/0xbc
	 Ý<0000000000ae1818>¨ driver_init+0x28/0x6c
	 Ý<0000000000abe582>¨ kernel_init+0x32a/0x3d8
	 Ý<000000000010b8c2>¨ kernel_thread_starter+0x6/0xc
	 Ý<000000000010b8bc>¨ kernel_thread_starter+0x0/0xc
	INFO: lockdep is turned off.
	00: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 00114BD4
	00:
	00:
	00:
	00:
	00:
	00:

Cai, any chance you can help? Using /usr/bin/console I can't choose another
kernel at the "00: Please choose (default will boot in 15 seconds):" stage,
how can I do this???

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-05 19:58                     ` Oleg Nesterov
@ 2010-01-06 14:59                       ` Heiko Carstens
  2010-01-06 20:17                       ` Oleg Nesterov
  1 sibling, 0 replies; 41+ messages in thread
From: Heiko Carstens @ 2010-01-06 14:59 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Martin Schwidefsky, Roland McGrath, caiqian, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On Tue, Jan 05, 2010 at 08:58:18PM +0100, Oleg Nesterov wrote:
> I decided to re-test this all with vanilla 2.6.33-rc2. It is really
> amazing how long it takes to recompile/install the kernel!

Then either you have a lot of steal time or an old machine (pre z10)?
You could also try to define more cpus if you have a virtual machine:
#cp define cpu <from>-<to>
...that is if your admin gave you permissions to do that.

> I spent
> the rest of this day fighting with this rhts machine. Result - it
> doesn't boot:
> 
> 	00: zIPL v1.8.2-5.el6 interactive boot menu
> 	00:
> 	00:  0. default (2.6.33-rc2)
> 	00:
> 	00:  1. 2.6.33-rc2
> 	00:  2. 2.6.32.2-14.s390x.el6.s390x
> 	00:  3. 2.6.32.2-14.el6.s390x
> 	00:  4. linux
> 	00:
> 	00: Note: VM users please use '#cp vi vmsg <input>'
> 	00:
> 	00: Please choose (default will boot in 15 seconds):
> 	00: Booting default (2.6.33-rc2)...
> 	 Ý<000000000011c4fc>¨ sysc_return+0x0/0x8
> 	 Ý<00000000003cc0c6>¨ selinux_sb_copy_data+0x17e/0x238
> 	(Ý<00000000003cbf94>¨ selinux_sb_copy_data+0x4c/0x238)
> 	 Ý<00000000003b62a6>¨ security_sb_copy_data+0x4e/0x60
> 	 Ý<0000000000280338>¨ vfs_kern_mount+0x19c/0x1f4
> 	 Ý<00000000002803de>¨ kern_mount_data+0x4e/0x5c
> 	 Ý<0000000000ae1908>¨ devtmpfs_init+0x60/0xbc
> 	 Ý<0000000000ae1818>¨ driver_init+0x28/0x6c
> 	 Ý<0000000000abe582>¨ kernel_init+0x32a/0x3d8
> 	 Ý<000000000010b8c2>¨ kernel_thread_starter+0x6/0xc
> 	 Ý<000000000010b8bc>¨ kernel_thread_starter+0x0/0xc
> 	INFO: lockdep is turned off.
> 	00: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 00114BD4
> 
> Cai, any chance you can help? Using /usr/bin/console I can't choose another
> kernel at the "00: Please choose (default will boot in 15 seconds):" stage,
> how can I do this???

You did enter something like
#cp vi vmsg 2
and not just '2'?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-05 19:58                     ` Oleg Nesterov
  2010-01-06 14:59                       ` Heiko Carstens
@ 2010-01-06 20:17                       ` Oleg Nesterov
  2010-01-06 21:13                         ` Roland McGrath
  1 sibling, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-06 20:17 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Roland McGrath, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On 01/05, Oleg Nesterov wrote:
>
> On 01/05, Oleg Nesterov wrote:
> >
> > On 01/05, Oleg Nesterov wrote:
> > >
> > > I'll add clear_bit(TIF_SINGLE_STEP) into do_fork() path and re-test.
> >
> > Hmm. This patch
> >
> > 	--- kernel/fork.c~	2009-12-22 10:41:53.188084961 -0500
> > 	+++ kernel/fork.c	2010-01-05 11:42:58.370636323 -0500
> > 	@@ -1206,6 +1206,8 @@ static struct task_struct *copy_process(
> > 		 * of CLONE_PTRACE.
> > 		 */
> > 		clear_tsk_thread_flag(p, TIF_SYSCALL_TRACE);
> > 	+	clear_tsk_thread_flag(p, TIF_SINGLE_STEP);
> > 	+	user_disable_single_step(p);
> > 	 #ifdef TIF_SYSCALL_EMU
> > 		clear_tsk_thread_flag(p, TIF_SYSCALL_EMU);
> > 	 #endif
> >
> > doesn't help, I still see the same XXX's in dmesg...
>
> Oh, now I am totally confused.
>
> I reverted your fix from
> https://www.redhat.com/archives/utrace-devel/2010-January/msg00006.html
> and now there is nothing in dmesg.

I take this back. I re-tested this all under 2.6.32.2 + utrace, and I
see nothing in dmesg. I don't know what I did wrong, most probably I
forgot to do zipl or something like this...

I'll try to summarize. Martin's patch
from https://www.redhat.com/archives/utrace-devel/2010-January/msg00006.html
fixes the problems with utrace.


However, with or without CONFIG_UTRACE, 6580807da14c423f0d0a708108e6df6ebc8bc83d
is needed on s390 too, otherwise the child gets unnecessary traps.

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-05  9:50           ` Martin Schwidefsky
  2010-01-05 15:36             ` Oleg Nesterov
@ 2010-01-06 20:23             ` Oleg Nesterov
  2010-01-06 20:56             ` Roland McGrath
  2 siblings, 0 replies; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-06 20:23 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Roland McGrath, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On 01/05, Martin Schwidefsky wrote:
>
> On Mon,  4 Jan 2010 13:11:47 -0800 (PST)
> Roland McGrath <roland@redhat.com> wrote:
>
> > > This probably means that copy_process()->user_disable_single_step()
> > > is not enough to clear the "this task wants single-stepping" copied
> > > from parent.
> >
> > I would suspect s390's TIF_SINGLE_STEP flag here.  That flag means "a
> > single-step trap occurred".  This is what causes do_single_step to be
> > called before returning to user mode, rather than the machine trap doing it
> > directly as is done in the other arch implementations.
>
> Just my thinking as well.
>
> > If I'm right, then "this task wants single-stepping" is not the problem,
> > and that really is fully cleared.  In fact, looking at s390's copy_thread
> > (arch/s390/kernel/process.c) it clears out all the state that is actually
> > touched by user_enable_single_step and user_disable_single_step.  So for
> > s390 the new fork.c call is actually superfluous AFAICT.
>
>         /* Don't copy debug registers */
>         memset(&p->thread.per_info, 0, sizeof(p->thread.per_info));
>
> Yep, the call from fork.c is indeed superfluous.

I can't explain this, but if I remove copy_process()->user_disable_single_step()
the test-case below triggers "XXX" printk's from do_single_step() with or
without CONFIG_UTRACE. the patch is

	--- arch/s390/kernel/traps.c~	2009-12-22 10:41:52.909174198 -0500
	+++ arch/s390/kernel/traps.c	2010-01-05 11:03:55.006487697 -0500
	@@ -384,6 +384,9 @@ void __kprobes do_single_step(struct pt_
		}
		if (tracehook_consider_fatal_signal(current, SIGTRAP))
			force_sig(SIGTRAP, current);
	+	else
	+		printk("XXX: %s/%d %d\n", current->comm, current->pid,
	+				test_thread_flag(TIF_SINGLE_STEP));
	 }
	 
	 static void default_trap_handler(struct pt_regs * regs, long interruption_code)

Oleg.

#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <assert.h>

int main(void)
{
	int pid, status;

	if (!(pid = fork())) {
		assert(ptrace(PTRACE_TRACEME) == 0);
		kill(getpid(), SIGSTOP);

		if (!fork())
			return 43;

		wait(&status);
		return WEXITSTATUS(status);
	}


	for (;;) {
		assert(pid == wait(&status));
		if (WIFEXITED(status))
			break;
		assert(ptrace(PTRACE_SINGLESTEP, pid, 0,0) == 0);
	}

	assert(WEXITSTATUS(status) == 43);
	return 0;
}



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-05  9:50           ` Martin Schwidefsky
  2010-01-05 15:36             ` Oleg Nesterov
  2010-01-06 20:23             ` Oleg Nesterov
@ 2010-01-06 20:56             ` Roland McGrath
  2010-01-07  9:00               ` Martin Schwidefsky
  2 siblings, 1 reply; 41+ messages in thread
From: Roland McGrath @ 2010-01-06 20:56 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Oleg Nesterov, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

> do_signal is called before do_single_step. The order of checks of the
> TIF_ bits is 1) machine checks, 2) need resched, 3) signal pending, 4)
> notify resume, 5) restarting system call, 6) single step.

I see.  Then the potential issue I raised would exist.

> But why is that important ? If the TIF_SINGLE_STEP bit is set the order
> of do_signal vs. do_single_step does not seem to be important to me.
> There will be a SIGTRAP if TIF_SINGLE_STEP is set, no ?

Right.  It only becomes relevant if something else clears TIF_SINGLE_STEP,
which does not happen now.  I was discussing the scenario of having
user_disable_single_step clear it.  That might happen inside a stop,
i.e. inside do_signal (get_signal_to_deliver).  So given that order of
checks, it becomes possible for user_disable_single_step to affect the
pending-step-should-SIGTRAP situation.

> But I agree, it is probably better to make all arches look the same in
> regard to that pending step report. 

Right.  That means we should leave the status quo of not clearing
TIF_SINGLE_STEP in user_disable_single_step.

> > Martin, I suggest having copy_thread clear TIF_SINGLE_STEP.
> > That bit is always task-private state that should not be copied.
> 
> Then let us do this.

Yes, good.

> The LCTLG of multiple control registers is rather expensive. Does it
> happen often that user_*_single_step is called without need? For gdb is
> doesn't matter, the cost to switch between tracer and tracee is already
> large, the cycles added to FixPerRegisters won't matter much. For
> utrace things might be different.

In old (current) ptrace, user_*_single_step is never called on current.
In utrace, it's always called on current, so utrace-based ptrace alone
adds this second reload overhead beyond the same context-switch overhead
old ptrace has.  Indeed that addition may be neglible.

In other circumstances with utrace, it is very possible to wind up with
user_disable_single_step being called superfluously when there was no
stop (and so not necessarily any context switch or other high overhead).
On other machines, user_disable_single_step is pretty cheap even where
user_enable_single_step is quite costly.  Given how simple and cheap it
is to short-circuit the excess work on s390, I think it is worthwhile.

It looks like the context-switch code already checks some magic bits in
per_info to decide whether to do the costly reload.  So it seems vaguely
consistent with that to conditionalize FixPerRegisters similarly.  The
single_step cases are a special case with an easy one-bit check to
short-circuit, so skipping all of FixPerRegisters seems worthwhile IMHO.

To be really optimization-happy about it, you'd also hack FixPerRegisters
itself to do the reload on current only if PSW_MASK_PER is or was set (if
I'm following the code correctly).  Or perhaps it should check PER_EM_MASK
instead to match what __switch_to does.  (I don't understand the
distinction between those two tests, if there is one.)  Extra frobbity
would be to leave the old state too when clearing PSW_MASK_PER, and then
just have the trap handler lazily ignore a hit when current doesn't have
it set.  Then in a case where there is no hit before context switch,
you haven't done anything.  But that is probably both excessive and
not even a win if the PER use is single-step and so there will really
very likely be a hit before context switch.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-05 15:36             ` Oleg Nesterov
  2010-01-05 15:46               ` Martin Schwidefsky
  2010-01-05 15:47               ` Oleg Nesterov
@ 2010-01-06 21:08               ` Roland McGrath
  2010-01-07  9:16                 ` Martin Schwidefsky
  2010-01-07 18:11                 ` Oleg Nesterov
  2 siblings, 2 replies; 41+ messages in thread
From: Roland McGrath @ 2010-01-06 21:08 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Martin Schwidefsky, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

> Oh, I am not sure. But I don't understand TIF_SINGLE_STEP on s390,
> absolutely.
> 
> For example, why do_signal() sets TIF_SINGLE_STEP? Why can't we do

I think we could.  That would be more consistent with other machines.  On
s390, once we set TIF_SINGLE_STEP, we are going to post a SIGTRAP
eventually before going to user mode.  But then tracehook_signal_handler()
also gets stepping=1 and the expected meaning of this is that the arch code
is not itself simulating a single-step for the handler setup.  So the
tracehook (i.e. ptrace/utrace) code does what it does for "need a fake
single-step".  

In ptrace (including utrace-based ptrace), this winds up with sending a
SIGTRAP.  So when we finally do get out of do_signal and TIF_SINGLE_STEP
causes a second SIGTRAP, it's already pending and the second one makes no
difference.

But for the general case of utrace, we'll have the UTRACE_SIGNAL_HANDLER
report, followed by a SIGTRAP that appears to be an authentic single-step
trap, but takes place on the same instruction.  If the resumption after the
UTRACE_SIGNAL_HANDLER report didn't use stepping, then this is an entirely
unexpected extra SIGTRAP.  If we do continue stepping, then we are
expecting the SIGTRAP, but this gets us a spurious and errnoeous report
that looks like the instruction right before the handler's entry point in
memory was just executed.

[Martin:]
> The reason why we set the TIF_SINGLE_STEP bit in do_signal is that we
> want to be able to stop the debugged program before the first
> instruction of the signal handler has been executed. The PER single
> step causes a trap after an instruction has been executed. That first
> instruction can do bad things to the arguments of the signal handler..

That's what tracehook_signal_handler is for.  You're both doing it yourself
in the arch code (by setting TIF_SINGLE_STEP), and then telling the generic
code to do it (by passing stepping=1 to tracehook_signal_handler).


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-06 20:17                       ` Oleg Nesterov
@ 2010-01-06 21:13                         ` Roland McGrath
  2010-01-07  9:18                           ` Martin Schwidefsky
  0 siblings, 1 reply; 41+ messages in thread
From: Roland McGrath @ 2010-01-06 21:13 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Martin Schwidefsky, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

> However, with or without CONFIG_UTRACE, 6580807da14c423f0d0a708108e6df6ebc8bc83d
> is needed on s390 too, otherwise the child gets unnecessary traps.

This confuses me.  user_disable_single_step on non-current doesn't do
anything not already done by the memset in copy_thread.  Ooh, except
perhaps it does not clear PSW_MASK_PER.  Maybe that matters.  That's
the only thing I can think of.  Maybe Martin can make sense of it.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-05  9:26         ` Martin Schwidefsky
@ 2010-01-06 21:15           ` Roland McGrath
  0 siblings, 0 replies; 41+ messages in thread
From: Roland McGrath @ 2010-01-06 21:15 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Oleg Nesterov, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

> > then the test-case from 6580807da14c423f0d0a708108e6df6ebc8bc83d
> > fails. This probably means that copy_process()->user_disable_single_step()
> > is not enough to clear the "this task wants single-stepping" copied
> > from parent.
> 
> user_disable_single_step() does not remove the TIF_SINGLE_STEP bit from the
> forked task. Perhaps we should just clear the bit in the function.

If that were to fix this test case, I think it would be incidental rather
than meaning the right thing.  The "this task wants single-stepping" state
should not have anything to do with TIF_SINGLE_STEP.  It means "this task
recently had single-stepping", which is a separate moving part.

Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-06 20:56             ` Roland McGrath
@ 2010-01-07  9:00               ` Martin Schwidefsky
  2010-01-07 21:32                 ` Roland McGrath
  2010-01-21 20:32                 ` Oleg Nesterov
  0 siblings, 2 replies; 41+ messages in thread
From: Martin Schwidefsky @ 2010-01-07  9:00 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Oleg Nesterov, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On Wed,  6 Jan 2010 12:56:33 -0800 (PST)
Roland McGrath <roland@redhat.com> wrote:

> > do_signal is called before do_single_step. The order of checks of the
> > TIF_ bits is 1) machine checks, 2) need resched, 3) signal pending, 4)
> > notify resume, 5) restarting system call, 6) single step.
> 
> I see.  Then the potential issue I raised would exist.
> 
> > But why is that important ? If the TIF_SINGLE_STEP bit is set the order
> > of do_signal vs. do_single_step does not seem to be important to me.
> > There will be a SIGTRAP if TIF_SINGLE_STEP is set, no ?
> 
> Right.  It only becomes relevant if something else clears TIF_SINGLE_STEP,
> which does not happen now.  I was discussing the scenario of having
> user_disable_single_step clear it.  That might happen inside a stop,
> i.e. inside do_signal (get_signal_to_deliver).  So given that order of
> checks, it becomes possible for user_disable_single_step to affect the
> pending-step-should-SIGTRAP situation.

That was the idea about the TIF_SINGLE_STEP bit. I can be set and later
unset if we don't want to deliver the SIGTRAP after all.

> > But I agree, it is probably better to make all arches look the same in
> > regard to that pending step report. 
> 
> Right.  That means we should leave the status quo of not clearing
> TIF_SINGLE_STEP in user_disable_single_step.

Ok, although it seems a bit strange not to do it. Perhaps I should add a
comment about it.

> > The LCTLG of multiple control registers is rather expensive. Does it
> > happen often that user_*_single_step is called without need? For gdb is
> > doesn't matter, the cost to switch between tracer and tracee is already
> > large, the cycles added to FixPerRegisters won't matter much. For
> > utrace things might be different.
> 
> In old (current) ptrace, user_*_single_step is never called on current.
> In utrace, it's always called on current, so utrace-based ptrace alone
> adds this second reload overhead beyond the same context-switch overhead
> old ptrace has.  Indeed that addition may be neglible.

So after everthing has been converted to utrace we always will load the
control registers in FixPerRegisters.
 
> In other circumstances with utrace, it is very possible to wind up with
> user_disable_single_step being called superfluously when there was no
> stop (and so not necessarily any context switch or other high overhead).
> On other machines, user_disable_single_step is pretty cheap even where
> user_enable_single_step is quite costly.  Given how simple and cheap it
> is to short-circuit the excess work on s390, I think it is worthwhile.

We could use the same compare of the control registers as the code in
__switch_to. See below.

> It looks like the context-switch code already checks some magic bits in
> per_info to decide whether to do the costly reload.  So it seems vaguely
> consistent with that to conditionalize FixPerRegisters similarly.  The
> single_step cases are a special case with an easy one-bit check to
> short-circuit, so skipping all of FixPerRegisters seems worthwhile IMHO.

What the magic code in __switch_to does is to check if the next process
wants to use per and do the load of the control registers only if the
current set of control registers 9 to 11 differ from the set for the
next process.
The check if the next process wants to use per is done with a
test-under-mask (TM) instruction and a mask of 0xe8. This checks for 4
bits: em_branching, em_instruction_fetch, em_storage_alteration and
em_store_real_address. If one of the bits is set then the current set
of control registers is stored, compared with the set for the next
process and only if they differ is the lctlg done.
The store of control registers is cheap (n cycles for n registers), the
load is expensive for specific control registers. For 9 to 11 it costs
more than 100 cycles.

> To be really optimization-happy about it, you'd also hack FixPerRegisters
> itself to do the reload on current only if PSW_MASK_PER is or was set (if
> I'm following the code correctly).  Or perhaps it should check PER_EM_MASK
> instead to match what __switch_to does.  (I don't understand the
> distinction between those two tests, if there is one.)  Extra frobbity
> would be to leave the old state too when clearing PSW_MASK_PER, and then
> just have the trap handler lazily ignore a hit when current doesn't have
> it set.  Then in a case where there is no hit before context switch,
> you haven't done anything.  But that is probably both excessive and
> not even a win if the PER use is single-step and so there will really
> very likely be a hit before context switch.

The PSW_MASK_PER is the "global" PER enablement switch, the PER_EM_MASK
bits enable the different PER events. We check for the PER_EM_MASK bits
because it is easier to access in __switch_to. The return PSW is stored
in the pt_regs structure, we would have to get a pointer to it (what
"regs = task_pt_regs(taks)" does in FixPerRegisters). In
FixPerRegisters we can as well use the PSW_MASK_PER bit.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-06 21:08               ` Roland McGrath
@ 2010-01-07  9:16                 ` Martin Schwidefsky
  2010-01-07 18:16                   ` Oleg Nesterov
  2010-01-07 21:41                   ` Roland McGrath
  2010-01-07 18:11                 ` Oleg Nesterov
  1 sibling, 2 replies; 41+ messages in thread
From: Martin Schwidefsky @ 2010-01-07  9:16 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Oleg Nesterov, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On Wed,  6 Jan 2010 13:08:12 -0800 (PST)
Roland McGrath <roland@redhat.com> wrote:

> > Oh, I am not sure. But I don't understand TIF_SINGLE_STEP on s390,
> > absolutely.
> > 
> > For example, why do_signal() sets TIF_SINGLE_STEP? Why can't we do
> 
> I think we could.  That would be more consistent with other machines.  On
> s390, once we set TIF_SINGLE_STEP, we are going to post a SIGTRAP
> eventually before going to user mode.  But then tracehook_signal_handler()
> also gets stepping=1 and the expected meaning of this is that the arch code
> is not itself simulating a single-step for the handler setup.  So the
> tracehook (i.e. ptrace/utrace) code does what it does for "need a fake
> single-step".  

Hmm, command for tracehook_signal_handler say this for stepping:
@stepping:           nonzero if debugger single-step or block-step in
use

> In ptrace (including utrace-based ptrace), this winds up with sending a
> SIGTRAP.  So when we finally do get out of do_signal and TIF_SINGLE_STEP
> causes a second SIGTRAP, it's already pending and the second one makes no
> difference.

So we have been lucky so far.
 
> But for the general case of utrace, we'll have the UTRACE_SIGNAL_HANDLER
> report, followed by a SIGTRAP that appears to be an authentic single-step
> trap, but takes place on the same instruction.  If the resumption after the
> UTRACE_SIGNAL_HANDLER report didn't use stepping, then this is an entirely
> unexpected extra SIGTRAP.  If we do continue stepping, then we are
> expecting the SIGTRAP, but this gets us a spurious and errnoeous report
> that looks like the instruction right before the handler's entry point in
> memory was just executed.
>
> [Martin:]
> > The reason why we set the TIF_SINGLE_STEP bit in do_signal is that we
> > want to be able to stop the debugged program before the first
> > instruction of the signal handler has been executed. The PER single
> > step causes a trap after an instruction has been executed. That first
> > instruction can do bad things to the arguments of the signal handler..
> 
> That's what tracehook_signal_handler is for.  You're both doing it yourself
> in the arch code (by setting TIF_SINGLE_STEP), and then telling the generic
> code to do it (by passing stepping=1 to tracehook_signal_handler).

Ok, so with the full utrace the semantics of tracehook_signal_handler
is more than just causing a SIGTRAP. It is an indication for a signal
AND a SIGTRAP if single-stepping is active. To make both cases work we
should stop setting TIF_SINGLE_STEP in do_signal and pass
current->thread.per_info.single_step to tracehook_signal_handler
instead of test_thread_flag(TIF_SINGLE_STEP).

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-06 21:13                         ` Roland McGrath
@ 2010-01-07  9:18                           ` Martin Schwidefsky
  2010-01-07 17:54                             ` Oleg Nesterov
  2010-01-07 21:46                             ` Roland McGrath
  0 siblings, 2 replies; 41+ messages in thread
From: Martin Schwidefsky @ 2010-01-07  9:18 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Oleg Nesterov, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On Wed,  6 Jan 2010 13:13:29 -0800 (PST)
Roland McGrath <roland@redhat.com> wrote:

> > However, with or without CONFIG_UTRACE, 6580807da14c423f0d0a708108e6df6ebc8bc83d
> > is needed on s390 too, otherwise the child gets unnecessary traps.
> 
> This confuses me.  user_disable_single_step on non-current doesn't do
> anything not already done by the memset in copy_thread.  Ooh, except
> perhaps it does not clear PSW_MASK_PER.  Maybe that matters.  That's
> the only thing I can think of.  Maybe Martin can make sense of it.

The additional traps should not happen anymore with this patch:
--
Subject: [PATCH] clear TIF_SINGLE_STEP for new process.

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

Clear the TIF_SINGLE_STEP bit in copy_thread. If the new process is
not auto-attached by the tracer it is wrong to delivere SIGTRAP to
the new process.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---

 arch/s390/kernel/process.c |    1 +
 1 file changed, 1 insertion(+)

diff -urpN linux-2.6/arch/s390/kernel/process.c linux-2.6-patched/arch/s390/kernel/process.c
--- linux-2.6/arch/s390/kernel/process.c	2009-12-03 04:51:21.000000000 +0100
+++ linux-2.6-patched/arch/s390/kernel/process.c	2010-01-07 09:25:53.000000000 +0100
@@ -217,6 +217,7 @@ int copy_thread(unsigned long clone_flag
 	p->thread.mm_segment = get_fs();
 	/* Don't copy debug registers */
 	memset(&p->thread.per_info, 0, sizeof(p->thread.per_info));
+	clear_tsk_thread_flag(p, TIF_SINGLE_STEP);
 	/* Initialize per thread user and system timer values */
 	ti = task_thread_info(p);
 	ti->user_timer = 0;
-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-07  9:18                           ` Martin Schwidefsky
@ 2010-01-07 17:54                             ` Oleg Nesterov
  2010-01-07 21:48                               ` Roland McGrath
  2010-01-07 21:46                             ` Roland McGrath
  1 sibling, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-07 17:54 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Roland McGrath, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

Martin, sorry for delay,

On 01/07, Martin Schwidefsky wrote:
>
> On Wed,  6 Jan 2010 13:13:29 -0800 (PST)
> Roland McGrath <roland@redhat.com> wrote:
>
> > > However, with or without CONFIG_UTRACE, 6580807da14c423f0d0a708108e6df6ebc8bc83d
> > > is needed on s390 too, otherwise the child gets unnecessary traps.
> >
> > This confuses me.  user_disable_single_step on non-current doesn't do
> > anything not already done by the memset in copy_thread.  Ooh, except
> > perhaps it does not clear PSW_MASK_PER.  Maybe that matters.  That's
> > the only thing I can think of.  Maybe Martin can make sense of it.

I am confused as well. Yes, I thought about regs->psw.mask change too,
but I don't understand why it helps..

> The additional traps should not happen anymore with this patch:
> --
> Subject: [PATCH] clear TIF_SINGLE_STEP for new process.
>
> From: Martin Schwidefsky <schwidefsky@de.ibm.com>
>
> Clear the TIF_SINGLE_STEP bit in copy_thread. If the new process is
> not auto-attached by the tracer it is wrong to delivere SIGTRAP to
> the new process.
>
> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> ---
>
>  arch/s390/kernel/process.c |    1 +
>  1 file changed, 1 insertion(+)
>
> diff -urpN linux-2.6/arch/s390/kernel/process.c linux-2.6-patched/arch/s390/kernel/process.c
> --- linux-2.6/arch/s390/kernel/process.c	2009-12-03 04:51:21.000000000 +0100
> +++ linux-2.6-patched/arch/s390/kernel/process.c	2010-01-07 09:25:53.000000000 +0100
> @@ -217,6 +217,7 @@ int copy_thread(unsigned long clone_flag
>  	p->thread.mm_segment = get_fs();
>  	/* Don't copy debug registers */
>  	memset(&p->thread.per_info, 0, sizeof(p->thread.per_info));
> +	clear_tsk_thread_flag(p, TIF_SINGLE_STEP);

Even if I don't understand s390, I think this patch makes sense
anyway. Or, user_disable_single_step() can clear this bit.


But. Acoording to the testing I did (unless I did something wrong
again) this patch doesn't make any difference in this particular
case. 6580807da14c423f0d0a708108e6df6ebc8bc83d does.

And. Please note that the test-case triggers 799 "false step", but
TIF_SINGLE_STEP is surely cleared (by the caller) after the first
invocation of do_single_step().

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-06 21:08               ` Roland McGrath
  2010-01-07  9:16                 ` Martin Schwidefsky
@ 2010-01-07 18:11                 ` Oleg Nesterov
  1 sibling, 0 replies; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-07 18:11 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Martin Schwidefsky, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On 01/06, Roland McGrath wrote:
>
> > Oh, I am not sure. But I don't understand TIF_SINGLE_STEP on s390,
> > absolutely.
> >
> > For example, why do_signal() sets TIF_SINGLE_STEP? Why can't we do
>
> I think we could.  That would be more consistent with other machines.  On
> s390, once we set TIF_SINGLE_STEP, we are going to post a SIGTRAP
> eventually before going to user mode.  But then tracehook_signal_handler()
> also gets stepping=1 and the expected meaning of this is that the arch code
> is not itself simulating a single-step for the handler setup.  So the
> tracehook (i.e. ptrace/utrace) code does what it does for "need a fake
> single-step".
>
> In ptrace (including utrace-based ptrace), this winds up with sending a
> SIGTRAP.  So when we finally do get out of do_signal and TIF_SINGLE_STEP
> causes a second SIGTRAP, it's already pending and the second one makes no
> difference.

Confused again, perhaps I just misunderstood what you mean...

Without utrace, tracehook_signal_handler() doesn't send SIGTRAP, it
merely does ptrace_notify(SIGTRAP), this means that

> But for the general case of utrace, we'll have the UTRACE_SIGNAL_HANDLER
> report, followed by a SIGTRAP that appears to be an authentic single-step
> trap, but takes place on the same instruction.  If the resumption after the
> UTRACE_SIGNAL_HANDLER report didn't use stepping, then this is an entirely
> unexpected extra SIGTRAP.

even without utrace we can have unexpected SIGTRAP.

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-07  9:16                 ` Martin Schwidefsky
@ 2010-01-07 18:16                   ` Oleg Nesterov
  2010-01-07 21:44                     ` Roland McGrath
  2010-01-08  8:34                     ` Martin Schwidefsky
  2010-01-07 21:41                   ` Roland McGrath
  1 sibling, 2 replies; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-07 18:16 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Roland McGrath, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On 01/07, Martin Schwidefsky wrote:
>
> On Wed,  6 Jan 2010 13:08:12 -0800 (PST)
> Roland McGrath <roland@redhat.com> wrote:
>
> > That's what tracehook_signal_handler is for.  You're both doing it yourself
> > in the arch code (by setting TIF_SINGLE_STEP), and then telling the generic
> > code to do it (by passing stepping=1 to tracehook_signal_handler).
>
> Ok, so with the full utrace the semantics of tracehook_signal_handler
> is more than just causing a SIGTRAP. It is an indication for a signal
> AND a SIGTRAP if single-stepping is active. To make both cases work we
> should stop setting TIF_SINGLE_STEP in do_signal and pass
> current->thread.per_info.single_step to tracehook_signal_handler
> instead of test_thread_flag(TIF_SINGLE_STEP).

Can't understand why do we need TIF_SINGLE_STEP at all.

Just pass current->thread.per_info.single_step to
tracehook_signal_handler() ?

Oleg.

--- a/arch/s390/kernel/signal.c
+++ b/arch/s390/kernel/signal.c
@@ -504,14 +504,8 @@ void do_signal(struct pt_regs *regs)
 			 * for a normal instruction, act like we took
 			 * one for the handler setup.
 			 */
-			if (current->thread.per_info.single_step)
-				set_thread_flag(TIF_SINGLE_STEP);
-
-			/*
-			 * Let tracing know that we've done the handler setup.
-			 */
 			tracehook_signal_handler(signr, &info, &ka, regs,
-					 test_thread_flag(TIF_SINGLE_STEP));
+					 current->thread.per_info.single_step);
 		}
 		return;
 	}


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-07  9:00               ` Martin Schwidefsky
@ 2010-01-07 21:32                 ` Roland McGrath
  2010-01-21 20:32                 ` Oleg Nesterov
  1 sibling, 0 replies; 41+ messages in thread
From: Roland McGrath @ 2010-01-07 21:32 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Oleg Nesterov, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

> > Right.  That means we should leave the status quo of not clearing
> > TIF_SINGLE_STEP in user_disable_single_step.
> 
> Ok, although it seems a bit strange not to do it. Perhaps I should add a
> comment about it.

It doesn't seem strange to me, but then I've just been through all this.
user_*_step is about what the task will do next.  TIF_SINGLE_STEP is about
what the task has done recently.  Of course more good comments always help.
I might be inclined to change the name of TIF_SINGLE_STEP so that its true
purpose is more obvious.  

AFAICT, in fact it is not even about single-step per se.  It means some PER
trap happened and should produce SIGTRAP.  Don't you get the same thing if
you haven't used single-step, but instead used PTRACE_POKEUSR to set up
per_struct with bits that say to trigger for some other reason?

How about calling it TIF_PER_PENDING?

> So after everthing has been converted to utrace we always will load the
> control registers in FixPerRegisters.

Right.  (This could well still change in the future.  But that's how it is
in utrace now.  And regardless of possible future implementation changes it
will always be the case that sometimes it will be called on current.)

> We could use the same compare of the control registers as the code in
> __switch_to. See below.

Yes, sounds good.

> The PSW_MASK_PER is the "global" PER enablement switch, the PER_EM_MASK
> bits enable the different PER events. We check for the PER_EM_MASK bits
> because it is easier to access in __switch_to. The return PSW is stored
> in the pt_regs structure, we would have to get a pointer to it (what
> "regs = task_pt_regs(taks)" does in FixPerRegisters). In
> FixPerRegisters we can as well use the PSW_MASK_PER bit.

I see.  Thanks for the explanation.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-07  9:16                 ` Martin Schwidefsky
  2010-01-07 18:16                   ` Oleg Nesterov
@ 2010-01-07 21:41                   ` Roland McGrath
  1 sibling, 0 replies; 41+ messages in thread
From: Roland McGrath @ 2010-01-07 21:41 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Oleg Nesterov, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

> Hmm, command for tracehook_signal_handler say this for stepping:
> @stepping:           nonzero if debugger single-step or block-step in
> use

Are you saying you would like me to clarify that wording somehow?  It's
meant to be implicit that the arch code is not doing any special fakery
about single-step for signal handlers, only processing real single-step
traps (and faking them for a syscall instruction if the arch requires
that).  No other arch does it, so it didn't occur to me that s390 would.
Before tracehook some had ptrace_notify calls there, and the call to
tracehook_signal_handler replaced that call.

> > In ptrace (including utrace-based ptrace), this winds up with sending a
> > SIGTRAP.  So when we finally do get out of do_signal and TIF_SINGLE_STEP
> > causes a second SIGTRAP, it's already pending and the second one makes no
> > difference.
> 
> So we have been lucky so far.

Actually, Oleg rightly points out:

> Confused again, perhaps I just misunderstood what you mean...
> 
> Without utrace, tracehook_signal_handler() doesn't send SIGTRAP, it
> merely does ptrace_notify(SIGTRAP), this means that
[...]
> even without utrace we can have unexpected SIGTRAP.

That is quite true, and I just misremembered when writing that paragraph.

So indeed we have been lucky, but it's not the luck of the problem not
happening on s390, but the luck of nobody ever caring. :-)

> Ok, so with the full utrace the semantics of tracehook_signal_handler
> is more than just causing a SIGTRAP. It is an indication for a signal
> AND a SIGTRAP if single-stepping is active. 

In short, it is the indication of a signal handler having been set up, just
like its kerneldoc description says.  Whatever that should mean to tracing
(SIGTRAP or otherwise) is in the purview of the generic tracing layer, not
the arch layer.

> To make both cases work we
> should stop setting TIF_SINGLE_STEP in do_signal and pass
> current->thread.per_info.single_step to tracehook_signal_handler
> instead of test_thread_flag(TIF_SINGLE_STEP).

Correct.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-07 18:16                   ` Oleg Nesterov
@ 2010-01-07 21:44                     ` Roland McGrath
  2010-01-08  8:34                     ` Martin Schwidefsky
  1 sibling, 0 replies; 41+ messages in thread
From: Roland McGrath @ 2010-01-07 21:44 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Martin Schwidefsky, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

> Can't understand why do we need TIF_SINGLE_STEP at all.

I think you mean "why we need to set it in do_signal" here,
not "why do we need it to exist at all".

> Just pass current->thread.per_info.single_step to
> tracehook_signal_handler() ?

Yes.  I believe this is what Martin meant, and it's what I meant to endorse.

do_signal should not do anything with TIF_SINGLE_STEP at all.
Its only purpose should be to communicate from the low-level
trap assembly code up to the return-to-user assembly code so
it calls do_single_step.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-07  9:18                           ` Martin Schwidefsky
  2010-01-07 17:54                             ` Oleg Nesterov
@ 2010-01-07 21:46                             ` Roland McGrath
  2010-01-08  8:30                               ` Martin Schwidefsky
  1 sibling, 1 reply; 41+ messages in thread
From: Roland McGrath @ 2010-01-07 21:46 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Oleg Nesterov, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

> Clear the TIF_SINGLE_STEP bit in copy_thread. If the new process is
> not auto-attached by the tracer it is wrong to delivere SIGTRAP to
> the new process.

The change is right, but this log entry is confusing.  "auto-attached" has
nothing to do with it, nor does anything about tracing the new process or
not.  The new process has not experienced a PER trap of its own, so it is
wrong to deliver a SIGTRAP that is meant for its creator.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-07 17:54                             ` Oleg Nesterov
@ 2010-01-07 21:48                               ` Roland McGrath
  2010-01-21 20:51                                 ` Oleg Nesterov
  0 siblings, 1 reply; 41+ messages in thread
From: Roland McGrath @ 2010-01-07 21:48 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Martin Schwidefsky, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

> I am confused as well. Yes, I thought about regs->psw.mask change too,
> but I don't understand why it helps..
[...]
> But. Acoording to the testing I did (unless I did something wrong
> again) this patch doesn't make any difference in this particular
> case. 6580807da14c423f0d0a708108e6df6ebc8bc83d does.

Those results are quite mysterious to me.  
I think we'll have to get Martin to sort it out definitively.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-07 21:46                             ` Roland McGrath
@ 2010-01-08  8:30                               ` Martin Schwidefsky
  2010-01-08 10:25                                 ` Roland McGrath
  0 siblings, 1 reply; 41+ messages in thread
From: Martin Schwidefsky @ 2010-01-08  8:30 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Oleg Nesterov, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On Thu,  7 Jan 2010 13:46:42 -0800 (PST)
Roland McGrath <roland@redhat.com> wrote:

> > Clear the TIF_SINGLE_STEP bit in copy_thread. If the new process is
> > not auto-attached by the tracer it is wrong to delivere SIGTRAP to
> > the new process.
> 
> The change is right, but this log entry is confusing.  "auto-attached" has
> nothing to do with it, nor does anything about tracing the new process or
> not.  The new process has not experienced a PER trap of its own, so it is
> wrong to deliver a SIGTRAP that is meant for its creator.

Ok, I changed the wording slightly:

Clear the TIF_SINGLE_STEP bit in copy_thread. The new process did not get
a PER event of its own. It is wrong deliver a SIGTRAP that was meant for
the parent process.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-07 18:16                   ` Oleg Nesterov
  2010-01-07 21:44                     ` Roland McGrath
@ 2010-01-08  8:34                     ` Martin Schwidefsky
  1 sibling, 0 replies; 41+ messages in thread
From: Martin Schwidefsky @ 2010-01-08  8:34 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On Thu, 7 Jan 2010 19:16:32 +0100
Oleg Nesterov <oleg@redhat.com> wrote:

> On 01/07, Martin Schwidefsky wrote:
> >
> > On Wed,  6 Jan 2010 13:08:12 -0800 (PST)
> > Roland McGrath <roland@redhat.com> wrote:
> >
> > > That's what tracehook_signal_handler is for.  You're both doing it yourself
> > > in the arch code (by setting TIF_SINGLE_STEP), and then telling the generic
> > > code to do it (by passing stepping=1 to tracehook_signal_handler).
> >
> > Ok, so with the full utrace the semantics of tracehook_signal_handler
> > is more than just causing a SIGTRAP. It is an indication for a signal
> > AND a SIGTRAP if single-stepping is active. To make both cases work we
> > should stop setting TIF_SINGLE_STEP in do_signal and pass
> > current->thread.per_info.single_step to tracehook_signal_handler
> > instead of test_thread_flag(TIF_SINGLE_STEP).
> 
> Can't understand why do we need TIF_SINGLE_STEP at all.
> 
> Just pass current->thread.per_info.single_step to
> tracehook_signal_handler() ?
> 
> Oleg.
> 
> --- a/arch/s390/kernel/signal.c
> +++ b/arch/s390/kernel/signal.c
> @@ -504,14 +504,8 @@ void do_signal(struct pt_regs *regs)
>  			 * for a normal instruction, act like we took
>  			 * one for the handler setup.
>  			 */
> -			if (current->thread.per_info.single_step)
> -				set_thread_flag(TIF_SINGLE_STEP);
> -
> -			/*
> -			 * Let tracing know that we've done the handler setup.
> -			 */
>  			tracehook_signal_handler(signr, &info, &ka, regs,
> -					 test_thread_flag(TIF_SINGLE_STEP));
> +					 current->thread.per_info.single_step);
>  		}
>  		return;
>  	}
> 

That is what I meant in the other mail. The patch on my local disk
looks almost the same but it removes the comment prior to the
TIF_SINGLE_STEP if statement as well.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-08  8:30                               ` Martin Schwidefsky
@ 2010-01-08 10:25                                 ` Roland McGrath
  0 siblings, 0 replies; 41+ messages in thread
From: Roland McGrath @ 2010-01-08 10:25 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Oleg Nesterov, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

> Ok, I changed the wording slightly:
> 
> Clear the TIF_SINGLE_STEP bit in copy_thread. The new process did not get
> a PER event of its own. It is wrong deliver a SIGTRAP that was meant for
> the parent process.

Very good!

Thanks,
Roland

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-07  9:00               ` Martin Schwidefsky
  2010-01-07 21:32                 ` Roland McGrath
@ 2010-01-21 20:32                 ` Oleg Nesterov
  1 sibling, 0 replies; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-21 20:32 UTC (permalink / raw)
  To: Martin Schwidefsky
  Cc: Roland McGrath, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On 01/07, Martin Schwidefsky wrote:
>
> On Wed,  6 Jan 2010 12:56:33 -0800 (PST)
> Roland McGrath <roland@redhat.com> wrote:
>
> > In other circumstances with utrace, it is very possible to wind up with
> > user_disable_single_step being called superfluously when there was no
> > stop (and so not necessarily any context switch or other high overhead).
> > On other machines, user_disable_single_step is pretty cheap even where
> > user_enable_single_step is quite costly.  Given how simple and cheap it
> > is to short-circuit the excess work on s390, I think it is worthwhile.
>
> We could use the same compare of the control registers as the code in
> __switch_to. See below.

FYI, I tested your c3311c13adc1021e986fef12609ceb395ffc5014 commit which
does this optimization (compared to the patch you sent previously), it
works fine.

But please see another email I am going to send...

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-07 21:48                               ` Roland McGrath
@ 2010-01-21 20:51                                 ` Oleg Nesterov
  2010-01-26 13:13                                   ` Martin Schwidefsky
  0 siblings, 1 reply; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-21 20:51 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Martin Schwidefsky, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel

On 01/07, Roland McGrath wrote:
>
> > I am confused as well. Yes, I thought about regs->psw.mask change too,
> > but I don't understand why it helps..
> [...]
> > But. Acoording to the testing I did (unless I did something wrong
> > again) this patch doesn't make any difference in this particular
> > case. 6580807da14c423f0d0a708108e6df6ebc8bc83d does.
>
> Those results are quite mysterious to me.
> I think we'll have to get Martin to sort it out definitively.

I did the testing again with 2.6.32-5.el6 + Martin's
	c3311c13adc1021e986fef12609ceb395ffc5014
	f8d5faf718c9ff2c04eb8484585d4963c4111cd7
patches.

the same test-case:

	#include <stdio.h>
	#include <unistd.h>
	#include <signal.h>
	#include <sys/ptrace.h>
	#include <sys/wait.h>
	#include <assert.h>

	int main(void)
	{
		int pid, status;

		if (!(pid = fork())) {
			assert(ptrace(PTRACE_TRACEME) == 0);
			kill(getpid(), SIGSTOP);

			if (!fork())
				return 43;

			wait(&status);
			return WEXITSTATUS(status);
		}


		for (;;) {
			assert(pid == wait(&status));
			if (WIFEXITED(status))
				break;
			assert(ptrace(PTRACE_SINGLESTEP, pid, 0,0) == 0);
		}

		assert(WEXITSTATUS(status) == 43);
		return 0;
	}

with the simple debugging patch below I did

	# perl -e 'syscall 172, 666, 0,0'; ./xxx
	# perl -e 'syscall 172, 666, 0,1'; ./xxx
	# perl -e 'syscall 172, 666, 1,0'; ./xxx
	# perl -e 'syscall 172, 666, 1,1'; ./xxx

and dmesg reports:

	XXX disable_step=0, clear_flag=0
	XXX: xxx/1868 0
	[... 799 times ...]
	XXX disable_step=0, clear_flag=1
	XXX: xxx/1905 0
	[... 799 times ...]
	XXX disable_step=1, clear_flag=0
	XXX disable_step=1, clear_flag=1

Just in case, I did the testing with and without CONFIG_UTRACE,
result is the same.

IOW, copy_thread()->clear_tsk_thread_flag(TIF_SINGLE_STEP) doesn't
make any difference, copy_process()->user_disable_single_step() does.

Although I need to re-read Martin's explanations about psw magic,
perhaps this was already explained...

Oleg.

--- K/kernel/sys.c~	2010-01-21 14:16:15.366639654 -0500
+++ K/kernel/sys.c	2010-01-21 14:30:35.131591879 -0500
@@ -1453,6 +1453,8 @@ SYSCALL_DEFINE1(umask, int, mask)
 	return mask;
 }
 
+int xxx_disable_step, xxx_clear_flag;
+
 SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 		unsigned long, arg4, unsigned long, arg5)
 {
@@ -1466,6 +1468,13 @@ SYSCALL_DEFINE5(prctl, int, option, unsi
 
 	error = 0;
 	switch (option) {
+		case 666:
+			xxx_disable_step = arg2;
+			xxx_clear_flag = arg3;
+			printk(KERN_INFO "XXX disable_step=%d, clear_flag=%d\n",
+				xxx_disable_step, xxx_clear_flag);
+			break;
+
 		case PR_SET_PDEATHSIG:
 			if (!valid_signal(arg2)) {
 				error = -EINVAL;
--- K/kernel/fork.c~	2010-01-18 09:35:16.823811008 -0500
+++ K/kernel/fork.c	2010-01-21 14:29:39.131624971 -0500
@@ -964,6 +964,8 @@ static void posix_cpu_timers_init(struct
 	INIT_LIST_HEAD(&tsk->cpu_timers[2]);
 }
 
+extern int xxx_disable_step;
+
 /*
  * This creates a new process as a copy of the old one,
  * but does not actually start it yet.
@@ -1207,7 +1209,8 @@ static struct task_struct *copy_process(
 	 * Syscall tracing and stepping should be turned off in the
 	 * child regardless of CLONE_PTRACE.
 	 */
-	user_disable_single_step(p);
+	if (xxx_disable_step)
+		user_disable_single_step(p);
 	clear_tsk_thread_flag(p, TIF_SYSCALL_TRACE);
 #ifdef TIF_SYSCALL_EMU
 	clear_tsk_thread_flag(p, TIF_SYSCALL_EMU);
--- K/arch/s390/kernel/process.c~	2010-01-21 14:32:38.541609793 -0500
+++ K/arch/s390/kernel/process.c	2010-01-21 14:34:10.461584130 -0500
@@ -161,6 +161,8 @@ void release_thread(struct task_struct *
 {
 }
 
+extern int xxx_clear_flag;
+
 int copy_thread(unsigned long clone_flags, unsigned long new_stackp,
 		unsigned long unused,
 		struct task_struct *p, struct pt_regs *regs)
@@ -217,7 +219,8 @@ int copy_thread(unsigned long clone_flag
 	p->thread.mm_segment = get_fs();
 	/* Don't copy debug registers */
 	memset(&p->thread.per_info, 0, sizeof(p->thread.per_info));
-	clear_tsk_thread_flag(p, TIF_SINGLE_STEP);
+	if (xxx_clear_flag)
+		clear_tsk_thread_flag(p, TIF_SINGLE_STEP);
 	/* Initialize per thread user and system timer values */
 	ti = task_thread_info(p);
 	ti->user_timer = 0;


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-21 20:51                                 ` Oleg Nesterov
@ 2010-01-26 13:13                                   ` Martin Schwidefsky
  0 siblings, 0 replies; 41+ messages in thread
From: Martin Schwidefsky @ 2010-01-26 13:13 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel, stable

On Thu, 21 Jan 2010 21:51:13 +0100
Oleg Nesterov <oleg@redhat.com> wrote:

> On 01/07, Roland McGrath wrote:
> >
> > > I am confused as well. Yes, I thought about regs->psw.mask change too,
> > > but I don't understand why it helps..
> > [...]
> > > But. Acoording to the testing I did (unless I did something wrong
> > > again) this patch doesn't make any difference in this particular
> > > case. 6580807da14c423f0d0a708108e6df6ebc8bc83d does.
> >
> > Those results are quite mysterious to me.
> > I think we'll have to get Martin to sort it out definitively.

Finally nailed that one. Grrmpf.. the special case in the program check
handler for single stepped svcs clobbers the argument registers. With our
test case this affects the clone() system call. Funny things happen when
the clone_flags argument is more or less random ..
The following patch fixes the problem for me.

--
Subject: [PATCH] fix single stepped svcs with TRACE_IRQFLAGS=y

From: Martin Schwidefsky <schwidefsky@de.ibm.com>

If irq flags tracing is enabled the TRACE_IRQS_ON macros expands to
a function call which clobbers registers %r0-%r5. The macro is used
in the code path for single stepped system calls. The argument
registers %r2-%r6 need to be restored from the stack before the system
call function is called.

Cc: stable@kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---

 arch/s390/kernel/entry.S   |    1 +
 arch/s390/kernel/entry64.S |    1 +
 2 files changed, 2 insertions(+)

diff -urpN linux-2.6/arch/s390/kernel/entry64.S linux-2.6-patched/arch/s390/kernel/entry64.S
--- linux-2.6/arch/s390/kernel/entry64.S	2009-12-03 04:51:21.000000000 +0100
+++ linux-2.6-patched/arch/s390/kernel/entry64.S	2010-01-26 14:04:58.000000000 +0100
@@ -549,6 +549,7 @@ pgm_svcper:
 	mvc	__THREAD_per+__PER_access_id(1,%r8),__LC_PER_ACCESS_ID
 	oi	__TI_flags+7(%r9),_TIF_SINGLE_STEP # set TIF_SINGLE_STEP
 	TRACE_IRQS_ON
+	lmg	%r2,%r6,SP_R2(%r15)	# load svc arguments
 	stosm	__SF_EMPTY(%r15),0x03	# reenable interrupts
 	j	sysc_do_svc
 
diff -urpN linux-2.6/arch/s390/kernel/entry.S linux-2.6-patched/arch/s390/kernel/entry.S
--- linux-2.6/arch/s390/kernel/entry.S	2009-12-03 04:51:21.000000000 +0100
+++ linux-2.6-patched/arch/s390/kernel/entry.S	2010-01-26 14:04:58.000000000 +0100
@@ -571,6 +571,7 @@ pgm_svcper:
 	mvc	__THREAD_per+__PER_access_id(1,%r8),__LC_PER_ACCESS_ID
 	oi	__TI_flags+3(%r9),_TIF_SINGLE_STEP # set TIF_SINGLE_STEP
 	TRACE_IRQS_ON
+	lm	%r2,%r6,SP_R2(%r15)	# load svc arguments
 	stosm	__SF_EMPTY(%r15),0x03	# reenable interrupts
 	b	BASED(sysc_do_svc)
 
-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
  2010-01-06 15:33 ` caiqian
@ 2010-01-06 20:09   ` Oleg Nesterov
  0 siblings, 0 replies; 41+ messages in thread
From: Oleg Nesterov @ 2010-01-06 20:09 UTC (permalink / raw)
  To: caiqian
  Cc: Roland McGrath, Heiko Carstens, Jan Kratochvil, linux-kernel,
	linux-s390, utrace-devel, Martin Schwidefsky

On 01/06, caiqian@redhat.com wrote:
>
> > Cai, any chance you can help? Using /usr/bin/console I can't choose
> > anotherkernel at the "00: Please choose (default will boot in 15 seconds):"
> > stage, how can I do this???
>
> As Heiko mentioned, I did manage to enter,
>
> #cp vi vmsg 2

if only I new about this magic ;)

Thanks Cai and Heiko!

Oleg.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: s390 && user_enable_single_step() (Was: odd utrace testing results on s390x)
       [not found] <1158952983.251101262791902387.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
@ 2010-01-06 15:33 ` caiqian
  2010-01-06 20:09   ` Oleg Nesterov
  0 siblings, 1 reply; 41+ messages in thread
From: caiqian @ 2010-01-06 15:33 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, caiqian, Heiko Carstens, Jan Kratochvil,
	linux-kernel, linux-s390, utrace-devel, Martin Schwidefsky


> Cai, any chance you can help? Using /usr/bin/console I can't choose
> anotherkernel at the "00: Please choose (default will boot in 15 seconds):"
> stage, how can I do this???

As Heiko mentioned, I did manage to enter,

#cp vi vmsg 2

during the prompt to get to the second kernel to boot...

Thanks,
CAI Qian

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2010-01-26 13:13 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1503844142.2061111261478093776.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
     [not found] ` <1257887498.2061171261478252049.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
2010-01-04 15:52   ` s390 && user_enable_single_step() (Was: odd utrace testing results on s390x) Oleg Nesterov
2010-01-04 16:16     ` Martin Schwidefsky
2010-01-04 18:14       ` Oleg Nesterov
2010-01-04 19:30         ` Oleg Nesterov
2010-01-04 21:11         ` Roland McGrath
2010-01-05  9:50           ` Martin Schwidefsky
2010-01-05 15:36             ` Oleg Nesterov
2010-01-05 15:46               ` Martin Schwidefsky
2010-01-05 15:59                 ` Oleg Nesterov
2010-01-05 17:03                   ` Oleg Nesterov
2010-01-05 19:58                     ` Oleg Nesterov
2010-01-06 14:59                       ` Heiko Carstens
2010-01-06 20:17                       ` Oleg Nesterov
2010-01-06 21:13                         ` Roland McGrath
2010-01-07  9:18                           ` Martin Schwidefsky
2010-01-07 17:54                             ` Oleg Nesterov
2010-01-07 21:48                               ` Roland McGrath
2010-01-21 20:51                                 ` Oleg Nesterov
2010-01-26 13:13                                   ` Martin Schwidefsky
2010-01-07 21:46                             ` Roland McGrath
2010-01-08  8:30                               ` Martin Schwidefsky
2010-01-08 10:25                                 ` Roland McGrath
2010-01-05 15:47               ` Oleg Nesterov
2010-01-05 15:50                 ` Martin Schwidefsky
2010-01-06 21:08               ` Roland McGrath
2010-01-07  9:16                 ` Martin Schwidefsky
2010-01-07 18:16                   ` Oleg Nesterov
2010-01-07 21:44                     ` Roland McGrath
2010-01-08  8:34                     ` Martin Schwidefsky
2010-01-07 21:41                   ` Roland McGrath
2010-01-07 18:11                 ` Oleg Nesterov
2010-01-06 20:23             ` Oleg Nesterov
2010-01-06 20:56             ` Roland McGrath
2010-01-07  9:00               ` Martin Schwidefsky
2010-01-07 21:32                 ` Roland McGrath
2010-01-21 20:32                 ` Oleg Nesterov
2010-01-05  9:26         ` Martin Schwidefsky
2010-01-06 21:15           ` Roland McGrath
2010-01-04 20:46       ` Roland McGrath
     [not found] <1158952983.251101262791902387.JavaMail.root@zmail06.collab.prod.int.phx2.redhat.com>
2010-01-06 15:33 ` caiqian
2010-01-06 20:09   ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).