All of lore.kernel.org
 help / color / mirror / Atom feed
* gdb test failure issue on xenomai 3.2
@ 2021-03-02 11:20 Chen, Hongzhan
  2021-03-05  1:15 ` Chen, Hongzhan
  0 siblings, 1 reply; 3+ messages in thread
From: Chen, Hongzhan @ 2021-03-02 11:20 UTC (permalink / raw)
  To: Philippe Gerum, xenomai

Hi Philippe,

The gdb test fails on xenomai 3.2 because low priority smokey process still has chance to scheduled to 
execute it's user space's task. According to my debug , it seems it is related to handle_ptrace_resume of 
dovetail/kevents.c because in failed case it would try to resume hi-thread like stack [1] and then
 hi-thread call xnthread_harden like stack [2].
But on IPIPE-xenomai image it would resume smokey thread at first and then xnthread_harden would
be called in smokey thread in the following.
The problem is that I do know why in failed case it would xnthread_resume hi-thread at first like log[3] but with 
IPIPE-xenomai image it always xnthread_resume smokey non-rt thread like in log [4] so that I still do not 
know how to fix it. I still need more time to debug it out.  Please comment.

In addition , I found that there is comments in handle_ptstep_event of kernel/evl/thread.c. It said that 
"the ptracer might have switched focus"  but  in gdb test case breakpoint exactly switch to hi-thread 
from smokey process . Does that means we need to do some special handle for it ?
But for me evl is quite so different from xenomai on handling INBAND_TASK_PTSTEP  in handle_inband_event.

[1]:
0xffffffff811c4020 xnthread_resume(): kernel/xenomai/thread.c, line 1039	
0xffffffff811b51e0 handle_ptrace_resume(): ...ernel/xenomai/pipeline/kevents.c, line 474	
0xffffffff81078bec inband_ptstep_notify(): ./include/linux/dovetail.h, line 121	
0xffffffff81078bec ptrace_resume(): kernel/ptrace.c, line 830	
0xffffffff81079f32 ptrace_request(): kernel/ptrace.c, line 1194	
0xffffffff8102eaab arch_ptrace(): arch/x86/kernel/ptrace.c, line 818	
0xffffffff8107949b __do_sys_ptrace(): kernel/ptrace.c, line 1279	
0xffffffff8107949b __se_sys_ptrace(): kernel/ptrace.c, line 1244	
0xffffffff8107949b __x64_sys_ptrace(): kernel/ptrace.c, line 1244	
0xffffffff81b93549 do_syscall_64(): arch/x86/entry/common.c, line 53	
0xffffffff81c0007c entry_SYSCALL_64_after_hwframe(): arch/x86/entry/entry_64.S, line 118	

[2]:
0xffffffff811c6340 xnthread_harden(): kernel/xenomai/thread.c, line 1864	
0xffffffff811b56fa handle_user_return(): ...ernel/xenomai/pipeline/kevents.c, line 339	
0xffffffff811b56fa handle_inband_event(): ...ernel/xenomai/pipeline/kevents.c, line 514	
0xffffffff8110791e do_retuser(): kernel/entry/common.c, line 238	
0xffffffff8110791e exit_to_user_mode_prepare(): kernel/entry/common.c, line 262	
0xffffffff81b97795 irqentry_exit_to_user_mode(): kernel/entry/common.c, line 356	
0xffffffff81b941dc exc_int3(): arch/x86/kernel/traps.c, line 710	
0xffffffff81c00aa1 asm_exc_int3(): .../arch/x86/include/asm/idtentry.h, line 592	

[3]:
gdb-1153    [003] d..3 17451.761941: sched_wait_task: comm=hi-thread pid=1160 prio=97
gdb-1153    [003] d..3 17451.761944: sched_wait_task: comm=hi-thread pid=1160 prio=97
gdb-1153    [003] d..3 17451.761947: sched_wait_task: comm=hi-thread pid=1160 prio=97
gdb-1153    [003] d..3 17451.761953: sched_wait_task: comm=hi-thread pid=1160 prio=97
gdb-1153    [003] d..3 17451.761958: sched_wait_task: comm=hi-thread pid=1160 prio=97
gdb-1153    [003] d..3 17451.761973: sched_wait_task: comm=hi-thread pid=1160 prio=97
gdb-1153    [003] D..1 17451.761976: cobalt_thread_resume: name=hi-thread pid=1160 mask=0x400000
gdb-1153    [003] D..1 17451.761977: cobalt_trace_pid: pid=1160, prio=2
gdb-1153    [003] d..3 17451.761978: sched_waking: comm=hi-thread pid=1160 prio=97 target_cpu=000

[4]
             gdb-839     [000] d..1  2420.981050: signal_deliver: sig=17 errno=0 code=4 sa_handler=55bb958074d0 sa_flags=14000000
             gdb-839     [000] ....  2420.981069: sched_process_wait: comm=gdb pid=0 prio=120
             gdb-839     [000] d..2  2420.981075: sched_wait_task: comm=hi-thread pid=846 prio=97
             gdb-839     [000] d..2  2420.981080: sched_wait_task: comm=hi-thread pid=846 prio=97
             gdb-839     [000] d..2  2420.981082: sched_wait_task: comm=hi-thread pid=846 prio=97
             gdb-839     [000] d..2  2420.981084: sched_wait_task: comm=hi-thread pid=846 prio=97
             gdb-839     [000] ....  2420.981087: sched_process_wait: comm=gdb pid=0 prio=120
             gdb-839     [000] d..2  2420.981147: sched_wait_task: comm=hi-thread pid=846 prio=97
             gdb-839     [000] d..2  2420.981150: sched_wait_task: comm=hi-thread pid=846 prio=97
             gdb-839     [000] d..1  2420.981303: cobalt_thread_resume: name=smokey pid=841 mask=0x400000
             gdb-839     [000] d..1  2420.981304: cobalt_thread_unblock: pid=841 state=0x48848 info=0x0
             gdb-839     [000] d..1  2420.981305: cobalt_schedule_remote: status=0x10000000
             gdb-839     [000] d..1  2420.981305: cobalt_schedule: status=0x10000000

Regards

Hongzhan Chen



^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: gdb test failure issue on xenomai 3.2
  2021-03-02 11:20 gdb test failure issue on xenomai 3.2 Chen, Hongzhan
@ 2021-03-05  1:15 ` Chen, Hongzhan
  2021-03-05 10:56   ` Philippe Gerum
  0 siblings, 1 reply; 3+ messages in thread
From: Chen, Hongzhan @ 2021-03-05  1:15 UTC (permalink / raw)
  To: Philippe Gerum, xenomai

Hi Philippe,

I think I already found the root cause and almost fix it but still need time to validate it.

Temporary patch like following:

diff --git a/kernel/cobalt/dovetail/sched.c b/kernel/cobalt/dovetail/sched.c
index de7c43b70..846b571b6 100644
--- a/kernel/cobalt/dovetail/sched.c
+++ b/kernel/cobalt/dovetail/sched.c
@@ -56,9 +56,22 @@ int pipeline_leave_inband(void)

 int pipeline_leave_oob_prepare(void)
 {
-       dovetail_leave_oob();
+       int suspmask = XNRELAX;
+       struct xnthread *curr = xnthread_current();

-       return XNRELAX;
+       dovetail_leave_oob();
+       /*
+        * If current is being debugged, record that it should migrate
+        * back in case it resumes in userspace. If it resumes in
+        * kernel space, i.e.  over a restarting syscall, the
+        * associated hardening will both clear XNCONTHI and disable
+        * the user return notifier again.
+        */
+       if (xnthread_test_state(curr, XNSSTEP)) {
+               xnthread_set_info(curr, XNCONTHI);
+               suspmask |= XNDBGSTOP;
+       }
+       return suspmask;
 }

 void pipeline_leave_oob_finish(void)
intel@intel-Z97X-UD5H:~/iotg/dovetail/xenomaioverdovetail/cobalt-dovetail/upstream/xenomai-rpm$ git diff kernel/cobalt/dovetail/kevents.c
diff --git a/kernel/cobalt/dovetail/kevents.c b/kernel/cobalt/dovetail/kevents.c
index 966a63ce0..9bceffaac 100644
--- a/kernel/cobalt/dovetail/kevents.c
+++ b/kernel/cobalt/dovetail/kevents.c
        if (xnthread_test_info(thread, XNCONTHI)) {
                xnlock_get_irqsave(&nklock, s);
                xnthread_clear_info(thread, XNCONTHI);
@@ -492,6 +494,8 @@ static void handle_ptrace_cont(void)
                        unregister_debugged_thread(curr);

                xnthread_set_localinfo(curr, XNHICCUP);
+
+               dovetail_request_ucall(current);
        }

After validation is done , I would submit clean patch to review.

Regards

Hongzhan Chen

-----Original Message-----
From: Chen, Hongzhan 
Sent: Tuesday, March 2, 2021 7:21 PM
To: Philippe Gerum <rpm@xenomai.org>; xenomai@xenomai.org
Subject: gdb test failure issue on xenomai 3.2

Hi Philippe,

The gdb test fails on xenomai 3.2 because low priority smokey process still has chance to scheduled to 
execute it's user space's task. According to my debug , it seems it is related to handle_ptrace_resume of 
dovetail/kevents.c because in failed case it would try to resume hi-thread like stack [1] and then
 hi-thread call xnthread_harden like stack [2].
But on IPIPE-xenomai image it would resume smokey thread at first and then xnthread_harden would
be called in smokey thread in the following.
The problem is that I do know why in failed case it would xnthread_resume hi-thread at first like log[3] but with 
IPIPE-xenomai image it always xnthread_resume smokey non-rt thread like in log [4] so that I still do not 
know how to fix it. I still need more time to debug it out.  Please comment.

In addition , I found that there is comments in handle_ptstep_event of kernel/evl/thread.c. It said that 
"the ptracer might have switched focus"  but  in gdb test case breakpoint exactly switch to hi-thread 
from smokey process . Does that means we need to do some special handle for it ?
But for me evl is quite so different from xenomai on handling INBAND_TASK_PTSTEP  in handle_inband_event.

[1]:
0xffffffff811c4020 xnthread_resume(): kernel/xenomai/thread.c, line 1039	
0xffffffff811b51e0 handle_ptrace_resume(): ...ernel/xenomai/pipeline/kevents.c, line 474	
0xffffffff81078bec inband_ptstep_notify(): ./include/linux/dovetail.h, line 121	
0xffffffff81078bec ptrace_resume(): kernel/ptrace.c, line 830	
0xffffffff81079f32 ptrace_request(): kernel/ptrace.c, line 1194	
0xffffffff8102eaab arch_ptrace(): arch/x86/kernel/ptrace.c, line 818	
0xffffffff8107949b __do_sys_ptrace(): kernel/ptrace.c, line 1279	
0xffffffff8107949b __se_sys_ptrace(): kernel/ptrace.c, line 1244	
0xffffffff8107949b __x64_sys_ptrace(): kernel/ptrace.c, line 1244	
0xffffffff81b93549 do_syscall_64(): arch/x86/entry/common.c, line 53	
0xffffffff81c0007c entry_SYSCALL_64_after_hwframe(): arch/x86/entry/entry_64.S, line 118	

[2]:
0xffffffff811c6340 xnthread_harden(): kernel/xenomai/thread.c, line 1864	
0xffffffff811b56fa handle_user_return(): ...ernel/xenomai/pipeline/kevents.c, line 339	
0xffffffff811b56fa handle_inband_event(): ...ernel/xenomai/pipeline/kevents.c, line 514	
0xffffffff8110791e do_retuser(): kernel/entry/common.c, line 238	
0xffffffff8110791e exit_to_user_mode_prepare(): kernel/entry/common.c, line 262	
0xffffffff81b97795 irqentry_exit_to_user_mode(): kernel/entry/common.c, line 356	
0xffffffff81b941dc exc_int3(): arch/x86/kernel/traps.c, line 710	
0xffffffff81c00aa1 asm_exc_int3(): .../arch/x86/include/asm/idtentry.h, line 592	

[3]:
gdb-1153    [003] d..3 17451.761941: sched_wait_task: comm=hi-thread pid=1160 prio=97
gdb-1153    [003] d..3 17451.761944: sched_wait_task: comm=hi-thread pid=1160 prio=97
gdb-1153    [003] d..3 17451.761947: sched_wait_task: comm=hi-thread pid=1160 prio=97
gdb-1153    [003] d..3 17451.761953: sched_wait_task: comm=hi-thread pid=1160 prio=97
gdb-1153    [003] d..3 17451.761958: sched_wait_task: comm=hi-thread pid=1160 prio=97
gdb-1153    [003] d..3 17451.761973: sched_wait_task: comm=hi-thread pid=1160 prio=97
gdb-1153    [003] D..1 17451.761976: cobalt_thread_resume: name=hi-thread pid=1160 mask=0x400000
gdb-1153    [003] D..1 17451.761977: cobalt_trace_pid: pid=1160, prio=2
gdb-1153    [003] d..3 17451.761978: sched_waking: comm=hi-thread pid=1160 prio=97 target_cpu=000

[4]
             gdb-839     [000] d..1  2420.981050: signal_deliver: sig=17 errno=0 code=4 sa_handler=55bb958074d0 sa_flags=14000000
             gdb-839     [000] ....  2420.981069: sched_process_wait: comm=gdb pid=0 prio=120
             gdb-839     [000] d..2  2420.981075: sched_wait_task: comm=hi-thread pid=846 prio=97
             gdb-839     [000] d..2  2420.981080: sched_wait_task: comm=hi-thread pid=846 prio=97
             gdb-839     [000] d..2  2420.981082: sched_wait_task: comm=hi-thread pid=846 prio=97
             gdb-839     [000] d..2  2420.981084: sched_wait_task: comm=hi-thread pid=846 prio=97
             gdb-839     [000] ....  2420.981087: sched_process_wait: comm=gdb pid=0 prio=120
             gdb-839     [000] d..2  2420.981147: sched_wait_task: comm=hi-thread pid=846 prio=97
             gdb-839     [000] d..2  2420.981150: sched_wait_task: comm=hi-thread pid=846 prio=97
             gdb-839     [000] d..1  2420.981303: cobalt_thread_resume: name=smokey pid=841 mask=0x400000
             gdb-839     [000] d..1  2420.981304: cobalt_thread_unblock: pid=841 state=0x48848 info=0x0
             gdb-839     [000] d..1  2420.981305: cobalt_schedule_remote: status=0x10000000
             gdb-839     [000] d..1  2420.981305: cobalt_schedule: status=0x10000000

Regards

Hongzhan Chen



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: gdb test failure issue on xenomai 3.2
  2021-03-05  1:15 ` Chen, Hongzhan
@ 2021-03-05 10:56   ` Philippe Gerum
  0 siblings, 0 replies; 3+ messages in thread
From: Philippe Gerum @ 2021-03-05 10:56 UTC (permalink / raw)
  To: Chen, Hongzhan; +Cc: xenomai


Hi Hongzhan,

Sorry for the lag in replying.

Chen, Hongzhan <hongzhan.chen@intel.com> writes:

> Hi Philippe,
>
> I think I already found the root cause and almost fix it but still need time to validate it.
>
> Temporary patch like following:
>
> diff --git a/kernel/cobalt/dovetail/sched.c b/kernel/cobalt/dovetail/sched.c
> index de7c43b70..846b571b6 100644
> --- a/kernel/cobalt/dovetail/sched.c
> +++ b/kernel/cobalt/dovetail/sched.c
> @@ -56,9 +56,22 @@ int pipeline_leave_inband(void)
>
>  int pipeline_leave_oob_prepare(void)
>  {
> -       dovetail_leave_oob();
> +       int suspmask = XNRELAX;
> +       struct xnthread *curr = xnthread_current();
>
> -       return XNRELAX;
> +       dovetail_leave_oob();
> +       /*
> +        * If current is being debugged, record that it should migrate
> +        * back in case it resumes in userspace. If it resumes in
> +        * kernel space, i.e.  over a restarting syscall, the
> +        * associated hardening will both clear XNCONTHI and disable
> +        * the user return notifier again.
> +        */
> +       if (xnthread_test_state(curr, XNSSTEP)) {
> +               xnthread_set_info(curr, XNCONTHI);
> +               suspmask |= XNDBGSTOP;
> +       }
> +       return suspmask;
>  }

Yes, that part was definitely missing. I guess I dropped it mistakenly
due to the condition on the obsolete IPIPE_KEVT_USERINTRET check.

>
>  void pipeline_leave_oob_finish(void)
> intel@intel-Z97X-UD5H:~/iotg/dovetail/xenomaioverdovetail/cobalt-dovetail/upstream/xenomai-rpm$ git diff kernel/cobalt/dovetail/kevents.c
> diff --git a/kernel/cobalt/dovetail/kevents.c b/kernel/cobalt/dovetail/kevents.c
> index 966a63ce0..9bceffaac 100644
> --- a/kernel/cobalt/dovetail/kevents.c
> +++ b/kernel/cobalt/dovetail/kevents.c
>         if (xnthread_test_info(thread, XNCONTHI)) {
>                 xnlock_get_irqsave(&nklock, s);
>                 xnthread_clear_info(thread, XNCONTHI);
> @@ -492,6 +494,8 @@ static void handle_ptrace_cont(void)
>                         unregister_debugged_thread(curr);
>
>                 xnthread_set_localinfo(curr, XNHICCUP);
> +
> +               dovetail_request_ucall(current);
>         }
>

Looks good to me too. Once the ptrace core tells us that current is
resuming from a stopped state, this is the right place to ask for
switching back to oob mode.

> After validation is done , I would submit clean patch to review.

Thanks for looking at this one. This is quite tricky stuff.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-03-05 10:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-02 11:20 gdb test failure issue on xenomai 3.2 Chen, Hongzhan
2021-03-05  1:15 ` Chen, Hongzhan
2021-03-05 10:56   ` Philippe Gerum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.