All of lore.kernel.org
 help / color / mirror / Atom feed
From: Philippe Gerum <rpm@xenomai.org>
To: "Chen, Hongzhan" <hongzhan.chen@intel.com>
Cc: xenomai@xenomai.org
Subject: Re: gdb  test failure debug status update
Date: Wed, 28 Apr 2021 16:18:42 +0200	[thread overview]
Message-ID: <87mtti3325.fsf@xenomai.org> (raw)
In-Reply-To: <DM5PR11MB18525F69E34389DE7A8EE49BF2409@DM5PR11MB1852.namprd11.prod.outlook.com>


Chen, Hongzhan via Xenomai <xenomai@xenomai.org> writes:

> According to my validation, gdb test fail on dovetail 5.10 branch but pass on v5.9-evl4 tag with same for-upstream/dovetail
> xenomai code base.
>
> After further debug , the issue is more clear for me. Gdb test failure because low priority thread smokey  userspace  is still
> executed after "cobalt_shadow_relaxed: state=0x4488c0 info=0x200"  like log [1] on dovetail-5.10 branch.
> The weird thing is that its following first ftrace log  happen at 62235.848583 after cobalt_shadow_relaxed in log [1].
> It is almost 3ms happened after cobalt_shadow_relaxed. The low priority smoke thread user space is executed during this
> 3ms period so that test fail.
>
> But in success case with v5.9-evl4 like in  log [2], the time interval between cobalt_shadow_relaxed and the following first ftrace log
> is only about 1us. It seems that low priority smokey userspace do not have chance to execute in this 1us because gdb test is successful.
>
> My question is why there is even no interrupt happened during that about 3ms period in failure case?  Tick seems in abnormal behavior.
> Please comment if you have any ideas to further debug it.
>
> PS: All my tests run on same up Xtream board.

<snip>

Let's put aside the tick issue for now, there may be a valid reason for
this delay with dynticks enabled.

The issue at stake may be related to the way a return to kernel space is
forced on a @user task (Dovetail has an integrated service for
triggering this called dovetail_request_ucall()).

The logic for doing so is as follows:

1. @user hits a breakpoint, which is an exception Dovetail-wise

2. @user gets XNDBGSTOP set into its flags because Cobalt notices it is
being debugged via a breakpoint trap, then relaxed as a result of taking
a exception in general, so that we may traverse the common trap handling code
safely.

3. since XNDBGSTOP is a blocking bit Cobalt-wise, it should prevent
@user from being picked for scheduling by the real-time core, next time
a Cobalt considers rescheduling that is. However, since @user is
currently relaxed, it can still run under the supervision of the common
Linux scheduler. This is what the log[1] show.

4. the common/in-band kernel code stops @user due to the ptrace stop
condition caused by the breakpoint, waiting for a continuation event to
happen.

Therefore, upon PTRACE_CONT (i.e. gdb continue), we need to force @user
to call back into kernel context (handle_ptrace_cont ->
dovetail_request_ucall), then ask for a switch to primary mode from
there, which should eventually happen when @user is about to leave the
kernel (on x86, this now happens from a generic kernel entry/exit code
in kernel/entry/*). As a result, handle_taskexit_event() runs, figures
out that @user is pending a switch to primary mode. As it switches to
primary mode, @user would be blocked by Cobalt from running further,
because XNDBGSTOP is set into its internal state.

So, I would check a few things for starters:

- is dovetail_request_ucall() working properly.
- is XNCONTHI properly set into the local Cobalt flags of @user when
  handle_user_return() is entered.
- is this path taken as expected once dovetail_request_ucall() has run
  for @user:
  exit_to_user_mode_prepare -> do_retuser -> inband_retuser_notify
  (kernel/entry/common.c)?
  
It may be a good idea to enable all cobalt tracepoints, add one to
handle_user_return() too.

-- 
Philippe.


  reply	other threads:[~2021-04-28 14:18 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-28  3:19 gdb test failure debug status update Chen, Hongzhan
2021-04-28 14:18 ` Philippe Gerum [this message]
2021-04-28 14:30   ` Philippe Gerum
2021-04-29  6:31     ` Chen, Hongzhan
2021-04-30  5:25       ` Chen, Hongzhan
2021-04-30  5:51         ` Chen, Hongzhan
2021-04-30  7:36           ` Philippe Gerum
2021-04-30  7:34         ` Philippe Gerum
2021-04-30  8:00           ` Philippe Gerum
2021-04-30  8:07             ` Chen, Hongzhan
     [not found]               ` <DM5PR11MB18529649C47BF241930A2217F25E9@DM5PR11MB1852.namprd11.prod.outlook.com>
     [not found]                 ` <8735v82jmd.fsf@xenomai.org>
2021-05-06  2:00                   ` Chen, Hongzhan
2021-05-07  1:10                     ` Chen, Hongzhan
2021-05-09 17:46                       ` Philippe Gerum
2021-05-09 17:49                         ` Philippe Gerum
2021-05-10  2:16                           ` Chen, Hongzhan
2021-05-15 15:55                         ` Philippe Gerum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mtti3325.fsf@xenomai.org \
    --to=rpm@xenomai.org \
    --cc=hongzhan.chen@intel.com \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.