All of lore.kernel.org
 help / color / mirror / Atom feed
From: Philippe Gerum <rpm@xenomai.org>
To: Jan Kiszka <jan.kiszka@domain.hid>
Cc: Xenomai core <Xenomai-core@domain.hid>
Subject: Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and	thread deletion
Date: Tue, 12 Jul 2011 17:48:02 +0200	[thread overview]
Message-ID: <1310485682.2154.212.camel@domain.hid> (raw)
In-Reply-To: <4E1C44B4.50106@domain.hid>

On Tue, 2011-07-12 at 14:57 +0200, Jan Kiszka wrote:
> On 2011-07-12 14:13, Jan Kiszka wrote:
> > On 2011-07-12 14:06, Gilles Chanteperdrix wrote:
> >> On 07/12/2011 01:58 PM, Jan Kiszka wrote:
> >>> On 2011-07-12 13:56, Jan Kiszka wrote:
> >>>> However, this parallel unsynchronized execution of the gatekeeper and
> >>>> its target thread leaves an increasingly bad feeling on my side. Did we
> >>>> really catch all corner cases now? I wouldn't guarantee that yet.
> >>>> Specifically as I still have an obscure crash of a Xenomai thread on
> >>>> Linux schedule() on my table.
> >>>>
> >>>> What if the target thread woke up due to a signal, continued much
> >>>> further on a different CPU, blocked in TASK_INTERRUPTIBLE, and then the
> >>>> gatekeeper continued? I wish we could already eliminate this complexity
> >>>> and do the migration directly inside schedule()...
> >>>
> >>> BTW, we do we mask out TASK_ATOMICSWITCH when checking the task state in
> >>> the gatekeeper? What would happen if we included it (state ==
> >>> (TASK_ATOMICSWITCH | TASK_INTERRUPTIBLE))?
> >>
> >> I would tend to think that what we should check is
> >> xnthread_test_info(XNATOMIC). Or maybe check both, the interruptible
> >> state and the XNATOMIC info bit.
> > 
> > Actually, neither the info bits nor the task state is sufficiently
> > synchronized against the gatekeeper yet. We need to hold a shared lock
> > when testing and resetting the state. I'm not sure yet if that is
> > fixable given the gatekeeper architecture.
> > 
> 
> This may work (on top of the exit-race fix):
> 
> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> index 50dcf43..90feb16 100644
> --- a/ksrc/nucleus/shadow.c
> +++ b/ksrc/nucleus/shadow.c
> @@ -913,20 +913,27 @@ static int gatekeeper_thread(void *data)
>  		if ((xnthread_user_task(target)->state & ~TASK_ATOMICSWITCH) == TASK_INTERRUPTIBLE) {
>  			rpi_pop(target);
>  			xnlock_get_irqsave(&nklock, s);
> -#ifdef CONFIG_SMP
> +
>  			/*
> -			 * If the task changed its CPU while in
> -			 * secondary mode, change the CPU of the
> -			 * underlying Xenomai shadow too. We do not
> -			 * migrate the thread timers here, it would
> -			 * not work. For a "full" migration comprising
> -			 * timers, using xnpod_migrate_thread is
> -			 * required.
> +			 * Recheck XNATOMIC to avoid waking the shadow if the
> +			 * Linux task received a signal meanwhile.
>  			 */
> -			if (target->sched != sched)
> -				xnsched_migrate_passive(target, sched);
> +			if (xnthread_test_info(target, XNATOMIC)) {
> +#ifdef CONFIG_SMP
> +				/*
> +				 * If the task changed its CPU while in
> +				 * secondary mode, change the CPU of the
> +				 * underlying Xenomai shadow too. We do not
> +				 * migrate the thread timers here, it would
> +				 * not work. For a "full" migration comprising
> +				 * timers, using xnpod_migrate_thread is
> +				 * required.
> +				 */
> +				if (target->sched != sched)
> +					xnsched_migrate_passive(target, sched);
>  #endif /* CONFIG_SMP */
> -			xnpod_resume_thread(target, XNRELAX);
> +				xnpod_resume_thread(target, XNRELAX);
> +			}
>  			xnlock_put_irqrestore(&nklock, s);
>  			xnpod_schedule();
>  		}
> @@ -1036,6 +1043,7 @@ redo:
>  	 * to process this signal anyway.
>  	 */
>  	if (rthal_current_domain == rthal_root_domain) {
> +		XENO_BUGON(NUCLEUS, xnthread_test_info(thread, XNATOMIC));
>  		if (XENO_DEBUG(NUCLEUS) && (!signal_pending(this_task)
>  		    || this_task->state != TASK_RUNNING))
>  			xnpod_fatal
> @@ -1044,6 +1052,8 @@ redo:
>  		return -ERESTARTSYS;
>  	}
>  
> +	xnthread_clear_info(thread, XNATOMIC);
> +
>  	/* "current" is now running into the Xenomai domain. */
>  	thread->gksched = NULL;
>  	sched = xnsched_finish_unlocked_switch(thread->sched);
> @@ -2650,6 +2660,8 @@ static inline void do_sigwake_event(struct task_struct *p)
>  
>  	xnlock_get_irqsave(&nklock, s);
>  
> +	xnthread_clear_info(thread, XNATOMIC);
> +
>  	if ((p->ptrace & PT_PTRACED) && !xnthread_test_state(thread, XNDEBUG)) {
>  		sigset_t pending;
>  
> 
> It totally ignores RPI and PREEMPT_RT for now. RPI is broken anyway,

I want to drop RPI in v3 for sure because it is misleading people. I'm
still pondering whether we should do that earlier during the 2.6
timeframe.

> ripping it out would allow to use solely XNATOMIC as condition in the
> gatekeeper.
> 
> /me is now looking to get this applied to a testbox that still crashes
> suspiciously.
> 
> Jan
> 
> _______________________________________________
> Xenomai-core mailing list
> Xenomai-core@domain.hid
> https://mail.gna.org/listinfo/xenomai-core

-- 
Philippe.




  reply	other threads:[~2011-07-12 15:48 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <E1QfDvt-0003TN-G7@domain.hid>
2011-07-11 18:53 ` [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion Gilles Chanteperdrix
2011-07-11 19:10   ` Jan Kiszka
2011-07-11 19:16     ` Jan Kiszka
2011-07-11 19:51       ` Gilles Chanteperdrix
2011-07-11 19:59         ` Jan Kiszka
2011-07-11 20:02           ` Gilles Chanteperdrix
2011-07-11 20:06             ` Jan Kiszka
2011-07-11 20:09               ` Gilles Chanteperdrix
2011-07-11 20:12                 ` Jan Kiszka
2011-07-12  6:41                   ` Gilles Chanteperdrix
2011-07-12  7:22                     ` Jan Kiszka
2011-07-12  7:49                       ` Gilles Chanteperdrix
2011-07-12 10:59                       ` Gilles Chanteperdrix
2011-07-12 11:00                         ` Jan Kiszka
2011-07-12 11:04                           ` Gilles Chanteperdrix
2011-07-12 11:06                             ` Jan Kiszka
2011-07-12 11:08                               ` Gilles Chanteperdrix
2011-07-12 11:10                                 ` Jan Kiszka
2011-07-12 11:26                                   ` Gilles Chanteperdrix
2011-07-12 11:29                                     ` Jan Kiszka
2011-07-12 11:41                                       ` Gilles Chanteperdrix
2011-07-12 11:56                                         ` Jan Kiszka
2011-07-12 11:58                                           ` Jan Kiszka
2011-07-12 12:06                                             ` Gilles Chanteperdrix
2011-07-12 12:13                                               ` Jan Kiszka
2011-07-12 12:57                                                 ` Jan Kiszka
2011-07-12 15:48                                                   ` Philippe Gerum [this message]
2011-07-12 16:18                                                     ` Jan Kiszka
2011-07-12 17:31                                                   ` Gilles Chanteperdrix
2011-07-12 17:34                                                     ` Jan Kiszka
2011-07-12 17:38                                                       ` Gilles Chanteperdrix
2011-07-12 17:43                                                         ` Jan Kiszka
2011-07-13 18:39                                                           ` Gilles Chanteperdrix
2011-07-13 19:04                                                             ` Jan Kiszka
2011-07-13 19:12                                                               ` Gilles Chanteperdrix
2011-07-14 20:57                                                                 ` Jan Kiszka
2011-07-15 12:30                                                                   ` Gilles Chanteperdrix
2011-07-15 13:10                                                                     ` Jan Kiszka
2011-07-16  8:13                                                                       ` Jan Kiszka
2011-07-16  8:52                                                                         ` Philippe Gerum
2011-07-16  9:15                                                                           ` Jan Kiszka
2011-07-16  9:56                                                                             ` Philippe Gerum
2011-07-16 17:16                                                                               ` Jan Kiszka
2011-07-13 19:35                                                             ` Philippe Gerum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1310485682.2154.212.camel@domain.hid \
    --to=rpm@xenomai.org \
    --cc=Xenomai-core@domain.hid \
    --cc=jan.kiszka@domain.hid \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.