From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4E1B5860.1000309@domain.hid> Date: Mon, 11 Jul 2011 22:09:04 +0200 From: Gilles Chanteperdrix MIME-Version: 1.0 References: <4E1B469A.8000703@domain.hid> <4E1B4AC0.80506@domain.hid> <4E1B4C19.2070205@domain.hid> <4E1B542B.2010906@domain.hid> <4E1B5638.1050005@domain.hid> <4E1B56E0.20109@domain.hid> <4E1B57D1.1070401@domain.hid> In-Reply-To: <4E1B57D1.1070401@domain.hid> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai-core] [Xenomai-git] Jan Kiszka : nucleus: Fix race between gatekeeper and thread deletion List-Id: Xenomai life and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: Xenomai core On 07/11/2011 10:06 PM, Jan Kiszka wrote: > On 2011-07-11 22:02, Gilles Chanteperdrix wrote: >> On 07/11/2011 09:59 PM, Jan Kiszka wrote: >>> On 2011-07-11 21:51, Gilles Chanteperdrix wrote: >>>> On 07/11/2011 09:16 PM, Jan Kiszka wrote: >>>>> On 2011-07-11 21:10, Jan Kiszka wrote: >>>>>> On 2011-07-11 20:53, Gilles Chanteperdrix wrote: >>>>>>> On 07/08/2011 06:29 PM, GIT version control wrote: >>>>>>>> @@ -2528,6 +2534,22 @@ static inline void do_taskexit_event(struct task_struct *p) >>>>>>>> magic = xnthread_get_magic(thread); >>>>>>>> >>>>>>>> xnlock_get_irqsave(&nklock, s); >>>>>>>> + >>>>>>>> + gksched = thread->gksched; >>>>>>>> + if (gksched) { >>>>>>>> + xnlock_put_irqrestore(&nklock, s); >>>>>>> >>>>>>> Are we sure irqs are on here? Are you sure that what is needed is not an >>>>>>> xnlock_clear_irqon? >>>>>> >>>>>> We are in the context of do_exit. Not only IRQs are on, also preemption. >>>>>> And surely no nklock is held. >>>>>> >>>>>>> Furthermore, I do not understand how we >>>>>>> "synchronize" with the gatekeeper, how is the gatekeeper garanteed to >>>>>>> wait for this assignment? >>>>>> >>>>>> The gatekeeper holds the gksync token while it's active. We request it, >>>>>> thus we wait for the gatekeeper to become idle again. While it is idle, >>>>>> we reset the queued reference - but I just realized that this may tramp >>>>>> on other tasks' values. I need to add a check that the value to be >>>>>> null'ified is actually still ours. >>>>> >>>>> Thinking again, that's actually not a problem: gktarget is only needed >>>>> while gksync is zero - but then we won't get hold of it anyway and, >>>>> thus, can't cause any damage. >>>> >>>> Well, you make it look like it does not work. From what I understand, >>>> what you want is to set gktarget to null if a task being hardened is >>>> destroyed. But by waiting for the semaphore, you actually wait for the >>>> harden to be complete, so setting to NULL is useless. Or am I missing >>>> something else? >>> >>> Setting to NULL is probably unneeded but still better than rely on the >>> gatekeeper never waking up spuriously and then dereferencing a stale >>> pointer. >>> >>> The key element of this fix is waitng on gksync, thus on the completion >>> of the non-RT part of the hardening. Actually, this part usually fails >>> as the target task received a termination signal at this point. >> >> Yes, but since you wait on the completion of the hardening, the test >> if (target &&...) in the gatekeeper code will always be true, because at >> this point the cleanup code will still be waiting for the semaphore. > > Yes, except we will ever wake up the gatekeeper later on without an > updated gktarget, ie. spuriously. Better safe than sorry, this is hairy > code anyway (hopefully obsolete one day). The gatekeeper is not woken up by posting the semaphore, the gatekeeper is woken up by the thread which is going to be hardened (and this thread is the one which waits for the semaphore). > > Jan > -- Gilles.