All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai-core] Race in rpi_clear_remote?
@ 2010-04-26 13:43 Jan Kiszka
  2010-04-26 13:51 ` Jan Kiszka
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kiszka @ 2010-04-26 13:43 UTC (permalink / raw)
  To: xenomai-core

Hi,

I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
/seems/ like thread->rpi is invalid. Looking at the code, I wonder if
this could explain the bug:


static void rpi_clear_remote(struct xnthread *thread)
{
...
        rpi = thread->rpi;
        if (unlikely(rpi == NULL))
                return;

        xnlock_get_irqsave(&rpi->rpilock, s);

        /*
         * The RPI slot - if present - is always valid, and won't
         * change since the thread is resuming on this CPU and cannot
         * migrate under our feet. We may grab the remote slot lock
         * now.
         */
        xnsched_pop_rpi(thread);
        thread->rpi = NULL;

...

So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
we check for it without any protection? Sounds racy. I think 'thread' is
not only pointing to the current thread but could refer to a foreign one
as well, right? Don't we need this:

diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
index 872c37f..1f995d6 100644
--- a/ksrc/nucleus/shadow.c
+++ b/ksrc/nucleus/shadow.c
@@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
 
 	xnlock_get_irqsave(&rpi->rpilock, s);
 
+	/* Re-check under lock, someone may have cleared rpi by now. */
+	if (unlikely(thread->rpi == NULL)) {
+		xnlock_put_irqrestore(&rpi->rpilock, s);
+		return;
+	}
+
 	/*
 	 * The RPI slot - if present - is always valid, and won't
 	 * change since the thread is resuming on this CPU and cannot


Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] Race in rpi_clear_remote?
  2010-04-26 13:43 [Xenomai-core] Race in rpi_clear_remote? Jan Kiszka
@ 2010-04-26 13:51 ` Jan Kiszka
  2010-04-26 16:06   ` [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next Jan Kiszka
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kiszka @ 2010-04-26 13:51 UTC (permalink / raw)
  To: xenomai-core

Jan Kiszka wrote:
> Hi,
> 
> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
> this could explain the bug:
> 
> 
> static void rpi_clear_remote(struct xnthread *thread)
> {
> ...
>         rpi = thread->rpi;
>         if (unlikely(rpi == NULL))
>                 return;
> 
>         xnlock_get_irqsave(&rpi->rpilock, s);
> 
>         /*
>          * The RPI slot - if present - is always valid, and won't
>          * change since the thread is resuming on this CPU and cannot
>          * migrate under our feet. We may grab the remote slot lock
>          * now.
>          */
>         xnsched_pop_rpi(thread);
>         thread->rpi = NULL;
> 
> ...
> 
> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
> we check for it without any protection? Sounds racy. I think 'thread' is
> not only pointing to the current thread but could refer to a foreign one
> as well, right? Don't we need this:
> 
> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> index 872c37f..1f995d6 100644
> --- a/ksrc/nucleus/shadow.c
> +++ b/ksrc/nucleus/shadow.c
> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
>  
>  	xnlock_get_irqsave(&rpi->rpilock, s);
>  
> +	/* Re-check under lock, someone may have cleared rpi by now. */
> +	if (unlikely(thread->rpi == NULL)) {
> +		xnlock_put_irqrestore(&rpi->rpilock, s);
> +		return;
> +	}
> +
>  	/*
>  	 * The RPI slot - if present - is always valid, and won't
>  	 * change since the thread is resuming on this CPU and cannot

Another worry: Can thread->rpi become != rpi without being NULL? Or can
we really only race for clearance here?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-04-26 13:51 ` Jan Kiszka
@ 2010-04-26 16:06   ` Jan Kiszka
  2010-04-26 16:11     ` Jan Kiszka
  2010-04-27  1:19     ` Philippe Gerum
  0 siblings, 2 replies; 18+ messages in thread
From: Jan Kiszka @ 2010-04-26 16:06 UTC (permalink / raw)
  To: Philippe Gerum, Gilles Chanteperdrix; +Cc: xenomai-core

Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Hi,
>>
>> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
>> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
>> this could explain the bug:
>>
>>
>> static void rpi_clear_remote(struct xnthread *thread)
>> {
>> ...
>>         rpi = thread->rpi;
>>         if (unlikely(rpi == NULL))
>>                 return;
>>
>>         xnlock_get_irqsave(&rpi->rpilock, s);
>>
>>         /*
>>          * The RPI slot - if present - is always valid, and won't
>>          * change since the thread is resuming on this CPU and cannot
>>          * migrate under our feet. We may grab the remote slot lock
>>          * now.
>>          */
>>         xnsched_pop_rpi(thread);
>>         thread->rpi = NULL;
>>
>> ...
>>
>> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
>> we check for it without any protection? Sounds racy. I think 'thread' is
>> not only pointing to the current thread but could refer to a foreign one
>> as well, right? Don't we need this:
>>
>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>> index 872c37f..1f995d6 100644
>> --- a/ksrc/nucleus/shadow.c
>> +++ b/ksrc/nucleus/shadow.c
>> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
>>  
>>  	xnlock_get_irqsave(&rpi->rpilock, s);
>>  
>> +	/* Re-check under lock, someone may have cleared rpi by now. */
>> +	if (unlikely(thread->rpi == NULL)) {
>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
>> +		return;
>> +	}
>> +
>>  	/*
>>  	 * The RPI slot - if present - is always valid, and won't
>>  	 * change since the thread is resuming on this CPU and cannot
> 
> Another worry: Can thread->rpi become != rpi without being NULL? Or can
> we really only race for clearance here?
> 

I think so now, therefore I'm proposing this:

----------->

Most RPI services work on the current task or the one to be scheduled in
next, thus are naturally serialized. But rpi_next is not as it can walk
the chain of RPI requests for a CPU independently. In that case,
clearing RPI via rpi_clear_remote can race with rpi_next, and if the
former loses after checking thread->rpi for NULL, we will dereference a
NULL pointer in xnsched_pop_rpi().

Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
---
 ksrc/nucleus/shadow.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
index 872c37f..cf7c08f 100644
--- a/ksrc/nucleus/shadow.c
+++ b/ksrc/nucleus/shadow.c
@@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread *thread)
 	xnlock_get_irqsave(&rpi->rpilock, s);
 
 	/*
+	 * Re-check under lock. Someone may have invoked rpi_next and cleared
+	 * rpi by now.
+	 */
+	if (unlikely(!rpi_p(thread))) {
+		xnlock_put_irqrestore(&rpi->rpilock, s);
+		return;
+	}
+
+	/*
 	 * The RPI slot - if present - is always valid, and won't
 	 * change since the thread is resuming on this CPU and cannot
 	 * migrate under our feet. We may grab the remote slot lock



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-04-26 16:06   ` [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next Jan Kiszka
@ 2010-04-26 16:11     ` Jan Kiszka
  2010-04-27  1:19     ` Philippe Gerum
  1 sibling, 0 replies; 18+ messages in thread
From: Jan Kiszka @ 2010-04-26 16:11 UTC (permalink / raw)
  To: Philippe Gerum, Gilles Chanteperdrix; +Cc: xenomai-core

Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Hi,
>>>
>>> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
>>> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
>>> this could explain the bug:
>>>
>>>
>>> static void rpi_clear_remote(struct xnthread *thread)
>>> {
>>> ...
>>>         rpi = thread->rpi;
>>>         if (unlikely(rpi == NULL))
>>>                 return;
>>>
>>>         xnlock_get_irqsave(&rpi->rpilock, s);
>>>
>>>         /*
>>>          * The RPI slot - if present - is always valid, and won't
>>>          * change since the thread is resuming on this CPU and cannot
>>>          * migrate under our feet. We may grab the remote slot lock
>>>          * now.
>>>          */
>>>         xnsched_pop_rpi(thread);
>>>         thread->rpi = NULL;
>>>
>>> ...
>>>
>>> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
>>> we check for it without any protection? Sounds racy. I think 'thread' is
>>> not only pointing to the current thread but could refer to a foreign one
>>> as well, right? Don't we need this:
>>>
>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>> index 872c37f..1f995d6 100644
>>> --- a/ksrc/nucleus/shadow.c
>>> +++ b/ksrc/nucleus/shadow.c
>>> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>  
>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
>>>  
>>> +	/* Re-check under lock, someone may have cleared rpi by now. */
>>> +	if (unlikely(thread->rpi == NULL)) {
>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
>>> +		return;
>>> +	}
>>> +
>>>  	/*
>>>  	 * The RPI slot - if present - is always valid, and won't
>>>  	 * change since the thread is resuming on this CPU and cannot
>> Another worry: Can thread->rpi become != rpi without being NULL? Or can
>> we really only race for clearance here?
>>
> 
> I think so now, therefore I'm proposing this:
> 
> ----------->
> 
> Most RPI services work on the current task or the one to be scheduled in
> next, thus are naturally serialized. But rpi_next is not as it can walk

...serialized /wrt changes of thread->rpi.

> the chain of RPI requests for a CPU independently. In that case,
> clearing RPI via rpi_clear_remote can race with rpi_next, and if the
> former loses after checking thread->rpi for NULL, we will dereference a
> NULL pointer in xnsched_pop_rpi().
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
> ---
>  ksrc/nucleus/shadow.c |    9 +++++++++
>  1 files changed, 9 insertions(+), 0 deletions(-)
> 
> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> index 872c37f..cf7c08f 100644
> --- a/ksrc/nucleus/shadow.c
> +++ b/ksrc/nucleus/shadow.c
> @@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread *thread)
>  	xnlock_get_irqsave(&rpi->rpilock, s);
>  
>  	/*
> +	 * Re-check under lock. Someone may have invoked rpi_next and cleared
> +	 * rpi by now.
> +	 */
> +	if (unlikely(!rpi_p(thread))) {
> +		xnlock_put_irqrestore(&rpi->rpilock, s);
> +		return;
> +	}
> +
> +	/*
>  	 * The RPI slot - if present - is always valid, and won't
>  	 * change since the thread is resuming on this CPU and cannot
>  	 * migrate under our feet. We may grab the remote slot lock
> 
> 

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-04-26 16:06   ` [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next Jan Kiszka
  2010-04-26 16:11     ` Jan Kiszka
@ 2010-04-27  1:19     ` Philippe Gerum
  2010-04-27  6:46       ` Jan Kiszka
  1 sibling, 1 reply; 18+ messages in thread
From: Philippe Gerum @ 2010-04-27  1:19 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

On Mon, 2010-04-26 at 18:06 +0200, Jan Kiszka wrote:
> Jan Kiszka wrote:
> > Jan Kiszka wrote:
> >> Hi,
> >>
> >> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
> >> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
> >> this could explain the bug:
> >>
> >>
> >> static void rpi_clear_remote(struct xnthread *thread)
> >> {
> >> ...
> >>         rpi = thread->rpi;
> >>         if (unlikely(rpi == NULL))
> >>                 return;
> >>
> >>         xnlock_get_irqsave(&rpi->rpilock, s);
> >>
> >>         /*
> >>          * The RPI slot - if present - is always valid, and won't
> >>          * change since the thread is resuming on this CPU and cannot
> >>          * migrate under our feet. We may grab the remote slot lock
> >>          * now.
> >>          */
> >>         xnsched_pop_rpi(thread);
> >>         thread->rpi = NULL;
> >>
> >> ...
> >>
> >> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
> >> we check for it without any protection? Sounds racy. I think 'thread' is
> >> not only pointing to the current thread but could refer to a foreign one
> >> as well, right? Don't we need this:
> >>
> >> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >> index 872c37f..1f995d6 100644
> >> --- a/ksrc/nucleus/shadow.c
> >> +++ b/ksrc/nucleus/shadow.c
> >> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
> >>  
> >>  	xnlock_get_irqsave(&rpi->rpilock, s);
> >>  
> >> +	/* Re-check under lock, someone may have cleared rpi by now. */
> >> +	if (unlikely(thread->rpi == NULL)) {
> >> +		xnlock_put_irqrestore(&rpi->rpilock, s);
> >> +		return;
> >> +	}
> >> +
> >>  	/*
> >>  	 * The RPI slot - if present - is always valid, and won't
> >>  	 * change since the thread is resuming on this CPU and cannot
> > 
> > Another worry: Can thread->rpi become != rpi without being NULL? Or can
> > we really only race for clearance here?
> > 
> 
> I think so now, therefore I'm proposing this:
> 
> ----------->
> 
> Most RPI services work on the current task or the one to be scheduled in
> next, thus are naturally serialized. But rpi_next is not as it can walk
> the chain of RPI requests for a CPU independently. In that case,
> clearing RPI via rpi_clear_remote can race with rpi_next, and if the
> former loses after checking thread->rpi for NULL, we will dereference a
> NULL pointer in xnsched_pop_rpi().
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
> ---
>  ksrc/nucleus/shadow.c |    9 +++++++++
>  1 files changed, 9 insertions(+), 0 deletions(-)
> 
> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> index 872c37f..cf7c08f 100644
> --- a/ksrc/nucleus/shadow.c
> +++ b/ksrc/nucleus/shadow.c
> @@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread *thread)
>  	xnlock_get_irqsave(&rpi->rpilock, s);
>  
>  	/*
> +	 * Re-check under lock. Someone may have invoked rpi_next and cleared
> +	 * rpi by now.
> +	 */
> +	if (unlikely(!rpi_p(thread))) {
> +		xnlock_put_irqrestore(&rpi->rpilock, s);
> +		return;
> +	}
> +
> +	/*
>  	 * The RPI slot - if present - is always valid, and won't
>  	 * change since the thread is resuming on this CPU and cannot
>  	 * migrate under our feet. We may grab the remote slot lock
> 

The suggested patch papers over the actual issue, which is that
rpi_clear_remote() may not invoke rpi_next(), because it may only affect
the RPI state of the argument thread which must be a local one, and not
that of any random thread that happens to be linked to the remote RPI
queue.

By calling rpi_next(), rpi_clear_remote() shoots itself in the foot,
allowing a concurrent invocation of itself on a remote CPU, to fiddle
with the rpi backlink of a thread which is not active on the
local/per-cpu Xenomai scheduler instance, which is the point where
things start to hit the crapper.

Now, unless I can't even synchronize the couple of neurons I have left
at this hour, the following patch should better fix the issue, because
it restores the two basic rules that apply to the whole RPI machinery,
namely:

- rpi_* calls may only alter the contents of the local scheduler's RPI
queue, with the notable exception of rpi_clear_remote() which may remove
the given _local_ thread only, from a remote RPI slot.

- rpi_* calls may only change the RPI state of threads which are
controlled by the local Xenomai scheduler instance, except rpi_push()
when called for setting up the RPI state of an emerging thread, in which
case this is a no-conflict zone.

That breakage was introduced in the early 2.5.1 timeframe, so 2.4.x
should be immune from this particular bug.

diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
index 872c37f..1397ed1 100644
--- a/ksrc/nucleus/shadow.c
+++ b/ksrc/nucleus/shadow.c
@@ -340,7 +340,12 @@ static void rpi_clear_remote(struct xnthread *thread)
 	xnsched_pop_rpi(thread);
 	thread->rpi = NULL;
 
-	if (rpi_next(rpi, s) == NULL)
+	/*
+	 * If the remote RPI queue was emptied, prepare for kicking
+	 * xnshadow_rpi_check() on the relevant CPU to demote the root
+	 * thread priority there.
+	 */
+	if (xnsched_peek_rpi(rpi) == NULL) /* No rpi_next() here. */
 		rcpu = xnsched_cpu(rpi);
 
 	xnlock_put_irqrestore(&rpi->rpilock, s);


-- 
Philippe.




^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-04-27  1:19     ` Philippe Gerum
@ 2010-04-27  6:46       ` Jan Kiszka
  2010-04-27  8:13         ` Philippe Gerum
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kiszka @ 2010-04-27  6:46 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 6293 bytes --]

Philippe Gerum wrote:
> On Mon, 2010-04-26 at 18:06 +0200, Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
>>>> Hi,
>>>>
>>>> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
>>>> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
>>>> this could explain the bug:
>>>>
>>>>
>>>> static void rpi_clear_remote(struct xnthread *thread)
>>>> {
>>>> ...
>>>>         rpi = thread->rpi;
>>>>         if (unlikely(rpi == NULL))
>>>>                 return;
>>>>
>>>>         xnlock_get_irqsave(&rpi->rpilock, s);
>>>>
>>>>         /*
>>>>          * The RPI slot - if present - is always valid, and won't
>>>>          * change since the thread is resuming on this CPU and cannot
>>>>          * migrate under our feet. We may grab the remote slot lock
>>>>          * now.
>>>>          */
>>>>         xnsched_pop_rpi(thread);
>>>>         thread->rpi = NULL;
>>>>
>>>> ...
>>>>
>>>> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
>>>> we check for it without any protection? Sounds racy. I think 'thread' is
>>>> not only pointing to the current thread but could refer to a foreign one
>>>> as well, right? Don't we need this:
>>>>
>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>> index 872c37f..1f995d6 100644
>>>> --- a/ksrc/nucleus/shadow.c
>>>> +++ b/ksrc/nucleus/shadow.c
>>>> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>>  
>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
>>>>  
>>>> +	/* Re-check under lock, someone may have cleared rpi by now. */
>>>> +	if (unlikely(thread->rpi == NULL)) {
>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
>>>> +		return;
>>>> +	}
>>>> +
>>>>  	/*
>>>>  	 * The RPI slot - if present - is always valid, and won't
>>>>  	 * change since the thread is resuming on this CPU and cannot
>>> Another worry: Can thread->rpi become != rpi without being NULL? Or can
>>> we really only race for clearance here?
>>>
>> I think so now, therefore I'm proposing this:
>>
>> ----------->
>>
>> Most RPI services work on the current task or the one to be scheduled in
>> next, thus are naturally serialized. But rpi_next is not as it can walk
>> the chain of RPI requests for a CPU independently. In that case,
>> clearing RPI via rpi_clear_remote can race with rpi_next, and if the
>> former loses after checking thread->rpi for NULL, we will dereference a
>> NULL pointer in xnsched_pop_rpi().
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
>> ---
>>  ksrc/nucleus/shadow.c |    9 +++++++++
>>  1 files changed, 9 insertions(+), 0 deletions(-)
>>
>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>> index 872c37f..cf7c08f 100644
>> --- a/ksrc/nucleus/shadow.c
>> +++ b/ksrc/nucleus/shadow.c
>> @@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread *thread)
>>  	xnlock_get_irqsave(&rpi->rpilock, s);
>>  
>>  	/*
>> +	 * Re-check under lock. Someone may have invoked rpi_next and cleared
>> +	 * rpi by now.
>> +	 */
>> +	if (unlikely(!rpi_p(thread))) {
>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
>> +		return;
>> +	}
>> +
>> +	/*
>>  	 * The RPI slot - if present - is always valid, and won't
>>  	 * change since the thread is resuming on this CPU and cannot
>>  	 * migrate under our feet. We may grab the remote slot lock
>>
> 
> The suggested patch papers over the actual issue, which is that
> rpi_clear_remote() may not invoke rpi_next(), because it may only affect

I don't think that in our case rpi_clear_remote called rpi_next and
therefore crashed. It should rather have been the scenario of both
running in parallel on different CPUs, the former on behalf of a
migrated shadow that wants to clear its remainders on the remote CPU,
the latter on that CPU, picking a new top RPI after some other shadow
was removed from the queue. Is this a possible scenario, and would your
patch cover it?

> the RPI state of the argument thread which must be a local one, and not
> that of any random thread that happens to be linked to the remote RPI
> queue.
> 
> By calling rpi_next(), rpi_clear_remote() shoots itself in the foot,
> allowing a concurrent invocation of itself on a remote CPU, to fiddle
> with the rpi backlink of a thread which is not active on the
> local/per-cpu Xenomai scheduler instance, which is the point where
> things start to hit the crapper.
> 
> Now, unless I can't even synchronize the couple of neurons I have left
> at this hour, the following patch should better fix the issue, because
> it restores the two basic rules that apply to the whole RPI machinery,
> namely:
> 
> - rpi_* calls may only alter the contents of the local scheduler's RPI
> queue, with the notable exception of rpi_clear_remote() which may remove
> the given _local_ thread only, from a remote RPI slot.
> 
> - rpi_* calls may only change the RPI state of threads which are
> controlled by the local Xenomai scheduler instance, except rpi_push()
> when called for setting up the RPI state of an emerging thread, in which
> case this is a no-conflict zone.
> 
> That breakage was introduced in the early 2.5.1 timeframe, so 2.4.x
> should be immune from this particular bug.
> 
> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> index 872c37f..1397ed1 100644
> --- a/ksrc/nucleus/shadow.c
> +++ b/ksrc/nucleus/shadow.c
> @@ -340,7 +340,12 @@ static void rpi_clear_remote(struct xnthread *thread)
>  	xnsched_pop_rpi(thread);
>  	thread->rpi = NULL;
>  
> -	if (rpi_next(rpi, s) == NULL)
> +	/*
> +	 * If the remote RPI queue was emptied, prepare for kicking
> +	 * xnshadow_rpi_check() on the relevant CPU to demote the root
> +	 * thread priority there.
> +	 */
> +	if (xnsched_peek_rpi(rpi) == NULL) /* No rpi_next() here. */
>  		rcpu = xnsched_cpu(rpi);
>  
>  	xnlock_put_irqrestore(&rpi->rpilock, s);
> 
> 

I have to confess, I do not understand how the patch may relate to our
crash. But that's because I still only have a semi-understanding of this
frightening complex RPI code. However, the fact that thread->rpi is now
again only checked outside its protecting lock leaves me with a very bad
feeling...

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-04-27  6:46       ` Jan Kiszka
@ 2010-04-27  8:13         ` Philippe Gerum
  2010-04-27  8:25           ` Jan Kiszka
  0 siblings, 1 reply; 18+ messages in thread
From: Philippe Gerum @ 2010-04-27  8:13 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

On Tue, 2010-04-27 at 08:46 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Mon, 2010-04-26 at 18:06 +0200, Jan Kiszka wrote:
> >> Jan Kiszka wrote:
> >>> Jan Kiszka wrote:
> >>>> Hi,
> >>>>
> >>>> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
> >>>> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
> >>>> this could explain the bug:
> >>>>
> >>>>
> >>>> static void rpi_clear_remote(struct xnthread *thread)
> >>>> {
> >>>> ...
> >>>>         rpi = thread->rpi;
> >>>>         if (unlikely(rpi == NULL))
> >>>>                 return;
> >>>>
> >>>>         xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>
> >>>>         /*
> >>>>          * The RPI slot - if present - is always valid, and won't
> >>>>          * change since the thread is resuming on this CPU and cannot
> >>>>          * migrate under our feet. We may grab the remote slot lock
> >>>>          * now.
> >>>>          */
> >>>>         xnsched_pop_rpi(thread);
> >>>>         thread->rpi = NULL;
> >>>>
> >>>> ...
> >>>>
> >>>> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
> >>>> we check for it without any protection? Sounds racy. I think 'thread' is
> >>>> not only pointing to the current thread but could refer to a foreign one
> >>>> as well, right? Don't we need this:
> >>>>
> >>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>>> index 872c37f..1f995d6 100644
> >>>> --- a/ksrc/nucleus/shadow.c
> >>>> +++ b/ksrc/nucleus/shadow.c
> >>>> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
> >>>>  
> >>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>  
> >>>> +	/* Re-check under lock, someone may have cleared rpi by now. */
> >>>> +	if (unlikely(thread->rpi == NULL)) {
> >>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>> +		return;
> >>>> +	}
> >>>> +
> >>>>  	/*
> >>>>  	 * The RPI slot - if present - is always valid, and won't
> >>>>  	 * change since the thread is resuming on this CPU and cannot
> >>> Another worry: Can thread->rpi become != rpi without being NULL? Or can
> >>> we really only race for clearance here?
> >>>
> >> I think so now, therefore I'm proposing this:
> >>
> >> ----------->
> >>
> >> Most RPI services work on the current task or the one to be scheduled in
> >> next, thus are naturally serialized. But rpi_next is not as it can walk
> >> the chain of RPI requests for a CPU independently. In that case,
> >> clearing RPI via rpi_clear_remote can race with rpi_next, and if the
> >> former loses after checking thread->rpi for NULL, we will dereference a
> >> NULL pointer in xnsched_pop_rpi().
> >>
> >> Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
> >> ---
> >>  ksrc/nucleus/shadow.c |    9 +++++++++
> >>  1 files changed, 9 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >> index 872c37f..cf7c08f 100644
> >> --- a/ksrc/nucleus/shadow.c
> >> +++ b/ksrc/nucleus/shadow.c
> >> @@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread *thread)
> >>  	xnlock_get_irqsave(&rpi->rpilock, s);
> >>  
> >>  	/*
> >> +	 * Re-check under lock. Someone may have invoked rpi_next and cleared
> >> +	 * rpi by now.
> >> +	 */
> >> +	if (unlikely(!rpi_p(thread))) {
> >> +		xnlock_put_irqrestore(&rpi->rpilock, s);
> >> +		return;
> >> +	}
> >> +
> >> +	/*
> >>  	 * The RPI slot - if present - is always valid, and won't
> >>  	 * change since the thread is resuming on this CPU and cannot
> >>  	 * migrate under our feet. We may grab the remote slot lock
> >>
> > 
> > The suggested patch papers over the actual issue, which is that
> > rpi_clear_remote() may not invoke rpi_next(), because it may only affect
> 
> I don't think that in our case rpi_clear_remote called rpi_next and
> therefore crashed. It should rather have been the scenario of both
> running in parallel on different CPUs, the former on behalf of a
> migrated shadow that wants to clear its remainders on the remote CPU,
> the latter on that CPU, picking a new top RPI after some other shadow
> was removed from the queue. Is this a possible scenario, and would your
> patch cover it?
> 
> > the RPI state of the argument thread which must be a local one, and not
> > that of any random thread that happens to be linked to the remote RPI
> > queue.
> > 
> > By calling rpi_next(), rpi_clear_remote() shoots itself in the foot,
> > allowing a concurrent invocation of itself on a remote CPU, to fiddle
> > with the rpi backlink of a thread which is not active on the
> > local/per-cpu Xenomai scheduler instance, which is the point where
> > things start to hit the crapper.
> > 
> > Now, unless I can't even synchronize the couple of neurons I have left
> > at this hour, the following patch should better fix the issue, because
> > it restores the two basic rules that apply to the whole RPI machinery,
> > namely:
> > 
> > - rpi_* calls may only alter the contents of the local scheduler's RPI
> > queue, with the notable exception of rpi_clear_remote() which may remove
> > the given _local_ thread only, from a remote RPI slot.
> > 
> > - rpi_* calls may only change the RPI state of threads which are
> > controlled by the local Xenomai scheduler instance, except rpi_push()
> > when called for setting up the RPI state of an emerging thread, in which
> > case this is a no-conflict zone.
> > 
> > That breakage was introduced in the early 2.5.1 timeframe, so 2.4.x
> > should be immune from this particular bug.
> > 
> > diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> > index 872c37f..1397ed1 100644
> > --- a/ksrc/nucleus/shadow.c
> > +++ b/ksrc/nucleus/shadow.c
> > @@ -340,7 +340,12 @@ static void rpi_clear_remote(struct xnthread *thread)
> >  	xnsched_pop_rpi(thread);
> >  	thread->rpi = NULL;
> >  
> > -	if (rpi_next(rpi, s) == NULL)
> > +	/*
> > +	 * If the remote RPI queue was emptied, prepare for kicking
> > +	 * xnshadow_rpi_check() on the relevant CPU to demote the root
> > +	 * thread priority there.
> > +	 */
> > +	if (xnsched_peek_rpi(rpi) == NULL) /* No rpi_next() here. */
> >  		rcpu = xnsched_cpu(rpi);
> >  
> >  	xnlock_put_irqrestore(&rpi->rpilock, s);
> > 
> > 
> 
> I have to confess, I do not understand how the patch may relate to our
> crash. But that's because I still only have a semi-understanding of this
> frightening complex RPI code. However, the fact that thread->rpi is now
> again only checked outside its protecting lock leaves me with a very bad
> feeling...
> 

My point is that we should not have to protect a section of code which
may never conflict in any case, by design; we will likely agree that
sprinkling locks everywhere to get a warm and fuzzy feeling is no
solution, it's actually a significant source of regression.

The idea, behind keeping most rpi_* operations applicable to locally
scheduled threads, is to introduce such a design, even when remote RPI
slots are involved. thread->sched == xnpod_current_sched() for each
rpi_*(sched, ...) calls is what is important in this logic. Another
original assumption was that no RPI updates could be done in interrupt
context, which is now wrong due to the change in xnshadow_rpi_check().

In short: we have to make sure that rpi_next() does not break the basic
assumptions of the RPI core, first.

> Jan
> 


-- 
Philippe.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-04-27  8:13         ` Philippe Gerum
@ 2010-04-27  8:25           ` Jan Kiszka
  2010-04-27  9:12             ` Philippe Gerum
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kiszka @ 2010-04-27  8:25 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai-core

Philippe Gerum wrote:
> On Tue, 2010-04-27 at 08:46 +0200, Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> On Mon, 2010-04-26 at 18:06 +0200, Jan Kiszka wrote:
>>>> Jan Kiszka wrote:
>>>>> Jan Kiszka wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
>>>>>> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
>>>>>> this could explain the bug:
>>>>>>
>>>>>>
>>>>>> static void rpi_clear_remote(struct xnthread *thread)
>>>>>> {
>>>>>> ...
>>>>>>         rpi = thread->rpi;
>>>>>>         if (unlikely(rpi == NULL))
>>>>>>                 return;
>>>>>>
>>>>>>         xnlock_get_irqsave(&rpi->rpilock, s);
>>>>>>
>>>>>>         /*
>>>>>>          * The RPI slot - if present - is always valid, and won't
>>>>>>          * change since the thread is resuming on this CPU and cannot
>>>>>>          * migrate under our feet. We may grab the remote slot lock
>>>>>>          * now.
>>>>>>          */
>>>>>>         xnsched_pop_rpi(thread);
>>>>>>         thread->rpi = NULL;
>>>>>>
>>>>>> ...
>>>>>>
>>>>>> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
>>>>>> we check for it without any protection? Sounds racy. I think 'thread' is
>>>>>> not only pointing to the current thread but could refer to a foreign one
>>>>>> as well, right? Don't we need this:
>>>>>>
>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>>>> index 872c37f..1f995d6 100644
>>>>>> --- a/ksrc/nucleus/shadow.c
>>>>>> +++ b/ksrc/nucleus/shadow.c
>>>>>> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>>>>  
>>>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
>>>>>>  
>>>>>> +	/* Re-check under lock, someone may have cleared rpi by now. */
>>>>>> +	if (unlikely(thread->rpi == NULL)) {
>>>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
>>>>>> +		return;
>>>>>> +	}
>>>>>> +
>>>>>>  	/*
>>>>>>  	 * The RPI slot - if present - is always valid, and won't
>>>>>>  	 * change since the thread is resuming on this CPU and cannot
>>>>> Another worry: Can thread->rpi become != rpi without being NULL? Or can
>>>>> we really only race for clearance here?
>>>>>
>>>> I think so now, therefore I'm proposing this:
>>>>
>>>> ----------->
>>>>
>>>> Most RPI services work on the current task or the one to be scheduled in
>>>> next, thus are naturally serialized. But rpi_next is not as it can walk
>>>> the chain of RPI requests for a CPU independently. In that case,
>>>> clearing RPI via rpi_clear_remote can race with rpi_next, and if the
>>>> former loses after checking thread->rpi for NULL, we will dereference a
>>>> NULL pointer in xnsched_pop_rpi().
>>>>
>>>> Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
>>>> ---
>>>>  ksrc/nucleus/shadow.c |    9 +++++++++
>>>>  1 files changed, 9 insertions(+), 0 deletions(-)
>>>>
>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>> index 872c37f..cf7c08f 100644
>>>> --- a/ksrc/nucleus/shadow.c
>>>> +++ b/ksrc/nucleus/shadow.c
>>>> @@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
>>>>  
>>>>  	/*
>>>> +	 * Re-check under lock. Someone may have invoked rpi_next and cleared
>>>> +	 * rpi by now.
>>>> +	 */
>>>> +	if (unlikely(!rpi_p(thread))) {
>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
>>>> +		return;
>>>> +	}
>>>> +
>>>> +	/*
>>>>  	 * The RPI slot - if present - is always valid, and won't
>>>>  	 * change since the thread is resuming on this CPU and cannot
>>>>  	 * migrate under our feet. We may grab the remote slot lock
>>>>
>>> The suggested patch papers over the actual issue, which is that
>>> rpi_clear_remote() may not invoke rpi_next(), because it may only affect
>> I don't think that in our case rpi_clear_remote called rpi_next and
>> therefore crashed. It should rather have been the scenario of both
>> running in parallel on different CPUs, the former on behalf of a
>> migrated shadow that wants to clear its remainders on the remote CPU,
>> the latter on that CPU, picking a new top RPI after some other shadow
>> was removed from the queue. Is this a possible scenario, and would your
>> patch cover it?
>>
>>> the RPI state of the argument thread which must be a local one, and not
>>> that of any random thread that happens to be linked to the remote RPI
>>> queue.
>>>
>>> By calling rpi_next(), rpi_clear_remote() shoots itself in the foot,
>>> allowing a concurrent invocation of itself on a remote CPU, to fiddle
>>> with the rpi backlink of a thread which is not active on the
>>> local/per-cpu Xenomai scheduler instance, which is the point where
>>> things start to hit the crapper.
>>>
>>> Now, unless I can't even synchronize the couple of neurons I have left
>>> at this hour, the following patch should better fix the issue, because
>>> it restores the two basic rules that apply to the whole RPI machinery,
>>> namely:
>>>
>>> - rpi_* calls may only alter the contents of the local scheduler's RPI
>>> queue, with the notable exception of rpi_clear_remote() which may remove
>>> the given _local_ thread only, from a remote RPI slot.
>>>
>>> - rpi_* calls may only change the RPI state of threads which are
>>> controlled by the local Xenomai scheduler instance, except rpi_push()
>>> when called for setting up the RPI state of an emerging thread, in which
>>> case this is a no-conflict zone.
>>>
>>> That breakage was introduced in the early 2.5.1 timeframe, so 2.4.x
>>> should be immune from this particular bug.
>>>
>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>> index 872c37f..1397ed1 100644
>>> --- a/ksrc/nucleus/shadow.c
>>> +++ b/ksrc/nucleus/shadow.c
>>> @@ -340,7 +340,12 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>  	xnsched_pop_rpi(thread);
>>>  	thread->rpi = NULL;
>>>  
>>> -	if (rpi_next(rpi, s) == NULL)
>>> +	/*
>>> +	 * If the remote RPI queue was emptied, prepare for kicking
>>> +	 * xnshadow_rpi_check() on the relevant CPU to demote the root
>>> +	 * thread priority there.
>>> +	 */
>>> +	if (xnsched_peek_rpi(rpi) == NULL) /* No rpi_next() here. */
>>>  		rcpu = xnsched_cpu(rpi);
>>>  
>>>  	xnlock_put_irqrestore(&rpi->rpilock, s);
>>>
>>>
>> I have to confess, I do not understand how the patch may relate to our
>> crash. But that's because I still only have a semi-understanding of this
>> frightening complex RPI code. However, the fact that thread->rpi is now
>> again only checked outside its protecting lock leaves me with a very bad
>> feeling...
>>
> 
> My point is that we should not have to protect a section of code which
> may never conflict in any case, by design; we will likely agree that
> sprinkling locks everywhere to get a warm and fuzzy feeling is no
> solution, it's actually a significant source of regression.
> 
> The idea, behind keeping most rpi_* operations applicable to locally
> scheduled threads, is to introduce such a design, even when remote RPI
> slots are involved. thread->sched == xnpod_current_sched() for each
> rpi_*(sched, ...) calls is what is important in this logic. Another
> original assumption was that no RPI updates could be done in interrupt
> context, which is now wrong due to the change in xnshadow_rpi_check().
> 
> In short: we have to make sure that rpi_next() does not break the basic
> assumptions of the RPI core, first.

Please check my scenario again: My concern is that a thread can be
queued for a short while on a remote sched, and while that is the case,
it can be manipulated (/wrt ->rpi) _concurrently_ (as we do not hold the
remote rpilock all the time). I'm quite sure now that your patch does
not change this.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-04-27  8:25           ` Jan Kiszka
@ 2010-04-27  9:12             ` Philippe Gerum
  2010-04-27  9:27               ` Jan Kiszka
  0 siblings, 1 reply; 18+ messages in thread
From: Philippe Gerum @ 2010-04-27  9:12 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

On Tue, 2010-04-27 at 10:25 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Tue, 2010-04-27 at 08:46 +0200, Jan Kiszka wrote:
> >> Philippe Gerum wrote:
> >>> On Mon, 2010-04-26 at 18:06 +0200, Jan Kiszka wrote:
> >>>> Jan Kiszka wrote:
> >>>>> Jan Kiszka wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
> >>>>>> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
> >>>>>> this could explain the bug:
> >>>>>>
> >>>>>>
> >>>>>> static void rpi_clear_remote(struct xnthread *thread)
> >>>>>> {
> >>>>>> ...
> >>>>>>         rpi = thread->rpi;
> >>>>>>         if (unlikely(rpi == NULL))
> >>>>>>                 return;
> >>>>>>
> >>>>>>         xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>>>
> >>>>>>         /*
> >>>>>>          * The RPI slot - if present - is always valid, and won't
> >>>>>>          * change since the thread is resuming on this CPU and cannot
> >>>>>>          * migrate under our feet. We may grab the remote slot lock
> >>>>>>          * now.
> >>>>>>          */
> >>>>>>         xnsched_pop_rpi(thread);
> >>>>>>         thread->rpi = NULL;
> >>>>>>
> >>>>>> ...
> >>>>>>
> >>>>>> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
> >>>>>> we check for it without any protection? Sounds racy. I think 'thread' is
> >>>>>> not only pointing to the current thread but could refer to a foreign one
> >>>>>> as well, right? Don't we need this:
> >>>>>>
> >>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>>>>> index 872c37f..1f995d6 100644
> >>>>>> --- a/ksrc/nucleus/shadow.c
> >>>>>> +++ b/ksrc/nucleus/shadow.c
> >>>>>> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
> >>>>>>  
> >>>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>>>  
> >>>>>> +	/* Re-check under lock, someone may have cleared rpi by now. */
> >>>>>> +	if (unlikely(thread->rpi == NULL)) {
> >>>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>>>> +		return;
> >>>>>> +	}
> >>>>>> +
> >>>>>>  	/*
> >>>>>>  	 * The RPI slot - if present - is always valid, and won't
> >>>>>>  	 * change since the thread is resuming on this CPU and cannot
> >>>>> Another worry: Can thread->rpi become != rpi without being NULL? Or can
> >>>>> we really only race for clearance here?
> >>>>>
> >>>> I think so now, therefore I'm proposing this:
> >>>>
> >>>> ----------->
> >>>>
> >>>> Most RPI services work on the current task or the one to be scheduled in
> >>>> next, thus are naturally serialized. But rpi_next is not as it can walk
> >>>> the chain of RPI requests for a CPU independently. In that case,
> >>>> clearing RPI via rpi_clear_remote can race with rpi_next, and if the
> >>>> former loses after checking thread->rpi for NULL, we will dereference a
> >>>> NULL pointer in xnsched_pop_rpi().
> >>>>
> >>>> Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
> >>>> ---
> >>>>  ksrc/nucleus/shadow.c |    9 +++++++++
> >>>>  1 files changed, 9 insertions(+), 0 deletions(-)
> >>>>
> >>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>>> index 872c37f..cf7c08f 100644
> >>>> --- a/ksrc/nucleus/shadow.c
> >>>> +++ b/ksrc/nucleus/shadow.c
> >>>> @@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread *thread)
> >>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>  
> >>>>  	/*
> >>>> +	 * Re-check under lock. Someone may have invoked rpi_next and cleared
> >>>> +	 * rpi by now.
> >>>> +	 */
> >>>> +	if (unlikely(!rpi_p(thread))) {
> >>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>> +		return;
> >>>> +	}
> >>>> +
> >>>> +	/*
> >>>>  	 * The RPI slot - if present - is always valid, and won't
> >>>>  	 * change since the thread is resuming on this CPU and cannot
> >>>>  	 * migrate under our feet. We may grab the remote slot lock
> >>>>
> >>> The suggested patch papers over the actual issue, which is that
> >>> rpi_clear_remote() may not invoke rpi_next(), because it may only affect
> >> I don't think that in our case rpi_clear_remote called rpi_next and
> >> therefore crashed. It should rather have been the scenario of both
> >> running in parallel on different CPUs, the former on behalf of a
> >> migrated shadow that wants to clear its remainders on the remote CPU,
> >> the latter on that CPU, picking a new top RPI after some other shadow
> >> was removed from the queue. Is this a possible scenario, and would your
> >> patch cover it?
> >>
> >>> the RPI state of the argument thread which must be a local one, and not
> >>> that of any random thread that happens to be linked to the remote RPI
> >>> queue.
> >>>
> >>> By calling rpi_next(), rpi_clear_remote() shoots itself in the foot,
> >>> allowing a concurrent invocation of itself on a remote CPU, to fiddle
> >>> with the rpi backlink of a thread which is not active on the
> >>> local/per-cpu Xenomai scheduler instance, which is the point where
> >>> things start to hit the crapper.
> >>>
> >>> Now, unless I can't even synchronize the couple of neurons I have left
> >>> at this hour, the following patch should better fix the issue, because
> >>> it restores the two basic rules that apply to the whole RPI machinery,
> >>> namely:
> >>>
> >>> - rpi_* calls may only alter the contents of the local scheduler's RPI
> >>> queue, with the notable exception of rpi_clear_remote() which may remove
> >>> the given _local_ thread only, from a remote RPI slot.
> >>>
> >>> - rpi_* calls may only change the RPI state of threads which are
> >>> controlled by the local Xenomai scheduler instance, except rpi_push()
> >>> when called for setting up the RPI state of an emerging thread, in which
> >>> case this is a no-conflict zone.
> >>>
> >>> That breakage was introduced in the early 2.5.1 timeframe, so 2.4.x
> >>> should be immune from this particular bug.
> >>>
> >>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>> index 872c37f..1397ed1 100644
> >>> --- a/ksrc/nucleus/shadow.c
> >>> +++ b/ksrc/nucleus/shadow.c
> >>> @@ -340,7 +340,12 @@ static void rpi_clear_remote(struct xnthread *thread)
> >>>  	xnsched_pop_rpi(thread);
> >>>  	thread->rpi = NULL;
> >>>  
> >>> -	if (rpi_next(rpi, s) == NULL)
> >>> +	/*
> >>> +	 * If the remote RPI queue was emptied, prepare for kicking
> >>> +	 * xnshadow_rpi_check() on the relevant CPU to demote the root
> >>> +	 * thread priority there.
> >>> +	 */
> >>> +	if (xnsched_peek_rpi(rpi) == NULL) /* No rpi_next() here. */
> >>>  		rcpu = xnsched_cpu(rpi);
> >>>  
> >>>  	xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>
> >>>
> >> I have to confess, I do not understand how the patch may relate to our
> >> crash. But that's because I still only have a semi-understanding of this
> >> frightening complex RPI code. However, the fact that thread->rpi is now
> >> again only checked outside its protecting lock leaves me with a very bad
> >> feeling...
> >>
> > 
> > My point is that we should not have to protect a section of code which
> > may never conflict in any case, by design; we will likely agree that
> > sprinkling locks everywhere to get a warm and fuzzy feeling is no
> > solution, it's actually a significant source of regression.
> > 
> > The idea, behind keeping most rpi_* operations applicable to locally
> > scheduled threads, is to introduce such a design, even when remote RPI
> > slots are involved. thread->sched == xnpod_current_sched() for each
> > rpi_*(sched, ...) calls is what is important in this logic. Another
> > original assumption was that no RPI updates could be done in interrupt
> > context, which is now wrong due to the change in xnshadow_rpi_check().
> > 
> > In short: we have to make sure that rpi_next() does not break the basic
> > assumptions of the RPI core, first.
> 
> Please check my scenario again: My concern is that a thread can be
> queued for a short while on a remote sched,

No, it can't, that is the crux of the matter, well, at least, this
should not be possible if the basic assumptions are preserved (have a
look at the rpi_clear_remote() callers, the target thread may not
migrate or be scheduled in linux-wise, or exit RPI via rpi_pop() during
the call -- all places where the rpi backlink may be cleared). Only a
caller operating from the local CPU should be allowed to alter the RPI
state of threads linked to the RPI slot of that same CPU.

rpi_clear_remote() is not even an exception to this, since it alters a
remote RPI slot, but for a thread which does run on the local CPU.

>  and while that is the case,
> it can be manipulated (/wrt ->rpi) _concurrently_ (as we do not hold the
> remote rpilock all the time). I'm quite sure now that your patch does
> not change this.

My patch attempts fixing the core issue, not just plugging one of its
bad outcomes.

Again, the point is not to pretend that your patch is wrong, and it
surely "plugs" one issue we have due to rpi_next(). The point is to make
sure that all issues are covered, by fixing the usage of rpi_next(), or
find another way to fix what rpi_next() was supposed to fix in the first
place.

> 
> Jan
> 


-- 
Philippe.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-04-27  9:12             ` Philippe Gerum
@ 2010-04-27  9:27               ` Jan Kiszka
  2010-04-27  9:32                 ` Philippe Gerum
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kiszka @ 2010-04-27  9:27 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai-core

Philippe Gerum wrote:
> On Tue, 2010-04-27 at 10:25 +0200, Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> On Tue, 2010-04-27 at 08:46 +0200, Jan Kiszka wrote:
>>>> Philippe Gerum wrote:
>>>>> On Mon, 2010-04-26 at 18:06 +0200, Jan Kiszka wrote:
>>>>>> Jan Kiszka wrote:
>>>>>>> Jan Kiszka wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
>>>>>>>> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
>>>>>>>> this could explain the bug:
>>>>>>>>
>>>>>>>>
>>>>>>>> static void rpi_clear_remote(struct xnthread *thread)
>>>>>>>> {
>>>>>>>> ...
>>>>>>>>         rpi = thread->rpi;
>>>>>>>>         if (unlikely(rpi == NULL))
>>>>>>>>                 return;
>>>>>>>>
>>>>>>>>         xnlock_get_irqsave(&rpi->rpilock, s);
>>>>>>>>
>>>>>>>>         /*
>>>>>>>>          * The RPI slot - if present - is always valid, and won't
>>>>>>>>          * change since the thread is resuming on this CPU and cannot
>>>>>>>>          * migrate under our feet. We may grab the remote slot lock
>>>>>>>>          * now.
>>>>>>>>          */
>>>>>>>>         xnsched_pop_rpi(thread);
>>>>>>>>         thread->rpi = NULL;
>>>>>>>>
>>>>>>>> ...
>>>>>>>>
>>>>>>>> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
>>>>>>>> we check for it without any protection? Sounds racy. I think 'thread' is
>>>>>>>> not only pointing to the current thread but could refer to a foreign one
>>>>>>>> as well, right? Don't we need this:
>>>>>>>>
>>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>>>>>> index 872c37f..1f995d6 100644
>>>>>>>> --- a/ksrc/nucleus/shadow.c
>>>>>>>> +++ b/ksrc/nucleus/shadow.c
>>>>>>>> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>>>>>>  
>>>>>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
>>>>>>>>  
>>>>>>>> +	/* Re-check under lock, someone may have cleared rpi by now. */
>>>>>>>> +	if (unlikely(thread->rpi == NULL)) {
>>>>>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
>>>>>>>> +		return;
>>>>>>>> +	}
>>>>>>>> +
>>>>>>>>  	/*
>>>>>>>>  	 * The RPI slot - if present - is always valid, and won't
>>>>>>>>  	 * change since the thread is resuming on this CPU and cannot
>>>>>>> Another worry: Can thread->rpi become != rpi without being NULL? Or can
>>>>>>> we really only race for clearance here?
>>>>>>>
>>>>>> I think so now, therefore I'm proposing this:
>>>>>>
>>>>>> ----------->
>>>>>>
>>>>>> Most RPI services work on the current task or the one to be scheduled in
>>>>>> next, thus are naturally serialized. But rpi_next is not as it can walk
>>>>>> the chain of RPI requests for a CPU independently. In that case,
>>>>>> clearing RPI via rpi_clear_remote can race with rpi_next, and if the
>>>>>> former loses after checking thread->rpi for NULL, we will dereference a
>>>>>> NULL pointer in xnsched_pop_rpi().
>>>>>>
>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
>>>>>> ---
>>>>>>  ksrc/nucleus/shadow.c |    9 +++++++++
>>>>>>  1 files changed, 9 insertions(+), 0 deletions(-)
>>>>>>
>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>>>> index 872c37f..cf7c08f 100644
>>>>>> --- a/ksrc/nucleus/shadow.c
>>>>>> +++ b/ksrc/nucleus/shadow.c
>>>>>> @@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
>>>>>>  
>>>>>>  	/*
>>>>>> +	 * Re-check under lock. Someone may have invoked rpi_next and cleared
>>>>>> +	 * rpi by now.
>>>>>> +	 */
>>>>>> +	if (unlikely(!rpi_p(thread))) {
>>>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
>>>>>> +		return;
>>>>>> +	}
>>>>>> +
>>>>>> +	/*
>>>>>>  	 * The RPI slot - if present - is always valid, and won't
>>>>>>  	 * change since the thread is resuming on this CPU and cannot
>>>>>>  	 * migrate under our feet. We may grab the remote slot lock
>>>>>>
>>>>> The suggested patch papers over the actual issue, which is that
>>>>> rpi_clear_remote() may not invoke rpi_next(), because it may only affect
>>>> I don't think that in our case rpi_clear_remote called rpi_next and
>>>> therefore crashed. It should rather have been the scenario of both
>>>> running in parallel on different CPUs, the former on behalf of a
>>>> migrated shadow that wants to clear its remainders on the remote CPU,
>>>> the latter on that CPU, picking a new top RPI after some other shadow
>>>> was removed from the queue. Is this a possible scenario, and would your
>>>> patch cover it?
>>>>
>>>>> the RPI state of the argument thread which must be a local one, and not
>>>>> that of any random thread that happens to be linked to the remote RPI
>>>>> queue.
>>>>>
>>>>> By calling rpi_next(), rpi_clear_remote() shoots itself in the foot,
>>>>> allowing a concurrent invocation of itself on a remote CPU, to fiddle
>>>>> with the rpi backlink of a thread which is not active on the
>>>>> local/per-cpu Xenomai scheduler instance, which is the point where
>>>>> things start to hit the crapper.
>>>>>
>>>>> Now, unless I can't even synchronize the couple of neurons I have left
>>>>> at this hour, the following patch should better fix the issue, because
>>>>> it restores the two basic rules that apply to the whole RPI machinery,
>>>>> namely:
>>>>>
>>>>> - rpi_* calls may only alter the contents of the local scheduler's RPI
>>>>> queue, with the notable exception of rpi_clear_remote() which may remove
>>>>> the given _local_ thread only, from a remote RPI slot.
>>>>>
>>>>> - rpi_* calls may only change the RPI state of threads which are
>>>>> controlled by the local Xenomai scheduler instance, except rpi_push()
>>>>> when called for setting up the RPI state of an emerging thread, in which
>>>>> case this is a no-conflict zone.
>>>>>
>>>>> That breakage was introduced in the early 2.5.1 timeframe, so 2.4.x
>>>>> should be immune from this particular bug.
>>>>>
>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>>> index 872c37f..1397ed1 100644
>>>>> --- a/ksrc/nucleus/shadow.c
>>>>> +++ b/ksrc/nucleus/shadow.c
>>>>> @@ -340,7 +340,12 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>>>  	xnsched_pop_rpi(thread);
>>>>>  	thread->rpi = NULL;
>>>>>  
>>>>> -	if (rpi_next(rpi, s) == NULL)
>>>>> +	/*
>>>>> +	 * If the remote RPI queue was emptied, prepare for kicking
>>>>> +	 * xnshadow_rpi_check() on the relevant CPU to demote the root
>>>>> +	 * thread priority there.
>>>>> +	 */
>>>>> +	if (xnsched_peek_rpi(rpi) == NULL) /* No rpi_next() here. */
>>>>>  		rcpu = xnsched_cpu(rpi);
>>>>>  
>>>>>  	xnlock_put_irqrestore(&rpi->rpilock, s);
>>>>>
>>>>>
>>>> I have to confess, I do not understand how the patch may relate to our
>>>> crash. But that's because I still only have a semi-understanding of this
>>>> frightening complex RPI code. However, the fact that thread->rpi is now
>>>> again only checked outside its protecting lock leaves me with a very bad
>>>> feeling...
>>>>
>>> My point is that we should not have to protect a section of code which
>>> may never conflict in any case, by design; we will likely agree that
>>> sprinkling locks everywhere to get a warm and fuzzy feeling is no
>>> solution, it's actually a significant source of regression.
>>>
>>> The idea, behind keeping most rpi_* operations applicable to locally
>>> scheduled threads, is to introduce such a design, even when remote RPI
>>> slots are involved. thread->sched == xnpod_current_sched() for each
>>> rpi_*(sched, ...) calls is what is important in this logic. Another
>>> original assumption was that no RPI updates could be done in interrupt
>>> context, which is now wrong due to the change in xnshadow_rpi_check().
>>>
>>> In short: we have to make sure that rpi_next() does not break the basic
>>> assumptions of the RPI core, first.
>> Please check my scenario again: My concern is that a thread can be
>> queued for a short while on a remote sched,
> 
> No, it can't, that is the crux of the matter, well, at least, this
> should not be possible if the basic assumptions are preserved (have a
> look at the rpi_clear_remote() callers, the target thread may not
> migrate or be scheduled in linux-wise, or exit RPI via rpi_pop() during
> the call -- all places where the rpi backlink may be cleared). Only a
> caller operating from the local CPU should be allowed to alter the RPI
> state of threads linked to the RPI slot of that same CPU.
> 
> rpi_clear_remote() is not even an exception to this, since it alters a
> remote RPI slot, but for a thread which does run on the local CPU.
> 
>>  and while that is the case,
>> it can be manipulated (/wrt ->rpi) _concurrently_ (as we do not hold the
>> remote rpilock all the time). I'm quite sure now that your patch does
>> not change this.
> 
> My patch attempts fixing the core issue, not just plugging one of its
> bad outcomes.
> 
> Again, the point is not to pretend that your patch is wrong, and it
> surely "plugs" one issue we have due to rpi_next(). The point is to make
> sure that all issues are covered, by fixing the usage of rpi_next(), or
> find another way to fix what rpi_next() was supposed to fix in the first
> place.

So, if you are right, we could (in theory) replace rpilock with local
IRQ disabling? That would be the proof from me that it doesn't matter to
test thread->rpi outside the lock.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-04-27  9:27               ` Jan Kiszka
@ 2010-04-27  9:32                 ` Philippe Gerum
  2010-04-27  9:34                   ` Jan Kiszka
  0 siblings, 1 reply; 18+ messages in thread
From: Philippe Gerum @ 2010-04-27  9:32 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

On Tue, 2010-04-27 at 11:27 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Tue, 2010-04-27 at 10:25 +0200, Jan Kiszka wrote:
> >> Philippe Gerum wrote:
> >>> On Tue, 2010-04-27 at 08:46 +0200, Jan Kiszka wrote:
> >>>> Philippe Gerum wrote:
> >>>>> On Mon, 2010-04-26 at 18:06 +0200, Jan Kiszka wrote:
> >>>>>> Jan Kiszka wrote:
> >>>>>>> Jan Kiszka wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
> >>>>>>>> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
> >>>>>>>> this could explain the bug:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> static void rpi_clear_remote(struct xnthread *thread)
> >>>>>>>> {
> >>>>>>>> ...
> >>>>>>>>         rpi = thread->rpi;
> >>>>>>>>         if (unlikely(rpi == NULL))
> >>>>>>>>                 return;
> >>>>>>>>
> >>>>>>>>         xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>>>>>
> >>>>>>>>         /*
> >>>>>>>>          * The RPI slot - if present - is always valid, and won't
> >>>>>>>>          * change since the thread is resuming on this CPU and cannot
> >>>>>>>>          * migrate under our feet. We may grab the remote slot lock
> >>>>>>>>          * now.
> >>>>>>>>          */
> >>>>>>>>         xnsched_pop_rpi(thread);
> >>>>>>>>         thread->rpi = NULL;
> >>>>>>>>
> >>>>>>>> ...
> >>>>>>>>
> >>>>>>>> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
> >>>>>>>> we check for it without any protection? Sounds racy. I think 'thread' is
> >>>>>>>> not only pointing to the current thread but could refer to a foreign one
> >>>>>>>> as well, right? Don't we need this:
> >>>>>>>>
> >>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>>>>>>> index 872c37f..1f995d6 100644
> >>>>>>>> --- a/ksrc/nucleus/shadow.c
> >>>>>>>> +++ b/ksrc/nucleus/shadow.c
> >>>>>>>> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
> >>>>>>>>  
> >>>>>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>>>>>  
> >>>>>>>> +	/* Re-check under lock, someone may have cleared rpi by now. */
> >>>>>>>> +	if (unlikely(thread->rpi == NULL)) {
> >>>>>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>>>>>> +		return;
> >>>>>>>> +	}
> >>>>>>>> +
> >>>>>>>>  	/*
> >>>>>>>>  	 * The RPI slot - if present - is always valid, and won't
> >>>>>>>>  	 * change since the thread is resuming on this CPU and cannot
> >>>>>>> Another worry: Can thread->rpi become != rpi without being NULL? Or can
> >>>>>>> we really only race for clearance here?
> >>>>>>>
> >>>>>> I think so now, therefore I'm proposing this:
> >>>>>>
> >>>>>> ----------->
> >>>>>>
> >>>>>> Most RPI services work on the current task or the one to be scheduled in
> >>>>>> next, thus are naturally serialized. But rpi_next is not as it can walk
> >>>>>> the chain of RPI requests for a CPU independently. In that case,
> >>>>>> clearing RPI via rpi_clear_remote can race with rpi_next, and if the
> >>>>>> former loses after checking thread->rpi for NULL, we will dereference a
> >>>>>> NULL pointer in xnsched_pop_rpi().
> >>>>>>
> >>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
> >>>>>> ---
> >>>>>>  ksrc/nucleus/shadow.c |    9 +++++++++
> >>>>>>  1 files changed, 9 insertions(+), 0 deletions(-)
> >>>>>>
> >>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>>>>> index 872c37f..cf7c08f 100644
> >>>>>> --- a/ksrc/nucleus/shadow.c
> >>>>>> +++ b/ksrc/nucleus/shadow.c
> >>>>>> @@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread *thread)
> >>>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>>>  
> >>>>>>  	/*
> >>>>>> +	 * Re-check under lock. Someone may have invoked rpi_next and cleared
> >>>>>> +	 * rpi by now.
> >>>>>> +	 */
> >>>>>> +	if (unlikely(!rpi_p(thread))) {
> >>>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>>>> +		return;
> >>>>>> +	}
> >>>>>> +
> >>>>>> +	/*
> >>>>>>  	 * The RPI slot - if present - is always valid, and won't
> >>>>>>  	 * change since the thread is resuming on this CPU and cannot
> >>>>>>  	 * migrate under our feet. We may grab the remote slot lock
> >>>>>>
> >>>>> The suggested patch papers over the actual issue, which is that
> >>>>> rpi_clear_remote() may not invoke rpi_next(), because it may only affect
> >>>> I don't think that in our case rpi_clear_remote called rpi_next and
> >>>> therefore crashed. It should rather have been the scenario of both
> >>>> running in parallel on different CPUs, the former on behalf of a
> >>>> migrated shadow that wants to clear its remainders on the remote CPU,
> >>>> the latter on that CPU, picking a new top RPI after some other shadow
> >>>> was removed from the queue. Is this a possible scenario, and would your
> >>>> patch cover it?
> >>>>
> >>>>> the RPI state of the argument thread which must be a local one, and not
> >>>>> that of any random thread that happens to be linked to the remote RPI
> >>>>> queue.
> >>>>>
> >>>>> By calling rpi_next(), rpi_clear_remote() shoots itself in the foot,
> >>>>> allowing a concurrent invocation of itself on a remote CPU, to fiddle
> >>>>> with the rpi backlink of a thread which is not active on the
> >>>>> local/per-cpu Xenomai scheduler instance, which is the point where
> >>>>> things start to hit the crapper.
> >>>>>
> >>>>> Now, unless I can't even synchronize the couple of neurons I have left
> >>>>> at this hour, the following patch should better fix the issue, because
> >>>>> it restores the two basic rules that apply to the whole RPI machinery,
> >>>>> namely:
> >>>>>
> >>>>> - rpi_* calls may only alter the contents of the local scheduler's RPI
> >>>>> queue, with the notable exception of rpi_clear_remote() which may remove
> >>>>> the given _local_ thread only, from a remote RPI slot.
> >>>>>
> >>>>> - rpi_* calls may only change the RPI state of threads which are
> >>>>> controlled by the local Xenomai scheduler instance, except rpi_push()
> >>>>> when called for setting up the RPI state of an emerging thread, in which
> >>>>> case this is a no-conflict zone.
> >>>>>
> >>>>> That breakage was introduced in the early 2.5.1 timeframe, so 2.4.x
> >>>>> should be immune from this particular bug.
> >>>>>
> >>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>>>> index 872c37f..1397ed1 100644
> >>>>> --- a/ksrc/nucleus/shadow.c
> >>>>> +++ b/ksrc/nucleus/shadow.c
> >>>>> @@ -340,7 +340,12 @@ static void rpi_clear_remote(struct xnthread *thread)
> >>>>>  	xnsched_pop_rpi(thread);
> >>>>>  	thread->rpi = NULL;
> >>>>>  
> >>>>> -	if (rpi_next(rpi, s) == NULL)
> >>>>> +	/*
> >>>>> +	 * If the remote RPI queue was emptied, prepare for kicking
> >>>>> +	 * xnshadow_rpi_check() on the relevant CPU to demote the root
> >>>>> +	 * thread priority there.
> >>>>> +	 */
> >>>>> +	if (xnsched_peek_rpi(rpi) == NULL) /* No rpi_next() here. */
> >>>>>  		rcpu = xnsched_cpu(rpi);
> >>>>>  
> >>>>>  	xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>>>
> >>>>>
> >>>> I have to confess, I do not understand how the patch may relate to our
> >>>> crash. But that's because I still only have a semi-understanding of this
> >>>> frightening complex RPI code. However, the fact that thread->rpi is now
> >>>> again only checked outside its protecting lock leaves me with a very bad
> >>>> feeling...
> >>>>
> >>> My point is that we should not have to protect a section of code which
> >>> may never conflict in any case, by design; we will likely agree that
> >>> sprinkling locks everywhere to get a warm and fuzzy feeling is no
> >>> solution, it's actually a significant source of regression.
> >>>
> >>> The idea, behind keeping most rpi_* operations applicable to locally
> >>> scheduled threads, is to introduce such a design, even when remote RPI
> >>> slots are involved. thread->sched == xnpod_current_sched() for each
> >>> rpi_*(sched, ...) calls is what is important in this logic. Another
> >>> original assumption was that no RPI updates could be done in interrupt
> >>> context, which is now wrong due to the change in xnshadow_rpi_check().
> >>>
> >>> In short: we have to make sure that rpi_next() does not break the basic
> >>> assumptions of the RPI core, first.
> >> Please check my scenario again: My concern is that a thread can be
> >> queued for a short while on a remote sched,
> > 
> > No, it can't, that is the crux of the matter, well, at least, this
> > should not be possible if the basic assumptions are preserved (have a
> > look at the rpi_clear_remote() callers, the target thread may not
> > migrate or be scheduled in linux-wise, or exit RPI via rpi_pop() during
> > the call -- all places where the rpi backlink may be cleared). Only a
> > caller operating from the local CPU should be allowed to alter the RPI
> > state of threads linked to the RPI slot of that same CPU.
> > 
> > rpi_clear_remote() is not even an exception to this, since it alters a
> > remote RPI slot, but for a thread which does run on the local CPU.
> > 
> >>  and while that is the case,
> >> it can be manipulated (/wrt ->rpi) _concurrently_ (as we do not hold the
> >> remote rpilock all the time). I'm quite sure now that your patch does
> >> not change this.
> > 
> > My patch attempts fixing the core issue, not just plugging one of its
> > bad outcomes.
> > 
> > Again, the point is not to pretend that your patch is wrong, and it
> > surely "plugs" one issue we have due to rpi_next(). The point is to make
> > sure that all issues are covered, by fixing the usage of rpi_next(), or
> > find another way to fix what rpi_next() was supposed to fix in the first
> > place.
> 
> So, if you are right, we could (in theory) replace rpilock with local
> IRQ disabling? That would be the proof from me that it doesn't matter to
> test thread->rpi outside the lock.

Noo, you did not get it yet. The rpilock is protecting the per-cpu RPI
queues, NOT thread->rpi. 

> 
> Jan
> 


-- 
Philippe.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-04-27  9:32                 ` Philippe Gerum
@ 2010-04-27  9:34                   ` Jan Kiszka
  2010-04-27  9:51                     ` Philippe Gerum
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kiszka @ 2010-04-27  9:34 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai-core

Philippe Gerum wrote:
> On Tue, 2010-04-27 at 11:27 +0200, Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> On Tue, 2010-04-27 at 10:25 +0200, Jan Kiszka wrote:
>>>> Philippe Gerum wrote:
>>>>> On Tue, 2010-04-27 at 08:46 +0200, Jan Kiszka wrote:
>>>>>> Philippe Gerum wrote:
>>>>>>> On Mon, 2010-04-26 at 18:06 +0200, Jan Kiszka wrote:
>>>>>>>> Jan Kiszka wrote:
>>>>>>>>> Jan Kiszka wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
>>>>>>>>>> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
>>>>>>>>>> this could explain the bug:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> static void rpi_clear_remote(struct xnthread *thread)
>>>>>>>>>> {
>>>>>>>>>> ...
>>>>>>>>>>         rpi = thread->rpi;
>>>>>>>>>>         if (unlikely(rpi == NULL))
>>>>>>>>>>                 return;
>>>>>>>>>>
>>>>>>>>>>         xnlock_get_irqsave(&rpi->rpilock, s);
>>>>>>>>>>
>>>>>>>>>>         /*
>>>>>>>>>>          * The RPI slot - if present - is always valid, and won't
>>>>>>>>>>          * change since the thread is resuming on this CPU and cannot
>>>>>>>>>>          * migrate under our feet. We may grab the remote slot lock
>>>>>>>>>>          * now.
>>>>>>>>>>          */
>>>>>>>>>>         xnsched_pop_rpi(thread);
>>>>>>>>>>         thread->rpi = NULL;
>>>>>>>>>>
>>>>>>>>>> ...
>>>>>>>>>>
>>>>>>>>>> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
>>>>>>>>>> we check for it without any protection? Sounds racy. I think 'thread' is
>>>>>>>>>> not only pointing to the current thread but could refer to a foreign one
>>>>>>>>>> as well, right? Don't we need this:
>>>>>>>>>>
>>>>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>>>>>>>> index 872c37f..1f995d6 100644
>>>>>>>>>> --- a/ksrc/nucleus/shadow.c
>>>>>>>>>> +++ b/ksrc/nucleus/shadow.c
>>>>>>>>>> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>>>>>>>>  
>>>>>>>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
>>>>>>>>>>  
>>>>>>>>>> +	/* Re-check under lock, someone may have cleared rpi by now. */
>>>>>>>>>> +	if (unlikely(thread->rpi == NULL)) {
>>>>>>>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
>>>>>>>>>> +		return;
>>>>>>>>>> +	}
>>>>>>>>>> +
>>>>>>>>>>  	/*
>>>>>>>>>>  	 * The RPI slot - if present - is always valid, and won't
>>>>>>>>>>  	 * change since the thread is resuming on this CPU and cannot
>>>>>>>>> Another worry: Can thread->rpi become != rpi without being NULL? Or can
>>>>>>>>> we really only race for clearance here?
>>>>>>>>>
>>>>>>>> I think so now, therefore I'm proposing this:
>>>>>>>>
>>>>>>>> ----------->
>>>>>>>>
>>>>>>>> Most RPI services work on the current task or the one to be scheduled in
>>>>>>>> next, thus are naturally serialized. But rpi_next is not as it can walk
>>>>>>>> the chain of RPI requests for a CPU independently. In that case,
>>>>>>>> clearing RPI via rpi_clear_remote can race with rpi_next, and if the
>>>>>>>> former loses after checking thread->rpi for NULL, we will dereference a
>>>>>>>> NULL pointer in xnsched_pop_rpi().
>>>>>>>>
>>>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
>>>>>>>> ---
>>>>>>>>  ksrc/nucleus/shadow.c |    9 +++++++++
>>>>>>>>  1 files changed, 9 insertions(+), 0 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>>>>>> index 872c37f..cf7c08f 100644
>>>>>>>> --- a/ksrc/nucleus/shadow.c
>>>>>>>> +++ b/ksrc/nucleus/shadow.c
>>>>>>>> @@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>>>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
>>>>>>>>  
>>>>>>>>  	/*
>>>>>>>> +	 * Re-check under lock. Someone may have invoked rpi_next and cleared
>>>>>>>> +	 * rpi by now.
>>>>>>>> +	 */
>>>>>>>> +	if (unlikely(!rpi_p(thread))) {
>>>>>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
>>>>>>>> +		return;
>>>>>>>> +	}
>>>>>>>> +
>>>>>>>> +	/*
>>>>>>>>  	 * The RPI slot - if present - is always valid, and won't
>>>>>>>>  	 * change since the thread is resuming on this CPU and cannot
>>>>>>>>  	 * migrate under our feet. We may grab the remote slot lock
>>>>>>>>
>>>>>>> The suggested patch papers over the actual issue, which is that
>>>>>>> rpi_clear_remote() may not invoke rpi_next(), because it may only affect
>>>>>> I don't think that in our case rpi_clear_remote called rpi_next and
>>>>>> therefore crashed. It should rather have been the scenario of both
>>>>>> running in parallel on different CPUs, the former on behalf of a
>>>>>> migrated shadow that wants to clear its remainders on the remote CPU,
>>>>>> the latter on that CPU, picking a new top RPI after some other shadow
>>>>>> was removed from the queue. Is this a possible scenario, and would your
>>>>>> patch cover it?
>>>>>>
>>>>>>> the RPI state of the argument thread which must be a local one, and not
>>>>>>> that of any random thread that happens to be linked to the remote RPI
>>>>>>> queue.
>>>>>>>
>>>>>>> By calling rpi_next(), rpi_clear_remote() shoots itself in the foot,
>>>>>>> allowing a concurrent invocation of itself on a remote CPU, to fiddle
>>>>>>> with the rpi backlink of a thread which is not active on the
>>>>>>> local/per-cpu Xenomai scheduler instance, which is the point where
>>>>>>> things start to hit the crapper.
>>>>>>>
>>>>>>> Now, unless I can't even synchronize the couple of neurons I have left
>>>>>>> at this hour, the following patch should better fix the issue, because
>>>>>>> it restores the two basic rules that apply to the whole RPI machinery,
>>>>>>> namely:
>>>>>>>
>>>>>>> - rpi_* calls may only alter the contents of the local scheduler's RPI
>>>>>>> queue, with the notable exception of rpi_clear_remote() which may remove
>>>>>>> the given _local_ thread only, from a remote RPI slot.
>>>>>>>
>>>>>>> - rpi_* calls may only change the RPI state of threads which are
>>>>>>> controlled by the local Xenomai scheduler instance, except rpi_push()
>>>>>>> when called for setting up the RPI state of an emerging thread, in which
>>>>>>> case this is a no-conflict zone.
>>>>>>>
>>>>>>> That breakage was introduced in the early 2.5.1 timeframe, so 2.4.x
>>>>>>> should be immune from this particular bug.
>>>>>>>
>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>>>>> index 872c37f..1397ed1 100644
>>>>>>> --- a/ksrc/nucleus/shadow.c
>>>>>>> +++ b/ksrc/nucleus/shadow.c
>>>>>>> @@ -340,7 +340,12 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>>>>>  	xnsched_pop_rpi(thread);
>>>>>>>  	thread->rpi = NULL;
>>>>>>>  
>>>>>>> -	if (rpi_next(rpi, s) == NULL)
>>>>>>> +	/*
>>>>>>> +	 * If the remote RPI queue was emptied, prepare for kicking
>>>>>>> +	 * xnshadow_rpi_check() on the relevant CPU to demote the root
>>>>>>> +	 * thread priority there.
>>>>>>> +	 */
>>>>>>> +	if (xnsched_peek_rpi(rpi) == NULL) /* No rpi_next() here. */
>>>>>>>  		rcpu = xnsched_cpu(rpi);
>>>>>>>  
>>>>>>>  	xnlock_put_irqrestore(&rpi->rpilock, s);
>>>>>>>
>>>>>>>
>>>>>> I have to confess, I do not understand how the patch may relate to our
>>>>>> crash. But that's because I still only have a semi-understanding of this
>>>>>> frightening complex RPI code. However, the fact that thread->rpi is now
>>>>>> again only checked outside its protecting lock leaves me with a very bad
>>>>>> feeling...
>>>>>>
>>>>> My point is that we should not have to protect a section of code which
>>>>> may never conflict in any case, by design; we will likely agree that
>>>>> sprinkling locks everywhere to get a warm and fuzzy feeling is no
>>>>> solution, it's actually a significant source of regression.
>>>>>
>>>>> The idea, behind keeping most rpi_* operations applicable to locally
>>>>> scheduled threads, is to introduce such a design, even when remote RPI
>>>>> slots are involved. thread->sched == xnpod_current_sched() for each
>>>>> rpi_*(sched, ...) calls is what is important in this logic. Another
>>>>> original assumption was that no RPI updates could be done in interrupt
>>>>> context, which is now wrong due to the change in xnshadow_rpi_check().
>>>>>
>>>>> In short: we have to make sure that rpi_next() does not break the basic
>>>>> assumptions of the RPI core, first.
>>>> Please check my scenario again: My concern is that a thread can be
>>>> queued for a short while on a remote sched,
>>> No, it can't, that is the crux of the matter, well, at least, this
>>> should not be possible if the basic assumptions are preserved (have a
>>> look at the rpi_clear_remote() callers, the target thread may not
>>> migrate or be scheduled in linux-wise, or exit RPI via rpi_pop() during
>>> the call -- all places where the rpi backlink may be cleared). Only a
>>> caller operating from the local CPU should be allowed to alter the RPI
>>> state of threads linked to the RPI slot of that same CPU.
>>>
>>> rpi_clear_remote() is not even an exception to this, since it alters a
>>> remote RPI slot, but for a thread which does run on the local CPU.
>>>
>>>>  and while that is the case,
>>>> it can be manipulated (/wrt ->rpi) _concurrently_ (as we do not hold the
>>>> remote rpilock all the time). I'm quite sure now that your patch does
>>>> not change this.
>>> My patch attempts fixing the core issue, not just plugging one of its
>>> bad outcomes.
>>>
>>> Again, the point is not to pretend that your patch is wrong, and it
>>> surely "plugs" one issue we have due to rpi_next(). The point is to make
>>> sure that all issues are covered, by fixing the usage of rpi_next(), or
>>> find another way to fix what rpi_next() was supposed to fix in the first
>>> place.
>> So, if you are right, we could (in theory) replace rpilock with local
>> IRQ disabling? That would be the proof from me that it doesn't matter to
>> test thread->rpi outside the lock.
> 
> Noo, you did not get it yet. The rpilock is protecting the per-cpu RPI
> queues, NOT thread->rpi. 

So linking a thread to an RPI queue and setting its ->rpi are not always
related?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-04-27  9:34                   ` Jan Kiszka
@ 2010-04-27  9:51                     ` Philippe Gerum
  2010-04-27 10:40                       ` Jan Kiszka
  0 siblings, 1 reply; 18+ messages in thread
From: Philippe Gerum @ 2010-04-27  9:51 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

On Tue, 2010-04-27 at 11:34 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Tue, 2010-04-27 at 11:27 +0200, Jan Kiszka wrote:
> >> Philippe Gerum wrote:
> >>> On Tue, 2010-04-27 at 10:25 +0200, Jan Kiszka wrote:
> >>>> Philippe Gerum wrote:
> >>>>> On Tue, 2010-04-27 at 08:46 +0200, Jan Kiszka wrote:
> >>>>>> Philippe Gerum wrote:
> >>>>>>> On Mon, 2010-04-26 at 18:06 +0200, Jan Kiszka wrote:
> >>>>>>>> Jan Kiszka wrote:
> >>>>>>>>> Jan Kiszka wrote:
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
> >>>>>>>>>> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
> >>>>>>>>>> this could explain the bug:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> static void rpi_clear_remote(struct xnthread *thread)
> >>>>>>>>>> {
> >>>>>>>>>> ...
> >>>>>>>>>>         rpi = thread->rpi;
> >>>>>>>>>>         if (unlikely(rpi == NULL))
> >>>>>>>>>>                 return;
> >>>>>>>>>>
> >>>>>>>>>>         xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>>>>>>>
> >>>>>>>>>>         /*
> >>>>>>>>>>          * The RPI slot - if present - is always valid, and won't
> >>>>>>>>>>          * change since the thread is resuming on this CPU and cannot
> >>>>>>>>>>          * migrate under our feet. We may grab the remote slot lock
> >>>>>>>>>>          * now.
> >>>>>>>>>>          */
> >>>>>>>>>>         xnsched_pop_rpi(thread);
> >>>>>>>>>>         thread->rpi = NULL;
> >>>>>>>>>>
> >>>>>>>>>> ...
> >>>>>>>>>>
> >>>>>>>>>> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
> >>>>>>>>>> we check for it without any protection? Sounds racy. I think 'thread' is
> >>>>>>>>>> not only pointing to the current thread but could refer to a foreign one
> >>>>>>>>>> as well, right? Don't we need this:
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>>>>>>>>> index 872c37f..1f995d6 100644
> >>>>>>>>>> --- a/ksrc/nucleus/shadow.c
> >>>>>>>>>> +++ b/ksrc/nucleus/shadow.c
> >>>>>>>>>> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
> >>>>>>>>>>  
> >>>>>>>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>>>>>>>  
> >>>>>>>>>> +	/* Re-check under lock, someone may have cleared rpi by now. */
> >>>>>>>>>> +	if (unlikely(thread->rpi == NULL)) {
> >>>>>>>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>>>>>>>> +		return;
> >>>>>>>>>> +	}
> >>>>>>>>>> +
> >>>>>>>>>>  	/*
> >>>>>>>>>>  	 * The RPI slot - if present - is always valid, and won't
> >>>>>>>>>>  	 * change since the thread is resuming on this CPU and cannot
> >>>>>>>>> Another worry: Can thread->rpi become != rpi without being NULL? Or can
> >>>>>>>>> we really only race for clearance here?
> >>>>>>>>>
> >>>>>>>> I think so now, therefore I'm proposing this:
> >>>>>>>>
> >>>>>>>> ----------->
> >>>>>>>>
> >>>>>>>> Most RPI services work on the current task or the one to be scheduled in
> >>>>>>>> next, thus are naturally serialized. But rpi_next is not as it can walk
> >>>>>>>> the chain of RPI requests for a CPU independently. In that case,
> >>>>>>>> clearing RPI via rpi_clear_remote can race with rpi_next, and if the
> >>>>>>>> former loses after checking thread->rpi for NULL, we will dereference a
> >>>>>>>> NULL pointer in xnsched_pop_rpi().
> >>>>>>>>
> >>>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
> >>>>>>>> ---
> >>>>>>>>  ksrc/nucleus/shadow.c |    9 +++++++++
> >>>>>>>>  1 files changed, 9 insertions(+), 0 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>>>>>>> index 872c37f..cf7c08f 100644
> >>>>>>>> --- a/ksrc/nucleus/shadow.c
> >>>>>>>> +++ b/ksrc/nucleus/shadow.c
> >>>>>>>> @@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread *thread)
> >>>>>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>>>>>  
> >>>>>>>>  	/*
> >>>>>>>> +	 * Re-check under lock. Someone may have invoked rpi_next and cleared
> >>>>>>>> +	 * rpi by now.
> >>>>>>>> +	 */
> >>>>>>>> +	if (unlikely(!rpi_p(thread))) {
> >>>>>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>>>>>> +		return;
> >>>>>>>> +	}
> >>>>>>>> +
> >>>>>>>> +	/*
> >>>>>>>>  	 * The RPI slot - if present - is always valid, and won't
> >>>>>>>>  	 * change since the thread is resuming on this CPU and cannot
> >>>>>>>>  	 * migrate under our feet. We may grab the remote slot lock
> >>>>>>>>
> >>>>>>> The suggested patch papers over the actual issue, which is that
> >>>>>>> rpi_clear_remote() may not invoke rpi_next(), because it may only affect
> >>>>>> I don't think that in our case rpi_clear_remote called rpi_next and
> >>>>>> therefore crashed. It should rather have been the scenario of both
> >>>>>> running in parallel on different CPUs, the former on behalf of a
> >>>>>> migrated shadow that wants to clear its remainders on the remote CPU,
> >>>>>> the latter on that CPU, picking a new top RPI after some other shadow
> >>>>>> was removed from the queue. Is this a possible scenario, and would your
> >>>>>> patch cover it?
> >>>>>>
> >>>>>>> the RPI state of the argument thread which must be a local one, and not
> >>>>>>> that of any random thread that happens to be linked to the remote RPI
> >>>>>>> queue.
> >>>>>>>
> >>>>>>> By calling rpi_next(), rpi_clear_remote() shoots itself in the foot,
> >>>>>>> allowing a concurrent invocation of itself on a remote CPU, to fiddle
> >>>>>>> with the rpi backlink of a thread which is not active on the
> >>>>>>> local/per-cpu Xenomai scheduler instance, which is the point where
> >>>>>>> things start to hit the crapper.
> >>>>>>>
> >>>>>>> Now, unless I can't even synchronize the couple of neurons I have left
> >>>>>>> at this hour, the following patch should better fix the issue, because
> >>>>>>> it restores the two basic rules that apply to the whole RPI machinery,
> >>>>>>> namely:
> >>>>>>>
> >>>>>>> - rpi_* calls may only alter the contents of the local scheduler's RPI
> >>>>>>> queue, with the notable exception of rpi_clear_remote() which may remove
> >>>>>>> the given _local_ thread only, from a remote RPI slot.
> >>>>>>>
> >>>>>>> - rpi_* calls may only change the RPI state of threads which are
> >>>>>>> controlled by the local Xenomai scheduler instance, except rpi_push()
> >>>>>>> when called for setting up the RPI state of an emerging thread, in which
> >>>>>>> case this is a no-conflict zone.
> >>>>>>>
> >>>>>>> That breakage was introduced in the early 2.5.1 timeframe, so 2.4.x
> >>>>>>> should be immune from this particular bug.
> >>>>>>>
> >>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>>>>>> index 872c37f..1397ed1 100644
> >>>>>>> --- a/ksrc/nucleus/shadow.c
> >>>>>>> +++ b/ksrc/nucleus/shadow.c
> >>>>>>> @@ -340,7 +340,12 @@ static void rpi_clear_remote(struct xnthread *thread)
> >>>>>>>  	xnsched_pop_rpi(thread);
> >>>>>>>  	thread->rpi = NULL;
> >>>>>>>  
> >>>>>>> -	if (rpi_next(rpi, s) == NULL)
> >>>>>>> +	/*
> >>>>>>> +	 * If the remote RPI queue was emptied, prepare for kicking
> >>>>>>> +	 * xnshadow_rpi_check() on the relevant CPU to demote the root
> >>>>>>> +	 * thread priority there.
> >>>>>>> +	 */
> >>>>>>> +	if (xnsched_peek_rpi(rpi) == NULL) /* No rpi_next() here. */
> >>>>>>>  		rcpu = xnsched_cpu(rpi);
> >>>>>>>  
> >>>>>>>  	xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>>>>>
> >>>>>>>
> >>>>>> I have to confess, I do not understand how the patch may relate to our
> >>>>>> crash. But that's because I still only have a semi-understanding of this
> >>>>>> frightening complex RPI code. However, the fact that thread->rpi is now
> >>>>>> again only checked outside its protecting lock leaves me with a very bad
> >>>>>> feeling...
> >>>>>>
> >>>>> My point is that we should not have to protect a section of code which
> >>>>> may never conflict in any case, by design; we will likely agree that
> >>>>> sprinkling locks everywhere to get a warm and fuzzy feeling is no
> >>>>> solution, it's actually a significant source of regression.
> >>>>>
> >>>>> The idea, behind keeping most rpi_* operations applicable to locally
> >>>>> scheduled threads, is to introduce such a design, even when remote RPI
> >>>>> slots are involved. thread->sched == xnpod_current_sched() for each
> >>>>> rpi_*(sched, ...) calls is what is important in this logic. Another
> >>>>> original assumption was that no RPI updates could be done in interrupt
> >>>>> context, which is now wrong due to the change in xnshadow_rpi_check().
> >>>>>
> >>>>> In short: we have to make sure that rpi_next() does not break the basic
> >>>>> assumptions of the RPI core, first.
> >>>> Please check my scenario again: My concern is that a thread can be
> >>>> queued for a short while on a remote sched,
> >>> No, it can't, that is the crux of the matter, well, at least, this
> >>> should not be possible if the basic assumptions are preserved (have a
> >>> look at the rpi_clear_remote() callers, the target thread may not
> >>> migrate or be scheduled in linux-wise, or exit RPI via rpi_pop() during
> >>> the call -- all places where the rpi backlink may be cleared). Only a
> >>> caller operating from the local CPU should be allowed to alter the RPI
> >>> state of threads linked to the RPI slot of that same CPU.
> >>>
> >>> rpi_clear_remote() is not even an exception to this, since it alters a
> >>> remote RPI slot, but for a thread which does run on the local CPU.
> >>>
> >>>>  and while that is the case,
> >>>> it can be manipulated (/wrt ->rpi) _concurrently_ (as we do not hold the
> >>>> remote rpilock all the time). I'm quite sure now that your patch does
> >>>> not change this.
> >>> My patch attempts fixing the core issue, not just plugging one of its
> >>> bad outcomes.
> >>>
> >>> Again, the point is not to pretend that your patch is wrong, and it
> >>> surely "plugs" one issue we have due to rpi_next(). The point is to make
> >>> sure that all issues are covered, by fixing the usage of rpi_next(), or
> >>> find another way to fix what rpi_next() was supposed to fix in the first
> >>> place.
> >> So, if you are right, we could (in theory) replace rpilock with local
> >> IRQ disabling? That would be the proof from me that it doesn't matter to
> >> test thread->rpi outside the lock.
> > 
> > Noo, you did not get it yet. The rpilock is protecting the per-cpu RPI
> > queues, NOT thread->rpi. 
> 
> So linking a thread to an RPI queue and setting its ->rpi are not always
> related?

Exactly. Normally, a remote CPU should not be allowed to compete with
another one for updating the rpi backlink of a thread it does NOT own,
meaning not linked to the local scheduler. 

My understanding is that rpi_next() is deliberately breaking that rule
when called from rpi_clear_remote() for a non-local scheduler, therefore
altering rpi backlinks of threads which do NOT belong to the local
scheduler.

Who is wrong then? Well, I tend to think that rpi_next() should not be
called in the only cross-CPU context we have, namely rpi_clear_remote().
In that very particular case, where we own the target thread, and we
want to complete its recent migration to our local CPU.

This is why I was insisting on the notion of calling context for all
rpi* calls, with respect to what threads may be touched by a given CPU. 

But, I am now worried by xnshadow_rpi_check(), which - by calling
rpi_next() instead of simply peeking at the RPI queue head - introduces
a potential race wrt the rpi backlink, when we deal with
rpi_clear_remote() on the _same_ CPU.

> 
> Jan
> 


-- 
Philippe.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-04-27  9:51                     ` Philippe Gerum
@ 2010-04-27 10:40                       ` Jan Kiszka
  2010-05-01 17:26                         ` Philippe Gerum
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kiszka @ 2010-04-27 10:40 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai-core

Philippe Gerum wrote:
> On Tue, 2010-04-27 at 11:34 +0200, Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> On Tue, 2010-04-27 at 11:27 +0200, Jan Kiszka wrote:
>>>> Philippe Gerum wrote:
>>>>> On Tue, 2010-04-27 at 10:25 +0200, Jan Kiszka wrote:
>>>>>> Philippe Gerum wrote:
>>>>>>> On Tue, 2010-04-27 at 08:46 +0200, Jan Kiszka wrote:
>>>>>>>> Philippe Gerum wrote:
>>>>>>>>> On Mon, 2010-04-26 at 18:06 +0200, Jan Kiszka wrote:
>>>>>>>>>> Jan Kiszka wrote:
>>>>>>>>>>> Jan Kiszka wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
>>>>>>>>>>>> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
>>>>>>>>>>>> this could explain the bug:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> static void rpi_clear_remote(struct xnthread *thread)
>>>>>>>>>>>> {
>>>>>>>>>>>> ...
>>>>>>>>>>>>         rpi = thread->rpi;
>>>>>>>>>>>>         if (unlikely(rpi == NULL))
>>>>>>>>>>>>                 return;
>>>>>>>>>>>>
>>>>>>>>>>>>         xnlock_get_irqsave(&rpi->rpilock, s);
>>>>>>>>>>>>
>>>>>>>>>>>>         /*
>>>>>>>>>>>>          * The RPI slot - if present - is always valid, and won't
>>>>>>>>>>>>          * change since the thread is resuming on this CPU and cannot
>>>>>>>>>>>>          * migrate under our feet. We may grab the remote slot lock
>>>>>>>>>>>>          * now.
>>>>>>>>>>>>          */
>>>>>>>>>>>>         xnsched_pop_rpi(thread);
>>>>>>>>>>>>         thread->rpi = NULL;
>>>>>>>>>>>>
>>>>>>>>>>>> ...
>>>>>>>>>>>>
>>>>>>>>>>>> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
>>>>>>>>>>>> we check for it without any protection? Sounds racy. I think 'thread' is
>>>>>>>>>>>> not only pointing to the current thread but could refer to a foreign one
>>>>>>>>>>>> as well, right? Don't we need this:
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>>>>>>>>>> index 872c37f..1f995d6 100644
>>>>>>>>>>>> --- a/ksrc/nucleus/shadow.c
>>>>>>>>>>>> +++ b/ksrc/nucleus/shadow.c
>>>>>>>>>>>> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>>>>>>>>>>  
>>>>>>>>>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
>>>>>>>>>>>>  
>>>>>>>>>>>> +	/* Re-check under lock, someone may have cleared rpi by now. */
>>>>>>>>>>>> +	if (unlikely(thread->rpi == NULL)) {
>>>>>>>>>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
>>>>>>>>>>>> +		return;
>>>>>>>>>>>> +	}
>>>>>>>>>>>> +
>>>>>>>>>>>>  	/*
>>>>>>>>>>>>  	 * The RPI slot - if present - is always valid, and won't
>>>>>>>>>>>>  	 * change since the thread is resuming on this CPU and cannot
>>>>>>>>>>> Another worry: Can thread->rpi become != rpi without being NULL? Or can
>>>>>>>>>>> we really only race for clearance here?
>>>>>>>>>>>
>>>>>>>>>> I think so now, therefore I'm proposing this:
>>>>>>>>>>
>>>>>>>>>> ----------->
>>>>>>>>>>
>>>>>>>>>> Most RPI services work on the current task or the one to be scheduled in
>>>>>>>>>> next, thus are naturally serialized. But rpi_next is not as it can walk
>>>>>>>>>> the chain of RPI requests for a CPU independently. In that case,
>>>>>>>>>> clearing RPI via rpi_clear_remote can race with rpi_next, and if the
>>>>>>>>>> former loses after checking thread->rpi for NULL, we will dereference a
>>>>>>>>>> NULL pointer in xnsched_pop_rpi().
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
>>>>>>>>>> ---
>>>>>>>>>>  ksrc/nucleus/shadow.c |    9 +++++++++
>>>>>>>>>>  1 files changed, 9 insertions(+), 0 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>>>>>>>> index 872c37f..cf7c08f 100644
>>>>>>>>>> --- a/ksrc/nucleus/shadow.c
>>>>>>>>>> +++ b/ksrc/nucleus/shadow.c
>>>>>>>>>> @@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>>>>>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
>>>>>>>>>>  
>>>>>>>>>>  	/*
>>>>>>>>>> +	 * Re-check under lock. Someone may have invoked rpi_next and cleared
>>>>>>>>>> +	 * rpi by now.
>>>>>>>>>> +	 */
>>>>>>>>>> +	if (unlikely(!rpi_p(thread))) {
>>>>>>>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
>>>>>>>>>> +		return;
>>>>>>>>>> +	}
>>>>>>>>>> +
>>>>>>>>>> +	/*
>>>>>>>>>>  	 * The RPI slot - if present - is always valid, and won't
>>>>>>>>>>  	 * change since the thread is resuming on this CPU and cannot
>>>>>>>>>>  	 * migrate under our feet. We may grab the remote slot lock
>>>>>>>>>>
>>>>>>>>> The suggested patch papers over the actual issue, which is that
>>>>>>>>> rpi_clear_remote() may not invoke rpi_next(), because it may only affect
>>>>>>>> I don't think that in our case rpi_clear_remote called rpi_next and
>>>>>>>> therefore crashed. It should rather have been the scenario of both
>>>>>>>> running in parallel on different CPUs, the former on behalf of a
>>>>>>>> migrated shadow that wants to clear its remainders on the remote CPU,
>>>>>>>> the latter on that CPU, picking a new top RPI after some other shadow
>>>>>>>> was removed from the queue. Is this a possible scenario, and would your
>>>>>>>> patch cover it?
>>>>>>>>
>>>>>>>>> the RPI state of the argument thread which must be a local one, and not
>>>>>>>>> that of any random thread that happens to be linked to the remote RPI
>>>>>>>>> queue.
>>>>>>>>>
>>>>>>>>> By calling rpi_next(), rpi_clear_remote() shoots itself in the foot,
>>>>>>>>> allowing a concurrent invocation of itself on a remote CPU, to fiddle
>>>>>>>>> with the rpi backlink of a thread which is not active on the
>>>>>>>>> local/per-cpu Xenomai scheduler instance, which is the point where
>>>>>>>>> things start to hit the crapper.
>>>>>>>>>
>>>>>>>>> Now, unless I can't even synchronize the couple of neurons I have left
>>>>>>>>> at this hour, the following patch should better fix the issue, because
>>>>>>>>> it restores the two basic rules that apply to the whole RPI machinery,
>>>>>>>>> namely:
>>>>>>>>>
>>>>>>>>> - rpi_* calls may only alter the contents of the local scheduler's RPI
>>>>>>>>> queue, with the notable exception of rpi_clear_remote() which may remove
>>>>>>>>> the given _local_ thread only, from a remote RPI slot.
>>>>>>>>>
>>>>>>>>> - rpi_* calls may only change the RPI state of threads which are
>>>>>>>>> controlled by the local Xenomai scheduler instance, except rpi_push()
>>>>>>>>> when called for setting up the RPI state of an emerging thread, in which
>>>>>>>>> case this is a no-conflict zone.
>>>>>>>>>
>>>>>>>>> That breakage was introduced in the early 2.5.1 timeframe, so 2.4.x
>>>>>>>>> should be immune from this particular bug.
>>>>>>>>>
>>>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>>>>>>> index 872c37f..1397ed1 100644
>>>>>>>>> --- a/ksrc/nucleus/shadow.c
>>>>>>>>> +++ b/ksrc/nucleus/shadow.c
>>>>>>>>> @@ -340,7 +340,12 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>>>>>>>  	xnsched_pop_rpi(thread);
>>>>>>>>>  	thread->rpi = NULL;
>>>>>>>>>  
>>>>>>>>> -	if (rpi_next(rpi, s) == NULL)
>>>>>>>>> +	/*
>>>>>>>>> +	 * If the remote RPI queue was emptied, prepare for kicking
>>>>>>>>> +	 * xnshadow_rpi_check() on the relevant CPU to demote the root
>>>>>>>>> +	 * thread priority there.
>>>>>>>>> +	 */
>>>>>>>>> +	if (xnsched_peek_rpi(rpi) == NULL) /* No rpi_next() here. */
>>>>>>>>>  		rcpu = xnsched_cpu(rpi);
>>>>>>>>>  
>>>>>>>>>  	xnlock_put_irqrestore(&rpi->rpilock, s);
>>>>>>>>>
>>>>>>>>>
>>>>>>>> I have to confess, I do not understand how the patch may relate to our
>>>>>>>> crash. But that's because I still only have a semi-understanding of this
>>>>>>>> frightening complex RPI code. However, the fact that thread->rpi is now
>>>>>>>> again only checked outside its protecting lock leaves me with a very bad
>>>>>>>> feeling...
>>>>>>>>
>>>>>>> My point is that we should not have to protect a section of code which
>>>>>>> may never conflict in any case, by design; we will likely agree that
>>>>>>> sprinkling locks everywhere to get a warm and fuzzy feeling is no
>>>>>>> solution, it's actually a significant source of regression.
>>>>>>>
>>>>>>> The idea, behind keeping most rpi_* operations applicable to locally
>>>>>>> scheduled threads, is to introduce such a design, even when remote RPI
>>>>>>> slots are involved. thread->sched == xnpod_current_sched() for each
>>>>>>> rpi_*(sched, ...) calls is what is important in this logic. Another
>>>>>>> original assumption was that no RPI updates could be done in interrupt
>>>>>>> context, which is now wrong due to the change in xnshadow_rpi_check().
>>>>>>>
>>>>>>> In short: we have to make sure that rpi_next() does not break the basic
>>>>>>> assumptions of the RPI core, first.
>>>>>> Please check my scenario again: My concern is that a thread can be
>>>>>> queued for a short while on a remote sched,
>>>>> No, it can't, that is the crux of the matter, well, at least, this
>>>>> should not be possible if the basic assumptions are preserved (have a
>>>>> look at the rpi_clear_remote() callers, the target thread may not
>>>>> migrate or be scheduled in linux-wise, or exit RPI via rpi_pop() during
>>>>> the call -- all places where the rpi backlink may be cleared). Only a
>>>>> caller operating from the local CPU should be allowed to alter the RPI
>>>>> state of threads linked to the RPI slot of that same CPU.
>>>>>
>>>>> rpi_clear_remote() is not even an exception to this, since it alters a
>>>>> remote RPI slot, but for a thread which does run on the local CPU.
>>>>>
>>>>>>  and while that is the case,
>>>>>> it can be manipulated (/wrt ->rpi) _concurrently_ (as we do not hold the
>>>>>> remote rpilock all the time). I'm quite sure now that your patch does
>>>>>> not change this.
>>>>> My patch attempts fixing the core issue, not just plugging one of its
>>>>> bad outcomes.
>>>>>
>>>>> Again, the point is not to pretend that your patch is wrong, and it
>>>>> surely "plugs" one issue we have due to rpi_next(). The point is to make
>>>>> sure that all issues are covered, by fixing the usage of rpi_next(), or
>>>>> find another way to fix what rpi_next() was supposed to fix in the first
>>>>> place.
>>>> So, if you are right, we could (in theory) replace rpilock with local
>>>> IRQ disabling? That would be the proof from me that it doesn't matter to
>>>> test thread->rpi outside the lock.
>>> Noo, you did not get it yet. The rpilock is protecting the per-cpu RPI
>>> queues, NOT thread->rpi. 
>> So linking a thread to an RPI queue and setting its ->rpi are not always
>> related?
> 
> Exactly. Normally, a remote CPU should not be allowed to compete with
> another one for updating the rpi backlink of a thread it does NOT own,
> meaning not linked to the local scheduler. 
> 
> My understanding is that rpi_next() is deliberately breaking that rule
> when called from rpi_clear_remote() for a non-local scheduler, therefore
> altering rpi backlinks of threads which do NOT belong to the local
> scheduler.

Just dug out the backtrace again: Where we saw the crash was (likely -
lots of inline functions) in the chain schedule_event -> rpi_switch ->
rpi_migrate -> rpi_clear_remote. To my understanding, this clear takes
place on the CPU that is the target of the just migrated Linux task. At
that point, isn't the task's shadow no longer linked to the original
CPU's RPI queue? And if that remote CPU now calls rpi_next e.g. via
rpi_pop, doesn't that step cause a clearance as well, thus races with
the one we started via rpi_migrate?

Even if I'm wrong on the above, the conditions for manipulating
thread->rpi should be documented or even checked (i.e. when setting
thread->rpi, thread->sched must equal the local sched - right?).

> 
> Who is wrong then? Well, I tend to think that rpi_next() should not be
> called in the only cross-CPU context we have, namely rpi_clear_remote().
> In that very particular case, where we own the target thread, and we
> want to complete its recent migration to our local CPU.
> 
> This is why I was insisting on the notion of calling context for all
> rpi* calls, with respect to what threads may be touched by a given CPU. 
> 
> But, I am now worried by xnshadow_rpi_check(), which - by calling
> rpi_next() instead of simply peeking at the RPI queue head - introduces
> a potential race wrt the rpi backlink, when we deal with
> rpi_clear_remote() on the _same_ CPU.

OK, then we have at least one reason to not trust thread->rpi unless we
hold rpilock. That's my whole point.

BTW, the pattern I suggested is a fairly common one in the kernel due to
all its complex (because fine-grained) locking. And that I thought I had
to introduce raised my worries about RPI locking noticeably. :-/

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-04-27 10:40                       ` Jan Kiszka
@ 2010-05-01 17:26                         ` Philippe Gerum
  2010-05-01 17:47                           ` Jan Kiszka
  0 siblings, 1 reply; 18+ messages in thread
From: Philippe Gerum @ 2010-05-01 17:26 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

On Tue, 2010-04-27 at 12:40 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Tue, 2010-04-27 at 11:34 +0200, Jan Kiszka wrote:
> >> Philippe Gerum wrote:
> >>> On Tue, 2010-04-27 at 11:27 +0200, Jan Kiszka wrote:
> >>>> Philippe Gerum wrote:
> >>>>> On Tue, 2010-04-27 at 10:25 +0200, Jan Kiszka wrote:
> >>>>>> Philippe Gerum wrote:
> >>>>>>> On Tue, 2010-04-27 at 08:46 +0200, Jan Kiszka wrote:
> >>>>>>>> Philippe Gerum wrote:
> >>>>>>>>> On Mon, 2010-04-26 at 18:06 +0200, Jan Kiszka wrote:
> >>>>>>>>>> Jan Kiszka wrote:
> >>>>>>>>>>> Jan Kiszka wrote:
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
> >>>>>>>>>>>> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
> >>>>>>>>>>>> this could explain the bug:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> static void rpi_clear_remote(struct xnthread *thread)
> >>>>>>>>>>>> {
> >>>>>>>>>>>> ...
> >>>>>>>>>>>>         rpi = thread->rpi;
> >>>>>>>>>>>>         if (unlikely(rpi == NULL))
> >>>>>>>>>>>>                 return;
> >>>>>>>>>>>>
> >>>>>>>>>>>>         xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>>>>>>>>>
> >>>>>>>>>>>>         /*
> >>>>>>>>>>>>          * The RPI slot - if present - is always valid, and won't
> >>>>>>>>>>>>          * change since the thread is resuming on this CPU and cannot
> >>>>>>>>>>>>          * migrate under our feet. We may grab the remote slot lock
> >>>>>>>>>>>>          * now.
> >>>>>>>>>>>>          */
> >>>>>>>>>>>>         xnsched_pop_rpi(thread);
> >>>>>>>>>>>>         thread->rpi = NULL;
> >>>>>>>>>>>>
> >>>>>>>>>>>> ...
> >>>>>>>>>>>>
> >>>>>>>>>>>> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
> >>>>>>>>>>>> we check for it without any protection? Sounds racy. I think 'thread' is
> >>>>>>>>>>>> not only pointing to the current thread but could refer to a foreign one
> >>>>>>>>>>>> as well, right? Don't we need this:
> >>>>>>>>>>>>
> >>>>>>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>>>>>>>>>>> index 872c37f..1f995d6 100644
> >>>>>>>>>>>> --- a/ksrc/nucleus/shadow.c
> >>>>>>>>>>>> +++ b/ksrc/nucleus/shadow.c
> >>>>>>>>>>>> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
> >>>>>>>>>>>>  
> >>>>>>>>>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>>>>>>>>>  
> >>>>>>>>>>>> +	/* Re-check under lock, someone may have cleared rpi by now. */
> >>>>>>>>>>>> +	if (unlikely(thread->rpi == NULL)) {
> >>>>>>>>>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>>>>>>>>>> +		return;
> >>>>>>>>>>>> +	}
> >>>>>>>>>>>> +
> >>>>>>>>>>>>  	/*
> >>>>>>>>>>>>  	 * The RPI slot - if present - is always valid, and won't
> >>>>>>>>>>>>  	 * change since the thread is resuming on this CPU and cannot
> >>>>>>>>>>> Another worry: Can thread->rpi become != rpi without being NULL? Or can
> >>>>>>>>>>> we really only race for clearance here?
> >>>>>>>>>>>
> >>>>>>>>>> I think so now, therefore I'm proposing this:
> >>>>>>>>>>
> >>>>>>>>>> ----------->
> >>>>>>>>>>
> >>>>>>>>>> Most RPI services work on the current task or the one to be scheduled in
> >>>>>>>>>> next, thus are naturally serialized. But rpi_next is not as it can walk
> >>>>>>>>>> the chain of RPI requests for a CPU independently. In that case,
> >>>>>>>>>> clearing RPI via rpi_clear_remote can race with rpi_next, and if the
> >>>>>>>>>> former loses after checking thread->rpi for NULL, we will dereference a
> >>>>>>>>>> NULL pointer in xnsched_pop_rpi().
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
> >>>>>>>>>> ---
> >>>>>>>>>>  ksrc/nucleus/shadow.c |    9 +++++++++
> >>>>>>>>>>  1 files changed, 9 insertions(+), 0 deletions(-)
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>>>>>>>>> index 872c37f..cf7c08f 100644
> >>>>>>>>>> --- a/ksrc/nucleus/shadow.c
> >>>>>>>>>> +++ b/ksrc/nucleus/shadow.c
> >>>>>>>>>> @@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread *thread)
> >>>>>>>>>>  	xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>>>>>>>  
> >>>>>>>>>>  	/*
> >>>>>>>>>> +	 * Re-check under lock. Someone may have invoked rpi_next and cleared
> >>>>>>>>>> +	 * rpi by now.
> >>>>>>>>>> +	 */
> >>>>>>>>>> +	if (unlikely(!rpi_p(thread))) {
> >>>>>>>>>> +		xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>>>>>>>> +		return;
> >>>>>>>>>> +	}
> >>>>>>>>>> +
> >>>>>>>>>> +	/*
> >>>>>>>>>>  	 * The RPI slot - if present - is always valid, and won't
> >>>>>>>>>>  	 * change since the thread is resuming on this CPU and cannot
> >>>>>>>>>>  	 * migrate under our feet. We may grab the remote slot lock
> >>>>>>>>>>
> >>>>>>>>> The suggested patch papers over the actual issue, which is that
> >>>>>>>>> rpi_clear_remote() may not invoke rpi_next(), because it may only affect
> >>>>>>>> I don't think that in our case rpi_clear_remote called rpi_next and
> >>>>>>>> therefore crashed. It should rather have been the scenario of both
> >>>>>>>> running in parallel on different CPUs, the former on behalf of a
> >>>>>>>> migrated shadow that wants to clear its remainders on the remote CPU,
> >>>>>>>> the latter on that CPU, picking a new top RPI after some other shadow
> >>>>>>>> was removed from the queue. Is this a possible scenario, and would your
> >>>>>>>> patch cover it?
> >>>>>>>>
> >>>>>>>>> the RPI state of the argument thread which must be a local one, and not
> >>>>>>>>> that of any random thread that happens to be linked to the remote RPI
> >>>>>>>>> queue.
> >>>>>>>>>
> >>>>>>>>> By calling rpi_next(), rpi_clear_remote() shoots itself in the foot,
> >>>>>>>>> allowing a concurrent invocation of itself on a remote CPU, to fiddle
> >>>>>>>>> with the rpi backlink of a thread which is not active on the
> >>>>>>>>> local/per-cpu Xenomai scheduler instance, which is the point where
> >>>>>>>>> things start to hit the crapper.
> >>>>>>>>>
> >>>>>>>>> Now, unless I can't even synchronize the couple of neurons I have left
> >>>>>>>>> at this hour, the following patch should better fix the issue, because
> >>>>>>>>> it restores the two basic rules that apply to the whole RPI machinery,
> >>>>>>>>> namely:
> >>>>>>>>>
> >>>>>>>>> - rpi_* calls may only alter the contents of the local scheduler's RPI
> >>>>>>>>> queue, with the notable exception of rpi_clear_remote() which may remove
> >>>>>>>>> the given _local_ thread only, from a remote RPI slot.
> >>>>>>>>>
> >>>>>>>>> - rpi_* calls may only change the RPI state of threads which are
> >>>>>>>>> controlled by the local Xenomai scheduler instance, except rpi_push()
> >>>>>>>>> when called for setting up the RPI state of an emerging thread, in which
> >>>>>>>>> case this is a no-conflict zone.
> >>>>>>>>>
> >>>>>>>>> That breakage was introduced in the early 2.5.1 timeframe, so 2.4.x
> >>>>>>>>> should be immune from this particular bug.
> >>>>>>>>>
> >>>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>>>>>>>> index 872c37f..1397ed1 100644
> >>>>>>>>> --- a/ksrc/nucleus/shadow.c
> >>>>>>>>> +++ b/ksrc/nucleus/shadow.c
> >>>>>>>>> @@ -340,7 +340,12 @@ static void rpi_clear_remote(struct xnthread *thread)
> >>>>>>>>>  	xnsched_pop_rpi(thread);
> >>>>>>>>>  	thread->rpi = NULL;
> >>>>>>>>>  
> >>>>>>>>> -	if (rpi_next(rpi, s) == NULL)
> >>>>>>>>> +	/*
> >>>>>>>>> +	 * If the remote RPI queue was emptied, prepare for kicking
> >>>>>>>>> +	 * xnshadow_rpi_check() on the relevant CPU to demote the root
> >>>>>>>>> +	 * thread priority there.
> >>>>>>>>> +	 */
> >>>>>>>>> +	if (xnsched_peek_rpi(rpi) == NULL) /* No rpi_next() here. */
> >>>>>>>>>  		rcpu = xnsched_cpu(rpi);
> >>>>>>>>>  
> >>>>>>>>>  	xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>> I have to confess, I do not understand how the patch may relate to our
> >>>>>>>> crash. But that's because I still only have a semi-understanding of this
> >>>>>>>> frightening complex RPI code. However, the fact that thread->rpi is now
> >>>>>>>> again only checked outside its protecting lock leaves me with a very bad
> >>>>>>>> feeling...
> >>>>>>>>
> >>>>>>> My point is that we should not have to protect a section of code which
> >>>>>>> may never conflict in any case, by design; we will likely agree that
> >>>>>>> sprinkling locks everywhere to get a warm and fuzzy feeling is no
> >>>>>>> solution, it's actually a significant source of regression.
> >>>>>>>
> >>>>>>> The idea, behind keeping most rpi_* operations applicable to locally
> >>>>>>> scheduled threads, is to introduce such a design, even when remote RPI
> >>>>>>> slots are involved. thread->sched == xnpod_current_sched() for each
> >>>>>>> rpi_*(sched, ...) calls is what is important in this logic. Another
> >>>>>>> original assumption was that no RPI updates could be done in interrupt
> >>>>>>> context, which is now wrong due to the change in xnshadow_rpi_check().
> >>>>>>>
> >>>>>>> In short: we have to make sure that rpi_next() does not break the basic
> >>>>>>> assumptions of the RPI core, first.
> >>>>>> Please check my scenario again: My concern is that a thread can be
> >>>>>> queued for a short while on a remote sched,
> >>>>> No, it can't, that is the crux of the matter, well, at least, this
> >>>>> should not be possible if the basic assumptions are preserved (have a
> >>>>> look at the rpi_clear_remote() callers, the target thread may not
> >>>>> migrate or be scheduled in linux-wise, or exit RPI via rpi_pop() during
> >>>>> the call -- all places where the rpi backlink may be cleared). Only a
> >>>>> caller operating from the local CPU should be allowed to alter the RPI
> >>>>> state of threads linked to the RPI slot of that same CPU.
> >>>>>
> >>>>> rpi_clear_remote() is not even an exception to this, since it alters a
> >>>>> remote RPI slot, but for a thread which does run on the local CPU.
> >>>>>
> >>>>>>  and while that is the case,
> >>>>>> it can be manipulated (/wrt ->rpi) _concurrently_ (as we do not hold the
> >>>>>> remote rpilock all the time). I'm quite sure now that your patch does
> >>>>>> not change this.
> >>>>> My patch attempts fixing the core issue, not just plugging one of its
> >>>>> bad outcomes.
> >>>>>
> >>>>> Again, the point is not to pretend that your patch is wrong, and it
> >>>>> surely "plugs" one issue we have due to rpi_next(). The point is to make
> >>>>> sure that all issues are covered, by fixing the usage of rpi_next(), or
> >>>>> find another way to fix what rpi_next() was supposed to fix in the first
> >>>>> place.
> >>>> So, if you are right, we could (in theory) replace rpilock with local
> >>>> IRQ disabling? That would be the proof from me that it doesn't matter to
> >>>> test thread->rpi outside the lock.
> >>> Noo, you did not get it yet. The rpilock is protecting the per-cpu RPI
> >>> queues, NOT thread->rpi. 
> >> So linking a thread to an RPI queue and setting its ->rpi are not always
> >> related?
> > 
> > Exactly. Normally, a remote CPU should not be allowed to compete with
> > another one for updating the rpi backlink of a thread it does NOT own,
> > meaning not linked to the local scheduler. 
> > 
> > My understanding is that rpi_next() is deliberately breaking that rule
> > when called from rpi_clear_remote() for a non-local scheduler, therefore
> > altering rpi backlinks of threads which do NOT belong to the local
> > scheduler.
> 
> Just dug out the backtrace again: Where we saw the crash was (likely -
> lots of inline functions) in the chain schedule_event -> rpi_switch ->
> rpi_migrate -> rpi_clear_remote. To my understanding, this clear takes
> place on the CPU that is the target of the just migrated Linux task. At
> that point, isn't the task's shadow no longer linked to the original
> CPU's RPI queue? And if that remote CPU now calls rpi_next e.g. via
> rpi_pop, doesn't that step cause a clearance as well, thus races with
> the one we started via rpi_migrate?
> 
> Even if I'm wrong on the above, the conditions for manipulating
> thread->rpi should be documented or even checked (i.e. when setting
> thread->rpi, thread->sched must equal the local sched - right?).
> 
> > 
> > Who is wrong then? Well, I tend to think that rpi_next() should not be
> > called in the only cross-CPU context we have, namely rpi_clear_remote().
> > In that very particular case, where we own the target thread, and we
> > want to complete its recent migration to our local CPU.
> > 
> > This is why I was insisting on the notion of calling context for all
> > rpi* calls, with respect to what threads may be touched by a given CPU. 
> > 
> > But, I am now worried by xnshadow_rpi_check(), which - by calling
> > rpi_next() instead of simply peeking at the RPI queue head - introduces
> > a potential race wrt the rpi backlink, when we deal with
> > rpi_clear_remote() on the _same_ CPU.
> 
> OK, then we have at least one reason to not trust thread->rpi unless we
> hold rpilock. That's my whole point.
> 
> BTW, the pattern I suggested is a fairly common one in the kernel due to
> all its complex (because fine-grained) locking. And that I thought I had
> to introduce raised my worries about RPI locking noticeably. :-/

Don't you think that, quoting yourself:

"I have to confess, I do not understand how the patch may relate to our
crash. But that's because I still only have a semi-understanding of this
frightening complex RPI code"

does not fit that well with any kind of strong assertion made later on
the same topic, not backed by facts?

So, I'm suggesting that we move on, and end this discussion in a
positive manner, i.e. by fixing this code. I hear your concerns, like I
always do, and I'm trying to find a solution that does not paper over
another issue. I guess we should be able to settle on this.

I pushed two patches in my for-upstream queue, with lengthy comments to
explain what they do and why this is needed:

http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=ac5c739dabcb14334c2e390a9e3064f11f97283c
http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=d3242401b8e1cf2c28822f801b681194243b4394

- the first patch is a plain revert of the commit introducing
rpi_next(), which caused the bug you observed in SMP mode, and that your
patch plugged successfully.

- the second patch is the proper fix for the issue rpi_next() tried to
address.

Could you please try them, and report whether this also fixes the issue
for you? In the meantime, I will be analyzing the RPI code once again,
to check whether your patch still covers a possible case, or not.

TIA,

-- 
Philippe.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-05-01 17:26                         ` Philippe Gerum
@ 2010-05-01 17:47                           ` Jan Kiszka
  2010-05-01 18:59                             ` Philippe Gerum
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kiszka @ 2010-05-01 17:47 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 2137 bytes --]

Philippe Gerum wrote:
> Don't you think that, quoting yourself:
> 
> "I have to confess, I do not understand how the patch may relate to our
> crash. But that's because I still only have a semi-understanding of this
> frightening complex RPI code"
> 
> does not fit that well with any kind of strong assertion made later on
> the same topic, not backed by facts?

See, as long as you do not directly explain why my concerns on this
naturally racy construct were not valid (I brought up quite a few
concrete questions), I couldn't help raising them over and over again.
This has, in fact, nothing to do with understanding the RPI code in all
its details. It's about reviewing basic patterns of it. But now that the
critical bit is gone, I'm surely no longer concerned. :)

> 
> So, I'm suggesting that we move on, and end this discussion in a
> positive manner, i.e. by fixing this code. I hear your concerns, like I
> always do, and I'm trying to find a solution that does not paper over
> another issue. I guess we should be able to settle on this.
> 
> I pushed two patches in my for-upstream queue, with lengthy comments to
> explain what they do and why this is needed:
> 
> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=ac5c739dabcb14334c2e390a9e3064f11f97283c
> http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=d3242401b8e1cf2c28822f801b681194243b4394
> 
> - the first patch is a plain revert of the commit introducing
> rpi_next(), which caused the bug you observed in SMP mode, and that your
> patch plugged successfully.
> 
> - the second patch is the proper fix for the issue rpi_next() tried to
> address.
> 
> Could you please try them, and report whether this also fixes the issue
> for you? In the meantime, I will be analyzing the RPI code once again,
> to check whether your patch still covers a possible case, or not.

I will try my best, but the issue showed up only once in countless
application runs. We will role out the patches at the next chance (we
already had to push my workaround as we need to deliver the fastsynch
fix, so it may take a while).

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-05-01 17:47                           ` Jan Kiszka
@ 2010-05-01 18:59                             ` Philippe Gerum
  2010-05-02  9:08                               ` Jan Kiszka
  0 siblings, 1 reply; 18+ messages in thread
From: Philippe Gerum @ 2010-05-01 18:59 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

On Sat, 2010-05-01 at 19:47 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > Don't you think that, quoting yourself:
> > 
> > "I have to confess, I do not understand how the patch may relate to our
> > crash. But that's because I still only have a semi-understanding of this
> > frightening complex RPI code"
> > 
> > does not fit that well with any kind of strong assertion made later on
> > the same topic, not backed by facts?
> 
> See, as long as you do not directly explain why my concerns on this
> naturally racy construct were not valid (I brought up quite a few
> concrete questions),

Which were answered. And you seem to consider that a construct is racy
by design without taking care of understanding why they could be
perfectly valid in the proper context. This is what I was pointing out,
all time long.

>  I couldn't help raising them over and over again.
> This has, in fact, nothing to do with understanding the RPI code in all
> its details. It's about reviewing basic patterns of it. But now that the
> critical bit is gone, I'm surely no longer concerned. :)
> 
> > 
> > So, I'm suggesting that we move on, and end this discussion in a
> > positive manner, i.e. by fixing this code. I hear your concerns, like I
> > always do, and I'm trying to find a solution that does not paper over
> > another issue. I guess we should be able to settle on this.
> > 
> > I pushed two patches in my for-upstream queue, with lengthy comments to
> > explain what they do and why this is needed:
> > 
> > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=ac5c739dabcb14334c2e390a9e3064f11f97283c
> > http://git.xenomai.org/?p=xenomai-rpm.git;a=commit;h=d3242401b8e1cf2c28822f801b681194243b4394
> > 
> > - the first patch is a plain revert of the commit introducing
> > rpi_next(), which caused the bug you observed in SMP mode, and that your
> > patch plugged successfully.
> > 
> > - the second patch is the proper fix for the issue rpi_next() tried to
> > address.
> > 
> > Could you please try them, and report whether this also fixes the issue
> > for you? In the meantime, I will be analyzing the RPI code once again,
> > to check whether your patch still covers a possible case, or not.
> 
> I will try my best, but the issue showed up only once in countless
> application runs. We will role out the patches at the next chance (we
> already had to push my workaround as we need to deliver the fastsynch
> fix, so it may take a while).
> 

No problem with that, you were the one hit by the issue so far.

> Jan
> 


-- 
Philippe.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next
  2010-05-01 18:59                             ` Philippe Gerum
@ 2010-05-02  9:08                               ` Jan Kiszka
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Kiszka @ 2010-05-02  9:08 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 965 bytes --]

Philippe Gerum wrote:
> On Sat, 2010-05-01 at 19:47 +0200, Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> Don't you think that, quoting yourself:
>>>
>>> "I have to confess, I do not understand how the patch may relate to our
>>> crash. But that's because I still only have a semi-understanding of this
>>> frightening complex RPI code"
>>>
>>> does not fit that well with any kind of strong assertion made later on
>>> the same topic, not backed by facts?
>> See, as long as you do not directly explain why my concerns on this
>> naturally racy construct were not valid (I brought up quite a few
>> concrete questions),
> 
> Which were answered. And you seem to consider that a construct is racy
> by design without taking care of understanding why they could be
> perfectly valid in the proper context. This is what I was pointing out,
> all time long.

Depends on the POV. But what counts is the outcome of this, and that's a
fix.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2010-05-02  9:08 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-26 13:43 [Xenomai-core] Race in rpi_clear_remote? Jan Kiszka
2010-04-26 13:51 ` Jan Kiszka
2010-04-26 16:06   ` [Xenomai-core] [PATCH] nucleus: Plug race between rpi_clear_remote and rpi_next Jan Kiszka
2010-04-26 16:11     ` Jan Kiszka
2010-04-27  1:19     ` Philippe Gerum
2010-04-27  6:46       ` Jan Kiszka
2010-04-27  8:13         ` Philippe Gerum
2010-04-27  8:25           ` Jan Kiszka
2010-04-27  9:12             ` Philippe Gerum
2010-04-27  9:27               ` Jan Kiszka
2010-04-27  9:32                 ` Philippe Gerum
2010-04-27  9:34                   ` Jan Kiszka
2010-04-27  9:51                     ` Philippe Gerum
2010-04-27 10:40                       ` Jan Kiszka
2010-05-01 17:26                         ` Philippe Gerum
2010-05-01 17:47                           ` Jan Kiszka
2010-05-01 18:59                             ` Philippe Gerum
2010-05-02  9:08                               ` Jan Kiszka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.