All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai-help] mlockall error after calling mlockall()
@ 2010-02-25 21:12 Charlton, John
  2010-02-25 21:25 ` Gilles Chanteperdrix
  2010-02-25 21:38 ` Jan Kiszka
  0 siblings, 2 replies; 31+ messages in thread
From: Charlton, John @ 2010-02-25 21:12 UTC (permalink / raw)
  To: xenomai

[-- Attachment #1: Type: text/plain, Size: 1876 bytes --]

I have a xenomai application that runs without problems with xenomai-2.4.6.1/linux-2.6.27.7. When run under xenomai-2.5.1/linux-2.6.32.7 it fails with the warning: Xenomai: process memory not locked (missing mlockall?).  I verified that mlockall() is being called before any xenomai calls are made as follows with out error:

  // lock process in memory for xenomai
  int err = mlockall(MCL_CURRENT | MCL_FUTURE);
  if (err)
  {
    fprintf(stderr, "TimerInit: mlockall failed to lock process into memory: %d\n", err);
    exit(1);
  }
  fprintf(stdout, "TimerInit: Locked process memory\n");

The relevant message in dmesg output is:
Xenomai: watchdog triggered -- signaling runaway thread 'timerloop-2166'

This occurs when creating a native xenomai task (created by CanFestival library) with the following code with no errors:

void StartTimerLoop(TimerCallback_t _init_callback)
{
  int ret = 0;
  stop_timer = 0;
  init_callback = _init_callback;

  char taskname[32];
  snprintf(taskname, sizeof(taskname), "timerloop-%d", getpid());

  printf("Starting timerloop task %s\n", taskname);
  /* create timerloop_task */
  ret = rt_task_create(&timerloop_task, taskname, 0, 50, T_JOINABLE);
  if (ret) {
    printf("Failed to create timerloop_task, code %d\n",errno);
    return;
  }

  /* start timerloop_task */
  ret = rt_task_start(&timerloop_task,&timerloop_task_proc,NULL);
  if (ret) {
    printf("Failed to start timerloop_task, code %u\n",errno);
    goto error;
  }

  return;

error:
  cleanup_all();
}

Debug output is displayed from the timerloop_task indicating that the task starts, but the mlockall warning is displayed before those outputs and the application already begins to shut down.

xeno-test runs without errors.  The trivial_periodic application also compiles and runs without error


--John



[-- Attachment #2: Type: text/html, Size: 3408 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-02-25 21:12 [Xenomai-help] mlockall error after calling mlockall() Charlton, John
@ 2010-02-25 21:25 ` Gilles Chanteperdrix
  2010-02-25 21:38 ` Jan Kiszka
  1 sibling, 0 replies; 31+ messages in thread
From: Gilles Chanteperdrix @ 2010-02-25 21:25 UTC (permalink / raw)
  To: Charlton, John; +Cc: xenomai

Charlton, John wrote:
> I have a xenomai application that runs without problems with
> xenomai-2.4.6.1/linux-2.6.27.7. When run under
> xenomai-2.5.1/linux-2.6.32.7 it fails with the warning: Xenomai:
> process memory not locked (missing mlockall?).  I verified that
> mlockall() is being called before any xenomai calls are made as
> follows with out error:
> 
> // lock process in memory for xenomai int err = mlockall(MCL_CURRENT
> | MCL_FUTURE); if (err) { fprintf(stderr, "TimerInit: mlockall failed
> to lock process into memory: %d\n", err); exit(1); } fprintf(stdout,
> "TimerInit: Locked process memory\n");
> 
> The relevant message in dmesg output is: Xenomai: watchdog triggered
> -- signaling runaway thread 'timerloop-2166'
> 
> This occurs when creating a native xenomai task (created by
> CanFestival library) with the following code with no errors:
> 
> void StartTimerLoop(TimerCallback_t _init_callback) { int ret = 0; 
> stop_timer = 0; init_callback = _init_callback;
> 
> char taskname[32]; snprintf(taskname, sizeof(taskname),
> "timerloop-%d", getpid());
> 
> printf("Starting timerloop task %s\n", taskname); /* create
> timerloop_task */ ret = rt_task_create(&timerloop_task, taskname, 0,
> 50, T_JOINABLE); if (ret) { printf("Failed to create timerloop_task,
> code %d\n",errno); return; }
> 
> /* start timerloop_task */ ret =
> rt_task_start(&timerloop_task,&timerloop_task_proc,NULL); if (ret) { 
> printf("Failed to start timerloop_task, code %u\n",errno); goto
> error; }
> 
> return;
> 
> error: cleanup_all(); }
> 
> Debug output is displayed from the timerloop_task indicating that the
> task starts, but the mlockall warning is displayed before those
> outputs and the application already begins to shut down.
> 
> xeno-test runs without errors.  The trivial_periodic application also
> compiles and runs without error

Please post a preferably simple, self-contained, compiling test case
which we can compile and run to reproduce the issue you are meeting.

The flags you pass to xenomai configure script and the command lines you
use when compiling the program also matter.

At first sight, your real problem is that your program has an infinite
loop in primary mode which causes the watchdog to trigger. Then, the
message printed gets it wrong. You can try and revert the commit
803ff2e093a4260054e1e9e59114bc4e656c84bd
to see if you get a more meaningful message.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-02-25 21:12 [Xenomai-help] mlockall error after calling mlockall() Charlton, John
  2010-02-25 21:25 ` Gilles Chanteperdrix
@ 2010-02-25 21:38 ` Jan Kiszka
  2010-02-25 22:42   ` Jan Kiszka
  1 sibling, 1 reply; 31+ messages in thread
From: Jan Kiszka @ 2010-02-25 21:38 UTC (permalink / raw)
  To: Charlton, John; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 2405 bytes --]

Charlton, John wrote:
> I have a xenomai application that runs without problems with xenomai-2.4.6.1/linux-2.6.27.7. When run under xenomai-2.5.1/linux-2.6.32.7 it fails with the warning: Xenomai: process memory not locked (missing mlockall?).  I verified that mlockall() is being called before any xenomai calls are made as follows with out error:
> 

I'm afraid that warning is misleading.

>   // lock process in memory for xenomai
>   int err = mlockall(MCL_CURRENT | MCL_FUTURE);
>   if (err)
>   {
>     fprintf(stderr, "TimerInit: mlockall failed to lock process into memory: %d\n", err);
>     exit(1);
>   }
>   fprintf(stdout, "TimerInit: Locked process memory\n");
> 
> The relevant message in dmesg output is:
> Xenomai: watchdog triggered -- signaling runaway thread 'timerloop-2166'

That is the actual error message, likely pointing to an application bug.
Try to attach gdb to your application, it will catch the signal and
allow to dump a stack trace that should give some hint what loops
endlessly here.

> 
> This occurs when creating a native xenomai task (created by CanFestival library) with the following code with no errors:
> 
> void StartTimerLoop(TimerCallback_t _init_callback)
> {
>   int ret = 0;
>   stop_timer = 0;
>   init_callback = _init_callback;
> 
>   char taskname[32];
>   snprintf(taskname, sizeof(taskname), "timerloop-%d", getpid());
> 
>   printf("Starting timerloop task %s\n", taskname);
>   /* create timerloop_task */
>   ret = rt_task_create(&timerloop_task, taskname, 0, 50, T_JOINABLE);
>   if (ret) {
>     printf("Failed to create timerloop_task, code %d\n",errno);
>     return;
>   }
> 
>   /* start timerloop_task */
>   ret = rt_task_start(&timerloop_task,&timerloop_task_proc,NULL);
>   if (ret) {
>     printf("Failed to start timerloop_task, code %u\n",errno);
>     goto error;
>   }
> 
>   return;
> 
> error:
>   cleanup_all();
> }
> 
> Debug output is displayed from the timerloop_task indicating that the task starts, but the mlockall warning is displayed before those outputs and the application already begins to shut down.

Way more interesting is how timerloop_task_proc looks like. Can you post
a complete test case? Besides that this would allow us to help you with
the runaway timerloop, I would like to understand why this wrong warning
is dumped in user space.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-02-25 21:38 ` Jan Kiszka
@ 2010-02-25 22:42   ` Jan Kiszka
  2010-02-25 22:54     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Kiszka @ 2010-02-25 22:42 UTC (permalink / raw)
  To: Charlton, John; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 4948 bytes --]

Jan Kiszka wrote:
> Charlton, John wrote:
>> I have a xenomai application that runs without problems with xenomai-2.4.6.1/linux-2.6.27.7. When run under xenomai-2.5.1/linux-2.6.32.7 it fails with the warning: Xenomai: process memory not locked (missing mlockall?).  I verified that mlockall() is being called before any xenomai calls are made as follows with out error:
>>
> 
> I'm afraid that warning is misleading.

And this should fix the error reporting (without any signal handler
installed, your application will still terminate, though):

------>

From: Jan Kiszka <jan.kiszka@domain.hid>

Avoid false error reports of xeno_handle_mlock_alert

We already propagate the SIGDEBUG reason to user space. Use it to tell
SIGDEBUG_NOMLOCK apart from other triggers of this signal, e.g. the
watchdog. This also allows to drop xeno_sigxcpu_no_mlock.

Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
---
 include/asm-generic/bits/bind.h        |    6 +++---
 include/asm-generic/bits/mlock_alert.h |    7 ++-----
 src/skins/native/task.c                |   15 ---------------
 src/skins/posix/thread.c               |    9 ---------
 4 files changed, 5 insertions(+), 32 deletions(-)

diff --git a/include/asm-generic/bits/bind.h b/include/asm-generic/bits/bind.h
index 7267e0d..1aeffb5 100644
--- a/include/asm-generic/bits/bind.h
+++ b/include/asm-generic/bits/bind.h
@@ -9,7 +9,7 @@ union xnsiginfo;
 
 typedef void xnsighandler(union xnsiginfo *si);
 
-void xeno_handle_mlock_alert(int sig);
+void xeno_handle_mlock_alert(int sig, siginfo_t *si, void *context);
 
 int 
 xeno_bind_skin_opt(unsigned skin_magic, const char *skin, 
@@ -29,9 +29,9 @@ xeno_bind_skin(unsigned skin_magic, const char *skin,
 		exit(EXIT_FAILURE);
 	}
 
-	sa.sa_handler = &xeno_handle_mlock_alert;
+	sa.sa_sigaction = xeno_handle_mlock_alert;
 	sigemptyset(&sa.sa_mask);
-	sa.sa_flags = 0;
+	sa.sa_flags = SA_SIGINFO;
 	sigaction(SIGXCPU, &sa, NULL);
 
 	return muxid;
diff --git a/include/asm-generic/bits/mlock_alert.h b/include/asm-generic/bits/mlock_alert.h
index 6c7217d..eded3c1 100644
--- a/include/asm-generic/bits/mlock_alert.h
+++ b/include/asm-generic/bits/mlock_alert.h
@@ -6,15 +6,12 @@
 #include <signal.h>
 #include <pthread.h>
 
-__attribute__ ((weak))
-int xeno_sigxcpu_no_mlock = 1;
-
 __attribute__ ((weak, visibility ("internal")))
-void xeno_handle_mlock_alert(int sig)
+void xeno_handle_mlock_alert(int sig, siginfo_t *si, void *context)
 {
 	struct sigaction sa;
 
-	if (xeno_sigxcpu_no_mlock) {
+	if (si->si_value.sival_int == SIGDEBUG_NOMLOCK) {
 		fprintf(stderr, "Xenomai: process memory not locked "
 			"(missing mlockall?)\n");
 		fflush(stderr);
diff --git a/src/skins/native/task.c b/src/skins/native/task.c
index ba04a27..6312f2f 100644
--- a/src/skins/native/task.c
+++ b/src/skins/native/task.c
@@ -41,7 +41,6 @@ extern pthread_key_t __native_tskey;
 #endif /* !HAVE___THREAD */
 
 extern int __native_muxid;
-extern int xeno_sigxcpu_no_mlock;
 
 /* Public Xenomai interface. */
 
@@ -97,9 +96,6 @@ static void *rt_task_trampoline(void *cookie)
 
 	xeno_set_current();
 
-	if (iargs->mode & T_WARNSW)
-		xeno_sigxcpu_no_mlock = 0;
-
 	/* Wait on the barrier for the task to be started. The barrier
 	   could be released in order to process Linux signals while the
 	   Xenomai shadow is still dormant; in such a case, resume wait. */
@@ -231,9 +227,6 @@ int rt_task_shadow(RT_TASK *task, const char *name, int prio, int mode)
 
 	xeno_set_current();
 
-	if (mode & T_WARNSW)
-		xeno_sigxcpu_no_mlock = 0;
-
 	return 0;
 
   fail:
@@ -347,14 +340,6 @@ int rt_task_set_mode(int clrmask, int setmask, int *oldmode)
 				__native_task_set_mode, clrmask, setmask,
 				oldmode);
 
-	/* Silently deactivate our internal handler for SIGXCPU. At that
-	   point, we know that the process memory has been properly
-	   locked, otherwise we would have caught the latter signal upon
-	   thread creation. */
-
-	if (!err && xeno_sigxcpu_no_mlock)
-		xeno_sigxcpu_no_mlock = !(setmask & T_WARNSW);
-
 	return err;
 }
 
diff --git a/src/skins/posix/thread.c b/src/skins/posix/thread.c
index d565d52..8084a20 100644
--- a/src/skins/posix/thread.c
+++ b/src/skins/posix/thread.c
@@ -330,20 +330,11 @@ int pthread_wait_np(unsigned long *overruns_r)
 
 int pthread_set_mode_np(int clrmask, int setmask)
 {
-	extern int xeno_sigxcpu_no_mlock;
 	int err;
 
 	err = -XENOMAI_SKINCALL2(__pse51_muxid,
 				 __pse51_thread_set_mode, clrmask, setmask);
 
-	/* Silently deactivate our internal handler for SIGXCPU. At that
-	   point, we know that the process memory has been properly
-	   locked, otherwise we would have caught the latter signal upon
-	   thread creation. */
-
-	if (!err && xeno_sigxcpu_no_mlock)
-		xeno_sigxcpu_no_mlock = !(setmask & PTHREAD_WARNSW);
-
 	return err;
 }
 
-- 
1.6.0.2


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-02-25 22:42   ` Jan Kiszka
@ 2010-02-25 22:54     ` Gilles Chanteperdrix
  2010-02-25 23:01       ` Jan Kiszka
  0 siblings, 1 reply; 31+ messages in thread
From: Gilles Chanteperdrix @ 2010-02-25 22:54 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai

Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Charlton, John wrote:
>>> I have a xenomai application that runs without problems with xenomai-2.4.6.1/linux-2.6.27.7. When run under xenomai-2.5.1/linux-2.6.32.7 it fails with the warning: Xenomai: process memory not locked (missing mlockall?).  I verified that mlockall() is being called before any xenomai calls are made as follows with out error:
>>>
>> I'm afraid that warning is misleading.
> 
> And this should fix the error reporting (without any signal handler
> installed, your application will still terminate, though):

Arg, it will conflict with the libxenomai thing. I am punished for not
having pushed my work earlier.

Did you test it? Or should I wait for John to test it before merging it?

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-02-25 22:54     ` Gilles Chanteperdrix
@ 2010-02-25 23:01       ` Jan Kiszka
  2010-02-26 16:47         ` Charlton, John
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Kiszka @ 2010-02-25 23:01 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 1046 bytes --]

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Charlton, John wrote:
>>>> I have a xenomai application that runs without problems with xenomai-2.4.6.1/linux-2.6.27.7. When run under xenomai-2.5.1/linux-2.6.32.7 it fails with the warning: Xenomai: process memory not locked (missing mlockall?).  I verified that mlockall() is being called before any xenomai calls are made as follows with out error:
>>>>
>>> I'm afraid that warning is misleading.
>> And this should fix the error reporting (without any signal handler
>> installed, your application will still terminate, though):
> 
> Arg, it will conflict with the libxenomai thing. I am punished for not
> having pushed my work earlier.
> 
> Did you test it? Or should I wait for John to test it before merging it?
> 

Of course it's untested. :)

Yes, I forgot to state explicitly that it would be nice of John to check
this with his application.

I've no problem rebasing this later on over your work, the changes are
trivial enough.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-02-25 23:01       ` Jan Kiszka
@ 2010-02-26 16:47         ` Charlton, John
  2010-02-26 16:53           ` Gilles Chanteperdrix
  0 siblings, 1 reply; 31+ messages in thread
From: Charlton, John @ 2010-02-26 16:47 UTC (permalink / raw)
  To: 'jan.kiszka@domain.hid', Gilles Chanteperdrix; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 1641 bytes --]

I am attaching a simpler portion of the timer loop that demonstrates the problem.  As I said before the problem does not occur with xenomai-2.4.6.1/linux-2.6.27.7 but does occur with xenomai-2.5.1/linux-2.6.32.7.

I had some problems applying your patch so I will let you know if I am able to do that.

Thanks,
--John 

-----Original Message-----
From: jan.kiszka@domain.hid [mailto:jan.kiszka@domain.hid
Sent: Thursday, February 25, 2010 6:01 PM
To: Gilles Chanteperdrix
Cc: Charlton, John; xenomai@xenomai.org
Subject: Re: [Xenomai-help] mlockall error after calling mlockall()

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Charlton, John wrote:
>>>> I have a xenomai application that runs without problems with xenomai-2.4.6.1/linux-2.6.27.7. When run under xenomai-2.5.1/linux-2.6.32.7 it fails with the warning: Xenomai: process memory not locked (missing mlockall?).  I verified that mlockall() is being called before any xenomai calls are made as follows with out error:
>>>>
>>> I'm afraid that warning is misleading.
>> And this should fix the error reporting (without any signal handler 
>> installed, your application will still terminate, though):
> 
> Arg, it will conflict with the libxenomai thing. I am punished for not 
> having pushed my work earlier.
> 
> Did you test it? Or should I wait for John to test it before merging it?
> 

Of course it's untested. :)

Yes, I forgot to state explicitly that it would be nice of John to check this with his application.

I've no problem rebasing this later on over your work, the changes are trivial enough.

Jan


[-- Attachment #2: timers_xeno.tgz --]
[-- Type: application/x-compressed, Size: 12849 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-02-26 16:47         ` Charlton, John
@ 2010-02-26 16:53           ` Gilles Chanteperdrix
  2010-02-26 17:15             ` Charlton, John
  2010-03-01 15:52             ` Charlton, John
  0 siblings, 2 replies; 31+ messages in thread
From: Gilles Chanteperdrix @ 2010-02-26 16:53 UTC (permalink / raw)
  To: Charlton, John; +Cc: xenomai, 'jan.kiszka@domain.hid'

Charlton, John wrote:
> I am attaching a simpler portion of the timer loop that demonstrates
> the problem.  As I said before the problem does not occur with
> xenomai-2.4.6.1/linux-2.6.27.7 but does occur with
> xenomai-2.5.1/linux-2.6.32.7.
> 
> I had some problems applying your patch so I will let you know if I
> am able to do that.

But did you have the watchdog enabled with 2.4.6.1 ?

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-02-26 16:53           ` Gilles Chanteperdrix
@ 2010-02-26 17:15             ` Charlton, John
  2010-03-01 15:52             ` Charlton, John
  1 sibling, 0 replies; 31+ messages in thread
From: Charlton, John @ 2010-02-26 17:15 UTC (permalink / raw)
  To: 'Gilles Chanteperdrix'; +Cc: xenomai, 'jan.kiszka@domain.hid'

Yes. watchdog is enabled with 4 sec in both 2.4.6.1 and 2.5.1 builds.

--

John 

-----Original Message-----
From: Gilles Chanteperdrix [mailto:gilles.chanteperdrix@xenomai.org
Sent: Friday, February 26, 2010 11:53 AM
To: Charlton, John
Cc: 'jan.kiszka@domain.hid'; xenomai@xenomai.org
Subject: Re: [Xenomai-help] mlockall error after calling mlockall()

Charlton, John wrote:
> I am attaching a simpler portion of the timer loop that demonstrates 
> the problem.  As I said before the problem does not occur with
> xenomai-2.4.6.1/linux-2.6.27.7 but does occur with 
> xenomai-2.5.1/linux-2.6.32.7.
> 
> I had some problems applying your patch so I will let you know if I am 
> able to do that.

But did you have the watchdog enabled with 2.4.6.1 ?

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-02-26 16:53           ` Gilles Chanteperdrix
  2010-02-26 17:15             ` Charlton, John
@ 2010-03-01 15:52             ` Charlton, John
  2010-03-01 20:30               ` Charlton, John
  2010-03-01 20:30               ` Jan Kiszka
  1 sibling, 2 replies; 31+ messages in thread
From: Charlton, John @ 2010-03-01 15:52 UTC (permalink / raw)
  To: 'Gilles Chanteperdrix'; +Cc: xenomai, 'jan.kiszka@domain.hid'

[-- Attachment #1: Type: text/plain, Size: 1215 bytes --]

I am attaching a much simpler tar ball that demonstrates the difference between xenomai-2.4.6.1 and xenomai-2.5.1 rt_cond_wait() function.  The difference is that in the new version of xenomai-2.5.1 the rt_cond_wait() function does wait for the condition for the specified timeout interval but when it timesout it returns 0 and I have not been able to get it to return -ETIMEDOUT.  The CanFestival timers_xeno task depends on the value of -ETIMEDOUT being returned when a timeout occurs.

--

John

-----Original Message-----
From: Gilles Chanteperdrix [mailto:gilles.chanteperdrix@xenomai.org
Sent: Friday, February 26, 2010 11:53 AM
To: Charlton, John
Cc: 'jan.kiszka@domain.hid'; xenomai@xenomai.org
Subject: Re: [Xenomai-help] mlockall error after calling mlockall()

Charlton, John wrote:
> I am attaching a simpler portion of the timer loop that demonstrates 
> the problem.  As I said before the problem does not occur with
> xenomai-2.4.6.1/linux-2.6.27.7 but does occur with 
> xenomai-2.5.1/linux-2.6.32.7.
> 
> I had some problems applying your patch so I will let you know if I am 
> able to do that.

But did you have the watchdog enabled with 2.4.6.1 ?

-- 
					    Gilles.

[-- Attachment #2: timers_xeno.tgz --]
[-- Type: application/x-compressed, Size: 12849 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 15:52             ` Charlton, John
@ 2010-03-01 20:30               ` Charlton, John
  2010-03-01 20:33                 ` Jan Kiszka
  2010-03-01 20:30               ` Jan Kiszka
  1 sibling, 1 reply; 31+ messages in thread
From: Charlton, John @ 2010-03-01 20:30 UTC (permalink / raw)
  To: Charlton, John, 'Gilles Chanteperdrix'
  Cc: xenomai, 'jan.kiszka@domain.hid'

I added some printk output to the cond.c file in xenomai-2.5.1 and it shows that  xnsync_sleep_on returned -ETIMEDOUT as it should:
Xenomai: registered exported object M1-1908 (mutexes)
Xenomai: registered exported object C1-1908 (condvars)
xnsynch_sleep_on returned: -110
xnsynch_sleep_on returned: -110
xnsynch_sleep_on returned: -110
xnsynch_sleep_on returned: -110
xnsynch_sleep_on returned: -110
xnsynch_sleep_on returned: -110
xnsynch_sleep_on returned: -4
Xenomai: native: cleaning up cond "C1-1908" (ret=0).
Xenomai: unregistered exported object C1-1908 (condvars)
Xenomai: native: cleaning up mutex "M1-1908" (ret=0).
Xenomai: unregistered exported object M1-1908 (mutexes)
Xenomai: POSIX: destroyed thread df5a0800

I put the printk on line cond.c:433 after err is set in rt_cond_wait_prologue():

	info = xnsynch_sleep_on(&cond->synch_base,
				timeout, timeout_mode);
	if (info & XNRMID)
		err = -EIDRM;	/* Condvar deleted while pending. */
	else if (info & XNTIMEO) {
		err = -ETIMEDOUT;	/* Timeout. */
        }
	else if (info & XNBREAK) {
		err = -EINTR;	/* Unblocked. */
	}
        printk(KERN_DEBUG "xnsynch_sleep_on returned: %d\n", err);

I put a printk in rt_cond_wait_inner() which is called directly by rt_cond_wait() from user mode but did not see the output in the dmesg output for that one.

--

John


-----Original Message-----
From: xenomai-help-bounces@domain.hid [mailto:xenomai-help-bounces@domain.hid] On Behalf Of Charlton, John
Sent: Monday, March 01, 2010 10:52 AM
To: 'Gilles Chanteperdrix'
Cc: xenomai@xenomai.org; 'jan.kiszka@domain.hid'
Subject: Re: [Xenomai-help] mlockall error after calling mlockall()

I am attaching a much simpler tar ball that demonstrates the difference between xenomai-2.4.6.1 and xenomai-2.5.1 rt_cond_wait() function.  The difference is that in the new version of xenomai-2.5.1 the rt_cond_wait() function does wait for the condition for the specified timeout interval but when it timesout it returns 0 and I have not been able to get it to return -ETIMEDOUT.  The CanFestival timers_xeno task depends on the value of -ETIMEDOUT being returned when a timeout occurs.

--

John

-----Original Message-----
From: Gilles Chanteperdrix [mailto:gilles.chanteperdrix@xenomai.org]
Sent: Friday, February 26, 2010 11:53 AM
To: Charlton, John
Cc: 'jan.kiszka@domain.hid'; xenomai@xenomai.org
Subject: Re: [Xenomai-help] mlockall error after calling mlockall()

Charlton, John wrote:
> I am attaching a simpler portion of the timer loop that demonstrates 
> the problem.  As I said before the problem does not occur with
> xenomai-2.4.6.1/linux-2.6.27.7 but does occur with 
> xenomai-2.5.1/linux-2.6.32.7.
> 
> I had some problems applying your patch so I will let you know if I am 
> able to do that.

But did you have the watchdog enabled with 2.4.6.1 ?

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 15:52             ` Charlton, John
  2010-03-01 20:30               ` Charlton, John
@ 2010-03-01 20:30               ` Jan Kiszka
  1 sibling, 0 replies; 31+ messages in thread
From: Jan Kiszka @ 2010-03-01 20:30 UTC (permalink / raw)
  To: Charlton, John; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 691 bytes --]

Charlton, John wrote:
> I am attaching a much simpler tar ball that demonstrates the difference between xenomai-2.4.6.1 and xenomai-2.5.1 rt_cond_wait() function.  The difference is that in the new version of xenomai-2.5.1 the rt_cond_wait() function does wait for the condition for the specified timeout interval but when it timesout it returns 0 and I have not been able to get it to return -ETIMEDOUT.  The CanFestival timers_xeno task depends on the value of -ETIMEDOUT being returned when a timeout occurs.
> 

Yes, there are unfortunately breakages in rt_cond_wait[_until] - fairly
obvious ones when looking from the right angle... :-/. I'm looking into
this.

Thanks,
Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 20:30               ` Charlton, John
@ 2010-03-01 20:33                 ` Jan Kiszka
  2010-03-01 20:38                   ` Gilles Chanteperdrix
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Kiszka @ 2010-03-01 20:33 UTC (permalink / raw)
  To: Charlton, John; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 1612 bytes --]

Charlton, John wrote:
> I added some printk output to the cond.c file in xenomai-2.5.1 and it shows that  xnsync_sleep_on returned -ETIMEDOUT as it should:
> Xenomai: registered exported object M1-1908 (mutexes)
> Xenomai: registered exported object C1-1908 (condvars)
> xnsynch_sleep_on returned: -110
> xnsynch_sleep_on returned: -110
> xnsynch_sleep_on returned: -110
> xnsynch_sleep_on returned: -110
> xnsynch_sleep_on returned: -110
> xnsynch_sleep_on returned: -110
> xnsynch_sleep_on returned: -4
> Xenomai: native: cleaning up cond "C1-1908" (ret=0).
> Xenomai: unregistered exported object C1-1908 (condvars)
> Xenomai: native: cleaning up mutex "M1-1908" (ret=0).
> Xenomai: unregistered exported object M1-1908 (mutexes)
> Xenomai: POSIX: destroyed thread df5a0800
> 
> I put the printk on line cond.c:433 after err is set in rt_cond_wait_prologue():
> 
> 	info = xnsynch_sleep_on(&cond->synch_base,
> 				timeout, timeout_mode);
> 	if (info & XNRMID)
> 		err = -EIDRM;	/* Condvar deleted while pending. */
> 	else if (info & XNTIMEO) {
> 		err = -ETIMEDOUT;	/* Timeout. */
>         }
> 	else if (info & XNBREAK) {
> 		err = -EINTR;	/* Unblocked. */
> 	}
>         printk(KERN_DEBUG "xnsynch_sleep_on returned: %d\n", err);
> 
> I put a printk in rt_cond_wait_inner() which is called directly by rt_cond_wait() from user mode but did not see the output in the dmesg output for that one.

One of our problems is the prologue/epilogue split up (both in kernel
and user space): the epilogue can eat the error code or the prologue,
including the -ETIMEDOUT.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 20:33                 ` Jan Kiszka
@ 2010-03-01 20:38                   ` Gilles Chanteperdrix
  2010-03-01 20:43                     ` Jan Kiszka
  0 siblings, 1 reply; 31+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-01 20:38 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai

Jan Kiszka wrote:
> Charlton, John wrote:
>> I added some printk output to the cond.c file in xenomai-2.5.1 and it shows that  xnsync_sleep_on returned -ETIMEDOUT as it should:
>> Xenomai: registered exported object M1-1908 (mutexes)
>> Xenomai: registered exported object C1-1908 (condvars)
>> xnsynch_sleep_on returned: -110
>> xnsynch_sleep_on returned: -110
>> xnsynch_sleep_on returned: -110
>> xnsynch_sleep_on returned: -110
>> xnsynch_sleep_on returned: -110
>> xnsynch_sleep_on returned: -110
>> xnsynch_sleep_on returned: -4
>> Xenomai: native: cleaning up cond "C1-1908" (ret=0).
>> Xenomai: unregistered exported object C1-1908 (condvars)
>> Xenomai: native: cleaning up mutex "M1-1908" (ret=0).
>> Xenomai: unregistered exported object M1-1908 (mutexes)
>> Xenomai: POSIX: destroyed thread df5a0800
>>
>> I put the printk on line cond.c:433 after err is set in rt_cond_wait_prologue():
>>
>> 	info = xnsynch_sleep_on(&cond->synch_base,
>> 				timeout, timeout_mode);
>> 	if (info & XNRMID)
>> 		err = -EIDRM;	/* Condvar deleted while pending. */
>> 	else if (info & XNTIMEO) {
>> 		err = -ETIMEDOUT;	/* Timeout. */
>>         }
>> 	else if (info & XNBREAK) {
>> 		err = -EINTR;	/* Unblocked. */
>> 	}
>>         printk(KERN_DEBUG "xnsynch_sleep_on returned: %d\n", err);
>>
>> I put a printk in rt_cond_wait_inner() which is called directly by rt_cond_wait() from user mode but did not see the output in the dmesg output for that one.
> 
> One of our problems is the prologue/epilogue split up (both in kernel
> and user space): the epilogue can eat the error code or the prologue,
> including the -ETIMEDOUT.

Ah. My fault. Looks user-space is Ok though. Only kernel-space has a
problem.

> 
> Jan
> 


-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 20:38                   ` Gilles Chanteperdrix
@ 2010-03-01 20:43                     ` Jan Kiszka
  2010-03-01 20:46                       ` Jan Kiszka
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Kiszka @ 2010-03-01 20:43 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 1937 bytes --]

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Charlton, John wrote:
>>> I added some printk output to the cond.c file in xenomai-2.5.1 and it shows that  xnsync_sleep_on returned -ETIMEDOUT as it should:
>>> Xenomai: registered exported object M1-1908 (mutexes)
>>> Xenomai: registered exported object C1-1908 (condvars)
>>> xnsynch_sleep_on returned: -110
>>> xnsynch_sleep_on returned: -110
>>> xnsynch_sleep_on returned: -110
>>> xnsynch_sleep_on returned: -110
>>> xnsynch_sleep_on returned: -110
>>> xnsynch_sleep_on returned: -110
>>> xnsynch_sleep_on returned: -4
>>> Xenomai: native: cleaning up cond "C1-1908" (ret=0).
>>> Xenomai: unregistered exported object C1-1908 (condvars)
>>> Xenomai: native: cleaning up mutex "M1-1908" (ret=0).
>>> Xenomai: unregistered exported object M1-1908 (mutexes)
>>> Xenomai: POSIX: destroyed thread df5a0800
>>>
>>> I put the printk on line cond.c:433 after err is set in rt_cond_wait_prologue():
>>>
>>> 	info = xnsynch_sleep_on(&cond->synch_base,
>>> 				timeout, timeout_mode);
>>> 	if (info & XNRMID)
>>> 		err = -EIDRM;	/* Condvar deleted while pending. */
>>> 	else if (info & XNTIMEO) {
>>> 		err = -ETIMEDOUT;	/* Timeout. */
>>>         }
>>> 	else if (info & XNBREAK) {
>>> 		err = -EINTR;	/* Unblocked. */
>>> 	}
>>>         printk(KERN_DEBUG "xnsynch_sleep_on returned: %d\n", err);
>>>
>>> I put a printk in rt_cond_wait_inner() which is called directly by rt_cond_wait() from user mode but did not see the output in the dmesg output for that one.
>> One of our problems is the prologue/epilogue split up (both in kernel
>> and user space): the epilogue can eat the error code or the prologue,
>> including the -ETIMEDOUT.
> 
> Ah. My fault. Looks user-space is Ok though. Only kernel-space has a
> problem.
> 

Both are affected.

Could you help me with what issues
97323b3287b5ee8cad99a7fa67cd050bc51f76c4 should fix?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 20:43                     ` Jan Kiszka
@ 2010-03-01 20:46                       ` Jan Kiszka
  2010-03-01 20:56                         ` Gilles Chanteperdrix
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Kiszka @ 2010-03-01 20:46 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 2180 bytes --]

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Charlton, John wrote:
>>>> I added some printk output to the cond.c file in xenomai-2.5.1 and it shows that  xnsync_sleep_on returned -ETIMEDOUT as it should:
>>>> Xenomai: registered exported object M1-1908 (mutexes)
>>>> Xenomai: registered exported object C1-1908 (condvars)
>>>> xnsynch_sleep_on returned: -110
>>>> xnsynch_sleep_on returned: -110
>>>> xnsynch_sleep_on returned: -110
>>>> xnsynch_sleep_on returned: -110
>>>> xnsynch_sleep_on returned: -110
>>>> xnsynch_sleep_on returned: -110
>>>> xnsynch_sleep_on returned: -4
>>>> Xenomai: native: cleaning up cond "C1-1908" (ret=0).
>>>> Xenomai: unregistered exported object C1-1908 (condvars)
>>>> Xenomai: native: cleaning up mutex "M1-1908" (ret=0).
>>>> Xenomai: unregistered exported object M1-1908 (mutexes)
>>>> Xenomai: POSIX: destroyed thread df5a0800
>>>>
>>>> I put the printk on line cond.c:433 after err is set in rt_cond_wait_prologue():
>>>>
>>>> 	info = xnsynch_sleep_on(&cond->synch_base,
>>>> 				timeout, timeout_mode);
>>>> 	if (info & XNRMID)
>>>> 		err = -EIDRM;	/* Condvar deleted while pending. */
>>>> 	else if (info & XNTIMEO) {
>>>> 		err = -ETIMEDOUT;	/* Timeout. */
>>>>         }
>>>> 	else if (info & XNBREAK) {
>>>> 		err = -EINTR;	/* Unblocked. */
>>>> 	}
>>>>         printk(KERN_DEBUG "xnsynch_sleep_on returned: %d\n", err);
>>>>
>>>> I put a printk in rt_cond_wait_inner() which is called directly by rt_cond_wait() from user mode but did not see the output in the dmesg output for that one.
>>> One of our problems is the prologue/epilogue split up (both in kernel
>>> and user space): the epilogue can eat the error code or the prologue,
>>> including the -ETIMEDOUT.
>> Ah. My fault. Looks user-space is Ok though. Only kernel-space has a
>> problem.
>>
> 
> Both are affected.
> 
> Could you help me with what issues
> 97323b3287b5ee8cad99a7fa67cd050bc51f76c4 should fix?

Ah, restart after RT-signals! So far we blocked on the concluding
rt_mutex_lock without breaking out to user space, now we have to loop
over -EINTR to allow signals, right?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 20:46                       ` Jan Kiszka
@ 2010-03-01 20:56                         ` Gilles Chanteperdrix
  2010-03-01 21:00                           ` Gilles Chanteperdrix
  2010-03-01 21:01                           ` Jan Kiszka
  0 siblings, 2 replies; 31+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-01 20:56 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai

Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Charlton, John wrote:
>>>>> I added some printk output to the cond.c file in xenomai-2.5.1 and it shows that  xnsync_sleep_on returned -ETIMEDOUT as it should:
>>>>> Xenomai: registered exported object M1-1908 (mutexes)
>>>>> Xenomai: registered exported object C1-1908 (condvars)
>>>>> xnsynch_sleep_on returned: -110
>>>>> xnsynch_sleep_on returned: -110
>>>>> xnsynch_sleep_on returned: -110
>>>>> xnsynch_sleep_on returned: -110
>>>>> xnsynch_sleep_on returned: -110
>>>>> xnsynch_sleep_on returned: -110
>>>>> xnsynch_sleep_on returned: -4
>>>>> Xenomai: native: cleaning up cond "C1-1908" (ret=0).
>>>>> Xenomai: unregistered exported object C1-1908 (condvars)
>>>>> Xenomai: native: cleaning up mutex "M1-1908" (ret=0).
>>>>> Xenomai: unregistered exported object M1-1908 (mutexes)
>>>>> Xenomai: POSIX: destroyed thread df5a0800
>>>>>
>>>>> I put the printk on line cond.c:433 after err is set in rt_cond_wait_prologue():
>>>>>
>>>>> 	info = xnsynch_sleep_on(&cond->synch_base,
>>>>> 				timeout, timeout_mode);
>>>>> 	if (info & XNRMID)
>>>>> 		err = -EIDRM;	/* Condvar deleted while pending. */
>>>>> 	else if (info & XNTIMEO) {
>>>>> 		err = -ETIMEDOUT;	/* Timeout. */
>>>>>         }
>>>>> 	else if (info & XNBREAK) {
>>>>> 		err = -EINTR;	/* Unblocked. */
>>>>> 	}
>>>>>         printk(KERN_DEBUG "xnsynch_sleep_on returned: %d\n", err);
>>>>>
>>>>> I put a printk in rt_cond_wait_inner() which is called directly by rt_cond_wait() from user mode but did not see the output in the dmesg output for that one.
>>>> One of our problems is the prologue/epilogue split up (both in kernel
>>>> and user space): the epilogue can eat the error code or the prologue,
>>>> including the -ETIMEDOUT.
>>> Ah. My fault. Looks user-space is Ok though. Only kernel-space has a
>>> problem.
>>>
>> Both are affected.
>>
>> Could you help me with what issues
>> 97323b3287b5ee8cad99a7fa67cd050bc51f76c4 should fix?
> 
> Ah, restart after RT-signals! So far we blocked on the concluding
> rt_mutex_lock without breaking out to user space, now we have to loop
> over -EINTR to allow signals, right?

Yes, it is needed even for handling linux signals (getting gdb working 
for instance). However, since we do not want rt_cond_wait to result into
two syscalls all the time, we handle everything in the first syscall if 
possible.

Try this:

diff --git a/ksrc/skins/native/syscall.c b/ksrc/skins/native/syscall.c
index 40e5cfd..d4e885c 100644
--- a/ksrc/skins/native/syscall.c
+++ b/ksrc/skins/native/syscall.c
@@ -1869,9 +1869,12 @@ static int __rt_cond_wait_prologue(struct pt_regs *regs)

        err = rt_cond_wait_prologue(cond, mutex, &lockcnt, timeout_mode, timeout);

-       if (err == 0 || err == -ETIMEDOUT)
-               err = rt_cond_wait_epilogue(mutex, lockcnt);
-
+       if (err == 0 || err == -ETIMEDOUT) {
+               int loc_err = rt_cond_wait_epilogue(mutex, lockcnt);
+               if (loc_err < 0)
+                       err = loc_err;
+       }
+
        if (err == -EINTR && __xn_reg_arg3(regs)
            && __xn_safe_copy_to_user((void __user *)__xn_reg_arg3(regs),
                                      &lockcnt, sizeof(lockcnt)))

> 
> Jan
> 


-- 
					    Gilles.


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 20:56                         ` Gilles Chanteperdrix
@ 2010-03-01 21:00                           ` Gilles Chanteperdrix
  2010-03-01 21:02                             ` Jan Kiszka
  2010-03-01 21:01                           ` Jan Kiszka
  1 sibling, 1 reply; 31+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-01 21:00 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
>>>> Jan Kiszka wrote:
>>>>> Charlton, John wrote:
>>>>>> I added some printk output to the cond.c file in xenomai-2.5.1 and it shows that  xnsync_sleep_on returned -ETIMEDOUT as it should:
>>>>>> Xenomai: registered exported object M1-1908 (mutexes)
>>>>>> Xenomai: registered exported object C1-1908 (condvars)
>>>>>> xnsynch_sleep_on returned: -110
>>>>>> xnsynch_sleep_on returned: -110
>>>>>> xnsynch_sleep_on returned: -110
>>>>>> xnsynch_sleep_on returned: -110
>>>>>> xnsynch_sleep_on returned: -110
>>>>>> xnsynch_sleep_on returned: -110
>>>>>> xnsynch_sleep_on returned: -4
>>>>>> Xenomai: native: cleaning up cond "C1-1908" (ret=0).
>>>>>> Xenomai: unregistered exported object C1-1908 (condvars)
>>>>>> Xenomai: native: cleaning up mutex "M1-1908" (ret=0).
>>>>>> Xenomai: unregistered exported object M1-1908 (mutexes)
>>>>>> Xenomai: POSIX: destroyed thread df5a0800
>>>>>>
>>>>>> I put the printk on line cond.c:433 after err is set in rt_cond_wait_prologue():
>>>>>>
>>>>>> 	info = xnsynch_sleep_on(&cond->synch_base,
>>>>>> 				timeout, timeout_mode);
>>>>>> 	if (info & XNRMID)
>>>>>> 		err = -EIDRM;	/* Condvar deleted while pending. */
>>>>>> 	else if (info & XNTIMEO) {
>>>>>> 		err = -ETIMEDOUT;	/* Timeout. */
>>>>>>         }
>>>>>> 	else if (info & XNBREAK) {
>>>>>> 		err = -EINTR;	/* Unblocked. */
>>>>>> 	}
>>>>>>         printk(KERN_DEBUG "xnsynch_sleep_on returned: %d\n", err);
>>>>>>
>>>>>> I put a printk in rt_cond_wait_inner() which is called directly by rt_cond_wait() from user mode but did not see the output in the dmesg output for that one.
>>>>> One of our problems is the prologue/epilogue split up (both in kernel
>>>>> and user space): the epilogue can eat the error code or the prologue,
>>>>> including the -ETIMEDOUT.
>>>> Ah. My fault. Looks user-space is Ok though. Only kernel-space has a
>>>> problem.
>>>>
>>> Both are affected.
>>>
>>> Could you help me with what issues
>>> 97323b3287b5ee8cad99a7fa67cd050bc51f76c4 should fix?
>> Ah, restart after RT-signals! So far we blocked on the concluding
>> rt_mutex_lock without breaking out to user space, now we have to loop
>> over -EINTR to allow signals, right?
> 
> Yes, it is needed even for handling linux signals (getting gdb working 
> for instance). However, since we do not want rt_cond_wait to result into
> two syscalls all the time, we handle everything in the first syscall if 
> possible.
> 
> Try this:
> 
> diff --git a/ksrc/skins/native/syscall.c b/ksrc/skins/native/syscall.c
> index 40e5cfd..d4e885c 100644
> --- a/ksrc/skins/native/syscall.c
> +++ b/ksrc/skins/native/syscall.c
> @@ -1869,9 +1869,12 @@ static int __rt_cond_wait_prologue(struct pt_regs *regs)
> 
>         err = rt_cond_wait_prologue(cond, mutex, &lockcnt, timeout_mode, timeout);
> 
> -       if (err == 0 || err == -ETIMEDOUT)
> -               err = rt_cond_wait_epilogue(mutex, lockcnt);
> -
> +       if (err == 0 || err == -ETIMEDOUT) {
> +               int loc_err = rt_cond_wait_epilogue(mutex, lockcnt);
> +               if (loc_err < 0)
> +                       err = loc_err;
> +       }
> +
>         if (err == -EINTR && __xn_reg_arg3(regs)
>             && __xn_safe_copy_to_user((void __user *)__xn_reg_arg3(regs),
>                                       &lockcnt, sizeof(lockcnt)))

No. That is not ok either. We need to store the first status somewhere
for returning it to user-space after looping several times in the
epilogue function.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 20:56                         ` Gilles Chanteperdrix
  2010-03-01 21:00                           ` Gilles Chanteperdrix
@ 2010-03-01 21:01                           ` Jan Kiszka
  2010-03-01 21:05                             ` Gilles Chanteperdrix
  1 sibling, 1 reply; 31+ messages in thread
From: Jan Kiszka @ 2010-03-01 21:01 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 4199 bytes --]

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
>>>> Jan Kiszka wrote:
>>>>> Charlton, John wrote:
>>>>>> I added some printk output to the cond.c file in xenomai-2.5.1 and it shows that  xnsync_sleep_on returned -ETIMEDOUT as it should:
>>>>>> Xenomai: registered exported object M1-1908 (mutexes)
>>>>>> Xenomai: registered exported object C1-1908 (condvars)
>>>>>> xnsynch_sleep_on returned: -110
>>>>>> xnsynch_sleep_on returned: -110
>>>>>> xnsynch_sleep_on returned: -110
>>>>>> xnsynch_sleep_on returned: -110
>>>>>> xnsynch_sleep_on returned: -110
>>>>>> xnsynch_sleep_on returned: -110
>>>>>> xnsynch_sleep_on returned: -4
>>>>>> Xenomai: native: cleaning up cond "C1-1908" (ret=0).
>>>>>> Xenomai: unregistered exported object C1-1908 (condvars)
>>>>>> Xenomai: native: cleaning up mutex "M1-1908" (ret=0).
>>>>>> Xenomai: unregistered exported object M1-1908 (mutexes)
>>>>>> Xenomai: POSIX: destroyed thread df5a0800
>>>>>>
>>>>>> I put the printk on line cond.c:433 after err is set in rt_cond_wait_prologue():
>>>>>>
>>>>>> 	info = xnsynch_sleep_on(&cond->synch_base,
>>>>>> 				timeout, timeout_mode);
>>>>>> 	if (info & XNRMID)
>>>>>> 		err = -EIDRM;	/* Condvar deleted while pending. */
>>>>>> 	else if (info & XNTIMEO) {
>>>>>> 		err = -ETIMEDOUT;	/* Timeout. */
>>>>>>         }
>>>>>> 	else if (info & XNBREAK) {
>>>>>> 		err = -EINTR;	/* Unblocked. */
>>>>>> 	}
>>>>>>         printk(KERN_DEBUG "xnsynch_sleep_on returned: %d\n", err);
>>>>>>
>>>>>> I put a printk in rt_cond_wait_inner() which is called directly by rt_cond_wait() from user mode but did not see the output in the dmesg output for that one.
>>>>> One of our problems is the prologue/epilogue split up (both in kernel
>>>>> and user space): the epilogue can eat the error code or the prologue,
>>>>> including the -ETIMEDOUT.
>>>> Ah. My fault. Looks user-space is Ok though. Only kernel-space has a
>>>> problem.
>>>>
>>> Both are affected.
>>>
>>> Could you help me with what issues
>>> 97323b3287b5ee8cad99a7fa67cd050bc51f76c4 should fix?
>> Ah, restart after RT-signals! So far we blocked on the concluding
>> rt_mutex_lock without breaking out to user space, now we have to loop
>> over -EINTR to allow signals, right?
> 
> Yes, it is needed even for handling linux signals (getting gdb working 
> for instance). However, since we do not want rt_cond_wait to result into
> two syscalls all the time, we handle everything in the first syscall if 
> possible.
> 
> Try this:
> 
> diff --git a/ksrc/skins/native/syscall.c b/ksrc/skins/native/syscall.c
> index 40e5cfd..d4e885c 100644
> --- a/ksrc/skins/native/syscall.c
> +++ b/ksrc/skins/native/syscall.c
> @@ -1869,9 +1869,12 @@ static int __rt_cond_wait_prologue(struct pt_regs *regs)
> 
>         err = rt_cond_wait_prologue(cond, mutex, &lockcnt, timeout_mode, timeout);
> 
> -       if (err == 0 || err == -ETIMEDOUT)
> -               err = rt_cond_wait_epilogue(mutex, lockcnt);
> -
> +       if (err == 0 || err == -ETIMEDOUT) {
> +               int loc_err = rt_cond_wait_epilogue(mutex, lockcnt);
> +               if (loc_err < 0)
> +                       err = loc_err;
> +       }
> +
>         if (err == -EINTR && __xn_reg_arg3(regs)
>             && __xn_safe_copy_to_user((void __user *)__xn_reg_arg3(regs),
>                                       &lockcnt, sizeof(lockcnt)))
> 

Same issue exists in user space. And rt_cond_wait_inner needs fixing.
And then there is a spurious sign inversion:

diff --git a/src/skins/native/cond.c b/src/skins/native/cond.c
index 478321d..f874678 100644
--- a/src/skins/native/cond.c
+++ b/src/skins/native/cond.c
@@ -96,9 +96,9 @@ int rt_cond_wait_until(RT_COND *cond, RT_MUTEX *mutex, RTIME timeout)
 				&saved_lockcnt, XN_REALTIME, &timeout);
 
 	while (err == -EINTR)
-		err = -XENOMAI_SKINCALL2(__native_muxid,
-					 __native_cond_wait_epilogue, mutex,
-					 saved_lockcnt);
+		err = XENOMAI_SKINCALL2(__native_muxid,
+					__native_cond_wait_epilogue, mutex,
+					saved_lockcnt);
 #endif /* !CONFIG_XENO_FASTSYNCH */
 
 	return err;

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 21:00                           ` Gilles Chanteperdrix
@ 2010-03-01 21:02                             ` Jan Kiszka
  2010-03-01 21:06                               ` Jan Kiszka
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Kiszka @ 2010-03-01 21:02 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 3778 bytes --]

Gilles Chanteperdrix wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> Jan Kiszka wrote:
>>>>>> Charlton, John wrote:
>>>>>>> I added some printk output to the cond.c file in xenomai-2.5.1 and it shows that  xnsync_sleep_on returned -ETIMEDOUT as it should:
>>>>>>> Xenomai: registered exported object M1-1908 (mutexes)
>>>>>>> Xenomai: registered exported object C1-1908 (condvars)
>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>> xnsynch_sleep_on returned: -4
>>>>>>> Xenomai: native: cleaning up cond "C1-1908" (ret=0).
>>>>>>> Xenomai: unregistered exported object C1-1908 (condvars)
>>>>>>> Xenomai: native: cleaning up mutex "M1-1908" (ret=0).
>>>>>>> Xenomai: unregistered exported object M1-1908 (mutexes)
>>>>>>> Xenomai: POSIX: destroyed thread df5a0800
>>>>>>>
>>>>>>> I put the printk on line cond.c:433 after err is set in rt_cond_wait_prologue():
>>>>>>>
>>>>>>> 	info = xnsynch_sleep_on(&cond->synch_base,
>>>>>>> 				timeout, timeout_mode);
>>>>>>> 	if (info & XNRMID)
>>>>>>> 		err = -EIDRM;	/* Condvar deleted while pending. */
>>>>>>> 	else if (info & XNTIMEO) {
>>>>>>> 		err = -ETIMEDOUT;	/* Timeout. */
>>>>>>>         }
>>>>>>> 	else if (info & XNBREAK) {
>>>>>>> 		err = -EINTR;	/* Unblocked. */
>>>>>>> 	}
>>>>>>>         printk(KERN_DEBUG "xnsynch_sleep_on returned: %d\n", err);
>>>>>>>
>>>>>>> I put a printk in rt_cond_wait_inner() which is called directly by rt_cond_wait() from user mode but did not see the output in the dmesg output for that one.
>>>>>> One of our problems is the prologue/epilogue split up (both in kernel
>>>>>> and user space): the epilogue can eat the error code or the prologue,
>>>>>> including the -ETIMEDOUT.
>>>>> Ah. My fault. Looks user-space is Ok though. Only kernel-space has a
>>>>> problem.
>>>>>
>>>> Both are affected.
>>>>
>>>> Could you help me with what issues
>>>> 97323b3287b5ee8cad99a7fa67cd050bc51f76c4 should fix?
>>> Ah, restart after RT-signals! So far we blocked on the concluding
>>> rt_mutex_lock without breaking out to user space, now we have to loop
>>> over -EINTR to allow signals, right?
>> Yes, it is needed even for handling linux signals (getting gdb working 
>> for instance). However, since we do not want rt_cond_wait to result into
>> two syscalls all the time, we handle everything in the first syscall if 
>> possible.
>>
>> Try this:
>>
>> diff --git a/ksrc/skins/native/syscall.c b/ksrc/skins/native/syscall.c
>> index 40e5cfd..d4e885c 100644
>> --- a/ksrc/skins/native/syscall.c
>> +++ b/ksrc/skins/native/syscall.c
>> @@ -1869,9 +1869,12 @@ static int __rt_cond_wait_prologue(struct pt_regs *regs)
>>
>>         err = rt_cond_wait_prologue(cond, mutex, &lockcnt, timeout_mode, timeout);
>>
>> -       if (err == 0 || err == -ETIMEDOUT)
>> -               err = rt_cond_wait_epilogue(mutex, lockcnt);
>> -
>> +       if (err == 0 || err == -ETIMEDOUT) {
>> +               int loc_err = rt_cond_wait_epilogue(mutex, lockcnt);
>> +               if (loc_err < 0)
>> +                       err = loc_err;
>> +       }
>> +
>>         if (err == -EINTR && __xn_reg_arg3(regs)
>>             && __xn_safe_copy_to_user((void __user *)__xn_reg_arg3(regs),
>>                                       &lockcnt, sizeof(lockcnt)))
> 
> No. That is not ok either. We need to store the first status somewhere
> for returning it to user-space after looping several times in the
> epilogue function.

Precisely (also in user space).

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 21:01                           ` Jan Kiszka
@ 2010-03-01 21:05                             ` Gilles Chanteperdrix
  2010-03-01 21:21                               ` Jan Kiszka
  0 siblings, 1 reply; 31+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-01 21:05 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> Jan Kiszka wrote:
>>>>>> Charlton, John wrote:
>>>>>>> I added some printk output to the cond.c file in xenomai-2.5.1 and it shows that  xnsync_sleep_on returned -ETIMEDOUT as it should:
>>>>>>> Xenomai: registered exported object M1-1908 (mutexes)
>>>>>>> Xenomai: registered exported object C1-1908 (condvars)
>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>> xnsynch_sleep_on returned: -4
>>>>>>> Xenomai: native: cleaning up cond "C1-1908" (ret=0).
>>>>>>> Xenomai: unregistered exported object C1-1908 (condvars)
>>>>>>> Xenomai: native: cleaning up mutex "M1-1908" (ret=0).
>>>>>>> Xenomai: unregistered exported object M1-1908 (mutexes)
>>>>>>> Xenomai: POSIX: destroyed thread df5a0800
>>>>>>>
>>>>>>> I put the printk on line cond.c:433 after err is set in rt_cond_wait_prologue():
>>>>>>>
>>>>>>> 	info = xnsynch_sleep_on(&cond->synch_base,
>>>>>>> 				timeout, timeout_mode);
>>>>>>> 	if (info & XNRMID)
>>>>>>> 		err = -EIDRM;	/* Condvar deleted while pending. */
>>>>>>> 	else if (info & XNTIMEO) {
>>>>>>> 		err = -ETIMEDOUT;	/* Timeout. */
>>>>>>>         }
>>>>>>> 	else if (info & XNBREAK) {
>>>>>>> 		err = -EINTR;	/* Unblocked. */
>>>>>>> 	}
>>>>>>>         printk(KERN_DEBUG "xnsynch_sleep_on returned: %d\n", err);
>>>>>>>
>>>>>>> I put a printk in rt_cond_wait_inner() which is called directly by rt_cond_wait() from user mode but did not see the output in the dmesg output for that one.
>>>>>> One of our problems is the prologue/epilogue split up (both in kernel
>>>>>> and user space): the epilogue can eat the error code or the prologue,
>>>>>> including the -ETIMEDOUT.
>>>>> Ah. My fault. Looks user-space is Ok though. Only kernel-space has a
>>>>> problem.
>>>>>
>>>> Both are affected.
>>>>
>>>> Could you help me with what issues
>>>> 97323b3287b5ee8cad99a7fa67cd050bc51f76c4 should fix?
>>> Ah, restart after RT-signals! So far we blocked on the concluding
>>> rt_mutex_lock without breaking out to user space, now we have to loop
>>> over -EINTR to allow signals, right?
>> Yes, it is needed even for handling linux signals (getting gdb working 
>> for instance). However, since we do not want rt_cond_wait to result into
>> two syscalls all the time, we handle everything in the first syscall if 
>> possible.
>>
>> Try this:
>>
>> diff --git a/ksrc/skins/native/syscall.c b/ksrc/skins/native/syscall.c
>> index 40e5cfd..d4e885c 100644
>> --- a/ksrc/skins/native/syscall.c
>> +++ b/ksrc/skins/native/syscall.c
>> @@ -1869,9 +1869,12 @@ static int __rt_cond_wait_prologue(struct pt_regs *regs)
>>
>>         err = rt_cond_wait_prologue(cond, mutex, &lockcnt, timeout_mode, timeout);
>>
>> -       if (err == 0 || err == -ETIMEDOUT)
>> -               err = rt_cond_wait_epilogue(mutex, lockcnt);
>> -
>> +       if (err == 0 || err == -ETIMEDOUT) {
>> +               int loc_err = rt_cond_wait_epilogue(mutex, lockcnt);
>> +               if (loc_err < 0)
>> +                       err = loc_err;
>> +       }
>> +
>>         if (err == -EINTR && __xn_reg_arg3(regs)
>>             && __xn_safe_copy_to_user((void __user *)__xn_reg_arg3(regs),
>>                                       &lockcnt, sizeof(lockcnt)))
>>
> 
> Same issue exists in user space. And rt_cond_wait_inner needs fixing.
> And then there is a spurious sign inversion:
> 
> diff --git a/src/skins/native/cond.c b/src/skins/native/cond.c
> index 478321d..f874678 100644
> --- a/src/skins/native/cond.c
> +++ b/src/skins/native/cond.c
> @@ -96,9 +96,9 @@ int rt_cond_wait_until(RT_COND *cond, RT_MUTEX *mutex, RTIME timeout)
>  				&saved_lockcnt, XN_REALTIME, &timeout);
>  
>  	while (err == -EINTR)
> -		err = -XENOMAI_SKINCALL2(__native_muxid,
> -					 __native_cond_wait_epilogue, mutex,
> -					 saved_lockcnt);
> +		err = XENOMAI_SKINCALL2(__native_muxid,
> +					__native_cond_wait_epilogue, mutex,
> +					saved_lockcnt);
>  #endif /* !CONFIG_XENO_FASTSYNCH */

Ok for the sign inversion, but in this case the status is Ok. We call
cond_wait_epilogue only if prologue returned -EINTR, and update the
status, this is what we want.

There is only one way out which will not break the ABI: do not call
cond_wait_epilogue in the kernel-space cond_wait_prologue.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 21:02                             ` Jan Kiszka
@ 2010-03-01 21:06                               ` Jan Kiszka
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Kiszka @ 2010-03-01 21:06 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 1522 bytes --]

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Gilles Chanteperdrix wrote:
>>> Try this:
>>>
>>> diff --git a/ksrc/skins/native/syscall.c b/ksrc/skins/native/syscall.c
>>> index 40e5cfd..d4e885c 100644
>>> --- a/ksrc/skins/native/syscall.c
>>> +++ b/ksrc/skins/native/syscall.c
>>> @@ -1869,9 +1869,12 @@ static int __rt_cond_wait_prologue(struct pt_regs *regs)
>>>
>>>         err = rt_cond_wait_prologue(cond, mutex, &lockcnt, timeout_mode, timeout);
>>>
>>> -       if (err == 0 || err == -ETIMEDOUT)
>>> -               err = rt_cond_wait_epilogue(mutex, lockcnt);
>>> -
>>> +       if (err == 0 || err == -ETIMEDOUT) {
>>> +               int loc_err = rt_cond_wait_epilogue(mutex, lockcnt);
>>> +               if (loc_err < 0)
>>> +                       err = loc_err;
>>> +       }
>>> +
>>>         if (err == -EINTR && __xn_reg_arg3(regs)
>>>             && __xn_safe_copy_to_user((void __user *)__xn_reg_arg3(regs),
>>>                                       &lockcnt, sizeof(lockcnt)))
>> No. That is not ok either. We need to store the first status somewhere
>> for returning it to user-space after looping several times in the
>> epilogue function.
> 
> Precisely (also in user space).
> 

As we do not specify if rt_cond_wait behaves like pthread_cond_wait on
signals or errors of the cond variable, I checked the 2.4 behavior
again: we used to ignore all errors of the final rt_mutex_lock. Guess we
should restore that for all contexts (except for -EINTR).

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 21:05                             ` Gilles Chanteperdrix
@ 2010-03-01 21:21                               ` Jan Kiszka
  2010-03-01 21:25                                 ` Jan Kiszka
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Kiszka @ 2010-03-01 21:21 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 5013 bytes --]

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Jan Kiszka wrote:
>>>>> Gilles Chanteperdrix wrote:
>>>>>> Jan Kiszka wrote:
>>>>>>> Charlton, John wrote:
>>>>>>>> I added some printk output to the cond.c file in xenomai-2.5.1 and it shows that  xnsync_sleep_on returned -ETIMEDOUT as it should:
>>>>>>>> Xenomai: registered exported object M1-1908 (mutexes)
>>>>>>>> Xenomai: registered exported object C1-1908 (condvars)
>>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>>> xnsynch_sleep_on returned: -110
>>>>>>>> xnsynch_sleep_on returned: -4
>>>>>>>> Xenomai: native: cleaning up cond "C1-1908" (ret=0).
>>>>>>>> Xenomai: unregistered exported object C1-1908 (condvars)
>>>>>>>> Xenomai: native: cleaning up mutex "M1-1908" (ret=0).
>>>>>>>> Xenomai: unregistered exported object M1-1908 (mutexes)
>>>>>>>> Xenomai: POSIX: destroyed thread df5a0800
>>>>>>>>
>>>>>>>> I put the printk on line cond.c:433 after err is set in rt_cond_wait_prologue():
>>>>>>>>
>>>>>>>> 	info = xnsynch_sleep_on(&cond->synch_base,
>>>>>>>> 				timeout, timeout_mode);
>>>>>>>> 	if (info & XNRMID)
>>>>>>>> 		err = -EIDRM;	/* Condvar deleted while pending. */
>>>>>>>> 	else if (info & XNTIMEO) {
>>>>>>>> 		err = -ETIMEDOUT;	/* Timeout. */
>>>>>>>>         }
>>>>>>>> 	else if (info & XNBREAK) {
>>>>>>>> 		err = -EINTR;	/* Unblocked. */
>>>>>>>> 	}
>>>>>>>>         printk(KERN_DEBUG "xnsynch_sleep_on returned: %d\n", err);
>>>>>>>>
>>>>>>>> I put a printk in rt_cond_wait_inner() which is called directly by rt_cond_wait() from user mode but did not see the output in the dmesg output for that one.
>>>>>>> One of our problems is the prologue/epilogue split up (both in kernel
>>>>>>> and user space): the epilogue can eat the error code or the prologue,
>>>>>>> including the -ETIMEDOUT.
>>>>>> Ah. My fault. Looks user-space is Ok though. Only kernel-space has a
>>>>>> problem.
>>>>>>
>>>>> Both are affected.
>>>>>
>>>>> Could you help me with what issues
>>>>> 97323b3287b5ee8cad99a7fa67cd050bc51f76c4 should fix?
>>>> Ah, restart after RT-signals! So far we blocked on the concluding
>>>> rt_mutex_lock without breaking out to user space, now we have to loop
>>>> over -EINTR to allow signals, right?
>>> Yes, it is needed even for handling linux signals (getting gdb working 
>>> for instance). However, since we do not want rt_cond_wait to result into
>>> two syscalls all the time, we handle everything in the first syscall if 
>>> possible.
>>>
>>> Try this:
>>>
>>> diff --git a/ksrc/skins/native/syscall.c b/ksrc/skins/native/syscall.c
>>> index 40e5cfd..d4e885c 100644
>>> --- a/ksrc/skins/native/syscall.c
>>> +++ b/ksrc/skins/native/syscall.c
>>> @@ -1869,9 +1869,12 @@ static int __rt_cond_wait_prologue(struct pt_regs *regs)
>>>
>>>         err = rt_cond_wait_prologue(cond, mutex, &lockcnt, timeout_mode, timeout);
>>>
>>> -       if (err == 0 || err == -ETIMEDOUT)
>>> -               err = rt_cond_wait_epilogue(mutex, lockcnt);
>>> -
>>> +       if (err == 0 || err == -ETIMEDOUT) {
>>> +               int loc_err = rt_cond_wait_epilogue(mutex, lockcnt);
>>> +               if (loc_err < 0)
>>> +                       err = loc_err;
>>> +       }
>>> +
>>>         if (err == -EINTR && __xn_reg_arg3(regs)
>>>             && __xn_safe_copy_to_user((void __user *)__xn_reg_arg3(regs),
>>>                                       &lockcnt, sizeof(lockcnt)))
>>>
>> Same issue exists in user space. And rt_cond_wait_inner needs fixing.
>> And then there is a spurious sign inversion:
>>
>> diff --git a/src/skins/native/cond.c b/src/skins/native/cond.c
>> index 478321d..f874678 100644
>> --- a/src/skins/native/cond.c
>> +++ b/src/skins/native/cond.c
>> @@ -96,9 +96,9 @@ int rt_cond_wait_until(RT_COND *cond, RT_MUTEX *mutex, RTIME timeout)
>>  				&saved_lockcnt, XN_REALTIME, &timeout);
>>  
>>  	while (err == -EINTR)
>> -		err = -XENOMAI_SKINCALL2(__native_muxid,
>> -					 __native_cond_wait_epilogue, mutex,
>> -					 saved_lockcnt);
>> +		err = XENOMAI_SKINCALL2(__native_muxid,
>> +					__native_cond_wait_epilogue, mutex,
>> +					saved_lockcnt);
>>  #endif /* !CONFIG_XENO_FASTSYNCH */
> 
> Ok for the sign inversion, but in this case the status is Ok. We call
> cond_wait_epilogue only if prologue returned -EINTR, and update the
> status, this is what we want.
> 
> There is only one way out which will not break the ABI: do not call
> cond_wait_epilogue in the kernel-space cond_wait_prologue.
> 

Unfortunately, that ABI is broken: If we get -ETIMEDOUT on cond wait an
-EINTR on mutex lock, we lose the former, don't we?

If we can't find a workaround, I'm for breaking the ABI for this
particular service - but let's search for an alternative first.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 21:21                               ` Jan Kiszka
@ 2010-03-01 21:25                                 ` Jan Kiszka
  2010-03-01 21:39                                   ` Jan Kiszka
  2010-03-01 21:53                                   ` Charlton, John
  0 siblings, 2 replies; 31+ messages in thread
From: Jan Kiszka @ 2010-03-01 21:25 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 1588 bytes --]

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Same issue exists in user space. And rt_cond_wait_inner needs fixing.
>>> And then there is a spurious sign inversion:
>>>
>>> diff --git a/src/skins/native/cond.c b/src/skins/native/cond.c
>>> index 478321d..f874678 100644
>>> --- a/src/skins/native/cond.c
>>> +++ b/src/skins/native/cond.c
>>> @@ -96,9 +96,9 @@ int rt_cond_wait_until(RT_COND *cond, RT_MUTEX *mutex, RTIME timeout)
>>>  				&saved_lockcnt, XN_REALTIME, &timeout);
>>>  
>>>  	while (err == -EINTR)
>>> -		err = -XENOMAI_SKINCALL2(__native_muxid,
>>> -					 __native_cond_wait_epilogue, mutex,
>>> -					 saved_lockcnt);
>>> +		err = XENOMAI_SKINCALL2(__native_muxid,
>>> +					__native_cond_wait_epilogue, mutex,
>>> +					saved_lockcnt);
>>>  #endif /* !CONFIG_XENO_FASTSYNCH */
>> Ok for the sign inversion, but in this case the status is Ok. We call
>> cond_wait_epilogue only if prologue returned -EINTR, and update the
>> status, this is what we want.
>>
>> There is only one way out which will not break the ABI: do not call
>> cond_wait_epilogue in the kernel-space cond_wait_prologue.
>>
> 
> Unfortunately, that ABI is broken: If we get -ETIMEDOUT on cond wait an
> -EINTR on mutex lock, we lose the former, don't we?
> 
> If we can't find a workaround, I'm for breaking the ABI for this
> particular service - but let's search for an alternative first.
> 

The only way around ABI changes: save the return code of cond_wait
/somewhere/ across prologue and epilogue syscalls - but where???

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 21:25                                 ` Jan Kiszka
@ 2010-03-01 21:39                                   ` Jan Kiszka
  2010-03-01 21:45                                     ` Gilles Chanteperdrix
  2010-03-01 21:53                                   ` Charlton, John
  1 sibling, 1 reply; 31+ messages in thread
From: Jan Kiszka @ 2010-03-01 21:39 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 1919 bytes --]

Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Same issue exists in user space. And rt_cond_wait_inner needs fixing.
>>>> And then there is a spurious sign inversion:
>>>>
>>>> diff --git a/src/skins/native/cond.c b/src/skins/native/cond.c
>>>> index 478321d..f874678 100644
>>>> --- a/src/skins/native/cond.c
>>>> +++ b/src/skins/native/cond.c
>>>> @@ -96,9 +96,9 @@ int rt_cond_wait_until(RT_COND *cond, RT_MUTEX *mutex, RTIME timeout)
>>>>  				&saved_lockcnt, XN_REALTIME, &timeout);
>>>>  
>>>>  	while (err == -EINTR)
>>>> -		err = -XENOMAI_SKINCALL2(__native_muxid,
>>>> -					 __native_cond_wait_epilogue, mutex,
>>>> -					 saved_lockcnt);
>>>> +		err = XENOMAI_SKINCALL2(__native_muxid,
>>>> +					__native_cond_wait_epilogue, mutex,
>>>> +					saved_lockcnt);
>>>>  #endif /* !CONFIG_XENO_FASTSYNCH */
>>> Ok for the sign inversion, but in this case the status is Ok. We call
>>> cond_wait_epilogue only if prologue returned -EINTR, and update the
>>> status, this is what we want.
>>>
>>> There is only one way out which will not break the ABI: do not call
>>> cond_wait_epilogue in the kernel-space cond_wait_prologue.
>>>
>> Unfortunately, that ABI is broken: If we get -ETIMEDOUT on cond wait an
>> -EINTR on mutex lock, we lose the former, don't we?
>>
>> If we can't find a workaround, I'm for breaking the ABI for this
>> particular service - but let's search for an alternative first.
>>
> 
> The only way around ABI changes: save the return code of cond_wait
> /somewhere/ across prologue and epilogue syscalls - but where???
> 

In-kernel RT_TASK, that's probably the only place. Once
__rt_cond_wait_epilogue has acquired the mutex (or failed without
receiving a -EINTR), it can then pick up the original error that this
task received from rt_cond_wait_prologue and return it. What do you think?

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 21:39                                   ` Jan Kiszka
@ 2010-03-01 21:45                                     ` Gilles Chanteperdrix
  2010-03-02  8:29                                       ` Jan Kiszka
  0 siblings, 1 reply; 31+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-01 21:45 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai

Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
>>>> Jan Kiszka wrote:
>>>>> Same issue exists in user space. And rt_cond_wait_inner needs fixing.
>>>>> And then there is a spurious sign inversion:
>>>>>
>>>>> diff --git a/src/skins/native/cond.c b/src/skins/native/cond.c
>>>>> index 478321d..f874678 100644
>>>>> --- a/src/skins/native/cond.c
>>>>> +++ b/src/skins/native/cond.c
>>>>> @@ -96,9 +96,9 @@ int rt_cond_wait_until(RT_COND *cond, RT_MUTEX *mutex, RTIME timeout)
>>>>>  				&saved_lockcnt, XN_REALTIME, &timeout);
>>>>>  
>>>>>  	while (err == -EINTR)
>>>>> -		err = -XENOMAI_SKINCALL2(__native_muxid,
>>>>> -					 __native_cond_wait_epilogue, mutex,
>>>>> -					 saved_lockcnt);
>>>>> +		err = XENOMAI_SKINCALL2(__native_muxid,
>>>>> +					__native_cond_wait_epilogue, mutex,
>>>>> +					saved_lockcnt);
>>>>>  #endif /* !CONFIG_XENO_FASTSYNCH */
>>>> Ok for the sign inversion, but in this case the status is Ok. We call
>>>> cond_wait_epilogue only if prologue returned -EINTR, and update the
>>>> status, this is what we want.
>>>>
>>>> There is only one way out which will not break the ABI: do not call
>>>> cond_wait_epilogue in the kernel-space cond_wait_prologue.
>>>>
>>> Unfortunately, that ABI is broken: If we get -ETIMEDOUT on cond wait an
>>> -EINTR on mutex lock, we lose the former, don't we?
>>>
>>> If we can't find a workaround, I'm for breaking the ABI for this
>>> particular service - but let's search for an alternative first.
>>>
>> The only way around ABI changes: save the return code of cond_wait
>> /somewhere/ across prologue and epilogue syscalls - but where???
>>
> 
> In-kernel RT_TASK, that's probably the only place. Once
> __rt_cond_wait_epilogue has acquired the mutex (or failed without
> receiving a -EINTR), it can then pick up the original error that this
> task received from rt_cond_wait_prologue and return it. What do you think?

I was thinking about the same idea, by using the xnthread errno.
However, it is not reentrant, in the sense that if a signal handler
emits a syscall which overrides this value, we are toast.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 21:25                                 ` Jan Kiszka
  2010-03-01 21:39                                   ` Jan Kiszka
@ 2010-03-01 21:53                                   ` Charlton, John
  2010-03-02 13:26                                     ` Charlton, John
  1 sibling, 1 reply; 31+ messages in thread
From: Charlton, John @ 2010-03-01 21:53 UTC (permalink / raw)
  To: 'Jan Kiszka', Gilles Chanteperdrix; +Cc: xenomai

 

-----Original Message-----
From: xenomai-help-bounces@domain.hid [mailto:xenomai-help-bounces@domain.hid] On Behalf Of Jan Kiszka
Sent: Monday, March 01, 2010 4:26 PM
To: Gilles Chanteperdrix
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai-help] mlockall error after calling mlockall()

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Same issue exists in user space. And rt_cond_wait_inner needs fixing.
>>> And then there is a spurious sign inversion:
>>>
>>> diff --git a/src/skins/native/cond.c b/src/skins/native/cond.c index 
>>> 478321d..f874678 100644
>>> --- a/src/skins/native/cond.c
>>> +++ b/src/skins/native/cond.c
>>> @@ -96,9 +96,9 @@ int rt_cond_wait_until(RT_COND *cond, RT_MUTEX *mutex, RTIME timeout)
>>>  				&saved_lockcnt, XN_REALTIME, &timeout);
>>>  
>>>  	while (err == -EINTR)
>>> -		err = -XENOMAI_SKINCALL2(__native_muxid,
>>> -					 __native_cond_wait_epilogue, mutex,
>>> -					 saved_lockcnt);
>>> +		err = XENOMAI_SKINCALL2(__native_muxid,
>>> +					__native_cond_wait_epilogue, mutex,
>>> +					saved_lockcnt);
>>>  #endif /* !CONFIG_XENO_FASTSYNCH */
>> Ok for the sign inversion, but in this case the status is Ok. We call 
>> cond_wait_epilogue only if prologue returned -EINTR, and update the 
>> status, this is what we want.
>>
>> There is only one way out which will not break the ABI: do not call 
>> cond_wait_epilogue in the kernel-space cond_wait_prologue.
>>
> 
> Unfortunately, that ABI is broken: If we get -ETIMEDOUT on cond wait 
> an -EINTR on mutex lock, we lose the former, don't we?
> 
> If we can't find a workaround, I'm for breaking the ABI for this 
> particular service - but let's search for an alternative first.
> 

The only way around ABI changes: save the return code of cond_wait /somewhere/ across prologue and epilogue syscalls - but where???

Jan


I will make the changes above and those to syscall.c since it looks like they will fix the immediate problem I am having.  My .config for the linux kernel always has CONFIG_XENO_FASTSYNC='y' even if I set it manually it is reset to 'y' so the else !CONFIG_XENO_FASTSYNC is not a factor for my kernel build.  I also added a saved_err to the user mode rt_cond_wait/wait_unitl to return the prologue error code to the application.  I will keep monitoring until you determine a better solution.

Thanks for the help,
--

John


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 21:45                                     ` Gilles Chanteperdrix
@ 2010-03-02  8:29                                       ` Jan Kiszka
  2010-03-02  8:36                                         ` Jan Kiszka
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Kiszka @ 2010-03-02  8:29 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 2828 bytes --]

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
>>>> Gilles Chanteperdrix wrote:
>>>>> Jan Kiszka wrote:
>>>>>> Same issue exists in user space. And rt_cond_wait_inner needs fixing.
>>>>>> And then there is a spurious sign inversion:
>>>>>>
>>>>>> diff --git a/src/skins/native/cond.c b/src/skins/native/cond.c
>>>>>> index 478321d..f874678 100644
>>>>>> --- a/src/skins/native/cond.c
>>>>>> +++ b/src/skins/native/cond.c
>>>>>> @@ -96,9 +96,9 @@ int rt_cond_wait_until(RT_COND *cond, RT_MUTEX *mutex, RTIME timeout)
>>>>>>  				&saved_lockcnt, XN_REALTIME, &timeout);
>>>>>>  
>>>>>>  	while (err == -EINTR)
>>>>>> -		err = -XENOMAI_SKINCALL2(__native_muxid,
>>>>>> -					 __native_cond_wait_epilogue, mutex,
>>>>>> -					 saved_lockcnt);
>>>>>> +		err = XENOMAI_SKINCALL2(__native_muxid,
>>>>>> +					__native_cond_wait_epilogue, mutex,
>>>>>> +					saved_lockcnt);
>>>>>>  #endif /* !CONFIG_XENO_FASTSYNCH */
>>>>> Ok for the sign inversion, but in this case the status is Ok. We call
>>>>> cond_wait_epilogue only if prologue returned -EINTR, and update the
>>>>> status, this is what we want.
>>>>>
>>>>> There is only one way out which will not break the ABI: do not call
>>>>> cond_wait_epilogue in the kernel-space cond_wait_prologue.
>>>>>
>>>> Unfortunately, that ABI is broken: If we get -ETIMEDOUT on cond wait an
>>>> -EINTR on mutex lock, we lose the former, don't we?
>>>>
>>>> If we can't find a workaround, I'm for breaking the ABI for this
>>>> particular service - but let's search for an alternative first.
>>>>
>>> The only way around ABI changes: save the return code of cond_wait
>>> /somewhere/ across prologue and epilogue syscalls - but where???
>>>
>> In-kernel RT_TASK, that's probably the only place. Once
>> __rt_cond_wait_epilogue has acquired the mutex (or failed without
>> receiving a -EINTR), it can then pick up the original error that this
>> task received from rt_cond_wait_prologue and return it. What do you think?
> 
> I was thinking about the same idea, by using the xnthread errno.
> However, it is not reentrant, in the sense that if a signal handler
> emits a syscall which overrides this value, we are toast.

Right. But that's the price someone unable/unwilling to update their
user space libs along with the next kernel update will have to pay. I
don't expect there are many, though, and the sketched scenario is fairly
uncommon as well.

So let's head this way: Install some workaround for the existing
__rt_cond_wait_prologue syscall, but also introduce
__rt_cond_wait_prologue2 which writes the cond-wait error code back like
it already happens for lockcnt. That syscall is used by updated user
space libs if they find it, otherwise we fall back to the old pattern.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-02  8:29                                       ` Jan Kiszka
@ 2010-03-02  8:36                                         ` Jan Kiszka
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Kiszka @ 2010-03-02  8:36 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai

[-- Attachment #1: Type: text/plain, Size: 3015 bytes --]

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
>>>> Jan Kiszka wrote:
>>>>> Gilles Chanteperdrix wrote:
>>>>>> Jan Kiszka wrote:
>>>>>>> Same issue exists in user space. And rt_cond_wait_inner needs fixing.
>>>>>>> And then there is a spurious sign inversion:
>>>>>>>
>>>>>>> diff --git a/src/skins/native/cond.c b/src/skins/native/cond.c
>>>>>>> index 478321d..f874678 100644
>>>>>>> --- a/src/skins/native/cond.c
>>>>>>> +++ b/src/skins/native/cond.c
>>>>>>> @@ -96,9 +96,9 @@ int rt_cond_wait_until(RT_COND *cond, RT_MUTEX *mutex, RTIME timeout)
>>>>>>>  				&saved_lockcnt, XN_REALTIME, &timeout);
>>>>>>>  
>>>>>>>  	while (err == -EINTR)
>>>>>>> -		err = -XENOMAI_SKINCALL2(__native_muxid,
>>>>>>> -					 __native_cond_wait_epilogue, mutex,
>>>>>>> -					 saved_lockcnt);
>>>>>>> +		err = XENOMAI_SKINCALL2(__native_muxid,
>>>>>>> +					__native_cond_wait_epilogue, mutex,
>>>>>>> +					saved_lockcnt);
>>>>>>>  #endif /* !CONFIG_XENO_FASTSYNCH */
>>>>>> Ok for the sign inversion, but in this case the status is Ok. We call
>>>>>> cond_wait_epilogue only if prologue returned -EINTR, and update the
>>>>>> status, this is what we want.
>>>>>>
>>>>>> There is only one way out which will not break the ABI: do not call
>>>>>> cond_wait_epilogue in the kernel-space cond_wait_prologue.
>>>>>>
>>>>> Unfortunately, that ABI is broken: If we get -ETIMEDOUT on cond wait an
>>>>> -EINTR on mutex lock, we lose the former, don't we?
>>>>>
>>>>> If we can't find a workaround, I'm for breaking the ABI for this
>>>>> particular service - but let's search for an alternative first.
>>>>>
>>>> The only way around ABI changes: save the return code of cond_wait
>>>> /somewhere/ across prologue and epilogue syscalls - but where???
>>>>
>>> In-kernel RT_TASK, that's probably the only place. Once
>>> __rt_cond_wait_epilogue has acquired the mutex (or failed without
>>> receiving a -EINTR), it can then pick up the original error that this
>>> task received from rt_cond_wait_prologue and return it. What do you think?
>> I was thinking about the same idea, by using the xnthread errno.
>> However, it is not reentrant, in the sense that if a signal handler
>> emits a syscall which overrides this value, we are toast.
> 
> Right. But that's the price someone unable/unwilling to update their
> user space libs along with the next kernel update will have to pay. I
> don't expect there are many, though, and the sketched scenario is fairly
> uncommon as well.
> 
> So let's head this way: Install some workaround for the existing
> __rt_cond_wait_prologue syscall, but also introduce
> __rt_cond_wait_prologue2 which writes the cond-wait error code back like
> it already happens for lockcnt. That syscall is used by updated user
> space libs if they find it, otherwise we fall back to the old pattern.
> 

Ah, and as it looks like, we also need a __pthread_cond_wait_prologue2.
Hope that's all then.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-01 21:53                                   ` Charlton, John
@ 2010-03-02 13:26                                     ` Charlton, John
  2010-03-02 14:05                                       ` Jan Kiszka
  0 siblings, 1 reply; 31+ messages in thread
From: Charlton, John @ 2010-03-02 13:26 UTC (permalink / raw)
  To: Charlton, John, 'Jan Kiszka', Gilles Chanteperdrix; +Cc: xenomai

While I was debugging this problem, it looked like I could change ksrc/skins/native/cond.c by modifying the rt_cond_wait_inner function as follows:
--- xenomai-2.5.1_bak/ksrc/skins/native/cond.c  2010-01-15 19:09:31.000000000 -0500
+++ xenomai-2.5.1/ksrc/skins/native/cond.c      2010-03-01 17:17:22.000000000 -0500
@@ -468,15 +468,22 @@
 {
        unsigned lockcnt;
        int err;
+        int err_save;

        err = rt_cond_wait_prologue(cond, mutex, &lockcnt,
                                    timeout_mode, timeout);

+        err_save = err;
        if(!err || err == -ETIMEDOUT || err == -EINTR)
                do {
                        err = rt_cond_wait_epilogue(mutex, lockcnt);
                } while (err == -EINTR);

+        if (err_save == -ETIMEDOUT)
+        {
+          err = err_save;
+        }
+
        return err;
 }
 /**

When I made these changes and put a printk in rt_cond_wait_inner also the printk did not get printed and the change above did not seem to make any difference.  So should I revert these changes or should they be made in addition to the syscall.c kernel and cond.c user space changes as follows:

--- xenomai-2.5.1_bak/ksrc/skins/native/syscall.c       2010-02-01 20:01:09.000000000 -0500
+++ xenomai-2.5.1/ksrc/skins/native/syscall.c   2010-03-01 16:13:20.000000000 -0500
@@ -1869,8 +1869,12 @@

        err = rt_cond_wait_prologue(cond, mutex, &lockcnt, timeout_mode, timeout);

-       if (err == 0 || err == -ETIMEDOUT)
-               err = rt_cond_wait_epilogue(mutex, lockcnt);
+        if (err == 0 || err == -ETIMEDOUT) {
+                int loc_err = rt_cond_wait_epilogue(mutex, lockcnt);
+                if (loc_err < 0)
+                       err = loc_err;
+        }
+

        if (err == -EINTR && __xn_reg_arg3(regs)
            && __xn_safe_copy_to_user((void __user *)__xn_reg_arg3(regs),

--- xenomai-2.5.1_bak/src/skins/native/cond.c   2010-01-15 19:09:32.000000000 -0500
+++ xenomai-2.5.1/src/skins/native/cond.c       2010-03-01 16:44:48.000000000 -0500
@@ -41,7 +41,7 @@                                                                  
                                                                                   
 int rt_cond_wait(RT_COND *cond, RT_MUTEX *mutex, RTIME timeout)                   
 {                                                                                 
-       int saved_lockcnt, err;                                                    
+  int saved_lockcnt, err, saved_err;                                              
                                                                                   
 #ifdef CONFIG_XENO_FASTSYNCH                                                      
        saved_lockcnt = mutex->lockcnt;                                            
@@ -49,7 +49,7 @@                                                                  
        err = XENOMAI_SKINCALL5(__native_muxid,                                    
                                __native_cond_wait_prologue, cond, mutex,          
                                NULL, XN_RELATIVE, &timeout);                      
-                                                                                  
+        saved_err = err;                                                          
        while (err == -EINTR)                                                      
                err = XENOMAI_SKINCALL2(__native_muxid,                            
                                        __native_cond_wait_epilogue, mutex,        
@@ -62,6 +62,7 @@                                                                  
                                 __native_cond_wait_prologue, cond, mutex,         
                                 &saved_lockcnt, XN_RELATIVE, &timeout);           
                                                                                   
+        saved_err = err;                                                          
        while (err == -EINTR)                                                      
                err = XENOMAI_SKINCALL2(__native_muxid,                            
                                        __native_cond_wait_epilogue, mutex,        
@@ -69,12 +70,12 @@                                                                
                                                                                   
 #endif /* !CONFIG_XENO_FASTSYNCH */

-       return err;
+       return saved_err;
 }

 int rt_cond_wait_until(RT_COND *cond, RT_MUTEX *mutex, RTIME timeout)
 {
-       int saved_lockcnt, err;
+  int saved_lockcnt, err, saved_err;

 #ifdef CONFIG_XENO_FASTSYNCH
        saved_lockcnt = mutex->lockcnt;
@@ -82,7 +83,7 @@
        err = XENOMAI_SKINCALL5(__native_muxid,
                                __native_cond_wait_prologue, cond, mutex,
                                NULL, XN_REALTIME, &timeout);
-
+        saved_err = err;
        while (err == -EINTR)
                err = XENOMAI_SKINCALL2(__native_muxid,
                                        __native_cond_wait_epilogue, mutex,
@@ -94,14 +95,14 @@
        err = XENOMAI_SKINCALL5(__native_muxid,
                                __native_cond_wait_prologue, cond, mutex,
                                &saved_lockcnt, XN_REALTIME, &timeout);
-
+        saved_err = err;
        while (err == -EINTR)
-               err = -XENOMAI_SKINCALL2(__native_muxid,
+               err = XENOMAI_SKINCALL2(__native_muxid,
                                         __native_cond_wait_epilogue, mutex,
                                         saved_lockcnt);
 #endif /* !CONFIG_XENO_FASTSYNCH */

-       return err;
+       return saved_err;
 }

 int rt_cond_signal(RT_COND *cond)


-----Original Message-----
From: xenomai-help-bounces@domain.hid [mailto:xenomai-help-bounces@domain.hid] On Behalf Of Jan Kiszka
Sent: Monday, March 01, 2010 4:26 PM
To: Gilles Chanteperdrix
Cc: xenomai@xenomai.org
Subject: Re: [Xenomai-help] mlockall error after calling mlockall()

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Same issue exists in user space. And rt_cond_wait_inner needs fixing.
>>> And then there is a spurious sign inversion:
>>>
>>> diff --git a/src/skins/native/cond.c b/src/skins/native/cond.c index
>>> 478321d..f874678 100644
>>> --- a/src/skins/native/cond.c
>>> +++ b/src/skins/native/cond.c
>>> @@ -96,9 +96,9 @@ int rt_cond_wait_until(RT_COND *cond, RT_MUTEX *mutex, RTIME timeout)
>>>  				&saved_lockcnt, XN_REALTIME, &timeout);
>>>  
>>>  	while (err == -EINTR)
>>> -		err = -XENOMAI_SKINCALL2(__native_muxid,
>>> -					 __native_cond_wait_epilogue, mutex,
>>> -					 saved_lockcnt);
>>> +		err = XENOMAI_SKINCALL2(__native_muxid,
>>> +					__native_cond_wait_epilogue, mutex,
>>> +					saved_lockcnt);
>>>  #endif /* !CONFIG_XENO_FASTSYNCH */
>> Ok for the sign inversion, but in this case the status is Ok. We call 
>> cond_wait_epilogue only if prologue returned -EINTR, and update the 
>> status, this is what we want.
>>
>> There is only one way out which will not break the ABI: do not call 
>> cond_wait_epilogue in the kernel-space cond_wait_prologue.
>>
> 
> Unfortunately, that ABI is broken: If we get -ETIMEDOUT on cond wait 
> an -EINTR on mutex lock, we lose the former, don't we?
> 
> If we can't find a workaround, I'm for breaking the ABI for this 
> particular service - but let's search for an alternative first.
> 

The only way around ABI changes: save the return code of cond_wait /somewhere/ across prologue and epilogue syscalls - but where???

Jan


I will make the changes above and those to syscall.c since it looks like they will fix the immediate problem I am having.  My .config for the linux kernel always has CONFIG_XENO_FASTSYNC='y' even if I set it manually it is reset to 'y' so the else !CONFIG_XENO_FASTSYNC is not a factor for my kernel build.  I also added a saved_err to the user mode rt_cond_wait/wait_unitl to return the prologue error code to the application.  I will keep monitoring until you determine a better solution.

Thanks for the help,
--

John

_______________________________________________
Xenomai-help mailing list
Xenomai-help@domain.hid
https://mail.gna.org/listinfo/xenomai-help


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [Xenomai-help] mlockall error after calling mlockall()
  2010-03-02 13:26                                     ` Charlton, John
@ 2010-03-02 14:05                                       ` Jan Kiszka
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Kiszka @ 2010-03-02 14:05 UTC (permalink / raw)
  To: Charlton, John; +Cc: xenomai

Charlton, John wrote:
> While I was debugging this problem, it looked like I could change ksrc/skins/native/cond.c by modifying the rt_cond_wait_inner function as follows:
> --- xenomai-2.5.1_bak/ksrc/skins/native/cond.c  2010-01-15 19:09:31.000000000 -0500
> +++ xenomai-2.5.1/ksrc/skins/native/cond.c      2010-03-01 17:17:22.000000000 -0500
> @@ -468,15 +468,22 @@
>  {
>         unsigned lockcnt;
>         int err;
> +        int err_save;
> 
>         err = rt_cond_wait_prologue(cond, mutex, &lockcnt,
>                                     timeout_mode, timeout);
> 
> +        err_save = err;
>         if(!err || err == -ETIMEDOUT || err == -EINTR)
>                 do {
>                         err = rt_cond_wait_epilogue(mutex, lockcnt);
>                 } while (err == -EINTR);
> 
> +        if (err_save == -ETIMEDOUT)
> +        {
> +          err = err_save;
> +        }
> +
>         return err;
>  }
>  /**

That is for in-kernel cond_wait only.

> 
> When I made these changes and put a printk in rt_cond_wait_inner also the printk did not get printed and the change above did not seem to make any difference.  So should I revert these changes or should they be made in addition to the syscall.c kernel and cond.c user space changes as follows:
> 
> --- xenomai-2.5.1_bak/ksrc/skins/native/syscall.c       2010-02-01 20:01:09.000000000 -0500
> +++ xenomai-2.5.1/ksrc/skins/native/syscall.c   2010-03-01 16:13:20.000000000 -0500
> @@ -1869,8 +1869,12 @@
> 
>         err = rt_cond_wait_prologue(cond, mutex, &lockcnt, timeout_mode, timeout);
> 
> -       if (err == 0 || err == -ETIMEDOUT)
> -               err = rt_cond_wait_epilogue(mutex, lockcnt);
> +        if (err == 0 || err == -ETIMEDOUT) {
> +                int loc_err = rt_cond_wait_epilogue(mutex, lockcnt);
> +                if (loc_err < 0)
> +                       err = loc_err;
> +        }
> +
> 
>         if (err == -EINTR && __xn_reg_arg3(regs)
>             && __xn_safe_copy_to_user((void __user *)__xn_reg_arg3(regs),
> 
> --- xenomai-2.5.1_bak/src/skins/native/cond.c   2010-01-15 19:09:32.000000000 -0500
> +++ xenomai-2.5.1/src/skins/native/cond.c       2010-03-01 16:44:48.000000000 -0500
> @@ -41,7 +41,7 @@                                                                  
>                                                                                    
>  int rt_cond_wait(RT_COND *cond, RT_MUTEX *mutex, RTIME timeout)                   
>  {                                                                                 
> -       int saved_lockcnt, err;                                                    
> +  int saved_lockcnt, err, saved_err;                                              
>                                                                                    
>  #ifdef CONFIG_XENO_FASTSYNCH                                                      
>         saved_lockcnt = mutex->lockcnt;                                            
> @@ -49,7 +49,7 @@                                                                  
>         err = XENOMAI_SKINCALL5(__native_muxid,                                    
>                                 __native_cond_wait_prologue, cond, mutex,          
>                                 NULL, XN_RELATIVE, &timeout);                      
> -                                                                                  
> +        saved_err = err;                                                          
>         while (err == -EINTR)                                                      
>                 err = XENOMAI_SKINCALL2(__native_muxid,                            
>                                         __native_cond_wait_epilogue, mutex,        
> @@ -62,6 +62,7 @@                                                                  
>                                  __native_cond_wait_prologue, cond, mutex,         
>                                  &saved_lockcnt, XN_RELATIVE, &timeout);           
>                                                                                    
> +        saved_err = err;                                                          
>         while (err == -EINTR)                                                      
>                 err = XENOMAI_SKINCALL2(__native_muxid,                            
>                                         __native_cond_wait_epilogue, mutex,        
> @@ -69,12 +70,12 @@                                                                
>                                                                                    
>  #endif /* !CONFIG_XENO_FASTSYNCH */
> 
> -       return err;
> +       return saved_err;
>  }
> 
>  int rt_cond_wait_until(RT_COND *cond, RT_MUTEX *mutex, RTIME timeout)
>  {
> -       int saved_lockcnt, err;
> +  int saved_lockcnt, err, saved_err;
> 
>  #ifdef CONFIG_XENO_FASTSYNCH
>         saved_lockcnt = mutex->lockcnt;
> @@ -82,7 +83,7 @@
>         err = XENOMAI_SKINCALL5(__native_muxid,
>                                 __native_cond_wait_prologue, cond, mutex,
>                                 NULL, XN_REALTIME, &timeout);
> -
> +        saved_err = err;
>         while (err == -EINTR)
>                 err = XENOMAI_SKINCALL2(__native_muxid,
>                                         __native_cond_wait_epilogue, mutex,
> @@ -94,14 +95,14 @@
>         err = XENOMAI_SKINCALL5(__native_muxid,
>                                 __native_cond_wait_prologue, cond, mutex,
>                                 &saved_lockcnt, XN_REALTIME, &timeout);
> -
> +        saved_err = err;
>         while (err == -EINTR)
> -               err = -XENOMAI_SKINCALL2(__native_muxid,
> +               err = XENOMAI_SKINCALL2(__native_muxid,
>                                          __native_cond_wait_epilogue, mutex,
>                                          saved_lockcnt);
>  #endif /* !CONFIG_XENO_FASTSYNCH */
> 
> -       return err;
> +       return saved_err;
>  }
> 
>  int rt_cond_signal(RT_COND *cond)
> 

Yep, the easiest approach for you is to throw away the epilogue error
for now (just loop over -EINTR). That's what 2.4.x did as well. We are
still struggling with proper fixes for all corner cases.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2010-03-02 14:05 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-25 21:12 [Xenomai-help] mlockall error after calling mlockall() Charlton, John
2010-02-25 21:25 ` Gilles Chanteperdrix
2010-02-25 21:38 ` Jan Kiszka
2010-02-25 22:42   ` Jan Kiszka
2010-02-25 22:54     ` Gilles Chanteperdrix
2010-02-25 23:01       ` Jan Kiszka
2010-02-26 16:47         ` Charlton, John
2010-02-26 16:53           ` Gilles Chanteperdrix
2010-02-26 17:15             ` Charlton, John
2010-03-01 15:52             ` Charlton, John
2010-03-01 20:30               ` Charlton, John
2010-03-01 20:33                 ` Jan Kiszka
2010-03-01 20:38                   ` Gilles Chanteperdrix
2010-03-01 20:43                     ` Jan Kiszka
2010-03-01 20:46                       ` Jan Kiszka
2010-03-01 20:56                         ` Gilles Chanteperdrix
2010-03-01 21:00                           ` Gilles Chanteperdrix
2010-03-01 21:02                             ` Jan Kiszka
2010-03-01 21:06                               ` Jan Kiszka
2010-03-01 21:01                           ` Jan Kiszka
2010-03-01 21:05                             ` Gilles Chanteperdrix
2010-03-01 21:21                               ` Jan Kiszka
2010-03-01 21:25                                 ` Jan Kiszka
2010-03-01 21:39                                   ` Jan Kiszka
2010-03-01 21:45                                     ` Gilles Chanteperdrix
2010-03-02  8:29                                       ` Jan Kiszka
2010-03-02  8:36                                         ` Jan Kiszka
2010-03-01 21:53                                   ` Charlton, John
2010-03-02 13:26                                     ` Charlton, John
2010-03-02 14:05                                       ` Jan Kiszka
2010-03-01 20:30               ` Jan Kiszka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.