xenomai.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* evl_wait_event returning EIDRM
@ 2023-05-01 20:59 Russell Johnson
  2023-05-06 10:08 ` Philippe Gerum
  0 siblings, 1 reply; 2+ messages in thread
From: Russell Johnson @ 2023-05-01 20:59 UTC (permalink / raw)
  To: xenomai; +Cc: Nickolas Heisler

[-- Attachment #1: Type: text/plain, Size: 534 bytes --]

Hello,

We are noticing on occasion evl_wait_event will return an EIDRM error, and
the program will freeze up. I have created a test hosted in github
https://github.com/nickolasheisler/evl_wait_event_error that consistently
reproduces this error. Please take a look at this and let us know if you can
figure out what is going on. Please let me know if you do not have access to
the link, and I will add you. For now, I added rpm@xenomai.org because that
is the only I have seen associated with your correspondence. 

Thanks,

Russell

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 6739 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: evl_wait_event returning EIDRM
  2023-05-01 20:59 evl_wait_event returning EIDRM Russell Johnson
@ 2023-05-06 10:08 ` Philippe Gerum
  0 siblings, 0 replies; 2+ messages in thread
From: Philippe Gerum @ 2023-05-06 10:08 UTC (permalink / raw)
  To: Russell Johnson; +Cc: xenomai, Nickolas Heisler


Russell Johnson <russell.johnson@kratosdefense.com> writes:

> [[S/MIME Signed Part:Undecided]]
> Hello,
>
> We are noticing on occasion evl_wait_event will return an EIDRM error, and
> the program will freeze up. I have created a test hosted in github
> https://github.com/nickolasheisler/evl_wait_event_error that consistently
> reproduces this error. Please take a look at this and let us know if you can
> figure out what is going on. Please let me know if you do not have access to
> the link, and I will add you. For now, I added rpm@xenomai.org because that
> is the only I have seen associated with your correspondence. 
>
> Thanks,
>
> Russell
>
> [[End of S/MIME Signed Part]]

In this particular case, the test code is at fault, not EVL. The patch
below illustrates how to addresses them.

The basic issue of receiving EINTR is caused by assigning both
semaphores to the same global smart pointer (m_dataSemaphore). The
second assignment causes the object dtor to run, which in turn deletes
the underlying EVL event, yielding EIDRM while Thread1 waits on it, as
expected.

Next, there are two causes for the apparent lockup, also in the test
code. The first one is about properly detecting errors in wait loops,
such as semaphore::take(). Failing to do so causes the thread to enter a
runaway loop, since EVL would repeatedly deny the wait syscall after the
event is deleted. Enabling CONFIG_EVL_WATCHDOG would detect such
situations, it consistently triggered when running the test app here.

The other issue is about releasing Thread1 from waiting on the event
before main() leaves, otherwise the app would block in the crt exit
code, waiting for all joinable threads to terminate. I chose to signal
the event in order to force a release, detaching Thread1 when creating
it would be another way.

diff --git a/condition_variable.cpp b/condition_variable.cpp
index 2af9edc..65ec770 100644
--- a/condition_variable.cpp
+++ b/condition_variable.cpp
@@ -20,16 +20,17 @@ condition_variable::~condition_variable()
 }
 
 ///////////////////////////////////////////////////////////////////////////////
-void condition_variable::wait(std::unique_lock<evl::mutex>& lock)
+int condition_variable::wait(std::unique_lock<evl::mutex>& lock)
 {
     int rv = evl_wait_event(&m_cv, lock.mutex()->get_raw_mutex());
 
     // If EIDRM is returned, then the event is no longer valid
     if (rv == -EIDRM)
     {
-        create();
-        evl_printf("evl_wait_event returned EIDRM and a new event was created\n");
+        evl_printf("evl_wait_event returned EIDRM\n");
     }
+
+    return rv;
 }
 
 ///////////////////////////////////////////////////////////////////////////////
@@ -45,9 +46,6 @@ void condition_variable::notify_all()
 ///////////////////////////////////////////////////////////////////////////////
 void condition_variable::create()
 {
-    // Just in case... try to close (ignore errors)
-    evl_close_event(&m_cv);
-
     // Create
     std::string name = "event_" + std::to_string(s_currentCVNum++) + '.' + std::to_string(getpid());
     int rv = evl_create_event(&m_cv, EVL_CLOCK_MONOTONIC, EVL_CLONE_PUBLIC, name.c_str());
diff --git a/condition_variable.h b/condition_variable.h
index f62e422..ca2ab19 100644
--- a/condition_variable.h
+++ b/condition_variable.h
@@ -14,7 +14,7 @@ public:
     condition_variable operator=(const condition_variable&) = delete;
     condition_variable& operator=(condition_variable&&) = default;
     ~condition_variable();
-    void wait(std::unique_lock<evl::mutex>& lock);
+    int wait(std::unique_lock<evl::mutex>& lock);
     void notify_all();
 
 private:
diff --git a/evl_cv_test.cpp b/evl_cv_test.cpp
index 501d045..fbd1e9a 100644
--- a/evl_cv_test.cpp
+++ b/evl_cv_test.cpp
@@ -8,7 +8,8 @@
 static char heap_storage[EVL_HEAP_RAW_SIZE(1024 * 1024)]; /* 1Mb heap */
 static struct evl_heap runtime_heap;
 
-std::unique_ptr<evl::semaphore> m_dataSemaphore;
+std::unique_ptr<evl::semaphore> m_dataSemaphore1;
+std::unique_ptr<evl::semaphore> m_dataSemaphore2;
 
 void* Thread1(void*)
 {
@@ -20,8 +21,8 @@ void* Thread1(void*)
 
     evl_printf("Thread 1 woken up\n");
 
-    m_dataSemaphore = std::unique_ptr<evl::semaphore>(new evl::semaphore());
-    m_dataSemaphore->take();
+    m_dataSemaphore1 = std::unique_ptr<evl::semaphore>(new evl::semaphore());
+    m_dataSemaphore1->take();
 
     evl_printf("Thread 1 Finished\n");
 
@@ -38,8 +39,8 @@ void* Thread2(void*)
 
     evl_printf("Thread 2 woken up\n");
 
-    m_dataSemaphore = std::unique_ptr<evl::semaphore>(new evl::semaphore());
-    m_dataSemaphore->give();
+    m_dataSemaphore2 = std::unique_ptr<evl::semaphore>(new evl::semaphore());
+    m_dataSemaphore2->give();
 
     evl_printf("Thread 2 Finished");
 
@@ -48,9 +49,9 @@ void* Thread2(void*)
 
 int main()
 {
-    // Init EVL
-    int ret = evl_init();
-    if (ret)
+    // Init EVL and attach main() to the core
+    int ret = evl_attach_self("main");
+    if (ret < 0)
     {
         printf("EVL Init failed with error: %d\n", ret);
         return -1;
@@ -86,8 +87,6 @@ int main()
     pthread_attr_setinheritsched(&tattr, PTHREAD_EXPLICIT_SCHED);
     pthread_create(&tid, &tattr, Thread1, NULL);
 
-    sleep(2);
-
     // Thread 2
     pthread_attr_t tattr2;
     sched_param param2;
@@ -107,6 +106,7 @@ int main()
 
     sleep(2);
 
+    m_dataSemaphore1->give();
     pthread_join(tid, NULL);
     pthread_join(tid2, NULL);
 
diff --git a/semaphore.cpp b/semaphore.cpp
index a9204d7..88423e1 100644
--- a/semaphore.cpp
+++ b/semaphore.cpp
@@ -33,12 +33,12 @@ void semaphore::give(unsigned int rcount)
 }
 
 ///////////////////////////////////////////////////////////////////////////////
-void semaphore::take(unsigned int rcount)
+int semaphore::take(unsigned int rcount)
 {
     // Short-circuit on rcount == 0;
     if (!rcount)
     {
-        return;
+        return 0;
     }
 
     // Ensure user isn't over-asking
@@ -51,10 +51,14 @@ void semaphore::take(unsigned int rcount)
 
     while (m_rcount < rcount)
     {
-        m_cv.wait(ulock);
+        int ret = m_cv.wait(ulock);
+	if (ret)
+		return ret;
     }
 
     m_rcount -= rcount;
+
+    return 0;
 }
 
 }
diff --git a/semaphore.h b/semaphore.h
index cbc13b2..54f5123 100644
--- a/semaphore.h
+++ b/semaphore.h
@@ -9,7 +9,7 @@ public:
     semaphore(const semaphore&) = delete;
     semaphore& operator=(const semaphore&) = delete;
     void give(uint32_t rcount = 1);
-    void take(uint32_t rcount = 1);
+    int take(uint32_t rcount = 1);
     
 private:
     evl::mutex m_mutex;

-- 
Philippe.

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-05-06 10:40 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-01 20:59 evl_wait_event returning EIDRM Russell Johnson
2023-05-06 10:08 ` Philippe Gerum

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).