A potential Xenomai Mutex issue

* A potential  Xenomai Mutex issue
@ 2019-08-22 18:42 DIAO, Hanson
  2019-08-23  6:47 ` Jan Kiszka
  0 siblings, 1 reply; 11+ messages in thread
From: DIAO, Hanson @ 2019-08-22 18:42 UTC (permalink / raw)
  To: xenomai

Hi all,

I hope you are doing well. Currently I was working on a critical deadlock issue with Xenomail Library(version 2.6.4). I found that for the Xenomai lock count is not reliable after we called rt_mutex_release. I print the following message to you. I hope some developer can help me fix this issue. I know that this version is EOL, but we still use this old version. Thank you so much.

Issue 1:

Before Mutex Lock Mutext addr = 0xb7c059e8,count = 0, owner = 0     This message show the status before rt_mutex_acquire.

After Mutex Lock Mutext addr = 0xb7c059e8,count = 1, owner = 2bd   This message show the status after calling rt_mutex_acquire.     Everything is right for the rt_mutex_acquire in this scenario.

Before Mutex unLock Mutext addr = 0xb7c059e8,count = 1, owner = 2bd   This message show the status before rt_mutex_release.

After Mutex unLock Mutext addr = 0xb7c059e8,count = 1, owner = 0          This message show the status after rt_mutex_release. It seems that the lock count is not correct after call rt_mutex_release.

Issue 2:

When our task is call recursive lock. The mutex lock count should more than 1, but the lock count is still 1.

For the issue 1, I guess that there are something wrong in the release function. I highlighted the code. I am not sure if it is the root cause.

int rt_mutex_release(RT_MUTEX *mutex)

{

#ifdef CONFIG_XENO_FASTSYNCH

        unsigned long status;

        xnhandle_t cur;

        cur = xeno_get_current();

        if (cur == XN_NO_HANDLE)

                return -EPERM;

        status = xeno_get_current_mode();

        if (unlikely(status & XNOTHER))

                /* See rt_mutex_acquire_inner() */

                goto do_syscall;

        if (unlikely(xnsynch_fast_owner_check(mutex->fastlock, cur) != 0))

                return -EPERM;

        if (mutex->lockcnt > 1) {

                mutex->lockcnt--;

                return 0;

        }

        if (likely(xnsynch_fast_release(mutex->fastlock, cur)))

        {

                return 0;

        }

do_syscall:

#endif /* CONFIG_XENO_FASTSYNCH */

        return XENOMAI_SKINCALL1(__native_muxid, __native_mutex_release, mutex);

}

For the Mutex lock function, I am so confused with the following comments which I highlighted as below. I am not sure if it supports the recursive lock.

static int rt_mutex_acquire_inner(RT_MUTEX *mutex, RTIME timeout, xntmode_t mode)

{

        int err;

#ifdef CONFIG_XENO_FASTSYNCH

        unsigned long status;

        xnhandle_t cur;

        cur = xeno_get_current();

        if (cur == XN_NO_HANDLE)

                return -EPERM;

        /*

         * We track resource ownership for non real-time shadows in

         * order to handle the auto-relax feature, so we must always

         * obtain them via a syscall.

         */

        status = xeno_get_current_mode();

        if (unlikely(status & XNOTHER))

                goto do_syscall;

        if (likely(!(status & XNRELAX))) {

                err = xnsynch_fast_acquire(mutex->fastlock, cur);

                if (likely(!err)) {

                        mutex->lockcnt = 1;

                        return 0;

                }

                if (err == -EBUSY) {

                        if (mutex->lockcnt == UINT_MAX)

                                return -EAGAIN;

                        mutex->lockcnt++;

                        return 0;

                }

                if (timeout == TM_NONBLOCK && mode == XN_RELATIVE)

                        return -EWOULDBLOCK;

        } else if (xnsynch_fast_owner_check(mutex->fastlock, cur) == 0) {

                /*

                 * The application is buggy as it jumped to secondary mode

                 * while holding the mutex. Nevertheless, we have to keep the

                 * mutex state consistent.

                 *

                 * We make no efforts to migrate or warn here. There is

                 * XENO_DEBUG(SYNCH_RELAX) to catch such bugs.

                 */

                if (mutex->lockcnt == UINT_MAX)

                        return -EAGAIN;

                mutex->lockcnt++;

                return 0;

        }

do_syscall:

#endif /* CONFIG_XENO_FASTSYNCH */

        err = XENOMAI_SKINCALL3(__native_muxid,

                                __native_mutex_acquire, mutex, mode, &timeout);

#ifdef CONFIG_XENO_FASTSYNCH

        if (!err)

                mutex->lockcnt = 1;

#endif /* CONFIG_XENO_FASTSYNCH */

        return err;

}

^ permalink raw reply	[flat|nested] 11+ messages in thread