[PATCH v2 1/3] drivers/testing: heapcheck: silence UMR warning (false positive)

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 1/3] drivers/testing: heapcheck: silence UMR warning (false positive)
@ 2019-11-28 10:02 Philippe Gerum
  2019-11-28 10:02 ` [PATCH v2 2/3] cobalt/process: exit_thread handlers should reschedule as needed Philippe Gerum
  2019-11-28 10:02 ` [PATCH v2 3/3] cobalt/sched: guard against missed rescheduling upon CPU migration Philippe Gerum
  0 siblings, 2 replies; 8+ messages in thread
From: Philippe Gerum @ 2019-11-28 10:02 UTC (permalink / raw)
  To: xenomai

From: Philippe Gerum <rpm@xenomai.org>

Signed-off-by: Philippe Gerum <rpm@xenomai.org>
---
 kernel/drivers/testing/heapcheck.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/drivers/testing/heapcheck.c b/kernel/drivers/testing/heapcheck.c
index 4eb6a5aae..023a06ca6 100644
--- a/kernel/drivers/testing/heapcheck.c
+++ b/kernel/drivers/testing/heapcheck.c
@@ -449,7 +449,7 @@ static int heapcheck_ioctl(struct rtdm_fd *fd,
 	struct rttst_heap_stathdr sthdr;
 	struct rttst_heap_parms parms;
 	int ret;
-	
+
 	switch (request) {
 	case RTTST_RTIOC_HEAP_CHECK:
 		ret = rtdm_copy_from_user(fd, &parms, arg, sizeof(parms));
@@ -464,6 +464,7 @@ static int heapcheck_ioctl(struct rtdm_fd *fd,
 		ret = rtdm_copy_to_user(fd, arg, &parms, sizeof(parms));
 		break;
 	case RTTST_RTIOC_HEAP_STAT_COLLECT:
+		sthdr.buf = NULL;
 		ret = rtdm_copy_from_user(fd, &sthdr, arg, sizeof(sthdr));
 		if (ret)
 			return ret;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/3] cobalt/process: exit_thread handlers should reschedule as needed
  2019-11-28 10:02 [PATCH v2 1/3] drivers/testing: heapcheck: silence UMR warning (false positive) Philippe Gerum
@ 2019-11-28 10:02 ` Philippe Gerum
  2019-11-28 10:02 ` [PATCH v2 3/3] cobalt/sched: guard against missed rescheduling upon CPU migration Philippe Gerum
  1 sibling, 0 replies; 8+ messages in thread
From: Philippe Gerum @ 2019-11-28 10:02 UTC (permalink / raw)
  To: xenomai

From: Philippe Gerum <rpm@xenomai.org>

An exit_handler wants to be called from the root domain, with
preemption enabled and hard irqs on in order to keep all options open,
such as using regular sleeping locks.

If such a handler updates the Cobalt scheduler state, it has to
trigger the rescheduling procedure (xnsched_run()) internally as well,
grouping the changes and the rescheduling call into the same
interrupt-free section, in order to guard against CPU migration.

Relying on the core to kick such procedure in order to commit pending
changes later on is unreliable.

Signed-off-by: Philippe Gerum <rpm@xenomai.org>
---
 kernel/cobalt/posix/process.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/cobalt/posix/process.c b/kernel/cobalt/posix/process.c
index 90949e519..8351d28fb 100644
--- a/kernel/cobalt/posix/process.c
+++ b/kernel/cobalt/posix/process.c
@@ -1144,10 +1144,11 @@ static void __handle_taskexit_event(struct task_struct *p)
 	if (xnthread_test_state(thread, XNSSTEP))
 		cobalt_unregister_debugged_thread(thread);
 
+	xnsched_run();
+
 	xnlock_put_irqrestore(&nklock, s);
 
 	xnthread_run_handler_stack(thread, exit_thread);
-	xnsched_run();
 
 	if (xnthread_test_state(thread, XNUSER)) {
 		cobalt_umm_free(&cobalt_kernel_ppd.umm, thread->u_window);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/3] cobalt/sched: guard against missed rescheduling upon CPU migration
  2019-11-28 10:02 [PATCH v2 1/3] drivers/testing: heapcheck: silence UMR warning (false positive) Philippe Gerum
  2019-11-28 10:02 ` [PATCH v2 2/3] cobalt/process: exit_thread handlers should reschedule as needed Philippe Gerum
@ 2019-11-28 10:02 ` Philippe Gerum
  2019-11-29 18:04   ` Jan Kiszka
  1 sibling, 1 reply; 8+ messages in thread
From: Philippe Gerum @ 2019-11-28 10:02 UTC (permalink / raw)
  To: xenomai

From: Philippe Gerum <rpm@xenomai.org>

Kicking the rescheduling procedure (xnsched_run()) from call sites
running over the root thread with hard irqs enabled may lead to
missing a rescheduling opportunity as a consequence of a CPU
migration, e.g.:

    CPU0					CPU1
    ----                         		----
    t0: ...
            <context switch>
    t1: xnthread_resume(t0)
             xnsched_run()
             <CPU MIGRATION -->
                  |                             |
                  |                             |
                  x                             |
          (missed rescheduling)                 v
                                    ___xnsched_run(CPU1.sched)

To address this issue, make sure every call to xnsched_run() either
runs from the head domain which is not affected by CPU migration, or
moves into the latest atomic section which might have changed the
scheduler state.

Turning on CONFIG_XENO_OPT_DEBUG_COBALT also enables a (oneshot)
assertion which detects invalid calls from the root domain with hard
irqs enabled.

Signed-off-by: Philippe Gerum <rpm@xenomai.org>
---
 include/cobalt/kernel/rtdm/driver.h | 11 ++++++++-
 include/cobalt/kernel/sched.h       | 36 ++++++-----------------------
 kernel/cobalt/posix/mqueue.c        | 13 ++++-------
 kernel/cobalt/posix/timerfd.c       |  9 +++-----
 kernel/cobalt/sched.c               | 20 ++++++++++++----
 kernel/cobalt/select.c              |  6 ++---
 kernel/cobalt/thread.c              |  4 ++--
 7 files changed, 44 insertions(+), 55 deletions(-)

diff --git a/include/cobalt/kernel/rtdm/driver.h b/include/cobalt/kernel/rtdm/driver.h
index d3d5d6b27..57495f12f 100644
--- a/include/cobalt/kernel/rtdm/driver.h
+++ b/include/cobalt/kernel/rtdm/driver.h
@@ -1044,8 +1044,12 @@ static inline void __deprecated rtdm_task_join_nrt(rtdm_task_t *task,
 static inline void rtdm_task_set_priority(rtdm_task_t *task, int priority)
 {
 	union xnsched_policy_param param = { .rt = { .prio = priority } };
+	spl_t s;
+
+	splhigh(s);
 	xnthread_set_schedparam(task, &xnsched_class_rt, &param);
 	xnsched_run();
+	splexit(s);
 }
 
 static inline int rtdm_task_set_period(rtdm_task_t *task,
@@ -1062,9 +1066,14 @@ static inline int rtdm_task_set_period(rtdm_task_t *task,
 
 static inline int rtdm_task_unblock(rtdm_task_t *task)
 {
-	int res = xnthread_unblock(task);
+	spl_t s;
+	int res;
 
+	splhigh(s);
+	res = xnthread_unblock(task);
 	xnsched_run();
+	splexit(s);
+
 	return res;
 }
 
diff --git a/include/cobalt/kernel/sched.h b/include/cobalt/kernel/sched.h
index 3b0825d67..7fc11a198 100644
--- a/include/cobalt/kernel/sched.h
+++ b/include/cobalt/kernel/sched.h
@@ -298,25 +298,8 @@ void __xnsched_run_handler(void);
 static inline int __xnsched_run(struct xnsched *sched)
 {
 	/*
-	 * NOTE: Since ___xnsched_run() won't run immediately if an
-	 * escalation to primary domain is needed, we won't use
-	 * critical scheduler information before we actually run in
-	 * primary mode; therefore we can first test the scheduler
-	 * status then escalate.
-	 *
-	 * Running in the primary domain means that no Linux-triggered
-	 * CPU migration may occur from that point either. Finally,
-	 * since migration is always a self-directed operation for
-	 * Xenomai threads, we can safely read the scheduler state
-	 * bits without holding the nklock.
-	 *
-	 * Said differently, if we race here because of a CPU
-	 * migration, it must have been Linux-triggered because we run
-	 * in secondary mode; in which case we will escalate to the
-	 * primary domain, then unwind the current call frame without
-	 * running the rescheduling procedure in
-	 * ___xnsched_run(). Therefore, the scheduler slot
-	 * (i.e. "sched") will be either valid, or unused.
+	 * Reschedule if XNSCHED is pending, but never over an IRQ
+	 * handler or in the middle of unlocked context switch.
 	 */
 	if (((sched->status|sched->lflags) &
 	     (XNINIRQ|XNINSW|XNRESCHED)) != XNRESCHED)
@@ -328,18 +311,13 @@ static inline int __xnsched_run(struct xnsched *sched)
 static inline int xnsched_run(void)
 {
 	struct xnsched *sched = xnsched_current();
+	struct xnthread *curr = READ_ONCE(sched->curr);
+
 	/*
-	 * No rescheduling is possible, either if:
-	 *
-	 * - the current thread holds the scheduler lock
-	 * - an ISR context is active
-	 * - we are caught in the middle of an unlocked context switch.
+	 * If running over the root thread, hard irqs must be off
+	 * (asserted out of line in ___xnsched_run()).
 	 */
-	smp_rmb();
-	if (unlikely(sched->curr->lock_count > 0))
-		return 0;
-
-	return __xnsched_run(sched);
+	return curr->lock_count > 0 ? 0 : __xnsched_run(sched);
 }
 
 void xnsched_lock(void);
diff --git a/kernel/cobalt/posix/mqueue.c b/kernel/cobalt/posix/mqueue.c
index 5c644b2c4..cd1cf3ce8 100644
--- a/kernel/cobalt/posix/mqueue.c
+++ b/kernel/cobalt/posix/mqueue.c
@@ -158,22 +158,19 @@ static inline int mq_init(struct cobalt_mq *mq, const struct mq_attr *attr)
 
 static inline void mq_destroy(struct cobalt_mq *mq)
 {
-	int resched;
 	spl_t s;
 
 	xnlock_get_irqsave(&nklock, s);
-	resched = (xnsynch_destroy(&mq->receivers) == XNSYNCH_RESCHED);
-	resched = (xnsynch_destroy(&mq->senders) == XNSYNCH_RESCHED) || resched;
+	xnsynch_destroy(&mq->receivers);
+	xnsynch_destroy(&mq->senders);
 	list_del(&mq->link);
+	xnsched_run();
 	xnlock_put_irqrestore(&nklock, s);
-	xnselect_destroy(&mq->read_select);
-	xnselect_destroy(&mq->write_select);
+	xnselect_destroy(&mq->read_select); /* Reschedules. */
+	xnselect_destroy(&mq->write_select); /* Ditto. */
 	xnregistry_remove(mq->handle);
 	xnheap_vfree(mq->mem);
 	kfree(mq);
-
-	if (resched)
-		xnsched_run();
 }
 
 static int mq_unref_inner(struct cobalt_mq *mq, spl_t s)
diff --git a/kernel/cobalt/posix/timerfd.c b/kernel/cobalt/posix/timerfd.c
index 610206047..29046cfec 100644
--- a/kernel/cobalt/posix/timerfd.c
+++ b/kernel/cobalt/posix/timerfd.c
@@ -131,18 +131,15 @@ timerfd_select(struct rtdm_fd *fd, struct xnselector *selector,
 static void timerfd_close(struct rtdm_fd *fd)
 {
 	struct cobalt_tfd *tfd = container_of(fd, struct cobalt_tfd, fd);
-	int resched;
 	spl_t s;
 
 	xnlock_get_irqsave(&nklock, s);
 	xntimer_destroy(&tfd->timer);
-	resched = xnsynch_destroy(&tfd->readers) == XNSYNCH_RESCHED;
+	xnsynch_destroy(&tfd->readers);
+	xnsched_run();
 	xnlock_put_irqrestore(&nklock, s);
-	xnselect_destroy(&tfd->read_select);
+	xnselect_destroy(&tfd->read_select); /* Reschedules. */
 	xnfree(tfd);
-
-	if (resched)
-		xnsched_run();
 }
 
 static struct rtdm_fd_ops timerfd_ops = {
diff --git a/kernel/cobalt/sched.c b/kernel/cobalt/sched.c
index 74dc7a961..f7e662a38 100644
--- a/kernel/cobalt/sched.c
+++ b/kernel/cobalt/sched.c
@@ -376,10 +376,11 @@ struct xnsched *xnsched_finish_unlocked_switch(struct xnsched *sched)
 void xnsched_lock(void)
 {
 	struct xnsched *sched = xnsched_current();
-	struct xnthread *curr = sched->curr;
+	struct xnthread *curr = READ_ONCE(sched->curr);
 
 	if (sched->lflags & XNINIRQ)
 		return;
+
 	/*
 	 * CAUTION: The fast xnthread_current() accessor carries the
 	 * relevant lock nesting count only if current runs in primary
@@ -398,14 +399,17 @@ EXPORT_SYMBOL_GPL(xnsched_lock);
 void xnsched_unlock(void)
 {
 	struct xnsched *sched = xnsched_current();
-	struct xnthread *curr = sched->curr;
+	struct xnthread *curr = READ_ONCE(sched->curr);
+
+	XENO_WARN_ON_ONCE(COBALT, (curr->state & XNROOT) &&
+			  !hard_irqs_disabled());
 
 	if (sched->lflags & XNINIRQ)
 		return;
-	
+
 	if (!XENO_ASSERT(COBALT, curr->lock_count > 0))
 		return;
-	
+
 	if (--curr->lock_count == 0) {
 		xnthread_clear_localinfo(curr, XNLBALERT);
 		xnsched_run();
@@ -962,6 +966,8 @@ int ___xnsched_run(struct xnsched *sched)
 	int switched, shadow;
 	spl_t s;
 
+	XENO_WARN_ON_ONCE(COBALT, !hard_irqs_disabled() && ipipe_root_p);
+
 	if (xnarch_escalate())
 		return 0;
 
@@ -1000,7 +1006,11 @@ reschedule:
 
 	trace_cobalt_switch_context(prev, next);
 
-	sched->curr = next;
+	/*
+	 * sched->curr is shared locklessly with xnsched_run(), be
+	 * careful about load/store tearing.
+	 */
+	WRITE_ONCE(sched->curr, next);
 	shadow = 1;
 
 	if (xnthread_test_state(prev, XNROOT)) {
diff --git a/kernel/cobalt/select.c b/kernel/cobalt/select.c
index 7f839f159..df3d960db 100644
--- a/kernel/cobalt/select.c
+++ b/kernel/cobalt/select.c
@@ -409,7 +409,6 @@ static void xnselector_destroy_loop(void *cookie)
 	struct xnselect_binding *binding, *tmpb;
 	struct xnselector *selector, *tmps;
 	struct xnselect *fd;
-	int resched;
 	spl_t s;
 
 	xnlock_get_irqsave(&nklock, s);
@@ -430,12 +429,11 @@ static void xnselector_destroy_loop(void *cookie)
 			xnlock_get_irqsave(&nklock, s);
 		}
 	release:
-		resched = xnsynch_destroy(&selector->synchbase) == XNSYNCH_RESCHED;
+		xnsynch_destroy(&selector->synchbase);
+		xnsched_run();
 		xnlock_put_irqrestore(&nklock, s);
 
 		xnfree(selector);
-		if (resched)
-			xnsched_run();
 
 		xnlock_get_irqsave(&nklock, s);
 	}
diff --git a/kernel/cobalt/thread.c b/kernel/cobalt/thread.c
index a5f8e1b00..5260fb1e7 100644
--- a/kernel/cobalt/thread.c
+++ b/kernel/cobalt/thread.c
@@ -1582,9 +1582,9 @@ check_self_cancel:
 	} else
 		__xnthread_kick(thread);
 out:
-	xnlock_put_irqrestore(&nklock, s);
-
 	xnsched_run();
+
+	xnlock_put_irqrestore(&nklock, s);
 }
 EXPORT_SYMBOL_GPL(xnthread_cancel);
 
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] cobalt/sched: guard against missed rescheduling upon CPU migration
  2019-11-28 10:02 ` [PATCH v2 3/3] cobalt/sched: guard against missed rescheduling upon CPU migration Philippe Gerum
@ 2019-11-29 18:04   ` Jan Kiszka
  2019-11-29 18:37     ` Philippe Gerum
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Kiszka @ 2019-11-29 18:04 UTC (permalink / raw)
  To: Philippe Gerum, xenomai

On 28.11.19 11:02, Philippe Gerum via Xenomai wrote:
> From: Philippe Gerum <rpm@xenomai.org>
> 
> Kicking the rescheduling procedure (xnsched_run()) from call sites
> running over the root thread with hard irqs enabled may lead to
> missing a rescheduling opportunity as a consequence of a CPU
> migration, e.g.:
> 
>      CPU0					CPU1
>      ----                         		----
>      t0: ...
>              <context switch>
>      t1: xnthread_resume(t0)
>               xnsched_run()
>               <CPU MIGRATION -->
>                    |                             |
>                    |                             |
>                    x                             |
>            (missed rescheduling)                 v
>                                      ___xnsched_run(CPU1.sched)
> 
> To address this issue, make sure every call to xnsched_run() either
> runs from the head domain which is not affected by CPU migration, or
> moves into the latest atomic section which might have changed the
> scheduler state.
> 
> Turning on CONFIG_XENO_OPT_DEBUG_COBALT also enables a (oneshot)
> assertion which detects invalid calls from the root domain with hard
> irqs enabled.
> 
> Signed-off-by: Philippe Gerum <rpm@xenomai.org>
> ---
>   include/cobalt/kernel/rtdm/driver.h | 11 ++++++++-
>   include/cobalt/kernel/sched.h       | 36 ++++++-----------------------
>   kernel/cobalt/posix/mqueue.c        | 13 ++++-------
>   kernel/cobalt/posix/timerfd.c       |  9 +++-----
>   kernel/cobalt/sched.c               | 20 ++++++++++++----
>   kernel/cobalt/select.c              |  6 ++---
>   kernel/cobalt/thread.c              |  4 ++--
>   7 files changed, 44 insertions(+), 55 deletions(-)
> 
> diff --git a/include/cobalt/kernel/rtdm/driver.h b/include/cobalt/kernel/rtdm/driver.h
> index d3d5d6b27..57495f12f 100644
> --- a/include/cobalt/kernel/rtdm/driver.h
> +++ b/include/cobalt/kernel/rtdm/driver.h
> @@ -1044,8 +1044,12 @@ static inline void __deprecated rtdm_task_join_nrt(rtdm_task_t *task,
>   static inline void rtdm_task_set_priority(rtdm_task_t *task, int priority)
>   {
>   	union xnsched_policy_param param = { .rt = { .prio = priority } };
> +	spl_t s;
> +
> +	splhigh(s);
>   	xnthread_set_schedparam(task, &xnsched_class_rt, &param);
>   	xnsched_run();
> +	splexit(s);
>   }
>   
>   static inline int rtdm_task_set_period(rtdm_task_t *task,
> @@ -1062,9 +1066,14 @@ static inline int rtdm_task_set_period(rtdm_task_t *task,
>   
>   static inline int rtdm_task_unblock(rtdm_task_t *task)
>   {
> -	int res = xnthread_unblock(task);
> +	spl_t s;
> +	int res;
>   
> +	splhigh(s);
> +	res = xnthread_unblock(task);
>   	xnsched_run();
> +	splexit(s);
> +
>   	return res;
>   }

Both become non-trivial this way and should be un-inlined. I will do 
that on top.

>   
> diff --git a/include/cobalt/kernel/sched.h b/include/cobalt/kernel/sched.h
> index 3b0825d67..7fc11a198 100644
> --- a/include/cobalt/kernel/sched.h
> +++ b/include/cobalt/kernel/sched.h
> @@ -298,25 +298,8 @@ void __xnsched_run_handler(void);
>   static inline int __xnsched_run(struct xnsched *sched)
>   {
>   	/*
> -	 * NOTE: Since ___xnsched_run() won't run immediately if an
> -	 * escalation to primary domain is needed, we won't use
> -	 * critical scheduler information before we actually run in
> -	 * primary mode; therefore we can first test the scheduler
> -	 * status then escalate.
> -	 *
> -	 * Running in the primary domain means that no Linux-triggered
> -	 * CPU migration may occur from that point either. Finally,
> -	 * since migration is always a self-directed operation for
> -	 * Xenomai threads, we can safely read the scheduler state
> -	 * bits without holding the nklock.
> -	 *
> -	 * Said differently, if we race here because of a CPU
> -	 * migration, it must have been Linux-triggered because we run
> -	 * in secondary mode; in which case we will escalate to the
> -	 * primary domain, then unwind the current call frame without
> -	 * running the rescheduling procedure in
> -	 * ___xnsched_run(). Therefore, the scheduler slot
> -	 * (i.e. "sched") will be either valid, or unused.
> +	 * Reschedule if XNSCHED is pending, but never over an IRQ
> +	 * handler or in the middle of unlocked context switch.
>   	 */
>   	if (((sched->status|sched->lflags) &
>   	     (XNINIRQ|XNINSW|XNRESCHED)) != XNRESCHED)
> @@ -328,18 +311,13 @@ static inline int __xnsched_run(struct xnsched *sched)
>   static inline int xnsched_run(void)
>   {
>   	struct xnsched *sched = xnsched_current();
> +	struct xnthread *curr = READ_ONCE(sched->curr);

But now we no longer need that READ_ONCE: I see only one use case below.

> +
>   	/*
> -	 * No rescheduling is possible, either if:
> -	 *
> -	 * - the current thread holds the scheduler lock
> -	 * - an ISR context is active
> -	 * - we are caught in the middle of an unlocked context switch.
> +	 * If running over the root thread, hard irqs must be off
> +	 * (asserted out of line in ___xnsched_run()).
>   	 */
> -	smp_rmb();
> -	if (unlikely(sched->curr->lock_count > 0))
> -		return 0;
> -
> -	return __xnsched_run(sched);
> +	return curr->lock_count > 0 ? 0 : __xnsched_run(sched);
>   }
>   
>   void xnsched_lock(void);
> diff --git a/kernel/cobalt/posix/mqueue.c b/kernel/cobalt/posix/mqueue.c
> index 5c644b2c4..cd1cf3ce8 100644
> --- a/kernel/cobalt/posix/mqueue.c
> +++ b/kernel/cobalt/posix/mqueue.c
> @@ -158,22 +158,19 @@ static inline int mq_init(struct cobalt_mq *mq, const struct mq_attr *attr)
>   
>   static inline void mq_destroy(struct cobalt_mq *mq)
>   {
> -	int resched;
>   	spl_t s;
>   
>   	xnlock_get_irqsave(&nklock, s);
> -	resched = (xnsynch_destroy(&mq->receivers) == XNSYNCH_RESCHED);
> -	resched = (xnsynch_destroy(&mq->senders) == XNSYNCH_RESCHED) || resched;
> +	xnsynch_destroy(&mq->receivers);
> +	xnsynch_destroy(&mq->senders);
>   	list_del(&mq->link);
> +	xnsched_run();
>   	xnlock_put_irqrestore(&nklock, s);
> -	xnselect_destroy(&mq->read_select);
> -	xnselect_destroy(&mq->write_select);
> +	xnselect_destroy(&mq->read_select); /* Reschedules. */
> +	xnselect_destroy(&mq->write_select); /* Ditto. */
>   	xnregistry_remove(mq->handle);
>   	xnheap_vfree(mq->mem);
>   	kfree(mq);
> -
> -	if (resched)
> -		xnsched_run();
>   }
>   
>   static int mq_unref_inner(struct cobalt_mq *mq, spl_t s)
> diff --git a/kernel/cobalt/posix/timerfd.c b/kernel/cobalt/posix/timerfd.c
> index 610206047..29046cfec 100644
> --- a/kernel/cobalt/posix/timerfd.c
> +++ b/kernel/cobalt/posix/timerfd.c
> @@ -131,18 +131,15 @@ timerfd_select(struct rtdm_fd *fd, struct xnselector *selector,
>   static void timerfd_close(struct rtdm_fd *fd)
>   {
>   	struct cobalt_tfd *tfd = container_of(fd, struct cobalt_tfd, fd);
> -	int resched;
>   	spl_t s;
>   
>   	xnlock_get_irqsave(&nklock, s);
>   	xntimer_destroy(&tfd->timer);
> -	resched = xnsynch_destroy(&tfd->readers) == XNSYNCH_RESCHED;
> +	xnsynch_destroy(&tfd->readers);
> +	xnsched_run();
>   	xnlock_put_irqrestore(&nklock, s);
> -	xnselect_destroy(&tfd->read_select);
> +	xnselect_destroy(&tfd->read_select); /* Reschedules. */
>   	xnfree(tfd);
> -
> -	if (resched)
> -		xnsched_run();
>   }
>   
>   static struct rtdm_fd_ops timerfd_ops = {
> diff --git a/kernel/cobalt/sched.c b/kernel/cobalt/sched.c
> index 74dc7a961..f7e662a38 100644
> --- a/kernel/cobalt/sched.c
> +++ b/kernel/cobalt/sched.c
> @@ -376,10 +376,11 @@ struct xnsched *xnsched_finish_unlocked_switch(struct xnsched *sched)
>   void xnsched_lock(void)
>   {
>   	struct xnsched *sched = xnsched_current();
> -	struct xnthread *curr = sched->curr;
> +	struct xnthread *curr = READ_ONCE(sched->curr);

And here we READ_ONCE for the consistency of the XENO_WARN?

>   
>   	if (sched->lflags & XNINIRQ)
>   		return;
> +
>   	/*
>   	 * CAUTION: The fast xnthread_current() accessor carries the
>   	 * relevant lock nesting count only if current runs in primary
> @@ -398,14 +399,17 @@ EXPORT_SYMBOL_GPL(xnsched_lock);
>   void xnsched_unlock(void)
>   {
>   	struct xnsched *sched = xnsched_current();
> -	struct xnthread *curr = sched->curr;
> +	struct xnthread *curr = READ_ONCE(sched->curr);
> +
> +	XENO_WARN_ON_ONCE(COBALT, (curr->state & XNROOT) &&
> +			  !hard_irqs_disabled());

Same here: If the warning doesn't trigger, sched->curr cannot change 
underneath us, can it?

>   
>   	if (sched->lflags & XNINIRQ)
>   		return;
> -	
> +
>   	if (!XENO_ASSERT(COBALT, curr->lock_count > 0))
>   		return;
> -	
> +
>   	if (--curr->lock_count == 0) {
>   		xnthread_clear_localinfo(curr, XNLBALERT);
>   		xnsched_run();
> @@ -962,6 +966,8 @@ int ___xnsched_run(struct xnsched *sched)
>   	int switched, shadow;
>   	spl_t s;
>   
> +	XENO_WARN_ON_ONCE(COBALT, !hard_irqs_disabled() && ipipe_root_p);
> +
>   	if (xnarch_escalate())
>   		return 0;
>   
> @@ -1000,7 +1006,11 @@ reschedule:
>   
>   	trace_cobalt_switch_context(prev, next);
>   
> -	sched->curr = next;
> +	/*
> +	 * sched->curr is shared locklessly with xnsched_run(), be
> +	 * careful about load/store tearing.
> +	 */
> +	WRITE_ONCE(sched->curr, next);

But xnsched_run touches the value only once. Do we rather want a barrier 
here?

>   	shadow = 1;
>   
>   	if (xnthread_test_state(prev, XNROOT)) {
> diff --git a/kernel/cobalt/select.c b/kernel/cobalt/select.c
> index 7f839f159..df3d960db 100644
> --- a/kernel/cobalt/select.c
> +++ b/kernel/cobalt/select.c
> @@ -409,7 +409,6 @@ static void xnselector_destroy_loop(void *cookie)
>   	struct xnselect_binding *binding, *tmpb;
>   	struct xnselector *selector, *tmps;
>   	struct xnselect *fd;
> -	int resched;
>   	spl_t s;
>   
>   	xnlock_get_irqsave(&nklock, s);
> @@ -430,12 +429,11 @@ static void xnselector_destroy_loop(void *cookie)
>   			xnlock_get_irqsave(&nklock, s);
>   		}
>   	release:
> -		resched = xnsynch_destroy(&selector->synchbase) == XNSYNCH_RESCHED;
> +		xnsynch_destroy(&selector->synchbase);
> +		xnsched_run();
>   		xnlock_put_irqrestore(&nklock, s);
>   
>   		xnfree(selector);
> -		if (resched)
> -			xnsched_run();
>   
>   		xnlock_get_irqsave(&nklock, s);
>   	}
> diff --git a/kernel/cobalt/thread.c b/kernel/cobalt/thread.c
> index a5f8e1b00..5260fb1e7 100644
> --- a/kernel/cobalt/thread.c
> +++ b/kernel/cobalt/thread.c
> @@ -1582,9 +1582,9 @@ check_self_cancel:
>   	} else
>   		__xnthread_kick(thread);
>   out:
> -	xnlock_put_irqrestore(&nklock, s);
> -
>   	xnsched_run();
> +
> +	xnlock_put_irqrestore(&nklock, s);
>   }
>   EXPORT_SYMBOL_GPL(xnthread_cancel);
>   
> 

In generally, this looks good. I just like to sort out any remaining 
fuzzyness of the lockless / ONCE parts.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] cobalt/sched: guard against missed rescheduling upon CPU migration
  2019-11-29 18:04   ` Jan Kiszka
@ 2019-11-29 18:37     ` Philippe Gerum
  2019-12-02  6:42       ` Jan Kiszka
  0 siblings, 1 reply; 8+ messages in thread
From: Philippe Gerum @ 2019-11-29 18:37 UTC (permalink / raw)
  To: Jan Kiszka, xenomai

On 11/29/19 7:04 PM, Jan Kiszka wrote:
> On 28.11.19 11:02, Philippe Gerum via Xenomai wrote:
>> From: Philippe Gerum <rpm@xenomai.org>
>>
>>     static inline int rtdm_task_set_period(rtdm_task_t *task,
>> @@ -1062,9 +1066,14 @@ static inline int rtdm_task_set_period(rtdm_task_t *task,
>>     static inline int rtdm_task_unblock(rtdm_task_t *task)
>>   {
>> -    int res = xnthread_unblock(task);
>> +    spl_t s;
>> +    int res;
>>   +    splhigh(s);
>> +    res = xnthread_unblock(task);
>>       xnsched_run();
>> +    splexit(s);
>> +
>>       return res;
>>   }
> 
> Both become non-trivial this way and should be un-inlined. I will do that on top.
>

Fair enough. I did not consider moving this out-of-line since those helpers are
infrequently used (or at least
should be).

>>            (XNINIRQ|XNINSW|XNRESCHED)) != XNRESCHED)
>> @@ -328,18 +311,13 @@ static inline int __xnsched_run(struct xnsched *sched)
>>   static inline int xnsched_run(void)
>>   {
>>       struct xnsched *sched = xnsched_current();
>> +    struct xnthread *curr = READ_ONCE(sched->curr);
> 
> But now we no longer need that READ_ONCE: I see only one use case below.>>> index 74dc7a961..f7e662a38 100644
>> --- a/kernel/cobalt/sched.c
>> +++ b/kernel/cobalt/sched.c
>> @@ -376,10 +376,11 @@ struct xnsched *xnsched_finish_unlocked_switch(struct
>> xnsched *sched)
>>   void xnsched_lock(void)
>>   {
>>       struct xnsched *sched = xnsched_current();
>> -    struct xnthread *curr = sched->curr;
>> +    struct xnthread *curr = READ_ONCE(sched->curr);
> 
> And here we READ_ONCE for the consistency of the XENO_WARN?
>

This is paired with writing to sched->curr in ___xnsched_run(). In theory, we
might have multiple subsequent stores to sched->curr from a remote CPU in the
time span of a single read from the local CPU. If we allow the compiler to go
wild with load tearing here, this might cause the reader to observe a pointer
value which might actually be a merge between the multiple stores.

Since these are inline helpers, we could indeed use a barrier for preventing
load fusing with subsequent code within the same scope, but I don't think this
would be of any help with load tearing.

>> +    /*
>> +     * sched->curr is shared locklessly with xnsched_run(), be
>> +     * careful about load/store tearing.
>> +     */
>> +    WRITE_ONCE(sched->curr, next);
> 
> But xnsched_run touches the value only once. Do we rather want a barrier here?
> 

This is the converse issue. WRITE_ONCE() is there to prevent the compiler from
writing to sched->curr piecemeal (store tearing) with multiple accesses, while
other folks read that value (like xnsched_lock(), xnsched_run()). In theory, we
might have an issue with store tearing since sched->curr can be considered a
shared memory accessed locklessly, and we don't know how braindamage the
compiler might be, e.g. in case it feels too much pressure on the register file
when generating code. Typically, having a 64bit pointer store split into a
couple of 32bit writes would look funny at first, funky in the end. Granted,
the access to sched->curr is always aligned, no immediate value is stored
there, but well, WRITE_ONCE() is for peace of mind with other cases.

-- 
Philippe.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] cobalt/sched: guard against missed rescheduling upon CPU migration
  2019-11-29 18:37     ` Philippe Gerum
@ 2019-12-02  6:42       ` Jan Kiszka
  2019-12-02 15:18         ` Philippe Gerum
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Kiszka @ 2019-12-02  6:42 UTC (permalink / raw)
  To: Philippe Gerum, xenomai

On 29.11.19 19:37, Philippe Gerum wrote:
> On 11/29/19 7:04 PM, Jan Kiszka wrote:
>> On 28.11.19 11:02, Philippe Gerum via Xenomai wrote:
>>> From: Philippe Gerum <rpm@xenomai.org>
>>>
>>>      static inline int rtdm_task_set_period(rtdm_task_t *task,
>>> @@ -1062,9 +1066,14 @@ static inline int rtdm_task_set_period(rtdm_task_t *task,
>>>      static inline int rtdm_task_unblock(rtdm_task_t *task)
>>>    {
>>> -    int res = xnthread_unblock(task);
>>> +    spl_t s;
>>> +    int res;
>>>    +    splhigh(s);
>>> +    res = xnthread_unblock(task);
>>>        xnsched_run();
>>> +    splexit(s);
>>> +
>>>        return res;
>>>    }
>>
>> Both become non-trivial this way and should be un-inlined. I will do that on top.
>>
> 
> Fair enough. I did not consider moving this out-of-line since those helpers are
> infrequently used (or at least
> should be).
> 
>>>             (XNINIRQ|XNINSW|XNRESCHED)) != XNRESCHED)
>>> @@ -328,18 +311,13 @@ static inline int __xnsched_run(struct xnsched *sched)
>>>    static inline int xnsched_run(void)
>>>    {
>>>        struct xnsched *sched = xnsched_current();
>>> +    struct xnthread *curr = READ_ONCE(sched->curr);
>>
>> But now we no longer need that READ_ONCE: I see only one use case below.>>> index 74dc7a961..f7e662a38 100644
>>> --- a/kernel/cobalt/sched.c
>>> +++ b/kernel/cobalt/sched.c
>>> @@ -376,10 +376,11 @@ struct xnsched *xnsched_finish_unlocked_switch(struct
>>> xnsched *sched)
>>>    void xnsched_lock(void)
>>>    {
>>>        struct xnsched *sched = xnsched_current();
>>> -    struct xnthread *curr = sched->curr;
>>> +    struct xnthread *curr = READ_ONCE(sched->curr);
>>
>> And here we READ_ONCE for the consistency of the XENO_WARN?
>>
> 
> This is paired with writing to sched->curr in ___xnsched_run(). In theory, we
> might have multiple subsequent stores to sched->curr from a remote CPU in the
> time span of a single read from the local CPU. If we allow the compiler to go
> wild with load tearing here, this might cause the reader to observe a pointer
> value which might actually be a merge between the multiple stores.
> 
> Since these are inline helpers, we could indeed use a barrier for preventing
> load fusing with subsequent code within the same scope, but I don't think this
> would be of any help with load tearing.
> 
>>> +    /*
>>> +     * sched->curr is shared locklessly with xnsched_run(), be
>>> +     * careful about load/store tearing.
>>> +     */
>>> +    WRITE_ONCE(sched->curr, next);
>>
>> But xnsched_run touches the value only once. Do we rather want a barrier here?
>>
> 
> This is the converse issue. WRITE_ONCE() is there to prevent the compiler from
> writing to sched->curr piecemeal (store tearing) with multiple accesses, while
> other folks read that value (like xnsched_lock(), xnsched_run()). In theory, we
> might have an issue with store tearing since sched->curr can be considered a
> shared memory accessed locklessly, and we don't know how braindamage the
> compiler might be, e.g. in case it feels too much pressure on the register file
> when generating code. Typically, having a 64bit pointer store split into a
> couple of 32bit writes would look funny at first, funky in the end. Granted,
> the access to sched->curr is always aligned, no immediate value is stored
> there, but well, WRITE_ONCE() is for peace of mind with other cases.

OK, can we add these reasonings as comments to the code?

Merging patch 1 and 2 already to next.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 3/3] cobalt/sched: guard against missed rescheduling upon CPU migration
  2019-12-02  6:42       ` Jan Kiszka
@ 2019-12-02 15:18         ` Philippe Gerum
  0 siblings, 0 replies; 8+ messages in thread
From: Philippe Gerum @ 2019-12-02 15:18 UTC (permalink / raw)
  To: Jan Kiszka, xenomai

On 12/2/19 7:42 AM, Jan Kiszka wrote:
> On 29.11.19 19:37, Philippe Gerum wrote:
>> On 11/29/19 7:04 PM, Jan Kiszka wrote:
>>> On 28.11.19 11:02, Philippe Gerum via Xenomai wrote:
>>>> From: Philippe Gerum <rpm@xenomai.org>
>>>>
>>>>      static inline int rtdm_task_set_period(rtdm_task_t *task,
>>>> @@ -1062,9 +1066,14 @@ static inline int rtdm_task_set_period(rtdm_task_t *task,
>>>>      static inline int rtdm_task_unblock(rtdm_task_t *task)
>>>>    {
>>>> -    int res = xnthread_unblock(task);
>>>> +    spl_t s;
>>>> +    int res;
>>>>    +    splhigh(s);
>>>> +    res = xnthread_unblock(task);
>>>>        xnsched_run();
>>>> +    splexit(s);
>>>> +
>>>>        return res;
>>>>    }
>>>
>>> Both become non-trivial this way and should be un-inlined. I will do that on top.
>>>
>>
>> Fair enough. I did not consider moving this out-of-line since those helpers are
>> infrequently used (or at least
>> should be).
>>
>>>>             (XNINIRQ|XNINSW|XNRESCHED)) != XNRESCHED)
>>>> @@ -328,18 +311,13 @@ static inline int __xnsched_run(struct xnsched *sched)
>>>>    static inline int xnsched_run(void)
>>>>    {
>>>>        struct xnsched *sched = xnsched_current();
>>>> +    struct xnthread *curr = READ_ONCE(sched->curr);
>>>
>>> But now we no longer need that READ_ONCE: I see only one use case below.>>> index 74dc7a961..f7e662a38 100644
>>>> --- a/kernel/cobalt/sched.c
>>>> +++ b/kernel/cobalt/sched.c
>>>> @@ -376,10 +376,11 @@ struct xnsched *xnsched_finish_unlocked_switch(struct
>>>> xnsched *sched)
>>>>    void xnsched_lock(void)
>>>>    {
>>>>        struct xnsched *sched = xnsched_current();
>>>> -    struct xnthread *curr = sched->curr;
>>>> +    struct xnthread *curr = READ_ONCE(sched->curr);
>>>
>>> And here we READ_ONCE for the consistency of the XENO_WARN?
>>>
>>
>> This is paired with writing to sched->curr in ___xnsched_run(). In theory, we
>> might have multiple subsequent stores to sched->curr from a remote CPU in the
>> time span of a single read from the local CPU. If we allow the compiler to go
>> wild with load tearing here, this might cause the reader to observe a pointer
>> value which might actually be a merge between the multiple stores.
>>
>> Since these are inline helpers, we could indeed use a barrier for preventing
>> load fusing with subsequent code within the same scope, but I don't think this
>> would be of any help with load tearing.
>>
>>>> +    /*
>>>> +     * sched->curr is shared locklessly with xnsched_run(), be
>>>> +     * careful about load/store tearing.
>>>> +     */
>>>> +    WRITE_ONCE(sched->curr, next);
>>>
>>> But xnsched_run touches the value only once. Do we rather want a barrier here?
>>>
>>
>> This is the converse issue. WRITE_ONCE() is there to prevent the compiler from
>> writing to sched->curr piecemeal (store tearing) with multiple accesses, while
>> other folks read that value (like xnsched_lock(), xnsched_run()). In theory, we
>> might have an issue with store tearing since sched->curr can be considered a
>> shared memory accessed locklessly, and we don't know how braindamage the
>> compiler might be, e.g. in case it feels too much pressure on the register file
>> when generating code. Typically, having a 64bit pointer store split into a
>> couple of 32bit writes would look funny at first, funky in the end. Granted,
>> the access to sched->curr is always aligned, no immediate value is stored
>> there, but well, WRITE_ONCE() is for peace of mind with other cases.
> 
> OK, can we add these reasonings as comments to the code?
> 

Sure, patch follows.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 3/3] cobalt/sched: guard against missed rescheduling upon CPU migration
  2019-11-27 20:08 [PATCH v2 1/3] drivers/testing: heapcheck: silence UMR warning (false positive) Philippe Gerum
@ 2019-11-27 20:08 ` Philippe Gerum
  0 siblings, 0 replies; 8+ messages in thread
From: Philippe Gerum @ 2019-11-27 20:08 UTC (permalink / raw)
  To: xenomai

From: Philippe Gerum <rpm@xenomai.org>

Kicking the rescheduling procedure (xnsched_run()) from call sites
running over the root thread with hard irqs enabled may lead to
missing a rescheduling opportunity as a consequence of a CPU
migration, e.g.:

    CPU0					CPU1
    ----                         		----
    t0: ...
            <context switch>
    t1: xnthread_resume(t0)
             xnsched_run()
             <CPU MIGRATION -->
                  |                             |
                  |                             |
                  x                             |
          (missed rescheduling)                 v
                                    ___xnsched_run(CPU1.sched)

To address this issue, make sure every call to xnsched_run() either
runs from the head domain which is not affected by CPU migration, or
moves into the latest atomic section which might have changed the
scheduler state.

Turning on CONFIG_XENO_OPT_DEBUG_COBALT also enables a (oneshot)
assertion which detects invalid calls from the root domain with hard
irqs enabled.
---
 include/cobalt/kernel/rtdm/driver.h | 11 ++++++++-
 include/cobalt/kernel/sched.h       | 36 ++++++-----------------------
 kernel/cobalt/posix/mqueue.c        | 13 ++++-------
 kernel/cobalt/posix/timerfd.c       |  9 +++-----
 kernel/cobalt/sched.c               | 20 ++++++++++++----
 kernel/cobalt/select.c              |  6 ++---
 kernel/cobalt/thread.c              |  4 ++--
 7 files changed, 44 insertions(+), 55 deletions(-)

diff --git a/include/cobalt/kernel/rtdm/driver.h b/include/cobalt/kernel/rtdm/driver.h
index d3d5d6b27..57495f12f 100644
--- a/include/cobalt/kernel/rtdm/driver.h
+++ b/include/cobalt/kernel/rtdm/driver.h
@@ -1044,8 +1044,12 @@ static inline void __deprecated rtdm_task_join_nrt(rtdm_task_t *task,
 static inline void rtdm_task_set_priority(rtdm_task_t *task, int priority)
 {
 	union xnsched_policy_param param = { .rt = { .prio = priority } };
+	spl_t s;
+
+	splhigh(s);
 	xnthread_set_schedparam(task, &xnsched_class_rt, &param);
 	xnsched_run();
+	splexit(s);
 }
 
 static inline int rtdm_task_set_period(rtdm_task_t *task,
@@ -1062,9 +1066,14 @@ static inline int rtdm_task_set_period(rtdm_task_t *task,
 
 static inline int rtdm_task_unblock(rtdm_task_t *task)
 {
-	int res = xnthread_unblock(task);
+	spl_t s;
+	int res;
 
+	splhigh(s);
+	res = xnthread_unblock(task);
 	xnsched_run();
+	splexit(s);
+
 	return res;
 }
 
diff --git a/include/cobalt/kernel/sched.h b/include/cobalt/kernel/sched.h
index 3b0825d67..7fc11a198 100644
--- a/include/cobalt/kernel/sched.h
+++ b/include/cobalt/kernel/sched.h
@@ -298,25 +298,8 @@ void __xnsched_run_handler(void);
 static inline int __xnsched_run(struct xnsched *sched)
 {
 	/*
-	 * NOTE: Since ___xnsched_run() won't run immediately if an
-	 * escalation to primary domain is needed, we won't use
-	 * critical scheduler information before we actually run in
-	 * primary mode; therefore we can first test the scheduler
-	 * status then escalate.
-	 *
-	 * Running in the primary domain means that no Linux-triggered
-	 * CPU migration may occur from that point either. Finally,
-	 * since migration is always a self-directed operation for
-	 * Xenomai threads, we can safely read the scheduler state
-	 * bits without holding the nklock.
-	 *
-	 * Said differently, if we race here because of a CPU
-	 * migration, it must have been Linux-triggered because we run
-	 * in secondary mode; in which case we will escalate to the
-	 * primary domain, then unwind the current call frame without
-	 * running the rescheduling procedure in
-	 * ___xnsched_run(). Therefore, the scheduler slot
-	 * (i.e. "sched") will be either valid, or unused.
+	 * Reschedule if XNSCHED is pending, but never over an IRQ
+	 * handler or in the middle of unlocked context switch.
 	 */
 	if (((sched->status|sched->lflags) &
 	     (XNINIRQ|XNINSW|XNRESCHED)) != XNRESCHED)
@@ -328,18 +311,13 @@ static inline int __xnsched_run(struct xnsched *sched)
 static inline int xnsched_run(void)
 {
 	struct xnsched *sched = xnsched_current();
+	struct xnthread *curr = READ_ONCE(sched->curr);
+
 	/*
-	 * No rescheduling is possible, either if:
-	 *
-	 * - the current thread holds the scheduler lock
-	 * - an ISR context is active
-	 * - we are caught in the middle of an unlocked context switch.
+	 * If running over the root thread, hard irqs must be off
+	 * (asserted out of line in ___xnsched_run()).
 	 */
-	smp_rmb();
-	if (unlikely(sched->curr->lock_count > 0))
-		return 0;
-
-	return __xnsched_run(sched);
+	return curr->lock_count > 0 ? 0 : __xnsched_run(sched);
 }
 
 void xnsched_lock(void);
diff --git a/kernel/cobalt/posix/mqueue.c b/kernel/cobalt/posix/mqueue.c
index 5c644b2c4..cd1cf3ce8 100644
--- a/kernel/cobalt/posix/mqueue.c
+++ b/kernel/cobalt/posix/mqueue.c
@@ -158,22 +158,19 @@ static inline int mq_init(struct cobalt_mq *mq, const struct mq_attr *attr)
 
 static inline void mq_destroy(struct cobalt_mq *mq)
 {
-	int resched;
 	spl_t s;
 
 	xnlock_get_irqsave(&nklock, s);
-	resched = (xnsynch_destroy(&mq->receivers) == XNSYNCH_RESCHED);
-	resched = (xnsynch_destroy(&mq->senders) == XNSYNCH_RESCHED) || resched;
+	xnsynch_destroy(&mq->receivers);
+	xnsynch_destroy(&mq->senders);
 	list_del(&mq->link);
+	xnsched_run();
 	xnlock_put_irqrestore(&nklock, s);
-	xnselect_destroy(&mq->read_select);
-	xnselect_destroy(&mq->write_select);
+	xnselect_destroy(&mq->read_select); /* Reschedules. */
+	xnselect_destroy(&mq->write_select); /* Ditto. */
 	xnregistry_remove(mq->handle);
 	xnheap_vfree(mq->mem);
 	kfree(mq);
-
-	if (resched)
-		xnsched_run();
 }
 
 static int mq_unref_inner(struct cobalt_mq *mq, spl_t s)
diff --git a/kernel/cobalt/posix/timerfd.c b/kernel/cobalt/posix/timerfd.c
index 610206047..29046cfec 100644
--- a/kernel/cobalt/posix/timerfd.c
+++ b/kernel/cobalt/posix/timerfd.c
@@ -131,18 +131,15 @@ timerfd_select(struct rtdm_fd *fd, struct xnselector *selector,
 static void timerfd_close(struct rtdm_fd *fd)
 {
 	struct cobalt_tfd *tfd = container_of(fd, struct cobalt_tfd, fd);
-	int resched;
 	spl_t s;
 
 	xnlock_get_irqsave(&nklock, s);
 	xntimer_destroy(&tfd->timer);
-	resched = xnsynch_destroy(&tfd->readers) == XNSYNCH_RESCHED;
+	xnsynch_destroy(&tfd->readers);
+	xnsched_run();
 	xnlock_put_irqrestore(&nklock, s);
-	xnselect_destroy(&tfd->read_select);
+	xnselect_destroy(&tfd->read_select); /* Reschedules. */
 	xnfree(tfd);
-
-	if (resched)
-		xnsched_run();
 }
 
 static struct rtdm_fd_ops timerfd_ops = {
diff --git a/kernel/cobalt/sched.c b/kernel/cobalt/sched.c
index 74dc7a961..f7e662a38 100644
--- a/kernel/cobalt/sched.c
+++ b/kernel/cobalt/sched.c
@@ -376,10 +376,11 @@ struct xnsched *xnsched_finish_unlocked_switch(struct xnsched *sched)
 void xnsched_lock(void)
 {
 	struct xnsched *sched = xnsched_current();
-	struct xnthread *curr = sched->curr;
+	struct xnthread *curr = READ_ONCE(sched->curr);
 
 	if (sched->lflags & XNINIRQ)
 		return;
+
 	/*
 	 * CAUTION: The fast xnthread_current() accessor carries the
 	 * relevant lock nesting count only if current runs in primary
@@ -398,14 +399,17 @@ EXPORT_SYMBOL_GPL(xnsched_lock);
 void xnsched_unlock(void)
 {
 	struct xnsched *sched = xnsched_current();
-	struct xnthread *curr = sched->curr;
+	struct xnthread *curr = READ_ONCE(sched->curr);
+
+	XENO_WARN_ON_ONCE(COBALT, (curr->state & XNROOT) &&
+			  !hard_irqs_disabled());
 
 	if (sched->lflags & XNINIRQ)
 		return;
-	
+
 	if (!XENO_ASSERT(COBALT, curr->lock_count > 0))
 		return;
-	
+
 	if (--curr->lock_count == 0) {
 		xnthread_clear_localinfo(curr, XNLBALERT);
 		xnsched_run();
@@ -962,6 +966,8 @@ int ___xnsched_run(struct xnsched *sched)
 	int switched, shadow;
 	spl_t s;
 
+	XENO_WARN_ON_ONCE(COBALT, !hard_irqs_disabled() && ipipe_root_p);
+
 	if (xnarch_escalate())
 		return 0;
 
@@ -1000,7 +1006,11 @@ reschedule:
 
 	trace_cobalt_switch_context(prev, next);
 
-	sched->curr = next;
+	/*
+	 * sched->curr is shared locklessly with xnsched_run(), be
+	 * careful about load/store tearing.
+	 */
+	WRITE_ONCE(sched->curr, next);
 	shadow = 1;
 
 	if (xnthread_test_state(prev, XNROOT)) {
diff --git a/kernel/cobalt/select.c b/kernel/cobalt/select.c
index 7f839f159..df3d960db 100644
--- a/kernel/cobalt/select.c
+++ b/kernel/cobalt/select.c
@@ -409,7 +409,6 @@ static void xnselector_destroy_loop(void *cookie)
 	struct xnselect_binding *binding, *tmpb;
 	struct xnselector *selector, *tmps;
 	struct xnselect *fd;
-	int resched;
 	spl_t s;
 
 	xnlock_get_irqsave(&nklock, s);
@@ -430,12 +429,11 @@ static void xnselector_destroy_loop(void *cookie)
 			xnlock_get_irqsave(&nklock, s);
 		}
 	release:
-		resched = xnsynch_destroy(&selector->synchbase) == XNSYNCH_RESCHED;
+		xnsynch_destroy(&selector->synchbase);
+		xnsched_run();
 		xnlock_put_irqrestore(&nklock, s);
 
 		xnfree(selector);
-		if (resched)
-			xnsched_run();
 
 		xnlock_get_irqsave(&nklock, s);
 	}
diff --git a/kernel/cobalt/thread.c b/kernel/cobalt/thread.c
index a5f8e1b00..5260fb1e7 100644
--- a/kernel/cobalt/thread.c
+++ b/kernel/cobalt/thread.c
@@ -1582,9 +1582,9 @@ check_self_cancel:
 	} else
 		__xnthread_kick(thread);
 out:
-	xnlock_put_irqrestore(&nklock, s);
-
 	xnsched_run();
+
+	xnlock_put_irqrestore(&nklock, s);
 }
 EXPORT_SYMBOL_GPL(xnthread_cancel);
 
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-12-02 15:18 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-28 10:02 [PATCH v2 1/3] drivers/testing: heapcheck: silence UMR warning (false positive) Philippe Gerum
2019-11-28 10:02 ` [PATCH v2 2/3] cobalt/process: exit_thread handlers should reschedule as needed Philippe Gerum
2019-11-28 10:02 ` [PATCH v2 3/3] cobalt/sched: guard against missed rescheduling upon CPU migration Philippe Gerum
2019-11-29 18:04   ` Jan Kiszka
2019-11-29 18:37     ` Philippe Gerum
2019-12-02  6:42       ` Jan Kiszka
2019-12-02 15:18         ` Philippe Gerum
  -- strict thread matches above, loose matches on Subject: below --
2019-11-27 20:08 [PATCH v2 1/3] drivers/testing: heapcheck: silence UMR warning (false positive) Philippe Gerum
2019-11-27 20:08 ` [PATCH v2 3/3] cobalt/sched: guard against missed rescheduling upon CPU migration Philippe Gerum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.