* [PATCH] um: fix os_idle_sleep() to not hang
@ 2021-01-22 20:40 Johannes Berg
2021-01-25 9:13 ` Anton Ivanov
0 siblings, 1 reply; 2+ messages in thread
From: Johannes Berg @ 2021-01-22 20:40 UTC (permalink / raw)
To: linux-um; +Cc: Johannes Berg
From: Johannes Berg <johannes.berg@intel.com>
Changing os_idle_sleep() to use pause() (I accidentally described
it as an empty select() in the commit log because I had changed it
from that to pause() in a later revision) exposed a race condition
in the idle code. The following can happen:
timer_settime(0, 0, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=0, tv_nsec=624017}}, NULL) = 0
...
<SIGALRM is delivered but we're already on the way to idle>
pause()
and we now hang forever. This was previously possible as well, but
it could never cause UML to hang for more than a second since we
could only sleep for that much, so at most you'd notice a "hiccup"
in the UML. Obviously, any sort of external interrupt also "saves"
it and interrupts pause().
Fix this by properly handling the race, rather than papering over
it again:
- first, block SIGALRM, and obtain the old signal set
- check the timer
- suspend, waiting for any signal out of the old set, if, and only
if, the timer will fire in the future
- restore the old signal mask
This ensures race-free operation: as it's blocked, the signal won't
be delivered while we're looking at the timer even if it were to be
triggered right _after_ we've returned from timer_gettime() with a
non-zero value (telling us the timer will trigger). Thus, despite
getting to sigsuspend() because timer_gettime() told us we're still
waiting, we'll not hang because sigsuspend() will return immediately
due to the pending signal.
Fixes: 49da38a3ef33 ("um: Simplify os_idle_sleep() and sleep longer")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
arch/um/os-Linux/time.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/um/os-Linux/time.c b/arch/um/os-Linux/time.c
index a61cbf73a179..6c5041c5560b 100644
--- a/arch/um/os-Linux/time.c
+++ b/arch/um/os-Linux/time.c
@@ -104,5 +104,18 @@ long long os_nsecs(void)
*/
void os_idle_sleep(void)
{
- pause();
+ struct itimerspec its;
+ sigset_t set, old;
+
+ /* block SIGALRM while we analyze the timer state */
+ sigemptyset(&set);
+ sigaddset(&set, SIGALRM);
+ sigprocmask(SIG_BLOCK, &set, &old);
+
+ /* check the timer, and if it'll fire then wait for it */
+ timer_gettime(event_high_res_timer, &its);
+ if (its.it_value.tv_sec || its.it_value.tv_nsec)
+ sigsuspend(&old);
+ /* either way, restore the signal mask */
+ sigprocmask(SIG_UNBLOCK, &set, NULL);
}
--
2.26.2
_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] um: fix os_idle_sleep() to not hang
2021-01-22 20:40 [PATCH] um: fix os_idle_sleep() to not hang Johannes Berg
@ 2021-01-25 9:13 ` Anton Ivanov
0 siblings, 0 replies; 2+ messages in thread
From: Anton Ivanov @ 2021-01-25 9:13 UTC (permalink / raw)
To: Johannes Berg, linux-um; +Cc: Johannes Berg
On 22/01/2021 20:40, Johannes Berg wrote:
> From: Johannes Berg <johannes.berg@intel.com>
>
> Changing os_idle_sleep() to use pause() (I accidentally described
> it as an empty select() in the commit log because I had changed it
> from that to pause() in a later revision) exposed a race condition
> in the idle code. The following can happen:
>
> timer_settime(0, 0, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=0, tv_nsec=624017}}, NULL) = 0
> ...
> <SIGALRM is delivered but we're already on the way to idle>
> pause()
>
> and we now hang forever. This was previously possible as well, but
> it could never cause UML to hang for more than a second since we
> could only sleep for that much, so at most you'd notice a "hiccup"
> in the UML. Obviously, any sort of external interrupt also "saves"
> it and interrupts pause().
>
> Fix this by properly handling the race, rather than papering over
> it again:
>
> - first, block SIGALRM, and obtain the old signal set
> - check the timer
> - suspend, waiting for any signal out of the old set, if, and only
> if, the timer will fire in the future
> - restore the old signal mask
>
> This ensures race-free operation: as it's blocked, the signal won't
> be delivered while we're looking at the timer even if it were to be
> triggered right _after_ we've returned from timer_gettime() with a
> non-zero value (telling us the timer will trigger). Thus, despite
> getting to sigsuspend() because timer_gettime() told us we're still
> waiting, we'll not hang because sigsuspend() will return immediately
> due to the pending signal.
>
> Fixes: 49da38a3ef33 ("um: Simplify os_idle_sleep() and sleep longer")
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
> ---
> arch/um/os-Linux/time.c | 15 ++++++++++++++-
> 1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/arch/um/os-Linux/time.c b/arch/um/os-Linux/time.c
> index a61cbf73a179..6c5041c5560b 100644
> --- a/arch/um/os-Linux/time.c
> +++ b/arch/um/os-Linux/time.c
> @@ -104,5 +104,18 @@ long long os_nsecs(void)
> */
> void os_idle_sleep(void)
> {
> - pause();
> + struct itimerspec its;
> + sigset_t set, old;
> +
> + /* block SIGALRM while we analyze the timer state */
> + sigemptyset(&set);
> + sigaddset(&set, SIGALRM);
> + sigprocmask(SIG_BLOCK, &set, &old);
> +
> + /* check the timer, and if it'll fire then wait for it */
> + timer_gettime(event_high_res_timer, &its);
> + if (its.it_value.tv_sec || its.it_value.tv_nsec)
> + sigsuspend(&old);
> + /* either way, restore the signal mask */
> + sigprocmask(SIG_UNBLOCK, &set, NULL);
> }
>
Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com>
--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/
_______________________________________________
linux-um mailing list
linux-um@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-um
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2021-01-25 9:33 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-22 20:40 [PATCH] um: fix os_idle_sleep() to not hang Johannes Berg
2021-01-25 9:13 ` Anton Ivanov
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.