All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 1/2] timers: Fix usleep_range() in the context of wake_up_process()
@ 2016-10-20 21:21 Douglas Anderson
  2016-10-20 21:21 ` [PATCH v4 2/2] timers: Fix documentation for schedule_timeout() and similar Douglas Anderson
  2016-10-20 21:27   ` Thomas Gleixner
  0 siblings, 2 replies; 7+ messages in thread
From: Douglas Anderson @ 2016-10-20 21:21 UTC (permalink / raw)
  To: Thomas Gleixner, John Stultz
  Cc: Andreas Mohr, briannorris, huangtao, tony.xie, linux-rockchip,
	linux, heiko, broonie, djkurtz, tskd08, Douglas Anderson,
	linux-kernel

Users of usleep_range() expect that it will _never_ return in less time
than the minimum passed parameter.  However, nothing in any of the code
ensures this.  Specifically:

usleep_range() => do_usleep_range() => schedule_hrtimeout_range() =>
schedule_hrtimeout_range_clock() just ends up calling schedule() with an
appropriate timeout set using the hrtimer.  If someone else happens to
wake up our task then we'll happily return from usleep_range() early.

msleep() already has code to handle this case since it will loop as long
as there was still time left.  usleep_range() had no such loop.

The problem is is easily demonstrated with a small bit of test code:

  static int usleep_test_task(void *data)
  {
    atomic_t *done = data;
    ktime_t start, end;

    start = ktime_get();
    usleep_range(50000, 100000);
    end = ktime_get();
    pr_info("Requested 50000 - 100000 us.  Actually slept for %llu us\n",
      (unsigned long long)ktime_to_us(ktime_sub(end, start)));
    atomic_set(done, 1);

    return 0;
  }

  static void run_usleep_test(void)
  {
    struct task_struct *t;
    atomic_t done;

    atomic_set(&done, 0);

    t = kthread_run(usleep_test_task, &done, "usleep_test_task");
    while (!atomic_read(&done)) {
      wake_up_process(t);
      udelay(1000);
    }
    kthread_stop(t);
  }

If you run the above code without this patch you get things like:
  Requested 50000 - 100000 us.  Actually slept for 967 us

If you run the above code _with_ this patch, you get:
  Requested 50000 - 100000 us.  Actually slept for 50001 us

Presumably this problem was not detected before because:
- It's not terribly common to use wake_up_process() directly.
- Other ways for processes to wake up are not typically mixed with
  usleep_range().
- There aren't lots of places that use usleep_range(), since many people
  call either msleep() or udelay().

NOTES:
- An effort was made to look for users relying on the old behavior by
  looking for usleep_range() in the same file as wake_up_process().
  No problems was found by this search, though it is conceivable that
  someone could have put the sleep and wakeup in two different files.
- An effort was made to ask several upstream maintainers if they were
  aware of people relying on wake_up_process() to wake up
  usleep_range().  No maintainers were aware of that but they were aware
  of many people relying on usleep_range() never returning before the
  minimum.

Reported-by: Tao Huang <huangtao@rock-chips.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Andreas Mohr <andim2@users.sf.net>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
---
Changes in v4: None
Changes in v3:
- Add Reviewed-by tags
- Add notes about validation

Changes in v2:
- Fixed stupid bug that snuck in before posting
- Use ktime_before
- Remove delta from the loop

 kernel/time/timer.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 32bf6f75a8fe..219439efd56a 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1898,12 +1898,28 @@ EXPORT_SYMBOL(msleep_interruptible);
 
 static void __sched do_usleep_range(unsigned long min, unsigned long max)
 {
+	ktime_t now, end;
 	ktime_t kmin;
 	u64 delta;
+	int ret;
 
-	kmin = ktime_set(0, min * NSEC_PER_USEC);
+	now = ktime_get();
+	end = ktime_add_us(now, min);
 	delta = (u64)(max - min) * NSEC_PER_USEC;
-	schedule_hrtimeout_range(&kmin, delta, HRTIMER_MODE_REL);
+	do {
+		kmin = ktime_sub(end, now);
+		ret = schedule_hrtimeout_range(&kmin, delta, HRTIMER_MODE_REL);
+
+		/*
+		 * If schedule_hrtimeout_range() returns 0 then we actually
+		 * hit the timeout. If not then we need to re-calculate the
+		 * new timeout ourselves.
+		 */
+		if (ret == 0)
+			break;
+
+		now = ktime_get();
+	} while (ktime_before(now, end));
 }
 
 /**
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 2/2] timers: Fix documentation for schedule_timeout() and similar
  2016-10-20 21:21 [PATCH v4 1/2] timers: Fix usleep_range() in the context of wake_up_process() Douglas Anderson
@ 2016-10-20 21:21 ` Douglas Anderson
  2016-10-20 21:27   ` Thomas Gleixner
  1 sibling, 0 replies; 7+ messages in thread
From: Douglas Anderson @ 2016-10-20 21:21 UTC (permalink / raw)
  To: Thomas Gleixner, John Stultz
  Cc: Andreas Mohr, briannorris, huangtao, tony.xie, linux-rockchip,
	linux, heiko, broonie, djkurtz, tskd08, Douglas Anderson,
	linux-kernel

The documentatoin for schedule_timeout(), schedule_hrtimeout(), and
schedule_hrtimeout_range() all claimed that the routines couldn't
possibly return early if the task state was TASK_UNINTERRUPTIBLE.  This
was simply not true since anyone calling wake_up_process() would cause
those routines to exit early.

As some evidence that the documentation was broken (not the code):
- If we changed the code to match the documentation, msleep() would be
  identical to schedule_timeout_uninterruptible() and
  msleep_interruptible() would be identical to
  schedule_timeout_interruptible().  That doesn't seem likely to have
  been the intention.
- The schedule() function sleeps until a task is woken up.  Logically,
  one would expect that the schedule_timeout() function would sleep
  until a task is woken up or a timeout occurrs.

As part of the above observation, it can be seen that
schedule_hrtimeout() and schedule_hrtimeout_range() might return -EINTR
even if the task state was TASK_UNINTERRUPTIBLE.  This isn't terrible
behavior so we'll document it and keep it as-is.  After all, trying to
match schedule_timeout() and return the time left would incure a bunch
of extra calculation cost that isn't needed in all cases.

Suggested-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
Changes in v4:
- Fixed stray double quotes.
- Updated wording as per Thomas Gleixner.

Changes in v3:
- Documentation fix new for v3.

 kernel/time/hrtimer.c | 20 ++++++++++++++------
 kernel/time/timer.c   | 11 +++++++----
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index bb5ec425dfe0..08be5c99d26b 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -1742,15 +1742,19 @@ schedule_hrtimeout_range_clock(ktime_t *expires, u64 delta,
  * You can set the task state as follows -
  *
  * %TASK_UNINTERRUPTIBLE - at least @timeout time is guaranteed to
- * pass before the routine returns.
+ * pass before the routine returns unless the current task is explicitly
+ * woken up, (e.g. by wake_up_process()).
  *
  * %TASK_INTERRUPTIBLE - the routine may return early if a signal is
- * delivered to the current task.
+ * delivered to the current task or the current task is explicitly woken
+ * up.
  *
  * The current task state is guaranteed to be TASK_RUNNING when this
  * routine returns.
  *
- * Returns 0 when the timer has expired otherwise -EINTR
+ * Returns 0 when the timer has expired. If the task was woken before the
+ * timer expired by a signal (only possible in state TASK_INTERRUPTIBLE) or
+ * by an explicit wakeup, it returns -EINTR.
  */
 int __sched schedule_hrtimeout_range(ktime_t *expires, u64 delta,
 				     const enum hrtimer_mode mode)
@@ -1772,15 +1776,19 @@ EXPORT_SYMBOL_GPL(schedule_hrtimeout_range);
  * You can set the task state as follows -
  *
  * %TASK_UNINTERRUPTIBLE - at least @timeout time is guaranteed to
- * pass before the routine returns.
+ * pass before the routine returns unless the current task is explicitly
+ * woken up, (e.g. by wake_up_process()).
  *
  * %TASK_INTERRUPTIBLE - the routine may return early if a signal is
- * delivered to the current task.
+ * delivered to the current task or the current task is explicitly woken
+ * up.
  *
  * The current task state is guaranteed to be TASK_RUNNING when this
  * routine returns.
  *
- * Returns 0 when the timer has expired otherwise -EINTR
+ * Returns 0 when the timer has expired. If the task was woken before the
+ * timer expired by a signal (only possible in state TASK_INTERRUPTIBLE) or
+ * by an explicit wakeup, it returns -EINTR.
  */
 int __sched schedule_hrtimeout(ktime_t *expires,
 			       const enum hrtimer_mode mode)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 219439efd56a..b2ca2a6bc4d2 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1691,11 +1691,12 @@ static void process_timeout(unsigned long __data)
  * You can set the task state as follows -
  *
  * %TASK_UNINTERRUPTIBLE - at least @timeout jiffies are guaranteed to
- * pass before the routine returns. The routine will return 0
+ * pass before the routine returns unless the current task is explicitly
+ * woken up, (e.g. by wake_up_process())".
  *
  * %TASK_INTERRUPTIBLE - the routine may return early if a signal is
- * delivered to the current task. In this case the remaining time
- * in jiffies will be returned, or 0 if the timer expired in time
+ * delivered to the current task or the current task is explicitly woken
+ * up.
  *
  * The current task state is guaranteed to be TASK_RUNNING when this
  * routine returns.
@@ -1704,7 +1705,9 @@ static void process_timeout(unsigned long __data)
  * the CPU away without a bound on the timeout. In this case the return
  * value will be %MAX_SCHEDULE_TIMEOUT.
  *
- * In all cases the return value is guaranteed to be non-negative.
+ * Returns 0 when the timer has expired otherwise the remaining time in
+ * jiffies will be returned.  In all cases the return value is guaranteed
+ * to be non-negative.
  */
 signed long __sched schedule_timeout(signed long timeout)
 {
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/2] timers: Fix usleep_range() in the context of wake_up_process()
@ 2016-10-20 21:27   ` Thomas Gleixner
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2016-10-20 21:27 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: John Stultz, Andreas Mohr, briannorris, huangtao, tony.xie,
	linux-rockchip, linux, heiko, broonie, djkurtz, tskd08,
	linux-kernel

On Thu, 20 Oct 2016, Douglas Anderson wrote:

Please wait for a full review and do not send out patches 5 seconds after
the first mail hits your inbox.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/2] timers: Fix usleep_range() in the context of wake_up_process()
@ 2016-10-20 21:27   ` Thomas Gleixner
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2016-10-20 21:27 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: huangtao-TNX95d0MmH7DzftRWevZcw, heiko-4mtYJXux2i+zQB+pC5nmwQ,
	broonie-DgEjT+Ai2ygdnm+yROfE0A,
	briannorris-F7+t8E8rja9g9hUCZPvPmw,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andreas Mohr,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	tony.xie-TNX95d0MmH7DzftRWevZcw, John Stultz,
	djkurtz-F7+t8E8rja9g9hUCZPvPmw, linux-0h96xk9xTtrk1uMJSBkQmQ,
	tskd08-Re5JQEeQqe8AvxtiuMwx3w

On Thu, 20 Oct 2016, Douglas Anderson wrote:

Please wait for a full review and do not send out patches 5 seconds after
the first mail hits your inbox.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/2] timers: Fix usleep_range() in the context of wake_up_process()
  2016-10-20 21:27   ` Thomas Gleixner
  (?)
@ 2016-10-20 23:37   ` Doug Anderson
  2016-10-21  6:47       ` Thomas Gleixner
  -1 siblings, 1 reply; 7+ messages in thread
From: Doug Anderson @ 2016-10-20 23:37 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: John Stultz, Andreas Mohr, Brian Norris, 黄涛,
	Tony Xie, open list:ARM/Rockchip SoC...,
	Guenter Roeck, Heiko Stübner, broonie, Daniel Kurtz,
	Akihiro Tsukada, linux-kernel

Hi,

On Thu, Oct 20, 2016 at 2:27 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Thu, 20 Oct 2016, Douglas Anderson wrote:
>
> Please wait for a full review and do not send out patches 5 seconds after
> the first mail hits your inbox.

Since you had previously commented on patch 1 and had already
commented on patch 2, I presumed you were done reviewing, but I was
obviously in error.  My apologies.

-Doug

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/2] timers: Fix usleep_range() in the context of wake_up_process()
@ 2016-10-21  6:47       ` Thomas Gleixner
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2016-10-21  6:47 UTC (permalink / raw)
  To: Doug Anderson
  Cc: John Stultz, Andreas Mohr, Brian Norris, 黄涛,
	Tony Xie, open list:ARM/Rockchip SoC...,
	Guenter Roeck, Heiko Stübner, broonie, Daniel Kurtz,
	Akihiro Tsukada, linux-kernel

On Thu, 20 Oct 2016, Doug Anderson wrote:
> On Thu, Oct 20, 2016 at 2:27 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Thu, 20 Oct 2016, Douglas Anderson wrote:
> >
> > Please wait for a full review and do not send out patches 5 seconds after
> > the first mail hits your inbox.
> 
> Since you had previously commented on patch 1 and had already
> commented on patch 2, I presumed you were done reviewing, but I was
> obviously in error.  My apologies.

No problem. As a general rule you should not send updates like a machine
gun. That issue is years old, so there is no rush to fix it just because
some random (probably out of tree) code trips over it.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/2] timers: Fix usleep_range() in the context of wake_up_process()
@ 2016-10-21  6:47       ` Thomas Gleixner
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2016-10-21  6:47 UTC (permalink / raw)
  To: Doug Anderson
  Cc: 黄涛,
	Heiko Stübner, broonie-DgEjT+Ai2ygdnm+yROfE0A, Brian Norris,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andreas Mohr,
	open list:ARM/Rockchip SoC...,
	Tony Xie, John Stultz, Daniel Kurtz, Guenter Roeck,
	Akihiro Tsukada

On Thu, 20 Oct 2016, Doug Anderson wrote:
> On Thu, Oct 20, 2016 at 2:27 PM, Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org> wrote:
> > On Thu, 20 Oct 2016, Douglas Anderson wrote:
> >
> > Please wait for a full review and do not send out patches 5 seconds after
> > the first mail hits your inbox.
> 
> Since you had previously commented on patch 1 and had already
> commented on patch 2, I presumed you were done reviewing, but I was
> obviously in error.  My apologies.

No problem. As a general rule you should not send updates like a machine
gun. That issue is years old, so there is no rush to fix it just because
some random (probably out of tree) code trips over it.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-10-21  6:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-20 21:21 [PATCH v4 1/2] timers: Fix usleep_range() in the context of wake_up_process() Douglas Anderson
2016-10-20 21:21 ` [PATCH v4 2/2] timers: Fix documentation for schedule_timeout() and similar Douglas Anderson
2016-10-20 21:27 ` [PATCH v4 1/2] timers: Fix usleep_range() in the context of wake_up_process() Thomas Gleixner
2016-10-20 21:27   ` Thomas Gleixner
2016-10-20 23:37   ` Doug Anderson
2016-10-21  6:47     ` Thomas Gleixner
2016-10-21  6:47       ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.